Linear Equations

Table of Contents

Linear equation, Hyperplane, Systems of linear equations, Solution set, Consistency, Forward elimination, Back substitution

And so we begin from the top. Not much we can do until we've defined some basic constructs (another text might regale you with some remarks on the beauty of linear algebra, but partly out of laziness (as most things go for me), I'll let the beauty of linear algebra speak for itself here.1

Basics

Linear algebra is, at its core, the study of linear equations.

Definition. A linear equation in the variables \(x_1, x_2, \dots, x_n\) is an equation that can expressed in the form

\[ a_1 x_1 + a_2 x_2 + \dots + a_n x_n = b \]

where \(a_1, a_2, \dots, a_n\) and \(b\) are real numbers.2

The numbers \(a_1, \dots, a_n\) in the definition above are called coefficients. Variables are also sometimes called unknowns.

Example. The equation

\[ 1.2x + \sqrt{3}y + 10z = \pi \]

is a linear equation3 whereas

\[ xy + 3z = 0 \]

is not. Also keep in mind that

\[ 3(x - 4) = z \]

is a linear equation because it can be expressed as

\[ 3x + (-1)z = 12 \]

In your first course on algebra, you should have been introduced to the equation for a line in the plane, a very simple form of linear equation. You likely first learned the slope-intercept form

\[ y = mx + b \]

where \(m\) and \(b\) are real numbers. You hopefully also learned the general form

\[ ax + by = c \]

where \(a\), \(b\), and \(c\) are real numbers and \(x\) and \(y\) are variables.

An equation for a line in the plane is the same as a linear equation in two variables.

Example. The equation \(2x + 3y = -6\) is the general form of the line

\[ y = \left(-\frac{2}{3}\right)x - 2 \]

in slope-intercept form.

A line in the plane defines a set of points in the plane (\(\mathbb R^2\)). If we plot this set of points, we get (surprise…) a line.

Example. This is the graph of the line \(y = \left(-\frac{2}{3}\right)x - 2\). The line is made up of all points \((a, b)\) such that \(2a + 3b = -6\), e.g., the points \((-9, 4)\), \((6, -6)\), and \((12, -10)\).

Analogously, a linear equation in \(n\) variables defines a set of points in \(\mathbb R^n\). We can think of a point in \(\mathbb R^n\) as something like a tuple in Python, e.g., \((3.3, 12.03, \sqrt{3}, 4)\) is a point in \(\mathbb R^4\).

In set notation, a linear equation expressible as

\[ a_1x_1 + \dots + a_n x_n = b \]

defines the set of points

\[ \{ (c_1, \dots, c_n) : a_1 c_1 + \dots + a_n c_n = b \} \]

which is a subset of \(\mathbb R^n\). This notation expresses the basic fact that a linear equation defines the set of points that satisfy it.

If we plot the set of points defined by a linear equation in three variables, we get a plane in \(\mathbb R^3\). We can imagine this is an infinite sheet of paper floating in space.

Example. This is the graph of the plane defined by the equation \(x + y + z = 1\). The plane includes, for example, the points \((4, 3, -6)\), \((-3, -5, 9)\), and \((-8, 8, 1)\).

We call the set of points defined by a linear equation in more than three variables a hyperplane. We, of course, cannot visualize hyperplanes, but we can often take advantage of our intuitions about lines in the plane (\(\mathbb R^2\)) or planes in Euclidean space (\(\mathbb R^3\)).4

But as helpful as our geometric intuitions may be, linear equations are perhaps more useful when we think about them algebraically. They describe relations that come up consistently in a number of application domains (e.g., combinatorial optimization, machine learning, scientific computing).

Example. Suppose you have \(5\) dollars and you're at a candy store which sells candies by the pound. You're interested in buying the following four candies in unknown amounts.

swedish fish $10.50/lb
chocolate covered peanuts $7.25/lb
peach rings $14.65/lb
gummy bears $6.00/lb

If you want to spend all \(5\) dollars, then the amounts of each candy you can buy are related by the following linear equation in the variables \(s\), \(c\), \(p\), and \(g\).

\[ 10.50s + 7.25c + 14.65p + 6g = 5 \]

I will leave it to you to map this example onto a more "useful" domain.

Exercise. Find two distinct points in the point set defined by the equation in the candy store example. What do these two points represent intuitively?

Exercise. Find a point at which the plane of the equation in the candy store example intersects the \(s\) axis. What does this point represent intuitively?

Systems of Linear Equations

The situation is made more interesting by considering multiple linear equations simultaneously.

Definition. A system of linear equations (linear system) in the variables \(x_1, \dots, x_n\) is a collection of linear equations in the same variables.

When we consider a system of linear equations, we're usually interested in the points which lie in the sets defined by every equation in the system.

Definition. A solution to a system of linear equations in \(n\) variables is a point in \(\mathbb R^n\) which satisfies every equation in the system.

Example. The point \((5, 3)\) is a solution to the system

\begin{align*} -x + y &= -2 \\ -2x + y &= -7 \end{align*}

because if we set \(x = 5\) and \(y = 3\) in each equation, then every equation is satisfied:

\begin{align*} -5 + 3 &= -2 \\ -2(5) + 3 = -10 + 3 &= -7 \end{align*}

Example. The point \((4, -2, 0)\) is a solution to the system

\begin{align*} 2x + 3y + 4z &= 2 \\ x + y + 3z &= 2 \\ x + 3y + 2z &= -2 \end{align*}

because if we set \(x = 4\) and \(y = -2\) and \(z = 0\) in each equation, then every equation is satisfied:

\begin{align*} 2(4) + 3(-2) + 2(0) = 8 + (-6) &= 2 \\ 4 + (-2) + 3(0) &= 2 \\ 4 + 3(-2) + 2(0) = 4 + (-6) &= -2 \end{align*}

We call the set of all solutions to a linear system its solution set (naturally). Geometrically, the solution set of a linear system corresponds to the intersection of the point sets of each linear equation in the system. So the problem of solving a system of linear equations is analogous to the line-intersection problem in the plane.

Example. \((5, 3)\) is the point at which the lines defined by \(-x + y = -2\) and \(-2x + y = -7\) intersect.

Example. \((4, -2, 0)\) is the point at which the planes defined by \(2x + 3y + 2z = 2\) and \(x + y + 3z = 2\) and \(x + 3y + 2z = -2\) intersect.

Exercise. Verify that \((7, 1, 1)\) is a point is the solution set of

\begin{align*} x + 2y &= 9 \\ 3y + z &= 4 \\ -x + z &= -6 \end{align*}

One of our primary concerns moving forward will be: what does the solution set of a given linear system look like? We will eventually be able to exactly describe the "shape" of a solution set, but for now we will be interested in two questions.

  • (Existence) Does the system have a solution?
  • (Uniqueness) If it does have a solution, is it the only solution?

And, as is characteristic in linear algebra, we'll introduce terminology for restating the same thing in different language.

Terminology. A system of linear equations is called consistent if it has a solution. Otherwise it is called inconsistent.

Example. An inconsistent system in two variables with two equations represents parallel lines, e.g., the system

\begin{align*} 2x - 3y &= -5 \\ -4x + 6y &= -14 \end{align*}

is inconsistent. The lines defined by these equations, when graphed in the plane, are parallel.

Exercise. Give an example of a linear system in two variables with more than one solution.

Example. It's also possible to build an inconsistent linear system in three variables with two equations. This system would represent two parallel planes.

Exercise Give an explicit example of a system of linear equations with three variables and two equations representing parallel planes. Then give a procedure for defining an inconsistent linear system with two equations in any number of variables.

Exercise. Perhaps more interesting, its possible to define a system of linear equations in three variables with three equations such that every pair of equations forms a consistent system. Geometrically, this would represent three planes which each intersect with the others, but do not all three intersect at a single point.

Give an explicit example of such a system.

One nice thing about systems of linear equations (as opposed to say, the systems of polynomials) is that, if we're just interested in the number of solutions, it turns out there are only three options.

Theorem. A system of linear equations either has zero, one, or infinitely many solutions.

If a linear system is consistent and it does not have a unique then it must have infinitely many solutions.

Example. If two distinct planes in \(\mathbb R^3\) intersect, then the must intersect at a line. Thus, there are infinitely many points at the intersection of two such planes.

Exercise. (Challenge) Suppose that \((c_1, \dots, c_n)\) and \((d_1, \dots d_n)\) are distinct solutions to a given linear system. Show that

\[ \left( \frac{c_1 + d_1}{2}, \dots, \frac{c_n + d_n}{2} \right) \]

is also a solution.

Solving Linear Systems

Solving a system of linear equations means finding a solution5 or showing that no such solution exists.

As we just mentioned, it's possible for a linear system to have no solutions or infinitely many solutions. We will deal with these situations soon, but in these notes, we will only consider solving systems of linear equations with unique solutions.

As a warm-up, let's first consider a system of linear equations in two variables (with a unique solution).

Since a linear equation in two variables defines a line in the plane, and solutions represent intersections, finding the solution to such a linear system means determining the point of intersection of two lines in the plane.

Going back again to our first algebra course, you likely learned to solve the line-intersection problem using the substitution method which can be roughly characterized in the case of two variables as:

  • solve for \(x\) in terms of \(y\) in first equation
  • substitute \(x\) in the second equation
  • solve for \(y\)
  • substitute \(y\) in the first equation
  • solve for \(x\)

Example. Consider the following system of linear equations

\begin{align*} -x -2y &= 1 \\ x + y &= 2 \end{align*}

Solving for \(x\) in the first equation gives us

\[ x = -2y - 1 \]

Substituting the right-hand side for \(x\) in the second equation gives us

\[ (-2y - 1) + y = 2 \]

Solving for \(y\) in this new equation gives us \(y = -3\), and substituting this for \(y\) in first equation gives us

\[ -x - 2(-3) = 1 \]

Solving for \(x\) finally gives us \(x = 5\), so \((5, -3)\) is a solution.

Exercise. Find a solution to the system

\begin{align*} -x + y &= 2 \\ -3x + 2y &= 2 \end{align*}

using the substitution method.

The substitution method works perfectly well, but it doesn't scale well if we want to solve systems with lots of variables.

The method (which, again, you were hopefully also taught) that will be useful now is the elimination method:

  • eliminate the appearance of \(x\) in the second equation by adding to the second equation a multiple of the first (this solves for \(y\))
  • substitute the value for \(y\) into the first equation (this solves for \(x\))

Example. Consider again the system

\begin{align*} -x -2y &= 1 \\ x + y &= 2 \end{align*}

We can eliminate the appearance of \(x\) in the second equation by adding the first equation to the second equation:

\begin{align*} -x - 2y &= 1 \\ x + y &= 2 &+ \\ \hline - y &= 3 \end{align*}

So \(y = -3\) and we can substitute this value for \(y\) into the first equation:

\[ -x -2(-3) = 1 \]

So \(x = 5\) and \((5, -3)\) is a solution.

Exercise. Find a solution to the system

\begin{align*} -x + y &= 2 \\ -3x + 2y &= 2 \end{align*}

using the elimination method.

The elimination method is the basis of Gaussian elimination, one of our next topics. In its simplest form, the elimination method has two phases: a forward elimination phase and a back substitution phase.

Rather than dwelling on how this works in general (we'll get to that), let's consider a rough outline for using the elimination method for a linear system in three variables:

  • (Forward elimination)
    • eliminate \(x\) from all but the first equation
    • eliminate \(y\) from all but the first and second equation
    • solve for the value of \(z\) in the third equation
  • (Back substitution)
    • substitute the value of \(z\) into the first and second equation
    • solve for \(y\) in the second equation
    • substitute the value of \(y\) into the first equation
    • solve for \(x\) in the first equation

Example. Consider the system of linear equations

\begin{align*} x + 2y &= 1 \\ -x - y - z &= -1 \\ 2x + 6y - z &= 1 \end{align*}

We can eliminate the appearance of \(x\) in the second equation by adding the first equation to the second equation:

\begin{align*} x + 2y &= 1 \\ y - z &= 0 \\ 2x + 6y - z &= 1 \end{align*}

We can eliminate the appearance of \(x\) in the third equation by subtracting 2 times the first equation from the second equation:

\begin{align*} x + 2y &= 1 \\ y - z &= 0 \\ 2y - z &= -1 \end{align*}

We can eliminate the appearance of \(y\) in the third equation by subtracting 2 times the second equation from the third equation:

\begin{align*} x + 2y &= 1 \\ y - z &= 0 \\ z &= -1 \end{align*}

So \(z = -1\) and we can substitute this value into the first and second equation:

\begin{align*} x + 2y &= 1 \\ y &= -1 \\ z &= -1 \end{align*}

And \(y = -1\) so we can substitute this into the first equation:

\begin{align*} x &= 3 \\ y &= 1 \\ z &= -1 \end{align*}

So \((3, -1, -1)\) is a solution to the above system of linear equations, and we can verify this by plugging these values into the original equations:

\begin{align*} 3 + 2(-1) = 3 - 2 &= 1 \\ -3 - (-1) - (-1) = -3 + 1 + 1 &= -1 \\ 2(3) + 6(-1) - (-1) = 6 - 6 + 1 &= 1 \end{align*}

The following a graph of the planes in the above system.

One important observation to make about this example: when we perform the elimination method, we're creating a bunch of intermediate systems of linear equations. Even the last system in the previous example

\begin{align*} x &= 3 \\ y &= 1 \\ z &= -1 \end{align*}

is a just a very simple system of linear equation. When graphed in \(\mathbb R^3\), this system is a collection of perpendicular planes.

As we will come to see, the key fact is that all these systems of equations have the same solution set.

Exercise. Find a solution to the following system of linear equations.

\begin{align*} x + 2y + 4z &= 17 \\ -x - y - z &= -8 \\ -2x -3y - 4z &= -22 \end{align*}

Exercise. Suppose that there are three companies which each pay for a combination of the cloud computing platforms AWS, Google Cloud Platform, and Azure. For simplicity, also suppose that each company is paying the same amount per terabyte of data (on something like queries). You know how much data each company has used for each service and you know how much they each spent overall.

Company AWS Google Cloud Azure Total
a 35000 24000 10000 $439,000
b 90000 13000 21000 $813,000
c 41000 19000 34000 $571,000

Is it possible to determine how much each services costs per terabyte? (Hint. This will be a bit of a hassle to do by hand. Give it a shot, but also look ahead to the next chapter to see how to do it with the help of a computer.)

Footnotes:

1

See the lovely notes by Mark Crovella we'll also be using for some exposition along these lines.

2

It's also possible to consider the case in which these are complex numbers, but we'll only consider real numbers.

3

When it comes to unknowns in algebraic equations, it doesn't matter what symbols we use. Sometimes we'll use \(x\), \(y\) and \(z\), other times we'll use \(x_1\), \(x_2\), and \(x_3\). It will always be clear from context which symbols are variables.

4

For example (and this will be more clear as we go on) just like a plane in \(\mathbb R^3\) separates \(\mathbb R^3\) into two disjoint regions, a plane in \(\mathbb R^{1934}\) also separates \(\mathbb R^{1934}\) into two disjoint regions.

5

As we will see, it can also mean describing the set of all possible solutions.