18.06 Pset 2¶

Due Friday Feb 18 at 11am on Canvas. Submit PDF(s) of your handwritten solutions along with your (clearly labeled) Julia code.

In [1]:
using LinearAlgebra

Problem 1 (10 points)¶

Suppose that $A$ is a $4\times 4$ matrix: $$ A = \begin{pmatrix} 1 & 3 & 2 & 5 \\ -2 & 1 & 1 & \\ 4 & 2 & & \\ -1 & & & \end{pmatrix} $$ In Julia:

In [2]:
A = [ 1 3 2 5
     -2 1 1 0
      4 2 0 0
     -1 0 0 0 ]
Out[2]:
4×4 Matrix{Int64}:
  1  3  2  5
 -2  1  1  0
  4  2  0  0
 -1  0  0  0

This is similar to an upper-triangular matrix, except that it is "flipped" horizontally. It doesn't matter in principle — we can still solve $Ax=b$ "bottom to top" similar to backsubstitution — but in practice it is often better to rewrite things in "standard" upper- or lower-triangular form because that's what all of the computer software is written to handle.

(a) Convert $A$ into a standard lower triangular matrix, of the form $$ L = \begin{pmatrix} a & & & \\ b & c & & \\ d & e & f & \\ g & h & i & j \end{pmatrix} = \mbox{products of permutation matrix/matrices with }A $$ as a product of $A$ with one or more permutation matrices on the left and/or right.

… and enter your answer in Julia as a check:

Solution (a)¶

Note that we can convert $A$ to standard lower-triangular form by reversing the order of the rows. Since this is a row operation, it corresponds to multiplying $A$ on the left by a permutation matrix $P$. What does $P$ look like? To see the matrix $P$, we can consider $P I = P$, which re-orders the rows of $I$ according to $P$.

In other words, we have $$ L = PA = \begin{pmatrix} 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 \\ 0 & 1 & 0 & 0 \\ 1 & 0 & 0 & 0 \end{pmatrix} \begin{pmatrix} 1 & 3 & 2 & 5 \\ -2 & 1 & 1 & \\ 4 & 2 & & \\ -1 & & & \end{pmatrix} = \boxed{ \begin{pmatrix} -1 & & & \\ 4 & 2 & & \\ -2 & 1 & 1 & \\ 1 & 3 & 2 & 5 \end{pmatrix}} $$

In [3]:
P = [ 0 0 0 1; 0 0 1 0; 0 1 0 0; 1 0 0 0]
L = P*A # your answer here as a product of A with some other matrix/matrices
Out[3]:
4×4 Matrix{Int64}:
 -1  0  0  0
  4  2  0  0
 -2  1  1  0
  1  3  2  5

(b) Suppose that we want to solve $Ax = b$ for $b = [26,-1,10,-1]$. Using your answer from (a), write $x = A^{-1} b$ in the following form:

  1. $y = $ something easy with $b$?
  2. forward-substitution solve of $Lz = y$
  3. $x = $ something easy with $z$?

Solution (b)¶

To solve $Ax = b$, we can apply a permutation on both sides to get $PA x = Pb$. Since $L = PA$, we have $L x = Pb$. This represents a permutation, followed by a lower-triangular solve (forward substitution).

Therefore, in step 1 we can let $\boxed{y = Pb}$, which is just $b$ with the rows in reverse order. We remark that in practice you would not compute $y = Pb$ by an explicit matrix-vector multiplication (~ $n^2$ operations)) — instead, you would just apply the linear operator $P$ to $b$, which amounts to reversing the rows (~ $n$ work).

Then, in step 2 we solve $Lz = y$ for $z$ using forward-substitution.

Finally, in step 3 we have $\boxed{x = z}$.

(c) Implement your answer from (b) in Julia and check it by filling in the outline below:

Solution (c)¶

As shown below, we have $\boxed{x = \begin{pmatrix} 1 \\ 3 \\ -2 \\ 4 \end{pmatrix}}$.

In [4]:
using LinearAlgebra

b = [26, -1, 10, -1]

y = b[4:-1:1] # something easy with b: just reverse the order.  b[ [4,3,2,1] ] or reverse(b) also work
z = LowerTriangular(L) \ y    # tells Julia to use forward-substitution
x = z # something (really) easy with z
Out[4]:
4-element Vector{Float64}:
  1.0
  3.0
 -2.0
  4.0
In [5]:
# compare it to naive solution:
A \ b
Out[5]:
4-element Vector{Float64}:
  1.0000000000000007
  2.9999999999999987
 -1.9999999999999984
  4.0

Yay, it matches (up to roundoff errors).

Problem 2 (15 points)¶

Suppose that we have the equation $$ YX+XZ = \underbrace{\begin{pmatrix} 1 & 2 \\ -3 & 4 \end{pmatrix}}_{Y} \underbrace{\begin{pmatrix} a & b \\ c & d \end{pmatrix}}_X + X \underbrace{\begin{pmatrix} 5 & -6 \\ 7 & 8 \end{pmatrix}}_Z = \underbrace{\begin{pmatrix} -7 & 21 \\ -3 & 3 \end{pmatrix}}_{B} $$ for the unknown $2\times 2$ matrix $X$.

(a) Explain why the left-hand side $YX+XZ$ is a linear operation (according to the definition in lecture 1) on $X$, where $X$ is viewed as a "vector" in the vector space of $2 \times 2$ matrices.

Solution (a)¶

We can check that for any $2 \times 2$ matrices $A, B$ and for any scalar $\alpha$, the linearity conditions are satisfied: $$\boxed{Y(A+B) + (A+B)Z = YA + YB + AZ + BZ = (YA + AZ) + (YB + BZ)}$$ and $$\boxed{Y(\alpha A) + (\alpha A) Z = \alpha YA + \alpha AZ = \alpha (YA + AZ)}$$ Therefore the left-hand side $YX + XZ$ is a linear operation on $X$.

(b) If we re-arrange the 4 unknowns into a column vector $x = [a,b,c,d]$, write down a $4\times 4$ matrix equation $Ax = b$ for $x$, i.e. write down the matrix $A$ and the vector $b$.

Solution (b)¶

Note that $$ \begin{pmatrix} 1 & 2 \\ -3 & 4 \end{pmatrix} \begin{pmatrix} a & b \\ c & d \end{pmatrix} + \begin{pmatrix} a & b \\ c & d \end{pmatrix} \begin{pmatrix} 5 & -6 \\ 7 & 8 \end{pmatrix} = \begin{pmatrix} a+2c & b+2d \\ -3a+4c & -3b+4d \end{pmatrix} + \begin{pmatrix} 5a+7b & -6a+8b \\ 5c+7d & -6c+8d \end{pmatrix} = \begin{pmatrix} 6a+7b+2c & -6a+9b+2d \\ -3a+9c+7d & -3b-6c+12d \end{pmatrix} $$ Therefore we have $$ \boxed{ \begin{pmatrix} 6 & 7 & 2 & 0 \\ -6 & 9 & 0 & 2\\ -3 & 0 & 9 & 7 \\ 0 & -3 & -6 & 12 \end{pmatrix} \begin{pmatrix} a\\b\\c\\d \end{pmatrix} = \begin{pmatrix} -7 \\ 21 \\ -3 \\ 3 \end{pmatrix}}$$

Note that this is sometimes called vectorizing the matrix and is denoted vec($X$), although the "standard" way to do it is by columns as $[a, c, b, d]$. You can then write $A$ in terms of something called "Kronecker products" involving $Y$ and $Z$. We won't cover this topic in 18.06, but it may be fun to read about them on Wikipedia etc.

(c) Enter your $A$ and $b$ in Julia and solve for $x$. Using your x, form the matrix $X$ and check that $YX + XZ = B$ (up to roundoff errors).

Solution (c)¶

As shown below, we can solve $Ax = b$ and get $\boxed{x = \begin{pmatrix} -2 \\ 1 \\ -1 \\ 0 \end{pmatrix}}$.

In [1]:
A = [ 6 7 2 0; -6 9 0 2; -3 0 9 7; 0 -3 -6 12]# your matrix here
b = [-7; 21; -3; 3]# your vector here
x = A \ b
Out[1]:
4-element Vector{Float64}:
 -2.0
  1.0
 -1.0
  0.0
In [2]:
X = [ x[1] x[2]; x[3] x[4] ]# your solution here, in terms of x above.  e.g [ x[1] x[3]; x[4] x[2] ] or whatever

Y = [1 2; -3 4]
Z = [5 -6; 7 8]
B = [-7 21; -3 3]

Y*X + X*Z ≈ B # should return "true"
Out[2]:
true

(d) If $X,Y,Z,B$ were all $n \times n$ matrices and you followed the above strategy, solving $Ax=b$ with ordinary Gaussian elimination and not taking advantage of any special structure of $A$, how would the computational cost scale with $n$? i.e. proportional to $n^2,n^3,n^4,\ldots$?

In fact, this is called a Sylvester equation and there are fancier ways to solve it (not covered in 18.06) that require $\sim n^3$ operations. The following Julia code should give the same answer $X$ up to roundoff errors:

Solution (d)¶

Since $X,Y,Z,B$ are all $n \times n$ matrices, $A$ is an $n^2 \times n^2$ matrix. Therefore, the computational cost of solving $Ax = b$ with ordinary Gaussian elimination is $\sim (n^2)^3 = \boxed{\sim n^6}$.

In [8]:
using LinearAlgebra # in case you didn't load this yet

X_fast = sylvester(Y, Z, -B) # solves YX+XZ = B for X much more efficiently than above
Out[8]:
2×2 Matrix{Float64}:
 -2.0  1.0
 -1.0  2.22045e-16

Yay, this matches our answer above:

In [3]:
X
Out[3]:
2×2 Matrix{Float64}:
 -2.0  1.0
 -1.0  0.0

Problem 3 (12 points)¶

Which of the following sets are vectors spaces (with the usual definition of $\pm$ and multiplication by real scalars), and why?

  1. The set of 4-component vectors whose components sum to 1.
  2. The set of 2-component vectors whose components have product 0.
  3. The union of two subspaces $S$ and $T$: all vectors that are in either $S$ or $T$.
  4. The intersection of two subspaces $S$ and $T$: all vectors that are in both $S$ and $T$.
  5. Given a subspace $V$ of $\mathbb{R}^n$ and an $m\times n$ matrix $A$, the set of all vectors $Ax$ for all $x\in V$.
  6. The set of all functions $f(x)$ such that the integral $\int_{-\infty}^\infty e^{-x^2} f(x) dx$ is zero.

Solution¶

  1. This is not a vector space. This set does not contain the zero vector $\begin{pmatrix} 0 \\ 0 \\ 0 \\ 0\end{pmatrix}$.

  2. This is not a vector space. This set contains the vectors $\begin{pmatrix} 1 \\ 0 \end{pmatrix}$ and $\begin{pmatrix} 0 \\ 1 \end{pmatrix}$, but it does not contain their sum $\begin{pmatrix} 1 \\ 1 \end{pmatrix}$.

  3. The union of two vector spaces, $S\cup T$, is in general not a vector space. Suppose $S$ is the set of all vectors of the form $\begin{pmatrix} a \\ 0 \end{pmatrix}$ and $T$ is the set of all vectors of the form $\begin{pmatrix} 0 \\ b \end{pmatrix}$. These are both (one-dimensional) subspaces of $\mathbb{R}^2$. However, if $a,b\neq0$, then $\begin{pmatrix} a \\ 0 \end{pmatrix}$ and $\begin{pmatrix} 0 \\ b \end{pmatrix}$ are both in $S \cup T$, but their sum $v = \begin{pmatrix} a \\ b \end{pmatrix}$ is in neither $S$ nor $T$, and so $v$ is not in $S \cup T$.

  4. The intersection of two vector spaces, $S\cap T$, is a vector space. Consider two vectors $u,v\in S \cap T$. Then $u,v\in S$ and $u, v \in T$. Consider a linear combination $au + bv$, where $a,b \in \mathbb{R}$. Then $au+bv \in S$, since $S$ is a vector space, and similarly $au+bv \in T$ since $T$ is a vector space. Therefore $au + bv \in S \cap T$, and so $S\cap T $ is a vector space.

  5. This is a vector space. Let $X = \{Ax : x\in V\}$, where $V$ is a subspace of $\mathbb{R}^n$. Suppose we have two vectors $u_1, u_2 \in X$. Then there exist two vectors $v_1, v_2\in V$, such that $u_1 = Av_1$ and $u_2 = Av_2$. Let's consider the linear combination $au_1 + bu_2$ for arbitrary scalars $a,b\in \mathbb{R}$. Then

$$ au_1 + bu_2 = a Av_1 + bAv_2 = A\left(av_1 + bv_2\right). $$

But since $V$ is a vector space, $av_1 + bv_2 \in V$ and so $au_1 + bu_2 \in X$. And so $X$ is a subspace of $\mathbb{R}^m$.

  1. This is a vector space. We know that the set of all real-valued functions $f(x)$ is a vector space. To show that the subset of all functions $f(x)$ whose integral $\int_{-\infty}^{\infty} e^{-x^2} f(x) \, dx = 0$ is a subspace, we need to show that $af(x)+bg(x)$ is in this subset, for any real numbers $a,b\in\mathbb{R}$ and functions $f(x), g(x)$ for which $\int_{-\infty}^{\infty} e^{-x^2} f(x) \, dx = \int_{-\infty}^{\infty} e^{-x^2} g(x) \, dx = 0$. Now using standard properties of integration
$$ \int_{-\infty}^{\infty} \left[af(x) + bg(x)\right] = a \int_{-\infty}^{\infty} e^{-x^2} f(x) \, dx + b \int_{-\infty}^{\infty} e^{-x^2} g(x) \, dx = 0 $$

Hence this set is a subspace of the vector space of real-valued functions.

Problem 4 (10 points)¶

(From Strang, section 3.2, problem 30.)

How is the nullspace $N(C)$ related to $N(A)$ and $N(B)$ if $C = \begin{pmatrix} A \\ B \end{pmatrix}$? ($A$ is $m \times n$ and $B$ is $p \times n$.)

Solution¶

Consider the $(m+p)\times n$ matrix $C = \begin{pmatrix} A \\ B \end{pmatrix}$, where $A$ is $m\times n$ and $B$ is $p\times n$. The null space of $C$ is the space $N(C)$ of $n\times 1$ column vectors $x$ for which $Cx = 0$.

Using the properties of matrix multiplication, we can see that $Cx = \begin{pmatrix} Ax \\ Bx \end{pmatrix}$, and so $Cx = 0$ if and only if $Ax= 0$ and $Bx = 0$. Equivalenetly, $x\in N(C)$ if and only if $x\in N(A)$ and $x\in N(B)$. This means that $N(C)$ must be the intersection of $N(A)$ and $N(B)$, i.e. $\boxed{N(C) = N(A) \cap N(B)}$.

Problem 5 (10 points)¶

Explain why $N(B)$ must be a subspace of $N(AB)$ for any $m \times n$ matrix $A$ and any $n \times p$ matrix $B$.

Solution¶

We know that the null space is always a vector space, so $N(B)$ and $N(AB)$ are both vector spaces. We therefore just need to show that $N(B)$ is a subset of $N(AB)$, i.e. that if $x\in N(B)$, then $x\in N(AB)$.

Suppose $x\in N(B)$. This means that $Bx = 0$. But then $(AB)x = A(Bx) = A 0 = 0$, and so $(AB)x = 0$. This means that $x\in N(AB)$, and so $N(B)$ is a subspace of $N(AB)$.

Problem 6 (10 points)¶

In lecture 6, I said that in writing $$ b = A\begin{pmatrix} x_1 \\ x_2 \\ x_3 \end{pmatrix} = \begin{pmatrix} 2 \\ 0 \\ 0 \end{pmatrix} x_1 + \begin{pmatrix} 1 \\ 3 \\ 0 \end{pmatrix} x_2 + \begin{pmatrix} 0 \\ 2 \\ 0 \end{pmatrix} x_3 $$ the third vector $[0,2,0]$ is redundant.

Explain how, for any $[x_1,x_2,x_3]$ above, we can write the same vector $b$ as: $$ b = A\begin{pmatrix} x_1 \\ x_2 \\ x_3 \end{pmatrix} = \begin{pmatrix} 2 \\ 0 \\ 0 \end{pmatrix} x_1' + \begin{pmatrix} 1 \\ 3 \\ 0 \end{pmatrix} x_2' $$ for some $[x_1', x_2']$. That is, give a formula for $[x_1', x_2']$ in terms of $[x_1,x_2,x_3]$.

Solution¶

Note that this just corresponds to finding a vector in the null space of $A$, and then scaling it so that the last component is 1.

In lecture 6, we were given a null-space vector $\begin{pmatrix} 1 \\ -2 \\ 3 \end{pmatrix}$. We can then scale it to $\begin{pmatrix} \frac{1}{3} \\ -\frac{2}{3} \\ 1 \end{pmatrix}$.

Hence, we can write the vector $b$ as: $$ b = A\begin{pmatrix} x_1 \\ x_2 \\ x_3 \end{pmatrix} = \begin{pmatrix} 2 \\ 0 \\ 0 \end{pmatrix} x_1' + \begin{pmatrix} 1 \\ 3 \\ 0 \end{pmatrix} x_2' $$ where $\boxed{x_1' = x_1 - \frac{x_3}{3}}$ and $\boxed{x_2' = x_2 + \frac{2x_3}{3}}$.

Problem 7 (12 points)¶

(a) Find the special solutions (a basis for the null space) for the matrix $$ A = \begin{pmatrix} 1 & 2 & 3 & 4 \\ 1 & 2 & 4 & 6 \\ 0 & 0 & 1 & 2 \end{pmatrix} . $$

(b) How does your answer change if you insert a column of 0's between the 1st and 2nd columns of $A$ (so that $A$ now has 5 columns, one of which is all 0's)?

(c) How does your answer change if you insert a row of zeros between the 1st and 2nd rows (so that $A$ now has 4 rows, one of which is all zeros)?

Solution¶

(a) Note that $$\begin{pmatrix} 1 & 2 & 3 & 4 \\ 1 & 2 & 4 & 6 \\ 0 & 0 & 1 & 2 \end{pmatrix} \xrightarrow{R_2 - R_1}{} \begin{pmatrix} 1 & 2 & 3 & 4 \\ 0 & 0 & 1 & 2 \\ 0 & 0 & 1 & 2 \end{pmatrix} \xrightarrow{R_3 - R_2}{} \begin{pmatrix} 1 & 2 & 3 & 4 \\ 0 & 0 & 1 & 2 \\ 0 & 0 & 0 & 0 \end{pmatrix} $$ Now we have an upper triangular matrix. Each special solution is a triangular solve (backsubstitution) from each of the free columns.

Since the free columns are 2 and 4, we can solve with $[x, 1, y, 0]$ for $x$ and $y$, and then solve with $[x, 0, y, 1]$ for $x$ and $y$, and that gives us the 2 special solutions by 2 backsolves.

More specifically, for $[x, 1, y, 0]$, from the second row of the upper-triangular form we have $$ 1\times y + 2\times 0 = 0 \Rightarrow y = 0 $$ Then, from the first row of the upper-triangular form we have $$ 1\times x + 2\times 1 + 3 \times 0 + 4 \times 0= 0 \Rightarrow x = -2 $$ Therefore, one specical solution is $\begin{pmatrix} -2 \\ 1 \\ 0 \\ 0 \end{pmatrix}$.

For $[x, 0, y, 1]$, from the second row of the upper-triangular form we have $$ 1\times y + 2\times 1 = 0 \Rightarrow y = -2 $$ Then, from the first row of the upper-triangular form we have $$ 1\times x + 2\times 0 + 3 \times -2 + 4 \times 1 = 0 \Rightarrow x = 2 $$ Therefore, another specical solution is $\begin{pmatrix} 2 \\ 0 \\ -2 \\ 1 \end{pmatrix}$.

Hence, a basis for the null space is $\boxed{\left\{\begin{pmatrix} -2 \\ 1 \\ 0 \\ 0 \end{pmatrix}, \begin{pmatrix} 2 \\ 0 \\ -2 \\ 1 \end{pmatrix} \right\}}$.

(b) Note that with a column of 0's between the 1st and 2nd columns of $A$ added, the new upper triangular matrix is $\begin{pmatrix} 1 & 0 & 2 & 3 & 4 \\ 0 & 0 & 0 & 1 & 2 \\ 0 & 0 & 0 & 0 & 0 \end{pmatrix}$.

Therefore, we need to add a 0 between the 1st and 2nd entries of the special solutions above to get $\begin{pmatrix} -2 \\ 0 \\ 1 \\ 0 \\ 0 \end{pmatrix}$ and $\begin{pmatrix} 2 \\ 0 \\ 0 \\ -2 \\ 1 \end{pmatrix}$.

We also need a null-space vector for the second column: $\begin{pmatrix} 0 \\ 1 \\ 0 \\ 0 \\ 0 \end{pmatrix}$.

Hence, a basis for the null space is $\boxed{\left\{\begin{pmatrix} -2 \\ 0 \\ 1 \\ 0 \\ 0 \end{pmatrix}, \begin{pmatrix} 2 \\ 0 \\ 0 \\ -2 \\ 1 \end{pmatrix}, \begin{pmatrix} 0 \\ 1 \\ 0 \\ 0 \\ 0 \end{pmatrix} \right\}}$.

(c) Note that after inserting a row of 0's between the 1st and 2nd rows of $A$, the new upper triangular matrix is $\begin{pmatrix} 1 & 2 & 3 & 4 \\ 0 & 0 & 1 & 2 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{pmatrix}$ and hence the null space is the same as the one in (a).

In other words, a basis for the null space is $\left\{\begin{pmatrix} -2 \\ 1 \\ 0 \\ 0 \end{pmatrix}, \begin{pmatrix} 2 \\ 0 \\ -2 \\ 1 \end{pmatrix} \right\}$.