pset11sol

Problem 1 (5+5)¶

(a) Suppose that $A$ is an $m \times m$ matrix, and $A + A^H$ is positive-definite. It follows that any eigenvalue $\lambda$ of $A$ must have a positive real part — why? (Hint: suppose $Ax=\lambda x$ and plug the eigenvector $x$ into $x^H (A + A^H) x$, which must be _____.)

(b) Consider $A = \begin{pmatrix} -1 & -3 & -3 & -7\\ 3 & -2 & -2 & -2\\ 3 & 2 & -3 & 1\\ 7 & 2 & -1 & -4\\ \end{pmatrix}$. Suppose that we multiply an arbitrary vector $y$ by $e^A$ over and over. With the help of your answer from (a), using no numerical calculation, explain why you would expect the resulting vector to grow, decay to zero, oscillate forever, or approach a nonzero steady state (pick one)?

Solution¶

(a) Assume that $\lambda$ is an eigenvalue of $A$ and let $Ax=\lambda x$ for the corresponding eigenvector $x$ (both $\lambda$ and $x$ might be complex). Following the hint, consider the scalar $x^H(A+A^H)x$, which must be a positive real number since $A+A^H$ is positive-definite. Since $Ax=\lambda x$ we have $$ x^H(A+A^H)x=x^HAx+x^HA^Hx=x^H(Ax)+(Ax)^Hx=\lambda x^Hx+\overline{\lambda}x^Hx=(\lambda+\overline{\lambda})x^Hx>0. $$ Since $x^Hx$ is also a positive real number, $\lambda+\overline\lambda=\frac{x^H(A+A^H)x}{x^Hx}$ is a positive real number as well. Hence $\mathrm{Re}\,(\lambda)=\frac{\lambda+\overline{\lambda}}{2}$ is positive.

(b) We start by checking what we get using (a). Note that $$A+A^H=\begin{pmatrix}-2 &0 &0&0\\ 0&-4&0&0\\0&0&-6&0\\0&0&0&-8\end{pmatrix}.$$ Clearly the matrix $A+A^H$ is negative-defenite: it is a diagonal matrix with eigenvalues $-2, -4, -6, -8$. Hence (a), applied to $-A-A^H$, implies that all eigenvalues of $A$ must have negative real parts.

This is enough to determine the expected behaviour of $e^{nA}y$ for large $n$. If $\lambda_1, \lambda_2, \lambda_3, \lambda_4$ are eigenvalues of $A$ with corresponding eigenvectors $x_1, x_2, x_3, x_4$ (which we generally expect to form a basis), then for any vector of the form $y=c_1x_1+\dots+c_4x_4$ we have $$ e^{nA}y=c_1e^{n\lambda_1}x_1+e^{n\lambda_2}c_2x_2+e^{n\lambda_3}c_3x_3+e^{n\lambda_3}c_4x_4. $$ Note that the complex numbers $e^{n\lambda_i}$ must have absolute values $$ |e^{n\lambda_i}|=\sqrt{e^{n\lambda_i}e^{n\overline{\lambda_i}}}=\sqrt{e^{2n\ \mathrm{Re}(\lambda_i)}}=e^{n\ \mathrm{Re}(\lambda_i)}, $$ which exponentially decay as $n$ grows since the real parts of $\lambda_i$ are negative. Hence, we expect $e^{nA}y$ to decay exponentially as $n$ grows.

More details: One might be concerned by our solution of part (b): we use that $A$ has a basis of eigenvectors $x_i$ or equivalently $A$ is diagonalizable (this is used when we write arbitrary vector $y$ as $c_1x_1+\dots+c_4x_4$). For a general matrix $A$ we generically expect $A$ to be diagonalizable, and in fact $A$ from (b) is diagonalizable: by computing eigenvalues we can see that they are four different numbers, which implies that $A$ is indeed diagonalizable:

In [1]:

using LinearAlgebra;
A=[-1 -3 -3 -7; 3 -2 -2 -2; 3 2 -3 1; 7 2 -1 -4];
eigvals(A)

Out[1]:

4-element Vector{ComplexF64}:
  -2.642807608141947 - 0.08744591155269804im
  -2.642807608141947 + 0.08744591155269804im
 -2.3571923918580566 - 8.575146718187343im
 -2.3571923918580566 + 8.575146718187343im

However, it is actually not necessary to know that $A$ is diagonalizable: if eigenvalues of $A$ have negative real parts it is always true that $e^{nA}y$ decays exponentially. Showing this was not required for this pset, since it uses Jordan vectors. More pricesely, for each eigenvalue $\lambda$ of $A$, in addition to an eigenvector $Ax=\lambda x$ we might have a sequence of Jordan vectors (a "Jordan chain") $j_1, j_2, \dots, j_k$ behaving like $$ Aj_1=\lambda j_1+ x, \qquad Aj_r=\lambda j_r+ j_{r-1}, $$ (here $x$ is a regular eignevector of $A$ with eigenvalue $\lambda$). One can check using the definition of $e^{nA}$ via Taylor series that the action of $e^{nA} = (e^A)^n$ on these Jordan vectors is given by polynomials in n times powers of $e^{\lambda}$ $$ e^{nA}j_r=e^{n\lambda}j_r+ne^{(n-1)\lambda}j_{r-1}+n (n-1) e^{(n-2)\lambda}j_{r-2}+\dots+n (n-1) \cdots (n-r+1) e^{(n-r)\lambda}x. $$ The expression above decays exponentially because $n^k e^{n\lambda}$ decays (powers $n^k$ of $n$ grow more slowly than the exponential decay of $e^{n\lambda}$) and since Jordan vectors form a basis, $e^{nA}$ decays as well.

Problem 2 (5+5)¶

(a) Suppose $A=X^H X$ and $B = Y^H Y$ are two positive-definite $n \times n$ matrices, for some $X$ and $Y$ matrices. $A + B = Z^H Z$ where $Z = \_\_\_$ (give an explicit formula in terms of $X$ and $Y$, no calculation required). What must be true of the dimensions of the matrices $X$, $Y$, and $Z$?

(b) Suppose $A = X^H C X$ where $C$ is positive-definite and $X$ has full column rank. Why must $A$ be positive-definite?

Solution¶

(a) Set $Z=\begin{pmatrix} X \\ Y\end{pmatrix}$. Then $$Z^HZ=\begin{pmatrix}X^H & Y^H\end{pmatrix} \begin{pmatrix}X \\Y\end{pmatrix}=X^HX+Y^HY.$$ Note that the number of columns of $X,Y,Z$ must be $n$, since $A=X^HX, B=Y^HY, A+B=Z^HZ$ are $n\times n$ matrices. Since $A,B,A+B$ are positive definite, all three of these matrices must have a zero nullspace (otherwise $x^HAx=0$ for $x\in N(A)$), hence $N(X)=N(Y)=N(Z)=0$. This is only possible if $X, Y, Z$ are full-column rank, which is only possible if the number of rows in all these matrices is at least $n$. So, dimensions of $X, Y, Z$ must be of the form $m\times n$ where $m\geq n$ ($m$ is arbitrary and might vary between $X, Y, Z$).

(b) To show that $A$ is positive-definite it is enough to show that $v^HAv>0$ for any non-zero $v$ (note that $A=A^H$, so $v^HAv$ is a real number). Note that $v^HAv=v^HX^HCXv=(Xv)^HC(Xv)$. Since $C$ is positive definite, this implies that $v^HAv=(Xv)^HC(Xv)>0$ unless $Xv=0$. But $X$ has full column rank, so $N(X)=0$ and $Xv=0$ only if $v=0$. So, $v^HAv>0$ unless $v=0$.

Problem 3 (5+5+5+5+5)¶

You are given the matrix $$ A = \begin{pmatrix} 7 & 4 & -5 \\ 4 & -2 & 4 \\ -5 & 4 & 7 \end{pmatrix} $$

(a) Two of the eigenvalues of $A$ are 6 and –6. Find the third eigenvalue. The eigenvalues must be real because $A$ is ____________. Check your answer in Julia using eigvals(A) (don't forget to do using LinearAlgebra first).

(b) Two eigenvectors of $A$ are the column vectors (1,1,1) and (1,-2,1). Find the third eigenvector. The eigenvectors must be ____________ to one another because $A$ is ____________

(c) Suppose that $x(t)$ is the solution to $\frac{dx}{dt} = Ax - 12x$ with $x(0) = (1,0,0)$. After a long time $x(t)$ is approximately parallel to what vector? Check your answer by computing $x(t) = e^{(A-12I)t} x(0)$ in Julia for a large $t$ using exp((A-12I)*t).

(d) Give an exact expression for $x(t)$ from (c). You should be able to write your answer easily, without doing Gaussian elimination or anything fancy — only a few dot products need to be computed, because the eigenvectors are ____________. Check your answer in Julia using exp((A-12I)*t).

(e) As $t \to \infty$, the matrix $e^{(A-12I)t}$ approaches what projection matrix $P$?

Solution¶

(a) Recall that $\mathrm{tr}\,(A)$ is equal to both the sum of eigenvalues and the sum of diagonal entries of $A$. Hence $\mathrm{tr}\,(A)=12$ and the third eigenvalue $\lambda$ satisfies $\lambda+6-6=12$, hence $\lambda=\boxed{12}$. The eigenvalues are real because $A$ is hermitian (more specifically, real symmetric). The Julia verification is below:

In [2]:

using LinearAlgebra;
A=[7 4 -5; 4 -2 4; -5 4 7];
eigvals(A)

Out[2]:

3-element Vector{Float64}:
 -6.0
  6.0
 12.0

(b) The eigenvectors must be orthogonal to one another because $A$ is hermitian (or real symmetric). Using this orthogonality, the third eigenvector $x$ can be found as a solution of $$ \begin{pmatrix}1&1&1\\1&-2&1\end{pmatrix}x=0. $$ Solving this, we get $$ \begin{pmatrix}1&1&1\\0&-3&0\end{pmatrix}x=0. $$ hence by back-substitution all solutions are proportional to $x=\begin{pmatrix}-1\\0\\1\end{pmatrix}$. So, $\boxed{\begin{pmatrix}-1\\0\\1\end{pmatrix}}$ is the third eigenvector, correpsonding to the eigenvalue $12$.

There are also other ways to find the third eigenvector: we can find it is as a solution of $(A-12I)x=0$, or we could have noted that $x$ is proportional to the cross-product of the two known eigenvectors.

(c) Let $\lambda_1, \lambda_2,\lambda_3$ denote the eigenvalues $6, -6, 12$ from (a), and let $x_1, x_2, x_3$ denote the corresponding eigenvectors $\begin{pmatrix}1\\1\\1\end{pmatrix},\begin{pmatrix}1\\-2\\1\end{pmatrix},\begin{pmatrix}-1\\0\\1\end{pmatrix}$ from (b). Note that $x_i$ are also eigenvectors of $A-12I$ and $e^{A-12I}$, with eigenvalues $\lambda_i-12$ and $e^{\lambda_i-12}$ respectvely. Since the eignevectors form a basis of $\mathbb R^3$, we have an expansion $x(0)=c_1x_1+c_2x_2+c_3x_3$ for some numbers $c_1, c_2, c_3$. The formula for solution of $\frac{dx(t)}{dt}=(A-12I)x$ is $x(t)=e^{(A-12I)t}x(0)$. Plugging the expression of $x(0)$ in terms of the eigenvectors we get $$ x(t)=e^{(A-12I)t}x(0)=c_1 e^{(\lambda_1-12)t} x_1+c_2 e^{(\lambda_2-12)t} x_2+c_3 e^{(\lambda_3-12)t} x_3=c_1 e^{-6t}\begin{pmatrix}1\\1\\1\end{pmatrix} +c_2 e^{-18t} \begin{pmatrix}1\\-2\\1\end{pmatrix}+c_3 \begin{pmatrix}-1\\0\\1\end{pmatrix}. $$ When $t$ is large the expression above is approximately $c_3 \begin{pmatrix}-1\\0\\1\end{pmatrix}$, which is parallel to $\boxed{\begin{pmatrix}-1\\0\\1\end{pmatrix}}$. The Julia verification is below:

In [3]:

t=1000;
x0=[1; 0; 0];
exp(t*(A-12*I))*x0

Out[3]:

3-element Vector{Float64}:
  0.5000000000002276
  0.0
 -0.5000000000002274

(d) To get an exact expression for $x(t)$ we need to compute the coefficents $c_i$ in $x(0)=c_1x_1+c_2x_2+c_3x_3$ from (c). Since $x_1, x_2, x_3$ are orthogonal, we have $c_i=\frac{x_i^Hx(0)}{x_i^Hx_i}$. Plugging expressions for $x_i, x(0)$, we get $$ c_1=\frac{1}{3}, \qquad c_2=\frac{1}{6}, \qquad c_3=-\frac{1}{2}. $$ Hence, $$ x(t)=e^{(A-12I)t}x(0)=\boxed{\frac{1}{3} e^{-6t}\begin{pmatrix}1\\1\\1\end{pmatrix} +\frac{1}{6} e^{-18t} \begin{pmatrix}1\\-2\\1\end{pmatrix}+\frac12 \begin{pmatrix}1\\0\\-1\end{pmatrix}}. $$

(e) As we showed in part (c), for any vector $v$ of the form $v=c_1x_1+c_2x_2+c_3x_3$ we have $$ e^{(A-12I)t}v=c_1 e^{-6t} x_1+c_2 e^{-18t} x_2+c_3 x_3. $$ When $t\to \infty$, the expression above approaches $c_3x_3$. Using the orthogonality of $x_i$ to compute $c_3$ again, we get $$ e^{(A-12I)t}v \to \frac{x_3^Hv}{x_3^Hx_3} x_3 $$ as $t\to\infty$. This limit is exactly the projection of $v$ on $x_3=\begin{pmatrix}-1\\0\\1\end{pmatrix}$, so as $t$ goes to infinity $e^{(A-12I)t}$ approaches the projection onto $\begin{pmatrix}-1\\0\\1\end{pmatrix}$. The matrix $P$ is given by $$ P=\frac{x_3x_3^H}{x_3^Hx_3}=\boxed{\frac{1}{2}\begin{pmatrix} 1&0&-1\\0&0&0\\-1&0&1\end{pmatrix}}. $$

Problem 4 (5+5+5+5)¶

Suppose that $A$ is a Hermitian positive-definite matrix, and we find a sequence of vectors $x^{(1)}, x^{(2)}, x^{(3)}, \ldots$ by repeatedly solving the recurrence equation: $$ x^{(n+1)} - x^{(n)} = -A\frac{x^{(n+1)} + x^{(n)} }{2} $$ starting from a vector $x^{(0)}$.

(a) Derive a formula $x^{(n+1)} = \mbox{(some matrix)} x^{(n)}$ in terms of "some matrix" that does not depend on $n$.

(b) Your formula from (a) involves the inverse of a matrix. Why must that matrix always be invertible?

(c) If $17$ is an eigenvalue of $A$, what is an eigenvalue of your "some matrix" from (a)?

(d) Would you expect the sequence of vectors $x^{(n)}$ to grow, decay, or oscillate as $n$ becomes large?

Solution¶

(a) Rearranging the terms in the defining relation, we get $$ x^{(n+1)} - x^{(n)} = -A\frac{x^{(n+1)} + x^{(n)} }{2} $$ $$ \left(I+\frac12 A\right)x^{(n+1)} = \left(I-\frac12A\right)x^{(n)} $$ $$ \boxed{x^{(n+1)} =\left(I+\frac12 A\right)^{-1} \left(I-\frac12A\right)x^{(n)}} $$ (b) Recall that a sum of two positive-definite matrices is positive-definite. Since both $I$ and $\frac12 A$ are positive-definite, the matrix $I+\frac12 A$ is positive-definite. But a positive-definite matrix is non-singular (for example, since all eigenvalues $>0$ there are no vectors with eigenvalue $0$, that is, the nullsapce is $0$). Hence $I+\frac12 A$ is invertible.

More explicitly, the eigenvalues of $I+\frac12 A$ are equal to $1 + \frac12 \lambda$ for eigenvalues $\lambda$ of $A$. Since $A$ is positive-definite, then the eigenvalues of $I+\frac12 A$ are actually all $> 1$ (whereas a singular matrix has zero eigenvalues).

(c) If $Ax=17x$ for some vector $x$, then $\left(I\pm\frac12A\right)x=(1\pm \frac{17}2)x$. Hence $\left(I+\frac12A\right)^{-1}x=(1+ \frac{17}2)^{-1}x$ and consequently $$ \left(I+\frac12 A\right)^{-1} \left(I-\frac12A\right)x=\frac{1-\frac{17}{2}}{1+\frac{17}{2}}x=-\frac{15}{19}x. $$ So, the matrix from (a) has eigenvalue $\boxed{\frac{15}{19}}$.

(d) Let $B=\left(I+\frac12 A\right)^{-1} \left(I-\frac12A\right)$. Our computation in part (c) has showed that if $x$ is an eigenvector of $A$ with eigenvalue $\lambda$, then $x$ is also an eigenvector of $B$ with eigenvalue $\frac{2-\lambda}{2+\lambda}$. Note that since $A$ is positive-definite we must have $\lambda>0$, and this implies that $\left |\frac{2-\lambda}{2+\lambda}\right |<1$.

Since $A$ is hermitian it is diagonalizable and we can find a basis $x_1, \dots, x_m$ of eigenvectors of $A$ with eigenvalues $\lambda_1, \dots, \lambda_m$. Consequently, $x_1, \dots, x_m$ are also eigenvectors of $B$, with all eigenvalues $\mu_1, \dots, \mu_m$ srictly between $-1$ and $1$ as noted above. Write $x^{(0)}$ in terms of this basis of eigenvectors: $$ x^{(0)}=c_1x_1+\dots+c_mx_m. $$ Then $$ x^{(n)}=B^nx^{(0)}=c_1\mu_1^nx_1+\dots+c_m\mu_m^nx_m. $$ Since $|\mu_i|<1$, $\mu_i^n$ decays as $n$ grows, so $x^{(n)}$ decays as $n$ becomes large.

For the sake of interest, we mention that the equation from this problem arises in Crank–Nicolson discretizations of differential equations, and the analysis in this problem is key to the stability of that technique.

18.06 Problem Set 11 Solutions¶

Problem 1 (5+5)¶

Solution¶

Problem 2 (5+5)¶

Solution¶

Problem 3 (5+5+5+5+5)¶

Solution¶

Problem 4 (5+5+5+5)¶

Solution¶