(a) If $P$ is the projection matrix onto the null space of $A$, then $Py − y$, for any $y$, is in the __________ space of $A$.
(b) If $Ax = b$ has a solution $x$, then the closest vector to $b$ in $N(A^T)$ is __________. (Try drawing a picture.)
(c) If the rows of $A$ (an $m \times n$ matrix) are independent, then the dimension $N(A^T A)$ is __________.
(d) If a matrix $U$ has orthonormal rows, then $I = \_\_\_\_\_\_\_\_\_\_$ and the projection matrix onto the row space of U is __________. (Your answers should be simplified expressions involving $U$ and $U^T$ only.)
(a) If $P$ is the projection matrix onto the null space of $A$, then $Py − y$, for any $y$, is in the row space of $A$.
$P(Py-y)= Py-Py=0$, so the vector $y$ is in the orthogonal complement of the null space of $A$, which is the row space. Equivalently, $Py − y = (I-P)(-y)$, and $I-P$ is projection onto the orthogonal complement.
(b) If $Ax = b$ has a solution $x$, then the closest vector to $b$ in $N(A^T)$ is $\boxed{\vec{0}}$.
$N(A^T)$ is the orthogonal complement of $C(A)$. $b$ is in $C(A)$ so its projection onto the orthogonal complement is $\vec{0}$ (the only vector that is in both $S$ and $S^\perp$ for any subspace $S$).
(c) If the rows of $A$ (an $m \times n$ matrix) are independent, then the dimension $N(A^T A)$ is $\boxed{n-m}$.
The nullspace of $A^TA$ is the same as the nullspace of $A$, which is $n-m$ dimensional since the rows of $A$ are independent (so the rank of $A$ is $m$).
(d) If a matrix $U$ has orthonormal rows, then $I = UU^T$ and the projection matrix onto the row space of U is $U^TU$.
$UU^T$ is computing the dot product of the rows of $U$ with each other, which is exactly the coefficients of the identity matrix. To see that $U^TU$ is a projection onto the rowspace, its image is the row space since the column space of $U$ is orthogonal to the null space of $U^T$ and the image of $U^T$ is the null space. It is a projection since $UU^T=I$, and it is an orthogonal projection since its nullspace is the null space of $U$, which is the orthogonal complement of the row space.
Equivalently, $U^T = Q$, a matrix with orthonormal columns. So, we can take the formulas $Q^T Q = I$ and $P = QQ^T$ from class and substitute in $Q = U^T$.
In class, we saw that the orthogonal projection $p$ of a vector $b$ onto $C(A)$ is given by $p = A\hat{x}$ where $\hat{x}$ is a solution to the "normal equations" $A^T A \hat{x} = A^T b$. We showed in class that these equations are always solvable, because $A^T b \in C(A^T) = C(A^T A)$.
(a) The least-square solution $\hat{x}$ is unique if $A$ is __________, in which case $A^T A$ is __________.
(b) The least-square solution $\hat{x}$ is not unique if $A$ is __________, in which case $A^T A$ is __________. However, the projection $p = A\hat{x}$ is still unique: if you have two solutions $\hat{x}$ and $\hat{x}'$ to the normal equations, $A\hat{x} - A\hat{x}' = \_\_\_\_\_\_\_\_$ because __________.
(a) The least-square solution $\hat{x}$ is unique if $A$ is full column rank, in which case $A^T A$ is invertible.
This is because the nullspace of $A^TA$ is the same as the nullspace of $A$, and recall from exam 1 that ($A^T A \hat{x} = A^T b$ has unique solutions only when $N(A^T A) = \{ \vec{0} \}$.
(b) The least-square solution $\hat{x}$ is not unique if $A$ is not full column rank, in which case $A^T A$ is singular. However, the projection $p = A\hat{x}$ is still unique: if you have two solutions $\hat{x}$ and $\hat{x}'$ to the normal equations, then $A\hat{x} - A\hat{x}' = 0$ because $\hat{x}-\hat{x}'$ is in the nullspace of $A^TA$, which is the same as the nullspace of $A$.
In this problem you will fit the motion of a projectile to a parabola in order to extract its velocity and the Earth's gravitational acceleration g.
If you launch a mass into the air from some position $x(0), y(0)$ with initial velocity $v_x(0), v_y(0)$, and friction is negligible, then you probably remember from 8.01 or high-school physics that the trajectory is a parabola:
$$ x(t) = x(0) + v_x(0) t $$$$ y(t) = \underbrace{y(0) + v_y(0) t - \frac{g}{2} t^2}_{\mbox{parabola }y(t)} = \underbrace{y(0) + \frac{v_y(0)}{v_x(0)} [x - x(0)] - \frac{g}{2v_x(0)^2} [x - x(0)]^2}_{\mbox{parabola }y(x)} $$The textbook Physics2000 by E. R. Huggins contains this strobe photograph of a steel ball launched from the left, with a picture taken every 0.1 seconds. The spatial grid in the background is numbered in 10s of cm (so that the numbers "1,2,3,…,10" refer to 10cm,20cm,30cm,100cm, and the fine grid lines are 1cm).
(a) Extract the data from the image for the 6 locations of the ball, and define Julia arrays x = [...]
, y = [...]
, and t = [...]
holding the coordinates $x,y,t$ (horizontal,vertical,time) of each ball location in meters and seconds, respectively. You can just zoom in and read the coordinates off the image by counting the fine grid lines, or use a fancier tool like WebPlotDigitizer. Plot the data $x(t)$ and $y(t)$.
(b) Perform a least-square fit to $x(0) + v_x(0) t$ to extract the velocity $v_x(0)$. That is, define a $6\times 2$ matrix X
so that the least-square solution X \ x
returns the 2-component vector $[x(0), v_x(0)]$ of the unknowns. Plot your linear fit along with the data $x(t)$.
(c) Perform a least-square fit to $y(0) + v_y(0) t - \frac{g}{2} t^2$ to extract the Earth's gravitational constant $g$ in m/s² (you should get roughly 9.8m/s²). That is, define a $6\times 3$ matrix Y
so that the least-square solution Y \ y
returns the 3-component vector $[y(0), v_y(0), g]$ of the unknowns. Plot your parabolic fit along with the data $y(t)$.
(a) We extracted the data from the image. Plotting it versus $t$, you can see that it looks reasonable: $x(t)$ is a straight line, and $y(t)$ is a parabola.
x = [.09,.25,.43,.6,.79,.95]
y = [.79,.90,.90,.80,.60,.30]
t = [0.1,0.2,0.3,0.4,0.5,0.6]
using PyPlot
plot(t, x, "bo-")
plot(t, y, "r*-")
xlabel("time (seconds)")
ylabel("(meters)")
legend([L"x(t)", L"y(t)"])
title("Problem 3a: Extracted x, y data")
PyObject Text(0.5, 1.0, 'Problem 3a: Extracted x, y data')
(b) We want the matrix
$$
X = \begin{pmatrix} 1 & t_1 \\ 1 & t_2 \\ \vdots & \vdots \\ 1 & t_m \end{pmatrix}
$$
which is constructed in Julia by X = [t.^0 t.^1]
. This is nearly identical to the linear-fitting (linear regression) example from class.
This way,
$$
\left\Vert X \begin{pmatrix} x(0) \\ v_x(0) \end{pmatrix} - \vec{x} \right\Vert^2
$$
is exactly the sum of the squares of the differences between our linear model $x(t) = x(0) + v_x(0) t$ and our data points $x_k$, and X \ x
gives the least-square $x(0)$ and $v_x(0)$ coefficients.
Plotting it, we can see visually that it is a good fit: the line almost goes through the data points.
X = [t.^0 t.^1]
xfit = X \ x # least square solution
@show x₀ = xfit[1]
@show vx₀ = xfit[2]
plot(t, x, "bo") # data
tf = range(t[begin], t[end], length=1000) # finely spaced times for plotting curves
plot(tf, x₀ .+ vx₀ .* tf, "k-") # fit ... note that we use .+, .^, .* for elementwise operations
legend(["data", "fit"])
xlabel("time (seconds)")
ylabel("(meters)")
title("Problem 3b: Linear fit of x(t)")
x₀ = xfit[1] = -0.09066666666666674 vx₀ = xfit[2] = 1.74
PyObject Text(0.5, 1.0, 'Problem 3b: Linear fit of x(t)')
(c) Similar to above, except that we want the matrix
$$
Y = \begin{pmatrix} 1 & t_1 & -t_1^2/2 \\ 1 & t_2 & -t_2^2/2 \\ \vdots & \vdots & \vdots \\ 1 & t_m & -t_m^2/2 \end{pmatrix}
$$
in order to express our quadratic model as a matrix.
which is constructed in Julia by Y = [t.^0 t.^1 -t.^2/2]
. This is nearly identical to the polynomial regression example from class.
Again, visually it looks like an excellent fit, in that it nearly goes through all of the data points.
Our fit yields $g = 10.17 m/s^2$, which differs from the true value ($9.807 m/s^2$) by about 4%. (You will get slightly different numbers, but should obtain a similar accuracy of a few percent.)
We can conclude that this photo was taken on another planet slightly differing from the Earth, a momentous discovery! Just kidding, a 4% error is well within the tolerances of this kind of data (e.g. even the time spacing of $0.1s$ is probably not that accurate), though we'd have to perform a statistical sensitivity analysis (outside the scope of 18.06) to gain a more precise understanding of the uncertainties.
Y = [t.^0 t.^1 -t.^2/2]
yfit = Y \ y # least square solution
@show y₀ = yfit[1]
@show vy₀ = yfit[2]
@show g = yfit[3]
plot(t, y, "r*") # data
tf = range(t[begin], t[end], length=1000) # finely spaced times for plotting curves
plot(tf, y₀ .+ vy₀ .* tf .+ (-g/2) .* tf.^2, "k-") # fit ... note that we use .+, .^, .* for elementwise operations
legend(["data", "fit"])
xlabel("time (seconds)")
ylabel("(meters)")
title("Problem 3c: Parabolic fit of y(t)")
y₀ = yfit[1] = 0.5850000000000004 vy₀ = yfit[2] = 2.5767857142857142 g = yfit[3] = 10.17857142857143
PyObject Text(0.5, 1.0, 'Problem 3c: Parabolic fit of y(t)')
Suppose we have data (e.g. from some experimental measurement) $b_1,b_2,b_3,\ldots,b_{21}$ at the 21 equally spaced times $t = −10,−9,\ldots,9,10$. All measurements are $b_k = 0$ except that $b_{11} = 1$ at the middle time $t = 0$.
(a) Using least squares, what are the best $c$ and $d$ to fit those 21 points by a straight line $c+dt$?
(b) You are projecting the vector $b$ onto what subspace? (Give a basis.)
(c) Find a nonzero vector perpendicular to that subspace.
(a) Using least squares, what are the best $c$ and $d$ to fit those 21 points by a straight line $c+dt$?
We are trying to find the least squares solution to $A \begin{pmatrix}c\\d\end{pmatrix} = b$ where $b$ is the vector consisting of the $b_i$ and $A$ is the matrix whose first column is $1$s and second column is $-10,9,\dots,10$.
$$A^TA = \begin{pmatrix} 1 + 1 + \cdots + 1 & -10 + -9 + \cdots + 9 + 10\\-10 + -9 + \cdots + 9 + 10& (10^2 + 9^2 + \cdots + 1^2) \times 2\end{pmatrix} = \begin{pmatrix} 21 & 0\\0&770\end{pmatrix} ,$$where the arithmetic is pretty easy to do by hand (this problem was actually on Fall 2006 exam 2!) and $$A^Tb = \begin{pmatrix} 1 & 1 & \cdots & 1 \\ -10 & -9 & \cdots & 10\end{pmatrix} \begin{pmatrix} 0 \\ 0 \\ \vdots \\ 0 \\ 1 \\ 0 \\ \vdots \\ 0 \end{pmatrix} = \begin{pmatrix}1 \\0\end{pmatrix} . $$
Thus we are trying to solve $$\begin{pmatrix} 21 & 0\\0&770\end{pmatrix} \underbrace{\begin{pmatrix} c \\d\end{pmatrix}}_{\hat{x}} = \begin{pmatrix}1 \\ 0\end{pmatrix} .$$No elimination is required since this is diagonal, and we immediately obtain that $\boxed{c = \frac{1}{21}, d = 0}$.
(b) You are projecting the vector $b$ onto the column space C(A), and a basis is given by the two columns of A (which are linearly independent): the vector of all $1$s and the vector $(-10,-9,\dots,10)$
(c) A nonzero vector perpendicular to that subspace is any nonzero vector in $N(A^T) = C(A)^\perp$. There are many such vectors, since it is a $21-2 = 19$ dimensional subspace!
We could do an elimination step on $$A^T$$ (only one row operation) to get it in upper-triangular form, and find one of the special solutions, for example.
Another vector $\perp C(A)$ is the error vector $$b - p = b - A\hat{x} = \frac{1}{21} \begin{pmatrix} (\mbox{ten 1's}) & -20 & (\mbox{ten 1's}) \end{pmatrix}^T $$.
But it's also straightforward to find a vector perpendicular to both rows by inspection. For example, the vector whose first and last entries are 1 and middle entry is $-2$, and all other entries are zero, is perpendicular to that subspace.
Suppose $\hat{x}$ is the least squares solution to $Ax \approx b$ (i.e. it minimizes $\Vert Ax-b\Vert$) and $\hat{y}$ is the least squares solution to $Ay \approx c$, where $A$ has full column rank. Does this tell you the least squares solution $\hat{z}$ to $Az \approx b + c$? If so, what is $\hat{z}$ and why?
$\hat{x}$ is the (unique) solution of the equation $A^TA\hat{x} = A^Tb$, and $\hat{y}$ is the solution of the equation $A^TA\hat{y} = A^Tc$.
Because this equation is linear, the solution for a right-hand side is $b+c$ is $\boxed{\hat{z} = \hat{x}+\hat{y}}$, since $$ A^T A \hat{z} = A^T A(\hat{x}+\hat{y}) = A^T A\hat{x} + A^T A\hat{y} = A^Tb + A^Tc = A^T(b+c), $$ i.e. $\hat{z}$ solves the normal equations for $b+c$.
(a) What matrix $P$ projects every vector in $\mathbb{R}^3$ onto the line that passes through origin and $a = [3, 4, 5]$ (column vector)?
(b) What is the nullspace of that matrix $P$? (Give a basis.)
(c) What is the row space of $P^7$?
(a) What matrix $P$ projects every vector in $\mathbb{R}^3$ onto the line that passes through origin and $a = [3, 4, 5]$ (column vector)?
The orthogonal projection onto $a$ sends a vector $u$ to $\frac{a a^T u}{a^Ta}$. This gives the rank-1 matrix $$P= \frac{aa^T}{a^T a} = \frac{1}{3^2 + 4^2 + 5^2} \begin{pmatrix} 3a & 4a & 5a \end{pmatrix} = \boxed{\frac{1}{50}\begin{pmatrix}9 & 12 & 15\\ 12 &16 & 20\\ 15 & 20 & 25\end{pmatrix}}.$$
(b) What is the nullspace of that matrix $P$? (Give a basis.)
The null space consists of vectors orthogonal to $a$, since each row is a multiple of $a$. That is, we are finding the nullspace of the $1 \times 3$ matrix $a^T = \begin{pmatrix} 3 & 4 & 5 \end{pmatrix}$.
We could find a basis in a variety of ways, e.g. by inspection (3 dimensions are easy), or by finding the special solutions as for exam 1. For example, the special solutions give the basis: $$ \boxed{ \begin{pmatrix} -4/3 \\ 1 \\ 0 \end{pmatrix}, \; \begin{pmatrix} -5/3 \\ 0 \\ 1 \end{pmatrix} } . $$ In any case, we need to have two vectors that are orthogonal to a, since $N(A^T)$ is two dimensional (the rank is 1).
(c) $P^2=P$ as it is a projection, so $P^7$ is also $P$. Every row of $P$ is proportional to $a^T$, so the row space of $P$ ($ = P^T$) is simply the line spanned by a.