Matrix/Linear Algebra Refresher

Terms defined: ------------------------------
matrix
matrix size
matrix element
vector
column vector
row vector
square matrix
diagonal
main diagonal
off diagonal
upper triangle
lower triangle
diagonal matrix
identity matrix
toeplitz matrix
triangular matrix
trace
ones matrix
zeros matrix
transpose
symmetric matrix
matrix addition/subtraction
matrix multiplication
inner product
dot product
outer product
scalar
matrix inverse
determinant
singular
ill-conditioned
simultaneous equations
overdetermined
row/column space
null space
singular value decomposition (SVD/PCA)
------------------------------

A matrix is a table of numbers. The numbers can be anything (pos, neg,
zero, integer, fractions, pi, real, imaginary, complex, etc). 

A matrix is characterized by its size, ie, the number of rows and
columns as "n-by-m", where the first number is the number of rows and
the second is the number of columns. Eg, "3-by-4" or "3-x-4" means a
matrix with 3 rows and 4 columns.  Brackets are usually drawn around
the matrix.

M = [a11 a12 a13]
    [a21 a22 a23]

Each number in the matrix is called an element.
First subscript is the row, second is the col.

A vector is a special matrix in which the number of cols or rows is 1. 
A column vector is a vector in which the number of cols=1.
A row vector is  a vector in which the number of rows=1.

A matrix can be thought of as being composed of a collection of row vectors or column
vectors. 

A matrix is square if nrows=ncols.
Diagonal: elements where row=col+constant.
Main diagonal, constant = 0.
Off diagonal, constant != 0.
Triangular matrix: all elements either above or below the main diagonal are zero. 
Upper triangle: all elements above the main diagonal are non-zero
Lower triangle: all elements below the main diagonal are non-zero.
Diagonal matrix: all elements are 0 except main diagonal.
Identity matrix: diagonal matrix with all 1s on main diagonal and zeros everywhere else
Toeplitz matrix: elements within a diag have same value throughout matrix.

Trace: sum of the main diagonal components of a square matrix.

Ones matrix. All elements are 1. Can be any size.
Zero matrix. All elements are 0. Can be any size.

Transpose. Reflect around diagonal. Swap rows and cols. 
Use "T", "t", or "'". Changes an n-by-m to m-by-n.

Symmetric matrix: M' = M (symmetric about diagonal)

Add/Subtract. Just add/subtract the matching elements. For this
reason, sizes must be the same.

Multiplication. Number of cols in the first must match the number of
rows in the second. 

Multiplication: Inner product (dot product)
Vectors: Multiply matching elements and sum:
[1 2] * [3 4]' = 1*3 + 2*4 = [11] (a 1x1 matrix)
[1 2 3] * [7 9 11]' = 1*7 + 2*9 + 3*11 = [58] (a 1x1 matrix)

Matrices: reduce to a series of vector multiplications
[1 2 3] [ 7  8]
[4 5 6] [ 9 10] = [ 58    64] (a 2x2 matrix)
        [11 12]   [139   154]

The final matrix size will be equal to the number of rows of the first
matrix by the number of columns in the second matrix.

Multiplication: Outer product:
[1] * [3 4] = [1*3 1*4] = [3 4]  (a 2x2 matrix)
[2]           [2*3 2*4]   [6 8]

Scalar: single number without a size designation. Different
than a 1x1 matrix. No brackets.

Multiplication of a matrix by a scalar. Multiply all elements of the
matrix by scalar value. 

.5 * [1 2] = [0.5 1.0] 
     [3 4]   [1.5 2.0]

You could not do this if 0.5 was a 1x1 matrix ([0.5]) because
dimensions would not match.

Put it all together:
  alpha = .5 % scalar
  x = [1 -1]';
  y = [3  4]';
  M = [2  5; 3 6];
  x*y' + alpha*M = [ 4.0  6.5]
                   [-1.5 -1.0]

Inverse: M*inv(M)=I.
  M must be square
  Other conditions on M too.
  Hard to do (not something you can eyeball)
M = [1 2]
    [3 4]
inv(M) = [-2.0  1.0]
         [ 1.5 -0.5]
   
Determinant of a 2x2
M = [a11 a12]
    [a21 a22]
delta = a11*a22 - a21*a12 = -2

Inverse of a 1x1:
inv([a11]) = [1/a11]

Inverse of a 2x2:
inv(M) = (1/delta) * [ a22 -a12]
                     [-a21  a11]

But, what if:
M = [1.0  2.0]
    [0.5  1.0]
delta = 1*1 - 0.5*2 = 0
Cannot compute 1/delta, so cannot compute inverse!
Matrix is "singular".

Note: you'd never actually do these computations by hand. You'd use
Matlab (or some other software package). 

A dumb way to compute the sum of a list of numbers:
list: 1, 2 ,3. 
x = [1 2 3]';
a = [1 1 1]
sum = x'*a = 1*1 + 2*1 + 3*1 = 5

A dumb way to compute the length of the list
N = a'*a = 1*1 + 1*1 + 1*1 = 3

A dumb way to compute the average
mu = x'*a * inv(a'*a) = sum/N

Sum of the squares of a list:
list: 1, 2 ,3. 
x = [1 2 3]';
SS = x'*x;

A dumb way to compute the variance
e = x - mu*a;
var = e'e/(a'a - 1) = SSE/(N-1)

Selective average:
list: 1, 2 ,3, 4, 5. But only want to compute the average for 1, 3, and 5.
x = [1 2 3 4 5]';
a = [1 0 1 0 1]';
sum = x'*a = 1*1 + 2*0 + 3*1 + 4*0 + 5*1 = 9
N = a'*a = 3
average = x'*a * inv(a'*a) = 3

Correlation of two lists of numbers:
x = 1, 2 ,3
y = 7, 9, 11
sum(x_i * y_i) = 58
x = [1 2 3]'
y = [7 9 11]'
x'*y

Correlation coefficient:
rho = sum(x_i * y_i)/sqrt(sum(x_i^2) * sum(y_i^2))
    = x'*y/sqrt((x'*x)*(y'*y));

Algebra:
  y = x*b, solve for b in terms of y and x: b = y/x = inv(x)*y
  Simultaneous equations: 
  y1 = x11*b1 + x12*b2
  y2 = x21*b1 + x22*b2
  Solve for b1 and b2 in terms of y1, y2, and xii
  Two equations, two unknowns.
  Use one equation to solve for b1 in terms of xii, y, and b2, 
  then substitute into the other equation to solve for b2.

Linear algebra: a way to represent simultaneous equations:
  y = [y1 y1]'
  b = [b1 b2]'
  X = [x11 x12]
      [x21 x22]
  y = X*b, solve for b in terms of y and X:
  b = inv(X)*y

Fitting a line (slope and offset)
  y = b + mx
  If we have two measures of y (y1 and y2) that correspond to two
  values of x (x1 and x2), then
    y1 = b + m*x1
    y2 = b + m*x2
  Two equations, two unknowns (m and b)
    y = [y1 y2]'
    X = [1 x1]
        [1 x2]
    beta = [b m]'
    y = X*beta
    beta = inv(X)*y
  Will this always work? Hint: determinant.
  Will this always work well?

Overdetermined: more equations than unknowns
    y1 = b + m*x1
    y2 = b + m*x2
    y3 = b + m*x3
  Three equations, two unknowns (m and b)
    y = [y1 y2 y3]'
    X = [1 x1]
        [1 x2]
        [1 x3]
    beta = [b m]'
    y = X*beta
  Cannot compute inv(X) because X is not square.
  betahat = inv(X'*X)*X'*y = (X+)*y (pseudo inverse)
  Will this always work? Hint: determinant.
  Will this always work well?
  "Best fit"

Ill-conditioned vs singular

Fitting a plane (offset and slopes)
  z = b + m1*x + m2*y
  Three equations, three unknowns:
    z1 = b + m1*x1 + m2*y1
    z2 = b + m1*x2 + m2*y2
    z3 = b + m1*x3 + m2*y3
  z = [z1 z2 z3]'
  X = [1 x1 y1]
      [1 x1 y2]
      [1 x1 y3]
  beta = [b m1 m2]'
  Z = X * beta
  beta = inv(X)*Z


Geometric interpretation of vectors:

Spaces:

Row space of a matrix - all the vectors that can be represented by a
weighted sum of all row vectors of a matrix.

Col space of a matrix - all the vectors that can be represented by a
weighted sum of the column vectors of a matrix.

Row null space of a matrix - all the vectors that CANNOT be
represented by a weighted sum of all row vectors of a matrix.

Col null space of a matrix - all the vectors that CANNOT be
represented by a weighted sum of all column vectors of a matrix.

Singular Value Decomposition (PCA):
M = U*S*V'
S is a diagonal matrix of the singular values.