{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Markov matrices\n", "\n", "A matrix $A$ is a **Markov matrix** if\n", "\n", "* Its entries are all $\\ge 0$\n", "* Each **column**'s entries **sum to 1**\n", "\n", "Typicaly, a Markov matrix's entries represent **transition probabilities** from one state to another.\n", "\n", "For example, consider the $2 \\times 2$ Markov matrix:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2×2 Array{Float64,2}:\n", " 0.9 0.2\n", " 0.1 0.8" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A = [0.9 0.2\n", " 0.1 0.8]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let us suppose that this represents the fraction of people switching majors each year between math and English literature.\n", "\n", "Let\n", "$$\n", "x = \\begin{pmatrix} m \\\\ e \\end{pmatrix}\n", "$$\n", "\n", "represent the number of math majors $m$ and English majors $e$. Suppose that each year, 10% of math majors and 20% of English majors switch majors. After one year, the new number of math and English majors is:\n", "\n", "$$\n", "m' = 0.9 m + 0.2 e\n", "e' = 0.1 m + 0.8 e\n", "$$\n", "\n", "But this is equivalent to a matrix multiplication! i.e. the numbers $x'$ of majors after one year is\n", "\n", "$$\n", "x' = A x \\,\n", "$$\n", "\n", "Note that the two Markov properties are critical: we never have negative numbers of majors (or negative probabilities), and the probabilities must sum to 1 (the net number of majors is not changing: we're not including new students or people that graduate in this silly model)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Eigenvalues of Markov matrices\n", "\n", "There are two key questions about Markov matrices that can be answered by analysis of their eigenvalues:\n", "\n", "* Is there a **steady state**?\n", " - i.e. is there an $x_0 \\ne 0$ such that $A x_0 = x_0$?\n", " - i.e. is there $\\lambda_0 = 1$ eigenvector $x_0$?\n", "\n", "* Does the system **tend toward a steady state?**\n", " - i.e. does $A^n x \\to \\mbox{multiple of } x_0$ as $n \\to \\infty$?\n", " - i.e. is $\\lambda = 1$ the **largest** $|\\lambda|$?\n", " \n", "The answers are **YES** for **any Markov** matrix $A$, and **YES** for any *positive* Markov matrix (Markov matrices with entries $> 0$, not just $\\ge 0$). For *any* Markov matrix, all of the λ satisfy $|\\lambda| \\le 1$, but if there are zero entries in the matrix we *may* have multiple $|\\lambda|=1$ eigenvalues (though this doesn't happen often in practical Markov problems)." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2-element Array{Float64,1}:\n", " 1.0\n", " 0.7" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "eigvals(A)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's try just multipling it many times by a \"random\" vector and see whether it is converging to a steady state:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2-element Array{Float64,1}:\n", " 14.0\n", " 7.0" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A^100 * [17, 4]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Yes, it seems to be giving a vector that is not changing, which shoud be a multiple $c x_0$ of a steady-state eigenvector:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2-element Array{Float64,1}:\n", " 14.0\n", " 7.0" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cx₀ = A^1000 * [17, 4]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's check that this is an eigenvector of $A$ with eigenvalue $\\lambda=1$:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2-element Array{Float64,1}:\n", " 14.0\n", " 7.0" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A * cx₀" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To see why, the key idea is to write the columns-sum-to-one property of Markov matrices in linear-algebra terms. It is equivalent to the statement:\n", "\n", "$$\n", "\\underbrace{\\begin{pmatrix} 1 & 1 & \\cdots & 1 & 1 \\end{pmatrix}}_{o^T} A = o^T\n", "$$\n", "\n", "since this is just the operation that sums all of the rows of $A$. Equivalently, if we transpose both sides:\n", "\n", "$$\n", "A^T o = o\n", "$$\n", "\n", "i.e. $o$ is an eigenvector of $A^T$ (called a **left eigenvector of A**) with eigenvalue $\\lambda = 1$.\n", "\n", "But since $A$ and $A^T$ have the **same eigenvalues** (they have the same characteristic polynomial $\\det (A - \\lambda I) = \\det (A^T - \\lambda I)$ because transposed don't change determinants), this means that $A$ **also has an eigenvalue 1** but with a **different eigenvector**." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1×2 RowVector{Float64,Array{Float64,1}}:\n", " 1.0 1.0" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "o = [1,1]\n", "o' * A" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2-element Array{Float64,1}:\n", " 1.0\n", " 1.0" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A' * o" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The eigenvector of $A$ with eigenvalue $1$ must be a basis for $N(A - I)$:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2×2 Array{Float64,2}:\n", " -0.1 0.2\n", " 0.1 -0.2" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A - 1*I" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "By inspection, $A - I$ is singular here: the second column is -2 times the first. So, $x_0 = (2,1)$ is a basis for its nullspace, and is the steady state:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2-element Array{Float64,1}:\n", " 5.55112e-17\n", " 5.55112e-17" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(A - I) * [2,1]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's check if some arbitrary starting vector $(3,0)$ tends towards the steady state:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " \n", " \n", "
" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [], "text/plain": [ "Interact.Slider{Int64}(1: \"input\" = 0 Int64 , \"\", 0, 0:100, \"horizontal\", true, \"d\", true)" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "2-element Array{Float64,1}:\n", " 3.0\n", " 0.0" ] }, "execution_count": 10, "metadata": { "comm_id": "189e3583-3776-47e9-8201-6cf642099b9f", "reactive": true }, "output_type": "execute_result" } ], "source": [ "using Interact\n", "@manipulate for n in slider(0:100,value=0)\n", " A^n * [3,0]\n", "end" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Yes! In fact, it tends to exactly $(2,1)$, because the other eigenvalue is $< 1$ (and hence that eigenvector component decays exponentially fast).\n", "\n", "An interesting property is that the **sum of the vector components is conserved** when we multiply by a Markov matrix. Given a vector $x$, $o^T x$ is the sum of its components. But $o^T A = o^T$, so:\n", "\n", "$$\n", "o^T A x = o^T x = o^T A^n x$\n", "$$\n", "\n", "for any $n$! This is why $(3,0)$ must tend to $(2,1)$, and not to any other multiple of $(2,1)$, because both of them sum to 3. (The \"number of majors\" is conserved in this problem.)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Why no eigenvalues > 1?\n", "\n", "Why are all $|\\lambda| \\le 1$ for a Markov matrix?\n", "\n", "The key fact is that the **product AB of two Markov matrices A and B is also Markov**. Reasons:\n", "\n", "* If $A$ and $B$ have nonnegative entries, $AB$ does as well: matrix multiplication uses only $\\times$ and $+$, and can't introduce a minus sign.\n", "\n", "* If $o^T A = o^T$ and $o^T B = o^T$ (both have columns summing to 1), then $o^T AB = o^T B = o^T$: the columns of $AB$ sum to 1.\n", "\n", "For example, $A^n$ is a Markov matrix for any $n$ if $A$ is Markov.\n", "\n", "Now, if there were an eigenvalue $|\\lambda| > 1$, the matrix $A^n$ would have to *blow up exponentially* as $n\\to \\infty$ (since the matrix times that eigenvector, or any vector with a nonzero component of that eigenvector, would blow up). But since $A^n$ is Markov, all of its entries must be between 0 and 1. It can't blow up! So we must have all $|\\lambda| \\le 1$." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2×2 Array{Float64,2}:\n", " 0.666667 0.666667\n", " 0.333333 0.333333" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A^100" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "(In fact, $A^n$ is pretty boring for large $n$: it just takes in any vector and redistributes it to the steady state.)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Can there be more than one steady state?\n", "\n", "We have just showed that we have *at least one* eigenvalue $\\lambda = 1$, and that *all* eigenvalues satisfy $|\\lambda| \\le 1$. But can there be *more than one* independent eigenvector with $\\lambda = 1$?\n", "\n", "**Yes!** For example, the **identity matrix** $I$ is a Markov matrix, and *all* of its eigenvectors have eigenvalue $1$. Since $Ix = x$ for *any* $x$, *every vector is a steady state* for $I$!\n", "\n", "But this does not usually happen for *interesting* Markov matrices coming from real problems. In fact, there is a theorem:\n", "\n", "* If all the entries of a Markov matrix are $> 0$ (not just $\\ge 0$), then *exactly one* of its eigenvalues $\\lambda = 1$ (that eigenvalue has \"multiplicity 1\": $N(A-I)$ is one-dimensional), and **all other eigenvalues have** $|\\lambda| < 1$. There is a *unique steady state* (up to an overall scale factor).\n", "\n", "I'm not going to prove this in 18.06, however." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Can the solutions oscillate?\n", "\n", "If you have a Markov matrix with zero entries, then there might be more than one eigenvalue with $|\\lambda| = 1$, but these additional solutions might be *oscillating* solutions rather than steady states.\n", "\n", "For example, consider the permutation matrix\n", "$$\n", "P = \\begin{pmatrix} 0 & 1 \\\\ 1 & 0 \\end{pmatrix}\n", "$$\n", "that simply swaps the first and second entries of any 2-component vector.\n", "\n", "If $x = (1,0)$, then $P^n x$ will oscillate forever, never reaching a steady state! It simply oscillates between $(1,0)$ (for even $n$) and $(0,1)$ (for odd $n$):" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2×2 Array{Int64,2}:\n", " 0 1\n", " 1 0" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "P = [0 1\n", " 1 0]" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "6-element Array{Array{Int64,1},1}:\n", " [1, 0]\n", " [0, 1]\n", " [1, 0]\n", " [0, 1]\n", " [1, 0]\n", " [0, 1]" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[P^n * [1,0] for n = 0:5]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "But this is a Markov matrix, so all $|\\lambda|$ are $\\le 1$:" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2-element Array{Float64,1}:\n", " -1.0\n", " 1.0" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "eigvals(P)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The problem is that the $\\lambda = -1$ eigenvalue corresponds to an oscillating solution:\n", "\n", "$$\n", "P^n \\begin{pmatrix} 1 \\\\ -1 \\end{pmatrix} = (-1)^n \\begin{pmatrix} 1 \\\\ -1 \\end{pmatrix}\n", "$$\n", "\n", "for the eigenvector $(1,-1)$.\n", "\n", "The steady state still exists, corresponding to the eigenvector $(1,1)$:\n", "\n", "$$\n", "P^n \\begin{pmatrix} 1 \\\\ 1 \\end{pmatrix} = \\begin{pmatrix} 1 \\\\ 1 \\end{pmatrix}\n", "$$" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2×2 Array{Float64,2}:\n", " -0.707107 0.707107\n", " 0.707107 0.707107" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "eig(P)[2] # the eigenvectors" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Since $(1,0) = [(1,1) + (1,-1)]/2$, we have:\n", "\n", "$$\n", "P^n \\begin{pmatrix} 1 \\\\ 0 \\end{pmatrix} = \\frac{1}{2} \\left[ \\begin{pmatrix} 1 \\\\ 1 \\end{pmatrix} + \n", "(-1)^n \\begin{pmatrix} 1 \\\\ -1 \\end{pmatrix} \\right]\n", "$$\n", "\n", "which alternates between $(1,0)$ and $(0,1)$." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Another example\n", "\n", "Let's generate a random 5x5 Markov matrix:" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "5×5 Array{Float64,2}:\n", " 0.738225 0.72637 0.554436 0.472905 0.170035\n", " 0.575379 0.641102 0.421929 0.142563 0.897826\n", " 0.992513 0.0505591 0.682035 0.959969 0.297188\n", " 0.692958 0.408806 0.101324 0.526985 0.310136\n", " 0.121889 0.395139 0.527851 0.710186 0.318843" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "M = rand(5,5) # random entries in [0,1]" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1×5 Array{Float64,2}:\n", " 3.12096 2.22198 2.28758 2.81261 1.99403" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sum(M,1) # not Markov yet" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "5×5 Array{Float64,2}:\n", " 0.236537 0.326903 0.242368 0.168138 0.0852724\n", " 0.184359 0.288528 0.184444 0.050687 0.450257 \n", " 0.318015 0.0227541 0.298147 0.341309 0.149039 \n", " 0.222033 0.183983 0.0442933 0.187365 0.155532 \n", " 0.0390548 0.177832 0.230747 0.252501 0.159899 " ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "M = M ./ sum(M,1)" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1×5 Array{Float64,2}:\n", " 1.0 1.0 1.0 1.0 1.0" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sum(M,1)" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "5-element Array{Complex{Float64},1}:\n", " 1.0+0.0im \n", " -0.217733+0.0im \n", " 0.116733+0.211049im\n", " 0.116733-0.211049im\n", " 0.154743+0.0im " ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "eigvals(M)" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "5-element Array{Float64,1}:\n", " 1.0 \n", " 0.217733\n", " 0.241182\n", " 0.241182\n", " 0.154743" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "abs.(eigvals(M))" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "5-element Array{Float64,1}:\n", " 0.338351\n", " 0.142161\n", " 0.139727\n", " 0.16293 \n", " 0.21683 " ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x = rand(5)\n", "x = x / sum(x)" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "5-element Array{Float64,1}:\n", " 0.222706\n", " 0.231999\n", " 0.220523\n", " 0.157424\n", " 0.167348" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "M^100 * x" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "5-element Array{Complex{Float64},1}:\n", " 0.222706+0.0im\n", " 0.231999+0.0im\n", " 0.220523+0.0im\n", " 0.157424+0.0im\n", " 0.167348+0.0im" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "λ, X = eig(M)\n", "X[:,1] / sum(X[:,1])" ] } ], "metadata": { "kernelspec": { "display_name": "Julia 0.6.0", "language": "julia", "name": "julia-0.6" }, "language_info": { "file_extension": ".jl", "mimetype": "application/julia", "name": "julia", "version": "0.6.0" }, "widgets": { "state": { "e53e5f7b-c65e-4676-a564-3f8ee40c11c0": { "views": [ { "cell_index": 13 } ] } }, "version": "1.2.0" } }, "nbformat": 4, "nbformat_minor": 1 }