Learned Q(s,a) values
+--------+--------+--------+ | | | | | | | | | -->80 -->100 | | 64<-- | | | | | G | | | | | | 64 | 80 | 100 | | ^ | | ^ | | ^ | +--|--|--+--|--|--+--|-----+ | | v | | v | | | | 51.2| 64 | | | | | | | -->64 -->80 | | 51.2<-- 64<-- | | | | | +--------+--------+--------+ Initial state s1 State s2