Instead of solving equations for all states, incrementally update visited states
TD reduces discrepancies between current and past states
If previous state has utility -100 and current state has utility +100, increase previous state utility to lessen discrepancy