Next: Subtleties and Ongoing Research
Previous: Exploration and Exploitation
TD-Gammon [Tesauro, 1995]
- Learn to play Backgammon
- Immediate reward
- +100 if win
- -100 if lose
- 0 for all other states
- Trained by playing 1.5 million games against itself
- Now approximately equal to best human player