Next: Reinforcement Learning Up: l9 Previous: Neural Network Issues

Reinforcement Learning

Learn action selection for probabilistic applications

Robot learning to dock on battery charger
Learning to choose actions to optimize factory output
Learning to play Backgammon

Note several problem characteristics:

Delayed reward
Opportunity for active exploration
Possibility that state only partially observable
Possible need to learn multiple tasks with same sensors/effectors