next up previous
Next: Temporal Difference Learning Up: l9 Previous: Passive Learning Agent

Two Methods For Updating Utility Values