Note: Not yet updated for Fall 2022.
Lecture scribing instruction (extra credit):
- Please sign up to scribe here: Lecture Scribing Sign Up
- Scribe template: Template
| Lecture | Date | Subject | Optional Readings | Deadlines | ||||
|---|---|---|---|---|---|---|---|---|
| PART 1: Dynamic Programming | ||||||||
| 1 | T 02/16 | Reinforcement learning overview [slides] [notes] | DPOC vol 1, Ch 1-2 | HW0 out (not due) | ||||
| 2 | R 02/18 | Dynamic programming [slides] | DPOC vol 1, Ch 2-3.4 | HW0 (post solutions) HW1 out (1.5 weeks) | ||||
| R1 | F 02/19 | Dynamic programming (finite horizon) | ||||||
| 3 | T 02/23 | Markov Decision Processes [slides][notes] | DPOC vol 1, Ch 5.1, 5.4; DPOC vol 2, Ch 1.1-1.5 | |||||
| 4 | R 02/25 | Markov Decision Processes | ||||||
| R2 | F 02/26 | Dynamic programming (stochastic, infinite horizon) | ||||||
| 5 | T 03/02 | Undiscounted infinite horizon dynamic programming [slides] [notes] | DPOC vol 1, Ch 5.2-5.3,5.5; DPOC vol 2, Ch 3.1-3.2 | HW1 due HW2 out (2 weeks) | ||||
| 6 | R 03/04 | Undiscounted infinite horizon dynamic programming [notes] | ||||||
| R3 | F 03/05 | Infinite horizon dynamic programming | ||||||
| T 03/09 | NO CLASS (Monday Schedule) | |||||||
| PART 2: Approximate Dynamic Programming | ||||||||
| 7 | R 03/11 | Model-free reinforcement learning [slides] | ||||||
| R4 | F 03/12 | Infinite horizon dynamic programming | ||||||
| 8 | T 03/16 | Stochastic approximation [slides][notes] | NDP, Ch 3.2, Ch 4.1, 4.3, 5 | HW2 due HW3 out (2 weeks) Sign-up for student lectures out | ||||
| 9 | R 03/18 | Representation approximation [slides][notes] | NDP, Ch 3.1, Ch 4.2, Ch 6.1-6.3 | |||||
| R5 | F 03/19 | Stochastic & representation approximation | NDP, Ch 5 | |||||
| T 03/23 | NO CLASS (Student holiday) | Sign-up for student lectures due | ||||||
| 10 | R 03/25 | Representation approximation [notes] | ||||||
| R6 | F 03/26 | Representation approximation | ||||||
| 11 | T 03/30 | Policy space methods (Part 1) [slides][notes] | ||||||
| 12 | R 04/01 | Policy space methods (Part 2) | HW3 due HW4 out (1.5 weeks) | |||||
| R7 | F 04/02 | Policy space methods | ||||||
| 13 | T 04/06 | Intro to Multi-Arm Bandits [slides] | Project Proposal due | |||||
| 14 | R 04/08 | Application of RL [slides] | ||||||
| F 04/09 | NO RECITATION | |||||||
| 15 | T 04/13 | State Abstraction (Student Lecture) [slides] [notes] | HW4 due | |||||
| 16 | R 04/15 | Exploration vs Exploitation (Student Lecture) [slides] [notes] | Project proposal feedback back to students | |||||
| R8 | F 04/16 | Quiz review | ||||||
| T 04/20 | NO CLASS (Student holiday) | |||||||
| 17 | R 04/22 | Quiz | Quiz HW5 out (2 weeks) | |||||
| R9 | F 04/23 | Implementation | ||||||
| 18 | T 04/27 | Transfer and Curriculum Learning (Student Lecture) [slides] [notes] | ||||||
| 19 | R 04/29 | Off-policy RL (Student Lecture) [slides][notes] | ||||||
| 20 | T 05/04 | Multi-agent Deep RL (Student Lecture) [slides] [notes] | ||||||
| 21 | R 05/06 | Safe RL (Student Lecture) [slides][notes] | HW5 due | |||||
| 22 | T 05/11 | Cooperation and Competition (Student Lecture) [slides][notes] | ||||||
| 23 | R 05/13 | Model-based RL (Student Lecture) [slides][notes] | ||||||
| 24 | T 05/18 | Project presentations | Project presentation slides (before class) | |||||
| 25 | R 05/20 | Project presentations | Project presentation slides (before class) Final project reports (5pm) | |||||