Lecture slides from a current course (2020) on Topics in Reinforcement Learning at Arizona State University:
Slides-Lecture 1, Slides-Lecture 2, Slides-Lecture 3, Slides-Lecture 4, Slides-Lecture 5, Slides-Lecture 6, Slides-Lecture 8.
Video of an Overview Lecture on Distributed RL from IPAM workshop at UCLA, Feb. 2020.
Slides from an Overview Lecture on Distributed RL from IPAM workshop at UCLA, Feb. 2020.
This research monograph, currently in progress, will be available from the publishing company Athena Scientific sometime in 2020.
The purpose of the monograph is to develop in greater depth some of the methods from the author's recently published textbook on Reinforcement Learning (Athena Scientific, 2019). In particular, we present new research, relating to systems involving multiple agents, partitioned architectures, and distributed asynchronous computation. We pay special attention to the contexts of dynamic programming/policy iteration and control theory/model predictive control. We also discuss in some detail the application of the methodology to challenging discrete/combinatorial optimization problems, such as routing, scheduling, assignment, and mixed integer programming, including the use of neural network approximations within these contexts.
A special focus is rollout algorithms for both discrete deterministic and stochastic DP problems, and the development of distributed implementations in both multiagent and multiprocessor settings, aiming to take advantage of parallelism. Rollout can be viewed as a single policy iteration, and when repeated multiple times, it yields a complex policy iteration method that is well suited for the use of multiple value and policy neural networks, which coordinate their training asynchronously. Much of the new research is inspired by the remarkable Alphazero chess program, where policy iteration, value and policy networks, approximate lookahead minimization, and parallel computation all play an important role.
Your comments and suggestions while the book is under development are welcome.
Click here for preface and table of contents.
Chapter 1: Exact Dynamic Programming
Chapter 2: Rollout and Policy Improvement
Chapter 3: Learning Values and Policies
Chapter 4: Approximate Policy Iteration for Infinite Horizon Problems
Reinforcement Learning and Optimal Control
The following papers and reports have a strong connection to material in the book, and amplify on its analysis and its range of applications.
Visits since February 15, 2020