DISTRIBUTED REINFORCEMENT LEARNING

BOOK AND COURSE MATERIAL, 2020

Dimitri P. Bertsekas


CURRENT REINFORCEMENT LEARNING COURSE AT ASU, 2020: SLIDES

Lecture slides from a current course (2020) on Topics in Reinforcement Learning at Arizona State University:

Slides-Lecture 1, Slides-Lecture 2, Slides-Lecture 3, Slides-Lecture 4, Slides-Lecture 5, Slides-Lecture 6, Slides-Lecture 8.

Video of an Overview Lecture on Distributed RL from IPAM workshop at UCLA, Feb. 2020.

Slides from an Overview Lecture on Distributed RL from IPAM workshop at UCLA, Feb. 2020.


DISTRIBUTED REINFORCEMENT LEARNING BOOK, Athena Scientific, 2020

RLCOVER.jpg

This research monograph, currently in progress, will be available from the publishing company Athena Scientific sometime in 2020.

The purpose of the monograph is to develop in greater depth some of the methods from the author's recently published textbook on Reinforcement Learning (Athena Scientific, 2019). In particular, we present new research, relating to systems involving multiple agents, partitioned architectures, and distributed asynchronous computation. We pay special attention to the contexts of dynamic programming/policy iteration and control theory/model predictive control. We also discuss in some detail the application of the methodology to challenging discrete/combinatorial optimization problems, such as routing, scheduling, assignment, and mixed integer programming, including the use of neural network approximations within these contexts.

A special focus is rollout algorithms for both discrete deterministic and stochastic DP problems, and the development of distributed implementations in both multiagent and multiprocessor settings, aiming to take advantage of parallelism. Rollout can be viewed as a single policy iteration, and when repeated multiple times, it yields a complex policy iteration method that is well suited for the use of multiple value and policy neural networks, which coordinate their training asynchronously. Much of the new research is inspired by the remarkable Alphazero chess program, where policy iteration, value and policy networks, approximate lookahead minimization, and parallel computation all play an important role.

Your comments and suggestions while the book is under development are welcome.


BOOK PREFACE, CONTENTS, SELECTED SECTIONS, AND RELATED MATERIAL

Click here for preface and table of contents.

Book chapters:

Chapter 1: Exact Dynamic Programming

Chapter 2: Rollout and Policy Improvement

Chapter 3: Learning Values and Policies

Chapter 4: Approximate Policy Iteration for Infinite Horizon Problems

References.


LINK TO 2019 REINFORCEMENT LEARNING BOOK PAGE

Reinforcement Learning and Optimal Control


The following papers and reports have a strong connection to material in the book, and amplify on its analysis and its range of applications.

  • Bertsekas, D., "Multiagent Rollout Algorithms and Reinforcement Learning," arXiv preprint arXiv:1910.00120, September 2019 (revised March 2020).

  • Bhattacharya, S., Badyal, S., Wheeler, W., Gil, S., Bertsekas, D.,"Reinforcement Learning for POMDP: Partitioned Rollout and Policy Iteration with Application to Autonomous Sequential Repair Problems," IEEE Robotics and Automation Letters, to appear, 2020.

  • Bertsekas, D., "Constrained Multiagent Rollout and Multidimensional Assignment with the Auction Algorithm," arXiv preprint, arXiv:2002.07407 February 2020.

  • D. P. Bertsekas and H. Yu, "Distributed Asynchronous Policy Iteration in Dynamic Programming," Proc. of 2010 Allerton Conference on Communication, Control, and Computing, Allerton Park, ILL, Sept. 2010. (Related Lecture Slides) (An extended version with additional algorithmic analysis) (A counterexample by Williams and Baird that motivates in part this paper).
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    

    Visits since February 15, 2020