DISTRIBUTED AND MULTIAGENT REINFORCEMENT LEARNING

BOOK AND COURSE MATERIAL, 2020

Dimitri P. Bertsekas


CURRENT REINFORCEMENT LEARNING COURSE AT ASU, 2020: SLIDES

Lecture slides from a current course (2020) on Topics in Reinforcement Learning at Arizona State University:

Slides-Lecture 1, Slides-Lecture 2, Slides-Lecture 3, Slides-Lecture 4, Slides-Lecture 5, Slides-Lecture 6, Slides-Lecture 8.

Video of an Overview Lecture on Distributed RL from IPAM workshop at UCLA, Feb. 2020.

Slides from an Overview Lecture on Distributed RL from IPAM workshop at UCLA, Feb. 2020.

Video of the Final Overview Lecture for the course.

Slides of the Final Overview Lecture for the course.


DISTRIBUTED AND MULTIAGENT REINFORCEMENT LEARNING book, Athena Scientific, 2020

RLCOVER.jpg

This research monograph, currently in progress, will be available from the publishing company Athena Scientific sometime in 2020.

The purpose of the monograph is to develop in greater depth some of the methods from the author's recently published textbook on Reinforcement Learning (Athena Scientific, 2019). In particular, we present new research, relating to systems involving multiple agents, partitioned architectures, and distributed asynchronous computation. We pay special attention to the contexts of dynamic programming/policy iteration and control theory/model predictive control. We also discuss in some detail the application of the methodology to challenging discrete/combinatorial optimization problems, such as routing, scheduling, assignment, and mixed integer programming, including the use of neural network approximations within these contexts.

A special focus is rollout algorithms for both discrete deterministic and stochastic DP problems, and the development of distributed implementations in both multiagent and multiprocessor settings, aiming to take advantage of parallelism. Rollout can be viewed as a single policy iteration, and when repeated multiple times, it yields a complex policy iteration method that is well suited for the use of multiple value and policy neural networks, which coordinate their training asynchronously. Much of the new research is inspired by the remarkable Alphazero chess program, where policy iteration, value and policy networks, approximate lookahead minimization, and parallel computation all play an important role.

Your comments and suggestions while the book is under development are welcome.

Click here for preface and table of contents.


RELATED MATERIAL

The following papers and reports have a strong connection to material in the book, and amplify on its analysis and its range of applications.

  • Bertsekas, D., "Multiagent Value Iteration Algorithms in Dynamic Programming and Reinforcement Learning," ASU Report, April 2020.

  • Bertsekas, D., "Multiagent Rollout Algorithms and Reinforcement Learning," arXiv preprint arXiv:1910.00120, September 2019 (revised April 2020).

  • Bhattacharya, S., Badyal, S., Wheeler, W., Gil, S., Bertsekas, D.,"Reinforcement Learning for POMDP: Partitioned Rollout and Policy Iteration with Application to Autonomous Sequential Repair Problems," IEEE Robotics and Automation Letters, to appear, 2020.

  • Bertsekas, D., "Constrained Multiagent Rollout and Multidimensional Assignment with the Auction Algorithm," arXiv preprint, arXiv:2002.07407 February 2020.

  • D. P. Bertsekas and H. Yu, "Distributed Asynchronous Policy Iteration in Dynamic Programming," Proc. of 2010 Allerton Conference on Communication, Control, and Computing, Allerton Park, ILL, Sept. 2010. (Related Lecture Slides) (An extended version with additional algorithmic analysis) (A counterexample by Williams and Baird that motivates in part this paper).

    LINK TO 2019 REINFORCEMENT LEARNING BOOK

    Reinforcement Learning and Optimal Control


    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    

    Visits since February 15, 2020