Topics in Multiagent Learning
Fall 2023
While machine learning techniques have had significant success in single-agent settings, an
increasingly large body of literature has been studying settings involving several learning agents
with different objectives. In these settings, standard training methods, such as gradient descent,
are less successful and the simultaneous learning of the agents commonly leads to nonstationary and
even chaotic system dynamics.
Motivated by these challenges, this course presents the foundations of multi-agent systems from a combined game-theoretic, optimization and learning-theoretic perspective, building from matrix games (such as rock-paper-scissors) to stochastic games, imperfect information games, and games with non-concave utilities. We will present manifestations of these models in machine learning applications, from solving Go to multi-agent reinforcement learning, adversarial learning and broader multi-agent deep learning applications. We will discuss aspects of equilibrium computation and learning as well as the computational complexity of equilibria. We will also discuss how the different models and methods have allowed several recent breakthroughs in AI, including human- and superhuman-level agents for established games such as Go, Poker, Diplomacy, and Stratego. A tentative course syllabus can be found below.
Motivated by these challenges, this course presents the foundations of multi-agent systems from a combined game-theoretic, optimization and learning-theoretic perspective, building from matrix games (such as rock-paper-scissors) to stochastic games, imperfect information games, and games with non-concave utilities. We will present manifestations of these models in machine learning applications, from solving Go to multi-agent reinforcement learning, adversarial learning and broader multi-agent deep learning applications. We will discuss aspects of equilibrium computation and learning as well as the computational complexity of equilibria. We will also discuss how the different models and methods have allowed several recent breakthroughs in AI, including human- and superhuman-level agents for established games such as Go, Poker, Diplomacy, and Stratego. A tentative course syllabus can be found below.
Course Info
- Instructors: Gabriele Farina and Constantinos Daskalakis
- Time: Tuesdays & Thursdays, 11:00am - 12:30pm
- Location: 3-333
- Office hours: TBD
- Prerequisites: Discrete Mathematics and Algorithms at the advanced undergraduate level; mathematical maturity.
- This course will use Canvas
Course Structure
The course will be lecture-based. At the end of the course there will be a few lectures of project presentations by students.
Readings will consist of a mixture of textbooks and course notes, which will be uploaded after lectures.
This course will include 2-3 homework sets and a project presentation. Projects may be done individually or in groups of 2-3 students and can be theoretical or experimental.
Grading will be as follows:
- 50% final project
- 40% homework sets
- 10% completion of readings, attendance, and participation in class discussions
Schedule (subject to change)
# |
Date
|
Topic | Reading |
Material
|
---|---|---|---|---|
1 | 9/7 |
Introduction to the course and logistics
|
Slides | |
Part I: Normal-form games | ||||
2 | 9/12 |
Setting and equilibria: Nash equilibrium
Definition of normal-form games. Solution concepts and
Nash
equilibrium. Nash equilibrium existence theorem. Brouwer's fixed point
theorem.
|
Slides | |
3 | 9/14 |
Setting and equilibria: Correlated equilibrium
Definition of Correlated and coarse correlated
equilibria. Their relationships with Nash equilibria in two-player
zero-sum games. Linear programming formulations
|
Slides | |
4 | 9/19 |
Learning in games: Foundations
Regret and hindsight rationality. Phi-regret
minimization and special cases. Connections with equilibrium computation
and saddle-point optimization
|
GGM08 | Notes |
5 | 9/21 |
Learning in games: Algorithms
Regret matching, regret matching plus, FTRL and
multiplicative weights update
|
Blackwell ap. | Notes |
Part Ib: Complexity of equilibrium | ||||
6 | 9/26 |
Nash equilibrium and PPAD complexity
Sperner's lemma, Brouwer's fixed point, and the PPAD
complexity class.
|
Slides | |
7 | 9/28 |
PPAD-completeness of Nash equilibria, and open problems (Part I)
|
Slides | |
# | 10/3 |
(Project Brainstorming)
See Canvas for a list of project ideas.
|
||
8 | 10/5 |
PPAD-completeness of Nash equilibria, and open problems (Part II)
Arithmetic Circuit SAT, and the PLS, PPP, and PPA
complexity classes.
|
Slides | |
Part II: Stochastic games | ||||
9 | 10/12 |
Stochastic games
Minimax theorem, and existence of equilibrium.
Stationary Markov Nash equilibria.
|
Proof notes | Slides |
10 | 10/17 |
Computation and learning of equilibria in stochastic games (Part I)
Part I: Upper bounds
(Guest lecture by Noah Golowich)
|
Notes | |
11 | 10/19 |
Computation and learning of equilibria in stochastic games (Part II)
Part II: Lower bounds
(Guest lecture by Noah Golowich)
|
Notes | |
Part III: Imperfect-information games | ||||
12 | 10/24 |
Foundations of imperfect-information extensive-form games
Complete versus imperfect information. Kuhn's theorem.
Normal-form and sequence-form strategies. Similarities and differences
with normal-form games.
|
Kernelized MWU | Slides |
13 | 10/26 |
Linear programming for Nash equilibrium in two-player zero-sum
extensive-form games
Formulation and implementation details.
|
Slides | |
14 | 10/31 |
Learning in imperfect-information extensive-form games (Part I)
Construction of learning algorithms for
extensive-form games.
|
CFR appendix | Slides |
15 | 11/2 |
Learning in imperfect-information extensive-form games (Part II) and
sequential irrationality
Proof of Counterfactual Regret Minimization (CFR).
Introduction to sequential irrationality.
|
Slides | |
16 | 11/7 |
Equilibrium refinements and team coordination
Extensive-form perfect equilibria and quasi-perfect
equilibrium.
|
Slides | |
17 | 11/9 |
Scalability-enhancing techniques
|
Slides | |
Project break | ||||
B1 | 11/14 |
Project break
No class this week
|
||
B2 | 11/16 |
Project break
No class this week
|
||
Part IV: Nonconcave games | ||||
18 | 11/21 |
Aspects of nonconcave games
Overview, challenges, and local Nash equilibria.
|
Slides | |
19 | 11/28 |
Aspects of nonconcave games
Randomized Nash equilibria. Infinite games, threshold
dimension and Littlestone dimension.
|
Slides | |
20 | 11/30 |
Aspects of nonconcave games
Infinite games. Double-oracle algorithm for infinite
games.
|
Slides | |
Project presentations | ||||
P1 | 12/5 |
Projects
|
||
P2 | 12/7 |
Projects
|
Related Courses
Below is a list of related courses at other schools.
Instructor(s) | Course title | Year | School |
---|---|---|---|
Farina & Sandholm | Computational Game Solving | 2021 | CMU |
Christian Kroer | Economics, AI, and Optimization | 2020 | Columbia |
John P. Dickerson | Applied Mechanism Design for Social Good | 2018 | UMD |
Fei Fang | Artificial Intelligence Methods for Social Good | 2018 | CMU |
Yiling Chen | Topics at the Interface between Computer Science and Economics | 2016 | Harvard |
Vincent Conitzer | Computational Microeconomics: Game Theory, Social Choice, and Mechanism Design | 2016 | Duke |
Sanmay Das | Multi-Agent Systems | 2016 | Wash U |
Ariel Procaccia | Truth, Justice, and Algorithms | 2016 | CMU |
Milind Tambe | Security and Game Theory | 2016 | USC |
Constantinos Daskalakis | Games, Decision, and Computation | 2015 | MIT |
Tuomas Sandholm | Foundations of Electronic Marketplaces | 2015 | CMU |
Tim Roughgarden | Algorithmic Game Theory | 2013 | Stanford |