Topics in Multiagent Learning
Fall 2024
While machine learning techniques have had significant success in single-agent settings, an
increasingly large body of literature has been studying settings involving several learning
agents
with different objectives. In these settings, standard training methods, such as gradient
descent,
are less successful and the simultaneous learning of the agents commonly leads to
nonstationary
and
even chaotic system dynamics.
Motivated by these challenges, this course presents the foundataions of multi-agent systems from a combined game-theoretic, optimization and learning-theoretic perspective. We start with basic matrix games, such as rock-paper-scissors, and advance to more complex forms including imperfect information games and structured games like combinatorial games, polymatrix games, and stochastic games. Topics will cover various equilibrium concepts, aspects of equilibrium computation and learning, as well as the computational complexity of finding equilibria. Additionally, we will examine how these models and methods have driven recent breakthroughs in AI, producing human- and superhuman-level agents for well-known games such as Go, Poker, Diplomacy, and Stratego. A tentative course syllabus can be found below.
Motivated by these challenges, this course presents the foundataions of multi-agent systems from a combined game-theoretic, optimization and learning-theoretic perspective. We start with basic matrix games, such as rock-paper-scissors, and advance to more complex forms including imperfect information games and structured games like combinatorial games, polymatrix games, and stochastic games. Topics will cover various equilibrium concepts, aspects of equilibrium computation and learning, as well as the computational complexity of finding equilibria. Additionally, we will examine how these models and methods have driven recent breakthroughs in AI, producing human- and superhuman-level agents for well-known games such as Go, Poker, Diplomacy, and Stratego. A tentative course syllabus can be found below.
Course Info
- Instructors: Gabriele Farina (✉️ gfarina@)
- Time: Tuesdays & Thursdays, 11:00am - 12:30pm
- Location: 3-333
- TA: Zhiyuan Fan (✉️ fanzy@)
- Office hours: see Canvas
- Prerequisites: Discrete Mathematics and Algorithms at the advanced undergraduate level; mathematical maturity.
- This course will use Canvas
Course Structure
The course will be lecture-based. At the end of the course there will be a few lectures of project presentations by students.
Readings will consist of a mixture of textbooks and course notes, which will be uploaded after lectures.
This course will include 2-3 homework sets and a project presentation. Projects may be done individually or in groups of 2-3 students and can be theoretical or experimental.
Grading will be as follows:
- 50% final project
- 50% homework sets
Schedule (subject to change)
# |
Date
|
Topic | Readings |
Lecture notes
|
---|---|---|---|---|
1 | 9/5 |
Introduction to the course and logistics
|
Syllabus | Slides |
Part I: Normal-form games | ||||
2 | 9/10 |
Setting and equilibria: the Nash equilibrium
Definition of normal-form games. Solution concepts and Nash equilibrium.
Nash equilibrium existence theorem. Brouwer's fixed point theorem.
|
Nash '51 | Notes html | pdf |
3 | 9/12 |
Setting and equilibria: the correlated equilibrium
Topological and computational properties of the set of Nash equilibria
in normal-form games. Connections with linear programming. Definition of
correlated and coarse correlated equilibria; relationships with Nash
equilibria.
|
von Stengel '24 | Notes html | pdf |
4 | 9/17 HW1 |
Learning in games: Foundations
Regret and hindsight rationality. Definition of regret minimization and
relationships with equilibrium concepts.
|
Intro of BM'07 | Notes html | pdf |
5 | 9/19 |
Learning in games: Algorithms (part I)
General principles in the design of learning algorithms.
Follow-the-leader, regret matching, multiplicative weights update,
online mirror descent.
|
Notes html | pdf | |
6 | 9/24 |
Learning in games: Algorithms (part II)
Optimistic mirror descent and optimistic follow-the-regularized-leader.
Accelerated computation of approximate equilibria.
|
SALS'15 | Notes html | pdf |
7 | 9/26 |
Learning in games: Bandit feedback
From multiplicative weights to Exp3. General principles.
Obtaining high-probability bounds.
|
ZS'21 | Notes html | pdf |
8 | 10/1 |
Learning in games: Φ-regret minimization
Blum-Mansour construction of a swap regret minimizer. Gordon, Greenwald,
and Marks general construction for Φ-regret minimizers.
|
GGM'08 | BM'07 | Notes html | pdf |
Part II: Extensive-form games | ||||
9 | 10/3 |
Foundations of extensive-form games
Complete versus imperfect information. Kuhn's theorem. Normal-form and
sequence-form strategies. Similarities and differences with normal-form
games.
|
Notes html | pdf | |
10 | 10/8 HW2 |
Learning in extensive-form games
No-regret algorithms for extensive-form games. Counterfactual utilities
and counterfactual regret minimization (CFR).
|
Notes html | pdf | |
11 | 10/10 |
Equilibrium refinements
Sequential irrationality. Extensive-form perfect equilibria and
quasi-perfect equilibrium.
|
Notes html | pdf | |
12 | 10/15 |
holiday
No class
Student holiday
|
||
13 | 10/17 |
project
Project ideas and brainstorming
|
||
14 | 10/22 |
project
Project break
Coincides with INFORMS 2024 Annual Meeting
|
||
15 | 10/24 |
project
Project break
Coincides with INFORMS 2024 Annual Meeting
|
||
16 | 10/29 |
Deep reinforcement learning for large-scale games (part I)
(Guest lecture by Samuel
Sokota) Rough taxonomy of deep RL methods for
games. Deep versions of fictitious
play, CFR, double oracle. Game-theoretic version of PPO. Magnetic mirror
descent.
|
Slides | |
17 | 10/31 |
Deep reinforcement learning for large-scale games (part II)
(Guest lecture by Samuel
Sokota) Public belief states. Applications to
poker, Hanabi, team games.
|
Slides | |
Part III: Other structured games | ||||
18 | 11/5 |
Combinatorial games and Kernelized MWU (Part I)
Example of combinatorial games. Kernelized multiplicative weights update
algorithm.
|
FLLK'22 | Notes html | pdf |
19 | 11/7 |
Combinatorial games and Kernelized MWU (Part II)
Example of combinatorial games. Kernelized multiplicative weights update
algorithm.
|
Notes html | pdf | |
20 | 11/12 HW3 |
Computation of exact equilibria
A second look at the minimax theorem. Hart and Schmeidler's proof of
existence of correlated equilibria. Ellipsoid against hope algorithm.
|
PR'08 | JLB'15 | FP'24 | Notes html | pdf |
21 | 11/14 |
Markov (aka stochastic) games
Setting and taxonomy of equilibria. Stationary Markov Nash equilibria in
inifinite-horizon games. Shapley's minimax theorem.
|
Notes html | pdf | |
Part IV: Complexity of equilibrium computation | ||||
22 | 11/19 |
PPAD-completeness of Nash equilibria (part I)
Sperner's lemma. The PPAD complexity class. Nash ∈ PPAD.
|
Notes html | pdf | |
23 | 11/21 |
PPAD-completeness of Nash equilibria (part II)
Arithmetic circuit SAT. PPAD-hardness of Nash equilibria.
|
||
Project break & presentations | ||||
24 | 11/26 |
project
Project break
|
||
25 | 11/28 |
holiday
No class
Thanksgiving
|
||
26 | 12/3 |
project
Project presentations
|
||
27 | 12/5 |
project
Project presentations
|
Related Courses
Below is a list of related courses at other schools.
Professor | Title | Year | School |
---|---|---|---|
Farina & Daskalakis | Topics in Multiagent Learning (Fall 2023 version of this course) | 2023 | MIT |
Farina & Sandholm | Computational Game Solving | 2021 | CMU |
Aaron Roth | Learning in Games (and Games in Learning) | 2023 | UPenn |
Christian Kroer | Economics, AI, and Optimization | 2020 | Columbia |
Tuomas Sandholm | Foundations of Electronic Marketplaces | 2015 | CMU |
John P. Dickerson | Applied Mechanism Design for Social Good | 2018 | UMD |
Fei Fang | Artificial Intelligence Methods for Social Good | 2018 | CMU |
Constantinos Daskalakis | Games, Decision, and Computation | 2015 | MIT |
Yiling Chen | Topics at the Interface between Computer Science and Economics | 2016 | Harvard |
Vincent Conitzer | Computational Microeconomics: Game Theory, Social Choice, and Mechanism Design | 2016 | Duke |
Ariel Procaccia | Truth, Justice, and Algorithms | 2016 | CMU |
Tim Roughgarden | Algorithmic Game Theory | 2013 | Stanford |