Tom Silver

Welcome to my academic website. You are probably here to see pictures of my dog. Otherwise, read on.

I am a third year computer science PhD student at MIT. I am advised by Leslie Kaelbling and Josh Tenenbaum and am a member of the Learning and Intelligent Systems group and the Computational Cognitive Science group. I am grateful for support from an NSF Graduate Research Fellowship and an MIT Presidential Fellowship. Previously I was a researcher at Vicarious AI. I received my B.A. from Harvard in computer science and mathematics in 2016.

Feel free to contact me at

[Google Scholar] [Twitter]


My research uses insights from program synthesis, classical planning, reinforcement learning, and machine learning to improve sample efficiency and generalization in decision-making tasks with very long horizons and sparse rewards.

Planning with learned object importance in large problem instances using graph neural networks
Tom Silver*, Rohan Chitnis*, Aidan Curtis, Josh Tenenbaum, Tomas Lozano-Perez, Leslie Kaelbling
AAAI, 2021
[BibTeX] [Video] [Code] [PDF]

We propose a graph neural network architecture for predicting object importance to address the challenge of planning in problems that contain many, many objects.

GLIB: Efficient exploration for relational model-based reinforcement learning via goal-literal babbling
Rohan Chitnis*, Tom Silver*, Josh Tenenbaum, Leslie Kaelbling, Tomas Lozano-Perez
AAAI, 2021
[BibTeX] [Video] [Code] [PDF]

We propose goal-literal babbling (GLIB), a simple and general method that addresses the problem of efficient exploration for transition model learning in the relational model-based reinforcement learning setting without extrinsic goals or rewards.

CAMPs: Learning context-specific abstractions for efficient planning in factored MDPs
Rohan Chitnis*, Tom Silver*, Beomjoon Kim, Leslie Kaelbling, Tomas Lozano-Perez
Conference on Robot Learning (CoRL), 2020 (Plenary Talk)
[BibTeX] [Video] [Code] [PDF]

We observe that learning to impose constraints in factored planning problems can induce context-specific abstractions that afford much more efficient planning.

PDDLGym: Gym environments from PDDL problems
Tom Silver, Rohan Chitnis
ICAPS PRL Workshop, 2020
[BibTeX] [Code] [PDF]

We present PDDLGym, a framework that automatically constructs OpenAI Gym environments from PDDL domains and problems.

Integrated Task and Motion Planning
Caelan Reed Garrett, Rohan Chitnis, Rachel Holladay, Beomjoon Kim, Tom Silver, Leslie Pack Kaelbling, Tomas Lozano-Perez
Annual Review of Control, Robotics, and Autonomous Systems. Vol. 4, 2021
[BibTeX] [PDF]

We define a class of TAMP problems and survey algorithms for solving them, characterizing the solution methods in terms of their strategies for solving the continuous-space subproblems and their techniques for integrating the discrete and continuous components of the search.

Online Bayesian Goal Inference for Boundedly-Rational Planning Agents
Tan Zhi-Xuan, Jordyn L. Mann, Tom Silver, Josh Tenenbaum, Vikash K. Mansinghka
NeurIPS, 2020
[BibTeX] [PDF]

We present an architecture capable of inferring an agent's goals online from both optimal and non-optimal sequences of actions.

Learning constraint-based planning models from demonstrations
João Loula, Kelsey Allen, Tom Silver, Josh Tenenbaum
IROS, 2020
[BibTeX] [PDF]

We present a framework for learning constraint-based task and motion planning models using gradient descent.

Learning skill hierarchies from predicate descriptions and self-supervision
Tom Silver*, Rohan Chitnis*, Anurag Ajay, Josh Tenenbaum, Leslie Kaelbling
AAAI GenPlan Workshop, 2020
[BibTeX] [PDF]

We learn lifted, goal-conditioned policies and use STRIPS planning with learned operator descriptions to solve a large suite of unseen test tasks.

Few-shot Bayesian imitation learning with logical program policies.
Tom Silver, Kelsey Allen, Leslie Kaelbling, Josh Tenenbaum
AAAI 2020. Previous versions at RLDM, 2019 and ICLR SPiRL Workshop.
[BibTeX] [PDF] [Website] [Code] [Video]

We can learn policies from five or fewer demonstrations that generalize to dramatically different test task instances.

Learning sparse relational transition models
Victoria Xia, Zi Wang, Kelsey Allen, Tom Silver, Leslie Kaelbling
ICLR, 2019
[BibTeX] [PDF]

We present a representation for describing transition models in complex uncertain domains using relational rules.

Learning models for mode-based planning
João Loula, Tom Silver, Kelsey Allen, Josh Tenenbaum
ICML MBRL Workshop, 2019

We present a model that learns mode constraints from expert demonstrations. We show that it is data efficient, and that it learns interpretable representations that it can leverage to effectively plan in out-of-distribution environments.

Discovering a symbolic planning language from continuous experience
João Loula, Tom Silver, Kelsey Allen, Josh Tenenbaum
CogSci, 2019

We present a model that starts out with a language of low-level physical constraints and, by observing expert demonstrations, builds up a library of high-level concepts that afford planning and action understanding.

Residual policy learning
Tom Silver*, Kelsey Allen*, Josh Tenenbaum, Leslie Kaelbling
arXiv, 2018
[BibTeX] [PDF] [Website] [Code] [Video]

We present a simple method for improving nondifferentiable policies using model-free deep reinforcement learning.

Behavior is everything — towards representing concepts with sensorimotor contingencies
Nicholas Hay, Michael Stark, Alexander Schlegel, Carter Wendelken, Dennis Park, Eric Purdy, Tom Silver, D. Scott Phoenix, Dileep George
AAAI, 2018
[BibTeX] [PDF] [Blog Post] [Code]

We propose an interactive, behavior-based model that represents concepts using sensorimotor contingencies grounded in an agent's experience.

Schema networks: zero-shot transfer with a generative causal model of intuitive physics
Ken Kansky, Tom Silver, David A. Mely, Mohamed Eldawy, Miguel Lazaro-Gredilla, Xinghua Lou, Nimrod Dorfman, Szymon Sidor, Scott Phoenix, Dileep George
ICML, 2017
[BibTeX] [PDF] [Blog Post] [Video] [Press Coverage: TechCrunch, Wired, Science]

We introduce the Schema Network, an object-oriented generative physics simulator capable of disentangling multiple causes of events and reasoning backward through causes to achieve goals.

Luna: a game-based rating system for Artificial Intelligence
Tom Silver
Harvard undergraduate thesis, 2016
[Blog Post]

I implement a two-player game and a rating system to measure machine intelligence via crowdsourcing.

Template from here.