Research
My research uses insights from program synthesis, classical planning, reinforcement learning, and machine learning to improve sample efficiency and generalization in decision-making tasks with very long horizons and sparse rewards.
|
|
Planning with learned object importance in large problem instances using graph neural networks
Tom Silver*, Rohan Chitnis*, Aidan Curtis, Josh Tenenbaum, Tomas Lozano-Perez, Leslie Kaelbling
AAAI, 2021
[BibTeX]
[Video]
[Code]
[PDF]
We propose a graph neural network architecture for predicting object importance to address the challenge of planning in problems that contain many, many objects.
|
|
GLIB: Efficient exploration for relational model-based reinforcement learning via goal-literal babbling
Rohan Chitnis*, Tom Silver*, Josh Tenenbaum, Leslie Kaelbling, Tomas Lozano-Perez
AAAI, 2021
[BibTeX]
[Video]
[Code]
[PDF]
We propose goal-literal babbling (GLIB), a simple and general method that addresses the problem of efficient exploration for transition model learning in the relational model-based reinforcement learning setting without extrinsic goals or rewards.
|
|
CAMPs: Learning context-specific abstractions for efficient planning in factored MDPs
Rohan Chitnis*, Tom Silver*, Beomjoon Kim, Leslie Kaelbling, Tomas Lozano-Perez
Conference on Robot Learning (CoRL), 2020 (Plenary Talk)
[BibTeX]
[Video]
[Code]
[PDF]
We observe that learning to impose constraints in factored planning problems can induce context-specific abstractions that afford much more efficient planning.
|
|
PDDLGym: Gym environments from PDDL problems
Tom Silver, Rohan Chitnis
ICAPS PRL Workshop, 2020
[BibTeX]
[Code]
[PDF]
We present PDDLGym, a framework that automatically constructs OpenAI Gym environments from PDDL domains and problems.
|
|
Integrated Task and Motion Planning
Caelan Reed Garrett, Rohan Chitnis, Rachel Holladay, Beomjoon Kim, Tom Silver, Leslie Pack Kaelbling, Tomas Lozano-Perez
Annual Review of Control, Robotics, and Autonomous Systems. Vol. 4, 2021
[BibTeX]
[PDF]
We define a class of TAMP problems and survey algorithms for solving them, characterizing the solution methods in terms of their strategies for solving the continuous-space subproblems and their techniques for integrating the discrete and continuous components of the search.
|
|
Online Bayesian Goal Inference for Boundedly-Rational Planning Agents
Tan Zhi-Xuan, Jordyn L. Mann, Tom Silver, Josh Tenenbaum, Vikash K. Mansinghka
NeurIPS, 2020
[BibTeX]
[PDF]
We present an architecture capable of inferring an agent's goals online from both optimal and non-optimal sequences of actions.
|
|
Learning constraint-based planning models from demonstrations
João Loula, Kelsey Allen, Tom Silver, Josh Tenenbaum
IROS, 2020
[BibTeX]
[PDF]
We present a framework for learning constraint-based task and motion planning models using gradient descent.
|
|
Learning skill hierarchies from predicate descriptions and self-supervision
Tom Silver*, Rohan Chitnis*, Anurag Ajay, Josh Tenenbaum, Leslie Kaelbling
AAAI GenPlan Workshop, 2020
[BibTeX]
[PDF]
We learn lifted, goal-conditioned policies and use STRIPS planning with learned operator descriptions to solve a large suite of unseen test tasks.
|
|
Few-shot Bayesian imitation learning with logical program policies.
Tom Silver, Kelsey Allen, Leslie Kaelbling, Josh Tenenbaum
AAAI 2020. Previous versions at RLDM, 2019 and
ICLR SPiRL Workshop.
[BibTeX]
[PDF]
[Website]
[Code]
[Video]
We can learn policies from five or fewer demonstrations that generalize to dramatically different test task instances.
|
|
Learning sparse relational transition models
Victoria Xia, Zi Wang, Kelsey Allen, Tom Silver, Leslie Kaelbling
ICLR, 2019
[BibTeX]
[PDF]
We present a representation for describing transition models in complex uncertain domains using relational rules.
|
|
Learning models for mode-based planning
João Loula, Tom Silver, Kelsey Allen, Josh Tenenbaum
ICML MBRL Workshop, 2019
We present a model that learns mode constraints from expert demonstrations. We show that it is data efficient, and that it learns interpretable representations that it can leverage to effectively plan in out-of-distribution environments.
|
|
Discovering a symbolic planning language from continuous experience
João Loula, Tom Silver, Kelsey Allen, Josh Tenenbaum
CogSci, 2019
We present a model that starts out with a language of low-level physical constraints and, by observing expert demonstrations, builds up a library of high-level concepts that afford planning and action understanding.
|
|
Residual policy learning
Tom Silver*, Kelsey Allen*, Josh Tenenbaum, Leslie Kaelbling
arXiv, 2018
[BibTeX]
[PDF]
[Website]
[Code]
[Video]
We present a simple method for improving nondifferentiable policies using model-free deep reinforcement learning.
|
|
Behavior is everything — towards representing concepts with sensorimotor contingencies
Nicholas Hay, Michael Stark, Alexander Schlegel, Carter Wendelken, Dennis Park, Eric Purdy, Tom Silver, D. Scott Phoenix, Dileep George
AAAI, 2018
[BibTeX]
[PDF]
[Blog Post]
[Code]
We propose an interactive, behavior-based model that represents concepts using sensorimotor contingencies grounded in an agent's experience.
|
|
Schema networks: zero-shot transfer with a generative causal model of intuitive physics
Ken Kansky, Tom Silver, David A. Mely, Mohamed Eldawy, Miguel Lazaro-Gredilla, Xinghua Lou, Nimrod Dorfman, Szymon Sidor, Scott Phoenix, Dileep George
ICML, 2017
[BibTeX]
[PDF]
[Blog Post] [Video]
[Press Coverage:
TechCrunch,
Wired,
Science]
We introduce the Schema Network, an object-oriented generative physics simulator capable of disentangling multiple causes of events and reasoning backward through causes to achieve goals.
|
|
Luna: a game-based rating system for Artificial Intelligence
Tom Silver
Harvard undergraduate thesis, 2016
[Blog Post]
I implement a two-player game and a rating system to measure machine intelligence via crowdsourcing.
|
|