ICML 2011 Workshop:
Planning and Acting with Uncertain Models

This workshop addresses the question of how an agent can best use uncertainty in a model of the world to act optimally. When does accounting for such uncertainty help? When can it be (perhaps partially) ignored?


Through speakers, submitted papers, posters, and discussion, this workshop will develop new insights about the advantages, disadvantages, and open questions for various strategies for leveraging knowledge of model uncertainty when planning.

Dates Program Description Related Organizers

Key Dates

This workshop will be held at ICML 2011 in Bellevue, Washington on July 2, 2011 at the Hyatt Regency Bellevue hotel. Please see the ICML 2011 website for more information on the venue.


The format of the workshop includes a mix of invited speakers and submitted papers on a variety of approaches to planning with uncertain models. Each invited speaker will be followed by a discussion about the approach, with the aim of understanding both its theoretical properties and applications.

9:00-9:05 Opening Remarks
Session One: Learning with Guarantees
9:05-9:55 Invited Talk: Lihong Li - Yahoo! Research
9:55-10:10 Probabilistic Goal Markov Decision Processes, Huan Xu, Shie Mannor
10:10-10:25 Model Estimation Within Planning and Learning, Alborz Geramifard, Josh Redding, Josh Joseph
10:25-10:35 Discussion
10:35-11:00 Coffee Break
Session Two: Bayesian Methods
11:00-11:50 Invited Talk: Joelle Pineau - McGill University
11:50-12:00 Discussion
1:30-1:45 A Behavior Based Kernel for Policy Search via Bayesian Optimization, Aaron Wilson
1:45-2:00 Multiple-Target Reinforcement Learning with a Single Policy, Marc Peter Deisenroth, Dieter Fox
Session Three: Learning Alternate Models
2:00-2:15 Off-Policy Linear Action Models, Hengshuai Yao
2:15-2:30 Reduced-Order Models for Data-Limited Reinforcement Learning, Josh Joseph
2:30-3:20 Invited Talk: Satinder Singh - University of Michigan
3:20-3:30 Discussion
3:30-4:00Coffee Break
Session Four: Planning in Large Spaces
4:00-4:50 Invited Talk: David Hsu - National University of Singapore
4:50-5:05 Online Planning in Continuous POMDPs with Open-Loop Information-Gathering Plans, Kris Hauser
5:05-5:20 Goal-Directed Knowledge Acquisition, Dan Bryce
5:20-5:40 Discussion

Topic, Motivation and Background

Recent years have seen huge advances in capturing structured uncertainty. We can describe distributions over hierarchies, factors, and many other kinds of structure, giving us rich hypothesis spaces for constructing models of many domains. Modeling the uncertainty is an especially important when learning controlled dynamical systems, such as state-space models. Model uncertainty can result from limited data to train parameters or an expert's approximations; in either case, the agent's goal is to leverage this knowledge of model uncertainty to select actions which control the system optimally.

Many action-selection approaches have been developed to leverage knowledge about model-uncertainty: Bayesian approaches cast unknown model parameters as additional hidden state; PAC approaches provide guarantees on the agent's final performance; robust algorithms minimize an agent's overall regret. However, in many applications, simple approximations, such as ignoring much of the uncertainty or making myopic decisions, often perform quite well. Does their success imply that they are sufficiently sophisticated? In what problem domains is explicitly planning to gather information to reduce model uncertainty crucial to good performance? And do properties of successful action-selection strategies provide insights on how to best track model uncertainty?

The goal of this workshop is to identify state of the art methods for planning under uncertainty and discuss why specific methods work well in specific domains. For example, how do explicit rewards for reducing uncertainty---such as goal-oriented problems---affect what planning algorithms are most successful? When planners do not take explicit steps to reduce model uncertainty, is that a result of the approximation or the fact that, in many domains, most reasonable policies reduce model uncertainty reasonably well? When is it useful to introduce internal rewards for exploration or reducing model variance? How should agents decide what aspects of the model are worth exploring?

Related Workshops

This topic has been persistently interesting to many communities. This workshop therefore builds on previous workshops, including


Finale Doshi completed her M.Sc. from the University of Cambridge and is currently a Ph.D. candidate at the Computer Science and Artificial Intelligence Laboratory (CSAIL) at MIT. Her research focuses on Bayesian nonparametric methods for reinforcement learning in partially observable domains.

David Wingate is a research scientist in the Laboratory for Information Decision Systems and the Computational Cognitive Science group at MIT. His research focuses on the intersection of dynamical systems modeling, reinforcement learning, hierarchical Bayesian modeling, and machine learning.