Operations Research Center
Seminars & Events
 
Skip to content

Fall 2015 Seminar Series

MASSACHUSETTS INSTITUTE OF TECHNOLOGY
OPERATIONS RESEARCH CENTER
FALL 2015 SEMINAR SERIES

DATE: 10/15/15
LOCATION: E51-315
TIME: 4:15pm
Reception immediately following

SPEAKER:
Tamara Broderick

TITLE
Feature Allocations, Paintboxes, and Approximate Bayesian Inference

ABSTRACT
Clustering involves placing entities into mutually exclusive categories. We wish to relax the requirement of mutual exclusivity, allowing objects to belong simultaneously to multiple classes, a formulation that we refer to as "feature allocation." The first step is a theoretical one. In the case of clustering the class of probability distributions over exchangeable partitions of a dataset has been characterized (via exchangeable partition probability functions and the Kingman paintbox). These characterizations support an elegant nonparametric Bayesian framework for clustering in which the number of clusters is not assumed to be known a priori. We establish an analogous characterization for feature allocation; we define notions of "exchangeable feature probability functions" and "feature paintboxes" that lead to a Bayesian framework that does not require the number of features to be fixed a priori. The second step is a computational one. Since exact Bayesian posterior calculation is typically not feasible in complex models, a large body of work is devoted to schemes for approximating the posterior. Rather than focusing on a good approximation to the full posterior, which has proved computationally prohibitive in complex models, we instead consider trading off some knowledge of the posterior for computational gains (and vice versa). For instance, while Mean Field Variational Bayes (MFVB) is a popular posterior approximation method due to its fast runtime on large-scale data sets, it is well known that a major failing of MFVB is its (sometimes severe) underestimates of the uncertainty of model variables and lack of information about model variable covariance. We generalize linear response methods from statistical physics to augment MFVB and deliver accurate uncertainty estimates for model variables---both for individual variables and coherently across variables.