I'm interested in language as a communicative and computational tool. People learn to understand and generate novel utterances from remarkably little data. Having learned language, we use it acquire new concepts and to structure our reasoning. Current machine learning techniques fall short of human abilities in both their capacity to learn language and learn from language about the rest of the world. My research aims to (1) understand the computational foundations of efficient language learning, and (2) build general-purpose intelligent systems that can communicate effectively with humans and learn from human guidance.
I'm an associate professor at MIT in EECS and CSAIL. I did my PhD work at Berkeley, where I was a member of the Berkeley NLP Group and the Berkeley AI Research Lab. I've also spent time with the Cambridge NLIP Group, and the NLP Group and the (erstwhile) Center for Computational Learning Systems at Columbia.
Prospective students: apply through the MIT graduate admissions portal in the fall. Please read my advising statement if you're considering applying. Prospective visitors: I do not currently have openings for visiting researchers. (I'm afraid I generally can't respond to individual emails about either grad admissions or visitor positions / internships.)
Some current research directions:
Learning from language
Much of humans' abstract knowledge comes from abstract descriptions, but almost all machine learning research focuses on learning from comparatively low-level demonstrations or interactions. How do we enable more natural and efficient learning from natural language supervision instead?
LaMPP: Language models as probabilistic priors for perception and action (preprint)
Skill induction and planning with latent language (ACL 2022)
Leveraging language to learn program abstractions and search heuristics (ICML 2021)
Interpretation and explanation
How can we help humans understand the features and representational strategies that black-box machine learning algorithms discover? To what extent do these strategies reflect abstractions that we already have names for?
What learning algorithm is in-context learning? Investigations with linear models (ICLR 2023)
Implicit representations of meaning in neural language models (ACL 2021)
Compositional explanations of neurons (NeurIPS 2020)
Compositionality and generalization
Compositionality and modularity are core features of representational systems in language, software and biology. How do they arise, and what function do they serve? Can we use descriptions of abstract compositional structure in one domain (e.g. language) to learn modular representations in another (e.g. vision)?
Collaboration graph trivia: My Erdős number is at most three (J Andreas to R Kleinberg to L Lovász to P Erdős). My Kevin Bacon number (and consequently my Erdős-Bacon number) remains lamentably undefined, but my Kevin Knight number (it's a thing) is one. I have never starred in a film with Kevin Knight. Noam Chomsky is my great-great-grand-advisor (J Andreas to D Klein to C Manning to J Bresnan to N Chomsky).
Photo: Gretchen Ertl / MIT CSAIL