Ben Lengerich

Ben Lengerich

Hello! I'm a postdoc in the MIT Computational Biology Lab (PI: Prof. Manolis Kellis) at MIT CSAIL and the Broad Institute. My research is currently supported by the Alana Down Syndrome Center at MIT, supporting our aims of using contextualized machine learning to elucidate the biologic bases and therapeutic options for complex diseases. I completed my PhD at the CS Department at Carnegie Mellon University, advised by Prof. Eric Xing. I've also been very blessed to spend time working with Rich Caruana at Microsoft Research (2019, 2020) and Chris Potts at Roam Analytics (2017). My work has previously been supported by the CMLH Fellowship.


Email Address:

Office: D-528 Stata Center, MITMap

Other: Calendar , ORCID ORCID Profile , Conference Deadlines


New preprint: Our study of biases and implications in EHR datasets ("Death by Round Numbers") is now available on Medrxiv!

May 2022

Talk: Presented Contexutalized ML for Disease Subtyping at BIRS. Video/slides available.

May 2022

New publication: Our study of Personalized Treatment Effects in Covid-19 is now available in JBI!

May 2022

Code available: The alpha version of Contextualized is now available.

April 2022

Code available: Contextualized GAMs have been open-sourced.

January 2022

New publication: Our paper on the Interaction Effect perspective of Dropout has been accepted to AISTATS 2022.

January 2022

New publication: Our paper on Neural Additive Models has been accepted for an oral presentation at NeurIPS 2021.

December 2021

New publication: Our paper on counter-intuitive untrustworthiness of GAMs has been accepted to KDD 2021.

June 2021

Research Interests and Selected Publications

I research machine learning methods and build artificial intelligence systems to automate the process of scientific discovery. I am particularly interested in questions of automatically identifying and adapting to changing contexts. This research focus requires advances in interpretable machine learning, multi-task learning, and task representation learning, and finds natural applications in precision medicine and computational genomics of complex diseases such as Alzheimer’s Disease, Down Syndrome, and cancer.

Context-Specific Inferences: What happens if we build models which can adapt to different contexts?
  • NOTMAD: Estimating Bayesian Networks with Sample-Specific Structures and Parameters
    Benjamin Lengerich, Caleb Ellington, Bryon Aragam, Eric P. Xing, Manolis Kellis
    Abstract Pre-print
    Arxiv 2021
  • Discriminative Subtyping of Lung Cancers from Histopathology Images via Contextual Deep Learning
    Benjamin J. Lengerich*, Maruan Al-Shedivat*, Amir Alavi, Jennifer Williams, Sami Labbaki, Eric P. Xing
    Abstract Pre-print
    Medrxiv 2020
  • Code: Contextualized.ML Python package.
Interpretable AI: How can we build models that summarize patterns in ways that we can use to understand the underlying phenomena?
  • Neural Additive Models: Interpretable Machine Learning with Neural Nets.
    Rishabh Agarwal, Levi Melnick, Nicholas Frosst, Xuezhou Zhang, Ben Lengerich, Geoffrey Hinton, Rich Caruana
    Abstract Pre-print Paper Cite
    NeurIPS 2021
  • Purifying Interaction Effects with the Functional ANOVA: An Efficient Algorithm for Recovering Identifiable Additive Models
    Benjamin J. Lengerich, Sarah Tan, Chun-Hao Chang, Giles Hooker, Rich Caruana
    Abstract Pre-print Paper Cite
    AISTATS 2020
  • Code: Interpret.ML Python package.
Personalized and Precision Medicine: How can we deliver optimal care for every individual patient?
  • Automated Interpretable Discovery of Heterogeneous Treatment Effectiveness: A COVID-19 Case Study
    Benjamin J. Lengerich, Mark E. Nunnally, Yin Aphinyanaphongs, Caleb Ellington, Rich Caruana
    Abstract Pre-print Paper Cite
    JBI 2022
Computational Genomics of Complex Diseases: What are the biological causes, implications, and therapeutics targets of complex diseases?

A more complete list of my publications can be found here.