Nikhil Naik

I am a Director of AI Research/Principal Researcher at Salesforce Research. I lead a team of researchers and engineers working on generative AI for natural language processing and computer vision. I have built large language models and image generation models and developed the first AI copilots at Salesforce, now in use by many customers.

I obtained my PhD from MIT in 2016 advised by Ramesh Raskar. My work has appeared in AI conferences along with Nature family journals and has been featured in The Atlantic, The Economist, MIT Technology Review, and New York Times. I have received several awards including a Webby Award and a Harvard Prize Fellowship. Email: naik[at]alum.mit.edu /  Google Scholar /  Twitter



Interns and Students Supervised/Co-supervised

Clara Wong-Fannjiang (UC Berkeley), Akash Gokul (UC Berkeley), Aman Shrivastava (U. Virginia), Brian Chen (Columbia), Viraj Prabhu (Georgia Tech), Alvin Chan (NTU), Isabela Albuquerque (U. Montreal), Ankan Bansal (U. Maryland), Abhimanyu Dubey (IIT Delhi→MIT), Bowen Baker (MIT→OpenAI, Winner of 2nd best CS masters thesis at MIT), Karan Dwivedi (Harvard), Otkrist Gupta (MIT→Startup), Jade Philipoom (MIT)



Selected Publications and Preprints

See Google Scholar for full list

Diffusion Model Alignment Using Direct Preference Optimization

Bram Wallace, Meihua Dang, Rafael Rafailov, Linqi Zhou, Aaron Lou, Senthil Purushwalkam, Stefano Ermon, Caiming Xiong, Shafiq Joty, Nikhil Naik   CVPR 2024
Paper /   Code /   Blog 
The first scalable method to align text-to-image models to user preferences, now used to align state-of-the-art models like Stable Diffusion 3 and others!


BootPIG: Bootstrapping Zero-shot Personalized Image Generation Capabilities in Pretrained Diffusion Models

Senthil Purushwalkam, Akash Gokul, Shafiq Joty, Nikhil Naik   arXiv 2024 (in submission)
Paper /  Code (Coming soon!)  Zero-shot image personalization for text-to-image models


ConRad: Image Constrained Radiance Fields for 3D Generation from a Single Image

Senthil Purushwalkam, Nikhil Naik   NeurIPS 2023
Paper /  Code (Coming soon!)  Single image-to-3D using diffusion models and a constrained Neural Radiance Field


XGen-Image-1: A Foundation Model for Text-to-Image Generation

Bram Wallace, Nikhil Naik et al.   Blog post  How to train a state-of-the-art text-to-image generation model using TPUs and optimize image quality at inference time


End-to-End Diffusion Latent Optimization Improves Classifier Guidance

Bram Wallace, Akash Gokul, Stefano Ermon, Nikhil Naik   ICCV 2023
Paper /  Code  Accurate plug-and-play classifier guidance for diffusion models without the need for noise-aware training


Large language models generate functional protein sequences across diverse families

Ali Madani, Ben Krause, Eric R Greene, ... , Richard Socher, James S Fraser, Nikhil Naik    Nature Biotechnology 2023
Paper / Preprint /  Code /  Blog /  Science Perspective /  Press (50+)   A 1.2 billion parameter LLM can be used to generate novel proteins unseen in nature, as validated by lab experiments


Exact Diffusion Inversion via Coupled Transformations

Bram Wallace, Akash Gokul, Nikhil Naik   CVPR 2023
Paper  / Code  / Hugging Face  An exactly invertible diffusion generative process that enables real image editing with diffusion models


CLIP-Lite: information efficient visual representation learning from textual annotations

Aman Shrivastava, Ramprasaath R Selvaraju, Nikhil Naik, Vicente Ordonez   AISTATS 2022
Paper / Code  Efficient CLIP training using an information efficient lower-bound to maximize the mutual information between input modalities


Deep Extrapolation for Attribute-Enhanced Generation

Alvin Chan*, Ali Madani*, Ben Krause, Nikhil Naik   NeurIPS 2021
Paper /  Code  A generative model that extrapolates in the attribute space using a learned latent space.


CASTing Your Model: Learning to Localize Improves Self-Supervised Representations

Ramprasaath R. Selvaraju*, Karan Desai*, Justin Johnson, Nikhil Naik   CVPR 2021
Paper /  Blog /  Code  Intelligent crop sampling and Grad-CAM supervision improves localization and downstream performance of SSL models


ProGen: Language Modeling for Protein Generation

Ali Madani, Bryan McCann, Nikhil Naik, Nitish Keskar, Namrata Anand, Raphael Eguchi, Possu Huang, Richard Socher   
arXiv preprint 2020   Paper /  Blog A language model successfully generate tailored protein sequences that appear structurally and functionally viable


The AI Economist: Improving Equality and Productivity with AI-Driven Tax Policies

Stephan Zheng, Alex Trott, Sunil Srinivasa, Nikhil Naik, Melvin Gruesbeck, David Parkes, Richard Socher   
arXiv preprint 2020   Paper /  Blog /  Code Two-level reinforcement learning can be used to set optimal tax policies in simulated economies


Deep Learning-enabled Breast Cancer Hormonal Receptor Status Determination from Base-level H&E Stains

Nikhil Naik, Ali Madani*, Andre Esteva*, Nitish Keskar, Michael Press, Dan Ruderman, David Agus, Richard Socher  
Nature Communications 2020   Paper /  Blog Deep learning can make hormone therapy decisions from H&E pathology images, without needing more complex IHC testing


Maximum-Entropy Fine Grained Classification

Abhimanyu Dubey, Otkrist Gupta, Ramesh Raskar, Nikhil Naik   NeurIPS 2018
Paper  /  Code Maximizing entropy of the output probability distribution for training CNNs helps tackle intra-class similarity in FGVC


Big Data and Big Cities: The Promises and Limitations of Improved Measures of Urban Life

Edward Glaeser, Scott Kominers, Michael Luca, Nikhil Naik   Economic Enquiry 2018
Paper (Winner of the 2018 Best EI Article Award) /   Press:  The Atlantic   Chicago Policy Review   HBS Working Knowledge   Computer vision can predict important socioeconomic characteristics from street view images


Pairwise Confusion for Fine-grained Visual Classification

Abhimanyu Dubey, Otkrist Gupta, Pei Guo, Ryan Farrell, Ramesh Raskar, Nikhil Naik   ECCV 2018
Paper  /  Code Reducing overfitting in neural net training by intentionally introducing confusion in activations improves FGVC performance


Accelerating Neural Architecture Search using Performance Prediction

Bowen Baker*, Otkrist Gupta*, Ramesh Raskar, Nikhil Naik   ICLR Workshops 2018
Paper  /  Code Early-stopping based on final performance prediction of partially trained neural networks accelerates architecture search


Computer Vision Uncovers Predictors of Physical Urban Change

Nikhil Naik, Scott Kominers, Ramesh Raskar, Edward Glaeser, Cesar Hidalgo   PNAS 2017
Paper  /  Website (Winner of 2018 Webby Award for the best use of machine learning on the Internet)
Press:  Citylab   Fast Company   Forbes   Harvard Gazette   MIT News   New York Times   Quartz 
Computer vision measures urban change from time-series street view images, enabling economic analysis of urban dynamics


Green streets− Quantifying and Mapping Urban Trees with Street-level Imagery and Computer Vision

Ian Seiferling, Nikhil Naik, Carlo Ratti, Raphäel Proulx    Landscape and Urban Planning 2017
Paper  Computer vision can create detailed maps of urban vegetation using street view images


Designing Neural Network Architectures using Reinforcement Learning

Bowen Baker*, Otkrist Gupta*, Nikhil Naik, Ramesh Raskar   ICLR 2017
Paper  /  Code  /  Press: MIT Technology Review  A reinforcement learning agent can automatically generate high-performing CNN architectures


Deep Learning the City: Quantifying Urban Perception At A Global Scale

Abhimanyu Dubey, Nikhil Naik, Devi Parikh, Ramesh Raskar, Cesar Hidalgo   ECCV 2016
Paper  /  Code A neural network predicts perceptual attributes of the built environment from hundreds of cities from six continents


Cities Are Physical Too: Using Computer Vision to Measure the Quality and Impact of Urban Appearance

Nikhil Naik, Ramesh Raskar, Cesar Hidalgo   American Economic Review: Papers and Proceedings 2016
Paper  Computer vision-driven prediction of urban appearance enables studies of its quality and impact on society


A Light Transport Model for Mitigating Multipath Interference in Time-of-flight Sensors

Nikhil Naik, Achuta Kadambi, Christoph Rhemann, Shahram Izadi, Ramesh Raskar, Sing Bing Kang   CVPR 2015
Paper  /  Supplement Separating global and direct components of light transport can reduce multipath interference


Estimating Wide-angle, Spatially Varying Reflectance using Time-resolved Inversion of Backscattered Light

Nikhil Naik, Christopher Barsi, Andreas Velten, Ramesh Raskar   JOSA A 2014
Paper (Selected by Editors to appear in a Special Issue of Virtual Journal of Biomedical Optics) A trillion-frames-per-second camera can measure the reflectance profile of objects imaged through a diffuser


Streetscore – Predicting the Perceived Safety of One Million Streetscapes

Nikhil Naik, Jade Philipoom, Ramesh Raskar, Cesar Hidalgo   CVPR Workshops 2014
Paper /  Website  /  Data  /  Press:   Daily Mail   The Economist   Fast Company   Gizmodo   A computer vision algorithm, trained with an online participatory game, accurately predicts human perception of streetscapes


Frequency Analysis of Transient Light Transport with Applications in Bare Sensor Imaging

Di Wu, Gordon Wetzstein, Christopher Barsi, Matthew O’Toole, Nikhil Naik, Kyros Kutulakos, Ramesh Raskar    ECCV 2012
Paper  Analyzing free space propagation in the frequency domain leads to a new, time-resolved bare sensor imaging system


Single View Reflectance Capture using Multiplexed Scattering and Time-of-flight Imaging

Nikhil Naik, Shuang Zhao, Andreas Velten, Ramesh Raskar, Kavita Bala    ACM SIGGRAPH ASIA 2011
Paper  A trillion-frames-per-second camera can measure the reflectance profile of objects by analyzing indirectly scattered light