The research goal of our laboratory is to understand the mechanisms underlying visual object recognition. Specifically we seek to understand how sensory input is transformed by the brain from an initial representation (essentially a photograph on the retina), to a new, remarkably powerful form of representation -- one that can support our seemingly effortless ability to solve the computationally difficult problem of object recognition. We are particularly focused on patterns of neuronal activity in the highest levels of the ventral visual stream (primate inferior temporal cortex, IT) that likely directly underlie recognition. At these high levels, individual neurons can have the remarkable response property of being highly selective for object identity, even though each object's image on the retinal surface is highly variable -- for example, due to changes in object position, distance, pose, lighting and background clutter. Understanding the creation of such neuronal responses by transformations carried out along the ventral visual processing stream is the key to understanding visual recognition.
To approach these very difficult problems, the work of our laboratory is directed along three main lines: 1) characterize the computational usefulness of patterns of IT neuronal activity for supporting immediate visual object recognition, 2) test and develop computational theories of how visual input is transformed along the ventral processing stream from a pixel-wise representation, to a powerful representation in IT, 3) understand the spatial organization of this representation. Our primary research approaches are: neurophysiology in awake, behaving non-human primates, functional brain imaging (fMRI), human psychophysics, and computational modeling. Across all of these endeavors we aim to develop innovative methods and tools to facilitate this work in our laboratory and others. Our approaches are often synergistic with those of other MIT laboratories, and this has greatly enhanced our progress.
Because recognition is critical to so much of behavior, the understanding we seek will fundamentally influence the way we think about how the brain processes sensory information, and, more generally, principles of cortical information processing. Our goal is to use this understanding to inspire artificial vision systems, to aid the development of visual prosthetics, to provide guidance to molecular approaches to repair lost brain function, and to obtain deep insight into how the brain represents sensory information in a way that is highly suited for cognition and action.
Cox DD, DiCarlo JJ. Does Learned Shape Selectivity in Inferior Temporal Cortex Automatically Generalize Across Retinal Position? Journal of Neuroscience 28(40):10045–10055 (2008).
Li N and DiCarlo JJ. Unsupervised Natural Experience Rapidly Alters Invariant Object Representation in Visual Cortex. Science 321 (5895): 1502-1507 (2008).
Op de Beeck HP, Deutsch JA, Vanduffel W, Kanwisher N and DiCarlo JJ. A Stable Topography of Selectivity for Unfamiliar Shape Classes in Monkey Inferior Temporal Cortex. Cerebral Cortex Advance Access published online on November 21, 2007.
DiCarlo JJ and Cox D. Untangling invariant object recognition. Trends in Cognitive Science 11:333-341 (2007).
Kreiman GK, Hung CP, Kraskov A, Quian Quiroga R, Poggio TA, and DiCarlo JJ. Object selectivity of local field potentials and spikes in the macaque inferior temporal cortex. Neuron 49: 433-445 (2006).
Hung CP, Kreiman GK, Poggio T, and DiCarlo JJ. Fast Readout of object identity from macaque inferior temporal cortex. Science 310: 863-866 (2005).
Zoccolan D, Cox DD, and DiCarlo JJ. Multiple object response normalization in monkey inferotemporal cortex. Journal of Neuroscience 36: 8150-64 (2005).
Cox DD, Meier P, Oertelt N, and DiCarlo JJ. “Breaking” position invariant object recognition. Nature Neuroscience 8:1145-1147 (2005).