Neuroscientists break code on sight

Cathryn M. Delude, News Office Correspondent

November 3, 2005

Neurons in a purely visual brain region called the inferotemporal (IT) cortex respond selectively to different images. As pictures were randomly presented to the monkey during specific intervals (top), neurons at different sites in IT produce distinct patterns of activity to each picture (bottom). For example, neurons at site 1 favor the toy and the yam, while neurons at site 3 prefer the monkey face and the cat.

Image courtesy / Poggio/DiCarlo labs

In the sci-fi movie "The Matrix," a cable running from a computer into Neo's brain writes in visual perceptions, and Neo's brain can manipulate the computer-created world. In reality, scientists cannot interact directly with the brain because they do not understand enough about how it codes and decodes information.

Now, neuroscientists in the McGovern Institute at MIT have been able to decipher a part of the code involved in recognizing visual objects. Practically speaking, computer algorithms used in artificial vision systems might benefit from mimicking these newly uncovered codes.

The study, a collaboration between James DiCarlo's and Tomaso Poggio's labs, appears in the Nov. 4 issue of Science.

"We want to know how the brain works to create intelligence," said Poggio, the Eugene McDermott Professor in Brain Sciences and Human Behavior. "Our ability to recognize objects in the visual world is among the most complex problems the brain must solve. Computationally, it is much harder than reasoning." Yet we take it for granted because it appears to happen automatically and almost unconsciously.

"This work enhances our understanding of how the brain encodes visual information in a useful format for brain regions involved in action, planning and memory," said DiCarlo, an assistant professor of neuroscience.

In a fraction of a second, visual input about an object runs from the retina through increasingly higher levels of the visual stream, continuously reformatting the information until it reaches the highest purely visual level, the inferotemporal (IT) cortex. The IT cortex identifies and categorizes the object and sends that information to other brain regions.

To explore how the IT cortex formats that output, the researchers trained monkeys to recognize different objects grouped into categories, such as faces, toys and vehicles. The images appeared in different sizes and positions in the visual field. Recording the activity of hundreds of IT neurons produced a large database of IT neural patterns generated in response to each object under many different conditions.

Then, the researchers used a computer algorithm, called a classifier, to decipher the code. The classifier was used to associate each object -- say, a monkey's face -- with a particular pattern of neural signals, effectively decoding neural activity. Remarkably, the classifier found that just a split second's worth of the neural signal contained specific enough information to identity and categorize the object, even at positions and sizes the classifier had not previously "seen."

It was quite surprising that so few IT neurons (several hundred out of millions) for such a short period of time contained so much precise information. "If we could record a larger population of neurons simultaneously, we might find even more robust codes hidden in the neural patterns and extract even fuller information," Poggio said.

This work was funded by DARPA, the Office of Naval Research and the National Institutes of Health.

A version of this article appeared in MIT Tech Talk on November 16, 2005 (download PDF).