|
Research |
Scenes StatisticsOne remarkable aspect of visual recognition is that humans are able to recognize the meaning (or "gist") of complex visual scenes within 1/20 of a second, independently of the quantity of objects in the image. This rapid understanding phenomenon can be experienced while looking at rapid sequences in television advertisements and quick cuts in modern movie trailers. How is this remarkable feat accomplished? Research over the last decade has made substantial progress toward understanding the mechanisms underlying single object recognition, but less progress has been made toward understanding scene and natural environments recognition. Global PropertiesComputer systems fall well short of human performance in tasks that require recognizing the gist of a scene. We are taking a novel approach to this challenging question by studying mechanisms of analysis that are global in nature, focusing on statistically robust features describing the spatial layout of the scene (e.g. its volume, its perspective, its level of clutter, cf. the spatial envelope model, Oliva & Torralba, 2001) and the scene's affordances (such as navigability, Greene & Oliva, 2006, 2008) and not merely its component objects. Neural representations of scene structureMoreover we will use this approach to define operational strategies for machine vision systems. This program of research combines a number of methodologies, including behavioral experiments (psychophysics, eye tracking), cognitive neuroscience methods and computational modeling. Currently, I am workig with Michael Ross to determine the image features that are used for classification tasks, and Soojin Park to determine the neural underpinnings of the global approach. Applications of this work include, among others, scene understanding systems to assist drivers, and automatic systems that could provide semantic descriptions of the contents of large image databases. |
|