Role of learning in 3D form perception
One of the most enduring questions about human vision is how we are able
to perceive three-dimensionality in two-dimensional images, even in the absence
of motion, stereo, shading and texture cues. Traditionally, researchers have posited
the use of innately specified brain mechanisms, such as a preference for simplicity.
To test these ideas, we have developed a computational system for recovering 3D
structures from single 2D line-drawings, using a fixed set of constraints that partly
capture the notion of perceptual simplicity. While the system is able to mimic human
performance for a small set of inputs, it exhibits significant limitations when analyzing
natural imagery. To account for these shortcomings, we have proposed a learning-based
theory, and have gathered experimental data that provide strong evidence for a role of
object-specific learning in the perception of 3D structure. Together, the computational
and experimental studies provide a good foundation for building a more comprehensive account
of 3D shape perception in single 2D images.
Sinha, P. & Adelson, E. H. (1993).
Recovering reflectance
and illumination in a world of painted polyhedra. Proceedings
of the Fifth International Conference on Computer Vision,
Berlin, Germany.
Sinha, P. & Poggio, T. (1996)
The role of learning in 3-D form perception.
Nature, Vol. 384, No. 6608, pp.
460-463.
Influence of learning on stereo-depth perception
The interaction between depth perception and learning processes has important
implications for the nature of mental object representations and models of hierarchical
organization of visual processing. It is often believed that the computation of depth influences
subsequent high-level object recognition processes, and that depth processing is an early
vision task that is largely immune to learned object-specific influences. We have found experimental
evidence that challenges both these assumptions in the specific context of stereoscopic depth-perception.
Our results suggest that observers can not only recognize depth-scrambled 3D objects, they are perceptually
unaware of the depth anomalies. The first result points to the limited contribution of depth information
to recognition while the second result is indicative of a top-down recognition-based influence whereby
learned expectations about an object’s 3D structure can overwhelm true stereoscopic information.
Bülthoff, I., Bülthoff, H. H. & Sinha, P. (1998).
Top-down influences on stereoscopic depth-perception. Nature
Neuroscience, Vol. 1, No. 3, pp 254-257.
A computational approach for incorporating learning in early vision
Perceptual tasks such as estimation of three-dimensional structure, edge detection and
image segmentation are considered to be low-level or mid-level vision problems and are
traditionally approached in a bottom-up, generic and hard-wired way. However, as described
above, we have found experimental evidence that suggests a top-down, learning-based scheme.
To complement our empirical results, we have developed a simple computational model for
incorporating learned expectations in perceptual tasks. The results generated by our model when
tested on edge-detection and view-prediction tasks for three-dimensional objects are consistent with
human perception and are more tolerant to input degradations than conventional bottom-up strategies.
This lends support to the idea that even some of the supposedly ‘hard-wired’ perceptual skills in the
human visual system might, in fact, incorporate learned top-down influences.
Jones, M. J., Sinha, P., Vetter, T., & Poggio, T. (1997).
Top-down learning of low-level vision tasks. Current Biology,
7: 991-994.
Sinha, P. and Poggio, T. (2002) High-level learning of early perceptual tasks.
In Perceptual Learning, Ed. Manfred Fahle, MIT Press, Cambridge, MA.