Context modeling for object detection

Context modeling for object detection

There is general consensus that context can be a rich source of information about an object's identity, location and scale. In fact, the structure of many real-world scenes is governed by strong configurational rules akin to those that apply to a single object. There is increasing evidence of an early use of contextual information in human perception. However, object-centered approaches dominate the research in computer vision. Under favorable conditions, the multiplicity of cues (color, shape, texture) in the retinal image produced by an object seems to provide enough information to unambiguously determine the object category. Under such high quality viewing conditions, the object recognition mechanisms could rely exclusively on intrinsic object features ignoring the background. However, in situations with poor viewing quality (caused, for instance, by large distances) context appears to play a major role in enhancing the reliability of recognition.

(a car and a person? the blob on the right is identical to the one on the left after a 90deg rotation) (paper.pdf )

In the absence of enough local evidence about an object's identity, the scene structure and prior knowledge of world regularities provide additional information for recognizing and localizing an object. Even when objects can be identified via intrinsic information, context can simplify the object discrimination by cutting down on the number of object categories, scales and positions that need to be considered.

This movie illustrates how context can help object detection (movie.avi). The context is used to predict which is the most expected location of cars in the scene (center image), then this information is used to reduce false alarms of a car detector (right image).

Datasets and code

Context-based vision system for place and object recognition

Publications

Sharing features: efficient boosting procedures for multiclass object detection

Antonio Torralba, Kevin Murphy and William Freeman.

( paper )

Using the forest to see the trees: a graphical model relating features, objects and scenes

Kevin P. Murphy, Antonio Torralba and William T. Freeman.
(NIPS 2003)

( paper.pdf )

Contextual priming for object detection

A. Torralba. (2003).
International Journal of Computer Vision, 53 (2): 153-167, July 2003.

( Abstract )(paper.pdf )(Images )

Contextual Modulation of Target Saliency

A. Torralba. (2001)
Proc. of the 2001 conference, Adv. in Neural Information Processing Systems, 14, vol.2, pp.1303-1310.

( paper.ps.gz ) (pdf)

Statistical context priming for object detection

A. Torralba, P. Sinha. (2001)
Proceedings of the IEEE International Conference on Computer Vision, ICCV01. (pp. 763-770), Vancouver, Canada. 2001.

(paper.pdf )

Context-based vision system for place and object recognition

A. Torralba, K. P. Murphy, W. T. Freeman and M. A. Rubin.

Proceedings of the IEEE International Conference on Computer Vision, ICCV 2003, vol.1, p.273. Nice, France.

( paper.pdf ) ( demo.avi ) ( data and movies )

Detecting faces in impoverished images

A. Torralba, P. Sinha. (2001).
AI Memo 2001-028, CBCL Memo 208, November 2001.

( paper.pdf )

A. Torralba, A. Oliva, W. T. Freeman. (2003) Object recognition by scene alignment. Visual Science Society Meeting.

P. Sinha, A. Torralba. (2002). Detecting faces in impoverished images. Abstract from Visual Science Society Meeting.

A. Torralba, P. Sinha. (2001). Contextual priming for object detection. AI Memo 2001-020, CBCL Memo 205, September 2001.

A. Torralba, P. Sinha. (2001). Contextual influences on object recognition. ECVP 2001. Perception, supp.

A. Torralba, P. Sinha, A. Oliva. (2001). Modeling Contextual influences on object recognition. Journal of Vision, Abstract from Visual Science Society Meeting.