Place recognition

Place recognition

While navigating in an environment, a vision system has to be able to recognize where it is and what the main objects in the scene are. We present a context-based vision system for place and object recognition. The goal is to identify familiar locations (e.g., office 610, conference room 941, Main Street), to categorize new environments (office, corridor, street) and to use that information to provide contextual priors for object recognition (e.g., table, chair, car, computer). We have trained a system to recognize over 60 locations (indoors and outdoors) and to suggest the presence and locations of more than 20 different object types. The algorithm has been integrated into a mobile system that provides real-time feedback to the user.

As a test-bed for the approach proposed, we use a helmet-mounted mobile system. The system is composed of a web-cam that is set to capture 4 images/second at a resolution of 120x160 pixels (color). The web-cam is mounted on a helmet in order to follow the head movements while the user explores their environment. The user receives feedback about system performance through a head-mounted display.

We present a low-dimensional global image representation that provides relevant information for place recognition and categorization, and how such contextual information introduces strong priors that simplify object recognition.

Performance of place recognition for a sequence that starts indoors and then goes outdoors. Top. The solid line represents the true location, and the dots represent the posterior probability associated with each location. There are 63 possible locations, but we only show those with non negligible probability mass. Middle. Estimated category of each location. Bottom. Estimated probability of being indoors or outdoors.

Datasets and code

Context-based vision system for place and object recognition

Publications

Using the forest to see the trees: a graphical model relating features, objects and scenes

Kevin P. Murphy, Antonio Torralba and William T. Freeman.
(NIPS 2003)

( paper.pdf )

Context-based vision system for place and object recognition

A. Torralba, K. P. Murphy, W. T. Freeman and M. A. Rubin.

Proceedings of the IEEE International Conference on Computer Vision, ICCV 2003, vol.1, p.273. Nice, France.

( paper.pdf ) ( demo.avi ) ( data and movies )