Computer Science and Artificial Intelligence Laboratory
Dept. of Electrical Engineering and Computer Science
Massachusetts Institute of Technology
Assistant: Fern Deolivera
Address: 32 Vassar Street,
Cambridge, MA 02139
My research is in the areas of computer vision, machine learning and human visual perception. I am interested in scene and object recognition, among other things.
Scene and object recognition are two related visual tasks generally studied separately. However, by devising systems that solve these tasks in an integrated fashion I believe it is possible to build more efficient and robust recognition systems.
ADE20K dataset. 22.210 fully annotated images with objects and many with parts.
A scene parsing challenge is being held jointly with ILSVRC'16. Winners will be invited to present at ILSVRC and COCO joint workshop at ECCV 2016. Check the scene parsing challenge website.
Multimodal scene recognition. The data for this work has thousands of linedrawings and textual descriptions of scenes, done by AMT workers. The dataset is organized with the same categories as the Places database.
Aligning books and movies. Learning to see and read by watching movies and reading books. Check also the MovieQA dataset: MovieQA: Story Understanding Benchmark.
Gaze following demo, and dataset. It follows the gaze of the people inside a picture or video and predicts what are they looking. In this video, frames are first processed independently and then the output is smoothed temporaly.
Places2 scene classification challenge, held in conjunction with ILSVRC at ICCV 2015.
Interactive visualization of deep networks: Places-CNN and ImageNet-CNN. For more details on understanding the internal representation built by a CNN trained for scene recognition: Object Detectors Emerge in Deep Scene CNNs.
Places database and scene recognition demo. The Places database contains 205 scene categories and 2,5 millions of images. More details about the demo appear in: "Learning Deep Features for Scene Recognition using Places Database," B. Zhou, A. Lapedriza, J. Xiao, A. Torralba, and A. Oliva. NIPS 2014 (pdf).
Saliency benchmark. It has 300 test images and fixations from 39 viewers. The fixations from 39 viewers per image are not public such that no model can be trained using this data set.
Check the LabelMe App for iPhone and iPad. The app connects with your LabelMe account online and allows you to take pictures and label them on the device. You can then recover the images and anotations with the LabelMe matlab toolbox. Developed by Josep Marc Mingot Hidalgo, Dolores Blanco, Aina Torralba, David Way and Antonio Torralba.
Aditya Khosla (Grad. student),
Adria Recasens (Grad. student),
Agata Lapedriza (Visiting professor, UOC),
Andrew Owens (Grad. student with Bill Freeman),
Bolei Zhou (Grad. student),
Carl Vondrick (Grad. student),
Xavier Puig Fernandez (Visiting student),
Yusuf Aytar (Post-doctoral Fellow)
Past students and visitors
Joseph J. Lim (Graduated 2015),
Lluis Castrejon (Visiting student, 2015),
Hamed Pirsiavash (Post-doctoral Fellow),
Zoya Gavrilov (Grad. Student).
Josep Marc Mingot Hidalgo (Visiting student),
Tomasz Malisiewicz (Post-doctoral Fellow),
Jianxiong Xiao (Graduated 2013),
Dolores Blanco Almazan (Visiting student, 2012),
Biliana Kaneva (Graduated 2011),
Jenny Yuen (Graduated 2011),
Tilke Judd (Graduated 2011)
Myung "Jin" Choi (Graduated 2011),
James Hays (Post-doctoral Fellow),
Hector J.Bernal (Visiting student),
Gunhee Kim (Visiting student),
Bryan C. Russell (Graduated 2008).
Places database. The Places database contains 205 scene categories and 2,5 millions of images.
3D IKEA dataset. Dataset for IKEA 3D models and aligned images. J. Lim, H. Pirsiavash, and A.Torralba. ICCV 2013.
SUN Database. Scene UNderstanding Database. A database for scene recognition (900 scene categories) and multiclass object detection (>15000 fully segmented images).
Xiao et al, CVPR 2010. (pdf)
360-SUN Database. A database of 360 degrees panoramas organized along the SUN categories.
Xiao et al, CVPR 2012. (pdf)
Out of context objects. The database contains 218 fully annotated images with at least one object out-of-context. Can you detect the out of context object? Project page
LabelMe: the open annotation tool. Explore the online query tool, Matlab toolbox, Wordnet hierarchy, and the 3D LabelMe toolbox
Jenny Yuen et al, ICCV 09. (pdf)
80 Million tiny images: explore a dense
sampling of the visual world Antonio Torralba, Rob Fergus, William T. Freeman
Indoor Scene Recognition Database: 67 indoor scene categories. A. Quattoni, and A.Torralba. CVPR 2009.