Language and Vision Ambiguities (LAVA) Corpus


Language and Vision Ambiguities (LAVA) is a multimodal corpus that supports the study of ambiguous language grounded in vision. The corpus contains ambiguous sentences coupled with visual scenes that depict the different interpretations of each sentence. LAVA sentences cover a wide range of linguistic ambiguities, including PP and VP attachment, conjunctions, logical form, anaphora and ellipsis.

Examples

Sentence Visual Setup Video Image Syntactic Parses Semantic Parses
Danny approached the chair with a yellow bag.
  1. Danny with bag
  2. Chair with bag
  1. λx.λy.λz.person(x)∧chair(y)∧bag(z)∧yellow(z)∧has(x,z)∧approach(x,y)
  2. λx.λy.λz.person(x)∧chair(y)∧bag(z)∧yellow(z)∧has(y,z)∧approach(x,y)
Danny looked at Andrei picking-up a yellow bag.
  1. Danny picking-up bag
  2. Andrei picking-up bag
  1. λx.λy.λz.yellow(x)∧bag(x)∧person(y)∧person(z)∧look-at(y,z)∧pick-up(y,x)
  2. λx.λy.λz.yellow(x)∧bag(x)∧person(y)∧person(z)∧look-at(y,z)∧pick-up(z,x)

Download

This corpus is available to the public here.

Reference

Yevgeni Berzak, Andrei Barbu, Daniel Harari, Boris Katz, and Shimon Ullman (2015). Do You See What I Mean? Visual Resolution of Linguistic Ambiguities. Conference on Empirical Methods in Natural Language Processing (EMNLP), Lisbon, Portugal. [PDF]

Acknowledgment

This material is based upon work supported by the Center for Brains, Minds, and Machines (CBMM), funded by NSF STC award CCF-1231216.