Grounding spatial prepositions for video search

Tellex, Stefanie; Roy, Deb

dc.contributor.author	Tellex, Stefanie A.
dc.contributor.author	Roy, Deb K.
dc.date.accessioned	2011-09-16T17:30:16Z
dc.date.available	2011-09-16T17:30:16Z
dc.date.issued	2009-11
dc.identifier.isbn	978-1-60558-772-1
dc.identifier.uri	http://hdl.handle.net/1721.1/65868
dc.description.abstract	Spatial language video retrieval is an important real-world problem that forms a test bed for evaluating semantic structures for natural language descriptions of motion on naturalistic data. Video search by natural language query requires that linguistic input be converted into structures that operate on video in order to find clips that match a query. This paper describes a framework for grounding the meaning of spatial prepositions in video. We present a library of features that can be used to automatically classify a video clip based on whether it matches a natural language query. To evaluate these features, we collected a corpus of natural language descriptions about the motion of people in video clips. We characterize the language used in the corpus, and use it to train and test models for the meanings of the spatial prepositions "to," "across," "through," "out," "along," "towards," and "around." The classifiers can be used to build a spatial language video retrieval system that finds clips matching queries such as "across the kitchen."	en_US
dc.description.sponsorship	United States. Office of Naval Research (MURI N00014-07-1-0749)	en_US
dc.language.iso	en_US
dc.publisher	Association for Computing Machinery	en_US
dc.relation.isversionof	http://dx.doi.org/10.1145/1647314.1647369	en_US
dc.rights	Creative Commons Attribution-Noncommercial-Share Alike 3.0	en_US
dc.rights.uri	http://creativecommons.org/licenses/by-nc-sa/3.0/	en_US
dc.source	MIT web domain	en_US
dc.title	Grounding spatial prepositions for video search	en_US
dc.type	Article	en_US
dc.identifier.citation	Tellex, Stefanie, and Deb Roy. “Grounding Spatial Prepositions for Video Search.” Proceedings of the 2009 International Conference on Multimodal Interfaces - ICMI-MLMI ’09. Cambridge, Massachusetts, USA, 2009. 253.	en_US
dc.contributor.department	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory	en_US
dc.contributor.department	Massachusetts Institute of Technology. Media Laboratory	en_US
dc.contributor.approver	Roy, Deb K.
dc.contributor.mitauthor	Tellex, Stefanie A.
dc.contributor.mitauthor	Roy, Deb K.
dc.relation.journal	Proceedings of the 2009 international conference on Multimodal interfaces	en_US
dc.eprint.version	Author's final manuscript	en_US
dc.type.uri	http://purl.org/eprint/type/ConferencePaper	en_US
dspace.orderedauthors	Tellex, Stefanie; Roy, Deb	en
dc.identifier.orcid	https://orcid.org/0000-0002-4333-7194
mit.license	OPEN_ACCESS_POLICY	en_US
mit.metadata.status	Complete

Files in this item

Name:: Roy_Grounding Spatial.pdf
Size:: 2.614Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record