Show simple item record

dc.contributor.authorTellex, Stefanie A.
dc.contributor.authorRoy, Deb K.
dc.date.accessioned2011-09-16T17:30:16Z
dc.date.available2011-09-16T17:30:16Z
dc.date.issued2009-11
dc.identifier.isbn978-1-60558-772-1
dc.identifier.urihttp://hdl.handle.net/1721.1/65868
dc.description.abstractSpatial language video retrieval is an important real-world problem that forms a test bed for evaluating semantic structures for natural language descriptions of motion on naturalistic data. Video search by natural language query requires that linguistic input be converted into structures that operate on video in order to find clips that match a query. This paper describes a framework for grounding the meaning of spatial prepositions in video. We present a library of features that can be used to automatically classify a video clip based on whether it matches a natural language query. To evaluate these features, we collected a corpus of natural language descriptions about the motion of people in video clips. We characterize the language used in the corpus, and use it to train and test models for the meanings of the spatial prepositions "to," "across," "through," "out," "along," "towards," and "around." The classifiers can be used to build a spatial language video retrieval system that finds clips matching queries such as "across the kitchen."en_US
dc.description.sponsorshipUnited States. Office of Naval Research (MURI N00014-07-1-0749)en_US
dc.language.isoen_US
dc.publisherAssociation for Computing Machineryen_US
dc.relation.isversionofhttp://dx.doi.org/10.1145/1647314.1647369en_US
dc.rightsCreative Commons Attribution-Noncommercial-Share Alike 3.0en_US
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/3.0/en_US
dc.sourceMIT web domainen_US
dc.titleGrounding spatial prepositions for video searchen_US
dc.typeArticleen_US
dc.identifier.citationTellex, Stefanie, and Deb Roy. “Grounding Spatial Prepositions for Video Search.” Proceedings of the 2009 International Conference on Multimodal Interfaces - ICMI-MLMI ’09. Cambridge, Massachusetts, USA, 2009. 253.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratoryen_US
dc.contributor.departmentMassachusetts Institute of Technology. Media Laboratoryen_US
dc.contributor.approverRoy, Deb K.
dc.contributor.mitauthorTellex, Stefanie A.
dc.contributor.mitauthorRoy, Deb K.
dc.relation.journalProceedings of the 2009 international conference on Multimodal interfacesen_US
dc.eprint.versionAuthor's final manuscripten_US
dc.type.urihttp://purl.org/eprint/type/ConferencePaperen_US
dspace.orderedauthorsTellex, Stefanie; Roy, Deben
dc.identifier.orcidhttps://orcid.org/0000-0002-4333-7194
mit.licenseOPEN_ACCESS_POLICYen_US
mit.metadata.statusComplete


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record