Abstract / Description of output
We present a system that produces sentential descriptions of video: who did what to whom, and where and how they did it. Action class is rendered as a verb, participant objects as noun phrases, properties of those objects as adjectival modifiers in those noun phrases, spatial relations between those participants as prepositional phrases, and characteristics of the event as prepositional-phrase adjuncts and adverbial modifiers. Extracting the information needed to render these linguistic entities requires an approach to event recognition that recovers object tracks, the track-to-role assignments, and changing body posture.
Original language | English |
---|---|
Title of host publication | Uncertainty in Artificial Intelligence |
Subtitle of host publication | Proceedings of the Twenty-Eighth Conference |
Publisher | Association for Uncertainty in Artificial Intelligence (AUAI) |
Pages | 102-112 |
Number of pages | 11 |
ISBN (Print) | 978-0-9749039-8-9 |
Publication status | Published - 17 Aug 2012 |
Event | Twenty-Eighth Conference on Uncertainty in Artificial Intelligence - Catalina Island, United States Duration: 15 Aug 2012 → 17 Aug 2012 http://www.auai.org/uai2012/ |
Conference
Conference | Twenty-Eighth Conference on Uncertainty in Artificial Intelligence |
---|---|
Abbreviated title | UAI 2012 |
Country/Territory | United States |
City | Catalina Island |
Period | 15/08/12 → 17/08/12 |
Internet address |