Multimodal Grammar Implementation

Katya Alahverdzhieva, Dan Flickinger, Alex Lascarides

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

This paper reports on an implementation of a multimodal grammar of speech and co-speech gesture within the LKB/PET grammar engineering environment. The implementation extends the English Resource Grammar (ERG, Flickinger (2000)) with HPSG types and rules that capture the form of the linguistic signal,
the form of the gestural signal and their relative timing to constrain the meaning of the multimodal action. The grammar yields a single parse tree that integrates the spoken and gestural modality thereby drawing on standard semantic composition techniques to derive the multimodal meaning representation. Using the current machinery, the main challenge for the grammar engineer is the nonlinear input: the modalities can overlap temporally. We capture this by identical speech and gesture token edges. Further, the semantic contribution of gestures is encoded by lexical rules transforming a speech phrase into a multimodal entity of conjoined spoken and gestural semantics.
Original languageEnglish
Title of host publicationHLT-NAACL
Subtitle of host publicationHuman Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, Proceedings, June 3-8, 2012, Montreal, Canada
PublisherAssociation for Computational Linguistics
Number of pages5
ISBN (Print)978-1-937284-20-6
Publication statusPublished - 2012


Dive into the research topics of 'Multimodal Grammar Implementation'. Together they form a unique fingerprint.

Cite this