Using Prosody to Classify Discourse Relations

Janine Kleinhans, Mireia Farrus, Agustin Gravano, Juan Manuel Perez, Catherine Lai, Leo Wanner

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This work aims to explore the correlation between the discourse structure of a spoken monologue and its prosody by predicting discourse relations from different prosodic attributes. For this purpose, a corpus of semi-spontaneous monologues in English has been automatically annotated according to the rhetorical Structure Theory, which models coherence in text via rhetorical relations. From corresponding audio files, prosodic features such as pitch, intensity, and speech rate have been extracted from different contexts of a relation. Supervised classification tasks using Support Vector Machines have been performed to find relationships between prosodic features and rhetorical relations. Preliminary results show that intensity combined with other features extracted from intra- and intersegmental environments is the feature with the highest predictability for a discourse relation. The prediction of rhetorical relations from prosodic features and their combinations is straightforwardly applicable to several tasks such as speech understanding or generation. Moreover, the knowledge of how rhetorical relations should be marked in terms of prosody will serve as a basis to improve speech synthesis applications and make voices sound more natural and expressive.
Original languageEnglish
Title of host publicationProceedings Interspeech 2017
Pages3201-3205
Number of pages5
DOIs
Publication statusPublished - 24 Aug 2017
EventInterspeech 2017 - Stockholm, Sweden
Duration: 20 Aug 201724 Aug 2017
http://www.interspeech2017.org/

Publication series

NameInterspeech 2017
PublisherISCA
ISSN (Electronic)1990-9772

Conference

ConferenceInterspeech 2017
Country/TerritorySweden
CityStockholm
Period20/08/1724/08/17
Internet address

Fingerprint

Dive into the research topics of 'Using Prosody to Classify Discourse Relations'. Together they form a unique fingerprint.

Cite this