Abstract
When a speaker leaves a voicemail message there are prosodic cues that emphasize the important points in the message, in addition to lexical content. In this paper we compare and visualize the relative contribution of these two types of features within a voicemail summarization system. We describe the system's ability to generate summaries of two test sets, having trained and validated using 700 messages from the IBM Voicemail corpus. Results measuring the quality of summary artifacts show that combined lexical and prosodic features are at least as robust as combined lexical features alone across all operating conditions.
Original language | English |
---|---|
Title of host publication | Proceedings of the ITRW on Prosody in Speech Recognition and Understanding |
Subtitle of host publication | Prosody 2001 |
Publisher | ISCA |
Publication status | Published - 2001 |
Event | ITRW on Prosody in Speech Recognition and Understanding (Prosody 2001) - Molly Pitcher Inn, Red Bank, NJ, United States Duration: 22 Oct 2001 → 24 Oct 2001 |
Workshop
Workshop | ITRW on Prosody in Speech Recognition and Understanding (Prosody 2001) |
---|---|
Country/Territory | United States |
City | Red Bank, NJ |
Period | 22/10/01 → 24/10/01 |