Abstract
This paper is about a system that extracts principal content words from speech-recognized transcripts of voicemail messages and classifies them into proper names, telephone numbers, dates/times and `other'. The short text summaries generated are suitable for mobile messaging applications. The system uses a set of classifiers to identify the summary words, with each word being identified by a vector of lexical and prosodic features. The features are selected using Parcel, an ROC-based algorithm. We visually compare the role of a large number of individual features and discuss effective ways to combine them. We finally evaluate their performance on manual and automatic transcriptions derived from two different speech recognition systems.
Original language | English |
---|---|
Title of host publication | Proceedings of the 8th European Conference on Speech Communication and Technology |
Subtitle of host publication | Eurospeech 2003 - Interspeech 2003 |
Publisher | ISCA |
Pages | 2785-2788 |
Number of pages | 4 |
Publication status | Published - 2003 |
Event | 8th European Conference on Speech Communication and Technology (Eurospeech 2003) - Geneva, Switzerland Duration: 1 Sept 2003 → 4 Sept 2003 |
Conference
Conference | 8th European Conference on Speech Communication and Technology (Eurospeech 2003) |
---|---|
Country/Territory | Switzerland |
City | Geneva |
Period | 1/09/03 → 4/09/03 |