This aticle presents trainable methods for extracting principal content words from voicemail messages. The short text summaries generated are suitable for mobile messaging applications. The system uses a set of classifiers to identify the summary words with each word described by a vector of lexical and prosodic features. We use an ROC-based algorithm, Parcel, to select input features (and classifiers). We have performed a series of objective and subjective evaluations using unseen data from two different speech recognition systems as well as human transcriptions of voicemail speech.
|Number of pages||21|
|Journal||ACM Transactions on Speech and Language Processing|
|Publication status||Published - 1 Feb 2005|
- automatic summarization
- feature subset selection
- receiver operating characteristic
- short message service