Original language | English |
---|
Pages (from-to) | 5-20 |
---|
Journal | Speech Communication |
---|
Volume | 32 |
---|
Issue number | 1-2 |
---|
DOIs | |
---|
Publication status | Published - 1 Sep 2000 |
---|
This paper describes a spoken document retrieval (SDR) system for British and North American Broadcast News. The system is based on a connectionist large vocabulary speech recognizer and a probabilistic information retrieval (IR) system. We discuss the development of a real-time Broadcast News speech recognizer, and its integration into an SDR system. Two advances were made for this task: automatic segmentation and statistical query expansion using a secondary corpus. Precision and recall results using the Text Retrieval Conference (TREC) SDR evaluation infrastructure are reported throughout the paper, and we discuss the application of these developments to a large scale SDR task based on an archive of British English broadcast news.
ID: 27449175