The THISL SDR System at TREC-8

Dave Abberley, Steve Renals, Dan Ellis, Tony Robinson

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper describes the participation of the THISL group at the TREC-8 Spoken Document Retrieval (SDR) track. The THISL SDR system consists of the realtime version of the
ABBOT large vocabulary speech recognition system and the THISLIR text retrieval system. The TREC-8 evaluation assessed SDR performance on a corpus of 500 hours of broadcast news material collected over a five month period. The main test condition involved retrieval of stories defined by manual segmentation of the corpus in which non-news material, such as commercials, were excluded. An optional test condition required required retrieval of the same stories from the unsegmented audio stream. The THISL SDR system
participated at both test conditions. The results show that a system such as THISL can produce respectable information retrieval performance on a realistically-sized corpus
of unsegmented audio material.
Original languageEnglish
Title of host publicationProceedings of the Eighth Text REtrieval Conference
Subtitle of host publication(TREC-8)
EditorsEllen Voorhees, Donna Harman
Pages699-606
Number of pages8
Publication statusPublished - 2000
EventEighth Text REtrieval Conference (TREC-8) - Gaithersburg, MD, United States
Duration: 16 Nov 199919 Nov 1999

Conference

ConferenceEighth Text REtrieval Conference (TREC-8)
Country/TerritoryUnited States
CityGaithersburg, MD
Period16/11/9919/11/99

Fingerprint

Dive into the research topics of 'The THISL SDR System at TREC-8'. Together they form a unique fingerprint.

Cite this