Disfluency and speech recognition profile factors

Matthew P. Aylett

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

This paper reports on work bringing together disfluency coding carried out by Lickley [1] and recognition work carried out as part of the ERF project (Bard, Thompson Isard, [2]) at Edinburgh University. A set of factors are investigated which characterise the behaviour of the ASR during recognition based on an analysis of the resulting word laffice. These factors can be grouped as: Entropy Factors - the entropy of the acoustic and language model likelihoods, within the word lattice, over a 10 ms frame, and, Arc Factors - the number of non-unique and unique arcs in the word lattice in any given 1 Oms time frame, together with the variance of start and end times of these arcs, and the number of arcs starting or ending in the frame. The values of all factors were used to train a simple CART model. The CART model was used to predict: recognition failure, interruption point location (the point where a disfluency begins), and whether the location was in a repair or a reparandum. The entropy of the language model values contributed most to the models prediction of recognition failure, and whether a frame was in a repair or reparandum. In contrast, the number of unique word hypotheses contributed most to the successful prediction of a frame being close to an interruption point.
Original languageEnglish
Title of host publicationProceedings of DiSS’03, Disfluency in Spontaneous Speech Workshop, 5–8 September 2003, Göteborg University, Sweden.
EditorsRobert Eklund
PublisherISCA
Pages51-54
Publication statusPublished - 1 Sept 2003
EventDisfluency in Spontaneous Speech (DiSS'03) - Göteborg, Sweden
Duration: 5 Sept 20038 Sept 2003

Publication series

Name Gothenburg Papers in Theoretical Linguistics
ISSN (Print)0349–102

Workshop

WorkshopDisfluency in Spontaneous Speech (DiSS'03)
Country/TerritorySweden
CityGöteborg
Period5/09/038/09/03

Fingerprint

Dive into the research topics of 'Disfluency and speech recognition profile factors'. Together they form a unique fingerprint.

Cite this