In this paper we present a method of discriminatively training language models for spoken language understanding; we show improvements in named entity F-scores on speech data using these improved language models. A comparison between theoretical probabilities associated with manual markup and the actual probabilities of output markup is used to identify probabilities requiring adjustment. We present results which support our hypothesis that improvements in F-scores are possible by using either previously used training data or held out development data to improve discrimination amongst a set of N-gram language models.
|Title of host publication||Eurospeech 2003 - Interspeech 2003|
|Subtitle of host publication||8th European Conference on Speech Communication and Technology|
|Publisher||International Speech Communication Association|
|Number of pages||4|
|ISBN (Print)||ISSN: 1990-9772|
|Publication status||Published - 1 Sep 2003|