Phone Recognition Analysis for Trajectory HMM

Le Zhang, Steve Renals

Research output: Chapter in Book/Report/Conference proceedingConference contribution


The trajectory HMM has been shown to be useful for model-based speech synthesis where a smoothed trajectory is generated using temporal constraints imposed by dynamic features. To evaluate the performance of such model on an ASR task, we present a trajectory decoder based on tree search with delayed path merging. Experiment on a speaker-dependent phone recognition task using the MOCHA-TIMIT database shows that the MLE-trained trajectory model, while retaining attractive properties of being a proper generative model, tends to favour over-smoothed trajectory among competing hypothesises, and does not perform better than a conventional HMM. We use this to build an argument that models giving better fit on training data may suffer a reduction of discrimination by being too faithful to training data. This partially explains why alternative acoustic models that try to explicitly model temporal constraints do not achieve significant improvements in ASR.
Original languageEnglish
Title of host publicationInterspeech 2006 - ICSLP
Subtitle of host publicationNinth International Conference on Spoken Language Processing, Proceedings of the
Publication statusPublished - 2006
EventNinth International Conference on Spoken Language Processing (INTERSPEECH 2006 - ICSLP) - Pittsburgh, PA, United States
Duration: 17 Sep 200621 Sep 2006


ConferenceNinth International Conference on Spoken Language Processing (INTERSPEECH 2006 - ICSLP)
CountryUnited States
CityPittsburgh, PA

Fingerprint Dive into the research topics of 'Phone Recognition Analysis for Trajectory HMM'. Together they form a unique fingerprint.

Cite this