Projects per year
Abstract / Description of output
This paper describes speech intelligibility enhancement for Hidden Markov Model (HMM) generated synthetic speech in noise. We present a method for modifying the Mel cepstral coefficients generated by statistical parametric models that have been trained on plain speech. We update these coefficients such that the glimpse proportion - an objective measure of the intelligibility of speech in noise - increases, while keeping the speech energy fixed. An acoustic analysis reveals that the modified speech is boosted in the region 1-4 kHz, particularly for vowels, nasals and approximants. Results from listening tests employing speech-shaped noise show that the modified speech is as intelligible as a synthetic voice trained on plain speech whose duration, Mel cepstral coefficients and excitation signal parameters have been adapted to Lombard speech from the same speaker. Our proposed method does not require these additional recordings of Lombard speech. In the presence of a competing talker, both modification and adaptation of spectral coefficients give more modest gains. (C) 2013 Elsevier Ltd. All rights reserved.
Original language | English |
---|---|
Pages (from-to) | 665-686 |
Number of pages | 22 |
Journal | Computer Speech and Language |
Volume | 28 |
Issue number | 2 |
DOIs | |
Publication status | Published - Mar 2014 |
Keywords / Materials (for Non-textual outputs)
- Intelligibility of speech in noise
- HMM-based speech synthesis
- Mel cepstral coefficients
- Glimpse proportion measure
- ALGORITHMS
- HEARING
- MODEL
Fingerprint
Dive into the research topics of 'Intelligibility enhancement of HMM-generated speech in additive noise by modifying Mel cepstral coefficients to increase the glimpse proportion'. Together they form a unique fingerprint.Projects
- 3 Finished
-
Deep architectures for statistical speech synthesis
Yamagishi, J.
UK industry, commerce and public corporations
4/09/12 → 3/03/16
Project: Research
-
-
LISTA: LISTA- The Listening Talker (RTGS)
King, S., Mayo, C. & Renals, S.
1/05/10 → 30/04/13
Project: Research
Activities
- 1 Invited talk
-
EACL 2014 keynote: Speech synthesis needs YOU!
Simon King (Speaker)
29 Apr 2014Activity: Academic talk or presentation types › Invited talk
File
Profiles
-
Cassia Valentini Botinhao
- School of Informatics - Senior Researcher
- Institute of Language, Cognition and Computation
- Centre for Speech Technology Research
- Language, Interaction, and Robotics
Person: Academic: Research Active