Projects per year
Abstract / Description of output
Text-to-speech voices created from noisy and reverberant recordings are of lower quality. A simple way to improve this is to increase the quality of the recordings prior to text-to-speech training with speech enhancement methods such as noise suppression and dereverberation. In this paper, we opted for this approach and to perform the enhancement, we used a recurrent neural network. The network is trained with parallel data of clean and lower quality recordings of speech. The lower quality data was artificially created by adding recordings of environmental noise to studio quality recordings of speech and by convolving room impulse responses with these clean recordings. We trained separate networks with noise only, reverberation only and both reverberation and additive noise data. The quality of voices trained with lower quality data that has been enhanced using these networks was significantly higher in all cases. For the noise only case, the enhanced synthetic voice ranked as high as the voice trained with clean data. For the most realistic and challenging scenario, when both noise and reverberation were present, the improvements were more modest, but still significant.
Original language | English |
---|---|
Pages (from-to) | 1420-1433 |
Number of pages | 14 |
Journal | IEEE/ACM Transactions on Audio, Speech and Language Processing |
Volume | 26 |
Issue number | 8 |
Early online date | 20 Apr 2018 |
DOIs | |
Publication status | Published - 1 Aug 2018 |
Fingerprint
Dive into the research topics of 'Speech Enhancement of Noisy and Reverberant Speech for Text-to-Speech'. Together they form a unique fingerprint.Projects
- 2 Finished
Datasets
-
Noisy speech database for training speech enhancement algorithms and TTS models
Valentini Botinhao, C. (Creator), Edinburgh DataShare, 21 Aug 2017
DOI: 10.7488/ds/2117
Dataset
-
Noisy reverberant speech database for training speech enhancement algorithms and TTS models
Valentini Botinhao, C. (Creator), Edinburgh DataShare, 14 Sept 2017
DOI: 10.7488/ds/2139
Dataset
-
Reverberant speech database for training speech dereverberation algorithms and TTS models
Valentini Botinhao, C. (Creator), Edinburgh DataShare, 22 Mar 2016
DOI: 10.7488/ds/1425
Dataset