Projects per year
Abstract / Description of output
A problem when developing and tuning speech synthesis systems is that there is no well-established method of automatically rating the quality of the synthetic speech. This research attempts to obtain a new automated measure which is trained on the result of large-scale subjective evaluations employing many human listeners, i.e., the Blizzard Challenge. To exploit the data, we experiment with linear regression, feed-forward and convolutional neural network models, and combinations of them to regress from synthetic speech to the perceptual scores obtained from listeners. The biggest improvements were seen when combining stimulus- and system-level predictions.
Original language | English |
---|---|
Title of host publication | Interspeech 2016 |
Publisher | International Speech Communication Association |
Pages | 342-346 |
Number of pages | 5 |
DOIs | |
Publication status | Published - 12 Sept 2016 |
Event | Interspeech 2016 - San Francisco, United States Duration: 8 Sept 2016 → 12 Sept 2016 http://www.interspeech2016.org/ |
Publication series
Name | |
---|---|
Publisher | International Speech Communication Association |
ISSN (Print) | 1990-9772 |
Conference
Conference | Interspeech 2016 |
---|---|
Country/Territory | United States |
City | San Francisco |
Period | 8/09/16 → 12/09/16 |
Internet address |
Fingerprint
Dive into the research topics of 'A Hierarchical Predictor of Synthetic Speech Naturalness Using Neural Networks'. Together they form a unique fingerprint.Projects
- 2 Finished
-
Deep architectures for statistical speech synthesis
Yamagishi, J.
UK industry, commerce and public corporations
4/09/12 → 3/03/16
Project: Research
-