Projects per year
Abstract / Description of output
This paper proposes a new approach to duration modelling for statistical parametric speech synthesis in which a recurrent statistical model is trained to output a phone transition probability at each timestep (acoustic frame). Unlike conventional approaches to duration modelling - which assume that duration distributions have a particular form (e.g., a Gaussian) and use the mean of that distribution for synthesis - our approach can in principle model any distribution supported on the non-negative integers. Generation from this model can be performed in many ways; here we consider output generation based on the median predicted duration. The median is more typical (more probable) than the conventional mean duration, is robust to training-data irregularities, and enables incremental generation. Furthermore, a frame-level approach to duration prediction is consistent with a longer-term goal of modelling durations and acoustic features together. Results indicate that the proposed method is competitive with baseline approaches in approximating the median duration of held-out natural speech.
Original language | English |
---|---|
Title of host publication | 2016 IEEE Spoken Language Technology Workshop (SLT) |
Publisher | Institute of Electrical and Electronics Engineers |
Pages | 686-692 |
Number of pages | 7 |
ISBN (Electronic) | 978-1-5090-4903-5 |
ISBN (Print) | 978-1-5090-4904-2 |
DOIs | |
Publication status | Published - 9 Feb 2017 |
Event | 2016 IEEE Spoken Language Technology Workshop - San Diego, United States Duration: 13 Dec 2016 → 16 Dec 2016 https://www2.securecms.com/SLT2016//Default.asp |
Conference
Conference | 2016 IEEE Spoken Language Technology Workshop |
---|---|
Abbreviated title | IEEE SLT 2016 |
Country/Territory | United States |
City | San Diego |
Period | 13/12/16 → 16/12/16 |
Internet address |
Fingerprint
Dive into the research topics of 'Median-based generation of synthetic speech durations using a non-parametric approach'. Together they form a unique fingerprint.Projects
- 1 Finished