Edinburgh Research Explorer

Exemplar-based speech waveform generation for text-to-speech

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Original languageEnglish
Title of host publication2018 IEEE Workshop on Spoken Language Technology (SLT)
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Pages332-338
Number of pages7
ISBN (Electronic)978-1-5386-4334-1, 978-1-5386-4333-4
ISBN (Print)978-1-5386-4335-8
DOIs
Publication statusPublished - 14 Feb 2019
Event2018 IEEE Workshop on Spoken Language Technology (SLT) - Athens, Greece
Duration: 18 Dec 201821 Dec 2018
http://www.slt2018.org/

Conference

Conference2018 IEEE Workshop on Spoken Language Technology (SLT)
Abbreviated titleIEEE SLT 2018
CountryGreece
CityAthens
Period18/12/1821/12/18
Internet address

Abstract

This paper presents a hybrid text-to-speech framework that uses a waveform generation method based on examplars of natural speech waveform. These examplars are selected at synthesis time given a sequence of acoustic features generated from text by a statistical parametric speech synthesis model. In order to match the expected degradation of these target synthesis features, the database of units is constructed such that the units’ target representations are generated from the same parametric model. We evaluate two variants of this framework by modifying the size of the examplar: a small unit variant (where unit boundaries are determined by pitch mark location) and a halfphone variant (where unit boundaries are determined by subphone state forced alignment). We found that for a larger dataset (around four hours of training data) the examplar-based waveform generation variants are rated higher than the vocoder-based system.

    Research areas

  • text-to-speech, vocoder, unit selection

Event

2018 IEEE Workshop on Spoken Language Technology (SLT)

18/12/1821/12/18

Athens, Greece

Event: Conference

Download statistics

No data available

ID: 76045394