Joint Prosodic and Segmental Unit Selection Speech Synthesis

Robert A. J. Clark, Simon King

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We describe a unit selection technique for text-to-speech synthesis which jointly searches the space of possible diphone sequences and the space of possible prosodic unit sequences in order to produce synthetic speech with more natural prosody. We demonstrates that this search, although currently computationally expensive, can achieve improved intonation compared to a baseline in which only the space of possible diphone sequences is searched. We discuss ways in which the search could be made sufficiently efficient for use in a real-time system.
Original languageEnglish
Title of host publicationInterspeech 2006- ICSLP
Subtitle of host publication9th International Conference on Spoken Language Processing
PublisherInternational Speech Communication Association
ISBN (Print)1990-9772
Publication statusPublished - 1 Sep 2006

Fingerprint

Dive into the research topics of 'Joint Prosodic and Segmental Unit Selection Speech Synthesis'. Together they form a unique fingerprint.

Cite this