HMM-based synthesis of child speech

Oliver Watts, Junichi Yamagishi, Kay Berkling, Simon King

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The synthesis of child speech presents challenges both in the collection of data and in the building of a synthesiser from that data. Because only limited data can be collected, and the domain of that data is constrained, it is difficult to obtain the type of phonetically-balanced corpus usually used in speech synthesis. As a consequence, building a synthesiser from this data is difficult. Concatenative synthesisers are not robust to corpora with many missing units (as is likely when the corpus content is not carefully designed), so we chose to build a statistical parametric synthesiser using the HMM-based system HTS. This technique has previously been shown to perform well for limited amounts of data, and for data collected under imperfect conditions. We compared 6 different configurations of the synthesiser, using both speaker-dependent and speaker-adaptive modelling techniques, and using varying amounts of data. The output from these systems was evaluated alongside natural and vocoded speech, in a Blizzard-style listening test.
Original languageEnglish
Title of host publicationProc. of The 1st Workshop on Child, Computer and Interaction (ICMI'08 post-conference workshop)
Publication statusPublished - 2008

Fingerprint

Dive into the research topics of 'HMM-based synthesis of child speech'. Together they form a unique fingerprint.

Cite this