Edinburgh Research Explorer

HMM adaptation and voice conversion for the synthesis of child speech: a comparison

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Related Edinburgh Organisations

Open Access permissions



  • Download as Adobe PDF

    Rights statement: © Watts, O., Yamagishi, J., King, S., & Berkling, K. (2009). HMM adaptation and voice conversion for the synthesis of child speech: a comparison. In Interspeech 2009, Brighton UK. (pp. 2627-2630)

    Final published version, 266 KB, PDF document

Original languageEnglish
Title of host publicationInterspeech 2009, Brighton UK
Number of pages4
Publication statusPublished - Sep 2009


This study compares two different methodologies for producing data-driven synthesis of child speech from existing systems that have been trained on the speech of adults. On one hand, an existing statistical parametric synthesiser is transformed using model adaptation techniques, informed by linguistic and prosodic knowledge, to the speaker characteristics of a child speaker. This is compared with the application of voice conversion techniques to convert the output of an existing waveform concatenation synthesiser with no explicit linguistic or prosodic knowledge. In a subjective evaluation of the similarity of synthetic speech to natural speech from the target speaker, the HMM-based systems evaluated are generally preferred, although this is at least in part due to the higher dimensional acoustic features supported by these techniques.

Download statistics

No data available

ID: 152834