The ADAPT entry to the Blizzard Challenge 2016

Joao P Cabral, Christian Saam, Eva Vanmassenhove, Stephen Bradley, Fasih Haider

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

This paper describes the text-to-speech synthesis system developed for the Blizzard Challenge 2016 by members of the ADAPT centre and colleagues from associated projects. The task was to build a synthetic voice for reading audiobooks to children, from a speech database of audiobooks around 5 hours long. Our entry system is an HMM-based parametric speech synthesizer which was built using a subset of the database (half the total number of the audiobooks of the full dataset). We only used this subset because it was the best quality data we could obtain under the time constraints posed by the Challenges’ deadlines. The main parts of the work undertaken on the development of the system for this challenge were on text chunking, including splitting of sentences and segments of text in quotes, and automatic alignment of speech and text data. We also aimed to synthesize speech with emotions to improve the expressiveness of the synthetic speech. Although we could not concretize this task on time for the submission, we plan to carry on this work and possibly use it in a future entry of our system to the Blizzard Challenge
Original languageEnglish
Title of host publicationProceedings of the Blizzard Challenge 2016
Place of PublicationCupertino, USA
Publication statusPublished - Sep 2016

Fingerprint

Dive into the research topics of 'The ADAPT entry to the Blizzard Challenge 2016'. Together they form a unique fingerprint.

Cite this