The role of higher-level linguistic features in HMM-based speech synthesis

Oliver Watts, Junichi Yamagishi, Simon King

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We analyse the contribution of higher-level elements of the linguistic specification of a data-driven speech synthesiser to the naturalness of the synthetic speech which it generates. The system is trained using various subsets of the full feature-set, in which features relating to syntactic category, intonational phrase boundary, pitch accent and boundary tones are selectively removed. Utterances synthesised by the different configurations of the system are then compared in a subjective evaluation of their naturalness. The work presented forms background analysis for an ongoing set of experiments in performing text-to-speech (TTS) conversion based on shallow features: features that can be trivially extracted from text. By building a range of systems, each assuming the availability of a different level of linguistic annotation, we obtain benchmarks for our on-going work.
Original languageEnglish
Title of host publicationProc. Interspeech
Pages841-844
Publication statusPublished - 2010

Fingerprint

Dive into the research topics of 'The role of higher-level linguistic features in HMM-based speech synthesis'. Together they form a unique fingerprint.

Cite this