Edinburgh Research Explorer

Objective Distance Measures for Spectral Discontinuities in Concatenative Speech Synthesis

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Original languageEnglish
Title of host publication ICSLP 2002
Subtitle of host publication7th International Conference on Spoken Language Processing
PublisherInternational Speech Communication Association
Pages2605-2608
Number of pages4
ISBN (Print)ISSN: 1990-9772
Publication statusPublished - 1 Sep 2002

Abstract

In unit selection based concatenative speech systems, `join cost', which measures how well two units can be joined together, is one of the main criteria for selecting appropriate units from the inventory. The ideal join cost will measure `perceived' discontinuity, based on easily measurable spectral properties of the units being joined, in order to ensure smooth and natural-sounding synthetic speech. In this paper we report a perceptual experiment conducted to measure the correlation between `subjective' human perception and various `objective' spectrally-based measures proposed in the literature. Our experiments used a state-of-the-art unit-selection text-to-speech system: `rVoice' from Rhetorical Systems Ltd.

ID: 2077413