Objective Distance Measures for Spectral Discontinuities in Concatenative Speech Synthesis

J. Vepa, S. King, Paul Taylor

Research output: Chapter in Book/Report/Conference proceedingConference contribution


In unit selection based concatenative speech systems, `join cost', which measures how well two units can be joined together, is one of the main criteria for selecting appropriate units from the inventory. The ideal join cost will measure `perceived' discontinuity, based on easily measurable spectral properties of the units being joined, in order to ensure smooth and natural-sounding synthetic speech. In this paper we report a perceptual experiment conducted to measure the correlation between `subjective' human perception and various `objective' spectrally-based measures proposed in the literature. Our experiments used a state-of-the-art unit-selection text-to-speech system: `rVoice' from Rhetorical Systems Ltd.
Original languageEnglish
Title of host publication ICSLP 2002
Subtitle of host publication7th International Conference on Spoken Language Processing
PublisherInternational Speech Communication Association
Number of pages4
ISBN (Print)ISSN: 1990-9772
Publication statusPublished - 1 Sep 2002


Dive into the research topics of 'Objective Distance Measures for Spectral Discontinuities in Concatenative Speech Synthesis'. Together they form a unique fingerprint.

Cite this