Edinburgh Research Explorer

Testing Evaluation of Speech Synthesis (TESSa)

Project: Research

AcronymTESSa
StatusFinished
Effective start/end date1/01/0531/12/07
Total award£246,816.00
Funding organisationEPSRC
Funder project referenceEP/C53042X/1
Period1/01/0531/12/07

Description

We proposed a systematic and thorough investigation of:

* what acoustic information listeners pay attention to during
subjective evaluation of unit-selection speech synthesis quality

* how best to manipulate listeners' attention so that it is
directed at the specific aspect of the synthetic speech under
investigation in the evaluation study

* what experimental paradigms are most appropriate for achieving
the most consistent listener ratings

The results of this study will allow us to design more reliable
subjective evaluation techniques, and to develop objective measures
which accurately model human perceptual rating behaviour for synthetic
speech. These advances will allow designers of speech-enabled systems
to accurately determine the perceived quality of their systems, and
users of such systems to make well-informed choices based on these
evaluations.

Layman's description

Speech synthesis - the conversion of text into speech by computer - is hard to evaluate because the judgements made by listeners are subjective. Fully-automated evaluation (without listeners) is currently impossible.
This project was about better understanding what listeners are listening to when they make judgements of synthetic speech quality. By 'unpicking' the listeners' behaviour we hoped to be able to design better evaluation method that would tell us not only how good a system was overall but also tell us in some detail about which parts of the system needed the most improvement.

Key findings

We made advances in the evaluation of speech synthesis, including:
* a method based on multi-dimensional scaling that can separate listeners' judgements about overall quality into axes such as "prosody" or "join quality" even though the untrained listeners only perform a very simple "same or different" judgement task
* feeding these findings directly into our ongoing work on unit-selection synthesis, leading to improved quality
* a basis for the Blizzard Challenge series of international evaluations, that we now co-ordinate

Activities

Research outputs