A Statistical Motivated Database Pruning Technique for Unit Selection Synthesis

Peter Rutten, Matthew Aylett, Justin Fackrell, Paul Taylor

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

An important topic in unit selection based speech synthesis is the scalability of such systems. Related to this problem is the question regarding the optimal size of a unit selection database. An ideal system should produce ever better synthesis results when more data is added to the system, but for a practical system this might not be the case. The unit selection criteria are generally not sufficiently developed to ensure that a system makes an optimal use of the data that it has available.

In this paper we propose a database reduction technique based on the statistical behaviour of unit selection. We investigate the effect of scaling down the database by objective and subjective criteria. We compare the proposed reduction technique with a technique that simply limits the size of unit lists to a fraction of their original size (random removal).

The results show that the proposed technique is far better than random removal, and that we can remove a significant portion of our database without causing any severe quality loss.
Original languageEnglish
Title of host publication7th International Conference on Spoken Language Processing (ICSLP2002), Proceedings of the
Subtitle of host publicationInterspeech 2002
EditorsJohn H. L. Hansen, Bryan Pellom
PublisherISCA
Number of pages4
Publication statusPublished - 2002
Event7th International Conference on Spoken Language Processing (Interspeech 2002) - Denver, CO, United States
Duration: 16 Sep 200220 Sep 2002

Conference

Conference7th International Conference on Spoken Language Processing (Interspeech 2002)
Country/TerritoryUnited States
CityDenver, CO
Period16/09/0220/09/02

Fingerprint

Dive into the research topics of 'A Statistical Motivated Database Pruning Technique for Unit Selection Synthesis'. Together they form a unique fingerprint.

Cite this