Speaker and language independent voice quality classification applied to unlabelled corpora of expressive speech

J. Kane, S. Scherer, M. Aylett, L. P. Morency, C. Gobl

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Voice quality plays a pivotal role in speech style variation. Therefore, control and analysis of voice quality is critical for many areas of speech technology. Until now, most work has focused on small purpose built corpora. In this paper we apply state-of-the-art voice quality analysis to large speech corpora built for expressive speech synthesis. A fuzzy-input fuzzy-output support vector machine classifier is trained and validated using features extracted from these corpora. We then apply this classifier to freely available audiobook data and demonstrate a clustering of the voice qualities that approximates the performance of human perceptual ratings. The ability to detect voice quality variation in these widely available unlabelled audiobook corpora means that the proposed method may be used as a valuable resource in expressive speech synthesis.
Original languageEnglish
Title of host publication2013 IEEE International Conference on Acoustics, Speech and Signal Processing
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Pages7982-7986
Number of pages5
ISBN (Print)978-1-4799-0356-6
DOIs
Publication statusPublished - 1 May 2013

Fingerprint

Dive into the research topics of 'Speaker and language independent voice quality classification applied to unlabelled corpora of expressive speech'. Together they form a unique fingerprint.

Cite this