Estimating detailed spectral envelopes using articulatory clustering

Yoshinori Shiga, Simon King

Research output: Chapter in Book/Report/Conference proceedingConference contribution


This paper presents an articulatory-acoustic mapping where detailed spectral envelopes are estimated. During the estimation, the harmonics of a range of F0 values are derived from the spectra of multiple voiced speech signals vocalized with similar articulator settings. The envelope formed by these harmonics is represented by a cepstrum, which is computed by fitting the peaks of all the harmonics based on the weighted least square method in the frequency domain. The experimental result shows that the spectral envelopes are estimated with the highest accuracy when the cepstral order is 48--64 for a female speaker, which suggests that representing the real response of the vocal tract requires high-quefrency elements that conventional speech synthesis methods are forced to discard in order to eliminate the pitch component of speech.
Original languageEnglish
Title of host publicationInterspeech 2004 - ICSLP
Subtitle of host publication8th International Conference on Spoken Language Processing
PublisherInternational Speech Communication Association
Number of pages4
ISBN (Print)ISSN: 1990-9772
Publication statusPublished - 1 Oct 2004


