Source-filter Separation of Speech Signal in the Phase Domain

Erfan Loweimi, Jon Barker, Thomas Hain

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Deconvolution of the speech excitation (source) and vocal tract (filter) components through log-magnitude spectral processing is well-established and has led to the well-known cepstral features used in a multitude of speech processing tasks. This paper presents a novel source-filter decomposition based on processing in the phase domain. We show that separation between source and filter in the log-magnitude spectra is far from perfect, leading to loss of vital vocal tract information. It is demonstrated that the same task can be better performed by trend and fluctuation analysis of the phase spectrum of the minimum-phase component of speech, which can be computed via the Hilbert transform. Trend and fluctuation can be separated through low-pass filtering of the phase, using additivity of vocal tract and source in the phase domain. This results in separated signals which have a clear relation to the vocal tract and excitation components. The effectiveness of the method is put to test in a speech recognition task. The vocal tract component extracted in this way is used as the basis of a feature extraction algorithm for speech recognition on the Aurora-2 database. The recognition results shows upto 8.5% absolute improvement in comparison with MFCC features on average (0-20dB).
Original languageEnglish
Title of host publicationProc. Interspeech 2015
PublisherISCA
Pages598-602
Number of pages5
Publication statusPublished - 2015
EventInterspeech 2015 - Dresden, Germany
Duration: 6 Sep 20159 Sep 2015

Conference

ConferenceInterspeech 2015
Country/TerritoryGermany
CityDresden
Period6/09/159/09/15

Fingerprint

Dive into the research topics of 'Source-filter Separation of Speech Signal in the Phase Domain'. Together they form a unique fingerprint.

Cite this