Compression of Model-based Group Delay Function for Robust Speech Recognition

E. Loweimi, Jon Barker, Thomas Hain

Research output: Other contribution

Abstract

In this paper, we improve the performance of the ARGDMF feature by adding a nonlinear filtering block. ARGDMF is a group delay-based feature consists of four main parts, namely autoregressive (AR) model extraction, group delay function (GDF) calculation, compression, and scale information augmentation. The main problem with the GDF is its spiky nature which is solved by coupling the GDF with an all-pole model. The compression step includes two stages similar to MFCC without taking a logarithm of the output energies. The fourth part augments the phase-based feature vector with scale information. The novelty of this paper is in adding a filtering block to compression process to make it more efficient. This filter aims at elevating the performance of the ARGDMF via a more optimum dynamic range and formants sharpness adjustment. The feature was evaluated on Aurora 2 database. In the presence of both additive and convolutional noises, the proposed method noticeably outperforms the MFCCs and other phase-based features, without remarkable increase in computational load.
Original languageEnglish
TypeSymposium paper
Number of pages2
Publication statusPublished - 1 Jun 2014

Keywords / Materials (for Non-textual outputs)

  • Robust speech recognition
  • phase spectrum, group delay
  • compression
  • scale information

Fingerprint

Dive into the research topics of 'Compression of Model-based Group Delay Function for Robust Speech Recognition'. Together they form a unique fingerprint.

Cite this