Projects per year
Abstract / Description of output
This paper proposes a new set of speech features called Locally-Normalized Cepstral Coefficients (LNCC) that are based on Seneff's Generalized Synchrony Detector (GSD). First, an analysis of the GSD frequency response is provided to show that it generates spurious peaks at harmonics of the detected frequency. Then, the GSD frequency response is modeled as a quotient of two filters centered at the detected frequency. The numerator is a triangular band pass filter centered around a particular frequency similar to the ordinary Mel filters. The denominator term is a filter that responds maximally to frequency components on either side of the numerator filter. As a result, a local normalization is performed without the spurious peaks of the original GSD. Speaker verification results demonstrate that the proposed LNCC features are of low computational complexity and far more effectively compensate for spectral tilt than ordinary MFCC coefficients. LNCC features do not require the computation and storage of a moving average of the feature values, and they provide relative reductions in Equal Error Rate (EER) as high as 47.7%, 34.0% or 25.8% when compared with MFCC, MFCC + CMN, or MFCC + RASTA in one case of variable spectral tilt, respectively. (C) 2014 Elsevier Ltd. All rights reserved.
Original language | English |
---|---|
Pages (from-to) | 1-27 |
Number of pages | 27 |
Journal | Computer Speech and Language |
Volume | 31 |
Issue number | 1 |
Early online date | 7 Nov 2014 |
DOIs | |
Publication status | Published - May 2015 |
Keywords / Materials (for Non-textual outputs)
- Channel robust feature extraction
- Auditorymodels
- Spectral local normalization
- Synchrony detection
- ROBUST SPEECH RECOGNITION
- AUDITORY-NERVE FIBERS
- LOCALIZED SYNCHRONY DETECTION
- AIRBORNE SOUND INSULATION
- POSITION-DEPENDENT CMN
- JOINT FACTOR-ANALYSIS
- STEADY-STATE VOWELS
- TEMPORAL INFORMATION
- NOISY ENVIRONMENTS
- PHASE SENSITIVITY
Fingerprint
Dive into the research topics of 'A perceptually-motivated low-complexity instantaneous linear channel normalization technique applied to speaker verification'. Together they form a unique fingerprint.Projects
- 2 Finished
-
Simple4All: Speech synthesis that improves through adaptive learning
1/11/11 → 31/10/14
Project: Research
File -
Profiles
-
Simon King
- School of Philosophy, Psychology and Language Sciences - Personal Chair of Speech Processing
- Centre for Speech Technology Research
Person: Academic: Research Active