Robust fundamental frequency estimation in sustained vowels: Detailed algorithmic comparisons and information fusion with adaptive Kalman filtering

Athanasios Tsanas*, Matias Zanartu, Max A. Little, Cynthia Fox, Lorraine O. Ramig, Gari D. Clifford

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

There has been consistent interest among speech signal processing researchers in the accurate estimation of the fundamental frequency (F-0) of speech signals. This study examines ten F-0 estimation algorithms (some well-established and some proposed more recently) to determine which of these algorithms is, on average, better able to estimate F-0 in the sustained vowel /a/. Moreover, a robust method for adaptively weighting the estimates of individual F-0 estimation algorithms based on quality and performance measures is proposed, using an adaptive Kalman filter (KF) framework. The accuracy of the algorithms is validated using (a) a database of 117 synthetic realistic phonations obtained using a sophisticated physiological model of speech production and (b) a database of 65 recordings of human phonations where the glottal cycles are calculated from electroglottograph signals. On average, the sawtooth waveform inspired pitch estimator and the nearly defect-free algorithms provided the best individual F-0 estimates, and the proposed KF approach resulted in a similar to 16% improvement in accuracy over the best single F-0 estimation algorithm. These findings may be useful in speech signal processing applications where sustained vowels are used to assess vocal quality, when very accurate F-0 estimation is required. (C) 2014 Acoustical Society of America.

Original languageEnglish
Pages (from-to)2885-2901
Number of pages17
JournalThe Journal of the Acoustical Society of America
Volume135
Issue number5
DOIs
Publication statusPublished - 9 May 2014

Keywords

  • PITCH DETECTION ALGORITHMS
  • PARKINSONS-DISEASE
  • ELECTROGLOTTOGRAPHIC SIGNALS
  • PERTURBATION MEASUREMENTS
  • VOCAL FOLDS
  • SPEECH
  • VOICE
  • MODEL
  • PHONATION
  • MUSIC

Cite this