Edinburgh Research Explorer

A Posterior Approach for Microphone Array Based Speech Recognition

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Related Edinburgh Organisations

Open Access permissions

Open

Documents

  • Download as Adobe PDF

    Rights statement: Wang, D., Himawan, I., Frankel, J., & King, S. (2008). A Posterior Approach for Microphone Array Based Speech Recognition. In Interspeech. (pp. 996-999).

    Accepted author manuscript, 198 KB, PDF document

http://hdl.handle.net/1842/3907
http://www.isca-speech.org/archive/interspeech_2008/i08_0996.html
Original languageEnglish
Title of host publicationInterspeech
Pages996-999
Number of pages4
Publication statusPublished - Sep 2008

Abstract

Automatic speech recognition (ASR) becomes rather difficult in meetings domains because of the adverse acoustic conditions, including more background noise, more echo and reverberation and frequent cross-talking. Microphone arrays have been demonstrated able to boost ASR performance dramatically in such noisy and reverberant environments, with various beamforming algorithms. However, almost all existing beamforming measures work in the acoustic domain, resorting to signal processing theories and geometric explanation. This limits their application, and induces significant performance degradation when the geometric property is unavailable or hard to estimate, or if heterogenous channels exist in the audio system. In this paper, we preset a new posterior-based approach for array-based speech recognition. The main idea is, instead of enhancing speech signals, we try to enhance the posterior probabilities that frames belonging to recognition units, e.g., phones. These enhanced posteriors are then transferred to posterior probability based features and are modeled by HMMs, leading to a tandem ANN-HMM hybrid system presented by Hermansky et al.. Experimental results demonstrated the validity of this posterior approach. With the posterior accumulation or enhancement, significant improvement was achieved over the single channel baseline. Moreover, we can combine the acoustic enhancement and posterior enhancement together, leading to a hybrid acoustic-posterior beamforming approach, which works significantly better than just the acoustic beamforming, especially in the scenario with moving-speakers.

Download statistics

No data available

ID: 154012