Neural networks for distant speech recognition

Steve Renals, Pawel Swietojanski

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Distant conversational speech recognition is challenging owing to the presence of multiple, overlapping talkers, additional non-speech acoustic sources, and the effects of reverberation. In this paper we review work on distant speech recognition, with an emphasis on approaches which combine multichannel signal processing with acoustic modelling, and investigate the use of hybrid neural network / hidden Markov model acoustic models for distant speech recognition of meetings recorded using microphone arrays. In particular we investigate the use of convolutional and fully-connected neural networks with different activation functions (sigmoid, rectified linear, and maxout). We performed experiments on the AMI and ICSI meeting corpora, with results indicating that neural network models are capable of significant improvements in accuracy compared with discriminatively trained Gaussian mixture models.
Original languageEnglish
Title of host publicationProceedings 2014 Workshop on Hands-Free Speech Communication and Microphone Arrays
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Pages172-176
Number of pages5
DOIs
Publication statusPublished - 2014

Fingerprint Dive into the research topics of 'Neural networks for distant speech recognition'. Together they form a unique fingerprint.

Cite this