Projects per year
Abstract
Deep neural networks have advanced the state-of-the-art in automatic speech recognition, when combined with hidden Markov models (HMMs). Recently there has been interest in using systems based on recurrent neural networks (RNNs)
to perform sequence modelling directly, without the requirement of an HMM superstructure. In this paper, we study the RNN encoder-decoder approach for large vocabulary end-toend speech recognition, whereby an encoder transforms a sequence of acoustic vectors into a sequence of feature representations,
from which a decoder recovers a sequence of words. We investigated this approach on the Switchboard corpus using a training set of around 300 hours of transcribed audio data. Without the use of an explicit language model or pronunciation lexicon, we achieved promising recognition accuracy, demonstrating that this approach warrants further investigation.
Index Terms: end-to-end speech recognition, deep neural networks,
recurrent neural networks, encoder-decoder.
to perform sequence modelling directly, without the requirement of an HMM superstructure. In this paper, we study the RNN encoder-decoder approach for large vocabulary end-toend speech recognition, whereby an encoder transforms a sequence of acoustic vectors into a sequence of feature representations,
from which a decoder recovers a sequence of words. We investigated this approach on the Switchboard corpus using a training set of around 300 hours of transcribed audio data. Without the use of an explicit language model or pronunciation lexicon, we achieved promising recognition accuracy, demonstrating that this approach warrants further investigation.
Index Terms: end-to-end speech recognition, deep neural networks,
recurrent neural networks, encoder-decoder.
| Original language | English |
|---|---|
| Title of host publication | INTERSPEECH 2015 16th Annual Conference of the International Speech Communication Association |
| Pages | 3249-3253 |
| Number of pages | 5 |
| Publication status | Published - Sept 2015 |
Fingerprint
Dive into the research topics of 'A Study of the Recurrent Neural Network Encoder-Decoder for Large Vocabulary Speech Recognition'. Together they form a unique fingerprint.Projects
- 1 Finished
-
Natural Speech Technology
Renals, S. (Principal Investigator) & King, S. (Co-investigator)
1/05/11 → 31/07/16
Project: Research