Projects per year
Abstract
This paper investigates the use of multitask learning to improve context-dependent deep neural network (DNN) acoustic models. The use of hybrid DNN systems with clustered triphone targets is now standard in automatic speech recognition. However, we suggest that using a single set of DNN targets in this manner may not be the most effective choice, since the targets are the result of a somewhat arbitrary clustering process that may not be optimal for discrimination. We propose to remedy this problem through the addition of secondary tasks predicting alternative content-dependent or context-independent targets. We present a comprehensive set of experiments on a lecture recognition task showing that DNNs trained through multitask learning in this manner give consistently improved performance compared to standard hybrid DNNs. The technique is evaluated across a range of data and output sizes. Improvements are seen when training uses the cross entropy criterion and also when sequence training is applied.
Original language | English |
---|---|
Pages (from-to) | 238 - 247 |
Number of pages | 10 |
Journal | IEEE/ACM Transactions on Audio, Speech, and Language Processing |
Volume | 25 |
Issue number | 2 |
Early online date | 17 Nov 2016 |
DOIs | |
Publication status | Published - 1 Feb 2017 |
Fingerprint Dive into the research topics of 'Multitask Learning of Context-Dependent Targets in Deep Neural Network Acoustic Models'. Together they form a unique fingerprint.
Projects
- 2 Finished
-
SUMMA - Scalable Understanding of Mulitingual Media
Renals, S., Birch-Mayne, A. & Cohen, S.
1/02/16 → 31/01/19
Project: Research
-
Profiles
-
Peter Bell
- School of Informatics - Reader in Speech Technology
- Institute of Language, Cognition and Computation
- Centre for Speech Technology Research
- Language, Interaction and Robotics
Person: Academic: Research Active