Projects per year
Abstract
This paper proposes a simple yet effective model-based neural network speaker adaptation technique that learns speakerspecific hidden unit contributions given adaptation data, without requiring any form of speaker-adaptive training, or
labelled adaptation data. An additional amplitude parameter is defined for each hidden unit; the amplitude parameters are tied for each speaker, and are learned using unsupervised adaptation. We conducted experiments on the TED talks data,
as used in the International Workshop on Spoken Language Translation (IWSLT) evaluations. Our results indicate that the approach can reduce word error rates on standard IWSLT test sets by about 8–15% relative compared to unadapted
systems, with a further reduction of 4–6% relative when combined with feature-space maximum likelihood linear regression (fMLLR). The approach can be employed in most existing feed-forward neural network architectures, and we
report results using various hidden unit activation functions:
sigmoid, maxout, and rectifying linear units (ReLU).
labelled adaptation data. An additional amplitude parameter is defined for each hidden unit; the amplitude parameters are tied for each speaker, and are learned using unsupervised adaptation. We conducted experiments on the TED talks data,
as used in the International Workshop on Spoken Language Translation (IWSLT) evaluations. Our results indicate that the approach can reduce word error rates on standard IWSLT test sets by about 8–15% relative compared to unadapted
systems, with a further reduction of 4–6% relative when combined with feature-space maximum likelihood linear regression (fMLLR). The approach can be employed in most existing feed-forward neural network architectures, and we
report results using various hidden unit activation functions:
sigmoid, maxout, and rectifying linear units (ReLU).
Original language | English |
---|---|
Title of host publication | Spoken Language Technology Workshop (SLT), 2014 IEEE |
Publisher | Institute of Electrical and Electronics Engineers |
Pages | 171-176 |
Number of pages | 6 |
ISBN (Print) | 978-1-4799-7129-9 |
DOIs | |
Publication status | Published - 2014 |
Fingerprint
Dive into the research topics of 'Learning Hidden Unit Contributions for Unsupervised Speaker Adaptation of Neural Network Acoustic Models'. Together they form a unique fingerprint.Projects
- 1 Finished
-
Natural Speech Technology
Renals, S. (Principal Investigator) & King, S. (Co-investigator)
1/05/11 → 31/07/16
Project: Research