Projects per year
Abstract
Adapting acoustic models jointly to both speaker and environment has been shown to be effective. In many realistic scenarios, however, either the speaker or environment at test time might be unknown, or there may be insufficient data to learn a joint transform. Generating independent speaker and environment transforms improves the match of an acoustic model to unseen combinations. Using i-vectors, we demonstrate that it is possible to factorise speaker or environment information using multi-condition training with neural networks. Specifically, we extract bottleneck features from networks trained to classify either speakers or environments. We perform experiments on the Wall Street Journal corpus combined with environment noise from the Diverse Environments Multichannel Acoustic Noise Database. Using the factorised i-vectors we show improvements in word error rates on perturbed versions of the eval92 and dev93 test sets, both when one factor is missing and when the factors are seen but not in the desired combination.
Original language | English |
---|---|
Title of host publication | Proceedings Interspeech 2017 |
Publisher | International Speech Communication Association |
Pages | 749-753 |
Number of pages | 5 |
DOIs | |
Publication status | Published - 24 Aug 2017 |
Event | Interspeech 2017 - Stockholm, Sweden Duration: 20 Aug 2017 → 24 Aug 2017 http://www.interspeech2017.org/ |
Publication series
Name | Interspeech |
---|---|
Publisher | International Speech Communcaition Association |
ISSN (Print) | 1990-9772 |
Conference
Conference | Interspeech 2017 |
---|---|
Country | Sweden |
City | Stockholm |
Period | 20/08/17 → 24/08/17 |
Internet address |
Fingerprint Dive into the research topics of 'Factorised representations for neural network adaptation to diverse acoustic environments'. Together they form a unique fingerprint.
Projects
- 2 Finished
-
SUMMA - Scalable Understanding of Mulitingual Media
Renals, S., Birch-Mayne, A. & Cohen, S.
1/02/16 → 31/01/19
Project: Research
-
Multi-domain speech recognition
Non-EU industry, commerce and public corporations
1/09/15 → 28/02/19
Project: Research