Multilingual bottleneck features for subword modeling in zero-resource languages

Enno Hermann, Sharon Goldwater

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

How can we effectively develop speech technology for languages where no transcribed data is available? Many existing approaches use no annotated resources at all, yet it makes sense to leverage information from large annotated corpora in other languages, for example in the form of multilingual bottleneck features (BNFs) obtained from a supervised speech recognition system. In this work, we evaluate the benefits of BNFs for subword modeling (feature extraction) in six unseen languages on a word discrimination task. First we establish a strong unsupervised baseline by combining two existing methods: vocal tract length normalisation (VTLN) and the correspondence autoencoder (cAE). We then show that BNFs trained on a single language already beat this baseline; including up to 10 languages results in additional improvements which cannot be matched by just adding more data from a single language. Finally, we show that the cAE can improve further on the BNFs if high-quality same-word pairs are available.
Index Terms: multilingual bottleneck features, subword modeling, unsupervised feature extraction, zero-resource speech technology
Original languageEnglish
Title of host publicationInterspeech 2018
Number of pages5
Publication statusAccepted/In press - 3 Jun 2018
EventInterspeech 2018 - Hyderabad International Convention Centre, Hyderabad, India
Duration: 2 Sep 20186 Sep 2018
http://interspeech2018.org/

Publication series

NameProc. Interspeech 2018
PublisherISCA
ISSN (Electronic)1990-9772

Conference

ConferenceInterspeech 2018
Country/TerritoryIndia
CityHyderabad
Period2/09/186/09/18
Internet address

Fingerprint

Dive into the research topics of 'Multilingual bottleneck features for subword modeling in zero-resource languages'. Together they form a unique fingerprint.

Cite this