Edinburgh's End-to-End Multilingual Speech Translation System for IWSLT 2021

Biao Zhang, Rico Sennrich

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper describes Edinburgh’s submissions to the IWSLT2021 multilingual speech translation (ST) task. We aim at improving multilingual translation and zero-shot performance in the constrained setting (without using any extra training data) through methods that encourage transfer learning and larger capacity modeling with advanced neural components. We build our end-to-end multilingual ST model based on Transformer, integrating techniques including adaptive speech feature selection, language-specific modeling, multi-task learning, deep and big Transformer, sparsified linear attention and root mean square layer normalization. We adopt data augmentation using machine translation models for ST which converts the zero-shot problem into a zero-resource one. Experimental results show that these methods deliver substantial improvements, surpassing the official baseline by > 15 average BLEU and outperforming our cascading system by > 2 average BLEU. Our final submission achieves competitive performance (runner up).
Original languageEnglish
Title of host publicationProceedings of the 18th International Conference on Spoken Language Translation (IWSLT 2021)
Place of PublicationBangkok, Thailand (online)
PublisherAssociation for Computational Linguistics
Pages160-168
Number of pages9
ISBN (Electronic)978-1-954085-74-9
DOIs
Publication statusPublished - 5 Aug 2021
Event18th International Conference on Spoken Language Translation - Online
Duration: 5 Aug 20216 Aug 2021
https://iwslt.org/2021/

Conference

Conference18th International Conference on Spoken Language Translation
Abbreviated titleIWSLT 2021
Period5/08/216/08/21
Internet address

Fingerprint

Dive into the research topics of 'Edinburgh's End-to-End Multilingual Speech Translation System for IWSLT 2021'. Together they form a unique fingerprint.

Cite this