Edinburgh Research Explorer

Windowed Attention Mechanisms for Speech Recognition

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Original languageEnglish
Title of host publicationICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Place of PublicationBrighton, United Kingdom
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Pages7100-7104
Number of pages5
DOIs
Publication statusE-pub ahead of print - 17 Apr 2019
Event44th International Conference on Acoustics, Speech, and Signal Processing: Signal Processing: Empowering Science and Technology for Humankind - Brighton , United Kingdom
Duration: 12 May 201917 May 2019
Conference number: 44
https://2019.ieeeicassp.org/

Conference

Conference44th International Conference on Acoustics, Speech, and Signal Processing
Abbreviated titleICASSP 2019
CountryUnited Kingdom
CityBrighton
Period12/05/1917/05/19
Internet address

Abstract

The usual attention mechanisms used for encoder-decoder models do not constrain the relationship between input and output sequences to be monotonic. To address this we explore windowed attention mechanisms which restrict attention to a block of source hidden states. Rule-based windowing restricts attention to a (typically large) fixed-length window. The performance of such methods is poor if the window size is small. In this paper, we propose a fully-trainable windowed attention and provide a detailed analysis on the factors which affect the performance of such an attention mechanism. Compared to the rule-based window methods, the learned window size is significantly smaller yet the model's performance is competitive. On the TIMIT corpus this approach has resulted in a 17% (relative) performance improvement over the traditional attention model. Our model also yields comparable accuracies to the joint CTC-attention model on the Wall Street Journal corpus.

    Research areas

  • End-to-end, Speech recognition, Attention

Download statistics

No data available

ID: 80931062