Windowed Attention Mechanisms for Speech Recognition

Shucong Zhang, Erfan Loweimi, Peter Bell, Steve Renals

Research output: Chapter in Book/Report/Conference proceedingConference contribution


The usual attention mechanisms used for encoder-decoder models do not constrain the relationship between input and output sequences to be monotonic. To address this we explore windowed attention mechanisms which restrict attention to a block of source hidden states. Rule-based windowing restricts attention to a (typically large) fixed-length window. The performance of such methods is poor if the window size is small. In this paper, we propose a fully-trainable windowed attention and provide a detailed analysis on the factors which affect the performance of such an attention mechanism. Compared to the rule-based window methods, the learned window size is significantly smaller yet the model's performance is competitive. On the TIMIT corpus this approach has resulted in a 17% (relative) performance improvement over the traditional attention model. Our model also yields comparable accuracies to the joint CTC-attention model on the Wall Street Journal corpus.
Original languageEnglish
Title of host publicationICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Place of PublicationBrighton, United Kingdom
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Number of pages5
ISBN (Electronic)978-1-4799-8131-1
ISBN (Print)978-1-4799-8132-8
Publication statusE-pub ahead of print - 17 Apr 2019
Event44th International Conference on Acoustics, Speech, and Signal Processing: Signal Processing: Empowering Science and Technology for Humankind - Brighton , United Kingdom
Duration: 12 May 201917 May 2019
Conference number: 44

Publication series

ISSN (Print)1520-6149
ISSN (Electronic)2379-190X


Conference44th International Conference on Acoustics, Speech, and Signal Processing
Abbreviated titleICASSP 2019
Country/TerritoryUnited Kingdom
Internet address


  • End-to-end
  • Speech recognition
  • Attention


Dive into the research topics of 'Windowed Attention Mechanisms for Speech Recognition'. Together they form a unique fingerprint.

Cite this