On Sparsifying Encoder Outputs in Sequence-to-Sequence Models

Biao Zhang, Ivan Titov, Rico Sennrich

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

Sequence-to-sequence models usually transfer all encoder outputs to the decoder for generation. In this work, by contrast, we hypothesize that these encoder outputs can be compressed to shorten the sequence delivered for decoding. We take Transformer as the test bed and introduce a layer of stochastic gates in-between the encoder and the decoder. The gates are regularized using the expected value of the sparsity-inducing L0 penalty, resulting in completely masking-out a subset of encoder outputs. In other words, via joint training, the L0DROP layer forces Transformer to route information through a subset of its encoder states. We investigate the effects of this sparsification on two machine translation and two summarization tasks. Experiments show that, depending on the task, around 40–70% of source encodings can be pruned without significantly compromising quality. The decrease of the output length endows L0DROP with the potential of improving decoding efficiency, where it yields a speedup of up to 1.65× on document summarization and 1.20× on character-based machine translation against the standard Transformer. We analyze the L0DROP behaviourand observe that it exhibits systematic preferences for pruning certain word types, e.g., function words and punctuation get pruned most. Inspired by these observations, we explore the feasibility of specifying rule-based patterns that mask out encoder outputs based on information such as part-of-speech tags, word frequency and word position.
Original languageEnglish
Title of host publicationFindings of the Association for Computational Linguistics: ACL-IJCNLP 2021
Place of PublicationOnline
PublisherAssociation for Computational Linguistics
Pages2888-2900
Number of pages13
ISBN (Electronic)978-1-954085-54-1
DOIs
Publication statusPublished - 1 Aug 2021
EventThe Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing - Bangkok, Thailand
Duration: 1 Aug 20216 Aug 2021
https://2021.aclweb.org/

Conference

ConferenceThe Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing
Abbreviated titleACL-IJCNLP 2021
Country/TerritoryThailand
CityBangkok
Period1/08/216/08/21
Internet address

Fingerprint

Dive into the research topics of 'On Sparsifying Encoder Outputs in Sequence-to-Sequence Models'. Together they form a unique fingerprint.

Cite this