A Lightweight Recurrent Network for Sequence Modeling

Biao Zhang, Rico Sennrich

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

Recurrent networks have achieved great success on various sequential tasks with the assistance of complex recurrent units, but suffer from severe computational inefficiency due to weak parallelization. One direction to alleviate this issue is to shift heavy computations outside the recurrence. In this paper, we propose a lightweight recurrent network, or LRN.

LRN uses input and forget gates to handle long-range dependencies as well as gradient vanishing and explosion, with all parameterrelated calculations factored outside the recurrence. The recurrence in LRN only manipulates the weight assigned to each token, tightly connecting LRN with self-attention networks.

We apply LRN as a drop-in replacement of existing recurrent units in several neural sequential models. Extensive experiments on six NLP tasks show that LRN yields the best running efficiency with little or no loss in model performance.

Source code is available at https://github.com/ bzhangGo/lrn. 
Original languageEnglish
Title of host publicationProceedings of the 57th Annual Meeting of the Association for Computational Linguistics
EditorsAnna Korhonen, David Traum, Lluís Màrquez
Place of PublicationFlorence, Italy
PublisherAssociation for Computational Linguistics (ACL)
Pages1538–1548
Number of pages11
Publication statusPublished - 2 Aug 2019
Event57th Annual Meeting of the Association for Computational Linguistics - Fortezza da Basso, Florence, Italy
Duration: 28 Jul 20192 Aug 2019
Conference number: 57
http://www.acl2019.org/EN/index.xhtml

Conference

Conference57th Annual Meeting of the Association for Computational Linguistics
Abbreviated titleACL 2019
Country/TerritoryItaly
CityFlorence
Period28/07/192/08/19
Internet address

Fingerprint

Dive into the research topics of 'A Lightweight Recurrent Network for Sequence Modeling'. Together they form a unique fingerprint.

Cite this