Fast Neural Machine Translation Implementation

Hieu Hoang, Tomasz Dwojak, Rihards Krislauks, Daniel Torregrosa, Kenneth Heafield

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper describes the submissions to the efficiency track for GPUs by members of the University of Edinburgh, Adam Mickiewicz University, Tilde and University of Alicante. We focus on efficient implementation of the recurrent deep-learning model as implemented in Amun, the fast inference engine for neural machine translation. We improve the performance with an efficient mini-batching algorithm, and by fusing the softmax operation with the k-best extraction algorithm.
Original languageEnglish
Title of host publicationThe 2nd Workshop on Neural Machine Translation and Generation 2018
Place of PublicationMelbourne, Australia
PublisherACL Anthology
Pages116-121
Number of pages6
Publication statusPublished - Jul 2018
Event2nd Workshop on Neural Machine Translation and Generation - Melbourne, Australia
Duration: 15 Jul 201820 Jul 2018
https://sites.google.com/site/wnmt18/home
https://sites.google.com/site/wnmt18/

Conference

Conference2nd Workshop on Neural Machine Translation and Generation
Abbreviated titleWNMT 2018
Country/TerritoryAustralia
CityMelbourne
Period15/07/1820/07/18
Internet address

Fingerprint

Dive into the research topics of 'Fast Neural Machine Translation Implementation'. Together they form a unique fingerprint.

Cite this