Dynamic Evaluation of Neural Sequence Models

Benjamin Krause, Emmanuel Kahembwe, Iain Murray, Steve Renals

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We explore dynamic evaluation, where sequence models are adapted to the recent sequence history using gradient descent, assigning higher probabilities to re-occurring sequential patterns. We develop a dynamic evaluation approach that outperforms existing adaptation approaches in our comparisons. We apply dynamic evaluation to outperform all previous word-level perplexities on the Penn Treebank and WikiText-2 datasets (achieving 51.1 and 44.3 respectively) and all previous character-level cross-entropies on the text8 and Hutter Prize datasets (achieving 1.19 bits/char and 1.08 bits/char respectively).
Original languageEnglish
Title of host publicationProceedings of the 35th International Conference on Machine Learning
EditorsJennifer Dy, Andreas Krause
Place of PublicationStockholmsmässan, Stockholm Sweden
PublisherPMLR
Pages2766-2775
Number of pages10
Publication statusPublished - 15 Jul 2018
EventThirty-fifth International Conference on Machine Learning - Stockholmsmässan, Stockholm, Sweden
Duration: 10 Jul 201815 Jul 2018
https://icml.cc/
https://icml.cc/

Publication series

NameProceedings of Machine Learning Research
PublisherPMLR
Volume80
ISSN (Electronic)2640-3498

Conference

ConferenceThirty-fifth International Conference on Machine Learning
Abbreviated titleICML 2018
Country/TerritorySweden
CityStockholm
Period10/07/1815/07/18
Internet address

Fingerprint

Dive into the research topics of 'Dynamic Evaluation of Neural Sequence Models'. Together they form a unique fingerprint.

Cite this