Edinburgh Research Explorer

Dynamic Evaluation of Neural Sequence Models

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Original languageEnglish
Title of host publicationProceedings of the 35th International Conference on Machine Learning
EditorsJennifer Dy, Andreas Krause
Place of PublicationStockholmsmässan, Stockholm Sweden
PublisherPMLR
Pages2766-2775
Number of pages10
Volume80
Publication statusPublished - 1 Oct 2018
EventThirty-fifth International Conference on Machine Learning - Stockholmsmässan, Stockholm, Sweden
Duration: 10 Jul 201815 Jul 2018
https://icml.cc/
https://icml.cc/

Publication series

NameProceedings of Machine Learning Research
PublisherPMLR
Volume80
ISSN (Electronic)2640-3498

Conference

ConferenceThirty-fifth International Conference on Machine Learning
Abbreviated titleICML 2018
CountrySweden
CityStockholm
Period10/07/1815/07/18
Internet address

Abstract

We explore dynamic evaluation, where sequence models are adapted to the recent sequence history using gradient descent, assigning higher probabilities to re-occurring sequential patterns. We develop a dynamic evaluation approach that outperforms existing adaptation approaches in our comparisons. We apply dynamic evaluation to outperform all previous word-level perplexities on the Penn Treebank and WikiText-2 datasets (achieving 51.1 and 44.3 respectively) and all previous character-level cross-entropies on the text8 and Hutter Prize datasets (achieving 1.19 bits/char and 1.08 bits/char respectively).

Event

Thirty-fifth International Conference on Machine Learning

10/07/1815/07/18

Stockholm, Sweden

Event: Conference

Download statistics

No data available

ID: 64283093