Projects per year
Abstract
We explore dynamic evaluation, where sequence models are adapted to the recent sequence history using gradient descent, assigning higher probabilities to re-occurring sequential patterns. We develop a dynamic evaluation approach that outperforms existing adaptation approaches in our comparisons. We apply dynamic evaluation to outperform all previous word-level perplexities on the Penn Treebank and WikiText-2 datasets (achieving 51.1 and 44.3 respectively) and all previous character-level cross-entropies on the text8 and Hutter Prize datasets (achieving 1.19 bits/char and 1.08 bits/char respectively).
Original language | English |
---|---|
Title of host publication | Proceedings of the 35th International Conference on Machine Learning |
Editors | Jennifer Dy, Andreas Krause |
Place of Publication | Stockholmsmässan, Stockholm Sweden |
Publisher | PMLR |
Pages | 2766-2775 |
Number of pages | 10 |
Publication status | Published - 15 Jul 2018 |
Event | Thirty-fifth International Conference on Machine Learning - Stockholmsmässan, Stockholm, Sweden Duration: 10 Jul 2018 → 15 Jul 2018 https://icml.cc/ https://icml.cc/ |
Publication series
Name | Proceedings of Machine Learning Research |
---|---|
Publisher | PMLR |
Volume | 80 |
ISSN (Electronic) | 2640-3498 |
Conference
Conference | Thirty-fifth International Conference on Machine Learning |
---|---|
Abbreviated title | ICML 2018 |
Country/Territory | Sweden |
City | Stockholm |
Period | 10/07/18 → 15/07/18 |
Internet address |
Fingerprint
Dive into the research topics of 'Dynamic Evaluation of Neural Sequence Models'. Together they form a unique fingerprint.Projects
- 1 Finished
-
SUMMA - Scalable Understanding of Mulitingual Media
Renals, S., Birch-Mayne, A. & Cohen, S.
1/02/16 → 31/01/19
Project: Research