Projects per year
Abstract
Neural machine learning models can successfully model language that is similar to their training distribution, but they are highly susceptible to degradation under distribution shift, which occurs in many practical applications when processing out-of-domain (OOD) text. This has been attributed to ``shortcut learning'''':'' relying on weak correlations over arbitrary large contexts. We propose a method based on OOD detection with Random Network Distillation to allow an autoregressive language model to automatically disregard OOD context during inference, smoothly transitioning towards a less expressive but more robust model as the data becomes more OOD, while retaining its full context capability when operating in-distribution. We apply our method to a GRU architecture, demonstrating improvements on multiple language modeling (LM) datasets.
Original language | English |
---|---|
Title of host publication | Proceedings of the 7th Workshop on Representation Learning for NLP |
Editors | Spandana Gella, He He, Bodhisattwa Prasad Majumdar, Burcu Can, Eleonora Giunchiglia, Samuel Cahyawijaya, Sewon Min, Maximillian Mozes, Xiang Lorraine Li, Isabelle Augenstein, Anna Rogers, Kyunghyun Cho, Edward Grefenstette, Laura Rimell, Chris Dyer |
Place of Publication | Dublin, Ireland |
Publisher | Association for Computational Linguistics |
Pages | 1-8 |
Number of pages | 8 |
ISBN (Electronic) | 978-1-955917-48-3 |
DOIs | |
Publication status | Published - 3 Jun 2022 |
Event | The 7th Workshop on Representation Learning for NLP - Dublin, Ireland Duration: 26 May 2022 → 26 May 2022 Conference number: 7 https://sites.google.com/view/repl4nlp2022/home |
Workshop
Workshop | The 7th Workshop on Representation Learning for NLP |
---|---|
Abbreviated title | Repl4NLP 2022 |
Country/Territory | Ireland |
City | Dublin |
Period | 26/05/22 → 26/05/22 |
Internet address |
Fingerprint
Dive into the research topics of 'Distributionally Robust Recurrent Decoders with Random Network Distillation'. Together they form a unique fingerprint.Projects
- 4 Finished
-
Global Under-Resourced MEdia Translation
Birch-Mayne, A. (Principal Investigator) & Haddow, B. (Co-investigator)
1/01/19 → 30/06/22
Project: Research
-
MTStretch: Low-resource Machine Translation
Birch-Mayne, A. (Principal Investigator)
29/06/18 → 28/12/21
Project: Research
-
BroadSem-Induction of Broad-Coverage Semantic Parsers
Titov, I. (Principal Investigator)
1/05/17 → 30/04/22
Project: Research