Edinburgh Research Explorer

Exploration by random network distillation

Research output: Contribution to conferencePaper

Related Edinburgh Organisations

Open Access permissions

Open

Documents

https://openreview.net/forum?id=H1lJJnR5Ym
Original languageEnglish
Number of pages17
Publication statusPublished - 2019
EventSeventh International Conference on Learning Representations - New Orleans, United States
Duration: 6 May 20199 May 2019
https://iclr.cc/

Conference

ConferenceSeventh International Conference on Learning Representations
Abbreviated titleICLR 2019
CountryUnited States
CityNew Orleans
Period6/05/199/05/19
Internet address

Abstract

We introduce an exploration bonus for deep reinforcement learning methods that is easy to implement and adds minimal overhead to the computation performed. The bonus is the error of a neural network predicting features of the observations given by a fixed randomly initialized neural network. We also introduce a method to flexibly combine intrinsic and extrinsic rewards. We find that the random network distillation (RND) bonus combined with this increased flexibility enables significant progress on several hard exploration Atari games. In particular we establish state of the art performance on Montezuma’s Revenge, a game famously difficult for deep reinforcement learning methods. To the best of our knowledge, this is the first method that achieves better than average human performance on this game without using
demonstrations or having access to the underlying state of the game, and occasionally completes the first level.

Event

Seventh International Conference on Learning Representations

6/05/199/05/19

New Orleans, United States

Event: Conference

Download statistics

No data available

ID: 89498446