A Bio-inspired Reinforcement Learning Rule to Optimise Dynamical Neural Networks for Robot Control

Tianqi Wei, Barbara Webb

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Most approaches for optimisation of neural networks are based on variants of back-propagation. This requires the network to be time invariant and differentiable; neural networks with dynamics are thus generally outside the scope of these methods. Biological neural circuits are highly dynamic yet clearly able to support learning. We propose a reinforcement learning approach inspired by the mechanisms and dynamics of biological synapses. The network weights undergo spontaneous fluctuations, and a reward signal modulates the centre and amplitude of fluctuations to converge to a desired network behaviour. We test the new learning rule on a 2D bipedal walking simulation, using a control system that combines a recurrent neural network, a bio-inspired central pattern generator layer and proportional-integral control, and demonstrate the first successful solution to this benchmark task.
Original languageEnglish
Title of host publication2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2018)
Place of PublicationMadrid, Spain
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Pages556-561
Number of pages6
ISBN (Electronic)978-1-5386-8094-0
ISBN (Print)978-1-5386-8095-7
DOIs
Publication statusPublished - 7 Jan 2019
Event2018 IEEE/RSJ International Conference on Intelligent Robots and Systems - Madrid, Spain
Duration: 1 Oct 20185 Oct 2018
https://www.iros2018.org/

Publication series

Name
PublisherIEEE
ISSN (Print)2153-0858
ISSN (Electronic)2153-0866

Conference

Conference2018 IEEE/RSJ International Conference on Intelligent Robots and Systems
Abbreviated titleIROS 2018
Country/TerritorySpain
CityMadrid
Period1/10/185/10/18
Internet address

Fingerprint

Dive into the research topics of 'A Bio-inspired Reinforcement Learning Rule to Optimise Dynamical Neural Networks for Robot Control'. Together they form a unique fingerprint.

Cite this