Edinburgh Research Explorer

On the Efficiency of Recurrent Neural Network Optimization Algorithms

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Original languageEnglish
Title of host publicationOPT2015 Optimization for Machine Learning at the Neural Information Processing Systems Conference, 2015
Number of pages5
Publication statusPublished - Dec 2015

Abstract

This study compares the sequential and parallel efficiency of training Recurrent Neural Networks (RNNs) with Hessian-free optimization versus a gradient descent variant. Experiments are performed using the long short term memory (LSTM)
architecture and the newly proposed multiplicative LSTM (mLSTM) architecture.
Results demonstrate a number of insights into these architectures and optimization
algorithms, including that Hessian-free optimization has the potential for large
efficiency gains in a highly parallel setup.

Download statistics

No data available

ID: 25469970