Edinburgh Research Explorer

Multiplicative LSTM for sequence modelling

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Original languageEnglish
Title of host publication International Conference on Learning Representations - ICLR 2017 - Workshop Track
Number of pages9
Publication statusPublished - 26 Apr 2017
Event5th International Conference on Learning Representations - Palais des Congrès Neptune, Toulon, France
Duration: 24 Apr 201726 Apr 2017


Conference5th International Conference on Learning Representations
Abbreviated titleICLR 2017
Internet address


We introduce multiplicative LSTM (mLSTM), a novel recurrent neural network architecture for sequence modelling that combines the long short-term memory (LSTM) and multiplicative recurrent neural network architectures. mLSTM is characterised by its ability to have different recurrent transition functions for each possible input, which we argue makes it more expressive for autoregressive density estimation. We demonstrate empirically that mLSTM outperforms standard LSTM and its deep variants for a range of character level modelling tasks, and that this improvement increases with the complexity of the task. This model achieves a test error of 1.19 bits/character on the last 4 million characters of the Hutter prize dataset when combined with dynamic evaluation.


5th International Conference on Learning Representations


Toulon, France

Event: Conference

Download statistics

No data available

ID: 31851610