Merlin: An Open Source Neural Network Speech Synthesis System

Zhizheng Wu, Oliver Watts, Simon King

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We introduce the Merlin speech synthesis toolkit for neural network-based speech synthesis. The system takes linguistic features as input, and employs neural networks to predict acoustic features, which are then passed to a vocoder to produce the speech waveform. Various neural netw are implemented, including a standard feedforward neural network, mixture density neural network, recurrent neural network (RNN), long short-term memory (LSTM) recurrent neural network, amongst others. The toolkit is Open Source, written in Python, and is extensible. This paper briefly describes the system, and provides some benchmarking results on a freely available corpus.
Original languageEnglish
Title of host publication9th ISCA Speech Synthesis Workshop (2016)
Pages202-207
Number of pages6
DOIs
Publication statusPublished - 15 Sep 2016
Event9th ISCA Speech Synthesis Workshop - Sunnyvale, United States
Duration: 13 Sep 201615 Sep 2016
http://ssw9.talp.cat/

Conference

Conference9th ISCA Speech Synthesis Workshop
Abbreviated titleISCA 2016
Country/TerritoryUnited States
CitySunnyvale
Period13/09/1615/09/16
Internet address

Fingerprint

Dive into the research topics of 'Merlin: An Open Source Neural Network Speech Synthesis System'. Together they form a unique fingerprint.

Cite this