Edinburgh Research Explorer

A Deep Neural Network for Acoustic-Articulatory Speech Inversion

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Related Edinburgh Organisations

Open Access permissions



Original languageEnglish
Title of host publicationProc. NIPS 2011 Workshop on Deep Learning and Unsupervised Feature Learning
Publication statusPublished - Dec 2011


In this work, we implement a deep belief network for the acoustic-articulatory inversion mapping problem. We find that adding up to 3 hidden-layers improves inversion accuracy. We also show that this improvement is due to the higher ex- pressive capability of a deep model and not a consequence of adding more adjustable parameters. Additionally, we show unsupervised pretraining of the sys- tem improves its performance in all cases, even for a 1 hidden-layer model. Our implementation obtained an average root mean square error of 0.95 mm on the MNGU0 test dataset, beating all previously published results.

Download statistics

No data available

ID: 4929931