A Deep Neural Network for Acoustic-Articulatory Speech Inversion

Benigno Uria, S. Renals, K. Richmond

Research output: Chapter in Book/Report/Conference proceedingConference contribution


In this work, we implement a deep belief network for the acoustic-articulatory inversion mapping problem. We find that adding up to 3 hidden-layers improves inversion accuracy. We also show that this improvement is due to the higher ex- pressive capability of a deep model and not a consequence of adding more adjustable parameters. Additionally, we show unsupervised pretraining of the sys- tem improves its performance in all cases, even for a 1 hidden-layer model. Our implementation obtained an average root mean square error of 0.95 mm on the MNGU0 test dataset, beating all previously published results.
Original languageEnglish
Title of host publicationProc. NIPS 2011 Workshop on Deep Learning and Unsupervised Feature Learning
Publication statusPublished - Dec 2011


Dive into the research topics of 'A Deep Neural Network for Acoustic-Articulatory Speech Inversion'. Together they form a unique fingerprint.

Cite this