Modelling acoustic feature dependencies with artificial neural networks: Trajectory-RNADE

Research output: Chapter in Book/Report/Conference proceedingConference contribution


Given a transcription, sampling from a good model of acoustic feature trajectories should result in plausible realizations of an utterance. However, samples from current probabilistic speech synthesis systems result in low quality synthetic
speech. Henter et al. have demonstrated the need to capture the dependencies between acoustic features conditioned on the phonetic labels in order to obtain high quality synthetic speech. These dependencies are often ignored in neural network based acoustic models. We tackle this deficiency by introducing a probabilistic neural network model of acoustic trajectories, trajectory RNADE, able to capture these dependencies.
Original languageEnglish
Title of host publicationAcoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
PublisherIEEE Signal Processing Society Press
Number of pages5
Publication statusPublished - 2015

Fingerprint Dive into the research topics of 'Modelling acoustic feature dependencies with artificial neural networks: Trajectory-RNADE'. Together they form a unique fingerprint.

Cite this