Edinburgh Research Explorer

Factorized context modelling for Text-to-Speech synthesis

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Original languageEnglish
Title of host publicationAcoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Pages7849-7853
Number of pages5
ISBN (Print)978-1-4799-0356-6
DOIs
Publication statusPublished - 2013

Abstract

Because speech units are so context-dependent, a large number of linguistic context features are generally used by HMM-based Text-to-Speech (TTS) speech synthesis systems, via context-dependent models. Since it is impossible to train separate models for every context, decision trees are used to discover the most important combinations of features that should be modelled. The task of the decision tree is very hard - to generalize from a very small observed part of the context feature space to the rest - and they have a major weakness: they cannot directly take advantage of factorial properties: they subdivide the model space based on one feature at a time. We propose a Dynamic Bayesian Network (DBN) based Mixed Memory Markov Model (MMMM) to provide factorization of the context space. The results of a listening test are provided as evidence that the model successfully learns the factorial nature of this space.

    Research areas

  • belief networks, decision trees, hidden Markov models, linguistics, speech synthesis, DBN, HMM, MMMM, TTS system, context dependent model, context features space, decision tree, dynamic Bayesian network, factorized context modelling, linguistic context feature, mixed memory Markov model, text-to-speech synthesis, Bayes methods, Context, Context modeling, Hidden Markov models, Markov processes, Speech, Speech synthesis, Dynamic Bayesian Network, Mixed Memory Markov Model, Text-To-Speech synthesis, factorized model, maximum likelihood parameter generation

Download statistics

No data available

ID: 11892442