Edinburgh Research Explorer

Grapheme-to-phoneme conversion methods for minority language conditions

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Original languageEnglish
Title of host publicationProceedings of the 2012 International Conference on Speech Database and Assessments, Oriental COCOSDA 2012
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Pages151-156
Number of pages6
ISBN (Print)9781467328104
DOIs
Publication statusPublished - 1 Feb 2013
Event2012 15th International Conference on Speech Database and Assessments, Oriental COCOSDA 2012 - Macau, China
Duration: 9 Dec 201212 Dec 2012

Conference

Conference2012 15th International Conference on Speech Database and Assessments, Oriental COCOSDA 2012
CountryChina
CityMacau
Period9/12/1212/12/12

Abstract

This study attempts to investigate the grapheme-to-phoneme conversion approaches for minority language conditions. Instead of isolated-word data for major languages, sentence-form data is defined to be a proper form of training data for minority languages. Joint-multigram Model and Hidden Markov Model were examined in this study. The 'treat-sentence-as-word' training method and the forced-alignment process were proposed to extend the Joint-multigram Model and the Hidden Markov Model respectively to meet the minority language conditions. Results get from the sentence-form training data using our proposed methods are as good as the results get from the isolated-word training data using previous proposed methods. The Joint-multigram Model performs better for well-designed training data, while the Hidden Markov Model has more error capacity and is more proper for minority language conditions.

    Research areas

  • forced-alignment, Grapheme-to-phoneme, HMM, Joint-multigram Model, treat-sentence-as-word

ID: 112495000