Nativization of foreign names in TTS for automatic reading of world news in Swahili

Joseph Mendelson, Pilar Oplustil, Oliver Watts, Simon King

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

When a text-to-speech (TTS) system is required to speak world news, a large fraction of the words to be spoken will be proper names originating in a wide variety of languages. Phonetization of these names based on target language letter-to-sound rules will typically be inadequate. This is detrimental not only during synthesis, when inappropriate phone sequences are produced, but also during training, if the system is trained on data from the same domain. This is because poor phonetization during forced alignment based on hidden Markov models can pollute the whole model set, resulting in degraded alignment even of normal target-language words. This paper presents four techniques designed to address this issue in the context of a Swahili TTS system: automatic transcription of proper names based on a lexicon from a better-resourced language; the addition of a parallel phone set and special part-of-speech tag exclusively dedicated to proper names; a manually-crafted phone mapping which allows substitutions for potentially more accurate phones in proper names during forced alignment; the addition in proper names of a grapheme-derived frame-level feature, supplementing the standard phonetic inputs to the acoustic model. We present results from objective and subjective evaluations of systems built using these four techniques.
Original languageEnglish
Title of host publicationProceedings Interspeech 2017
Number of pages5
Publication statusPublished - 24 Aug 2017
EventInterspeech 2017 - Stockholm, Sweden
Duration: 20 Aug 201724 Aug 2017

Publication series

ISSN (Electronic)1990-9772


ConferenceInterspeech 2017
Internet address


Dive into the research topics of 'Nativization of foreign names in TTS for automatic reading of world news in Swahili'. Together they form a unique fingerprint.

Cite this