Using neighbourhood density and selective SNR boosting to increase the intelligibility of synthetic speech in noise

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Motivated by the fact that words are not equally confusable, we explore the idea of using word-level intelligibility predictions to selectively boost the harder-to-understand words in a sentence, aiming to improve overall intelligibility in the presence of noise. First, the intelligibility of a set of words from dense and sparse phonetic neighbourhoods was evaluated in isolation. The resulting intelligibility scores were used to inform two sentencelevel experiments. In the first experiment the signal-to-noise ratio of one word was boosted to the detriment of another word. Sentence intelligibility did not generally improve. The intelligibility of words in isolation and in a sentence were found to be significantly different, both in clean and in noisy conditions. For the second experiment, one word was selectively boosted while slightly attenuating all other words in the sentence. This strategy was successful for words that were poorly recognised in that particular context. However, a reliable predictor of word-in-context intelligibility remains elusive, since this involves ? as our results indicate ? semantic, syntactic and acoustic information about the word and the sentence.
Original languageEnglish
Title of host publication8th ISCA Workshop on Speech Synthesis
Pages133-138
Number of pages6
Publication statusPublished - Aug 2013

Fingerprint

Dive into the research topics of 'Using neighbourhood density and selective SNR boosting to increase the intelligibility of synthetic speech in noise'. Together they form a unique fingerprint.

Cite this