Enhancing Subjective Speech Intelligibility Using a Statistical Model of Speech

Petko N. Petkov, W. Bastiaan Kleijn, Gustav Eje Henter

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

The intelligibility of speech in adverse noise conditions can be improved by modifying the characteristics of the clean speech prior to its presentation. An effective and flexible paradigm is to select the modification by optimizing a measure of objective intelligibility. Here we apply this paradigm at the text level and optimize a measure related to the classification error probability in an automatic speech recognition system. The proposed method was applied to a simple but powerful band-energy modification mechanism under an energy preservation constraint. Subjective evaluation results provide a clear indication of a significant gain in subjective intelligibility. In contrast to existing methods, the proposed approach is not restricted to a particular modification strategy and treats the notion of optimality at a level closer to that of subjective intelligibility. The computational complexity of the method is sufficiently low to enable its use in on-line applications.
Original languageEnglish
Title of host publicationProc. Interspeech 2012
Place of PublicationPortland, OR
Pages166-169
Number of pages4
Volume13
Publication statusPublished - 1 Sept 2012
EventINTERSPEECH 2012 - 13th Annual Conference of the International Speech Communication Association - Portland, Oregon, United States
Duration: 9 Sept 201213 Sept 2012

Conference

ConferenceINTERSPEECH 2012 - 13th Annual Conference of the International Speech Communication Association
Country/TerritoryUnited States
CityPortland, Oregon
Period9/09/1213/09/12

Keywords / Materials (for Non-textual outputs)

  • speech modification
  • statistical model of speech
  • subjective intelligibility

Fingerprint

Dive into the research topics of 'Enhancing Subjective Speech Intelligibility Using a Statistical Model of Speech'. Together they form a unique fingerprint.

Cite this