The listening talker: A review of human and algorithmic context-induced modifications of speech

Martin Cooke*, Simon King, Maeva Garnier, Vincent Aubanel

*Corresponding author for this work

Research output: Contribution to journalLiterature reviewpeer-review

Abstract / Description of output

Speech output technology is finding widespread application, including in scenarios where intelligibility might be compromised at least for some listeners by adverse conditions. Unlike most current algorithms, talkers continually adapt their speech patterns as a response to the immediate context of spoken communication, where the type of interlocutor and the environment are the dominant situational factors influencing speech production. Observations of talker behaviour can motivate the design of more robust speech output algorithms. Starting with a listener-oriented categorisation of possible goals for speech modification, this review article summarises the extensive set of behavioural findings related to human speech modification, identifies which factors appear to be beneficial, and goes on to examine previous computational attempts to improve intelligibility in noise. The review concludes by tabulating 46 speech modifications, many of which have yet to be perceptually or algorithmically evaluated. Consequently, the review provides a roadmap for future work in improving the robustness of speech output. (C) 2013 Elsevier Ltd. All rights reserved.
Original languageEnglish
Pages (from-to)543-571
Number of pages29
JournalComputer Speech and Language
Volume28
Issue number2
Early online date30 Aug 2013
DOIs
Publication statusPublished - Mar 2014

Keywords / Materials (for Non-textual outputs)

  • Speech production
  • Modification algorithms
  • INFANT-DIRECTED SPEECH
  • ACOUSTIC-PHONETIC CHARACTERISTICS
  • FLATTENED FUNDAMENTAL-FREQUENCY
  • HEARING-IMPAIRED LISTENERS
  • HIGH NOISE-LEVELS
  • HARD-OF-HEARING
  • CLEAR SPEECH
  • CONVERSATIONAL SPEECH
  • MOTHERS SPEECH
  • WORD RECOGNITION

Fingerprint

Dive into the research topics of 'The listening talker: A review of human and algorithmic context-induced modifications of speech'. Together they form a unique fingerprint.

Cite this