Evaluating Cognitive Load of Text-To-Speech (TTS) synthesis

Avashna Govender*, Cassia Valentini-Botinhao, Simon King

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

Current evaluation methods for text-to-speech (TTS) synthesis rely solely on subjective rating scores. These tests typically account mostly for how natural or intelligible the voice is. With state-of-the-art systems, these measures are approaching ceiling and therefore alternative measures such as the cognitive load may become more meaningful. To our knowledge, there is little or no recent work evaluating the cognitive load of state-of- the-art text-to-speech systems. We use pupillometry as a measure of cognitive load. The pupil has been found to dilate upon increased cognitive effort when carrying out a listening task. Currently we are evaluating speech generated by a Deep Neural Network TTS synthesiser. In our method, we generate stimuli that step incrementally from natural speech to synthesized speech by changing only a single feature at a time. Stimuli are presented to listeners in speech-shaped noise conditions.

Original languageEnglish
Title of host publicationProceedings of the 23rd International Congress on Acoustics
Subtitle of host publicationIntegrating 4th EAA Euroregio 2019
EditorsMartin Ochmann, Vorlander Michael, Janina Fels
PublisherInternational Commission for Acoustics (ICA)
Number of pages5
ISBN (Electronic)9783939296157
Publication statusPublished - 9 Sept 2019
Event23rd International Congress on Acoustics: Integrating 4th EAA Euroregio - Aachen, Germany
Duration: 9 Sept 201923 Sept 2019

Publication series

NameProceedings of the International Congress on Acoustics
ISSN (Print)2226-7808
ISSN (Electronic)2415-1599


Conference23rd International Congress on Acoustics: Integrating 4th EAA Euroregio
Abbreviated titleICA 2019

Keywords / Materials (for Non-textual outputs)

  • Cognitive load
  • Evaluation
  • Text-to-speech


Dive into the research topics of 'Evaluating Cognitive Load of Text-To-Speech (TTS) synthesis'. Together they form a unique fingerprint.

Cite this