Misperceptions of the emotional content of natural and vocoded speech in a car

Jaime Lorenzo-Trueba, Cassia Valentini Botinhao, Gustav Henter, Junichi Yamagishi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper analyzes a) how often listeners interpret the emotional content of an utterance incorrectly when listening to vocoded or natural speech in adverse conditions; b) which noise conditions cause the most misperceptions; and c) which group of listeners misinterpret emotions the most. The long-term goal is to construct new emotional speech synthesizers that adapt to the environment and to the listener. We performed a large-scale listening test where over 400 listeners between the ages of 21 and 72 assessed natural and vocoded acted emotional speech stimuli. The stimuli had been artificially degraded using a room impulse response recorded in a car and various in-car noise types recorded in a real car. Experimental results show that the recognition rates for emotions and perceived emotional strength degrade as signal-to-noise ratio decreases. Interestingly, misperceptions seem to be more pronounced for negative and lowarousal emotions such as calmness or anger, while positive emotions such as happiness appear to be more robust to noise. An ANOVA analysis of listener meta-data further revealed that gender and age also influenced results, with elderly male listeners most likely to incorrectly identify emotions.
Original languageEnglish
Title of host publicationProceedings Interspeech 2017
PublisherInternational Speech Communication Association
Pages606-610
Number of pages5
DOIs
Publication statusPublished - 24 Aug 2017
EventInterspeech 2017 - Stockholm, Sweden
Duration: 20 Aug 201724 Aug 2017
http://www.interspeech2017.org/

Publication series

NameInterspeech
PublisherInternational Speech Communication Association
ISSN (Electronic)1990-9772

Conference

ConferenceInterspeech 2017
Country/TerritorySweden
CityStockholm
Period20/08/1724/08/17
Internet address

Fingerprint

Dive into the research topics of 'Misperceptions of the emotional content of natural and vocoded speech in a car'. Together they form a unique fingerprint.

Cite this