Affect Recognition through Scalogram and Multi-resolution Cochleagram Features

Research output: Contribution to conferencePaperpeer-review

Abstract

An approach to the categorization of voice samples according to emotions expressed by the speaker is proposed which uses Multi-Resolution Cochleagram (MRCG) and scalogram features in a novel way. Audio recordings from the EmoDB, EMOVO and Savee Data-sets are employed in training and testing of predictive models consisting of different sets of speech features. This study systematically evaluates the performance of the feature sets most commonly used in computational paralinguistic tasks (i.e. emobase, eGeMAPS and ComParE) in addition to MRCG- and scalogram-derived features and their fusion, across five different classifiers. The datasets used in this evaluation include speech in three different languages (German, Italian and English). MRCG features outperform the feature sets most commonly used in computational paralinguistic tasks, including emobase, eGeMAPS and ComParE, for the EmoDB (unweighted average recall, UAR = 59:15%) and SAVEE (UAR = 36:12%) datasets, while eGeMAPS provides the best overall UAR (33.84%) for the EMOVO dataset. A support vector machine (SVM) classifier yields the best UAR for EmoDB (80.05%) through fusion of emobase, eGeMAPS, ComParE and MRCG, and for EMOVO (40.31%), through fusion of emobase, eGeMAPS and ComParE. For SAVEE, random forests provide the best result (46.55%) using the ComParE feature set.

Original languageEnglish
Pages581-585
Number of pages5
DOIs
Publication statusPublished - 30 Aug 2021
Event22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021 - Brno, Czech Republic
Duration: 30 Aug 20213 Sep 2021

Conference

Conference22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021
Country/TerritoryCzech Republic
CityBrno
Period30/08/213/09/21

Keywords

  • Affective computing
  • Emotion recognition
  • Social signal processing

Fingerprint

Dive into the research topics of 'Affect Recognition through Scalogram and Multi-resolution Cochleagram Features'. Together they form a unique fingerprint.

Cite this