Emotion Recognition from the Speech Signal by Effective Combination of Generative and Discriminative Models

E. Loweimi, M. Doulaty, Jon Barker, Thomas Hain

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper, we propose an effective way for combining the discriminative and generative models for emotion recognition from speech signal. Finding an efficient feature extraction algorithm which captures just the main attribute(s) pertinent to the task and filters out the other aspects of the data turns out to be very challenging, if not impossible. We propose an interface between the front-end and the back-end in order to compensate for the shortcoming of the parameterization block in suppressing the irrelevant dimensions of the signal. This interface is a generative model, which performs remarkable dimensionality reduction, allows for extraction of a long-term feature, and also paves the way for better classification of the data through a discriminative model. This method leads to a 7.6% absolute performance improvement in comparison with the baseline system and results in 87.6% accuracy in emotion recognition task. Human performance on the same database is reportedly 84.3
Original languageEnglish
Title of host publicationUSES 2015 - The University of Sheffield Engineering Symposium
Number of pages2
Publication statusPublished - 1 Jun 2015

Keywords

  • Discriminative model
  • Emotion recognition
  • Front-end
  • Generative model
  • Speech signal

Fingerprint

Dive into the research topics of 'Emotion Recognition from the Speech Signal by Effective Combination of Generative and Discriminative Models'. Together they form a unique fingerprint.

Cite this