Extracting audio-visual features for emotion recognition through active feature selection

Fasih Haider, Senja Pollak, Pierre Albert, Saturnino Luz

Research output: Chapter in Book/Report/Conference proceedingConference contribution


Research in automatic emotion recognition has seldom addressed the issue of computational resource utilisation. With the advent of ambient technology, which employs a variety of low-power, resource constrained devices, this issue is increasingly gaining interest. This is especially the case in the context of health and elderly care technologies, where interventions aim at maintaining the user's independence as unobtrusively as possible. In this context, efforts are being made to model human social signals such as affects using low-cost technologies, which can aid health monitoring. This paper presents an Active Feature Selection (AFS) method using self-organized maps neural networks for emotion recognition in the wild. The AFS is used for feature subsets selection from three different feature sets: 62 out of 88 features were selected for eGeMAPs, 21 out of 988 for emobase, and 140 out of 2832 for LBPTOP features. The results show that the features subsets selected by AFS provide better results than the entire feature set and PCA dimensionality reduction method. The best improvement is observed on emobase features, followed by eGeMAPs. For visual features, nearly the same results are obtained with a significant reduction in dimensionality (only 5% of the full feature set is required for the same level of accuracy). The weighted score fusion results in an improvement, leading to 43.40% and 40.12% accuracies on the EmotiW 2018 validation and test datasets respectively.
Original languageEnglish
Title of host publication7th IEEE Global Conference on Signal and Information Processing (GlobalSIP)
Publication statusPublished - 11 Nov 2019


  • Affective Computing
  • Emotion Recognition
  • Feature Engineering
  • Feature Extraction
  • Feature selection
  • Feature Transformation


Dive into the research topics of 'Extracting audio-visual features for emotion recognition through active feature selection'. Together they form a unique fingerprint.

Cite this