From hard to soft: Towards more human-like emotion recognition by modelling the perception uncertainty

Jing Han, Zixing Zhang, Maximilian Schmitt, Maja Pantic, Björn Schuller

    Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

    59 Citations (Scopus)
    1 Downloads (Pure)

    Abstract

    Over the last decade, automatic emotion recognition has become well established. The gold standard target is thereby usually calculated based on multiple annotations from different raters. All related efforts assume that the emotional state of a human subject can be identified by a 'hard' category or a unique value. This assumption tries to ease the human observer's subjectivity when observing patterns such as the emotional state of others. However, as the number of annotators cannot be infinite, uncertainty remains in the emotion target even if calculated from several, yet few human annotators. The common procedure to use this same emotion target in the learning process thus inevitably introduces noise in terms of an uncertain learning target. In this light, we propose a 'soft' prediction framework to provide a more human-like and comprehensive prediction of emotion. In our novel framework, we provide an additional target to indicate the uncertainty of human perception based on the inter-rater disagreement level, in contrast to the traditional framework which is merely producing one single prediction (category or value). To exploit the dependency between the emotional state and the newly introduced perception uncertainty, we implement a multi-task learning strategy. To evaluate the feasibility and effectiveness of the proposed soft prediction framework, we perform extensive experiments on a time- and value-continuous spontaneous audiovisual emotion database including late fusion results. We show that the soft prediction framework with multitask learning of the emotional state and its perception uncertainty significantly outperforms the individual tasks in both the arousal and valence dimensions.

    Original languageEnglish
    Title of host publicationMM 2017 - Proceedings of the 25th ACM Multimedia Conference
    PublisherAssociation for Computing Machinery
    Pages890-897
    Number of pages8
    ISBN (Electronic)9781450349062
    DOIs
    Publication statusPublished - 23 Oct 2017
    Event25th ACM Multimedia Conference, MM 2017 - Mountain View, United States
    Duration: 23 Oct 201727 Oct 2017
    Conference number: 25
    http://www.acmmm.org/2017/

    Conference

    Conference25th ACM Multimedia Conference, MM 2017
    Abbreviated titleMM
    Country/TerritoryUnited States
    CityMountain View
    Period23/10/1727/10/17
    Internet address

    Keywords

    • Emotion recognition
    • Long short-term memory
    • Multi-task learning
    • Perception uncertainty modelling

    Fingerprint

    Dive into the research topics of 'From hard to soft: Towards more human-like emotion recognition by modelling the perception uncertainty'. Together they form a unique fingerprint.

    Cite this