String-based audiovisual fusion of behavioural events for the assessment of dimensional affect

Florian Eyben, Martin Wöllmer, Michel F. Valstar, Hatice Gunes, Björn Schuller, Maja Pantic

    Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

    35 Citations (Scopus)

    Abstract

    The automatic assessment of affect is mostly based on feature-level approaches, such as distances between facial points or prosodic and spectral information when it comes to audiovisual analysis. However, it is known and intuitive that behavioural events such as smiles, head shakes or laughter and sighs also bear highly relevant information regarding a subject's affective display. Accordingly, we propose a novel string-based prediction approach to fuse such events and to predict human affect in a continuous dimensional space. Extensive analysis and evaluation has been conducted using the newly released SEMAINE database of human-to-agent communication. For a thorough understanding of the obtained results, we provide additional benchmarks by more conventional feature-level modelling, and compare these and the string-based approach to fusion of signal-based features and string-based events. Our experimental results show that the proposed string-based approach is the best performing approach for automatic prediction of Valence and Expectation dimensions, and improves prediction performance for the other dimensions when combined with at least acoustic signal-based features.
    Original languageUndefined
    Title of host publicationIEEE International Conference on Automatic Face & Gesture Recognition and Workshops (FG 2011)
    Place of PublicationUSA
    PublisherIEEE Computer Society
    Pages322-329
    Number of pages8
    ISBN (Print)978-1-4244-9140-7
    DOIs
    Publication statusPublished - Mar 2011
    Event9th IEEE International Conference on Automatic Face and Gesture Recognition, FG 2011 - Santa Barbara, United States
    Duration: 21 Mar 201125 Mar 2011
    Conference number: 9

    Publication series

    Name
    PublisherIEEE Computer Society

    Conference

    Conference9th IEEE International Conference on Automatic Face and Gesture Recognition, FG 2011
    Abbreviated titleFG
    CountryUnited States
    CitySanta Barbara
    Period21/03/1125/03/11

    Keywords

    • METIS-285032
    • IR-79436
    • Databases
    • Visualization
    • Feature extraction
    • Hidden Markov models
    • HMI-MI: MULTIMODAL INTERACTIONS
    • Pixel
    • Speech
    • EWI-21332
    • EC Grant Agreement nr.: FP7/231287
    • EC Grant Agreement nr.: FP7/211486
    • Face

    Cite this