A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions

Zhihong Zeng, Maja Pantic, Glenn I. Roisman, Thomas S. Huang

    Research output: Contribution to journalArticleAcademicpeer-review

    1826 Citations (Scopus)


    Automated analysis of human affective behavior has attracted increasing attention from researchers in psychology, computer science, linguistics, neuroscience, and related disciplines. However, the existing methods typically handle only deliberately displayed and exaggerated expressions of prototypical emotions, despite the fact that deliberate behavior differs in visual appearance, audio profile, and timing from spontaneously occurring behavior. To address this problem, efforts to develop algorithms that can process naturally occurring human affective behavior have recently emerged. Moreover, an increasing number of efforts are reported toward multimodal fusion for human affect analysis, including audiovisual fusion, linguistic and paralinguistic fusion, and multicue visual fusion based on facial expressions, head movements, and body gestures. This paper introduces and surveys these recent advances. We first discuss human emotion perception from a psychological perspective. Next, we examine available approaches for solving the problem of machine understanding of human affective behavior and discuss important issues like the collection and availability of training and test data. We finally outline some of the scientific and engineering challenges to advancing human affect sensing technology.
    Original languageEnglish
    Pages (from-to)39-58
    Number of pages20
    JournalIEEE transactions on pattern analysis and machine intelligence
    Issue number1
    Publication statusPublished - Jan 2009


    • Affective computing
    • Evaluation/methodology
    • Human-centered computing
    • Survey
    • Introductory
    • EC Grant Agreement nr.: FP7/211486

    Fingerprint Dive into the research topics of 'A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions'. Together they form a unique fingerprint.

    Cite this