Latent Trees for Estimating Intensity of Facial Action Units

Sebastian Kaltwang, Sinisa Todorovic, Maja Pantic

    Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

    32 Citations (Scopus)
    51 Downloads (Pure)

    Abstract

    This paper is about estimating intensity levels of Facial Action Units (FAUs) in videos as an important step toward interpreting facial expressions. As input features, we use locations of facial landmark points detected in video frames. To address uncertainty of input, we formulate a generative latent tree (LT) model, its inference, and novel algorithms for efficient learning of both LT parameters and structure. Our structure learning iteratively builds LT by adding either a new edge or a new hidden node to LT, starting from initially independent nodes of observable features. A graph-edit operation that increases maximally the likelihood and minimally the model complexity is selected as optimal in each iteration. For FAU intensity estimation, we derive closed-form expressions of posterior marginals of all variables in LT, and specify an efficient bottom-up/top-down inference. Our evaluation on the benchmark DISFA and ShoulderPain datasets, in subject-independent setting, demonstrate that we outperform the state of the art, even under significant noise in facial landmarks. Effectiveness of our structure learning is demonstrated by probabilistically sampling meaningful facial expressions from the LT.
    Original languageUndefined
    Title of host publicationProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015)
    Place of PublicationUSA
    PublisherIEEE Computer Society
    Pages296-304
    Number of pages9
    ISBN (Print)978-1-4673-6964-0
    DOIs
    Publication statusPublished - Jun 2015

    Publication series

    Name
    PublisherIEEE Computer Society

    Keywords

    • EWI-26801
    • IR-99459
    • METIS-316035
    • HMI-HF: Human Factors

    Cite this

    Kaltwang, S., Todorovic, S., & Pantic, M. (2015). Latent Trees for Estimating Intensity of Facial Action Units. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015) (pp. 296-304). USA: IEEE Computer Society. https://doi.org/10.1109/CVPR.2015.7298626