Latent Trees for Estimating Intensity of Facial Action Units

Sebastian Kaltwang, Sinisa Todorovic, Maja Pantic

    Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

    50 Citations (Scopus)
    124 Downloads (Pure)


    This paper is about estimating intensity levels of Facial Action Units (FAUs) in videos as an important step toward interpreting facial expressions. As input features, we use locations of facial landmark points detected in video frames. To address uncertainty of input, we formulate a generative latent tree (LT) model, its inference, and novel algorithms for efficient learning of both LT parameters and structure. Our structure learning iteratively builds LT by adding either a new edge or a new hidden node to LT, starting from initially independent nodes of observable features. A graph-edit operation that increases maximally the likelihood and minimally the model complexity is selected as optimal in each iteration. For FAU intensity estimation, we derive closed-form expressions of posterior marginals of all variables in LT, and specify an efficient bottom-up/top-down inference. Our evaluation on the benchmark DISFA and ShoulderPain datasets, in subject-independent setting, demonstrate that we outperform the state of the art, even under significant noise in facial landmarks. Effectiveness of our structure learning is demonstrated by probabilistically sampling meaningful facial expressions from the LT.
    Original languageUndefined
    Title of host publicationProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015)
    Place of PublicationUSA
    Number of pages9
    ISBN (Print)978-1-4673-6964-0
    Publication statusPublished - Jun 2015
    Event28th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015 - Boston, United States
    Duration: 7 Jun 201512 Jun 2015
    Conference number: 28

    Publication series

    PublisherIEEE Computer Society


    Conference28th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015
    Abbreviated titleCVPR 2015
    Country/TerritoryUnited States


    • EWI-26801
    • IR-99459
    • METIS-316035
    • HMI-HF: Human Factors

    Cite this