Latent Trees for Estimating Intensity of Facial Action Units

Sebastian Kaltwang, Sinisa Todorovic, Maja Pantic

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

28 Citations (Scopus)
37 Downloads (Pure)

Abstract

This paper is about estimating intensity levels of Facial Action Units (FAUs) in videos as an important step toward interpreting facial expressions. As input features, we use locations of facial landmark points detected in video frames. To address uncertainty of input, we formulate a generative latent tree (LT) model, its inference, and novel algorithms for efficient learning of both LT parameters and structure. Our structure learning iteratively builds LT by adding either a new edge or a new hidden node to LT, starting from initially independent nodes of observable features. A graph-edit operation that increases maximally the likelihood and minimally the model complexity is selected as optimal in each iteration. For FAU intensity estimation, we derive closed-form expressions of posterior marginals of all variables in LT, and specify an efficient bottom-up/top-down inference. Our evaluation on the benchmark DISFA and ShoulderPain datasets, in subject-independent setting, demonstrate that we outperform the state of the art, even under significant noise in facial landmarks. Effectiveness of our structure learning is demonstrated by probabilistically sampling meaningful facial expressions from the LT.
Original languageUndefined
Title of host publicationProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015)
Place of PublicationUSA
PublisherIEEE Computer Society
Pages296-304
Number of pages9
ISBN (Print)978-1-4673-6964-0
DOIs
Publication statusPublished - Jun 2015

Publication series

Name
PublisherIEEE Computer Society

Keywords

  • EWI-26801
  • IR-99459
  • METIS-316035
  • HMI-HF: Human Factors

Cite this

Kaltwang, S., Todorovic, S., & Pantic, M. (2015). Latent Trees for Estimating Intensity of Facial Action Units. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015) (pp. 296-304). USA: IEEE Computer Society. https://doi.org/10.1109/CVPR.2015.7298626
Kaltwang, Sebastian ; Todorovic, Sinisa ; Pantic, Maja. / Latent Trees for Estimating Intensity of Facial Action Units. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015). USA : IEEE Computer Society, 2015. pp. 296-304
@inproceedings{7e17bfe3c89547e4bc0f720d1186f6a4,
title = "Latent Trees for Estimating Intensity of Facial Action Units",
abstract = "This paper is about estimating intensity levels of Facial Action Units (FAUs) in videos as an important step toward interpreting facial expressions. As input features, we use locations of facial landmark points detected in video frames. To address uncertainty of input, we formulate a generative latent tree (LT) model, its inference, and novel algorithms for efficient learning of both LT parameters and structure. Our structure learning iteratively builds LT by adding either a new edge or a new hidden node to LT, starting from initially independent nodes of observable features. A graph-edit operation that increases maximally the likelihood and minimally the model complexity is selected as optimal in each iteration. For FAU intensity estimation, we derive closed-form expressions of posterior marginals of all variables in LT, and specify an efficient bottom-up/top-down inference. Our evaluation on the benchmark DISFA and ShoulderPain datasets, in subject-independent setting, demonstrate that we outperform the state of the art, even under significant noise in facial landmarks. Effectiveness of our structure learning is demonstrated by probabilistically sampling meaningful facial expressions from the LT.",
keywords = "EWI-26801, IR-99459, METIS-316035, HMI-HF: Human Factors",
author = "Sebastian Kaltwang and Sinisa Todorovic and Maja Pantic",
note = "10.1109/CVPR.2015.7298626",
year = "2015",
month = "6",
doi = "10.1109/CVPR.2015.7298626",
language = "Undefined",
isbn = "978-1-4673-6964-0",
publisher = "IEEE Computer Society",
pages = "296--304",
booktitle = "Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015)",
address = "United States",

}

Kaltwang, S, Todorovic, S & Pantic, M 2015, Latent Trees for Estimating Intensity of Facial Action Units. in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015). IEEE Computer Society, USA, pp. 296-304. https://doi.org/10.1109/CVPR.2015.7298626

Latent Trees for Estimating Intensity of Facial Action Units. / Kaltwang, Sebastian; Todorovic, Sinisa; Pantic, Maja.

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015). USA : IEEE Computer Society, 2015. p. 296-304.

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

TY - GEN

T1 - Latent Trees for Estimating Intensity of Facial Action Units

AU - Kaltwang, Sebastian

AU - Todorovic, Sinisa

AU - Pantic, Maja

N1 - 10.1109/CVPR.2015.7298626

PY - 2015/6

Y1 - 2015/6

N2 - This paper is about estimating intensity levels of Facial Action Units (FAUs) in videos as an important step toward interpreting facial expressions. As input features, we use locations of facial landmark points detected in video frames. To address uncertainty of input, we formulate a generative latent tree (LT) model, its inference, and novel algorithms for efficient learning of both LT parameters and structure. Our structure learning iteratively builds LT by adding either a new edge or a new hidden node to LT, starting from initially independent nodes of observable features. A graph-edit operation that increases maximally the likelihood and minimally the model complexity is selected as optimal in each iteration. For FAU intensity estimation, we derive closed-form expressions of posterior marginals of all variables in LT, and specify an efficient bottom-up/top-down inference. Our evaluation on the benchmark DISFA and ShoulderPain datasets, in subject-independent setting, demonstrate that we outperform the state of the art, even under significant noise in facial landmarks. Effectiveness of our structure learning is demonstrated by probabilistically sampling meaningful facial expressions from the LT.

AB - This paper is about estimating intensity levels of Facial Action Units (FAUs) in videos as an important step toward interpreting facial expressions. As input features, we use locations of facial landmark points detected in video frames. To address uncertainty of input, we formulate a generative latent tree (LT) model, its inference, and novel algorithms for efficient learning of both LT parameters and structure. Our structure learning iteratively builds LT by adding either a new edge or a new hidden node to LT, starting from initially independent nodes of observable features. A graph-edit operation that increases maximally the likelihood and minimally the model complexity is selected as optimal in each iteration. For FAU intensity estimation, we derive closed-form expressions of posterior marginals of all variables in LT, and specify an efficient bottom-up/top-down inference. Our evaluation on the benchmark DISFA and ShoulderPain datasets, in subject-independent setting, demonstrate that we outperform the state of the art, even under significant noise in facial landmarks. Effectiveness of our structure learning is demonstrated by probabilistically sampling meaningful facial expressions from the LT.

KW - EWI-26801

KW - IR-99459

KW - METIS-316035

KW - HMI-HF: Human Factors

U2 - 10.1109/CVPR.2015.7298626

DO - 10.1109/CVPR.2015.7298626

M3 - Conference contribution

SN - 978-1-4673-6964-0

SP - 296

EP - 304

BT - Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015)

PB - IEEE Computer Society

CY - USA

ER -

Kaltwang S, Todorovic S, Pantic M. Latent Trees for Estimating Intensity of Facial Action Units. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015). USA: IEEE Computer Society. 2015. p. 296-304 https://doi.org/10.1109/CVPR.2015.7298626