Predicting Listener Backchannels: A Probabilistic Multimodal Approach

Louis-Philippe Morency, I.A. de Kok, Jonathan Gratch

    Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

    52 Citations (Scopus)
    74 Downloads (Pure)

    Abstract

    During face-to-face interactions, listeners use backchannel feedback such as head nods as a signal to the speaker that the communication is working and that they should continue speaking. Predicting these backchannel opportunities is an important milestone for building engaging and natural virtual humans. In this paper we show how sequential probabilistic models (e.g., Hidden Markov Model (HMM) or Conditional Random Fields (CRF)) can automatically learn from a database of human-to-human interactions to predict listener backchannels using the speaker multimodal output features (e.g., prosody, spoken words and eye gaze). The main challenges addressed in this paper are automatic selection of the relevant features and optimal feature representation for probabilistic models. For prediction of visual backchannel cues (i.e., head nods), our prediction model shows a statistically significant improvement over a previously published approach based on hand-crafted rules.
    Original languageUndefined
    Title of host publicationProceedings of the Eight International Conference on Intelligent Virtual Agents 2008
    EditorsHelmut Prendinger, James Lester, Mitsuru Ishizuka
    Place of PublicationBerlin
    PublisherSpringer
    Pages176-190
    Number of pages15
    ISBN (Print)978-3-540-85482-1
    DOIs
    Publication statusPublished - 2008
    Event8th International Conference on Intelligent Virtual Agents, IVA 2008 - Tokyo, Japan
    Duration: 1 Sep 20083 Sep 2008
    Conference number: 8

    Publication series

    NameLecture Notes in Computer Science
    PublisherSpringer Verlag
    Volume5208/2008

    Conference

    Conference8th International Conference on Intelligent Virtual Agents, IVA 2008
    Abbreviated titleIVA
    CountryJapan
    CityTokyo
    Period1/09/083/09/08

    Keywords

    • METIS-264251
    • IR-68949
    • HMI-HF: Human Factors
    • HMI-IA: Intelligent Agents
    • EWI-17025

    Cite this

    Morency, L-P., de Kok, I. A., & Gratch, J. (2008). Predicting Listener Backchannels: A Probabilistic Multimodal Approach. In H. Prendinger, J. Lester, & M. Ishizuka (Eds.), Proceedings of the Eight International Conference on Intelligent Virtual Agents 2008 (pp. 176-190). [10.1007/978-3-540-85483-8_18] (Lecture Notes in Computer Science; Vol. 5208/2008). Berlin: Springer. https://doi.org/10.1007/978-3-540-85483-8_18