A probabilistic multimodal approach for predicting listener backchannels

Louis-Philippe Morency, I.A. de Kok, Jonathan Gratch

    Research output: Contribution to journalArticleAcademicpeer-review

    96 Citations (Scopus)

    Abstract

    During face-to-face interactions, listeners use backchannel feedback such as head nods as a signal to the speaker that the communication is working and that they should continue speaking. Predicting these backchannel opportunities is an important milestone for building engaging and natural virtual humans. In this paper we show how sequential probabilistic models (e.g., Hidden Markov Model or Conditional Random Fields) can automatically learn from a database of human-to-human interactions to predict listener backchannels using the speaker multimodal output features (e.g., prosody, spoken words and eye gaze). The main challenges addressed in this paper are automatic selection of the relevant features and optimal feature representation for probabilistic models. For prediction of visual backchannel cues (i.e., head nods), our prediction model shows a statistically significant improvement over a previously published approach based on hand-crafted rules.
    Original languageUndefined
    Article number10.1007/s10458-009-9092-y
    Pages (from-to)70-84
    Number of pages16
    JournalAutonomous agents and multi-agent systems
    Volume20
    Issue number1
    DOIs
    Publication statusPublished - 2010

    Keywords

    • nonverbal behavior prediction
    • conditional random field
    • head nod
    • sequental probabilistic model
    • EWI-17023
    • HMI-HF: Human Factors
    • HMI-IA: Intelligent Agents
    • METIS-270701
    • IR-68958
    • Multimodal
    • Listener backchannel feedback

    Cite this