Context-based Recognition during Human Interactions: Automatic Feature Selection and Encoding Dictionary

Louis-Philippe Morency, I.A. de Kok, Jonathan Gratch

    Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

    35 Citations (Scopus)


    During face-to-face conversation, people use visual feedback such as head nods to communicate relevant information and to synchronize rhythm between participants. In this paper we describe how contextual information from other partic- ipants can be used to predict visual feedback and improve recognition of head gestures in human-human interactions. For example, in a dyadic interaction, the speaker contextual cues such as gaze shifts or changes in prosody will in uence listener backchannel feedback (e.g., head nod). To auto- matically learn how to integrate this contextual information into the listener gesture recognition framework, this paper addresses two main challenges: optimal feature representa- tion using an encoding dictionary and automatic selection of optimal feature-encoding pairs. Multimodal integration between context and visual observations is performed using a discriminative sequential model (Latent-Dynamic Condi- tional Random Fields) trained on previous interactions. In our experiments involving 38 storytelling dyads, our context- based recognizer signicantly improved head gesture recog- nition performance over a vision-only recognizer.
    Original languageUndefined
    Title of host publicationProceedings of the 10th international conference on Multimodal interfaces (ICMI '08)
    Place of PublicationNew York
    PublisherAssociation for Computing Machinery (ACM)
    Number of pages8
    ISBN (Print)978-1-60558-198-9
    Publication statusPublished - 2008
    Event10th International Conference on Multimodal Interfaces, ICMI 2008 - Chania, Crete, Greece
    Duration: 20 Oct 200822 Oct 2008
    Conference number: 10

    Publication series



    Conference10th International Conference on Multimodal Interfaces, ICMI 2008
    Abbreviated titleICMI
    CityChania, Crete


    • METIS-264252
    • Contextual information
    • IR-69012
    • EWI-17026
    • HMI-HF: Human Factors
    • human- human interaction
    • head nod recognition
    • HMI-IA: Intelligent Agents
    • visual gesture recognition

    Cite this