Automatic Role Recognition Based on Conversational and Prosodic Behaviour

Hugues Salamin, Khiet Phuong Truong, Gelareh Mohammadi

    Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

    10 Citations (Scopus)
    1 Downloads (Pure)


    This paper proposes an approach for the automatic recognition of roles in settings like news and talk-shows, where roles correspond to specific functions like Anchorman, Guest or Interview Participant. The approach is based on purely nonverbal vocal behavioral cues, including who talks when and how much (turn-taking behavior), and statistical properties of pitch, formants, energy and speaking rate (prosodic behavior). The experiments have been performed over a corpus of around 50 hours of broadcast material and the accuracy, percentage of time correctly labeled in terms of role, is up to 89%. Both turn-taking and prosodic behavior lead to satisfactory results. Furthermore, on one database, their combination leads to a statistically significant improvement.
    Original languageEnglish
    Title of host publicationProceedings of the ACM International Conference on Multimedia
    Place of PublicationNew York
    PublisherAssociation for Computing Machinery
    Number of pages4
    ISBN (Print)978-1-60558-933-6
    Publication statusPublished - 2010
    Event18th ACM Multimedia Conference, MM 2010 - Firenze, Italy
    Duration: 25 Oct 201029 Oct 2010
    Conference number: 18


    Conference18th ACM Multimedia Conference, MM 2010
    Abbreviated titleMM
    Internet address


    • METIS-271131
    • EC Grant Agreement nr.: FP7/231287
    • EWI-18805
    • IR-74618


    Dive into the research topics of 'Automatic Role Recognition Based on Conversational and Prosodic Behaviour'. Together they form a unique fingerprint.

    Cite this