Automatic Role Recognition Based on Conversational and Prosodic Behaviour

Hugues Salamin, Khiet Phuong Truong, Gelareh Mohammadi

    Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

    9 Citations (Scopus)

    Abstract

    This paper proposes an approach for the automatic recognition of roles in settings like news and talk-shows, where roles correspond to specific functions like Anchorman, Guest or Interview Participant. The approach is based on purely nonverbal vocal behavioral cues, including who talks when and how much (turn-taking behavior), and statistical properties of pitch, formants, energy and speaking rate (prosodic behavior). The experiments have been performed over a corpus of around 50 hours of broadcast material and the accuracy, percentage of time correctly labeled in terms of role, is up to 89%. Both turn-taking and prosodic behavior lead to satisfactory results. Furthermore, on one database, their combination leads to a statistically significant improvement.
    Original languageEnglish
    Title of host publicationProceedings of the ACM International Conference on Multimedia
    Place of PublicationNew York
    PublisherAssociation for Computing Machinery (ACM)
    Pages847-850
    Number of pages4
    ISBN (Print)978-1-60558-933-6
    DOIs
    Publication statusPublished - 2010
    Event18th ACM Multimedia Conference, MM 2010 - Firenze, Italy
    Duration: 25 Oct 201029 Oct 2010
    Conference number: 18
    http://www.sigmm.org/archive/MM/mm10/www.acmmm10.org/index.html

    Conference

    Conference18th ACM Multimedia Conference, MM 2010
    Abbreviated titleMM
    CountryItaly
    CityFirenze
    Period25/10/1029/10/10
    Internet address

    Keywords

    • METIS-271131
    • EC Grant Agreement nr.: FP7/231287
    • EWI-18805
    • IR-74618

    Fingerprint Dive into the research topics of 'Automatic Role Recognition Based on Conversational and Prosodic Behaviour'. Together they form a unique fingerprint.

    Cite this