Robust Canonical Correlation Analysis: Audio-visual fusion for learning continuous interest

Mihalis A. Nicolaou, Yannis Panagakis, Stefanos Zafeiriou, Maja Pantic

    Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

    20 Citations (Scopus)
    17 Downloads (Pure)


    The problem of automatically estimating the interest level of a subject has been gaining attention by researchers, mostly due to the vast applicability of interest detection. In this work, we obtain a set of continuous interest annotations for the SE-MAINE database, which we analyse also in terms of emotion dimensions such as valence and arousal. Most importantly, we propose a robust variant of Canonical Correlation Analysis (RCCA) for performing audio-visual fusion, which we apply to the prediction of interest. RCCA recovers a low-rank subspace which captures the correlations of fused modalities, while isolating gross errors in the data without making any assumptions regarding Gaussianity. We experimentally show that RCCA is more appropriate than other standard fusion techniques (such as l2-CCA and feature-level fusion), since it both captures interactions between modalities while also decontaminating the obtained subspace from errors which are dominant in real-world problems.
    Original languageUndefined
    Title of host publicationProceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2014
    Place of PublicationUSA
    PublisherIEEE Computer Society
    Number of pages5
    ISBN (Print)978-1-4799-2892-7
    Publication statusPublished - May 2014
    EventIEEE International Conference on Acoustic, Speech and Signal Processing, ICASSP 2014 - Fortezza dal Basso, Florence, Italy
    Duration: 4 May 20149 May 2014

    Publication series

    PublisherIEEE Computer Society


    ConferenceIEEE International Conference on Acoustic, Speech and Signal Processing, ICASSP 2014
    Abbreviated titleICASSP
    Internet address


    • HMI-HF: Human Factors
    • EWI-25821
    • EC Grant Agreement nr.: FP7/2007-2013
    • EC Grant Agreement nr.: FP7/288235
    • METIS-309947
    • Interest Detection
    • Emotion Recognition
    • IR-95228
    • Audio-visual Fusion
    • Multi-modal Fusion

    Cite this