DepecheMood++: a Bilingual Emotion Lexicon Built Through Simple Yet Powerful Techniques

Oscar Araque*, Lorenzo Gatti, Jacopo Staiano, Marco Guerini

*Corresponding author for this work

    Research output: Contribution to journalArticleAcademicpeer-review

    1 Citation (Scopus)
    15 Downloads (Pure)

    Abstract

    Several lexica for sentiment analysis have been developed; while most of these come with word polarity annotations (e.g., positive/negative), attempts at building lexica for finer-grained emotion analysis (e.g., happiness, sadness) have recently attracted significant attention. They are often exploited as a building block for developing emotion recognition learning models, and/or used as baselines to which the performance of the models can be compared. In this work, we contribute two new resources, that we call DepecheMood++ (DM++): a) an extension of an existing and widely used emotion lexicon for English; and b) a novel version of the lexicon, targeting Italian. Furthermore, we show how simple techniques can be used, both in supervised and unsupervised experimental settings, to boost performance on datasets and tasks of varying degree of domain-specificity. Also, we report an extensive comparative analysis against other available emotion lexica and state-of-the-art supervised approaches, showing that DepecheMood++ emerges as the best-performing non-domain-specific lexicon in unsupervised settings. We also observe that simple learning models on top of DM++ can provide more challenging baselines. We finally introduce embedding-based methodologies to perform a) vocabulary expansion to address data scarcity and b) vocabulary porting to new languages in case training data is not available.
    Original languageEnglish
    JournalIEEE transactions on affective computing
    DOIs
    Publication statusE-pub ahead of print/First online - 14 Aug 2019

    Cite this

    @article{0ae01293754342429b6c921f9344ee9b,
    title = "DepecheMood++: a Bilingual Emotion Lexicon Built Through Simple Yet Powerful Techniques",
    abstract = "Several lexica for sentiment analysis have been developed; while most of these come with word polarity annotations (e.g., positive/negative), attempts at building lexica for finer-grained emotion analysis (e.g., happiness, sadness) have recently attracted significant attention. They are often exploited as a building block for developing emotion recognition learning models, and/or used as baselines to which the performance of the models can be compared. In this work, we contribute two new resources, that we call DepecheMood++ (DM++): a) an extension of an existing and widely used emotion lexicon for English; and b) a novel version of the lexicon, targeting Italian. Furthermore, we show how simple techniques can be used, both in supervised and unsupervised experimental settings, to boost performance on datasets and tasks of varying degree of domain-specificity. Also, we report an extensive comparative analysis against other available emotion lexica and state-of-the-art supervised approaches, showing that DepecheMood++ emerges as the best-performing non-domain-specific lexicon in unsupervised settings. We also observe that simple learning models on top of DM++ can provide more challenging baselines. We finally introduce embedding-based methodologies to perform a) vocabulary expansion to address data scarcity and b) vocabulary porting to new languages in case training data is not available.",
    author = "Oscar Araque and Lorenzo Gatti and Jacopo Staiano and Marco Guerini",
    year = "2019",
    month = "8",
    day = "14",
    doi = "10.1109/TAFFC.2019.2934444",
    language = "English",
    journal = "IEEE transactions on affective computing",
    issn = "1949-3045",
    publisher = "IEEE",

    }

    DepecheMood++ : a Bilingual Emotion Lexicon Built Through Simple Yet Powerful Techniques. / Araque, Oscar; Gatti, Lorenzo ; Staiano, Jacopo; Guerini, Marco.

    In: IEEE transactions on affective computing, 14.08.2019.

    Research output: Contribution to journalArticleAcademicpeer-review

    TY - JOUR

    T1 - DepecheMood++

    T2 - a Bilingual Emotion Lexicon Built Through Simple Yet Powerful Techniques

    AU - Araque, Oscar

    AU - Gatti, Lorenzo

    AU - Staiano, Jacopo

    AU - Guerini, Marco

    PY - 2019/8/14

    Y1 - 2019/8/14

    N2 - Several lexica for sentiment analysis have been developed; while most of these come with word polarity annotations (e.g., positive/negative), attempts at building lexica for finer-grained emotion analysis (e.g., happiness, sadness) have recently attracted significant attention. They are often exploited as a building block for developing emotion recognition learning models, and/or used as baselines to which the performance of the models can be compared. In this work, we contribute two new resources, that we call DepecheMood++ (DM++): a) an extension of an existing and widely used emotion lexicon for English; and b) a novel version of the lexicon, targeting Italian. Furthermore, we show how simple techniques can be used, both in supervised and unsupervised experimental settings, to boost performance on datasets and tasks of varying degree of domain-specificity. Also, we report an extensive comparative analysis against other available emotion lexica and state-of-the-art supervised approaches, showing that DepecheMood++ emerges as the best-performing non-domain-specific lexicon in unsupervised settings. We also observe that simple learning models on top of DM++ can provide more challenging baselines. We finally introduce embedding-based methodologies to perform a) vocabulary expansion to address data scarcity and b) vocabulary porting to new languages in case training data is not available.

    AB - Several lexica for sentiment analysis have been developed; while most of these come with word polarity annotations (e.g., positive/negative), attempts at building lexica for finer-grained emotion analysis (e.g., happiness, sadness) have recently attracted significant attention. They are often exploited as a building block for developing emotion recognition learning models, and/or used as baselines to which the performance of the models can be compared. In this work, we contribute two new resources, that we call DepecheMood++ (DM++): a) an extension of an existing and widely used emotion lexicon for English; and b) a novel version of the lexicon, targeting Italian. Furthermore, we show how simple techniques can be used, both in supervised and unsupervised experimental settings, to boost performance on datasets and tasks of varying degree of domain-specificity. Also, we report an extensive comparative analysis against other available emotion lexica and state-of-the-art supervised approaches, showing that DepecheMood++ emerges as the best-performing non-domain-specific lexicon in unsupervised settings. We also observe that simple learning models on top of DM++ can provide more challenging baselines. We finally introduce embedding-based methodologies to perform a) vocabulary expansion to address data scarcity and b) vocabulary porting to new languages in case training data is not available.

    U2 - 10.1109/TAFFC.2019.2934444

    DO - 10.1109/TAFFC.2019.2934444

    M3 - Article

    JO - IEEE transactions on affective computing

    JF - IEEE transactions on affective computing

    SN - 1949-3045

    ER -