Improving Dutch sentiment analysis in Pattern

Lorenzo Gatti, Judith van Stegeren

Research output: Contribution to journalArticleAcademicpeer-review

1 Citation (Scopus)
847 Downloads (Pure)

Abstract

In this paper we investigate methods for improving the sentiment analysis functionality of Pattern.nl, the Dutch submodule of Pattern, an open-source library for web mining and natural language processing. We discuss the impact on performance of three different potential improvements: extending the module’s internal sentiment lexicon; removing subsets of neutral words from the sentiment lexicon; and improving the algorithm for combining multiple word-level sentiment ratings into a sentence-level sentiment rating. We evaluated the improvements on datasets from the product review domain (books, clothing and music) and a dataset of short emotional stories. The experiments show that lexicon expansion does not lead to better results; new normalization techniques, on the other hand, show a limited but consistent performance increase for sentiment ratings.
Original languageEnglish
Pages (from-to)73-89
Number of pages15
JournalComputational linguistics in the Netherlands journal
Volume10
Publication statusPublished - 12 Dec 2020

Keywords

  • Sentiment Analysis
  • Pattern
  • natural language processing
  • Dutch
  • Emotion detection
  • text analysis
  • Text classification
  • Text Mining

Fingerprint

Dive into the research topics of 'Improving Dutch sentiment analysis in Pattern'. Together they form a unique fingerprint.
  • Improving Pattern.nl sentiment analysis

    Gatti, L. & van Stegeren, J., Dec 2020.

    Research output: Contribution to conferenceAbstractAcademic

    Open Access
    File

Cite this