UT-DB: an experimental study on sentiment analysis in twitter

Zhemin Zhu, Djoerd Hiemstra, Peter M.G. Apers, Andreas Wombacher

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

2 Citations (Scopus)
57 Downloads (Pure)

Abstract

This paper describes our system for participating SemEval2013 Task2-B (Kozareva et al., 2013): Sentiment Analysis in Twitter. Given a message, our system classifies whether the message is positive, negative or neutral sentiment. It uses a co-occurrence rate model. The training data are constrained to the data provided by the task organizers (No other tweet data are used). We consider 9 types of features and use a subset of them in our submitted system. To see the contribution of each type of features, we do experimental study on features by leaving one type of features out each time. Results suggest that unigrams are the most important features, bigrams and POS tags seem not helpful, and stopwords should be retained to achieve the best results. The overall results of our system are promising regarding the constrained features and data we use.
Original languageUndefined
Title of host publicationProceedings of the Seventh International Workshop on Semantic Evaluation, SemEval 2013
Place of PublicationUSA
PublisherAssociation for Computational Linguistics (ACL)
Pages384-389
Number of pages6
ISBN (Print)not assigned
Publication statusPublished - Jun 2013

Publication series

Name
PublisherAssociation for Computational Linguistics

Keywords

  • EWI-23378
  • Tweet
  • METIS-297658
  • IR-86472
  • Sentiment Analysis

Cite this

Zhu, Z., Hiemstra, D., Apers, P. M. G., & Wombacher, A. (2013). UT-DB: an experimental study on sentiment analysis in twitter. In Proceedings of the Seventh International Workshop on Semantic Evaluation, SemEval 2013 (pp. 384-389). USA: Association for Computational Linguistics (ACL).