#WhoAmI in 160 characters? Classifying social identities based on Twitter profile descriptions

Anna Priante, Djoerd Hiemstra, Tijs Adriaan van den Broek, Aaqib Saeed, Michel Léon Ehrenhard, Ariana Need

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

136 Downloads (Pure)


We combine social theory and NLP methods to classify English-speaking Twitter users’ online social identity in profile descriptions. We conduct two text classification experiments. In Experiment 1 we use a 5-category online social identity classification based on identity and self-categorization theories. While we are able to automatically classify two identity categories (Relational and Occupational), automatic classification of the other three identities (Political, Ethnic/religious and Stigmatized) is challenging. In Experiment 2 we test a merger of such identities based on theoretical arguments. We find that by combining these identities we can improve the predictive performance of the classifiers in the experiment. Our study shows how social theory can be used to guide NLP methods, and how such methods provide input to revisit traditional social theory that is strongly consolidated in offline setting
Original languageEnglish
Title of host publicationWorkshop on Natural Language Processing and Computational Social Science
Subtitle of host publicationNLP+CSS, EMNLP
Place of PublicationAustin, Texas
PublisherAssociation for Computational Linguistics (ACL)
Number of pages11
ISBN (Print)978-1-945626-26-5
Publication statusPublished - Nov 2016


  • EWI-27810
  • IR-102356
  • METIS-319140

Fingerprint Dive into the research topics of '#WhoAmI in 160 characters? Classifying social identities based on Twitter profile descriptions'. Together they form a unique fingerprint.

Cite this