Abstract

We combine social theory and NLP methods to classify English-speaking Twitter users’ online social identity in profile descriptions. We conduct two text classification experiments. In Experiment 1 we use a 5-category online social identity classification based on identity and self-categorization theories. While we are able to automatically classify two identity categories (Relational and Occupational), automatic classification of the other three identities (Political, Ethnic/religious and Stigmatized) is challenging. In Experiment 2 we test a merger of such identities based on theoretical arguments. We find that by combining these identities we can improve the predictive performance of the classifiers in the experiment. Our study shows how social theory can be used to guide NLP methods, and how such methods provide input to revisit traditional social theory that is strongly consolidated in offline setting
Original languageEnglish
Title of host publicationWorkshop on Natural Language Processing and Computational Social Science
Subtitle of host publicationNLP+CSS, EMNLP
Place of PublicationAustin, Texas
PublisherAssociation for Computational Linguistics
Pages55-65
Number of pages11
ISBN (Print)978-1-945626-26-5
DOIs
StatePublished - Nov 2016

Fingerprint

identity
experiment
classification
theory
category
political identity
twitter
method
merger
speaking
input
text
user
test
performance

Keywords

  • EWI-27810
  • IR-102356
  • METIS-319140

Cite this

Priante, A., Hiemstra, D., van den Broek, T. A., Saeed, A., Ehrenhard, M. L., & Need, A. (2016). #WhoAmI in 160 characters? Classifying social identities based on Twitter profile descriptions. In Workshop on Natural Language Processing and Computational Social Science: NLP+CSS, EMNLP (pp. 55-65). Austin, Texas: Association for Computational Linguistics. DOI: 10.18653/v1/W16-5608

Priante, Anna; Hiemstra, Djoerd; van den Broek, Tijs Adriaan; Saeed, Aaqib; Ehrenhard, Michel Léon; Need, Ariana / #WhoAmI in 160 characters? : Classifying social identities based on Twitter profile descriptions.

Workshop on Natural Language Processing and Computational Social Science: NLP+CSS, EMNLP. Austin, Texas : Association for Computational Linguistics, 2016. p. 55-65.

Research output: Scientific - peer-reviewConference contribution

@inbook{b33495df5a2844a9987d6ab7c2b0ce3b,
title = "#WhoAmI in 160 characters?: Classifying social identities based on Twitter profile descriptions",
abstract = "We combine social theory and NLP methods to classify English-speaking Twitter users’ online social identity in profile descriptions. We conduct two text classification experiments. In Experiment 1 we use a 5-category online social identity classification based on identity and self-categorization theories. While we are able to automatically classify two identity categories (Relational and Occupational), automatic classification of the other three identities (Political, Ethnic/religious and Stigmatized) is challenging. In Experiment 2 we test a merger of such identities based on theoretical arguments. We find that by combining these identities we can improve the predictive performance of the classifiers in the experiment. Our study shows how social theory can be used to guide NLP methods, and how such methods provide input to revisit traditional social theory that is strongly consolidated in offline setting",
keywords = "EWI-27810, IR-102356, METIS-319140",
author = "Anna Priante and Djoerd Hiemstra and {van den Broek}, {Tijs Adriaan} and Aaqib Saeed and Ehrenhard, {Michel Léon} and Ariana Need",
year = "2016",
month = "11",
doi = "10.18653/v1/W16-5608",
isbn = "978-1-945626-26-5",
pages = "55--65",
booktitle = "Workshop on Natural Language Processing and Computational Social Science",
publisher = "Association for Computational Linguistics",

}

Priante, A, Hiemstra, D, van den Broek, TA, Saeed, A, Ehrenhard, ML & Need, A 2016, #WhoAmI in 160 characters?: Classifying social identities based on Twitter profile descriptions. in Workshop on Natural Language Processing and Computational Social Science: NLP+CSS, EMNLP. Association for Computational Linguistics, Austin, Texas, pp. 55-65. DOI: 10.18653/v1/W16-5608

#WhoAmI in 160 characters? : Classifying social identities based on Twitter profile descriptions. / Priante, Anna; Hiemstra, Djoerd; van den Broek, Tijs Adriaan; Saeed, Aaqib; Ehrenhard, Michel Léon; Need, Ariana.

Workshop on Natural Language Processing and Computational Social Science: NLP+CSS, EMNLP. Austin, Texas : Association for Computational Linguistics, 2016. p. 55-65.

Research output: Scientific - peer-reviewConference contribution

TY - CHAP

T1 - #WhoAmI in 160 characters?

T2 - Classifying social identities based on Twitter profile descriptions

AU - Priante,Anna

AU - Hiemstra,Djoerd

AU - van den Broek,Tijs Adriaan

AU - Saeed,Aaqib

AU - Ehrenhard,Michel Léon

AU - Need,Ariana

PY - 2016/11

Y1 - 2016/11

N2 - We combine social theory and NLP methods to classify English-speaking Twitter users’ online social identity in profile descriptions. We conduct two text classification experiments. In Experiment 1 we use a 5-category online social identity classification based on identity and self-categorization theories. While we are able to automatically classify two identity categories (Relational and Occupational), automatic classification of the other three identities (Political, Ethnic/religious and Stigmatized) is challenging. In Experiment 2 we test a merger of such identities based on theoretical arguments. We find that by combining these identities we can improve the predictive performance of the classifiers in the experiment. Our study shows how social theory can be used to guide NLP methods, and how such methods provide input to revisit traditional social theory that is strongly consolidated in offline setting

AB - We combine social theory and NLP methods to classify English-speaking Twitter users’ online social identity in profile descriptions. We conduct two text classification experiments. In Experiment 1 we use a 5-category online social identity classification based on identity and self-categorization theories. While we are able to automatically classify two identity categories (Relational and Occupational), automatic classification of the other three identities (Political, Ethnic/religious and Stigmatized) is challenging. In Experiment 2 we test a merger of such identities based on theoretical arguments. We find that by combining these identities we can improve the predictive performance of the classifiers in the experiment. Our study shows how social theory can be used to guide NLP methods, and how such methods provide input to revisit traditional social theory that is strongly consolidated in offline setting

KW - EWI-27810

KW - IR-102356

KW - METIS-319140

U2 - 10.18653/v1/W16-5608

DO - 10.18653/v1/W16-5608

M3 - Conference contribution

SN - 978-1-945626-26-5

SP - 55

EP - 65

BT - Workshop on Natural Language Processing and Computational Social Science

PB - Association for Computational Linguistics

ER -

Priante A, Hiemstra D, van den Broek TA, Saeed A, Ehrenhard ML, Need A. #WhoAmI in 160 characters?: Classifying social identities based on Twitter profile descriptions. In Workshop on Natural Language Processing and Computational Social Science: NLP+CSS, EMNLP. Austin, Texas: Association for Computational Linguistics. 2016. p. 55-65. Available from, DOI: 10.18653/v1/W16-5608