Decoding Online Hate in the United States: A BERT-CNN Analysis of 36 Million Tweets from 2020 to 2022

Shasank Sekhar Pandey*, Alberto Garcia-Robledo, Mahboobeh Zangiabady

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

143 Downloads (Pure)

Abstract

Since its inception, social media has enabled people worldwide to connect with like-minded individuals and freely express their thoughts and opinions. However, its widespread nature has not only had an immeasurable impact on society but also presented significant challenges. One such challenge is online hate speech. Consequently, the identification of hate speech has recently gained considerable attention, ranging from reactive methods, such as classifying individual posts, to proactive strategies that utilize contextual information to decipher the complex lexicon of online discussions. Despite these efforts, current research lacks a comprehensive analysis of hate speech on Twitter during the crucial 2020-2022 period, marked by significant events such as the COVID-19 pandemic. In this paper, we present a BERT-based model for classifying hate speech. To this end, we collected 36 million tweets posted in the United States on Twitter during this period. We developed, trained, and tested a BERT-based Convolutional Neural Network (BERT-CNN), using it to classify the collected tweets. The classification of this dataset revealed a high incidence of targets motivated by ethnicity, with gender and nationality as other prominent categories. This work provides insightful data on the sentiments of individuals across the United States during the events of 2020-2022.
Original languageEnglish
Title of host publication18th IEEE International Conference on Semantic Computing (ICSC2024)
PublisherIEEE
Pages329-334
Number of pages6
ISBN (Electronic)9798350385359
DOIs
Publication statusPublished - 22 Mar 2024

Keywords

  • 2024 OA procedure
  • Sentiment analysis
  • Social Network Analysis (SNA)
  • BERT
  • Convolutional Neural Networks (CNN)
  • Twitter
  • Hate speech detection

Fingerprint

Dive into the research topics of 'Decoding Online Hate in the United States: A BERT-CNN Analysis of 36 Million Tweets from 2020 to 2022'. Together they form a unique fingerprint.

Cite this