Experts and Machines United Against Cyberbullying

M. Dadvar

Abstract

One form of online misbehaviour which has deeply affected society with harmful consequences is known as cyberbullying. Cyberbullying can simply be defined as an intentional act that is conducted through digital technology to hurt someone. Cyberbullying is a widely covered topic in the social sciences. There are many studies in which the problem of cyberbullying has been introduced and its origins and consequences have been explored in detail. There are also studies which have investigated the intervention and prevention strategies and have proposed guidelines for parents and adults in this regard. However, studies on the technical dimensions of this topic are relatively rare. In this research the overall goal was to bridge the gap between social science approaches and technical solutions. In order to be able to suggest solutions that could contribute to minimizing the risk and impact of cyberbullying we have investigated the phenomenon of cyberbullying from different angles. We have thoroughly studied the origin of cyberbullying and its growth over time, as well as the role of technology in the emergence of this type of virtual behaviour and in the potential for reducing the extent of the social concern it raises. First we introduced a novel outlook towards the cyberbullying phenomenon. We looked into the gradual changes which have occurred in relationships and social communication with the emergence of the Internet. We argued that one should look at virtual environments as virtual communities, because the human needs projected on these environments, the relationships, human concerns and misbehaviour have the same nature as in real-life societies. Therefore, to make virtual communities safe, we need to take safety measures and precautions that are similar to the ones that are common in non-virtual communities. We derived the assumption that if cyberbullying is recognized and treated as a social problem and not just seen as some random mischief conducted by individuals with the use of technology, the methods for handling its consequences are likely to be more realistic, effective and comprehensive. This part of our study led to the conviction that for combating cyberbullying, behavioural and psychological studies, and the study of technical solutions should go hand in hand. One of the main limitations that we faced when we started our research was the lack of a comprehensive dataset for cyberbullying studies. We needed a dataset which included real instances of bullying incidents. Moreover, it was essential for our studies to also have the demographic information of the social media users as well as the history of their activities. We started our preliminary experiments using a dataset that was collected from MySpace forums. This dataset did not meet all the requirements for our experiment, namely in terms of size and sufficiency of information. Therefore we developed our own YouTube dataset, with the aim to encompass extensive information about the users and their activities as well as larger numbers of bullying comments. We collected information on user activities and posted textual comments as well as personal and demographic details of the users involved. Detecting a bullying comment or post at the earliest possible moment in time can substantially decrease the negative effects of cyberbullying incidents. We started our experiments by showing that besides the conventional features used for text mining methods such as sentiment analysis and specifically bullying detection, more personal features, in this experiment gender, can improve the accuracy of the detection models. As expected the models which were optimized accordingly resulted in a more accurate classification. The improved outcome motivated us to look into other personal features as well, such as age and the writing style of users. By adding more personal information, the previous classification results were outperformed and the detection accuracy enhanced even further. In the last experiment we made use of experts’ knowledge to identify potential bully users in social networks. To better understand and interpret the intentions underlying the online activities of users of social media, we decided to incorporate human reasoning and knowledge into a bulliness rating system by developing a Multi-Criteria Evaluation System. Moreover, to have more sources of information and to make use of the potential of both human and machine, we designed a hybrid approach, incorporating machine learning models on top of the expert system. The hybrid approach reached an optimum model which outperformed the results obtained from the machine learning models and the expert system individually. Our hybrid model illustrates the added value of integrating technical solutions with insights from the social sciences for the first time. As argued in this thesis, the integration of social studies into a software-enhanced monitoring workflow could pave the way towards the tackling of this kind of online misbehaviour. The ideas and algorithms proposed for fulfilling this purpose can be a stepping stone for future research in this direction. The work carried out is also a demonstration of the added value of frameworks for text categorization, sentiment mining and user profiling in applications addressing societal issues. This work can be viewed as a contribution to the more general societal challenge of increasing the level of cybersecurity, in particular for the younger generations of social network users. By turning the internet into a safer place for children, the chances increase that they will be able to benefit from the informational richness that it also offers.
Original languageUndefined
Awarding Institution
  • University of Twente
Supervisors/Advisors
  • de Jong, Franciska M.G., Supervisor
Date of Award12 Sep 2014
Place of PublicationEnschede
Print ISBNs978-90-365-3739-1
DOIs
StatePublished - 12 Sep 2014

Fingerprint

Experiment
Bullying
Social sciences
Machine learning
Learning model
Social networks
Demographics
Virtual community
Expert system
Digital technology
Monitoring
Software
Virtual environments
Expert knowledge
Psychological
Social problems
Hybrid model
Profiling
Precaution
Sentiment

Keywords

  • METIS-304975
  • Information Retrieval
  • EWI-25014
  • Sentiment Analysis
  • Cyberbullying
  • Expert Systems
  • IR-91720

Cite this

Dadvar, M.. / Experts and Machines United Against Cyberbullying. Enschede, 2014. 159 p.
@misc{426a5a9099b04f539fdc1184a5a4b5c7,
title = "Experts and Machines United Against Cyberbullying",
abstract = "One form of online misbehaviour which has deeply affected society with harmful consequences is known as cyberbullying. Cyberbullying can simply be defined as an intentional act that is conducted through digital technology to hurt someone. Cyberbullying is a widely covered topic in the social sciences. There are many studies in which the problem of cyberbullying has been introduced and its origins and consequences have been explored in detail. There are also studies which have investigated the intervention and prevention strategies and have proposed guidelines for parents and adults in this regard. However, studies on the technical dimensions of this topic are relatively rare. In this research the overall goal was to bridge the gap between social science approaches and technical solutions. In order to be able to suggest solutions that could contribute to minimizing the risk and impact of cyberbullying we have investigated the phenomenon of cyberbullying from different angles. We have thoroughly studied the origin of cyberbullying and its growth over time, as well as the role of technology in the emergence of this type of virtual behaviour and in the potential for reducing the extent of the social concern it raises. First we introduced a novel outlook towards the cyberbullying phenomenon. We looked into the gradual changes which have occurred in relationships and social communication with the emergence of the Internet. We argued that one should look at virtual environments as virtual communities, because the human needs projected on these environments, the relationships, human concerns and misbehaviour have the same nature as in real-life societies. Therefore, to make virtual communities safe, we need to take safety measures and precautions that are similar to the ones that are common in non-virtual communities. We derived the assumption that if cyberbullying is recognized and treated as a social problem and not just seen as some random mischief conducted by individuals with the use of technology, the methods for handling its consequences are likely to be more realistic, effective and comprehensive. This part of our study led to the conviction that for combating cyberbullying, behavioural and psychological studies, and the study of technical solutions should go hand in hand. One of the main limitations that we faced when we started our research was the lack of a comprehensive dataset for cyberbullying studies. We needed a dataset which included real instances of bullying incidents. Moreover, it was essential for our studies to also have the demographic information of the social media users as well as the history of their activities. We started our preliminary experiments using a dataset that was collected from MySpace forums. This dataset did not meet all the requirements for our experiment, namely in terms of size and sufficiency of information. Therefore we developed our own YouTube dataset, with the aim to encompass extensive information about the users and their activities as well as larger numbers of bullying comments. We collected information on user activities and posted textual comments as well as personal and demographic details of the users involved. Detecting a bullying comment or post at the earliest possible moment in time can substantially decrease the negative effects of cyberbullying incidents. We started our experiments by showing that besides the conventional features used for text mining methods such as sentiment analysis and specifically bullying detection, more personal features, in this experiment gender, can improve the accuracy of the detection models. As expected the models which were optimized accordingly resulted in a more accurate classification. The improved outcome motivated us to look into other personal features as well, such as age and the writing style of users. By adding more personal information, the previous classification results were outperformed and the detection accuracy enhanced even further. In the last experiment we made use of experts’ knowledge to identify potential bully users in social networks. To better understand and interpret the intentions underlying the online activities of users of social media, we decided to incorporate human reasoning and knowledge into a bulliness rating system by developing a Multi-Criteria Evaluation System. Moreover, to have more sources of information and to make use of the potential of both human and machine, we designed a hybrid approach, incorporating machine learning models on top of the expert system. The hybrid approach reached an optimum model which outperformed the results obtained from the machine learning models and the expert system individually. Our hybrid model illustrates the added value of integrating technical solutions with insights from the social sciences for the first time. As argued in this thesis, the integration of social studies into a software-enhanced monitoring workflow could pave the way towards the tackling of this kind of online misbehaviour. The ideas and algorithms proposed for fulfilling this purpose can be a stepping stone for future research in this direction. The work carried out is also a demonstration of the added value of frameworks for text categorization, sentiment mining and user profiling in applications addressing societal issues. This work can be viewed as a contribution to the more general societal challenge of increasing the level of cybersecurity, in particular for the younger generations of social network users. By turning the internet into a safer place for children, the chances increase that they will be able to benefit from the informational richness that it also offers.",
keywords = "METIS-304975, Information Retrieval, EWI-25014, Sentiment Analysis, Cyberbullying, Expert Systems, IR-91720",
author = "M. Dadvar",
note = "SIKS Dissertation series no. 2014-37",
year = "2014",
month = "9",
doi = "10.3990/1.9789036537391",
isbn = "978-90-365-3739-1",
school = "University of Twente",

}

Dadvar, M 2014, 'Experts and Machines United Against Cyberbullying', University of Twente, Enschede. DOI: 10.3990/1.9789036537391

Experts and Machines United Against Cyberbullying. / Dadvar, M.

Enschede, 2014. 159 p.

Research output: ScientificPhD Thesis - Research UT, graduation UT

TY - THES

T1 - Experts and Machines United Against Cyberbullying

AU - Dadvar,M.

N1 - SIKS Dissertation series no. 2014-37

PY - 2014/9/12

Y1 - 2014/9/12

N2 - One form of online misbehaviour which has deeply affected society with harmful consequences is known as cyberbullying. Cyberbullying can simply be defined as an intentional act that is conducted through digital technology to hurt someone. Cyberbullying is a widely covered topic in the social sciences. There are many studies in which the problem of cyberbullying has been introduced and its origins and consequences have been explored in detail. There are also studies which have investigated the intervention and prevention strategies and have proposed guidelines for parents and adults in this regard. However, studies on the technical dimensions of this topic are relatively rare. In this research the overall goal was to bridge the gap between social science approaches and technical solutions. In order to be able to suggest solutions that could contribute to minimizing the risk and impact of cyberbullying we have investigated the phenomenon of cyberbullying from different angles. We have thoroughly studied the origin of cyberbullying and its growth over time, as well as the role of technology in the emergence of this type of virtual behaviour and in the potential for reducing the extent of the social concern it raises. First we introduced a novel outlook towards the cyberbullying phenomenon. We looked into the gradual changes which have occurred in relationships and social communication with the emergence of the Internet. We argued that one should look at virtual environments as virtual communities, because the human needs projected on these environments, the relationships, human concerns and misbehaviour have the same nature as in real-life societies. Therefore, to make virtual communities safe, we need to take safety measures and precautions that are similar to the ones that are common in non-virtual communities. We derived the assumption that if cyberbullying is recognized and treated as a social problem and not just seen as some random mischief conducted by individuals with the use of technology, the methods for handling its consequences are likely to be more realistic, effective and comprehensive. This part of our study led to the conviction that for combating cyberbullying, behavioural and psychological studies, and the study of technical solutions should go hand in hand. One of the main limitations that we faced when we started our research was the lack of a comprehensive dataset for cyberbullying studies. We needed a dataset which included real instances of bullying incidents. Moreover, it was essential for our studies to also have the demographic information of the social media users as well as the history of their activities. We started our preliminary experiments using a dataset that was collected from MySpace forums. This dataset did not meet all the requirements for our experiment, namely in terms of size and sufficiency of information. Therefore we developed our own YouTube dataset, with the aim to encompass extensive information about the users and their activities as well as larger numbers of bullying comments. We collected information on user activities and posted textual comments as well as personal and demographic details of the users involved. Detecting a bullying comment or post at the earliest possible moment in time can substantially decrease the negative effects of cyberbullying incidents. We started our experiments by showing that besides the conventional features used for text mining methods such as sentiment analysis and specifically bullying detection, more personal features, in this experiment gender, can improve the accuracy of the detection models. As expected the models which were optimized accordingly resulted in a more accurate classification. The improved outcome motivated us to look into other personal features as well, such as age and the writing style of users. By adding more personal information, the previous classification results were outperformed and the detection accuracy enhanced even further. In the last experiment we made use of experts’ knowledge to identify potential bully users in social networks. To better understand and interpret the intentions underlying the online activities of users of social media, we decided to incorporate human reasoning and knowledge into a bulliness rating system by developing a Multi-Criteria Evaluation System. Moreover, to have more sources of information and to make use of the potential of both human and machine, we designed a hybrid approach, incorporating machine learning models on top of the expert system. The hybrid approach reached an optimum model which outperformed the results obtained from the machine learning models and the expert system individually. Our hybrid model illustrates the added value of integrating technical solutions with insights from the social sciences for the first time. As argued in this thesis, the integration of social studies into a software-enhanced monitoring workflow could pave the way towards the tackling of this kind of online misbehaviour. The ideas and algorithms proposed for fulfilling this purpose can be a stepping stone for future research in this direction. The work carried out is also a demonstration of the added value of frameworks for text categorization, sentiment mining and user profiling in applications addressing societal issues. This work can be viewed as a contribution to the more general societal challenge of increasing the level of cybersecurity, in particular for the younger generations of social network users. By turning the internet into a safer place for children, the chances increase that they will be able to benefit from the informational richness that it also offers.

AB - One form of online misbehaviour which has deeply affected society with harmful consequences is known as cyberbullying. Cyberbullying can simply be defined as an intentional act that is conducted through digital technology to hurt someone. Cyberbullying is a widely covered topic in the social sciences. There are many studies in which the problem of cyberbullying has been introduced and its origins and consequences have been explored in detail. There are also studies which have investigated the intervention and prevention strategies and have proposed guidelines for parents and adults in this regard. However, studies on the technical dimensions of this topic are relatively rare. In this research the overall goal was to bridge the gap between social science approaches and technical solutions. In order to be able to suggest solutions that could contribute to minimizing the risk and impact of cyberbullying we have investigated the phenomenon of cyberbullying from different angles. We have thoroughly studied the origin of cyberbullying and its growth over time, as well as the role of technology in the emergence of this type of virtual behaviour and in the potential for reducing the extent of the social concern it raises. First we introduced a novel outlook towards the cyberbullying phenomenon. We looked into the gradual changes which have occurred in relationships and social communication with the emergence of the Internet. We argued that one should look at virtual environments as virtual communities, because the human needs projected on these environments, the relationships, human concerns and misbehaviour have the same nature as in real-life societies. Therefore, to make virtual communities safe, we need to take safety measures and precautions that are similar to the ones that are common in non-virtual communities. We derived the assumption that if cyberbullying is recognized and treated as a social problem and not just seen as some random mischief conducted by individuals with the use of technology, the methods for handling its consequences are likely to be more realistic, effective and comprehensive. This part of our study led to the conviction that for combating cyberbullying, behavioural and psychological studies, and the study of technical solutions should go hand in hand. One of the main limitations that we faced when we started our research was the lack of a comprehensive dataset for cyberbullying studies. We needed a dataset which included real instances of bullying incidents. Moreover, it was essential for our studies to also have the demographic information of the social media users as well as the history of their activities. We started our preliminary experiments using a dataset that was collected from MySpace forums. This dataset did not meet all the requirements for our experiment, namely in terms of size and sufficiency of information. Therefore we developed our own YouTube dataset, with the aim to encompass extensive information about the users and their activities as well as larger numbers of bullying comments. We collected information on user activities and posted textual comments as well as personal and demographic details of the users involved. Detecting a bullying comment or post at the earliest possible moment in time can substantially decrease the negative effects of cyberbullying incidents. We started our experiments by showing that besides the conventional features used for text mining methods such as sentiment analysis and specifically bullying detection, more personal features, in this experiment gender, can improve the accuracy of the detection models. As expected the models which were optimized accordingly resulted in a more accurate classification. The improved outcome motivated us to look into other personal features as well, such as age and the writing style of users. By adding more personal information, the previous classification results were outperformed and the detection accuracy enhanced even further. In the last experiment we made use of experts’ knowledge to identify potential bully users in social networks. To better understand and interpret the intentions underlying the online activities of users of social media, we decided to incorporate human reasoning and knowledge into a bulliness rating system by developing a Multi-Criteria Evaluation System. Moreover, to have more sources of information and to make use of the potential of both human and machine, we designed a hybrid approach, incorporating machine learning models on top of the expert system. The hybrid approach reached an optimum model which outperformed the results obtained from the machine learning models and the expert system individually. Our hybrid model illustrates the added value of integrating technical solutions with insights from the social sciences for the first time. As argued in this thesis, the integration of social studies into a software-enhanced monitoring workflow could pave the way towards the tackling of this kind of online misbehaviour. The ideas and algorithms proposed for fulfilling this purpose can be a stepping stone for future research in this direction. The work carried out is also a demonstration of the added value of frameworks for text categorization, sentiment mining and user profiling in applications addressing societal issues. This work can be viewed as a contribution to the more general societal challenge of increasing the level of cybersecurity, in particular for the younger generations of social network users. By turning the internet into a safer place for children, the chances increase that they will be able to benefit from the informational richness that it also offers.

KW - METIS-304975

KW - Information Retrieval

KW - EWI-25014

KW - Sentiment Analysis

KW - Cyberbullying

KW - Expert Systems

KW - IR-91720

U2 - 10.3990/1.9789036537391

DO - 10.3990/1.9789036537391

M3 - PhD Thesis - Research UT, graduation UT

SN - 978-90-365-3739-1

ER -

Dadvar M. Experts and Machines United Against Cyberbullying. Enschede, 2014. 159 p. Available from, DOI: 10.3990/1.9789036537391