The Comparative analysis of different predictive analytics models in predicting cyberbullying

Author

Amjad, Muhammad Wajahat

Term

4. Semester

Education

Digital Communication Leadership (Erasmus+)

Publication year

2024

Submitted on

2024-07-31

Abstract

The emergence of social media, especially X, has led to misunderstandings about freedom of speech, resulting in issues like cyberbullying. This study investigates cyberbullying's effects by comparing machine learning algorithms based on accuracy, precision, recall, and F1 score. It also assesses how sentiment analysis and Psychosocial Safety Climate (PSC) principles can improve model efficiency. Utilizing PSC theory, which promotes prosocial behavior to prevent bullying, the study merges technological solutions with human behavior insights for online safety. Analyzing 2,000 tweets using TextBlob for sentiment analysis, features are extracted through TF-IDF vectorization, and models are trained using Multinomial Naive Bayes, Random Forests, XGBoost, and Support Vector Machines (SVM). These models are evaluated using 5-fold cross-validation, with hyperparameters fine-tuned by GridSearchCV. Results show that Random Forest and XGBoost achieved the highest scores, 0.761 and 0.740, respectively. Multinomial Naive Bayes demonstrated exceptional computational efficiency, making it suitable for real time applications. Sentiment analysis improved detection by emphasizing emotional context, and PSC principles enhanced model effectiveness by incorporating features like "number_negative_words" and "number_positive_words." The research underscores the combination of machine learning and psychosocial theory in detecting cyberbullying. It recommends choosing models based on application needs: Random Forest for a balance between performance and interpretability, XGBoost for high accuracy, and Multinomial Naive Bayes for efficiency. Future research should expand datasets, address privacy concerns, and incorporate features like social network analysis to enhance practicality and improve online safety by involving administrators and moderators.

Keywords

Cyberbullying ; Predictive Analytics ; Sentiment Analysis ; Multinomial Naive Bayes ; Random Forest ; XGBoost ; Support Vector Machines ; PSC

Documents

Download
View record in AAU Student Projects

A master's thesis from Aalborg University

The Comparative analysis of different predictive analytics models in predicting cyberbullying