AAU Student Projects - visit Aalborg University's student projects portal
A master's thesis from Aalborg University
Book cover


The Comparative analysis of different predictive analytics models in predicting cyberbullying

Term

4. Semester

Publication year

2024

Submitted on

Abstract

The emergence of social media, especially X, has led to misunderstandings about freedom of speech, resulting in issues like cyberbullying. This study investigates cyberbullying's effects by comparing machine learning algorithms based on accuracy, precision, recall, and F1 score. It also assesses how sentiment analysis and Psychosocial Safety Climate (PSC) principles can improve model efficiency. Utilizing PSC theory, which promotes prosocial behavior to prevent bullying, the study merges technological solutions with human behavior insights for online safety. Analyzing 2,000 tweets using TextBlob for sentiment analysis, features are extracted through TF-IDF vectorization, and models are trained using Multinomial Naive Bayes, Random Forests, XGBoost, and Support Vector Machines (SVM). These models are evaluated using 5-fold cross-validation, with hyperparameters fine-tuned by GridSearchCV. Results show that Random Forest and XGBoost achieved the highest scores, 0.761 and 0.740, respectively. Multinomial Naive Bayes demonstrated exceptional computational efficiency, making it suitable for real time applications. Sentiment analysis improved detection by emphasizing emotional context, and PSC principles enhanced model effectiveness by incorporating features like "number_negative_words" and "number_positive_words." The research underscores the combination of machine learning and psychosocial theory in detecting cyberbullying. It recommends choosing models based on application needs: Random Forest for a balance between performance and interpretability, XGBoost for high accuracy, and Multinomial Naive Bayes for efficiency. Future research should expand datasets, address privacy concerns, and incorporate features like social network analysis to enhance practicality and improve online safety by involving administrators and moderators.