Boosting Methods for Multi-label Data Cyberbullying

Fidya Farasalsabila; Mhd Adi Setiawan Aritonang; Faradiba Jabnabillah; Anip Moniva; Verra Budhi Lestari; Rizky Handayani

doi:10.30865/jurikom.v12i3.8721

Authors

Fidya Farasalsabila Institut Teknologi Batam, Batam
Mhd Adi Setiawan Aritonang Institut Teknologi Batam, Batam
Faradiba Jabnabillah Institut Teknologi Batam, Batam
Anip Moniva Politeknik AI Budi Mulia Dua, Yogyakarta
Verra Budhi Lestari Universitas Media Nusantara Citra, Jakarta
Rizky Handayani Institut Teknologi Bisnis dan Kesehatan Bhakti Putra Bangsa Indonesia, Jawa Tengah

DOI:

https://doi.org/10.30865/jurikom.v12i3.8721

Keywords:

Cyberbullying, Multi-label Classification, Boosting Methods, Sentiment Analysis.

Abstract

Easy accessibility to the internet and social media allows individuals to communicate anonymously, providing opportunities for abusive and harmful behavior. The psychological impact of cyberbullying can be very detrimental, triggering stress, depression, and even causing more serious consequences such as suicide. This paper describes cyberbullying sentiment analysis with a focus on the use of four different boosting methods, namely Gradient Booster, Gradient Booster, XGBoost, AdaBoost, dan LightGBM on a multi-label public dataset covering 6 categories. The aim of this research is to compare and analyze the relative performance of these boosting methods in overcoming the challenges of multi-label sentiment analysis in the context of cyberbullying. Results reveal that XGBoost and LightGBM have a tendency to more effectively overcome the challenges of detecting cyberbullying in more complex categories, making a positive contribution to the development of superior detection systems in the context of multi-label sentiment analysis. This research contributes to the field by providing a comparative analysis of state-of-the-art boosting algorithms, highlighting their strengths in multi-label classification tasks, and offering practical insights for developing more accurate and reliable cyberbullying detection systems. The findings from this study are expected to serve as a reference for future development of machine learning-based tools that can help mitigate the psychological harm caused by online abuse, particularly in detecting subtle and complex forms of cyberbullying behavior.

References

J. Li, G. Huang, C. Fan, Z. Sun, and H. Zhu, “Key word extraction for short text via word2vec, doc2vec, and textrank,” Turkish Journal of Electrical Engineering and Computer Sciences, vol. 27, no. 3, pp. 1794–1805, 2019, doi: 10.3906/elk-1806-38.

W. Medhat, A. Hassan, and H. Korashy, “Sentiment analysis algorithms and applications: A survey,” Ain Shams Engineering Journal, vol. 5, no. 4, pp. 1093–1113, Dec. 2014, doi: 10.1016/j.asej.2014.04.011.

M. A. Al-Garadi et al., “Predicting Cyberbullying on Social Media in the Big Data Era Using Machine Learning Algorithms: Review of Literature and Open Challenges,” IEEE Access, vol. 7, pp. 70701–70718, 2019, doi: 10.1109/ACCESS.2019.2918354.

A. Muneer and S. M. Fati, “A comparative analysis of machine learning techniques for cyberbullying detection on twitter,” Future Internet, vol. 12, no. 11, pp. 1–21, Nov. 2020, doi: 10.3390/fi12110187.

D. W. Hosmer, Stanley. Lemeshow, and R. X. Sturdivant, Applied logistic regression.

F. Farasalsabila, E. Utami, and M. Hanafi, “ANALYSIS OF PUBLIC OPINION ON INDONESIAN TELEVISION SHOWS USING SUPPORT VECTOR MACHINE,” JURTEKSI (Jurnal Teknologi dan Sistem Informasi), vol. 10, no. 2, pp. 239–246, Mar. 2024, doi: 10.33330/jurteksi.v10i2.2935.

V. S. Chavan and S. S. S, Machine Learning Approach for Detection of Cyber-Aggressive Comments by Peers on Social Media Network. 2015.

F. Farasalsabila, E. Utami, and H. Hanafi, “Deteksi Cyberbullying Menggunakan BERT dan Bi-LSTM,” J Teknol, vol. 17, no. 1, May 2024, doi: 10.34151/jurtek.v17i1.4636.

F. Farasalsabila, E. Utami, and S. Raharjo, “Multi-Label Classification using BERT for Cyberbullying Detection.”

J. Bogatinovski, L. Todorovski, S. Džeroski, and D. Kocev, “Comprehensive comparative study of multi-label classification methods,” Expert Syst Appl, vol. 203, Oct. 2022, doi: 10.1016/j.eswa.2022.117215.

J. Li, X. Zhu, and J. Wang, “AdaBoost.C2: Boosting Classifiers Chains for Multi-Label Classification,” 2023. [Online]. Available: www.aaai.org

G. Ke et al., “LightGBM: A Highly Efficient Gradient Boosting Decision Tree.” [Online]. Available: https://github.com/Microsoft/LightGBM.

S. Rahman, M. Irfan, M. Raza, K. M. Ghori, S. Yaqoob, and M. Awais, “Performance analysis of boosting classifiers in recognizing activities of daily living,” Int J Environ Res Public Health, vol. 17, no. 3, Feb. 2020, doi: 10.3390/ijerph17031082.

J. Tanha, Y. Abdi, N. Samadi, N. Razzaghi, and M. Asadpour, “Boosting methods for multi-class imbalanced data classification: an experimental review,” J Big Data, vol. 7, no. 1, Dec. 2020, doi: 10.1186/s40537-020-00349-y.

Jason Wang, Kaiqun Fu, and Chang-Tien Lu, “Fine-Grained Balanced Cyberbullying Dataset,” 2020.

A. Rafid Rizqullah, A. Wedhasmara, R. Izwan Heroza, A. Putra, and P. Putra, “ANALISIS MASALAH PADA DATA REVIEW APLIKASI TERHADAP LAYANAN E-COMMERCE MENGGUNAKAN METODE TEXT CLASSIFICATION,” 2023.

D. D. Nur Cahyo et al., “Sentiment Analysis for IMDb Movie Review Using Support Vector Machine (SVM) Method,” Inform : Jurnal Ilmiah Bidang Teknologi Informasi dan Komunikasi, vol. 8, no. 2, pp. 90–95, Mar. 2023, doi: 10.25139/inform.v8i2.5700.

P. Florek and A. Zagda?ski, “Benchmarking state-of-the-art gradient boosting algorithms for classification,” May 2023, [Online]. Available: http://arxiv.org/abs/2305.17094

H. Mulyo and A. Khanif Zyen, “BULLETIN OF COMPUTER SCIENCE RESEARCH Pengaruh Hyperparameter Tuning Gradient Boosting Terhadap Prediksi Pemilihan Program Studi Mahasiswa Baru,” Media Online), vol. 5, no. 2, pp. 131–137, 2025, doi: 10.47065/bulletincsr.v5i2.454.

M. Rama Hadi Suryanto and D. Wahyu Utomo, “Pembelajaran Ensemble Untuk Klasifikasi Ulasan Pelanggan E-commerce Menggunakan Teknik Boosting,” vol. 15, no. 02, 2024, doi: 10.35970/infotekmesin.v15i2.2314.

A. Mayr, H. Binder, O. Gefeller, and M. Schmid, “The evolution of boosting algorithms: From machine learning to statistical modelling,” Methods Inf Med, vol. 53, no. 6, pp. 419–427, 2014, doi: 10.3414/ME13-01-0122.

N. Ritha et al., “Sentiment Analysis of Health Protocol Policy Using K-Nearest Neighbor and Cosine Similarity,” in ICSEDTI 2022, European Alliance for Innovation n.o., Jan. 2023. doi: 10.4108/eai.11-10-2022.2326274.

E. Beauxis-Aussalet and L. Hardman, Simplifying the Visualization of Confusion Matrix. 2014.