Sentiment Analysis of Hate Speech on Twitter Public Figures with AdaBoost and XGBoost Methods
DOI:
https://doi.org/10.30865/mib.v6i3.4394Keywords:
Twitter, Hate Speech, Sentiment, Analysis, AdaBoost, XGBoostAbstract
Public figures are often scrutinized by social media users, either because of what they say or even because of their role in a television series. Generally, public figures upload something on their social media accounts to help shape their image. But not everyone who sees it is happy. Some even dislike the upload. This study aims to determine public sentiment towards public figure Anya Geraldine conveyed on Twitter in Indonesian. The classification process in this study uses the Adaptive Boosting (AdaBoost) and Extreme Gradient Boosting (XGBoost) classification methods with text preprocessing using cleaning, case folding, tokenization, and filtering. The data used are tweets in Indonesian with the keyword â€@anyaselalubenarâ€, with a total dataset of 7,475 tweets divided into 6,887 positive and 588 negative tweets. From the label results using oversampling to avoid excessive overfitting problems. The feature used is TF-IDF weighting. Four experimental scenarios were carried out to validate the effectiveness of the model used: first model performance without oversampling, second model performance with oversampling, third model performance with undersampling, and fourth model performance with Hyperparameter tune. The experimental results show that XGBoost+SMOTE+Hyperparameter achieved 95% compared to AdaBoost+SMOTE+Hyperparameter of 87%. The application of SMOTE and Hyperparameter tune is proven to overcome the problem of data imbalance and get better classification results.
References
G. Buntoro, “ANALISIS SENTIMEN HATESPEECH PADA TWITTER DENGAN METODE NAÃVE BAYES CLASSIFIER DAN SUPPORT VECTOR MACHINE,†Jurnal Dinamika Informatika, vol. 5, Jun. 2016.
B. A. Simangunsong, “Interaksi Antarmanusia Melalui Media Sosial Facebook Mengenai Topik Keagamaan,†Jurnal Aspikom, vol. 3, no. 1, pp. 65–76, 2016.
S. Tuarob and J. L. Mitrpanont, “Automatic discovery of abusive thai language usages in social networks,†in International Conference on Asian Digital Libraries, 2017, pp. 267–278.
S. Surahman, “Public Figure sebagai Virtual Opinion Leader dan Kepercayaan Informasi Masyarakat,†WACANA: Jurnal Ilmiah Ilmu Komunikasi, vol. 17, no. 1, pp. 53–63, 2018.
F. S. Jumeilah and others, “Penerapan Support Vector Machine (SVM) untuk Pengkategorian Penelitian,†Jurnal RESTI (Rekayasa Sistem Dan Teknologi Informasi), vol. 1, no. 1, pp. 19–25, 2017.
R. R. Rerung, “Penerapan data mining dengan memanfaatkan metode association rule untuk promosi produk,†J. Teknol. Rekayasa, vol. 3, no. 1, p. 89, 2018.
G. Abdurrahman, “Klasifikasi Penyakit Diabetes Melitus Menggunakan Adaboost Classifier,†JUSTINDO (Jurnal Sistem dan Teknologi Informasi Indonesia), vol. 7, no. 1, pp. 59–66, 2022.
Z. Imaduddin and H. A. Tawakal, “Deteksi dan Klasifikasi Daun Menggunakan Metode Adaboost dan SVM,†SEMNASTEKNOMEDIA ONLINE, vol. 3, no. 1, 2015.
E. Sutoyo and M. A. Fadlurrahman, “Penerapan SMOTE untuk Mengatasi Imbalance Class dalam Klasifikasi Television Advertisement Performance Rating Menggunakan Artificial Neural Network,†JEPIN (Jurnal Edukasi dan Penelitian Informatika), vol. 6, no. 3, pp. 379–385, 2020.
I. L. Cherif and A. Kortebi, “On using extreme gradient boosting (XGBoost) machine learning algorithm for home network traffic classification,†in 2019 Wireless Days (WD), 2019, pp. 1–6.
A. F. Hidayatullah, A. A. Fadila, K. P. Juwairi, and R. A. Nayoan, “Identifikasi Konten Kasar pada Tweet Bahasa Indonesia,†Jurnal Linguistik Komputasional, vol. 2, no. 1, pp. 1–5, 2019.
W. A. Luqyana, “Analisis Sentimen Cyberbullying pada Komentar Instagram dengan Metode Klasifikasi Support Vector Machine,†Universitas Brawijaya, 2018.
K. Nugroho, “INDONESIAN LANGUAGE CLASSIFICATION OF CYBERBULLING WORDS ON TWITTER USING ADABOOST AND NEURAL NETWORK METHODS,†Jurnal Riset Informatika, vol. 3, no. 2, pp. 93–100, 2021.
S. Liang and others, “Comparative Analysis of SVM, XGBoost and Neural Network on Hate Speech Classification,†Jurnal RESTI (Rekayasa Sistem Dan Teknologi Informasi), vol. 5, no. 5, pp. 896–903, 2021.
S. Sahrul, A. F. Rahman, M. D. Normansyah, and A. Irawan, “Sistem Pendeteksi Kalimat Umpatan Di Media Sosial Dengan Model Neural Network,†Computatio: Journal of Computer Science and Information Systems, vol. 3, no. 2, pp. 108–115, 2019.
N. Hidayah and S. Sahibu, “Algoritma Multinomial Naïve Bayes Untuk Klasifikasi Sentimen Pemerintah Terhadap Penanganan Covid-19 Menggunakan Data Twitter,†Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 5, no. 4, pp. 820–826, 2021.
E. Listiana and M. A. Muslim, “Penerapan Adaboost Untuk Klasifikasi Support Vector Machine Guna Meningkatkan Akurasi Pada Diagnosa Chronic Kidney Disease,†Prosiding SNATIF, pp. 875–881, 2017.
J. Brownlee, XGBoost With python: Gradient boosted trees with XGBoost and scikit-learn. Machine Learning Mastery, 2016.
A. A. Firdaus and A. K. Mutaqin, “Klasifikasi Pemegang Polis Menggunakan Metode XGBoost,†Prosiding Statistika, pp. 704–710, 2021.
S. T. Jishan, R. I. Rashu, N. Haque, and R. M. Rahman, “Improving accuracy of students’ final grade prediction model using optimal equal width binning and synthetic minority over-sampling technique,†Decision Analytics, vol. 2, no. 1, pp. 1–25, 2015.
D. J. M. Pasaribu, K. Kusrini, and S. Sudarmawan, “Peningkatan Akurasi Klasifikasi Sentimen Ulasan Makanan Amazon dengan Bidirectional LSTM dan Bert Embedding,†Inspiration: Jurnal Teknologi Informasi dan Komunikasi, vol. 10, no. 1, pp. 9–20, 2020.
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).