Comparative Analysis of Naive Bayes Model Performance in Hate Speech Detection in Media Social Twitter
DOI:
https://doi.org/10.30865/jurikom.v10i1.5493Keywords:
Naïve Bayes, Multinomial, Gaussian, Bernoulli, Hate SpeechAbstract
Twitter is a popular social media in Indonesia, and for some people, it is a place to find and disseminate information. Hate speech is aggressive behavior against individuals or groups such on race, gender, religion, nationality, ethnicity, sexual orientation, gender identity, or disability. In this study, hate speech is modeled using Naive Bayesian models, which consist of Multinomial, Bernoulli, and Gaussian Naïve Bayes Models. These methods were chosen because Naïve Bayes is a simple method but has good performance in the case of sentiment analysis. This research aims to get the method with the highest accuracy value in analyzing hate speech. Thus, the Naïve Bayes model can provide the best solution for hate speech problems. The process carried out in this study is to process all data which obtained from Twitter social media and then classify it using the Multinomial Naïve Bayes, Gaussian Naïve Bayes, and Bernoulli Naive Bayes models based on the classification of HS and non-HS sentiment categories.  In this study, to get the best accuracy, two different scenarios were used. The result of the analysis of the accuracy is 82.13% of the Multinomial Naïve Bayes model which is the best accuracy rate value compared with other models.References
WeAreSocial, “SOCIAL MEDIA USERS PASS THE 4.5 BILLION MARK,†2021. https://wearesocial.com/us/blog/2021/10/social-media-users-pass-the-4-5-billion-mark/ (accessed Nov. 15, 2022).
C. O. (Universitas Muhammadiyah Yogyakarta), “Analisis Yuridis Tindak Pidana Ujaran Kebencian Dalam Media Sosial,†Al-Adl : Jurnal Hukum, vol. 13, no. 1, p. 168, 2021, doi: 10.31602/al-adl.v13i1.3938.
A. Rafi R, M. Nasrun, and R. Astuti N, “Deteksi Ujaran Ancaman Berbasis Website Pada Postingan Media Sosial Twitter Menggunakan Metode Naive Bayes,†e-Proceeding of Engineering, vol. 8, no. 1, p. 500, 2021.
A. Perwira, J. Dwitama, and K. Kunci, “Deteksi Ujaran Kebencian Pada Twitter Bahasa Indonesia Menggunakan Machine Learning : Reviu Literatur,†Jurnal SNATi, vol. 1, no. 1, pp. 31–39, 2021.
G. Singh, B. Kumar, L. Gaur, and A. Tyagi, “Comparison between Multinomial and Bernoulli Naïve Bayes for Text Classification,†2019 International Conference on Automation, Computational and Technology Management, ICACTM 2019, pp. 593–596, 2019, doi: 10.1109/ICACTM.2019.8776800.
P. S. Mishra and S. Tanuben, “Sentiment Analysis of Twitter Text Using Machine Learning Techniques Like Logistic Regression, Naïve Bayes, and Multinomial Naïve Bayes,†International Research Journal of Modernization in Engineering Technology and Science, no. 07, pp. 2582–5208, 2022.
M. B. Hamzah, “Classification of Movie Review Sentiment Analysis Using Chi-Square and Multinomial Naïve Bayes with Adaptive Boosting,†Journal of Advances in Information Systems and Technology, vol. 3, no. 1, pp. 67–74, 2021, doi: 10.15294/jaist.v3i1.49098.
C. Fiarni, H. Maharani, and G. R. Wisastra, “Opinion Mining Model System for Indonesian Non Profit Organization Using Multinomial Naive Bayes Algorithm,†2020 8th International Conference on Information and Communication Technology, ICoICT 2020, 2020, doi: 10.1109/ICoICT49345.2020.9166391.
N. Umar and M. Adnan Nur, “Application of Naïve Bayes Algorithm Variations On Indonesian General Analysis Dataset for Sentiment Analysis,†Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 6, no. 4, pp. 585–590, 2022, doi: 10.29207/resti.v6i4.4179.
D. N. Fitriana and Y. Sibaroni, “ARJUNA) Managed by Ministry of Research, Technology, and Higher Education,†Accredited by National Journal Accreditation, vol. 4, no. 2, pp. 846–853, 2020, [Online]. Available: http://jurnal.iaii.or.id
J. Evanovich, Hardcore twenty-four : a Stephanie Plum novel.
J. Patihullah and E. Winarko, “Hate Speech Detection for Indonesia Tweets Using Word Embedding And Gated Recurrent Unit,†IJCCS (Indonesian Journal of Computing and Cybernetics Systems), vol. 13, no. 1, p. 43, 2019, doi: 10.22146/ijccs.40125.
S. Symeonidis, D. Effrosynidis, and A. Arampatzis, “A comparative evaluation of pre-processing techniques and their interactions for twitter sentiment analysis,†Expert Systems with Applications, vol. 110, pp. 298–310, 2018, doi: 10.1016/j.eswa.2018.06.022.
riochr17, “Analisis-Sentimen-ID,†github, 2018. https://github.com/riochr17/Analisis-Sentimen-ID/blob/516d11ba66002cf6580ae4598e980ca71501df0a/kamus/kbba.txt#L1-L20
E. B. Setiawan, D. H. Widyantoro, and K. Surendro, “Feature expansion using word embedding for tweet topic classification,†Proceeding of 2016 10th International Conference on Telecommunication Systems Services and Applications, TSSA 2016: Special Issue in Radar Technology, no. 2011, 2017, doi: 10.1109/TSSA.2016.7871085.
C. Albon, Machine learning with Python cookbook : practical solutions from preprocessing to deep learning. 2018.
N. Rezaeian and G. Novikova, “Persian text classification using naive bayes algorithms and support vector machine algorithm,†Indonesian Journal of Electrical Engineering and Informatics, vol. 8, no. 1, pp. 178–188, 2020, doi: 10.11591/ijeei.v8i1.1696.
W. A. Prabowo and C. Wiguna, “Sistem Informasi UMKM Bengkel Berbasis Web Menggunakan Metode SCRUM,†Jurnal Media Informatika Budidarma, vol. 5, no. 1, p. 149, 2021, doi: 10.30865/mib.v5i1.2604.
A. M. Kibriya, E. Frank, B. Pfahringer, and G. Holmes, “Multinomial naive bayes for text categorization revisited,†Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science), vol. 3339, pp. 488–499, 2004, doi: 10.1007/978-3-540-30549-1_43.
S. A. Karunia, R. Saptono, and R. Anggrainingsih, “Online News Classification Using Naive Bayes Classifier with Mutual Information for Feature Selection,†Jurnal Ilmiah Teknologi dan Informasi, vol. 6, no. 1, pp. 10–15, 2017.
K. D. Kategori, “Kata Kunci : Naive Bayes, Bernoulli, Klasifikasi Dokumen Kategoriâ€.



