Comparative Analysis of Naive Bayes Model Performance in Hate Speech Detection in Media Social Twitter

Muhammad Hadyan Baqi; Yuliant Sibaroni; Sri Suryani Prasetiyowati

doi:10.30865/jurikom.v10i1.5493

Authors

Muhammad Hadyan Baqi Telkom University, Bandung
Yuliant Sibaroni Telkom University, Bandung
Sri Suryani Prasetiyowati Telkom University, Bandung

DOI:

https://doi.org/10.30865/jurikom.v10i1.5493

Keywords:

NaÃ¯ve Bayes, Multinomial, Gaussian, Bernoulli, Hate Speech

Abstract

Twitter is a popular social media in Indonesia, and for some people, it is a place to find and disseminate information. Hate speech is aggressive behavior against individuals or groups such on race, gender, religion, nationality, ethnicity, sexual orientation, gender identity, or disability. In this study, hate speech is modeled using Naive Bayesian models, which consist of Multinomial, Bernoulli, and Gaussian NaÃ¯ve Bayes Models. These methods were chosen because NaÃ¯ve Bayes is a simple method but has good performance in the case of sentiment analysis. This research aims to get the method with the highest accuracy value in analyzing hate speech. Thus, the NaÃ¯ve Bayes model can provide the best solution for hate speech problems. The process carried out in this study is to process all data which obtained from Twitter social media and then classify it using the Multinomial NaÃ¯ve Bayes, Gaussian NaÃ¯ve Bayes, and Bernoulli Naive Bayes models based on the classification of HS and non-HS sentiment categories. Â In this study, to get the best accuracy, two different scenarios were used. The result of the analysis of the accuracy is 82.13% of the Multinomial NaÃ¯ve Bayes model which is the best accuracy rate value compared with other models.

References

WeAreSocial, â€œSOCIAL MEDIA USERS PASS THE 4.5 BILLION MARK,â€ 2021. https://wearesocial.com/us/blog/2021/10/social-media-users-pass-the-4-5-billion-mark/ (accessed Nov. 15, 2022).

C. O. (Universitas Muhammadiyah Yogyakarta), â€œAnalisis Yuridis Tindak Pidana Ujaran Kebencian Dalam Media Sosial,â€ Al-Adl : Jurnal Hukum, vol. 13, no. 1, p. 168, 2021, doi: 10.31602/al-adl.v13i1.3938.

A. Rafi R, M. Nasrun, and R. Astuti N, â€œDeteksi Ujaran Ancaman Berbasis Website Pada Postingan Media Sosial Twitter Menggunakan Metode Naive Bayes,â€ e-Proceeding of Engineering, vol. 8, no. 1, p. 500, 2021.

A. Perwira, J. Dwitama, and K. Kunci, â€œDeteksi Ujaran Kebencian Pada Twitter Bahasa Indonesia Menggunakan Machine Learning : Reviu Literatur,â€ Jurnal SNATi, vol. 1, no. 1, pp. 31â€“39, 2021.

G. Singh, B. Kumar, L. Gaur, and A. Tyagi, â€œComparison between Multinomial and Bernoulli NaÃ¯ve Bayes for Text Classification,â€ 2019 International Conference on Automation, Computational and Technology Management, ICACTM 2019, pp. 593â€“596, 2019, doi: 10.1109/ICACTM.2019.8776800.

P. S. Mishra and S. Tanuben, â€œSentiment Analysis of Twitter Text Using Machine Learning Techniques Like Logistic Regression, NaÃ¯ve Bayes, and Multinomial NaÃ¯ve Bayes,â€ International Research Journal of Modernization in Engineering Technology and Science, no. 07, pp. 2582â€“5208, 2022.

M. B. Hamzah, â€œClassification of Movie Review Sentiment Analysis Using Chi-Square and Multinomial NaÃ¯ve Bayes with Adaptive Boosting,â€ Journal of Advances in Information Systems and Technology, vol. 3, no. 1, pp. 67â€“74, 2021, doi: 10.15294/jaist.v3i1.49098.

C. Fiarni, H. Maharani, and G. R. Wisastra, â€œOpinion Mining Model System for Indonesian Non Profit Organization Using Multinomial Naive Bayes Algorithm,â€ 2020 8th International Conference on Information and Communication Technology, ICoICT 2020, 2020, doi: 10.1109/ICoICT49345.2020.9166391.

N. Umar and M. Adnan Nur, â€œApplication of NaÃ¯ve Bayes Algorithm Variations On Indonesian General Analysis Dataset for Sentiment Analysis,â€ Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 6, no. 4, pp. 585â€“590, 2022, doi: 10.29207/resti.v6i4.4179.

D. N. Fitriana and Y. Sibaroni, â€œARJUNA) Managed by Ministry of Research, Technology, and Higher Education,â€ Accredited by National Journal Accreditation, vol. 4, no. 2, pp. 846â€“853, 2020, [Online]. Available: http://jurnal.iaii.or.id

J. Evanovich, Hardcore twenty-four : a Stephanie Plum novel.

J. Patihullah and E. Winarko, â€œHate Speech Detection for Indonesia Tweets Using Word Embedding And Gated Recurrent Unit,â€ IJCCS (Indonesian Journal of Computing and Cybernetics Systems), vol. 13, no. 1, p. 43, 2019, doi: 10.22146/ijccs.40125.

S. Symeonidis, D. Effrosynidis, and A. Arampatzis, â€œA comparative evaluation of pre-processing techniques and their interactions for twitter sentiment analysis,â€ Expert Systems with Applications, vol. 110, pp. 298â€“310, 2018, doi: 10.1016/j.eswa.2018.06.022.

riochr17, â€œAnalisis-Sentimen-ID,â€ github, 2018. https://github.com/riochr17/Analisis-Sentimen-ID/blob/516d11ba66002cf6580ae4598e980ca71501df0a/kamus/kbba.txt#L1-L20

E. B. Setiawan, D. H. Widyantoro, and K. Surendro, â€œFeature expansion using word embedding for tweet topic classification,â€ Proceeding of 2016 10th International Conference on Telecommunication Systems Services and Applications, TSSA 2016: Special Issue in Radar Technology, no. 2011, 2017, doi: 10.1109/TSSA.2016.7871085.

C. Albon, Machine learning with Python cookbook : practical solutions from preprocessing to deep learning. 2018.

N. Rezaeian and G. Novikova, â€œPersian text classification using naive bayes algorithms and support vector machine algorithm,â€ Indonesian Journal of Electrical Engineering and Informatics, vol. 8, no. 1, pp. 178â€“188, 2020, doi: 10.11591/ijeei.v8i1.1696.

W. A. Prabowo and C. Wiguna, â€œSistem Informasi UMKM Bengkel Berbasis Web Menggunakan Metode SCRUM,â€ Jurnal Media Informatika Budidarma, vol. 5, no. 1, p. 149, 2021, doi: 10.30865/mib.v5i1.2604.

A. M. Kibriya, E. Frank, B. Pfahringer, and G. Holmes, â€œMultinomial naive bayes for text categorization revisited,â€ Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science), vol. 3339, pp. 488â€“499, 2004, doi: 10.1007/978-3-540-30549-1_43.

S. A. Karunia, R. Saptono, and R. Anggrainingsih, â€œOnline News Classification Using Naive Bayes Classifier with Mutual Information for Feature Selection,â€ Jurnal Ilmiah Teknologi dan Informasi, vol. 6, no. 1, pp. 10â€“15, 2017.

K. D. Kategori, â€œKata Kunci : Naive Bayes, Bernoulli, Klasifikasi Dokumen Kategoriâ€.

Comparative Analysis of Naive Bayes Model Performance in Hate Speech Detection in Media Social Twitter

Authors

DOI:

Keywords:

Abstract

References

Additional Files

Published

How to Cite

Issue

Section

menujuribaru

template

sitasigs

member

Keywords