Hate Speech Hashtag Classification on Twitter Using the Hybrid Classifier Method

Authors

  • Aulia Rayhan Syaifullah Telkom University, Bandung
  • Yuliant Sibaroni Telkom University, Bandung

DOI:

https://doi.org/10.30865/jurikom.v9i4.4548

Keywords:

Twitter, Hate Speech, Hybrid Classifier, Classification, Social Media

Abstract

Hate speech on social media, especially Twitter, often takes the form of racism, sexism, or political interests aimed at certain individuals or groups. These actions can trigger crime, riots, violence and even resistance to individuals or groups. Therefore, we need a process of classifying a tweet whether it is hate speech or not to reduce the abuse that occurs on Twitter. The technology used in the classification of hate speech that is most commonly used is neural networks that require user data and meta data. In previous studies, the Naïve Bayes (NB) method has been used using the bigram, unigram and feature selection features with an accuracy of 80-85%. The k-Nearest Neighbor (kNN) method has also been used which has an accuracy of 70-85% on the clarification of hate speech by political figures. Meanwhile, the most widely used method is the Support Vector Machine (SVM) method with an accuracy of 70 to the highest 95%. To get a higher accuracy in the classification of hate speech, this study will perform a Hybrid Classifier on the Hate Speech Hashtag Classification process using a combination method of MLP, kNN, NB. The data used in this study are Twitter Tweets from November 2021 to June 2022 regarding trending hashtags. The average accuracy performance results obtained using MLP, kNN, NB were 72%, 63%,73% respectively. To improve the accuracy of the classification results of the three methods, a combination of methods using the Hybrid Classifier is carried out. Experimental results show Hybrid Classifier with voting method can increase accuracy up to 74%. It was found that the use of a hybrid can provide a better system performance than the 3 classifiers in its composition, namely kNN, NB and MLP

 

References

C. M. Annur, “Ada 204,7 Juta Pengguna Internet di Indonesia Awal 2022 [online],†2022. https://databoks.katadata.co.id/datapublish/2022/03/23/ada-2047-juta-pengguna-internet-di-indonesia-awal-2022 (accessed Mar. 23, 2022).

T. Febriana and A. Budiarto, “Twitter Dataset for Hate Speech and Cyberbullying Detection in Indonesian Language,†in 2019 International Conference on Information Management and Technology (ICIMTech), Aug. 2019, vol. 1, pp. 379–382. doi: 10.1109/ICIMTech.2019.8843722.

M. A. Fauzi and A. Yuniarti, “Ensemble method for indonesian twitter hate speech detection,†Indonesian Journal of Electrical Engineering and Computer Science, vol. 11, no. 1, pp. 294–299, 2018, doi: 10.11591/ijeecs.v11.i1.pp294-299.

M. Hakiem, M. A. Fauzi, and I. Indriati, “Klasifikasi ujaran kebencian pada twitter menggunakan metode naïve bayes berbasis N-gram dengan seleksi fitur information gain,†vol, vol. 3, no. 3, pp. 2443–2451, 2019.

H. Sahi, Y. Kilic, and R. B. Saglam, “Automated Detection of Hate Speech towards Woman on Twitter,†in 2018 3rd International Conference on Computer Science and Engineering (UBMK), Sep. 2018, pp. 533–536. doi: 10.1109/UBMK.2018.8566304.

G. K. Pitsilis, H. Ramampiaro, and H. Langseth, “Effective hate-speech detection in Twitter data using recurrent neural networks,†Applied Intelligence, vol. 48, no. 12, pp. 4730–4742, Dec. 2018, doi: 10.1007/s10489-018-1242-y.

K. K. Kiilu, G. Okeyo, R. Rimiru, and K. Ogada, “Using Naïve Bayes Algorithm in detection of Hate Tweets,†International Journal of Scientific and Research Publications (IJSRP), vol. 8, no. 3, pp. 99–107, Mar. 2018, doi: 10.29322/IJSRP.8.3.2018.p7517.

S. Saha, J. Yadav, and P. Ranjan, “Proposed Approach for Sarcasm Detection in Twitter,†Indian Journal of Science and Technology, vol. 10, no. 25, pp. 1–8, Jun. 2017, doi: 10.17485/ijst/2017/v10i25/114443.

Oryza Habibie Rahman, Gunawan Abdillah, and Agus Komarudin, “Klasifikasi Ujaran Kebencian pada Media Sosial Twitter Menggunakan Support Vector Machine,†Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 5, no. 1, pp. 17–23, Feb. 2021, doi: 10.29207/resti.v5i1.2700.

I. Vogel and M. Meghana, “Profiling Hate Speech Spreaders on Twitter: SVM vs. Bi-LSTM.,†in CLEF (Working Notes), 2021, pp. 2193–2200.

J. C. Pereira-Kohatsu, L. Quijano-Sánchez, F. Liberatore, and M. Camacho-Collados, “Detecting and Monitoring Hate Speech in Twitter,†Sensors (Basel), vol. 19, no. 21, p. 4654, Oct. 2019, doi: 10.3390/s19214654.

A. Silva and N. Roman, Hate Speech Detection in Portuguese with Naïve Bayes, SVM,MLP and Logistic Regression. 2020. doi: 10.5753/eniac.2020.12112.

J. KUSUMA, B. H. HAYADI, W. WANAYUMINI, and R. ROSNELLY, “Komparasi Metode Multi Layer Perceptron (MLP) dan Support Vector Machine (SVM) untuk Klasifikasi Kanker Payudara,†MIND Journal, vol. 7, no. 1, pp. 51–60, Jun. 2022, doi: 10.26760/mindjournal.v7i1.51-60.

I. Kamalludin and B. N. Arief, “KEBIJAKAN FORMULASI HUKUM PIDANA TENTANG PENANGGULANGAN TINDAK PIDANA PENYEBARAN UJARAN KEBENCIAN (HATE SPEECH) DI DUNIA MAYA,†LAW REFORM, vol. 15, no. 1, p. 113, May 2019, doi: 10.14710/lr.v15i1.23358.

Robi Kurniawan and Aulia Apriliani, “Analisis Sentimen Masyarakat terhadap Virus Corona berdasarkan Opini dari Twitter berbasis Web Scraper,†Jurnal Instek, vol. 5, no. 1, 2020.

S. Gharatkar, A. Ingle, T. Naik, and A. Save, “Review preprocessing using data cleaning and stemming technique,†in 2017 International Conference on Innovations in Information, Embedded and Communication Systems (ICIIECS), Mar. 2017, pp. 1–4. doi: 10.1109/ICIIECS.2017.8276011.

T. Ridwansyah, “Implementasi Text Mining Terhadap Analisis Sentimen Masyarakat Dunia Di Twitter Terhadap Kota Medan Menggunakan K-Fold Cross Validation Dan Naïve Bayes Classifier,†KLIK: Kajian Ilmiah Informatika dan Komputer, vol. 2, no. 5, pp. 178–185, Apr. 2022, doi: 10.30865/klik.v2i5.362.

Fatri Nurul Inayah, Sri Suryani Prasetiyowati, and Yuliant Sibaroni, “Classification of Dengue Hemorrhagic Fever (DHF) Spread in Bandung using Hybrid Naïve Bayes, K-Nearest Neighbor, and Artificial Neural Network Methods,†International Journal on Information and Communication Technology (IJoICT), vol. 7, no. 1, pp. 10–20, Jun. 2021, doi: 10.21108/ijoict.v7i1.562.

Additional Files

Published

2022-08-30

How to Cite

Syaifullah, A. R., & Sibaroni, Y. (2022). Hate Speech Hashtag Classification on Twitter Using the Hybrid Classifier Method. JURNAL RISET KOMPUTER (JURIKOM), 9(4), 828–833. https://doi.org/10.30865/jurikom.v9i4.4548