Identifikasi Ujaran Kebencian Multilabel Pada Teks Twitter Berbahasa Indonesia Menggunakan Convolution Neural Network

Authors

  • Aditya Perwira Joan Dwitama Universitas Islam Indonesia, Yogyakarta
  • Syarif Hidayat Universitas Islam Indonesia, Yogyakarta

DOI:

https://doi.org/10.30865/json.v3i2.3610

Keywords:

Indonesian, CNN, Machine Learning, Twitter, Hate Speech

Abstract

There has been a significant increase in communication activities between internet users in online media due to the increase in social media users. For instance, Twitter users may send messages via their tweets. However, tweets can also contain negative meanings. Therefore, it deserves special attention as it has the potential to contain hate speech. Even the government deems it necessary to publish regulations to deal with hate speech cases such as the Information and Electronic Transactions Law (ITE Law) issued in 2018 Article 28 paragraph 2 of the Hate Speech. Machine Learning (ML) is one of the techniques that can be used in identifying patterns. There are various types of data that ML can be applied to, including text (known as Text Analytic). Previous research has used the Support Vector Machine (SVM) method to identify hate speech on Twitter text with more than one label (multilabel). The purpose of this study was to identify hate speech on Twitter with a label of more than one (multilabel) via Convolutional Neural Network (CNN). The study obtained the best CNN model with an accuracy of 98.76% from the multi-label dataset on hate speech in Indonesian texts

References

S. Kemp, “Digital in Indonesia: All the Statistics You Need in 2021 — DataReportal – Global Digital Insights,†2021. https://datareportal.com/reports/digital-2021-indonesia (accessed Jun. 27, 2021).

A. Briliani, B. Irawan, and C. Setianingsih, “Hate speech detection in indonesian language on instagram comment section using K-nearest neighbor classification method,†Proc. - 2019 IEEE Int. Conf. Internet Things Intell. Syst. IoTaIS 2019, pp. 98–104, 2019, doi: 10.1109/IoTaIS47347.2019.8980398.

S. Y. Hukmana, “125 Akun Medsos Terjaring Virtual Police - Medcom.id,†2020. https://www.medcom.id/nasional/hukum/gNQ5RnwN-125-akun-medsos-terjaring-virtual-police (accessed Jun. 27, 2021).

L. P. A. S. Tjahyanti, “Pendeteksian Bahasa Kasar (Abusive Language) Dan Ujaran Kebencian (Hate Speech) Dari Komentar Di Jejaring Sosial,†J. Chem. Inf. Model., vol. 07, no. 9, pp. 1689–1699, 2020.

P. Devita, “Apakah semua ujaran kebencian perlu dipidana? Catatan untuk revisi UU ITE.†https://theconversation.com/apakah-semua-ujaran-kebencian-perlu-dipidana-catatan-untuk-revisi-uu-ite-156132 (accessed Jun. 27, 2021).

F. Alzami, N. P. P, R. A. P, R. A. Megantara, and D. P. Prabowo, “SENTIMENT ANALYSIS UNTUK DETEKSI UJARAN KEBENCIAN PADA DOMAIN POLITIK,†vol. 5, no. Sens 5, pp. 213–218, 2020.

K. M. Hana, Adiwijaya, S. Al Faraby, and A. Bramantoro, “Multi-label Classification of Indonesian Hate Speech on Twitter Using Support Vector Machines,†2020 Int. Conf. Data Sci. Its Appl. ICoDSA 2020, 2020, doi: 10.1109/ICoDSA50139.2020.9212992.

M. Mozafari, R. Farahbakhsh, and N. Crespi, “Hate speech detection and racial bias mitigation in social media based on BERT model,†PLoS One, vol. 15, no. 8 August, 2020, doi: 10.1371/journal.pone.0237861.

F. A. Prabowo, M. O. Ibrohim, and I. Budi, “Hierarchical multi-label classification to identify hate speech and abusive language on Indonesian twitter,†2019 6th Int. Conf. Inf. Technol. Comput. Electr. Eng. ICITACEE 2019, pp. 1–5, 2019, doi: 10.1109/ICITACEE.2019.8904425.

M. O. Ibrohim and I. Budi, “Multi-label Hate Speech and Abusive Language Detection in Indonesian Twitter,†pp. 46–57, 2019, doi: 10.18653/v1/w19-3506.

R. Duwairi, A. Hayajneh, and M. Quwaider, “A Deep Learning Framework for Automatic Detection of Hate Speech Embedded in Arabic Tweets,†Arabian Journal for Science and Engineering, vol. 46, no. 4. pp. 4001–4014, 2021, doi: 10.1007/s13369-021-05383-3.

N. A. Setyadi, M. Nasrun, and C. Setianingsih, “Text Analysis for Hate Speech Detection Using Backpropagation Neural Network,†Proc. - 2018 Int. Conf. Control. Electron. Renew. Energy Commun. ICCEREC 2018, pp. 159–165, 2018, doi: 10.1109/ICCEREC.2018.8712109.

K. Sozykin, S. Protasov, A. Khan, R. Hussain, and J. Lee, “Multi-label class-imbalanced action recognition in hockey videos via 3D convolutional neural networks,†Proc. - 2018 IEEE/ACIS 19th Int. Conf. Softw. Eng. Artif. Intell. Netw. Parallel/Distributed Comput. SNPD 2018, pp. 146–151, 2018, doi: 10.1109/SNPD.2018.8441034.

Downloads

Published

2021-12-31

How to Cite

Dwitama, A. P. J., & Hidayat, S. (2021). Identifikasi Ujaran Kebencian Multilabel Pada Teks Twitter Berbahasa Indonesia Menggunakan Convolution Neural Network. Jurnal Sistem Komputer Dan Informatika (JSON), 3(2), 117–127. https://doi.org/10.30865/json.v3i2.3610