Identifikasi Ujaran Kebencian Multilabel Pada Teks Twitter Berbahasa Indonesia Menggunakan Convolution Neural Network

Aditya Perwira Joan Dwitama; Syarif Hidayat

doi:10.30865/json.v3i2.3610

Authors

Aditya Perwira Joan Dwitama Universitas Islam Indonesia, Yogyakarta
Syarif Hidayat Universitas Islam Indonesia, Yogyakarta

DOI:

https://doi.org/10.30865/json.v3i2.3610

Keywords:

Indonesian, CNN, Machine Learning, Twitter, Hate Speech

Abstract

There has been a significant increase in communication activities between internet users in online media due to the increase in social media users. For instance, Twitter users may send messages via their tweets. However, tweets can also contain negative meanings. Therefore, it deserves special attention as it has the potential to contain hate speech. Even the government deems it necessary to publish regulations to deal with hate speech cases such as the Information and Electronic Transactions Law (ITE Law) issued in 2018 Article 28 paragraph 2 of the Hate Speech. Machine Learning (ML) is one of the techniques that can be used in identifying patterns. There are various types of data that ML can be applied to, including text (known as Text Analytic). Previous research has used the Support Vector Machine (SVM) method to identify hate speech on Twitter text with more than one label (multilabel). The purpose of this study was to identify hate speech on Twitter with a label of more than one (multilabel) via Convolutional Neural Network (CNN). The study obtained the best CNN model with an accuracy of 98.76% from the multi-label dataset on hate speech in Indonesian texts

References

S. Kemp, â€œDigital in Indonesia: All the Statistics You Need in 2021 â€” DataReportal â€“ Global Digital Insights,â€ 2021. https://datareportal.com/reports/digital-2021-indonesia (accessed Jun. 27, 2021).

A. Briliani, B. Irawan, and C. Setianingsih, â€œHate speech detection in indonesian language on instagram comment section using K-nearest neighbor classification method,â€ Proc. - 2019 IEEE Int. Conf. Internet Things Intell. Syst. IoTaIS 2019, pp. 98â€“104, 2019, doi: 10.1109/IoTaIS47347.2019.8980398.

S. Y. Hukmana, â€œ125 Akun Medsos Terjaring Virtual Police - Medcom.id,â€ 2020. https://www.medcom.id/nasional/hukum/gNQ5RnwN-125-akun-medsos-terjaring-virtual-police (accessed Jun. 27, 2021).

L. P. A. S. Tjahyanti, â€œPendeteksian Bahasa Kasar (Abusive Language) Dan Ujaran Kebencian (Hate Speech) Dari Komentar Di Jejaring Sosial,â€ J. Chem. Inf. Model., vol. 07, no. 9, pp. 1689â€“1699, 2020.

P. Devita, â€œApakah semua ujaran kebencian perlu dipidana? Catatan untuk revisi UU ITE.â€ https://theconversation.com/apakah-semua-ujaran-kebencian-perlu-dipidana-catatan-untuk-revisi-uu-ite-156132 (accessed Jun. 27, 2021).

F. Alzami, N. P. P, R. A. P, R. A. Megantara, and D. P. Prabowo, â€œSENTIMENT ANALYSIS UNTUK DETEKSI UJARAN KEBENCIAN PADA DOMAIN POLITIK,â€ vol. 5, no. Sens 5, pp. 213â€“218, 2020.

K. M. Hana, Adiwijaya, S. Al Faraby, and A. Bramantoro, â€œMulti-label Classification of Indonesian Hate Speech on Twitter Using Support Vector Machines,â€ 2020 Int. Conf. Data Sci. Its Appl. ICoDSA 2020, 2020, doi: 10.1109/ICoDSA50139.2020.9212992.

M. Mozafari, R. Farahbakhsh, and N. Crespi, â€œHate speech detection and racial bias mitigation in social media based on BERT model,â€ PLoS One, vol. 15, no. 8 August, 2020, doi: 10.1371/journal.pone.0237861.

F. A. Prabowo, M. O. Ibrohim, and I. Budi, â€œHierarchical multi-label classification to identify hate speech and abusive language on Indonesian twitter,â€ 2019 6th Int. Conf. Inf. Technol. Comput. Electr. Eng. ICITACEE 2019, pp. 1â€“5, 2019, doi: 10.1109/ICITACEE.2019.8904425.

M. O. Ibrohim and I. Budi, â€œMulti-label Hate Speech and Abusive Language Detection in Indonesian Twitter,â€ pp. 46â€“57, 2019, doi: 10.18653/v1/w19-3506.

R. Duwairi, A. Hayajneh, and M. Quwaider, â€œA Deep Learning Framework for Automatic Detection of Hate Speech Embedded in Arabic Tweets,â€ Arabian Journal for Science and Engineering, vol. 46, no. 4. pp. 4001â€“4014, 2021, doi: 10.1007/s13369-021-05383-3.

N. A. Setyadi, M. Nasrun, and C. Setianingsih, â€œText Analysis for Hate Speech Detection Using Backpropagation Neural Network,â€ Proc. - 2018 Int. Conf. Control. Electron. Renew. Energy Commun. ICCEREC 2018, pp. 159â€“165, 2018, doi: 10.1109/ICCEREC.2018.8712109.

K. Sozykin, S. Protasov, A. Khan, R. Hussain, and J. Lee, â€œMulti-label class-imbalanced action recognition in hockey videos via 3D convolutional neural networks,â€ Proc. - 2018 IEEE/ACIS 19th Int. Conf. Softw. Eng. Artif. Intell. Netw. Parallel/Distributed Comput. SNPD 2018, pp. 146â€“151, 2018, doi: 10.1109/SNPD.2018.8441034.