Improved Text Classification for Indonesian Hate Speech Detection: FastText-LSTM Model with Easy Data Augmentation
DOI:
https://doi.org/10.30865/json.v7i3.9637Keywords:
Bayesian Optimization, Easy Data Augmentation, FastText, Hate Speech Detection, Long Short-Term MemoryAbstract
The swift expansion of social media in Indonesia has led to a significant rise in hate speech, highlighting the urgent need for effective automated detection techniques. This research evaluates the performance of the proposed FastText-Long Short-Term Memory with Easy Data Augmentation (FastText-LSTM-WE) compared with the baseline model, FastText-Convolutional Neural Network with Easy Data Augmentation (FastText-CNN-WE). To further investigate the impact of data augmentation, the effectiveness of both FastText-Long Short-Term Memory without Easy Data Augmentation (FastText-LSTM-WO) and FastText-Convolutional Neural Network without Easy Data Augmentation (FastText-CNN-WO) was also assessed. Bayesian Optimization was employed to identify the best hyperparameter configurations for each model. The experiments were carried out on a dataset comprising 14,306 samples while maintaining consistent experimental conditions. Model performance was measured using precision, recall, F1-score, and accuracy derived from the confusion matrix. The results indicate that FastText-LSTM-WE achieved the highest performance, with precision, recall, F1-score, and accuracy of 84.02%, 83.16%, 83.59%, and 81.37%, respectively. These findings demonstrate that the proposed model provides a robust and reliable solution for detecting hate speech within the Indonesian context, thereby improving automated content moderation systems in practical applications.
References
A. Nurdin, A. N. Paryati, S. K. Rizqi, I. H. Hermawan, and T. Q. Handayani, “The Role of Social Media in Political Education and Election Socialization Among Generation Z,” The Journal of Academic Science, no. 2, pp. 566–577, 2025, doi: https://doi.org/10.59613/2w7p1883.
A. Dreißigacker, P. Müller, A. Isenhardt, and J. Schemmel, “Online hate speech victimization: consequences for victims’ feelings of insecurity,” Crime Sci., vol. 13, no. 1, Dec. 2024, doi: 10.1186/s40163-024-00204-y.
H. Berchenko, P. Domingos, D. Shahiqi, Z. Fetahu, and R. Fetahu, “The Criminal Confrontation for the Crimes of Discrimination and Hate Speech: A Comparative Study,” Access to Justice in Eastern Europe, vol. 7, no. 2, pp. 138–162, 2024, doi: https://doi.org/10.33327/AJEE-18-7.2-000210.
P. Ray and A. Chakrabarti, “A Mixed approach of Deep Learning method and Rule-Based method to improve Aspect Level Sentiment Analysis,” Applied Computing and Informatics, vol. 18, no. 1–2, pp. 163–178, Jan. 2022, doi: 10.1016/j.aci.2019.02.002.
M. Vergani et al., “Mapping the scientific knowledge and approaches to defining and measuring hate crime, hate speech, and hate incidents: A systematic review,” Campbell Systematic Reviews, vol. 20, no. 2, pp. 1–54, Jun. 2024, doi: 10.1002/cl2.1397.
J. Zapata and O. Deroy, “Ordinary citizens are more severe towards verbal than nonverbal hate-motivated incidents with identical consequences,” Sci. Rep., vol. 13, no. 1, Dec. 2023, doi: 10.1038/s41598-023-33892-8.
J. M. Perez et al., “Assessing the Impact of Contextual Information in Hate Speech Detection,” IEEE Access, vol. 11, pp. 30575–30590, 2023, doi: 10.1109/ACCESS.2023.3258973.
P. Poschmann, J. Goldenstein, S. Büchel, and U. Hahn, “A Vector Space Approach for Measuring Relationality and Multidimensionality of Meaning in Large Text Collections,” Organ. Res. Methods, vol. 27, no. 4, pp. 650–680, Oct. 2024, doi: 10.1177/10944281231213068.
O. Karakaya and Z. H. Kilimci, “An efficient consolidation of word embedding and deep learning techniques for classifying anticancer peptides: FastText+BiLSTM,” PeerJ Comput. Sci., vol. 10, 2024, doi: 10.7717/peerj-cs.1831.
E. Aurora Az Zahra, Y. Sibaroni, and S. Suryani Prasetyowati, “Classification of Multi-Label of Hate Speech on Twitter Indonesia using LSTM and BiLSTM Method,” JINAV: Journal of Information and Visualization, vol. 4, no. 2, pp. 170–178, Jul. 2023, doi: 10.35877/454ri.jinav1864.
R. Angger Saputra and Y. Sibaroni, “Multilabel Hate Speech Classification in Indonesian Political Discourse on X using Combined Deep Learning Models with Considering Sentence Length,” Jurnal Ilmu Komputer dan Informasi, vol. 18, no. 1, pp. 113–125, Feb. 2025, doi: 10.21609/jiki.v18i1.1440.
A. Muhamad Faza, Y. Sibaroni, and S. S. Prasetyowati, “A Comparative Study on Handling Imbalanced Data in Indonesian Hate Speech Detection Using FastText and BiLSTM,” Intl. Journal on ICT, vol. 11, no. 2, pp. 136–149, 2025, doi: 10.21108/ijoict.v11i2.9513.
M. Tonneau, D. Liu, S. Fraiberger, R. Schroeder, S. Hale, and P. Röttger, “From Languages to Geographies: Towards Evaluating Cultural Bias in Hate Speech Datasets,” in Proceedings of the 8th Workshop on Online Abuse and Harms (WOAH 2024), Stroudsburg, PA, USA: Association for Computational Linguistics, 2024, pp. 283–311. doi: 10.18653/v1/2024.woah-1.23.
A. T. Ni’mah and R. Yunitarini, “Relevance of the Retrieval of Hadith Information (RoHI) using Bidirectional Encoder Representations from Transformers (BERT) in religious education media,” in BIO Web of Conferences, EDP Sciences, Nov. 2024. doi: 10.1051/bioconf/202414601041.
A. Ahmad Aliero, B. Sulaimon Adebayo, H. Olanrewaju Aliyu, A. Gogo Tafida, B. Umar Kangiwa, and N. Muhammad Dankolo, “Systematic Review on Text Normalization Techniques and its Approach to Non-Standard Words,” Int. J. Comput. Appl., vol. 185, no. 33, pp. 975–8887, 2023.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Jurnal Sistem Komputer dan Informatika (JSON)

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

This work is licensed under a Creative Commons Attribution 4.0 International License
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).

