Handling Imbalance Dataset on Hoax Indonesian Political News Classification using IndoBERT and Random Sampling
Abstract
Keywords
Full Text:
PDFReferences
M. A. Rahmat, Indrabayu, and I. S. Areni, “Hoax Web Detection For News in Bahasa Using Support Vector Machine,†2019 International Conference on Information and Communications Technology (ICOIACT), 2019, doi: 10.1109/ICOIACT46704.2019.8938425.
Hanadian Nurhayati Wolff, “Internet usage in Indonesia - statistics & facts.†Accessed: Nov. 11, 2023. [Online]. Available: https://www.statista.com/topics/2431/internet-usage-in-indonesia/
SIMON KEMP, “DIGITAL 2020: INDONESIA.†Accessed: Nov. 11, 2023. [Online]. Available: https://datareportal.com/reports/digital-2020-indonesia
P. Utami, “Hoax in Modern Politics: The Meaning of Hoax in Indonesian Politics and Democracy,†Jurnal Ilmu Sosial dan Ilmu Politik, vol. 22, no. 2, p. 85, Jan. 2019, doi: 10.22146/jsp.34614.
J. A. Nasir, O. S. Khan, and I. Varlamis, “Fake news detection: A hybrid CNN-RNN based deep learning approach,†International Journal of Information Management Data Insights, vol. 1, no. 1, Apr. 2021, doi: 10.1016/j.jjimei.2020.100007.
A. Wani, I. Joshi, S. Khandve, V. Wagh, and R. Joshi, “Evaluating Deep Learning Approaches for Covid19 Fake News Detectionâ€, doi: 10.48550/arXiv.2101.04012.
R. K. Kaliyar, A. Goswami, and P. Narang, “FakeBERT: Fake news detection in social media with a BERT-based deep learning approach,†Multimed Tools Appl, vol. 80, no. 8, pp. 11765–11788, Mar. 2021, doi: 10.1007/s11042-020-10183-2.
F. Koto, A. Rahimi, J. H. Lau, and T. Baldwin, “IndoLEM and IndoBERT: A Benchmark Dataset and Pre-trained Language Model for Indonesian NLP,†Nov. 2020, doi: 10.48550/arXiv.2011.00677.
M. N. Fakhruzzaman, S. Z. Jannah, R. A. Ningrum, and I. Fahmiyah, “Clickbait Headline Detection in Indonesian News Sites using Multilingual Bidirectional Encoder Representations from Transformers (M-BERT),†Feb. 2021, [Online]. Available: http://arxiv.org/abs/2102.01497
D. R. Faisal and R. Mahendra, “Two-Stage Classifier for COVID-19 Misinformation Detection Using BERT: a Study on Indonesian Tweets,†Jun. 2022, doi: 10.48550/arXiv.2102.01497.
Muhammad Ikram Kaer Sinapoy, Yuliant Sibaroni, and Sri Suryani Prasetyowati, “Comparison of LSTM and IndoBERT Method in Identifying Hoax on Twitter,†Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 7, no. 3, pp. 657–662, Jun. 2023, doi: 10.29207/resti.v7i3.4830.
S. Al-Azani and E. S. M. El-Alfy, “Imbalanced Sentiment Polarity Detection Using Emoji-Based Features and Bagging Ensemble,†in 1st International Conference on Computer Applications and Information Security, ICCAIS 2018, Institute of Electrical and Electronics Engineers Inc., Aug. 2018. doi: 10.1109/CAIS.2018.8441956.
H. A. Najada and X. Zhu, “iSRD: Spam review detection with imbalanced data distributions,†Proceedings of the 2014 IEEE 15th International Conference on Information Reuse and Integration (IEEE IRI 2014), 2014, doi: 10.1109/IRI.2014.7051938.
S. Al–Azani and E. M. El–Alfy, “Imbalanced Sentiment Polarity Detection Using Emoji-Based Features and Bagging Ensemble,†2018 1st International Conference on Computer Applications & Information Security (ICCAIS), pp. 1–5, 2018, doi: 10.1109/CAIS.2018.8441956.
H. A. Najada and X. Zhu, “iSRD: Spam review detection with imbalanced data distributions,†Proceedings of the 2014 IEEE 15th International Conference on Information Reuse and Integration (IEEE IRI 2014), 2014.
Fransiscus and A. S. Girsang, “Sentiment Analysis of COVID-19 Public Activity Restriction (PPKM) Impact using BERT Method,†International Journal of Engineering Trends and Technology, vol. 70, no. 12, pp. 281–288, Dec. 2022, doi: 10.14445/22315381/IJETT-V70I12P226.
W. Satriaji and R. Kusumaningrum, “Effect of Synthetic Minority Oversampling Technique (SMOTE), Feature Representation, and Classification Algorithm on Imbalanced Sentiment Analysis,†2018 2nd International Conference on Informatics and Computational Sciences (ICICoS), Semarang, Indonesia, 2018, doi: 10.1109/ICICOS.2018.8621648.
J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,†Oct. 2018, doi: 10.18653/v1/N19-1423.
B. Wilie et al., “IndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language Understanding,†Sep. 2020, doi: 10.48550/arXiv.2009.05387.
L. H. Suadaa, I. Santoso, and A. T. B. Panjaitan, “Transfer Learning of Pre-trained Transformers for Covid-19 Hoax Detection in Indonesian Language,†IJCCS (Indonesian Journal of Computing and Cybernetics Systems), vol. 15, no. 3, p. 317, Jul. 2021, doi: 10.22146/ijccs.66205.
Y. Muliono, F. L. Gaol, B. Soewito, and H. L. H. S. Warnars, “Hoax Classification in Imbalanced Datasets Based on Indonesian News Title using RoBERTa,†in 2022 3rd International Conference on Artificial Intelligence and Data Sciences: Championing Innovations in Artificial Intelligence and Data Sciences for Sustainable Future, AiDAS 2022 - Proceedings, Institute of Electrical and Electronics Engineers Inc., 2022, pp. 264–268. doi: 10.1109/AiDAS56890.2022.9918747.
A. D. Sanya and L. H. Suadaa, “Handling Imbalanced Dataset on Hate Speech Detection in Indonesian Online News Comments,†2022 10th International Conference on Information and Communication Technology (ICoICT), pp. 380–385, 2022, doi: 10.1109/ICoICT55009.2022.9914883.
W. Obaid and A. Nassif Bou, “The Effects of Resampling on Classifying Imbalanced Datasets,†2022 Advances in Science and Engineering Technology International Conferences (ASET), 2022, doi: 10.1109/ASET53988.2022.9735021.
DOI: https://doi.org/10.30865/mib.v8i1.7099
Refbacks
- There are currently no refbacks.
Copyright (c) 2024 JURNAL MEDIA INFORMATIKA BUDIDARMA

This work is licensed under a Creative Commons Attribution 4.0 International License.
JURNAL MEDIA INFORMATIKA BUDIDARMA
Universitas Budi Darma
Secretariat: Sisingamangaraja No. 338 Telp 061-7875998
Email: mib.stmikbd@gmail.com

This work is licensed under a Creative Commons Attribution 4.0 International License.