AI Explanation related Covid Hoax Detection Using Support Vector Machine and Logistics Regression Methods

Authors

  • Naufal Haritsah Luthfi Telkom University, Bandung
  • Agus Hartoyo Telkom University, Bandung

DOI:

https://doi.org/10.30865/mib.v7i1.5386

Keywords:

Detection, Explainable-AI, Hoax, Logistic Regression, Support Vector Machine, Tf-Idf, Word

Abstract

Hoax news about Covid is still circulating in society. Especially on social media, this phenomenon still occurs. The existence of this disinformation can cause divisions between communities. Currently, technology can classify hoax news and non-hoax news. But no system can see the reasons for a model to classify hoax news and non-hoax news. Therefore, in this study, a system was developed that can see words on a system that detects hoax and non-hoax news using the Support Vector Machine and Logistic Regression methods. Meanwhile, the Explainable AI method is Local Interpretable Model-agnostic Explanations (LIME). The test results show that the SVM and Logistic Regression methods have the highest accuracy of 91% and 95%. The words collected in the dataset are sufficient to differentiate between a hoax and non-hoax news. It was found that hoax news about Covid-19 has many words related to Covid-19, religion, politics, medical, and words that are not related to Covid-19. Among them are "lockdown", "masjid", "rezim", "ventilator", and "kiamat". Meanwhile, non-hoax news about Covid-19 has many words related to Covid-19, government, and medical. Among them are "protokol", "isolasi", "infeksi", "menteri", and "nakes".

References

M. M. Alvanof and R. Triandi, “Analisa Dan Deteksi Konten Hoax Pada Media Berita,†J. Teknol. Terap. Sains 4.0 Univ. Malikussaleh, vol. 1, p. 2, 2020.

C. Juditha, “Interaksi Komunikasi Hoax di Media Sosial Serta Antisipasinya,†J. Pekommas, vol. 3, no. 1, pp. 31–34, 2018.

B. K. Palma, D. T. Murdiansyah, and W. Astuti, “Klasifikasi Teks Artikel Berita Hoaks Covid-19 dengan Menggunakan Algotrima K- Nearest Neighbor,†eProceedings …, vol. 8, no. 5, pp. 10637–10649, 2021.

G. W. Frista, “Deteksi Konten Hoax Berbahasa Indonesia Pada Media Sosial Menggunakan Metode Levenshtein Distance,†Perpust. Univ. Islam Neger Sunan Ampel, pp. 1–78, 2018.

I. A. Ropikoh, R. Abdulhakim, U. Enri, and N. Sulistiyowati, “Penerapan Algoritma Support Vector Machine (Svm) Untuk Klasifikasi Web Phising,†J. Chem. Inf. Model., vol. 5, no. 1, pp. 64–73, 2021.

F. Ismayanti and E. B. Setiawan, “Deteksi Konten Hoax Berbahasa Indonesia di Twitter Menggunakan Fitur Ekspansi dengan Word2Vec,†vol. 8, no. 5, pp. 10288–10300, 2021.

J. Tugas, A. Fakultas, H. K. Putra, M. Arif Bijaksana, and A. Romadhony, “Deteksi Penggunaan Kalimat Abusive Pada Teks Bahasa Indonesia Menggunakan Metode IndoBERT,†e-Proceeding Eng., vol. Vol.8, No., no. 2, pp. 3028–3038, 2021.

H. A. Pradana, A. Bramantoro, A. A. Alkodri, O. Rizan, T. Sugihartono, and Supardi, “An android-based hoax detection for social media,†Int. Conf. Electr. Eng. Comput. Sci. Informatics, pp. 189–194, 2019, doi: 10.23919/EECSI48112.2019.8976998.

M. Aldwairi and A. Alwahedi, “Detecting fake news in social media networks,†Procedia Comput. Sci., vol. 141, pp. 215–222, 2018, doi: 10.1016/j.procs.2018.10.171.

M. T. Ribeiro, S. Singh, and C. Guestrin, “‘Why Should I Trust You?’ Explaining the Predictions of Any Classifier,†NAACL-HLT 2016 - 2016 Conf. North Am. Chapter Assoc. Comput. Linguist. Hum. Lang. Technol. Proc. Demonstr. Sess., pp. 97–101, 2016, doi: 10.18653/v1/n16-3020.

A. Saini and R. Prasad, “Locally Interpretable Model Agnostic Explanations using Gaussian Processes,†2021.

H. Zhou, “Research of Text Classification Based on TF-IDF and CNN-LSTM,†J. Phys. Conf. Ser., vol. 2171, no. 1, 2022, doi: 10.1088/1742-6596/2171/1/012021.

X. Zhou, X. Zhang, and B. Wang, “Online support vector machine: A survey,†Adv. Intell. Syst. Comput., vol. 382, no. 8, pp. 269–278, 2016, doi: 10.1007/978-3-662-47926-1_26.

A. A. T. Fernandes, D. B. F. Filho, E. C. da Rocha, and W. da Silva Nascimento, “Read this paper if you want to learn logistic regression,†Rev. Sociol. e Polit., vol. 28, no. 74, pp. 1/1-19/19, 2020, doi: 10.1590/1678-987320287406EN.

H. H. Rashidi, N. K. Tran, E. V. Betts, L. P. Howell, and R. Green, “Artificial Intelligence and Machine Learning in Pathology: The Present Landscape of Supervised Methods,†Acad. Pathol., vol. 6, 2019, doi: 10.1177/2374289519873088.

M. Junker, R. Hoch, and A. Dengel, “On the evaluation of document analysis components by recall, precision, and accuracy,†Proc. Int. Conf. Doc. Anal. Recognition, ICDAR, no. April, pp. 717–720, 1999, doi: 10.1109/ICDAR.1999.791887.

S. Haghighi, M. Jasemi, S. Hessabi, and A. Zolanvari, “PyCM: Multiclass confusion matrix library in Python,†J. Open Source Softw., vol. 3, no. 25, p. 729, 2018, doi: 10.21105/joss.00729.

N. Aslam et al., “Interpretable Machine Learning Models for Malicious Domains Detection Using Explainable Artificial Intelligence (XAI),†Sustain., vol. 14, no. 12, 2022, doi: 10.3390/su14127375.

M. R. Islam, M. U. Ahmed, S. Barua, and S. Begum, “A Systematic Review of Explainable Artificial Intelligence in Terms of Different Application Domains and Tasks,†Appl. Sci., vol. 12, no. 3, 2022, doi: 10.3390/app12031353.

M. T. Ribeiro, S. Singh, and C. Guestrin, “Model-Agnostic Interpretability of Machine Learning,†no. Whi, 2016.

Downloads

Published

2023-01-28