https://eurogeojournal.eu/ https://jurnal.pendidikanbiologiukaw.ac.id/
https://e-kerja.bnpp.go.id/bkp/https://journal.dkpp.go.id/wow/https://ppid.dkpp.go.id/_fungsi/dana/https://jurnal.pendidikanbiologiukaw.ac.id/https://e-kerja.bnpp.go.id/Pengawas/demo/https://jos.unsoed.ac.id/stats/2024/https://journal.umkendari.ac.id/dm/https://jurnal.radenfatah.ac.id/demo/https://journal.ar-raniry.ac.id/lap/https://sipeg.ui.ac.id/dm/https://e-kerja.bnpp.go.id/Pengawas/dana/
slot gacor 2025slot gacor 2025slot gacor 2025slot gacor 2025slot gacor 2025slot gacor
Hoax Detection of Indonesian News Media on Twitter Using IndoBERT with Word Embedding Word2Vec | Bhagaskara S M | JURNAL MEDIA INFORMATIKA BUDIDARMA

Hoax Detection of Indonesian News Media on Twitter Using IndoBERT with Word Embedding Word2Vec

Pernanda Arya Bhagaskara S M, Sri Suryani Prasetiyowati, Yuliant Sibaroni

Abstract


Hoax is data that is added or deducted from the news that occurred. In the digital age, hoaxes are increasingly being spread, and people are very quickly affected by their spread, especially hoaxes circulating in Indonesian news media on social media. Disseminating information that has not been confirmed as accurate can cause public concern and anxiety. Virtual diversion has transformed into a correspondence key to begin thinking, talking, and moving around cordial issues. In this manner, exploration will be led by consolidating the IndoBERT model with the Word2Vec development highlight in arranging deception news in Indonesian news media. This model was constructed using K-Fold cross-validation to enhance model performance across extensive data sets. The information utilized comes from tweets shared on Twitter by the Indonesian public. The trials that have been carried out demonstrate that combining Word2Vec with IndoBERT is effective at detecting hoaxes, with an overall accuracy score of 88% for the entire dataset. This conclusion can be drawn from the classification results of Word2Vec with IndoBERT. Also, the best precision and incentive for every cycle is almost 99%. In addition, the study's objective is to identify hoax news in Indonesian news media disseminated via social media. This will encourage individuals to be more cautious when reading and disseminating news, as untrue information will significantly impact certain individuals.

Keywords


Indonesian News Media; Hoax; IndoBERT; Word2Vec; Social Media

Full Text:

PDF

References


C. Juditha, “Hoax Communication Interactivity in Social Media and Anticipation (Interaksi Komunikasi Hoax di Media Sosial serta Antisipasinya),†Journal Pekommas, vol. 3, no. 1, p. 31, 2018, doi: 10.30818/jpkm.2018.2030104.

E. Utami, A. F. Iskandar, W. Hidayat, A. B. Prasetyo, and A. D. Hartanto, “Covid-19 Hoax Detection Using KNN in Jaccard Space,†IJCCS (Indonesian Journal of Computing and Cybernetics Systems), vol. 15, no. 3, p. 255, 2021, doi: 10.22146/ijccs.67392.

A. Zubiaga and A. Jiang, “Early Detection of Social Media Hoaxes at Scale,†ACM Transactions on the Web, vol. 14, no. 4, 2020, doi: 10.1145/3407194.

P. N. Anggreyani and W. Maharani, “Hoax Detection Tweets of the COVID-19 on Twitter Using LSTM-CNN with Word2Vec,†Jurnal Media Informatika Budidarma, vol. 6, no. 4, p. 2432, 2022, doi: 10.30865/mib.v6i4.4564.

M. A. Rahmat, Indrabayu, and I. S. Areni, “Hoax web detection for news in bahasa using support vector machine,†2019 International Conference on Information and Communications Technology, ICOIACT 2019, pp. 332–336, 2019, doi: 10.1109/ICOIACT46704.2019.8938425.

A. Fauzi, E. B. Setiawan, and Z. K. A. Baizal, “Hoax News Detection on Twitter using Term Frequency Inverse Document Frequency and Support Vector Machine Method,†J Phys Conf Ser, vol. 1192, no. 1, 2019, doi: 10.1088/1742-6596/1192/1/012025.

M. Ikram, K. Sinapoy, Y. Sibaroni, and S. S. Prasetyowati, “JURNAL RESTI Comparison of LSTM and IndoBERT Method,†vol. 5, no. 158, pp. 2–6, 2023.

B. P. Nayoga, R. Adipradana, R. Suryadi, and D. Suhartono, “Hoax Analyzer for Indonesian News Using Deep Learning Models,†Procedia Comput Sci, vol. 179, no. 2020, pp. 704–712, 2021, doi: 10.1016/j.procs.2021.01.059.

P. K. Pravin, “Automatic Hoax Detection on Social Media Using Deep Learning,†no. January, pp. 1–57, 2021, [Online]. Available: www.bth.se

F. Koto, A. Rahimi, J. H. Lau, and T. Baldwin, “IndoLEM and IndoBERT: A Benchmark Dataset and Pre-trained Language Model for Indonesian NLP,†COLING 2020 - 28th International Conference on Computational Linguistics, Proceedings of the Conference, pp. 757–770, 2020, doi: 10.18653/v1/2020.coling-main.66.

Rahmawati, Arnetta, Andry Alamsyah, and Ade Romadhony. "Hoax News Detection Analysis using IndoBERT Deep Learning Methodology." 2022 10th International Conference on Information and Communication Technology (ICoICT). IEEE, 2022.

L. H. Suadaa, I. Santoso, and A. T. B. Panjaitan, “Transfer Learning of Pre-trained Transformers for Covid-19 Hoax Detection in Indonesian Language,†IJCCS (Indonesian Journal of Computing and Cybernetics Systems), vol. 15, no. 3, p. 317, 2021, doi: 10.22146/ijccs.66205.

K. W. Church, “Emerging Trends: Word2Vec,†Nat Lang Eng, vol. 23, no. 1, pp. 155–162, 2017, doi: 10.1017/S1351324916000334.

T. Mikolov, E. Grave, P. Bojanowski, C. Puhrsch, and A. Joulin, “Advances in pre-training distributed word representations,†LREC 2018 - 11th International Conference on Language Resources and Evaluation, no. 1, pp. 52–55, 2019.

P. Bojanowski, E. Grave, A. Joulin, and T. Mikolov, “Enriching Word Vectors with Subword Information,†Trans Assoc Comput Linguist, vol. 5, pp. 135–146, 2017, doi: 10.1162/tacl_a_00051.

M. Baroni and A. Lenci, “Distributional memory: A general framework for corpus-based semantics,†Computational Linguistics, vol. 36, no. 4, pp. 675–721, 2010, doi: 10.1162/coli_a_00016.

F. Ismayanti and E. B. Setiawan, “Deteksi Konten Hoax Berbahasa Indonesia Di Twitter Menggunakan Fitur Ekspansi Dengan Word2vec,†eProceedings …, vol. 8, no. 5, pp. 10288–10300, 2021, [Online]. Available: https://openlibrarypublications.telkomuniversity.ac.id/index.php/engineering/article/view/15697%0Ahttps://openlibrarypublications.telkomuniversity.ac.id/index.php/engineering/article/view/15697/15410

H. M. Lee and Y. Sibaroni, “Comparison of IndoBERTweet and Support Vector Machine on Sentiment Analysis of Racing Circuit Construction in Indonesia,†Jurnal Media Informatika Budidarma, vol. 7, no. 1, pp. 99–106, 2023, doi: 10.30865/mib.v7i1.5380.

F. Koto, J. H. Lau, and T. Baldwin, “Liputan6: A Large-scale Indonesian Dataset for Text Summarization,†no. 1, 2020, [Online]. Available: http://arxiv.org/abs/2011.00679

S. M. Isa, G. Nico, and M. Permana, “Indobert for Indonesian Fake News Detection,†ICIC Express Letters, vol. 16, no. 3, pp. 289–297, 2022, doi: 10.24507/icicel.16.03.289.

S. Sivakumar, L. S. Videla, T. Rajesh Kumar, J. Nagaraj, S. Itnal, and D. Haritha, “Review on Word2Vec Word Embedding Neural Net,†Proceedings - International Conference on Smart Electronics and Communication, ICOSEC 2020, no. Icosec, pp. 282–290, 2020, doi: 10.1109/ICOSEC49089.2020.9215319.




DOI: https://doi.org/10.30865/mib.v7i3.6367

Refbacks



Copyright (c) 2023 JURNAL MEDIA INFORMATIKA BUDIDARMA

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.



JURNAL MEDIA INFORMATIKA BUDIDARMA
Universitas Budi Darma
Secretariat: Sisingamangaraja No. 338 Telp 061-7875998
Email: mib.stmikbd@gmail.com

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.