Hoax Detection Tweets of the COVID-19 on Twitter Using LSTM-CNN with Word2Vec

Prisla Novia Anggreyani; Warih Maharani

doi:10.30865/mib.v6i4.4564

Authors

Prisla Novia Anggreyani Telkom University, Bandung http://orcid.org/0000-0002-8370-0365
Warih Maharani Telkom University, Bandung

DOI:

https://doi.org/10.30865/mib.v6i4.4564

Keywords:

Hoax, Twitter, LSTM, CNN, Word2Vec

Abstract

The growth of Twitter users is increasing every year, impacting activities in social media such as hoaxes that are increasingly widespread on various platforms. During this pandemic, the rate of hoaxes is growing because nowadays, it is very easy for humans to interact with each other, have opinions, and exchange information. One of the hoaxes that often appears is the hoax about the Covid-19 virus. Therefore, a method for detecting hoaxes is needed, especially for the topic of the Covid-19 virus in Indonesia. The method used in hoax detection is LSTM-CNN with Word2Vec. More than 1000 tweets data are used in this study, divided into hoax and non-hoax categories. Detection is carried out to analyze the hoax results obtained by using Word2Vec as a method to convert data as a classification vector and LSTM-CNN to classify the data. This work's result showed that the LSTM-CNN model with Word2Vec achieves 79.71% accuracy, surpassing the LSTM model and CNN model.

References

L. Rizkinaswara, â€œKominfo Temukan 1.819 Isu Hoaks Seputar Covid-19,â€ Kominfo. https://aptika.kominfo.go.id/2021/08/kominfo-temukan-1-819-isu-hoaks-seputar-covid-19/ (accessed Oct. 26, 2021).

K. Azizah, â€œHoax adalah Berita Bohong, Kenali Ciri-Ciri, Jenis, dan Cara Mengatasinya,â€ Merdeka. https://www.merdeka.com/trending/hoax-adalah-berita-bohong-kenali-ciri-ciri-jenis-dan-cara-mengatasinya-kln (accessed Oct. 26, 2021).

C. Olah, â€œUnderstanding LSTM Networks,â€ Colah.github.io. http://colah.github.io/posts/2015-08-Understanding-LSTMs (accessed Dec. 5, 2021).

I. Y. R. Pratiwi, R. A. Asmara, and F. Rahutomo, â€œStudy of hoax news detection using naÃ¯ve bayes classifier in Indonesian language,â€ in 2017 11th International Conference on Information & Communication Technology and System (ICTS), Surabaya, Indonesia, Oct. 2017, pp. 73â€“78. doi: 10.1109/ICTS.2017.8265649.

B. P. Nayoga, R. Adipradana, R. Suryadi, and D. Suhartono, â€œHoax Analyzer for Indonesian News Using Deep Learning Models,â€ Procedia Comput. Sci., vol. 179, pp. 704â€“712, 2021, doi: 10.1016/j.procs.2021.01.059.

P. Reddy, D. Roy, P. Manoj, M. Keerthana, and P. Tijare, â€œA Study on Fake News Detection Using NaÃ¯ve Bayes, SVM,â€ Neural Netw. LSTM J Adv Res Dyn Control Syst, vol. 1, pp. 942â€“947, 2019.

H. Mustofa and A. A. Mahfudh, â€œKlasifikasi Berita Hoax Dengan Menggunakan Metode Naive Bayes,â€ Walisongo J. Inf. Technol., vol. 1, no. 1, pp. 1â€“12, 2019.

F. N. Rozi and D. H. Sulistyawati, â€œKLASIFIKASI BERITA HOAX PILPRES MENGGUNAKAN METODE MODIFIED K-NEAREST NEIGHBOR DAN PEMBOBOTAN MENGGUNAKAN TF-IDF,â€ KONVERGENSI, vol. 15, no. 1, Oct. 2019, doi: 10.30996/konv.v15i1.2828.

P. M. Sosa, â€œTwitter sentiment analysis using combined LSTM-CNN models,â€ Eprint Arxiv, pp. 1â€“9, 2017.

A. K. Cotra, â€œAnalysis On Tweets Using Python and TWINT,â€ Towards Data Science. Analysis On Tweets Using Python and TWINT (accessed Jun. 26, 2022).

W. Kurniasih, â€œPengertian Hoaks: Sejarah, Jenis, Contoh, Penyebab dan Cara Menghindarinya. [Online] Gramedia,â€ Gramedia. https://www.gramedia.com/literasi/pengertian-hoaks/ (accessed Nov. 09, 2021).

Z. Li, â€œA Beginnerâ€™s Guide to Word Embedding with Gensim Word2Vec Model,â€ Towards Data Science. https://towardsdatascience.com/a-beginners-guide-to-word-embedding-with-gensim-word2vec-model-5970fa56cc92 (accessed Jun. 27, 2022).

W. Widayat, â€œAnalisis Sentimen Movie Review menggunakan Word2Vec dan metode LSTM Deep Learning,â€ J. MEDIA Inform. BUDIDARMA, vol. 5, no. 3, p. 1018, Jul. 2021, doi: 10.30865/mib.v5i3.3111.

D. Karani, â€œIntroduction to word embedding and word2vec,â€ Data Sci., vol. 1, 2018.

B. Jang, I. Kim, and J. W. Kim, â€œWord2vec convolutional neural networks for classification of news articles and tweets,â€ PloS One, vol. 14, no. 8, p. e0220976, 2019.

S. Hochreiter and J. Schmidhuber, â€œLong Short-Term Memory,â€ Neural Comput., vol. 9, no. 8, pp. 1735â€“1780, Nov. 1997, doi: 10.1162/neco.1997.9.8.1735.

M. Rajdev and K. Lee, â€œFake and Spam Messages: Detecting Misinformation During Natural Disasters on Social Media,â€ in 2015 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT), Singapore, Dec. 2015, pp. 17â€“20. doi: 10.1109/WI-IAT.2015.102.

Hoax Detection Tweets of the COVID-19 on Twitter Using LSTM-CNN with Word2Vec

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License