Perbandingan Metode Naïve Bayes dan Support Vector Machine Untuk Analisis Sentimen Terhadap Vaksin Astrazeneca di Twitter

Authors

  • Eva Rahma Indriyani Institut Teknologi Telkom Purwokerto, Banyumas http://orcid.org/0000-0002-0361-3721
  • Paradise Paradise Institut Teknologi Telkom Purwokerto, Banyumas
  • Merlinda Wibowo Institut Teknologi Telkom Purwokerto, Banyumas

DOI:

https://doi.org/10.30865/mib.v6i3.4220

Keywords:

Sentiment Analysis, Astrazeneca Vaccine, Support Vector Machine, Naïve Bayes, Twitter

Abstract

The implementation of Covid-19 vaccination in Indonesia turned out to have various pro and contra opinions from the public. The discovery of disinformation and misinformation about vaccines  spread through social  media content affects a person's absorption of information so which leads to vaccine delays. When in fact, vaccination is one of the biggest and most effective contributions  to preventing the Covid-19 pandemic. Astrazeneca is one of the vaccines provided by the Indonesian government. This vaccine used to be controversial amongst the public regarding its halalness and the safety of the vaccine because of the issue of  the said vaccine  containing swine trypsin. Nowadays Twitter has  become a place for users to express their concerns and opinion regarding the Covid-19 vaccine. Data obtained from Twitter will be useful if it is analyzed, one of which is sentiment analysis. In this study, data collection was carried out using the snscrape library with a total of 3105 tweets obtained from the  period May to June 31, 2021. The dataset that has been  collected is then  preprocessed to optimize the data. After passing the preprocessing stage, the data was labeled as tweet class using a lexicon-based dictionary which resulted in 1275 tweets with positive opinin labels and 1830 tweets  labeled as negative opinion. The  aim of this study is to examines the performance of Naïve Bayes and Support Vector Machine with  adding the weighting method  TF-IDF (Term Frequency – Inverse Document Frequency). The evaluation results  show that the Support Vector Machine has a greater accuracy, precision, recall and f1-score of 87.27%, 90.41%, 77,34% and 83.37% compared to  Naïve Bayes which has an accuracy, precision, recall and f1- of 76.81%, 72.40%, 70.70% and 71.52%.

References

S. Youse, R. Dara, S. Mubareka, and A. Papadopoulos, “International Journal of Infectious Diseases An analysis of COVID-19 vaccine sentiments and opinions on Twitter,†vol. 108, pp. 256–262, 2021, doi: 10.1016/j.ijid.2021.05.059.

A. K. Napitupulu et al., “ANALISIS KONSEP AL- ḌARŪRAH DALAM FATWA DSN -MUI ASTRAZENECA,†At-Thullab J. Pendidik. Guru Madrasah Ibtidaiyah, vol. 3, no. 14, pp. 748–767, 2021.

L. Prasetyaning Widayanti and E. Kusumawati, “Hubungan Persepsi Tentang Efektifitas Vaksin Dengan Sikap Kesediaan Mengikuti Vaksinasi Covid-19,†Hearty, vol. 9, no. 2, p. 78, 2021, doi: 10.32832/hearty.v9i2.5400.

K. RI, ITAGI, WHO, and UNICEF, “Survei Penerimaan Vaksin COVID-19 di Indonesia,†Satuan Gugus Tugas Penanganan COVID-19, no. November, pp. 1–26, 2020.

N. Puri, E. A. Coomes, H. Haghbayan, and K. Gunaratne, “Social media and vaccine hesitancy : new updates for the era of COVID-19 and globalized infectious diseases,†Hum. Vaccin. Immunother., vol. 16, no. 11, pp. 2586–2593, 2020, doi: 10.1080/21645515.2020.1780846.

A. Saepulrohman, Sudin Saepudin, and D. Gustian, “Analisis Sentimen Kepuasan Pengguna Aplikasi Whatsapp Menggunakan Algoritma Naïve Bayes Dan Support Vector Machine,†@is Best Account. Inf. Syst. Inf. Technol. Bus. Enterp., vol. 6, no. 2, pp. 91–105, 2021, [Online]. Available: https://rekayasa.nusaputra.ac.id/article/view/107%0Ahttps://rekayasa.nusaputra.ac.id/article/download/107/140.

A. Mustopa, Hermanto, Anna, E. B. Pratama, A. Hendini, and D. Risdiansyah, “Analysis of user reviews for the pedulilindungi application on google play using the support vector machine and naive bayes algorithm based on particle swarm optimization,†2020 5th Int. Conf. Informatics Comput. ICIC 2020, vol. 2, 2020, doi: 10.1109/ICIC50835.2020.9288655.

R. Mahendrajaya, G. A. Buntoro, and M. B. Setyawan, “Analisis Sentimen Pengguna Gopay Menggunakan Metode Lexicon Based Dan Support Vector Machine,†Komputek, vol. 3, no. 2, p. 52, 2019, doi: 10.24269/jkt.v3i2.270.

T. N. Wijaya, Rini Indriati, and M. N. Muzaki, “Analisis Sentimen Opini Publik Tentang Undang- Undang Cipta Kerja Pada Twitter,†Jambura J. Electr. Electron. Eng., vol. 3, pp. 78–83, 2021.

B. Laurensz and Eko Sediyono, “Analisis Sentimen Masyarakat terhadap Tindakan Vaksinasi dalam Upaya Mengatasi Pandemi Covid-19,†J. Nas. Tek. Elektro dan Teknol. Inf., vol. 10, no. 2, pp. 118–123, 2021, doi: 10.22146/jnteti.v10i2.1421.

R. T. Aldisa, Azizah, and M. A. Abdullah, “Analisis Sentimen Mengenai Vaksin Sinovac di Media Sosial Twitter Menggunakan Metode Naïve bayes Classification,†J. JTIK (Jurnal Teknol. Inf. dan Komunikasi), vol. 6, no. 3, pp. 1–5, 2022.

M. R. A. Nasution and M. Hayaty, “Perbandingan Akurasi dan Waktu Proses Algoritma K-NN dan SVM dalam Analisis Sentimen Twitter,†J. Inform., vol. 6, no. 2, pp. 226–235, 2019, doi: 10.31311/ji.v6i2.5129.

A. Suad A. and B. Wesam S., “Review of data preprocessing techniques in data mining.pdf,†J. Eng. Appl. Sci., vol. 12, no. 16, pp. 4102–4107, 2017, [Online]. Available: https://medwelljournals.com/abstract/?doi=jeasci.2017.4102.4107.

F. Koto and G. Y. Rahmaningtyas, “InSet Lexicon : Evaluation of a Word List for Indonesian Sentiment Analysis in Microblogs InSet Lexicon : Evaluation of a Word List for Indonesian Sentiment Analysis in Microblogs,†IEEE, no. December, pp. 391–393, 2017, doi: 10.1109/IALP.2017.8300625.

D. Musfiroh et al., “Analisis Sentimen terhadap Perkuliahan Daring di Indonesia dari Twitter Dataset Menggunakan InSet Lexicon,†MALCOM Indones. J. Mach. Learn. Comput. Sci., vol. 1, no. 1, pp. 24–33, 2021.

R. Melita et al., “PENERAPAN METODE TERM FREQUENCY INVERSE DOCUMENT FREQUENCY (TF-IDF) DAN COSINE SIMILARITY PADA SISTEM TEMU KEMBALI INFORMASI UNTUK MENGETAHUI SYARAH HADITS BERBASIS WEB ( STUDI KASUS : SYARAH UMDATIL AHKAM ),†J. Tek. Inform., vol. 11, no. 2, 2018.

T. Krisdiyanto, E. Maricha, and O. Nurharyanto, “Analisis Sentimen Opini Masyarakat Indonesia Terhadap Kebijakan PPKM pada Media Sosial Twitter Menggunakan Naïve Bayes Clasifiers,†CoreIT, vol. 7, no. 1, pp. 32–37, 2021.

H. C. Husada and A. S. Paramita, “Analisis Sentimen Pada Maskapai Penerbangan di Platform Twitter Menggunakan Algoritma Support Vector Machine ( SVM ) Sentiment Analysis of Airline on Twitter Platform Using Support Vector Machine ( SVM ) Algorithm,†IKADO, vol. 10, no. 1, pp. 18–26, 2021, doi: 10.34148/teknika.v10i1.311.

S. Yousefinaghani, R. Dara, S. Mubareka, A. Papadopoulos, and S. Sharif, “An analysis of COVID-19 vaccine sentiments and opinions on Twitter,†Int. J. Infect. Dis., vol. 108, pp. 256–262, 2021, doi: 10.1016/j.ijid.2021.05.059.

T. Meisya et al., “PERBANDINGAN KERNEL SUPPORT VECTOR MACHINE ( SVM ) DALAM PENERAPAN ANALISIS SENTIMEN VAKSINISASI COVID-19,†SINTECH, vol. 4, no. 2, pp. 139–145, 2021.

U. Verawardina, F. Edi, and R. Watrianthos, “Analisis Sentimen Pembelajaran Daring Pada Twitter di Masa Pandemi COVID-19 Menggunakan Metode Naïve Bayes,†J. MEDIA Inform. BUDIDARMA, vol. 5, pp. 157–163, 2021, doi: 10.30865/mib.v5i1.2604.

C. D. Manning, P. Raghavan, and H. Schütze, An Introduction to Information Retrieval, no. c. 2009.

G. N. Aulia and E. Patriya, “IMPLEMENTASI LEXICON BASED DAN NAIVE BAYES PADA ANALISIS SENTIMEN PENGGUNA TWITTER TOPIK PEMILIHAN PRESIDEN 2019,†J. Ilm. Inform. Kompute, vol. 24, no. 2, pp. 140–153, 2019.

Z. Alhaq, A. Mustopa, S. Mulyatun, and J. D. Santoso, “Penerapan Metode Support Vector Machine Untuk Analisis Sentimen Pengguna Twitter,†J. Inf. Syst. Manag., vol. 3, no. 2, pp. 44–49, 2021, doi: 10.24076/joism.2021v3i2.558.

A. Sari, F. V., & Wibowo, “Analisis Sentimen Pelanggan Toko Online Jd. Id Menggunakan Metode Naïve Bayes Classifier Berbasis Konversi Ikon Emosi,†Simetris J. Tek. Mesin, Elektro dan Ilmu Komput., vol. 2, no. 2, pp. 681–686, 2019.

S. Utep and O. Kosheleva, “Why 70 / 30 or 80 / 20 Relation Between Training and Testing Sets : A Pedagogical Explanation Why 70 / 30 or 80 / 20 Relation Between Training and Testing Sets : A Pedagogical Explanation,†2018.

N. Hardi, Y. Alkahfi, P. Handayani, W. Gata, and M. R. Firdaus, “Analisis Sentimen Physical Distancing pada Twitter Menggunakan Text Mining dengan Algoritma Naive Bayes Classifier,†Sistemasi, vol. 10, no. 1, p. 131, 2021, doi: 10.32520/stmsi.v10i1.1118.

F. Ratnawati, “Implementasi Algoritma Naive Bayes Terhadap Analisis Sentimen Opini Film Pada Twitter,†J. INOVTEK POLBENG, vol. 3, no. 1, pp. 50–59, 2018.

Downloads

Published

2022-07-25

Issue

Section

Articles