Analisis Sentimen Pada Isu Vaksin Covid-19 di Indonesia dengan Metode Naive Bayes Classifier

Authors

  • Fitria Septianingrum Universitas Singaperbangsa Karawang, Karawang
  • Jajam Haerul Jaman Universitas Singaperbangsa Karawang, Karawang
  • Ultach Enri Universitas Singaperbangsa Karawang, Karawang

DOI:

https://doi.org/10.30865/mib.v5i4.3260

Keywords:

Sentiment Analysis, Information Gain, Classification, Naive Bayes Classifier, Covid-19 Vaccine

Abstract

The Covid-19 pandemic that has occurred in Indonesia and even in the world has not yet ended. Various efforts have been made by the Indonesian government to minimize the spread of this virus, such as the implementation of a lockdown, Large-Scale Social Restrictions (PSBB), a ban on going home during the Eid al-Fitr holiday, and so on. One of the new policies issued by the government is the vaccination program, where the government has started implementing the program since early 2021 for the people of Indonesia, which aims to increase antibodies to avoid exposure to the Covid-19 virus. To find out opinions, comments, or feedback given by the public on this new policy, sentiment analysis can be done. The process of this sentiment analysis includes data collection, namely the crawled tweet data originating from the Twitter social media. The data is then selected for further pre-processing stage so that the data is clean and ready for classification. Furthermore, sentiment weighting is carried out for data labeling using a lexicon dictionary and negative words. Then after that, the terms or words are weighted with tf-idf and followed by the feature selection process using Information Gain. Furthermore, the classification process is carried out using the Naive Bayes Classifier algorithm to classify the data into 3 classes, namely positive, negative, and neutral sentiments. The results of this study are to produce a model accuracy rate of 78%, recall 80%, and an AUC score of 0.904.

References

A. K. Fauziyyah, “Analisis Sentimen Pandemi Covid19 Pada Streaming Twitter Dengan Text Mining Python,†J. Ilm. SINUS, vol. 18, no. 2, p. 31, 2020, doi: 10.30646/sinus.v18i2.491.

U. Sivarajah, Z. Irani, S. Gupta, and K. Mahroof, “Role of big data and social media analytics for business to business sustainability: A participatory web context,†Ind. Mark. Manag., vol. 86, no. April, pp. 163–179, 2020, doi: 10.1016/j.indmarman.2019.04.005.

G. Appel, L. Grewal, R. Hadi, and A. T. Stephen, “The future of social media in marketing,†J. Acad. Mark. Sci., vol. 48, no. 1, pp. 79–95, 2020, doi: 10.1007/s11747-019-00695-1.

S. Syamaidzar, “Review Vaksin Covid-19,†ResearchGate, no. July, pp. 0–14, 2020, [Online]. Available: https://www.researchgate.net/publication/343126729_Review_Vaksin_Covid-19.

B. Liu, “Many Facets of Sentiment Analysis,†in A Practical Guide to Sentiment Analysis, Socio Affective Computing, 2017, pp. 11–39.

L. C. Chen, C. M. Lee, and M. Y. Chen, “Exploration of social media for sentiment analysis using deep learning,†Soft Comput., vol. 24, no. 11, pp. 8187–8197, 2020, doi: 10.1007/s00500-019-04402-8.

A. B. P. Negara, H. Muhardi, and I. M. Putri, “Analisis Sentimen Maskapai Penerbangan Menggunakan Metode Naive Bayes dan Seleksi Fitur Information Gain,†J. Teknol. Inf. dan Ilmu Komput., vol. 7, no. 3, p. 599, 2020, doi: 10.25126/jtiik.2020711947.

Syahriani, A. A. Yana, and T. Santoso, “Sentiment analysis of facebook comments on indonesian presidential candidates using the naïve bayes method,†J. Phys. Conf. Ser., vol. 1641, no. 1, 2020, doi: 10.1088/1742-6596/1641/1/012012.

D. A. Muthia, “Sentiment Analysis on Closure of Illegal Movie Streaming Sites Using Naïve Bayes Algorithm,†J. Pilar Nusa Mandiri, vol. 16, no. 1, pp. 123–128, 2020, doi: 10.33480/pilar.v16i1.1306.

T. D. Yustika, “FAKTOR-FAKTOR YANG MENYEBABKAN TERJADINYA PERCERAIAN MENGGUNAKAN ALGORITMA APRIORI,†2020.

H. Wu and N. Yuan, “An Improved TF-IDF algorithm based on word frequency distribution information and category distribution information,†ACM Int. Conf. Proceeding Ser., pp. 211–215, 2018, doi: 10.1145/3232116.3232152.

A. Lestari, “Increasing Accuracy of C4 . 5 Algorithm Using Information Gain Ratio and Adaboost for Classification of Chronic Kidney Disease,†pp. 32–38, 2020.

R. Marcos De Moraes, E. A. D. M. G. Soares, and L. D. S. MacHado, “A double weighted fuzzy gamma naive bayes classifier,†J. Intell. Fuzzy Syst., vol. 38, no. 1, pp. 577–588, 2020, doi: 10.3233/JIFS-179431.

J. Xu, Y. Zhang, and D. Miao, “Three-way confusion matrix for classification: A measure driven view,†Inf. Sci. (Ny)., vol. 507, pp. 772–794, 2020, doi: 10.1016/j.ins.2019.06.064.

H. C. Husada and A. S. Paramita, “Analisis Sentimen Pada Maskapai Penerbangan di Platform Twitter Menggunakan Algoritma Support Vector Machine (SVM),†Teknika, vol. 10, no. 1, pp. 18–26, 2021, doi: 10.34148/teknika.v10i1.311.

J. R. Fernando, “Klasifikasi Spam pada Komentar Pemilu 2019 Indonesia di YouTube menggunakan Multinomial Naïve-bayes,†vol. 2, no. 1, pp. 24–25, 2019.

Gorunescu, Data Mining:Concepts,Models and Techniques (Vol. 12). Berlin: Heidelberg: Springer Berlin Heidelberg., 2011.

Downloads

Published

2021-10-26