Studi Komparasi Metode Analisis Sentimen Naïve Bayes, SVM, dan Logistic Regression Pada Piala Dunia 2022

Authors

  • Muhamad Zaki Anbari Universitas Islam Negeri Sunan Kalijaga, Yogyakarta
  • Bambang Sugiantoro Universitas Islam Negeri Sunan Kalijaga, Yogyakarta

DOI:

https://doi.org/10.30865/mib.v7i2.5383

Keywords:

WorldCup2022, Bernouli Naïve Bayes, Support Vector Classifier, Logistic Regression, Sentiment Analysis

Abstract

The world cup is the most popular sporting event in the world. The 2022 World Cup will be held for the first time in the Middle East, in the country of Qatar to be precise. Its implementation was colored by various controversies ranging from human rights issues, LGBT+ issues, issues of alcoholic beverages, and so on which were so busy in the mainstream media. Various sentiments and opinions have emerged on social media regarding the implementation of the world cup, some have positive opinions and some have negative ones. Sentiment analysis was carried out to find out the main opinions that are developing in society regarding the 2022 world cup, the results can then be used as input and consideration for policy makers. This study uses the snscrape library running on the Python programming language to collect tweets related to the 2022 World Cup on the Twitter social media platform on the first day of the World Cup. The collected data then enters the pre-processing, splitting, TF-IDF stage, before it is ready to be used for modeling. The method used in this research is Bernouli Naïve Bayes, Support Vector Machine, and Logistic Regression. The evaluation results show that the Bernouli Naïve Bayes method produces a precision parameter value of 71%, a recall parameter of 99%, and an accuracy of 76%. While the Support Vector Classifier method produces precision parameter values of 94%, 93% recall parameters, and 92% accuracy. The Logistic Regression method produces a precision parameter value of 93%, a recall parameter of 93%, and an accuracy of 92%.

Author Biographies

Muhamad Zaki Anbari, Universitas Islam Negeri Sunan Kalijaga, Yogyakarta

Magister Informatics

Bambang Sugiantoro, Universitas Islam Negeri Sunan Kalijaga, Yogyakarta

Magister Informatics

References

M. C. Ewers, A. Diop, K. T. Le, and L. Bader, “Migrant Worker Well-Being and Its Determinants: The Case of Qatar,†Soc. Indic. Res., vol. 152, no. 1, pp. 137–163, 2020, doi: 10.1007/s11205-020-02427-3.

S. Sridhar, K. Ferris, and E. Osmond, “World Cup 2022: why is Qatar a controversial location for the tournament?,†2022. https://www.reuters.com/lifestyle/sports/world-cup-2022-why-is-qatar-controversial-location-fifa-tournament-2022-11-15/

E. R. Indriyani, P. Paradise, and M. Wibowo, “Perbandingan Metode Naïve Bayes dan Support Vector Machine Untuk Analisis Sentimen Terhadap Vaksin Astrazeneca di Twitter,†J. Media Inform. Budidarma, vol. 6, no. 3, p. 1545, 2022, doi: 10.30865/mib.v6i3.4220.

M. Birjali, M. Kasri, and A. Beni-Hssane, “A comprehensive survey on sentiment analysis: Approaches, challenges and trends,†Knowledge-Based Syst., vol. 226, p. 107134, 2021, doi: 10.1016/j.knosys.2021.107134.

B. O’Connor, “The government’s response to people trafficking. Statement to the Australian Parliament,†pp. 1–8, 2010.

R. Reza El Akbar, R. N. Shofa, M. I. Paripurna, and Supratman, “The Implementation of Naïve Bayes Algorithm for Classifying Tweets Containing Hate Speech with Political Motive,†ICSECC 2019 - Int. Conf. Sustain. Eng. Creat. Comput. New Idea, New Innov. Proc., no. December 2018, pp. 144–148, 2019, doi: 10.1109/ICSECC.2019.8907208.

H. Hasanli and S. Rustamov, “Sentiment Analysis of Azerbaijani twits Using Logistic Regression, Naive Bayes and SVM,†13th IEEE Int. Conf. Appl. Inf. Commun. Technol. AICT 2019 - Proc., 2019, doi: 10.1109/AICT47866.2019.8981793.

A. Poornima and K. S. Priya, “A Comparative Sentiment Analysis of Sentence Embedding Using Machine Learning Techniques,†2020 6th Int. Conf. Adv. Comput. Commun. Syst. ICACCS 2020, pp. 493–496, 2020, doi: 10.1109/ICACCS48705.2020.9074312.

M. Al Omari, M. Al-Hajj, N. Hammami, and A. Sabra, “Sentiment classifier: Logistic regression for Arabic services’ reviews in Lebanon,†2019 Int. Conf. Comput. Inf. Sci. ICCIS 2019, no. 2012, pp. 1–5, 2019, doi: 10.1109/ICCISci.2019.8716394.

M. Qorib, T. Oladunni, M. Denis, E. Ososanya, and P. Cotae, “Covid-19 vaccine hesitancy: Text mining, sentiment analysis and machine learning on COVID-19 vaccination Twitter dataset,†Expert Syst. Appl., vol. 212, no. January 2022, p. 118715, 2023, doi: 10.1016/j.eswa.2022.118715.

A. Pratama, R. I. Alhaqq, and Y. Ruldeviyani, “Sentiment Analysis of the Covid-19 Booster Vaccination Program As a Requirement for Homecoming During Eid Fitr in Indonesia,†J. Theor. Appl. Inf. Technol., vol. 101, no. 1, pp. 248–261, 2023.

M. Maree, M. Eleyat, S. Rabayah, and M. Belkhatir, “A hybrid composite features based sentence level sentiment analyzer,†IAES Int. J. Artif. Intell., vol. 12, no. 1, pp. 284–294, 2023, doi: 10.11591/ijai.v12.i1.pp284-294.

H. Zhang, L. Jiang, and L. Yu, “Attribute and instance weighted naive Bayes,†Pattern Recognit., vol. 111, p. 107674, Mar. 2021, doi: 10.1016/J.PATCOG.2020.107674.

K. Gulati, S. Saravana Kumar, R. Sarath Kumar Boddu, K. Sarvakar, D. Kumar Sharma, and M. Z. M. Nomani, “Comparative analysis of machine learning-based classification models using sentiment classification of tweets related to COVID-19 pandemic,†Mater. Today Proc., vol. 51, pp. 38–41, Jan. 2022, doi: 10.1016/j.matpr.2021.04.364.

S. Majumder, A. Aich, and S. Das, “Sentiment Analysis of People During Lockdown Period of COVID-19 Using SVM and Logistic Regression Analysis,†SSRN Electron. J., Mar. 2021, doi: 10.2139/SSRN.3801039.

Kaggle.com, “FIFA World Cup 2022 Tweets,†2022. https://www.kaggle.com/datasets/tirendazacademy/fifa-world-cup-2022-tweets

R. Ali et al., “Text Mining: Use of TF-IDF to Examine the Relevance of Words to Documents Text Mining: Use of TF-IDF to Examine the Relevance of Words to Documents Text Mining,†Artic. Int. J. Comput. Appl., vol. 181, no. 1, pp. 975–8887, 2018, doi: 10.5120/ijca2018917395.

M. Saraswati and D. Riminarsih, “Analisis Sentimen Terhadap Pelayanan Krl Commuterline Berdasarkan Data Twitter Menggunakan Algortima Bernoulli Naive Bayes,†J. Ilm. Inform. Komput., vol. 25, no. 3, pp. 225–238, 2020, doi: 10.35760/ik.2020.v25i3.3256.

H. Apriyani and K. Kurniati, “Perbandingan Metode Naïve Bayes Dan Support Vector Machine Dalam Klasifikasi Penyakit Diabetes Melitus,†J. Inf. Technol. Ampera, vol. 1, no. 3, pp. 133–143, 2020, doi: 10.51519/journalita.volume1.isssue3.year2020.page133-143.

K. Gulati, S. Saravana Kumar, R. Sarath Kumar Boddu, K. Sarvakar, D. Kumar Sharma, and M. Z. M. Nomani, “Comparative analysis of machine learning-based classification models using sentiment classification of tweets related to COVID-19 pandemic,†Mater. Today Proc., vol. 51, no. xxxx, pp. 38–41, 2021, doi: 10.1016/j.matpr.2021.04.364.

Downloads

Published

2023-04-27

Issue

Section

Articles