Studi Komparasi Metode Analisis Sentimen Naïve Bayes, SVM, dan Logistic Regression Pada Piala Dunia 2022
DOI:
https://doi.org/10.30865/mib.v7i2.5383Keywords:
WorldCup2022, Bernouli Naïve Bayes, Support Vector Classifier, Logistic Regression, Sentiment AnalysisAbstract
The world cup is the most popular sporting event in the world. The 2022 World Cup will be held for the first time in the Middle East, in the country of Qatar to be precise. Its implementation was colored by various controversies ranging from human rights issues, LGBT+ issues, issues of alcoholic beverages, and so on which were so busy in the mainstream media. Various sentiments and opinions have emerged on social media regarding the implementation of the world cup, some have positive opinions and some have negative ones. Sentiment analysis was carried out to find out the main opinions that are developing in society regarding the 2022 world cup, the results can then be used as input and consideration for policy makers. This study uses the snscrape library running on the Python programming language to collect tweets related to the 2022 World Cup on the Twitter social media platform on the first day of the World Cup. The collected data then enters the pre-processing, splitting, TF-IDF stage, before it is ready to be used for modeling. The method used in this research is Bernouli Naïve Bayes, Support Vector Machine, and Logistic Regression. The evaluation results show that the Bernouli Naïve Bayes method produces a precision parameter value of 71%, a recall parameter of 99%, and an accuracy of 76%. While the Support Vector Classifier method produces precision parameter values of 94%, 93% recall parameters, and 92% accuracy. The Logistic Regression method produces a precision parameter value of 93%, a recall parameter of 93%, and an accuracy of 92%.
References
M. C. Ewers, A. Diop, K. T. Le, and L. Bader, “Migrant Worker Well-Being and Its Determinants: The Case of Qatar,†Soc. Indic. Res., vol. 152, no. 1, pp. 137–163, 2020, doi: 10.1007/s11205-020-02427-3.
S. Sridhar, K. Ferris, and E. Osmond, “World Cup 2022: why is Qatar a controversial location for the tournament?,†2022. https://www.reuters.com/lifestyle/sports/world-cup-2022-why-is-qatar-controversial-location-fifa-tournament-2022-11-15/
E. R. Indriyani, P. Paradise, and M. Wibowo, “Perbandingan Metode Naïve Bayes dan Support Vector Machine Untuk Analisis Sentimen Terhadap Vaksin Astrazeneca di Twitter,†J. Media Inform. Budidarma, vol. 6, no. 3, p. 1545, 2022, doi: 10.30865/mib.v6i3.4220.
M. Birjali, M. Kasri, and A. Beni-Hssane, “A comprehensive survey on sentiment analysis: Approaches, challenges and trends,†Knowledge-Based Syst., vol. 226, p. 107134, 2021, doi: 10.1016/j.knosys.2021.107134.
B. O’Connor, “The government’s response to people trafficking. Statement to the Australian Parliament,†pp. 1–8, 2010.
R. Reza El Akbar, R. N. Shofa, M. I. Paripurna, and Supratman, “The Implementation of Naïve Bayes Algorithm for Classifying Tweets Containing Hate Speech with Political Motive,†ICSECC 2019 - Int. Conf. Sustain. Eng. Creat. Comput. New Idea, New Innov. Proc., no. December 2018, pp. 144–148, 2019, doi: 10.1109/ICSECC.2019.8907208.
H. Hasanli and S. Rustamov, “Sentiment Analysis of Azerbaijani twits Using Logistic Regression, Naive Bayes and SVM,†13th IEEE Int. Conf. Appl. Inf. Commun. Technol. AICT 2019 - Proc., 2019, doi: 10.1109/AICT47866.2019.8981793.
A. Poornima and K. S. Priya, “A Comparative Sentiment Analysis of Sentence Embedding Using Machine Learning Techniques,†2020 6th Int. Conf. Adv. Comput. Commun. Syst. ICACCS 2020, pp. 493–496, 2020, doi: 10.1109/ICACCS48705.2020.9074312.
M. Al Omari, M. Al-Hajj, N. Hammami, and A. Sabra, “Sentiment classifier: Logistic regression for Arabic services’ reviews in Lebanon,†2019 Int. Conf. Comput. Inf. Sci. ICCIS 2019, no. 2012, pp. 1–5, 2019, doi: 10.1109/ICCISci.2019.8716394.
M. Qorib, T. Oladunni, M. Denis, E. Ososanya, and P. Cotae, “Covid-19 vaccine hesitancy: Text mining, sentiment analysis and machine learning on COVID-19 vaccination Twitter dataset,†Expert Syst. Appl., vol. 212, no. January 2022, p. 118715, 2023, doi: 10.1016/j.eswa.2022.118715.
A. Pratama, R. I. Alhaqq, and Y. Ruldeviyani, “Sentiment Analysis of the Covid-19 Booster Vaccination Program As a Requirement for Homecoming During Eid Fitr in Indonesia,†J. Theor. Appl. Inf. Technol., vol. 101, no. 1, pp. 248–261, 2023.
M. Maree, M. Eleyat, S. Rabayah, and M. Belkhatir, “A hybrid composite features based sentence level sentiment analyzer,†IAES Int. J. Artif. Intell., vol. 12, no. 1, pp. 284–294, 2023, doi: 10.11591/ijai.v12.i1.pp284-294.
H. Zhang, L. Jiang, and L. Yu, “Attribute and instance weighted naive Bayes,†Pattern Recognit., vol. 111, p. 107674, Mar. 2021, doi: 10.1016/J.PATCOG.2020.107674.
K. Gulati, S. Saravana Kumar, R. Sarath Kumar Boddu, K. Sarvakar, D. Kumar Sharma, and M. Z. M. Nomani, “Comparative analysis of machine learning-based classification models using sentiment classification of tweets related to COVID-19 pandemic,†Mater. Today Proc., vol. 51, pp. 38–41, Jan. 2022, doi: 10.1016/j.matpr.2021.04.364.
S. Majumder, A. Aich, and S. Das, “Sentiment Analysis of People During Lockdown Period of COVID-19 Using SVM and Logistic Regression Analysis,†SSRN Electron. J., Mar. 2021, doi: 10.2139/SSRN.3801039.
Kaggle.com, “FIFA World Cup 2022 Tweets,†2022. https://www.kaggle.com/datasets/tirendazacademy/fifa-world-cup-2022-tweets
R. Ali et al., “Text Mining: Use of TF-IDF to Examine the Relevance of Words to Documents Text Mining: Use of TF-IDF to Examine the Relevance of Words to Documents Text Mining,†Artic. Int. J. Comput. Appl., vol. 181, no. 1, pp. 975–8887, 2018, doi: 10.5120/ijca2018917395.
M. Saraswati and D. Riminarsih, “Analisis Sentimen Terhadap Pelayanan Krl Commuterline Berdasarkan Data Twitter Menggunakan Algortima Bernoulli Naive Bayes,†J. Ilm. Inform. Komput., vol. 25, no. 3, pp. 225–238, 2020, doi: 10.35760/ik.2020.v25i3.3256.
H. Apriyani and K. Kurniati, “Perbandingan Metode Naïve Bayes Dan Support Vector Machine Dalam Klasifikasi Penyakit Diabetes Melitus,†J. Inf. Technol. Ampera, vol. 1, no. 3, pp. 133–143, 2020, doi: 10.51519/journalita.volume1.isssue3.year2020.page133-143.
K. Gulati, S. Saravana Kumar, R. Sarath Kumar Boddu, K. Sarvakar, D. Kumar Sharma, and M. Z. M. Nomani, “Comparative analysis of machine learning-based classification models using sentiment classification of tweets related to COVID-19 pandemic,†Mater. Today Proc., vol. 51, no. xxxx, pp. 38–41, 2021, doi: 10.1016/j.matpr.2021.04.364.
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).