Prediction of Indonesian Presidential Election Results using Sentiment Analysis with Nae Bayes Method

 Asno Azzawagama Firdaus (Universitas Ahmad Dahlan, Yogyakarta, Indonesia)
 (*)Anton Yudhana Mail (Universitas Ahmad Dahlan, Yogyakarta, Indonesia)
 Imam Riadi (Universitas Ahmad Dahlan, Yogyakarta, Indonesia)
 Mahsun Mahsun (Universitas Mataram, Mataram, Indonesia)

(*) Corresponding Author

Submitted: November 19, 2023; Published: January 9, 2024


Social media serves as a solution for politicians as a campaign tool because it can save costs compared to conventional campaigns. The 2024 Indonesian Presidential Election has drawn public attention, especially among social media users. Twitter, as one of the widely used social media platforms in Indonesia, functions as an effective campaign forum. However, the problem that arises is how to automatically collect social media data related to presidential discussions and provide conclusions on the analysis results. Of course, this is not easy if done manually. Sentiment analysis is one approach that can be used for this in order to draw conclusions and analysis related to the available data. Data was collected shortly after the registration of presidential and vice-presidential candidates in November 2023. This study aims to obtain sentiment results from the latest data obtained, get the best model from the Naive Bayes method, to conduct analysis in predicting presidential election results based on sentiment. However, at the time of data collection, candidate numbers had not been assigned by the Election organizers. The obtained data amounted to 11,569 records using the Valence Aware Dictionary for Sentiment Reasoning (VADER) library for labeling. After removing duplicated tweets, the data was reduced to 4,893 records, with each candidate pair having 1,631 data points. The sentiment analysis classification model was determined using the Nae Bayes method with Term Frequency-Inverse Document Frequency (TF-IDF) feature extraction. Based on the data, the highest percentage of positive sentiment was found in Ganjar Pranowo - Mahfud MD data at 69.16%, and the highest negative sentiment was in Prabowo Subianto - Gibran Rakabuming Raka data at 52.12%. Common words in positive sentiment for Ganjar Pranowo - Mahfud MD include "strong," "corruption," "support," "reward," and others. Meanwhile, frequently appearing negative sentiment words for Prabowo Subianto - Gibran Rakabuming Raka include "child," "eldest," "mk," "young," and others. This research achieved an average accuracy of 76.67% using the Naive Bayes method on the entire dataset, indicating its reliability in similar cases.


Full Text:


Article Metrics

Abstract view : 197 times
PDF - 94 times


A. A. Firdaus, A. Yudhana, and I. Riadi, Public Opinion Analysis of Presidential Candidate Using Nave Bayes Method, Kinet. Game Technol. Inf. Syst. Comput. Network, Comput. Electron. Control, vol. 4, no. 2, pp. 563570, May 2023, doi: 10.22219/kinetik.v8i2.1686.

H. Yas et al., The Negative Role of Social Media During the COVID-19 Outbreak, Int. J. Sustain. Dev. Plan., vol. 16, no. 2, pp. 219228, Apr. 2021, doi: 10.18280/ijsdp.160202.

B. Al sari et al., Sentiment analysis for cruises in Saudi Arabia on social media platforms using machine learning algorithms, J. Big Data, vol. 9, no. 1, 2022, doi: 10.1186/s40537-022-00568-5.

A. Zahri, R. Adam, and E. B. Setiawan, Social Media Sentiment Analysis using Convolutional Neural Network (CNN) dan Gated Recurrent Unit (GRU), J. Ilm. Tek. Elektro Komput. dan Inform., vol. 9, no. 1, pp. 119131, 2023, doi: 10.26555/jiteki.v9i1.25813.

E. S. Prihatini, Women and social media during legislative elections in Indonesia, Womens. Stud. Int. Forum, vol. 83, p. 102417, Nov. 2020, doi: 10.1016/j.wsif.2020.102417.

S. Hinduja, M. Afrin, S. Mistry, and A. Krishna, Machine learning-based proactive social-sensor service for mental health monitoring using twitter data, Int. J. Inf. Manag. Data Insights, vol. 2, no. 2, p. 100113, 2022, doi: 10.1016/j.jjimei.2022.100113.

N. M. Azahra and E. B. Setiawan, Sentence-Level Granularity Oriented Sentiment Analysis of Social Media Using Long Short-Term Memory (LSTM) and IndoBERTweet Method, J. Ilm. Tek. Elektro Komput. dan Inform., vol. 9, no. 1, pp. 8595, 2023, doi: 10.26555/jiteki.v9i1.25765.

D. Antypas, A. Preece, and J. Camacho-Collados, Negativity spreads faster: A large-scale multilingual twitter analysis on the role of sentiment in political communication, Online Soc. Networks Media, vol. 33, no. January, 2023, doi: 10.1016/j.osnem.2023.100242.

R. H. Ali, G. Pinto, E. Lawrie, and E. J. Linstead, A large-scale sentiment analysis of tweets pertaining to the 2020 US presidential election, J. Big Data, vol. 9, no. 1, 2022, doi: 10.1186/s40537-022-00633-z.

B. Haryanto, Y. Ruldeviyani, F. Rohman, T. N. Julius Dimas, R. Magdalena, and F. Muhamad Yasil, Facebook analysis of community sentiment on 2019 Indonesian presidential candidates from Facebook opinion data, Procedia Comput. Sci., vol. 161, pp. 715722, 2019, doi: 10.1016/j.procs.2019.11.175.

Z. Geng, Q. Meng, J. Bai, J. Chen, and Y. Han, A model-free Bayesian classifier, Inf. Sci. (Ny)., vol. 482, pp. 171188, 2019, doi:

W. M. Shaban, A. H. Rabie, A. I. Saleh, and M. A. Abo-Elsoud, Accurate detection of COVID-19 patients based on distance biased Nave Bayes (DBNB) classification strategy, Pattern Recognit., vol. 119, 2021, doi: 10.1016/j.patcog.2021.108110.

H. Zhang, L. Jiang, and L. Yu, Attribute and instance weighted naive Bayes, Pattern Recognit., vol. 111, p. 107674, Mar. 2021, doi: 10.1016/j.patcog.2020.107674.

S. Wang, J. Ren, and R. Bai, A semi-supervised adaptive discriminative discretization method improving discrimination power of regularized naive Bayes, Expert Syst. Appl., vol. 225, no. November 2022, 2023, doi: 10.1016/j.eswa.2023.120094.

R. Blanquero, E. Carrizosa, P. Ramrez-Cobo, and M. R. Sillero-Denamiel, Variable selection for Nave Bayes classification, Comput. Oper. Res., vol. 135, 2021, doi: 10.1016/j.cor.2021.105456.

V. A. Fitri, R. Andreswari, and M. A. Hasibuan, Sentiment analysis of social media Twitter with case of Anti-LGBT campaign in Indonesia using Nave Bayes, decision tree, and random forest algorithm, Procedia Comput. Sci., vol. 161, pp. 765772, 2019, doi: 10.1016/j.procs.2019.11.181.

N. Leelawat et al., Twitter data sentiment analysis of tourism in Thailand during the COVID-19 pandemic using machine learning, Heliyon, vol. 8, no. 10, 2022, doi: 10.1016/j.heliyon.2022.e10894.

E. Hirata and T. Matsuda, Examining logistics developments in post-pandemic Japan through sentiment analysis of Twitter data, Asian Transp. Stud., vol. 9, no. April 2023, 2023, doi: 10.1016/j.eastsj.2023.100110.

A. A. Firdaus, A. Yudhana, and I. Riadi, DECODE : Jurnal Pendidikan Teknologi Informasi, Decod. J. Pendidik. Teknol. Inf., vol. 3, no. 2, pp. 236245, 2023, doi:

M. Qorib, T. Oladunni, M. Denis, E. Ososanya, and P. Cotae, Covid-19 vaccine hesitancy: Text mining, sentiment analysis and machine learning on COVID-19 vaccination Twitter dataset, Expert Syst. Appl., vol. 212, no. September 2022, p. 118715, 2023, doi: 10.1016/j.eswa.2022.118715.

H. Xu, R. Liu, Z. Luo, and M. Xu, COVID-19 Vaccine Sensing: Sentiment Analysis and Subject Distillation from Twitter Data, SSRN Electron. J., vol. 8, no. July, 2022, doi: 10.2139/ssrn.4073419.

S. Xu, Y. Leng, G. Feng, C. Zhang, and M. Chen, A gene pathway enrichment method based on improved TF-IDF algorithm, Biochem. Biophys. Reports, vol. 34, no. December 2022, p. 101421, 2023, doi: 10.1016/j.bbrep.2023.101421.

M. Liang and T. Niu, Research on Text Classification Techniques Based on Improved TF-IDF Algorithm and LSTM Inputs, Procedia Comput. Sci., vol. 208, pp. 460470, 2022, doi: 10.1016/j.procs.2022.10.064.

M. Chiny, M. Chihab, Y. Chihab, and O. Bencharef, LSTM, VADER and TF-IDF based Hybrid Sentiment Analysis Model, Int. J. Adv. Comput. Sci. Appl., vol. 12, no. 7, pp. 265275, 2021, doi: 10.14569/IJACSA.2021.0120730.

A. Yudhana and A. Dwi, Spatial distribution of soil nutrient content for sustainable rice agriculture using geographic information system and Nave Bayes classifier, Int. J. Smart Sens. Intell. Syst., vol. 16, no. 1, 2023, doi: 10.2478/ijssis-2023-0001.

A. Yudhana, D. Sulistyo, and I. Mufandi, GIS-based and Nave Bayes for nitrogen soil mapping in Lendah, Indonesia, Sens. Bio-Sensing Res., vol. 33, p. 100435, 2021, doi: 10.1016/j.sbsr.2021.100435.

Z. Ye, P. Song, D. Zheng, X. Zhang, and J. Wu, A Naive Bayes model on lung adenocarcinoma projection based on tumor microenvironment and weighted gene co-expression network analysis, Infect. Dis. Model., vol. 7, no. 3, pp. 498509, 2022, doi: 10.1016/j.idm.2022.07.009.

H. Zhang, L. Jiang, and G. I. Webb, Rigorous non-disjoint discretization for naive Bayes, Pattern Recognit., vol. 140, 2023, doi: 10.1016/j.patcog.2023.109554.

W. Guo, G. Wang, C. Wang, and Y. Wang, Distribution network topology identification based on gradient boosting decision tree and attribute weighted naive Bayes, Energy Reports, vol. 9, pp. 727736, 2023, doi: 10.1016/j.egyr.2023.04.256.

A. H. Rabie, N. A. Mansour, A. I. Saleh, and A. E. Takieldeen, Expecting individuals body reaction to Covid-19 based on statistical Nave Bayes technique, Pattern Recognit., vol. 128, 2022, doi: 10.1016/j.patcog.2022.108693.

M. Vishwakarma and N. Kesswani, A new two-phase intrusion detection system with Nave Bayes machine learning for data classification and elliptic envelop method for anomaly detection, Decis. Anal. J., vol. 7, no. January, 2023, doi: 10.1016/j.dajour.2023.100233.

D. van Herwerden, J. W. OBrien, P. M. Choi, K. V. Thomas, P. J. Schoenmakers, and S. Samanipour, Naive Bayes classification model for isotopologue detection in LC-HRMS data, Chemom. Intell. Lab. Syst., vol. 223, no. November 2021, 2022, doi: 10.1016/j.chemolab.2022.104515.

M. Artur, Review the performance of the Bernoulli Nave Bayes Classifier in Intrusion Detection Systems using Recursive Feature Elimination with Cross-validated selection of the best number of features, Procedia Comput. Sci., vol. 190, no. 2019, pp. 564570, 2021, doi: 10.1016/j.procs.2021.06.066.

A. Tariq et al., Modelling, mapping and monitoring of forest cover changes, using support vector machine, kernel logistic regression and naive bayes tree models with optical remote sensing data, Heliyon, vol. 9, no. 2, 2023, doi: 10.1016/j.heliyon.2023.e13212.

A. Peryanto, A. Yudhana, and R. Umar, Convolutional Neural Network and Support Vector Machine in Classification of Flower Images, Khazanah Inform. J. Ilmu Komput. dan Inform., vol. 8, no. 1, pp. 17, 2022, doi: 10.23917/khif.v8i1.15531.

C. B. G. Allo, L. S. A. Putra, N. R. Paranoan, and V. A. Gunawan, Comparing Logistic Regression and Support Vector Machine in Breast Cancer Problem, Jambura J. Probab. Stat., vol. 4, no. 2022, 2023.

S. Szeghalmy and A. Fazekas, A Comparative Study of the Use of Stratified Cross-Validation and Distribution-Balanced Stratified Cross-Validation in Imbalanced Learning, Sensors, vol. 23, no. 4, p. 2333, Feb. 2023, doi: 10.3390/s23042333.

U. Daxecker and M. Rauschenbach, Election type and the logic of pre-election violence: Evidence from Zimbabwe, Elect. Stud., vol. 82, no. January, 2023, doi: 10.1016/j.electstud.2023.102583.

Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Prediction of Indonesian Presidential Election Results using Sentiment Analysis with Na´ve Bayes Method


  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

STMIK Budi Darma
Secretariat: Sisingamangaraja No. 338 Telp 061-7875998

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.