Sentiment Analysis on Indonesian Movie Review Using KNN Method With the Implementation of Chi-Square Feature Selection

Authors

DOI:

https://doi.org/10.30865/mib.v7i1.5522

Keywords:

Movie Review, Sentiment Analysis, Chi-Square, KNN

Abstract

The advancement and development of the internet is used by the people to support various sectors, one of which is the film industry. Nowadays, people can easily access various movies from available sites. This convenience had led to many reviews about a movie that can be obtained easily. This movie review is very influential on the variety of movies. Freedom of expression on the internet, makes the reviews of a movie vary. For this reason, it is necessary to analyze the sentiment of he movie reviews that are positive or negative. In this research, a sentiment analysis model is build using chi-square selection feature with the KNN algorithm. The final result of this research is able to provide the best classification model with the implementation of stemming. The value of k = 267 in selectkbest at the feature selection stage using chi-square, and using the value of K = 11 in the KNN parameter. This model produces f1 score value of 86.98%.

References

A. R. Yosafat and Y. Kurnia, “Aplikasi Prediksi Rating Film dengan Perbandingan Metode Naïve Bayes dan KNN Berbasis Website Menggunakan Framework Codeigniter,†J. ALGOR, vol. 1, no. 1, pp. 16–26, 2019.

W. Widayat, “Analisis Sentimen Movie Review menggunakan Word2Vec dan metode LSTM Deep Learning,†J. Media Inform. Budidarma, vol. 5, no. 3, p. 1018, 2021, doi: 10.30865/mib.v5i3.3111.

C. A. Putri, “Analisis Sentimen Review Film Berbahasa Inggris Dengan Pendekatan Bidirectional Encoder Representations from Transformers,†JATISI (Jurnal Tek. Inform. dan Sist. Informasi), vol. 6, no. 2, pp. 181–193, 2020, doi: 10.35957/jatisi.v6i2.206.

T. K. Shivaprasad and J. Shetty, “Sentiment analysis of product reviews: A review,†Proc. Int. Conf. Inven. Commun. Comput. Technol. ICICCT 2017, no. Icicct, pp. 298–303, 2017, doi: 10.1109/ICICCT.2017.7975207.

M. B. Hamzah, “Classification of Movie Review Sentiment Analysis Using Chi-Square and Multinomial Naïve Bayes with Adaptive Boosting,†J. Adv. Inf. Syst. Technol., vol. 3, no. 1, pp. 67–74, 2021, [Online]. Available: https://journal.unnes.ac.id/sju/index.php/jaist.

Nurhayati, A. E. Putra, L. K. Wardhani, and Busman, “Chi-Square Feature Selection Effect On Naive Bayes Classifier Algorithm Performance For Sentiment Analysis Document,†in 2019 7th International Conference on Cyber and IT Service Management (CITSM), 2019, vol. 7, pp. 1–7, doi: 10.1109/CITSM47753.2019.8965332.

H. Jelodar et al., “A NLP framework based on meaningful latent-topic detection and sentiment analysis via fuzzy lattice reasoning on youtube comments,†Multimed. Tools Appl., vol. 80, no. 3, pp. 4155–4181, 2021, doi: 10.1007/s11042-020-09755-z.

N. O. F. Daeli and A. Adiwijaya, “Sentiment Analysis on Movie Reviews using Information Gain and K-Nearest Neighbor,†J. Data Sci. Its Appl., vol. 3, no. 1, pp. 1–7, 2020, doi: 10.34818/JDSA.2020.3.22.

F. K. Chandra and Y. Sibaroni, “Klasifikasi Sentiment Analysis pada Review Buku Novel Berbahasa Inggris dengan Menggunakan Metode Support Vector Machine (SVM),†e-Proceeding Eng., vol. Vol.6, no. 3, pp. 10451–10462, 2019.

R. Sari, “Analisis Sentimen Pada Review Objek Wisata Dunia Fantasi Menggunakan Algoritma K-Nearest Neighbor (K-Nn),†EVOLUSI J. Sains dan Manaj., vol. 8, no. 1, pp. 10–17, 2020, doi: 10.31294/evolusi.v8i1.7371.

B. Jonathan, J. I. Sihotang, and S. Martin, “Sentiment analysis of customer reviews in zomato bangalore restaurants using random forest classifier,†Abstr. Proc. Int. Sch. Conf., vol. 7, no. 1, pp. 1831–1840, 2019, doi: 10.35974/isc.v7i1.1003.

B. A. H. Murshed, S. Mallappa, O. A. M. Ghaleb, and H. D. E. Al-ariki, Efficient Twitter Data Cleansing Model for Data Analysis of the Pandemic Tweets, vol. 348, no. March. Springer International Publishing, 2021.

R. S. Murti and S. Al-faraby, “Analisis Sentimen pada Ulasan Film Menggunakan Word2Vec dan Long Short-Term Mermory ( LSTM ),†Telkom Univ., 2019.

K. Sugiyama, K. Hatano, M. Yoshikawa, and S. Uemura, “Refinement of TF-IDF schemes for web pages using their hyperlinked neighboring pages,†no. January, p. 198, 2003, doi: 10.1145/900095.900096.

F. Taufiqurrahman, S. Al Faraby, and M. D. Purbolaksono, “Klasifikasi Teks Multi Label pada Hadis Terjemahan Bahasa Indonesia Menggunakan Chi Square dan SVM,†e-Proceeding Eng., vol. 8, no. 5, pp. 10650–10659, 2021.

J. Aguilera, L. C. González, M. Montes-y-Gómez, and P. Rosso, “A New Weighted k-Nearest Neighbor Algorithm Based on Newton’s Gravitational Force,†in Lecture Notes in Computer Science, vol. 11401, Springer-Verlag, 2019, pp. 305–313.

N. I. P. Munggaran and E. B. Setiawan, “Prediksi Kepribadian DISC dengan K-Nearest Neighbors Algorithm (KNN) Menggunakan Pembobotan TF-IDF dan TF-Chi Square,†e-Proceeding Eng., vol. 6, no. 2, pp. 9446–9457, 2019.

X. Deng, Q. Liu, Y. Deng, and S. Mahadevan, “An improved method to construct basic probability assignment based on the confusion matrix for classification problem,†Inf. Sci. (Ny)., vol. 340–341, pp. 250–261, 2016, doi: 10.1016/j.ins.2016.01.033.

D. O. Ratmana, G. Fajar Shidik, A. Z. Fanani, Muljono, and R. A. Pramunendar, “Evaluation of feature selections on movie reviews sentiment,†Proc. - 2020 Int. Semin. Appl. Technol. Inf. Commun. IT Challenges Sustain. Scalability, Secur. Age Digit. Disruption, iSemantic 2020, pp. 567–571, 2020, doi: 10.1109/iSemantic50169.2020.9234287.

A. W. Pradana and M. Hayaty, “The Effect of Stemming and Removal of Stopwords on the Accuracy of Sentiment Analysis on Indonesian-language Texts,†Kinet. Game Technol. Inf. Syst. Comput. Network, Comput. Electron. Control, vol. 4, no. 3, pp. 375–380, 2019, doi: 10.22219/kinetik.v4i4.912.

A. Pamuji, “Performance of the K-Nearest Neighbors Method on Analysis of Social Media Sentiment,†Juisi, vol. 07, no. 01, pp. 32–37, 2021.

Downloads

Published

2023-01-28