Sentiment Analysis using Random Forest and Word2Vec for Indonesian Language Movie Reviews
DOI:
https://doi.org/10.30865/mib.v7i3.6299Keywords:
Sentiment Analysis, Random Forest, Word2Vec, Movie ReviewAbstract
The film industry in recent years has become one of the industries that people are most interested in. The convenience of watching movies through streaming services is one of the reasons why watching movies is so popular. This ease of access resulted in a large selection of available movies and encouraged the public to look for movie reviews to find out whether the movies was good or bad. Freedom of expression on the internet has resulted in many movie reviews being spread. Therefore, sentiment analysis was conducted to see the positive or negative of these reviews. The method used in this research is Random Forest and Word2Vec skip-gram as feature extraction. The Random Forest classification was chosen because Randomforest is a highly flexible and highly accurate method, while Word2Vec Skip-Gram is used as a feature extraction because it is an efficient model that studies a large number of word vectors in an irregular text. The best model obtained from this experiment is a model built with stemming, Word2Vec with 300 dimensions, and a max_depth value of 23, achieving an f1-score of 83.59%.References
A. Andreyestha and A. Subekti, “Analisa Sentiment Pada Ulasan Film Dengan Optimasi Ensemble Learning,†J. Inform., vol. 7, no. 1, pp. 15–23, 2020, doi: 10.31311/ji.v7i1.6171.
S. Bhatia, M. Sharma, and K. K. Bhatia, “Sentiment Analysis and Mining of Opinions,†Stud. Big Data, vol. 30, no. May, pp. 503–523, 2018, doi: 10.1007/978-3-319-60435-0_20.
S. Mukherjee, “Sentiment Analysis,†ML.NET Reveal., pp. 113–127, 2021, doi: 10.1007/978-1-4842-6543-7_7.
J. Khan, A. Alam, and Y. Lee, “Intelligent Hybrid Feature Selection for Textual Sentiment Classification,†IEEE Access, vol. 9, pp. 140590–140608, 2021, doi: 10.1109/ACCESS.2021.3118982.
S. Ballı and O. Karasoy, “Development of content-based SMS classification application by using Word2Vec-based feature extraction,†IET Softw., vol. 13, no. 4, pp. 295–304, 2019, doi: 10.1049/iet-sen.2018.5046.
A. Ramadhan, B. Susetyo, and Indahwati, “Penerapan Metode Klasifikasi Random Forest Dalam Mengidentifikasi Faktor Penting Penilaian Mutu Pendidikan,†J. Pendidik. dan Kebud., vol. 4, no. 2, pp. 169–182, 2019, doi: 10.24832/jpnk.v4i2.1327.
I. Steinke, J. Wier, L. Simon, and R. Seetan, “Sentiment Analysis of Online Movie Reviews using Machine Learning,†Int. J. Adv. Comput. Sci. Appl., vol. 13, no. 9, pp. 618–624, 2022, doi: 10.14569/IJACSA.2022.0130973.
S. M. Qaisar, “Sentiment Analysis of IMDb Movie Reviews Using Long Short-Term Memory,†2020 2nd Int. Conf. Comput. Inf. Sci. ICCIS 2020, no. November 2020, 2020, doi: 10.1109/ICCIS49240.2020.9257657.
M. A. A. Jihad, Adiwijaya, and W. Astuti, “Analisis sentimen terhadap ulasan film menggunakan algoritma random forest,†e-Proceeding Eng., vol. 8, no. 5, pp. 10153–10165, 2021.
W. Widayat, “Analisis Sentimen Movie Review menggunakan Word2Vec dan metode LSTM Deep Learning,†J. Media Inform. Budidarma, vol. 5, no. 3, p. 1018, 2021, doi: 10.30865/mib.v5i3.3111.
F. W. KURNIAWAN, “Analisis Sentimen Twitter Bahasa Indonesia dengan Word2Vec,†Pengemb. Teknol. Inf. dan Ilmu Komput., vol. 2, no. 2, pp. 4704–4713, 2020, [Online]. Available: https://openlibrary.telkomuniversity.ac.id/home/catalog/id/159923/slug/analisis-sentimen-twitter-bahasa-indonesia-dengan-word2vec.html%0A/home/catalog/id/159923/slug/analisis-sentimen-twitter-bahasa-indonesia-dengan-word2vec.html
H. Juwintho, E. Setiawan, and J. Santoso, “Sentiment Analysis Twitter Bahasa Indonesia Berbasis WORD2VEC Menggunakan Deep Convolutional Neural Network,†J. Teknol. Inf. dan Ilmu Komput., vol. 7, no. 1, pp. 181–188, 2020, doi: 10.25126/jtiik.202071758.
E. A. Felix and S. P. Lee, “Systematic literature review of preprocessing techniques for imbalanced data,†IET Softw., vol. 13, no. 6, pp. 479–496, 2019, doi: 10.1049/iet-sen.2018.5193.
D. J. Putri and M. Dwifebri, “Text Classification of Indonesian Translated Hadith Using XGBoost Model and Chi-Square Feature Selection,†vol. 4, no. 4, pp. 1732–1738, 2023, doi: 10.47065/bits.v4i4.2944.
I. Prayoga and M. D. P, “Sentiment Analysis on Indonesian Movie Review Using KNN Method With the Implementation of Chi-Square Feature Selection,†vol. 7, pp. 369–375, 2023, doi: 10.30865/mib.v7i1.5522.
S. Al-Saqqa and A. Awajan, “The Use of Word2vec Model in Sentiment Analysis: A Survey,†ACM Int. Conf. Proceeding Ser., no. December, pp. 39–43, 2019, doi: 10.1145/3388218.3388229.
R. P. Nawangsari, R. Kusumaningrum, and A. Wibowo, “Word2vec for Indonesian sentiment analysis towards hotel reviews: An evaluation study,†Procedia Comput. Sci., vol. 157, pp. 360–366, 2019, doi: 10.1016/j.procs.2019.08.178.
T. Zhu, “Analysis on the applicability of the random forest,†J. Phys. Conf. Ser., vol. 1607, no. 1, 2020, doi: 10.1088/1742-6596/1607/1/012123.
F. Rahmad, Y. Suryanto, and K. Ramli, “Performance Comparison of Anti-Spam Technology Using Confusion Matrix Classification,†IOP Conf. Ser. Mater. Sci. Eng., vol. 879, no. 1, 2020, doi: 10.1088/1757-899X/879/1/012076.
F. Khairani, A. Kurnia, M. N. Aidi, and S. Pramana, “Predictions of Indonesia Economic Phenomena Based on Online News Using Random Forest,†SinkrOn, vol. 7, no. 2, pp. 532–540, 2022, doi: 10.33395/sinkron.v7i2.11401.
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).