Analisis Sentimen Movie Review menggunakan Word2Vec dan metode LSTM Deep Learning

Authors

  • Widi Widayat Institut Teknologi Telkom Purwokerto, Purwokerto

DOI:

https://doi.org/10.30865/mib.v5i3.3111

Keywords:

Sentiment, Classification, RNN, LSTM, word2vec

Abstract

The increasing number of internet users is directly in line with the increasing number of data on the internet that is available for analysis, especially data in text form. The availability of this text data encourages a lot of sentiment analysis research. However, it turns out that the availability of abundant text data is also one of the challenges in sentiment analysis research. Datasets that consist of long and complex text documents require a different approach. In this study, LSTM was chosen to be used as a sentiment classification method. This research uses a movie review dataset that consists of 25,000 review documents, with an average length per review is 233 words. The research uses CBOW and Skip-Gram methods on word2vec to form a vector representation of each word (word vector) in the corpus data. Several dimensions of the word vector was used in this research, there are 50, 60, 100, 150, 200, and 500, this tuning parameter is used to determine their effect on the resulting accuracy. The best accuracy around 88.17% is obtained at the word vector 100 dimension and the lowest accuracy is 85.86% at the word vector 500 dimension.

References

W. Medhat, A. Hassan, and H. Korashy, “Sentiment analysis algorithms and applications: A survey,†Ain Shams Eng. J., vol. 5, no. 4, pp. 1093–1113, 2014, doi: 10.1016/j.asej.2014.04.011.

K. Ravi and V. Ravi, A survey on opinion mining and sentiment analysis: Tasks, approaches and applications, vol. 89, no. June. Elsevier B.V., 2015.

P. Baby and K. B, “Sentimental Analysis and Deep Learning : A Survey,†Int. J. Sci. Res. Sci. Eng. Technol., pp. 212–220, 2020, doi: 10.32628/ijsrset207135.

K. Saranya and S. Jayanthy, “Onto-based sentiment classification using machine learning techniques,†Proc. 2017 Int. Conf. Innov. Information, Embed. Commun. Syst. ICIIECS 2017, vol. 2018-Janua, pp. 1–5, 2018, doi: 10.1109/ICIIECS.2017.8276047.

Y. Dang, Y. Zhang, and H. Chen, “A lexicon-enhanced method for sentiment classification: An experiment on online product reviews,†IEEE Intell. Syst., vol. 25, no. 4, pp. 46–53, 2010, doi: 10.1109/MIS.2009.105.

M. Rushdi Saleh, M. T. Martín-Valdivia, A. Montejo-Ráez, and L. A. Ureña-López, “Experiments with SVM to classify opinions in different domains,†Expert Syst. Appl., vol. 38, no. 12, pp. 14799–14804, 2011, doi: 10.1016/j.eswa.2011.05.070.

G. Adam and P. Josh, Deep Learning: A Practitioner’s Approach. 2017.

Y. Kim, “Convolutional neural networks for sentence classification,†EMNLP 2014 - 2014 Conf. Empir. Methods Nat. Lang. Process. Proc. Conf., pp. 1746–1751, 2014, doi: 10.3115/v1/d14-1181.

A. Hassan and A. Mahmood, “Deep Learning approach for sentiment analysis of short texts,†2017 3rd Int. Conf. Control. Autom. Robot. ICCAR 2017, pp. 705–710, 2017, doi: 10.1109/ICCAR.2017.7942788.

A. Chachra, P. Mehndiratta, and M. Gupta, “Sentiment analysis of text using deep convolution neural networks,†2017 10th Int. Conf. Contemp. Comput. IC3 2017, vol. 2018-Janua, no. August, pp. 1–6, 2018, doi: 10.1109/IC3.2017.8284327.

Z. Hu, J. Hu, W. Ding, and X. Zheng, “Review Sentiment Analysis Based on Deep Learning,†Proc. - 12th IEEE Int. Conf. E-bus. Eng. ICEBE 2015, pp. 87–94, 2015, doi: 10.1109/ICEBE.2015.24.

R. Ghosh, K. Ravi, and V. Ravi, “A novel deep learning architecture for sentiment classification,†2016 3rd Int. Conf. Recent Adv. Inf. Technol. RAIT 2016, pp. 511–516, 2016, doi: 10.1109/RAIT.2016.7507953.

X. Zhang, J. Zhao, and Y. Lecun, “Character-level convolutional networks for text classification,†Adv. Neural Inf. Process. Syst., vol. 2015-Janua, pp. 649–657, 2015.

M. Sundermeyer, H. Ney, and R. Schluter, “From feedforward to recurrent LSTM neural networks for language modeling,†IEEE Trans. Audio, Speech Lang. Process., vol. 23, no. 3, pp. 517–529, 2015, doi: 10.1109/TASLP.2015.2400218.

R. Ni and H. Cao, “Sentiment Analysis based on GloVe and LSTM-GRU,†Chinese Control Conf. CCC, vol. 2020-July, pp. 7492–7497, 2020, doi: 10.23919/CCC50068.2020.9188578.

A. H. Quraishi, “Performance analysis of machine learning algorithms for Movie Review,†Int. J. Comput. Applivations, vol. 177, no. 36, pp. 1–4, 2020, doi: 10.4018/IJHISI.2020040101.

A. Yenter and A. Verma, “Deep CNN-LSTM with combined kernels from multiple branches for IMDb review sentiment analysis,†2017 IEEE 8th Annu. Ubiquitous Comput. Electron. Mob. Commun. Conf. UEMCON 2017, vol. 2018-Janua, pp. 540–546, 2017, doi: 10.1109/UEMCON.2017.8249013.

N. Hossain, M. R. Bhuiyan, Z. N. Tumpa, and S. A. Hossain, “Sentiment Analysis of Restaurant Reviews using Combined CNN-LSTM,†2020 11th Int. Conf. Comput. Commun. Netw. Technol. ICCCNT 2020, 2020, doi: 10.1109/ICCCNT49239.2020.9225328.

A. Hassan and A. Mahmood, “Convolutional Recurrent Deep Learning Model for Sentence Classification,†IEEE Access, vol. 6, no. c, pp. 13949–13957, 2018, doi: 10.1109/ACCESS.2018.2814818.

X. Wang, W. Jiang, and Z. Luo, “Combination of convolutional and recurrent neural network for sentiment analysis of short texts,†COLING 2016 - 26th Int. Conf. Comput. Linguist. Proc. COLING 2016 Tech. Pap., pp. 2428–2437, 2016.

A. L. Maas, R. E. Daly, P. T. Pham, D. Huang, A. Y. Ng, and C. Potts, “Learning word vectors for sentiment analysis,†ACL-HLT 2011 - Proc. 49th Annu. Meet. Assoc. Comput. Linguist. Hum. Lang. Technol., vol. 1, pp. 142–150, 2011.

T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean, “Distributed representations ofwords and phrases and their compositionality,†Adv. Neural Inf. Process. Syst., pp. 1–9, 2013.

T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space,†1st Int. Conf. Learn. Represent. ICLR 2013 - Work. Track Proc., pp. 1–12, 2013.

T. Mikolov, W. T. Yih, and G. Zweig, “Linguistic regularities in continuous spaceword representations,†NAACL HLT 2013 - 2013 Conf. North Am. Chapter Assoc. Comput. Linguist. Hum. Lang. Technol. Proc. Main Conf., pp. 746–751, 2013.

Downloads

Published

2021-07-31

Issue

Section

Articles