Sentiment Analysis on Twitter Social Media towards Climate Change on Indonesia Using IndoBERT Model

-The phenomenon of climate change is a change in temperature and weather patterns in the long term. This incident became a frightening specter for everyone because consciously or unconsciously the bad effects of climate change are already in sight. This has become an urgency for all levels of society so that this topic has become quite hot on Social Media, especially on Twitter. The topic of climate change in Indonesia on Twitter Social Media can be analyzed so that it can be seen how people's sentiments towards this phenomenon. This research utilizes the Transformer architecture, namely IndoBERT, IndoBERT itself is the development of the BERT architecture by the IndoNLU team which has 74 million words from various Bahasa Indonesia sources. Therefore, this method was chosen in the hope of helping sentiment analysis on the topic of climate change so that public sentiment can be mapped. The test results obtained an F1-Score values of 95.6% with a tuning parameter of 0.00002 learning rate and 16 of batch size. Hopefully the results of this research can be used in future research.


INTRODUCTION
Climate change is a long-term change in temperature and weather patterns. This phenomenon initially occurred naturally, but since the 1800s, this change cannot be separated from human intervention, caused by the use of fossil fuels (oil, coal, and gas) which produce gas heat sinks [1]. The phenomenon of climate change has occurred for a long time but has only become an issue that is often sought after for the past few years. In Indonesia itself, the issue of climate change is still a tertiary issue that is still rarely conveyed by politicians but is quite popular among young people, this is reflected in the many active young citizens who are worried and express this sentiment on social media, one of which is Twitter.
Twitter is a social networking and microblogging service that users use to send and read text-based messages called tweets [2]. Quoting from Twitter's 3 rd quarter 2021 financial report [3], it was reported that Twitter's daily active users reached 211 million users. Twitter daily active users in Indonesia is one of the most active daily users in Southeast Asia. Nayomi Kankanamge revealed [4], utilizing social media as one of the expected approaches in assessing citizens' knowledge. This study shows how Twitter social media can be used to analyze sentiments from various topics, one of which is the issue of climate change.
Nowadays, research on sentiment analysis on Twitter social media has been widely published on the internet, some of which use long-established methods such as Naive-Bayes Classifier, Word2Vec and Support Vector Machine. One of the previous studies on sentiment analysis using neural networks showed a fairly high accuracy result but with several variations of scenarios such as data shuffling, learning rate, hidden layer nodes and different dropouts. This indicates that the neural network method is only reliable when some of the previously mentioned factors are appropriate.
Sentiment Analysis is a study that analyzes opinions, sentiments, evaluations, judgments, attitudes, and emotions of people towards entities such as products, services, organizations, individuals, problems, events, topics, and the attributes [5]. Sentiment analysis is divided into two processes, namely Sentiment Extraction and Sentiment Classification. Sentiment Extraction is the process of extracting aspects that have been evaluated [5]. Sentiment Classification is the process of determining opinions about different aspects that are positive, negative or neutral [5].
In research [6], this study conducted a sentiment analysis of someone's tweets regarding flood disaster management, especially in West Java on Twitter social media. This study uses a neural network algorithm with the Term Frequency -Inversed Document Frequency (TF-IDF) method. This study uses several different scenarios of data shuffling, learning rate, node hidden layer and drop out. The best accuracy comes from the eighth scenario with 73.87% accuracy without data shuffling, learning rate of 0.001, 128 node hidden layer and no drop out.
Furthermore, in research [7], this study conducted a sentiment analysis of the dataset obtained from social media Twitter regarding the post-disaster. This study uses the naive Bayes classifier algorithm with the n-gram feature, the word solving in the sentence in this study is divided into two, namely with a single term or n=1 and bigram or n=2. The highest accuracy results obtained from several tests with different ratios of training and test datasets are 93.33% for single terms or unigrams, while for bigrams the accuracy value is 86.67%.
Over time methods for sentiment analysis developed quite rapidly, one of which is IndoBERT. IndoBERT is an extension of BERT, an AI Language developed by Google researchers. IndoBERT is an Indonesian variation of a pre-trained model with more than 220 million words drawn from several sources. The three main sources are the Indonesian Wikipedia (74 million words), news articles from Kompas, Liputan6 and Tempo (55 million words) and Indonesian Web Corpus like Medved and Suchomel (90 million words). IndoBERT is one of the state-of-the- art model options for conducting sentiment analysis in Bahasa Indonesia [8], such as a journal published by Bens Pardamean, et al [9]. The journal, entitled Finetunning IndoBERT to Understand Indonesian Stock Trader Slang Language, explains that IndoBERT is the method with the highest accuracy rate out of 10 previous studies, with an accuracy rate of 68%.
The contribution of this research is delivering a fine-tuned IndoBERT [10] model so it can analyze the sentiments of the data withdrawn from Twitter whether its positive or negative sentiments. The discrepancy between this research and previous research is situated in the parameter tuning used in the model explained further on Result and Discussion section.

Research Phases
The system used in this research is a system for sentiment analysis on Twitter social media on climate change in Indonesia using the IndoBERT method. The flow diagram of this system is as follows.

Data Crawling
The data crawling process is carried out using a program with the python programming language and the twint library. The data collected is tweets on social media Twitter about climate change in Indonesia. Crawling data was carried out using the keyword 'perubahan iklim' and 1533 Indonesian-language tweets were collected with a time period of 2 July 2022 to 13 July 2022. The collected data was saved in a CSV format file.

Data Labelling
The initial stage in designing this system is labeling the data manually by adding '1' for sentences with positive sentiments and '0' for sentences with negative sentiments. Labelling is done collectively by three people and then the sentiment category of the tweet is determined by adding up the sentiment scores. Positive categories covered tweets containing positive words, reporting on climate change management, praising efforts to deal with climate change and others. While negative category covered tweets containing negative words, complaining about climate change conditions, reporting on the effects of climate change and others. This labeling is done with the hope of increasing accuracy in the system to be built. An example of the labelling process is explained on a Table 1 below.

Preprocessing
Before the data trained in the IndoBERT model [11], it must be preprocessed to meet the standard-of the model. Preprocessing is the process of preparing the data used to conform to predetermined standards, so that the knowledge extraction process can be applied [12]. The data preprocessing step is carried out using a program with the Python programming language, the nltk library and the Sastrawi library. Pre-processing is a handy step to fabricate quality data [13]. The steps taken while preprocessing are as follows.

Data Cleaning
In the early stages of preprocessing, data cleaning will be carried out to removing all characters other than letters, hashtag, URL, punctuation also removing mentions and username links.

Case Folding
The next stage of preprocessing is case folding, which convert all letters into lowercase letters.

Stopword Removal
The next stage is stopword removal, which is removing words that are considered meaningless.

Stemming
The last stage is stemming, which is removing affixes on words. An example of the preprocessing process is explained on Table 2 below.

Splitting Data
The third stage for designing this system is data splitting. This stage is done to separate the dataset into two, namely training data and test data. Training data is used to create a model and testing data to test the model. The proportion of data separation is 70% for train data and 30% for test data.

BERT or Bidirectional Encoder Representations from Transformers is one of the main innovations in
Contextualized representations learning [14]. BERT is designed to train deep bidirectional representations of unlabeled text by conditioning the left and right sides into a context across all layers. This allows the previously trained BERT model to be matched with just one output layer to create up-to-date models for various tasks.

Figure 2.
Pre-training and Fine-tuning BERT [8] The framework of BERT is divided into two, pre-training and fine-tuning. In pre-training, the model is trained with unlabeled data on different pre-training tasks. Whereas in fine-tuning the BERT model is first initialized with fine-tuned parameters using labeled data from downstream tasks. Each downstream has a separate fine-tuned model, although initialized with the same pre-trained parameters. In Figure above it can be seen that the pre-training process at BERT uses two unsupervised tasks.

IndoBERT Modelling
IndoBERT is a modification of the BERT Base initiated by the IndoNLU team, a masterpiece for sentiment analysis in Indonesian. This model has become popular recently because it is trained with about 4 billion Word Corpus [9]. The model is trained using over 220M words aggregated from three main sources: (1) Indonesian Wikipedia, (2) news article from Kompas, Tempo and Liputan6 also (3) Indonesian Web Corpus (Medved and Suchomel). These resources of the pre-trained model are accesible and easy to reproduce [15]. IndoBERT uses the transformer mechanism that learns the relationship between words in a text/sentence, it trained purely as a masked language model training using the Huggingface framework [11].
There are two stages in the pre-training process of IndoBERT. The first stage is Masked Language Model (Masked LM) where the model will try to predict the original value of the words given the [MASK] token which has been randomly inserted before. An example as in Table 3 below. The second stage is Next Sentence Prediction (NSP). In this task, given two sentences (A and B) and are asked to predict whether B follows A. Training data can be generated trivially. 50% of the time B follows A in the corpus and 50% of the time B is a random sentences from the corpus. The tokens used are such as (1) [CLS] first token of every sequence, (2) [SEP] used to separate two sentences and (3) [PAD] which is a special token used for padding. An example as in Table 4 below.

Evaluation
At this stage, evaluation and analysis of the results obtained from the previous stages are carried out. Evaluation carried out using the Confussion Matrix method, a table that allows visualization of the performance of an algorithm [6] to represent the accuracy, precission, recall and F1-Score values. This matrix consist of some variable on the Table 5 as follows.

RESULTS AND DISCUSSION
In this research, the system is tested to determine succes rate of the system in terms of the average F1-Score gained from each label. The model built has 6 scenarios with different hyperparamater tunning, each scenarios has different learning rates and batch size. The scenarios value previously mentioned presented in Table 6 as follows. 5e-5 32

Testing Results
According to the hyperparameter tunning attempt before, it can be seen in the Table 7 below that model SMA_01 has the highest F1-Score value compared to other models. Model SMA_01 gets 95.6% of F1-Score value that indicates this model is the best model based on tuning parameters. Lower batch size help the model to increasing F1-Score value.

Analysis of Test Results
In Table 8 below shown the model of SMA_01 result of experiment conducted by this research as explained in detail in Research Method Section, the dataset holds 1533 records and divided as training set with 1087 tweets also test and validation set with each 233 tweets. From table above it can be concluded that the scenario is the best scenario by giving an accuracy results 95.3% for training set at 1087 tweets. In the validation set, the proposed model performed with accuracy: 89.6% at 233 tweets. In the testing set, the model performed with 95.3% at 233 tweets. The values gained from the train, validation and test above are the best results that can be obtained from the proposed model due to its lower learning rate and batch size. The more lower learning rate and batch size used the more time taken to train the model but generate higher accuracy [18]. Besides differentiate the tuning parameters on the proposed model, the performance and accuracy of every scenarios are also affected by the number of datasets and some potential on wrongly-labelled data since the labelling process still relying on human.