Analisis Trending Topik Twitter dengan Fitur Ekspansi FastText Menggunakan Metode Logistic Regression
DOI:
https://doi.org/10.30865/jurikom.v9i1.3791Keywords:
Twitter, Trending Topics, FastText, Expansion Feature, Logistic RegressionAbstract
Twitter is a social media that contains information such as the latest news, a person's biography, and tweets from users. Twitter has a feature called trending topics that serves to find out information on certain topics that are currently popular. In fact, it is often difficult to understand what trending topics are happening. Therefore, it is necessary to classify trending topics into a general category. This study aims to analyze and classify Twitter topic trending information by dividing several topic trend labels using the FastText expansion feature method. The FastText expansion feature is used to reduce vocabulary mismatches in a tweet. The classification process of this system will use the Logistic Regression method. The best results were obtained in this study using test data scenarios, 90:10 training data with 76.39% accuracy. The most discussed trending topic from September 2021 to October 2021 was politics with a percentage of 15.83%, followed by religion 12.64% and technology 10.42%References
F. Atefeh and W. Khreich, “A survey of techniques for event detection in Twitter,†Comput. Intell., vol. 31, no. 1, pp. 133–164, 2015, doi: 10.1111/coin.12017.
M. Chelly and H. Mataillet, “Social media and the impact on education: Social media and home education,†2012 Int. Conf. E-Learning E-Technologies Educ. ICEEE 2012, pp. 236–239, 2012, doi: 10.1109/ICeLeTE.2012.6333388.
A. D. Hartanto, E. Utami, S. Adi, and H. S. Hudnanto, “Job seeker profile classification of twitter data using the naïve bayes classifier algorithm based on the DISC method,†2019 4th Int. Conf. Inf. Technol. Inf. Syst. Electr. Eng. ICITISEE 2019, pp. 533–536, 2019, doi: 10.1109/ICITISEE48480.2019.9003963.
S. T. Indra, L. Wikarsa, and R. Turang, “Using logistic regression method to classify tweets into the selected topics,†2016 Int. Conf. Adv. Comput. Sci. Inf. Syst. ICACSIS 2016, pp. 385–390, 2017, doi: 10.1109/ICACSIS.2016.7872727.
K. Lee, D. Palsetia, R. Narayanan, M. M. A. Patwary, A. Agrawal, and A. Choudhary, “Twitter trending topic classification,†Proc. - IEEE Int. Conf. Data Mining, ICDM, pp. 251–258, 2011, doi: 10.1109/ICDMW.2011.171.
E. B. Setiawan, D. H. Widyantoro, and K. Surendro, “Feature expansion using word embedding for tweet topic classification,†Proceeding 2016 10th Int. Conf. Telecommun. Syst. Serv. Appl. TSSA 2016 Spec. Issue Radar Technol., no. 2011, 2017, doi: 10.1109/TSSA.2016.7871085.
A. Amalia, O. S. Sitompul, E. B. Nababan, and T. Mantoro, “An Efficient Text Classification Using fastText for Bahasa Indonesia Documents Classification,†2020 Int. Conf. Data Sci. Artif. Intell. Bus. Anal. DATABIA 2020 - Proc., pp. 69–75, 2020, doi: 10.1109/DATABIA50434.2020.9190447.
Imamah and F. H. Rachman, “Twitter sentiment analysis of Covid-19 using term weighting TF-IDF and logistic regresion,†Proceeding - 6th Inf. Technol. Int. Semin. ITIS 2020, pp. 238–242, 2020, doi: 10.1109/ITIS50118.2020.9320958.
M. Lan, S. Sung, H. Low, and C. Tan, “A Comparative Study on Term Weighting Schemes for Text Categorization,†vol. 1, pp. 546–551, 2005.
B. Trstenjak, S. Mikac, and D. Donko, “KNN with TF-IDF based framework for text categorization,†Procedia Eng., vol. 69, pp. 1356–1364, 2014, doi: 10.1016/j.proeng.2014.03.129.
V. R. Prasetyo and E. Winarko, “Rating of Indonesian sinetron based on public opinion in Twitter using Cosine similarity,†Proc. - 2016 2nd Int. Conf. Sci. Technol. ICST 2016, pp. 200–205, 2017, doi: 10.1109/ICSTC.2016.7877374.
H. Hasanli and S. Rustamov, “Sentiment Analysis of Azerbaijani twits Using Logistic Regression, Naive Bayes and SVM,†13th IEEE Int. Conf. Appl. Inf. Commun. Technol. AICT 2019 - Proc., 2019, doi: 10.1109/AICT47866.2019.8981793.
Y. Zheng, T. Deng, and Y. Wang, “Autism Classification Based on Logistic Regression Model,†2021 IEEE 2nd Int. Conf. Big Data, Artif. Intell. Internet Things Eng. ICBAIE 2021, no. Icbaie, pp. 579–582, 2021, doi: 10.1109/ICBAIE52039.2021.9389914.
S. Xu, “Bayesian Naïve Bayes classifiers to text classification,†J. Inf. Sci., vol. 44, no. 1, pp. 48–59, 2018, doi: 10.1177/0165551516677946.
M. S. Saputri, R. Mahendra, and M. Adriani, “Emotion Classification on Indonesian Twitter Dataset,†Proc. 2018 Int. Conf. Asian Lang. Process. IALP 2018, pp. 90–95, 2019, doi: 10.1109/IALP.2018.8629262.
F. Z. Tala, “A Study of Stemming Effects on Information Retrieval in Bahasa Indonesia,†M.Sc. Thesis, Append. D, vol. pp, pp. 39–46, 2003.
E. Grave, P. Bojanowski, P. Gupta, A. Joulin, and T. Mikolov, “Learning word vectors for 157 languages,†Lr. 2018 - 11th Int. Conf. Lang. Resour. Eval., pp. 3483–3487, 2019.
 
						



 
 