Analisis Trending Topik Twitter dengan Fitur Ekspansi FastText Menggunakan Metode Logistic Regression

Izzan Faikar Ramadhy; Yuliant Sibaroni

doi:10.30865/jurikom.v9i1.3791

Authors

Izzan Faikar Ramadhy Telkom University, Bandung
Yuliant Sibaroni Telkom University, Bandung

DOI:

https://doi.org/10.30865/jurikom.v9i1.3791

Keywords:

Twitter, Trending Topics, FastText, Expansion Feature, Logistic Regression

Abstract

Twitter is a social media that contains information such as the latest news, a person's biography, and tweets from users. Twitter has a feature called trending topics that serves to find out information on certain topics that are currently popular. In fact, it is often difficult to understand what trending topics are happening. Therefore, it is necessary to classify trending topics into a general category. This study aims to analyze and classify Twitter topic trending information by dividing several topic trend labels using the FastText expansion feature method. The FastText expansion feature is used to reduce vocabulary mismatches in a tweet. The classification process of this system will use the Logistic Regression method. The best results were obtained in this study using test data scenarios, 90:10 training data with 76.39% accuracy. The most discussed trending topic from September 2021 to October 2021 was politics with a percentage of 15.83%, followed by religion 12.64% and technology 10.42%

References

F. Atefeh and W. Khreich, â€œA survey of techniques for event detection in Twitter,â€ Comput. Intell., vol. 31, no. 1, pp. 133â€“164, 2015, doi: 10.1111/coin.12017.

M. Chelly and H. Mataillet, â€œSocial media and the impact on education: Social media and home education,â€ 2012 Int. Conf. E-Learning E-Technologies Educ. ICEEE 2012, pp. 236â€“239, 2012, doi: 10.1109/ICeLeTE.2012.6333388.

A. D. Hartanto, E. Utami, S. Adi, and H. S. Hudnanto, â€œJob seeker profile classification of twitter data using the naÃ¯ve bayes classifier algorithm based on the DISC method,â€ 2019 4th Int. Conf. Inf. Technol. Inf. Syst. Electr. Eng. ICITISEE 2019, pp. 533â€“536, 2019, doi: 10.1109/ICITISEE48480.2019.9003963.

S. T. Indra, L. Wikarsa, and R. Turang, â€œUsing logistic regression method to classify tweets into the selected topics,â€ 2016 Int. Conf. Adv. Comput. Sci. Inf. Syst. ICACSIS 2016, pp. 385â€“390, 2017, doi: 10.1109/ICACSIS.2016.7872727.

K. Lee, D. Palsetia, R. Narayanan, M. M. A. Patwary, A. Agrawal, and A. Choudhary, â€œTwitter trending topic classification,â€ Proc. - IEEE Int. Conf. Data Mining, ICDM, pp. 251â€“258, 2011, doi: 10.1109/ICDMW.2011.171.

E. B. Setiawan, D. H. Widyantoro, and K. Surendro, â€œFeature expansion using word embedding for tweet topic classification,â€ Proceeding 2016 10th Int. Conf. Telecommun. Syst. Serv. Appl. TSSA 2016 Spec. Issue Radar Technol., no. 2011, 2017, doi: 10.1109/TSSA.2016.7871085.

A. Amalia, O. S. Sitompul, E. B. Nababan, and T. Mantoro, â€œAn Efficient Text Classification Using fastText for Bahasa Indonesia Documents Classification,â€ 2020 Int. Conf. Data Sci. Artif. Intell. Bus. Anal. DATABIA 2020 - Proc., pp. 69â€“75, 2020, doi: 10.1109/DATABIA50434.2020.9190447.

Imamah and F. H. Rachman, â€œTwitter sentiment analysis of Covid-19 using term weighting TF-IDF and logistic regresion,â€ Proceeding - 6th Inf. Technol. Int. Semin. ITIS 2020, pp. 238â€“242, 2020, doi: 10.1109/ITIS50118.2020.9320958.

M. Lan, S. Sung, H. Low, and C. Tan, â€œA Comparative Study on Term Weighting Schemes for Text Categorization,â€ vol. 1, pp. 546â€“551, 2005.

B. Trstenjak, S. Mikac, and D. Donko, â€œKNN with TF-IDF based framework for text categorization,â€ Procedia Eng., vol. 69, pp. 1356â€“1364, 2014, doi: 10.1016/j.proeng.2014.03.129.

V. R. Prasetyo and E. Winarko, â€œRating of Indonesian sinetron based on public opinion in Twitter using Cosine similarity,â€ Proc. - 2016 2nd Int. Conf. Sci. Technol. ICST 2016, pp. 200â€“205, 2017, doi: 10.1109/ICSTC.2016.7877374.

H. Hasanli and S. Rustamov, â€œSentiment Analysis of Azerbaijani twits Using Logistic Regression, Naive Bayes and SVM,â€ 13th IEEE Int. Conf. Appl. Inf. Commun. Technol. AICT 2019 - Proc., 2019, doi: 10.1109/AICT47866.2019.8981793.

Y. Zheng, T. Deng, and Y. Wang, â€œAutism Classification Based on Logistic Regression Model,â€ 2021 IEEE 2nd Int. Conf. Big Data, Artif. Intell. Internet Things Eng. ICBAIE 2021, no. Icbaie, pp. 579â€“582, 2021, doi: 10.1109/ICBAIE52039.2021.9389914.

S. Xu, â€œBayesian NaÃ¯ve Bayes classifiers to text classification,â€ J. Inf. Sci., vol. 44, no. 1, pp. 48â€“59, 2018, doi: 10.1177/0165551516677946.

M. S. Saputri, R. Mahendra, and M. Adriani, â€œEmotion Classification on Indonesian Twitter Dataset,â€ Proc. 2018 Int. Conf. Asian Lang. Process. IALP 2018, pp. 90â€“95, 2019, doi: 10.1109/IALP.2018.8629262.

F. Z. Tala, â€œA Study of Stemming Effects on Information Retrieval in Bahasa Indonesia,â€ M.Sc. Thesis, Append. D, vol. pp, pp. 39â€“46, 2003.

E. Grave, P. Bojanowski, P. Gupta, A. Joulin, and T. Mikolov, â€œLearning word vectors for 157 languages,â€ Lr. 2018 - 11th Int. Conf. Lang. Resour. Eval., pp. 3483â€“3487, 2019.

Analisis Trending Topik Twitter dengan Fitur Ekspansi FastText Menggunakan Metode Logistic Regression

Authors

DOI:

Keywords:

Abstract

References

Additional Files

Published

How to Cite

Issue

Section

menujuribaru

template

sitasigs

member

Keywords