Identify User Behavior based on Tweet Type on twitter Platform using Mean Shift Clustering
DOI:
https://doi.org/10.30865/mib.v6i3.4329Keywords:
Mean Shift, Politic, Centrality, TF-IDF Vectorizer, User BehaviorAbstract
Twitter is a social media where users often get information from various fields. There are many problems with Twitter. For example, in Indonesia's political field, discussing the performance of the President of Indonesia and his staff who are not good, students and the public hold demonstrations in DKI Jakarta. They want the President of Indonesia to step down from office. When the problem is trending, some users have positive (praise) and negative (blasphemous) behavior, which is interesting to discuss in this study. Before the method stage, data preprocessing is carried out so that the data to be used becomes more efficient. Word weighting is also done using the TF-IDF Vectorizer. Then, the clustering method with the Mean Shift algorithm is applied to identify user behavior based on the type of tweet. This method can find information from a vast data set in a short time. Based on this algorithm, the results obtained are 67 clusters from the Mean Shift algorithm. From a total of 67 clusters obtained, 5 clusters were taken to identify user behavior. User behavior in clusters 0, 2, 3, and 4 is negative because it discusses the people who want the President of the Republic of Indonesia to resign from his position immediately. Meanwhile, user behavior in cluster 1 is positive because the topics discussed only information that the people of Lampung are already in Jakarta.
References
A. Gupta, A. Joshi, and P. Kumaraguru, “Identifying and characterizing user communities on twitter during crisis events,†International Conference on Information and Knowledge Management, Proceedings, pp. 23–26, 2012, doi: 10.1145/2390131.2390142.
Z. Zengin Alp and Ş. Gündüz Öğüdücü, “Identifying topical influencers on twitter based on user behavior and network topology,†Knowledge-Based Systems, vol. 141, pp. 211–221, Feb. 2018, doi: 10.1016/J.KNOSYS.2017.11.021.
L. Jiang, M. Yu, M. Zhou, X. Liu, and T. Zhao, “Target-dependent Twitter Sentiment Classification,†pp. 151–160, 2011, doi: 10.5555/2002472.
V. Effendy, A. Novantirani, and M. K. Sabariah, “Sentiment Analysis on Twitter about the Use of City Public Transportation Using Support Vector Machine Methodâ€.
“[PDF] Sentiment Classification using Distant Supervision | Semantic Scholar.†https://www.semanticscholar.org/paper/Sentiment-Classification-using-Distant-Supervision-Go/52e2bd533323ddf97073d034bae40a46eda55f34 (accessed Jun. 20, 2022).
S. He, H. Wang, and Z. H. Jiang, “Identifying user behavior on Twitter based on multi-scale entropy,†Proceedings 2014 IEEE International Conference on Security, Pattern Analysis, and Cybernetics, SPAC 2014, pp. 381–384, Dec. 2014, doi: 10.1109/SPAC.2014.6982720.
“Detecting Spammers on Twitter by Identifying User Behavior and Tweet-Based Features | Journal of Telecommunication, Electronic and Computer Engineering (JTEC).†https://jtec.utem.edu.my/jtec/article/view/4321 (accessed Jun. 20, 2022).
A. Mogadala and V. Varma, “Twitter user behavior understanding with mood transition prediction,†International Conference on Information and Knowledge Management, Proceedings, pp. 31–34, 2012, doi: 10.1145/2390131.2390145.
M. Maia, J. Almeida, and V. Almeida, “Identifying user behavior in online social networks,†Proceedings of the 1st Workshop on Social Network Systems, SocialNets’08 - Affiliated with EuroSys 2008, pp. 13–18, 2008, doi: 10.1145/1435497.1435498.
G. Wang, X. Zhang, S. Tang, H. Zheng, and B. Y. Zhao, “Unsupervised clickstream clustering for user behavior analysis,†Conference on Human Factors in Computing Systems - Proceedings, pp. 225–236, May 2016, doi: 10.1145/2858036.2858107.
G. Pitolli, L. Aniello, G. Laurenza, L. Querzoni, and R. Baldoni, “Malware family identification with BIRCH clustering,†Proceedings - International Carnahan Conference on Security Technology, vol. 2017-October, pp. 1–6, Dec. 2017, doi: 10.1109/CCST.2017.8167802.
“Identifying Biased Users in Online Social Networks to Enhance the Accuracy of Sentiment Analysis: A User Behavior-Based Approach | Request PDF.†https://www.researchgate.net/publication/351575532_Identifying_Biased_Users_in_Online_Social_Networks_to_Enhance_the_Accuracy_of_Sentiment_Analysis_A_User_Behavior-Based_Approach (accessed Jun. 20, 2022).
J. Jin and L. Chen, “Identity credibility evaluation method based on user behavior analysis in cloud environment,†ACM International Conference Proceeding Series, pp. 77–82, May 2019, doi: 10.1145/3335484.3335491.
Z. Xu and Q. Yang, “Analyzing user retweet behavior on twitter,†Proceedings of the 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2012, pp. 46–50, 2012, doi: 10.1109/ASONAM.2012.18.
T. Tang, M. Hämäläinen, A. Virolainen, and J. Makkonen, “Understanding user behavior in a local social media platform by social network analysis,†Proceedings of the 15th International Academic MindTrek Conference: Envisioning Future Media Environments, MindTrek 2011, pp. 183–188, 2011, doi: 10.1145/2181037.2181067.
K. Deng, L. Xing, L. Zheng, H. Wu, P. Xie, and F. Gao, “A User Identification Algorithm Based on User Behavior Analysis in Social Networks,†IEEE Access, vol. 7, pp. 47114–47123, 2019, doi: 10.1109/ACCESS.2019.2909089.
U. Dutta et al., “Analyzing Twitter Users’ Behavior Before and After Contact by the Russia’s Internet Research Agency,†Proceedings of the ACM on Human-Computer Interaction, vol. 5, no. CSCW1, pp. 1–24, Apr. 2021, doi: 10.1145/3449164.
C. Bepery, S. Bhadra, Md. M. Rahman, M. K. Sarkar, and M. J. Hossain, “Improved Mean Shift Algorithm for Maximizing Clustering Accuracy,†Journal of Engineering Advancements, vol. 2, no. 01, pp. 01–06, Jan. 2021, doi: 10.38032/JEA.2021.01.001.
“EKSTRAKSI TF-IDF N-GRAM DARI KOMENTAR PELANGGAN PRODUK SMARTPHONE PADA WEBSITE E-COMMERCE | Semantic Scholar.†https://www.semanticscholar.org/paper/EKSTRAKSI-TF-IDF-N-GRAM-DARI-KOMENTAR-PELANGGAN-Mardianti-Naf%E2%80%99an/9eacb1ba53a6fe48b01ecf77c6aa965daf1baa55 (accessed Jun. 20, 2022).
J. Ye, X. Jing, and J. Li, “Sentiment Analysis Using Modified LDA,†Lecture Notes in Electrical Engineering, vol. 473, pp. 205–212, 2018, doi: 10.1007/978-981-10-7521-6_25.
D. E. Cahyani and I. Patasik, “Performance comparison of tf-idf and word2vec models for emotion text classification,†Bulletin of Electrical Engineering and Informatics, vol. 10, no. 5, pp. 2780–2788, Oct. 2021, doi: 10.11591/EEI.V10I5.3157.
A. D. Fontanini and J. Abreu, “A Data-Driven BIRCH Clustering Method for Extracting Typical Load Profiles for Big Data,†IEEE Power and Energy Society General Meeting, vol. 2018-August, Dec. 2018, doi: 10.1109/PESGM.2018.8586542.
X. Zhao, S. Guo, and Y. Wang, “The node influence analysis in social networks based on structural holes and degree centrality,†Proceedings - 2017 IEEE International Conference on Computational Science and Engineering and IEEE/IFIP International Conference on Embedded and Ubiquitous Computing, CSE and EUC 2017, vol. 1, pp. 708–711, Aug. 2017, doi: 10.1109/CSE-EUC.2017.137.
“ML | BIRCH Clustering - GeeksforGeeks.†https://www.geeksforgeeks.org/ml-birch-clustering/ (accessed Jun. 20, 2022).
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).