Personality Detection On Twitter User With RoBERTa

Rianda Khusuma; Warih Maharani; Prati Hutari Gani

doi:10.30865/mib.v7i1.5598

Authors

Rianda Khusuma Telkom University, Bandung
Warih Maharani Telkom University, Bandung
Prati Hutari Gani Telkom University, Bandung

DOI:

https://doi.org/10.30865/mib.v7i1.5598

Keywords:

Twitter, Personality Classification, Big Five Personality, RoBERTa, Hyperparameter

Abstract

Social media provides a service where users can make status updates about themselves. One of the social media that has such a facility is twitter. Twitter allows its users to express themselves easily by uploading tweets to their Twitter accounts. These activities on social media can indirectly describe the personality of the account owner. One form of personality classification that can be used is the big five personality. This theory classifies individual characters into five personality types, namely openness, conscientiousness, extraversion, agreeableness, and neuroticism. In the work environment, personality will significantly affect the work that is suitable for someone to do. To do a personality test, a test that is done manually, certainly takes longer and costs more. Therefore the use of machine learning to detect personality from social media is needed. By using the RoBERTa model to perform personality classification and dataset support from Twitter tweets, a system can be formed to detect personality. In the RoBERTa model, by determining the optimal ratio of training data and test data, as well as performing hyperparameter tuning, accuracy results can be obtained in classification activities, reaching 57.14%.

References

D. J. Holman and D. J. Hughes, â€œTransactions between Big-5 personality traits and job characteristics across 20 years,â€ J Occup Organ Psychol, vol. 94, no. 3, pp. 762â€“788, Sep. 2021, doi: 10.1111/joop.12332.

D. T. Alidemi and F. Fejza, â€œTheories Of Personality: A Literature Review,â€ International Journal of Progressive Sciences and Technologies (IJPSAT, vol. 25, no. 2, pp. 194â€“200, 2021, [Online]. Available: http://ijpsat.ijsht-journals.org

K. Simon, â€œDIGITAL 2022: INDONESIA,â€ Feb. 15, 2022. https://datareportal.com/reports/digital-2022-indonesia (accessed Jan. 22, 2023).

H. J. Kawekas, â€œApplication of Social Media Twitter as a Strategy for Governmentâ€™s Transparency: Study on #Kemala Jateng Program,â€ Forum Ilmu Sosial, vol. 47, no. 1, pp. 1â€“7, 2020, doi: 10.15294/fis.v47i1.23424.

N. Hutagalung, â€œKlasifikasi Tipe Kepribadian Pengguna Sosial Media Berdasarkan Teori BIG Five Menggunakan K-Nearest Neighbor,â€ Skripsi Sarjana, Universitas Sumatera Utara, Medan, 2018.

W. Bleidorn and C. James, â€œUsing Machine Learning to Advance Personality Assessment and Theory,â€ Personality and Social Psychology Review, vol. 23, no. 2, pp. 190â€“203, 2019, doi: 10.1177/1088868318772990.

Md. T. Zumma, J. A. Munia, D. Halder, and Md. S. Rahman, â€œPersonality Prediction from Twitter Dataset using Machine Learning,â€ in 2022 13th International Conference on Computing Communication and Networking Technologies (ICCCNT), 2022, pp. 1â€“5. doi: 10.1109/ICCCNT54827.2022.9984495.

Y. Liu et al., â€œRoBERTa: A Robustly Optimized BERT Pretraining Approach,â€ CoRR, vol. abs/1907.11692, 2019, [Online]. Available: http://arxiv.org/abs/1907.11692

M. Hercog, P. JaroÅ„ski, J. Kolanowski, P. MieczyÅ„ski, D. WiÅ›niewski, and J. Potoniec, â€œSarcastic RoBERTa: A RoBERTa-Based Deep Neural Network Detecting Sarcasm on Twitter,â€ in Big Data Analytics and Knowledge Discovery, 2022, pp. 46â€“52.

H. Jiang, X. Zhang, and J. D. Choi, â€œAutomatic Text-based Personality Recognition on Monologues and Multiparty Dialogues Using Attentive Networks and Contextual Embeddings,â€ CoRR, vol. abs/1911.09304, 2019, [Online]. Available: http://arxiv.org/abs/1911.09304

H. Christian, D. Suhartono, A. Chowanda, and K. Z. Zamli, â€œText based personality prediction from multiple social media data sources using pre-trained language model and model averaging,â€ J Big Data, vol. 8, no. 1, p. 68, 2021, doi: 10.1186/s40537-021-00459-1.

D. Lu, â€œMasked Reasoner at SemEval-2020 Task 4: Fine-Tuning RoBERTa for Commonsense Reasoning,â€ in Proceedings of the Fourteenth Workshop on Semantic Evaluation, Dec. 2020, pp. 411â€“414. doi: 10.18653/v1/2020.semeval-1.49.

M. A. Ayub, K. Ahmad, K. Ahmad, N. Ahmad, and A. I. Al-Fuqaha, â€œNLP Techniques for Water Quality Analysis in Social Media Content,â€ CoRR, vol. abs/2112.11441, 2021, [Online]. Available: https://arxiv.org/abs/2112.11441

A. F. Adoma, N.-M. Henry, and W. Chen, â€œComparative Analyses of Bert, Roberta, Distilbert, and Xlnet for Text-Based Emotion Recognition,â€ in 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), 2020, pp. 117â€“121. doi: 10.1109/ICCWAMTIP51612.2020.9317379.

J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, â€œBERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,â€ CoRR, vol. abs/1810.04805, 2018, [Online]. Available: http://arxiv.org/abs/1810.04805

D. Dhami, â€œUnderstanding BERT Word Embeddings,â€ Medium, Jul. 05, 2020. https://medium.com/@dhartidhami/understanding-bert-word-embeddings-7dc4d2ea54ca (accessed Jan. 22, 2023).

â€œflax-community/indonesian-roberta-base,â€ Huggingface.co, Dec. 02, 2022. https://huggingface.co/flax-community/indonesian-roberta-base (accessed Jan. 22, 2023).

M. Heydarian, T. E. Doyle, and R. Samavi, â€œMLCM: Multi-Label Confusion Matrix,â€ IEEE Access, vol. 10, pp. 19083â€“19095, 2022, doi: 10.1109/ACCESS.2022.3151048.

A. Luque, A. Carrasco, A. MartÃn, and A. de las Heras, â€œThe impact of class imbalance in classification performance metrics based on the binary confusion matrix,â€ Pattern Recognit, vol. 91, pp. 216â€“231, 2019, doi: https://doi.org/10.1016/j.patcog.2019.02.023.

D. Chicco and G. Jurman, â€œThe advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation,â€ BMC Genomics, vol. 21, no. 1, p. 6, 2020, doi: 10.1186/s12864-019-6413-7.

Personality Detection On Twitter User With RoBERTa

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Most read articles by the same author(s)

Menu Utama

flagcounter

template

statcounter

rji

terindex