Comparative Analysis of Personality Detection using Random Forest and Multinomial Naive Bayes

Authors

  • Azka Zainur Azifa Telkom University, Bandung
  • Warih Maharani Telkom University, Bandung
  • Prati Hutari Gani Telkom University, Bandung

DOI:

https://doi.org/10.30865/mib.v7i1.5592

Keywords:

Twitter, Personality Detection, Big Five Personality, Random Forest, Multinomial Naïve Bayes

Abstract

Personality is a difference that is owned by each individual in thinking, feeling, and behaving. Personality is an individual characteristic that is formed based on biological parents and environmental influences. Personality type is one of the determinants of the type of work performed. The Big Five personality is a method used to detect personality. This theory divides characteristics into five dimensions, namely Openness, Conscientiousness, Extraversion, Neuroticism, and Agreeableness. Several studies have shown that personality identification can be done through social media, one of which is by using Twitter. Much research related to personality detection has been carried out using machine learning, but only focuses on one machine learning model. In the case of text detection, multinomial naive bayes have a more stable performance than random forest, while random forest has better accuracy than multinomial naive bayes. therefore this study focuses on conducting a comparative analysis using random forest and multinomial naive Bayes. the best accuracy is produced by a system with a random forest model of 60.71% and a precision value of 62% for openness personality and 57% for agreeableness personality.

References

Sari, “PENGARUH BEBAN KERJA, PENGALAMAN, TIPE KEPRIBADIAN, DAN KOMPETENSI AUDITOR TERHADAP SKEPTISME PROFESIONAL,†Uii.ac.id, 2018.

T. Simanullang, “PENGARUH TIPE KEPRIBADIAN THE BIG FIVE MODEL PERSONALITY TERHADAP KINERJA APARATUR SIPIL NEGARA (KAJIAN STUDI LITERATUR MANAJEMEN KEUANGAN),†vol. 2, no. 2, 2021, doi: 10.38035/jmpis.v2i2.

“DataReportal Digital 2022: Indonesia — DataReportal – Global Digital Insights, “DataReportal – Global Digital Insights,†DataReportal – Global Digital Insights, Feb. 15, 2022.

M. Fikry et al., “Klasifikasi Kepribadian Big Five Pengguna Twitter dengan Metode Naïve Bayes,†2018.

R. P. Pratama and W. Maharini, “Predicting Big Five Personality Traits Based on Twitter User U sing Random Forest Method,†2021 International Conference on Data Science and Its Applications (ICoDSA), Oct. 2021.

N. Y. Hutama, K. M. Lhaksmana, and I. Kurniawan, “Text Analysis of Applicants for Personality Classification Using Multinomial Naïve Bayes and Decision Tree,†JURNAL INFOTEL, vol. 12, no. 3, pp. 72–81, Aug. 2020, doi: 10.20895/infotel.v12i3.505.

S. V. Therik and E. B. Setiawan, “Deteksi Kepribadian Big Five Pengguna Twitter Dengan Metode C4.5,†eProceedings of Engineering, vol. 8, 2021.

Y. Aditama, I. Nanda, ; Bety, and W. Sari, “Techno Nusa Mandiri: Journal of Computing and Information Technology As an Accredited Journal Rank 4 based on SK Dirjen Risbang SK Nomor,†TECHNO Nusa Mandiri Journal, vol. 17, no. 1, 2020, [Online]. Available: www.amikom.ac.id

S. A. Utami, N. Grasiaswaty, and S. Z. Akmal, “Hubungan Tipe Kepribadian Berdasarkan Big Five Theory Personality dengan Kebimbangan Karier pada Siswa SMA Relationship between Types of Personality Based on Big Five Theory Personality with Career Indecision among High School Students,†2018.

Kerem Kargın, “NLP: Tokenization, Stemming, Lemmatization and Part of Speech Tagging,†Medium, Feb. 27, 2021.

M. A. Rofiqi, Abd. C. Fauzan, A. P. Agustin, and A. A. Saputra, “Implementasi Term-Frequency Inverse Document Frequency (TF-IDF) Untuk Mencari Relevansi Dokumen Berdasarkan Query,†ILKOMNIKA: Journal of Computer Science and Applied Informatics, vol. 1, no. 2, pp. 58–64, Dec. 2019, doi: 10.28926/ilkomnika.v1i2.18.

R. Kosasih and A. Alberto, “Analisis Sentimen Produk Permainan Menggunakan Metode TF-IDF Dan Algoritma K-Nearest Neighbor,†vol. 6, no. 1, 2021, doi: 10.30743/infotekjar.v6i1.3893.

C. Sindermann, R. Mõttus, D. Rozgonjuk, and C. Montag, “Predicting current voting intentions by Big Five personality domains, facets, and nuances – A random forest analysis approach in a German sample,†Personality Science, vol. 2, Sep. 2021, doi: 10.5964/ps.6017.

R. M. Awangga and N. H. Khonsa’, “Analisis Performa Algoritma Random Forest dan Naive Bayes Multinomial pada Dataset Ulasan Obat dan Ulasan Film,†InComTech : Jurnal Telekomunikasi dan Komputer, vol. 12, no. 1, p. 60, Apr. 2022, doi: 10.22441/incomtech.v12i1.14770.

A. Toha, P. Purwono, and W. Gata, “Model Prediksi Kualitas Udara dengan Support Vector Machines dengan Optimasi Hyperparameter GridSearch CV,†Buletin Ilmiah Sarjana Teknik Elektro, vol. 4, no. 1, pp. 12–21, May 2022, doi: 10.12928/biste.v4i1.6079.

W. Nugraha and A. Sasongko, “Hyperparameter Tuning on Classification Algorithm with Grid Search,†SISTEMASI, vol. 11, no. 2, p. 391, May 2022, doi: 10.32520/stmsi.v11i2.1750.

R. Siringoringo, “KLASIFIKASI DATA TIDAK SEIMBANG MENGGUNAKAN ALGORITMA SMOTE DAN k-NEAREST NEIGHBOR,†2018.

E. Sutoyo, M. Asri Fadlurrahman, J. Telekomunikasi Jl Terusan Buah Batu, K. Dayeuhkolot, K. Bandung, and J. Barat, “JEPIN (Jurnal Edukasi dan Penelitian Informatika) Penerapan SMOTE untuk Mengatasi Imbalance Class dalam Klasifikasi Television Advertisement Performance Rating Menggunakan Artificial Neural Networkâ€.

Sabrani Alif, “KLASIFIKASI ARTIKEL ONLINE TENTANG GEMPA DI INDONESIA MENGGUNAKAN MULTINOMIAL NAÃVE BAYES,†Publikasi Tugas Akhir S-1 PSTI FT-UNRAM, 2020.

R. Arthana, “Mengenal Accuracy, Precision, Recall dan Specificity serta yang diprioritaskan dalam Machine Learning,†Medium, Apr. 05, 2019.

Downloads

Published

2023-02-03