Work Readiness Prediction of Telkom University Students Using Multinomial Logistic Regression and Random Forest Method

Authors

  • Haura Athaya Salka Telkom University, Bandung
  • Kemas Muslim Lhaksmana Telkom University, Bandung

DOI:

https://doi.org/10.30865/mib.v6i4.4546

Keywords:

People Analytics, Work Readiness, Students Performance, Multinomial Logistic Regression, Random Forest

Abstract

Work readiness for college graduates is an essential and significant thing to get a job immediately after graduation. But what happens is that many graduates are unemployed after graduation or do not get jobs that match the majors they have studied for more than four years. Therefore, by using a people analytics approach, this study aims to predict the work readiness of Telkom University students and find out what factors affect student work-readiness after graduation. The model built is a multi-classes classification model. This model uses Chi-square Test calculation for feature selection, Multinomial Logistic Regression and Random Forest as a classification method, and confusion matrix as an evaluation method. Multinomial Logistic Regression is used because several studies use this algorithm for categorical data, while Random Forest is used to compare which model produces better accuracy. This study conducted several test scenarios, which obtained the best model by performing hyperparameter tuning and handling unbalanced data with SMOTE-ENN. Handling imbalanced data with SMOTE-ENN is used to improve accuracy scores and predict classes well, especially for minority class. The best accuracy of the Multinomial Logistic Regression method is 53.9%, and Random Forest is 48.5%.

References

S. C. Necula and C. Strîmbei, “People analytics of semantic web human resource résumés for sustainable talent acquisition,†Sustainability (Switzerland), vol. 11, no. 13, Jul. 2019, doi: 10.3390/SU11133520.

M. J. D. Kavanagh and R. D. D. Johnson, Human Resource Information Systems : Basics, Applications, and Future Directions.

P. Leonardi and N. Contractor, “Better people analytics,†Harvard Business Review, vol. 2018, no. November-December, pp. 1–22, 2018.

I. Kapareliotis, K. Voutsina, and A. Patsiotis, “Internship and employability prospects: assessing student’s work readiness,†Higher Education, Skills and Work-based Learning, vol. 9, no. 4, pp. 538–549, 2019, doi: 10.1108/HESWBL-08-2018-0086.

I. P. Herbert, A. T. Rothwell, J. L. Glover, and S. A. Lambert, “Graduate employability, employment prospects and work-readiness in the changing field of professional work,†International Journal of Management Education, vol. 18, no. 2, p. 100378, 2020, doi: 10.1016/j.ijme.2020.100378.

F. Nasril, D. Indiyati, and G. Ramantoko, “Talent Performance Analysis Using People Analytics Approach,†Budapest International Research and Critics Institute (BIRCI-Journal): Humanities and Social Sciences, vol. 4, no. 1, pp. 216–230, 2021, doi: 10.33258/birci.v4i1.1585.

I. Supriadi, A. Hariyanti, M. Z. Abidin, Rinrin, and D. Gustian, “Penerapan regresi linier berganda dalam kesiapan kerja mahasiswa,†Seminar Nasional Informatika 2020, vol. 1, no. 1, pp. 204–211, 2020.

K. C. Saling and M. D. Do, “Leveraging people analytics for an adaptive complex talent management system,†Procedia Computer Science, vol. 168, pp. 105–111, 2020, doi: 10.1016/J.PROCS.2020.02.269.

A. Anthony, E. Sediyono, and A. Iriani, “Analisis Kesiapan Kerja Mahasiwa di Era Revolusi Industri 4.0 Menggunakan Soft - System Methodology,†Jurnal Teknologi Informasi dan Ilmu Komputer, vol. 7, no. 5, p. 1041, 2020, doi: 10.25126/jtiik.2020752380.

M. Lin et al., “Detection of Ionospheric Scintillation Based on XGBoost Model Improved by SMOTE-ENN Technique,†2021, doi: 10.3390/rs13132577.

T. Al-Shehari and R. A. Alsowail, “An Insider Data Leakage Detection Using One-Hot Encoding, Synthetic Minority Oversampling and Machine Learning Techniques,†2021, doi: 10.3390/e23101258.

“SMOTEENN — Version 0.9.1.†https://imbalanced-learn.org/stable/references/generated/imblearn.combine.SMOTEENN.html (accessed Jul. 08, 2022).

K. F. Weaver, “An introduction to statistical analysis in research : with applications in the biological and life sciences,†p. 594.

A. J. Dobson and A. G. Barnett, An Introduction to Generalized Linear Models, 4th ed. Taylor & Francis Group, 2018. doi: https://doi.org/10.1201/9781315182780.

H. A. Park, “An introduction to logistic regression: From basic concepts to interpretation with particular attention to nursing domain,†J Korean Acad Nurs, vol. 43, no. 2, pp. 154–164, 2013, doi: 10.4040/jkan.2013.43.2.154.

“sklearn.linear_model.LogisticRegression — scikit-learn 1.1.1 documentation.†https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html?highlight=logistic%20regression#sklearn.linear_model.LogisticRegression (accessed Jul. 08, 2022).

“sklearn.ensemble.RandomForestClassifier — scikit-learn 1.1.1 documentation.†https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html (accessed Jul. 08, 2022).

W. Sullivan, Machine Learning For Beginners Guide Algorithms: Supervised & Unsupervsied Learning. Decision Tree & Random Forest Introduction. Healthy Pragmatic Solutions Inc., 2017.

N. Farnaaz and M. A. Jabbar, “Random Forest Modeling for Network Intrusion Detection System,†Procedia Computer Science, vol. 89, pp. 213–217, 2016, doi: 10.1016/j.procs.2016.06.047.

Y. Li et al., “Random forest regression for online capacity estimation of lithium-ion batteries,†Applied Energy, vol. 232, no. September, pp. 197–210, 2018, doi: 10.1016/j.apenergy.2018.09.182.

Downloads

Published

2022-10-25