Analysis of Expertise Group Using The Fuzzy K-NN Classification Algorithm (Case Study: School of Computing Telkom University)

Jodi Kusuma; Angelina Prima Kurniati; Ichwanul Muslim Karo Karo

doi:10.30865/jurikom.v9i3.4215

Authors

Jodi Kusuma Universitas Telkom, Bandung http://orcid.org/0000-0002-5141-1569
Angelina Prima Kurniati Universitas Telkom, Bandung
Ichwanul Muslim Karo Karo Universitas Negeri Medan, Medan http://orcid.org/0000-0002-2824-5654

DOI:

https://doi.org/10.30865/jurikom.v9i3.4215

Keywords:

School of Computing, Expertise Group, Classification, Fuzzy K-Nearest Neighbor, Undersampling, Oversampling

Abstract

The School of Computing at Telkom University has four Expertise Groups that defines the lectures taken by students. Deciding the Expertise Group, will be influential in deciding elective courses and raising the topic of the Final Project. There are many students who are still having difficulty in deciding the Expertise Group and finally only decide based on the most popular Expertise Group without seeing their potential and abilities. The impact of wrong decision of the Expertise Group are delays in graduation time. It will then affect accreditation of study program and university rank, especially in the timely graduation indicator. Therefore, it is necessary to have a system that can predict the decision of the Expertise Group for the School of Computing students based on their academic scores. In this study, prediction using the Fuzzy K-Nearest Neighbor classification algorithm was chosen because it can determine the class based on the nearest neighbor and consider ambiguous data because of the weighting value in each class. There are five tests carried out to get the best model, namely (1) examine the best split training and validation data, (2) examine the best K value, (3) compare Fuzzy K-Nearest Neighbor with NaÃ¯ve Bayes and Decision Tree (C4.5) which is a commonly used classification algorithm, (4) examine the values of accuracy, precision, recall, f1-score, and (5) examine the values of accuracy using Cross-Validation method. The result is that the model made using Fuzzy K-Nearest Neighbor has an accuracy value of 72% in the case of imbalance data, 62% in the case of applying the undersampling technique, and 56% in the case of applying oversampling. Based on experiments with the other two algorithms, it was found that compared to the other two algorithms, the Fuzzy K-Nearest Neighbor has a higher accuracy value in the case of imbalance data and the case of applying to undersampling, but it has a lower accuracy in the case of applying oversampling, due to the lack of Fuzzy K-Nearest Neighbor in handling small minority data variations.

References

â€œKelompok Keahlian - Telkom University,â€ Telkom University, 2020. [Online]. Available: https://telkomuniversity.ac.id/kelompok-keahlian/. [Accessed 29 Apr 2022].

A. C. Febryanti, I. Darmawan and R. Andreswari, â€œModelling Of Decision Support System For Fields Of Interest Selection With Simple Additive Weighting Method Case Study : Bachelor Program Of Information System Telkom University,â€ e-Proceeding of Engineering, vol. 4, no. 2, pp. 3114-3121, 2017.

N. Lizarti and A. N. Ulfah, â€œPenerapan Algoritma K-Nearest Neighbor Untuk Penentuan Peminatan Studi STMIK Amik Riau,â€ Fountain of Informatics Journal, vol. 4, no. 1, pp. 1 - 7, 2019.

A. S. P. Anugerah, Indriati and C. Dewi, â€œImplementasi Algoritme Fuzzy K-Nearest Neighbor untuk Penentuan Lulus Tepat Waktu (Studi Kasus : Fakultas Ilmu Komputer Universitas Brawijaya),â€ Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer, vol. 2, no. 4, pp. 1726-1732, 2018.

S. N. Latifah, R. Andreswari and M. A. Hasibuan, â€œPrediction Analysis of Student Specialization Suitability using Artificial Neural Network Algorithm,â€ dalam International Conference on Sustainable Engineering and Creative Computing (ICSECC), 2019.

A. R. Manurung, R. Andreswari and M. A. Hasibuan, â€œAnalisis Prediksi Pemilihan Bidang Peminatan Berdasarkan Rekam Data Akademik Menggunakan Algoritme C4.5 (Studi Kasus : Mahasiswa Sistem Informasi Universitas Telkom),â€ dalam Conference on Information Technology and Electrical Engineering (CITEE 2019), Yogyakarta, 2019.

F. K. Wattimury and E. Seniwati, â€œPENENTUAN PEMINATAN MAHASISWA PRODI INFORMATIKA DI UNIVERSITAS AMIKOM YOGYAKARTA MENGGUNAKAN SVM,â€ INTECHNO Journal - Information Technology Journal, vol. 1, no. 4, pp. 15-18, 2019.

P. E. Mas`udia, R. Rismanto and A. Mas`ud, â€œAnalysis of Comparison of Fuzzy Knn, C4.5 Algorithm, and NaÃ¯ve Bayes Classification Method for Diabetes Mellitus Diagnosis,â€ International Journal of Computer Applications Technology and Research, vol. 7, no. 8, pp. 363-369, 2018.

R. D. Fitriani, H. Yasin and Tarno, â€œPENANGANAN KLASIFIKASI KELAS DATA TIDAK SEIMBANG DENGAN RANDOM OVERSAMPLING PADA NAIVE BAYES (Studi Kasus: Status Peserta KB IUD di Kabupaten Kendal),â€ JURNAL GAUSSIAN, vol. 10, no. 1, pp. 11-20, 2021.

Y. B. Wah, H. A. A. Rahman, H. He and A. Bulgiba, â€œHandling Imbalanced Dataset Using SVM and k-NN Approach,â€ dalam AIP Conference Proceedings, 2016.

I. M. K. Karo, A. Khosuri and R. Setiawan, â€œEffects of Distance Measurement Methods in K-Nearest Neighbor Algorithm to Select Indonesia Smart Card Recipient,â€ dalam International Conference on Data Science and Its Applications (ICoDSA), 2021.

D. Chicco and G. Jurman, â€œThe advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation,â€ BMC Genomics, vol. 21, no. 6, pp. 1-13, 2022.

I. M. K. Karo, A. T. R. Dzaky and M. A. Saputra, â€œComparative Analysis of K-Nearest Neighbor and Modified K-Nearest Neighbor Algorithm for Financial Well-Being Data Classification,â€ Indonesia Journal of Computing, vol. 6, no. 3, pp. 25-34, 2021.

I. M. K. Karo, A. Khosuri, J. S. I. Septory and D. P. Supandi, â€œPengaruh Metode Pengukuran Jarak pada Algoritma k-NN untuk Klasifikasi Kebakaran Hutan dan Lahan,â€ JURNAL MEDIA INFORMATIKA BUDIDARMA, vol. 6, no. 2, pp. 1174-1182, 2022.

W. Ustyannie and Suprapto, â€œOversampling Method To Handling Imbalanced Datasets Problem In Binary Logistic Regression Algorithm,â€ Windyaning Ustyannie, vol. 14, no. 1, pp. 1-10, 2020.

A. RÃ¡cz, D. Bajusz and K. HÃ©berger, â€œEffect of Dataset Size and Train/Test Split Ratios in QSAR/QSPR Multiclass Classification,â€ Multidisciplinary Digital Publishing Institute, vol. 26, no. 4, p. 1111, 2021.

A. B. Hassanat, M. A. Abbadi and G. A. Altarawneh, â€œSolving the Problem of the K Parameter in the KNN Classifier Using an Ensemble Learning Approach,â€ International Journal of Computer Science and Information Security (IJCSIS), vol. 12, no. 8, pp. 33-39, 2014.

F. Wafiyah, N. Hidayat and R. S. Perdana, â€œImplementasi Algoritma Modified K-Nearest Neighbor (MKNN) untuk Klasifikasi Penyakit Demam,â€ Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer, vol. 1, no. 10, pp. 1210-1219, 2017.

F. Tempola, M. Muhammad and A. Khairan, â€œPERBANDINGAN KLASIFIKASI ANTARA KNN DAN NAIVE BAYES PADA PENENTUAN STATUS GUNUNG BERAPI DENGAN K-FOLD CROSS VALIDATION,â€ Jurnal Teknologi Informasi dan Ilmu Komputer (JTIIK), vol. 5, no. 5, pp. 577-584, 2018.

Analysis of Expertise Group Using The Fuzzy K-NN Classification Algorithm (Case Study: School of Computing Telkom University)

Authors

DOI:

Keywords:

Abstract

References

Additional Files

Published

How to Cite

Issue

Section

menujuribaru

template

sitasigs

member