Analysis of the K-Nearest Neighbor (KNN) Algorithm for Gender Classification Based on Voice Characteristics
DOI:
https://doi.org/10.30865/jurikom.v12i5.9117Keywords:
Classification, K-Nearest Neighbor, Classification; Gender, Voice Characteristics, Knowledge Discovery in DatabasesAbstract
The gender recognition system based on voice still faces challenges such as dependence on MFCC (Mel Frequency Cepstral Coefficients) features, which are not yet able to fully represent the complexity of human voice patterns. To overcome this, this study uses 20 voice characteristics and the K-Nearest Neighbor (KNN) algorithm because it is non-parametric, capable of handling non-linear relationships between features, and works intuitively by grouping data based on similarity of distance in the feature space, making it suitable for voice patterns that are not always linearly distributed. The purpose of this study is to analyze and develop a KNN model in classifying gender based on voice characteristics. Based on testing 50 variations of K values using K-Fold Cross Validation and Euclidean Distance, the evaluation results at K = 3, 5, and 7 showed average accuracies of 0.9740, 0.9700, and 0.9712. K = 3 was selected as the optimal parameter because it produced the highest accuracy. The results show that testing on 634 test data samples using K = 3 produced 619 correct predictions and 15 incorrect predictions, with an accuracy of 98% (0.9740), as well as precision, recall, and F1-score for the Female class of 0.98, 0.97, and 0.98, while for the Male class they were 0.97, 0.98, and 0.98.
References
[1] Hermanto and T. W. Sen, “Syllable-Based Javanese Speech Recognition Using MFCC and CNNs: Noise Impact Evaluation,” Jurnal Teknik Informatika, vol. 18, no. 1, pp. 32–42, 2025.
[2] G. Ajinurseto, L. O. Bakrim, and N. Islamuddin, “Penerapan Metode Mel Frequency Cepstral Coefficients pada Sistem Pengenalan Suara Berbasis Desktop,” INFOMATEK: Jurnal Informatika, Manajemen dan Teknologi, vol. 25, no. 2, pp. 11–20, 2023, doi: 10.23969/infomatek.v25i1.6109.
[3] H. D. Arpita, A. A. Ryan, M. F. Hossain, M. S. Rahman, M. Sajjad, and N. N. I. Prova, “Exploring Bengali speech for gender classification: machine learning and deep learning approaches,” Bulletin of Electrical Engineering and Informatics, vol. 14, no. 1, pp. 328–337, 2025, doi: 10.11591/eei.v14i1.8146.
[4] R. Cristina Oliveira, A. C. C. Gama, and M. D. C. Magalhães, “Fundamental Voice Frequency: Acoustic, Electroglottographic, and Accelerometer Measurement in Individuals With and Without Vocal Alteration.,” J Voice, vol. 35, no. 2, pp. 174–180, Mar. 2021, doi: 10.1016/j.jvoice.2019.08.004.
[5] H. Colineaux, L. Neufcourt, C. Delpierre, M. Kelly-Irving, and B. Lepage, “Explaining biological differences between men and women by gendered mechanisms,” Emerg Themes Epidemiol, vol. 20, no. 1, pp. 1–17, 2023, doi: 10.1186/s12982-023-00121-6.
[6] I. N. Switrayana, S. Hadi, and N. Sulistianingsih, “A Robust Gender Recognition System using Convolutional Neural Network on Indonesian Speaker,” Sistemasi, vol. 13, no. 3, p. 1008, 2024, doi: 10.32520/stmsi.v13i3.3698.
[7] M. Araya-Salas, “Streamline Bioacoustic Analysis,” warbleR, pp. 1–149, 2025.
[8] A. Biehl et al., “Scalable and High-Throughput In Vitro Vibratory Platform for Vocal Fold Tissue Engineering Applications.,” Bioengineering (Basel), vol. 10, no. 5, May 2023, doi: 10.3390/bioengineering10050602.
[9] D. T. Adherda, M. Hikmatyar, and Ruuhwan, “Gender Classification Based on Voice Using Recurrent Neural Network (RNN),” Antivirus : Jurnal Ilmiah Teknik Informatika, vol. 17, no. 1, pp. 111–122, 2023, doi: 10.35457/antivirus.v17i1.3049.
[10] R. W. Dwinanto, A. S. S. A, and R. Ardianto, “Klasifikasi Berisiko Stunting pada Balita: Perbandingan K-Nearest Neighbor, Naïve Bayes, Support Vector Machine,” METHOMIKA: Jurnal Manajemen Informatika & Komputerisasi Akuntansi, vol. 8, no. 2, pp. 264–273, 2024.
[11] A. A. Baihaqi and M. Fakhriza, “K-Nearest Neighbors (KNN) to Determine BBRI Stock Price,” Sistemasi: Jurnal Sistem Informasi, vol. 14, no. 2, pp. 969–984, 2025.
[12] MP Firdaus, “Perbandingan Algoritma K-Nearest Neighbor (KNN) dan Naive Bayes Classifier (NBC) dengan pelabelan Transformers serta Ektraksi Fitur TF-IDF dan N-Gram untuk Analisis Sentimen Terhadap Penundaan Pemilu,” Perbandingan Algoritma K-Nearest Neighbor (KNN) dan Naive Bayes Classifier (NBC) dengan pelabelan Transformers serta Ektraksi Fitur TF-IDF dan N-Gram untuk Analisis Sentimen Terhadap Penundaan Pemilu, pp. 5–24, 2023, [Online]. Available: https://repository.uinjkt.ac.id/dspace/handle/123456789/72466
[13] X. Shu and Y. Ye, “Knowledge Discovery: Methods From Data Mining And Machine Learning,” Soc Sci Res, vol. 110, p. 102817, 2023, doi: https://doi.org/10.1016/j.ssresearch.2022.102817.
[14] Z. M. J. Nafis, R. Nazilla, R. Nugraha, and S. ’Uyun, “Perbandingan Algoritma Decision Tree dan K-Nearest Neighbor untuk Klasifikasi Serangan Jaringan IoT,” Komputika: Jurnal Sistem Komputer, vol. 13, no. 2, pp. 245–252, 2024, doi: 10.34010/komputika.v13i2.12609.
[15] V. I. Sunarko, D. L. S. Dimara, P. S. E. Siagian, D. Manalu, and F. T. Anggraeny, “Implementasi K-Fold Dalam Prediksi Hasil Produksi Agrikultur Pada Algoritma K-Nearest Neighbor (KNN),” INTEGER: Journal of Information Technology, vol. 10, no. 1, pp. 10–16, 2025, doi: 10.31284/j.integer.2024.v10i1.6892.
[16] A. Fathah and C. Juliane, “Klasifikasi Gender Menggunakan Data Wajah dengan Algoritma Naïve Bayes dan K-Nearest Neighbors,” Jurnal Teknologi Informasi dan Ilmu Komputer (JTIIK), vol. 12, no. 1, pp. 99–110, 2025, doi: 10.25126/jtiik.2025128724.
[17] Moh. F. Erinsyah, V. Karenina, and D. S. Wibowo, “Klasifikasi Rentang Usia Dan Gender Dengan Deteksi Suara Menggunakan Metode Deep Learning Algoritma CNN (Convolutional Neural Network),” Komputika: Jurnal Sistem Komputer, vol. 12, no. 2, pp. 195–202, 2023, doi: 10.34010/komputika.v12i2.10516.
[18] S. Suwarno, “Gender Classification Based on Fingerprint Using Wavelet and Multilayer Perceptron,” Sinkron, vol. 8, no. 1, pp. 139–144, 2023, doi: 10.33395/sinkron.v8i1.11925.
[19] A. Muhammad, D. Pratiwi, and A. Salim, “Penerapan Metode Convolutional Neural Networks pada Pengenalan Gender Manusia berdasarkan Foto Tampak Depan,” Jurnal Komtika (Komputasi dan Informatika), vol. 7, no. 2, pp. 114–123, 2023, doi: 10.31603/komtika.v7i2.9937.
[20] V. H. Nabilla, D. Fitria, D. Permana, and F. Fitri, “Comparison of Haversine and Euclidean Distance Formula for Calculating Distance Between Regencies in West Sumatra,” UNP Journal of Statistics and Data Science, vol. 1, no. 3, pp. 120–125, 2023, doi: 10.24036/ujsds/vol1-iss3/39.
[21] V. H. Nabilla, D. Fitria, D. Permana, and F. Fitri, “Jarak Haversine,” vol. 1, pp. 120–125, 2023, [Online]. Available: https://ujsds.ppj.unp.ac.id/index.php/ujsds/article/view/39/31
[22] W. Wijiyanto, A. I. Pradana, S. Sopingi, and V. Atina, “Teknik K-Fold Cross Validation untuk Mengevaluasi Kinerja Mahasiswa,” Jurnal Algoritma, vol. 21, no. 1, pp. 239–248, 2024, doi: 10.33364/algoritma/v.21-1.1618.
[23] A. I. Pradana and Wijiyanto, “Identifikasi Jenis Kelamin Otomatis Berdasarkan Mata Manusia Menggunakan Convolutional Neural Network (CNN) dan Haar Cascade Classifier,” G-Tech : Jurnal Teknologi Terapan, vol. 8, no. 1, pp. 502–511, 2024, doi: 10.33379/gtech.v8i1.3814.
[24] Wijoyo A, Saputra A, Ristanti S, Sya’ban S, Amalia M, and Febriansyah R, “Pembelajaran Machine Learning,” OKTAL (Jurnal Ilmu Komputer dan Science), vol. 3, no. 2, pp. 375–380, 2024, [Online]. Available: https://journal.mediapublikasi.id/index.php/oktal/article/view/2305
[25] S. Sathyanarayanan, “Confusion Matrix-Based Performance Evaluation Metrics,” African Journal of Biomedical Research, no. November, pp. 4023–4031, 2024, doi: 10.53555/ajbr.v27i4s.4345.
[26] P. Pangestu and R. Setyadi, “Penerapan Metode K-Nearest Neighbor Untuk Pemilihan Rekomendasi Game FPS Pada Aplikasi Google Play Store,” Journal of Information System Research (JOSH), vol. 4, no. 2, pp. 742–747, 2023, doi: 10.47065/josh.v4i2.3006.
[27] M. A. Aulia, R. Goejantoro, and M. N. Hayati, “Penerapan Metode Klasifikasi K-Nearest Neighbor (Studi Kasus : Data Status Gizi Balita di Puskesmas Baqa Samarinda Seberang),” Prosiding Seminar Nasional Matematika, Statistika, dan Aplikasinya, vol. 3, pp. 128–142, 2023.
[28] I. Belcic, “What is Classification in Machine Learning? | IBM,” 2024. [Online]. Available: https://www.ibm.com/think/topics/classification-machine-learning
[29] J. C. Mestika, M. O. Selan, and M. I. Qadafi, “Menjelajahi Teknik-Teknik Supervised Learning untuk Pemodelan Prediktif Menggunakan Python,” Buletin Ilmiah Ilmu Komputer dan Multimedia (BIIKMA), vol. 1, no. 1, pp. 216–219, 2023, [Online]. Available: https://jurnalmahasiswa.com/index.php/biikma/article/view/101
[30] M. A. Satriawan and W. Widhiarso, “Klasifikasi Pengenalan Wajah Untuk Mengetahui Jenis Kelamin Menggunakan Metode Convolutional Neural Network,” Jurnal Algoritme, vol. 4, no. 1, pp. 43–52, 2023, doi: 10.35957/algoritme.xxxx.



