Analisis Perbandingan Model Machine Learning menggunakan Teknik Stratified K-Fold Cross Validation untuk Klasifikasi Penyakit Jantung

Authors

  • Avrilyan Putra Bintang Pratama Universitas Dian Nuswantoro, Semarang
  • Wahyu Aji Eko Prabowo Universitas Dian Nuswantoro, Semarang

DOI:

https://doi.org/10.30865/jurikom.v13i2.9670

Keywords:

Heart Disease, Classification, Random Forest, Support Vector Machine, K-Nearest Neighbor

Abstract

Heart disease is one of the leading causes of death worldwide. Conventional approaches still have limitations, such as subjectivity in interpretation and relatively long analysis times. Therefore, this study proposes using machine learning to improve the accuracy of heart disease risk prediction by comparing the performance of Random Forest, Support Vector Machine (SVM), and K-Nearest Neighbor (KNN) algorithms. The research methodology includes data preprocessing, splitting the dataset into training and testing sets, and hyperparameter optimization using Stratified K-Fold Cross Validation with variations of K = 5, 10, 15, and 20. Model evaluation is conducted using accuracy, precision, recall, F1-score, and ROC-AUC metrics to comprehensively and objectively measure classification performance. The results show that the Random Forest algorithm achieves the best performance. At the optimal configuration of K = 15, the model attains an accuracy of 93.17%, a precision of 0.92, a recall of 0.95, an F1-score of 0.94, and an ROC-AUC of 0.97. In addition, this model minimizes classification errors, particularly False Negatives, making it more effective at identifying at-risk patients. The main contribution of this study is demonstrating that the combination of Random Forest and Stratified K-Fold Cross Validation can significantly improve classification performance and produce a model that is accurate, stable, and reliable for implementation in medical decision support systems.

References

[1] D. Pradana, M. Luthfi Alghifari, M. Farhan Juna, and D. Palaguna, “Klasifikasi Penyakit Jantung Menggunakan Metode Artificial Neural Network,” Indones. J. Data Sci., vol. 3, no. 2, pp. 55–60, 2022, doi: 10.56705/ijodas.v3i2.35.

[2] U. Athiyah et al., “Diagnosa Resiko Penyakit Jantung Menggunakan Logika Fuzzy Metode Tsukamoto,” Infokes, vol. 11, no. 1, pp. 31–40, 2021.

[3] B. Hirwono, A. Hermawan, and D. Avianto, “Implementasi Metode Naïve Bayes untuk Klasifikasi Penderita Penyakit Jantung,” J. JTIK (Jurnal Teknol. Inf. dan Komunikasi), vol. 7, no. 3, pp. 450–457, 2023, doi: 10.35870/jtik.v7i3.910.

[4] W. K. Sari and I. S. B. Azhar, “Perbandingan Kinerja Neural Network dengan Metode Klasifikasi Tradisional dalam Mendiagnosis Penyakit Jantung: Sebuah Studi Komparatif,” JSI J. Sist. Inf., vol. 15, no. 1, pp. 3111–3117, 2023, doi: 10.18495/jsi.v15i1.112.

[5] M. Praditapuspa, Gala Sifa; Annisa, Rahmadani; Safari, “Uji aktivitas analgesik fraksi etil asetat daun kersen (Muntingia calabura L.) dengan metode hot plate,” J. Ilmu Farm. dan Farm. Klin., vol. 6, no. 10, pp. 4701–4707, 2020, [Online]. Available: https://j-ptiik.ub.ac.id/index.php/j-ptiik/article/view/11662

[6] F. Yulian Pamuji, Ahmad Rofiqul Muslikh, Rizza Muhammad Arief, and Delviana Muti, “Komparasi Metode Mean dan KNN Imputation dalam Mengatasi Missing Value pada Dataset Kecil,” J. Inform. Polinema, vol. 10, no. 2, pp. 257–264, 2024, doi: 10.33795/jip.v10i2.5031.

[7] W. Nursahid, B. I. Nugroho, and S. Syefudin, “Optimalisasi Preprocessing Data Menggunakan Pendekatan CRISP-DM untuk Meningkatkan Kualitas Klasifikasi Penyakit Jantung,” RIGGS J. Artif. Intell. Digit. Bus., vol. 4, no. 3, pp. 3621–3626, 2025, doi: 10.31004/riggs.v4i3.2514.

[8] W. S. Dharmawan, “Informatika Dalam Prediksi Penyakit Jantung,” J. Inform. Manaj. dan Komput., vol. 13, no. 2, pp. 31–41, 2021.

[9] A. Faradisia and M. A. I. Pakereng, “Analisis Komparatif Kernel Linear, Polynomial, RBF, dan Sigmoid pada Support Vector Machine untuk Klasifikasi Penyakit Jantung: Comparative Analysis of Linear, Polynomial, RBF, and Sigmoid Kernels in Support Vector Machine for Heart Disease Classification,” MALCOM Indones. J. Mach. Learn. Comput. Sci., vol. 5, no. 4, pp. 1531–1537, 2025.

[10] S. P. Tamba and E. -, “Prediksi Penyakit Gagal Jantung Dengan Menggunakan Random Forest,” J. Sist. Inf. dan Ilmu Komput. Prima(JUSIKOM PRIMA), vol. 5, no. 2, pp. 176–181, 2022, doi: 10.34012/jurnalsisteminformasidanilmukomputer.v5i2.2445.

[11] G. Rizky, Dataset Jantung. 2025. [Online]. Available: https://www.kaggle.com/datasets/ginantiriski/dataset-jantung

[12] J. Zulkaidah, Data Penyakit Jantung. 2023. [Online]. Available: https://www.kaggle.com/datasets/jumainzulkaidah/data-penyakit-jantung

[13] S. Khairunnisa, A. Adiwijaya, and S. Al Faraby, “Pengaruh Text Preprocessing terhadap Analisis Sentimen Komentar Masyarakat pada Media Sosial Twitter (Studi Kasus Pandemi COVID-19),” J. Media Inform. Budidarma, vol. 5, no. 2, p. 406, 2021, doi: 10.30865/mib.v5i2.2835.

[14] Putri Azhiah, Jasmir, and Purnama Benni, “Klasifikasi Kelayakan Penerima Beasiswa Menggunakan Naive Bayes dengan Optimasi Atribut Berbasis K-Means,” Inst. Ris. dan Publ. Indones., vol. 5, no. October, pp. 1450–1462, 2025.

[15] Sopiatul Ulum, R. F. Alifa, P. Rizkika, and C. Rozikin, “Perbandingan Performa Algoritma KNN dan SVM dalam Klasifikasi Kelayakan Air Minum,” Gener. J., vol. 7, no. 2, pp. 141–146, 2023, doi: 10.29407/gj.v7i2.20270.

[16] A. Khairi, A. F. Ghozali, and A. D. N. Hidayah, “Implementasi K-Nearest Neighbor (KNN) untuk Mengklasifikasi Masyarakat Pra-Sejahtera Desa Sapikerep Kecamatan Sukapura,” TRILOGI J. Ilmu Teknol. Kesehatan, dan Hum., vol. 2, no. 3, pp. 319–323, 2021, doi: 10.33650/trilogi.v2i3.2878.

[17] A. Rahman Isnain, A. Indra Sakti, D. Alita, and N. Satya Marga, “Sentimen Analisis Publik Terhadap Kebijakan Lockdown Pemerintah Jakarta menggunakan Algoritma Support Vector Machine (SVM),” Jdmsi, vol. 2, no. 1, pp. 31–37, 2021, [Online]. Available: https://t.co/NfhnfMjtXw

[18] M Riski Qisthiano, “Klasifikasi Terhadap Prediksi Kelulusan Mahasiswa Dengan Menggunakan Metode Support Vector Machine (Svm),” Semin. Nas. Teknol. dan Multidisiplin Ilmu, vol. 2, no. 2, pp. 203–207, 2022, doi: 10.51903/semnastekmu.v2i1.170.

[19] A. S. Prabowo and F. I. Kurniadi, “Analisis Perbandingan Kinerja Algoritma Klasifikasi dalam Mendeteksi Penyakit Jantung,” J. SISKOM-KB (Sistem Komput. dan Kecerdasan Buatan), vol. 7, no. 1, pp. 56–61, 2023, doi: 10.47970/siskom-kb.v7i1.468.

[20] M. N. Raza, “Sistem Deteksi Berita Hoax Menggunakan Algoritma Naïve Bayes Dan Random Forest Pada Machine Learning,” Pondasi J. Appl. Sci. Eng., vol. 1, no. 2, pp. 43–57, 2024, [Online]. Available: https://journal.alshobar.or.id/index.php/pondasi/article/view/221

[21] Suci Amaliah, M. Nusrang, and A. Aswi, “Penerapan Metode Random Forest Untuk Klasifikasi Varian Minuman Kopi di Kedai Kopi Konijiwa Bantaeng,” VARIANSI J. Stat. Its Appl. Teach. Res., vol. 4, no. 3, pp. 121–127, 2022, doi: 10.35580/variansiunm31.

[22] C. Azzaria, E. Daniati, and A. Ristyawan, “Peningkatan Akurasi Deteksi Liver Disease melalui Hyperparameter Tuning pada Algoritma Random Forest,” Indones. J. Comput. Sci. Res., vol. 4, no. 2, pp. 139–147, 2025, [Online]. Available: https://subset.id/index.php/IJCSR

[23] A. Setiawan, A. A. Siregar, N. Setiawan, J. Nasution, and N. D. Putra, “Optimasi Performa Model SVM dan Random Forest untuk Klasifikasi Kanker Payudara Menggunakan Penyetelan Hyperparameter,” J. Komput. Teknol. Inf. Sist. Komput., vol. 4, no. 3, pp. 2141–2149, 2026.

[24] M. Anita, I. G. D. Yulianti, and S. V. Pasaribu, “Klasifikasi Faktor Risiko Penyakit Jantung Menggunakan Machine Learning,” HOAQ (High Educ. Organ. Arch. Qual. J. Teknol. Inf., vol. 16, no. 1, pp. 68–78, 2025, doi: 10.52972/hoaq.vol16no1.p68-78.

Published

2026-04-30

How to Cite

Pratama, A. P. B., & Prabowo, W. A. E. (2026). Analisis Perbandingan Model Machine Learning menggunakan Teknik Stratified K-Fold Cross Validation untuk Klasifikasi Penyakit Jantung . JURNAL RISET KOMPUTER (JURIKOM), 13(2). https://doi.org/10.30865/jurikom.v13i2.9670

Issue

Section

Articles