Analisis Komparatif Model Random Forest dan XGBoost untuk Klasifikasi Penyakit Jantung Berbasis Data Klinis

Authors

  • Angga Guardi Zunus Saputra Universitas Dian Nuswantoro, Semarang
  • Fikri Budiman Universitas Dian Nuswantoro, Semarang

DOI:

https://doi.org/10.30865/jurikom.v13i1.9427

Keywords:

Random Forest, XGboost, Feature Importance, SMOTE, Heart Disease Classification

Abstract

Heart disease remains a major global health challenge due to its increasing prevalence, highlighting the need for accurate and reliable early diagnostic systems. This study aims to analyze and compare the performance of Random Forest (RF) and XGBoost algorithms for heart disease classification, and to identify the most suitable model for data-driven clinical decision support. Addressing a research gap in ensemble learning studies, this research conducts a comprehensive comparative evaluation using the UCI Heart Disease Dataset. The proposed methodology includes data preprocessing, feature encoding, normalization, class imbalance handling using the Synthetic Minority Oversampling Technique (SMOTE), and hyperparameter optimization based on RandomizedSearchCV. Model performance is evaluated using accuracy, precision, recall, F1-score, ROC-AUC, and Matthews Correlation Coefficient (MCC), supported by Feature Importance analysis. The results demonstrate that both ensemble models achieve strong predictive performance, with consistently high F1-scores above 0.88. XGBoost exhibits superior overall performance, achieving the highest F1-score of 0.8995 and Precision of 0.8785, making it more effective in minimizing False Positive predictions. In contrast, Random Forest shows superior sensitivity, with the highest Recall of 0.9510 and ROC-AUC of 0.9582, along with better cross-validation stability. These findings indicate that the selection of heart disease classification algorithms should be aligned with specific clinical objectives, and the results of this study are expected to contribute to the development of effective machine learning–based clinical decision support systems.

References

[1] “WHO EMRO - World Heart Day - 29 September 2024: campaigning for cardiovascular health.” Accessed: Dec. 15, 2025. [Online]. Available: https://www.emro.who.int/media/news/world-heart-day-29-september-2024-campaigning-for-cardiovascular-health.html

[2] “Katalog Data - Layanan Permintaan Data | Kementerian Kesehatan RI.” Accessed: Nov. 27, 2025. [Online]. Available: https://layanandata.kemkes.go.id/katalog-data/ski/ketersediaan-data/ski-2023

[3] “Cardiovascular diseases (CVDs).” Accessed: Nov. 27, 2025. [Online]. Available: https://www.who.int/news-room/fact-sheets/detail/cardiovascular-diseases-(cvds)

[4] “LAPORAN SKI 2023 DALAM ANGKA_REVISI I_OK.pdf,” Google Docs. Accessed: Dec. 15, 2025. [Online]. Available: https://drive.google.com/file/d/1rjNDG_f8xG6-Y9wmhJUnXhJ-vUFevVJC/view?usp=sharing&usp=embed_facebook

[5] D. Triyono, R. Liani, A. W. Utami, S. Tristiyanti, and A. Supriatna, “PENYAKIT JANTUNG KORONER DI INDONESIA: PERAN FAKTOR RISIKO DAN UPAYA PENCEGAHAN,” HUMANIS J. Ilmu-Ilmu Sos. Dan Hum., vol. 17, no. 1, pp. 86–94, Jan. 2025, doi: 10.52166/humanis.v17i01.8798.

[6] C. Muazizah and H. Novida, “Faktor Risiko Kematian Pada Pasien Diabetes Melitus dan Penyakit Jantung: Systematic Review,” J. Ilmu Kedokt. Dan Kesehat. Indones., vol. 4, no. 2, pp. 01–13, July 2024, doi: 10.55606/jikki.v4i2.3908.

[7] N. A. Baghdadi, S. M. Farghaly Abdelaliem, A. Malki, I. Gad, A. Ewis, and E. Atlam, “Advanced machine learning techniques for cardiovascular disease early detection and diagnosis,” J. Big Data, vol. 10, no. 1, p. 144, Sept. 2023, doi: 10.1186/s40537-023-00817-1.

[8] H. El-Sofany, B. Bouallegue, and Y. M. A. El-Latif, “A proposed technique for predicting heart disease using machine learning algorithms and an explainable AI method,” Sci. Rep., vol. 14, no. 1, p. 23277, Oct. 2024, doi: 10.1038/s41598-024-74656-2.

[9] S. M. Ganie, P. K. D. Pramanik, and Z. Zhao, “Ensemble learning with explainable AI for improved heart disease prediction based on multiple datasets,” Sci. Rep., vol. 15, no. 1, p. 13912, Apr. 2025, doi: 10.1038/s41598-025-97547-6.

[10] M. W. Nugroho, “Analisis Performa Algoritma Random Forest dalam Mengatasi Overfitting pada Model Prediksi,” J. JTIK J. Teknol. Inf. Dan Komun., vol. 9, no. 4, pp. 1562–1571, Oct. 2025, doi: 10.35870/jtik.v9i4.4236.

[11] A.-A.-R. Asif et al., “Performance Evaluation and Comparative Analysis of Different Machine Learning Algorithms in Predicting Cardiovascular Disease,” vol. 29, no. 2, 2021.

[12] A. M. A. Rahim, Inggrid Yanuar Risca Pratiwi, and Muhammad Ainul Fikri, “Klasifikasi Penyakit Jantung Menggunakan Metode Synthetic Minority Over-Sampling Technique Dan Random Forest Clasifier,” Indones. J. Comput. Sci., vol. 12, no. 5, Nov. 2023, doi: 10.33022/ijcs.v12i5.3413.

[13] K. Budholiya, S. K. Shrivastava, and V. Sharma, “An optimized XGBoost based diagnostic system for effective prediction of heart disease,” J. King Saud Univ. - Comput. Inf. Sci., vol. 34, no. 7, pp. 4514–4523, July 2022, doi: 10.1016/j.jksuci.2020.10.013.

[14] R. H. Laftah and K. H. K. Al-Saedi, “Explainable Ensemble Learning Models for Early Detection of Heart Disease”.

[15] A. Rohman and S. Mujiyono, “Komparasi Algoritma Machine Learning dan Ensemble Methods dalam Prediksi Penyakit Jantung dengan Dataset yang Bervariasi,” vol. 4, no. 2, 2022.

[16] Y. Amelia, “PERBANDINGAN METODE MACHINE LEARNING UNTUK MENDETEKSI PENYAKIT JANTUNG,” IDEALIS Indones. J. Inf. Syst., vol. 6, no. 2, pp. 220–225, July 2023, doi: 10.36080/idealis.v6i2.3043.

[17] D. Setiawan, A. Muhammad, and S. H. F. Dewi, “Penerapan Algoritma Klasifikasi untuk Deteksi Dini Penyakit Jantung Koroner Berdasarkan Gejala Klinis”.

[18] M. K. Biddinika, A. Masitha, H. Herman, and V. A. N. Fatimah, “Machine Learning Techniques for Heart Disease Prediction Using a Multi-Algorithm Approach,” JUITA J. Inform., vol. 12, no. 2, p. 149, Nov. 2024, doi: 10.30595/juita.v12i2.24153.

[19] N. Nasution, M. A. Hasan, and F. Bakri Nasution, “Predicting Heart Disease Using Machine Learning: An Evaluation of Logistic Regression, Random Forest, SVM, and KNN Models on the UCI Heart Disease Dataset,” IT J. Res. Dev., vol. 9, no. 2, pp. 140–150, Apr. 2025, doi: 10.25299/itjrd.2025.17941.

[20] D. Wijayanto, R. Marco, A. Sidauruk, and M. Sulistiyono, “The Effect of SMOTE and Optuna Hyperparameter Optimization on TabNet Performance for Heart Disease Classification,” J. Sisfokom Sist. Inf. Dan Komput., vol. 14, no. 2, pp. 156–164, May 2025, doi: 10.32736/sisfokom.v14i2.2348.

[21] E. Erlin, Y. Desnelita, N. Nasution, L. Suryati, and F. Zoromi, “Dampak SMOTE terhadap Kinerja Random Forest Classifier berdasarkan Data Tidak seimbang,” MATRIK J. Manaj. Tek. Inform. Dan Rekayasa Komput., vol. 21, no. 3, pp. 677–690, July 2022, doi: 10.30812/matrik.v21i3.1726.

[22] J. P. Anggraini, Chaya Gladys Zhafirah A, and A. Desiani, “Perbandingan Algoritma Random Forest dan Extreme Gradient Boosting (XGBoost) dalam Klasifikasi Penyakit Gagal Jantung,” Komputika J. Sist. Komput., vol. 14, no. 2, pp. 149–157, Nov. 2025, doi: 10.34010/komputika.v14i2.16618.

Additional Files

Published

2026-02-28

How to Cite

Zunus Saputra, A. G., & Fikri Budiman. (2026). Analisis Komparatif Model Random Forest dan XGBoost untuk Klasifikasi Penyakit Jantung Berbasis Data Klinis . JURNAL RISET KOMPUTER (JURIKOM), 13(1), 254–264. https://doi.org/10.30865/jurikom.v13i1.9427