Perbandingan Performa Multi-Algoritma Machine Learning dengan Dua Strategi Validasi pada Klasifikasi Curah Hujan
DOI:
https://doi.org/10.30865/json.v7i3.9475Keywords:
Perbandingan Model, Klasifikasi Hujan, Machine Learning, Random Forest, Mitigasi BencanaAbstract
Prediksi curah hujan yang akurat masih menjadi tantangan karena kompleksitas proses atmosfer serta dampaknya terhadap berbagai sektor. Performa algoritma machine learning dalam klasifikasi curah hujan sangat dipengaruhi oleh karakteristik data dan metode validasi, sehingga diperlukan evaluasi komparatif untuk menentukan model yang paling sesuai pada konteks lokal. Penelitian ini bertujuan membandingkan performa lima model machine learning, yaitu Random Forest, XGBoost, Support Vector Machine, K-Nearest Neighbor, dan Decision Tree dalam klasifikasi curah hujan di Kabupaten Tapanuli Tengah menggunakan data observasi harian periode 2015–2024 sebanyak 32.796 data yang diperoleh dari Stasiun Meteorologi FL Tobing. Evaluasi dilakukan melalui skema pembagian data dan 10-cross fold validation dengan metrik precision, recall, dan f1-score. Hasil penelitian menunjukkan bahwa Random Forest secara konsisten memberikan performa terbaik pada kedua skema validasi dengan f1-score sebesar 62% dan 63%, lebih stabil dibandingkan model lainnya pada kondisi distribusi kelas yang tidak seimbang. Temuan ini menunjukkan bahwa pendekatan ensemble lebih adaptif dalam menangkap hubungan nonlinier parameter meteorologi serta memberikan dasar metodologis dalam pemilihan model klasifikasi curah hujan untuk mendukung mitigasi bencana hidrometeorologi.
References
M. T. Chaudhary dan A. Piracha, “Natural Disasters—Origins, Impacts, Management,” Encyclopedia, vol. 1, no. 4, hlm. 1101–1131, Des 2021, doi: 10.3390/encyclopedia1040084.
J. A. Marengo dkk., “Flash floods and landslides in the city of Recife, Northeast Brazil after heavy rain on May 25–28, 2022: Causes, impacts, and disaster preparedness,” Weather Clim. Extrem., vol. 39, Mar 2023, doi: 10.1016/j.wace.2022.100545.
E. Alcantara dkk., “Deadly disasters in southeastern South America: flash floods and landslides of February 2022 in Petrópolis, Rio de Janeiro,” Natural Hazards and Earth System Sciences, vol. 23, no. 3, hlm. 1157–1175, Mar 2023, doi: 10.5194/nhess-23-1157-2023.
P. Ismartini, “Statistik Indonesia 2025,” 2025.
R. Al Fauzi, “Analisis Tingkat Kerawanan Banjir Kota Bogor Menggunakan Metode Overlay dan Scoring Berbasis Sistem Informasi Geografis,” Geomedia, vol. 20, no. 2, hlm. 96–107, 2022, doi: https://journal.uny.ac.id/index.php/geomedia/index.
S. Q. Dotse, I. Larbi, A. M. Limantol, dan L. C. De Silva, “A review of the application of hybrid machine learning models to improve rainfall prediction,” 1 Februari 2024, Springer Science and Business Media Deutschland GmbH. doi: 10.1007/s40808-023-01835-x.
H. Jamaludin dan E. S. Wijaya, “Analisis Korelasi Curah Hujan dan Tinggi Muka Air Sungai Menggunakan Metode Regresi Linear,” Jurnal Media Pratama, vol. 17, no. 2, hlm. 141–147, 2023.
C. Ley, R. K. Martin, A. Pareek, A. Groll, R. Seil, dan T. Tischer, “Machine learning and conventional statistics: making sense of the differences,” 1 Maret 2022, Springer Science and Business Media Deutschland GmbH. doi: 10.1007/s00167-022-06896-6.
A. Wicaksono, “Prediksi dan Deteksi Bug Pada Software Menggunakan Pendekatan Machine Learning,” Journal Signin, hlm. 14–17, 2023.
F. Sulianta, Dasar dan Konsep Machine Learning. Bandung: Feri Sulianta, 2025.
R. F. Putra dkk., Data Mining: Algoritma dan Penerapannya. Jakarta: Sonpedia, 2023.
P. W. Rahayu dkk., Buku Ajar Data Mining. Jakarta: Sonpedia, 2024.
Z. Setiawan dkk., Buku Ajar Data Mining. Jakarta: Sonpedia, 2023.
G. Ashari Rakhmat dan W. Mutohar, “Prakiraan Hujan menggunakan Metode Random Forest dan Cross Validation,” MIND Journal, vol. 8, no. 2, hlm. 173–187, 2023, doi: 10.26760/mindjournal.v8i2.173-187.
I. D. G. L. Maheswara dan A. H. Al’aziz, “PERBANDINGAN MODEL MACHINE LEARNING PADA KLASIFIKASI CURAH HUJAN DI BOGOR,” INTI Nusa Mandiri, vol. 19, no. 2, hlm. 202–210, Feb 2025, doi: 10.33480/inti.v19i2.6296.
I. Saputra dan D. Ajeng Kristiyanti, “Application of Data Mining for Rainfall Prediction Classification in Australia with Decision Tree Algorithm and C5.0 Algorithm,” hlm. 13–2021, [Daring]. Tersedia pada: https://www.kaggle.com/jsphyg/weather-dataset-rattle-
S. S. P. Rachmawati, K. V. Prakusa, dan S. Rihastuti, “Penerapan Data Mining dengan Metode Decision Tree untuk Prediksi Cuaca di Kota Seattle menggunakan Aplikasi Weka,” dalam Seminar Nasional AMIKOM Surakarta (SEMNASA), Surakarta, Nov 2023, hlm. 93–100.
L. Mdegela, E. Municio, Y. De Bock, E. Luhanga, J. Leo, dan E. Mannens, “Extreme Rainfall Event Classification Using Machine Learning for Kikuletwa River Floods,” Water (Switzerland), vol. 15, no. 6, Mar 2023, doi: 10.3390/w15061021.
W. Sudrajat dan I. Cholid, “K-NEAREST NEIGHBOR (K-NN) UNTUK PENANGANAN MISSING VALUE PADA DATA UMKM,” 2023.
E. Erlin, Y. Desnelita, N. Nasution, L. Suryati, dan F. Zoromi, “Dampak SMOTE terhadap Kinerja Random Forest Classifier berdasarkan Data Tidak seimbang,” MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer, vol. 21, no. 3, hlm. 677–690, Jul 2022, doi: 10.30812/matrik.v21i3.1726.
M. A. Hasanah, S. Soim, dan A. S. Handayani, “Implementasi CRISP-DM Model Menggunakan Metode Decision Tree dengan Algoritma CART untuk Prediksi Curah Hujan Berpotensi Banjir,” 2021. [Daring]. Tersedia pada: http://jurnal.polibatam.ac.id/index.php/JAIC
F. R. Rustan, H. W. Tanje, A. S. Sukri, M. K. Amir, M. Sriwati, dan R. M. Rachman, Hidrologi. Makassar: Tohar Media, 2024.
A. P. Permana, A. Kurniyatul, dan F. H. H. Khadijah, “Analisis Perbandingan Algoritma Decision Tree, kNN, dan Naive Bayes untuk Prediksi Kesuksesan Start-up,” JISKa, vol. 6, no. 3, hlm. 178–188, 2021, [Daring]. Tersedia pada: https://www.kaggle.com/manishkc06/startup-success-prediction.
A. Pannadhitthana Candra, “Analisis Data Menggunakan Python: Memperkenalkan Pandas dan NumPy,” vol. 3, no. 1, hlm. 11–16, 2025.
N. A. P. Indaryono, R. R. Saedudin, dan F. Hamami, “Comparison Analysis of Random Forest and Naive Bayes Algorithms for Rainfall Classification based on Climate in Indonesia,” SITEKNIK, vol. 1, no. 2, hlm. 102–109, 2024.
Z. Mahmood, L. Jamel, D. A. Salem, dan I. Ashraf, “Improving learning from the complex multi-class imbalanced and overlapped data by mapping into higher dimension using SVM++,” Sci. Rep., vol. 15, no. 1, Des 2025, doi: 10.1038/s41598-025-13929-w.
H. Vega-Huerta dkk., “K-Nearest Neighbors Model to Optimize Data Classification According to the Water Quality Index of the Upper Basin of the City of Huarmey,” Applied Sciences (Switzerland), vol. 15, no. 18, Sep 2025, doi: 10.3390/app151810202.
E. Halabaku dan E. Bytyçi, “Overfitting in Machine Learning: A Comparative Analysis of Decision Trees and Random Forests,” Intelligent Automation and Soft Computing, vol. 39, no. 6, hlm. 987–1006, 2024, doi: 10.32604/iasc.2024.059429.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Jurnal Sistem Komputer dan Informatika (JSON)

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

This work is licensed under a Creative Commons Attribution 4.0 International License
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).

