Implementasi Model Support Vector Machine dan Logistic Regression Untuk Memprediksi Penyakit Stroke
DOI:
https://doi.org/10.30865/jurikom.v10i1.5478Keywords:
Stroke Disease, Machine Learning Algorithm, K-Fold Cross Validation, Confusion Matrix, Prediction.Abstract
In this study the topic raised was about stroke. Based on the statistical results of the Indonesian Health Service Research Agency (2018), stroke is the disease in the first order in Indonesia. Previous studies have carried out stroke research including research that took data from patients at Abdul Wahab Sjahranie Hospital. The model used is the Machine Learning Decision Tree, with an accuracy of 87.5%. For better accuracy, this study predicts stroke using two algorithms, namely support vector machine (SVM) and logistic regression (LR). In the stroke data found there were 10 attributes and 1 output, which consisted of gender, age, hypertension, heart_disease, ever_married, work_type, Residence_type, avg_glucose_level, bmi, smoking_status, and stroke(output). This study uses the SVM+SMOTE model with a confusion matrix and K-Fold and uses 4981 rows and 11 columns. Based on the research results, the support vector machine algorithm is better than the logistic regression (LR) algorithm in predicting datasets using the oversampling and cross-validation methods. Testing the SVM + SMOTE model using the confusion matrix and the K-Fold method produces much better accuracy in the distribution of stroke data accurately. The results show that the Support Vector Machine classification algorithm can work effectively with perfect accuracy of 95.3% at the 10K-Fold Validation level.
References
N. Permatasari, “Perbandingan Stroke Non Hemoragik dengan Gangguan Motorik Pasien Memiliki Faktor Resiko Diabetes Melitus dan Hipertensi,†J. Ilm. Kesehat. Sandi Husada, vol. 11, no. 1, pp. 298–304, 2020, doi: 10.35816/jiskh.v11i1.273.
S. Hospitals, “Ciri dan Penyebab Stroke di Usia Muda,†2022, 2022. .
S. Sutarwi, Y. Bakhtiar, and N. Rochana, “Sensitivitas dan Spesifitas Skor Stroke Literature Review,†Gaster, vol. 18, no. 2, p. 186, 2020, doi: 10.30787/gaster.v18i2.521.
V. L. Feigin et al., “World Stroke Organization (WSO): Global Stroke Fact Sheet 2022,†Int. J. Stroke, vol. 17, no. 1, pp. 18–29, 2022, doi: 10.1177/17474930211065917.
BUNG HARYANTO, “Penyebab Kematian Tertinggi di Indonesia adalah Stroke dan Penyakit Jantung,†SUBANING RUSTRIANTONO. 2018, [Online]. Available: http://subaning.sentradetox.com/blog/read/seputar-detoksifikasi/107/penyebab-kematian-tertinggi-di-indonesia-adalah-stroke-dan-penyakit-jantung.html.
J. B. Junior, R. R. Saedudin, and V. P. Widharta, “Perbandingan Akurasi Algorima Decision Tree Dan Algoritma Support Vector Machine Pada Penyakit Diabetes,†vol. 8, no. 5, pp. 9749–9756, 2021.
K. R. Sulaeman, C. Setianingsih, and R. E. Saputra, “Analisis Algoritma Support Vector Machine Dalam Klasifikasi Penyakit Stroke,†eProceedings Eng., vol. 9, no. 3, pp. 922–928, 2022, [Online]. Available: https://openlibrarypublications.telkomuniversity.ac.id/index.php/engineering/article/view/17909/17544%0Ahttps://openlibrarypublications.telkomuniversity.ac.id/index.php/engineering/article/view/17909.
H. S. Wafa, A. I. Hadiana, and F. R. Umbara, “Prediksi Penyakit Diabetes Menggunakan Algoritma Support Vector Machine (SVM),†vol. 4, no. 1, pp. 40–45, 2022, [Online]. Available: https://e-journal.unper.ac.id/index.php/informatics.
F. Handayani, “Komparasi Support Vector Machine, Logistic Regression Dan Artificial Neural Network Dalam Prediksi Penyakit Jantung,†J. Edukasi dan Penelit. Inform., vol. 7, no. 3, p. 329, 2021, doi: 10.26418/jp.v7i3.48053.
I. Lishania, R. Goejantoro, and Y. N. Nasution, “Perbandingan Klasifikasi Metode Naive Bayes dan Metode Decision Tree Algoritma (J48) pada Pasien Penderita Penyakit Stroke di RSUD Abdul Wahab Sjahranie Samarinda,†J. Eksponensial, vol. 10, no. 2, pp. 135–142, 2019.
A. Indrawati, “Penerapan Teknik Kombinasi Oversampling Dan Undersampling Untuk Mengatasi Permasalahan Imbalanced Dataset,†JIKO (Jurnal Inform. dan Komputer), vol. 4, no. 1, pp. 38–43, 2021, doi: 10.33387/jiko.v4i1.2561.
W. Hidayat, M. Ardiansyah, and A. Setyanto, “Pengaruh Algoritma ADASYN dan SMOTE terhadap Performa Support Vector Machine pada Ketidakseimbangan Dataset Airbnb,†Edumatic J. Pendidik. Inform., vol. 5, no. 1, pp. 11–20, 2021, doi: 10.29408/edumatic.v5i1.3125.
Normah, B. Rifai, S. Vambudi, and R. Maulana, “Analisa Sentimen Perkembangan Vtuber Dengan Metode Support Vector Machine Berbasis SMOTE,†J. Tek. Komput. AMIK BSI, vol. 8, no. 2, pp. 174–180, 2022, doi: 10.31294/jtk.v4i2.
E. Sutoyo and M. A. Fadlurrahman, “Penerapan SMOTE untuk Mengatasi Imbalance Class dalam Klasifikasi Television Advertisement Performance Rating Menggunakan Artificial Neural Network,†J. Edukasi dan Penelit. Inform., vol. 6, no. 3, p. 379, 2020, doi: 10.26418/jp.v6i3.42896.
A. Franseda, W. Kurniawan, S. Anggraeni, and W. Gata, “Integrasi Metode Decision Tree dan SMOTE untuk Klasifikasi Data Kecelakaan Lalu Lintas,†J. Sist. dan Teknol. Inf., vol. 8, no. 3, p. 282, 2020, doi: 10.26418/justin.v8i3.40982.
R. Siringoringo, “KLASIFIKASI DATA TIDAK SEIMBANG MENGGUNAKAN ALGORITMA SMOTE DAN k-NEAREST NEIGHBOR,†2018.
R. Kusumawati, A. D’Arofah, and P. A. Pramana, “Comparison Performance of Naive Bayes Classifier and Support Vector Machine Algorithm for Twitter’s Classification of Tokopedia Services,†J. Phys. Conf. Ser., vol. 1320, no. 1, 2019, doi: 10.1088/1742-6596/1320/1/012016.
I. Ahmad, M. Basheri, M. J. Iqbal, and A. Rahim, “Performance Comparison of Support Vector Machine, Random Forest, and Extreme Learning Machine for Intrusion Detection,†IEEE Access, vol. 6, pp. 33789–33795, 2018, doi: 10.1109/ACCESS.2018.2841987.
Trivusi, “Penjelasan Lengkap Algoritma Support Vector Machine (SVM),†https://www.trivusi.web.id/, 2022. .
J. J. Khanam and S. Y. Foo, “A comparison of machine learning algorithms for diabetes prediction,†ICT Express, vol. 7, no. 4, pp. 432–439, 2021, doi: 10.1016/j.icte.2021.02.004.
M. I. Gunawan, D. Sugiarto, and I. Mardianto, “Peningkatan Kinerja Akurasi Prediksi Penyakit Diabetes Mellitus Menggunakan Metode Grid Seacrh pada Algoritma Logistic Regression,†J. Edukasi dan Penelit. Inform., vol. 6, no. 3, p. 280, 2020, doi: 10.26418/jp.v6i3.40718.
V. Michael, “Mengenal Logistic Regression,†4 mei 2019, 2019. .
Z. N. et. a. KUSMANTORO, “AKURASI UJI DIAGNOSTIK MENGGUNAKAN LUASAN BAWAH KURVA ROC SMOOTHED EMPIRICAL,†Universitas Gajah Mada, 2018. .
K. S. Nugroho, “Confusion Matrix untuk Evaluasi Model pada Supervised Learning,†https://ksnugroho.medium.com/, 2019.



