Penerapan Algoritma Adaboost Untuk Peningkatan Kinerja Klasifikasi Data Mining Pada Imbalance Dataset Diabetes

Authors

  • Nia Novianti Universitas Sumatera Utara, Medan
  • Muhammad Zarlis Universitas Sumatera Utara, Medan
  • Poltak Sihombing Universitas Sumatera Utara, Medan

DOI:

https://doi.org/10.30865/mib.v6i2.4017

Keywords:

Improvement, Performance, Classification, Adaboost, Data Mining

Abstract

According to the World Health Organization (WHO), it has been recorded that up to now more than 150 million people have diabetes, whether they are elderly people, adults, teenagers, men or women. Early knowledge of diabetes can be seen based on data from patients who already have diabetes. The patient's disease data has previously been stored and arranged in a data warehouse or what is commonly referred to as a dataset. Therefore, it is necessary to process the data contained in the dataset. But the use of data mining techniques themselves must be assisted by using the techniques contained in the data mining, namely classification techniques. K-Nearest Neighbor (K-NN) is one of the methods used in the classification technique. In the results of the classification of the level of confidence obtained in the process, it is seen based on the amount of accuracy. However, there are important issues that need special attention. In the dataset used for the classification process, the data collected contains unbalanced class results (balance). The unbalanced data classification process becomes an important problem, this is because it can cause a decrease in performance. Adaboost is a technique in data mining that can be used to increase the level of accuracy in classification methods. The results showed that the adaboost algorithm can help improve classification performance. This can be seen from the increasing level of accuracy obtained from the process carried out before and after using the adaboost algorithm. The results obtained from the research show that the adaboost algorithm can be used properly to help the performance of the K-Nearest Neighbor algorithm for the classification process on diabetes datasets. It can be seen from 5 tests with values of K = 7, 13, 19, 25 and 31 there is an increase in the accuracy results obtained after using the adaboost algorithm.

References

S. Ucha Putri, E. Irawan, and F. Rizky, “Implementasi Data Mining Untuk Prediksi Penyakit Diabetes Dengan Algoritma C4.5,†Januari, vol. 2, no. 1, pp. 39–46, 2021.

F. Aris and Benyamin, “Penerapan Data Mining untuk Identifikasi Penyakit Diabetes Melitus dengan Menggunakan Metode Klasifikasi,†Router Res., vol. 1, no. 1, pp. 1–6, 2019, [Online]. Available: https://www.ejournal.stipwunaraha.ac.id/index.php/router/article/view/313.

R. Kajen, “Pengertian Penyakit Diabetes, Faktor Risiko, dan Cara Pencegahannya,†RSUD Kajen, 2021. https://rsudkajen.id/pengertian-penyakit-diabetes-faktor-risiko-dan-cara-pencegahannya/ (accessed Mar. 08, 2022).

N. Sagala and H. Tampubolon, “Komparasi Kinerja Algoritma Data Mining pada Dataset Konsumsi Alkohol Siswa,†Khazanah Inform. J. Ilmu Komput. dan Inform., vol. 4, no. 2, p. 98, 2018, doi: 10.23917/khif.v4i2.7061.

D. P. Utomo and S. Aripin, “Penerapan Algoritma C5 . 0 Untuk Mengetahui Pola Kepuasan Mahasiswa di Masa Pembelajaran Daring,†in Seminar Nasional Riset Dan Information Science (SENARIS), 2021, vol. 3, pp. 7–12.

U. R. Amanda and D. P. Utomo, “Penerapan Data Mining Algoritma Hash Based Pada Data Pemesanan Buah Impor Cv. Green Uni Fruit,†KOMIK (Konferensi Nas. Teknol. Inf. dan Komputer), vol. 5, no. 1, pp. 86–93, 2021, doi: 10.30865/komik.v5i1.3653.

A. Handayanto, K. Latifa, N. D. Saputro, and R. R. Waliansyah, “Analisis dan Penerapan Algoritma Support Vector Machine (SVM) dalam Data Mining untuk Menunjang Strategi Promosi,†JUITA J. Inform., vol. 7, no. 2, 2019, doi: 10.30595/juita.v7i2.4378.

R. R. Putra and C. Wadisman, “IMPLEMENTASI DATA MINING PEMILIHAN PELANGGAN POTENSIAL MENGGUNAKAN ALGORITMA K-MEANS,†Intecoms J. Inf. Technol. Comput. Sci., vol. 11, no. 1, pp. 1–5, 2018, [Online]. Available: http://link.springer.com/10.1007/978-3-319-59379-1%0Ahttp://dx.doi.org/10.1016/B978-0-12-420070-8.00002-7%0Ahttp://dx.doi.org/10.1016/j.ab.2015.03.024%0Ahttps://doi.org/10.1080/07352689.2018.1441103%0Ahttp://www.chile.bmw-motorrad.cl/sync/showroom/lam/es/.

H. D. Wijaya and S. Dwiasnati, “Implementasi Data Mining dengan Algoritma Naïve Bayes pada Penjualan Obat,†J. Inform., vol. 7, no. 1, pp. 1–7, 2020, doi: 10.31311/ji.v7i1.6203.

H. H. Patel and P. Prajapati, “Study and Analysis of Decision Tree Based Classification Algorithms,†Int. J. Comput. Sci. Eng., vol. 6, no. 10, pp. 74–78, 2018.

I. Ahmad, M. Basheri, M. J. Iqbal, and A. Rahim, “Performance Comparison of Support Vector Machine, Random Forest, and Extreme Learning Machine for Intrusion Detection,†IEEE Access, vol. 6, pp. 33789–33795, 2018, doi: 10.1109/ACCESS.2018.2841987.

S. Mulyati, S. M. Husein, and Ramdhan, “RANCANG BANGUN APLIKASI DATA MINING PREDIKSI KELULUSAN UJIAN NASIONAL MENGGUNAKAN ALGORITMA (KNN) K-NEAREST NEIGHBOR DENGAN METODE EUCLIDEAN DISTANCE PADA SMPN 2 PAGEDANGAN,†J. Tek. Inform. Univ. Muhammadiyah Tangerang, vol. 4, no. 1, pp. 65–73, 2020.

I. A. Nikmatun and I. Waspada, “Implementasi Data Mining untuk Klasifikasi Masa Studi Mahasiswa Menggunakan Algoritma K-Nearest Neighbor,†J. SIMETRIS, vol. 10, no. 2, pp. 421–432, 2019.

A. M. Argina, “Penerapan Metode Klasifikasi K-Nearest Neigbor pada Dataset Penderita Penyakit Diabetes,†Indones. J. Data Sci., vol. 1, no. 2, pp. 29–33, 2020, doi: 10.33096/ijodas.v1i2.11.

N. M. Putry and B. N. Sari, “KOMPARASI ALGORITMA KNN DAN NAÃVE BAYES UNTUK KLASIFIKASI DIAGNOSIS PENYAKIT DIABETES MELITUS,†Evolusi J. Sains dan Manaj., vol. 10, no. 1, 2022.

R. Ahsana, R. R. Saedudin, and V. P. Widharta, “Perbandingan Akurasi Algoritma Adaboost Dan Algoritma Lightgbm Untuk Klasifikasi Penyakit Diabetes,†in e-Proceeding of Engineering, 2021, vol. 8, no. 5, pp. 9757–9764.

D. P. Utomo and Mesran, “Analisis Komparasi Metode Klasifikasi Data Mining dan Reduksi Atribut Pada Data Set Penyakit Jantung,†Media Inform. Budidarma, vol. 4, no. 2, pp. 437–444, 2020.

D. P. Utomo, P. Sirait, and R. Yunis, “Reduksi Atribut Pada Dataset Penyakit Jantung dan Klasifikasi Menggunakan Algoritma C5. 0,†J. Media Inform. Budidarma, vol. 4, no. 4, pp. 994–1006, 2020, doi: 10.30865/mib.v4i4.2355.

A. N. Kasanah, Muladi, and U. Pujianto, “Penerapan Teknik SMOTE untuk Mengatasi Imbalance Class dalam Klasifikasi Objektivitas Berita Online Menggunakan Algoritma KNN,†J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 3, no. 2, pp. 196–201, 2019.

Ardiyansyah, P. A. Rahayuningsih, and R. Maulana, “Analisis Perbandingan Algoritma Klasifikasi Data Mining Untuk Dataset Blogger Dengan Rapid Miner,†J. Khatulistiwa Inform., vol. VI, no. 1, pp. 20–28, 2018.

A. Byna and M. Basit, “Penerapan Metode Adaboost Untuk Mengoptimasi Prediksi Penyakit Stroke Dengan Algoritma Naïve Bayes,†J. Sisfokom (Sistem Inf. dan Komputer), vol. 9, no. 3, pp. 407–411, 2020, doi: 10.32736/sisfokom.v9i3.1023.

S. I. Gultom, “Implementasi Data Mining Menentukan Pola Hidup Sehat Bagi Pengguna KB Menggunakan Algoritma Adaboost ( Studi Kasus : Dinas Serdang Bedagai ),†J. Inf. dan Teknol. Ilm., vol. 7, no. 3, pp. 298–304, 2020.

Downloads

Published

2022-04-25

Issue

Section

Articles