Pengembalian Data Yang Hilang Pada Dataset Dengan Menggunakan Algoritma K-Nearest Neighbor Imputation Data Mining

Budianto Bangun, Abdul Karim Karim

Abstract


One of the things that is really hoped for when collecting data is to produce complete data. In research, incomplete data will affect the results obtained. because the process carried out in the research was not optimal. A dataset is a collection of information that is stored for a long time and becomes a large pile of data. Missing values in the dataset will be an important problem and must be handled in research. Therefore, data recovery is needed. Data mining is a process carried out in computer research. Where data mining will process data that has been collected first, either data collected by yourself (primary data) or data that has been collected in a dataset (secondary data). Recovery is the process of recovering data that is lost or cannot be found. The K-Nearest Neighbor Imputation algorithm is a system that uses a supervised learning algorithm and aims to discover new data patterns by connecting existing data patterns with new data. KNNI is an approach used to identify objects based on certain information, namely the closest distance to the object

Keywords


Return; Data; Lost; Datasets; K-Nearest Neighbor Imputation algorithm; Data Mining

Full Text:

PDF

References


Miraati Laia, “Analisis Kinerja Algoritma K-Nearest Neighbor Imputation (KNNI) Untuk Missing Value Pada Klasifikasi Data Mining,†J. Informatics, Electr. Electron. Eng., vol. 2, no. 3, pp. 92–98, 2023, doi: 10.47065/jieee.v2i3.891.

M. R. A. Prasetya, A. M. Priyatno, and Nurhaeni, “Penanganan Imputasi Missing Values pada Data Time Series dengan Menggunakan Metode Data Mining,†J. Inf. dan Teknol., vol. 5, no. 2, pp. 52–62, 2023, doi: 10.37034/jidt.v5i2.324.

A. S. Alianso, L. Syafaah, and A. Faruq, “K-nearest neighbor imputation for missing value in hepatitis data,†AIP Conf. Proc., vol. 2453, no. July, 2022, doi: 10.1063/5.0095625.

A. Fadlil, Herman, and D. Praseptian M, “K Nearest Neighbor Imputation Performance on Missing Value Data Graduate User Satisfaction,†J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 6, no. 4, pp. 570–576, 2022, doi: 10.29207/resti.v6i4.4173.

J. Mantik et al., “Implementation of KNN algorithm in classifying diabetic ulcers in patients with diabetes mellitus,†J. Mantik, vol. 7, no. 2, pp. 2685–4236, 2023.

A. Fikri Sallaby, “Analysis of Missing Value Imputation Application with K-Nearest Neighbor (K-NN) Algorithm in Dataset,†Int. J. Informatics Comput. Sci., vol. 5, no. 2, pp. 141–144, 2021, doi: 10.30865/ijics.v5i2.3185.

B. Bijanto and R. Yunus, “Improvement of Accuracy and Handling of Missing Value Data in the Naive Bayes Kernel Algorithm,†J. Appl. Intell. Syst., vol. 6, no. 2, pp. 134–143, 2021, doi: 10.33633/jais.v6i2.5288.

T. Raudhatunnisa and N. Wilantika, “Performance Comparison of Hot-Deck Imputation, K-Nearest Neighbor Imputation, and Predictive Mean Matching in Missing Value Handling, Case Study: March 2019 SUSENAS Kor Dataset,†Proc. Int. Conf. Data Sci. Off. Stat., vol. 2021, no. 1, pp. 753–770, 2022, doi: 10.34123/icdsos.v2021i1.93.

H. Özen and C. Bal, “Rasgele Orman Yönteminde Eksik Veri Probleminin İncelenmesi,†OSMANGAZİ J. Med., vol. 00, no. 1, pp. 103–109, 2019, doi: 10.20515/otd.496524.

D. A. Anggoro and N. C. Aziz, “Implementation of K-Nearest Neighbors Algorithm for Predicting Heart Disease Using Python Flask,†Iraqi J. Sci., vol. 62, no. 9, pp. 3196–3219, 2021, doi: 10.24996/ijs.2021.62.9.33.

Y. Bagus Pratama and A. Setiawan, “RESOLUSI : Rekayasa Teknik Informatika dan Informasi Implementasi Machine Learning Menggunakan Algoritma K-Means Untuk Klasifikasi Sekolah Dasar,†Media Online, vol. 4, no. 3, pp. 249–257, 2024, [Online]. Available: https://djournals.com/resolusi.

F. H. Alfebi and M. D. Anasanti, “Improving Cardiovascular Disease Prediction by Integrating Imputation, Imbalance Resampling, and Feature Selection Techniques into Machine Learning Model,†IJCCS (Indonesian J. Comput. Cybern. Syst., vol. 17, no. 1, p. 55, 2023, doi: 10.22146/ijccs.80214.

E. Novianto and S. Suhirman, “Comparison of K-Nearest Neighbor Classification Methods and Support Vector Machine in Predicting Students’ Study PeriodComparison of K-Nearest Neighbor Classification Methods and Support Vector Machine in Predicting Students’ Study Period.,†J. Educ. Sci., vol. 33, no. 1, pp. 32–45, 2024, doi: 10.33899/edusj.2023.144865.1408.

A. Choudhury and M. R. Kosorok, “Missing Data Imputation for Classification Problems,†pp. 1–27, 2020, [Online]. Available: http://arxiv.org/abs/2002.10709.

U. Mawarsari, “IMPUTASI MISSING DATA DENGAN K-NEAREST NEIGHBOR DAN ALGORITMA GENETIKA,†AdMathEdu, vol. 6, no. 1, pp. 77–86, 2016.

Moch. Lutfi and Mochamad Hasyim, “Penanganan Data Missing Value Pada Kualitas Produksi Jagung Dengan Menggunakan Metode K-Nn Imputation Pada Algoritma C4.5,†J. Resist. (Rekayasa Sist. Komputer), vol. 2, no. 2, pp. 89–104, 2019, doi: 10.31598/jurnalresistor.v2i2.427.

M. Lutfi, “Implementasi Metode K-Nearest Neighbor dan Bagging Untuk Klasifikasi Mutu Produksi Jagung,†Agromix, vol. 10, no. 2, pp. 130–137, 2019, doi: 10.35891/agx.v10i2.1636.

I. J. Fadillah and S. Muchlisoh, “Perbandingan Metode Hot-Deck Imputation Dan Metode Knni Dalam Mengatasi Missing Values,†Semin. Nas. Off. Stat., vol. 2019, no. 1, pp. 275–285, 2020, doi: 10.34123/semnasoffstat.v2019i1.101.

D. Nofriansyah and G. W. Nurcahyo, Algoritma Data Mining Dan Pengujiannya. Yogyakarta: Deepublish, 2017.

E. Prasetyo, Data Mining : Konsep dan Aplikasi Menggunakan Matlab. Yogyakarta: CV. Andi Offset, 2012.




DOI: https://doi.org/10.30865/mib.v8i3.8014

Refbacks

  • There are currently no refbacks.


Copyright (c) 2024 JURNAL MEDIA INFORMATIKA BUDIDARMA

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.



JURNAL MEDIA INFORMATIKA BUDIDARMA
Universitas Budi Darma
Secretariat: Sisingamangaraja No. 338 Telp 061-7875998
Email: mib.stmikbd@gmail.com

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.