Analysis of Missing Value Imputation Application with K-Nearest Neighbor (K-NN) Algorithm in Dataset

 (*)Achmad Fikri Sallaby Mail (Universitas Dehasen, Bengkulu, Indonesia)
 Azlan Azlan (STMIK Triguna Dharma, Medan, Indonesia)

(*) Corresponding Author

Submitted: July 16, 2021; Published: August 1, 2021

Abstract

Missing value is a problem that is still often found in many studies. Missing value is where data or data features are not available completely and intact. This still happens a lot in datasets that will be used in research. The missing value is caused by many factors such as human error, unavailable data or even from a virus in the database. Data is important for research, incomplete data will affect the results obtained. Data mining is a process that is very influential on data, including the classification process. Classification in data mining can be done if the data is complete. These problems can be overcome by the Imputation process by combining it with the K-Nearest Neighbor process or the process can be called K-Nearest Neighbor Imputation (K-NNI). In the research that has been done the K-Nearest Neighbor Imputation algorithm can overcome the problem of missing values in the dataset. This can be seen from the level of accuracy obtained where the accuracy of the classification process before handling the missing value is 77.01% while after the imputation process the accuracy is 78.31%

Keywords


Missing Value; Data Mining; Datasets; Imputation; K-Nearest Neighbor Imputation (K-NNI)

Full Text:

PDF


Article Metrics

Abstract view : 1273 times
PDF - 893 times

References

J. Seita, “Missing data Missing data,” Reclaiming Child. Youth, vol. 19, no. March, pp. 7–8, 2007.

I. J. Fadillah and C. D. Puspita, “Pemanfaatan Metode Weighted K-Nearest Neighbor Impution (Weighted KNNI) Untuk Mengatasi Missing Data: Penerapan pada Data Indeks Produksi Triwulanan Industri Mikro Kecil ( IMK ) Tahun 2016-2019,” Semin. Nas. Off. Stat. 2019 Pengemb. Off. Stat. dalam mendukung Implementasi SDG’s, pp. 511–518, 2020.

I. Bagus and G. Narinda, “Missing Value Imputation Using KNN Method Optimized With Memetic Algorithm,” e-Proceeding Eng., vol. 3, no. 1, pp. 1098–1105, 2016.

Susanti, S. Martha, and E. Sulistianingsih, “K-Nearest Neighbor Dalam Imputasi Missing Data,” Bul. Ilm. Math. Stat. dan Ter., vol. 07, no. 1, pp. 9–14, 2018.

A. Izzah, S. Ramadhan, and P. D. K. Means, “Imputasi Missing data Menggunakan Algoritma Pengelompokan Data K- Harmonic Means Related papers Imputasi Missing data Menggunakan Algoritma.”

T. Rizaldi, F. E. Purnomo, and A. S. Arifianto, “Perbandingan Metode K-Nn Dan Bayes Pada Missing Imputation,” J. Teknol. Inf. dan Terap., vol. 5, no. 2, pp. 85–90, 2019, doi: 10.25047/jtit.v5i2.84.

I. J. Fadillah and S. Muchlisoh, “Perbandingan Metode Hot-Deck Imputation Dan Metode Knni Dalam Mengatasi Missing Values,” Semin. Nas. Off. Stat., vol. 2019, no. 1, pp. 275–285, 2020, doi: 10.34123/semnasoffstat.v2019i1.101.

D. Nofriansyah and G. W. Nurcahyo, Algoritma Data Mining Dan Pengujiannya. Yogyakarta: Deepublish, 2015.

D. Nofriansyah, Konsep Data Mining Vs Sistem Pendukung Keputusan. Yogyakarta: Deepublish, 2014.

B. Efori, Data Mining Untuk Perguruan Tinggi. Yogyakarta: Deepublish, 2020.

E. Prasetyo, Data Mining : Konsep dan Aplikasi Menggunakan Matlab. Yogyakarta: CV. Andi Offset, 2012.

U. Mawarsari, “IMPUTASI MISSING DATA DENGAN K-NEAREST NEIGHBOR DAN ALGORITMA GENETIKA,” AdMathEdu, vol. 6, no. 1, pp. 77–86, 2016.

Moch. Lutfi and Mochamad Hasyim, “Penanganan Data Missing Value Pada Kualitas Produksi Jagung Dengan Menggunakan Metode K-Nn Imputation Pada Algoritma C4.5,” J. Resist. (Rekayasa Sist. Komputer), vol. 2, no. 2, pp. 89–104, 2019, doi: 10.31598/jurnalresistor.v2i2.427.

E. Sartika, “Analisis metode k nearest neighbor imputation (knni) untuk mengatasi data hilang pada estimasi data survey,” Tedc, vol. 12, no. 3, pp. 219–227, 2018.

Refbacks

  • There are currently no refbacks.


Copyright (c) 2021 Ahmad Fikri Sallaby, Azlan Azlan

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.


The IJICS (International Journal of Informatics and Computer Science)
Published by STMIK Budi Darma.
Jl. Sisingamangaraja No.338 Simpang Limun, Medan, North Sumatera
Email: ijics.stmikbudidarma@gmail.com

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.