Peningkatan Akurasi Metode K-Nearest Neighbor dengan Seleksi Fitur Symmetrical Uncertainty

Authors

  • Anirma Kandida Br Ginting Universitas Sumatera Utara, Medan
  • Maya Silvi Lydia Universitas Sumatera Utara, Medan
  • Elviawaty Muisa Zamzami Universitas Sumatera Utara, Medan

DOI:

https://doi.org/10.30865/mib.v5i4.3254

Keywords:

K-Nearest Neighbor, Symmetrical Uncertainty, Feature Selection, Classification, Accuracy of Improvement

Abstract

Accuracy of K-Nearest Neighbor (KNN) tends to be lower than other classification methods. The cause of this is related to the attributes used and the percentage of the influence of these attributes on the classification process in a data. And also attributes with less relevant influence can be a problem in determining the new class. One way that can be done to overcome this is by doing Feature Selection. In this research, the author selects features on K-Nearest Neighbor by using Symmetrical Uncertainty to remove attributes that have an unfavorable effect from the data set. Testing of the proposed method uses data sets obtained from the UCI Machine Learning Repository. The results obtained from testing the proposed method using feature selection with Symmetrical Uncertainty are able to increase the classification accuracy of KNN, with an increase in accuracy obtained after feature selection is 3.00 %.

References

A. Danades, D. Pratama, D. Anggraini, and D. Anggriani, "Comparison of Accuracy Level K-Nearest Neighbor Algorithm and Support Vector Machine Algorithm in Classification Water Quality Status," International Conference on System Engineering and Technology, pp. 137-141, 2016.

J. Han, J. Pei, and M. Kamber, "Data Mining Concept and Techniques, 3rd edition," Morgan Kaufmann-Elsevier. vol. 2, no. 1, pp. 88-97, 2012.

Y. Chen, and Y. Hao, "A Feature Weighted Support Vector Machine and K-Nearest Neighbor Algorithm for Stock Market Indices Prediction," Expert Systems with Applications (2017), vol. 80, pp. 340-355, 2017.

J. S. Raikwal, and K. Saxena, "Performance Evaluation of SVM and K-Nearest Neighbor Algorithm over Medical Data set," International Journal of Computer Applications. vol. 50, no. 14, pp. 35-39, 2012.

A. Moosavian, H. Ahmadi, A. Tabatabaeefar, M. Khazaee, "Comparison of two classifiers; K-nearest neighbor and artificial neural network, for fault diagnosis on a main engine journal-bearing," Shock and Vibration, vol. 20, no. 2, pp. 263-272, 2012.

A. Ashari, I. Paryudi, and A. M. Tjoa, "Performance Comparison between Naïve Bayes, Decision Tree and k-Nearest Neighbor in Searching Alternative Design in an Energy Simulation Tool," (IJACSA) International Journal of Advanced Computer Science and Applications. vol. 4, no. 11, pp. 33-39, 2013.

M. Danil, S. Efendi, and R. W. Sembiring, "The Analysis of Attribution Reduction of K-Nearest Neighbor (KNN) Algorithm by Using Chi-Square," In Journal of Physics: Conference Series, vol. 1424, no. 1, pp. 012004, 2019.

N. S. I. M. Rafei, R. Hassan, R. D. R. Saedudin, A. F. M. Raffei, Z. Zakaria, and S. Kasim, "Comparison of feature selection techniques in classifying stroke documents," Indonesian Journal of Electrical Engineering and Computer Science, vol. 14, no. 3, pp.1244–1250, 2019.

C. S. Kumar, and R. J. Sree, "Application of Ranking Based Attribute Selection Filters to Perform Automated Evaluation of Descriptive Answers Through Sequential Minimal Optimization Models," ICTACT Journal on Soft Computing, vol. 92, no. 2012, pp. 124-132, 2014.

P. Refaeilzadeh, L. Tang, and H. Liu, "Encyclopedia of Database Systems," In Cross-validation, pp. 532-538, 2009.

M. Piao, Y. Piao, and J. Y. Lee, "Symmetrical uncertainty-based feature subset generation and ensemble learning for electricity customer classification," Symmetry, vol. 11, no. 4, pp. 498-504, 2019.

C. Saranya, and G. Manikandan, "A Study on Normalization Techniques for Privacy Preserving Data Mining," International Journal of Engineering and Technology (IJET), vol. 5, no. 3, pp. 2701-2704, 2013.

J. D. Novaković, A. Veljović, S. S. Ilić, Ž. Papić, and T. Milica, "Evaluation of Classification Models in Machine Learning," Theory and Applications of Mathematics & Computer Science, vol. 7, no. 1, pp. 39-46, 2017.

M. Kuhkan, "A Method to Improve the Accuracy of K-Nearest Neighbor Algorithm," Internatonal Journal of Computer Engineering and Information Technology, vol. 8, no. 6, pp. 90-95, 2016.

H. Wang, T. M. Khoshgoftaar, and A. Napolitano, "Software measurement data reduction using ensemble techniques," Neurocomputing, vo. 92, pp.124-132, 2012.

Rezki, A., Mawengkang, H., Efendi, S., &Khair, H. (2018). Classification Accuracy of K-Nearest Neighbours Algorithm to Predict Rice Quality. International Journal of Progressive Sciences and Technologies, 10(1), 158-162

Saikhu, A., Arifin, A. Z., & Fatichah, C. 2019. Correlation and Symmetrical Uncertainty-Based Feature Selection for Multivariate Time Series Classification. International Journal of Intelligent Engineering and System. 12(3), 129-137.

Downloads

Published

2021-10-26