Kombinasi Pembobotan Symmetrical Uncertainty Pada K-Means Clustering Dalam Peningkatan Kinerja Pengelompokan Data

Authors

  • Suranta Bill Fatric Ginting Universitas Sumatera Utara, Medan
  • Sawaluddin Sawaluddin Universitas Sumatera Utara, Medan
  • Muhammad Zarlis Universitas Sumatera Utara, Medan

DOI:

https://doi.org/10.30865/mib.v6i1.3366

Keywords:

Clustering, K-Means Clustering, Symmetrical Uncertainty, Davies-Bouldin Index

Abstract

Based on several studies that examine the K-Means Clustering method, it was found that in K-Means Clustering one of the weaknesses lies in the process of determining the center point of the cluster which also has implications for distance calculations in determining the similarity between data to obtain conclusions from the data. a cluster. And this is also caused by the influence of the percentage of the attributes used. If the attributes used are less relevant to their level of influence and also have a low contribution to the data, this can have a significant impact on the results of clustering. So from these problems, in this research, the author proposes to use the method in calculating the weight of data attributes in the clustering process, namely using Symmetrical Uncertainty. To test the proposed method, this research uses a dataset from UCI Machine Learning which consists of Iris with 150 data and Wine Quality with 178 data. The evaluation of the proposed clustering performance is based on the Davies-Bouldin Index (DBI) value. The test results in this study show that the proposed method can produce a significantly smaller Davies-Bouldin Index (DBI) value.

References

V. Kumar, J. K. Chhabra, and D. Kumar, "Initializing Cluster Center for K-Means Using Biogeography Based Optimization," In International Conference on Advances in Computing, Communication and Control. pp. 448-456. Springer, Berlin, Heidelberg, 2011.

C. Xiong, Z. Hua, K. Lv, and X. Li, "An Improved K-means Text Clustering Algorithm by Optimizing Initial Cluster Centers," In 2016 7th International Conference on Cloud Computing and Big Data (CCBD), pp. 265-268, 2016.

F. Gorunescu, "Data Mining: Concepts, Models and Techniques," vol. 12. Springer Science & Business Media, 2011.

F. Cao, J. Liang, and G. Jiang, "An Initialization Method for the K-Means Algorithm Using Neighborhood Model," Computers & Mathematics with Applications, vol. 58, no. 3, pp. 474-483, 2009.

M. Kuhkan, "A Method to Improve the Accuracy of K-Nearest Neighbor Algorithm," International Journal of Computer Engineering and Information Technology, vol. 8, no. 6, pp. 90-95, 2016.

C. S. Kumar, and R. J. Sree, "Application of Ranking Based Attribute Selection Filters to Perform Automated Evaluation of Descriptive Answers Through Sequential Minimal Optimization Models," ICTACT Journal on Soft Computing, vol. 92, no. 2012, pp. 124-132, 2014.

C. Saranya, and G. Manikandan, "A Study on Normalization Techniques for Privacy Preserving Data Mining," International Journal of Engineering and Technology (IJET), vol. 5, no. 3, pp. 2701-2704, 2013

R.W. Nurul, S. Defiyanti, and M. Jajuli, "Implementasi Algoritma K-Means Dalam Pengklasteran Mahasiswa Pelamar Beasiswa," Jurnal Ilmiah Teknologi dan Komputer (JITTER). vol. 1, no. 2. pp. 126-134, 2015.

M. Piao, Y. Piao, and J. Y. Lee, "Symmetrical uncertainty-based feature subset generation and ensemble learning for electricity customer classification," Symmetry, vol. 11, no. 4, pp. 498-503, 2019.

M. A. Syakur, B. K. Khotimah, E. M. S. Rochman, and B. D. Satoto, "Integration K-Means Clustering Method and Elbow Method for Identification of the Best Customer Profile Cluster," IOP Conference Series: Materials Science and Engineering, vol. 336, no. 1, pp. 12-17, 2018.

Q. Zhan, "An Improved K-Means Algorithm Based on Structure Features," Journal of Software. vol. 12, no. 1, 62-80, 2017.

Downloads

Published

2022-01-25