Cluster Analysis using K-Means and K-Medoids Methods for Data Clustering of Amil Zakat Institutions Donor

Authors

  • Hotmaida Lestari Siregar University of North Sumatra, Medan
  • Muhammad Zarlis Bina Nusantara University, Jakarta
  • Syahril Efendi University of North Sumatra, Medan

DOI:

https://doi.org/10.30865/mib.v7i2.5315

Keywords:

K-Means, K-Medoids, RFM Model, DBI, Average Silhouette Score

Abstract

Cluster analysis is a multivariate analysis method whose purpose is to classify an object into a group based on certain characteristics. In cluster analysis, determining the number of initial clusters is very important so that the resulting clusters are also optimal. In this study, an analysis of the most optimal number of clusters for data classification will be carried out using the K-Means and K-Medoids methods. The data were analyzed using the RFM model and a comparative analysis was carried out based on the DBI value and cluster compactness which was assessed from the average silhouette score. The K-Means method produces the smallest DBI value of 0.485 and the highest average silhouette score value of 0.781 at k=6, while the K-Medoids method produces the smallest DBI value of 1.096 and the highest average silhouette score value of 0.517 at k=3. The results show that the best method for data clustering donations Amil Zakat Institutions is using the K-Means method with an optimal number of clusters of 6 clusters.

References

M. W. Talakua, Z. A. Leleury, and A. W. Taluta, “Analisis Cluster Dengan Menggunakan Metode K-Means Untuk Pengelompokkan Kabupaten/Kota Di Provinsi Maluku Berdasarkan Indikator Indeks Pembangunan Manusia Tahun 2014,†BAREKENG J. Ilmu Mat. dan Terap., vol. 11, no. 2, pp. 119–128, 2017, doi: 10.30598/barekengvol11iss2pp119-128.

A. Ali, “Klasterisasi Data Rekam Medis Pasien Menggunakan Metode K-Means Clustering di Rumah Sakit Anwar Medika Balong Bendo Sidoarjo,†MATRIK J. Manajemen, Tek. Inform. dan Rekayasa Komput., vol. 19, no. 1, pp. 186–195, 2019, doi: 10.30812/matrik.v19i1.529.

S. Zhang, C. Bi, M. Zhang, S. Zhang, C. Bi, and M. Zhang, “ScienceDirect ScienceDirect Logistics service supply chain order allocation mixed K-Means and Logistics service supply chain allocation mixed K-Means and Qos order matching Qos matching CQVIP Conference on Data Driven Intelligence and Innovation,†Procedia Comput. Sci., vol. 188, no. 2019, pp. 121–129, 2021, doi: 10.1016/j.procs.2021.05.060.

W. Qadadeh and S. Abdallah, “Customers Segmentation in the Insurance Company (TIC) Dataset,†Procedia Comput. Sci., vol. 144, pp. 277–290, 2018, doi: 10.1016/j.procs.2018.10.529.

J. Karthik, V. Tamizhazhagan, and S. Narayana, “Data leak identification using scattering search K Means in social networks,†Mater. Today Proc., no. xxxx, 2021, doi: 10.1016/j.matpr.2021.01.200.

A. K. Wardhani, “K-Means Algorithm Implementation for Clustering of Patients Disease in Kajen Clinic of Pekalongan,†J. Transform., vol. 14, no. 1, p. 30, 2016, doi: 10.26623/transformatika.v14i1.387.

G. Niu, Y. Ji, Z. Zhang, W. Wang, J. Chen, and P. Yu, “ScienceDirect Clustering analysis of typical scenarios of island power supply system by using cohesive hierarchical clustering based K-Means clustering method,†vol. 7, pp. 250–256, 2021, doi: 10.1016/j.egyr.2021.08.049.

W. Johnson and R. Dean, “Clustering, Distance Methods, and Ordination,†Applied Multivariate Statistical Analysis. pp. 671–757, 2007.

P. Govender and V. Sivakumar, Application of k-means and hierarchical clustering techniques for analysis of air pollution: A review (1980–2019), vol. 11, no. 1. Turkish National Committee for Air Pollution Research and Control, 2020. doi: 10.1016/j.apr.2019.09.009.

C. Yuan and H. Yang, “Research on K-Value Selection Method of K-Means Clustering Algorithm,†J, vol. 2, no. 2, pp. 226–235, 2019, doi: 10.3390/j2020016.

F. M. Nasution, Penerapan Metode K-Means Clustering Untuk Mengelompokkan Ketahanan Tanaman Pangan Kabupaten/Kota Diprovinsi Sumatera Utara. 2019.

A. Naghizadeh and D. N. Metaxas, “Condensed silhouette: An optimized filtering process for cluster selection in K-means,†in Procedia Computer Science, 2020, vol. 176, pp. 205–214. doi: 10.1016/j.procs.2020.08.022.

H. Xu, P. Croot, and C. Zhang, “Discovering hidden spatial patterns and their associations with controlling factors for potentially toxic elements in topsoil using hot spot analysis and K-means clustering analysis,†Environ. Int., vol. 151, no. February, p. 106456, 2021, doi: 10.1016/j.envint.2021.106456.

H. Song, J. G. Lee, and W. S. Han, “PAMAE: Parallel k-Medoids clustering with high accuracy and efficiency,†Proc. ACM SIGKDD Int. Conf. Knowl. Discov. Data Min., vol. Part F1296, pp. 1087–1096, 2017, doi: 10.1145/3097983.3098098.

N. Sureja, B. Chawda, and A. Vasant, “An improved K-medoids clustering approach based on the crow search algorithm,†J. Comput. Math. Data Sci., vol. 3, no. March, p. 100034, 2022, doi: 10.1016/j.jcmds.2022.100034.

P. Arora, Deepali, and S. Varshney, “Analysis of K-Means and K-Medoids Algorithm for Big Data,†Phys. Procedia, vol. 78, no. December 2015, pp. 507–512, 2016, doi: 10.1016/j.procs.2016.02.095.

B. Bernábe-Loranca, R. Gonzalez-Velázquez, E. Olivares-Benítez, J. Ruiz-Vanoye, and J. Martínez-Flores, “Extensions to K-medoids with balance restrictions over the cardinality of the partitions,†J. Appl. Res. Technol., vol. 12, no. 3, pp. 396–408, 2014, doi: 10.1016/S1665-6423(14)71621-9.

S. I. Murpratiwi, I. G. Agung Indrawan, and A. Aranta, “Analisis Pemilihan Cluster Optimal Dalam Segmentasi Pelanggan Toko Retail,†J. Pendidik. Teknol. dan Kejuru., vol. 18, no. 2, p. 152, 2021, doi: 10.23887/jptk-undiksha.v18i2.37426.

R. D. Astuti, “Analisis Perbandingan Algoritma K-Means Dan K-Medoids Untuk Menerapkan Segmentasi Pelanggan,†2019.

T. Hardiani, S. Sulistyo, and R. Hartanto, “Segmentasi Nasabah Tabungan Menggunakan Model RFM (Recency, Frequency,Monetary) dan K-Means Pada Lembaga Keuangan Mikro,†Semin. Nas. Teknol. Inf. dan Komun. Terap., no. November, p. 2015, 2015.

R. Heldt, C. S. Silveira, and F. B. Luce, “Predicting customer value per product: From RFM to RFM/P,†J. Bus. Res., vol. 127, no. March, pp. 444–453, 2021, doi: 10.1016/j.jbusres.2019.05.001.

I. I. P. Damanik, S. Solikhun, I. S. Saragih, I. Parlina, D. Suhendro, and A. Wanto, “Algoritma K-Medoids untuk Mengelompokkan Desa yang Memiliki Fasilitas Sekolah di Indonesia,†Pros. Semin. Nas. Ris. Inf. Sci., vol. 1, no. September, p. 520, 2019, doi: 10.30645/senaris.v1i0.58.

A. Supriyadi, A. Triayudi, and I. D. Sholihati, “Perbandingan Algoritma K-Means Dengan K-Medoids Pada Pengelompokan Armada Kendaraan Truk Berdasarkan Produktivitas,†JIPI (Jurnal Ilm. Penelit. dan Pembelajaran Inform., vol. 6, no. 2, pp. 229–240, 2021, doi: 10.29100/jipi.v6i2.2008.

S. Harikumar and P. V. Surya, “K-Medoid Clustering for Heterogeneous DataSets,†Procedia Comput. Sci., vol. 70, pp. 226–237, 2015, doi: 10.1016/j.procs.2015.10.077.

Z. Min and D. Kai-Fei, “Improved Research to K-means Initial Cluster Centers,†Proc. - 2015 9th Int. Conf. Front. Comput. Sci. Technol. FCST 2015, pp. 349–353, 2015, doi: 10.1109/FCST.2015.61.

M. A. Nahdliyah, T. Widiharih, and A. Prahutama, “METODE k-MEDOIDS CLUSTERING DENGAN VALIDASI SILHOUETTE INDEX DAN C-INDEX (Studi Kasus Jumlah Kriminalitas Kabupaten/Kota di Jawa Tengah Tahun 2018),†J. Gaussian, vol. 8, no. 2, pp. 161–170, 2019, doi: 10.14710/j.gauss.v8i2.26640.

Downloads

Published

2023-04-27

Issue

Section

Articles