Cluster Analysis using K-Means and K-Medoids Methods for Data Clustering of Amil Zakat Institutions Donor

Hotmaida Lestari Siregar; Muhammad Zarlis; Syahril Efendi

doi:10.30865/mib.v7i2.5315

Authors

Hotmaida Lestari Siregar University of North Sumatra, Medan
Muhammad Zarlis Bina Nusantara University, Jakarta
Syahril Efendi University of North Sumatra, Medan

DOI:

https://doi.org/10.30865/mib.v7i2.5315

Keywords:

K-Means, K-Medoids, RFM Model, DBI, Average Silhouette Score

Abstract

Cluster analysis is a multivariate analysis method whose purpose is to classify an object into a group based on certain characteristics. In cluster analysis, determining the number of initial clusters is very important so that the resulting clusters are also optimal. In this study, an analysis of the most optimal number of clusters for data classification will be carried out using the K-Means and K-Medoids methods. The data were analyzed using the RFM model and a comparative analysis was carried out based on the DBI value and cluster compactness which was assessed from the average silhouette score. The K-Means method produces the smallest DBI value of 0.485 and the highest average silhouette score value of 0.781 at k=6, while the K-Medoids method produces the smallest DBI value of 1.096 and the highest average silhouette score value of 0.517 at k=3. The results show that the best method for data clustering donations Amil Zakat Institutions is using the K-Means method with an optimal number of clusters of 6 clusters.

References

M. W. Talakua, Z. A. Leleury, and A. W. Taluta, â€œAnalisis Cluster Dengan Menggunakan Metode K-Means Untuk Pengelompokkan Kabupaten/Kota Di Provinsi Maluku Berdasarkan Indikator Indeks Pembangunan Manusia Tahun 2014,â€ BAREKENG J. Ilmu Mat. dan Terap., vol. 11, no. 2, pp. 119â€“128, 2017, doi: 10.30598/barekengvol11iss2pp119-128.

A. Ali, â€œKlasterisasi Data Rekam Medis Pasien Menggunakan Metode K-Means Clustering di Rumah Sakit Anwar Medika Balong Bendo Sidoarjo,â€ MATRIK J. Manajemen, Tek. Inform. dan Rekayasa Komput., vol. 19, no. 1, pp. 186â€“195, 2019, doi: 10.30812/matrik.v19i1.529.

S. Zhang, C. Bi, M. Zhang, S. Zhang, C. Bi, and M. Zhang, â€œScienceDirect ScienceDirect Logistics service supply chain order allocation mixed K-Means and Logistics service supply chain allocation mixed K-Means and Qos order matching Qos matching CQVIP Conference on Data Driven Intelligence and Innovation,â€ Procedia Comput. Sci., vol. 188, no. 2019, pp. 121â€“129, 2021, doi: 10.1016/j.procs.2021.05.060.

W. Qadadeh and S. Abdallah, â€œCustomers Segmentation in the Insurance Company (TIC) Dataset,â€ Procedia Comput. Sci., vol. 144, pp. 277â€“290, 2018, doi: 10.1016/j.procs.2018.10.529.

J. Karthik, V. Tamizhazhagan, and S. Narayana, â€œData leak identification using scattering search K Means in social networks,â€ Mater. Today Proc., no. xxxx, 2021, doi: 10.1016/j.matpr.2021.01.200.

A. K. Wardhani, â€œK-Means Algorithm Implementation for Clustering of Patients Disease in Kajen Clinic of Pekalongan,â€ J. Transform., vol. 14, no. 1, p. 30, 2016, doi: 10.26623/transformatika.v14i1.387.

G. Niu, Y. Ji, Z. Zhang, W. Wang, J. Chen, and P. Yu, â€œScienceDirect Clustering analysis of typical scenarios of island power supply system by using cohesive hierarchical clustering based K-Means clustering method,â€ vol. 7, pp. 250â€“256, 2021, doi: 10.1016/j.egyr.2021.08.049.

W. Johnson and R. Dean, â€œClustering, Distance Methods, and Ordination,â€ Applied Multivariate Statistical Analysis. pp. 671â€“757, 2007.

P. Govender and V. Sivakumar, Application of k-means and hierarchical clustering techniques for analysis of air pollution: A review (1980â€“2019), vol. 11, no. 1. Turkish National Committee for Air Pollution Research and Control, 2020. doi: 10.1016/j.apr.2019.09.009.

C. Yuan and H. Yang, â€œResearch on K-Value Selection Method of K-Means Clustering Algorithm,â€ J, vol. 2, no. 2, pp. 226â€“235, 2019, doi: 10.3390/j2020016.

F. M. Nasution, Penerapan Metode K-Means Clustering Untuk Mengelompokkan Ketahanan Tanaman Pangan Kabupaten/Kota Diprovinsi Sumatera Utara. 2019.

A. Naghizadeh and D. N. Metaxas, â€œCondensed silhouette: An optimized filtering process for cluster selection in K-means,â€ in Procedia Computer Science, 2020, vol. 176, pp. 205â€“214. doi: 10.1016/j.procs.2020.08.022.

H. Xu, P. Croot, and C. Zhang, â€œDiscovering hidden spatial patterns and their associations with controlling factors for potentially toxic elements in topsoil using hot spot analysis and K-means clustering analysis,â€ Environ. Int., vol. 151, no. February, p. 106456, 2021, doi: 10.1016/j.envint.2021.106456.

H. Song, J. G. Lee, and W. S. Han, â€œPAMAE: Parallel k-Medoids clustering with high accuracy and efficiency,â€ Proc. ACM SIGKDD Int. Conf. Knowl. Discov. Data Min., vol. Part F1296, pp. 1087â€“1096, 2017, doi: 10.1145/3097983.3098098.

N. Sureja, B. Chawda, and A. Vasant, â€œAn improved K-medoids clustering approach based on the crow search algorithm,â€ J. Comput. Math. Data Sci., vol. 3, no. March, p. 100034, 2022, doi: 10.1016/j.jcmds.2022.100034.

P. Arora, Deepali, and S. Varshney, â€œAnalysis of K-Means and K-Medoids Algorithm for Big Data,â€ Phys. Procedia, vol. 78, no. December 2015, pp. 507â€“512, 2016, doi: 10.1016/j.procs.2016.02.095.

B. BernÃ¡be-Loranca, R. Gonzalez-VelÃ¡zquez, E. Olivares-BenÃtez, J. Ruiz-Vanoye, and J. MartÃnez-Flores, â€œExtensions to K-medoids with balance restrictions over the cardinality of the partitions,â€ J. Appl. Res. Technol., vol. 12, no. 3, pp. 396â€“408, 2014, doi: 10.1016/S1665-6423(14)71621-9.

S. I. Murpratiwi, I. G. Agung Indrawan, and A. Aranta, â€œAnalisis Pemilihan Cluster Optimal Dalam Segmentasi Pelanggan Toko Retail,â€ J. Pendidik. Teknol. dan Kejuru., vol. 18, no. 2, p. 152, 2021, doi: 10.23887/jptk-undiksha.v18i2.37426.

R. D. Astuti, â€œAnalisis Perbandingan Algoritma K-Means Dan K-Medoids Untuk Menerapkan Segmentasi Pelanggan,â€ 2019.

T. Hardiani, S. Sulistyo, and R. Hartanto, â€œSegmentasi Nasabah Tabungan Menggunakan Model RFM (Recency, Frequency,Monetary) dan K-Means Pada Lembaga Keuangan Mikro,â€ Semin. Nas. Teknol. Inf. dan Komun. Terap., no. November, p. 2015, 2015.

R. Heldt, C. S. Silveira, and F. B. Luce, â€œPredicting customer value per product: From RFM to RFM/P,â€ J. Bus. Res., vol. 127, no. March, pp. 444â€“453, 2021, doi: 10.1016/j.jbusres.2019.05.001.

I. I. P. Damanik, S. Solikhun, I. S. Saragih, I. Parlina, D. Suhendro, and A. Wanto, â€œAlgoritma K-Medoids untuk Mengelompokkan Desa yang Memiliki Fasilitas Sekolah di Indonesia,â€ Pros. Semin. Nas. Ris. Inf. Sci., vol. 1, no. September, p. 520, 2019, doi: 10.30645/senaris.v1i0.58.

A. Supriyadi, A. Triayudi, and I. D. Sholihati, â€œPerbandingan Algoritma K-Means Dengan K-Medoids Pada Pengelompokan Armada Kendaraan Truk Berdasarkan Produktivitas,â€ JIPI (Jurnal Ilm. Penelit. dan Pembelajaran Inform., vol. 6, no. 2, pp. 229â€“240, 2021, doi: 10.29100/jipi.v6i2.2008.

S. Harikumar and P. V. Surya, â€œK-Medoid Clustering for Heterogeneous DataSets,â€ Procedia Comput. Sci., vol. 70, pp. 226â€“237, 2015, doi: 10.1016/j.procs.2015.10.077.

Z. Min and D. Kai-Fei, â€œImproved Research to K-means Initial Cluster Centers,â€ Proc. - 2015 9th Int. Conf. Front. Comput. Sci. Technol. FCST 2015, pp. 349â€“353, 2015, doi: 10.1109/FCST.2015.61.

M. A. Nahdliyah, T. Widiharih, and A. Prahutama, â€œMETODE k-MEDOIDS CLUSTERING DENGAN VALIDASI SILHOUETTE INDEX DAN C-INDEX (Studi Kasus Jumlah Kriminalitas Kabupaten/Kota di Jawa Tengah Tahun 2018),â€ J. Gaussian, vol. 8, no. 2, pp. 161â€“170, 2019, doi: 10.14710/j.gauss.v8i2.26640.

Cluster Analysis using K-Means and K-Medoids Methods for Data Clustering of Amil Zakat Institutions Donor

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Most read articles by the same author(s)

Menu Utama

flagcounter

template

statcounter

rji

terindex