Perbandingan Metode Perhitungan Jarak pada Nilai Centroid dan Pengelompokan Data Menggunakan K-Means Clustering

Budi Hartono, Sri Eniyati, Kristophorus Hadiono

Abstract


This study will observe the process of grouping data or forming clusters using K-Means clusters with three methods of measuring distances, namely Euclidean distance, Manhattan distance, and Minkowski distance. Observations are more focused on changing the centroid value and the results of grouping data, as well as the number of iterations required. Experimental data amounted to 20, 30, 40, and 50 pieces of data which were grouped into 2 groups. This research also summarizes the application of K-Means clusters which have been widely used in various fields, including Health, Education, and Disaster. The results of grouping data with the three distance measurement methods are not too much different, namely the highest difference is 2 members of the data on 50 test data. The most iterations on 40 test data use the Euclidean distance, namely 7 iterations, and the least iteration on 20 test data uses Minkowski distance i.e. 3 iterations. On the 50 test data it takes 4 iterations. The amount of test data is not directly proportional to the number of iterations needed to reach the cluster in a stable state.

Keywords


Data Cluster; Euclidean Distance; Centroid; K-Means Cluster

Full Text:

PDF

References


Syaifuddin, Ramlah, I. Hakim, Y. Berliana, and Nurhayati, "Pemetaan Produksi Tanaman Tomat di Indonesia Berdasarkan Provinsi Menggunakan Algoritma K-Means Clustering," Journal of Computer System and Informatics (JoSYC), vol. 3, no. 4, pp. 222−228, 2022, doi: 10.47065/josyc.v3i4.2206.

M. Sarosa, M. Ridwan, I. Mahfudi, and M. B. Purwanto, "Penghitung Skor Tembak Otomatis menggunakan Metode Background Substraction dan Euclidean Distance," Jurnal Edukasi dan Penelitian Informatika (JEPIN), vol. 8, no. 1, pp. 140-146, 2022, doi: http://dx.doi.org/10.26418/jp.v8i1.51265.

S. Widodo, H. Brawijaya, and Samudi, "Clustering Kanker Serviks Berdasarkan Perbandingan Euclidean dan Manhattan Menggunakan Metode K-Means," Jurnal Media Informatika Budidarma, vol. 5, no. 2, pp. 687-694, 2021, doi: http://dx.doi.org/10.30865/mib.v5i2.2947.

R. G. Santosa, Y. Lukito, and A. R. Chrismanto, "Implementasi Metode TwoStep Clustering untuk Klasterisasi Karakteristik Akademik Mahasiswa," Jurnal Edukasi dan Penelitian Informatika (JEPIN), vol. 7, no. 2, pp. 121-132, 2021, doi: http://dx.doi.org/10.26418/jp.v7i2.47735.

M. Nishom, "Perbandingan Akurasi Euclidean Distance, Minkowski Distance, dan Manhattan Distance pada Algoritma K-Means Clustering berbasis Chi-Square," Jurnal Pengembangan IT (JPIT), vol. 4, no. 1, pp. 20−24, 2019, doi: http://dx.doi.org/10.30591/jpit.v4i1.1253.

M. S. Pangestu and M. A. Fitriani, "Perbandingan Perhitungan Jarak Euclidean Distance, Manhattan Distance, dan Cosine Similarity dalam Pengelompokan Data Bibit Padi Menggunakan Algoritma K-Means," Jurnal SAINTEKS, vol. 19, no. 2, pp. 141–155, 2022, doi: http://dx.doi.org/10.30595/sainteks.v19i2.14495.

R. I. Fajriah, H. Sutisna, and B. K. Simpony, "Perbandingan Distance Space Manhattan Dengan Euclidean Pada K-Means Clustering Dalam Menentukan Promosi," Indonesian Journal on Computer and Information Technology (IJCIT), vol. 4, no. 1, pp. 36-49, 2019, https://ejournal.bsi.ac.id/ejurnal/index.php/ijcit/article/view/4630.

J. Hutagalung and F. Sonata, "Penerapan Metode K-Means Untuk Menganalisis Minat Nasabah Asuransi," Jurnal Media Informatika Budidarma, vol. 5, no. 3, pp. 1187-1194, 2021, doi: http://dx.doi.org/10.30865/mib.v5i3.3113.

E. Virantika, Kusnawi, and J. Ipmawati, "Evaluasi Hasil Pengujian Tingkat Clusterisasi Penerapan Metode K-Means Dalam Menentukan Tingkat Penyebaran Covid-19 di Indonesia," Jurnal Media Informatika Budidarma, vol. 6, no. 3, pp. 1657-1666, 2022, doi: http://dx.doi.org/10.30865/mib.v6i3.4325.

C. Prianto, Rd. Nuraini, and A. T. Wali, "Implementation of K-Means Methods In Clustering Students Ability Levels in English Language," The IJICS (International Journal of Informatics and Computer Science), vol. 3, no. 2, pp. 49-58, 2019, doi: http://dx.doi.org/10.30865/ijics.v3i2.1382.

P. Novianti, D. Setyorini, and U. Rafflesia, "K-Means cluster analysis in earthquake epicenter clustering," International Journal of Advances in Intelligent Informatics (IJAIN), vol. 3, no. 2, pp. 81-89, 2017, doi: https://doi.org/10.26555/ijain.v3i2.100.

N. Purba, Poningsih, and H. S. Tambunan, "Penerapan Algoritma K-Means Clustering Pada Penyebaran Penyakit Infeksi Saluran Pernapasan Akut (ISPA) di Provinsi Riau," Journal of Information System Research (JOSH), vol. 2, no. 3, pp. 220-226, 2021, https://ejurnal.seminar-id.com/index.php/josh/article/view/736.

H. Nopriandi and F. Haswan, "Analisis Klasterisasi Mahasiswa Baru dalam Memilih Program Studi dengan Menggunakan Algoritma K-Means," Journal of Information System Research (JOSH), vol. 3, no. 4, pp. 666−671, 2022, doi: https://doi.org/10.47065/josh.v3i4.1986.

Q. I. Mawarni and E. S. Budi, "Implementasi Algoritma K-Means Clustering Dalam Penilaian Kedisiplinan Siswa," Jurnal Sistem Komputer dan Informatika (JSON), vol. 3, no. 4, pp. 522−528, 2022, doi: 10.30865/json.v3i4.4242.

J. Han, M. Kamber, and J. Pei, Data Mining: Concepts and Techniques 3rd Edition, Morgan Kaufmann, 2011.

I.H. Witten, E. Frank, M.H. Hall, and C.J. Pal, Data Mining Practical Machine Learning Tools and Techniques 4th Edition, Morgan Kaufmann, Elsevier, 2016.

R. Watrianthos, R. Handayani, A. F. P. Akhir, Ambiyar, and U. Verawardina, "Penerapan Algoritma K-Means Pada Pemetaan Kemampuan Penggunaan Teknologi Informasi Remaja dan Dewasa di Indonesia," Journal of Computer System and Informatics (JoSYC), vol. 4, no. 1, pp. 45−50, 2022, doi: 10.47065/josyc.v4i1.2264.

R. Kurniawan, Suhada, and R. Dewi, "Penerapan Algoritma K-Means Clustering Dalam Persentase Merokok Pada Penduduk Umur Di Atas 15 Tahun Menurut Provinsi," Jurnal Sistem Komputer dan Informatika (JSON), vol. 2, no. 2, pp. 178-186, 2021, doi: 10.30865/json.v2i2.2770..

Suyanto, E. Rachmawati, M. D. Sulistiyo, G. S. Wulandari, and M. Fachrie, Explainable Artificial Intelligence Menggunakan Metode-Metode Berbasis Nearest Neighbors, Penerbit Informatika Bandung, 2022.

Suyanto, Data Mining untuk Klasifikasi dan Klasterisasi Data (edisi revisi), Penerbit Informatika Bandung, 2019.

A. Wahyu and Rushendra, "Klasterisasi Dampak Bencana Gempa Bumi Menggunakan Algoritma K-Means di Pulau Jawa," Jurnal Edukasi dan Penelitian Informatika (JEPIN), vol. 8, no. 1, pp. 175-179, 2022, doi: http://dx.doi.org/10.26418/jp.v8i1.52260.

R. Annisa, D. Rosiyadi, and D. Riana, "Improved point center algorithm for k-means clustering to increase software defect prediction," International Journal of Advances in Intelligent Informatics (IJAIN), vol. 6, no. 3, pp. 328-339, 2020, doi: https://doi.org/10.26555/ijain.v6i3.484.

M. R. Kusnaidi, T. Gulo, and S. Aripin, "Penerapan Normalisasi Data Dalam Mengelompokkan Data Mahasiswa Dengan Menggunakan Metode K-Means Untuk Menentukan Prioritas Bantuan Uang Kuliah Tunggal," Journal of Computer System and Informatics (JoSYC), vol. 3, no. 4, pp. 330−338, 2022, doi: 10.47065/josyc.v3i4.2112.




DOI: https://doi.org/10.30865/json.v4i3.6021

Refbacks

  • There are currently no refbacks.


Copyright (c) 2023 Budi Hartono, Sri Eniyati, Kristophorus Hadiono

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Jurnal Sistem Komputer dan Informatika (JSON)
Dikelola oleh Universitas Budi Darma
Sekretariat : Jln. Sisingamangaraja No. 338 Telp 061-7875998
email : lppm.ubd@gmail.com


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.