Clustering YouTube Comments on Mental Health in Indonesia Using the K-Means Algorithm

Authors

  • Agung Nugroho Universitas Islam Negeri Sumatera Utara, Medan
  • Raissa Amanda Putri Universitas Islam Negeri Sumatera Utara, Medan

DOI:

https://doi.org/10.30865/jurikom.v13i2.9552

Keywords:

Mental Health, YouTube Comments, Text Mining, K-Means Clustering, TF-IDF

Abstract

This study aims to analyze mental health expressions in Indonesian-language YouTube comments using a text mining approach and the K-Means clustering algorithm. The increasing use of social media as a platform for expressing psychological conditions has resulted in large volumes of unstructured textual data that are difficult to analyze manually. Therefore, this study applies text preprocessing techniques, including case folding, tokenization, stopword removal, and stemming, followed by Term Frequency–Inverse Document Frequency (TF-IDF) weighting to transform textual data into numerical representations. The clustering process is performed using the K-Means algorithm, and the optimal number of clusters is determined using the Elbow Method and Silhouette Coefficient. The results show that the optimal number of clusters is k = 3, with the highest Silhouette Coefficient value indicating good cluster quality. A total of 2,411 YouTube comments were successfully grouped into three clusters, representing different types of mental health expressions, namely complaint expressions, personal experience narratives, and general responses. This study contributes by providing a social media comment clustering model to analyze mental health expressions in the Indonesian digital context. The results demonstrate that the K-Means algorithm can effectively identify meaningful patterns in large-scale textual data without requiring labeled datasets, making it useful for supporting data-driven mental health analysis.

References

[1] S. P. Aji, N. Ani, And R. Mar’atu Sholihah, “Faktor-Faktor Yang Berpengaruh Terhadap Kondisi Kesehatan Mental Mahasiswa Pada Proses Pembelajaran Daring Di Masa Pandemi Covid-19 Factors That Influence The Students’ Mental Health Conditions On The Online Learning Processes In The Covid-19 Pandemic,” Jurnal Ilmu Kesehatan Masyarakat Berkala, Vol. 4, No. 1, Pp. 28–37, 2022.

[2] S. P. Aji, N. Ani, And R. Mar’atu Sholihah, “Faktor-Faktor Yang Berpengaruh Terhadap Kondisi Kesehatan Mental Mahasiswa Pada Proses Pembelajaran Daring Di Masa Pandemi Covid-19 Factors That Influence The Students’ Mental Health Conditions On The Online Learning Processes In The Covid-19 Pandemic,” Jurnal Ilmu Kesehatan Masyarakat Berkala, Vol. 4, No. 1, Pp. 28–37, 2020.

[3] E. Annuril Akbar, W. Astuti, J. J. Sakai, And N. Y. Aulia, “Indonesian Journal Of Digital Public Relations (Ijdpr) Analisis Konten Kesehatan Mental Akun Youtube Cnn Indonesia Mental Health Content Analysis Of Cnn Indonesia Youtube Account,” 2024. [Online]. Available: Https://Journals.Telkomuniversity.Ac.Id/Ijdpr

[4] D. Ariel And T. Handayani, “Jurnal Ilmu Komputer Dan Sistem Informasi Perbandingan Efektivitas Algoritma K-Means Da N Fuzzy C-Means Untuk Clustering Data Produksi Alpukat Di Indonesia,” 2025.

[5] A. Atira And B. Nurina Sari, “Penerapan Silhouette Coefficient, Elbow Method Dan Gap Statistics Untuk Penentuan Cluster Optimum Dalam Pengelompokkan Provinsi Di Indonesia Berdasarkan Indeks Kebahagiaan,” Jurnal Ilmiah Wahana Pendidikan, Vol. 9, No. 17, Pp. 76–86, 2023, Doi: 10.5281/Zenodo.8282638.

[6] Aulia And Anggi, “Gambaran Kesehatan Mental Mahasiswa Di Masa Pandemi Covid-19,” Online, 2021. [Online]. Available: Http://Ejurnalmalahayati.Ac.Id/Index.Php/Duniakesmas/Index

[7] E. Ayuningsih, S. R. Lubis, And Z. Tembusai, “Jurnal Media Informatika Budidarma Perancangan Ai Chatbot (Stylesavvy) Untuk Memilih Fashion Pakaian Berdasarkan Warna Kulit Di Toko Xi-Xiu,” Vol. 8, Pp. 1790–1794, 2024, Doi: 10.30865/Mib.V8i3.8220.

[8] S. A. Azzahra And N. W. A. Majid, “Klasifikasi Dan Analisis Semantik Cyberbullying Sosial Media X: Integrasi Web Scraping Dan Natural Language Processing (Nlp),” Jurnal Educatio Fkip Unma, Vol. 11, No. 2, Apr. 2025, Doi: 10.31949/Educatio.V11i2.12725.

[9] H. Tetiawadi, P. R. Studi Manajemen Informatika Politeknik Malinau Jl Ladang, D. Malinau Seberang, K. Malinau Utara Kabupaten Malinau, And K. Utara, “Sistem Informasi Publik Sekretariat Dprd Kabupaten Malinau,” Jurnal Bangkit Indonesia, Vol. 12, No. 01, 2023.

[10] Bella Salsa Risnawati, Nasichah Nasichah, Muhammad Faqih Prayogo, And Zannuby Al Izzami, “Faktor-Faktor Yang Mempengaruhi Kesehatan Mental Mahasiswa Bimbingan Dan Penyuluhan Islam Uin Syarif Hidayatullah Jakarta,” Jurnal Ilmiah Dan Karya Mahasiswa, Vol. 2, No. 1, Pp. 179–186, Dec. 2023, Doi: 10.54066/Jikma.V2i1.1389.

[11] S. Bila And R. Sharafi, “Identifikasi Pola Diskusi Publik Mengenai Pemindahan Ibu Kota Negara Menggunakan Analisis Tf-Idf Dan K-Means Clustering,” Seminar Nasional Sistem Informasi, 2024.

[12] F. Bintang Putra, M. Taufik Chulkamdi, And F. Febrinita, “Implementasi Data Mining Untuk Memprediksi Data Stok Fukubi Outfit Menggunakan Metode K-Nearest Neighbor,” 2024.

[13] T. A. Br Sembiring And M. S. Hasibuan, “Text Clustering In Karo Language Using Tf-Idf Weighting And K-Means Clustering,” Jurnal Teknik Informatika (Jutif), Vol. 4, No. 5, Pp. 1257–1265, Nov. 2023, Doi: 10.52436/1.Jutif.2023.4.5.1462.

[14] S. Budi And H. Sakur, “Analisis Perbandingan Pengukuran Jarak Pada Algoritme K-Means Berbasis Sum Of Square Error,” 2022.

[15] S. Chancellor And M. De Choudhury, “Methods In Predictive Techniques For Mental Health Status On Social Media: A Critical Review,” Dec. 01, 2020, Nature Research. Doi: 10.1038/S41746-020-0233-7.

[16] A. R. Danurisa And J. Heikal, “Customer Clustering Using The K-Means Clustering Algorithm In The Top 5 Online Marketplaces In Indonesia,” 2023, Doi: 10.33258/Birci.V5i3.6450.

[17] G. Erda, C. Gunawan, And Z. Erda, “Grouping Of Poverty In Indonesia Using K-Means With Silhouette Coefficient,” Parameter: Journal Of Statistics, Vol. 3, No. 1, Pp. 1–6, Jun. 2023, Doi: 10.22487/27765660.2023.V3.I1.16435.

[18] A. Fauzan, “Mental Dalam Youtube Channel Satu Persen Skripsi,” 2023.

[19] G. Risky Pratiwi, D. Wahiddin, E. E. Awal, A. Fauzi, U. Buana, And P. Karawang, “Klasterisasi Tingkat Kemiskinan Kabupaten/Kota Di Indonesia Menggunakan Algoritma K-Means Dan K-Medoids,” 2024, Doi: 10.33364/Algoritma/V.21-2.1788.

[20] Febby Wilyani, Qonaah Nuryan Arif, And Fitri Aslimar, “Pengenalan Dasar Pemrograman Python Dengan Google Colaboratory,” Jurnal Pelayanan Dan Pengabdian Masyarakat Indonesia, Vol. 3, No. 1, Pp. 08–14, Mar. 2024, Doi: 10.55606/Jppmi.V3i1.1087.

Additional Files

Published

2026-04-28

How to Cite

Nugroho, A., & Putri, R. A. (2026). Clustering YouTube Comments on Mental Health in Indonesia Using the K-Means Algorithm . JURNAL RISET KOMPUTER (JURIKOM), 13(2), 537–544. https://doi.org/10.30865/jurikom.v13i2.9552

Issue

Section

Articles