Uji Kemiripan Kalimat Judul Tugas Akhir dengan Metode Cosine Similarity dan Pembobotan TF-IDF
DOI:
https://doi.org/10.30865/mib.v5i2.2935Keywords:
Cosine Similarity, Text Mining, TF-IDF Weighting, Final Project Report, Machine LearningAbstract
Deli Husada Health Institute is a health campus that has been established for 34 years, currently it has 30000 students, each student at the final level will submit a final project of study program every year, each student before doing his final project report must provide the title of an assignment report. Finally, to the study program, to reduce the level of similarity in the title of the student's final report, the study program usually conducts a manual check, the result that appears is that it is not effective in determining the title of the final project for students, so that it creates quite a lot of similarities between students. So that many final project reports look the same. With the above conditions, the Sentence Similarity Test of the Final Project Title was carried out with the Cosine Similarity Method and TF-IDF Weighting at the Deli Husada Delitua Health Institute Campus. At the end of the test results on the training data against the training data, the results obtained were 43% of the titles in Submitted is not eligible to be submitted again and 53% is eligible to be submitted as the title of the final project because it has high similarities to the title of the final project report. And get the average time 0.12117 in minutesReferences
R. T. Wahyuni, D. Prastiyanto, dan E. Supraptono, “Jurnal Teknik Elektro,†J. Tek. Elektro, vol. 9, no. 1, hal. 18–23, 2017.
Z. Efendi dan M. Mustakim, “Text Mining Classification sebagai Rekomendasi Dosen Pembimbing Tugas Akhir Program Studi Sistem Informasi,†Semin. Nas. Teknol. Inf. Komun. dan Ind., vol. 0, no. 0, hal. 235–242, 2017.
P. Meilina, “Penerapan Data Mining dengan Metode Klasifikasi Menggunakan Decision Tree dan Regresi,†J. Teknol. Univ. Muhammadiyah Jakarta, vol. 7, no. 1, hal. 11–20, 2015.
D. H. Wahid dan A. SN, “Peringkasan Sentimen Esktraktif di Twitter Menggunakan Hybrid TF-IDF dan Cosine Similarity,†IJCCS (Indonesian J. Comput. Cybern. Syst., vol. 10, no. 2, hal. 207, 2016, doi: 10.22146/ijccs.16625.
C. F. Suharno, M. A. Fauzi, dan R. S. Perdana, “Klasifikasi Teks Bahasa Indonesia Pada Dokumen Pengaduan Sambat Online Menggunakan Metode K-Nearest Neighbors Dan Chi-square,†Syst. Inf. Syst. Informatics J., vol. 3, no. 1, hal. 25–32, 2017, doi: 10.29080/systemic.v3i1.191.
D. Susandi dan U. Sholahudin, “Pemanfaatan Vector Space Model pada Penerapan Algoritma Nazief Adriani , KNN dan Fungsi Similarity Cosine untuk Pembobotan IDF dan WIDF pada Prototipe Sistem Klasifikasi Teks Bahasa Indonesia,†J. ProTekInfo, vol. 3, no. 1, hal. 22–29, 2016.
O. Nurdiana, J. Jumadi, dan D. Nursantika, “Perbandingan Metode Cosine Similarity Dengan Metode Jaccard Similarity Pada Aplikasi Pencarian Terjemah Al-Qur’an Dalam Bahasa Indonesia,†J. Online Inform., vol. 1, no. 1, hal. 59, 2016, doi: 10.15575/join.v1i1.12.
J. Oliver, “Bab Ii Tinjauan Pustaka Aplikasi,†Hilos Tensados, vol. 1, no., hal. 1–476, 2019.
W. Issues, W. Issues, dan U. Words, “TESIS BAB 1 ( Repaired ),†vol. 1, hal. 1–28, 2020.
D. A. R. Ariantini, A. S. M. Lumenta, dan A. Jacobus, “Pengukuran Kemiripan Dokumen Teks Bahasa Indonesia Menggunakan Metode Cosine Similarity,†J. Tek. Inform., vol. 9, no. 1, hal. 1–8, 2016, doi: 10.35793/jti.9.1.2016.13752.
S. Sanjaya, S. Sanjaya, dan E. A. Absar, “Pengelompokan Dokumen Menggunakan Winnowing Fingerprint dengan Metode K-Nearest Neighbour,†J. CoreIT J. Has. Penelit. Ilmu Komput. dan Teknol. Inf., vol. 1, no. 2, hal. 50–56, 2015, doi: 10.24014/coreit.v1i2.1229.
R. Yulianti, Implementasi Penilaian Pembelajaran Berbasis Computer Based Test (CBT) di SMA Yadika 6 Tangerang Selatan. 2019.
A. Deolika, K. Kusrini, dan E. T. Luthfi, “Analisis Pembobotan Kata Pada Klasifikasi Text Mining,†J. Teknol. Inf., vol. 3, no. 2, hal. 179, 2019, doi: 10.36294/jurti.v3i2.1077.
M. Z. Naf’an, A. Burhanuddin, dan A. Riyani, “Penerapan Cosine Similarity dan Pembobotan TF-IDF untuk Mendeteksi Kemiripan Dokumen,†J. Linguist. Komputasional, vol. 2, no. 1, hal. 23–27, 2019, doi: 10.26418/jlk.v2i1.17.
A. Hidayat, “Impementasi Metode Term Frequency and Inverse Document Frequency Dan Marginal Relevance Untuk Monitoring Diskusi Online,†vol. 13, no. 2, hal. 151–159, 2015.
A. Khozaimi, S. S. Putro, dan M. Rohman, “Pengembangan Aplikasi Managemen Tugas Skripsi (Studi Kasus : Program Studi Teknik Informatika Universitas Trunojoyo Madura),†MATRIK J. Manajemen, Tek. Inform. dan Rekayasa Komput., vol. 18, no. 2, hal. 237–245, 2019, doi: 10.30812/matrik.v18i2.392.
A. A. Prasanti, M. A. Fauzi, dan M. T. Furqon, “Klasifikasi Teks Pengaduan Pada Sambat Online Menggunakan Metode N- Gram dan Neighbor Weighted K-Nearest Neighbor ( NW-KNN ),†J.
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).