Uji Kemiripan Kalimat Judul Tugas Akhir dengan Metode Cosine Similarity dan Pembobotan TF-IDF

 Indra Mawanta (Universitas Potensi Utama, Medan, Indonesia)
 T S Gunawan (Universitas Potensi Utama, Medan, Indonesia)
 (*)Wanayumini Wanayumini Mail (Universitas Potensi Utama, Medan, Indonesia)

(*) Corresponding Author

DOI: http://dx.doi.org/10.30865/mib.v5i2.2935

Abstract

Deli Husada Health Institute is a health campus that has been established for 34 years, currently it has 30000 students, each student at the final level will submit a final project of study program every year, each student before doing his final project report must provide the title of an assignment report. Finally, to the study program, to reduce the level of similarity in the title of the student's final report, the study program usually conducts a manual check, the result that appears is that it is not effective in determining the title of the final project for students, so that it creates quite a lot of similarities between students. So that many final project reports look the same. With the above conditions, the Sentence Similarity Test of the Final Project Title was carried out with the Cosine Similarity Method and TF-IDF Weighting at the Deli Husada Delitua Health Institute Campus. At the end of the test results on the training data against the training data, the results obtained were 43% of the titles in Submitted is not eligible to be submitted again and 53% is eligible to be submitted as the title of the final project because it has high similarities to the title of the final project report. And get the average time 0.12117 in minutes

Keywords


Cosine Similarity; Text Mining; TF-IDF Weighting; Final Project Report; Machine Learning

Full Text:

PDF


Article Metrics

Abstract view : 196 times
PDF - 92 times

References

R. T. Wahyuni, D. Prastiyanto, dan E. Supraptono, “Jurnal Teknik Elektro,” J. Tek. Elektro, vol. 9, no. 1, hal. 18–23, 2017.

Z. Efendi dan M. Mustakim, “Text Mining Classification sebagai Rekomendasi Dosen Pembimbing Tugas Akhir Program Studi Sistem Informasi,” Semin. Nas. Teknol. Inf. Komun. dan Ind., vol. 0, no. 0, hal. 235–242, 2017.

P. Meilina, “Penerapan Data Mining dengan Metode Klasifikasi Menggunakan Decision Tree dan Regresi,” J. Teknol. Univ. Muhammadiyah Jakarta, vol. 7, no. 1, hal. 11–20, 2015.

D. H. Wahid dan A. SN, “Peringkasan Sentimen Esktraktif di Twitter Menggunakan Hybrid TF-IDF dan Cosine Similarity,” IJCCS (Indonesian J. Comput. Cybern. Syst., vol. 10, no. 2, hal. 207, 2016, doi: 10.22146/ijccs.16625.

C. F. Suharno, M. A. Fauzi, dan R. S. Perdana, “Klasifikasi Teks Bahasa Indonesia Pada Dokumen Pengaduan Sambat Online Menggunakan Metode K-Nearest Neighbors Dan Chi-square,” Syst. Inf. Syst. Informatics J., vol. 3, no. 1, hal. 25–32, 2017, doi: 10.29080/systemic.v3i1.191.

D. Susandi dan U. Sholahudin, “Pemanfaatan Vector Space Model pada Penerapan Algoritma Nazief Adriani , KNN dan Fungsi Similarity Cosine untuk Pembobotan IDF dan WIDF pada Prototipe Sistem Klasifikasi Teks Bahasa Indonesia,” J. ProTekInfo, vol. 3, no. 1, hal. 22–29, 2016.

O. Nurdiana, J. Jumadi, dan D. Nursantika, “Perbandingan Metode Cosine Similarity Dengan Metode Jaccard Similarity Pada Aplikasi Pencarian Terjemah Al-Qur’an Dalam Bahasa Indonesia,” J. Online Inform., vol. 1, no. 1, hal. 59, 2016, doi: 10.15575/join.v1i1.12.

J. Oliver, “Bab Ii Tinjauan Pustaka Aplikasi,” Hilos Tensados, vol. 1, no., hal. 1–476, 2019.

W. Issues, W. Issues, dan U. Words, “TESIS BAB 1 ( Repaired ),” vol. 1, hal. 1–28, 2020.

D. A. R. Ariantini, A. S. M. Lumenta, dan A. Jacobus, “Pengukuran Kemiripan Dokumen Teks Bahasa Indonesia Menggunakan Metode Cosine Similarity,” J. Tek. Inform., vol. 9, no. 1, hal. 1–8, 2016, doi: 10.35793/jti.9.1.2016.13752.

S. Sanjaya, S. Sanjaya, dan E. A. Absar, “Pengelompokan Dokumen Menggunakan Winnowing Fingerprint dengan Metode K-Nearest Neighbour,” J. CoreIT J. Has. Penelit. Ilmu Komput. dan Teknol. Inf., vol. 1, no. 2, hal. 50–56, 2015, doi: 10.24014/coreit.v1i2.1229.

R. Yulianti, Implementasi Penilaian Pembelajaran Berbasis Computer Based Test (CBT) di SMA Yadika 6 Tangerang Selatan. 2019.

A. Deolika, K. Kusrini, dan E. T. Luthfi, “Analisis Pembobotan Kata Pada Klasifikasi Text Mining,” J. Teknol. Inf., vol. 3, no. 2, hal. 179, 2019, doi: 10.36294/jurti.v3i2.1077.

M. Z. Naf’an, A. Burhanuddin, dan A. Riyani, “Penerapan Cosine Similarity dan Pembobotan TF-IDF untuk Mendeteksi Kemiripan Dokumen,” J. Linguist. Komputasional, vol. 2, no. 1, hal. 23–27, 2019, doi: 10.26418/jlk.v2i1.17.

A. Hidayat, “Impementasi Metode Term Frequency and Inverse Document Frequency Dan Marginal Relevance Untuk Monitoring Diskusi Online,” vol. 13, no. 2, hal. 151–159, 2015.

A. Khozaimi, S. S. Putro, dan M. Rohman, “Pengembangan Aplikasi Managemen Tugas Skripsi (Studi Kasus : Program Studi Teknik Informatika Universitas Trunojoyo Madura),” MATRIK J. Manajemen, Tek. Inform. dan Rekayasa Komput., vol. 18, no. 2, hal. 237–245, 2019, doi: 10.30812/matrik.v18i2.392.

A. A. Prasanti, M. A. Fauzi, dan M. T. Furqon, “Klasifikasi Teks Pengaduan Pada Sambat Online Menggunakan Metode N- Gram dan Neighbor Weighted K-Nearest Neighbor ( NW-KNN ),” J.

Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Uji Kemiripan Kalimat Judul Tugas Akhir dengan Metode Cosine Similarity dan Pembobotan TF-IDF

Refbacks

  • There are currently no refbacks.


Copyright (c) 2021 JURNAL MEDIA INFORMATIKA BUDIDARMA

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.



JURNAL MEDIA INFORMATIKA BUDIDARMA
STMIK Budi Darma
Sekretariat : Jln. Sisingamangaraja No. 338 Telp 061-7875998
email : mib.stmikbd@gmail.com

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.