Uji Kemiripan Kalimat Judul Tugas Akhir dengan Metode Cosine Similarity dan Pembobotan TF-IDF

Authors

  • Indra Mawanta Universitas Potensi Utama, Medan
  • T S Gunawan Universitas Potensi Utama, Medan
  • Wanayumini Wanayumini Universitas Potensi Utama, Medan

DOI:

https://doi.org/10.30865/mib.v5i2.2935

Keywords:

Cosine Similarity, Text Mining, TF-IDF Weighting, Final Project Report, Machine Learning

Abstract

Deli Husada Health Institute is a health campus that has been established for 34 years, currently it has 30000 students, each student at the final level will submit a final project of study program every year, each student before doing his final project report must provide the title of an assignment report. Finally, to the study program, to reduce the level of similarity in the title of the student's final report, the study program usually conducts a manual check, the result that appears is that it is not effective in determining the title of the final project for students, so that it creates quite a lot of similarities between students. So that many final project reports look the same. With the above conditions, the Sentence Similarity Test of the Final Project Title was carried out with the Cosine Similarity Method and TF-IDF Weighting at the Deli Husada Delitua Health Institute Campus. At the end of the test results on the training data against the training data, the results obtained were 43% of the titles in Submitted is not eligible to be submitted again and 53% is eligible to be submitted as the title of the final project because it has high similarities to the title of the final project report. And get the average time 0.12117 in minutes

Author Biographies

Indra Mawanta, Universitas Potensi Utama, Medan

Fakultas Teknik dan Ilmu komputer, Prodi Magister Ilmu Komputer

T S Gunawan, Universitas Potensi Utama, Medan

Fakultas Teknik dan Ilmu komputer, Prodi Magister Ilmu Komputer

Wanayumini Wanayumini, Universitas Potensi Utama, Medan

Fakultas Teknik dan Ilmu komputer, Prodi Magister Ilmu Komputer

References

R. T. Wahyuni, D. Prastiyanto, dan E. Supraptono, “Jurnal Teknik Elektro,†J. Tek. Elektro, vol. 9, no. 1, hal. 18–23, 2017.

Z. Efendi dan M. Mustakim, “Text Mining Classification sebagai Rekomendasi Dosen Pembimbing Tugas Akhir Program Studi Sistem Informasi,†Semin. Nas. Teknol. Inf. Komun. dan Ind., vol. 0, no. 0, hal. 235–242, 2017.

P. Meilina, “Penerapan Data Mining dengan Metode Klasifikasi Menggunakan Decision Tree dan Regresi,†J. Teknol. Univ. Muhammadiyah Jakarta, vol. 7, no. 1, hal. 11–20, 2015.

D. H. Wahid dan A. SN, “Peringkasan Sentimen Esktraktif di Twitter Menggunakan Hybrid TF-IDF dan Cosine Similarity,†IJCCS (Indonesian J. Comput. Cybern. Syst., vol. 10, no. 2, hal. 207, 2016, doi: 10.22146/ijccs.16625.

C. F. Suharno, M. A. Fauzi, dan R. S. Perdana, “Klasifikasi Teks Bahasa Indonesia Pada Dokumen Pengaduan Sambat Online Menggunakan Metode K-Nearest Neighbors Dan Chi-square,†Syst. Inf. Syst. Informatics J., vol. 3, no. 1, hal. 25–32, 2017, doi: 10.29080/systemic.v3i1.191.

D. Susandi dan U. Sholahudin, “Pemanfaatan Vector Space Model pada Penerapan Algoritma Nazief Adriani , KNN dan Fungsi Similarity Cosine untuk Pembobotan IDF dan WIDF pada Prototipe Sistem Klasifikasi Teks Bahasa Indonesia,†J. ProTekInfo, vol. 3, no. 1, hal. 22–29, 2016.

O. Nurdiana, J. Jumadi, dan D. Nursantika, “Perbandingan Metode Cosine Similarity Dengan Metode Jaccard Similarity Pada Aplikasi Pencarian Terjemah Al-Qur’an Dalam Bahasa Indonesia,†J. Online Inform., vol. 1, no. 1, hal. 59, 2016, doi: 10.15575/join.v1i1.12.

J. Oliver, “Bab Ii Tinjauan Pustaka Aplikasi,†Hilos Tensados, vol. 1, no., hal. 1–476, 2019.

W. Issues, W. Issues, dan U. Words, “TESIS BAB 1 ( Repaired ),†vol. 1, hal. 1–28, 2020.

D. A. R. Ariantini, A. S. M. Lumenta, dan A. Jacobus, “Pengukuran Kemiripan Dokumen Teks Bahasa Indonesia Menggunakan Metode Cosine Similarity,†J. Tek. Inform., vol. 9, no. 1, hal. 1–8, 2016, doi: 10.35793/jti.9.1.2016.13752.

S. Sanjaya, S. Sanjaya, dan E. A. Absar, “Pengelompokan Dokumen Menggunakan Winnowing Fingerprint dengan Metode K-Nearest Neighbour,†J. CoreIT J. Has. Penelit. Ilmu Komput. dan Teknol. Inf., vol. 1, no. 2, hal. 50–56, 2015, doi: 10.24014/coreit.v1i2.1229.

R. Yulianti, Implementasi Penilaian Pembelajaran Berbasis Computer Based Test (CBT) di SMA Yadika 6 Tangerang Selatan. 2019.

A. Deolika, K. Kusrini, dan E. T. Luthfi, “Analisis Pembobotan Kata Pada Klasifikasi Text Mining,†J. Teknol. Inf., vol. 3, no. 2, hal. 179, 2019, doi: 10.36294/jurti.v3i2.1077.

M. Z. Naf’an, A. Burhanuddin, dan A. Riyani, “Penerapan Cosine Similarity dan Pembobotan TF-IDF untuk Mendeteksi Kemiripan Dokumen,†J. Linguist. Komputasional, vol. 2, no. 1, hal. 23–27, 2019, doi: 10.26418/jlk.v2i1.17.

A. Hidayat, “Impementasi Metode Term Frequency and Inverse Document Frequency Dan Marginal Relevance Untuk Monitoring Diskusi Online,†vol. 13, no. 2, hal. 151–159, 2015.

A. Khozaimi, S. S. Putro, dan M. Rohman, “Pengembangan Aplikasi Managemen Tugas Skripsi (Studi Kasus : Program Studi Teknik Informatika Universitas Trunojoyo Madura),†MATRIK J. Manajemen, Tek. Inform. dan Rekayasa Komput., vol. 18, no. 2, hal. 237–245, 2019, doi: 10.30812/matrik.v18i2.392.

A. A. Prasanti, M. A. Fauzi, dan M. T. Furqon, “Klasifikasi Teks Pengaduan Pada Sambat Online Menggunakan Metode N- Gram dan Neighbor Weighted K-Nearest Neighbor ( NW-KNN ),†J.

Downloads

Published

2021-04-25

Issue

Section

Articles