Perancangan Aplikasi Deteksi Kemiripan Dokumen Teks Menggunakan Algoritma Shingling

 (*)Irwan Saputra Simanullang Mail (Universitas Budi Darma, Medan, Indonesia)

(*) Corresponding Author

Submitted: September 2, 2020; Published: September 30, 2020

Abstract

Detection of similarity in text documents is needed to avoid the accumulation of information in a database or file system, one way to find out plagiarism between two documents by detecting the similarity of two documents to be compared to detect plagiarism, the Shingling Algorithm is an algorithm used for near-duplicate document search processes. Shingling's algorithm is implemented to detect similarities in text documents. The ease of copying information from one document to another is one of the effects of advances in information technology. And this leads to unwanted plagiarism, so that the document detection process can be minimized properly. In this study, the test results show that the application created can detect the similarity of text documents that have gone through various manipulations, namely scaling (enlarge / reduce), rotation, cropping (partially cut), and manipulating documents.

Keywords


Shingling, Detection, Similarity, Document, Text

Full Text:

PDF


Article Metrics

Abstract view : 531 times
PDF - 257 times

References

A. Ilmiah, “Perancangan dan Implementasi Aplikasi Deteksi Kemiripan Citra Digital Menggunakan Algoritma Shingling dan Redundant Pixel Removal Perancangan dan Implementasi Aplikasi Deteksi Kemiripan Citra Digital Menggunakan Algoritma Shingling dan Redundant Pixel Rem,” no. 672010147, 2016.

D. Kepada, F. T. Informasi, U. Memperoleh, and G. Sarjana, “Perancangan dan Implementasi Aplikasi Deteksi Kemiripan Dokumen Menggunakan Algoritma Shingling,” 2014.

A. G. N. K. Walia, “A Review. International Journal of Engineering Development and Research,” no. Cryptography Algorithms, 2014.

A. Kadir, Algoritma & Pemrograman Menggunakan C & C++. yogyakarta: andi, 2012.

P. L. Montanari, D. & Puglisi, “In Multidisciplinary Research and Practice for Information Systems,” Near Duplic. Doc. Detect. large Inf. flows, vol. Informatio, 2012.

Refbacks

  • There are currently no refbacks.


Copyright (c) 2020 Irwan Saputra Simanullang

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Jurnal Sistem Komputer dan Informatika (JSON)
Dikelola oleh STMIK Budi Darma
Sekretariat : Jln. Sisingamangaraja No. 338 Telp 061-7875998
email : jurnal.json@gmail.com


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.