Pengaruh N-Gram terhadap Klasifikasi Buku menggunakan Ekstraksi dan Seleksi Fitur pada Multinomial Naïve Bayes

 (*)Esti Mulyani Mail (Politeknik Negeri Indramayu, Indramayu, Indonesia)
 Fachrul Pralienka Bani Muhamad (Politeknik Negeri Indramayu, Indramayu, Indonesia)
 Kurnia Adi Cahyanto (Politeknik Negeri Indramayu, Indramayu, Indonesia)

(*) Corresponding Author

Submitted: December 14, 2020; Published: January 22, 2021



Libraries have the main task in the processing of library materials by classifying books according to certain ways. Dewey Decimal Classification (DDC) is the method most commonly used in the world to determine book classification (labeling) in libraries. The advantages of this DDC method are universal and more systematic. However, this method is less efficient considering the large number of books that must be classified in a library, as well as labeling that must follow label updates on the DDC. An automatic classification system will be the perfect solution to this problem. Automatic classification can be done by applying the text mining method. In this study, searching for words in the book title was carried out with N-Gram (Unigram, Bigram, Trigram) as a feature generation. The features that have been raised are then selected for features. The process of book title classification is carried out using the Naïve Bayes Multinomial algorithm. This study examines the effect of Unigram, Bigram, Trigram on the classification of book titles using the feature extraction and selection feature on Multinomial Naïve Bayes algorithm. The test results show Unigram has the highest accuracy value of 74.4%.


Classification; Feature Ekstraction; Feature Selection; Multinomial Naïve Bayes; N-Gram

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.