Deteksi Konten Gereflekter pada Cerita Anak Menggunakan Naïve Bayes Classifier

Mayya Tania Wewengkang, Dana Sulistiyo Kusumo, Widi Astuti

Abstract


Textbooks and storybooks are the ones used as a source of knowledge. When children read a book, they will try to interpret each word and sentence in it. However, it will be a problem if the book contains vulgar words and indecent sentences. For children at the elementary school level, it is not allowed. For this research, we called that content as gereflekter content. Based on these problems, this research was conducted by building a system to detect gereflekter content in the text of the child's stories that were used as a data set. A system is built by using Naïve Bayes Classifier (NBC) and then evaluated in two scenarios using accuracy, precision, and recall metrics because the characteristics of the data set are imbalanced with the amount of data in the negative class are greater than the data in the positive class. From evaluation results, test scenario produced a high average precision of 99.01%, whereas the recall value has an average of above 50%. From these two values, it can be concluded that the model built by the system has not detected the class properly, but highly trusted when it does.

Keywords


Children Story, Imbalanced Data, Gereflekter Content, Text Classification, Naïve Bayes Classifier

Full Text:

PDF

References


Davinof, “6 Buku Pelajaran yang Pernah Bikin Geger Dunia Pendidikan (+PICT),†Kaskus, 2014. [Online]. Available: https://www.kaskus.co.id/thread/5441ba4adc06bd784d8b457a/6-buku-pelajaran-yang-pernah-bikin-geger-dunia-pendidikan-pict/. [Accessed: 06-Feb-2020].

N. Purnamasari, “Orang Tua Keluhkan Konten Dewasa di Buku ‘Si Kabayan Super Kocak,’†detiknews, 2017. [Online]. Available: https://news.detik.com/berita/d-3547378/orang-tua-keluhkan-konten-dewasa-di-buku-si-kabayan-super-kocak. [Accessed: 06-Feb-2020].

M. Pateda, Semantik leksikal. Gorontalo: Rineka Cipta, 1996.

Ivan, Y. A. Sari, and P. P. Adikara, “Klasifikasi Hate Speech Berbahasa Indonesia di Twitter Menggunakan Naive Bayes dan Seleksi Fitur Information Gain dengan Normalisasi Kata,†J. Pengemb. Teknol. Inf. dan Ilmu Komput., vol. 3, no. 5, pp. 4914–4922, 2019.

K. N. Sani, “Sistem Deteksi Hoax Berita Bahasa Indonesia dengan Menggunakan Algoritma Naïve Bayes,†Universitas Airlangga, 2018.

F. Rahutomo, I. Y. R. Pratiwi, and D. M. Ramadhani, “Eksperimen Naïve Bayes Pada Deteksi Berita Hoax Berbahasa Indonesia,†J. Penelit. Komun. dan Opini Publik, vol. 23, no. 1, pp. 1–15, 2019.

P. D. Utami and R. Sari, “Filtering Hoax Menggunakan Naive Bayes Classifier,†J. MULTINETICS, vol. 4, no. 1, pp. 57–61, 2018.

M. A. Rahman and Y. A. Akter, “Topic Classification from Text Using Decision Tree, K-NN and Multinomial Naïve Bayes,†in 2019 1st International Conference on Advances in Science, Engineering and Robotics Technology (ICASERT), 2019, pp. 1–4.

S. R. Basha, J. K. Rani, J. P. Yadav, and G. R. Kumar, “Impact of feature selection techniques in Text Classification: an experimental study,†in 2nd International Conference on Advances in Engineering, Management and Sciences, 2019, pp. 39–51.

R. Chatterjee, V. Acharya, K. Prakasha, and R. V. Arjunan, “Text based Machine Learning Using Discriminative Classifiers,†J. Adv. Res. Dyn. Control Syst., vol. 11, no. 7, pp. 32–41, 2019.

T. Tokunaga and I. Makoto, “Text categorization based on weighted inverse document frequency,†in Information Process Society of Japan (SIG-IPSJ, 1994, pp. 33–39.

S. kotagiri Raju and M. R. Murty, “Support Vector Machine with K-fold Cross Validation Model for Software Fault Prediction,†Int. J. Pure Appl. Math., vol. 118, no. 20, pp. 321–334, 2018.

P. Refaeilzadeh, L. Tang, and H. Liu, “Cross-Validation,†in Encyclopedia of Database Systems 5, Boston,MA: Springer, 2009.

J. Ling, I. P. E. N. Kencana, and T. B. Oka, “Analisis Sentimen Menggunakan Metode Naïve Bayes Classifier Dengan Seleksi Fitur Chi Square,†E-Jurnal Mat., vol. 3, no. 3, pp. 92–99, 2014.

S. Mishra, “Handling imbalanced data: SMOTE vs. random undersampling,†Int. Res. J. Eng. Technol., vol. 4, no. 8, pp. 317–320, 2017.




DOI: https://doi.org/10.30865/mib.v4i2.2015

Refbacks

  • There are currently no refbacks.


Copyright (c) 2020 JURNAL MEDIA INFORMATIKA BUDIDARMA

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.



JURNAL MEDIA INFORMATIKA BUDIDARMA
Universitas Budi Darma
Secretariat: Sisingamangaraja No. 338 Telp 061-7875998
Email: mib.stmikbd@gmail.com

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.