Klasifikasi SMS Spam Berbahasa Indonesia Menggunakan Algoritma Multinomial Naïve Bayes
DOI:
https://doi.org/10.30865/mib.v5i4.3119Keywords:
Classification of SMS Spam, Multinomial Nave Bayes, Indonesian SMS Spam, Text Mining, SMSAbstract
Based on a report submitted by Truecaller Insights Report 2020, Indonesia placed sixth position with the most spam messages, one of the spam applications is SMS. Spam SMS contains unwanted or unsolicited messages, including advertisements, scams and so on. The existence of this spam message causes inconvenience from the user's side when receiving spam SMS, and some even become victims of crime after responding to the SMS. To minimize inconvenience and crime caused by spam messages, the purpose of this study is to filter SMS spam or SMS filtering by classifying SMS spam using the Multinomial Naïve Bayes algorithm by looking for the best combination of parameters to improve the performance of the model that is formed. The results of model testing get the highest precision value in the MNB and SVM models by 93%, the highest recall value in the SVM model at 94%, the highest f1-score value in the SVM model at 94%, the highest accuracy value in the SVM model at 95%, and the fastest test time on the MNB model is 2.66 ms
References
Widyawati and Sutanto, “Perbandingan Algoritma Naive Bayes Dan Support Vector Machine (SVM),†J. Sains Teknol., vol. 3, no. 2, pp. 178–194, 2019.
M. R. Akhyari and A. R. Pratama, “Kesadaran akan Ancaman Serangan Berbasis Backdoor di Kalangan Pengguna Smartphone Android,†Automata, vol. 2, no. 1, pp. 1–7, 2021.
K. F. Kok, “Top 20 Countries Affected by Spam Calls in 2020,†truecaller, 2020. [Online]. Available: https://truecaller.blog/2020/12/08/truecaller-insights-top-20-countries-affected-by-spam-calls-in-2020-2/. [Accessed: 08-Dec-2020].
M. A. F. Syahril, “Privasi Yang Terpublikasi,†pp. 1–14, 2021.
B. Susilo, “Pengaruh Penggunaan Media Sosial Terhadap Kesadaran Registrasi Kartu Prabayar Di Pontianak,†SENSITEK, pp. 121–126, 2018.
Apriliana, N. Ransi, and J. Nangi, “Implementasi Text Mining Klasifikasi Skripsi Menggunakan Metode Naïve Bayes Classifier,†Semant. Vol.3, No.2, Jul-Des 2017, vol. 3, no. 2, pp. 187–194, 2017, doi: 10.1007/978-1-4471-7307-6_20.
D. N. Fitriana, N. A. Setifani, and A. Yusuf, “Perbandingan Algoritma Naïve Bayes, Svm, Dan Decision Tree Untuk Klasifikasi SMS Spam,†JUSIM (Jurnal Sist. Inf. Musirawas), vol. 5, no. 02, pp. 167–174, 2020, doi: 10.32767/jusim.v5i02.956.
A. S. Dharma, O. Y. Silitonga, and H. J. Manurung, “Perbandingan Algoritma Naive Bayes, ID3 dan TAN Pada Klasifikasi SMS Spam,†J. Marit. Educ., vol. 1, no. 2, pp. 30–34, 2019.
N. Hayatin, “Implementasi Multinomial Naïve Bayes Untuk Klasifikasi Data Tweets Mengandung Term,†SENTRA, pp. 344–349, 2020.
Chapman, P., Kerber R., Clinton J., Khabaza T., Reinartz T., Wirth R. – “The CRISP-DM Process Modelâ€, Discussion Paper, 2000.
E. M. Silval, H. A. do Pradol, E. Femedal, Text mining: crossing the chasm between the academy and the industry", Paper from: Data Mining III, A Zanasi, CA Brebbia, NFF Ebecken & P Melli (Editors), 2002
Lukasza, Kurgan, and Petrmusilek, " A survey of Knowledge Discovery and Data Mining process models", The Knowledge Engineering Review, Vol. 21, 2006
S. A. Salloum, M. Al-Emran, A. A. Monem, K. Shaalan, "A Survey of text mining in social media: facebook and twitter perspectives", Ad-vances in Science, Technology and Engineering Systems Journal, Vol. 2, 2017
W. Hua, Z. Wang, H. Wang, K. Zheng and X. Zhou, "Understand Short Texts by Harvesting and Analyzing Semantic Knowledge", IEEE Transactions On Knowledge And Data Engineering, 2016.
J. Zhu, Member, K. Wang, Y. Wu, Zhongyi Hu, and H. Wang, "Mining User-Aware Rare Sequential Topic Patterns in Document Streams", IEEE Transactions On Knowledge And Data Engineering, Vol. 28, 2016.
Anil Kumar Soni, Avinash Kumar, Robin Prakash Mathur, “Enhancing the Stemming Algorithm in Text Miningâ€, International Journal of Applied Engineering Research, Vol. 10, 2015
C.Ramasubramanian1, R.Ramya, “Effective Pre-Processing Activities in Text Mining using Improved Porter’s Stemming Algorithmâ€, IJARCCE Vol. 2, Issue 12, 2013.
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).