Klasifikasi Topik Multi Label pada Hadis Shahih Bukhari Menggunakan K-Nearest Neighbor dan Latent Semantic Analysis
DOI:
https://doi.org/10.30865/jurikom.v7i1.2013Keywords:
Hadith, Classification, Latent Semantic Analysis, K-Nearest Neighbor, Binary RelevanceAbstract
Hadith is the second source of Islamic law after Al-Quran, making it important to study. However, there are some difficulties in learning hadith, such as to determine which hadith belongs to the topic of suggestions, prohibitions, and information. This certainly obstructs the hadith learning process, especially for Muslims. Therefore, it is necessary to classify hadiths into the topic of suggestions, prohibitions, information, and a combination of the three topics which also called as multi-label topic. The classification can be done with the K-Nearest Neighbor, it is one of the best methods in the Vector Space Model and is the simplest but quite effective method. However, the KNN has a lack in dealing with high vector dimension, resulting in the long time computing classification. For that reason, it is necessary to classify Sahih Bukhari's Hadiths into the topic of recommendations, prohibitions, and information using the Latent-Semantic Analysis - K-nearest Neighbor (LSA-KNN) method. Binary Relevance method is also employed in this research to process the multi-label data. This research shows that the performance of LSA-KNN is 90.28% with the computation time is 19 minutes 21 seconds and the performance of KNN is 90.23% with the computation time is 37 minutes 06 seconds, which means that the LSA-KNN method has a better performance than KNNReferences
Jonathan A. C. Brown, Misquoting Muhammad : The Challange and Choices of Interpreting the Prophet’s Legacy. 2015.
Muhammad Vandestra, Kitab Hadist Shahih Bukhari & Muslim Edisi Bahasa Indonesia. Dragon Promedia, 2018.
H. Fauzan, S. Al-faraby, and Adiwijaya, “Pengklasifikasian Topik Hadits Terjemahan Bahasa Indonesia Menggunakan Latent Semantic Indexing dan Support Vector Machine,†Media Inform. Budidarma, vol. 2, no. 4, pp. 131–139, 2018.
M. Y. Abu Bakar, Adiwijaya, and S. Al Faraby, “Multi-Label Topic Classification of Hadith of Bukhari (Indonesian Language Translation)Using Information Gain and Backpropagation Neural Network,†Proc. 2018 Int. Conf. Asian Lang. Process. IALP 2018, pp. 344–350, 2019.
G. Mediamer, adiwijaya@telkomuniversity ac id Adiwijaya, and S. Al Faraby, “Development of rule-based feature extraction in multi-label text classification,†Int. J. Adv. Sci. Eng. Inf. Technol., vol. 9, no. 4, pp. 1460–1465, 2019.
M. Arshi, S. Norisma, I. Rohana, S. Ja, and T. Abdullah, “Hadith data mining and classification : a comparative analysis,†Artif. Intell. Rev., 2016.
S. Al Faraby, E. R. Rachmawati, Y. Irwanto, and M. A. Bijaksana, “Text Categorization on Hadith Sahih Al-Bukhari using Random Forest Text Categorization on Hadith Sahih Al-Bukhari using Random Forest.â€
Al Faraby, S., Jasin, E.R.R. and Kusumaningrum, A., 2018, March. Classification of hadith into positive suggestion, negative suggestion, and information. In Journal of Physics: Conference Series (Vol. 971, No. 1, p. 012046). IOP Publishing.
Z. Yong, L. Youwen, and X. Shixiong, “An Improved KNN Text Classification Algorithm Based on Clustering,†vol. 4, no. 3, pp. 230–237, 2009.
Sari, P. K., & Purwadinata, A. (2019). "Analysis Characteristics of Car Sales In E-Commerce Data Using Clustering Model". Journal of Data Science and Its Applications, 2(1), 68-77.
Zhang, M.L., Li, Y.K., Liu, X.Y. and Geng, X., 2018. "Binary relevance for multi-label learning: an overview". Frontiers of Computer Science, 12(2), pp.191-202
I. Yahav, O. Shehory, and D. Schwartz, “Comments Mining With TF-IDF : The Inherent Bias and Its Removal,†vol. 14, no. 8, 2018.
Naf'an, M. Z., Bimantara, A. A., Larasati, A., Risondang, E. M., & Nugraha, N. A. S. (2019). "Sentiment Analysis of Cyberbullying on Instagram User Comments". Journal of Data Science and Its Applications, 2(1), 88-98.
K. Merchant, “NLP Based Latent Semantic Analysis for Legal Text Summarization,†2018 Int. Conf. Adv. Comput. Commun. Informatics, pp. 1803–1807, 2018.
A. Melo and H. Paulheim, “Local and global feature selection for multilabel classification with binary relevance: An empirical comparison on flat and hierarchical problems,†Artif. Intell. Rev., vol. 51, no. 1, pp. 33–60, 2019.
Adiwijaya, Aulia, M.N., Mubarok, M.S., Novia, W.U. and Nhita, F., 2017, May. A comparative study of MFCC-KNN and LPC-KNN for hijaiyyah letters pronounciation classification system. In Information and Communication Technology (ICoIC7), 2017 5th International Conference on (pp. 1-5). IEEE.
Pratiwi, A.I., Adiwijaya, 2018. "On the feature selection and classification based on information gain for document sentiment analysis." Applied Computational Intelligence and Soft Computing, 2018.



