A Multi-label Classification on Topic of Hadith Verses in Indonesian Translation using CART and Bagging

Authors

  • Rendi Kustiawan Telkom University, Bandung
  • Adiwijaya Adiwijaya Telkom University, Bandung
  • Mahendra Dwifebri Purbolaksono Telkom University, Bandung

DOI:

https://doi.org/10.30865/mib.v6i2.3787

Keywords:

Classification, Hadith Bukhari, Preprocessing, Feature Extraction, CART, Bagging

Abstract

Hadith is a source of law for Muslims after the al-qur'an, in which there are instructions in the form of words, actions, attitudes, and others. Hadith must be studied and practiced by Muslims, then used as a way of life after the al-qur'an. Classifying hadith is a way to make it easier for Muslims to learn hadith by looking at the text pattern in the translation of Bukhari hadith based on three classes or categories based on suggestions, prohibitions, and information. The classification carried out is a multi-label classification. The classification process uses N-gram and TF-IDF as feature extraction, CART and bagging as classification methods, and hamming loss as evaluation methods. Bagging is used to cover the shortcomings of CART, namely, the CART model is less stable, which, if there is a slight change in the training data, will have a significant effect on the resulting learning model. Several testing methods were carried out to obtain the best hammer loss value in this study. Based on several tests that have been carried out, the best hamming loss value is 0.1914 or 80.86%. These results indicate that the use of bagging can help increase accuracy by 5%.

References

H. Prasetyo, A. Adiwijaya, and W. Astuti, “Klasifikasi Multi-label Pada Hadis Bukhari Dalam Terjemahan Bahasa Indonesia Menggunakan Mutual Information Dan Backpropagation Neural Network,†eProceedings Eng., vol. 6, no. 2, 2019.

A. Wiraguna, S. Al Faraby, and Adiwijaya, “Klasifikasi Topik Multi Label pada Hadis Bukhari dalam Terjemahan Bahasa Indonesia Menggunakan Random Forest,†e-Proceeding Eng., vol. 6, no. 1, pp. 2144–2153, 2019.

I. K. Syuriadi, Adiwijaya, and W. Astuti, “Klasifikasi Teks Multi Label pada Hadis dalam Terjemahan Bahasa Indonesia Berdasarkan Anjuran, Larangan dan Informasi menggunakan TF-IDF dan KNN,†vol. 6, no. 2, pp. 9121–9132, 2019.

S. Al Faraby, E. R. R. Jasin, A. Kusumaningrum, and Adiwijaya, “Classification of hadith into positive suggestion, negative suggestion, and information,†J. Phys. Conf. Ser., vol. 971, no. 1, 2018, doi: 10.1088/1742-6596/971/1/012046.

H. Fauzan, A. Adiwijaya, and S. Al-Faraby, “Pengklasifikasian Topik Hadits Terjemahan Bahasa Indonesia Menggunakan Latent Semantic Indexing dan Support Vector Machine,†J. Media Inform. Budidarma, vol. 2, no. 4, p. 131, 2018, doi: 10.30865/mib.v2i4.948.

A. R. Arrahimi, M. K. Ihsan, D. Kartini, M. R. Faisal, and F. Indriani, “Teknik Bagging Dan Boosting Pada Algoritma CART Untuk Klasifikasi Masa Studi Mahasiswa,†J. Sains dan Inform., vol. 5, no. 1, pp. 21–30, 2019, doi: 10.34128/jsi.v5i1.171.

D. A. N. A. Lgoritma, C. U. Ntuk, and M. E. K. Etidakseimbangan, “P ENERAPAN T EKNIK B AGGING P ADA A LGORITMA N AIVE B AYES,†vol. 1, pp. 41–48, 2020.

S. Bashir, U. Qamar, F. H. Khan, and M. Y. Javed, “An Efficient Rule-Based Classification of Diabetes Using ID3, C4.5, & CART Ensembles,†Proc. - 12th Int. Conf. Front. Inf. Technol. FIT 2014, pp. 226–231, 2015, doi: 10.1109/FIT.2014.50.

L. G. Irham, A. Adiwijaya, and U. N. Wisesty, “Klasifikasi Berita Bahasa Indonesia Menggunakan Mutual Information dan Support Vector Machine,†J. Media Inform. Budidarma, vol. 3, no. 4, p. 284, 2019, doi: 10.30865/mib.v3i4.1410.

X. Zheng, P. Li, Z. Chu, and X. Hu, “A Survey on Multi-Label Data Stream Classification,†IEEE Access, vol. 8, pp. 1249–1275, 2020, doi: 10.1109/ACCESS.2019.2962059.

N. Indah Prabawati, Widodo, and H. Ajie, “Kinerja Algoritma Classification And Regression Tree (Cart) dalam Mengklasifikasikan Lama Masa Studi Mahasiswa yang Mengikuti Organisasi di Universitas Negeri Jakarta,†PINTER J. Pendidik. Tek. Inform. dan Komput., vol. 3, no. 2, pp. 139–145, 2019, doi: 10.21009/pinter.3.2.9

R. Irmanita, Sri Suryani Prasetiyowati, and Yuliant Sibaroni, “Classification of Malaria Complication Using CART (Classification and Regression Tree) and Naïve Bayes,†J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 5, no. 1, pp. 10–16, 2021, doi: 10.29207/resti.v5i1.2770.

E. Prasetyo, B. Prasetiyo, U. N. Semarang, and P. Korespondensi, “INCREASED CLASSIFICATION ACCURACY C4 . 5 ALGORITHM USING BAGGING TECHNIQUES IN DIAGNOSING HEART DISEASE,†vol. 7, no. 5, pp. 1035–1040, 2020, doi: 10.25126/jtiik.202072379.

R. Richman and M. V. Wüthrich, “Nagging predictors,†Risks, vol. 8, no. 3, pp. 1–26, 2020, doi: 10.3390/risks8030083.

R. Latifah, E. S. Wulandari, and P. E. Kreshna, “Model Decision Tree Untuk Prediksi Jadwal Kerja Menggunakan Scikit-Learn,†J. Univ. Muhammadiyah Jakarta, pp. 1–6, 2019, [Online]. Available: https://jurnal.umj.ac.id/index.php/semnastek/article/download/5239/3517.

A. Rahayu, L. T. Sianturi, and T. Zebua, “Implemantasi Algoritma Cart untuk Mengklasifikasikan Buku Yang Paling Sering Dibaca (Studi Kasus: Yayasan Cinta Baca),†Inf. dan Teknol. Ilm., vol. 13, pp. 223–227, 2018.

R. Pujianto, Adiwijaya, and A. A. Rahmawati, “Analisis Ekstraksi Fitur Principle Component Analysis pada Klasifikasi Microarray Data Menggunakan Classification And Regression Trees,†eProceedings …, vol. 6, no. 1, pp. 2368–2379, 2019.

Adiwijaya, Said Al Faraby, Mohamad Syahrul Mubarok, and Mahendra Dwifebri Purbolaksono. 2021. Indonesian Translation of the Hadith of Bukhari (Single-label), Dataverse Telkom University, DOI: https://doi.org/10.34820/FK2/GWSEWB

Downloads

Published

2022-04-25

Issue

Section

Articles