Single-Label and Multi-Label Text Classification using ANN and Comparison with NaÃ¯ve Bayes and SVM

M. Mahfi Nurandi Karsana; Kemas Muslim L.; Widi Astuti

doi:10.30865/mib.v7i2.6024

Authors

M. Mahfi Nurandi Karsana Telkom University, Bandung
Kemas Muslim L. Telkom University, Bandung
Widi Astuti Telkom University, Bandung

DOI:

https://doi.org/10.30865/mib.v7i2.6024

Keywords:

ANN, F1-Macro, Naive Bayes, Text Classification, SVM

Abstract

Machine learning has become useful in daily life thanks to improvements in machine learning techniques. Text classification as an important part in machine learning. There are already many methods used for text classification such as Artificial Neural Network (ANN), NaÃ¯ve Bayes, SVM, Decision Tree etc.Â ANN is a branch in machine learning which approximate the function of natural neural network. ANN have been used extensively for classification. In this research a simple architecture of ANN is used. But it needs to be pointed out that the architecture used in this research is relatively simple compared to the cutting edge in ANN development and research to show the potential that ANN have compared to other classification method. ANN, NaÃ¯ve Bayes and SVM performance are measured using f1-macro. Performance of classification model is measured of multiple single-label and multi-label dataset. This research found that in single-label classification ANN have a comparable f1-macro with 0.79 compared to 0.82 for SVM. In multi-label classification ANN have the best f1-macro with 0.48 compared to 0.44 in SVM.

References

M. M. MiroÅ„czuk and J. Protasiewicz, â€œA recent overview of the state-of-the-art elements of text classification,â€ Expert Syst Appl, vol. 106, pp. 36â€“54, 2018.

J. Zheng and L. Zheng, â€œA Hybrid Bidirectional Recurrent Convolutional Neural Network Attention-Based Model for Text Classification,â€ IEEE Access, vol. 7, pp. 106673â€“106685, 2019, doi: 10.1109/ACCESS.2019.2932619.

R. DziseviÄ and D. Å eÅ¡ok, â€œText classification using different feature extraction approaches,â€ in 2019 Open Conference of Electrical, Electronic and Information Sciences (eStream), 2019, pp. 1â€“4.

K. Kowsari, K. Jafari Meimandi, M. Heidarysafa, S. Mendu, L. Barnes, and D. Brown, â€œText classification algorithms: A survey,â€ Information, vol. 10, no. 4, p. 150, 2019.

A. Elnagar, R. Al-Debsi, and O. Einea, â€œArabic text classification using deep learning models,â€ Inf Process Manag, vol. 57, no. 1, p. 102121, 2020.

O. I. Abiodun, A. Jantan, A. E. Omolara, K. V. Dada, N. A. Mohamed, and H. Arshad, â€œState-of-the-art in artificial neural network applications: A survey,â€ Heliyon, vol. 4, no. 11, p. e00938, 2018.

Q. Li et al., â€œA survey on text classification: From shallow to deep learning,â€ arXiv preprint arXiv:2008.00364, 2020.

M. A. Ahmed, R. A. Hasan, A. H. Ali, and M. A. Mohammed, â€œThe classification of the modern arabic poetry using machine learning,â€ TELKOMNIKA (Telecommunication Computing Electronics and Control), vol. 17, no. 5, pp. 2667â€“2674, 2019.

J. Kolluri and S. Razia, â€œText classification using Na"ive Bayes classifier,â€ Mater Today Proc, 2020.

A. I. Kadhim, â€œSurvey on supervised machine learning techniques for automatic text classification,â€ Artif Intell Rev, vol. 52, no. 1, pp. 273â€“292, 2019.

H. Kim, J. Kim, J. Kim, and P. Lim, â€œTowards perfect text classification with Wikipedia-based semantic Naive Bayes learning,â€ Neurocomputing, vol. 315, pp. 128â€“134, 2018.

K. Kowsari, K. Jafari Meimandi, M. Heidarysafa, S. Mendu, L. Barnes, and D. Brown, â€œText classification algorithms: A survey,â€ Information, vol. 10, no. 4, p. 150, 2019.

X. Luo, â€œEfficient english text classification using selected machine learning techniques,â€ Alexandria Engineering Journal, vol. 60, no. 3, pp. 3401â€“3409, 2021.

A. W. Haryanto, E. K. Mawardi, and others, â€œInfluence of word normalization and chi-squared feature selection on support vector machine (svm) text classification,â€ in 2018 International Seminar on Application for Technology of Information and Communication, 2018, pp. 229â€“233.

T. B. Shahi and A. K. Pant, â€œNepali news classification using Na"ive Bayes, support vector machines and neural networks,â€ in 2018 International Conference on Communication Information and Computing Technology (ICCICT), 2018, pp. 1â€“5.

D. Yuliana and C. Supriyanto, â€œKlasifikasi Teks Pengaduan Masyarakat Dengan Menggunakan Algoritma Neural Network,â€ vol. 5, no. 3, pp. 92â€“116, 2019, doi: 10.29165/komtekinfo.v5i2.

W. Chen, B. Zhang, and M. Lu, â€œUncertainty quantification for multilabel text classification,â€ Wiley Interdiscip Rev Data Min Knowl Discov, vol. 10, no. 6, p. e1384, 2020.

A. M. de J. C. Cachopo and others, â€œImproving methods for single-label text categorization,â€ Instituto Superior TÃ©cnico, Portugal, 2007.

D. Greene and P. Cunningham, â€œPractical solutions to the problem of diagonal dominance in kernel document clustering,â€ in Proceedings of the 23rd international conference on Machine learning, 2006, pp. 377â€“384.

D. Kershaw and R. Koeling, â€œElsevier OA CC-By Corpus,â€ CoRR, vol. abs/2008.00774, 2020, [Online]. Available: https://arxiv.org/abs/2008.00774

G. Singh, B. Kumar, L. Gaur, and A. Tyagi, â€œComparison between multinomial and Bernoulli na"ive Bayes for text classification,â€ in 2019 International Conference on Automation, Computational and Technology Management (ICACTM), 2019, pp. 593â€“596.

D. Chicco and G. Jurman, â€œThe advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation,â€ BMC Genomics, vol. 21, pp. 1â€“13, 2020.

M. E. Polus and T. Abbas, â€œDevelopment for performance of Porter Stemmer algorithm,â€ Eastern-European Journal of Enterprise Technologies, vol. 1, no. 2, p. 109, 2021.

B. G. Marcot and A. M. Hanea, â€œWhat is an optimal value of k in k-fold cross-validation in discrete Bayesian network analysis?,â€ Comput Stat, vol. 36, no. 3, pp. 2009â€“2031, 2021.

Single-Label and Multi-Label Text Classification using ANN and Comparison with NaÃ¯ve Bayes and SVM

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License