Perbandingan Akurasi, Recall, dan Presisi Klasifikasi pada Algoritma C4.5, Random Forest, SVM dan Naive Bayes

Mulkan Azhari; Zakaria Situmorang; Rika Rosnelly

doi:10.30865/mib.v5i2.2937

Authors

Mulkan Azhari Universitas Potensi Utama, Medan
Zakaria Situmorang Universitas Katolik Santo Thomas Medan, Medan
Rika Rosnelly Universitas Potensi Utama, Medan

DOI:

https://doi.org/10.30865/mib.v5i2.2937

Keywords:

Data Mining, Classification, SVM, C4.5, Random Forest, Naive Bayes

Abstract

In this study aims to compare the performance of several classification algorithms namely C4.5, Random Forest, SVM, and naive bayes. Research data in the form of JISC participant data amounting to 200 data. Training data amounted to 140 (70%) and testing data amounted to 60 (30%). Classification simulation using data mining tools in the form of rapidminer. The results showed that . In the C4.5 algorithm obtained accuracy of 86.67%. Random Forest algorithm obtained accuracy of 83.33%. In SVM algorithm obtained accuracy of 95%. Naive Bayes' algorithm obtained an accuracy of 86.67%. The highest algorithm accuracy is in SVM algorithm and the smallest is in random forest algorithm

Author Biographies

Mulkan Azhari, Universitas Potensi Utama, Medan

Prodi Magister Ilmu Komputer Fakultas Teknik dan Ilmu Komputer

Rika Rosnelly, Universitas Potensi Utama, Medan

Prodi Magister Ilmu Komputer Fakultas Teknik dan Ilmu Komputer

References

J. wang, â€œEncyclopedia of DataWarehousing and Mining,â€ in Encyclopedia of Data Warehousing and Mining, Second., Information Science, 2008, p. 2226.

V. Bogorny and S. Shekhar, â€œSpatial and Spatio-temporal Data Mining,â€ in 2010 IEEE International Conference on Data Mining, 2010, p. 1217, doi: 10.1109/ICDM.2010.166.

M. Akhil, B. L. Deekshatulu, and P. Chandra, â€œScienceDirect International Conference on Computational Intelligence: Modeling Techniques and Applications (CIMTA) 2013 Classification of Heart Disease Using K-Nearest Neighbor and Genetic Algorithm,â€ Procedia Technol., vol. 10, pp. 85â€“94, 2013, doi: 10.1016/j.protcy.2013.12.340.

J. S. Challa, P. Goyal, S. Nikhil, A. Mangla, S. S. Balasubramaniam, and N. Goyal, â€œDD-Rtree: A dynamic distributed data structure for efficient data distribution among cluster nodes for spatial data mining algorithms,â€ in 2016 IEEE International Conference on Big Data (Big Data), 2016, pp. 27â€“36, doi: 10.1109/BigData.2016.7840586.

H. Abe, H. Yokoi, M. Ohsaki, and T. Yamaguchi, â€œDeveloping an Integrated Time-Series Data Mining Environment for Medical Data Mining,â€ in Seventh IEEE International Conference on Data Mining Workshops (ICDMW 2007), 2007, pp. 127â€“132, doi: 10.1109/ICDMW.2007.47.

F. Harahap, A. Y. N. Harahap, E. Ekadiansyah, R. N. Sari, R. Adawiyah, and C. B. Harahap, â€œImplementation of NaÃ¯ve Bayes Classification Method for Predicting Purchase,â€ in 2018 6th International Conference on Cyber and IT Service Management (CITSM), 2018, pp. 1â€“5, doi: 10.1109/CITSM.2018.8674324.

A. Ordonez, R. E. Paje, and R. Naz, â€œSMS Classification Method for Disaster Response Using NaÃ¯ve Bayes Algorithm,â€ in 2018 International Symposium on Computer, Consumer and Control (IS3C), 2018, pp. 233â€“236, doi: 10.1109/IS3C.2018.00066.

D. Kabakchieva, â€œStudent Performance Prediction by Using Data Mining Classification Algorithms,â€ Int. J. Comput. Sci. Manag. Res., vol. 1, no. 4, 2012, Accessed: Jun. 22, 2018. [Online]. Available: http://www.ece.uvic.ca/~rexlei86/SPP/GoogleScholar/Student performance prediction by using data mining classification algorithms.pdf.

J. Chen, Z. Dai, J. Duan, H. Matzinger, and I. Popescu, â€œNaive Bayes with Correlation Factor for Text Classification Problem,â€ in 2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA), 2019, pp. 1051â€“1056, doi: 10.1109/ICMLA.2019.00177.

J. Li, S. Fong, and Y. Zhuang, â€œOptimizing SMOTE by Metaheuristics with Neural Network and Decision Tree,â€ in 2015 3rd International Symposium on Computational and Business Intelligence (ISCBI), 2015, pp. 26â€“32, doi: 10.1109/ISCBI.2015.12.

K. Netti and Y. Radhika, â€œA novel method for minimizing loss of accuracy in Naive Bayes classifier,â€ in 2015 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC), 2015, pp. 1â€“4, doi: 10.1109/ICCIC.2015.7435801.

J. Liu, S. Li, L. Cui, and X. Luo, â€œSimultaneous classification and feature selection via LOG SVM and Elastic LOG SVM,â€ in 2017 36th Chinese Control Conference (CCC), 2017, pp. 11017â€“11022, doi: 10.23919/ChiCC.2017.8029116.

A. Lawi and F. Aziz, â€œClassification of Credit Card Default Clients Using LS-SVM Ensemble,â€ in 2018 Third International Conference on Informatics and Computing (ICIC), 2018, pp. 1â€“4, doi: 10.1109/IAC.2018.8780427.

A. C. Flores, R. I. Icoy, C. F. PeÃ±a, and K. D. Gorro, â€œAn Evaluation of SVM and Naive Bayes with SMOTE on Sentiment Analysis Data Set,â€ in 2018 International Conference on Engineering, Applied Sciences, and Technology (ICEAST), 2018, pp. 1â€“4, doi: 10.1109/ICEAST.2018.8434401.

Y. Ge, D. Yue, and L. Chen, â€œPrediction of wind turbine blades icing based on MBK-SMOTE and random forest in imbalanced data set,â€ in 2017 IEEE Conference on Energy Internet and Energy System Integration (EI2), 2017, pp. 1â€“6, doi: 10.1109/EI2.2017.8245530.

B. K. Baradwaj, â€œMining Educational Data to Analyze Students " Performance,â€ IJACSA) Int. J. Adv. Comput. Sci. Appl., vol. 2, no. 6, 2011, Accessed: Jun. 22, 2018. [Online]. Available: www.ijacsa.thesai.org.

Z. Chang, â€œThe application of C4.5 algorithm based on SMOTE in financial distress prediction model,â€ in 2011 2nd International Conference on Artificial Intelligence, Management Science and Electronic Commerce (AIMSEC), 2011, pp. 5852â€“5855, doi: 10.1109/AIMSEC.2011.6011460.

N. Soonthornphisaj, T. Sira-Aksorn, and P. Suksankawanich, â€œSocial Media Comment Management using SMOTE and Random Forest Algorithms,â€ in 2018 19th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD), 2018, pp. 129â€“134, doi: 10.1109/SNPD.2018.8441039.

Z. Hao, L. Shaohong, and S. Jinping, â€œUnit Model of Binary SVM with DS Output and its Application in Multi-class SVM,â€ in 2011 Fourth International Symposium on Computational Intelligence and Design, 2011, vol. 1, pp. 101â€“104, doi: 10.1109/ISCID.2011.34.

Z. Yan, â€œA SVM model for data mining and knowledge discoverying of mine water disasters,â€ in 2010 8th World Congress on Intelligent Control and Automation, 2010, pp. 2730â€“2734, doi: 10.1109/WCICA.2010.5554830.

Perbandingan Akurasi, Recall, dan Presisi Klasifikasi pada Algoritma C4.5, Random Forest, SVM dan Naive Bayes

Authors

DOI:

Keywords:

Abstract

Author Biographies

Mulkan Azhari, Universitas Potensi Utama, Medan

Rika Rosnelly, Universitas Potensi Utama, Medan

References

Downloads

Published

Issue

Section

License