Enhancing Machine Learning Accuracy in Detecting Preventable Diseases using Backward Elimination Method

 Muhammad Dliyauddin (Universitas Dian Nuswantoro, Semarang, Indonesia)
 Guruh Fajar Shidik (Universitas Dian Nuswantoro, Semarang, Indonesia)
 Affandy Affandy (Universitas Dian Nuswantoro, Semarang, Indonesia)
 (*)M. Arief Soeleman Mail (Universitas Dian Nuswantoro, Semarang, Indonesia)

(*) Corresponding Author

Submitted: December 5, 2023; Published: January 9, 2024


In the current landscape of abundant high-dimensional datasets, addressing classification challenges is pivotal. While prior studies have effectively utilized Backward Elimination (BE) for disease detection, there is a notable absence of research demonstrating the method's significance through comprehensive comparisons across diverse databases. The study aims to extend its contribution by applying BE across multiple machine learning algorithms (MLAs)Nae Bayes (NB), k-Nearest Neighbors (KNN), and Support Vector Machine (SVM)on datasets associated with preventable diseases (i.e. heart failure (HF), breast cancer (BC), and diabetes). This study aims to elucidate and recommend significant differences observed in the application of BE across diverse datasets and machine learning (ML) methods. This study conducted testing on four distinct datasetsraisin, HF, BC, and early-stage diabetes risk prediction datasets. Each dataset underwent evaluation with three MLAs: NB, KNN, and SVM. The application of BE successfully eliminated non-significant attributes, retaining only influential ones in the model. In addition, t-test results revealed a significant impact on accuracy across all datasets (p-value < 0.05). In specific algorithmic evaluations, SVM exhibited the highest accuracy for the raisin dataset at 87.22%. Additionally, KNN attained the utmost accuracy in the heart failure dataset with an accuracy of 86.31%. In the breast cancer dataset, KNN again excelled, achieving an accuracy of 83.56%. For the diabetes dataset, KNN proved the most accurate, reaching 96.15%. These results underscore the efficacy of BE in enhancing the execution of MLAs for disease detection.


Feature Selection; Backward Elimination; Machine Learning Algorithms; Disease Detection; KNN

Full Text:


Article Metrics

Abstract view : 100 times
PDF - 39 times


M. Gjoreski, M. Simjanoska, A. Gradiek, A. Peterlin, M. Gams, and G. Poglajen, Chronic heart failure detection from heart sounds using a stack of machine-learning classifiers., in 2017 International Conference on Intelligent Environments (IE), 2017, pp. 1419. doi: https://doi.org/10.1109/IE.2017.19.

F. S. Alotaibi, Implementation of Machine Learning Model to Predict Heart Failure Disease, Int. J. Adv. Comput. Sci. Appl., vol. 10, no. 6, 2019, doi: 10.14569/IJACSA.2019.0100637.

S. A. Mohammed, S. Darrab, S. A. Noaman, and G. Saake, Analysis of Breast Cancer Detection Using Different Machine Learning Techniques, 2020, pp. 108117. doi: 10.1007/978-981-15-7205-0_10.

D. A. Omondiagbe, S. Veeramani, and A. S. Sidhu, Machine Learning Classification Techniques for Breast Cancer Diagnosis, IOP Conf. Ser. Mater. Sci. Eng., vol. 495, p. 012033, Jun. 2019, doi: 10.1088/1757-899X/495/1/012033.

L. Kopitar, P. Kocbek, L. Cilar, A. Sheikh, and G. Stiglic, Early detection of type 2 diabetes mellitus using machine learning-based prediction models, Sci. Rep., vol. 10, no. 1, p. 11981, Jul. 2020, doi: 10.1038/s41598-020-68771-z.

M. Kubat, An Introduction to Machine Learning. Cham: Springer International Publishing, 2017. doi: 10.1007/978-3-319-63913-0.

R. C. Thom de Souza, C. A. de Macedo, L. dos Santos Coelho, J. Pierezan, and V. C. Mariani, Binary coyote optimization algorithm for feature selection, Pattern Recognit., vol. 107, p. 107470, Nov. 2020, doi: 10.1016/j.patcog.2020.107470.

T. Nyathi and N. Pillay, Comparison of a genetic algorithm to grammatical evolution for automated design of genetic programming classification algorithms, Expert Syst. Appl., vol. 104, pp. 213234, Aug. 2018, doi: 10.1016/j.eswa.2018.03.030.

E. Odhiambo Omuya, G. Onyango Okeyo, and M. Waema Kimwele, Feature Selection for Classification using Principal Component Analysis and Information Gain, Expert Syst. Appl., vol. 174, p. 114765, Jul. 2021, doi: 10.1016/j.eswa.2021.114765.

T. H. Nguyen, K. Shirai, and J. Velcin, Sentiment analysis on social media for stock movement prediction, Expert Syst. Appl., vol. 42, no. 24, pp. 96039611, Dec. 2015, doi: 10.1016/j.eswa.2015.07.052.

S. Arora, H. Singh, M. Sharma, S. Sharma, and P. Anand, A New Hybrid Algorithm Based on Grey Wolf Optimization and Crow Search Algorithm for Unconstrained Function Optimization and Feature Selection, IEEE Access, vol. 7, pp. 2634326361, 2019, doi: 10.1109/ACCESS.2019.2897325.

R. C. T. De Souza, L. dos S. Coelho, C. A. De Macedo, and J. Pierezan, A V-Shaped Binary Crow Search Algorithm for Feature Selection, in 2018 IEEE Congress on Evolutionary Computation (CEC), Jul. 2018, pp. 18. doi: 10.1109/CEC.2018.8477975.

C. C. Aggarwal, X. Kong, Q. Gu, J. Han, and P. S. Yu, Active learning: A survey, Data Classification: Algorithms and Applications, pp. 571605, 2014, doi: 10.1201/b17320.

A. Katrutsa and V. Strijov, Comprehensive study of feature selection methods to solve multicollinearity problem according to evaluation criteria, Expert Syst. Appl., vol. 76, pp. 111, Jun. 2017, doi: 10.1016/j.eswa.2017.01.048.

A. Zarshenas and K. Suzuki, Binary coordinate ascent: An efficient optimization technique for feature subset selection for machine learning, Knowledge-Based Syst., vol. 110, pp. 191201, Oct. 2016, doi: 10.1016/j.knosys.2016.07.026.

X. Zhu, S. Zhang, R. Hu, Y. Zhu, and J. Song, Local and Global Structure Preservation for Robust Unsupervised Spectral Feature Selection, IEEE Trans. Knowl. Data Eng., vol. 30, no. 3, pp. 517529, Mar. 2018, doi: 10.1109/TKDE.2017.2763618.

M. Z. I. Chowdhury and T. C. Turin, Variable selection strategies and its importance in clinical prediction modelling, Fam. Med. Community Heal., vol. 8, no. 1, p. e000262, Feb. 2020, doi: 10.1136/fmch-2019-000262.

F. Maulidina, Z. Rustam, S. Hartini, V. V. P. Wibowo, I. Wirasati, and W. Sadewo, Feature optimization using Backward Elimination and Support Vector Machines (SVM) algorithm for diabetes classification, J. Phys. Conf. Ser., vol. 1821, no. 1, p. 012006, Mar. 2021, doi: 10.1088/1742-6596/1821/1/012006.

S. Farahdiba, D. Kartini, R. A. Nugroho, R. Herteno, and T. H. Saragih, Backward Elimination for Feature Selection on Breast Cancer Classification Using Logistic Regression and Support Vector Machine Algorithms, IJCCS (Indonesian J. Comput. Cybern. Syst., vol. 17, no. 4, p. 429, Oct. 2023, doi: 10.22146/ijccs.88926.

M. Arifin, Nave Bayes Algorithm Based On Backward Elimination For Predicting Cervical Cancer, Int. J. Innov. Sci. Res. Technol., vol. 7, no. 7, pp. 13, 2022.

N. Bodasingi, N. Balaji, and B. R. Jammu, Automatic diagnosis of pneumonia using backward elimination method based SVM and its hardware implementation, Int. J. Imaging Syst. Technol., vol. 32, no. 3, pp. 10001014, May 2022, doi: 10.1002/ima.22694.

S. Karthika and N. Sairam, A Nave Bayesian Classifier for Educational Qualification, Indian J. Sci. Technol., vol. 8, no. 16, Jul. 2015, doi: 10.17485/ijst/2015/v8i16/62055.

V. Kumar, Feature Selection: A literature Review, Smart Comput. Rev., vol. 4, no. 3, Jun. 2014, doi: 10.6029/smartcr.2014.03.007.

Y. Isler, U. Ozturk, and E. Sayilgan, A new sample reduction method for decreasing the running time of the k-nearest neighbors algorithm to diagnose patients with congestive heart failure: backward iterative elimination, S?dhan?, vol. 48, no. 2, p. 35, Mar. 2023, doi: 10.1007/s12046-023-02105-3.

Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Enhancing Machine Learning Accuracy in Detecting Preventable Diseases using Backward Elimination Method


  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

STMIK Budi Darma
Secretariat: Sisingamangaraja No. 338 Telp 061-7875998
Email: mib.stmikbd@gmail.com

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.