Deteksi Kanker Berdasarkan Data Microarray Menggunakan Metode Naïve Bayes dan Hybrid Feature Selection

 (*)Bintang Peryoga Mail (Universitas Telkom, Bandung, Indonesia)
 Adiwijaya Adiwijaya (Universitas Telkom, Bandung, Indonesia)
 Widi Astuti (Universitas Telkom, Bandung, Indonesia)

(*) Corresponding Author

DOI: http://dx.doi.org/10.30865/mib.v4i3.2096

Abstract

Cancer is a deadly disease that is responsible for 9.6 million death in 2018 based on WHO data so early cancer detection is needed so can be treated immediately and cancer deaths can be reduced. Microarray is technology that can monitor and analyze the expression of cancer genes in microarray data but has high data dimension and small sample so dimensional reductions are needed for the optimal classification process. Dimension reduction can reduce the use of features for the classification process by selecting some influential features. Hybrid method is one dimension reduction by combining Filter method with Wrapper so it gets the both advantage. In this case, researchers combined Naïve Bayes with Hybrid Feature Selection (Information Gain - Genetic Algorithm) on cancer data for microarray Lung Cancer, Ovarian Cancer, Breast Cancer, Colon Tumors, and Prostate Tumors. These data were obtained from Kent-Ridge Biomedical Dataset. The results showed that from 5 data used, 4 data obtained an accuracy between 87-100% while the prostate tumor data obtained the smallest accuracy of 61.14%. The implementation of the feature selection method and the classification of the 5 cancer data above only uses less than 63 features to obtain this accuracy

Keywords


Cancer, Microarray, Naïve Bayes, Information Gain, Genetic Algorithm, Hybrid Feature Selection

Full Text:

PDF


Article Metrics

Abstract view : 273 times
PDF - 78 times

References

World Health Organization, “Cancer,” 12-Sep-2018. [Online]. Available: https://www.who.int/news-room/fact-sheets/detail/cancer. [Accessed: 18-Mar-2020].

M. M. Babu, “An Introduction to Microarray Data Analysis,” Comput. genomics Theory Appl., vol. 225, p. 249, 2004.

S. Michiels, S. Koscielny, and C. Hill, “Interpretation of microarray data in cancer,” British Journal of Cancer. 2007.

N. Almugren and H. Alshamlan, “A survey on hybrid feature selection methods in microarray gene expression data for cancer classification,” IEEE Access. 2019.

N. Sánchez-Maroño, O. Fontenla-Romero, and B. Pérez-Sánchez, “Classification of Microarray Data,” in Microarray Bioinformatics, V. Bolón-Canedo and A. Alonso-Betanzos, Eds. New York, NY: Springer New York, 2019, pp. 185–205.

A. Adiwijaya, “Deteksi Kanker Berdasarkan Klasifikasi Microarray Data,” J. MEDIA Inform. BUDIDARMA, 2018.

Adiwijaya, U. N. Wisesty, E. Lisnawati, A. Aditsania, and D. S. Kusumo, “Dimensionality reduction using Principal Component Analysis for cancer detection based on microarray data classification,” J. Comput. Sci., 2018.

P. Yang, B. B. Zhou, Z. Zhang, and A. Y. Zomaya, “A multi-filter enhanced genetic ensemble system for gene selection and sample classification of microarray data,” BMC Bioinformatics, 2010.

Z. M. Hira and D. F. Gillies, “A review of feature selection and feature extraction methods applied on microarray data,” Adv. Bioinformatics, 2015.

H. Salem, G. Attiya, and N. El-Fishawy, “Classification of human cancer diseases by gene expression profiles,” Appl. Soft Comput. J., 2017.

C. S. Yang, L. Y. Chuang, J. C. Li, and C. H. Yang, “Information gain with chaotic genetic algorithm for gene selection and classification problem,” in Conference Proceedings - IEEE International Conference on Systems, Man and Cybernetics, 2008.

C. H. Yang, L. Y. Chuang, and C. H. Yang, “IG-GA: A hybrid filter/wrapper method for feature selection of microarray data,” J. Med. Biol. Eng., 2010.

A. Hasnat and A. U. Molla, “Feature selection in cancer microarray data using multi-objective genetic algorithm combined with correlation coefficient,” in Proceedings of IEEE International Conference on Emerging Technological Trends in Computing, Communications and Electrical Engineering, ICETT 2016, 2017.

W. Astuti and A. Adiwijaya, “Principal Component Analysis Sebagai Ekstraksi Fitur Data Microarray Untuk Deteksi Kanker Berbasis Linear Discriminant Analysis,” J. MEDIA Inform. BUDIDARMA, 2019.

M. S. Mubarok, A. Adiwijaya, and M. D. Aldhi, “Aspect-based sentiment analysis to review products using Naïve Bayes,” in AIP Conference Proceedings, 2017.

R. Aziz, C. K. Verma, and N. Srivastava, “A fuzzy based feature selection from independent component subspace for machine learning classification of microarray data,” Genomics Data, 2016.

E. Alpaydin, “Introduction to Machine Learning Ethem Alpaydin.,” Introd. to Mach. Learn. Third Ed., 2014.

M. D. Purbolaksono, K. C. Widiastuti, M. S. Mubarok, Adiwijaya, and F. A. Ma’ruf, “Implementation of mutual information and bayes theorem for classification microarray data,” in Journal of Physics: Conference Series, 2018.

A. C. Pradana, Adiwijaya, and A. Aditsania, “Implementing binary particle swarm optimization and C4.5 decision tree for cancer detection based on microarray data classification,” in Journal of Physics: Conference Series, 2019.

H. Aydadenta and Adiwijaya, “A clustering approach for feature selection in microarray data classification using random forest,” J. Inf. Process. Syst., 2018.

C. Arun Kumar, M. P. Sooraj, and S. Ramakrishnan, “A Comparative Performance Evaluation of Supervised Feature Selection Algorithms on Microarray Datasets,” in Procedia Computer Science, 2017.

I. Jain, V. K. Jain, and R. Jain, “An improved Binary Particle Swarm Optimization (iBPSO) for Gene Selection and Cancer Classification using DNA Microarrays,” in 2018 Conference on Information and Communication Technology, CICT 2018, 2018.

Mabarti, I., Aditsania, A., "Implementation of Minimum Redundancy Maximum Relevance (MRMR) and Genetic Algorithm (GA) for Microarray Data Classification with C4.5 Decision Tree". Journal of Data Science and Its Applications, 3(1), 2020.

Purnomoputra, R.B., Adiwijaya, A. and Wisesty, U.N., 2019. Sentiment Analysis of Movie Review using Naïve Bayes Method with Gini Index Feature Selection. Journal of Data Science and Its Applications, 2(2), pp.85-94.

Ma’ruf, F. A., Adiwijaya & Wisesty, U. N. "Analysis of the influence of Minimum Redundancy Maximum Relevance as dimensionality reduction method on cancer classification based on microarray data using Support Vector Machine classifier". In Journal of Physics: Conference Series (Vol. 1192, No. 1, p. 012011). IOP Publishing, 2019.

Manik, A., Adiwijaya, A., & Utama, D. Q. "Classification of Electrocardiogram Signals using Principal Component Analysis and Levenberg Marquardt Backpropagation for Detection Ventricular Tachyarrhythmia".Journal of Data Science and Its Applications, 2(1), 78-87, 2019

Daeli, N.O.F, Adiwijaya. Sentiment analysis on movie reviews using Information gain and K-nearest neighbor. Journal of Data Science and Its Applications, 3(1), 2020.

Bila bermanfaat silahkan share artikel ini

Berikan Komentar Anda terhadap artikel Deteksi Kanker Berdasarkan Data Microarray Menggunakan Metode Naïve Bayes dan Hybrid Feature Selection

Refbacks

  • There are currently no refbacks.


Copyright (c) 2020 JURNAL MEDIA INFORMATIKA BUDIDARMA

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.



JURNAL MEDIA INFORMATIKA BUDIDARMA
STMIK Budi Darma
Sekretariat : Jln. Sisingamangaraja No. 338 Telp 061-7875998
email : mib.stmikbd@gmail.com


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.