Classification of Lung Cancer using Vision Transformer on Histopathological Images
DOI:
https://doi.org/10.30865/json.v7i3.9399Keywords:
Lung Cancer Classification, Histopathological Image Analysis, Vision Transformer (ViT), Deep Learning in Medical Imaging, Computer-Aided Diagnosis (CAD)Abstract
Lung cancer is the leading cause of cancer-related deaths worldwide, with early diagnosis often hindered by morphological variations in histopathological images. The main problem is the difficulty in accurately and rapidly distinguishing cancer types such as adenocarcinoma and squamous cell carcinoma from benign tissue. This research processes histopathological images as input to produce a three-class classification: adenocarcinoma, squamous cell carcinoma, and benign tissue. Early detection of lung cancer can improve survival rates by up to 50%, but manual diagnosis by pathologists depends on subjective experience, causing errors of up to 20% in ambiguous cases. For example, in developing countries like Indonesia, the shortage of pathologists exacerbates treatment delays. This gap demands a reliable automated approach to support more timely clinical decisions. The developed solution involves implementing Vision Transformer (ViT) with two different architectures: ViT-B/16 (base model with 86 million parameters) and ViT-L/16 (large model with 304 million parameters). Histopathological images are processed through normalization and patch embedding of 16×16 pixels, then features are extracted using self-attention mechanism. Models are trained with transfer learning from ImageNet-21k, applying fine- tuning on lung cancer histopathological images dataset. The process includes data splitting into training (70%), validation (15%), and testing (15%), as well as data augmentation to improve robustness. The ViT-B/16 model achieved testing accuracy of 98.40% with F1-score of 0.984, while ViT-L/16 achieved accuracy of 98.18% with F1-score of 0.982. Both models demonstrated perfect capability in detecting benign tissues (precision 1.00). The average AUC-ROC value reached 0.999 for ViT-B/16 and 0.998 for ViT-L/16, indicating very high discriminative power. The main contribution of this research is a comprehensive comparison between two scales of Vision Transformer for automated lung cancer diagnosis, proving that the smaller model (ViT-B/16) can achieve equivalent or better performance with higher computational efficiency.
References
H. Sung et al., “Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries,” CA. Cancer J. Clin., vol. 71, no. 3, pp. 209–249, May 2021, doi: 10.3322/caac.21660.
J. An et al., “Transformer-Based Weakly Supervised Learning for Whole Slide Lung Cancer Image Classification,” IEEE J. Biomed. Heal. Informatics, vol. 29, no. 12, pp. 9095–9108, Dec. 2025, doi: 10.1109/JBHI.2024.3425434.
O. Singh, K. L. Kashyap, and K. K. Singh, “Lung and Colon Cancer Classification of Histopathology Images Using Convolutional Neural Network,” SN Comput. Sci., vol. 5, no. 2, p. 223, Jan. 2024, doi: 10.1007/s42979-023-02546-x.
A. Dosovitskiy et al., “An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale,” ICLR 2021 - 9th Int. Conf. Learn. Represent., Jun. 2021, [Online]. Available: http://arxiv.org/abs/2010.11929
C. Nisa’, N. Suciati, and A. Yuniarti, “CLASSIFICATION OF LUNG AND COLON CANCER TISSUES USING HYBRID CONVOLUTIONAL NEURAL NETWORKS,” JUTI J. Ilm. Teknol. Inf., vol. 22, no. 1, pp. 56–64, Jan. 2024, doi: 10.12962/j24068535.v22i1.a1225.
A. Toskas, F.-M. Laskaratos, S. Coda, S. Banerjee, and O. Epstein, “Is Panenteric PillcamTM Crohn’s Capsule Endoscopy Ready for Widespread Use? A Narrative Review,” Diagnostics, vol. 13, no. 12, p. 2032, Jun. 2023, doi: 10.3390/diagnostics13122032.
A. Esteva et al., “Deep learning-enabled medical computer vision,” npj Digit. Med., vol. 4, no. 1, pp. 1–9, 2021, doi: 10.1038/s41746-020-00376-2.
N. Mahesh, A. Prakash, P. Naveen, and M. Reddy, “Osteosarcoma of Maxilla – A Rare Case Report,” Br. J. Med. Med. Res., vol. 17, no. 4, pp. 1–7, Jan. 2016, doi: 10.9734/BJMMR/2016/25396.
Ó. A. Martín and J. Sánchez, “Evaluation of Vision Transformers for Multi-Organ Tumor Classification Using MRI and CT Imaging,” Electronics, vol. 14, no. 15, p. 2976, Jul. 2025, doi: 10.3390/electronics14152976.
M. Hasan et al., “Vision Transformer-based Classification for Lung and Colon Cancer using Histopathology Images,” Proc. - 22nd IEEE Int. Conf. Mach. Learn. Appl. ICMLA 2023, no. i, pp. 1300–1304, 2023, doi: 10.1109/ICMLA58977.2023.00196.
H. Ali, F. Mohsen, and Z. Shah, “Improving diagnosis and prognosis of lung cancer using vision transformers: a scoping review,” BMC Med. Imaging, vol. 23, no. 1, p. 129, Sep. 2023, doi: 10.1186/s12880-023-01098-z.
S. Rezaei et al., “Role of machine learning in molecular pathology for breast cancer: A review on gene expression profiling and RNA sequencing application,” Crit. Rev. Oncol. Hematol., vol. 213, no. March, p. 104780, Sep. 2025, doi: 10.1016/j.critrevonc.2025.104780.
A. Almangush, A. A. Mäkitie, and I. Leivo, “Back to basics: Hematoxylin and eosin staining is the principal tool for histopathological risk assessment of oral cancer,” Oral Oncol., vol. 115, p. 105134, Apr. 2021, doi: 10.1016/j.oraloncology.2020.105134.
S. Koivukoski, U. Khan, P. Ruusuvuori, and L. Latonen, “Unstained Tissue Imaging and Virtual Hematoxylin and Eosin Staining of Histologic Whole Slide Images,” Lab. Investig., vol. 103, no. 5, p. 100070, 2023, doi: 10.1016/j.labinv.2023.100070.
X. Matias-Guiu et al., “Implementing digital pathology: qualitative and financial insights from eight leading European laboratories,” Virchows Arch., vol. 487, no. 4, pp. 815–826, Oct. 2025, doi: 10.1007/s00428-025-04064-y.
N. Kumar, M. Sharma, V. P. Singh, C. Madan, and S. Mehandia, “An empirical study of handcrafted and dense feature extraction techniques for lung and colon cancer classification from histopathological images,” Biomed. Signal Process. Control, vol. 75, no. February, p. 103596, May 2022, doi: 10.1016/j.bspc.2022.103596.
N. Y. Ibrahim and A. S. Talaat, “An Enhancement Technique to Diagnose Colon and Lung Cancer by using Double CLAHE and Deep Learning,” Int. J. Adv. Comput. Sci. Appl., vol. 13, no. 8, pp. 276–282, 2022, doi: 10.14569/IJACSA.2022.0130833.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Jurnal Sistem Komputer dan Informatika (JSON)

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

This work is licensed under a Creative Commons Attribution 4.0 International License
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).

