A Clinically Applicable Automated Risk Classification Model for Pulmonary Nodules
Lung cancer has the highest prevalence in cancer-related deaths due to its rapid progression and it is detected at advanced stages. The paper proposes a novel method for predicting the risk of being malignant of Pulmonary Nodule (PN), presence of which can be an indication of lung cancer, with the motive to reduce the number of unnecessary biopsies and prevent anxiety among the patients. The study has considered different morphological features along with the clinical history of the patient having the particular nodule as described in medical literature. Depending upon these features, we have classified the risk of being malignant of pulmonary nodule into two classes, namely, low-risked or benign and high-risked or malignant. The entire dataset required to design the model is collected from a retrospective dataset, containing 476 (401 Malignant or high-risked and 75 low-risked or benign) PNs. The classification is performed by Recursive Partitioning Algorithm (RPA). RPA not only improves the accuracy but also helps to interpret how the morphological features are classifying the true risk of being malignant of the nodules.
KeywordsPulmonary nodule Morphological features Decision tree Recursive partitioning Imbalance class problem ROC curve Low-risked High-risked
We are thankful to Centre of Excellence in Systems Biology and Biomedical Engineering (TEQIP II and III), UGC UPE-II projects of University of Calcutta for providing the financial support of this research, and Peerless Hospital for providing their valuable dataset.
Compliance with Ethical Standard
The collection of patient images and pathological report for research purpose was approved by the Ethical Committee of Peerless Hospital and B. K. Roy Research Centre Ltd.
- 1.Formdan, D., Bray, F., Brewster, D. H., Mbalawa, C. G., Kohler, B., Pieros, M., et al. (2013). Cancer incidence in five continents, vol. X (electronic version). Lyon: IARC (2013).Google Scholar
- 8.Jones, R., & Svalbe, I. D. (1994). Basis algorithms in mathematical morphology. In Advances in electronics and electron physics (vol. 89, pp. 325–390). Academic Press.Google Scholar
- 10.Barandiaran, I. (1998). The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(8) (1998).Google Scholar
- 12.Fukunaga, K. (2013). Introduction to statistical pattern recognition. Elsevier.Google Scholar
- 17.Armato, I. I. I., Samuel, G., McLennan, G., Bidaut, L., McNittGray, M. F., Meyer, C. R., et al. (2011). The lung image database consortium (LIDC) and image database resource initiative (IDRI): A completed reference database of lung nodules on CT scans. Medical Physics, 38(2), 915–931.CrossRefGoogle Scholar
- 18.World Medical Association. (2001). World Medical Association Declaration of Helsinki. Ethical principles for medical research involving human subjects. Bulletin of the World Health Organization, 79(4), 373.Google Scholar
- 20.Terry Therneau and Beth Atkinson. (2018). Rpart: Recursive partitioning and regression trees. R package version 4.1–13. https://CRAN.R-project.org/package=rpart.
- 21.Therneau, T. M., & Atkinson, E. J. (1997). An introduction to recursive partitioning using the RPART routines.Google Scholar
- 22.Gong, J., Gao, T., Bu, R.-R., Wang, X.-F., Nie, S.-D. (2014). An automatic pulmonary nodules detection method using 3d adaptive template matching. In International Conference on Life System Modeling and Simulation and International Conference on Intelligent Computing for Sustainable Energy and Environment (pp. 39–49). Springer, Berlin, Heidelberg.Google Scholar