1 Introduction

Lung cancer stands as the foremost cause of cancer-related mortality worldwide [1], constituting a significant global public health concern and ranking as the primary cancer in men and the second most prevalent in women. With approximately 1 in 16 men and 1 in 17 women projected to receive a diagnosis in their lifetime [2], the urgency of addressing this pervasive threat becomes evident.

The disease manifests when cells in the lungs undergo uncontrolled growth, disrupting normal cellular division processes and culminating in tumor formation. These tumors may manifest as either malignant (cancerous) or benign (non-cancerous) entities. Lung cancer typically presents in two main types: small cell lung cancer (SCLC) and non-small cell lung cancer (NSCLC), each necessitating distinct approaches to treatment [3]. NSCLC, in particular, emerges as the more prevalent subtype.

Notably, lung cancer does not discriminate based on age, gender, or race, underscoring the imperative of early intervention and treatment. In this regard, medical imaging emerges as a cornerstone of diagnostic endeavors, offering invaluable insights into pathological processes. A myriad of modalities, including magnetic resonance imaging (MRI), positron emission tomography (PET), mammography, computed tomography (CT), radiography ultrasound, and duplex ultrasound, serve as indispensable tools in the evaluation of abnormality by facilitating the detection of cancerous cells within human organs [4].

Among the various imaging modalities, Chest X-ray and CT scans emerge as the most widely utilized methods for lung cancer detection, owing to their widespread availability and cost-effectiveness. However, the current paradigm of reviewing and analyzing medical images predominantly relies on manual interpretation, necessitating specialized expertise. This approach is not without its limitations, being susceptible to time constraints, human errors, biases, and the potential for erroneous diagnoses.

Furthermore, lung nodules, exhibiting diverse characteristics in terms of size, shape, texture, and density, pose formidable challenges for manual interpretation by healthcare professionals. Recent studies [5, 6] have revealed a tendency to overlook many small nodules during manual readings, with others [7] highlighting the prevalence of nodules characterized by fuzzy contours, which complicate precise cancer detection. Consequently, a burgeoning body of research advocates for the adoption of soft computing-based solutions to mitigate the challenges associated with the heterogeneity of lung nodules [8,9,10,11,12,13,14].

Another persistent challenge in lung cancer detection, underscored in recent studies, pertains to the confounding phenomenon of overlapping features between lung nodules and surrounding anatomical structures [10, 12, 13, 15,16,17]. The superimposition of nodules with elements such as blood vessels, lung walls, and pleura often obscures diagnostic clarity, leading to erroneous interpretations grounded in the intricate feature information gleaned from raw medical images. Moreover, the presence of concomitant lung pathologies, such as tuberculosis and pneumonia, further complicates matters, given their propensity to manifest overlapping opacities within the pulmonary region [18].

These factors collectively impede a system's capacity to discern the salient and distinctive attributes of nodules indicative of early-stage lung cancer. The manual scrutiny of scans exacerbates the issue, consuming valuable time, and resources. Consequently, researchers have directed their efforts towards automating and refining the lung cancer screening process. By harnessing soft computing, artificial intelligence (AI), machine learning (ML), and deep learning (DL) methodologies, endeavors aim to mitigate the likelihood of misdiagnosis and augment the early detection of lung cancer.

1.1 Motivation for this Survey

Prior to 2020, most reviews centered on AI for disease diagnosis in general, as shown in Fig. 1. While these reviews provided a comprehensive overview of the potential of AI as a clinical tool, they lacked a detailed analysis of the technical advancements in the computer-aided diagnosis (CAD) development for lung cancer. Since 2019, there has been a continuous increase in targeted surveys that concentrate on CAD development for lung cancer, while the number of general-scoped surveys has decreased since 2022.

Fig. 1
figure 1

Distribution of survey articles by scope and year

A significant number of high-quality reviews on AI integration in lung cancer were published between 2017 and 2023. Li et al. [19] provided an overview of ML-based approaches that strengthen the varying aspects of lung cancer diagnosis and therapy, including early detection, auxiliary diagnosis, prognosis prediction, and immunotherapy practice. Ladbury et al. [20] summarized the current literature on AI specific to lung cancer and how it applies to the multidisciplinary team taking care of these complex patients.

For the CAD system regarding lung nodule detection and classification, Mohammad et al. [21] comprehensively discussed the common factors (section thickness, dose, nodule location, and nodule size) affecting CAD performance, but they did not include the CAD design and algorithm as part of the discussion. It is worth noting that factors affecting CAD performance heavily rely on the model design and algorithm and shall not be treated as a black box.

With the emerging development of CNN, Sathyakumar et al. [22] performed a narrative review on a total of 648 articles and performed comparative experiments on four variations of CNN models for lung nodule cancer detection. However, no detailed analysis and comprehensive discussion were presented in the paper. On the other hand, Gu et al. [23] based the analysis on a CAD system's main stages and tasks while reviewing DL algorithms, mainly CNN-based models, developed in 2019 and up to November 2020.

More recently, Zhou and Xin [24] reviewed 104 developed studies on lung cancer screening from federated DL models, multi-modal DL models, and interpretable DL models perspectives. Tomassini et al. [25] presented an investigation from a data-driven perspective devoted to comprehensively reviewing slice-based and scan-based approaches using CNNs for lung nodule diagnosis and cancer histology classification from CT data. Shah and Parveen [26] reviewed and summarized 32 papers on medical imaging for lung cancer classification from 2012 to 2022 but did not discuss development trends on CAD systems based on the model architecture.

Although these survey articles have summarized past literature, they lack a comprehensive comparison of the methods and algorithms used by each author for lung image analysis. This aspect holds particular significance today, given the increasing use of customized CNN with dedicated architectures for specific tasks and objectives. In addition, certain reviews contain tables summarizing results and significant model characteristics across the year but do not provide a dependable overview of the research trend. This limitation originates from their focus on a general perspective, lacking in-depth exploration of the model architectures and designs.

1.2 Contributions of this Survey

In addressing the aforementioned challenges, this survey endeavors to furnish a comprehensive overview of DL applications in the domain of lung cancer medical image analysis. Through the provision of overview tables and figures, the survey aims to elucidate the intricate architectures underpinning these DL systems, facilitating a nuanced understanding of their operational mechanisms.

Specifically, this survey embarks on a meticulous exploration of recent advancements in CAD systems tailored for lung cancer diagnosis through medical image analysis. It underscores the indispensable role of CAD systems in augmenting both the efficiency and accuracy of lung cancer diagnosis, while also introducing quantitative metrics to gauge the comparative robustness of diverse algorithms and techniques embedded within these systems.

Moreover, the study delves into the evolutionary trajectories of CAD development for lung cancer diagnosis, scrutinizing the manifold ways in which model configurations shape diagnostic efficacy. Through a systematic classification and discussion of new models for automated lung cancer detection using medical images, this survey provides insights from both the constituent model and designed task perspectives. Notably, the absence of prior review articles encompassing such a comprehensive analysis for lung cancer detection systems underscores the novelty of this endeavor.

Additionally, the survey identifies superior approaches for the detection and classification of lung cancer from medical images, while also delineating typical strategies employed to enhance algorithmic efficiency and performance. This model-driven perspective offers a unique synthesis of state-of-the-art solutions, methodologies, challenges, and potential research avenues within the domain, thus bridging a significant gap in existing literature reviews.

Furthermore, the survey highlights the pivotal role of various AI methods, particularly ML and DL, underscoring their transformative impact on the advancement of lung cancer diagnosis through image data analysis. By elucidating the utilization of these AI paradigms, the survey not only contributes to the theoretical underpinnings of the field but also provides practical insights for researchers and practitioners alike.

1.3 Organization of this Survey

Figure 2 serves as a reading map to help the reader understand the contents and the relationship between each section and the research questions. The review begins with an introduction in Sect. 1, providing context about intuition of CAD systems for lung cancer. Detailed procedures for including/excluding information, and primary research questions are detailed in Sect. 2. Section 3 provides a concise overview of the background of CAD systems for disease detection, setting the background for the discussion. Section 4 offers an overview of recent methodologies dedicated to lung diseases, categorizing the included articles into seven distinct model groups, along with references review from a model-driven perspective. In Sect. 5, each research question is addressed and discussed. Section 6 highlights the difficulties encountered by the methodologies examined and concludes the review.

Fig. 2
figure 2

Reading map for this survey

2 Methodology

To improve the focus of the review process and ensure a comprehensive coverage, specific research questions (RQs) have been formulated which will be used to explore the application of AI in CAD systems, models architectural designs, and comparative studies of the latest lung cancer detection systems. The RQs are as follows:

RQ1: What are the types of CAD techniques used for diagnosing lung cancer, and how do CAD systems contribute to enhancing the efficiency and accuracy of lung cancer diagnosis?

RQ2: What are the key advancements in CAD for lung cancer diagnosis, and how do different model configurations impact the performance of lung cancer diagnosis?

RQ3: Which approach has demonstrated superior performance in detecting and classifying lung cancer from medical images, and what are the typical methods used to enhance the efficiency and performance of the algorithms?

This survey adopts the PRISMA guidelines [27] to systematically review the existing literature. Figure 3 is a flowchart that illustrates how articles were selected and carefully screened using the inclusion and exclusion criteria. Ultimately, a total of 119 articles were included in this survey. Table 1 provides an overview of the article search outcomes with respect to their sources.

Fig. 3
figure 3

PRISMA flowchart

Table 1 Results of article search

3 An overview of Computer-Aided Diagnosis and Detection System

Machine-assisted and computer-aided diagnosis and detection (CADD) techniques are revolutionizing the detection of lung cancer. CADD systems utilize ML algorithms to detect diseases in patients, eliminating observer bias [24]. The general types of lung cancer prediction and/or detection systems are shown in Fig. 4.

Fig. 4
figure 4

General types of lung cancer CADD systems

Diagnosis report-based systems operate on textual data extracted from clinical or diagnostic reports. A classifier is used to learn information from textual data. However, the datasets often suffer from missing values and irrelevant features, preventing the classifier from learning sufficiently from the data. Research in this field focuses on data pre-processing techniques such as feature reduction and feature extraction with the integration of soft computing approaches.

Medical image-based systems operate on medical images such as chest X-rays, CT scans, and histopathological images to detect the presence of lung cancer. These systems typically undergo certain pre-processing procedures to locate a region of interest (ROI) and then extract features from the images. Medical image-based systems can be categorized into two types—feature engineering-based approach and DL-based approach and the general stages are presented in Fig. 5.

Fig. 5
figure 5

Existing lung cancer diagnosis frameworks

The feature engineering-based approach involves three stages: data pre-processing, feature extraction and selection, and classification. Data pre-processing includes filtering and segmentation, followed by post-processing correction. The system extracts visual features and then applies appropriate classifiers, usually ML techniques. However, the handcrafted features limit the optimality of the results, especially in subtle distinctions between benign and malignant lesions.

DL approaches and neural networks (NN) based models are increasingly popular in medical image processing due to advancements in computational intelligence and frontier technologies. In particular, a growing body of literature has explored and applied convolutional neural networks (CNN) and other DL models/approaches for analyzing medical images [28].

In contrast to traditional feature engineering-based systems, DL-based systems have demonstrated their ability to process a massive number of images directly without the need for handcrafted features. Instead, the system depends on the visual patterns revealed in the data to detect malignancy [24, 29].

A new end-to-end detection framework has been recently unveiled. This framework simplifies the traditional system by automating tasks and skipping the segmentation, feature extraction, and feature selection processes. This framework uses a set of convolutional and pooling layers to automatically extract deep or high-level features from input images, based on the objective functions specified by the users [28]. As a result, the system can directly perform predictions.

A main drawback of most existing CADD systems is that the performance of each step is heavily dependent on the accuracy of the previous step [30]. For instance, data segmentation accuracy is a crucial factor to consider when utilizing segmented images to detect and classify lung nodules and cancer cases. It is possible for the system to produce inaccurate findings when segmented images are inaccurate or vague. Other than that, selection of features to be fed into a classifier or detection network is also closely related to the model’s accuracy and specificity. The solution to this problem lies in robust end-to-end detection and/or classification mathematical algorithms that can directly operate on lung cancer medical images.

4 Recent Advances in CAD System Design for Lung Cancer

A review of 119 research articles on AI algorithms for diagnosing lung cancer and lung diseases using medical images was conducted. Out of these, 108 articles introduced novel AI algorithms, referred to as "New Algorithms". Most of these algorithms used a combination of four foundational models: CNN, GAN, other NN (other derivatives of NN), and ML (conventional non-NN architecture). Furthermore, fuzzy ML and metaheuristic search optimization algorithms were integrated as complementary techniques to enhance the performance of the new algorithms.

A consistent pattern prevailed throughout the observed articles: a strong emphasis on utilizing CNN-based models or components. Many researchers either devised entirely new CNN algorithms tailored for specific diagnostic purposes or enhanced existing publicly available CNN models by incorporating customized or modified layers. However, beyond this predominant pattern, several noteworthy trends were identified over the years:

  1. i.

    Direct Application of Transfer Learning: A prevalent strategy involved directly adopting readily available CNN models, such as VGG, ResNet, and AlexNet and leveraging transfer learning techniques to fine-tune them for lung disease and cancer diagnosis was observed [31,32,33,34,35].

  2. ii.

    Integration of GANs: Some researchers [36,37,38,39,40] incorporated GANs into their approaches for image synthesis purposes to enlarge the size of training data for better model generalization, some authors [11, 41,42,43] applied GANs for segmentation purposes, and some authors [44,45,46,47] applied GANs for classification purposes.

  3. iii.

    Diversification with Different Neural Network Models: In addition to CNNs, other NN-based models were explored [48,49,50]. It was noted that in most cases, these studies incorporated other NN-based models in conjunction with CNNs rather than employing them as standalone models.

  4. iv.

    Exploration of Non-Neural Network Models: A distinct approach was taken by some researchers who ventured beyond NN, experimenting with conventional ML approaches or alternative models to devise effective diagnostic solutions [51,52,53,54]. Similar trends were noted in the context of other NN models, where the prevailing approach involved employing traditional ML classifiers to classify extracted features in the final phase of the CAD system after the image processing phase, in which a foundational CNN model was utilized.

  5. v.

    Hybrid Model Development: A trend emerged where hybrid models were built, combining different AI and/or non-AI techniques to harness their collective strengths for improved diagnostic outcomes [11, 30, 55,56,57,58,59].

  6. vi.

    Utilization of Commercial CAD Tools: Additionally, some studies [60,61,62,63] examined the commercially available CAD tools to accurately diagnose lung diseases.

4.1 Categorization of the New Algorithm into distinct model groups

All the New Algorithms have been systematically classified into the 7 distinct model groups, as outlined below:

  1. (i)

    Pure CNN: Entries constructed solely using CNN architecture, without integrating optimization algorithms and/or Fuzzy ML.

  2. (ii)

    Pure GAN: Entries constructed solely using GAN, without integrating optimization algorithms and/or Fuzzy ML.

  3. (iii)

    Pure Other NN: Entries constructed solely using Other-NN, without integrating optimization algorithms and/or Fuzzy ML.

  4. (iv)

    Pure ML: Entries constructed solely using ML, without integrating optimization algorithms and/or Fuzzy ML.

  5. (v)

    Other: Entries constructed solely using approaches other than those among CNN, GAN, Other-NN, and ML, without integrating optimization algorithms and/or Fuzzy ML.

  6. (vi)

    Commercial CAD system: Entries that applied commercially available CAD systems or prototypes.

  7. (vii)

    Hybrid ML: Entries meeting at least one of the following conditions:

    1. a.

      Combining two or more foundational models from {CNN, GAN, Other-NN, ML}.

    2. b.

      Combining one foundational model from {CNN, GAN, Other-NN, ML} with other ML methods.

    3. c.

      Incorporating one foundational model from {CNN, GAN, Other-NN, ML} along with an optimization algorithm and/or Fuzzy ML.

This meticulous categorization scheme effectively encompasses all 108 New Algorithms, providing a clear framework for their analysis and comparison throughout the subsequent sections. Figure 6 provides insight into the distribution of the New Algorithms across the 7 model groups, illustrating the proportional representation within each. Further, Fig. 7 offers a chronological view, arranging the 108 New Algorithm entries into the 7 model groups according to their publication years, whereas Fig. 8 displays according to their corresponding model groups.

Fig. 6
figure 6

Distribution of new algorithms across 7 model groups

Fig. 7
figure 7

Distribution of model groups of the new algorithm by year of publication

Fig. 8
figure 8

Categories of new algorithm entries by model group types

Figures 6 and 7 reveal a noteworthy pattern, in which both Pure CNN and Hybrid ML garner significant attention, comprising the largest segments in the pie chart. An intriguing observation is the pronounced focus on Pure CNN since 2019, coinciding with the rise of Hybrid ML studies. Notably, Hybrid ML experienced a marked surge in 2023. This surge can be attributed to the recognition that Pure CNN may not comprehensively address the demands of CAD workflows and often lacks optimal generalization within end-to-end CNN frameworks. To address these limitations, numerous investigations [9, 15, 58, 59, 64,65,66,67,68,69,70,71,72,73,74] adopt multi-stage models, merging diverse approaches into a single framework. Number of studies [11, 14, 46, 55, 57, 75,76,77,78,79,80,81,82,83,84,85,86,87] navigate this challenge by leveraging metaheuristic search techniques for hyperparameter optimization, mitigating the prolonged training issue. The observed trend suggests a continued influx of novel Pure CNN and Hybrid ML algorithms, driven by the ongoing pursuit of refined and specialized CAD solutions.

4.2 An Even More Comprehensive Categorization of the New Algorithms Based on Foundational Models and Methodology

This section provides a breakdown of how the New Algorithms entries are distributed by their constituent models and methodologies. Figure 9 presents a diagram where the size of the regions reflects the number of entries falling within each specific combination. The pink shading represents the entries categorized under Hybrid ML, while the black numbers correspond to entries falling within model groups numbered 1 to 6, as defined in Section A.

Fig. 9
figure 9

Euler diagram illustrating the distribution of new algorithm entries based on foundational models and methodologies

Optimization and fuzzy instances are not considered distinct foundational models because they are typically used as accompanying algorithms rather than standalone entities within New Algorithms. Thus, foundational models that were attached with optimization algorithm and/or Fuzzy ML are classified as Hybrid ML.

Several studies involve integrating metaheuristic optimization algorithms to enhance network performance. These investigations are represented in the section bounded by the box in green border in Fig. 9, and these studies are detailed in Table 2. Furthermore, the exploration of Fuzzy ML within this domain is limited, with only two entries [30, 56] incorporating Fuzzy ML alongside primary models. In contrast, both Hybrid ML and Pure CNN investigations have gained significant attention, each with 42 entries.

Table 2 Classification of all the 108 new algorithms based on foundational models

Pure CNN has advantages in extracting informative deep features and operating seamlessly as an end-to-end model, making it more user-friendly for clinical applications. However, practical applications in medical image processing are often complex due to the high noise levels in raw images. This requires preprocessing, segmentation, and detection steps before they can be used for optimal classification and diagnostic outcomes. Hence, a significant portion of research work remains focused on improving the established CAD workflow. This explains the prevalent interest surrounding the exploration of Hybrid ML and Pure CNN.

4.3 In-Depth Review of the Model Category: Pure CNN

This section analyzes 43 Pure CNN entries, exclusively constructed upon CNN models. The objective is to examine each entries’ unique characteristics, design principles, and applications, offering readers a profound comprehension of CNN-driven algorithms.

4.3.1 CNN Architectures Overview

CNNs are the preferred method for computer vision, especially in CAD for lung cancer, due to their specialized design for grid-like data [134]. Figure 10 shows the core foundational elements collectively adopted by Pure CNN models.

Fig. 10
figure 10

Common structures of a typical CNN framework

In the CNN workflow, the process involves two main parts: extracting features from input images and classifying them. Different layers such as convolutional, pooling, and fully connected layers perform various operations on the input data.

  • Convolutional layer. This layer extracts features from 3D tensor data through convolution operation to produce feature maps.

  • Pooling layer. This layer adeptly reduces the dimensionality of the feature maps, facilitating a more compact representation.

  • Fully connected layer. This layer forms connections between each neuron in the preceding layer and the current layer, producing a feature vector representing the final model prediction.

4.3.2 Utilization of Transfer Learning

CNN models require significant computational resources and extensive training data to achieve optimal performance. Transfer learning allows a model to leverage existing knowledge and customize it for a specific domain, such as classifying lung cancer images. In basic terms, the model gains knowledge through saved weights and then is further trained in a specialized field to excel in that domain. This speeds up learning as opposed to starting from scratch.

Publicly available CNN models, such as VGG, ResNet, DenseNet, MobileNet, AlexNet, and Inception, have been extensively trained on natural images from the ImageNet dataset and can be useful for transfer learning. Several investigations [31, 32, 34, 35, 59, 92, 96, 100, 114, 118] have showcased the incorporation of transfer learning into the suggested model process.

These specified studies did not investigate how well the knowledge transferred from the ImageNet dataset, which comprises natural images, would perform in classifying lung cancer, given the fundamental differences between natural images and lung images. While these studies boast high accuracy and commendable overall performance, they lack further validation to determine whether these results are due to overfitting or inadequate generalization.

4.3.3 Readily Available CNNs

A minority of studies utilized well-known readily available CNN algorithms, which were not included in the new 108 algorithms. These studies applied and compared the publicly available CNNs for lung cancer detection and classification tasks.

For instance, Teramoto et al. [31] used VGG-16 to classify cancer cells in histopathology images, but the accuracy of 79.20% was below the desired level despite high specificity (83.30%) and sensitivity (89.30%). In a recent study by Pandian et al. [34], both VGG-16 and GoogleNet were used to classify data from private hospitals into three classes. The study found that GoogleNet performed better than VGG-16 in this context. On a different note, Rajasekar et al. [35] conducted foundational research on six different CNN models, including CNN, CNN with Gradient Descent, Inception-V3, ResNet-50, VGG-16, and VGG-19. However, the study lacked a comprehensive analysis among the models, and crucial details about datasets and experimental settings were omitted.

Notably, Bicakci et al. [32] were among the first to experiment with readily available CNNs, such as SqueezeNet, VGG-16, and VGG-19, in the context of PET imaging for classifying input into adenocarcinoma and squamous cell carcinoma. Unfortunately, the reported F-score and AUC values fell considerably short of the desired standards, ranging between 54—74 and 52—70 respectively.

The studies mentioned above revealed a trend where researchers commonly utilized readily available CNNs as a “Black Box” tool, without exploring the underlying mathematical models. Among the readily available CNNs, VGG models were the preferred option for direct application of CNN, as evidenced by their usage in all four cited studies [31, 32, 34, 35]. However, a notable challenge observed from these investigations is the exclusive use of private datasets, which hinders the ability to validate and reproduce the reported findings.

4.3.4 Customized CNN Model Dedicated for Lung Cancer Diagnosis

In total, this survey examines 43 customized CNNs that are designed to detect lung cancer. These custom CNN architectures are built by modifying at least one existing CNN model. Researchers have used various techniques to improve the pre-existing CNN models, such as fusing multiple CNNs or enhancing the layers within a CAD pipeline. Some studies have added customized internal layers to address limitations, while others have integrated attention modules as an integral component of the CNN architecture. As a result, there is a wide range of variations among the deployed CNN models. The survey will examine these customized CNN architectures based on their backbone models. Table 3 gives the summary of results obtained by the 43 customized CNNs.

Table 3 Summary of the Pure CNN model group

U-Net is a popular architecture in CAD for lung cancer detection and segmentation. It functions as an encoder-decoder network with skip connections, allowing it to better extract high-level and detailed features. As the network goes deeper, it can extract high-level features better, and the skip connections integrate detailed features back into the decoder phase. Among the surveyed studies, 11 instances [10, 13, 16, 91, 94, 96, 101, 103, 106, 111, 117] were identified with U-Net as the backbone model for constructing customized CNN architectures.

Wang et al. [106] introduced a CAD system that unified two CNN components within a single framework. U-Net was adopted for the segmentation task, while a custom-designed CNN with a conventional structure was tailored for classification. The segmentation model acted as a trainable preprocessing module, generating a classification-guided ‘attention’ weight map from the raw CT data. This map indicates the significance of distinct regions for the classification task, thereby enhancing diagnostic performance.

Similarly, Cao et al. [16] introduced a two-stage CNN (TSCNN) to detect lung nodules. In the initial stage of the model, U-Net was employed as the foundational architecture and was enhanced with a ResDense structure, a novel sampling strategy, and a two-phase prediction approach. In the subsequent stage, the proposed dual pooling mechanism was integrated into the CNN classifiers (ResNet, DenseNet, and Inception). Likewise, Han et al. [111] adopted a two-stage methodology, combining the strengths of 3D-RPN and U-Net for pulmonary nodule detection. Detected nodules were then forwarded to a ResNet classifier for classification.

Xie et al. [91] introduced an innovative approach to address the challenge of limited data availability in model training. They proposed a novel deep NN model called the multi-view knowledge-based collaborative (MV-KBC) model. In their approach, instead of using full images, they utilized multi-patches for training. The MV-KBC model employed the U-Net architecture to extract 2D nodule slices from multiple views (a total of 9 views). These slices were subsequently fed into a knowledge-based collaborative (KBC) submodel. Within each KBC submodel, the fusion of three ResNet-50 networks facilitated the learning of various aspects: overall appearance (OA), heterogeneity in voxel values (HVV), and heterogeneity in shapes (HS) extracted from segmented patches. Finally, all nine KBC submodels were integrated to create the overall MV-KBC model.

Dutande et al. [103] introduced a cascaded CNN approach named LNCDS, which operates in 2D and 3D dimensions. They successfully addressed limitations in the U-Net architecture, such as the insufficient representation of significant features and long skip connections between the encoder and decoder, with the incorporation of short skip connections known as residuals and harnessed the effectiveness of squeeze-and-excitation blocks. In the subsequent stage of their approach, they employed a 3D-NodNet classification model that utilized 3D cubes as input, leveraging the 3D nodule structure’s capacity to encompass more structural and geometrical details.

Suzuki et al. [117] introduced a modified 3D U-Net model for automated lung nodule detection on chest CT images. Their adaptation allowed any feature map to be reached from a marginal output map within three steps, thereby preventing the vanishing gradient problem. Lei et al. [94] enhanced the U-Net architecture by incorporating a high-level feature-enhanced soft activation mapping (HESAM). This integration combined high-level convolutional features with detailed lung nodule shape and margin features, enhancing feature analysis. Chen et al. [101] introduced the DC-U-Net model, which combines the U-Net network with dilated convolution. This approach expanded the feature receptive field without sacrificing spatial resolution, enabling the model to capture more information from images while maintaining parameter efficiency.

Conversely, Liu et al. [96] proposed an end-to-end detection framework involving a modified U-Net network. They integrated a residual attention network as a shortcut into the backbone U-Net. Additionally, they incorporated weight transfer learning, leveraging both image-level tag annotations and mask annotations. Shi et al. [10] presented the Multiscale Residual U-Net model (MCA-ResUNet), designed for accurate lung nodule segmentation, particularly for nodules with detailed geometric shapes. Their encoder featured multiple multiscale residual blocks, with the Atrous spatial pyramid pooling as a bridging module and facilitating feature layer connection through layer-crossed context attention. Wang et al. [13] introduced the Multi-Granularity Scale-Aware Networks (MGSA-Net), which unified path-level and global approaches within a single framework. This design preserved global contextual information and local fine details through multi-granularity feature map sharing, enhancing feature map fusion and information preservation.

V-Net is a modified version of the U-Net architecture. Unlike U-Net, V-Net does not incorporate Batch Normalization. Additionally, U-Net does not employ element-wise sums at the end of successive Convolutional Layers. Moreover, while V-Net utilizes four concatenations, U-Net employs only three. Liu and Chan [5] introduced an integrated segmentation and classification network, utilizing V-Net as the foundational model. In this end-to-end approach, segmentation and voxel-based feature learning occurred concurrently, facilitated by the voxel-based feature extraction layer. Conversely, Ozdemir et al. [99] initially claimed to have deployed an end-to-end model, but their proposed approach was actually executed in two stages. They employed V-Net for segmentation in the first stage, followed by two consecutive basic CNNs for malignancy ranking and classification in the second stage.

Six studies [8, 29, 93, 100, 113, 118] opted for VGG as the foundational model for their classification tasks. Malik et al. [113] introduced the Best Diagnostic Classifier Network (BDCNet), utilizing VGG-19 as its backbone and customizing it with various typical CNN layer structures to enable 4-class classification (normal, COVID-19, pneumonia, lung cancer). Bishnoi and Goel [118] devised a weighted VGG deep network (WVDN) within a high-speed real-time transfer learning framework tailored for real-time applications. Zuo et al. [8] presented a multi-resolution CNN and knowledge transfer approach, employing VGG as the backbone network. Their objective was to extract features of differing levels and resolutions from distinct depth layers in the network, enhancing the classification of lung nodule candidates through transfer learning. Notably, they improved the loss function and objective equation to operate at an image-wise calculation level rather than pixel-wise.

Apostolopoulos et al. [100] introduced the Feature Fusion VGG19 (FF-VGG19) technique. This approach relied on VGG networks as the backbone for feature fusion using a self-training strategy. Bharati et al. [93] proposed a novel hybrid DL framework named VDSNet, combining VGG, data augmentation, and a spatial transformer network (STN) with CNN. The niche of this field is worth mentioning, with only one article exploring the integration of CNN with a transformer network. Ibrahim et al. [29] embarked on a pioneering venture in the realm of multi-modality data classification, incorporating a fusion of CNN models (VGG19 + CNN, ResNet-152 V2, ResNet-152 V2 + GRU, and ResNet-152 V2 + Bi-GRU). Their models operated end-to-end, relying solely on high-level features automatically extracted during the model training phase.

Huang et al. [88] devised a rapid and fully automated end-to-end system combining Faster R-CNN and CNN for nodule detection and FCN2s for nodule segmentation. Despite the authors’ assertion, this system functions in multiple stages. Initially, a 2D Faster R-CNN was employed to identify pulmonary nodule patches. The authors incorporated a conventional CNN before directing the patches to a modified FCN (with VGG-18 as the backbone) for precise nodule segmentation to mitigate false positives.

Huang et al. [112] introduced a 3D OSAF-YOLOv3 model for lung nodule detection. This model is created by fusing the 3D YOLOv3 architecture with the one-shot aggregation module, the receptive field block, and the feature fusion scheme. Sahu et al. [90] leveraged MobileNet as the backbone network for their novel multi-section CNN. The unique aspect of their model is incorporating a view pooling layer, enabling it to aggregate information from cross sections from different angles, effectively encoding the nodule’s volumetric characteristics. Furthermore, the presented model capitalizes on the advantages of MobileNet, which is known for its lightweight nature and is conducive to deployment on mobile devices.

Three investigations [18, 89, 114] have developed novel customized CNN algorithms with AlexNet as the backbone model. Mehmood et al. [114] introduced CSIP-TL, a new model that combines class-selective image processing, transfer learning, and AlexNet for classification. This approach improved undesirable class outcomes by using Histogram Equalization (HE) to enhance image quality and retrained the model with improved images for the problematic classes. Kasinathan et al. [89] developed an Enhanced CNN classifier with segmentation using an active contour model. The article did not detail the enhancements applied to the AlexNet model but emphasized their integration. In another study, Souza et al. [18] proposed a method for lung segmentation in chest X-rays using AlexNet and ResNet as backbone models. AlexNet generated precise lung contours to produce initial segmentation maps, while ResNet-18 reconstructed missing lung regions. The final segmentation result was obtained by combining the outputs of both CNNs.

References [104, 107, 115, 120] utilized the ResNet architecture as their foundational framework. Zhao et al. [120] introduced an adaptive and attentive 3D CNN with ResNet at its core. Their model incorporated a high-resolution fused attention module in the initial stage for candidate nodule detection. Subsequently, they developed an adaptive 3D CNN with an adaptive 3D convolution kernel to reduce false positives. This was achieved by extracting multilevel contextual information. Ozdemir and Sonmez [115] integrated a feature-wise attention layer into ResNet-50 to enhance the discriminative features acquired by the network. This modification aimed to improve the network’s ability to capture important information. In a distinct approach, Xiao et al. [107] investigated contour representation in polar coordinates instead of Cartesian coordinates. They proposed a nucleus segmentation model based on polar representation, utilizing the ResNet architecture as a foundational component.

Chen et al. [102] devised Lung Dense Neural Network (LDNNET), built upon the DenseNet network structure. Their model functions as an end-to-end framework designed for lung nodule classification tasks. Li et al. [95] presented a multi-resolution patch-based CNN that adopts DenseNet as its foundational backbone for lung nodule detection. Features extracted from patches were fused using a feature fusion strategy, enhancing the model’s detection capabilities. In contrast, Pandit et al. [119] introduced a multi-space image pool layer to an autoencoder. This innovative addition enabled the model to take account of reconstruction loss by subtracting it from accuracy, thereby contributing to improved performance.

Xu et al. [17] presented ISANET, an innovative approach for multiclass classification that combined various attention mechanisms—the channel attention mechanism, Squeeze-and-Excitation (SE), and spatial attention (SA) with the InceptionV3 architecture. InceptionV3 was the backbone model, which was tuned with a channel attention module positioned before the final layer. Masood et al. [97] introduced an advanced multidimensional region-based fully convolutional network (mRFCN) tailored for lung nodule detection and classification. Their model featured a multi-layer fusion region proposed network (mLRPN) designed to enhance ROI selection by incorporating position-sensitive score maps. Additionally, during the downsampling process, a deconvolutional layer was integrated to recover any potential loss of small objects, such as lung nodules.

Guo et al. [104] developed an end-to-end model, ProNet, for classifying histological subtypes. They utilized the ResNet architecture, incorporating a sequence of two batch normalization layers, followed by multiple building blocks and a global average pooling module in their research.

Moreover, a collection of studies [12, 53, 92, 98, 105, 108, 110, 116] has introduced fully customized CNN models, deliberately avoiding the use of pre-existing CNN architectures as their backbones. These models typically encompass a conventional CNN structure, incorporating convolutional, pooling, and fully connected layers. The models vary in the number of layers employed and often integrate additional enhancement modules like attention mechanisms.

Suresh and Mohan [116] developed a deep CNN that adheres to the typical CNN model structure and asserted its status as an end-to-end solution that eliminates the need for manual feature extraction. Ashraf et al. [53] formulated a distinctive approach, constructing a customized model featuring three branches of identical customized CNN models. These branches were subsequently fused to produce the final output, effectively fusing information from the three sub-models. In contrast, using histopathological images, Civit-Masot et al. [108] introduced the Explainable Deep Learning (xDL) framework for non-small cell lung cancer diagnosis. This novel approach harnessed the Gradient-weighted Class Activation Mapping (Grad-CAM) algorithm, which directed the system’s focus toward specific regions one at a time rather than analyzing the entire images collectively.

Saradhi et al. [12] introduced the Multiscale CNN with Compound Fusions (MCNN-CF) model, which takes multiscale 3D patches as inputs and performs feature fusion within the network. This fusion process occurs at two different network depths in two distinct manners. The model is organized into three stages, each consisting of submodules. Notably, the first stage simultaneously learns low-level features from 3D patches of varying sizes corresponding to each nodule candidate. In the subsequent two stages, high-level features are acquired through multiple fusions of features from the previous stage. Two distinct fusion techniques, namely Concatenation and Addition, are employed in the second and third stages, respectively. These fusion techniques contribute to integrating learned features, enhancing the model’s overall performance and capabilities.

Two studies [105, 110] have recognized the importance of focusing on specific regions within the input and addressed this by integrating attention mechanisms into CNN architectures. Sun et al. [105] introduced the attention-embedded complementary-stream CNN (AECS-CNN) to achieve this objective. The AECS-CNN model consists of three key functional blocks: the attention-guided multi-scale feature extraction block ensures the acquisition of multi-scale features with attention-driven focus, the complementary-stream block, augmented with an attention module, guides the network in weighing features from diverse scale inputs to prioritize the nodule region, the classification block utilizes the integrated features for accurate classification. These efforts collectively enable the network to concentrate on significant regions, enhancing information accuracy and classification performance.

Fu et al. [110] extend their approach by integrating multiple attention-based learning modules to concurrently assess nine distinct visual attributes of lung nodules across complete CT image volumes. They employ a slice attention module to eliminate insignificant slices, cross-attribute attention modules to explicitly leverage inter-attribute relationships by combining high-level CNN representations, and attribute specialization attention modules to ensure the meaningfulness of high-level CNN representations for the specific attributes.

Ali et al. [92] introduced the Transferable Texture CNN, an end-to-end model. Comprising merely three convolutional layers and an energy layer, this model extracts texture features from the convolutional layer. The energy layer’s role is twofold: it preserves texture information from the preceding layer and dynamically learns during forward and backward propagation. This strategic inclusion of the energy layer not only retains the essential energy/texture information but also reduces the network’s learnable parameters, subsequently contributing to reduced computational complexity.

4.4 In-depth Review of the Model Category: Pure GAN

GAN has gained attention lately. Among 119 articles reviewed, only three [38,39,40] utilized established GANs without modifications. Specifically, Toda et al. [39] employed the Style Pix2pix model, Moris et al. [38] utilized CycleGAN, and Mendes et al. [40] employed Pix2Pix and cCGAN for lung image generation. Eight studies [36, 37, 41,42,43,44,45, 47] introduced the novel Pure GAN algorithms for detecting and classifying lung cancer. Three studies [11, 46, 47] combined GANs with optimization algorithms. Table 4 summarizes the results obtained by the customized GANs.

Table 4 Summary of the pure GAN model group

In two recent research works, Nishio et al. [36] and Jin et al. [37], GANs were harnessed to expand the available dataset for lung cancer by generating images. Nishio et al. [36] tailored a 3D GAN model close to a 3D pix2pix network. Their model utilized the ResNet architecture as its foundation. Noteworthy modifications included integrating 3D random erasing to infuse noise into the generator and introducing a loss function called PathGAN loss, which combined L1 loss and GAN loss. This model’s adaptability enabled the creation of masked images containing nodules of varying sizes rather than being restricted to a specific size.

Jin et al. [37] approached the augmentation challenge as a free-form image generation problem, considering the complexities of arbitrary and irregular shapes. Their model, FRGAN, allowed users to specify their areas of interest for synthetic tumor generation with shape and size preferences. The process involved masking and erasing the chosen region, after which FRGAN reconstructed the erased portion using a learned mapping between the mask and the tumor. The model predominantly relied on a dilated–gated generator, leveraging dilation operations for an expanded receptive field and a more comprehensive convolutional feature connection branch for tumor reconstructions of shape. The introduction of a hybrid loss function, fusing multi-mask loss, style loss, and perceptual loss, facilitated the integration of complementary information.

GANs have been utilized in various other tasks, including segmenting and classifying lung nodules. Notably, several methods have been explored for segmentation, often involving a series of steps requiring manual parameter adjustments at each stage. However, Jain et al. [41] introduced a novel LGAN approach, which automates the segmentation process. The model adopts U-Net as its foundational architecture and innovatively incorporates an Earth Mover distance-based loss function. This approach aims to achieve end-to-end segmentation, pushing the generated lung segmentation mask to align closely with the actual ground truth mask.

In a similar effort, Pawar and Talbar [42] introduced an approach named LungSeg-Net, which operates on the principles of a conditional GAN. Their primary focus was on extracting pertinent features to construct segmentation maps. They integrated a multi-scale dense feature extraction module between the encoder and decoder blocks to enhance their approach. This module consists of four inception blocks interconnected through dense connections, allowing for robust multi-scale feature extraction from the encoded feature maps.

Likewise, Tyagi and Talbar [43] devised a 3D conditional GAN termed CSE-GAN for lung nodule detection. Their strategy considered data distribution to address the challenge of class imbalance, which can lead to model overfitting. To combat this, they adopted a patch-based training methodology. The generator within their proposed network is based on the well-known U-Net architecture, tailored with a concurrent squeeze and excitation module. In contrast, the discriminator is a typical CNN network, enhanced with a spatial squeeze and channel excitation module. This setup allowed the discriminator to differentiate between ground truth and generated segmentation.

In addition, the inherent classification task of the discriminator in a GAN has been exploited for alternative purposes in certain studies [44, 45], where the focus shifted to classifying lung nodules as benign or malignant. For instance, Salama et al. [45] proposed the Deep GAN framework, which leveraged a convolutional variational auto-encoder (CVAE) as the generator. This architecture was employed to create a dense, class-balanced dataset for training the classifier model. In this case, the discriminator was a ResNet-50 network configured to classify lung tumors. Likewise, Xie et al. [44] introduced the MK-SSAC model, which operates on semi-supervised adversarial classification. This model employs multi-view knowledge-based collaborative learning, utilizing three semi-supervised adversarial classification modules to handle different aspects of benign-malignant lung nodule classification: overall appearance, shape heterogeneity, and texture heterogeneity. The model incorporates an adversarial autoencoder-based unsupervised reconstruction network, a supervised classification network, and transition layers. These transition layers enable the transfer of image representation abilities learned by the reconstruction network to the classifier, enhancing the overall classification performance.

4.5 In-Depth Review of the Model Category: Pure Other NN

Pure Other NN comprises models that do not primarily rely on convolution layer as their fundamental structure such models include Artificial Neural Networks (ANN), Long Short-Term Memory (LSTM) networks, and Recurrent Neural Networks (RNN). They contribute to the diagnosis process by classifying features extracted from earlier phases of the workflow. Some studies in this category are complemented by accompanying algorithms and categorized as Hybrid ML. These networks may not excel at image processing alone, but often serve as classifiers in the final stages of a CAD workflow to classify features extracted from earlier phases. Thus, it is noted that a significant portion of studies falling within the Other NN category is complemented by accompanying algorithms, and therefore categorized as Hybrid ML instead [14, 15, 30, 55, 56, 66,67,68,69, 72, 74, 75, 80, 82, 83, 85, 86, 130,131,132,133].

Among the surveyed studies, only one study [50] falls under the category of Other NN. In this study, the authors tackled the challenge of classifying lung nodules with varying appearances by introducing the Progressive Growing Channel Attentive Non-Local (ProCAN) network. The ProCAN approach was devised to address this challenge through three strategic solutions. Firstly, the non-local network was enhanced by integrating channel-wise attention mechanisms, enhancing its capability to capture essential features. Secondly, the principles of Curriculum Learning were adopted, enabling the model to begin training on simpler examples before progressing to more complex ones. Thirdly, a progressive growth technique was employed to gradually modify the network’s depth, facilitating its adaptation to the increasing difficulty of the classification task. The expansion of the network was realized by introducing new layers facilitated by the Bernoulli Blending algorithm. This comprehensive strategy empowered ProCAN with the competence to handle the challenge inherent in lung nodule classification, thereby establishing it as a robust solution for this demanding problem. Notably, ProCAN’s foundational architecture is based on NNs, consisting of seven CAN (Channel Attentive Non-Local) blocks, followed by a global average pooling (GAP) layer and a fully connected layer.

4.6 In-Depth Review of the Model Category: Pure ML

In this survey, Pure ML methods refers to ML approaches that are not based on NNs. Although ML methods are frequently used as classifiers in the final stages of a CAD system, their usage is limited due to a shift towards more efficient approaches, such as end-to-end models or using NNs to extract high-level features for better classification performance.

Only three identified studies [51, 52, 121] were categorized as Pure ML. These studies emphasized a multi-stage CAD system, wherein pre-processing, handcrafted feature extraction, and selection are performed in dedicated stages. The extracted features are subsequently input into an ML classifier for final classification.

For instance, in the work by Savitha and Jidesh [51], a two-stage CAD system was developed to classify lung nodules into solid and subsolid types. Image denoising, segmentation, and feature extraction were conducted as initial steps. In the first stage, the system classified nodules from non-nodules using SVM, FCM, and RF classifiers, respectively. The identified nodules were then subjected to further classification into solid and subsolid categories using K-means clustering and SVM.

Huang et al. [121] focused on quantitatively extracting features from CT images to classify subtypes of non-small-cell lung cancer. The Partition Around Medoids (PAM) consensus clustering algorithm was employed for this purpose. In another study, conventional SVM classification was used to address the nodule versus non-nodule classification problem. Before classification, segmentation and feature extraction were performed to derive quantitative features for the SVM classifier [52]. The authors introduced a hybrid segmentation approach, Adaptive Morphology-Based Segmentation Technique (AMST), which combined k-means clustering, morphological top-hat and bottom-hat operations, and adaptive structuring elements.

These studies underscore the application of traditional ML techniques in lung nodule classification, often integrated within multi-stage CAD systems that involve preprocessing, feature extraction, and final classification. Table 5 gives the summary of results obtained by the Pure ML methodologies.

Table 5 Summary of the pure ML model group

4.7 In-Depth Review of the Model Category: Other

An additional seven articles [6, 7, 122,123,124,125,126] fall under the “Other” category in this classification. This category comprises models that do not fit into the defined categories of Pure CNN, Pure GAN, Pure Other NN, or Pure ML. Instead, they are built using traditional non-ML methods and are combined within multi-stage CAD systems, which involve multiple stages for the analysis and processing of lung nodule data. Table 6 gives the summary of results obtained by the Pure ML methodologies.

Table 6 Summary of the other model group

CAD systems were developed to automate the detection and classification of pulmonary nodules in lung CT images [122, 124]. These systems could simultaneously detect and categorize three types of nodules (ground-glass opacity, part-solid, and solid) using a combination of morphological, texture, and nodular features unique to the nodules.

Furthermore, Cui et al. [6] introduced an in-house DL-CAD system that embraced a double reading approach for enhanced accuracy, involving both DL- and radiologist-based reading. Heidari et al. [126] pioneered a global model named FBCLC-Rad, which was the first federated learning model in literature. Their innovation lies in utilizing data from multiple hospitals via blockchain-based federated learning to train a global CapsNets model, thereby optimizing information sharing. In another study, Kavithaa et al. [123] utilized the Spatial Image Clustering Technique and the Linear Subspace Image Classification Algorithm (LSICA) for the segmentation and classification of lung cancer, respectively.

Within the realm of lung segmentation, Wang et al. [7] introduced a novel method for Solitary Pulmonary Nodule (SPN) segmentation. This approach combined a multiscale total-variation pyramid and improved GrabCut techniques to address issues related to nodule inhomogeneity and fuzzy contours. Shariaty et al. [125] proposed a Texture Appearance Model (TAM) that utilized extracted features from CT scans to create a Texture Representation of Image (TRI). This approach aimed to differentiate between lung nodules and lung tissue within lung CT images, focusing on texture-based distinctions.

4.8 In-Depth Review of the Model Category: Hybrid ML

Most studies are observed integrating multiple foundation models to derive a new algorithm. Table 7 gives the summary of results obtained by the Hybrid ML methodologies and the development is discussed according to the combination pairs:

Table 7 Summary of the hybrid ML model group

4.8.1 Eight Studies Combined CNN and ML: [9, 58, 59, 64, 70, 71, 73, 127]

A recent trend in lung cancer classification involves the utilization of deep features extracted by one or more Deep NN models, which are then passed to another Deep NN or Conventional ML classifier for the final classification. Several studies have explored this approach, including [9, 58, 59, 71].

Kumar et al. [70] compared the performance of handcrafted features and deep features for classification purposes. References [58, 59, 73] demonstrated the use of pre-trained CNN to extract deep features, followed by their utilization in conventional classifiers for final classification. While these studies focused on comparative analyses of various models, further performance enhancements were not a primary objective.

Wang et al. [9] introduced CNN-AvgFea-Norm3-based RF, a weakly supervised approach for efficient classification of whole-slide lung cancer images. Their method employed a patch-based fully convolutional network for extracting discriminative blocks and generating representative deep features. Different strategies for context-aware block selection and feature aggregation were explored and was then fed into a random forest classifier for image-level prediction.

Alshayeji and Abed [71] introduced NoduleDiag, an end-to-end conventional ML approach that fused handcrafted features and deep features from various networks (ResNet-50, AlexNet, EfficientNet-80) for classifying cancerous CT images and determining malignancy stages based on observable nodule characteristics. Furthermore, Savitha and Jidesh [64] combined conditional random field (CRF), a non-NN model, with DL for semantic segmentation and deep feature extraction and classification within a CAD system.

Hu et al. [127] proposed Mask R-CNN Lung Map for automatic lung segmentation from CT images, incorporating supervised and unsupervised ML methods such as Bayes, SVM, K-means, and Gaussian Mixture Models. However, limitations in Mask R-CNN were noted due to memory usage and detection speeds associated with region proposals.

These studies collectively illustrate the diverse approaches and methods used to leverage deep features for lung cancer classification, incorporating both DL and conventional ML techniques to enhance accuracy and efficiency in this critical medical application.

4.8.2 Six Studies combined CNN and Other NN: [68, 69, 72, 130,131,132]

Marentakis et al. [130] proposed a novel method that does not require detailed segmentation. They explored combinatorial models, specifically the combination of LSTM, CNN, and radiomics techniques. Wang et al. [68] utilized DL techniques to classify lung adenocarcinoma subtypes in CT images. They introduced an ensemble model by combining a modified 3D-ResNet-34 with radiomics strategies.

Wankhade and Vigneshwari [132] introduced a new approach called Cancer Cell Detection using Hybrid Neural Network (CCDC-HNN). They employed parallel CNN and RNN models for segmentation, feature extraction, and classification, with results combined at a merge layer. Bushra et al. [72] developed a unique DL framework, LCD-CapsNet, which encapsulates both CNN and Capsule Neural Network (CapsNet). This framework leveraged the strengths of these networks to minimize data and achieved spatial invariance for lung cancer detection and classification in CT images. The proposed LCD-CapsNet achieved high accuracy, but its algorithm’s complexity, particularly the inner loop of the dynamic routing algorithm, resulted in slower performance compared to CNNs.

Chen et al. [131] introduced a multi-task learning model for histologic subtype classification of non-small cell lung cancer using CT images. Their approach aimed to optimize both subtype and staging classifications simultaneously. The multi-task learning model demonstrated better subtype classification performance than radiomics-based methods due to its ability to automatically extract higher-level features. Halder et al. [69] introduced a DL framework, 2-Pathway Morphology-based CNN (2PMorphCNN), for accurate lung nodule classification. This framework combined morphological and textural features using two trainable parallel paths: one path employed Gabor filters for CNN-based feature learning, while the other used adaptive morphology-based feature extraction. The 2PMorphCNN outperformed other nodule classification methods by capturing and combining textural and morphological features from lung nodule images.

These studies collectively highlight various innovative approaches using DL, multi-task learning, ensemble models, and hybrid models to enhance lung cancer detection and classification using medical images, while also addressing challenges such as efficient feature extraction and classification.

4.8.3 Two Studies Combined CNN and Other: [128, 129]

Bae et al. [128] developed a novel CAD system for classifying lung diseases. They introduced a unique approach by utilizing Perlin noise for data augmentation, which generates natural-looking textures efficiently. This augmented data was then used as input for the FusionNet classifier, enabling precise pixel-level disease classification.

Addressing the issue of varying resolution screening data, Xu et al. [129] introduced DeepLN, an integrated solution employing CNNs, a multi-level feature combination strategy, and RPN. This system effectively handles multi-resolution challenges by using neural-network-based detectors to identify lung nodules. They tackled the class imbalance problem using hard negative mining and a modified focal loss function. Furthermore, they proposed an innovative ensemble technique based on non-maximum suppression to merge outcomes from various NN models trained on different CT image resolutions.

4.8.4 One Study Combined ML and Other: [65]

Li et al. [65] designed a CAD system named Relief-SVM to classify different subtypes of lung cancer using histopathology images. The process involved initial extraction of traditional features, subsequent application of RELIEF for selecting relevant features, and ultimately employing SVM for the classification task.

4.8.5 Five Studies Combined Other NN and ML: [15, 66, 67, 74, 133]

In most of these studies, SVM was utilized as the primary classifier within their respective CAD systems as seen in references [15, 66, 67, 74, 133].

Addressing the imperfections in raw images, Shakeel et al. [133] introduced an innovative hybrid pre-processing technique called Improved Profuse Clustering Technique (IPCT) to enhance image quality combined with Deep Learning Instantaneously Trained Neural Networks (DITNN) for the purpose of classifying cancer and non-cancer cases.

Due to the challenge of overlapping cells, Kavitha et al. [15] proposed an Efficient Classification Model for Cancer Stage Diagnosis (ECM-CSD). They employed Region-Based Fuzzy C-Means Clustering (FCM) for lung region segmentation and SVM for classifying cancer stages.

Nanglia et al. [66] presented the Kernel Attribute Selected Classifier (KASC) incorporating three key blocks. Their approach involved preprocessing, utilizing Speeded Up Robust Features (SURF) optimized by Genetic Algorithm (GA) for feature extraction, and employing SVM integrated with a Feed-Forward Back Propagation Neural Network for classification.

Rey et al. [67] introduced a hybrid CAD system involving fuzzy clustering, SVM, and ANN. Their system featured automated detection through a combination of Modified Spatial Kernelized Fuzzy C-Means (MSKFCM) and Back Propagation Neural Network (BPNN). They also incorporated volume of interest creation using a growing algorithm, feature selection through PCA, and utilized SVM and ANN classifiers with various regularization techniques.

Siddiqui et al. [74] proposed the GFSVM-EDBN method, which employs an Enhanced-DBN (E-DBN) consisting of cascaded Gaussian-Bernoulli and Bernoulli-Bernoulli Restricted Boltzmann Machines (RBMs) for feature selection. They combined this with a SVM for classifying lung CT images. This cascading approach simplifies layer-to-layer operations and offers improved feature selection compared to conventional DBN methods.

4.8.6 Three Studies Combined {CNN, GAN, Other-NN, ML} and Fuzzy: [30, 56, 78]

To date, the utilization of Fuzzy ML is rare, as evidenced by only three studies [30, 56, 78] incorporating a fuzzy component as part of their algorithms. Dey et al. [78] introduced an optimized fuzzy ensemble of CNNs through the Sugeno integral-based ensemble approach, which was further enhanced using eight well-established optimization algorithms for the purpose of lung disease screening. Meanwhile, Tian et al. [56] integrated the existing fuzzy possibilistic c-ordered mean alongside enhanced capsule networks (ECN), incorporating the converged search and rescue (CSAR) algorithm to optimize the clustering processes.

The effectiveness of each stage heavily relies on the accuracy achieved in the preceding stage, which constitutes a significant drawback. In their work, Tiwari et al. [30] addressed this concern by employing a Target-based Weighted Elman DL Neural Network (TWEDLNN) alongside a Farthest First Fuzzy C-Means (3FCM) algorithm for lung cancer detection. While Elman Deep Neural Networks (EDNN) are adept at handling discrete time series challenges, they might exhibit convergence issues or prolonged execution times. To mitigate this, the study introduced target-based weight values to enhance the control and performance of EDNN. In traditional Fuzzy C-Means (FCM) algorithms, cluster centroids are chosen randomly, which can yield suboptimal outcomes. To overcome this limitation, the proposed 3FCM approach leverages the Farthest Point First Clustering (FPFC) algorithm to select initial centroids in a more effective manner.

4.8.7 Eighteen Studies Combined {CNN, GAN, Other-NN, ML} and Optimization algorithm: [11, 14, 46, 47, 55, 57, 75,76,77, 79,80,81,82,83,84,85,86,87]

The interest in integrating optimization techniques has been steadily increasing over the years due to the growing demand for computational resources and the advancement of DL models. The number of studies in this area has exponentially increased and is expected to continue growing in the future.

CNN layers generate high-dimensional deep features, which can lead to the curse of dimensionality. The curse of dimensionality refers to the challenges and inefficiencies that arise when dealing with high-dimensional data. To address this, a subset of studies [76, 79, 81] has attempted to optimize the models using specific algorithms to find optimal hyperparameters and improve performance.

Li et al. [76] employed a genetic algorithm (GA) to optimize a basic CNN. Their visual analysis indicated that the GA-optimized CNN exhibited improved accuracy, although no quantitative measurements were provided. In another study, Xu et al. [81] introduced a novel CAD system for lung cancer using CT-scan images. They utilized a modified Bowerbird optimization algorithm to enhance an AlexNet model. Modifications were introduced to address previous challenges related to low accuracy and slow convergence speed with Bowerbird algorithm, incorporating opposition-based learning and chaos mechanisms.

Addressing the challenge of high-dimensional deep features generated by the concatenation layer in multi-input CNN methods, Huang et al. [79] introduced a manifold-based DL model named as Deep Feature Optimization Framework (DFOF). In this research, optimization techniques were integrated into the two-stream feature extraction stage to segregate interclass samples and to form the embedding features to be fed into a classifier.

Meanwhile, the integration of optimization techniques is also gaining traction in the realm of research involving customized CNNs. Jiang et al. [77] introduced NASLung, a customized CNN that incorporates a convolutional-based attention module (CBAM) with A-Softmax loss function and employed a neural architecture search approach known as Partial Order Pruning to search for low-latency neural architecture. An ensemble of diverse NNs was utilized to enhance prediction accuracy and overall robustness. Notably, the model achieved remarkably competitive performance while utilizing less than 1/40th of the parameters typically employed.

Kanipriya et al. [80] presented an innovative approach called Improved Capuchin Search Algorithm (ICSA) optimized hybrid architecture, combining CNN and LSTM, denoted as ICSA-LSTM-CNN, for the classification of lung nodule subtypes. The Capuchin search algorithm draws inspiration from the energetic foraging behavior of capuchin monkeys, and the algorithm was improved with Opposition-Based Learning and Chaotic Local Search to optimize hyperparameters.

Ajai and Anitha [57] introduced the Shuffled Social Sky Optimizer-based Multi-Object Rectified Attention Network (SSSO-based MORAN) for lung cancer classification. This model utilizes a novel algorithm, the Shuffled Shepherd Optimization Algorithm (SSOA), combined with the Social Ski-Driver (SSD) algorithm. The SSOA is inspired by animal instinct, mimicking a shepherd’s ability to find optimal paths, while SSD addresses imbalanced data issues by considering velocity and previous positions for solution updates. Notably, the SSSO-based MORAN is heavyweight as it incorporates multiple advanced components, including Deep Renyi entropy fuzzy clustering (DREFC) for segmentation, artificial feature extraction, and a grid-based scheme for detection, culminating in an advanced classification approach.

A significant issue of high overfitting was encountered, where the models displayed effectiveness for specific classes but lacked performance across all categories. Addressing this concern, Rajagopal et al. [84] demonstrated an alternative solution for lung disease detection. They employed a deep convolutional spiking NN optimized with the arithmetic optimization algorithm (LDC-DCSNN-AOA). Impressively, their approach achieved heightened sensitivity in comparison to the competing models.

Sengodan et al. [86] introduced a method, named Multipopulational Neighborhood Particle Swarm Optimized Modified Ensemble Faster Learning (MNPS-MEFL). Their approach involved adaptively tuning an ensemble of SVM and Faster R-CNN classifier using the algorithm. This adaptive tuning aimed to strengthen accuracy specifically for precise detection of benign and malignant lung nodules.

Previous research did not focus on optimizing the assigned weights within the ensemble model through a meta-heuristic-based strategy. Srivastava et al. [87] implemented ensemble learning across six DCNN classifiers. They employed Differential Evolution optimization to determine optimal assigned weights for these classifiers during ensemble model training. Additionally, a majority voting mechanism based on Condorcet’s Jury Theorem was introduced. This innovation significantly reduced computational efforts by eliminating the need for training meta-learners.

In addition to models based on CNNs, researchers in both references have also delved into the optimization of GAN models. Jain et al. [11] applied the Salp Shuffled Shepherd Optimization Algorithm, while Kumar et al. [46] applied the Sunflower Optimization Algorithm. Both approaches were employed to address challenges associated with high levels of overfitting in GAN models.

Murthy and Prasad [47] proposed Transformer-Aided GAN (T-GAN) technique to address the issue of spatial feature degradation within a GAN-based classification approach. They introduced a transformer into the GAN network, aiming to minimize spatial features distractions within the image by optimizing the various layers within the GAN framework. Furthermore, to fine-tune the network model, they introduced a novel technique termed Dynamic Levy Flight Chimp Optimization (DyLF-CO).

The challenges in detecting tumors with varying appearances and characteristics prompted Braveen et al. [14] to propose the Ant Lion-based Autoencoders (ALbAE) model for optimal high-level feature extraction to allow precise classification based on discriminative features. Moreover, Lakshmanaprabu et al. [75] introduced an Optimal Deep Learning (ODNN) classifier, composed of a Deep Belief Network and a Restricted Boltzmann Machine, using a Modified Gravitational Search Algorithm (MGSA) for weight optimization, aimed at detecting lung cancer.

The concept of Opposition-Based Learning (OBL) involves generating vary solutions to diversify the search space, potentially leading to improved optimization outcomes. Priya et al. [55] employed OBL alongside Deep Belief Networks, employing the Opposition-Based Pity Beetle Algorithm (OPBA), inspired by beetle behavior. Sabzalian et al. [85] enhanced Bidirectional Recurrent Neural Networks (BRNNs) for lung cancer diagnosis using an Improved Ebola Search Optimization Algorithm, outperforming standard gradient-based optimization techniques. Meanwhile, Prakash et al. [83] optimized the Lung Cancer Classification system (EESNN classifier) with the Flamingo Search Optimization Algorithm.

Convergence challenges in Mask R-CNN were addressed by Indumathi and Siva [82], who introduced a hybrid Mask R-CNN-Bidirectional Long Short-Term Memory (BiDLSTM) model for more accurate lung disease prediction. This model utilizes the Crystal algorithm to optimize Mask R-CNN through hyperparameter tuning, improving scalability and convergence, and enhancing segmentation accuracy through region selection.

4.8.8 In-Depth Review of the Model Category: Commercial CAD System

Four studies [60,61,62,63] revolved around the implementation of readily available CAD systems. Table 8 gives the summary of results obtained by the Commercial CAD methodologies.

Table 8 Summary of Commercial CAD system group

Morozov et al. [61] introduced the practical tool FAnTom software, grounded in a cluster model, which was made accessible on GitHub. This tool was designed for nodule localization and could accommodate variations in interpretations from diverse individual readers annotating CT scans. Lancaster et al. [63] conducted an evaluation and comparison of an DL-based screening method called AVIEW LCS for lung cancer. The performance of this AI system was assessed in comparison to manual reading methods.

Tam et al. [62] introduced AI integration into cancer diagnosis. They employed Red Dot for initial classification and then directed patients towards further treatment or radiologist review based on the AI reader’s assessment. Hsu et al. [60] conducted a study to compare the effectiveness and reading time of different readers using the ClearReadCT system, an automatic AI-powered CAD system, for lung nodule detection across various reading modes.

5 Discussion

In this section, a detailed analysis will be conducted in relation to each individual research question.

5.1 RQ1: What are the types of CAD techniques used for diagnosing lung cancer, Cancer, and how do CAD systems contribute to enhancing the efficiency and accuracy of lung cancer cancer diagnosis?

Medical imaging analysis for lung cancer CAD systems can be classified into three main categories based on feature engineering, DL, and end-to-end frameworks. These systems are constructed using foundational models, including Pure CNN, Pure ML, Pure GAN, Other NN, Pure Other, Hybrid ML, and Commercial CAD systems. The most popular technique among them is the Pure CNN approach. The goal of CAD systems is to automate lung cancer detection tasks and provide accurate diagnostic assessments.

5.2 RQ2: What are the Key Advancements in CAD for Lung Cancer Diagnosis, and How Do Different Model Configurations Impact the Performance of Lung Cancer Diagnosis?

In essence, the landscape of medical imaging analysis has been marked by these compelling trends: the direct utilization of transfer learning, the incorporation of GANs, the exploration of diverse NN models, the investigation of non-NN approaches, the development of hybrid models, and the integration of commercial CAD tools. Notably, many researchers have either introduced entirely novel CNN algorithms tailored to specific diagnostic objectives or have enhanced publicly available CNN models by incorporating custom or modified layers.

5.2.1 RQ2a: What are the advantages and disadvantages of CNN-based approaches?

CNNs have revolutionized the field of lung cancer imaging analysis by their capacity to automatically extract hierarchical features from data. Nevertheless, CNN-based strategies come with both merits and demerits like any other technology.

Advantages: CNNs excel at automatically learning and extracting hierarchical features and discriminative patterns from raw data. Also, their translation invariance properties enable them to identify patterns irrespective of their location within the input data, proving advantageous for tasks like image classification where nodule location can vary. Moreover, the hierarchical learning nature of CNNs facilitates an efficient understanding of varying levels of abstract representation, from structural level to high level features. Furthermore, CNNs also inherently capture spatial hierarchies, making them appropriate for tasks involving spatial relationships such as object segmentation and detection.

Disadvantages: Within the CNN architecture, deeper networks do not necessarily yield improved performance and determining the optimal depth for effectiveness remains unclear [135]. Moreover, deeper network depth corresponds to an increase in trainable parameters, rendering the model computationally intensive and reliant on robust hardware. Moreover, CNNs are data hungry and demand a substantial volume of data to perform effectively and training them from scratch with limited data can lead to overfitting or weak generalization. In addition, task specific CNNs may not generalize well across different tasks, necessitating tailored architectural fine-tuning for varying objectives.

5.2.2 RQ2b: What are the key advancements in CNN-based Aapproaches for lung cancer detection, and how do these approaches address the Limitations and challenges faced in lung cancer detection?

Several key advancements in CNN-Based Approaches for lung cancer CAD development were observed. One notable advancement is the widespread utilization of transfer learning in several studies [31, 32, 34, 35, 59, 92, 96, 100, 114, 118], where pre-trained CNN models, often trained on large datasets like ImageNet, are fine-tuned for lung cancer detection. This approach enables the extraction of relevant features from medical images without requiring massive amounts of labeled data.

Secondly, researchers have developed customized CNN architectures optimized for lung cancer detection. These proposed architectures often incorporate specialized layers, attention mechanisms, or skip connections to capture intricate features and relationships within lung images [10, 90, 92, 109, 113]. Particularly, attention mechanisms have been implemented within CNN architectures to focus on relevant ROI within the lung images to enhance the models’ ability to accurately identify and classify cancerous regions in a number of studies [17, 96, 105, 110, 115, 120]. Additionally, some studies applied ensemble techniques, combining predictions from multiple CNN models, have gained traction for improving the overall performance and reducing potential overfitting [18, 29, 88, 91, 93, 103, 106, 107, 109, 111].

An ongoing challenge is the class imbalance issues. In medical imaging, the number of positive cases (cancerous) is usually significantly smaller than negative cases. Most studies addressed class imbalance problems with techniques like augmentation through geometric transformation [5, 8, 17, 29, 53, 92, 97, 99, 102, 104, 105, 115, 116, 118] and one study [100] explored GAN-based image synthesis.

Apart from that, lung nodules can exhibit variations in shape, orientation, size, and intensity, leading to potential challenges in feature extraction, especially for the detection of small nodules. Some studies used multi-scale approaches [10, 12, 13, 53], some leveraged multi-resolution approaches [8, 95] and some demonstrated multi-dimensional approaches can help handle these variations [97, 117].

In addition, models trained on specific datasets might not generalize well to different datasets due to variations in imaging protocols and demographics. Many studies addressed this issue with cross-validation on diverse datasets [10, 17, 29, 53, 92, 95, 96, 99, 102, 103, 109, 111, 113, 117].

Lastly, CNNs-based methods are computationally intensive, especially for large datasets. Efficient model architectures, hardware acceleration, and optimization techniques are the possible solutions to this challenge. Notably, many studies attempted to mitigate by embedding an optimization algorithm into the CNN network structure [9, 57, 64, 68, 69, 72, 73, 77, 80, 84, 86, 87, 127, 129, 131, 132].

5.2.3 RQ2c: How have GANs been applied to lung cancer diagnosis?

Research of lung cancer CAD development utilizing GANs is relatively confined, with only seven notable studies [36, 37, 41,42,43,44,45] solely dedicated to development of GAN models. These studies comprise a diverse spectrum of applications, each contributing to the expanding potential of GANs in lung cancer research. Specifically, for segmentation tasks, studies have emerged, showcasing the capacity of GANs to delineate object boundaries [11, 41,42,43]. Alternatively, works [44,45,46,47] have exploited the power of GANs in refining classification models. Fundamentally, GAN, which was designed for generating fake images was exploited for lung cancer image generation too. References [36, 37] delved into the augmentation domain, leveraging GANs to create synthetic data that expands the diversity of training samples. This aids in minimizing overfitting and ensuring robust generalization of ML models.

While the current landscape showcases a relatively modest count of studies, the track of GAN-based approaches for lung cancer CAD system appears promising and expect growing interest in this domain, especially in the context of augmentation and segmentation tasks. The anticipated surge in GAN usage signifies a paradigm shift, propelling future CAD development towards more sophisticated and effective ML solutions to directly target the challenges of data scarcity, feature enhancement, and complex multi-modal analysis.

5.2.4 RQ2d: What are the key advancements in non-DL approaches for lung cancer detection?

Non-DL methods rely on manual feature extraction from medical images, employing advanced techniques like texture analysis, shape-based features, and intensity-based features to capture discriminative lung cancer features that play a critical role for classification of lung cancer. Nevertheless, as the drive to automate the CAD process intensifies, the conventional ML approach is losing relevance due to its limited capacity to effectively learn relevant lung nodule features, unlike CNN models. This decline in interest and usage is evident, as only a few studies exclusively utilized ML algorithms [51, 52, 121], whereas most models either employed pure CNN architectures or integrated ML with other techniques, forming a Hybrid ML approach.

Among these studies, eight [9, 58, 59, 70, 71, 73, 79, 127] introduced Hybrid ML, combining CNN and ML for classification. These models exploited CNNs to learn high-level features and then fed these features into ML classifiers to derive accurate diagnosis outcomes.

Additionally, numerous studies adopted other NN based methods and clustering approaches in conjunction with ML classifiers [14, 15, 66, 67, 74, 82, 133]. These proposed strategies typically follow a two-stage process: initially, segmentation, detection, or feature extraction is performed using other NNs or clustering techniques, followed by ML classifiers for final classification.

5.3 RQ3: Which approach has demonstrated superior performance in detecting and classifying lung cancer from medical images, and what are the typical methods used to enhance the efficiency and performance of the algorithms?

In terms of accuracy, the ProNet model reported the lowest accuracy at 71.6%, utilizing CT images for a classification task [104]. On the contrary, the highest classification accuracy was achieved by the Jury-based ensemble model at 99.88%, employing a hybrid approach by integrating multiple methods [87].

On the topic of Pure CNN models, their accuracy ranged from 72.34% [104] to 99.69% [92, 108], specificity spanned from 66.36% [53] to 100% [93], and sensitivity exhibited a range of 63% [93] to 99.897% [102] for a two-class classification task. In the case of multiclass classification, Pure CNN models achieved accuracy ranging from 72.17% [106] using a basic customized CNN to 99.5% [119] employing a CNN enhanced with a Multispace Image (MIR) pool. Specificity ranged from 91.78% [5] to 100% [29], while sensitivity varied from 88.79% [5] to 98% [29].

Pure GAN models, employed for segmentation tasks, reported a Dice coefficient ranging from 80.74% [43] to 98.99% [42] and a Jaccard index ranging from 72.52% [43] to 98% [42]. Interestingly, a modified GAN structure [45] for classification tasks yielded impressive results, reporting accuracy, precision, F-1 score, sensitivity, and recall of 98.91%, 97.72%, 97.89%, 98.46%, and 98.85% respectively. These findings indicate the promising potential of GANs for classification tasks.

It is worth noting that studies often lack systematic evaluations, which limits this survey from offering a comprehensive and in-depth comparison of model performance. Instead, it provides a general overview of performance ranges reported in various studies. Furthermore, a trend emerged where studies reporting state-of-the-art accuracy values sometimes exhibited slightly lower sensitivity or significant underperformance in terms of sensitivity. This highlights the risk of false detection and misdiagnosis in proposed models. As a result, the survey underscores the significance of sensitivity and false positive rates, emphasizing that they are equally crucial as accuracy and should not be neglected in upcoming research.

Moreover, many studies, 19 [11, 14, 46, 47, 55, 57, 75,76,77,78,79,80,81,82,83,84,85,86,87] in total, incorporated embedded optimization algorithms to guide models in selecting optimal hyperparameters for performance maximization. The outcomes of these methods surpassed several state-of-the-art approaches, indicating their potential for enhancing model performance.

6 Conclusion and Future Work

This research undertook a comprehensive analysis of 119 papers published between 2019 and 2023, focusing on the development of Computer-Aided Diagnosis (CAD) systems for lung cancer from a model-driven perspective. This study presents a comprehensive review and trend analysis exploring the composition of constituent models in lung cancer detection. The review elucidated essential deep learning (DL) concepts and popular architectural frameworks, categorizing lung cancer CAD systems into three main categories: feature engineering, DL, and end-to-end frameworks. Emerging trends such as the direct utilization of transfer learning, incorporation of Generative Adversarial Networks (GANs), exploration of diverse neural network (NN) models, investigation of non-NN approaches, development of hybrid models, and integration of commercial CAD tools have significantly influenced the landscape of lung imaging analysis.

Convolutional neural networks (CNNs) emerge as a predominant approach, revolutionizing lung cancer imaging analysis by automatically extracting hierarchical features from data. Researchers have leveraged CNN algorithms tailored to specific diagnostic objectives, enhancing publicly available models with custom or modified layers. Attention mechanisms and ensemble techniques have further augmented CNN architectures to improve performance and mitigate potential overfitting. Nonetheless, CNN-based methods entail significant computational costs, which can be addressed through the adoption of efficient model architectures, hardware acceleration, and optimization techniques.

Although research on GANs in this context remains limited, promising applications have emerged, particularly in data augmentation and segmentation tasks. GANs have facilitated the generation of synthetic data, augmenting training samples' diversity and mitigating overfitting. However, challenges such as systematically evaluating the quality of synthetic images and addressing the susceptibility of GANs to mode collapse and training instability require careful consideration.

Furthermore, a decline in conventional ML approaches is observed, with a discernible shift towards hybrid models integrating CNNs and ML for enhanced feature learning and classification accuracy. State-of-the-art models showcase impressive results, highlighting the potential of GANs in classification tasks and embedded optimization algorithms in enhancing model performance.

Despite significant strides in lung cancer detection through CAD systems, challenges persist, necessitating further research endeavors. Careful assessment of sensitivity and false positive rates is imperative, alongside the development of image processing methods capable of handling low-resolution and blurred nodule images. Moreover, precise detection methods are essential for borderline cases, necessitating the utilization of sophisticated mathematical tools such as fuzzy logic and deep learning algorithms.

In conclusion, while substantial progress has been made in leveraging AI-driven CAD systems for lung cancer detection, ongoing research efforts are essential to address remaining challenges and advance the field towards more robust and accurate diagnostic solutions.