1 Introduction

The word cancer comes from the ancient Greek kapkivoc, which means crab and tumor. Cancer was introduced to the medical world in the 1600 s and is associated with abnormally growing cells that can invade or spread to other parts of the body [136]. The uncontrolled growth of cells starts from a site in the human body and further spreads to other body parts known as cancer metastasis [43, 172]. Cancer cells are categorized into benign and malignant cells. The benign cells do not spread to other parts, while malignant cells metastasize and are considered more destructive. Due to high mortality and recurrence rate, its process of treatment is very long and costly. There is a need to accurately diagnose it early to enhance cancer patient's survival rate. It is a genetic disease triggered due to genetic mutations that control our cell's function, especially how they grow and divide. As the tumor cells continue to grow, additional changes will occur. In a nutshell, cancer cells have more genetic changes, such as mutations in DNA, than normal cells [116], 110]. Though the immune system generally discards damaged or abnormal cells from the body, few cancer cells can hide from the immune system. The tumor also uses the immune system to grow and stay alive [179]. The name of the cancer type is based on the site where tumor cells grow, for example, cancer that arises in the lungs and spreads to the liver is called lung cancer. Cancer diagnosis includes three predictive predictions related to cancer risk assessment, cancer recurrence, and cancer survivability prediction. Initially, the probability of cancer occurrence is assessed, followed by the second step, predicting cancer recurrence. The last step is to predict the aspects like progression, life expectancy, tumor-drug sensitivity, survivability [95].

1.1 Motivation

The motivation behind this research is the rapid growth in cancer incidence and mortality cases worldwide [10]. The reasons are complex but reflect both aging and growth of the population and changes in the prevalence and distribution of the main risk factors for cancer. Figure 1 depicts the cancer incidence cases and death statistics reported by the American Cancer Society and other reliable resources.

Fig. 1
figure 1

Estimated number of new cases and deaths in 2020 for common cancer types (www.cancer.net)

Multiple investigations have been done in cancer research; for example, Rong et al. [142] have led a mortality and survival study by gender orientation. Dolatkhah et al. [49] have introduced the investigation that revealed the endurance information and pattern examination of malignant breast growth in Iran. Goodarzi et al. [65] had introduced the assessment dependent on distinct cross-sectional malignant growth studies. Azamjah et al. [13] aimed to determine the 25-year breast cancer mortality rate in 7 super regions defined by the Health Metrics and Evaluation (IHME). Momenimovahed et al. [115] presented a study that determined that breast cancer incidence varies significantly with race and ethnicity and is higher in developed countries. Haggar et al. [66] introduced the examination which demonstrated the frequency, mortality, and survival rates for colorectal malignancy are with consideration paid to provincial varieties and changes after some time. Zhang et al. [184] led an investigation to gather the CRC frequency information from the Cancer Incidence in Five Continents. Wong et al. [174] observed a positive correlation between incidence and country-specific socio-economic development. Nguyen et al. [124] summarized the diagnosis and treatment of thyroid cancer, with recommendations from the American Thyroid Association regarding thyroid nodules and differentiated thyroid cancer. Lee et al. [176] have stated that from March 18 to April 26, 2020, 800 patients analyzed with a diagnosis of cancer and symptomatic COVID-19. 412 (52%) patients had a mild COVID-19 disease course. 226 (28%) patients died, and the risk of death was significantly associated with advancing patient age. Al-Zhou et al. [6] evaluated the demographic characteristics and histological trends of skin cancer in Southern areas of Yemen. Artificial Intelligence (AI) is one of the exceptional achievements of computer science conceived around the 1940s [5, 130]. AI has marked its significance in advanced clinical diagnostics by providing unique opportunities to incorporate the tools into the healthcare area [4, 131]. AI aims to analyze the associations between treatment techniques and patient outcomes. In cancer research, AI has proved its potential to affect several facets of cancer therapy, improved the accuracy and speed of diagnosis, and provided more reliable clinical decisions, leading to better health outcomes [182, 183]. AI provides an unprecedented cancer prediction accuracy level higher than a general statistical expert [152, 180]. Thus, AI-based cancer detection models can assist in health centers and help medical experts affirm their medical verdicts without any obstruction. Hence, the article aims to highlight the contribution made by the researchers in the field of artificial intelligence techniques for the early detection and diagnosis of cancer.

1.2 Contribution and Organization of Paper

We conducted an extensive survey of the conventional machine and deep learning models proposed in cancer research. The paper presents a comparative analysis of the existing research works using AI-based techniques and medical imaging for cancer diagnosis, medical imaging for diagnosis, and automated analysis in cancer diagnosis. Most of the techniques proposed in the different papers were based on the deep learning framework and provided appreciable prediction outcomes. The paper provides a description of cancer complications and clinical applications, cancer classification using AI-based techniques, the role of deep learning in cancer research, limitations of cancer prediction-related using automated learning, multiple investigations, and challenges corresponding to cancer research using AI-based techniques.

The rest of the paper is organized as follows. Section 2 elaborates the research methodology. This section discusses the approach used for selecting the literature. Section 3 highlights the Cancer complications and clinical Applications. Section 4 expresses the reported work, which covers the deep learning perspective in cancer. This section further discusses the comparative analysis, which includes the challenges of the current work with performance evaluation using various other parameters. Section 5 delivers a thorough discussion; all the investigations are discussed in this section. Section 6 concludes the paper and discusses future directions.

2 Research Methodology

We conducted this systematic review under the PRISMA guidelines [40]. We performed an efficient search for selecting research articles on three different electronic databases, i.e., the web of science, EBSCO, and EMBASE. These are all openly available web indexes that list the entire content or metadata of academic writings. The articles were selected using the query ((Artificial Intelligence) or (Cancer Diagnosis) or (Early Detection) or (Machine Learning) or (Deep Learning)). The exclusion and inclusion standards used to select the articles are discussed in Sect. 2.1. Figure 2 presents the PRISMA flowchart depicting the detailed screening of the collected papers.

Fig. 2
figure 2

PRISMA flow chart

The articles published from 2009 to April 2021 have been included in this study. Total 350 studies were selected, and after removing duplicate ones, 275 studies remained. Subsequently, 210 papers were selected, and the studies focused on diseases other than cancer, treatment & surgery, a language other than English were excluded. Also, after this phase, the complete articles were evaluated, and the research articles that used methods other than AI-based techniques were also excluded from further analysis. Finally, the 185 selected articles were analyzed in the study.

2.1 Investigations

  • Investigation 1: Which Learning Approach has provided appreciable prediction outcomes extensively?

  • Investigation 2: Which cancer site and training data has been explored most extensively?

  • Investigation 3: In which year most of the cancer prediction studies have been published?

  • Investigation 4: Which sorts of images have attained the highest prediction accuracy?

  • Investigation 5: What are the Challenges faced by the researchers in the construction of AI-based prediction models.

3 Cancer Complications and Clinical Applications

The DNA present inside a cell is packaged into a vast number of individual genes and has instructions that communicate the cell's functions. [15]. DNA mutations are the reason for cancer development. The original functioning of the cells ultimately turns cancerous due to some error interruption in the multistage process [104, 185].

Figure 3 shows different factors that affect the spread of cancers. Tobacco, alcohol, improper diet, and few physical activities are the leading cancer risk factors worldwide. Some chronic infections are the risk factors for cancer and have major significance in low- and middle-income countries.

Fig. 3
figure 3

Causes of cancers [26]

3.1 Cancer Complications

While undergoing cancer treatment, one can experience many complications that affect the health of the patient. However, not all cancers are painful while undergoing cancer treatment, but they still may have to experience some pain. But there are few medications and other approaches that help treat cancer-related pain [129, 184]. During cancer, one can experience fatigue and many symptoms, but usually, it is manageable [3]. Tiredness happens because of radiation therapy or chemotherapy treatments,however, it is generally short-term. Breathing is another complication because of cancer or cancer treatment [120]. However, treatments may bring relief whereas, some types of cancer and treatment of cancer can lead to nausea [34]. Cancerous cells deprive normal cells of required nutrients, which may ultimately cause a loss in weight. Majorly, even if nutrients are provided with the help of artificial ways via tubes in the vein or stomach, it still does not impact the reduction of weight [169], 21]. Cancer can also uplift severe complications because of the imbalance of the average chemical balance in the human body. Frequent urination, confusion, excessive thirst, and constipation might be the signs and symptoms of chemical imbalances [46]. In some instances, cancer can impact the body's immune system by attacking cancer cells to normal and fit cells. Paraneoplastic syndrome, a very uncommon reaction, can bring on several symptoms and signs like a problem in walk and seizures [7]. Cancer immensely affects the functioning of that body part as it may press on nearby nerves. It can cause headaches and signs and symptoms of stroke and maybe a weakness on one side of the human body if it involves the brain [47]. Suppose someone becomes successful in defeating once it may save one temporarily because cancer survivors always remain at the risk of occurrence [36]. So, the patient needs to hear from the doctor about the precautions.

3.2 Clinical Applications

Doctors can develop a plan for the future, consisting of scans and examine at regular fixed intervals of time (in the months or years) after the patient's treatment to investigate radiation treatment: In a radiation treatment, cancerous cells are targeted [30, 54]. A significant fraction of cancer cases and deaths can be preventable by having an excellent epidemiological and mechanistic understanding of environmental and behavioral risk factors. Cancer therapeutics presently have the most minimal clinical preliminary achievement pace of every significant sickness. Due to the scarcity of successful anti-cancer drugs, malignant growth will be the leading source of mortality in created nations. As a sickness inserted in the essentials of our science, cancerous growth presents troublesome difficulties that would profit by joining specialists from a wide cross-segment of related and random fields [55]. Along with causes, we have factors for identifications of the initial staging of cancer. Diagnosing cancer at an early stage ultimately leads to higher survival rates, less morbidity, and less expensive treatment [27]. Three essential steps need to be taken in a well-timed way:

  • Alertness and get into precaution

  • Medical valuation, analysis, and staging

  • Get into therapeutics.

The relevancy of early diagnosis is high in every situation and most cancers. Programs can be formulated to lessen hold-up in and obstruction to care, letting patients gain treatment well in time [31].

3.2.1 Current methodologies applied in the medical sector for cancer prediction

The section presents a description on the clinical practices applied in the medical sector for cancer prediction at present. The methodologies are described as follows:

  1. 1.

    Screening: Screening aims to find people of particular cancer or pre-cancer who have not developed any symptoms and direct them quickly for analysis and treatment. For the specific type of cancer, screening can be effective when tests are used according to the need and stages [149]. Moreover, screening is a more complicated process to follow than early diagnosis. Screening is of utmost necessary to have an accurate diagnosis [10]. The main reason behind every type of cancer is that cancer needs a unique treatment schedule that includes single or extra modalities, such as chemotherapy, surgical procedures, and radiotherapy [16]. The main aim is to treat the tumor and significantly extend lifespan because improving a patient's life is also an unforgettable target [28].

  2. 2.

    Chemotherapy: The main aim of chemotherapy is to kill cancerous cells with the help of medications that target rapidly dividing cells. The drugs used to shrink tumors have dangerous side effects [71].

    • Hormone-level therapy: Hormone-level therapy works on the reaction of few hormones to the body. Hormones play a substantial role among people suffering from prostate or breast cancers [53].

    • Immunotherapy: Immunotherapy aims to strengthen the body's immune system to fight against cancerous cells. Checkpoint inhibitors and adoptive cell transfers are some examples of immunotherapy [150].

    • Personalized medication: Personalized medication is a newly developed approach with the help of genetic testing and determines suitable treatment for specific cancer. However, it is yet to prove that whether personalized medication can treat all kinds of cancers or not [24].

    • Radiation treatment: Radiation therapy kills the cancerous cells or slows down the growth of cancerous cells by damaging their DNA. Medical experts often recommend this treatment to shrink tumors or minimize cancer symptoms before surgery [89].

    • Stem cell transplant: Stem cell transplant is helpful for cancer that is related to blood, such as leukemia or lymphoma. The process involves the removal of RBC (Red Blood Cells) and WBC (White Blood cells), which have been destroyed because of the chemotherapy [34].

    • Surgery: Surgery is primarily done when a person is suffering from cancerous cells. It is also used to nullify the spread of the disease by removing the lymph nodes [48].

    • Targeted therapies: Targeted therapies are used to avoid the spread of cancer and improve immunity. Small-molecule drugs and monoclonal antibodies are examples of the target therapies [90].

4 Related Work

From the last couple of years, artificial intelligence has taken society’s imagination and created interest in its potential to progress our lives [91]. Now the usage of AI has been increasing rampantly to uplift disease recognition, its management, and the ramification of therapies. Because of the growing number of patients identified with cancer and the ample amount of data gathered during the treatment process [77, 119]. It leads to the need for AI to improve oncologic care. Cancer prediction can diminish the mortality rate [57, 118]. The section consists of cancer diagnosis based on deep learning methods, medical imaging for cancer, the mortality rate for different cancers, cancer dataset, and automated and semi-automated methods for cancer detection.

4.1 Artificial Intelligence in Medical Imaging for Cancers Diagnosis

In clinical imaging, computer-aided detection (CADe) or computer-aided diagnosis (CADx) is the system-based framework that helps specialists to make decisions rapidly [70]. Medical imaging manages data in the picture that the clinical specialist and specialists need to assess and examine abnormality in a timeframe [182, 183]. Clinical images prepared with AI strategies can propel the exactness in various cancer growth stages [121]. In this way, early malignancy determination and recognition clinical imaging is a robust method. Without a doubt, clinical imaging has been generally utilized for early malignancy discovery, checking, and follow-up after the medicines [44, 101, 102].

Figure 4 shows different kinds of scans used for cancer diagnosis. A computed tomography (CT) scan can help doctors diagnose cancer and determine the shape and size of the tumor. Nuclear medicine scans can help medical experts determine cancer metastasis. The most common nuclear scans are bone scans, PET (positron emission tomography) scans, Thyroid scans, MUGA (multigated acquisition) scans, and gallium scans. MRI assists specialists with discovering malignancy in the body and search for signs that it has spread. X-ray additionally can help specialists plan malignant growth therapy, similar to medical procedure or radiation, and Mammograms are low-portion x-beams that can help discover breast disease. Detection of Cancer usually includes radiological imaging that examines the extent of cancer and improvement after treatment. Oncological imaging is constantly turning into more wide-ranging and precise [95]. Suberi et al. [162] proposed an image-based computer-aided system for cancer immunotherapy. The proposed approach enhanced the preparation of the vaccine with Dendritic Cells (DCs) immunotherapy. The study has incorporated various image-based algorithms have into the system with low computational time.

Fig. 4
figure 4

Types of imaging for cancer test

Nirupama and Damodhar [126] predicted lung cancer using the MRI scans (Dicom images). Win et al. [171] developed a computer-aided decision system to detect the cancer cells in cytological pleural effusion images. Initially, median filtering and intensity adjustment were applied to enhance the quality of the picture. They used a hybrid segmentation method to extract cell nuclei based on simple linear iterative clustering and K-means clustering. In a K- means clustering algorithm, the error of each data point is computed using the distance (Euclidean) between the data point and nearest centroid as shown in Eq. (1), and further compute the total sum of the squared errors.

$${\text{D}} = \mathop \sum \limits_{i = 1}^{m} \mathop \sum \limits_{j = 1}^{n} x_{j}^{\left( i \right)} - c_{i}^{2}$$
(1)

In the Eq. (1), \({\text{D}}\), \(m,\) and \(n\) represent the objective function, the number of clusters, and number of cases, respectively. Also, \(x_{j}^{\left( i \right)}\) represents \(j{\text{th}}\) case of \(i{\text{th}}\) cluster and \(c_{i}\) is the centroid for \(i{\text{th}}\) cluster. Another distance metric used in K-means clustering is cosine similarity, expressed mathematically in Eq. (2).

$$\cos \left( \theta \right) = \frac{a \cdot b}{{ab}}$$
(2)

In Eq. (2), \(a\) and \(b\) are the Euclidean norms of the vector \(a\) and vector \(b\), respectively. Rosalidar et al. [140] presented the asymmetrical thermal distribution on breast thermograms using computer-assisted technology. The reported work has shown that the current neural learning models have increased the classification accuracy of breast cancer thermograms. Taher et al. [165] worked on the CAD system to diagnose lung cancer. They used the database of 100 sputum color images of different patients collected from the Tokyo Centre of lung cancer. The new CAD system processed the sputum images and classified them into benign or cancerous cells. Another factor observed in the study was the superior performance of Bayesian classification over the rule-based heuristic classification. The Bayesian algorithm works by computing posterior probabilities as shown in Eq. (3).

$$f\left( {c{|}x} \right) = \frac{{f\left( {x{|}c} \right)f\left( c \right)}}{f\left( x \right)}$$
(3)

In Eq. (3), \(f\left( c \right)\) and \(f\left( x \right)\)  are the prior probability of class and predictor, respectively. Also, \(f\left( {c{|}x} \right)\) and \(f\left( {x{|}c} \right)\) denote the posterior probability of target (\(c\)) given predictor (\(x\)) and the probability of \(x\) given \(c\), respectively. Naeem et al. [117] introduced the AI (ML) strategies for liver malignancy order using a fused dataset of two-dimensional (2D) computed tomography (CT) and attractive reverberation imaging (MRI). From that point, a combination of MRI and CT-filter datasets produced the fused optimized hybrid-feature dataset. The MLP has indicated a promising exactness of 99% among all the conveyed classifiers. Kalaiselvi et al. [80] have also proposed a fuzzy c-means method to detect automatic brain tumors from T2-weighted MRI brain images using the principle of modified minimum error thresholding (MET). Lee et al. [99] discovered the most widely recognized type of disease types, particularly breast malignancy, prostate disease, cellular breakdown in the lungs, and skin disease. A new proposed distributed computing structure has motivated the specialists to use the current deals with picture-based disease investigation and build up a more flexible CAD framework for discovery [87]. introduced an edge technique for sectioning mammographic pictures to identify Breast malignancy in its beginning phases. [127] evaluated a computer-aided diagnosis (CADx) system for lung nodule classification. The retrospective study hand-crafted imaging features with machine learning algorithms and compared support vector machine (SVM) and gradient tree boosting (XGBoost) as machine learning algorithms. Gradient boosting classifiers works by first computing the error done by each misclassified instance as shown in Eq. (4) and then increasing the weight of misclassified instances in the next layer as shown in Eq. (4).

$$E_{p} = \frac{{\mathop \sum \nolimits_{m = 1}^{M} w_{m }^{\left( p \right)} *C\left( {s_{m} \ne \hbar_{p} \left( {s_{m} } \right)} \right)}}{{\mathop \sum \nolimits_{m = 1}^{M} w_{m }^{\left( p \right)} }}$$
(4)

Here, \(E\) denotes the error, \(w\) is the weight associated with each instance and \(m\) is the size of the dataset, and \(p\) denotes the number of the weak learners. The hypothesis \(\hbar \left( {s_{m} } \right)\) for each of the s instances is evaluated under the condition function \(C\). The weight Updation formula is given in Eq. (5).

$$w_{m}^{{\left( {p + 1} \right)}} = w_{m}^{\left( p \right)} * exp\left( {\mu_{p} *C\left( {s_{m} \ne \hbar_{p} \left( {s_{m} } \right)} \right)} \right)$$
(5)

4.2 Deep learning methods for cancer detection

Deep learning is a sub-part of AI, which falls under artificial intelligence. Deep learning is a technique that takes in the features from the data, for instance, text, pictures, or sound. Deep learning is one of the most significant attributes of AI [101, 102]. Traditional AI methodologies require gathering steps to achieve the portrayal task, including pre-getting ready, feature extraction, and wary selection of features, learning, and request [113]. The introduction of these systems is solidly dependent on the picked features, which may not be the right features to isolate between classes. At the same time, Deep learning engages the robotized learning of the capacities for different endeavors instead of standard AI methodology. It can achieve the learning and gathering in one shot [114].

Figure 5 shows the deep learning methods for cancer diagnosis and detection by analyzing the medical imaging in different steps. This section discusses the purpose of various deep learning models such as auto-encoder, transfer learning, Convolutional Neural Networks, Gradient Descent, Generative Adversarial Networks, and Boltzmann Machines for cancer diagnosis and detection. Yu et al. [178] built up an information-based discovery technique that utilized deep learning strategies for lincRNA discovery and created DNA genome examination [82]. Second, approving the commented on lincRNAs record locales and testing the presence of deep learning strategy by contrasting and customary procedures. For the primary objective, the auto-encoder method accomplished a 100% rate.

Fig. 5
figure 5

Deep learning process for cancer diagnosis [1]

An auto-encoder strategy is made out of three primary strides, as demonstrated in Fig. 6: building, pre-preparing, and approving. The fundamental design, including an input layer, concealed layer, and initiation capacities, is fabricated in the initial step. Also, the encoder and the decoder are prepared layer by coating following the pre-arranged cycles. Thirdly, fine-grained preparing/approval is performed through the whole model. All in all, the initial step develops the fundamental system of the deep neural organization, the subsequent one trains the layer-wise hubs, and the last one moves through all layers for approval. Brosch et al. [35] described a method that learned the 3D brain image using a deep belief network. Their approach took low computational time and less memory. Kadam et al. [79] also proposed a feature ensemble learning based on Sparse Auto-encoders and Softmax Regression for classification of Breast Cancer into benign (non-cancerous) and malignant (cancerous). An Auto-encoder consists of an encoder part and a decoder part, an artificial neural network trained using unsupervised learning that applies the back-propagation approach. Sparse Auto-encoder (SA) is an Autoencoder imposed with sparseness constraints on all hidden nodes and the sparse penalty term. The cost function for training a Sparse Auto-encoder (given by Eq. (6) includes three attributes. The first term is called mean square error, which offers the discrepancy between input and reconstructs the whole training data.

$${E } = { MSE } + { }\left( {{\lambda } \times { L}2{\varvec{Regularization \; Term}}} \right) + \left( {{\beta } \times { Sparsity \; Regularization \; Term}} \right)$$
(6)

where \(\lambda = The\; coefficient \; for \; the \; L2 \; regularization \; term.\)

$$\beta = The \; coefficient \; for\; the\; sparsity \; regularization \; term.$$
Fig. 6
figure 6

Working of auto-encoder method [126]

Mean Squared Error computes the average squared difference between predicted and the actual value. MSE is expressed mathematically in Eq. (7) where \(G\) and \(G^{i}\) are the vectors of observed and predicted values

$$MSE = E\left[ {G_{h} \left( x \right) - G_{h}^{i} \left( x \right)} \right]^{2}$$
(7)

Li [100] also proposed a practical and self-interpretable invasive cancer diagnosis solution for the diagnosis of breast cancer. Also, Krithiga et al. [88] carried a systematic review on breast cancer that focused on the call for specific action in the diagnostic processes. Similarly, Bulten et al. [32], Sajja et al. [145] also proposed a deep neural network based on GoogleNet with a maximum dropout ratio to moderate the processing time for detection of lung cancer using CT scan images. In the proposed approach, 60% of neurons are at a fully connected layer with which higher drop rate than the existing GoogleNet. Experiments were conducted using the three pre-trained CNN architectures such as AlexNet, GoogleNet, and ResNet50 on LIDC pre-process dataset. ResNet50 produced the highest accuracy than the pre-trained architectures and the state-of-the-art methods. The main components working behind the deep learning architecture are the "neurons" that compute average k vector values, and q denotes the column vector of weights. The working is mathematically expressed in Eq. (8).

$$z = q_{1} k_{1} + q_{2} k_{2} + q_{3} k_{3} + ... + q_{n} k_{n} = q^{t} . k$$
(8)

Further, bias (b) gets updated with each iteration and added to adjust the output, as shown in Eq. (9).

$$z = e^{t} \cdot k + b$$
(9)

The functioning of layer k is explained in Eq. (10), where g and \(a\) are the non-linear function and activation functions.

$$y_{k}^{\left[ l \right]} = q_{k}^{t} . a^{{\left[ {l - 1} \right]}} + b_{i} a_{k}^{\left[ l \right]} = g^{\left[ l \right]} \left( {z_{i}^{\left[ l \right]} } \right)$$
(10)

The function of each is further computed, as shown in Eq. (11).

$$\hat{y} = g\left( z \right)$$
(11)

Kassani et al. [78] proposed a successful deep learning-based technique utilizing a DCNN descriptor and pooling activity to characterize breast malignancy. The creators likewise utilized diverse information enlargement strategies to help the exhibition of order and explored the impact of various stain standardization strategies. The proposed approach using the pre-prepared Xception model accomplished 92.50% order precision. Chen et al. [37] proposed a transfer learning-based depiction group (TLSE) strategy by incorporating preview outfit learning with move learning in a brought together and composed manner. Preview outfit gives troupe benefits inside a solitary model preparing methodology while moving learning centers around the little example issue in cervical cell arrangement.

Figure 7 portrays the transfer learning-based approach ensemble strategy for cervical cell arrangement reason. The TLSE technique is assessed on a pap-smear dataset called Herlev dataset and is demonstrated to have a few superiorities over the leaving strategies. It shows that TLSE can improve the exactness with just one preparing measure for the little example in fine-grained cervical cells arrangement. Alzubaidi et al. [9] introduced a crossover deep convolutional neural organization to arrange hematoxylin–eosin-stained bosom biopsy pictures into four classes: obtrusive carcinoma, in-situ carcinoma, kind tumor, and normal tissue. The model consolidated two ideas, which are equal convolutions with various channel sizes and leftover connections. The foundational layout of the proposed model has as conspicuous attributes a superior component portrayal and the mix of highlights at multiple levels. This study achieved a precision of 90% precision in predicting breast cancer. Sasikala et al. [151] performed the detection of skin cancer lesions as malignant (melanoma) or benign using the CNN. The system's performance was evaluated using the accuracy and error rate with varying learning rates. Hosny et al. [76] introduced a programmed skin injuries grouping framework with a higher characterization rate utilizing the hypothesis of move learning and the pre-prepared deep neural organization. The exchange learning has been applied to the Alex-net in various manners, including the arrangement layer with a softmax layer. The presentation of the framework is measured with the ISIC dataset and got 93% precision. Nivaashini and Soundariya [128] The proposed system uses a Deep Boltzmann Machine (DBM) to find an efficient set of features. Deep Neural Network (DNN) classifier is used to classify the tumor into benign or malignant breast cancer groups. The proposed system obtained a higher detection rate of 99.73% than the conventional machine learning models.

Fig. 7
figure 7

Transfer learning-based snapshot ensemble method [37]

Figure 8 shows the typical segmentation with Deep Learning: A Convolutional Neural Network (CNN) based model is discovered. It first packs up the source picture with a heap of various convolution, actuation, and pooling layers. The inverse operation extends the compacted latent representation. The organization is kept from start to finish trainable. At the test time, a forward pass gives the segmentation labels, which first packs the information picture measurements with a heap of convolutional and pooling layers. Altaf et al. [1], Gomez et al. [59] also proposed a CNN-based breast disease diagnosis technique by utilizing thermal pictures. The creators showed that an all-around delimited data set split method is required to decrease the bias and overfitting during the training process. They likewise introduced the studies on the DMR-IR data set. Exploratory outcomes affirmed that the data set split approach limits the overfitting and bias during training. The creators also passed on that state-of-the-art benchmark of CNN models, for example, ResNet, SeResNet, VGG16, Inception, InceptionResNetV2, and Xception, the DMR-IR data set. Albahar [8] proposed a prediction model that grouped skin injuries into kind-hearted or harmful sores dependent on a novel regularize method. The proposed model accomplished a standard exactness of 97.49%, which indicated its prevalence over other state-of-the-art strategies. The presentation of CNN as far as AUC-ROC with an implanted novel regularizer was tried on various use cases. The Area under the curve (AUC) accomplished for nevus against melanoma sore is 77%. Ragab et al. [135] proposed a computer-aided diagnosis (CAD) structure for requesting thoughtful and undermining mass tumors in breast mammography pictures. The deep convolutional neural association (DCNN) is used to incorporate extraction. An outstanding DCNN design named AlexNet is used and is aligned to mastermind two classes instead of 1,000 classes. The last related convolution layer is associated with the support vector machine (SVM) classifier to improve exactness. The results are obtained using the going with transparently open datasets (1) the electronic informational index for screening mammography (DDSM) and (2) the Curated Breast Imaging Subset of DDSM (CBIS-DDSM). The mathematical working of linear, polynomial, and radial basis function (rbf) kernel is expressed in the Eqs. (12), (13), (14), respectively.

$$k\left( {x_{i} ,w_{j} } \right) = x_{i} \cdot w_{j}$$
(12)
Fig. 8
figure 8

Deep learning-based CNN model for segmentation of MRI imaging [1]

Here, \(k_{i} {\text{and}} k_{j}\) are n-dimensional inputs.

$$k\left( {x_{i} ,w_{j} } \right){ } = (x_{i} \cdot w_{j} + r)^{t}$$
(13)

Here, \(r\) is the constant and \(t\) is the degree of freedom.

$$k\left( {x_{i} ,w_{j} } \right){ } = exp( - \frac{{|| x_{i} - w_{j} ||^{2} )}}{{\sigma^{2} }}$$
(14)

Here, \(\sigma\) is the free parameter.

Saraf and Kalpana [148] presented the work for classifying the benign and the malignant thyroid nodules in ultrasound images. The author performed pre-processing, segmentation, feature extraction as well as the classification for thyroid detection. Edge detection techniques have been used for segmentation purposes and detected malignant nodule using ANN. Similarly, Dov et al. [51] also presented the work for predicting thyroid-malignancy from the ultra-high-resolution whole-slide images of the cytopathology. A deep-learning-based algorithm has been used for the cytopathologist diagnosing the slides. The projected algorithm assigns the relevant image regions to the local malignancy scores, which are incorporated into global malignancy. The reported output of the presented work using the MIL method is 0.87 Area under the curve (AUC) and 0.743 average precision (AP). Ma et al. [106] also proposed that the CNN diagnose thyroid-based diseases using the SPECT images. The projected method used the modified DenseNet architecture as well as the improved training method. The accuracy achieved using the proposed method is 99.08% for Grave’s disease, 99.25% for Hashimoto disease, and 99.67% for Subacute disease. Sokoutil et al. [161] presented the work for detecting tumors in the thyroid gland. The reported work depicts the image processing technique and the simple, intelligent system like the hill-climbing algorithm. Malathi et al. [107] presented the CNN method for the segmentation of brain tumors and achieved high prediction accurateness [132], compared three segmentation algorithms and proposed a Random Forest (RF) classifier, and convolution neural network. RF and CNN yielded an average Dice’s coefficient (DC) of 0.862 and 0.876, respectively. The RF classification method computes the information gain for a split using Entropy (E). Mathematically,

\(E\) is expressed in Eq. (15). Here, \(y\) is the number of classes (binary or multi) and \(\rho_{n}\) is the likelihood that an instance belongs to the class n.

$$E = - \mathop \sum \limits_{n}^{y} \rho_{n} \log_{2} \rho_{n}$$
(15)

Image processing techniques have been widely used in various health sectors, especially detecting and diagnosing cancer early. Huidrom et al. [75] used Juxta-Pleural nodules inclusion which was a fully automated lung segmentation method, and it consisted of two main stages. In its first stage, the Lung region was extracted, also known as lung field extraction, followed by the second stage, lungs were segmented using boundary analysis and segmentation techniques. It has been observed that their proposed method yielded a better result than that of the existing ones. Whereas, Asideu et al. [12] proposed a technique in which automatic features were extracted and classified for acetic acid and Lugol’s iodine cervigrams. The study employed various techniques for combining the features in cervigrams and used a support vector machine model to classify cervigrams. Cheng et al. [38] used a CAD system to detect and classify breast cancer. They did it in four stages, i.e., pre-processing, segmentation, feature extraction, and feature classification. Patil et al. [131] presented the automated system to build the mammogram breast detection model with improved hybrid classifiers. Image processing, tumor segmentation, feature extraction, and diagnosis are the well-designed steps for detecting projected breast cancer. [122] launched automated multi-strategy-based lung nodule detection and the classification system, which contains the objective of the bogus positive decrease at the beginning phases. Cui et al. [41] proposed the strategy to perceive lung nodules in the pictures of chest CT and improved DICOM windows show. During this experiment, the nodule recognition was 92.65% sensitive with 0.2468 FPs/filter.

4.3 Comparative Analysis

The comparative analysis section highlighted the study of different researchers for cancer disease detection using AI techniques. The prediction outcomes are classified on basis of parameters such as accuracy, sensitivity/recall, precision, specificity, dice score, Area under the Curve. Figure 9 provides the description of multiple evaluation parameters.

Fig. 9
figure 9

Evaluation parameters

Table 1 comprises the comparative analysis based on multiple evaluation parameters for various cancer types.

Table 1 Comparative analysis using AI techniques for different cancers

As shown in the comparative analysis, many research works have been analyzed for cancer diagnosis and detection using conventional machine and deep learning methods. It can be observed that most of the deep learning techniques have performed well and achieved high accurateness in terms of the prediction scores obtained. Also, most of the research articles have been published recently (2020). Also, most of the studies have worked on the diagnosis of breast cancer.

5 Discussion

In the current review, we have presented recently published research studies that employed AI-based Learning techniques for predicting malignancy. This study highlights research works related to cancer diagnosis prediction and predicting post-operative life expectancy of cancer patients using AI-based learning techniques.

  • Investigation 1: Which Learning Approach has provided appreciable prediction outcomes extensively?

    AI-based techniques have contributed significantly to the field of cancer research. The research works mentioned in the literature have focussed mainly on deep learning techniques. Deep learning classifiers have dominated over machine learning models in the field of cancer research. Among Deep learning models, Convolutional Neural Networks (CNN) has been used most commonly for cancer prediction; approximately 41% of studies have used CNN to classify cancer. Neural networks (NN) and Deep Neural Networks (DNN) have also been used extensively in the literature. Apart from deep learning approaches, Ensemble learning techniques (Random Forest Classifier weighted voting, Gradient Boosting Machines) and Support vector machines (SVM) are primarily used in literature. The distribution of literature based on AI-based prediction models is shown in Fig. 10.

  • Investigation 2: Which cancer site and training data has been explored most extensively? Most of the research papers explored in this review focused on the automated diagnosis of cancer prediction. The most extensively explored sites are the breast (22) followed by the kidney (17). Other than breast and kidney, most researchers have worked on brain, colorectal, cervical, and prostate cancer prediction. Figure 11 depicts the distribution of the research works based on cancer sites.

    The type of data used to train the prediction model significantly affects the performance of the model. The reliability and the prediction outcomes are dependent on the data used to train the classification model. Most of the research studies reviewed in this paper has used Magnetic Resonance Imaging (MRI). The second most commonly used data is Computed Tomography (CT) scan images. Other image types like dermoscopic, mammographic, endoscopic, and pathological were also used in the literature. Figure 12 highlights the distribution of papers based on the type of data used to train the prediction model.

  • Investigation3: In which year most of the cancer prediction studies have been published?

Fig. 10
figure 10

AI-Based Prediction Models

Fig. 11
figure 11

Cancer site-wise distribution of papers

Fig. 12
figure 12

Distribution of papers based on the type of training data

The research works published between 2009 to April 2021 are selected in this review article. Figure 13 demonstrates the distribution of the articles based on the published year. Most of the research works were published in the years 2020 (35), 2019 (32), 2018 (30). There are few papers from the year 2021 as we could only extract papers published up to April 2021. Based on the analysis of Fig. 13, we can conclude that number of research studies has increased gradually in recent years.

  • Investigation 4: which sorts of images have attained the highest prediction accuracy? Most of the studies have used MRI images for cancer diagnosis prediction. Approximately 23% of literature has used Computed Tomography scan for training the model. Also, many studies have employed mammographic images, endoscopic images, and pathological images. Low contrast in CT scan images makes the classification task difficult as it becomes difficult to differentiate the object from the background. Some cancers, such as prostate cancer, and certain liver cancers, are hardly detected using a CT scan. In such scenarios, Digital Imaging and Communications in Medicine (DICOM) images generated from MRI can help achieve the purpose with greater prediction accurateness.

    Regarding the specificity of the type of classification models used for specific cancer: Convolutional Neural Networks models have been used to predict almost every type of cancer such as brain, colorectal, skin, thyroid, and lungs. Most of the studies that explored the prediction of breast cancer diagnosis used hybrid modes or novel approaches for the purpose. Also, Neural networks have been applied to almost all breast and cervical cancer datasets. Regarding Stomach cancer, only Convolutional Neural Networks have been used. Support Vector machines have been used for the prediction of liver and breast cancer. In a nutshell, Convolutional Neural Networks can be applied with different datasets. Also, ensemble learners have been used with almost every kind of cancer.

  • Investigation 5: Challenges faced by the researchers in the construction of AI-based prediction models.

Fig. 13
figure 13

Year-wise distribution of papers

Although AI-based techniques have marked their significance in the field of cancer prediction research, there are still many challenges faced by the researchers that need to be addressed.

  1. i.

    Limited Data size The most common challenge faced by most of the studies was insufficient data to train the modelA small sample size implies a smaller training set which does not authenticate the efficiency of the proposed approaches. Good sample size can train the model better than the limited one.

  2. ii.

    High dimensionality Another data-related issue faced in cancer research is high dimensionality. High dimensionality is referred to a vast number of features as compared to cases. However, multiple dimensionality reduction techniques [155] are available to deal with this issue. However, the requirement of a generic approach to handle this issue is there.

  3. iii.

    Class imbalance problem A leading challenge faced by medical data sets, especially cancer data, is the uneven distribution of classes. Class imbalance arises due to a miss-match of the sample size of each class. Classification models tend to be biased towards the class with a majority of samples. Most of the existing techniques handle the imbalance well on binary classes but fail in multi-class patterns.

  4. iv.

    Computational time About 90% of studies have endorsed deep learning approaches to predict cancer using medical images than other techniques. However, the deep learning-based approaches are highly complex. About 41% of the studies have used the CNN classifier, which has performed significantly but at the cost of high computational time and space.

  5. v.

    Efficient feature selection technique Many studies have achieved exceptional prediction outcomes. However, the requirement of a computationally effective feature selection method is still there to eradicate the data cleaning procedures while generating high cancer prediction accuracy.

  6. vi.

    Model Generalizability A shift in research towards improving the generalizability of the model is required. Most of the studies have proposed a prediction model that is validated on a single site. There is a need to validate the models on multiple sites that can help improve the model's generalizability.

  7. vii.

    Clinical Implementation AI-based models have proved their dominance in cancer research; still, the practical implementation of the models in the clinics is not incorporated. These models need to be validated in a clinical setting to assist the medical practitioner in affirming the diagnosis verdicts.

6 Conclusions and Future Directions

This review study attempts to summarize the various research directions for AI-based cancer prediction models. AI has marked its significance in the area of healthcare, especially cancer prediction. The paper provides a critical and analytical examination of current state-of-the-art cancer diagnostic and detection analysis approaches—a thorough examination of the machine and deep learning models used in cancer early detection using medical imaging. The AI techniques play a significant role in early cancer prognosis and detection using machine and deep learning techniques for extracting and classifying the disease features. Our study concluded that most previous literature works employed deep learning techniques, especially Convolutional Neural Networks. Another significant factor noted in our study is that most studies have worked on breast cancer data. It was examined that when deep learning models are applied to pre-processed and segmented medical images, the images perform better in classification metrics such as AUC, Sensitivity, Dice-coefficient, and Accuracy. There is scope to work on early detection of head and neck cancers because less study has been conducted for both types of cancer. Also, the federated learning model can be used for cancer detection based on distributed datasets. hence, we intend to use a federated learning model for the detection of cancer disease by creating the decentralized training model for cancer datasets in remote places. This study highlights the challenges faced by the researchers in the construction of AI-based prediction models. Although multiple pieces of research have displayed significant results, there is still a need to address the challenges in cancer research in future.