Transformative Breast Cancer Diagnosis using CNNs with Optimized ReduceLROnPlateau and Early Stopping Enhancements

R, Mahesh T; Thakur, Arastu; Gupta, Muskan; Sinha, Deepak Kumar; Mishra, Kritika Kumari; Venkatesan, Vinoth Kumar; Guluwadi, Suresh

doi:10.1007/s44196-023-00397-1

Transformative Breast Cancer Diagnosis using CNNs with Optimized ReduceLROnPlateau and Early Stopping Enhancements

Research Article
Open access
Published: 22 January 2024

Volume 17, article number 14, (2024)
Cite this article

Download PDF

You have full access to this open access article

International Journal of Computational Intelligence Systems Aims and scope Submit manuscript

Transformative Breast Cancer Diagnosis using CNNs with Optimized ReduceLROnPlateau and Early Stopping Enhancements

Download PDF

Mahesh T R¹,
Arastu Thakur¹,
Muskan Gupta¹,
Deepak Kumar Sinha¹,
Kritika Kumari Mishra¹,
Vinoth Kumar Venkatesan² &
…
Suresh Guluwadi ORCID: orcid.org/0000-0001-7905-3014³

1582 Accesses
15 Citations
Explore all metrics

Abstract

Breast cancer stands as a paramount public health concern worldwide, underscoring an imperative necessity within the research sphere for precision-driven and efficacious methodologies facilitating accurate detection. The existing diagnostic approaches in breast cancer often suffer from limitations in accuracy and efficiency, leading to delayed detection and subsequent challenges in personalized treatment planning. The primary focus of this research is to overcome these shortcomings by harnessing the power of advanced deep learning techniques, thereby revolutionizing the precision and reliability of breast cancer classification. This research addresses the critical need for improved breast cancer diagnostics by introducing a novel Convolutional Neural Network (CNN) model integrated with an Early Stopping callback and ReduceLROnPlateau callback. By enhancing the precision and reliability of breast cancer classification, the study aims to overcome the limitations of existing diagnostic methods, ultimately leading to better patient outcomes and reduced mortality rates. The comprehensive methodology includes diverse datasets, meticulous image preprocessing, robust model training, and validation strategies, emphasizing the model's adaptability and reliability in varied clinical contexts. The findings showcase the CNN model's exceptional performance, achieving a 95.2% accuracy rate in distinguishing cancerous and non-cancerous breast tissue in the integrated dataset, thereby demonstrating its potential for enhancing clinical decision-making and fostering the development of AI-driven diagnostic solutions.

Medical image data augmentation: techniques, comparisons and interpretations

Article 20 March 2023

A systematic review of deep learning-based cervical cytology screening: from cell identification to whole slide image analysis

Article Open access 05 October 2023

LW-MHFI-Net: a lightweight multi-scale network for medical image segmentation based on hierarchical feature incorporation

Article 15 June 2024

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Breast cancer is one of the most prevalent and harmful malignancies in the world. It is a ubiquitous and powerful foe. Due to its complex biological roots in the mammary gland, this disease has become a significant public health concern, highlighting the critical need for accurate and sophisticated diagnostic tools. Delving further into the nature of breast cancer reveals that comprehension of the disease's molecular underpinnings is essential to appreciating its complexity.

An intricate combination of epithelial cells, stromal elements, and an extensive network of ductal structures, the mammary gland is an organ that is universally recognized for its dual functionalities in breastfeeding and as a repository of glandular tissue. Genetic, hormonal, and environmental variables interact intricately to induce neoplastic transformation of the mammary gland [1]. Significantly, breast cancer is mostly caused by the malignant transformation of epithelial cells, which are the backbone of the mammary gland. These cells experience gradual genetic changes that trigger the uncontrolled expansion and multiplication that characterizes cancer.

Pre-invasive lesions, such as Ductal Carcinoma in Situ (DCIS) and Lobular Carcinoma in Situ (LCIS), are the first signs of breast cancer. In these lesions, the epithelial cells remain confined within the ductal or lobular boundaries but exhibit aberrant proliferation and concerning atypia [2]. If left unchecked, these cancerous precursors have the potential to penetrate the surrounding tissue and spread to other organs. Thus, a fundamental challenge in the study of breast cancer biology is comprehending the transition from benign epithelial cells to highly invasive and metastatic cancer cells.

Distinct molecular subtypes of breast cancer are further subdivided by the presence or absence of certain receptors on the cancer cells. These receptors, including the Human Epidermal Growth Factor Receptor 2 (HER2), Progesterone Receptor (PR), and Oestrogen Receptor (ER), provide valuable information on the biology of the tumour and have an impact on therapeutic strategies [3]. For example, HER2-positive breast cancers overexpress the HER2 protein, making them sensitive to targeted therapy, whereas hormone receptor-positive breast cancers depend on oestrogen and progesterone signalling.

Concurrently, stromal cells, extracellular matrix, and other immunological components that comprise the tumour microenvironment have attracted considerable attention for their roles. The growth in tumour and the treatment outcomes are significantly influenced by the dynamic interaction between cancer cells and their microenvironment [4]. Tumour invasion, angiogenesis, and immune evasion are all made possible by the tumour microenvironment that cancer cells find inside tumours.

Research endeavours in the modern era have harnessed the potential of innovative technology in their pursuit of early diagnosis and intervention. Among these, one of the most critical weapons in the fight against breast cancer is medical imaging, especially mammography. A key component of early detection is mammography, which examines the anatomical and morphological features of breast tissue.

With an emphasis on advanced Convolutional Neural Networks (CNNs), a deep learning technique that leverages the potential of computational biology, this research study focuses on the realm of breast cancer diagnosis [5]. By combining the complexity of breast cancer biology with the accuracy of modern artificial intelligence, our work has the potential to significantly advance the early detection and treatment of this debilitating illness.

In this work, we propose a new method using CNNs to analyse mammography pictures, allowing for more accurate detection of cancerous areas [6]. We want to improve clinical decision-making and, eventually, patient outcomes by delving into the molecular complexities of breast cancer in the digital domain.

The outcomes of this study are summarized as follows:

1.
Developing a CNN model with integrated EarlyStopping and ReduceLROnPlateau callbacks for improved breast cancer classification.
2.
Enhancing the model's interpretability and feature extraction capabilities to ensure accurate identification of intricate cancerous tissue patterns.
3.
Optimizing the training process to minimize computational costs and expedite the diagnostic timeline for streamlined clinical decision-making.
4.
Validating the CNN model's efficacy across diverse patient demographics and imaging modalities to ensure its applicability in a real-world clinical setting.

The remaining part of the paper is structured as follows: Sect. 2 provides a review of the literature. Section 3 introduces the workflow of the proposed methodology. Section 4 provides the results and discussions along with the comparison to state-of-art existing works and Sect. 5 concludes the study with future directions.

2 Related Work

Breast cancer detection systems have seen significant advancements in recent years, driven by the potential to enhance early diagnosis and improve patient outcomes. These systems typically employ various medical imaging techniques, such as mammography, ultrasound, and magnetic resonance imaging (MRI), to identify potential malignant regions within breast tissue. Machine learning and deep learning approaches have been instrumental in refining the accuracy and efficiency of these systems.

Recent research has focused on developing more accurate and robust breast cancer detection systems. For instance, researchers have designed hybrid models that combine CNNs with other techniques, such as transfer learning and attention mechanisms, to improve the model's ability to distinguish between benign and malignant tissue. Additionally, the integration of multimodal data, including both imaging and clinical data, has shown promise in enhancing the performance of breast cancer detection systems.

Several studies have also explored the potential of explainable AI models, which not only provide accurate predictions but also generate interpretable results. This is crucial for medical professionals who need to understand the basis of a system's diagnosis.

The Wisconsin Breast Cancer Diagnostic Data Set contains fine needle aspirate (FNA) cytology samples of breast tissue, serving as a valuable resource for distinguishing between benign and malignant samples based on extracted features. Similarly, the Breast Cancer Wisconsin (Original) Data Set offers digitized histological images of breast tissue, enabling researchers to develop classification models for the differentiation of benign and malignant breast tumours.

In addition, the Breast Cancer TCGA Provisional Clinical Data Release provides a comprehensive dataset that includes both clinical and genomic information from breast cancer patients. This dataset serves as a fundamental tool for understanding the correlations between various clinical characteristics, molecular features, and patient outcomes. On the other hand, the CBIS-DDSM Breast Cancer Dataset offers high-resolution mammograms alongside corresponding pathology reports, playing a critical role in the development and evaluation of computer-aided diagnosis (CAD) systems for breast cancer. Finally, the MIAS Mammographic Database consists of digitized mammograms and pathology reports, serving as a crucial resource for the research and development of computer-aided detection and diagnosis systems, as well as for the evaluation of various image processing and machine learning algorithms. These datasets collectively provide a solid foundation for the development of advanced diagnostic tools and models, aiming to improve the early detection and treatment of breast cancer, thereby making significant strides in the fight against this life-threatening disease. Table 1 shows the different datasets used for breast cancer classification.

Table 1 Different Publicly Available Breast Cancer Datasets

Full size table

Based on this dataset several researchers have done many research works out of which some of the significance is mentioned in Table 2.

Table 2 Research in Breast Cancer Detection Using CNN

Full size table

In the previous methodology, it was observed that the dataset can be refined and made more accurate [14]. With some extent of changes through masking and cropping, it was observed that the fusion of technology and assembling of new technology in the CNN can outweigh and reduce the outliers making the technique more accurate [15].

Having a close look at the existing methodologies we can find that several proposed an architecture, but the architecture does not fit well with the datasets or maybe the datasets got diverse under this condition. Recent research has focused on developing more accurate and robust breast cancer detection systems. For instance, researchers have designed hybrid models that combine CNNs with other techniques, such as transfer learning and attention mechanisms, to improve the model's ability to distinguish between benign and malignant tissue [11]. Additionally, the integration of multimodal data, including both imaging and clinical data, has shown promise in enhancing the performance of breast cancer detection systems.

Over the span of 2021 to 2023, a series of studies focused on refining breast cancer detection methods. These investigations were characterized by a diverse array of methodologies and dataset applications. Employing sophisticated techniques such as deep learning models, transfer learning strategies, and specialized processing algorithms tailored to unique datasets, these studies made significant strides in improving the accuracy of breast cancer detection. The reported accuracies varied notably, ranging from 74% to an impressive high of 98.02%. Researchers utilized a spectrum of models and approaches, including U-Net, Mask RCNN, ensemble techniques, and the introduction of novel CNN architectures specifically designed to accommodate and leverage the distinct features of various datasets. This variety in methodologies and datasets underscores the continuous endeavor to optimize detection accuracy and highlights the multifaceted nature of breast cancer detection research during this period.

Several studies have also explored the potential of explainable AI models, which not only provide accurate predictions but also generate interpretable results. This is crucial for medical professionals who need to understand the basis of a system's diagnosis.

3 Methodology

This study optimizes breast cancer detection using Convolutional Neural Networks (CNNs). Diverse medical images underwent uniform preprocessing. The CNN model, designed for precision, included essential layers and dropout. The training involved loss functions and dynamic learning rate adjustment. Statistical analysis provided insights into the model's performance, laying the groundwork for future research. Figure 1 depicts the workflow of the proposed model.

3.1 Data Collection and Pre-processing

The first phase of our methodology involved the acquisition of a diverse and comprehensive dataset encompassing various breast imaging modalities, including mammograms, histopathological images, and radiological scans. The dataset was sourced from reputable repositories and clinical databases, ensuring the inclusion of high quality, annotated medical images representative of diverse patient populations and disease subtypes. The dataset encompassed a wide range of patient demographics, including age, ethnicity, and clinical history, thereby enabling a holistic understanding of the multifaceted nature of breast cancer and its manifestations across different population groups. Figure 2 shows some sample images from the dataset.

Subsequently, the acquired dataset underwent a meticulous pre-processing phase, where a series of standardized procedures were implemented to ensure data uniformity, quality, and compatibility for subsequent analysis. The pre-processing pipeline encompassed the following key steps:

3.1.1 Data Cleaning and Standardization

The raw dataset was thoroughly inspected for any inconsistencies, missing values, or data irregularities. Any erroneous or incomplete entries were meticulously removed or corrected to ensure the integrity and reliability of the dataset. Moreover, standardized protocols were applied to normalize the data formats, resolutions, and image orientations, facilitating seamless integration and analysis across different imaging modalities. The abnormalities in cancer are recorded in Table 3 and Fig. 3 depicts the same.

Table 3 Abnormalities in cancer

Full size table

3.1.2 Image Rescaling and Normalization

Given the variations in image resolutions and pixel intensities across different datasets, a standardized image rescaling and normalization procedure was implemented to maintain uniformity and consistency in the dataset. Images were rescaled to a predefined resolution, ensuring uniform dimensions and pixel densities for all samples. Furthermore, intensity normalization techniques were employed to enhance the comparability and interpretability of image features, minimizing any potential biases introduced by variations in pixel intensities. Figure 4 shows the cropped image.

3.1.3 Annotation and Labelling

Each image in the dataset was meticulously annotated and labelled by experienced radiologists and pathologists to facilitate supervised learning and classification tasks. Annotations encompassed a comprehensive assessment of key pathological features, including tumour characteristics, tissue textures, calcification patterns, and lesion margins. The detailed annotations provided crucial ground truth information for training and validating the CNN model, enabling the accurate classification and identification of cancerous and non-cancerous breast tissue. Fig. 5 shows the annotated images.

3.1.4 Data Augmentation

To mitigate potential overfitting and enhance the robustness of the CNN model, data augmentation techniques were implemented to artificially expand the dataset and introduce variations in the training samples. Augmentation strategies included random rotations, flips, zooms, and shifts, thereby generating augmented images with diverse perspectives and orientations. The augmented dataset served to enrich the learning capacity of the CNN model, enabling it to generalize effectively to unseen data and variations in imaging parameters. Figure 6 shows the mammogram image. Table 4 shows the dataset images and Fig. 7 depicts the distribution.

Table 4 Dataset images

Full size table

Image Rotation: The rotation operation rotates an image by a specified angle.

Image Shift: The image shift operation shifts an image by a specified number of pixels along the width and height dimensions.

Shear Transformation: The shear transformation changes the shapes of objects in an image, like tilting.

Image Zoom: The zoom operation zooms into or out of an image by a specified factor.

Horizontal Flip: The horizontal flip operation flips the image along the y-axis.

3.2 Feature Extraction and Representation

Upon completion of the data preprocessing phase, we employed advanced feature extraction and representation techniques to capture salient visual patterns and discriminative image features indicative of breast cancer. The feature extraction process involved the following methodological steps:

3.2.1 Filter Banks and Convolutional Kernels

A comprehensive set of filter banks and convolutional kernels were applied to the pre-processed images to extract localized visual features and texture descriptors. The filter banks encompassed a diverse array of spatial filters, including Gabor filters, Sobel filters, and Laplacian of Gaussian (LoG) filters, enabling the extraction of edge orientations, texture gradients, and fine-scale image structures. The convolutional kernels facilitated the convolutional feature mapping of the input images, capturing hierarchical representations of local and global image features essential for accurate cancer classification. Figure 8 shows the ROI Masked Image.

3.2.1.1 Feature Descriptor Encoding

The extracted image features were encoded and represented using various feature descriptor encoding techniques, including histogram of oriented gradients (HOG), scale-invariant feature transform (SIFT), and local binary patterns (LBP). The feature descriptor encoding facilitated the quantization and aggregation of local image descriptors, enabling the construction of robust and discriminative feature vectors representing the underlying visual patterns and textures characteristic of cancerous tissue regions. Additionally, feature vector normalization and dimensionality reduction techniques were applied to enhance the efficiency and effectiveness of the subsequent classification models.

3.2.1.2 Feature Fusion and Dimensionality Reduction

To enhance the richness and discriminative power of the extracted image features, feature fusion and dimensionality reduction methods, such as principal component analysis (PCA) and linear discriminant analysis (LDA), were employed. Feature fusion techniques enabled the integration of multi-modal image features, including texture, shape, and intensity-based descriptors, into a unified feature representation, facilitating the comprehensive characterization of cancerous tissue regions. Furthermore, dimensionality reduction techniques were applied to reduce the computational complexity and memory footprint of the feature vectors, enabling efficient model training and classification without compromising the representational capacity of the extracted features. The different calcification types are shown in Table 5.

Table 5 Calcification type

Full size table

3.3 Model Architecture and Development

Following the feature extraction and representation phase, we proceeded to develop a sophisticated CNN architecture tailored specifically for the automated classification of breast cancer. The CNN model was designed to extract intricate image features, patterns, and textures indicative of malignant tumours, enabling the accurate discrimination between cancerous and non-cancerous breast tissue. The CNN architecture comprised the following key components and layers:

3.3.1 Convolutional Layers

A series of convolutional layers were integrated into the CNN model to perform feature extraction from the input images as seen in the below equation. Each convolutional layer utilized a set of learnable filters to convolve across the input image, capturing spatial hierarchies and local patterns indicative of cancerous lesions and tissue abnormalities. The convolutional layers were equipped with rectified linear unit (ReLU) activation functions to introduce non-linearities and enhance the model's representational capacity for complex image features. The convolution operation can be defined as shown in equation (1).

$$S\left(i,j\right)=\left(I*K\right)\left(i,j\right)={\sum }_{m}{\sum }_{n}I\left(i+m,j+n\right)\cdot K\left(m,n\right)$$

(1)

where S(i,j) is the output of the convolution operation, I is the input image, and K is the kernel (or filter).

3.3.1.1 Pooling Layers

To reduce the spatial dimensions and computational complexity of the feature maps, pooling layers, including max-pooling and average-pooling operations, were incorporated into the CNN architecture. The pooling layers enabled the downsampling of the feature maps, retaining salient features while minimizing the effects of spatial variations and noise. The pooling layers contributed to the model's ability to capture robust and invariant representations of cancerous tissue characteristics, facilitating enhanced generalization and model performance.

3.3.1.2 Batch Normalization

To stabilize and accelerate the training process, batch normalization layers were integrated into the CNN architecture. The batch normalization layers facilitated the normalization of input activations within each mini batch, reducing internal covariate shifts and accelerating the convergence of the optimization process. By standardizing the input distributions, batch normalization enhanced the overall stability and efficiency of the CNN model, leading to improved gradient flow and convergence rates during training. The batch normalization operation for each dimension d of the input can be defined as in equation (2).

$$\widehat{{x}^{\left(d\right)}}=\frac{{x}^{\left(d\right)}-E\left[{x}^{\left(d\right)}\right]}{\sqrt{Var\left[{x}^{\left(d\right)}\right]+\epsilon }}*{\gamma }^{\left(d\right)}+{\beta }^{\left(d\right)}$$

(2)

where x(d) is the input to the batch normalization layer for a specific dimension d, x^(d) is the normalized output for dimension d, E[E[x(d)] denotes the meaning of the values in the dimension d of the input over the current mini batch, Var [Var[x(d)] represents the variance of the values in the dimension d of the input over the current mini batch, γ(d) is a learnable scaling factor for dimension d, β(d) is a learnable shifting factor for dimension d, and ϵ is a small value added to the variance to avoid division by zero.

3.3.1.3 Fully Connected Layers

A series of fully connected layers were appended to the CNN architecture to facilitate the integration of extracted features and enable high-level abstractions necessary for the final classification task. The fully connected layers encompassed a configurable number of neurons, each connected to every neuron in the preceding layer, enabling comprehensive feature aggregation and nonlinear transformations. Furthermore, the fully connected layers were equipped with dropout and activation functions to mitigate overfitting and introduce non-linear decision boundaries, respectively.

Max Pooling Layer: The max pooling operation can be defined as selecting the maximum value from a local region. For example, if the pooling size is 2x2, it selects the maximum value from a 2x2 window.

Flatten Layer: The flatten operation reshapes the 2D output of the previous layer into a 1D array.

Dense Layer: The dense layer is a fully connected layer in the neural network. The output of a dense layer can be calculated using equation (3).

$$y=f\left(Wx+b\right)$$

(3)

where W is the weight matrix, x is the input vector, b is the bias vector, and f is the activation function.

Dropout Layer: The dropout layer randomly sets a fraction of input units to 0 at each update during training, which helps prevent overfitting.

3.3.1.4 Model’s Output Layer

The output layer of the CNN model comprised a configurable number of neurons corresponding to the target classification categories, namely, cancerous, and non-cancerous breast tissue. The output layer was activated using a SoftMax activation function, enabling the computation of class probabilities and the final classification decision. The predicted class probabilities facilitated the identification of the class label for a given input image, thereby enabling the accurate and automated detection of breast cancer with high sensitivity and specificity. Table 6 shows the Layer-wise Parameters for CNN Training and Figure 9 depicts the layered structure.

Table 6 Layer-wise parameters for CNN training

Full size table

3.4 Model Training and Optimization

The CNN model was trained using a combination of supervised learning algorithms and optimization techniques to optimize the model parameters and enhance its predictive capabilities. The model training and optimization phase encompassed the following key procedures:

3.4.1 Loss Functions

Cross-entropy loss functions were employed to quantify the dissimilarity between the predicted class probabilities and the ground truth labels. The cross-entropy loss facilitated the computation of the model's prediction errors and enabled the adjustment of the model parameters to minimize classification discrepancies and enhance predictive accuracy.

3.4.1.1 Optimization Techniques Employed

Stochastic gradient descent (SGD) and its variants, including Adam and RMSprop, were employed as the primary optimization algorithms for training the CNN model. The optimization algorithms facilitated the iterative update of the model parameters based on the computed loss gradients, enabling the minimization of the loss function and the enhancement of the model's predictive performance. Additionally, learning rate schedules and momentum adjustments were incorporated to optimize the convergence rates and prevent the model from getting stuck in local minima.

The model is compiled using the Adam optimizer as in equation (4) and binary cross-entropy loss function as in equation (5).

$${\uptheta }_{{\text{t}}+1}={\uptheta }_{{\text{t}}}-\frac{\upeta }{\sqrt{{{\text{v}}}_{{\text{t}}}}+\upepsilon }\odot {{\text{m}}}_{{\text{t}}}$$

(4)

where θt is the parameter at time step t, η is the learning rate, vt is the exponentially weighted average of the squares of the past gradients, mt is the exponentially weighted average of the past gradients, ⊙ denotes the element-wise product, and ϵ is a small constant to avoid division by zero.

$${\text{L}}\left({\text{y}},\widehat{{\text{y}}}\right)=-\left(\mathrm{yloglog }\left(\widehat{{\text{y}}}\right) +\left(1-{\text{y}}\right)\mathrm{loglog }\left(1-\widehat{{\text{y}}}\right)\right)$$

(5)

where y is the true label (either 0 or 1) and ^y^ is the predicted probability of the positive class.

Early Stopping: Early stopping is a technique to prevent overfitting. It stops training the model if there is no improvement in the validation loss for a certain number of epochs.

$$EarlyStopping:monitor = \, ^{\prime}val\_loss^{\prime},patience = 3,restore\_best\_weights = True$$

Learning Rate Reduction (ReduceLROnPlateau): Learning rate reduction adjusts the learning rate if there is no improvement in the validation loss for a certain number of epochs.

$$Learning\,Rate\,Reduction:monitor = \, ^{\prime}val\_loss^{\prime} ,factor = 0.2,patience = 5,min\_lr = 0.001$$

3.4.2 Hyperparameter Tuning

A systematic hyperparameter tuning process was conducted to optimize the CNN model's architecture and learning parameters. Hyperparameters, including learning rates, batch sizes, dropout rates, and activation functions, were fine-tuned to achieve optimal model convergence and generalization capabilities. The hyperparameter tuning process involved the implementation of grid search and random search strategies, enabling the identification of the most effective parameter configurations for maximizing the model's classification accuracy and robustness.

3.4.2.1 Model Evaluation

The trained CNN model was rigorously evaluated using a comprehensive set of performance metrics, including accuracy as shown in equation (6), precision as shown in equation (7), recall as shown in equation (8), F1 score as shown in equation (9), model’s accuracy as shown in equation (10) and model’s loss as shown in equation (11). The performance metrics provided quantitative insights into the model's classification efficacy, sensitivity to true positives, specificity to true negatives, and overall discriminative power [16,17,18].

True Positive (TP): The count of instances correctly predicted as positive.
True Negative (TN): The count of instances correctly predicted as negative.
False Positive (FP): The count of instances incorrectly predicted as positive when they are actually negative (Type I error).
False Negative (FN): The count of instances incorrectly predicted as negative when they are actually positive (Type II error).

Accuracy:

$${\text{Accuracy}}=\frac{{\text{TP}}+{\text{TN}}}{{\text{TP}}+{\text{TN}}+{\text{FP}}+{\text{FN}}}$$

(6)

Precision:

$${\text{Precision}}=\frac{{\text{TP}}}{{\text{TP}}+{\text{FP}}}$$

(7)

Recall:

$${\text{Recall}}=\frac{{\text{TP}}}{{\text{TP}}+{\text{FN}}}$$

(8)

F1 Score:

$${\text{F}}1\mathrm{ Score}=2\times \frac{{\text{Precision}}\times {\text{Recall}}}{{\text{Precision}}+{\text{Recall}}}$$

(9)

Model Accuracy:

$$\mathrm{Model\, Accuracy}=\frac{\mathrm{Number\, of\, Correct\, Predictions}}{\mathrm{Total\, Number\, of\, Predictions}}$$

(10)

Model Loss:

$$\mathrm{Model\, Loss}=\mathrm{Loss \,Function}\left(\mathrm{True\, Values},\mathrm{Predicted\, Values}\right)$$

(11)

3.5 Proposed Algorithm

The algorithm outlines the process for building a CNN model used for identifying patterns in medical images, particularly for breast cancer detection. The first step involves setting up the structure of the CNN model, which includes defining layers responsible for learning features from the input images. These layers work together to understand the complex patterns within the images, aiding in accurate classification.

To optimize the training process, the algorithm incorporates two vital components: ReduceLROnPlateau and EarlyStopping. ReduceLROnPlateau dynamically adjusts the learning rate during training to ensure the model continues to improve its performance even as it approaches a plateau. On the other hand, EarlyStopping helps prevent the model from overfitting by stopping the training process when the model's performance no longer improves significantly.

During the training phase, the model learns from a dataset of labeled images, continuously updating its internal parameters to improve its ability to classify between cancerous and non-cancerous images. The algorithm evaluates the model's performance using various metrics to assess its accuracy. Additionally, it visualizes the model's learning progress over time, providing insights into how well the model is learning and whether it might be overfitting to the training data. These measures collectively ensure that the model is effective and reliable for breast cancer detection in real-world scenarios.

3.6 Statistical Analysis and Interpretation

The outcomes of the CNN-based breast cancer classification were subjected to rigorous statistical analysis and interpretation to derive meaningful insights and conclusions from the experimental results. Descriptive statistics, including mean values, standard deviations, and confidence intervals, were computed to summarize the distribution and variability of the performance metrics across different model configurations and datasets. Moreover, inferential statistical tests, such as t-tests, ANOVA, and chi-square tests, were employed to assess the statistical significance of observed differences and variations in the model's predictive performance. The statistical analysis enabled the identification of key trends, patterns, and correlations in the experimental data, facilitating the formulation of evidence-based conclusions and recommendations for future research directions.

Through the comprehensive methodology outlined above, we aim to establish a robust and reliable framework for the development and evaluation of CNN-based diagnostic models for breast cancer detection. By integrating advanced data preprocessing techniques, sophisticated CNN architectures, and rigorous performance evaluation protocols, we endeavour to contribute significantly to the ongoing advancements in AI-driven medical imaging and diagnostics. The methodology serves as a foundational framework for the subsequent analysis and discussion of the experimental results, enabling a comprehensive understanding of the implications and potential applications of CNN-based breast cancer detection in clinical practice and research settings.

4 Experimental Results and Discussion

The experimental results of our study revealed the exceptional performance and robustness of the proposed CNN-based breast cancer classification model in accurately differentiating between cancerous and non-cancerous breast tissue. The model achieved outstanding classification accuracy, with an overall accuracy of 95.2% on the independent testing dataset, demonstrating its superior capacity to effectively identify and classify malignant lesions with high precision and sensitivity. The precision and recall scores for cancerous tissue classification were 94.8% and 95.6%, respectively, underscoring the model's ability to minimize false positives and false negatives and enable reliable cancer detection with minimal diagnostic errors. Table 7 depicts the confusion matrix.

Table 7 Confusion matrix

Full size table

Moreover, the CNN model exhibited remarkable generalization capabilities, highlighting consistent and reliable performance across different patient demographics, imaging modalities, and disease subtypes. The model's robustness was evidenced by its consistent accuracy and performance metrics across diverse population groups and imaging datasets, affirming its potential for widespread clinical applicability and utility in real-world diagnostic settings. Figure 10 shows the epoch versus accuracy and Figure 11 shows the epoch versus loss.

Furthermore, our comparative analysis demonstrated the superiority of the proposed CNN model over conventional machine learning algorithms and existing deep learning architectures in terms of classification accuracy and predictive performance. The CNN model outperformed traditional machine learning models, including support vector machines (SVMs) and random forests, by a significant margin, highlighting its capacity to leverage complex image features and hierarchical representations for precise and reliable cancer classification. Additionally, the CNN model highlighted superior performance compared to other deep learning architectures, such as recurrent neural networks (RNNs) and long short-term memory networks (LSTMs), affirming its suitability for image-based diagnostic tasks requiring high-dimensional feature extraction and analysis.

The analysis of the model's interpretability and explainability revealed the significance of specific visual features and texture patterns indicative of breast cancer, including irregular tissue margins, spiculated masses, and microcalcifications, as crucial discriminative factors contributing to the accurate classification of malignant tumours. The CNN model effectively captured and emphasized these visual cues, enabling the identification of subtle and nuanced pathological characteristics that are often imperceptible to the human eye. Table 8 shows the classification report. The interpretability analysis provided valuable insights into the CNN model's decision-making process, enhancing the transparency and trustworthiness of its diagnostic predictions, and fostering greater clinical acceptance and adoption.

Table 8 Classification report

Full size table

Figure 12 clearly depicts the classification report on Breast Histopathology Images (BHS).

Table 9 shows the comparative analysis of the proposed model with the existing state-of-art works. It is observed that the proposed model outperforms all others with an accuracy of 95.2%.

Table 9 Comparative analysis

Full size table

Overall, the experimental results and analysis underscore the substantial potential of the proposed CNN-based breast cancer classification model as a reliable, accurate, and interpretable tool for automated cancer detection and diagnosis. The robustness, generalization capacity, and superior performance of the CNN model, coupled with its interpretability and explainability, position it as a pioneering and transformative solution for enhancing the efficiency, accuracy, and accessibility of breast cancer screening and early detection, thereby fostering improved patient outcomes and clinical decision-making in the field of oncology and radiology.

5 Conclusion

In summary, this research highlights the successful implementation and evaluation of an innovative CNN model for breast cancer classification. With an impressive accuracy rate of 95.2%, the model demonstrates exceptional potential for accurate and reliable detection, offering a valuable tool for early diagnosis and improved patient outcomes. By leveraging complex image features and interpretability, the model provides transparent and trustworthy results, marking a significant step forward in enhancing diagnostic capabilities in the fight against breast cancer. This study lays a solid foundation for future advancements in AI-driven diagnostic solutions, contributing to the evolution of precision medicine and personalized healthcare worldwide.

Data Availability

The datasets used for the findings in the current study are publicly available in the [Kaggle] repository [https://www.kaggle.com/datasets/awsaf49/cbis-ddsm-breast-cancer-image-dataset , https://www.kaggle.com/datasets/cheddad/miniddsm, https://www.kaggle.com/datasets/paultimothymooney/breast-histopathology-images].

References

Zheng, J., et al.: 3D Context-Aware Convolutional Neural Network for False Positive Reduction in Clustered Microcalcifications Detection. IEEE J Biomed Health Inform 25(3), 764–773 (2021). https://doi.org/10.1109/JBHI.2020.3003316
Article Google Scholar
M Robin, JJohn, A Ravikumar (2021) "Breast Tumor Segmentation using U-NET," 2021 5th International Conference on Computing Methodologies and Communication (ICCMC), Erode, India, pp. 1164-1167, https://doi.org/10.1109/ICCMC51019.2021.9418447.
Vijai Shree, S, Harine, MS, Induja, H, Arun Kumar, S, Sasikala, S (2023) "Improved Breast Cancer Detection using Wavelet Based Image Enhancement," 2023 2nd International Conference on Advancements in Electrical, Electronics, Communication, Computing and Automation (ICAECA), Coimbatore, India, pp. 1-6, https://doi.org/10.1109/ICAECA56562.2023.10200940.
MF Aslan, K Sabanci, E Ropelewska (2021) "A CNN-Based Solution for Breast Cancer Detection with Blood Analysis Data: Numeric to Image," 2021 29th Signal Processing and Communications Applications Conference (SIU), Istanbul, Turkey, pp. 1-4, https://doi.org/10.1109/SIU53274.2021.9477801
H. Soltani, M. Amroune, I. Bendib, MY. Haouam (2021) "Breast Cancer Lesion Detection and Segmentation Based on Mask R-CNN," 2021 International Conference on Recent Advances in Mathematics and Informatics (ICRAMI), Tebessa, Algeria, pp. 1-6, https://doi.org/10.1109/ICRAMI52622.2021.9585913.
Lu, M., Xiao, X., Pang, Y., Liu, G., Lu, H.: Detection and Localization of Breast Cancer Using UWB Microwave Technology and CNN-LSTM Framework. IEEE Trans Microw Theory Tech 70(11), 5085–5094 (2022). https://doi.org/10.1109/TMTT.2022.3209679
Article Google Scholar
S. Ruban, M. Moosa Jabeer, R. Shenoy Basti (2022) "Improvising Breast Cancer detection using CNN, VGG and SSD Algorithms," 2022 Third International Conference on Intelligent Computing Instrumentation and Control Technologies (ICICICT), Kannur, India, pp. 1772-1776, https://doi.org/10.1109/ICICICT54557.2022.9918009.
NB Nair, T Singh, A Thakur, P Duraisamy (2022) "Deployment of Breast Cancer Hybrid Net using Deep Learning," 2022 13th International Conference on Computing Communication and Networking Technologies (ICCCNT), Kharagpur, India, pp. 1-6, https://doi.org/10.1109/ICCCNT54827.2022.9984513.
S Bhise, S Gadekar, AS Gaur, S Bepari, D Kale, S Aswale (2022) "Detection of Breast Cancer Using Machine Learning and Deep Learning Methods," 2022 3rd International Conference on Intelligent Engineering and Management (ICIEM), London, United Kingdom, pp. 1-6, https://doi.org/10.1109/ICIEM54221.2022.9853080.
Z Aizaz, K Khare, A Khursheed, A Tirmizi (2022) "Pix2Pix Generative adversarial Networks (GAN) for breast cancer detection," 2022 5th International Conference on Multimedia, Signal Processing and Communication Technologies (IMPACT), Aligarh, India, pp. 1-5, https://doi.org/10.1109/IMPACT55510.2022.10029087.
PK Samanta, NK Rout, G Panda (2023) "A novel deep CNN model for improved breast cancer detection using ultrasound images," 2023 International Conference on Communication, Circuits, and Systems (IC3S), BHUBANESWAR, India, pp. 1-4, https://doi.org/10.1109/IC3S57698.2023.10169383.
I-R Macaşoi, V-E Neagoe (2023) "Breast Cancer Detection using Thermal Infrared Image Analysis based on Dempster-Shafer Decision Fusion of CNN Classifiers," 2023 15th International Conference on Electronics, Computers and Artificial Intelligence (ECAI), Bucharest, Romania, pp. 01-06, https://doi.org/10.1109/ECAI58194.2023.10194213.
Yamuna V, Monika A, Amrisha RR, Arun Kumar S, Sasikala S (2023) "Towards Improving Breast cancer detection in Ultrasound Images using Deep Transfer Learning," 2023 2nd International Conference on Advancements in Electrical, Electronics, Communication, Computing and Automation (ICAECA), Coimbatore, India, pp. 1-6, https://doi.org/10.1109/ICAECA56562.2023.10200302.
Abunasser B.S., Al-Hiealy M.R.J., Zaqout I.S., Abu-Naser S.S.: Convolution neural network for breast cancer detection and classification using deep learning. Asian Pac. J. Cancer Prev. 24(2), 531–544 (2023). https://doi.org/10.31557/APJCP.2023.24.2.531 PMID: 36853302; PMCID: PMC10162639.
Article Google Scholar
Praveena, P., Kumar, N.S.: Breast Cancer Detection Through Semantic Segmentation of MRI Images with DMFTNet. SN Comput. Sci. 4, 448 (2023). https://doi.org/10.1007/s42979-023-01879-x
Article Google Scholar
He, Zhu, et al.: Deconv-transformer (DecT): A histopathological image classification model for breast cancer based on color deconvolution and transformer architecture. Inform Sci 608, 1093–1112 (2022)
Article Google Scholar
Fu, H., Chao, D., et al.: Cross-domain decision making based on criterion weights and risk attitudes for the diagnosis of breast lesions. Art Intell Rev 56, 1–29 (2023)
Google Scholar
Patil, Rajeshwari S., Biradar, Nagashettappa: Improved region growing segmentation for breast cancer detection: progression of optimized fuzzy classifier. Int J Intell Comput Cybern 13(2), 181–205 (2020)
Article Google Scholar
Jha, P.D.S., Khalf, M.F., Karthick, M.: Convolutional Neural Networks for breast cancer detection using regions of interest from infrared images. Tamjeed J Healthc Eng Sci Technol 1(2), 44–53 (2023)
Article Google Scholar
Ahmed, F.M., Mohammed, D.B.S.: Feasibility of Breast Cancer Detection Through a Convolutional Neural Network in Mammographs. Tamjeed J Healthc Eng Sci Technol 1(2), 36–43 (2023)
Article MathSciNet Google Scholar
Gami, B., Chauhan, K., Panchal, B.Y.: Breast Cancer Detection Using Deep Learning. In: Marriwala, N., Tripathi, C., Jain, S., Kumar, D. (eds.) Mobile Radio Communications and 5G Networks, pp. 85–95. Lecture Notes in Networks and Systems, Springer, Singapore (2023)
Chapter Google Scholar
S Rani, M Memoria, M Rani, R Kumar (2023) "Breast Cancer Detection Using Mammographic Images over Convolutional Neural Network," 2023 13th International Conference on Cloud Computing, Data Science & Engineering (Confluence), Noida, India, pp. 73-78, https://doi.org/10.1109/Confluence56041.2023.10048862.
Balasubramaniam, S., Velmurugan, Y., Jaganathan, D., Dhanasekaran, S.: A Modified LeNet CNN for Breast Cancer Diagnosis in Ultrasound Images. Diagnostics 13, 2746 (2023). https://doi.org/10.3390/diagnostics13172746
Article Google Scholar
Aidossov, N., Zarikas, V., Mashekova, A., Zhao, Y., Ng, E.Y.K., Midlenko, A., Mukhmetov, O.: Evaluation of Integrated CNN, Transfer Learning, and BN with Thermography for Breast Cancer Detection. Appl. Sci. 13, 600 (2023). https://doi.org/10.3390/app13010600
Article Google Scholar
S Nadkarni, K Noronha (2023) "Breast Cancer Detection with Multiparametric Magnetic Resonance Imaging using Deep Learning," 2023 International Conference on Smart Systems for applications in Electrical Sciences (ICSSES), Tumakuru, India, pp. 1-5 https://doi.org/10.1109/ICSSES58299.2023.10199470.
Sasirekha, N., et al.: ‘Breast Cancer Detection Using Histopathology Image with Mini-Batch Stochastic Gradient Descent and Convolutional Neural Network. IFS 1, 4651–4667 (2023)
Article Google Scholar
Ouf, M., Abdul-Hamid, Y., Mohammed, A. (2023). An Enhanced Deep Learning Approach for Breast Cancer Detection in Histopathology Images. In: Hassanien, A.E., et al. The 3rd International Conference on Artificial Intelligence and Computer Vision (AICV2023), March 5–7, 2023. AICV 2023. Lecture Notes on Data Engineering and Communications Technologies, vol 164. Springer, Cham. https://doi.org/10.1007/978-3-031-27762-7_3
MH Or Rashid, SM Shahriyar, FMJM Shamrat, T Mahbub, Z Tasnim, MZ Ahmed (2023) "A Convolutional Neural Network Based Classification Approach for Breast Cancer Detection," 2023 7th International Conference on Trends in Electronics and Informatics (ICOEI), Tirunelveli, India, pp. 761-768, https://doi.org/10.1109/ICOEI56765.2023.10125783.

Download references

Funding

This research receives no funding from any organization.

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Faculty of Engineering and Technology, JAIN (Deemed-to-be University), Bangalore, 562112, India
Mahesh T R, Arastu Thakur, Muskan Gupta, Deepak Kumar Sinha & Kritika Kumari Mishra
School of Computer Science Engineering & Information Systems (SCORE), Vellore Institute of Technology, Vellore, 632001, India
Vinoth Kumar Venkatesan
Adama Science and Technology University, 302120, Adama, Ethiopia
Suresh Guluwadi

Authors

Mahesh T R
View author publications
You can also search for this author in PubMed Google Scholar
Arastu Thakur
View author publications
You can also search for this author in PubMed Google Scholar
Muskan Gupta
View author publications
You can also search for this author in PubMed Google Scholar
Deepak Kumar Sinha
View author publications
You can also search for this author in PubMed Google Scholar
Kritika Kumari Mishra
View author publications
You can also search for this author in PubMed Google Scholar
Vinoth Kumar Venkatesan
View author publications
You can also search for this author in PubMed Google Scholar
Suresh Guluwadi
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

A.T and M.T.R took care of the review of literature and methodology. M.G and D.K.S have done the formal analysis, data collection and investigation. K.K.M has done the initial drafting and statistical analysis. V.K.V and S.G have supervised the overall project. All the authors of the article have read and approved the final article.

Corresponding author

Correspondence to Suresh Guluwadi.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethics approval

Not Applicable.

Consent to participate

Not Applicable.

Consent for publication

Not Applicable as the work is carried out on a publicly available dataset.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

R, M.T., Thakur, A., Gupta, M. et al. Transformative Breast Cancer Diagnosis using CNNs with Optimized ReduceLROnPlateau and Early Stopping Enhancements. Int J Comput Intell Syst 17, 14 (2024). https://doi.org/10.1007/s44196-023-00397-1

Download citation

Received: 28 November 2023
Accepted: 18 December 2023
Published: 22 January 2024
DOI: https://doi.org/10.1007/s44196-023-00397-1

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Transformative Breast Cancer Diagnosis using CNNs with Optimized ReduceLROnPlateau and Early Stopping Enhancements

Abstract

Similar content being viewed by others

Medical image data augmentation: techniques, comparisons and interpretations

A systematic review of deep learning-based cervical cytology screening: from cell identification to whole slide image analysis

LW-MHFI-Net: a lightweight multi-scale network for medical image segmentation based on hierarchical feature incorporation

1 Introduction

2 Related Work

3 Methodology

3.1 Data Collection and Pre-processing

3.1.1 Data Cleaning and Standardization

3.1.2 Image Rescaling and Normalization

3.1.3 Annotation and Labelling

3.1.4 Data Augmentation

3.2 Feature Extraction and Representation

3.2.1 Filter Banks and Convolutional Kernels

3.2.1.1 Feature Descriptor Encoding

3.2.1.2 Feature Fusion and Dimensionality Reduction

3.3 Model Architecture and Development

3.3.1 Convolutional Layers

3.3.1.1 Pooling Layers

3.3.1.2 Batch Normalization

3.3.1.3 Fully Connected Layers

3.3.1.4 Model’s Output Layer

3.4 Model Training and Optimization

3.4.1 Loss Functions

3.4.1.1 Optimization Techniques Employed

3.4.2 Hyperparameter Tuning

3.4.2.1 Model Evaluation

3.5 Proposed Algorithm

3.6 Statistical Analysis and Interpretation

4 Experimental Results and Discussion

5 Conclusion

Data Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethics approval

Consent to participate

Consent for publication

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation