Value of Original and Generated Ultrasound Data Towards Training Robust Classifiers for Breast Cancer Identification

Munteanu, Bianca-Ştefania; Murariu, Alexandra; Nichitean, Mǎrioara; Pitac, Luminiţa-Gabriela; Dioşan, Laura

doi:10.1007/s10796-024-10499-6

Value of Original and Generated Ultrasound Data Towards Training Robust Classifiers for Breast Cancer Identification

Open access
Published: 12 June 2024

(2024)
Cite this article

Download PDF

You have full access to this open access article

Information Systems Frontiers Aims and scope Submit manuscript

Value of Original and Generated Ultrasound Data Towards Training Robust Classifiers for Breast Cancer Identification

Download PDF

Bianca-Ştefania Munteanu ORCID: orcid.org/0009-0003-7303-5968¹^na1,
Alexandra Murariu¹^na1,
Mǎrioara Nichitean¹^na1,
Luminiţa-Gabriela Pitac¹^na1 &
…
Laura Dioşan¹^na1

224 Accesses
Explore all metrics

Abstract

Breast cancer represents one of the leading causes of death among women, with 1 in 39 (around 2.5%) of them losing their lives annually, at the global level. According to the American Cancer Society, it is the second most lethal type of cancer in females, preceded only by lung cancer. Early diagnosis is crucial in increasing the chances of survival. In recent years, the incidence rate has increased by 0.5% per year, with 1 in 8 women at increased risk of developing a tumor during their life. Despite technological advances, there are still difficulties in identifying, characterizing, and accurately monitoring malignant tumors. The main focus of this article is on the computerized diagnosis of breast cancer. The main objective is to solve this problem using intelligent algorithms, that are built with artificial neural networks and involve 3 important steps: augmentation, segmentation, and classification. The experiment was made using a publicly available dataset that contains medical ultrasound images, collected from approximately 600 female patients (it is considered a benchmark). The results of the experiment are close to the goal set by our team. The final accuracy obtained is 86%.

Accurate Breast Tumor Identification Using Computational Ultrasound Image Features

A Comprehensive Review of CAD Systems in Ultrasound and Elastography for Breast Cancer Diagnosis

Improved Ultrasound Based Computer Aided Diagnosis System for Breast Cancer Incorporating a New Feature of Mass Central Regularity Degree (CRD)

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Breast cancer is the most prevalent cancer among adults, with over 2.3 million cases occurring annually. In 95% of countries, breast cancer ranks as the top or second-most frequent cause of cancer-related deaths in women (Organization, 2023). The use of breast imaging is crucial for detecting breast cancer at an early stage and for monitoring and assessing the effectiveness of treatment. Research indicates (Balkenende et al., 2022) that Deep Learning (DL) algorithms demonstrate comparable or superior performance to radiologists in breast cancer imaging. However, it is evident that extensive clinical trials are necessary, particularly for ultrasound, to precisely determine the added benefits of DL in this field. Ultrasound is considered a safe, painless, and non-invasive screening technique compared to other screening procedures that use ionizing radiation, such as X-rays in mammography. Also, it is very fast and slightly less expensive. Considering these facts, more younger women choose to take an ultrasound examination instead of a mammography.

When abnormalities are detected by other imaging modalities or on palpation, ultrasonography is utilized to detect and diagnose breast lesions due to its benefits, such as safety, accessibility, and low cost. Moreover, breast ultrasound is planned to become an additional screening technique for women with mammographically dense breasts. This screening approach is anticipated to identify tumours at an early stage and decrease the risk of dying of breast cancer in women. Therefore, the proposed intelligent system can identify breast lesions in ultrasound images. In the first phase, the system locates the tumour, if any. In the second phase, it fits the tissue into one of the 3 categories: healthy, malignant, or benign.

Automatic tumour segmentation and classification are still difficult because of the high noise, low contrast, weak or blurry boundaries, and significant quantity of shadows in breast ultrasound images. Recent works in medical image segmentation, such as those seen in Das et al. (2022), utilize non-deep learning techniques like hybrid ellipse-fitting based on bounded opening and Fast Radial Symmetry. These methods are ingenious but can face challenges, particularly when dealing with images that exhibit high variability, noise, and complex morphological structures. Similarly, traditional techniques reviewed in Ganesan et al. (2013) like thresholding and edge detection are foundational in understanding the limitations of non-deep learning methods. They often require manual tuning and may not effectively handle the subtle nuances present in medical images, such as overlapping tissues or varying densities. These limitations are significant when considering the precision required for medical diagnosis, where deep learning approaches like CNNs demonstrate superior performance. Their capacity to learn representations that capture underlying data distributions makes deep learning methods indispensable in modern medical image analysis.

Efficiency is vital for the system, given the delicate issue it has to address. A large and complex dataset is essential for proper algorithm training. Considering the limited public data available at the moment, the system initially employs a data augmentation step to enhance the dataset, followed by image segmentation and classification steps to ensure robust performance. The augmentation phase consists of creating new images, to handle the limited and unbalanced dataset. These new images are built by a Generative Adversarial Network (GAN), a fascinating recent development in Machine Learning able to yield new data instances that mimic the existing training data. The UNet model, a popular deep architecture in medical imaging for disease detection and diagnosis, is utilized for image segmentation. Further, a classic Convolutional Neural Network is employed for image classification. This proposed work addresses the significant challenge of diagnosing breast cancer from ultrasound images using deep learning models by focusing on addressing data limitations, an end-to-end model validation and on investigating data impact on performance. Firstly, public datasets in this domain are often limited in size and imbalanced, hindering model training. This work proposes a novel GAN-based data augmentation approach, generating realistic and diverse synthetic images to enrich the dataset and achieve performance comparable to state-of-the-art (SOTA) methods. Secondly, this research introduces and validates a novel end-to-end deep learning model for breast cancer segmentation and classification. The model streamlines the analysis process by combining segmentation and classification tasks into a single framework, offering advantages in efficiency compared to traditional approaches employing separate models. Finally, this research investigates the impact of data quality and quantity on the performance of the proposed model. The systematic investigation of data augmentation techniques, including the proposed GAN-based approach, highlights the potential benefits of data augmentation for overcoming limitations in public datasets.

This work proposes a novel approach to address the challenges of limited and imbalanced datasets in breast cancer segmentation using ultrasound images. A Generative Adversarial Network (GAN)-based data augmentation technique that generates realistic and diverse synthetic images is introduced to enhance the training data and improve the performance of the proposed end-to-end deep learning model. This continuous augmentation strategy aims to address data imbalances and increase training data diversity. By employing this novel GAN-based approach, the main aim is to contribute to the field by overcoming data limitations and fostering further research in reliable and efficient breast cancer diagnosis tools. This technique brings constant benefits to the learning process, and it helps the proposed model outperform the state-of-the-art model developed in Yap et al. (2019).

In summary, the contributions of this paper are three-fold:

the design, implementation, and validation of the end-to-end model, followed by comparing it with other models from the literature. The validation and comparison performed on a benchmark dataset indicate the efficiency and robustness of the proposed system.
an investigation of the training data’s quality and quantity is conducted since the automatic learning of the end-to-end model is somehow restricted by the limited and unbalanced ground truth provided by the radiologists. Observations reveal that the generated data consistently enhances the learning process. A GAN-based method is introduced for automatic augmentation, addressing the challenges of unbalanced and limited datasets, a common issue in the medical field.
the experimental results on a dataset that contains the required annotations (segmentation masks and classification labels) show that the proposed end-to-end model outperforms (Yap et al., 2019), given the fact that the images with a Dice score lower than 0.5 were also considered to compute the overall accuracy.

Therefore, the proposed developments support answering the research questions addressed in this study:

RQ$_1$: Which is the most efficient approach to design a robust system for breast tumour identification and characterization: a sequential system or an end-to-end one?
RQ$_2$: How do the quality and quantity of training data impact the performance of automatic stadialization of breast tumours?

The current paper is structured as follows: Section 3 gives a brief overview of algorithms used for breast cancer identification. Next, Section 4 presents the proposed approach, including preprocessing, augmentation, segmentation, and classification stages. Section 5 focuses on a brief explanation of the used dataset, applied performance metrics, and the conducted experiments, coupled with their results. The objective of Section 6 is to provide answers to the research questions that were proposed in the introduction. The last Section 7 contains the conclusions, a short evaluation of the proposed end-to-end model, and proposals for future improvements.

2 Research Significance

This proposed work addresses the significant challenge of diagnosing breast cancer from ultrasound images using deep learning models by focusing on:

addressing data limitations: public datasets in this domain are often limited in size and imbalanced, hindering model training. This work proposes a novel GAN-based data augmentation approach, generating realistic and diverse synthetic images to enrich the dataset and achieve performance comparable to state-of-the-art (SOTA) methods.
end-to-end model validation: this research introduces and validates a novel end-to-end deep learning model for breast cancer segmentation and classification. The model streamlines the analysis process by combining segmentation and classification tasks into a single framework, offering advantages in efficiency compared to traditional approaches employing separate models.
investigating data impact on performance: this research investigates the impact of data quality and quantity on the performance of the proposed model. The systematic investigation of data augmentation techniques, including the proposed GAN-based approach, highlights the potential benefits of data augmentation for overcoming limitations in public datasets.

In summary, this research contributes to the field by:

proposing a novel approach that leverages Generative Adversarial Networks (GANs) to address limited and imbalanced public datasets.
systematically investigating the impact of data quality and quantity on the performance of the proposed end-to-end model.
prioritizing computational efficiency while maintaining accuracy, considering the sensitive nature of breast cancer diagnosis.
conducting a rigorous evaluation of the proposed end-to-end model by validating its performance on an established benchmark dataset.

3 Related Work

Ultrasound images are used more frequently, and radiologists spend a very long time examining large volumes of these images. Thus, it has become a major problem in many countries because it leads to increasing medical expenses and worsening the quality of medical services.

According to (MD et al., 2019), in recent years, AI has revolutionized medical research, to detect and diagnose cancer. These methods use different types of algorithms, e.g. Convolutional Neural Networks (CNN) architectures and learning procedures, for cancer classification, and have achieved outstanding performance. Lately, Deep Learning technologies have been applied for radiological images, for example, to detect tuberculosis in chest X-rays, lung nodules, or cranial tumours in MRI. Also, in the case of breast cancer, the recent advanced AI methods proved to be useful in analysing various medical modalities (ultrasound images, MRIs, CTs). In the case of breast ultrasound images, datasets with comparable sizes to those used in the current numerical experiments, have been cited 325 times (Al Saleh et al., 2021).

According to Roslidar et al. (2019), research in the classification of breast cancer based on histological images using CNN has an accuracy of 98%. Using mammography, some studies in which the Convolutional Neural Networks were applied in the classification of tumours, achieved an accuracy of 97%. Meanwhile, an improved approach using a support vector machine (Chen et al., 2017) for processing the ultrasound images achieved 76.8% accuracy through binary classification, the two classes being benign and malignant. Considering these results and the fact the number of studies on ultrasonic images is significantly lower than the ones that use mammography, this article focuses on improving the performance of ultrasound cancer diagnosis.

Now, approaches focused on tumor identification will be presented, taking into account the image type: mammography or breast ultrasound images (BUSI) In some approaches, the use of pre-trained models helps to speed up the process of adapting networks, leading to faster problem-solving.

Table 1 Performance of various discriminating models between benign and malignant breast tumours

Full size table

Table 2 Performance of two detectors of benign and malignant breast tumours

Full size table

Several models have been developed to detect and discriminate between various breast tumours in mammography images. In Ragab et al. (2019), the authors used the AlexNet network to achieve binary classification, which they modified by introducing Support Vector Machines on the last layer. They also used a threshold-based method to automatize the segmentation process. They have successfully achieved the classification of benign and malignant tumours, working on some mammographic datasets (the DDSM (Heath et al., 2007) and CBIS-DDSM datasets (Lee et al., 2016)), obtaining an accuracy of 87.2%. Levy and Jain (Lévy & Jain, 2016) compared AlexNet, GoogleNet, and a simple CNN architecture, to which they have added transfer learning techniques, batch normalization, preprocessing, and augmentation. Their best model achieves 0.934 recall and 0.924 precision on the DDSM dataset (Heath et al., 2007) for discriminating between benign and malignant tumours. Jung et al. (2018) proposed the RetinaNet network, with pre-trained weights on an in-house dataset GURO, to demonstrate that pre-trained models achieve similar performance to those trained directly on the public INbreast dataset (Moreira et al., 2012), which contains both benign and malignant tumours. Their detection model obtained an average false positive rate of 0.34. In (William Hang & Hannun, 2017), an adversarial network was used to detect tumours. The study was conducted on a convolutional network, followed by Conditional Random Fields (CRF) for structured learning. The adversarial model was also used to control the overfitting that could occur since authors had access to a small amount of data. They used two datasets: INbreast (Moreira et al., 2012) and DDSM-BCRP (Heath et al., 2007) on which they obtained 66.18% accuracy which can be considered state-of-the-art performance for discriminating between benign, malignant, and healthy tissue. In another paper, Bakkour and Afdel (Bakkouri & Afdel, 2017) proposed a new discriminative technique for supervised learning, using the Softmax layer as a classifier. The network has been improved using Gaussian pyramids to highlight the regions of interest. Results on DDSM (Heath et al., 2007) and BCDR (Oliveira et al., 2011) revealed an accuracy of 97.28% on 2 classes: benign and malignant.

In what follows, the obtained results on breast ultrasound images are presented, starting from some approaches already reviewed in the article (T et al., 2020).

Several models were built and their performance in the differentiation of benign and malignant breast tumours has been evaluated. The input of these models is represented by B-mode images (Han et al., 2017; Fujioka et al., 2019; Mango et al., 2020) or shear wave elastography images (Zhang et al., 2016; Fujioka et al., 2020), while the architecture varies from traditional backbones as GoogleLeNet (Han et al., 2017; Fujioka et al., 2019) or DenseNet (Fujioka et al., 2020) to various Boltzmann machines (Mango et al., 2020). Even though the results are promising (Table 1), no access to the datasets used in these experiments is provided (Mango et al., 2020). Regarding the systems able to perform the detection task (that refers to localizing and categorizing the lesion in an image), different input types (the hand-held B-mode images (Cao et al., 2019) or automated breast B-mode images (Jiang et al., 2018)) and different detection models (YOLO or SSD (Cao et al., 2019)) can be identified. The experiments’ datasets are not published, even though the findings are encouraging (Table 2).

In addition to detectors, some methods enable high precision to identify all the pixels that belong to a particular object in the image, even if sometimes they are computationally intensive. The task is called semantic segmentation, and the ML algorithms learn a label (from a prefixed set) for every pixel of an image. Regarding breast lesion segmentation, many CNN architectures used in various approaches are based on the UNet model, one of the most popular neural networks in the medical field because it produces a satisfactory segmentation with very few data samples available (Ronneberger et al., 2015). For instance, in Zhuang et al. (2019), the authors proposed a U-Net-based model able to segment benign and malignant tumours in breast ultrasound images. They trained and validated their model on a large dataset (but only a part of it is public^{Footnote 1}) and obtained very good performances in segmenting various lesions in breast images.

Other approaches are based on deep architectures different from UNet. In Hu et al. (2019) the authors joined a dilated fully convolutional network with a phase-based active contour model to segment breast tumours, achieving a Dice score of 88.97%. Recently, Kumar et al. (2020) used a contextual-information-aware deep adversarial learning framework to propose an effective model for breast tumour segmentation in BUS images. In this framework, they applied a deep learning paradigm to capture both the textural features and the contextual dependencies in the BUS images. Two datasets have been involved in their experiments: a semi-public dataset of BUS images (but without ground truth masks that constrains the possibility of replicating their results) and a public dataset. Even if the results on the public dataset indicate a good performance of this approach (a Dice over 86%), it achieved limited tumour detection segmentation accuracy with some BUS images.

Ensemble methods have been investigated for breast cancer classification, achieving promising results. For instance, the study titled "The stratified K-folds cross-validation and class-balancing methods with high-performance ensemble classifiers for breast cancer classification" T R et al. (2023) employed stratified K-folds cross-validation alongside ensemble classifiers. While their approach addressed class imbalance to some degree, the authors recognized the potential for bias in highly imbalanced datasets. To tackle this issue, they proposed the incorporation of Synthetic Minority Over-sampling Technique (SMOTE). This technique generates synthetic data points for the minority class by interpolating-specifically, by taking a weighted average between an original data point and its selected neighbour. This approach achieved an accuracy of 99.3%, and a precision of 99.2% for the majority voting ensemble demonstrating the potential of SMOTE to improve model performance when dealing with class imbalance.

Building upon the foundation laid by hybrid CNN models and ensemble frameworks, as demonstrated in the recent studies Sahu et al. (2023, 2024), published in Biomedical Signal Processing and Control, 2023, significant strides have been made in the field of deep learning for breast cancer detection using ultrasound images. These studies have introduced innovative approaches, such as combining efficient deep CNN networks to form hybrid models that utilize weight factors for enhanced accuracy and speed, and ensembling powerful transfer learning models like AlexNet, ResNet, and MobileNetV2 for their synergistic benefits. The application of advanced image processing techniques, such as the Laplacian of Gaussian-based modified high boosting filters, further refine the quality of ultrasound images, leading to more precise detection capabilities. Moreover, the focus on small datasets, as highlighted in the proceedings of the 22nd International Conference on Intelligent Systems Design and Applications (ISDA’22), showcases the evolving ability of deep learning frameworks to achieve remarkable performance even with limited data availability Sahu et al. (2023). These developments highlight the continuous advancements in the application of deep learning architectures and image processing techniques, contributing significantly to the enhancement of both the reliability and efficiency in the diagnostic processes of breast cancer through ultrasound imaging.

As previously mentioned, substantial advancements have been achieved in the domain of artificial intelligence (AI)-enabled detection of breast cancer through ultrasound imaging. However, many important challenges still prevent it from being widely adopted and achieving its full potential:

Data Quantity and Quality - Small and potentially biased datasets: Many studies rely on limited and retrospective datasets, which can restrict the generalizability of the models and potentially introduce bias into the results. These limitations can lead to models with reduced accuracy, reliability, and generalizability, hindering their real-world application in clinical settings. Fujioka et al. (2020)
Overfitting: The concern of overfitting, a limitation often discussed in the realm of deep learning models, remains relevant for ultrasound-based breast cancer detection. Even when working with larger datasets, there’s the risk of models becoming overly focused on the specific characteristics present in their training data, leading to reduced generalizability and poor performance on unseen images. Sahu et al. (2023)

This article aims to address these challenges by utilizing data augmentation techniques to artificially create diverse variations of existing data points. This method allows models to learn from a richer and more varied representation of the underlying patterns, potentially improving generalizability and performance without requiring extensive data collection. To the best of our knowledge, the approaches mentioned previously have focused on a single task: either classification, detection, or segmentation of tumors. Furthermore, the numerical experiments have been conducted on datasets that provide specific information, such as tumor labels, bounding boxes, or masks. The proposed end-to-end approach is able to perform both tumour segmentation and tumour categorization. Nevertheless, for this purpose, a corresponding dataset (with both labels and masks) must be considered.

4 Materials and Methods

This study utilized a publicly available dataset of breast ultrasound images, obtained from Dataset-BUSI-with-GT (W et al., 2020). As the dataset was publicly accessible and did not involve direct interactions with human participants or animals, formal review and approval were not applicable. However, ethical considerations associated with the original collection of the data were considered.

The main aim is to develop an end-to-end decision support system, which automatically performs both steps (cancer segmentation and stratification) and could be an efficient solution (in terms of quality of predictions, but also speed) to the unmet medical need for proper delimitation and characterization of the tumour. The input of the intelligent system is a medical breast ultrasound image. In the first phase, the intelligent system locates the lesion, if any. In the second phase, it fits the tissue into one of several categories (e.g., normal, benign, malignant). Using a dataset that includes ultrasound breast images of 600 women aged between 25 and 75 years old, the intention is to validate the hypothesis that an end-to-end system is more performant than the two-stage systems developed until now in the literature. In addition to its efficiency from a computational point of view (reduced time and space complexity), the proposed system will be able to improve the process of cancer identification because of its architecture. Being an end-to-end system, the training of the decision core algorithm will benefit simultaneously from both cost functions that measure the quality of predictions in terms of lesion segmentation and lesion discrimination. Furthermore, the entire learning procedure behind the AI algorithm is agnostic to the input type or size - current results are obtained on 2-dimensional B-mode ultrasound images, with planned new experiments on SWE ultrasound images and 3D tomosynthesis images. The ground truth data required by such a system must include both annotations for every breast image: the location of the lesion and its type. To the best of our knowledge, these types of datasets are not available elsewhere, except in our investigated set of images. This lack of similar datasets could affect the validity of the proposed system as there are currently no other sets on which to test/confirm its robustness. However, good results were obtained when validating specific components of the approach. Figure 1 illustrates a graphical overview of the applied pipeline. Starting with the preprocessing steps, continuing with the augmentation, followed by segmentation combined with the classification through 4 intermediate layers, the end-to-end model was obtained. Two filters (a Gamma correction and a Gaussian blur) and thresholding are applied during the preprocessing stage (more details are given in Section 4.2). To handle the small and unbalanced initial dataset and to increase the training dataset required by the segmentation model, a GAN is employed in the augmentation step to generate new images (the details of image generation process are given in Section 4.3). The next step regards the partition of the images into specific and meaningful regions and it is performed by a UNet-based model (see Section 4.4). Finally, the segmented regions are labelled as benign, malign or normal by a trained classifier (see Section 4.5). By integrating segmentation and classification into a single and efficient framework, the proposed approach not only enriches the dataset but also enhances the learning process, showcasing a significant leap over traditional data augmentation techniques.

4.1 Dataset

The used dataset is breast-ultrasound-images-dataset: Dataset-BUSI-with-GT (W et al., 2020), which includes mammographic ultrasound images of women aged between 25 and 75 years old. These data were collected in 2018. The number of patients is 600, representing females. The dataset consists of 780 images with an average image size of 500*500 pixels. The images are in PNG format. In this dataset, the 2D images are divided into 3 categories: benign, malignant, and normal.

The “benign” category contains 891 images, including approximately 437 ultrasound images and their corresponding masks.
The “malignant” category contains 421 images, including approximately 210 ultrasound images and their corresponding masks.
The “normal” category contains 266 images, including approximately 133 ultrasound images and their corresponding masks.

In addition to the category label, the dataset contains ground truth images (the breast tumour masks). By having both elements (lesion label and contour), this dataset corresponds to the aim of the proposed approach: an end-to-end model able to localize the breast tumour and classify it as benign or malignant one.

4.2 Preprocessing

The proposed system reads the pixel matrix in RGB format and applies different techniques to obtain the most relevant information. At each step, various filters were tested, and only the results showing improvements were proceeded further. Next, images with partial filtering applied to them (one image from each category: benign/malignant/healthy tissue) will be attached, followed by more details about the chosen filters for further analysis.

4.2.1 Step 1-Gamma correction

Gamma correction is a nonlinear operation that defines the relationship between the numerical value of a pixel and its true brightness (in Colour, 2021). With this correction, the shades captured by a device are brought as close as possible to those perceived by the human eye.

$$\begin{aligned} s=c*r^\gamma \end{aligned}$$

(1)

where:

r=input pixels

s=output pixels

c=constant (scaling factor)

$\gamma $=the main exponent responsible for changing the brightness threshold of the image

Two cases are possible:

Case 1: $\gamma<$ 1: Gamma encoding is useful when there is a narrow range of dark pixels in the original image and the range of output values needs to be expanded. The effect of this curve is to accentuate the bright areas of an image.
Case 2: $\gamma>$ 1: Gamma decoding (or Gamma correction) is useful when there is a wide range of dark pixels in the original image and the range of output values needs to be narrowed. The effect of this curve is to reduce the bright areas of an image.

Logarithmic correction is useful for enhancing images with low contrast, since it compresses the dynamic range of pixel values, bringing out more details in dark areas while avoiding overexposure of bright areas. However, logarithmic correction may also result in a loss of detail in bright regions, especially when the image has a high dynamic range (Akram & Hussain, 2015).

Comparing the images resulting from the Gamma and Log corrections, the preference leaned towards the former. In the dataset case, a gamma correction with $\gamma $=2 was applied to reduce the brightness of the pixels and emphasize the darker areas of the image, which may be tumor suspicious (some examples are given in Fig. 2).

4.2.2 Step 2 - Noise removal by Gaussian Blur

The Gaussian filter is applied to an image for noise and detail reduction (ScienceDirect, 2021). It uses a Gaussian function to compute the transformation of each pixel in the image. Thus, the use of this filter represents the application of the convolution operation on the image, with a Gaussian function.

The Gaussian formula for two-dimensional space is:

$$\begin{aligned} G(x,y)=\frac{1}{2 \pi {\sigma ^2}} e^{-{\frac{x^2+y^2}{2 \sigma ^2}}} \end{aligned}$$

(2)

where:

x = distance from the origin, on the Ox axis;
y = distance from the origin, on the Oy axis;
$\sigma $ = standard deviation.

The value of the deviation depends on the width of the resulting image after applying the filter. The smaller the variance, the more concentrated the pixels are in a certain area. A Gaussian kernel features the highest value at the center, which decreases symmetrically towards the edges.

In the proposed approach, the images are filtered (through convolution operations) by the following Gaussian kernel:

$$ A_{5\times 5} = \begin{Bmatrix} {1/256}&{4/256}&{6/256}&{4/256}&{1/256}\\ {4/256}&{16/256}&{24/256}&{16/256}&{4/256}\\ {6/256}&{24/256}&{36/256}&{24/256}&{6/256}\\ {4/256}&{16/256}&{24/256}&{16/256}&{4/256}\\ {1/256}&{4/256}&{6/256}&{4/256}&{1/256} \end{Bmatrix} $$

Convolution: Mathematically, convolution is an operation that combines two signals and emits a third signal. Assuming two functions, f(t) and g(t), the convolution is defined as the integral that indicates the overlap of a function g, over the function f.

$$\begin{aligned} (f*g)(t)=\int _{-\infty }^{\infty } {f(\tau )}{g(t-\tau )}d\tau \end{aligned}$$

In image processing, convolution is the process in which each element of the image is added to its local neighbours and then weighted by the kernel (Saha, 2020). A kernel is a small matrix, with an odd number of lines/columns, in which each cell has a number and an anchor point (that is used to find the position of the kernel relative to the image). Each pixel from the initial matrix is multiplied by the corresponding element from the kernel and then added to the sum. In the end, the resulting matrix is composed of these sums.

Convolution is very important in image processing because it can be used for blurring, sharpening, edge detection, and noise reduction.

Some examples of applying these filters are given in Fig. 3.

4.2.3 Step 3 - Binarization by Sauvola Thresholding

At this step, image pixels are normalized to the (0, 1) range, and only pixels that have a value that exceeds a certain threshold are kept. If the pixels exceed that value, they will receive a value of 1, otherwise, they will become white (0). In the experiments, the value 0.45 was chosen for the threshold.

Sauvola Thresholding is an extension (it can be considered an improvement) of Niblack’s^{Footnote 2} algorithm. This technique is used for images with a nonuniform background.

$$\begin{aligned} T=m*(1+k*((stdN/R) - 1)) \end{aligned}$$

(3)

where:

m = mean of the neighbourhood
k = constant in the range [0.2, 0.5] (default 0.5)l
stdN = standard deviation of pixel values in the neighbourhood;
R = dynamic range of standard deviation (default 128).

Instead of calculating a single overall threshold for the entire image, this technique calculates multiple thresholds for each pixel, considering the mean and the standard deviation of the local neighbourhood (defined by a window centered around the pixel). Some examples of thresholded images are given in Fig. 4.

4.3 Augmentation

4.3.1 General description

During some preliminary experiments, a classification accuracy between 94-99% was obtained for the training set, and between 65-71% for the test set. Analysing these results and the discrepancy between the result obtained on the training set and the one on the test set, it was concluded that the number of images from each set has a major impact on the output of the method.

Augmentation is one of the main methods used in the process of creating new samples of inputs, by manipulating the original existing data. There are two frequent situations where the use of augmentation should be considered: when an unbalanced dataset or a small dataset is present. As observed in the input of the algorithm, both situations are encountered.

Generally, augmentation is limited to an approach in which it returns, rotates, or randomly changes the hue, saturation, brightness, and contrast of an image. This augmentation procedure is simple and can be done without much effort. The disadvantage of using these techniques is that it does not introduce new synthetic data into the model, but only include the same samples in a different state. Therefore, the model already knows these samples, and the impact on the result is limited.

Generating new realistic data is a difficult task that includes learning to imitate the original distribution of the available dataset. For this issue, it is possible to use a generative model, such as a Generative Adversarial Network (GAN) (Goodfellow et al., 2014), able to create new and realistic enough images from an existing dataset. This technique was chosen because it can generate better synthetic data samples that may improve the performance of the model. A GAN is composed of two important parts: Generator and Discriminator (Baeldung, 2022).

The Generator is a CNN that learns to create new plausible data. It receives as input a random vector of fixed length and learns to produce samples that imitate the distribution of the original dataset. Then, the generated samples become negative examples for the discriminator.
The discriminator is a CNN that learns to distinguish the synthetic data of the generator from the real data. It receives a sample as input and classifies it as “real” (it comes from the original dataset) or "synthetic" (it comes from the generator). The discriminator penalizes the generator for producing implausible samples.

Thus, the Discriminator and the Generator play a “game” with two participants, in which the Generator tries to mislead the Discriminator (to classify the synthetic samples as real) – see Fig. 5.

4.3.2 Augmentation Model

1.
Network architecture:
- Generator architecture: The Generator is architecturally configured to progressively upscale input latent vectors into higher-resolution images. This is achieved through a series of upsampling blocks, each consisting of transposed convolutional operations, batch normalization, and leaky rectified linear unit (LeakyReLU) activations. This series of operations systematically increases the spatial dimensions while concurrently decreasing the depth of feature maps, culminating in a high-fidelity image representation. The final output layer employs a hyperbolic tangent (Tanh) activation function to ensure the pixel values of the generated images are normalized.
- Discriminator architecture: The Discriminator is designed to perform the inverse operation of the Generator. It progressively downscales the input images into more abstract representations. This is facilitated through a series of downsampling blocks, each comprising convolutional operations, layer normalization, and LeakyReLU activations. These operations systematically reduce the spatial dimensions while increasing the depth of feature maps, allowing the network to assess the features of the input images.
Fig. 6
Images generated using GAN - (a) benign, (b) malignant, (c) normal
Full size image
2.
Block components:
- Upsampling blocks in the Generator: Each upsampling block employs a transposed convolutional layer to increase the feature map’s spatial dimensions, followed by batch normalization to stabilize the learning process and LeakyReLU activation to introduce non-linearity.
- Downsampling blocks in the Discriminator: Each downsampling block utilizes a convolutional layer to reduce the feature map’s spatial dimensions, followed by layer normalization for effective re-centering and scaling of the activations, and LeakyReLU activation to introduce non-linearity.
3.
Training parameters: In the training phase, the Wasserstein (Adler & Lunz, 2018) loss function is employed. Both the Generator and Discriminator are optimized using the Adam optimizer with a learning rate of 0.0001 and beta parameters set to (0.0, 0.9), ensuring a balanced optimization trajectory. The training is conducted over 800 epochs, a duration carefully chosen to allow sufficient model convergence. A ratio of five Discriminator updates for every Generator update is maintained to preserve the adversarial balance, a critical aspect for the success of GAN training. Additionally, a gradient penalty with a weight term of 10 is applied.

In Fig. 6, augmented images generated by generative adversarial networks, along with their corresponding masks, can be observed.

4.4 Segmentation

4.4.1 General description

U-Net is a convolutional neural network that has been implemented for biomedical image segmentation tasks (Ronneberger et al., 2015). It consists of a full network with a modified architecture that allows it to work with fewer images in the training set, while still producing accurate results. The diagram of the network layers is shown in Fig. 7.

The U-Net was trained to predict masks of the tumours within the ultrasound images, utilizing the reference mask already available in the dataset as the "correct" value. This segmentation model was selected based on indications of its potential from previous researchers (Zhuang et al., 2019). However, they applied a flavour of UNet only on a balanced dataset. The dataset involved in current experiments is not balanced.

4.4.2 Segmentation Model

1.
Network architecture: The architecture of the U-Net is constructed in a way that initially captures a broad context of the input image through its contracting path-comprised of convolution layers that increase the depth of the network while reducing the spatial dimensions of the image, effectively expanding the feature representation. This expansion is achieved by applying a series of 3x3 convolutions followed by a rectified linear unit (ReLU) and 2x2 max pooling at each layer, which doubles the number of feature channels while halving the image dimensions. In the expansive path, the process is reversed. Here, the network performs up-convolutions (also known as transposed convolutions or deconvolutions), which increase the spatial dimensions of the feature maps. These upsampled features are then concatenated with the corresponding feature maps from the contracting path, ensuring that fine-grained details are carried through to the final layers. This concatenation helps preserve important spatial information that might be lost due to pooling operations. As a result, the expansive path gradually decreases the number of feature channels while restoring the spatial dimensions, leading to the final segmentation map. Dropout is incorporated throughout the network to regularize the model and prevent overfitting, ensuring that the model generalizes well to new data.
2.
Training parameters: A small batch size of 16 is chosen, taking into consideration the relatively limited number of ultrasound images available. The model is trained over 50 epochs, a duration deemed sufficient to thoroughly learn and adapt to the features present in the medical images. Each image is resized to a uniform dimension of 256x256 pixels, a necessary step as the model architecture is designed to process square images. The Adam optimizer is employed for its proven efficiency in handling sparse gradients, a common characteristic in binary masks where the region of interest, often mapped with ones, is significantly smaller than the background, mapped with zeros. The learning rate is fixed at 0.003, striking a balance between rapid convergence and stability. The loss function used is Binary Crossentropy, a commonly utilized metric in binary segmentation models. It measures the similarity between the predicted mask and the actual mask as a probabilistic distribution, effectively enabling the model to classify each pixel into one of two categories: zero for the background or one for the region of interest.

4.5 Classification

4.5.1 General description

Convolutional Neural Networks (CNN) are leveraged for classification to provide an analytical perspective like human visual perception. Just as the human eye deconstructs an image into smaller segments for analysis, a CNN applies a similar strategy through three fundamental steps: Convolution, Max Pooling, and Flattening.

Convolution layers act as feature detectors from the input image, while Max Pooling layers reduce the spatial size of these representations, as illustrated in Fig. 8. This process not only diminishes computational complexity but also helps in achieving translational invariance in feature detection. Flattening, depicted in Fig. 9, converts the pooled feature maps into a one-dimensional array, laying out the extracted features end-to-end. This flattened array then feeds into the dense layers of the network, where classification decisions are made based on the presence of learned features. These steps ensure that CNNs can process and interpret images effectively, leading to accurate image classifications.

4.5.2 Classification Model

1.
Network architecture: The classification model is structured as a sequential convolutional neural network, designed to process and classify 400x400 pixel ultrasound images. It starts with convolutional layers, each followed by max pooling to reduce dimensionality and dropout to prevent overfitting. The convolutional layers have 16, 32, and 64 filters, respectively, each employing the ReLU activation function for non-linearity. After convolution and pooling, the data is flattened and passed through a dense layer with 512 neurons, also activated by ReLU. The network’s final layer is a dense layer with 3 neurons, corresponding to the number of image classes, using the softmax activation function to output class probabilities.
2.
Training parameters: For training, the Adam optimizer is used due to its effectiveness in managing learning rates and enabling rapid convergence. The Categorical Crossentropy loss function guides the training, measuring the difference between the predicted class probabilities and the actual class distribution. The model trains over 30 epochs with a batch size of 32. This setup, including a limited number of epochs and a modest batch size, is chosen considering the dataset’s size and the goal of achieving efficient and effective training without overfitting.

5 Results

Two scenarios were investigated to identify the benign and malignant tumours:

a sequential approach - two models, one for segmentation and another one for classification, trained independently
a unitary approach - an end-to-end model, composed of a segmentation block, followed by a classification block, trained simultaneously.

5.1 Dataset Specification

The experiments were conducted on two distinct datasets. The first dataset consists of the original 780 ultrasound images from the Breast Ultrasound Images (BUSI) dataset, which includes 437 benign, 210 malignant, and 133 normal images. The second dataset expands on the first by including augmented images, resulting in a total of 3000 images, with each category (malignant, benign, normal) equally represented by 1000 images.

For both datasets, an 80/20 split was implemented for training and testing purposes. This split resulted in the following sample distributions:

For the first dataset, out of the total 780 images, 624 images were utilized for training and 156 images for testing.
For the second dataset, from the total of 3000 images, 2400 images were allocated for training while 600 were set aside for testing.

5.2 Metrics

For segmentation, the Dice coefficient was chosen as the evaluation metric, considering the need to assess the level of similarity between the mask resulting from the algorithm and the real mask found in the initial dataset. Dice coefficient is computed as the double of the area of overlap divided by the total number of pixels in both images. The Dice coefficient value is in the range (0, 1). The closer this value is to 1, the greater the similarity between the two images.

Accuracy was utilized as an evaluation metric in the classification task. Accuracy, the percentage of the image labels that are classified correctly, has values in the range [0%, 100%]. The closer the value is to 100%, the more performant the classification algorithm is.

Additionally, the inference time was calculated for both segmentation and classification across three classes: benign, malignant, and normal. This represents the time required for the algorithm to apply the trained neural network model to new input data.

Table 3 The segmentation results in the case of two samples (one benign and one malign)

Full size table

Without formally analysing the computational complexity of the proposed algorithms, an empirical processing times which serve as a practical indicator of computational effort can be provided: augmentation phase: approximately 3 hours for the entire dataset, segmentation training: roughly 1 hour, classification phase: about 30 minutes, and-to-end model training: approximately 1-2 hours. All phases were executed on a GPU-P100, which reflects the practical computational requirements

5.3 Experiment 1 - Segmentation

Initially, two segmentation models were trained: SegModel-A and SegModel-B. U-Net is the architecture of each model. SegModel-A is trained by using only the original images, while SegModel-B is trained by using both original and augmented images.

The investigated models were trained using a variety of hyperparameters that were carefully chosen to optimize performance: 50 epochs, batch size of 16, Adam optimizer with a learning rate of 0.003, and the Binary Cross-Entropy loss function.

Table 4 Inference time - segmentation. The second column corresponds to SegModel-A, while the third column corresponds to SegModel-B

Full size table

Segmentation accuracy in this study is defined as the proportion of pixels correctly classified as tumor or non-tumor in the segmentation output relative to the reference standard, which is the ground truth mask. Although accuracy is a common metric, it is acknowledged that in the context of medical image segmentation, where the region of interest such as a tumour might occupy a relatively small portion of the image, accuracy alone may not be sufficient to capture the effectiveness of the model. Therefore, alongside accuracy, the Dice coefficient is employed as it is more indicative of the model’s performance by measuring the overlap between the predicted segmentation and the ground truth masks.

After training on benign and malignant BUS images, a segmentation accuracy of 94% was achieved by SegModel-B, as well as 94% when training only on the original dataset (SegModel-A). However, it was noticed that the masks predicted by the model trained only on the original dataset (SegModel-A) are much more precise than those predicted by the model trained on original and augmented images (SegModel-B). In Table 3, you can see the difference in most of the segmentations estimated by these models. Additionally, a Dice score of 0.8911 was obtained for SegModel-B, and 0.9015 for SegModel-A. Since the tumor is a very small area in the image, identifying it in a more restricted area does not contribute to a radical change in the dice score value. Therefore, the two values obtained are close and do not reflect the fact that the augmented segmentation is less precise.

In Table 3, the first three rows of images contain the original BUS images, the ground truth masks, and the predictions in the case of a benign tumour, while the next three rows of images contain the original BUS images, the ground truth masks, and the predictions in the case of a malign tumour. The first column of images refers to the model trained on original and GAN-based augmented images (SegModel-B), while the second column refers to the model trained on original images only (SegModel-A). The Dice coefficients from the top refer to the average segmentation performance obtained on the test dataset.

The inference time for image segmentation was computed for both models. The inference time represents the time required for the algorithm to apply the trained model to new input data. The inference results were obtained using 10 random images from the augmented dataset (first row from Table 4) and 10 random images from the original BUSI-Dataset (W et al., 2020) (second row from Table 4) and computing their average inference times.

5.4 Experiment 2 - Classification

Figure 10 presents the training and validation progress of ClassifModel-A, which is trained solely using original images. Throughout the 30 epochs, there is an observable trend where training accuracy significantly improves, and training loss decreases, reflecting the model’s capacity to learn effectively from the dataset. However, the validation accuracy and loss demonstrate some volatility, indicating the model’s challenge in generalizing to new data.

In contrast, Figure 11 showcases the enhanced performance of ClassifModel-B, which benefits from a richer training dataset that includes both original and synthetic images generated by a GAN. The training curve for ClassifModel-B reveals a steadier and more consistent improvement in accuracy and a more substantial reduction in loss, both for training and validation phases. This suggests that the inclusion of GAN-synthesized images contributes positively to the model’s ability to generalize, leading to better performance when compared to ClassifModel-A. The smoother convergence of ClassifModel-B, as evidenced by the less erratic and generally higher validation accuracy, underlines the value of diversifying the training dataset with additional synthetic data.

Table 5 Classification results on test set

Full size table

After training the previously described CNN classifier (see Section 4.5) on the BUSI-Dataset (W et al., 2020), accuracy was used as an evaluation metric on test images. The results are presented in Table 5.

Furthermore, the inference time for classification in one of the three classes: benign, malignant, and normal, was computed. This represents the time required for the algorithm to apply the trained neural network model to new input data.

The columns in Table 6 are representing the two classification models. The original model was trained on the original BUSI-Dataset (W et al., 2020) – ClassifModel-A – . The augmented model was trained on the original dataset combined with the augmented images – ClassifModel-B –. The inference results were obtained using 10 random images from the augmented dataset (first row from Table 6) and 10 random images from the original BUSI-Dataset (W et al., 2020) (second row from Table 6) and computing their average inference times.

Table 6 Inference time - classification. The second column corresponds to ClassifModel-A, while the third column corresponds to ClassifModel-B

Full size table

5.5 Experiment 3 - Segmentation and Classification

To calculate the final accuracy of the algorithm, the segmentation was merged with the classification model to obtain the end-to-end model. Below is the representation of the end-to-end model in layers, where ’Functional’ is the segmentation model and ’Sequential’ is the classification model (see Fig. 12). Again, two end-to-end models were trained: one by using the original images only and another one by using original and synthetic images, and their performance is presented in Fig. 7.

Table 7 illustrates the performance of the end-to-end model, which is composed of two components: the segmentation and classification models. The classification models are represented by the rows, while the columns display the segmentation models. Each of these models has two scenarios: training on either an augmented or original dataset. For example, the result of 68.7% was obtained in the test process of the end-to-end model using the classification model trained on the original BUSI (W et al., 2020) dataset and the segmentation model trained on the augmented dataset.

The results of the end-to-end model are slightly inferior to the results obtained from the sequential scenario, where the input for the classification model is the real ground truth mask. However, this result is a consequence of the fact that the predicted mask overlaps with the real one in a proportion of 90%. Table 8 provides a comparative analysis of studies employing the end-to-end approach, encompassing the research detailed in this article as well as findings from other studies. It focuses on aspects such as the datasets utilized, techniques implemented, and metrics used to assess performance. This comparison seeks to underscore the variety of approaches in the field, illustrating how different strategies influence the overall performance of the models.

6 Discussion

In the current study, an end-to-end model for breast cancer identification in BUS images has been developed, aiming to address the well-known issue of efficiency introduced by the pipeline of detection and characterization of breast cancer in images (RQ$_1$). Finally, the current study investigated the impact of dataset quality and dimension over the training process (RQ$_2$).

Regarding the first RQ, the obtained results revealed that the end-to-end approach performed best (both in terms of quality and speed) across all test images. Being an end-to-end system, the training of the decision core algorithm benefits simultaneously from both cost functions that measure the quality of predictions in terms of lesion segmentation and lesion discrimination. Furthermore, the entire learning procedure behind the AI algorithm is agnostic/independent of the input type or size. The current results are obtained on 2-dimensional B-mode ultrasound images, with planned new experiments on SWE ultrasound images and 3D tomosynthesis images. The ground truth data required by such a system must include both annotations for every breast image: the location of the lesion and its type; to the best of our knowledge, these types of datasets are not available, apart from the investigated dataset. The absence of other double annotated datasets limits the possibility of training a performant lesion detector/classifier. The augmentation methods helped to pass over this constraint: the newly generated data, similar to the original samples, influenced the intelligent system performance. Furthermore, by using an augmentation method, it is possible to balance the dataset to have a uniform distribution of samples over lesion position and type.

In response to the second RQ, it was observed that the manual generation of new data (e.g., by rotation or translation of an image) positively impacts the quality of breast cancer identification, while the automatic generation (e.g., by using a Generative Adversarial Network (GAN)) was not as useful. In the second scenario, better results could be obtained by a finer tuning of GAN’s parameters.

Furthermore, if the proposed model is compared with others from the recent literature, it can be noticed some common elements, but also some new features. In Sahu et al. (2023, 2024) the authors proposed a hybrid framework for training 2 stand-alone classification models, so the inference stage is much more expensive. The advantage of the system proposed in this paper is precisely that it is a single model (end-to-end) that learns everything and then the inference will be faster as well (it saves time). In addition, the authors initially framed the issue as a binary classification problem. However, the proposed model advances this approach by simultaneously predicting both the class and the location of the lesion. This dual-prediction model offers computational benefits, as it determines both outcomes-the lesion’s location and type-with a single computational process. Moreover, the proposed model is theoretically capable of classifying lesions into multiple categories, not just two.

The current results have several limitations. First, all the data used for validation was provided by retrospective and single-center medical studies. Therefore, a prospective multi-center analysis should be carried out to investigate the power of predictive models. Second, only ultrasound images have been included in the test data. Other medical image modalities (such as mammographic images or tomosynthesis images) could be worth exploring.

Table 7 Accuracy comparison on 4 end-to-end models

Full size table

Table 8 Comparison of end-to-end models trained on the BUSI dataset

Full size table

The design and validation of a novel end-to-end model specifically designed for BUS images indicates that the results of this study mark an important advance in the field of breast cancer diagnoses. Compared to recent literature, the proposed model offers several advantages: it is a single end-to-end model (learning both segmentation and classification simultaneously), the inference with this model is faster (saving time compared to hybrid frameworks with separate models), it can predict both class and lesion location, offering flexibility for classifying lesions into multiple categories.

7 Conclusions and Future Work

In this work, a novel approach was proposed for breast cancer recognition; it leverages Generative Adversarial Networks (GANs) to address limited and imbalanced public datasets in breast cancer diagnosis. A systematic investigation analysed the impact of data quality and quantity on the performance of the proposed end-to-end model by prioritizing computational efficiency while maintaining accuracy and by considering the sensitive nature of breast cancer diagnosis. A rigorous evaluation of the proposed end-to-end model was conducted by validating its performance on an established benchmark dataset.

As a result of the performed experiments, the results established as an objective have been approached. The findings suggest that ultrasound imaging can serve as a valuable adjunct to mammography in breast cancer screening.

Based on the conducted analysis, it was found that an end-to-end model is highly effective for analysing ultrasound images of the breast. By incorporating both segmentation and classification into a single pipeline, highly accurate results were achieved in identifying and diagnosing breast lesions. Compared to traditional methods of breast cancer screening, an end-to-end model offers several advantages, including improved speed and efficiency, as well as the ability to automatically learn and adapt to new data.

Overall, the obtained results suggest that an end-to-end model is a highly promising approach for ultrasound imaging in breast cancer screening. By continuing to refine and develop this methodology, the potential exists to further improve the accuracy and efficacy of breast cancer diagnosis, ultimately leading to better outcomes for patients. The current accuracy of 86% on the test set, while not clinically sufficient, serves as an encouraging baseline for this proof-of-concept study. The segmentation model has performed well, with a Dice coefficient of 0.90, showing promise in the accurate delineation of regions of interest. Additionally, the successful generation of synthetic ultrasound images is a significant achievement that will support the training of more advanced classification models in future work. The necessity for a classification model with improved performance is recognized, and it is identified as a principal focus for ongoing advancement. Given the encouraging outcomes from the segmentation model and the creation of synthetic images, there is a strong belief in the methodology, with expectations set for considerable enhancements in future versions of the comprehensive model.

Reflecting on the work Sahu et al. (2023) it becomes evident that the journey of breast cancer detection and diagnosis is continuously evolving. Looking ahead, the future of this domain holds promising avenues:

Exploring GANs for data augmentation: Advancing data augmentation techniques using GANs, especially those able to capture textural features, could significantly enhance the discrimination between benign and malignant lesions.
Improving model accuracy: Exploring different neural network architectures and fine-tuning hyperparameters might yield improvements in model accuracy.
Dataset diversification: Expanding the dataset to include more diverse images, such as mammography, MRI, or CT scans, and images from different hospitals, countries, and ethnicities to make it more representative.
Integration with multiple modalities: Incorporating additional data types such as clinical data, genomics, or proteomics alongside medical imaging might offer a more holistic approach to diagnosis.
Deployment in clinical settings: The development of a user-friendly interface that allows radiologists to upload medical images and receive automated predictions, coupled with integration into Electronic Health Records (EHRs), could streamline the diagnostic process, making it more efficient and accessible.

Availability of data and material

This study utilized a publicly available dataset of breast ultrasound images, obtained from Dataset-BUSI-with-GT (W et al., 2020). As the dataset was publicly accessible and did not involve direct interactions with human participants or animals, formal review and approval were not applicable. However, ethical considerations associated with the original collection of the data were taken into account.

Notes

http://ultrasoundcases.info/category.aspx?cat=67/
technique that tried to solve the problem of low contrast by decrementing the value of threshold

References

W, A.-D., M, G., H, K., A, F.(2020). Dataset of breast ultrasound images.https://doi.org/10.1016/j.dib.2019.104863
AI, T.(2021). Guide to Convolutional Neural Network from Scratch. [Online; accessed 19-March-2023]. https://tinyurl.com/convolutional-neural-networks
Adler, J., Lunz, S.(2018). Banach wasserstein gan. Advances in neural information processing systems. 31.
Akram, G.B., Hussain, A.(2015). An integrated approach of logarithmic transformation and histogram equalization for image enhancement. International Journal of Computer Applications. 118(15), 13–18. https://doi.org/10.5120/20969-3447
Al Saleh, R., Raza, S.M.A., Arshad, M.I., Ali, A.(2021). Utilization of molecular breast imaging in diagnostic work-up of breast cancer: A systematic review and meta-analysis. Journal of Medical Systems. 45(3), 1–14.
Baeldung.(2022). Machine Learning with GANs for Data Augmentation. [Online; accessed 19-March-2023] . https://www.baeldung.com/cs/ml-gan-data-augmentation
Bakkouri, I., Afdel, K.(2017). Breast tumor classification based on deep convolutional neural networks. In: 2017 International Conference on Advanced Technologies for Signal and Image Processing (ATSIP), pp. 1–6. IEEE
Balkenende, L., Teuwen, J., Mann, R.M.: Application of deep learning in breast cancer imaging. Seminars in Nuclear Medicine. 52(5), 584–596. https://doi.org/10.1053/j.semnuclmed.2022.02.003 . Breast Cancer
Cao, Z., Duan, L., Yang, G., Yue, T., Chen, Q.(2019). An experimental study on breast lesion detection and classification from ultrasound images using deep learning architectures. BMC medical imaging. 19(1), 1–9.
Chen, Z., Wang, H., Wu, J., Wang, X., Lin, M., Lin, Y., Zhang, H., Ten, B., Huang, X.(2017). [computer aided diagnosis system of breast ultrasound based on support vector machine: a clinical analysis]. Zhonghua Yi Xue Za Zhi. 97(48), 3812–3815. https://doi.org/10.3760/cma.j.issn.0376-2491.2017.48.012
Colour, C.(2021). Gamma Correction. https://www.cambridgeincolour.com/tutorials/gamma-correction.htm
Das, P.K., Meher, S., Panda, R., Abraham, A.(2022). An efficient blood-cell segmentation for the detection of hematological disorders. IEEE Transactions on Cybernetics. 52(10), 10615–10626. https://doi.org/10.1109/TCYB.2021.3062152
Fujioka, T., Katsuta, L., Kubota, K., Mori, M., Kikuchi, Y., Kato, A., Oda, G., Nakagawa, T., Kitazume, Y., Tateishi, U.(2020). Classification of breast masses on ultrasound shear wave elastography using convolutional neural networks. Ultrasonic Imaging. 42(4-5), 213–220.
Fujioka, T., Mori, M., Kubota, K., Oyama, J., Yamaga, E., Yashima, Y., Katsuta, L., Nomura, K., Nara, M., Oda, G., Nakagawa, T., Kitazume, Y., Tateishi, U.(2020). The utility of deep learning in breast ultrasonic imaging: A review. Diagnostics. 10, 1055. https://doi.org/10.3390/diagnostics10121055
Fujioka, T., Kubota, K., Mori, M., Kikuchi, Y., Katsuta, L., Kasahara, M., Oda, G., Ishiba, T., Nakagawa, T., & Tateishi, U. (2019). Distinction between benign and malignant breast masses at breast ultrasound using deep learning method with convolutional neural network. Japanese Journal of Radiology, 37, 466–472.
Article Google Scholar
Ganesan, K., Acharya, U.R., Chua, K.C., Min, L.C., Abraham, K.T.(2013). Pectoral muscle segmentation: A review. Computer Methods and Programs in Biomedicine. 110(1), 48–57. https://doi.org/10.1016/j.cmpb.2012.10.020
Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.(2014). Generative adversarial networks 1406. arXiv:1406.2661
Han, S., Kang, H.-K., Jeong, J.-Y., Park, M.-H., Kim, W., Bang, W.-C., Seong, Y.-K.(2017). A deep learning framework for supporting the classification of breast lesions in ultrasound images. Physics in Medicine & Biology. 62(19), 7714.
Heath, M.D., Bowyer, K., Kopans, D.B., Moore, R.H.(2007). The digital database for screening mammography.
Hu, Y., Guo, Y., Wang, Y., Yu, J., Li, J., Zhou, S., Chang, C.(2019). Automatic tumor segmentation in breast ultrasound images using a dilated fully convolutional network combined with an active contour model. Medical physics. 46(1), 215–228.
Jiang, Y., Inciardi, M.F., Edwards, A.V., Papaioannou, J.(2018). Interpretation time using a concurrent-read computer-aided detection system for automated breast ultrasound in breast cancer screening of women with dense breast tissue. American Journal of Roentgenology. 211(2), 452–461.
Jung, H., Kim, B., Lee, I., Yoo, M., Lee, J., Ham, S., Woo, O., Kang, J.(2018). Detection of masses in mammograms using a one-stage object detector based on a deep convolutional neural network. PloS one. 13(9), 0203355.
Karpathy, A., & Li, F.-F. (2023). 2015. Accessed: Convolutional neural networks for visual recognition.
Lee, R.S., Gimenez, F., Hoogi, A., Rubin, D.(2016). Curated Breast Imaging Subset of DDSM [Dataset]. The Cancer Imaging Archive.https://doi.org/10.7937/K9/TCIA.2016.7O02S9CY
Lévy, D., Jain, A.(2016). Breast Mass Classification from Mammograms using Deep Convolutional Neural Networks.
Mango, V.L., Sun, M., Wynn, R.T., Ha, R.(2020). Should we ignore, follow, or biopsy? impact of artificial intelligence decision support on breast ultrasound lesion assessment. AJR. American journal of roentgenology. 214(6), 1445.
Md, W. L. B., Ms, A. H., PhD, M. B. S., PhD, M. L. G., PhD, N. J. B., & MSc, A.M., BS, T.A., MD, O.A., MD, C.A., MD, I.F.D., MD, R.H.M., PhD, R.M.T., MD, C.M.T., PhD, C.S.M., MD, U.H., MD, L.H.S., MD, R.J.G., PhD, R.Y.H.M., PhD, H.J.W.L.A. (2019). Artificial intelligence in cancer imaging: Clinical challenges and applications. CA A Cancer J Clin, 69(2), 127–157.
Moreira, I.C., Amaral, I., Domingues, I., Cardoso, A., Cardoso, M.J., Cardoso, J.S.(2012). Inbreast: toward a full-field digital mammographic database. Academic Radiology. 19(2), 236–248. https://doi.org/10.1016/j.acra.2011.09.014 . Epub 2011 Nov 10
Oliveira, J.H., Cardoso, J.S., Cardoso, M.J.(2011). Bcdr: A breast cancer digital repository. In: Proceedings of the 8th European Conference on E-Health, pp. 43–49. ACI
Organization, W.H.(2023). WHO launches new roadmap on breast cancer. Accessed March 19, 2023. https://www.who.int/news/item/03-02-2023-who-launches-new-roadmap-on-breast-cancer
Ragab, D.A., Sharkas, M., Marshall, S., Ren, J.(2019). Breast cancer detection using deep convolutional neural networks and support vector machines. PeerJ 7: e6201.
Ronneberger, O., Fischer, P., Brox, T.(2015). U-net: Convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, pp. 234–241 . Springer
Ronneberger, O., Fischer, P., Brox, T.(2015). U-net: Convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, pp. 234–241 . Springer
Roslidar, R., Saddami, K., Arnia, F., Syukri, M., Munadi, K.(2019). A study of fine-tuning cnn models based on thermal imaging for breast cancer classification. In: 2019 IEEE International Conference on Cybernetics and Computational Intelligence (CyberneticsCom), pp. 77–81. IEEE
Saha, S.(2020). Intuitively Understanding Convolutions for Deep Learning. Link
Sahu, A., Das, P.K., Meher, S.(2023). High accuracy hybrid cnn classifiers for breast cancer detection using mammogram and ultrasound datasets. Biomedical Signal Processing and Control. 80, 104292.https://doi.org/10.1016/j.bspc.2022.104292
Sahu, A., Das, P.K., Meher, S.(2023). Recent advancements in machine learning and deep learning-based breast cancer detection using mammograms. Physica Medica. 114, 103138. https://doi.org/10.1016/j.ejmp.2023.103138
Sahu, A., Das, P.K., Meher, S.(2024). An efficient deep learning scheme to detect breast cancer using mammogram and ultrasound breast images. Biomedical Signal Processing and Control. 87, 105377.https://doi.org/10.1016/j.bspc.2023.105377
Sahu, A., Das, P., Meher, S., Panda, R., Abraham, A.(2023). An Efficient Deep Learning-Based Breast Cancer Detection Scheme with Small Datasets, pp. 39–48 . https://doi.org/10.1007/978-3-031-35510-3_5
ScienceDirect. (2021). Gaussian blur. Reference Module in Computer Science. https://doi.org/10.1016/B978-0-12-812138-7.02999-9
Singh, V.K., Abdel-Nasser, M., Akram, F., Rashwan, H.A., Sarker, M.M.K., Pandey, N., Romani, S., Puig, D.(2020). Breast tumor segmentation in ultrasound images using contextual-information-aware deep adversarial learning framework. Expert Systems with Applications. 162, 113870.
T, F., M, M., K, K., J, O., E, Y., Y, Y., L, K., K, N., M, N., G, O., T, N., Y, K., U, T. (2020). The utility of deep learning in breast ultrasonic imaging: A review. Diagnostics (Basel)., 1005, 10–12.
T R, M., Kumar, V.V., Venkatesan, D., Geman, O., Margala, M., Guduri, M.(2023). The stratified k-folds cross-validation and class-balancing methods with high-performance ensemble classifiers for breast cancer classification. Healthcare Analytics. 4, 100247.https://doi.org/10.1016/j.health.2023.100247
William Hang, Z.L., Hannun, A.(2017). Glimpsenet: Attentional methods for full-image mammogram diagnosis. Technical Report Stanford AI Lab Internal Report, Stanford University.
Yap, M.H., Goyal, M., Osman, F.M., Martí, R., Denton, E., Juette, A., Zwiggelaar, R.(2019). Breast ultrasound lesions recognition: end-to-end deep learning approaches. Journal of Medical Imaging. 6(1), 011007. https://doi.org/10.1117/1.JMI.6.1.011007
Zhang, Q., Xiao, Y., Dai, W., Suo, J., Wang, C., Shi, J., & Zheng, H. (2016). Deep learning based classification of breast tumors with shear-wave elastography. Ultrasonics, 72, 150–157.
Article Google Scholar
Zhuang, Z., Li, N., Joseph Raj, A.N., Mahesh, V.G., Qiu, S.(2019). An rdau-net model for lesion segmentation in breast ultrasound images. PloS one. 14(8), 0221535.

Download references

Acknowledgements

Not applicable.

Funding

No funding was received to assist with the preparation of this manuscript.

Author information

Bianca-Ştefania Munteanu, Alexandra Murariu, Mǎrioara Nichitean, Luminiţa-Gabriela Pitac, Laura Dioşan These authors are contributed equally to this work.

Authors and Affiliations

Faculty of Mathematics and Computer Science, University Babes-Bolyai, Kogalniceanu, Cluj-Napoca, 400084, Cluj, Romania
Bianca-Ştefania Munteanu, Alexandra Murariu, Mǎrioara Nichitean, Luminiţa-Gabriela Pitac & Laura Dioşan

Authors

Bianca-Ştefania Munteanu
View author publications
You can also search for this author in PubMed Google Scholar
Alexandra Murariu
View author publications
You can also search for this author in PubMed Google Scholar
Mǎrioara Nichitean
View author publications
You can also search for this author in PubMed Google Scholar
Luminiţa-Gabriela Pitac
View author publications
You can also search for this author in PubMed Google Scholar
Laura Dioşan
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed equally to this work.

Corresponding author

Correspondence to Bianca-Ştefania Munteanu.

Ethics declarations

Competing interests

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Munteanu, BŞ., Murariu, A., Nichitean, M. et al. Value of Original and Generated Ultrasound Data Towards Training Robust Classifiers for Breast Cancer Identification. Inf Syst Front (2024). https://doi.org/10.1007/s10796-024-10499-6

Download citation

Accepted: 26 May 2024
Published: 12 June 2024
DOI: https://doi.org/10.1007/s10796-024-10499-6

Value of Original and Generated Ultrasound Data Towards Training Robust Classifiers for Breast Cancer Identification

Abstract

Similar content being viewed by others

Accurate Breast Tumor Identification Using Computational Ultrasound Image Features

A Comprehensive Review of CAD Systems in Ultrasound and Elastography for Breast Cancer Diagnosis

Improved Ultrasound Based Computer Aided Diagnosis System for Breast Cancer Incorporating a New Feature of Mass Central Regularity Degree (CRD)

1 Introduction

2 Research Significance

3 Related Work

4 Materials and Methods

4.1 Dataset

4.2 Preprocessing

4.2.1 Step 1-Gamma correction

4.2.2 Step 2 - Noise removal by Gaussian Blur

4.2.3 Step 3 - Binarization by Sauvola Thresholding

4.3 Augmentation

4.3.1 General description

4.3.2 Augmentation Model

4.4 Segmentation

4.4.1 General description

4.4.2 Segmentation Model

4.5 Classification

4.5.1 General description

4.5.2 Classification Model

5 Results

5.1 Dataset Specification

5.2 Metrics

5.3 Experiment 1 - Segmentation

5.4 Experiment 2 - Classification

5.5 Experiment 3 - Segmentation and Classification

6 Discussion

7 Conclusions and Future Work

Availability of data and material

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation