Dense Tissue Pattern Characterization Using Deep Neural Network

Breast tumors are from the common infections among women around the world. Classifying the various types of breast tumors contribute to treating breast tumors more efficiently. However, this classification task is often hindered by dense tissue patterns captured in mammograms. The present study has been proposed a dense tissue pattern characterization framework using deep neural network. A total of 322 mammograms belonging to the mini-MIAS dataset and 4880 mammograms from DDSM dataset have been taken, and an ROI of fixed size 224 × 224 pixels from each mammogram has been extracted. In this work, tedious experimentation has been executed using different combinations of training and testing sets using different activation function with AlexNet, ResNet-18 model. Data augmentation has been used to create a similar type of virtual image for proper training of the DL model. After that, the testing set is applied on the trained model to validate the proposed model. During experiments, four different activation functions ‘sigmoid’, ‘tanh’, ‘ReLu’, and ‘leakyReLu’ are used, and the outcome for each function has been reported. It has been found that activation function ‘ReLu’ perform always outstanding with respect to others. For each experiment, classification accuracy and kappa coefficient have been computed. The obtained accuracy and kappa value for MIAS dataset using ResNet-18 model is 91.3% and 0.803, respectively. For DDSM dataset, the accuracy of 92.3% and kappa coefficient value of 0.846 are achieved. After the combination of both dataset images, the achieved accuracy is 91.9%, and kappa coefficient value is 0.839 using ResNet-18 model. Finally, it has been concluded that the ResNet-18 model and ReLu activation function yield outstanding performance for the task.


Introduction
The statistical report produced by the American cancer society shows that among twelve women one is the possibility of rising breast-related lesions. Among the various breast lesions, breast tumor is one of the repeatedly occurring infections among women in every region of the world [1][2][3][4][5]. The major attribute to developing breast cancer is breast tissue pattern density, body weight, age, genetic history of breast lesions, type of radiation therapy, alcohol consumption, etc. [4,5]. The relation between the development of cancer and risk factor is shown in Fig. 1.
After the study of past statistics and Fig. 1, it has been seen that the tissue density plays a crucial role in developing breast cancer. Breast tissue density is the ratio between fibro-glandular tissues to the fatty tissues. According to the Breast Imaging-Reporting and Data System (BIRADS) notation [6], breast tissue characterization is further classified into four different levels as per the tissue density availability in mammograms. The four class of tissue characterization [7][8][9][10][11][12][13] is as BIRADS-I which covers completely fatty tissue, i.e., ratio is less than 25%, BIRADS-II which covers the ratio between 25 and 50%, BIRADS-III which covers the ratio between 51 and 75% and BIRADS-IV includes the ratio between 76 and 100%.
In most of the cases, BIRADS-I and BIRADS-II combined together as fatty breast tissue class and BIRADS-III and BIRADS-IV combined together as dense breast tissue class. The sample image of each ACR-BIRADS class taken from the Digital Database for Screening Mammography (DDSM) dataset [14] is shown in Fig. 2.
The characterization of breast tissue pattern density is important because of (i) adequate scheduling of treatment related to breast lesions, and (ii) it has been also found that lesions are masked behind the dense tissue so that the treatment is not going in proper directions, for such type of cases expert may take the opinion of secondary imaging modalities. It has been also observed that the early predictions of increased breast tissue density reduce the development of cancerous cell and improve the treatment adequacy. For characterization of breast tissue pattern, machine learning and deep learning models have played a crucial role [7,8]. Therefore, the growth of efficient computer-aided classification (CAC) system is a challenging task for the research community. Regarding this problem, so much work based on the machine learning concept has been already completed, but these models suffered from either complexity or less accuracy [9,10,[15][16][17].
Based on the challenges obtained from the previous studies, a dense tissue pattern classification framework has been proposed using a deep neural network model. The major contributions of the proposed work as follows: 1. Deep neural network-based dense tissue pattern classification framework is proposed for prediction of breast tissue pattern. This framework is suitable for clinical practice as a secondary opinion tool for prediction of breast density. 2. The extensive experiments have been carried out on 322 mammograms taken from mini-MIAS dataset and 4880 mammograms taken from DDSM dataset. 3. The present work is different from past studies in terms of complexity because it is based on the region of interest. In which a region of interest (ROI) is cropped from the center region of the mammograms and passed to the framework, and the decision of the system is generated. The rest of the paper is ordered as a literature review section as "Related Studies" section, in which major findings and limitations of past studies have been illustrated. In next "Materials and Methods" section, material and methods are described in which data preparation protocols, ROI  [14] extraction, data augmentation and dataset bifurcation are discussed. In "Experiments and Results" proposed model, experiments and results are discussed, and finally, the complete work has been summarized in the "Conclusion" section as a conclusion.

Related Studies
After a deep study of literature related to dense tissue pattern characterization, it has been found that the characterization of tissue can be done in two methods: (i) segmented tissue-based characterization (ST) and (ii) region of interest tissue-based characterization (ROIT). ROI-based method is simpler than segmentation-based method because segmentation-based approach needs additional preprocessing steps, but in case of ROI-based approach, a fixed size ROI has to be cropped and passed to the model without any additional preprocessing steps. Therefore, the present work is based on an ROI-based method for dense tissue pattern characterization.
In past studies, so many studies for breast tissue pattern characterization using ST and ROIT have been done [7,8,18,19]. For these methodologies, machine learning (ML) and deep learning (DL) are prominently used [20,21]. In ML, support vector machine (SVM), artificial neural network (ANN), k-nearest neighbor (kNN), probabilistic neural network (PNN), smooth SVM (SSVM), etc. [7,8,18] are applied with spatial domain feature extraction, transform domain-based feature extraction, and law's texture energy features [18]. The past studies show some promising results with ML models, but it suffers if the number of the input sample is large. For such types of problems, DL models yield better performance. Due to that, DL model is getting more attention in the last few years.
With time, several deep learning model-based computer aided diagnostic systems have been developed for the detection or classification of tumors. In the study [22], convolutional neural network (CNN) is used for lesion classification on 736 mammograms and attained the accuracy of 82.6%. In this model, computed features are passed to the classification module. The experiment performed by Qiu et al. in study [23] used a DL model having eight layers for classification task. The proposed eight-layer DL model can extract the features automatically, and extracted features are used for feature classification. The designing of the DL model is a hectic task; therefore, pre-trained models can be also used for the classification problem [24]. The study [25] used a pre-trained model for benign and malignant cancer classification on DDSM and InBreast dataset [26]. The obtained ROC curve shows the accuracy of 90.0%. For tissue density classification, the study [15] shows the accuracy of 83.6% for two-class classification on MIAS dataset using pre-trained VGG16 model. The breast tissue density can be also classified into four classes as reported in previously published work [27]. In this work, the DL model is used and validated on 200,000 mammographic samples. The achieved accuracy for the proposed model is 84.2%. The obtained result shows the promising results for four class classification tasks. The similar type of task has been performed in [16] using the Inception V3 model and achieves the accuracy of 84.4% on 3813 number of samples.
The DL model is not only limited to the classification task. Such a type of model can be also used for the feature extraction, and the computed values are passed to the machine learning model called transfer learning [28][29][30].
The study [28] shows the application of transfer learning and achieved outstanding results. In the proposed work, DL model is used for feature extraction, and extracted features are passed to the ML algorithm for classification task. The proposed model is validated on self-collected mammograms. The concept of transfer learning is also used in [29] where 22000 samples are used, and the CNN model is used for feature extraction. Finally, the reported accuracy for this study is 92.6% for testing and 94.2% for training.
Keeping in the view of previously published results, it has been conceded that most of work has been concentrated around complete mammogram processing. In past studies, it has been also found that the density of tissue pattern is highest at behind the nipple and center location of breast. The same fact has been experimentally proved by Li [31]. It is also noticed that the same facts have been also observed by radiologist and medical experts [7,8]. Therefore, ROIT concept is used to design a dense tissue pattern characterization. In the proposed work, a fixed size of ROI is extracted from the mammograms and then it is passed to the DL model.
In this work, two pre-trained DL models AlexNet and ResNet-18 [19,25,32] have been used to develop the proposed system. For each model, four activation functions 'sigmoid', 'tanh', 'ReLu', and 'leakyReLu' [33] are used for activation of neurons. The training and testing image samples are taken from mini-MIAS and DDSM database. The image augmentation has been performed to increase the number of samples. After that, training and testing of DL model is performed and obtained results are evaluated in terms of accuracy, misclassification accuracy, and kappa coefficient.

Dataset Preparation
In this work, two scientific datasets (mini-MIAS, DDSM) are used. Both datasets are freely available for research purposes. The various experiments have been performed on firstly individual dataset then the combination of both dataset and obtained findings has been reported in results sections.
The mini-MIAS [34] consists of 322 mammograms of 161 patients. Each mammogram of the dataset is digitized at 50-micron pixels. The density label of each mammogram is labeled by three expert radiologists as fatty tissue class, fatty-glandular tissue class, and dense-glandular class. For two class tissue pattern classification, fatty tissue class is considered as class 1, i.e., fatty whereas fatty-glandular and dense-glandular are treated as class 2, i.e., dense class. The total number of samples in the fatty class is 106 mammograms (106 samples ϵ fatty tissue classes), and the number of samples in dense class is 216 mammograms (104 samples ϵ fatty-glandular and 112 samples ϵ dense-glandular class).
In DDSM dataset [14], 10000 multi-view (CC view and MLO view) mammograms of 2500 patients are available of three classes as benign, malignant, and normal. Each mammogram of the dataset is digitized at 42 to 50 microns. Each study includes two projections (MLO and CC view) of each breast, along with essential patient information like patient age, tissue pattern density rating, rating for lesions, and description of architectural distortion. The density label of each mammogram is categorized into four classes, i.e., BIRADS-I, BIRADS-II, BIRADS-III, and BIRADS-IV. To attain the objective of proposed work, a total sample of 4880 mammographic images are used as a fatty class and dense class. The fatty class comprised of 2460 mammograms as 620 samples ϵ BIRADS-I and 1840samples ϵ BIRADS-II and dense class comprised of 2420 mammograms as 1440 samples ϵ BIRADS-III and 980 samples ϵ BIRADS-IV class of MLO view. The brief detail of dataset preparation is given in Fig. 3 for MIAS and DDSM dataset.

ROI Extraction
From the past studies, it has been observed that the maximum tissue pattern density found at the center of mammograms just behind the nipple [7,8,18,31]. After the concern of an expert radiologist, they had also concluded a similar fact. Therefore, an ROI of size 224 × 224 pixels has been cropped from the center of each mammogram for the entire experiments. For mini-MIAS dataset, a total of 322 ROIs have been cropped, and from DDSM dataset, a total of 4880 ROIs have been cropped. The steps for ROI extraction are shown in Fig. 4.

Data Augmentation
It is well known that the performance of deep learning models depends on the amount of training dataset. In case of fewer amounts of data, data augmentation is used for generating a large amount of virtual samples from the available samples [35][36][37]. The virtual images are generated with the help of angle rotation of 5° with height, width, shear, and zoom range of value 0. In the case of DDSM mammograms, ten virtual samples are generated using one sample with abovementioned parameters, and a set of twenty virtual images is generated using mini-MIAS mammograms by using the above mentioned parameters. After the augmentation, overall mammographic samples from the DDSM dataset are 48800 and 6440 samples of mini-MIAS mammographic images are available. The sample image of the augmented dataset is shown in Fig. 5.

Dataset Bifurcation
The complete set of samples further bifurcated into training and testing samples in a balanced and unbalanced manner. In a balanced manner, bifurcation with 50:50 ratio is used for training and testing samples. According to this, 24400 samples of DDSM dataset and 3220 samples of the mini-MIAS dataset are used as training and testing dataset. Similarly, for unbalanced bifurcation of samples 70:30 ratio is maintained for training and testing samples. Thus 34160 samples of DDSM dataset and 4508 samples of the mini-MIAS dataset are used as a training set and 14640 samples of DDSM dataset 1932 samples of the mini-MIAS dataset are used as a testing set. The brief description of dataset preparation and bifurcation is given in Table 1.

Proposed Work
The proposed workflow chart is shown in Fig. 6. The proposed work is divided into three main levels as a pre-processing module, model building module, and decision section. In the preprocessing section, dataset preparation, ROI extraction, data augmentation, and ROIs bifurcation are performed, and in model building section, DL-based model is trained using the training set, and trained model is used to predict the testing samples of testing dataset described in decision section.

Deep Learning Model
In the recent trends of artificial intelligence, deep learning (DL) model plays a significant role in the development of computer-assisted framework [25] and different domains of research [38]. DL is the subset of ML which learns underlying features from data using neural a network. It is well known that the ML-based framework performance is degraded with a large amount of data, but DL models show promising results on the large amount of dataset. Another important limitation of the ML algorithm is learning from hand-engineered features, which are time consuming, brittle, and non-scalable whereas in case of the DL model tries to learn high-level features from the data itself. The DL model is similar to the neural network having multiple hidden layers, convolution layers, pooling layers, fully connected layers, activation functions, etc. Some popular pre-trained neural network architectures like CNN and recurrent neural network (RNN) are suitable for classification and object detection type problems [19,33]. The generalized architecture of the DL model (CNN) as a classifier is shown in Fig. 7.

Convolutional Layer
In CNN, the convolution layer is used as a feature extractor layer of the input image. To compute the feature, the input image is convolved with the weight matrix of the convolutional layer [39]. The output of every neuron is obtained using dot matrix multiplication between the weight matrix of the convolutional layer and part of the input image. The explanation of the convolution operation is shown in Fig. 8.
The size of the output image is defined as given in Eq. (1) for input image I(W,H).
where W and H are the width and height of the image. F is used for kernel filter, P is pooling function, and S is used for stride value.

Activation Layer
In this layer, a nonlinear function, known as activation function, is applied to the input matrix and performs the

Pooling Layer
This layer is used to reduce the spatial dimension of the input image so that fewer operations have to be performed at the next layer. The most frequently used pooling layer operations are average pooling and max pooling. The example of average pooling operation is shown in Fig. 9.
In this work, two different pre-trained DL models (AlexNet, ResNet-18) are used for dense tissue pattern characterization. The brief description of each model is given below.

AlexNet
The first deep learning model 'AlexNet' is developed by Alex Krizhevsky [40] which makes a huge revolution in ML and AI research field. The model consists of five convolutional layers, two fully connected layers, and a softmax layer. The architecture of 'AlexNet' model is given in Fig. 10.
Each convolutional layer uses a convolutional filter followed by a non-linear activation function. Out of these five convolutional layers, three layers are followed by a pooling layer as shown in Fig. 9. The input size of the AlexNet model is 224 × 224 × 3, so the ROI of every mammogram has been resized to 224 × 224 × 3. A convolution filter size of 11 × 11 with stride size of 4 is applied on input images of size 224 × 224 × 3. Therefore, the size of the conv2 layer is 55 × 55 × 96 using {(224 − 11/4) + 1} = 55 and a kernel size of 96 is generated. After that, a max-pooling filter of size 3 × 3 with a stride rate of 2 is applied so that the size of the next layer becomes 27 × 27 × 256.
The similar calculation has been done for others layers, and experimental structure of AlexNet with input sample taken from the used dataset is shown in Fig. 11.

ResNet
The ResNet model is the same as the GoogleNet model having seven numbers of layers [41,42]. Each layer consists of an identity block. The structure of identity block-1 and identity block-2 is shown in Fig. 12. Each block consists of a convolutional filter, batch normalization, and nonlinear activation function. The convolutional filter size of block-1 is 3 × 3, and 1 × 1 convolutional filter size for Algorithm 1 Train a deep neural network model with defined batch size for SGD approach block-2 is used. The resultant vector of residual block-1 is directly added, i.e., element wise addition with input vector so that the extracted features are preserved and the DL model tries to learn maximum features from low to high level. In the same manner, block-2 performs the same set of operations with a minor difference as shown in Fig. 12. In this work, ResNet-18 model is used for dense tissue pattern characterization.
The complete architecture of ResNet-18 is shown in Fig. 13. The ResNet-18 [19] is composed of 8 residual blocks. Among these, residual block-1, residual block-2, residual block-4, residual block-6, and residual block-8 are composed of identity block-1, and residual block-3, residual block-, and residual block-7 are composed of identity block-2. The size of the 224 × 224 × 3 pixels input image is applied to the ResNet model. At layer one convolution filter of size 3 × 3, a total number of kernels 64, stride of [2,2] with padding [3,3,3,3] is applied. This operation is followed by batch normalization and activation function, and finally, max pooling, stride, and padding are applied on the normalized vector. The resultant vector of layer one is passed to layer two, which comprises identity block 1. The set of operations performed at layer 2 is mentioned in Fig. 11. Layer 3 is divided into two identity blocks as residual block 3 and block 4. Residual block 3 consists of identity block-1 and residual block 4 consists of identity block-2. In the same manner, the remaining layer has consisted, and details of each layer are shown in Fig. 12. To train the DL model, stochastic gradient descent algorithm is used.

Stochastic Gradient Descent
In past studies, it has been found that stochastic gradient descent (SGD) is frequently used for DL model training and attains the promising results [43]. SGD is an optimization technique which is mathematically defined as a given expression in Eq. (2) for training sample tr(x) with label tr_b(y). The complete training algorithm is given as Algorithm 1.

Performance Evolutionary Parameters
In this work, the proposed work has been evaluated through accuracy and kappa coefficient. For accuracy calculation, a confusion matrix (CM) is used as shown in Fig. 14, and the mathematical expression of accuracy is also given in the same.

Kappa Coefficient
The kappa coefficient [44] is the statistical analysis of the proposed work, which shows the significance or reliability of the proposed work. The kappa coefficient is calculated with the help of given Eq. (3).
where is kappa value, P 0 is the probability of observed agreement, and p e is the probability of hypothetical agreement. With the help of CM, P 0 and P e are computed as: and where P class1 and P class2 are computed as given expression.
The relation between kappa coefficient and the significance of the system is shown in Fig. 15.

Experimental Setup
The complete experimentation has been performed at HP Z4 G4 workstation. The specification of the system is given as Intel Xeon W-2014 CPU @ 3.2 GHz, 64 GB RAM, 4 GB NVIDIA Quadro P1000, 256 GB SSD, and 2 TB SATA HDD. All the images and ROIs are stored in this system, and the Python environment is used for performing the experiments.

Experiments
In this work, meticulous experimentations have been carried out for the dense tissue pattern characterization using deep learning models. To achieve the desired outcome, AlexNet and ResNet-18 deep learning models have been used. The input mammograms were taken from mini-MIAS and DDSM databases. Due to the limited number of samples available in MIAS and DDSM dataset, virtual images were generated using data augmentation, for the training and testing purpose of the model. After the augmentation, 322 × 20 = 6440 ROIs of MIAS and 4880 × 10 = 48800 ROIs of the DDSM dataset are generated. Further, the complete set of ROIs is bifurcated into the training and testing set. The description of the bifurcation of ROIs belonging from DDSM and MIAS dataset is shown in Table 2.
The sample of the original and augmented dataset with true class and predicted class is shown in Fig. 16.
In this work, experiments have been carried out for MIAS, DDSM, and MIAS + DDSM mammograms. Initially, a model is designed for MIAS images and tested with the test samples from MIAS mammograms. In the next model, input mammogram ROIs are taken from the DDSM dataset, and the test set is also generated from the same set. In the next experiments, input ROIs are taken from DDSM and MIAS both dataset and then train the model from a training set ROIs and lastly test the model using the testing set. The bifurcation of the training and testing set is performed by balanced and unbalanced methods. The list of experiments carried out for the work is given in Table 3.

Experiment 1
In this experiment, a total number of 6440 ROIs have been used for dense tissue pattern classification using AlexNet and ResNet-18 models. From 6440 ROIs, training and testing sets are created using balance bifurcation; therefore, 3220 ROIs are used as a training set, and the remaining 3220 ROIs are used as a testing set. From 3220 ROIs of the training set, 1060 ROIs belong to the fatty class, and 2160 ROIs belong to dense tissue class. In the same manner, the testing set is created. The four activation functions 'ReLU', 'Sigmoid', 'Tanh', and 'Leaky ReLU' are used for the experiment, and the obtained results are reported in Table 4. The distribution of ROIs is given as

Experiment 2
In this experiment, a total number of 6440 ROIs have been used for dense tissue pattern classification using AlexNet and ResNet-18 model. From 6440 ROIs, training and testing sets are created using unbalanced bifurcation; therefore, 4508 ROIs are used as a training set, and remaining 1932 ROIs are used as a testing set. The obtained results for four different activation functions are reported in Table 5. The description of ROIs distribution ROIs is given as

Experiment 3
In this experiment, 4880 cases are taken from the DDSM dataset, and an ROI from each mammogram has been extracted according to previously mentioned steps and area. After the augmentation, a total set of a total number of 48800 ROIs have been generated for dense tissue pattern classification using AlexNet and ResNet-18 model. From 48800 ROIs, training and testing sets are created using balance bifurcation; therefore, 24400 ROIs are used as a training set, and remaining 24400 ROIs are used as a testing set. The obtained results for four different activation functions are reported in Table 6. The description of ROIs distribution is given as

Experiment 4
In this experiment, 4880 ROIs are extracted from each mammogram taken from the DDSM dataset. To improve the learning of the DL model, the large number of input samples required; therefore, augmentation is used to create virtual 48800 ROIs for dense tissue pattern classification using AlexNet and ResNet-18 model. Further, these samples are divided into training and testing sets using an unbalanced manner, i.e., 70;30 ratio. The obtained results for four different activation functions are reported in Table 7, and the description of ROIs distribution ROIs is given as Total images fatty class ∶ 24600{12300 training + 12300 testing} Total images dense class ∶ 24200{12100 training + 12100 testing}.

Experiment 5
In this experiment, a total of 55240 ROIs (6440 belonging to MIAS and 48800 belonging to DDSM) are considered. The total number of fatty class ROI is 26720, and 28520 ROIs belong to dense tissue class. In this experiment, balance bifurcation of the dataset is used to create training and testing sets. The AlexNet and ResNet-18 model is used for dense tissue pattern classification. The obtained results for four different activation functions are reported in Table 8, and the description of ROIs distribution ROIs is given as

Experiment 6
In this experiment, the same number of samples as Experiment 5 is used, but the bifurcation of the training and testing sets is made according to the unbalanced manner. The ratio of 70:30 is used for the training and testing set creation. The description of ROIs distribution ROIs is given as:   The obtained results for different activation functions are reported in Table 9.

Results Analysis
To achieve the efficient model for dense tissue characterization, extensive experiments have been carried out on MIAS and DDSM mammograms using AlexNet and ResNet-18.
separately for model designing, and later on, combined samples are used, so that maximum variability of input samples are considered. After the experimentations, following major outcomes have been induced as: (i) For the MIAS dataset, two experiments (Experiment 1 and Experiment 2) have been performed using the AlexNet and ResNet-18 model. In these experiments, training and testing sets have been created using balance and unbalanced bifurcation approaches. The various activation functions have been checked for the convolutional layer, and results have been reported  in Tables 4 and 5. The performance of each experiment has been evaluated using classification accuracy (Acc) and kappa coefficient ( ). The highest accuracy of 89.8% (2892/3220) has been reported for Experiment 1 using ResNet-18 model for ReLu non-linear activation function and value of the kappa coefficient ( ) is 0.770. For the same testing set, the accuracy of 88.9% (2863/3220) has been achieved using the In Experiment 4, the same number of samples as Experiment 3 are used, but the distribution of samples in the training and testing sets is made according to the ratio of 70:30. It means 70% of the total sample is used as a training set, and the remaining 30% is used as a testing set. The highest classification accuracy of 92.3% (13513/14610) is achieved using  Tables 8 and 9, respectively. From Table 8, it has been found that the maximum classification accuracy of 88.3% (24388/27620) is attained for the ReLu activation function using From Table 9, it has been observed that the AlexNet and ResNet- 18 Fig. 17.

Misclassification Analysis
After the successful completion of experiments, so many samples are misclassified. The analysis of the misclassification of every experiment is given in Table 10. From Table 10, it has been observed that the minimum misclassification accuracy is 7.7% (1127/14640) using the ResNet-18 model for DDSM dataset. It shows that 1127 samples are not correctly predicted. From 1127 samples, 568 samples of the fatty class and 559 samples of the dense class have been incorrectly classified. The minimum misclassification accuracy for MIAS dataset is 8.7% using the ResNet-18 model. In the same manner, the minimum misclassification accuracy for MIAS + DDSM dataset is 8.1% using the ResNet-18 model.

Comparative Analysis
The performance of proposed work has been compared with the current state of the art, and comparative analysis table is shown in Table 11. The comparative analysis has been performed on the basis used DL model, the total number of considered cases for experiment, accuracy, and kappa coefficient. Figure 18 shows the comparative analysis between proposed works and previously published. In Table 11, previously published work [9,10,15,16,26] is compared with present work. The comparative table shows the accuracy of present work is better than the accuracy of the state of art work done in [9,10,15,16]. The accuracy of the work reported in a study [9] is 90.7% that is the highest among the accuracy of study reported in [10,15,16,26]. It is also worth mentioning that the proposed work accuracy is 92.3% which is higher than the previously reported accuracy in study [9].
Among previously published work [9,10,15,16,26], only study done by Gandomkar et al. [16] done kappa coefficient evaluation for their work. The obtained kappa value for proposed work is 0.846, and kappa value for the study [16] is 0.775. The kappa coefficient value obtained for the proposed work shows that the significance of proposed work is more than the previously published work. Thus, the present study is more suitable for the clinical purpose for dense tissue pattern characterizations.

Conclusion
It is very well known that dense tissue is a major risk factor for the growth of cancerous cells in women's breasts. Therefore, the present study reports the performance of the proposed dense tissue pattern characterization model using deep neural networks. Initially, MIAS and DDSM datasets are used for input image samples. Due to less number of samples, data augmentation has been performed to generate virtual samples, so that the training and testing of models are done properly. After the augmentation, the problem of under-fitting and over-fitting of the model is reduced.
In this work, two deep learning models (AlexNet and ResNet-18) are used, and four activation functions ('Sigmoid', 'Tanh', 'ReLu', and 'LeakyReLU') have been tested Fig. 18 Comparative analysis between proposed works and previously published with each model. To achieve the desired objective, extensive experiments have been performed with the different combination of training and testing samples with different activation functions using AlexNet and ResNet-18 model. For every experiment, the outcome of the results has been measured in terms of accuracy and kappa coefficient. The obtained accuracy and kappa value show the activation function 'ReLu', and deep neural network model ResNet-18 is more suitable for dense tissue pattern characterization. Finally, it has been concluded that the designed model is more suitable for clinical purposes, and it shall be helpful for an expert person for proper and adequate scheduling of the treatment.
This work is also suffering from the manual ROIs extraction. If the ROIs extraction performed automatically then the execution time and performance will be improved. The same limitations will be also considered as a future work of the proposed work. In near future, the same set of approaches have been also used for the designing a computerized framework for breast density classification on full-field screen mammograms.

Declarations
Ethical approval The reported research was carried out using secondary data. Hence, ethical approval was not required.
Research involving human participants and/or animals This research did not directly involve any human participant or animal.

Conflict of Interest
The authors declare no competing interests.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.