Abstract
Breast cancer is the most common cancer in women worldwide. While the early diagnosis and treatment can significantly reduce the mortality rate, it is a challenging task for pathologists to accurately estimate the cancerous cells and tissues. Therefore, machine learning techniques are playing a significant role in assisting pathologists and improving the diagnosis results. This paper proposes a hybrid architecture that combines: three of the most recent deep learning techniques for feature extraction (DenseNet_201, Inception_V3, and MobileNet_V2) and random forest to classify breast cancer histological images over the BreakHis dataset with its four magnification factors: 40X, 100X, 200X and 400X. The study evaluated and compared: (1) the developed random forest models with their base learners, (2) the designed random forest models with the same architecture but with a different number of trees, (3) the decision tree classifiers with the best random forest models and (4) the best random forest models of each feature extractor. The empirical evaluations used: four classification performance criteria (accuracy, sensitivity, precision and F1-score), 5-fold cross-validation, Scott Knott statistical test, and Borda Count voting method. The best random forest model achieved an accuracy mean value of 85.88%, and was constructed using 9 trees, 200X as a magnification factor, and Inception_V3 as a feature extractor. The experimental results demonstrated that combining random forest with deep learning models is effective for the automatic classification of malignant and benign tumors using histopathological images of breast cancer.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Breast Cancer Facts and Statistics. https://www.breastcancer.org/facts-statistics. Accessed 08 Apr 2022
Ginsburg, O., et al.: Breast cancer early detection: A phased approach to implementation. Cancer 126, 2379–2393 (2020). https://doi.org/10.1002/cncr.32887
Yassin, N.I.R., Omran, S., El Houby, E.M.F., Allam, H.: Machine learning techniques for breast cancer computer aided diagnosis using different image modalities: a systematic review. Comput. Methods Programs Biomed. 156, 25–45 (2018). https://doi.org/10.1016/j.cmpb.2017.12.012
Abdar, M., et al.: A new nested ensemble technique for automated diagnosis of breast cancer. Pattern Recogn. Lett. 132, 123–131 (2020). https://doi.org/10.1016/j.patrec.2018.11.004
Hamed, G., Marey, M.A.E.-R., Amin, S.E.-S., Tolba, M.F.: Deep learning in breast cancer detection and classification. In: Hassanien, A.-E., Azar, A.T., Gaber, T., Oliva, D., Tolba, F.M. (eds.) AICV 2020. AISC, vol. 1153, pp. 322–333. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-44289-7_30
Ho, T.K.: Multiple classifier combination: lessons and next steps. In: Bunke, H., Kandel, A. (eds.) Hybrid Methods in Pattern Recognition, pp. 171–198. WORLD SCIENTIFIC (2002). https://doi.org/10.1142/9789812778147_0007
Kuncheva, L.I.: Combining Pattern Classifiers, p. 382 (2014)
Polikar, R.: Ensemble learning. In: Zhang, C., Ma, Y. (eds.) Ensemble Machine Learning, pp. 1–34. Springer, Boston (2012). https://doi.org/10.1007/978-1-4419-9326-7_1
Sagi, O., Rokach, L.: Ensemble learning: a survey. WIREs Data Min. Knowl. Discov. 8, e1249 (2018). https://doi.org/10.1002/widm.1249
Opitz, D., Maclin, R.: Popular ensemble methods: an empirical study. JAIR 11, 169–198 (1999). https://doi.org/10.1613/jair.614
Oza, N.C., Tumer, K.: Classifier ensembles: select real-world applications. Inf. Fus. 9, 4–20 (2008). https://doi.org/10.1016/j.inffus.2007.07.002
Kuncheva, L.I.: Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy, p. 27 (2003)
Brown, G., Kuncheva, L.I.: “Good” and “Bad” diversity in majority vote ensembles. In: El Gayar, N., Kittler, J., Roli, F. (eds.) MCS 2010. LNCS, vol. 5997, pp. 124–133. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-12127-2_13
El Ouassif, B., Idri, A., Hosni, M.: Investigating accuracy and diversity in heterogeneous ensembles for breast cancer classification. In: Gervasi, O., et al. (eds.) ICCSA 2021. LNCS, vol. 12950, pp. 263–281. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86960-1_19
Wang, G., Hao, J., Ma, J., Jiang, H.: A comparative assessment of ensemble learning for credit scoring. Expert Syst. Appl. 38, 223–230 (2011). https://doi.org/10.1016/j.eswa.2010.06.048
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521, 436–444 (2015). https://doi.org/10.1038/nature14539
del Rio, F., Messina, P., Dominguez, V., Parra, D.: Do Better ImageNet Models Transfer Better... for Image Recommendation? arXiv:1807.09870 [cs] (2018)
Xu, G., Liu, M., Jiang, Z., Söffker, D., Shen, W.: Bearing fault diagnosis method based on deep convolutional neural network and random forest ensemble learning. Sensors 19, 1088 (2019). https://doi.org/10.3390/s19051088
Breiman, L.: Bagging predictors. Mach Learn. 24, 123–140 (1996). https://doi.org/10.1007/BF00058655
Zerouaoui, H., Idri, A.: Deep hybrid architectures for binary classification of medical breast cancer images. Biomed. Signal Process. Control 71, 103226 (2022). https://doi.org/10.1016/j.bspc.2021.103226
Hosni, M., Abnane, I., Idri, A., Carrillo de Gea, J.M., Fernández Alemán, J.L.: Reviewing ensemble classification methods in breast cancer. Comput. Methods Prog. Biomed. 177, 89–112 (2019). https://doi.org/10.1016/j.cmpb.2019.05.019
Guo, Y., Shi, H., Kumar, A., Grauman, K., Rosing, T., Feris, R.: SpotTune: Transfer Learning Through Adaptive Fine-Tuning, p. 10 (2018)
Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22, 1345–1359 (2010). https://doi.org/10.1109/TKDE.2009.191
Alshalali, T., Josyula, D.: Fine-tuning of pre-trained deep learning models with extreme learning machine. In: 2018 International Conference on Computational Science and Computational Intelligence (CSCI), pp. 469–473. IEEE, Las Vegas, NV, USA (2018). https://doi.org/10.1109/CSCI46756.2018.00096
Ahmed, A., Yu, K., Xu, W., Gong, Y., Xing, E.: Training hierarchical feed-forward visual recognition models using transfer learning from pseudo-tasks. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008. LNCS, vol. 5304, pp. 69–82. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88690-7_6
Morid, M.A., Borjali, A., Del Fiol, G.: A scoping review of transfer learning research on medical image analysis using ImageNet. Comput. Biol. Med. 128, 104115 (2021). https://doi.org/10.1016/j.compbiomed.2020.104115
Wang, S.-H., Zhang, Y.-D.: DenseNet-201-based deep neural network with composite learning factor and precomputation for multiple sclerosis classification. ACM Trans. Multimedia Comput. Commun. Appl. 16, 1–19 (2020). https://doi.org/10.1145/3341095
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: MobileNetV2: inverted residuals and linear bottlenecks. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4510–4520. IEEE, Salt Lake City, UT (2018). https://doi.org/10.1109/CVPR.2018.00474
Inception V3 Deep Convolutional Architecture For Classifying Acute. https://www.intel.com/content/www/us/en/develop/articles/inception-v3-deep-convolutional-architecture-for-classifying-acute-myeloidlymphoblastic.html. Accessed 3 June 2021
Iqbal, M., Yan, Z.: Supervised machine learning approaches: a survey. Int. J. Soft Comput. 5, 946–952 (2015). https://doi.org/10.21917/ijsc.2015.0133
Liang, G., Zhu, X., Zhang, C.: An Empirical Study of Bagging Predictors for Different Learning Algorithms, p. 2 (2011)
Bühlmann, P., Yu, B.: Analyzing bagging. Ann. Statist. 30 (2002). https://doi.org/10.1214/aos/1031689014
Adele Cutler, D., Cutler, R., Stevens, J.R.: Random forests. In: Zhang, C., Ma, Y. (eds.) Ensemble Machine Learning, pp. 157–175. Springer, Boston (2012). https://doi.org/10.1007/978-1-4419-9326-7_5
Oshiro, T.M., Perez, P.S., Baranauskas, J.A.: How many trees in a random forest? In: Perner, P. (ed.) MLDM 2012. LNCS (LNAI), vol. 7376, pp. 154–168. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-31537-4_13
Kassani, S.H., Kassani, P.H., Wesolowski, M.J., Schneider, K.A., Deters, R.: Classification of Histopathological Biopsy Images Using Ensemble of Deep Learning Networks. arXiv:1909.11870 [cs, eess] (2019)
Saxena, S., Shukla, S., Gyanchandani, M.: Pre-trained convolutional neural networks as feature extractors for diagnosis of breast cancer using histopathology. Int. J. Imaging Syst. Technol. 30, 577–591 (2020). https://doi.org/10.1002/ima.22399
Zerouaoui, H., Idri, A., Nakach, F.Z., Hadri, R.E.: Breast fine needle cytological classification using deep hybrid architectures. In: Gervasi, O., et al. (eds.) ICCSA 2021. LNCS, vol. 12950, pp. 186–202. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86960-1_14
Nikhil, B.: Image Data Pre-Processing for Neural Networks. https://becominghuman.ai/image-data-pre-processing-for-neural-networks-498289068258. Accessed 12 May 2021
Yussof, W.: Performing Contrast Limited Adaptive Histogram Equalization Technique on Combined Color Models for Underwater Image Enhancement (2013)
Shorten, C., Khoshgoftaar, T.M.: A survey on image data augmentation for deep learning. J. Big Data 6(1), 1–48 (2019). https://doi.org/10.1186/s40537-019-0197-0
ScottKnott: a package for performing the Scott-Knott clustering algorithm in R. https://www.scielo.br/scielo.php?script=sci_arttext&pid=S2179-84512014000100002. Accessed 20 May 2021
Borda Count | Mathematics for the Liberal Arts. https://courses.lumenlearning.com/waymakermath4libarts/chapter/borda-count/. Accessed 21 May 2021
Hastie, T., Tibshirani, R., Friedman, J.: Ensemble learning. In: Hastie, T., Tibshirani, R., Friedman, J. (eds.) The Elements of Statistical Learning. SSS, pp. 605–624. Springer, New York (2009). https://doi.org/10.1007/978-0-387-84858-7_16
Acknowledgement
This work was conducted under the research project “Machine Learning based Breast Cancer Diagnosis and Treatment”, 2020–2023. The authors would like to thank the Moroccan Ministry of Higher Education and Scientific Research, Digital Development Agency (ADD), CNRST, and UM6P for their support.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Nakach, FZ., Zerouaoui, H., Idri, A. (2022). Random Forest Based Deep Hybrid Architecture for Histopathological Breast Cancer Images Classification. In: Gervasi, O., Murgante, B., Hendrix, E.M.T., Taniar, D., Apduhan, B.O. (eds) Computational Science and Its Applications – ICCSA 2022. ICCSA 2022. Lecture Notes in Computer Science, vol 13376. Springer, Cham. https://doi.org/10.1007/978-3-031-10450-3_1
Download citation
DOI: https://doi.org/10.1007/978-3-031-10450-3_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-10449-7
Online ISBN: 978-3-031-10450-3
eBook Packages: Computer ScienceComputer Science (R0)