Skip to main content
Log in

A structured combination of ensemble classifier and filter-based feature selection to improve breast cancer diagnosis

  • Research
  • Published:
Journal of Cancer Research and Clinical Oncology Aims and scope Submit manuscript

Abstract

Introduction

Advances in technology have led to the emergence of computerized diagnostic systems as intelligent medical assistants. Machine learning approaches cannot replace professional humans, but they can change the treatment of diseases such as cancer and be used as medical assistants.

Background

Breast cancer treatment can be very effective, especially when the disease is detected in the early stages. Feature selection and classification are common data mining techniques in machine learning that can provide breast cancer diagnosis with high speed, low cost and high precision.

Methodology

This paper proposes a new intelligent approach using an integrated filter-evolutionary search-based feature selection and an optimized ensemble classifier for breast cancer diagnosis. The selected features mainly relate to the viable solution as the selected features are successfully used in the breast cancer disease classification process. The proposed feature selection method selects the most informative features from the original feature set by integrating adaptive thresholder information gain-based feature selection and evolutionary gravity-search-based feature selection. Meanwhile, classification model is done by proposing a new intelligent multi-layer perceptron neural network-based ensemble classifier.

Results

The simulation results show that the proposed method provides better performance compared to the state-of-the-art algorithms in terms of various criteria such as accuracy, sensitivity and specificity. Specifically, the proposed method achieves an average accuracy of 99.42% on WBCD, WDBC and WPBC datasets from Wisconsin database with only 56.7% of features.

Conclusion

Systems based on intelligent medical assistants configured with machine learning approaches are an important step toward helping doctors to detect breast cancer early.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data availability

The supporting of data and material is not available.

References

  • Abdar M, Makarenkov V (2019) CWV-BANN-SVM ensemble learning classifier for an accurate diagnosis of breast cancer. Measurement 146:557–570

    Article  Google Scholar 

  • Abdar M, Zomorodi-Moghadam M, Zhou X, Gururajan R, Tao X, Barua PD, Gururajan R (2020) A new nested ensemble technique for automated diagnosis of breast cancer. Pattern Recogn Lett 132:123–131

    Article  Google Scholar 

  • Assiri AS, Nazir S, Velastin SA (2020) Breast tumor classification using an ensemble machine learning method. J Imaging 6(6):39

    Article  PubMed  PubMed Central  Google Scholar 

  • Cao C, Wang J, Kwok D, Cui F, Zhang Z, Zhao D, Zou Q (2022) webTWAS: a resource for disease candidate susceptibility genes identified by transcriptome-wide association study. Nucleic Acids Res 50(D1):D1123–D1130

    Article  CAS  PubMed  Google Scholar 

  • Chaurasia V, Pal S, Tiwari BB (2018) Prediction of benign and malignant breast cancer using data mining techniques. J Algorithms Comput Technol 12(2):119–126

    Article  Google Scholar 

  • Cheng F, Liang H, Niu B, Zhao N, Zhao X (2023a) Adaptive neural self-triggered bipartite secure control for nonlinear MASs subject to DoS attacks. Inf Sci 631:256–270

    Article  Google Scholar 

  • Cheng Y, Niu B, Zhao X, Zong G, Ahmad AM (2023b) Event-triggered adaptive decentralised control of interconnected nonlinear systems with Bouc-Wen hysteresis input. Int J Syst Sci 54(6):1275–1288

    Article  Google Scholar 

  • Forouzandeh S, Berahmand K, Sheikhpour R, Li Y (2023) A new method for recommendation based on embedding spectral clustering in heterogeneous networks (RESCHet). Expert Syst Appl 231:120699

    Article  Google Scholar 

  • Frank A, Asuncion A (2010) University of California Irvine (UCI) machine learning repository

  • Ghiasi MM, Zendehboudi S (2021) Application of decision tree-based ensemble learning in the classification of breast cancer. Comput Biol Med 128:104089

    Article  PubMed  Google Scholar 

  • Gupta A, Kumar R, Arora HS, Raman B (2022) C-CADZ: computational intelligence system for coronary artery disease detection using Z-Alizadeh Sani dataset. Appl Intell 52(3):2436–2464

    Article  Google Scholar 

  • Haq AU, Li JP, Memon MH, Nazir S, Sun R (2018) A hybrid intelligent system framework for the prediction of heart disease using machine learning algorithms. Mob Inf Syst 2018:1–21

    Google Scholar 

  • Hassanien AE, Ali JM (2004) Rough set approach for generation of classification rules of breast cancer data. Informatica 15(1):23–38

    Article  Google Scholar 

  • Ibrahim S, Nazir S, Velastin SA (2021) Feature selection using correlation analysis and principal component analysis for accurate breast cancer diagnosis. J Imaging 7(11):225

    Article  PubMed  PubMed Central  Google Scholar 

  • Kadam VJ, Jadhav SM, Vijayakumar K (2019) Breast cancer diagnosis using feature ensemble learning based on stacked sparse autoencoders and softmax regression. J Med Syst 43(8):263

    Article  PubMed  Google Scholar 

  • Kumar PS, Kumari A, Mohapatra S, Naik B, Nayak J, Mishra M (2021) CatBoost ensemble approach for diabetes risk prediction at early stages. In: 2021 1st Odisha International Conference on Electrical Power Engineering, Communication and Computing Technology (ODICON) (pp. 1–6). IEEE

  • Kumar M, Singhal S, Shekhar S, Sharma B, Srivastava G (2022) Optimized stacking ensemble learning model for breast cancer detection and classification using machine learning. Sustainability 14(21):13998

    Article  Google Scholar 

  • Lei X, Li Z, Zhong Y, Li S, Chen J, Ke Y, Yu X (2022) Gli1 promotes epithelial–mesenchymal transition and metastasis of non-small cell lung carcinoma by regulating snail transcriptional activity and stability. Acta Pharmaceutica Sinica B 12(10):3877–3890

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Li Y, Niu B, Zong G, Zhao J, Zhao X (2022a) Command filter-based adaptive neural finite-time control for stochastic nonlinear systems with time-varying full-state constraints and asymmetric input saturation. Int J Syst Sci 53(1):199–221

  • Li Y, Wang H, Zhao X, Xu N (2022b) Event-triggered adaptive tracking control for uncertain fractional-order nonstrict-feedback nonlinear systems via command filtering. Int J Robust Nonlinear Control 32(14):7987–8011

    Article  Google Scholar 

  • Li X, Chen X, Rezaeipanah A (2023) Automatic breast cancer diagnosis based on hybrid dimensionality reduction technique and ensemble classification. J Cancer Res Clin Oncol 149:7609–7627

    Article  PubMed  Google Scholar 

  • Liu S, Niu B, Zong G, Zhao X, Xu N (2023) Adaptive neural dynamic-memory event-triggered control of high-order random nonlinear systems with deferred output constraints. IEEE Trans Autom Sci Eng. https://doi.org/10.1109/TASE.2023.3269509

    Article  Google Scholar 

  • Mahesh TR, Kaladevi AC, Balajee JM, Vivek V, Prabu M, Muthukumaran V (2022a) An efficient ensemble method using K-fold cross validation for the early detection of benign and malignant breast cancer. Int J Integrated Eng 14(7):204–216

    Google Scholar 

  • Mahesh TR, Vinoth Kumar V, Muthukumaran V, Shashikala HK, Swapna B, Guluwadi S (2022b) Performance analysis of xgboost ensemble methods for survivability with the classification of breast cancer. J Sens 2022:1–8

    Article  Google Scholar 

  • Maleki N, Zeinali Y, Niaki STA (2021) A k-NN method for lung cancer prognosis with the use of a genetic algorithm for feature selection. Expert Syst Appl 164:113981

    Article  Google Scholar 

  • Maritz JS (1985) Models and the use of signed rank tests. Stat Med 4(2):145–153

    Article  CAS  PubMed  Google Scholar 

  • Mohan S, Thirumalai C, Srivastava G (2019) Effective heart disease prediction using hybrid machine learning techniques. IEEE Access 7:81542–81554

    Article  Google Scholar 

  • Ono Y, Mitani Y (2022) Evaluation of feature extraction methods with ensemble learning for breast cancer classification. In: 2022 IEEE 4th Global Conference on Life Sciences and Technologies (LifeTech) (pp. 194–195). IEEE.

  • Ragab M, Albukhari A, Alyami J, Mansour RF (2022) Ensemble deep-learning-enabled clinical decision support system for breast cancer diagnosis and classification on ultrasound images. Biology 11(3):439

    Article  PubMed  PubMed Central  Google Scholar 

  • Rashedi E, Nezamabadi-Pour H, Saryazdi S (2009) GSA: a gravitational search algorithm. Inf Sci 179(13):2232–2248

    Article  Google Scholar 

  • Rezaeipanah A, Ahmadi G (2022) Breast cancer diagnosis using multi-stage weight adjustment in the MLP neural network. Comput J 65(4):788–804

    Article  Google Scholar 

  • Rezaeipanah A, Syah R, Wulandari S, Arbansyah A (2021) Design of ensemble classifier model based on MLP neural network for breast cancer diagnosis. Intel Artif 24(67):147–156

    Article  Google Scholar 

  • Rustam Z, Hartini S, Pratama RY, Yunus RE, Hidayat R (2020) Analysis of architecture combining convolutional neural network (CNN) and kernel K-means clustering for lung cancer diagnosis. Int J Adv Sci Eng Inf Technol 10(3):1200–1206

    Article  Google Scholar 

  • Shahidinejad A, Ghobaei-Arani M, Souri A, Shojafar M, Kumari S (2021) Light-edge: a lightweight authentication protocol for IoT devices in an edge-cloud environment. IEEE Consumer Electron Mag 11(2):57–63

    Article  Google Scholar 

  • Sharma D, Kumar R, Jain A (2022a) Breast cancer prediction based on neural networks and extra tree classifier using feature ensemble learning. Measurement 24:100560

    Google Scholar 

  • Sharma SK, Vijayakumar K, Kadam VJ, Williamson S (2022b) Breast cancer prediction from microRNA profiling using random subspace ensemble of LDA classifiers via Bayesian optimization. Multimedia Tools Appl 81(29):41785–41805

    Article  Google Scholar 

  • Sheikhpour R, Berahmand K, Forouzandeh S (2023) Hessian-based semi-supervised feature selection using generalized uncorrelated constraint. Knowl-Based Syst 269:110521

    Article  Google Scholar 

  • Talatian Azad S, Ahmadi G, Rezaeipanah A (2022) An intelligent ensemble classification method based on multi-layer perceptron neural network and evolutionary algorithms for breast cancer diagnosis. J Exp Theor Artif Intell 34(6):949–969

    Article  Google Scholar 

  • Tan J, Liu L, Li F, Chen Z, Chen GY, Fang F, ... , Zhou X (2022) Screening of endocrine disrupting potential of surface waters via an affinity-based biosensor in a rural community in the Yellow River Basin, China. Environ Sci Technol 56(20):14350–14360

  • Tang F, Niu B, Zong G, Zhao X, Xu N (2022) Periodic event-triggered adaptive tracking control design for nonlinear discrete-time systems via reinforcement learning. Neural Netw 154:43–55

    Article  PubMed  Google Scholar 

  • Tang F, Wang H, Chang XH, Zhang L, Alharbi KH (2023) Dynamic event-triggered control for discrete-time nonlinear Markov jump systems using policy iteration-based adaptive dynamic programming. Nonlinear Anal Hybrid Syst 49:101338

  • Torabi E, Ghobaei-Arani M, Shahidinejad A (2022) Data replica placement approaches in fog computing: a review. Clust Comput 25(5):3561–3589

    Article  Google Scholar 

  • Tuerhong A, Silamujiang M, Xianmuxiding Y, Wu L, Mojarad M (2023) An ensemble classifier method based on teaching–learning-based optimization for breast cancer diagnosis. J Cancer Res Clin Oncol 149:9337–9348

    Article  PubMed  Google Scholar 

  • Wang J, Jiang X, Zhao L, Zuo S, Chen X, Zhang L, Yu XY (2020) Lineage reprogramming of fibroblasts into induced cardiac progenitor cells by CRISPR/Cas9-based transcriptional activators. Acta Pharmaceutica Sinica B 10(2):313–326

    Article  PubMed  Google Scholar 

  • Wang M, Yang M, Fang Z, Wang M, Wu Q (2022) A practical feeder planning model for urban distribution system. IEEE Trans Power Syst 38(2):1297–1308

  • Wang T, Wang H, Xu N, Zhang L, Alharbi KH (2023) Sliding-mode surface-based decentralized event-triggered control of partially unknown interconnected nonlinear systems via reinforcement learning. Inf Sci 641:119070

    Article  Google Scholar 

  • Yan F, Huang H, Pedrycz W, Hirota K (2023) Automated breast cancer detection in mammography using ensemble classifier and feature weighting algorithms. Expert Syst Appl 227:120282

    Article  Google Scholar 

  • Zhang L, Deng S, Zhang Y, Peng Q, Li H, Wang P, Yu X (2020a) Homotypic targeting delivery of siRNA with artificial cancer cells. Adv Healthc Mater 9(9):1900772

    Article  CAS  Google Scholar 

  • Zhang X, He D, Zheng Y, Huo H, Li S, Chai R, Liu T (2020b) Deep learning based analysis of breast cancer using advanced ensemble classifier and linear discriminant analysis. IEEE Access 8:120208–120217

    Article  Google Scholar 

  • Zhang H, Zhao X, Zhang L, Niu B, Zong G, Xu N (2022) Observer-based adaptive fuzzy hierarchical sliding mode control of uncertain under-actuated switched nonlinear systems with input quantization. Int J Robust Nonlinear Control 32(14):8163–8185

    Article  Google Scholar 

  • Zhang H, Zhao X, Wang H, Niu B, Xu N (2023) Adaptive tracking control for output-constrained switched MIMO pure-feedback nonlinear systems with input saturation. J Syst Sci Complexity 36(3):960–984

    Article  Google Scholar 

  • Zhao Y, Niu B, Zong G, Xu N, Ahmad AM (2023) Event-triggered optimal decentralized control for stochastic interconnected nonlinear systems via adaptive dynamic programming. Neurocomputing 539:126163

    Article  Google Scholar 

Download references

Funding

Not applicable.

Author information

Authors and Affiliations

Authors

Contributions

All authors reviewed the manuscript

Corresponding authors

Correspondence to Dengru Zheng or Sajjad Saberi.

Ethics declarations

Conflict of interest

The authors have no actual or potential conflict of interest in the subject matter discussed in the manuscript.

Ethical approval

The paper reflects the authors’ own research and analysis in a truthful and complete manner.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zheng, D., Tang, P., Lu, D. et al. A structured combination of ensemble classifier and filter-based feature selection to improve breast cancer diagnosis. J Cancer Res Clin Oncol 149, 14519–14534 (2023). https://doi.org/10.1007/s00432-023-05238-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00432-023-05238-4

Keywords

Navigation