Abstract
Breast cancer is the most common cancer diagnosed and cause of death among women worldwide. There is evidence that early detection and treatment can increase the survival rate of breast cancer patients. The traditional method for diagnosing the disease relies on human experiences to identify the presence of certain pattern from the database. It is prone to human error, time consuming and labour intensive. Therefore, this work proposes an automatic breast cancer diagnosis technique using a genetic algorithm (GA) for simultaneous feature selection and parameter optimization of an artificial neural network (ANN). The proposed algorithm is implemented with three different variations of the backpropagation technique namely the resilient back-propagation (GAANN_RP), Levenberg–Marquardt (GAANN_LM) and gradient descent with momentum (GAANN_GD) for fine tuning of the weight of ANN, and their performances are compared. Besides, the effect of the feature selection and manual determination of the hidden node size has also been investigated. Interestingly, one of the proposed algorithms called GAANN_RP produces the best and on average, 99.24 and 98.29 % correct classification, respectively, on the Wisconsin breast cancer dataset, which is comparable with the results gathered from other works found in the literature.
Similar content being viewed by others
References
Ferlay J (2010) Nearly 1.4 million women worldwide diagnosed with breast cancer in 2008. http://www.wcrf.org/cancer_statistics/cancer_facts/women-breast-cancer.php. Accessed 31 Dec 2012
Übeyli ED (2007) Implementing automated diagnostic systems for breast cancer detection. Expert Syst Appl 33(4):1054–1062
Furundzic D, Djordjevic M, Jovicevic Bekic A (1998) Neural networks approach to early breast cancer detection. J Syst Archit 44(8):617–633
Karabatak M, Ince MC (2009) An expert system for detection of breast cancer based on association rules and neural network. Expert Syst Appl 36(2, Part 2):3465–3469
Rogers SK, Ruck DW, Kabrisky M (1994) Artificial neural networks for early detection and diagnosis of cancer. Cancer Lett 77(2–3):79–83
Paliwal M, Kumar UA (2009) Neural networks and statistical techniques: a review of applications. Expert Syst Appl 36(1):2–17
Walczak S, Cerpa N (1999) Heuristic principles for the design of artificial neural networks. Inf Softw Technol 41(2):107–117
Rudy S, Huan L (1997) Neural-network feature selector. IEEE Trans Neural Netw 8(3):654–662
Verikas A, Bacauskiene M (2002) Feature selection with neural networks. Pattern Recognit Lett 23(11):1323–1335
Kabir MM, Islam MM, Murase K (2010) A new wrapper feature selection approach using neural network. Neurocomputing 73(16–18):3273–3283
Tian J, Li M, Chen F (2010) Dual-population based coevolutionary algorithm for designing RBFNN with feature selection. Expert Syst Appl 37(10):6904–6918
Huang C-L, Wang C-J (2006) A GA-based feature selection and parameters optimization for support vector machines. Expert Syst Appl 31(2):231–240
Castillo PA, Merelo JJ, Prieto A, Rivas V, Romero G (2000) G-Prop: global optimization of multilayer perceptrons using GAs. Neurocomputing 35(1–4):149–163
Kuo RJ (2001) A sales forecasting system based on fuzzy neural network with initial weights generated by genetic algorithm. Eur J Oper Res 129(3):496–517
Kermani BG, White MW, Nagle HT (1995) Feature extraction by genetic algorithms for neural networks in breast cancer classification. In: Proceedings of the 17th annual conference on IEEE engineering in medicine and biology society, vol 831, pp 831–832
Verma B, Zhang P (2007) A novel neural-genetic algorithm to find the most significant combination of features in digital mammograms. Appl Soft Comput 7(2):612–625
Palaniappan R, Eswaran C (2009) Using genetic algorithm to select the presentation order of training patterns that improves simplified fuzzy ARTMAP classification performance. Appl Soft Comput 9(1):100–106
Ferentinos KP (2005) Biological engineering applications of feedforward neural networks designed and parameterized by genetic algorithms. Neural Netw 18(7):934–950
Almeida LM, Ludermir TB (2010) A multi-objective memetic and hybrid methodology for optimizing the parameters and performance of artificial neural networks. Neurocomputing 73(7–9):1438–1450
Xin Y (1999) Evolving artificial neural networks. Proc IEEE 87(9):1423–1447
Hall MA (1999) Correlation-based feature selection for machine learning. Ph.D. thesis, Department of Computer Science, University of Waikato, Hamilton
Sun Y (2007) Iterative RELIEF for feature weighting: algorithms, theories, and applications. IEEE Trans Pattern Anal Mach Intell 29(6):1035–1051
Peng H, Fulmi L, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238
Kim G, Kim Y, Lim H, Kim H (2010) An MLP-based feature subset selection for HIV-1 protease cleavage site analysis. Artif Intell Med 48(2):83–89
Rumelhart D, Hinton G, Williams R (1986) Learning representations by back-propagating errors. Nature 323:533–536
Kolen JF, Pollack JB (1991) Back propagation is sensitive to initial conditions. Adv Neural Inf Process Syst 3:860–867
Gupta JND, Sexton RS (1999) Comparing backpropagation with a genetic algorithm for neural network training. Omega 27(6):679–684
K-j Kim (2006) Artificial neural networks with evolutionary instance selection for financial forecasting. Expert Syst Appl 30(3):519–526
Sexton RS, Dorsey RE, Johnson JD (1998) Toward global optimization of neural networks: a comparison of the genetic algorithm and backpropagation. Decis Support Syst 22(2):171–185
Fogel D, Wasson E III, Boughton E (1995) Evolving neural networks for detecting breast cancer. Cancer Lett 96(1):49–54
Abbass HA (2002) An evolutionary artificial neural networks approach for breast cancer diagnosis. Artif Intell Med 25(3):265–281. doi:10.1016/S0933-3657(02)00028-3
Arauzo-Azofra A, Aznarte JL, Benítez JM (2011) Empirical study of feature selection methods based on individual feature evaluation for classification problems. Expert Syst Appl 38(7):8170–8177
Leung F, Lam H, Ling S, Tam P (2003) Tuning of the structure and parameters of a neural network using an improved genetic algorithm. IEEE Trans Neural Netw 14(1):79–88
Wolberg WH (1990) Breast cancer Wisconsin (original) dataset. http://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Original%29. Accessed 14 Oct 2012
Hsu H–H, Hsieh C-W, Lu M-D (2011) Hybrid feature selection by combining filters and wrappers. Expert Syst Appl 38(7):8144–8150
Zhao M, Fu C, Ji L, Tang K, Zhou M (2011) Feature selection and parameter optimization for support vector machines: a new approach based on genetic algorithm with feature chromosomes. Expert Syst Appl 38(5):5197–5204
Goldberg D (1989) Genetic algorithms in search and optimization, 1st edn. Addison-Wesley, Boston
Prechelt L (1994) Proben1: a set of neural network benchmark problems and benchmarking rules. Technical Report, University of Karlsruhe, Karlsruhe, Germany
Esugasini S, Mashor M, Isa N, Othman N (2005) Performance comparison for MLP networks using various back propagation algorithms for breast cancer diagnosis. In: Knowledge-based intelligent information and engineering systems. Lecture notes in computer science, vol 3682. Springer, Berlin, pp 166–166. doi:10.1007/11552451_17
Riedmiller M, Braun H (1992) RPROP-A fast adaptive learning algorithm. In: Proceedings of the international symposium computer information sciences, Antalya, pp 279–285
Quinlan JR (1996) Improved use of continuous attributes in C4.5. J Artif Int Res 4(1):77–90
Hamilton HJ, Shan N, Cercone N (1996) RIAC: a rule induction algorithm based on approximate classification. Paper presented at the international conference on engineering applications of neural networks, University of Regina
Nauck D, Kruse R (1999) Obtaining interpretable fuzzy classification rules from medical data. Artif Intell Med 16(2):149–169
Peña-Reyes CA, Sipper M (1999) A fuzzy-genetic approach to breast cancer diagnosis. Artif Intell Med 17(2):131–155
Setiono R (2000) Generating concise and accurate classification rules for breast cancer diagnosis. Artif Intell Med 18(3):205–219
Albrecht AA, Lappas G, Vinterbo SA, Wong C, Ohno-Machado L (2002) Two applications of the LSA machine. In: Proceedings of the 9th international conference on neural information processing (ICONIP’02), Singapore, pp 184–189
Abonyi J, Szeifert F (2003) Supervised fuzzy clustering for the identification of fuzzy classifiers. Pattern Recognit Lett 24(14):2195–2207
Polat K, Günes S (2007) Breast cancer diagnosis using least square support vector machine. Digit Signal Proc 17(4):694–701
Guijarro-Berdiñas B, Fontenla-Romero O, Pérez-Sánchez B, Fraguela P (2007) A linear learning method for multilayer perceptrons using least-squares. In: Intelligent data engineering and automated learning (IDEAL’07). Lecture notes in computer science, vol 4881. Springer, Berlin, pp 365–374. doi:10.1007/978-3-540-77226-2_38
Akay MF (2009) Support vector machines combined with feature selection for breast cancer diagnosis. Expert Syst Appl 36(2, Part 2):3240–3247
Peng Y, Wu Z, Jiang J (2010) A novel feature selection approach for biomedical data classification. J Biomed Inform 43(1):15–23
Marcano-Cedeño A, Quintanilla-Domínguez J, Andina D (2011) WBCD breast cancer database classification applying artificial metaplasticity neural network. Expert Syst Appl 38(8):9573–9579
Stoean R, Stoean C (2013) Modeling medical decision making by support vector machines, explaining by rules of evolutionary algorithms with feature selection. Expert Syst Appl 40(7):2677–2686. doi:10.1016/j.eswa.2012.11.007
Acknowledgments
This research is partially supported by Universiti Sains Malaysia’s Research University Postgraduate Research Grant Scheme (USM-RU-PGRS) entitled ‘Genetic Algorithm-Artificial Neural Network Hybrid Intelligence’ and the Universiti Sains Malaysia’s Research University Grant entitled ‘Study on Compatibility of FTIR Spectral Characteristics for the Development of Intelligent Cervical Pre-cancerous Diagnostic System’. The author also wishes to thank Universiti Teknologi MARA for its financial assistance.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ahmad, F., Mat Isa, N.A., Hussain, Z. et al. A GA-based feature selection and parameter optimization of an ANN in diagnosing breast cancer. Pattern Anal Applic 18, 861–870 (2015). https://doi.org/10.1007/s10044-014-0375-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10044-014-0375-9