Cluster Computing

, Volume 22, Supplement 3, pp 6899–6906 | Cite as

Improving medical diagnosis performance using hybrid feature selection via relieff and entropy based genetic search (RF-EGA) approach: application to breast cancer prediction

  • Ilangovan SangaiahEmail author
  • A. Vincent Antony Kumar


In this research a new hybrid prediction algorithm for breast cancer has been made from a breast cancer data set. Many approaches are available in diagnosing the medical diseases like genetic algorithm, ant colony optimization, particle swarm optimization, cuckoo search algorithm, etc., The proposed algorithm uses a ReliefF attribute reduction with entropy based genetic algorithm for breast cancer detection. The hybrid combination of these techniques is used to handle the dataset with high dimension and uncertainties. The data are obtained from the Wisconsin breast cancer dataset; these data have been categorized based on different properties. The performance of the proposed method is evaluated and the results are compared with other well known feature selection methods. The obtained result shows that the proposed method has a remarkable ability to generate reduced-size subset of salient features while yielding significant classification accuracy for large datasets.


Feature selection ReliefF ranking Entropy based genetic algorithm Classification Breast cancer diagnosis 


  1. 1.
    Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers, San Francisco (2000)zbMATHGoogle Scholar
  2. 2.
    Goldberg, D.E.: Genetic Algorithm in Search, Optimization & Machine Learning. Addison Wesley, Reading (1989)zbMATHGoogle Scholar
  3. 3.
    Kononenko, I.: Estimation attributes: analysis and Extensions of RELIEF. In: Proceedings of the 1994 European Conference on Machine Learning, pp. 171–182 (1994)CrossRefGoogle Scholar
  4. 4.
    Yang, P., Zhang, Z.: An embedded two-layer feature selection approach for microarray data analysis. EEE Intell. Inf. Bull. 10(1), 24–32 (2009)Google Scholar
  5. 5.
    Huerta, E.B.: A Hybrid GA/SVM approach for gene selection and classification of microarray data. pp. 34–44 (2006)Google Scholar
  6. 6.
    Olaniyi, E.O., Oyedotun, O.K., Adnan, K.: Heart diseases diagnosis using neural networks arbitration. Int. J. Intell. Syst. Appl. (IJISA) 7(12), 75 (2015)Google Scholar
  7. 7.
    Hsieh, S.L., Hsieh, S.H., Cheng, P.H., et al.: Design ensemble machine learning model for breast cancer diagnosis. J. Med. Syst. 36(5), 2841–2847 (2012)CrossRefGoogle Scholar
  8. 8.
    Sallehuddin, R., Ubaidillah, S.H., Mustaffa, N.H.: Classification of liver cancer using artificial neural network and support vector machine. In: Proceedings of International Conference on Advance in Communication Network, and Computing, Elsevier Science, CNC (2014)Google Scholar
  9. 9.
    Long, N.C., Meesad, P., Unger, H.: A highly accurate firefly based algorithm for heart disease prediction. Expert Syst. Appl. 42(21), 8221–8231 (2015)CrossRefGoogle Scholar
  10. 10.
    Jabbar, M.A., Deekshatulu, B.L., Chandra, P.: Heart disease prediction system using associative classification and genetic algorithm. (2012)Google Scholar
  11. 11.
    Kim, J.K., Lee, J.S., Park, D.K., Lim, Y.S., Lee, Y.H., Jung, E.Y.: Adaptive mining prediction model for content recommendation to coronary heart disease patients. Clust. Comput. 17(3), 881–891 (2014)CrossRefGoogle Scholar
  12. 12.
    Choubey, D.K., Sanchita, P.: GAXXSlahUndXXMLP NN: a hybrid intelligent system for diabetes disease diagnosis. Int. J. Intell. Syst. Appl. 8(1), 49 (2016)Google Scholar
  13. 13.
    Ordonez, C., Omiecinski, E., De Braal L. et al.: Mining constrained association rules to predict heart disease. In: Proceedings 2001 IEEE International Conference on Data Mining, pp. 433–440. San Jose, CA, USA (2001)Google Scholar
  14. 14.
    Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. Ed. Leslie Pack Kaelbling. J. Mach. Learn. Res. 3, 1157–1182 (2003)zbMATHGoogle Scholar
  15. 15.
    Wang, H., Khoshgoftaar, T.M., Van Hulse, J., Gao, K.: Metric selection for software defect prediction. Int. J. Softw. Eng. Knowl. Eng. 21(2), 237–257 (2011)CrossRefGoogle Scholar
  16. 16.
    Hall, M.A., Smith, L.A.: Feature subset selection: a correlation based filter approach. In: Proceedings of 1997 International Conference on Neural Information Processing and Intelligent Information Systems, pp. 855–858 (1997)Google Scholar
  17. 17.
    Ding, C., Peng, H.: Minimum redundancy feature selection from microarray gene experssion data. J. Bioinf. Comput. Biol. 3(2), 185–205 (2005)CrossRefGoogle Scholar
  18. 18.
    Jayaram, M.A., Karegowda, A.G., Manjunath, A.S.: Feature subset selection problem using wrapper approach in supervised learning. Int. J. Comput. Appl. 1(7), 13–16 (2010)Google Scholar
  19. 19.
    Unler, A., Murat, A., Chinnam, R.B.: mr 2 PSO: a maximum relevance minimum redundancy approach based on swarm intelligence for support vector machine classification. Inf. Sci. 181(20), 4625–4641 (2011)CrossRefGoogle Scholar
  20. 20.
    Jensen, R., Shen, Q.: Fuzzy-rough data reduction with ant colony optimization. Present. Fuzzy Sets Syst. 149, 5–20 (2005)MathSciNetCrossRefGoogle Scholar
  21. 21.
    Zhang, C.K., Hu, H.: Feature selection using the hybrid of ant colony optimization and mutual information for the forecaster. In: Proceedings of the Fourth International Conference on Machine Learning and Cybernetics, vol. 3, pp. 1728–1732 (2005)Google Scholar
  22. 22.
    Liu, H., Setiono, R.: A probabilistic approach to feature selection—a filter solution. In Proceedings of the 13th International Conference on Machine Learning, pp. 319–327 (1996)Google Scholar
  23. 23.
    Kent ridge bio-medical data set repository World Wide Web.
  24. 24.
  25. 25.
    Hualong, B., Jing, X.: Hybrid feature selection mechanism based high dimensional date sets reduction. Energy Proc. 11(1), 4973–4978 (2011)CrossRefGoogle Scholar
  26. 26.
    Tan, F., Fu, X., Zhang, Y., Bourgeois, A.G.: A genetic algorithm based method for feature subset selection. Soft Comput. 11(1), 111–120 (2008)Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Information TechnologyK.L.N.College of EngineeringMaduraiIndia
  2. 2.Department of Information TechnologyPSNA College of Engineering and TechnologyDindigulIndia

Personalised recommendations