Skip to main content

Optimal and Novel Hybrid Feature Selection Framework for Effective Data Classification

  • Chapter
  • First Online:
Advances in Systems, Control and Automation

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 442))

Abstract

Data mining methods are frequently applied in the framework of data classification. Under data mining methods, feature selection (FS) algorithms are essential for dealing with various dimensional data sets that may contain features in the range of small, medium, and large dimensions. Handling large number of features always raises the issues regarding the classifier accuracy and running time. A novel hybrid feature selection technique build on symmetrical uncertainty and genetic algorithm is proposed. The experiments’ results on UCI datasets using this hybrid framework proved that proposed feature selector is efficient through minimizing the volume of initial features and accurate by providing better detection performance in the classification algorithms comparing with other feature selectors in the literature. It is evident from the earlier research work the prosed method promotes in optimizing and improves the performance. In summary, the proposed feature selection method has outperformed other methods in minimizing the selected features, classification performance and reduces the executing time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  1. Frenay, B., Doquire, G., Verleysen, M.: Estimating mutual information for feature selection in the presence of label noise. Comput. Stat. Data Anal. 71(1), 832–848 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  2. Hemphill, E., Lindsay, J., Lee, C., Mandoiu, I., Nelson, C.E.: Feature selection and classifier performance on diverse bio-logical datasets. BMC Bioinf. 15(13) (2014)

    Google Scholar 

  3. Ganapathy, S., Kulothungan, K., Muthurajkumar, S., Vijayalakshmi, M., Yogesh, P., Kannan, A.: Intelligent feature selection and classification techniques for intrusion detection in networks: a survey. EURASIP J. Wirel. Commun. Netw. (2013)

    Google Scholar 

  4. Raymer, M.L., Doom, T.E., Kuhn, L.A., Punch, W.F.: Knowledge discovery in medical and biological datasets using a hybrid bayes classifier/evolutionary algorithm. IEEE Trans. Syst. Man Cybern. 33(5), 802–810 (2003)

    Article  Google Scholar 

  5. Osl, M., Dreiseit, S., Cerqueira, F., Netzer, M., Pfeifer, B., Baumgartner, C.: Demoting redundant features to improve the discriminatory ability in cancer data. J. Biomed. Inform. 42(4), 721–725 (2009)

    Article  Google Scholar 

  6. Xie, J., Wang, C.: Using support vector machines with a novel hybrid feature selection method for diagnosis of erythemato-squamous diseases. Expert Syst. Appl. 38, 5809–5815 (2010)

    Google Scholar 

  7. Holland, J.H.: Adaptation in Natural Artificial Systems, 2nd edn. MIT Press (1992)

    Google Scholar 

  8. Deutsch, J.M.: Evolutionary algorithms for finding optimal gene sets in microarray prediction. Bioinformatics 19(1), 45–52 (2003)

    Google Scholar 

  9. Jirapech-Umpai, T., Aitken, S.: Feature selection and classification for microarray data analysis: evolutionary methods for identifying predictive genes. BMC Bioinf. 6, 148 (2005)

    Article  Google Scholar 

  10. Li, L., Weinberg, C.R., Darden, T.A., Pedersen, L.G.: Gene selection for sample classification based on gene expression data: study of sensitivity to choice of parameters of the GA/KNN method. Bioinformatics 17(12), 1131–1142 (2001)

    Google Scholar 

  11. Li, L., Pedersen, L.G., Darden, T.A., Weinberg, C.R.: Computational analysis of leukemia microarray expression data using GA/KNN method. In: Proceeding of the 1st Conference on Critical Assessment of Microarray Data Analysis, CAMDA (2000)

    Google Scholar 

  12. Ooi, C.H., Tan, P.: Genetic algorithm applied to multi-class prediction for the analysis of gene expression data. Bioinformatics 19(1), 37–44 (2003)

    Article  Google Scholar 

  13. Moscato, P.: On evolution, search, optimization, genetic algorithms and martial arts: toward memetic algorithms. Technical Report Caltech Concurrent Computation Program, Rep. 826, California Institute of Technology, Pasadena, CA (1989)

    Google Scholar 

  14. Zhu, Z., Ong, Y.S., Dash, M.: Wrapper-Filter feature selection algorithm using a memetic framework. IEEE Trans. Syst. Man Cybern. Part B 10(4), 392–404 (2006)

    Google Scholar 

  15. Hettich, S., Blake, C., Merz, C.: UCI repository of machine learning databases. http://www.ics.uci.edu/mlearn/MLRepository.html (1998)

  16. Moretti, S., van Leeuwen, D., Gmuender, H., Bonassi, S., Van Delft, J., Kleinjans, J., Patrone, F., Merlo, D.F.: Combining Shapley value and statistics to the analysis of gene expression data in children exposed to air pollution. BMC Bioinf. 9(361), 1–21 (2008)

    Google Scholar 

  17. Aitkenhead, M.J.A.: Co-evolving decision tree classification method. Expert Syst. Appl. 34(1), 18–25 (2006)

    Article  Google Scholar 

  18. Baker, J.E.: Adaptive selection methods for genetic algorithms. In: Proceedings of International Conference in Genetic Algorithm and Their Applications, pp. 101–111 (1985)

    Google Scholar 

  19. Hualonga, B., Jingb, X.: Hybrid feature selection mechanism based high dimensional date sets reduction. Energy Procedia 11, 4973–4978 (2011)

    Google Scholar 

  20. Tan, F., Fu, X., Zhang, Y., Bourgeois, A.G.: A genetic algorithm—based method for feature subset selection. Soft Comput. 11, 111–120 (2008)

    Google Scholar 

  21. Jinyan, L., Huiqing, L.: Kentridge bio-medical data set repository. http://datam.i2r.a-star.edu.sg/datasets/krbd (2001)

  22. Keinan, A., Sandbank, B., Hilgetag, C.C., Ellison, I., Ruppin, E.: Fair attribution of functional contribution in artificial and biological networks. Neural Comput. 16(9), 1887–1915 (2004)

    Article  MATH  Google Scholar 

  23. Qi, Z., Tian, Y., Shi, Y.: Robust twin support vector machine for pattern classification. J. Pattern Recognit. 46(1), 305–316 (2013)

    Article  MATH  Google Scholar 

  24. Senthamarai Kannan, S., Ramaraj, N.: A novel hybrid feature selection via Symmetrical Uncertainty ranking based local memetic search algorithm. Knowl. Based Syst. 23, 580–585 (2010)

    Google Scholar 

  25. Shao, Y.H., Chen, W.J., Zhang, J.J. et al.: An efficient weighted Lagrangian twin support vector machine for imbalanced data classification. J. Pattern Recognit. 47(9), 3158–3167 (2014)

    Google Scholar 

  26. Weka.: Machine Learning Software in Java. The University of Waikato software documentation. http://www.cs.waikato.ac.nz/_ml/wek

  27. Eswa, J., Yang, J.H., Honavar, V.: Feature selection using a genetic algorithm. IEEE Intell. Syst. 13(2), 44–49 (1998)

    Article  Google Scholar 

  28. Yildirim, P.: Filter based feature selection methods for prediction of risks in hepatitis disease. Int. J. Mach. Learn. Comput. 5(4), 258–263 (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Sivakumar Venkataraman or Rajalakshmi Selvaraj .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Venkataraman, S., Rajalakshmi Selvaraj (2018). Optimal and Novel Hybrid Feature Selection Framework for Effective Data Classification. In: Konkani, A., Bera, R., Paul, S. (eds) Advances in Systems, Control and Automation. Lecture Notes in Electrical Engineering, vol 442. Springer, Singapore. https://doi.org/10.1007/978-981-10-4762-6_48

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-4762-6_48

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-4761-9

  • Online ISBN: 978-981-10-4762-6

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics