Skip to main content

A Hybrid Model for Optimum Gene Selection of Microarray Datasets

  • Conference paper
  • First Online:

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 740))

Abstract

Selection of genes is one of the most onerous tasks for the study of microarray data, which is accounted because of the higher number of features, rising up to tens of thousands. Feature selection is a crucial step for proper analysis and classification of microarray data. Filter methods are pre-processing algorithms that are independent of the type of classifiers used. Wrapper methods predict the advantages of adding or removing a feature from the dataset by introduction of the induction algorithm and cross validation. In our proposed technique we have tried for significant reduction of the dimensionality of the feature set namely, Leukaemia, Prostate Cancer and DLBCL datasets by passing it to various filters namely, T-test, Bhattacharyya and ReliefF. The further reduction in dimension is done in the second layer with the Mutual Information Maximisation (MIM) filter, which is further optimised by the Adaptive Genetic Algorithm (AGA).

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Heller, M.J.: DNA microarray technology: devices, systems, and applications. Annual Rev. Biomed. Eng. 4, 129–153 (2002)

    Article  Google Scholar 

  2. Li, S., Li, D.: DNA microarray technology. In: DNA Microarray Technology and Data Analysis in Cancer Research, pp. 1–9 (2008)

    Google Scholar 

  3. Kumar, A., Kumar, S., Venkatesh, D., Prabhakaran, C., Ravi Prakash, D., Chakraborty, S.: Identification of genes associated with tumorigenesis of meibomian cell carcinoma by microarray analysis. Genomics 90, 559–566 (2007)

    Article  Google Scholar 

  4. Wang, A., An, N., Chen, G., Li, L., Alterovitz, G.: Improving PLS-RFE based gene selection for microarray data classification. Comput. Biol. Med. 62, 14–24 (2015)

    Article  Google Scholar 

  5. Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97, 273–324 (1997)

    Article  Google Scholar 

  6. Brahim, A.B., Limam, M.: Robust ensemble feature selection for high dimensional data sets. In: International Conference on High Performance Computing and Simulation (HPCS), pp. 151–157 (2013)

    Google Scholar 

  7. Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)

    MATH  Google Scholar 

  8. Liu, H., Li, J., Wong, L.: A comparative study on feature selection and classification methods using gene expression profiles and proteomic patterns. Genome Inform. 13, 51–60 (2002)

    Google Scholar 

  9. Yu, L., Liu H.: Feature selection for high-dimensional data: a fast correlation-based filter solution. In: Twentieth International Conference on Machine Learning, pp: 856–863 (2003)

    Google Scholar 

  10. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern classification, 2nd edn. Wiley, New York (2001)

    MATH  Google Scholar 

  11. Guorong, X., Peiqi, C., Minhui, W.: Bhattacharyya distance feature selection. In: 13th International Conference on Pattern Recognition, pp. 195–199 (1996)

    Google Scholar 

  12. Robnik-Sikonja, M., Kononenko. I.: An adaptation of relief for attribute estimation in regression. In: ICML’97 Proceedings of the Fourteenth International Conference on Machine Learning, pp: 296–304 (1997)

    Google Scholar 

  13. Cover, T.M., Thomas, J.A.: Elements of information theory, Chapter 2. Wiley, New York (1991)

    Book  Google Scholar 

  14. Davis, L.: Handbook of Genetic Algorithms. Van Nostrand Reinhold (1991)

    Google Scholar 

  15. Srinivas, M., Patnaik, L.M.: Genetic algorithm: a survey. IEEE Trans. Comput. 27, 17–26 (1994)

    Google Scholar 

  16. [Online] Available: http://www.biolab.si/sup/bi-cancer/projections/

  17. Gao, L., Ye, M., Lu, X., Huang, D.: Hybrid method based on information gain and support vector machine for gene selection in cancer classification. Genom. Proteomics Bioinform. 15, 389–395 (2017)

    Article  Google Scholar 

  18. Mohammadi, A., Saraee, M.H., Salehi, M.: Identification of disease-causing genes using micro array data mining and Gene Ontology. BMC Med. Genomics 4, 4–12 (2011)

    Article  Google Scholar 

  19. Chandra, B., Gupta, M.: An efficient statistical feature selection approach for classification of gene expression data. J. Biomed. Inform. 44, 529–535 (2011)

    Article  Google Scholar 

  20. Vege, S.H.: Ensemble of feature selection techniques for high dimensional data. Master’s Thesis and Specialist Projects (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shemim Begum .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Begum, S., Ansari, A.A., Sultan, S., Dam, R. (2019). A Hybrid Model for Optimum Gene Selection of Microarray Datasets. In: Kalita, J., Balas, V., Borah, S., Pradhan, R. (eds) Recent Developments in Machine Learning and Data Analytics. Advances in Intelligent Systems and Computing, vol 740. Springer, Singapore. https://doi.org/10.1007/978-981-13-1280-9_39

Download citation

Publish with us

Policies and ethics