Skip to main content

Performance Evaluation of ANOVA and RFE Algorithms for Classifying Microarray Dataset Using SVM

  • Conference paper
  • First Online:
Information Systems (EMCIS 2020)

Abstract

A significant application of microarray gene expression data is the classification and prediction of biological models. An essential component of data analysis is dimension reduction. This study presents a comparison study on a reduced data using Analysis of Variance (ANOVA) and Recursive Feature Elimination (RFE) feature selection dimension reduction techniques, and evaluates the relative performance evaluation of classification procedures of Support Vector Machine (SVM) classification technique. In this study, an accuracy and computational performance metrics of the processes were carried out on a microarray colon cancer dataset for classification, SVM-RFE achieved 93% compared to ANOVA with 87% accuracy in the classification output result.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Aaron, T.L., Davis, J.M, John, C.M.: A step-by-step workflow for low-level analysis of single-cell RNA-seq data. Research 1(5), 1–62. https://doi.org/10.12688/f1000research.9501.2

  2. Ana, C., et al.: A survey of best practices for RNA-seq data analysis. Genome Biol. 17(13), 1–19 (2016). https://doi.org/10.1186/s13059-016-0881-8

    Article  Google Scholar 

  3. Levin, J.Z., et al.: Comprehensive comparative analysis of strand-specific RNA sequencing methods. Nat. Methods 7, 709–715 (2010)

    Article  Google Scholar 

  4. Pierson, E., Yau, C.: ZIFA: dimensionality reduction for zero-inflated single-cell gene expression analysis. Genome Biol. 16, 241–257 (2015)

    Article  Google Scholar 

  5. Dongfang, W., Jin, G.: VASC: dimension reduction and visualization of single-cell RNA-seq data by deep variation autoencoder. Genom. Proteom. Bioinform. (2018). https://doi.org/10.1016/j.gpb.2018.08.03

  6. Junhyong, K.: Computational Analysis of RNA-Seq Data: From Quantification to High-Dimensional Analysis. University of Pennsylvania, pp. 35–43 (2012)

    Google Scholar 

  7. Bacher, R., and Kendziorski, C.: Design and computational analysis of single-cell RNA-seq experiments. Genome Biol. 17(63) (2016)

    Google Scholar 

  8. Alon, U., et al.: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc. Natl. myAcad. Sci. USA 8; 96(12), 6745–6750 (1999)

    Google Scholar 

  9. Chieh, L., Siddhartha, J., Hannah, K., Ziv, B.: Using neural networks for reducing the dimensions of single-cell RNA-Seq data. Nucleic Acids Res. 45(17), 1–11 (2017). https://doi.org/10.1093/nar/gkx681

    Article  Google Scholar 

  10. Mariangela, B., et al.: RNA-seq analyses of changes in the Anopheles gambiae transcriptome associated with resistance to pyrethroids in Kenya: identification of candidate-resistance genes and candidate-resistance SNPs. Paras. Vector 8(474), 1–13 (2015). https://doi.org/10.1186/s13071-015-1083-z

    Article  Google Scholar 

  11. https://figshare.com/articles/Additional_file_4_of_RNA-seq_analyses_of_changes_in_the_Anopheles_gambiae_transcriptome_associated_with_resistance_to_pyrethroids_in_Kenya_identification_of_candidate-resistance_genes_and_candidate-resistance_SNPs/4346279/1

  12. Bezanson, J., Karpinski, S., Shah, V., Edelman, A.: Julia: a fast-dynamic language for technical computing (2012). arXiv:1209.5145

  13. Gary, A.C.: Using ANOVA to analyze microarray data. Biotechn. Future Sci. 37(2), 1–5 (2018)

    Google Scholar 

  14. Mukesh, K., Nitish, K.R., Amitav, S., Santanu, K.R.: Feature selection and classification of microarray data using MapReduce Based ANOVA and KNN. Procedia Comput. Sci. 54, 301–310 (2015)

    Article  Google Scholar 

  15. Ding, Y., Dawn, W.: Improving the performance of SVM-RFE to select genes in microarray data. BMC Bioinform. 2(12), 1–11 (2015)

    Google Scholar 

  16. Shruti, M., Mishra, D.: SVM-BT-RFE: an improved gene selection framework using Bayesian T-test embedded in support vector machine (recursive feature elimination) algorithm. Karbala Int. J. Modern Sci. 1(2), 86–96 (2015)

    Article  Google Scholar 

  17. Rimah, A., Dorra, B.A., Noureddine, E.: An empirical comparison of SVM and some supervised learning algorithms for vowel recognition. Int. J. Intell. Inf. Process. (IJIIP) 3(1), 1–5 (2012)

    Google Scholar 

  18. Aydadenta, H., Adiwijaya: On the classification techniques in data mining for microarray data classification. In: International Conference on Data and Information Science, Journal of Physics: Conf. Series vol. 971. pp. 1–10 (2018). https://doi.org/10.1088/1742-6596/971/1/012004

  19. Chang, C., Lin, C.: LIBSVM: a library for support vector machines. ACM TIST. 2(3), 27

    Google Scholar 

  20. Soofi, A.A., Awan, A.: Classification techniques in. machine learning: applications and issues. J. Basic Appl. Sci. 13, 459–465 (2017)

    Article  Google Scholar 

  21. Khan, A., Baharudin, B., Lee, L.H., Khan, K.: A review of machine learning algorithms for text-documents classification. J. Adv. Inf. Technol. 1(1), 1–17 (2010)

    Google Scholar 

  22. Bhavsar, H., Panchal, M.H.: A review on support vector machine for data classification. Int. J. Adv. Res. Comput. Eng. Technol. (IJARCET) 1(2), 185–189 (2012)

    Google Scholar 

  23. Devi, A.V., Devaraj, D.V.: Gene expression data classification using support vector machine and mutual information-based gene selection. Procedia Comput. Sci. 47, 13–21 (2015)

    Article  Google Scholar 

  24. Esra, P., Hamparsum, B., Sinan, Ç.: A novel hybrid dimension reduction technique for undersized high dimensional gene expression data sets using information complexity criterion for cancer classification. Comput. Math. Methods Med. 1, 1–14 (2015). https://doi.org/10.1155/2015/370640

    Article  MathSciNet  MATH  Google Scholar 

  25. Wenyan, Z., Xuewen, L., Jingjing, W.: Feature selection for cancer classification using microarray gene expression data. Biostat. Biometr. J. 1(2), 1–7 (2017)

    Google Scholar 

  26. Balamurugan, M., Nancy, A., Vijaykumar, S.: Alzheimer’s disease diagnosis by using dimensionality reduction based on KNN classifier. Biomed. Pharmacol. J. 10(4), 1823–1830 (2017)

    Article  Google Scholar 

  27. Usman, A., Shazad, A., Javed, F.: Using PCA and factor analysis for dimensionality reduction of bio-informatics data. (IJACSA) Int. J. Adv. Comput. Sci. Appl. 8(5), 515–426 (2017)

    Google Scholar 

  28. Gökmen, Z., et al.: A comprehensive simulation study on classification of RNASeq data. PLoS ONE J. 12(8), 1–24 (2017)

    Google Scholar 

  29. Ian, T.J., Jorge, C.: Principal component analysis: a review and recent developments. Philosoph. Trans. Math. Phys. Eng. Sci. 374, 1–21 (2017)

    Google Scholar 

  30. Nathan, T.J., Andi, D., Katelyn, J.H., Dmitry, K.: Biological classification with RNA-Seq data: Can alternative splicing enhance machine learning classifier? bioRxiv. doi:http://dx.doi.org/10.1101/146340 (2017)

  31. Keerthi, K.V., Surendiran, B.: Dimensionality reduction using Principal Component Analysis for network intrusion detection. Perspect. Sci. 8, 510–512 (2016)

    Article  Google Scholar 

  32. Sofie, V.: A comparative review of dimensionality reduction methods for high-throughput single-cell transcriptomics. Master’s dissertation submitted to Ghent University to obtain the degree of Master of Science in Biochemistry and Biotechnology. Major Bioinformatics and Systems Biology, pp. 1–88 (2017)

    Google Scholar 

  33. Elavarasan, Mani, K.: A survey on feature extraction techniques. Int. J. Innov. Res. Comput. Commun. Eng. 3(1), 1–4 (2015)

    Article  Google Scholar 

  34. Divya, J., Vijendra, S.: Feature selection and classification systems for chronic disease prediction: a review. Egyptian Inform. J. (2018). https://doi.org/10.1016/j.eij.2018.03.002

    Google Scholar 

  35. Awotunde, J.B., Ogundokun, R.O., Ayo, Femi E., Ajamu, Gbemisola J., Adeniyi, E.A., Ogundokun, E.O.: Social media acceptance and use among university students for learning purpose using UTAUT model. In: Borzemski, L., Świątek, J., Wilimowska, Z. (eds.) ISAT 2019. AISC, vol. 1050, pp. 91–102. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-30440-9_10

    Chapter  Google Scholar 

  36. Ogundokun, R.O.: Evaluation of the scholastic performance of students in 12 programs from a private university in the south-west geopolitical zone in Nigeria. Research 8 (2019)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Micheal Olaolu Arowolo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Abdulsalam, S.O. et al. (2020). Performance Evaluation of ANOVA and RFE Algorithms for Classifying Microarray Dataset Using SVM. In: Themistocleous, M., Papadaki, M., Kamal, M.M. (eds) Information Systems. EMCIS 2020. Lecture Notes in Business Information Processing, vol 402. Springer, Cham. https://doi.org/10.1007/978-3-030-63396-7_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-63396-7_32

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-63395-0

  • Online ISBN: 978-3-030-63396-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics