Skip to main content

Ensemble of Artificial Bee Colony Optimization and Random Forest Technique for Feature Selection and Classification of Protein Function Family Prediction

  • Conference paper
  • First Online:
Soft Computing in Data Analytics

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 758))

Abstract

Protein function prediction is a prevalent technique in bioinformatics and computational biology. Even now, the computation of function prediction is an impudent task to provide efficient and statistically significant accurate results. In this work, the optimization approach and the machine learning method were proposed to predict the function families of a protein using the sequences regardless of its similarity. It is denoted as Prot-RF (ABC) (predicting protein family using random forest with artificial bee colony). The features of the protein sequences are selected using the ABC method, and they are classified using the random forest classifier. The Uniprot and PDB benchmark databases have been utilized to assess the proposed Prot-RF (ABC) method against the other well-known existing methods such as SVM-Prot, K-nearest neighbor, AdaBoost, probabilistic neural network, Naïve Bayes, random forest, and J48. The classification accuracy results of the proposed Prot-RF (ABC) method outperform the other remaining existing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Lee, B.J., Shin, M.S., Oh, Y.J., Oh, H.S., Ryu, K.H.: Identification of protein functions using a machine-learning approach based on sequence-derived properties. Prot Sci. 7 (2009)

    Google Scholar 

  2. Tiwari, A.K., Srivastava, R.: A survey of computational intelligence techniques in protein function prediction. Int. J. prot. (2014)

    Google Scholar 

  3. Ong, S.A., Lin, H.H., Chen, Y.Z., Li, Z.R., Cao, Z.: Efficacy of different protein descriptors in predicting protein functional families. BMC Bio. 8 (2007)

    Google Scholar 

  4. Naveed, M., Khan, A.U.: GPCR-MPredictor: multi-level prediction of G protein-coupled receptors using genetic ensemble. Ami, Aci. 42 (2012)

    Google Scholar 

  5. Cai, C.Z., Han, L.Y., Ji, Z.L., Chen, X., Chen, Y.Z.: SVM-Prot: web-based support vector machine software for functional classification of a protein from its primary sequence. Nucl Aci Res. 31 (2003)

    Google Scholar 

  6. Li, Y.H, Xu, J.Y., Tao, L., Li, X.F., Li, S., Zeng, X., et al.: SVM-Prot 2016: A Web-Server for Machine Learning Prediction of Protein Functional Families from Sequence Irrespective of Similarity. PLoS ONE 11 (2016)

    Google Scholar 

  7. Cai, Y., Liao, Z., Ju, Y., Liu, J., Mao, Y., Liu, X.: Resistance gene identification from Larimichthys crocea with machine learning techniques. Sci Rep. 6 (2016)

    Google Scholar 

  8. Gao, Q.B., Wang, Z.Z.: Classification of G protein-coupled receptors at four levels. Prot. Eng. Design Sel. 19 (2006)

    Google Scholar 

  9. Lou, W., Wang, X., Chen, F., Chen, Y., Jiang, B., Zhang, H.: Sequence based prediction of DNA-binding proteins based on hybrid feature selection using random forest and Gaussian naive Bayes. PLoS ONE 9 (2014)

    Google Scholar 

  10. Gu, Q., Ding, Y.S., Zhang, T.L.: Prediction of G-protein coupled receptor classes in low homology using chous pseudo amino acid composition with approximate entropy and hydrophobicity patterns. Prot. Pept. Lett. 17 (2010)

    Google Scholar 

  11. Kaswan, K.S., Choudhary, S., Sharma, K.: Applications of artificial bee colony optimization technique: survey. In: Proceedings in 2nd International Conference on Computing for Sustainable Gloal Development (2015)

    Google Scholar 

  12. Azad, V.S.: Feature based protein function prediction by using random forest. Int. J. Eng. Res. Manag. Technol. 4 (2015)

    Google Scholar 

  13. Horn, F., Vriend, G., Cohen, F.E.: Collecting and harvesting biological data: the GPCRDB and NucleaRDB information systems. Nucl. Aci. Res. 29 (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ranjani Rani Rangasamy .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Rangasamy, R.R., Duraisamy, R. (2019). Ensemble of Artificial Bee Colony Optimization and Random Forest Technique for Feature Selection and Classification of Protein Function Family Prediction. In: Nayak, J., Abraham, A., Krishna, B., Chandra Sekhar, G., Das, A. (eds) Soft Computing in Data Analytics . Advances in Intelligent Systems and Computing, vol 758. Springer, Singapore. https://doi.org/10.1007/978-981-13-0514-6_17

Download citation

Publish with us

Policies and ethics