Abstract
Protein function prediction is a prevalent technique in bioinformatics and computational biology. Even now, the computation of function prediction is an impudent task to provide efficient and statistically significant accurate results. In this work, the optimization approach and the machine learning method were proposed to predict the function families of a protein using the sequences regardless of its similarity. It is denoted as Prot-RF (ABC) (predicting protein family using random forest with artificial bee colony). The features of the protein sequences are selected using the ABC method, and they are classified using the random forest classifier. The Uniprot and PDB benchmark databases have been utilized to assess the proposed Prot-RF (ABC) method against the other well-known existing methods such as SVM-Prot, K-nearest neighbor, AdaBoost, probabilistic neural network, Naïve Bayes, random forest, and J48. The classification accuracy results of the proposed Prot-RF (ABC) method outperform the other remaining existing methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Lee, B.J., Shin, M.S., Oh, Y.J., Oh, H.S., Ryu, K.H.: Identification of protein functions using a machine-learning approach based on sequence-derived properties. Prot Sci. 7 (2009)
Tiwari, A.K., Srivastava, R.: A survey of computational intelligence techniques in protein function prediction. Int. J. prot. (2014)
Ong, S.A., Lin, H.H., Chen, Y.Z., Li, Z.R., Cao, Z.: Efficacy of different protein descriptors in predicting protein functional families. BMC Bio. 8 (2007)
Naveed, M., Khan, A.U.: GPCR-MPredictor: multi-level prediction of G protein-coupled receptors using genetic ensemble. Ami, Aci. 42 (2012)
Cai, C.Z., Han, L.Y., Ji, Z.L., Chen, X., Chen, Y.Z.: SVM-Prot: web-based support vector machine software for functional classification of a protein from its primary sequence. Nucl Aci Res. 31 (2003)
Li, Y.H, Xu, J.Y., Tao, L., Li, X.F., Li, S., Zeng, X., et al.: SVM-Prot 2016: A Web-Server for Machine Learning Prediction of Protein Functional Families from Sequence Irrespective of Similarity. PLoS ONE 11 (2016)
Cai, Y., Liao, Z., Ju, Y., Liu, J., Mao, Y., Liu, X.: Resistance gene identification from Larimichthys crocea with machine learning techniques. Sci Rep. 6 (2016)
Gao, Q.B., Wang, Z.Z.: Classification of G protein-coupled receptors at four levels. Prot. Eng. Design Sel. 19 (2006)
Lou, W., Wang, X., Chen, F., Chen, Y., Jiang, B., Zhang, H.: Sequence based prediction of DNA-binding proteins based on hybrid feature selection using random forest and Gaussian naive Bayes. PLoS ONE 9 (2014)
Gu, Q., Ding, Y.S., Zhang, T.L.: Prediction of G-protein coupled receptor classes in low homology using chous pseudo amino acid composition with approximate entropy and hydrophobicity patterns. Prot. Pept. Lett. 17 (2010)
Kaswan, K.S., Choudhary, S., Sharma, K.: Applications of artificial bee colony optimization technique: survey. In: Proceedings in 2nd International Conference on Computing for Sustainable Gloal Development (2015)
Azad, V.S.: Feature based protein function prediction by using random forest. Int. J. Eng. Res. Manag. Technol. 4 (2015)
Horn, F., Vriend, G., Cohen, F.E.: Collecting and harvesting biological data: the GPCRDB and NucleaRDB information systems. Nucl. Aci. Res. 29 (2001)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Rangasamy, R.R., Duraisamy, R. (2019). Ensemble of Artificial Bee Colony Optimization and Random Forest Technique for Feature Selection and Classification of Protein Function Family Prediction. In: Nayak, J., Abraham, A., Krishna, B., Chandra Sekhar, G., Das, A. (eds) Soft Computing in Data Analytics . Advances in Intelligent Systems and Computing, vol 758. Springer, Singapore. https://doi.org/10.1007/978-981-13-0514-6_17
Download citation
DOI: https://doi.org/10.1007/978-981-13-0514-6_17
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-0513-9
Online ISBN: 978-981-13-0514-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)