Abstract
A new method is proposed to identify whether a query protein is singleplex or multiplex for improving the quality of protein subcellular localization prediction. Based on the transductive learning technique, this approach utilizes the information from the both query proteins and known proteins to estimate the subcellular location number of every query protein so that the singleplex and multiplex proteins can be recognized and distinguished. Each query protein is then dealt with by a targeted single-label or multi-label predictor to achieve a high-accuracy prediction result. We assess the performance of the proposed approach by applying it to three groups of protein sequences datasets. Simulation experiments show that the proposed approach can effectively identify the singleplex and multiplex proteins. Through a comparison, the reliably of this method for enhancing the power of predicting protein subcellular localization can also be verified.
Similar content being viewed by others
References
Chou KC (2005) Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes. Bioinformatics 21:10–19
Chou KC (2011) Some remarks on protein attribute prediction and pseudo amino acid composition. J Theor Biol 273:236–247
Chou KC, Shen HB (2007) Euk-mPLoc: a fusion classifier for large-scale eukaryotic protein subcellular location prediction by incorporating multiple sites. J Proteome Res 6:1728–1734
Chou KC, Shen HB (2010a) Cell-PLoc 2.0: an improved package of web-servers for predicting subcellular localization of proteins in various organisms. Nat Sci 2:1090–1103
Chou KC, Shen HB (2010b) Plant-mPLoc: a top-down strategy to augment the power for predicting plant protein subcellular localization. PLoS ONE 5:e11335
Chou KC, Wu ZC, Xiao X (2011) iLoc-Hum: using the accumulation-label scale to predict subcellular locations of human proteins with both single and multiple sites. Mol Biosyst 8:629–641
Foster LJ, de Hoog CL, Zhang YL, Zhang Y, Xie XH, Mootha VK, Mann MA (2006) Mammalian organelle map by protein correlation profiling. Cell 125:187–199
Kong X, Ng M, Zhou ZH (2013) Transductive multi-label learning via label set propagation. IEEE Trans Knowl Data Eng 25:704–719
Shen HB, Chou KC (2009) A top-down approach to enhance the power of predicting human protein subcellular localization: Hum-mPLoc 2.0. Anal Biochem 394:269–274
Shen HB, Chou KC (2010) Virus-mPLoc: a fusion classifier for viral protein subcellular location prediction by incorporating multiple sites. J Biomol Struct Dyn 28:175–186
Wu ZC, Xiao X, Chou KC (2011) iLoc-Plant: a multi-label classifier for predicting the subcellular localization of plant proteins with both single and multiple sites. Mol Biosyst 7:3287–3297
Xiao X, Wu ZC, Chou KC (2011) iLoc-Virus: a multi-label learning classifier for identifying the subcellular localization of virus proteins with both single and multiple sites. J Theor Biol 284:42–51
Zhang ML (2009) ML-RBF: RBF neural networks for multi-label learning. Neural Process Lett 29:61–74
Acknowledgments
This work is supported by the National Natural Science Foundation of China (61174027); National Science and Technology Mega-Project Program of China (2010ZX04007-011-5); Program for Liaoning Excellent Talents in University (LJQ2012005); Specialized Research Fund for the Doctoral Program of Higher Education (20120041110008).
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Cao, J., Liu, W., He, J. et al. Identifying the singleplex and multiplex proteins based on transductive learning for protein subcellular localization prediction. Biotechnol Lett 35, 1107–1113 (2013). https://doi.org/10.1007/s10529-013-1186-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10529-013-1186-6