Amino Acids

, Volume 32, Issue 4, pp 483–488 | Cite as

Using ensemble classifier to identify membrane protein types

  • H.-B. Shen
  • K.-C. Chou


Predicting membrane protein type is both an important and challenging topic in current molecular and cellular biology. This is because knowledge of membrane protein type often provides useful clues for determining, or sheds light upon, the function of an uncharacterized membrane protein. With the explosion of newly-found protein sequences in the post-genomic era, it is in a great demand to develop a computational method for fast and reliably identifying the types of membrane proteins according to their primary sequences. In this paper, a novel classifier, the so-called “ensemble classifier”, was introduced. It is formed by fusing a set of nearest neighbor (NN) classifiers, each of which is defined in a different pseudo amino acid composition space. The type for a query protein is determined by the outcome of voting among these constituent individual classifiers. It was demonstrated through the self-consistency test, jackknife test, and independent dataset test that the ensemble classifier outperformed other existing classifiers widely used in biological literatures. It is anticipated that the idea of ensemble classifier can also be used to improve the prediction quality in classifying other attributes of proteins according to their sequences.

Keywords: Type-I – Type-II – Multi-pass transmembrane – Lipid-chain-anchored – GPI-anchored – Pseudo-amino acid composition – Ensemble classifier – Fusion – Voting 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Alberts, B, Bray, D, Lewis, J, Raff, M, Roberts, K, Watson, JD 1994Molecular biology of the cell3Garland PublishingNew York Londonchapter 1Google Scholar
  2. Bairoch, A, Apweiler, R 2000The SWISS-PROT protein sequence data bank and its supplement TrEMBLNucleic Acids Res253136CrossRefGoogle Scholar
  3. Cai, YD, Zhou, GP, Chou, KC 2003Support vector machines for predicting membrane protein types by using functional domain compositionBiophys J8432573263PubMedCrossRefGoogle Scholar
  4. Cedano, J, Aloy, P, P’erez-Pons, JA, Querol, E 1997Relation between amino acid composition and cellular location of proteinsJ Mol Biol266594600PubMedCrossRefGoogle Scholar
  5. Chou, JJ, Zhang, CT 1993A joint prediction of the folding types of 1490 human proteins from their genetic codonsJ Theor Biol161251262PubMedCrossRefGoogle Scholar
  6. Chou, KC 1995A novel approach to predicting protein structural classes in a (20-1)-D amino acid composition spaceProteins Struct Function Genet21319344CrossRefGoogle Scholar
  7. Chou, KC 2001Prediction of protein cellular attributes using pseudo amino acid compositionProteins Struct Function Genet43246255Erratum: ibid., 2001, Vol. 44, 60CrossRefGoogle Scholar
  8. Chou, KC 2005Review: Progress in protein structural class prediction and its impact to bioinformatics and proteomicsCurr Protein Pept Sci6423436PubMedCrossRefGoogle Scholar
  9. Chou, KC, Elrod, DW 1999Prediction of membrane protein types and subcellular locationsProteins Struct Function Genet34137153CrossRefGoogle Scholar
  10. Chou, KC, Zhang, CT 1994Predicting protein folding types by distance functions that make allowances for amino acid interactionsJ Biol Chem2692201422020PubMedGoogle Scholar
  11. Chou, KC, Zhang, CT 1995Review: Prediction of protein structural classesCrit Rev Biochem Mol Biol30275349PubMedGoogle Scholar
  12. Chou, PY 1989Prediction of protein structural classes from amino acid compositionFasman, GD eds. Prediction of protein structure and the principles of protein conformationPlenum PressNew York549586Google Scholar
  13. Cortes, C, Vapnik, V 1995Support vector networksMach Learn20273293Google Scholar
  14. Cover, TM, Hart, PE 1967Nearest neighbour pattern classificationIEEE Trans Inform TheoryIT-132127CrossRefGoogle Scholar
  15. Feng, ZP 2001Prediction of the subcellular location of prokaryotic proteins based on a new representation of the amino acid compositionBiopolymers58491499PubMedCrossRefGoogle Scholar
  16. Feng, ZP 2002An overview on predicting the subcellular location of a proteinIn Silico Biol2291303PubMedGoogle Scholar
  17. Gao, Y, Shao, SH, Xiao, X, Ding, YS, Huang, YS, Huang, ZD, Chou, KC 2005Using pseudo amino acid composition to predict protein subcellular location: approached with Lyapunov index, Bessel function, and Chebyshev filterAmino Acids28373376PubMedCrossRefGoogle Scholar
  18. Guo, YZ, Li, M, Lu, M, Wen, Z, Wang, K, Li, G, Wu, J 2006Classifying G protein-coupled receptors and nuclear receptors based on protein power spectrum from fast Fourier transformAmino Acids30397402PubMedCrossRefGoogle Scholar
  19. Lodish H, Baltimore D, Berk A, Zipursky SL, Matsudaira P, Darnell J (1995) Mol Cell Biol, Chapter 3, 3rd ed. Scientific American Books, New YorkGoogle Scholar
  20. Luo, RY, Feng, ZP, Liu, JK 2002Prediction of protein strctural class by amino acid and polypeptide compositionEur J Biochem26942194225PubMedCrossRefGoogle Scholar
  21. Mahalanobis, PC 1936On the generalized distance in statisticsProc Natl Inst Sci India24955Google Scholar
  22. Mardia, KV, Kent, JT, Bibby, JM 1979Multivariate analysis: Chapter 11: Discriminant analysis; Chapter 12: Multivariate analysis of variance; Chapter 13: cluster analysisAcademic PressLondon322381Google Scholar
  23. Matthews, BW 1975Comparison of the predicted and observed secondary structure of T4 phage lysozymeBiochim Biophys Acta405442451PubMedGoogle Scholar
  24. Nakashima, H, Nishikawa, K, Ooi, T 1986The folding type of a protein is relevant to the amino acid compositionJ Biochem99152162Google Scholar
  25. Pan, YX, Zhang, ZZ, Guo, ZM, Feng, GY, Huang, ZD, He, L 2003Application of pseudo amino acid composition for predicting protein subcellular location: stochastic signal processing approachJ Protein Chem22395402PubMedCrossRefGoogle Scholar
  26. Pillai, KCS 1985Mahalanobis D2Kotz, SJohnson, NL eds. Encyclopedia of statistical sciencesWileyNew york176181Google Scholar
  27. Rost, B, Casadio, R, Fariselli, P, Sander, C 1995Transmembrane helices predicted at 95% accuracyProtein Sci4521533PubMedCrossRefGoogle Scholar
  28. Sun, XD, Huang, RB 2006Prediction of protein structural classes using support vector machinesAmino Acids30469475PubMedCrossRefGoogle Scholar
  29. Wang, M, Yang, J, Liu, GP, Xu, ZJ, Chou, KC 2004Weighted-support vector machines for predicting membrane protein types based on pseudo amino acid compositionProtein Eng Design Select17509516CrossRefGoogle Scholar
  30. Wang, M, Yang, J, Xu, ZJ, Chou, KC 2005SLLE for predicting membrane protein typesJ Theor Biol232715PubMedCrossRefGoogle Scholar
  31. Xiao, X, Shao, S, Ding, Y, Huang, Z, Chen, X, Chou, KC 2005aAn application of gene comparative image for predicting the effect on replication ratio by HBV virus gene missense mutationJ Theor Biol235555565CrossRefGoogle Scholar
  32. Xiao, X, Shao, S, Ding, Y, Huang, Z, Huang, Y, Chou, KC 2005bUsing complexity measure factor to predict protein subcellular locationAmino Acids285761CrossRefGoogle Scholar
  33. Xiao, X, Shao, SH, Ding, YS, Huang, ZD, Chou, KC 2006aUsing cellular automata images and pseudo amino acid composition to predict protein sub-cellular locationAmino Acids304954CrossRefGoogle Scholar
  34. Xiao, X, Shao, SH, Huang, ZD, Chou, KC 2006bUsing pseudo amino acid composition to predict protein structural classes: approached with complexity measure factorJ Comput Chem27478482CrossRefGoogle Scholar
  35. Zhang, SW, Pan, Q, Zhang, HC, Shao, ZC, Shi, JY 2006Prediction protein homo-oligomer types by pseudo amino acid composition: approached with an improved feature extraction and naive Bayes feature fusionAmino Acids30461468PubMedCrossRefGoogle Scholar
  36. Zhou, GP, Assa-Munt, N 2001Some insights into protein structural class predictionProteins Struct Function Genet445759CrossRefGoogle Scholar
  37. Zhou, GP, Cai, YD 2006Predicting protease types by hybridizing gene ontology and pseudo amino acid compositionProteins Struct Function Bioinform63681684CrossRefGoogle Scholar
  38. Zhou, GP, Doctor, K 2003Subcellular location prediction of apoptosis proteinsProteins Struct Function Genet504448CrossRefGoogle Scholar

Copyright information

© Springer-Verlag 2006

Authors and Affiliations

  • H.-B. Shen
    • 1
  • K.-C. Chou
    • 1
    • 2
  1. 1.Institute of Image Processing and Pattern RecognitionShanghai Jiaotong UniversityShanghaiChina
  2. 2.Gordon Life Science InstituteSan DiegoU.S.A.

Personalised recommendations