Amino Acids

, Volume 42, Issue 5, pp 1619–1625 | Cite as

Predicting subcellular location of apoptosis proteins with pseudo amino acid composition: approach from amino acid substitution matrix and auto covariance transformation

  • Xiaoqing Yu
  • Xiaoqi Zheng
  • Taigang Liu
  • Yongchao Dou
  • Jun Wang
Original Article


Apoptosis proteins are very important for understanding the mechanism of programmed cell death. Obtaining information on subcellular location of apoptosis proteins is very helpful to understand the apoptosis mechanism. In this paper, based on amino acid substitution matrix and auto covariance transformation, we introduce a new sequence-based model, which not only quantitatively describes the differences between amino acids, but also partially incorporates the sequence-order information. This method is applied to predict the apoptosis proteins’ subcellular location of two widely used datasets by the support vector machine classifier. The results obtained by jackknife test are quite promising, indicating that the proposed method might serve as a potential and efficient prediction model for apoptosis protein subcellular location prediction.


Apoptosis proteins Subcellular location Substitution matrix Auto covariance transformation Support vector machine 



This work was partially supported by the National Natural Science Foundation of China (No. 10731040), Shanghai Leading Academic Discipline Project (No. S30405) and Innovation Program of Shanghai Municipal Education Commission (No. 09zz134).


  1. Adams JM, Cory S (1998) The Bcl-2 protein family: arbiters of cell survival. Science 281:1322–1326PubMedCrossRefGoogle Scholar
  2. Assfalg J, Gong J, Kriegel HP, Pryakhin A, Wei T, Zimek A (2009) Supervised ensembles of prediction methods for subcellular localization. J Bioinform Comput Biol 7(2):269–285PubMedCrossRefGoogle Scholar
  3. Assfalg J, Gong J, Kriegel HP, Pryakhin A, Wei T, Zimek A (2010) Investigating a correlation between subcellular localization and fold of proteins. J UCS 16(5):604–621Google Scholar
  4. Cai YD, Liu XJ, Xu XB, Chou KC (2002a) Prediction of protein structural classes by support vector machines. Comput Chem 26:293–296PubMedCrossRefGoogle Scholar
  5. Cai YD, Liu XJ, Xu XB, Chou KC (2002b) Support vector machines for predicting HIV protease cleavage sites in protein. J Comput Chem 23:267–274PubMedCrossRefGoogle Scholar
  6. Cai YD, Liu XJ, Xu XB, Chou KC (2002c) Support vector machines for predicting the specificity of GalNAc-transferase. Peptides 23:205–208PubMedCrossRefGoogle Scholar
  7. Cai YD, Liu XJ, Xu XB, Chou KC (2002d) Support vector machines for prediction of protein subcellular location by incorporating quasi-sequence-order effect. J Cell Biochem 84:343–348PubMedCrossRefGoogle Scholar
  8. Cai YD, Pong Wong R, Feng K, Jen JCH, Chou KC (2004) Application of SVM to predict membrane protein types. J Theor Biol 226:373–376PubMedCrossRefGoogle Scholar
  9. Cedano J, Aloy P, Pérez-Pons JA, Querol E (1997) Relation between amino acid composition and cellular location of proteins. J Mol Biol 266:594–600PubMedCrossRefGoogle Scholar
  10. Chang C, Lin CJ (2009) Libsvm: a library for support vector machines.
  11. Chen YL, Li QZ (2004) Prediction of the subcellular location apoptosis proteins using the algorithm of measure of diversity. Acta Sci Nat Univ Nei Mong 25:413–417Google Scholar
  12. Chen YL, Li QZ (2007a) Prediction of the subcellular location of apoptosis proteins. J Theor Biol 245:775–783PubMedCrossRefGoogle Scholar
  13. Chen YL, Li QZ (2007b) Prediction of apoptosis protein subcellular location using improved hybrid approach and pseudo amino acid composition. J Theor Biol 248:377–381PubMedCrossRefGoogle Scholar
  14. Chen C, Chen L, Zou X, Cai P (2009) Prediction of protein secondary structure content by using the concept of Chou’s pseudo amino acid composition and support vector machine. Protein Pept Lett 16:27–31PubMedCrossRefGoogle Scholar
  15. Chou KC (2001) Prediction of protein cellular attributes using pseudo amino acid composition. PROTEINS: structure, function, and genetics (Erratum: ibid., 2001, vol. 44, 60) 43:246–255Google Scholar
  16. Chou KC (2004a) Review: structural bioinformatics and its impact to biomedical science. Curr Med Chem 11:2105–2134PubMedGoogle Scholar
  17. Chou KC (2004b) Insights from modelling the 3D structure of the extracellular domain of alpha7 nicotinic acetylcholine receptor. Biochem Biophys Res Commun 319:433–438PubMedCrossRefGoogle Scholar
  18. Chou KC (2004c) Modelling extracellular domains of GABA-A receptors: subtypes 1, 2, 3, and 5. Biochem Biophys Res Commun 316:636–642PubMedCrossRefGoogle Scholar
  19. Chou KC (2004d) Molecular therapeutic target for type-2 diabetes. J Proteome Res 3:1284–1288PubMedCrossRefGoogle Scholar
  20. Chou KC (2005a) Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes. Bioinformatics 21:10–19PubMedCrossRefGoogle Scholar
  21. Chou KC (2005b) Coupling interaction between thromboxane A2 receptor and alpha-13 subunit of guanine nucleotide-binding protein. J Proteome Res 4:1681–1686PubMedCrossRefGoogle Scholar
  22. Chou KC (2005c) Prediction of G-protein-coupled receptor classes. J Proteome Res 4:1413–1418PubMedCrossRefGoogle Scholar
  23. Chou KC (2009) Pseudo amino acid composition and its applications in bioinformatics, proteomics and system biology. Curr Proteomics 6:262–274CrossRefGoogle Scholar
  24. Chou KC, Cai YD (2002) Using functional domain composition and support vector machines for prediction of protein subcellular location. J Biol Chem 277:45765–45769PubMedCrossRefGoogle Scholar
  25. Chou KC, Elrod DW (1999) Protein subcellular location prediction. Protein Engine 12:107–118CrossRefGoogle Scholar
  26. Chou KC, Shen HB (2007a) Review: recent progresses in protein subcellular location prediction. Anal Biochem 370:1–16PubMedCrossRefGoogle Scholar
  27. Chou KC, Shen HB (2007b) Signal-CF: a subsite-coupled and window-fusing approach for predicting signal peptides. Biochem Biophys Res Comm 357:633–640PubMedCrossRefGoogle Scholar
  28. Chou KC, Shen HB (2008a) Cell-PLoc: a package of Web servers for predicting subcellular localization of proteins in various organisms. Nat Protoc 3:153–162PubMedCrossRefGoogle Scholar
  29. Chou KC, Shen HB (2008b) ProtIdent: a web server for identifying proteases and their types by fusing functional domain and sequential evolution information. Biochem Biophys Res Comm 376:321–325PubMedCrossRefGoogle Scholar
  30. Chou KC, Shen HB (2010a) A new method for predicting the subcellular localization of eukaryotic proteins with both single and multiple sites: Euk-mPLoc 2.0. PLoS ONE 5:e9931Google Scholar
  31. Chou KC, Shen HB (2010b) Plant-mPLoc: a top-down strategy to augment the power for predicting plant protein subcellular localization. PLoS ONE 5:e11335PubMedCrossRefGoogle Scholar
  32. Chou KC, Zhang CT (1995) Review: prediction of protein structural classes. Crit Rev Biochem Mol Biol 30:275–349PubMedCrossRefGoogle Scholar
  33. Chou KC, Zhang TC, Maggiora MG (1997) Disposition of amphiphilic helices in heteropolar environments. Proteins 28:99–108PubMedCrossRefGoogle Scholar
  34. Chou JJ, Li H, Salvessen GS, Yuan J, Wagner G (1999) Solution structure of BID, an intracellular amplifier of apoptotic signalling. Cell 96:615–624PubMedCrossRefGoogle Scholar
  35. Chou KC, Tomasselli AG, Heinrikson RL (2000) Prediction of the tertiary structure of a caspase-9/inhibitor complex. FEBS Lett 470:249–256PubMedCrossRefGoogle Scholar
  36. Dayhoff MO, Schwartz RM, Orcutt BC (1978) A model of evolutionary change in proteins, vol 5. National Biomedical Research Foundation, Washington, pp 345–352Google Scholar
  37. Ding YS, Zhang TL (2008) Using Chous pseudo amino acid composition to predict subcellular localization of apoptosis proteins: an approach with immune genetic algorithm-based ensemble classifier. Pattern Recognit Lett 29:1887–1892CrossRefGoogle Scholar
  38. Ding Y, Zhang TL, Chou KC (2007) Prediction of protein structure classes with pseudo amino acid composition and fuzzy support vector machine network. Protein Pept Lett 14:811–815PubMedCrossRefGoogle Scholar
  39. Ding H, Luo L, Lin H (2009) Prediction of cell wall lytic enzymes using Chou’s amphiphilic pseudo amino acid composition. Protein Pept Lett 16:351–355PubMedCrossRefGoogle Scholar
  40. Dubchak I, Muchnik I, Holbrook SR, Kim SH (1995) Prediction of protein folding class using global description of amino acid sequence. PNAS USA 92:8700–8704PubMedCrossRefGoogle Scholar
  41. Evan G, Littlewood T (1998) A matter of life and cell death. Science 281:1317–1322PubMedCrossRefGoogle Scholar
  42. Feng ZP (2001) Prediction of the subcellular location of prokaryotic proteins based on a new representation of the amino acid composition. Biopolymers 58:491–499PubMedCrossRefGoogle Scholar
  43. Garg A, Bhasin M, Raghava GPS (2005) Support vector machine-based method for subcellular localization of human proteins using amino acid compositions, their order, and similarity search. J Biol Chem 280(15):14427–14432PubMedCrossRefGoogle Scholar
  44. Gu Q, Ding YS, Jiang XY, Zhang TL (2010) Prediction of subcellular location apoptosis proteins with ensemble classifier and feature selection. Amino Acids 38(4):975–983PubMedCrossRefGoogle Scholar
  45. Guo Y, Li M, Lu M, Wen Z, Huang Z (2006) Predicting g-protein coupled receptors-g-protein coupling specificity based on autocross-covariance transform. Proteins 65:55–60PubMedCrossRefGoogle Scholar
  46. Guo YZ, Yu LZ, Wen ZN, Li ML (2008) Using support vector machine combined with auto covariance to predict protein–protein interactions form protein sequences. Nucleic Acids Res 36:3025–3030PubMedCrossRefGoogle Scholar
  47. Henikoff S, Henikoff JG (1992) Amino acid substitution matrices from protein blocks. Natl Acad Sci USA 89:10915–10919CrossRefGoogle Scholar
  48. Henikoff S, Henikoff JG (1993) Performance evaluation of amino acid substitution matrices. Protein Struct Funct Genet 17:49–61CrossRefGoogle Scholar
  49. Hiss JA, Schneider G (2009) Architecture, function and prediction of long signal peptides. Brief Bioinform 10:569–578PubMedCrossRefGoogle Scholar
  50. Hua S, Sun ZR (2001) Support vector machine approach for protein subcellular localization prediction. Bioinformatics 17:721–728PubMedCrossRefGoogle Scholar
  51. Huang Y, Li Y (2004) Prediction of protein subcellular location using fuzzy k-NN method. Bioinformatics 20(1):121–128CrossRefGoogle Scholar
  52. Huang J, Shi F (2005) Support vector machines for predicting apoptosis proteins types. Acta Biotheor 53:39–47PubMedCrossRefGoogle Scholar
  53. Jacobson MD, Weil M, Raff MC (1997) Programmed cell death in animal development. Cell 88:347–354PubMedCrossRefGoogle Scholar
  54. Jiang X, Wei R, Zhang T, Gu Q (2008) Using the concept of Chou’s pseudo amino acid composition to predict apoptosis proteins subcellular location: an approach by approximate entropy. Protein Pept Lett 15:392–396PubMedCrossRefGoogle Scholar
  55. Johnson MS, Overington JP (1993) A structural basis of sequence comparisons: an evaluation of scoring methodologies. J Mol Bio 233:716–738CrossRefGoogle Scholar
  56. Lapinsh M, Gutcaits A, Prusis P, Post C, Lundstedt T, Wikberg JE (2002) Classification of G-protein coupled receptors by alignment-independent extraction of principal chemical properties of primary amino acid sequences. Protein Sci 11:795–805PubMedCrossRefGoogle Scholar
  57. Leslid C, Eskin E, Noble WS (2002) The spectrum kernel: a string kernel for SVM protein classification. In: Pacific symposium on biocomputing (PSB), pp 564–575Google Scholar
  58. Li FM, Li QZ (2008) Predicting protein subcellular location using Chou’s pseudo amino acid composition and improved hybrid approach. Protein Pept Lett 15:612–616PubMedCrossRefGoogle Scholar
  59. Lin H (2008) The modified Mahalanobis discriminant for predicting outer membrane proteins by using Chou’s pseudo amino acid composition. J Theor Biol 252:350–356PubMedCrossRefGoogle Scholar
  60. Lin Z, Pan XM (2001) Accurate prediction of protein secondary structural content. J Protein Chem 20:217–220PubMedCrossRefGoogle Scholar
  61. Lin H, Ding H, Guo FB, Zhang AY, Huang J (2008) Predicting subcellular localization of mycobacterial proteins by using Chou’s pseudo amino acid composition. Protein Pept Lett 15:739–744PubMedCrossRefGoogle Scholar
  62. Lin H, Wang H, Ding H, Chen YL, Li QZ (2009) Prediction of subcellular localization of apoptosis protein using Chou’s pseudo amino acid composition. Acta Biotheor 57:321–330PubMedCrossRefGoogle Scholar
  63. Malde K (2008) The effect of sequence quality on sequence alignment. Bioinformatics 24(7):897–900PubMedCrossRefGoogle Scholar
  64. Nakashima H, Nishikawa K (1994) Discrimination of intracellular and extracellular proteins using amino acid composition and residue pair frequencies. J Mol Biol 238:54–61PubMedCrossRefGoogle Scholar
  65. Raff M (1998) Cell suicide for beginners. Nature 396:119–122PubMedCrossRefGoogle Scholar
  66. Reed JC, Paternostro G (1999) Postmitochondrial regulation of apoptosis during heart failure. Proc Natl Acad Sci USA 96:7614–7616PubMedCrossRefGoogle Scholar
  67. Schulz JB, Weller M, Moskowitz MA (1999) Caspases as treatment targets in stroke and neurodegenerative diseases. Ann Neurol 45:421–429PubMedCrossRefGoogle Scholar
  68. Steller H (1995) Mechanisms and genes of cellular suicide. Science 267:1445–1449PubMedCrossRefGoogle Scholar
  69. Wold S, Jonsson J, Ssjörström M, Sandberg M, Rännar S (1993) DNA and peptide sequences and chemical processes multivariately modeled by principal component analysis and partial least-squares projections to latent structures. Anal Chim Acta 277:239–253CrossRefGoogle Scholar
  70. Xiao X, Wang P, Chou KC (2009) Predicting protein quaternary structural attribute by hybridizing functional domain composition and pseudo amino acid composition. J Appl Crystallogr 42:169–173CrossRefGoogle Scholar
  71. Zeng YH, Guo YZ, Xiao RQ, Yang L, Yu LZ, Li ML (2009) Using the augmented Chou’s pseudo amino acid composition for predicting protein submitochondria locations based on auto covariance approach. J Theor Biol 259:366–372PubMedCrossRefGoogle Scholar
  72. Zhang CT, Lin ZS, Zhang ZD, Yan M (1998) Prediction of the helix/strand content of globular proteins based on their primary sequences. Protein Eng 11:971–979PubMedCrossRefGoogle Scholar
  73. Zhang ZD, Sun ZR, Zhang CT (2001) A new approach to predict the helix/strand content of globular proteins. J Theor Biol 208:65–78PubMedCrossRefGoogle Scholar
  74. Zhang ZH, Wang ZH, Zhang ZR, Wang YX (2006) A novel method for apoptosis protein subcellular localization prediction combining encoding based on grouped weight and support vector machine. FEBS Lett 580:6169–6174PubMedCrossRefGoogle Scholar
  75. Zhang L, Liao B, Li D, Zhu W (2009) A novel representation for apoptosis protein subcellular localization prediction using support vector machine. J Theor Biol 259:361–365PubMedCrossRefGoogle Scholar
  76. Zhou GP (1998) An intriguing controversy over protein structural class prediction. J Protein Chem 17:729–738PubMedCrossRefGoogle Scholar
  77. Zhou GP, Doctor K (2003) Subcellular location prediction of apoptosis proteins. Proteins 50:40–48Google Scholar
  78. Zhou XB, Chen C, Li ZC, Zou XY (2007) Using Chou’s amphiphilic pseudoamino acid composition and support vector machine for prediction of enzyme subfamily classes. J Theor Biol 248:546–551PubMedCrossRefGoogle Scholar

Copyright information

© Springer-Verlag 2011

Authors and Affiliations

  • Xiaoqing Yu
    • 1
  • Xiaoqi Zheng
    • 1
    • 2
  • Taigang Liu
    • 3
  • Yongchao Dou
    • 4
  • Jun Wang
    • 1
    • 2
  1. 1.Department of MathematicsShanghai Normal UniversityShanghaiChina
  2. 2.Scientific Computing Key Laboratory of Shanghai UniversitiesShanghaiChina
  3. 3.College of Information Sciences and EngineeringShandong Agricultural UniversityTaianChina
  4. 4.School of Mathematical SciencesDalian University of TechnologyDalianChina

Personalised recommendations