The Journal of Membrane Biology

, Volume 248, Issue 6, pp 1033–1041 | Cite as

iCataly-PseAAC: Identification of Enzymes Catalytic Sites Using Sequence Evolution Information with Grey Model GM (2,1)

  • Xuan Xiao
  • Meng-Juan Hui
  • Zi Liu
  • Wang-Ren Qiu


Enzymes play pivotal roles in most of the biological reaction. The catalytic residues of an enzyme are defined as the amino acids which are directly involved in chemical catalysis; the knowledge of these residues is important for understanding enzyme function. Given an enzyme, which residues are the catalytic sites, and which residues are not? This is the first important problem for in-depth understanding the catalytic mechanism and drug development. With the explosive of protein sequences generated during the post-genomic era, it is highly desirable for both basic research and drug design to develop fast and reliable method for identifying the catalytic sites of enzymes according to their sequences. To address this problem, we proposed a new predictor, called iCataly-PseAAC. In the prediction system, the peptide sample was formulated with sequence evolution information via grey system model GM(2,1). It was observed by the rigorous jackknife test and independent dataset test that iCataly-PseAAC was superior to exist predictions though its only use sequence information. As a user-friendly web server, iCataly-PseAAC is freely accessible at A step-by-step guide has been provided on how to use the web server to get the desired results for the convenience of most experimental scientists.


Catalytic active sites Pseudo amino acid composition Grey system model Web server iCataly-PseAAC 



This work was partially supported by the National Nature Science Foundation of China (Nos. 31260273, 61261027), Natural Science Foundation of Jiangxi Province, China (Nos. 20114BAB211013, 20122BAB211033, 20122BAB201044, 20122BAB201020), the Department of Education of JiangXi Province (GJJ12490), the LuoDi plan of the Department of Education of JiangXi Province(KJLD12083), and the JiangXi Provincial Foundation for Leaders of Disciplines in Science (20113BCB22008) and the Graduated innovation found of Jingdezhen ceramic institute (JYC1310, JYC201427).

Supplementary material

232_2015_9815_MOESM1_ESM.doc (864 kb)
Supplementary material 1 (DOC 864 kb)
232_2015_9815_MOESM2_ESM.doc (2.5 mb)
Supplementary material 2 (DOC 2544 kb)
232_2015_9815_MOESM3_ESM.doc (292 kb)
Supplementary material 3 (DOC 292 kb)
232_2015_9815_MOESM4_ESM.doc (829 kb)
Supplementary material 4 (DOC 829 kb)


  1. Bartlett GJ, Porter CT, Borkakoti N, Thornton JM (2002) Analysis of catalytic residues in enzyme active sites. J Mol Biol 324:105–121CrossRefPubMedGoogle Scholar
  2. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN et al (2000) The Protein Data Bank. Nucleic Acids Res 28:235–242PubMedCentralCrossRefPubMedGoogle Scholar
  3. Berman HM, Battistuz T, Bhat TN, Bluhm WF, Bourne PE et al (2002) The Protein Data Bank. Acta Crystallogr D Biol Crystallogr 58:899–907CrossRefPubMedGoogle Scholar
  4. Chea E, Livesay DR (2007) How accurate and statistically robust are catalytic site predictions based on closeness centrality? BMC Bioinform 8:153CrossRefGoogle Scholar
  5. Chen W, Feng P-M, Lin H, Chou K-C (2013) iRSpot-PseDNC: identify recombination spots with pseudo dinucleotide composition. Nucleic Acids Res 41:e68PubMedCentralCrossRefPubMedGoogle Scholar
  6. Chien Y-T, Huang S-W (2012) Accurate prediction of protein catalytic residues by side chain orientation and residue contact density. PLoS One 7:e47951PubMedCentralCrossRefPubMedGoogle Scholar
  7. Chien YT, Huang SW (2013) On the structural context and identification of enzyme catalytic residues. Biomed Res Int 2013:802945PubMedCentralCrossRefPubMedGoogle Scholar
  8. Chou KC (1995) A novel approach to predicting protein structural classes in a (20–1)-D amino acid composition space. Proteins 21:319–344CrossRefPubMedGoogle Scholar
  9. Chou K-C (1996) Prediction of human immunodeficiency virus protease cleavage sites in proteins. Anal Biochem 233:1–14CrossRefPubMedGoogle Scholar
  10. Chou K-C (2001) Prediction of signal peptides using scaled window. Peptides 22:1973–1979CrossRefPubMedGoogle Scholar
  11. Chou KC (2005) Progress in protein structural class prediction and its impact to bioinformatics and proteomics. Curr Protein Pept Sci 6:423–436CrossRefPubMedGoogle Scholar
  12. Chou K-C (2011) Some remarks on protein attribute prediction and pseudo amino acid composition. J Theor Biol 273:236–247CrossRefPubMedGoogle Scholar
  13. Chou K-C, Zhang C-T (1994) Predicting protein folding types by distance functions that make allowances for amino acid interactions. J Biol Chem 269:22014–22020PubMedGoogle Scholar
  14. Chou K-C, Zhang C-T (1995) Prediction of protein structural classes. Crit Rev Biochem Mol Biol 30:275–349CrossRefPubMedGoogle Scholar
  15. Chou K-C, Wu Z-C, Xiao X (2012) iLoc-Hum: using the accumulation-label scale to predict subcellular locations of human proteins with both single and multiple sites. Mol Biosyst 8:629–641CrossRefPubMedGoogle Scholar
  16. Davis J, Goadrich M (2006) The relationship between precision–recall and ROC curves. ACM, New York, pp 233–240Google Scholar
  17. Deng J-L (1989) Introduction to grey system theory. J Grey Syst 1:1–24Google Scholar
  18. Dou Y, Zheng X, Yang J, Wang J (2010) Prediction of catalytic residues based on an overlapping amino acid classification. Amino Acids 39:1353–1361CrossRefPubMedGoogle Scholar
  19. Dou Y, Geng X, Gao H, Yang J, Zheng X et al (2011) Sequence conservation in the prediction of catalytic sites. Protein J 30:229–239CrossRefPubMedGoogle Scholar
  20. Fawcett T (2004) ROC graphs: notes and practical considerations for researchers. Mach Learn 31:1–38Google Scholar
  21. Fischer JD, Mayer CE, Soding J (2008) Prediction of protein functional residues from sequence by probability density estimation. Bioinformatics 24:613–620CrossRefPubMedGoogle Scholar
  22. Fu L, Niu B, Zhu Z, Wu S, Li W (2012) CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics 28:3150–3152PubMedCentralCrossRefPubMedGoogle Scholar
  23. Gao YF, Li BQ, Cai YD, Feng KY, Li ZD et al (2013) Prediction of active sites of enzymes by maximum relevance minimum redundancy (mRMR) feature selection. Mol Biosyst 9:61–69CrossRefPubMedGoogle Scholar
  24. Gutteridge A, Bartlett GJ, Thornton JM (2003) Using a neural network and spatial clustering to predict the location of active sites in enzymes. J Mol Biol 330:719–734CrossRefPubMedGoogle Scholar
  25. Nakashima H, Nishikawa K, Tatsuo O (1986) The folding type of a protein is relevant to the amino acid composition. J Biochem 99:153–162PubMedGoogle Scholar
  26. Ota M, Kinoshita K, Nishikawa K (2003) Prediction of catalytic residues in enzymes based on known tertiary structure, stability profile, and sequence conservation. J Mol Biol 327:1053–1064CrossRefPubMedGoogle Scholar
  27. Petrova NV, Wu CH (2006) Prediction of catalytic residues using Support Vector Machine with selected protein sequence and structural properties. BMC Bioinform 7:312CrossRefGoogle Scholar
  28. Porter CT, Bartlett GJ, Thornton JM (2004) The Catalytic Site Atlas: a resource of catalytic sites and residues identified in enzymes using structural data. Nucleic Acids Res 32:D129–D133PubMedCentralCrossRefPubMedGoogle Scholar
  29. Schaffer AA, Aravind L, Madden TL, Shavirin S, Spouge JL et al (2001) Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. Nucleic Acids Res 29:2994–3005PubMedCentralCrossRefPubMedGoogle Scholar
  30. Tong W, Williams RJ, Wei Y, Murga LF, Ko J et al (2008) Enhanced performance in prediction of protein active sites with THEMATICS and support vector machines. Protein Sci 17:333–341PubMedCentralCrossRefPubMedGoogle Scholar
  31. Torrance JW, Bartlett GJ, Porter CT, Thornton JM (2005) Using a library of structural templates to recognise catalytic sites and explore their evolution in homologous families. J Mol Biol 347:565–581CrossRefPubMedGoogle Scholar
  32. UniProt C (2007) The universal protein resource (UniProt). Nucleic Acids Res 35:D193–D197CrossRefGoogle Scholar
  33. Wang P, Xiao X, Chou K-C (2011) NR-2L: a two-level predictor for identifying nuclear receptor subfamilies based on sequence-derived features. PLoS One 6:e23505PubMedCentralCrossRefPubMedGoogle Scholar
  34. Xiao X, Wang P, Chou K-C (2008) Predicting protein structural classes with pseudo amino acid composition: an approach using geometric moments of cellular automaton image. J Theor Biol 254:691–696CrossRefPubMedGoogle Scholar
  35. Xiao X, Wang P, Chou KC (2011) Quat-2L: a web-server for predicting protein quaternary structural attributes. Mol Divers 15:149–155CrossRefPubMedGoogle Scholar
  36. Xiao X, Wang P, Chou K-C (2012) inr-physchem: A sequence-based predictor for identifying nuclear receptors and their subfamilies via physical-chemical property matrix. PLoS One 7:e30869PubMedCentralCrossRefPubMedGoogle Scholar
  37. Xu Y, Ding J, Wu L-Y, Chou K-C (2013) iSNO-PseAAC: predict cysteine S-nitrosylation sites in proteins by incorporating position specific amino acid propensity into pseudo amino acid composition. PLoS One 8:e55844PubMedCentralCrossRefPubMedGoogle Scholar
  38. Youn E, Peters B, Radivojac P, Mooney SD (2007) Evaluation of features for catalytic residue prediction in novel folds. Protein Sci 16:216–226PubMedCentralCrossRefPubMedGoogle Scholar
  39. Zhang T, Zhang H, Chen K, Shen S, Ruan J et al (2008) Accurate sequence-based prediction of catalytic residues. Bioinformatics 24:2329–2338CrossRefPubMedGoogle Scholar
  40. Zvelebil MJ, Sternberg MJ (1988) Analysis and prediction of the location of catalytic residues in enzymes. Protein Eng 2:127–138CrossRefPubMedGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  • Xuan Xiao
    • 1
    • 2
    • 3
  • Meng-Juan Hui
    • 1
  • Zi Liu
    • 1
  • Wang-Ren Qiu
    • 1
  1. 1.Computer DepartmentJing-De-Zhen Ceramic InstituteJing-De-ZhenChina
  2. 2.Information SchoolZheJiang Textile & Fashion CollegeNingBoChina
  3. 3.Gordon Life Science InstituteBelmontUSA

Personalised recommendations