Amino Acids

, Volume 38, Issue 3, pp 915–921 | Cite as

Recognition of β-hairpin motifs in proteins by using the composite vector

Original Article

Abstract

A composite vector method for predicting β-hairpin motifs in proteins is proposed by combining the score of matrix, increment of diversity, the value of distance and auto-correlation information to express the information of sequence. The prediction is based on analysis of data from 3,088 non-homologous protein chains including 6,035 β-hairpin motifs and 2,738 non-β-hairpin motifs. The overall accuracy of prediction and Matthew’s correlation coefficient are 83.1% and 0.59, respectively. In addition, by using the same methods, the accuracy of 80.7% and Matthew’s correlation coefficient of 0.61 are obtained for other dataset with 2,878 non-homologous protein chains, which contains 4,884 β-hairpin motifs and 4,310 non-β-hairpin motifs. Better results are also obtained in the prediction of the β-hairpin motifs of proteins by analysis of the CASP6 dataset.

Keywords

β-Hairpin Scoring matrix Increment of diversity Auto-correlation function 

Notes

Acknowledgments

This work was supported by National Natural Science Foundation of China (30560039) and Project for Excellent Subject-directors of Inner Mongolia Autonomous Region.

References

  1. Chen YL, Li QZ (2007) Prediction of the subcellular location of apoptosis proteins. J Theor Biol 245:775–783. doi: 10.1016/j.jtbi.2006.11.010 CrossRefPubMedGoogle Scholar
  2. Chou KC (2000) Prediction of protein subcellular locations by incorporating quasi-sequence-order effect. Biochem Biophys Res Commun 278:477–483. doi: 10.1006/bbrc.2000.3815 CrossRefPubMedGoogle Scholar
  3. Chou KC (2005) Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes. Bioinformatics 21:10–19. doi: 10.1093/bioinformatics/bth466 CrossRefPubMedGoogle Scholar
  4. Chou KC, Cai YD (2006) Prediction of protease types in a hybridization space. Biochem Biophys Res Commun 339:1015–1020. doi: 10.1016/j.bbrc.2005.10.196 CrossRefPubMedGoogle Scholar
  5. Chou KC, Elrod DW (1998) Using discriminant function for prediction of subcellular location of prokaryotic proteins. Biochem Biophys Res Commun 252:63–68. doi: 10.1006/bbrc.1998.9498 CrossRefPubMedGoogle Scholar
  6. Cruz X, Thornton JM (1999) Factors limiting the performance of prediction-based fold recognition methods. Protein Sci 8:750–759PubMedGoogle Scholar
  7. Cruz X, Hutchinson EG, Shepherd A, Thornton JM (2002) Toward predicting protein topology: an approach to identifying β-hairpins. Proc Natl Acad Sci USA 99:11157–11162. doi: 10.1073/pnas.162376199 CrossRefPubMedGoogle Scholar
  8. Espadaler J, Fuentes NF, Hermoso A, Querol E, Aviles FX, Sternberg MJE, Oliva B (2004) ArchDB: automated protein loop classification as a tool for structural genomics. Nucleic Acids Res 32:185–188. doi: 10.1093/nar/gkh002 CrossRefGoogle Scholar
  9. Hu XZ, Li QZ (2008) Prediction of the β-hairpins in proteins using support vector machine. Protein J 27:115–122. doi: 10.1007/s10930-007-9114-z CrossRefPubMedGoogle Scholar
  10. Hutchinson EG, Thornton JM (1996) PROMOTIF-A program to identify and analyze structural motifs in proteins. Protein Sci 5:212–220PubMedCrossRefGoogle Scholar
  11. Jones DT (2001) Predicting novel protein folds by using FRAGFOLD. Proteins 5:127–132. doi: 10.1002/prot.1171 CrossRefPubMedGoogle Scholar
  12. Kabsch W, Sander C (1983) Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22:2577–2637. doi: 10.1002/bip.360221211 CrossRefPubMedGoogle Scholar
  13. Kawashima S, Ogata H, Kanehisa M (1999) Aaindex: amino acid index database. Nucleic Acids Res 27:368–369. doi: 10.1093/nar/27.1.368 CrossRefPubMedGoogle Scholar
  14. Kel AE, Goßling E, Reuter I, Cheremushkin E, Kel-Margoulis OV, Wingender E (2003) MATCHTM: a tool for searching transcription factor binding sites in DNA sequences. Nucleic Acids Res 31:3576–3579. doi: 10.1093/nar/gkg585 CrossRefPubMedGoogle Scholar
  15. Kielbasa SM, Gonze D, Herzel H (2005) Measuring similarities between transcription factor binding sites. BMC Bioinformatics 6:237. doi: 10.1186/1471-2105-6-237 CrossRefPubMedGoogle Scholar
  16. Kuhn M, Meile J, Baker D (2004) Strand-loop-strand motifs: prediction of hairpins and diverging turns in proteins. Bioinformatics 54:282–288Google Scholar
  17. Kumar M, Bhasin M, Natt NK, Raghava GPS (2005) BhairPred: prediction of β-hairpins in a protein from multiple alignment information using ANN and SVM techniques. Nucleic Acids Res 33:154–159. doi: 10.1093/nar/gki588 CrossRefGoogle Scholar
  18. Laxton RR (1978) The measure of diversity. J Theor Biol 71:51–67. doi: 10.1016/0022-5193(78)90302-8 CrossRefGoogle Scholar
  19. Li QZ, Lu ZQ (2001) The prediction of the structural class of protein: application of the measure of diversity. J Theor Biol 213:493–502. doi: 10.1006/jtbi.2001.2441 CrossRefPubMedGoogle Scholar
  20. Oliva B, Bates PA, Querol E, Aviles FX, Sternberg MJE (1997) An automated classification of the structure of protein loops. J Mol Biol 266:814–830. doi: 10.1006/jmbi.1996.0819 CrossRefPubMedGoogle Scholar
  21. Rose GD, Gierasch L, Smith JA (1985) Turns in peptides and proteins. Adv Protein Chem 37:1–109. doi: 10.1016/S0065-3233(08)60063-7 CrossRefPubMedGoogle Scholar
  22. Rost B, Schneider R, Sander C (1997) Protein fold recognition by prediction-based threading. J Mol Biol 270:471–480. doi: 10.1006/jmbi.1997.1101 CrossRefPubMedGoogle Scholar
  23. Takano K, Yamagata Y, Yutani K (2000) Role of amino acid residues at turns in the conformational stability and folding of human lysozyme. Biochemistry 39:8655–8665. doi: 10.1021/bi9928694 CrossRefPubMedGoogle Scholar
  24. Wasserman WW, Sandelin A (2004) Applied bioinformatics for the identification of regulatory elements. Nat Rev Genet 5:276–287. doi: 10.1038/nrg1315 CrossRefPubMedGoogle Scholar
  25. Zhang LR, Luo LF (2003) Splice site prediction with quadratic discriminate analysis using diversity measure. Nucleic Acids Res 31:6214–6220. doi: 10.1093/nar/gkg805 CrossRefPubMedGoogle Scholar
  26. Zhang SW, Pan Q, Zhang HC, Wang HY, Zhang MG (2004) Prediction of multi-class protein folds by using support vector machine. J Northwest Polytech Univ 22(2):200–204Google Scholar

Copyright information

© Springer-Verlag 2009

Authors and Affiliations

  1. 1.College of SciencesInner Mongolia University of TechnologyHohhotPeople’s Republic of China
  2. 2.Laboratory of Theoretical Biophysics, College of Physical Science and TechnologyInner Mongolia UniversityHohhotPeople’s Republic of China

Personalised recommendations