Recognition of β-hairpin motifs in proteins by using the composite vector
- 91 Downloads
- 9 Citations
Abstract
A composite vector method for predicting β-hairpin motifs in proteins is proposed by combining the score of matrix, increment of diversity, the value of distance and auto-correlation information to express the information of sequence. The prediction is based on analysis of data from 3,088 non-homologous protein chains including 6,035 β-hairpin motifs and 2,738 non-β-hairpin motifs. The overall accuracy of prediction and Matthew’s correlation coefficient are 83.1% and 0.59, respectively. In addition, by using the same methods, the accuracy of 80.7% and Matthew’s correlation coefficient of 0.61 are obtained for other dataset with 2,878 non-homologous protein chains, which contains 4,884 β-hairpin motifs and 4,310 non-β-hairpin motifs. Better results are also obtained in the prediction of the β-hairpin motifs of proteins by analysis of the CASP6 dataset.
Keywords
β-Hairpin Scoring matrix Increment of diversity Auto-correlation functionNotes
Acknowledgments
This work was supported by National Natural Science Foundation of China (30560039) and Project for Excellent Subject-directors of Inner Mongolia Autonomous Region.
References
- Chen YL, Li QZ (2007) Prediction of the subcellular location of apoptosis proteins. J Theor Biol 245:775–783. doi: 10.1016/j.jtbi.2006.11.010 CrossRefPubMedGoogle Scholar
- Chou KC (2000) Prediction of protein subcellular locations by incorporating quasi-sequence-order effect. Biochem Biophys Res Commun 278:477–483. doi: 10.1006/bbrc.2000.3815 CrossRefPubMedGoogle Scholar
- Chou KC (2005) Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes. Bioinformatics 21:10–19. doi: 10.1093/bioinformatics/bth466 CrossRefPubMedGoogle Scholar
- Chou KC, Cai YD (2006) Prediction of protease types in a hybridization space. Biochem Biophys Res Commun 339:1015–1020. doi: 10.1016/j.bbrc.2005.10.196 CrossRefPubMedGoogle Scholar
- Chou KC, Elrod DW (1998) Using discriminant function for prediction of subcellular location of prokaryotic proteins. Biochem Biophys Res Commun 252:63–68. doi: 10.1006/bbrc.1998.9498 CrossRefPubMedGoogle Scholar
- Cruz X, Thornton JM (1999) Factors limiting the performance of prediction-based fold recognition methods. Protein Sci 8:750–759PubMedGoogle Scholar
- Cruz X, Hutchinson EG, Shepherd A, Thornton JM (2002) Toward predicting protein topology: an approach to identifying β-hairpins. Proc Natl Acad Sci USA 99:11157–11162. doi: 10.1073/pnas.162376199 CrossRefPubMedGoogle Scholar
- Espadaler J, Fuentes NF, Hermoso A, Querol E, Aviles FX, Sternberg MJE, Oliva B (2004) ArchDB: automated protein loop classification as a tool for structural genomics. Nucleic Acids Res 32:185–188. doi: 10.1093/nar/gkh002 CrossRefGoogle Scholar
- Hu XZ, Li QZ (2008) Prediction of the β-hairpins in proteins using support vector machine. Protein J 27:115–122. doi: 10.1007/s10930-007-9114-z CrossRefPubMedGoogle Scholar
- Hutchinson EG, Thornton JM (1996) PROMOTIF-A program to identify and analyze structural motifs in proteins. Protein Sci 5:212–220PubMedCrossRefGoogle Scholar
- Jones DT (2001) Predicting novel protein folds by using FRAGFOLD. Proteins 5:127–132. doi: 10.1002/prot.1171 CrossRefPubMedGoogle Scholar
- Kabsch W, Sander C (1983) Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22:2577–2637. doi: 10.1002/bip.360221211 CrossRefPubMedGoogle Scholar
- Kawashima S, Ogata H, Kanehisa M (1999) Aaindex: amino acid index database. Nucleic Acids Res 27:368–369. doi: 10.1093/nar/27.1.368 CrossRefPubMedGoogle Scholar
- Kel AE, Goßling E, Reuter I, Cheremushkin E, Kel-Margoulis OV, Wingender E (2003) MATCHTM: a tool for searching transcription factor binding sites in DNA sequences. Nucleic Acids Res 31:3576–3579. doi: 10.1093/nar/gkg585 CrossRefPubMedGoogle Scholar
- Kielbasa SM, Gonze D, Herzel H (2005) Measuring similarities between transcription factor binding sites. BMC Bioinformatics 6:237. doi: 10.1186/1471-2105-6-237 CrossRefPubMedGoogle Scholar
- Kuhn M, Meile J, Baker D (2004) Strand-loop-strand motifs: prediction of hairpins and diverging turns in proteins. Bioinformatics 54:282–288Google Scholar
- Kumar M, Bhasin M, Natt NK, Raghava GPS (2005) BhairPred: prediction of β-hairpins in a protein from multiple alignment information using ANN and SVM techniques. Nucleic Acids Res 33:154–159. doi: 10.1093/nar/gki588 CrossRefGoogle Scholar
- Laxton RR (1978) The measure of diversity. J Theor Biol 71:51–67. doi: 10.1016/0022-5193(78)90302-8 CrossRefGoogle Scholar
- Li QZ, Lu ZQ (2001) The prediction of the structural class of protein: application of the measure of diversity. J Theor Biol 213:493–502. doi: 10.1006/jtbi.2001.2441 CrossRefPubMedGoogle Scholar
- Oliva B, Bates PA, Querol E, Aviles FX, Sternberg MJE (1997) An automated classification of the structure of protein loops. J Mol Biol 266:814–830. doi: 10.1006/jmbi.1996.0819 CrossRefPubMedGoogle Scholar
- Rose GD, Gierasch L, Smith JA (1985) Turns in peptides and proteins. Adv Protein Chem 37:1–109. doi: 10.1016/S0065-3233(08)60063-7 CrossRefPubMedGoogle Scholar
- Rost B, Schneider R, Sander C (1997) Protein fold recognition by prediction-based threading. J Mol Biol 270:471–480. doi: 10.1006/jmbi.1997.1101 CrossRefPubMedGoogle Scholar
- Takano K, Yamagata Y, Yutani K (2000) Role of amino acid residues at turns in the conformational stability and folding of human lysozyme. Biochemistry 39:8655–8665. doi: 10.1021/bi9928694 CrossRefPubMedGoogle Scholar
- Wasserman WW, Sandelin A (2004) Applied bioinformatics for the identification of regulatory elements. Nat Rev Genet 5:276–287. doi: 10.1038/nrg1315 CrossRefPubMedGoogle Scholar
- Zhang LR, Luo LF (2003) Splice site prediction with quadratic discriminate analysis using diversity measure. Nucleic Acids Res 31:6214–6220. doi: 10.1093/nar/gkg805 CrossRefPubMedGoogle Scholar
- Zhang SW, Pan Q, Zhang HC, Wang HY, Zhang MG (2004) Prediction of multi-class protein folds by using support vector machine. J Northwest Polytech Univ 22(2):200–204Google Scholar