Prediction of Protein Beta-Sheets: Dynamic Programming versus Grammatical Approach

  • Yuki Kato
  • Tatsuya Akutsu
  • Hiroyuki Seki
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5265)

Abstract

Protein secondary structure prediction is one major task in bioinformatics and various methods in pattern recognition and machine learning have been applied. In particular, it is a challenge to predict β-sheet structures since they range over several discontinuous regions in an amino acid sequence. In this paper, we propose a dynamic programming algorithm for some kind of antiparallel β-sheet, where the proposed approach can be extended for more general classes of β-sheets. Experimental results for real data show that our prediction algorithm has good performance in accuracy. We also show a relation between the proposed algorithm and a grammar-based method. Furthermore, we prove that prediction of planar β-sheet structures is NP-hard.

Keywords

β-sheet dynamic programming formal grammar computational complexity 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Abe, N., Mamitsuka, H.: Predicting Protein Secondary Structure Using Stochastic Tree Grammars. Machine Learning 29, 275–301 (1997)CrossRefGoogle Scholar
  2. 2.
    Akutsu, T., Miyano, S.: On the Approximation of Protein Threading. Theor.Comp.Sci. 210, 261–275 (1999)CrossRefGoogle Scholar
  3. 3.
    Asai, K., Hayamizu, S., Handa, K.: Prediction of Protein Secondary Structure by the Hidden Markov Model. Bioinformatics 9, 141–146 (1993)CrossRefGoogle Scholar
  4. 4.
    Berrera, M., Molinari, H., Fogolari, F.: Amino Acid Empirical Contact Energy Definitions for Fold Recognition in the Space of Contact Maps. BMC Bioinformatics 4 (2003)Google Scholar
  5. 5.
    Boullier, P.: Range Concatenation Grammars. In: Sixth Intl.Workshop on Parsing Technologies (IWPT 2000), pp.53–64 (2000)Google Scholar
  6. 6.
    Branden, C., Tooze, J.: Introduction to Protein Structure, 2nd edn. Garland Publishing (1999)Google Scholar
  7. 7.
    Cai, L., Malmberg, R.L., Wu, Y.: Stochastic Modeling of RNA Pseudoknotted Structures: A Grammatical Approach. Bioinformatics 19, i66–i73 (2003)CrossRefGoogle Scholar
  8. 8.
    Chiang, D., Joshi, A.K., Searls, D.B.: Grammatical Representations of Macromolecular Structure. J. Comp. Biol. 13, 1077–1100 (2006)CrossRefGoogle Scholar
  9. 9.
    Dosztányi, Z., Csizmók, V., Tompa, P., Simon, I.: The Pairwise Energy Content Estimated from Amino Acid Composition Discriminates between Folded and Intrinsically Unstructured Proteins. J. Mol. Biol. 347, 827–839 (2005)CrossRefPubMedGoogle Scholar
  10. 10.
    Eddy, S.R., Durbin, R.: RNA Sequence Analysis Using Covariance Models. Nucl. Acids Res. 22, 2079–2088 (1994)CrossRefPubMedPubMedCentralGoogle Scholar
  11. 11.
    Hobohm, U., Scharf, M., Schneider, R., Sander, C.: Selection of a Representative Set of Structures from the Brookhaven Protein Data Bank. Protein Sci. 1, 409–417 (1992)CrossRefPubMedPubMedCentralGoogle Scholar
  12. 12.
    Hua, S., Sun, Z.: A Novel Method of Protein Secondary Structure Prediction with High Segment Overlap Measure: Support Vector Machine Approach. J. Mol. Biol. 308, 397–407 (2001)CrossRefPubMedGoogle Scholar
  13. 13.
    Hubbard, T.J.P.: Use of β-Strand Interaction Pseudo-Potentials in Protein Structure Prediction and Modelling. In: The Twenty-Seventh Annual Hawaii Intl. Conf. on System Sciences, pp. 336–344 (1994)Google Scholar
  14. 14.
    Kabsch, W., Sander, C.: Dictionary of Protein Secondary Structure: Pattern Recognition of Hydrogen-Bonded and Geometrical Features. Biopolymers 22, 2577–2637 (1983)CrossRefPubMedGoogle Scholar
  15. 15.
    Kato, Y., Seki, H., Kasami, T.: RNA Pseudoknotted Structure Prediction Using Stochastic Multiple Context-Free Grammar. IPSJ Trans. Bioinformatics 47, 12–21 (2006)Google Scholar
  16. 16.
    Krogh, A., Brown, M., Mian, I.S., Sjölander, K., Haussler, D.: Hidden Markov Models in Computational Biology: Applications to Protein Model. J. Mol. Biol. 235, 1501–1531 (1994)CrossRefPubMedGoogle Scholar
  17. 17.
    Lathrop, R.H.: The Protein Threading Problem with Sequence Amino Acid Interaction Preferences is NP-Complete. Protein Eng. 7, 1059–1068 (1994)CrossRefPubMedGoogle Scholar
  18. 18.
    Lin, K., Simossis, V.A., Taylor, W.R., Heringa, J.: A Simple and Fast Secondary Structure Prediction Method Using Hidden Neural Networks. Bioinformatics 21, 152–159 (2005)CrossRefPubMedGoogle Scholar
  19. 19.
    Maier, R.: The Complexity of Some Problems on Subsequences and Supersequences. J. ACM 25, 322–336 (1978)CrossRefGoogle Scholar
  20. 20.
    Muggleton, S., King, R., Sternberg, M.: Protein Secondary Structure Prediction Using Logic-Based Machine Learning. Protein Eng. 5, 647–657 (1992)CrossRefPubMedGoogle Scholar
  21. 21.
    Rivas, E., Eddy, S.R.: The Language of RNA: A Formal Grammar that Includes Pseudoknots. Bioinformatics 16, 334–340 (2000)CrossRefPubMedGoogle Scholar
  22. 22.
    Rost, B., Sander, C.: Prediction of Protein Secondary Structure at Better than 70% Accuracy. J. Mol. Biol. 232, 584–599 (1993)CrossRefPubMedGoogle Scholar
  23. 23.
    Sakakibara, Y., Brown, M., Hughey, R., Mian, I.S., Sjölander, K., Underwood, R.C., Haussler, D.: Stochastic Context-Free Grammars for tRNA Modeling. Nucl. Acids Res. 22, 5112–5120 (1994)CrossRefPubMedPubMedCentralGoogle Scholar
  24. 24.
    Tanaka, S., Scheraga, H.A.: Medium- and Long-Range Interaction Parameters between Amino Acids for Predicting Three-Dimensional Structures of Proteins. Macromolecules 9, 945–950 (1976)CrossRefPubMedGoogle Scholar
  25. 25.
    Uemura, Y., Hasegawa, A., Kobayashi, S., Yokomori, T.: Tree Adjoining Grammars for RNA Structure Prediction. Theor. Comp. Sci. 210, 277–303 (1999)CrossRefGoogle Scholar
  26. 26.
    Xu, Y., Xu, D., Uberbacher, E.C.: An Efficient Computational Method for Globally Optimal Threading. J. Comp. Biol. 5, 597–614 (1998)CrossRefGoogle Scholar
  27. 27.
    Zhang, C., Kim, S.H.: Environment-Dependent Residue Contact Energies for Proteins. PNAS 97, 2550–2555 (2000)CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Yuki Kato
    • 1
  • Tatsuya Akutsu
    • 1
  • Hiroyuki Seki
    • 2
  1. 1.Bioinformatics Center, Institute for Chemical ResearchKyoto UniversityGokasho, UjiJapan
  2. 2.Graduate School of Information ScienceNara Institute of Science and TechnologyIkomaJapan

Personalised recommendations