Protein Secondary Structure Prediction with Bidirectional Recurrent Neural Nets: Can Weight Updating for Each Residue Enhance Performance?

  • Michalis Agathocleous
  • Georgia Christodoulou
  • Vasilis Promponas
  • Chris Christodoulou
  • Vassilis Vassiliades
  • Antonis Antoniou
Part of the IFIP Advances in Information and Communication Technology book series (IFIPAICT, volume 339)

Abstract

Successful protein secondary structure prediction is an important step towards modelling protein 3D structure, with several practical applications. Even though in the last four decades several PSSP algorithms have been proposed, we are far from being accurate. The Bidirectional Recurrent Neural Network (BRNN) architecture of Baldi et al. [1] is currently considered as one of the optimal computational neural network type architectures for addressing the problem. In this paper, we implement the same BRNN architecture, but we use a modified training procedure. More specifically, our aim is to identify the effect of the contribution of local versus global information, by varying the length of the segment on which the Recurrent Neural Networks operate for each residue position considered. For training the network, the backpropagation learning algorithm with an online training procedure is used, where the weight updates occur for every amino acid, as opposed to Baldi et al. [1], where the weight updates are applied after the presentation of the entire protein. Our results with a single BRNN are better than Baldi et al. [1] by three percentage points (Q3) and comparable to results of [1] when they use an ensemble of 6 BRNNs. In addition, our results improve even further when sequence-to-structure output is filtered in a post-processing step, with a novel Hidden Markov Model-based approach.

Keywords

Protein Secondary Structure Prediction Bidirectional Recurrent Neural Networks Bioinformatics and Computational Biology 

References

  1. 1.
    Baldi, P., Brunak, S., Frasconi, P., Soda, G., Pollastri, G.: Exploiting the past and the future in protein secondary structure prediction. Bioinformatics 15, 937–946 (1999)CrossRefGoogle Scholar
  2. 2.
    Anfinsen, C.: Studies on the principles that govern the folding of protein chains. Les Prix Nobel en 1972 1, 103–119 (1973)Google Scholar
  3. 3.
    Richards, F., Kundrot, C.: Identification of structural motifs from protein coordinate data: Secondary structure and first-level super secondary structure. Proteins 3, 71–84 (1988)CrossRefGoogle Scholar
  4. 4.
    Pollastri, G., Przybylski, D., Rost, B., Baldi, P.: Improving the prediction of protein secondary structure in three and eight classes using recurrent neural networks and profiles. Proteins 47, 228–235 (2002)CrossRefGoogle Scholar
  5. 5.
    Rost, B., Eyrich, V.: EVA: large-scale analysis of secondary structure prediction. Proteins 5, 192–199 (2001)CrossRefGoogle Scholar
  6. 6.
    Qian, N., Sejnowski, T.: Predicting the secondary structure of globular proteins using neural network models. Journal of Molecular Biology 202, 865–884 (1988)CrossRefGoogle Scholar
  7. 7.
    Rost, B., Sander, C.: Improved prediction of protein secondary structure by use of sequence profiles and neural networks. Proc. Natl. Acad. Sci. USA 90(16), 7558–7562 (1993)CrossRefGoogle Scholar
  8. 8.
    Rost, B., Sander, C.: Combining evolutionary information and neural networks to predict protein secondary structure. Proteins 19, 55–72 (1994)CrossRefGoogle Scholar
  9. 9.
    Salamov, A., Soloveyev, V.: Protein secondary structure prediction using local alignments. Journal of Molecular Biology 268, 31–36 (1997)CrossRefGoogle Scholar
  10. 10.
    Cuff, J., Barton, G.: Evaluation and improvement of multiple sequence methods for protein secondary structure prediction. Proteins 34, 508–519 (1999)CrossRefGoogle Scholar
  11. 11.
    Chen, J., Chaudhari, N.S.: Cascaded bidirectional recurrent neural networks for protein secondary structure prediction. IEEE/ACM Trans. Comput. Biology Bioinform. 4(4), 572–582 (2007)CrossRefGoogle Scholar
  12. 12.
    Zemla, A., Venclovas, C., Fidelis, K., Rost, B.: A modified definition of Sov, a segment-based measure for protein secondary structure prediction assessment. Proteins 34(2), 220–223 (1999)CrossRefGoogle Scholar
  13. 13.
    Hobohm, U., Scharf, M., Schneider, R., Sander, C.: Selection of representative protein data sets. Protein Science 1, 409–417 (1992)CrossRefGoogle Scholar
  14. 14.
    Kabsch, W., Sander, C.: Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22, 2577–2637 (1983)CrossRefGoogle Scholar
  15. 15.
    Schneider, R., Sander, C.: The HSSP database of protein structure-sequence alignments. Nucleic Acids Research 24, 201–205 (1996)CrossRefGoogle Scholar
  16. 16.
    Elman, J.: Finding structure in time. Cognitive Science 14, 179–211 (1990)CrossRefGoogle Scholar
  17. 17.
    Baram, D., Yonath, A.: From peptide-bond formation to cotranslational folding: dynamic, regulatory and evolutionary aspects. FEBS Lett. 579(4), 948–954 (2005)CrossRefGoogle Scholar
  18. 18.
    Rumelhart, D., Hinton, G., Williams, R.: Learning representations by back-propagating errors. Nature 323, 533–536 (1986)CrossRefGoogle Scholar

Copyright information

© IFIP 2010

Authors and Affiliations

  • Michalis Agathocleous
    • 1
  • Georgia Christodoulou
    • 1
  • Vasilis Promponas
    • 2
  • Chris Christodoulou
    • 1
  • Vassilis Vassiliades
    • 1
  • Antonis Antoniou
    • 1
  1. 1.Dept. of Computer Science 
  2. 2.Dept. of Biological SciencesUniversity of CyprusNicosiaCyprus

Personalised recommendations