Skip to main content

Finding Signal Peptides in Human Protein Sequences Using Recurrent Neural Networks

  • Conference paper
  • First Online:
Algorithms in Bioinformatics (WABI 2002)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2452))

Included in the following conference series:

Abstract

A new approach called Sigfind for the prediction of signal peptides in human protein sequences is introduced. The method is based on the bidirectional recurrent neural network architecture. The modifications to this architecture and a better learning algorithm result in a very accurate identification of signal peptides (99.5% correct in fivefold cross-validation). The Sigfind system is available on the WWW for predictions (http://www.stepc.gr/ synaptic/sig.nd.html).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bairoch, A., Boeckmann, B.: The swiss-prot protein sequence data bank: current status. Nucleic Acids Res. 22 (1994) 3578–3580

    Article  Google Scholar 

  2. Baldi, P., Brunak, S., Frasconi, P., Pollastri, G., Soda, G.: Bidirectional dynamics for protein secondary structure prediction. In: Sun, R., Giles, L. (eds.): Sequence Learning: Paradigms, Algorithms, and Applications. Springer Verlag (2000)

    Google Scholar 

  3. Bengio, Y., P. Simard, Frasconi, P.: Learning long-term dependencies with gradient descent is difficult. IEEE Trans. on Neural Networks, 5 (1994) 157–166

    Article  Google Scholar 

  4. Hatzigeorgiou, A.: Translation initiation site prediction in human cDNAs with high accuracy. Bioinformatics, 18 (2002) 343–350

    Article  Google Scholar 

  5. Hatzigeorgiou, A., Fizief, P., Reczko, M. Diana-est: A statistical analysis. Bioinformatics, 17 (2001) 913–919

    Article  Google Scholar 

  6. Hatzigeorgiou, A., Papanikolaou, H., Reczko, M.: Finding the reading frame in protein coding regions on dna sequences: a combination of statistical and neural network methods. In: Computational Intelligence: Neural Networks & Advanced Control Strategies. IOS Press, Vienna (1999) 148–153

    Google Scholar 

  7. Horton, P., Nakai, K.: Better prediction of protein cellular localization sites with the k nearest neighbors classifier. In: ISMB (1997) 147–152

    Google Scholar 

  8. Kyte, J., Doolittle, R.: A simple method dor displaying the hydrophatic character of a protein. J. Mol. Biol., 157 (1982) 105–132

    Article  Google Scholar 

  9. Ladunga, I.: Large-scale predictions of secretory proteins from mammalian genomic and est sequences. Curr. Opin. in Biotechnolgy, 11 (2000) 13–18

    Article  Google Scholar 

  10. Mathews, B. W.: Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochem. Biophys. Acta, Vol. 405 (1975) 442–451

    Google Scholar 

  11. Minsky, M., Papert, S.: Perceptrons: An Introduction to Computational Geometry. The MIT Press, Cambridge, Massachusetts (1969) 145

    MATH  Google Scholar 

  12. Nielsen, H., Brunak, S., von Heijne, G. Machine learning approaches for the prediction of signal peptides and other protein sorting signals. Protein Engineering, 12 (1999) 3–9

    Article  Google Scholar 

  13. Nielsen, H., Engelbrecht, J., S. Brunak, von Heijne, G.: Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites. Protein Engineering, 10 (1997) 1–6

    Article  Google Scholar 

  14. Nielsen, H., Krogh, A.: Prediction of signal peptides and signal anchors by a hidden markov model. In: ISMB (1998) 122–130

    Google Scholar 

  15. Riedmiller, M., Braun, H.: A direct adaptive method for faster backpropagation learning: The RPROP algorithm. In: Ruspini, H., (ed.): Proceedings of the IEEE International Conference on Neural Networks (ICNN 93). IEEE, San Francisco (1993) 586–591

    Google Scholar 

  16. Rumelhart, D. E., Hinton, G. E., Williams, R. J.: Learning internal representations by error propagation. In: Rumelhart, D. E., McClelland, J. L. (eds.): Parallel Distributed Processing: Explorations in the microstructure of cognition; Vol. 1: Foundations. The MIT Press, Cambridge, Massachusetts (1986)

    Google Scholar 

  17. v. Heijne, G.: A new method for predicting signal sequence cleavage sites. Nucleid Acids Res., 14 (1986) 4683–4690

    Article  Google Scholar 

  18. von Heijne, G.: Computer-assisted identification of protein sorting signals and prediction of membrane protein topology and structure. In: Advances in Computational Biology, volume 2, Jai Press Inc. (1996) 1–14

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Reczko, M., Fiziev, P., Staub, E., Hatzigeorgiou, A. (2002). Finding Signal Peptides in Human Protein Sequences Using Recurrent Neural Networks. In: Guigó, R., Gusfield, D. (eds) Algorithms in Bioinformatics. WABI 2002. Lecture Notes in Computer Science, vol 2452. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45784-4_5

Download citation

  • DOI: https://doi.org/10.1007/3-540-45784-4_5

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-44211-0

  • Online ISBN: 978-3-540-45784-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics