Skip to main content

Getting the Most from Your Protein Sequence

  • Protocol
Protein Sequencing Protocols

Part of the book series: Methods in Molecular Biology™ ((MIMB,volume 211))

  • 1336 Accesses

Abstract

The diversity and complexity of Bioinformatics tools currently available for protein sequence analysis can make it difficult to know where to begin when presented with a new sequence. In this chapter we assume that the reader has a protein sequence (full-length or partial) identified from mass spectrometry or translation of a putative gene and wishes to identify aspects of its structure and function via Bioinformatics. We go through a protocol outlining one approach that should give the most complete picture possible given the limits of available tools, and then provide a worked example to illustrate the procedures involved. The nature of this paper is such that we are unable to give complete details of all the methods discussed. We refer the reader to references and websites described in the text for more information.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Altschul, S. F., Gish, W., Miller, W., Myers, E. W., and Lipman, D. J. (1990) Basic local alignment search tool. J. Mol. Biol. 215, 403–410.

    PubMed  CAS  Google Scholar 

  2. Altschul, S. F., Madden, T. L., Schaffer, A. A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D. J. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402.

    Article  PubMed  CAS  Google Scholar 

  3. Barton G. J. (1993) An efficient algorithm to locate all locally optimal alignments between two sequences allowing for gaps. Comput. Appl. Biosci. 9, 729–734.

    PubMed  CAS  Google Scholar 

  4. Perutz M. F. (1999) Glutamine repeats and neurodegenerative diseases: molecular aspects. Trends Biochem. Sci. 24, 58–63.

    Article  PubMed  CAS  Google Scholar 

  5. Wootton J. C. (1994) Non-globular domains in protein sequences: automated segmentation using complexity measures. Comput. Chem. 18, 269–285.

    Article  PubMed  CAS  Google Scholar 

  6. Lupas, A., Van Dyke, M., and Stock, J. (1991) Predicting coiled coils from protein sequences. Science 252, 1162–1164.

    Article  CAS  Google Scholar 

  7. Berger, B., Wilson, D. B., Wolf, E., Tonchev, T., Milla, M., and Kim, P. S. (1995) Predicting coiled coils by use of pairwise residue correlations. Proc. Natl. Acad. Sci. USA 92, 8259–8263.

    Article  PubMed  CAS  Google Scholar 

  8. Wolf, E., Kim, P. S., and Berger, B. (1997) MultiCoil: a program for predicting two-and three-stranded coiled coils. Protein Sci. 6, 1179–1189.

    Article  PubMed  CAS  Google Scholar 

  9. Hoffmann, K. and Stoffel, W. (1993) TMbase-A database of membrane spanning proteins segments Biol. Chem. Hoppe-Seyler 347, 166

    Google Scholar 

  10. Sonnhammer, E. L. L., von Heijne, G., and Krogh A., (1998) A hidden Markov model for predicting transmembrane helices in protein sequences, in Proceedings of Sixth International Conference on Intelligent Systems for Molecular Biology (Glasgow, J., Littlejohn, T., Major, F., Lathrop, R., Sankoff, D., and Sensen, C., eds.) Menlo Park, CA, AAAI Press, pp. 175–182.

    Google Scholar 

  11. Krogh, A., Larsson, B., von Heijne, G., and Sonnhammer, E. L. L. (2001) Predicting transmembrane protein topology with a hidden markov model: application to complete genomes. J. Mol. Biol. 305, 560–580

    Article  Google Scholar 

  12. Nielsen, H., Engelbrecht, J., Brunak, S., and von Heijne, G. (1997) Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites. Protein Eng. 10, 1–6.

    Article  PubMed  CAS  Google Scholar 

  13. Ladunga, I. (1999) PHYSEAN: PHYsical SEquence ANalysis for the identification of protein domains on the basis of physical and chemical properties of amino acids. Bioinformatics 15, 1028–1038.

    Article  PubMed  CAS  Google Scholar 

  14. Nakai, K. (2000) Protein sorting signals and prediction of subcellular localization. Adv. Prot. Chem. 54, 277–344.

    Article  CAS  Google Scholar 

  15. Eisenhaber, B., Bork, P., and Eisenhaber, F. (1999) Prediction of potential GPImodification sites in proprotein sequences. J. Mol. Biol. 292, 741–758.

    Article  PubMed  CAS  Google Scholar 

  16. Emanuelsson, O., Nielsen, H., and von Heijne, G. (1999) ChloroP, a neural network-based method for predicting chloroplast transit peptides and their cleavage sites. Protein Sci. 8, 978–984.

    Article  PubMed  CAS  Google Scholar 

  17. Emanuelsson, O., Nielsen, H., Brunak, S., and von Heijne, G. (2000) Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. J. Mol. Biol. 300, 1005–1016.

    Article  PubMed  CAS  Google Scholar 

  18. Hofmann, K., Bucher, P., Falquet, L., and Bairoch, A. (1999) The PROSITE database, its status in 1999. Nucleic Acids Res. 27, 215–221.

    Article  PubMed  CAS  Google Scholar 

  19. Bateman, A., Birney, E., Durbin, R., Eddy, S. R., Finn, R. D., and Sonnhammer E. L. L. (1999) Pfam 3.1: 1313 multiple alignments match the majority of proteins. Nucleic Acids Res. 27, 260–262.

    Article  PubMed  CAS  Google Scholar 

  20. Schultz, J., Copley, R. R., Doerks, T., Ponting C. P., and Bork, P. (2000) Nucleic Acids ReSMART: a web-based tool for the study of genetically mobile domainss. 28, 231–234.

    CAS  Google Scholar 

  21. Thompson, J. D., Higgins, D. G., and Gibson, T. J. (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673–680.

    Article  PubMed  CAS  Google Scholar 

  22. Eddy, S. R. (1996) Hidden Markov models. Curr. Opin. Struct. Biol. 6, 361–365.

    Article  PubMed  CAS  Google Scholar 

  23. Eddy, S. R. (1998) Profile hidden Markov models. Bioinformatics 14, 755–763.

    Article  PubMed  CAS  Google Scholar 

  24. Karplus, K., Barrett, C., and Hughey, R. (1998) Hidden Markov models for detecting remote protein homologies. Bioinformatics 14, 846–856.

    Article  PubMed  CAS  Google Scholar 

  25. Brown, N. P., Leroy, C., and Sander, C. (1998) MView: a web-compatible database search or multiple alignment viewer. Bioinformatics 14, 380–381.

    Article  PubMed  CAS  Google Scholar 

  26. Galtier, N., Gouy, M., and Gautier, C. (1996) SeaView and Phylo_win, two graphic tools for sequence alignment and molecular phylogeny. Comput. Appl. Biosci. 12, 543–548.

    PubMed  CAS  Google Scholar 

  27. Gibson, T. J. and Spring, J. (1998) Genetic redundancy in vertebrates: polyploidy and persistence of genes encoding multidomain proteins. Trends Genet. 14, 46–49.

    Article  PubMed  CAS  Google Scholar 

  28. Wilson, C. A., Kreychman, J., and Gerstein, M. (2000) Assessing annotation transfer for genomics: quantifying the relations between protein sequence, structure and function through traditional and probabilistic scores. J. Mol. Biol. 297, 233–249.

    Article  PubMed  CAS  Google Scholar 

  29. Peitsch, M. C. (1996) ProMod and Swiss-Model: Internet-based tools for automated comparative protein modelling. Biochem. Soc. Trans. 24, 274–279.

    PubMed  CAS  Google Scholar 

  30. Rost, B. and Sander, C. (1993) Prediction of protein secondary structure at better than 70% accuracy. J. Mol. Biol. 232, 584–599.

    Article  PubMed  CAS  Google Scholar 

  31. Jones, D. T. (1999) Protein secondary structure prediction based on position-specific scoring matrices. J. Mol. Biol. 292, 195–202.

    Article  PubMed  CAS  Google Scholar 

  32. Cuff, J. A. and Barton, G. J. (2000) Application of multiple sequence alignment profiles to improve protein secondary structure prediction. Proteins 40, 502–511.

    Article  PubMed  CAS  Google Scholar 

  33. Jones, D. T., Taylor, W. R., and Thornton, J. M. (1992) A new approach to protein fold recognition. Nature 358, 86–89.

    Article  PubMed  CAS  Google Scholar 

  34. Rost, B. (1995) TOPITS: threading one-dimensional predictions into three-dimensional structures. Ismb 3, 314–321.

    PubMed  CAS  Google Scholar 

  35. Kelley, L. A., MacCallum, R. M., and Sternberg, M. J. (2000) Enhanced genome annotation using structural profiles in the program 3D-PSSM. J. Mol. Biol. 299, 499–520.

    Article  PubMed  CAS  Google Scholar 

  36. White, J. H., Wise, A., Main, M. J., Green, A., Fraser, N. J., Disney, G. H., et al. (1998) Heterodimerization is required for the formation of a functional GABA(B) receptor. Nature 396, 679–682.

    Article  PubMed  CAS  Google Scholar 

  37. Galvez, T., Parmentier, M. L., Joly, C., Malitschek, B., Kaupmann, K., Kuhn, R., et al. (1999) Mutagenesis and modeling of the GABAB receptor extracellular domain support a venus flytrap mechanism for ligand binding. J. Biol. Chem. 274, 13362–13369.

    Article  PubMed  CAS  Google Scholar 

  38. Dekel, I., Russek, N., Jones, T., Mortin, M. A., and Katzav, S. (2000) Identification of the Drosophila melanogaster homologue of the mammalian signal transducer protein, Vav. FEBS Lett 472, 99–104

    Article  PubMed  CAS  Google Scholar 

  39. Kraulis P. J. (1991) Molscript: a program to produce both detailed and schematic plots of protein structures. J. Appl. Crystallog. 24, 964–950.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Humana Press Inc.

About this protocol

Cite this protocol

Copley, R.R., Russell, R.B. (2003). Getting the Most from Your Protein Sequence. In: Smith, B.J. (eds) Protein Sequencing Protocols. Methods in Molecular Biology™, vol 211. Humana Press. https://doi.org/10.1385/1-59259-342-9:411

Download citation

  • DOI: https://doi.org/10.1385/1-59259-342-9:411

  • Publisher Name: Humana Press

  • Print ISBN: 978-0-89603-975-9

  • Online ISBN: 978-1-59259-342-2

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics