Protein Identification and Analysis Tools on the ExPASy Server

  • Elisabeth Gasteiger
  • Christine Hoogland
  • Alexandre Gattiker
  • S'everine Duvaud
  • Marc R. Wilkins
  • Ron D. Appel
  • Amos Bairoch

Abstract

Protein identification and analysis software performs a central role in the investigation of proteins from two-dimensional (2-D) gels and mass spectrometry. For protein identification, the user matches certain empirically acquired information against a protein database to define a protein as already known or as novel. For protein analysis, information in protein databases can be used to predict certain properties about a protein, which can be useful for its empirical investigation. The two processes are thus complementary. Although there are numerous programs available for those applications, we have developed a set of original tools with a few main goals in mind. Specifically, these are:
  1. 1.

    To utilize the extensive annotation available in the Swiss-Prot database (1) wherever possible, in particular the position-specific annotation in the Swiss-Prot feature tables to take into account posttranslational modifications and protein processing.

     
  2. 2.

    To develop tools specifically, but not exclusively, applicable to proteins prepared by twodimensional gel electrophoresis and peptide mass fingerprinting experiments.

     
  3. 3.

    To make all tools available on the World-Wide Web (WWW), and freely usable by the scientific community.

     

References

  1. 1.
    Boeckmann, B., Bairoch, A., Apweiler, R., et al. (2003) The Swiss-Prot protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Res. 31, 354–370.CrossRefGoogle Scholar
  2. 2.
    Gasteiger, E., Gattiker, A., Hoogland, C., Ivanyi, I., Appel, R. D. and Bairoch, A. (2003). ExPASy—the proteomics server for in-depth protein knowledge and analysis. Nucleic Acids Res. 31, 3784–3788.PubMedCrossRefGoogle Scholar
  3. 3.
    Apweiler, R., Bairoch, A., Wu, C. H., et al. (2004) UniProt: the Universal Protein knowledgebase. Nucleic Acids Res. 432, D115–D119.CrossRefGoogle Scholar
  4. 4.
    Jung, E., Gasteiger, E., Veuthey, A.-L., and Bairoch A. (2001) Annotation of glycoproteins in the SWISS-PROT database. Proteomics 1, 262–268.PubMedCrossRefGoogle Scholar
  5. 5.
    Farriol-Mathis, N., Garavelli, J. S., Boeckmann, B., et al. (2004), Annotation of post-translational modifications in the Swiss-Prot knowledgebase. Proteomics, in press.Google Scholar
  6. 6.
    Gill, S. C, von Hippel P. H. (1989) Calculation of protein extinction coefficients from amino acid sequence data. Anal. Biochem. 182, 319–326.PubMedCrossRefGoogle Scholar
  7. 7.
    Kyte, J., and Doolittle, R. F. (1982) A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 157, 105–132.PubMedCrossRefGoogle Scholar
  8. 8.
    Wilkins, M. R., Lindskog, I., Gasteiger, E., et al. (1997) Detailed peptide characterization using PEPTIDEMASS—a World-Wide-Web-accessible tool. Electrophoresis 18, 403–408.PubMedCrossRefGoogle Scholar
  9. 9.
    Keil, B. (1992) Specificity of proteolysis. Springer-Verlag Berlin-Heidelberg New York, p. 335.Google Scholar
  10. 10.
    Wilkins, M. R., Gasteiger, E., Sanchez, J.-C., Appel, R. D., and Hochstrasser, D. F. (1996) Protein identification with sequence tags. Curr. Biol. 6, 1543–1544.PubMedCrossRefGoogle Scholar
  11. 11.
    Wilkins, M. R., Gasteiger, E., Tonella, L., et al. (1998) Protein identification with N-and C-terminal sequence tags in proteome projects. J. Mol. Biol. 278, 599–608.PubMedCrossRefGoogle Scholar
  12. 12.
    Ashburner, M., Ball, C. A., Blake, J. A., et al. (2000), Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25, 25–29.PubMedCrossRefGoogle Scholar
  13. 13.
    Wilkins, M. R., Pasquali, C., Appel, R. D., et al. (1996) From proteins to proteomes: large scale protein identification by two-dimensional electrophoresis and amino acid analysis. Bio/Technology 14, 61–65.PubMedCrossRefGoogle Scholar
  14. 14.
    Wilkins, M. R., Ou, K., Appel, R. D., et al. (1996) Rapid protein identification using Nterminal “sequence tag” and amino acid analysis. Biochem. Biophys. Res. Commun. 221, 609–613.PubMedCrossRefGoogle Scholar
  15. 15.
    Hobohm, U. and Sander, C. (1995) A sequence property approach to searching protein databases. J. Mol. Biol. 251, 390–399.PubMedCrossRefGoogle Scholar
  16. 16.
    Cordwell, S. J., Wilkins, M. R., Cerpa-Poljak, A., et al. (1995) Cross-species identification of proteins separated by two-dimensional gel electrophoresis using matrix-assisted laser desorption time of flight mass spectrometry and amino acid composition. Electrophoresis 16, 438–443.PubMedCrossRefGoogle Scholar
  17. 17.
    Wheeler, C. H., Berry, S. L., Wilkins, M. R., et al. (1996) Characterisation of proteins from 2-D gels by matrix-assisted laser desorption mass spectrometry and amino acid compositional analysis. Electrophoresis 17, 580–587.PubMedCrossRefGoogle Scholar
  18. 18.
    Wilkins, M. R., Gasteiger, E., Wheeler, C., et al. (1998) Multiple parameter cross-species protein identification using MultiIdent—a world wide web accessible tool. Electrophoresis 19, 3199–3206.PubMedCrossRefGoogle Scholar
  19. 19.
    Wilkins, M. R., Gasteiger E., Gooley, A. A., et al. (1999) High-throughput mass spectrometric discovery of protein post-translational modifications. J. Mol. Biol. 289, 645–657.PubMedCrossRefGoogle Scholar
  20. 20.
    Hulo, N., Sigrist, C. J., Le Saux V., et al. (2004) Recent improvements to the PROSITE database. Nucleic Acids Res. 32, D134–D137.PubMedCrossRefGoogle Scholar
  21. 21.
    Henikoff, S., and Henikoff, J. G. (1993) Performance evaluation of amino acid substitution matrices. Proteins 17, 49–61.PubMedCrossRefGoogle Scholar
  22. 22.
    Cooper, C. A., Gasteiger, E., and Packer, N. (2001) GlycoMod—a software tool for determining glycosylation compositions from mass spectrometric data. Proteomics 1, 340–349.PubMedCrossRefGoogle Scholar
  23. 23.
    Cooper, C. A., Gasteiger, E., and Packer, N. (2003) Predicting glycan composition from experimental mass using GlycoMod. In: (Conn, P.M., ed.) Handbook of Proteomic Methods (Humana, Totowa, NJ: pp. 225–231.CrossRefGoogle Scholar
  24. 24.
    Gattiker, A., Bienvenut, W. V., Bairoch, A., and Gasteiger, E. (2002) FindPept, a tool to identify unmatched masses in peptide mass fingerprinting protein identification. Proteomics 2, 1435–1444.PubMedCrossRefGoogle Scholar
  25. 25.
    Bjellqvist, B., Hughes, G., Pasquali, C., et al. (1993) The focusing positions of polypeptides in immobilized pH gradients can be predicted from their amino acid sequences. Electrophoresis 14, 1023–1031.PubMedCrossRefGoogle Scholar
  26. 26.
    Bjellqvist, B., Basse, B., Olsen, E., and Celis, J. E. (1994) Reference points for comparisons of two-dimensional maps of proteins from different human cell types defined in a pH scale where isoelectric points correlate with polypeptide compositions. Electrophoresis 15, 529–539.PubMedCrossRefGoogle Scholar
  27. 27.
    Hoogland, C., Sanchez, J.-C., Tonella, L., et al. (2000) The 1999 SWISS-2DPAGE database update. Nucleic Acids Res. 28, 286–288.PubMedCrossRefGoogle Scholar
  28. 28.
    Nielsen, H., Engelbrecht, J., Brunak, S., and von Heijne, G. (1997) Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites. Protein Eng. 10, 1–6.PubMedCrossRefGoogle Scholar
  29. 29.
    Bachmair, A., Finley, D., and Varshavsky, A. (1986) In vivo half-life of a protein is a function of its amino-terminal residue. Science 234, 179–186.PubMedCrossRefGoogle Scholar
  30. 30.
    Gonda, D. K., Bachmair, A., Wunning, I., Tobias, J. W., Lane, W. S., and Varshavsky, A. J. (1989) Universality and structure of the N-end rule. J. Biol. Chem. 264, 16,700–16,712.PubMedGoogle Scholar
  31. 31.
    Tobias, J. W., Shrader, T. E., Rocap, G., and Varshavsky, A. (1991) The N-end rule in bacteria. Science 254, 1374–1377.PubMedCrossRefGoogle Scholar
  32. 32.
    Ciechanover, A. and Schwartz, A. L. (1989) How are substrates recognized by the ubiquitin-mediated proteolytic system? Trends Biochem. Sci. 14, 483–488.PubMedCrossRefGoogle Scholar
  33. 33.
    Guruprasad, K., Reddy, B. V. B., and Pandit, M. W. (1990) Correlation between stability of a protein and its dipeptide composition: a novel approach for predicting in vivo stability of a protein from its primary sequence. Protein Eng. 4, 155–161.PubMedCrossRefGoogle Scholar
  34. 34.
    Ikai, A. J. (1980) Thermostability and aliphatic index of globular proteins. J. Biochem. 88, 1895–1898.Google Scholar
  35. 35.
    Gattiker, A., Gasteiger, E., and Bairoch, A. (2002) ScanProsite: a reference implementation of a PROSITE scanning tool. Applied Bioinform. 1, 107–108.Google Scholar
  36. 36.
    Altschul, S. F., Gish, W., Miller, W., Myers, E. W., and Lipman, D. J. (1990) Basic local alignment search tool. J. Mol. Biol. 215, 403–410.PubMedGoogle Scholar
  37. 37.
    Golaz, O., Wilkins, M. R., Sanchez, J.-C., Appel, R. D., Hochstrasser, D. F., and Williams, K. L. (1996) Identification of proteins by their amino acid composition: an evaluation of the method. Electrophoresis 17, 573–579.PubMedCrossRefGoogle Scholar
  38. 38.
    Wilkins, M. R. and Williams, K. L. (1997) Cross-species protein identification using amino acid composition, peptide mass fingerprinting, isoelectric point and molecular mass: a theoretical evaluation. J. Theor. Biol. 186, 7–15.PubMedCrossRefGoogle Scholar
  39. 39.
    Hara, S., Rosenfeld, R., and Lu, H. S. (1996) Preventing the generation of artifacts during peptide map analysis of recombinant human insulin-like growth factor-I. Anal. Biochem. 243, 74–79.PubMedCrossRefGoogle Scholar
  40. 40.
    Parker, K. C., Garrels, J. I., Hines, W., et al. (1998) Identification of yeast proteins from twodimensional gels: working out spot cross-contamination. Electrophoresis 19, 1920–1932.PubMedCrossRefGoogle Scholar

Copyright information

© Humana Press Inc., Totowa, NJ 2005

Authors and Affiliations

  • Elisabeth Gasteiger
    • 1
  • Christine Hoogland
    • 1
  • Alexandre Gattiker
    • 1
  • S'everine Duvaud
    • 1
  • Marc R. Wilkins
    • 2
  • Ron D. Appel
    • 3
  • Amos Bairoch
    • 4
  1. 1.Swiss Institute of BioinformaticsGenevaSwitzerland
  2. 2.Proteome Systems Ltd.SydneyAustralia
  3. 3.Swiss Institute of BioinformaticsUniversity and Geneva University HospitalGenevaSwitzerland
  4. 4.Swiss Institute of BioinformaticsGenevaSwitzerland

Personalised recommendations