A Guide to UniProt for Protein Scientists

  • Claire O’Donovan
  • Rolf Apweiler
Part of the Methods in Molecular Biology book series (MIMB, volume 694)


One of the essential requirements of the proteomics community is a high quality annotated nonredundant protein sequence database with stable identifiers and an archival service to enable protein identification and characterization. The scope of this chapter is to illustrate how Universal Protein Resource (UniProt) (The UniProt Consortium, Nucleic Acids Res. 38:D142–D148, 2010) can be best utilized for proteomics purposes with a particular focus on exploiting the knowledge captured in the UniProt databases, the services provided and the availability of complete proteomes.

Key words

Protein sequence database Annotation Stable identifiers Complete proteome Archive Nonredundant 


  1. 1.
    The UniProt Consortium. (2010) The Universal Protein Resource (UniProt) in 2010. Nucleic Acids Res. 38, D142–D148.CrossRefGoogle Scholar
  2. 2.
    Leinonen R, Diez FG, Binns D, Fleischmann W, Lopez R, Apweiler R. (2004) UniProt archive. Bioinformatics 20, 3236–3237.PubMedCrossRefGoogle Scholar
  3. 3.
    Suzek BE, Huang H, McGarvey P, Mazumder R, Wu CH. (2007) UniRef: comprehensive and non-redundant UniProt reference clusters. Bioinformatics 23, 1282–1288.PubMedCrossRefGoogle Scholar
  4. 4.
    Kersey P, Hermjakob H, Apweiler R. (2000) VARSPLIC: alternatively-spliced protein sequences derived from Swiss-Prot and TrEMBL. Bioinformatics 16, 1048–1049.PubMedCrossRefGoogle Scholar
  5. 5.
    Gattiker A, Michoud K, Rivoire C, Auchincloss AH, Coudert E, Lima T, Kersey P, Pagni M, Sigrist CJ, Lachaize C, et al. (2003) Automated annotation of microbial proteomes in SWISS-PROT. Comput. Biol. Chem. 27, 49–58.PubMedCrossRefGoogle Scholar
  6. 6.
    Fleischmann W, Moller S, Gateau A, Apweiler R. (1999) A novel method for automatic functional annotation of proteins. Bioinformatics 15, 228–233.PubMedCrossRefGoogle Scholar
  7. 7.
    Wu CH, Nikolskaya A, Huang H, Yeh L-S, Natale DA, Vinayaka CR, Hu ZZ, Mazumder R, Kumar S, Kourtesis P, et al. (2004) PIRSF: family classification system at the Protein Information Resource. Nucleic Acids Res. 32, D112–D114.PubMedCrossRefGoogle Scholar
  8. 8.
    Natale DA, Vinayaka CR, Wu CH. (2004) Large-scale, classification-driven, rule-based functional annotation of proteins. In: Encyclopedia of Genetics, Genomics, Proteomics and Bioinformatics – Subramaniam S, ed. Bioinformatics John Wiley West Sussex, England.Google Scholar
  9. 9.
    Swarbreck D, Wilks C, Lamesch P, Berardini TZ, Garcia-Hernandez M, Foerster H, Li D, Meyer T, Muller R, Ploetz L, et al. (2008) The Arabidopsis Information Resource (TAIR): gene structure and function annotation. Nucleic Acids Res. 36, D1009–D1014.PubMedCrossRefGoogle Scholar
  10. 10.
    Hong EL, Balakrishnan R, Dong Q, Christie KR, Park J, Binkley G, Costanzo MC, Dwight SS, Engel SR, Fisk DG, et al. (2008) The Ontology annotations at SGD: new data sources and annotation methods. Nucleic Acids Res. 36, D577–D581.PubMedCrossRefGoogle Scholar
  11. 11.
    Flicek P, Aken BL, Beal K, Ballester B, Caccamo M, Chen Y, Clarke L, Coates G, Cunningham F, Cutts T, et al. (2008) Ensembl 2008. Nucleic Acids Res. 36, D707–D714.PubMedCrossRefGoogle Scholar
  12. 12.
    The Gene Ontology Consortium. (2000) Gene ontology: tool for the unification of biology. Nat. Genet. 25, 25–29.CrossRefGoogle Scholar
  13. 13.
    Kersey PJ, Duarte J, Williams A, Karavidopoulou Y, Birney E, Apweiler R. (2004) The International Protein Index: an integrated database for proteomics experiments. Proteomics 4, 1985–1988.PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  • Claire O’Donovan
    • 1
  • Rolf Apweiler
    • 1
  1. 1.The European Bioinformatics InstituteCambridgeUK

Personalised recommendations