Protein-Centric Data Integration for Functional Analysis of Comparative Proteomics Data

  • Peter B. McGarveyEmail author
  • Jian Zhang
  • Darren A. Natale
  • Cathy H. Wu
  • Hongzhan Huang
Part of the Methods in Molecular Biology book series (MIMB, volume 694)


High-throughput proteomic, microarray, protein interaction and other experimental methods all generate long lists of proteins and/or genes that have been identified or have varied in accumulation under the experimental conditions studied. These lists can be difficult to sort through for Biologists to make sense of. Here we describe a next step in data analysis – a bottom-up approach at data integration – starting with protein sequence identifications, mapping them to a common representation of the protein and then bringing in a wide variety of structural, functional, genetic, and disease information related to proteins derived from annotated knowledge bases and then using this information to categorize the lists using Gene Ontology (GO) terms and mappings to biological pathway databases. We illustrate with examples how this can aid in identifying important processes from large complex lists.

Key words

Gene Ontology Biological pathways Protein database UniProtKB Proteomics Bioinformatics 


  1. 1.
    The UniProt Consortium. (2009) The Universal Protein Resource (UniProt) 2009. Nucleic Acids Res 37, D169–D174.CrossRefGoogle Scholar
  2. 2.
    Wu CH, Huang H, Nikolskaya A, Hu Z, Barker WC. (2004) The iProClass integrated database for protein functional analysis. Comput Biol Chem 28, 87–96.PubMedCrossRefGoogle Scholar
  3. 3.
    Leinonen R, Diez FG, Binns D, Fleischmann W, Lopez R, et al. (2004) UniProt archive. Bioinformatics 20, 3236–3237.PubMedCrossRefGoogle Scholar
  4. 4.
    Zhang C, Crasta O, Cammer S, Will R, Kenyon R, et al. (2008) An emerging cyber infrastructure for biodefense pathogen and pathogen-host data. Nucleic Acids Res 36, D884–D891.PubMedCrossRefGoogle Scholar
  5. 5.
    McGarvey P, Huang H, Mazumder R, Zhang J, Chen Y, et al. (2009) Systems integration of biodefense omics data for analysis of pathogen-host interactions and identification of potential targets. PLOS One 4, e7162.PubMedCrossRefGoogle Scholar
  6. 6.
    Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, et al. (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25, 25–29.PubMedCrossRefGoogle Scholar
  7. 7.
    Ansong C, Yoon H, Norbeck AD, Gustin JK, McDermott JE, et al. (2008) Proteomics analysis of the causative agent of typhoid fever. J Proteome Res 7, 546–557.PubMedCrossRefGoogle Scholar
  8. 8.
    Adkins JN, Mottaz HM, Norbeck AD, Gustin JK, Rue J, et al. (2006) Analysis of the Salmonella typhimurium proteome through environmental response toward infectious conditions. Mol Cell Proteomics 5, 1450–1461.PubMedCrossRefGoogle Scholar
  9. 9.
    Kanehisa M, Araki M, Goto S, Hattori M, Hirakawa M, et al. (2008) KEGG for linking genomes to life and the environment. Nucleic Acids Res 36, D480–D484.PubMedCrossRefGoogle Scholar
  10. 10.
    Matthews L, Gopinath G, Gillespie M, Caudy M, Croft D, et al. (2009) Reactome knowledgebase of human biological pathways and processes. Nucleic Acids Res 37, D619–D622.PubMedCrossRefGoogle Scholar
  11. 11.
    Schaefer CF, Anthony K, Krupa S, Buchoff J, Day M, et al. (2009) PID: the Pathway Interaction Database. Nucleic Acids Res 37, D674–D679.PubMedCrossRefGoogle Scholar
  12. 12.
    Manes NP, Estep RD, Mottaz HM, Moore RJ, Clauss TR, et al. (2008) Comparative proteomics of human monkeypox and vaccinia intracellular mature and extracellular enveloped virions. J Proteome Res 7, 960–968.PubMedCrossRefGoogle Scholar
  13. 13.
    Suzek BE, Huang H, McGarvey P, Mazumder R, Wu CH. (2008) UniRef: comprehensive and non-redundant UniProt reference clusters. Bioinformatics 23, 1282–1288.PubMedCrossRefGoogle Scholar
  14. 14.
    Leinonen R, Diez FG, Binns D, Fleischmann W, Lopez R, Apweiler R. (2004) UniProt archive. Bioinformatics 20, 3236–3237.PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  • Peter B. McGarvey
    • 1
    Email author
  • Jian Zhang
    • 1
  • Darren A. Natale
    • 1
  • Cathy H. Wu
    • 2
  • Hongzhan Huang
    • 2
  1. 1.Department of Biochemistry and Molecular & Cellular BiologyGeorgetown University Medical CenterWashingtonUSA
  2. 2.Department of Computer and Information SciencesUniversity of DelawareNewarkUSA

Personalised recommendations