Abstract
High-throughput proteomic, microarray, protein interaction and other experimental methods all generate long lists of proteins and/or genes that have been identified or have varied in accumulation under the experimental conditions studied. These lists can be difficult to sort through for Biologists to make sense of. Here we describe a next step in data analysis – a bottom-up approach at data integration – starting with protein sequence identifications, mapping them to a common representation of the protein and then bringing in a wide variety of structural, functional, genetic, and disease information related to proteins derived from annotated knowledge bases and then using this information to categorize the lists using Gene Ontology (GO) terms and mappings to biological pathway databases. We illustrate with examples how this can aid in identifying important processes from large complex lists.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
The UniProt Consortium. (2009) The Universal Protein Resource (UniProt) 2009. Nucleic Acids Res 37, D169–D174.
Wu CH, Huang H, Nikolskaya A, Hu Z, Barker WC. (2004) The iProClass integrated database for protein functional analysis. Comput Biol Chem 28, 87–96.
Leinonen R, Diez FG, Binns D, Fleischmann W, Lopez R, et al. (2004) UniProt archive. Bioinformatics 20, 3236–3237.
Zhang C, Crasta O, Cammer S, Will R, Kenyon R, et al. (2008) An emerging cyber infrastructure for biodefense pathogen and pathogen-host data. Nucleic Acids Res 36, D884–D891.
McGarvey P, Huang H, Mazumder R, Zhang J, Chen Y, et al. (2009) Systems integration of biodefense omics data for analysis of pathogen-host interactions and identification of potential targets. PLOS One 4, e7162.
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, et al. (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25, 25–29.
Ansong C, Yoon H, Norbeck AD, Gustin JK, McDermott JE, et al. (2008) Proteomics analysis of the causative agent of typhoid fever. J Proteome Res 7, 546–557.
Adkins JN, Mottaz HM, Norbeck AD, Gustin JK, Rue J, et al. (2006) Analysis of the Salmonella typhimurium proteome through environmental response toward infectious conditions. Mol Cell Proteomics 5, 1450–1461.
Kanehisa M, Araki M, Goto S, Hattori M, Hirakawa M, et al. (2008) KEGG for linking genomes to life and the environment. Nucleic Acids Res 36, D480–D484.
Matthews L, Gopinath G, Gillespie M, Caudy M, Croft D, et al. (2009) Reactome knowledgebase of human biological pathways and processes. Nucleic Acids Res 37, D619–D622.
Schaefer CF, Anthony K, Krupa S, Buchoff J, Day M, et al. (2009) PID: the Pathway Interaction Database. Nucleic Acids Res 37, D674–D679.
Manes NP, Estep RD, Mottaz HM, Moore RJ, Clauss TR, et al. (2008) Comparative proteomics of human monkeypox and vaccinia intracellular mature and extracellular enveloped virions. J Proteome Res 7, 960–968.
Suzek BE, Huang H, McGarvey P, Mazumder R, Wu CH. (2008) UniRef: comprehensive and non-redundant UniProt reference clusters. Bioinformatics 23, 1282–1288.
Leinonen R, Diez FG, Binns D, Fleischmann W, Lopez R, Apweiler R. (2004) UniProt archive. Bioinformatics 20, 3236–3237.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer Science+Business Media, LLC
About this protocol
Cite this protocol
McGarvey, P.B., Zhang, J., Natale, D.A., Wu, C.H., Huang, H. (2011). Protein-Centric Data Integration for Functional Analysis of Comparative Proteomics Data. In: Wu, C., Chen, C. (eds) Bioinformatics for Comparative Proteomics. Methods in Molecular Biology, vol 694. Humana Press. https://doi.org/10.1007/978-1-60761-977-2_20
Download citation
DOI: https://doi.org/10.1007/978-1-60761-977-2_20
Published:
Publisher Name: Humana Press
Print ISBN: 978-1-60761-976-5
Online ISBN: 978-1-60761-977-2
eBook Packages: Springer Protocols