The Bioverse API and Web Application

Guerquin, Michal; McDermott, Jason; Frazier, Zach; Samudrala, Ram

doi:10.1007/978-1-59745-243-4_22

Michal Guerquin⁶,
Jason McDermott⁷,
Zach Frazier⁶ &
…
Ram Samudrala⁶

Part of the book series: Methods in Molecular Biology ((MIMB,volume 541))

2697 Accesses

Abstract

The Bioverse is a framework for creating, warehousing and presenting biological information based on hierarchical levels of organisation. The framework is guided by a deeper philosophy of desiring to represent all relationships between all components of biological systems towards the goal of a wholistic picture of organismal biology. Data from various sources are combined into a single repository and a uniform interface is exposed to access it. The power of the approach of the Bioverse is that, due to its inclusive nature, patterns emerge from the acquired data and new predictions are made. The implementation of this repository (beginning with acquisition of source data, processing in a pipeline, and concluding with storage in a relational database) and interfaces to the data contained in it, from a programmatic application interface to a user friendly web application, are discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

J. Yu, J. Wang, W. Lin, et al. The genomes of Oryza sativa: a history of duplications. Public Libr. Sci. Biol. 3: e38 (2005).
Google Scholar
S. Kikuchi, K. Satoh, T. Nagata, et al. Collection, mapping, and annotation of over 28,000 cDNA clones from japonica rice. Science. 301: 376–379 (2003).
Article PubMed Google Scholar
The Gene Ontology Consortium. Gene ontology: tool for the unification of biology. Nat. Genet. 25: 25–29 (2000).
Article Google Scholar
J. Cherry, C. Adler, C. Ball, et al. SGD: Saccharomyces genome database. Nucl. Acids Res. 261: 73–79 (1998).
Article PubMed CAS Google Scholar
T. Harris, N. Chen, F. Cunningham, et al. WormBase: a multi-species resource for nematode biology and genomics. Nucleic Acids Res. 32: D411–D417 (2004).
Article PubMed CAS Google Scholar
F. Consortium. The FlyBase database of the Drosophila genome projects and community literature. Nucleic Acids Res. 31: 172–175 (2003).
Article Google Scholar
S. Peri, J. D. Navarro, R. Amanchy, et al. Development of human protein reference database as an initial platform for approaching systems biology in humans. Genome Res. 13(10): 2363–2371 (2003).
Article PubMed CAS Google Scholar
R. Apweiler, T. Attwood, A. Bairoch, et al. InterPro-an integrated documentation resource for protein families, domains and functional sites. Bioinformatics. 16: 1145–1150 (2000).
Article PubMed CAS Google Scholar
H. M. Berman, J. Westbrook, Z. Feng, et al. The protein data bank. Nucl. Acids Res. 281: 235–242 (2000).
Article PubMed CAS Google Scholar
A. G. Murzin, S. E. Brenner, T. Hubbard, C. Chothia. SCOP: a structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol. 247: 536–540 (1995).
PubMed CAS Google Scholar
T. Hubbard, A. Murzin, S. Brenner, C. Chothia. SCOP: a structural classification of proteins database. Nucleic Acids Res. 25: 236–239 (1997).
Article PubMed CAS Google Scholar
L. Lo Conte, S. E. Brenner, T. J. P. Hubbard, C. Chothia, A. G. Murzin. SCOP database in 2002: refinements accommodate structural genomics. Nucl. Acids Res. 30(1): 264–267 (2002).
Google Scholar
A. Andreeva, D. Howorth, S. E. Brenner, et al. SCOP database in 2004: refinements integrate structure and sequence family data. Nucl. Acids Res. 32 (2004).
Google Scholar
J. Gough, K. Karplus, R. Hughey, C. Chothia. Assignment of homology to genome sequences using a library of hidden Markov models that represent all proteins of known structure. J. Mol. Biol. 313: 903–919 (2001).
Article PubMed CAS Google Scholar
J. Gough, C. Chothia. SUPERFAMILY: HMMs representing all proteins of known structure. SCOP sequence searches, alignments and genome assignments. Nucleic Acids Res. 30: 268–272 (2002).
Article PubMed CAS Google Scholar
L. McGuffin, K. Bryson, D. Jones. The PSIPRED protein structure prediction server. Bioinformatics. 16: 404–405 (2000).
Article PubMed CAS Google Scholar
R. Samudrala, J. Moult. A graph-theoretic algorithm for comparative modelling of protein structure. J. Mol. Biol. 279: 287–302 (1998).
Article PubMed CAS Google Scholar
R. Samudrala, Y. Xia, E. Huang, M. Levitt. Ab initio protein structure prediction using a combined hierarchical approach. Prot.: Struct. Funct. Genet. S3: 194–198 (1999).
Article Google Scholar
E. Huang, R. Samudrala, J. Ponder. Ab initio fold prediction of small helical proteins using distance geometry and knowledge-based scoring functions. J. Mol. Biol. 290: 267–281 (1999).
Article PubMed CAS Google Scholar
Y. Xia, E. Huang, M. Levitt, R. Samudrala. Ab initio construction of protein tertiary structures using a hierarchical approach. J. Mol. Biol. 300: 171–185 (2000).
Article PubMed CAS Google Scholar
G. Bader, D. Betel, C. Hogue. BIND: the biomolecular interaction network database. Nucleic Acids Res. 31: 248–250 (2003).
Article PubMed CAS Google Scholar
H. Mewes, D. Frishman, U. Guldener, et al. MIPS: a database for genomes and protein sequences. Nucleic Acids Res. 30: 31–34 (2002).
Article PubMed CAS Google Scholar
I. Xenarios, L. Salwinski, X. Duan, et al. DIP, the database of interacting proteins: a research tool for studying cellular networks of protein interactions. Nucleic Acids Res. 30: 303–305 (2002).
Article PubMed CAS Google Scholar
L. Matthews, P. Vaglio, J. Reboul, et al. Identification of potential interaction networks using sequence-based searches for conserved protein-protein intera ctions or “interologs”. Genome Res. 11: 2120–2126 (2001).
Article PubMed CAS Google Scholar
J. McDermott, R. Bumgarner, R. Samudrala. Functional annotation from predicted protein interaction networks. Bioinformatics. 21: 3217–3226 (2005).
Article PubMed CAS Google Scholar
Computing. http://compbio.washington.edu/computing.html.
S. Altschul, T. Madden, A. Schaffer, et al. Gapped BLAST and PSI-BLAST: a new generation of database programs. Nucleic Acids Res. 25: 3389–3402 (1997).
Article PubMed CAS Google Scholar
HMMER: biosequence analysis using profile hidden Markov models. http://hmmer.janelia.org.
L.-H. Hung, R. Samudrala. PROTINFO: secondary and tertiary protein structure prediction. Nucleic Acids Res. 31: 3736–3737 (2003).
Article Google Scholar
L. Hung, S. Ngan, T. Liu, R. Samudrala. PROTINFO: new algorithms for enhanced protein structure predictions. Nucleic Acids Res. 33: W77–W80 (2005).
Article PubMed CAS Google Scholar
L.-H. Hung, R. Samudrala. An automated assignment-free Bayesian approach for accurately identifying proton contacts from NOESY data. J. Biomol. NMR. 36: 189–198 (2006).
Google Scholar
L.-H. Hung, R. Samudrala. Accurate and automated assignment of secondary structure with PsiCSI. Protein Sci. 12: 288–295 (2003).
Article PubMed CAS Google Scholar
K. Wang, J. A. Horst, G. Cheng, D. C. Nickle, R. Samudrala. Protein Meta-Functional Signatures from Combining Sequence, Structure, Evolution, and Amino Acid Property Information. PLoS Computational Biology 4(9): e1000181 (2008).
Google Scholar
G. Cheng, B. Qian, R. Samudrala, D. Baker. Improvement in protein functional site prediction by distinguishing structural and functional constraints on protein family evolution using computational design. Nucleic Acids Res. 33: 5861–5867 (2005).
Article PubMed CAS Google Scholar
K. Wang, R. Samudrala. FSSA: a novel method for identifying functional signatures from structural alignments. Bioinformatics. 21: 2969–2977 (2005).
Article PubMed CAS Google Scholar
G. Cheng, R. Samudrala. An all-atom geometrical knowledge-based scoring function to predict protein metal ion binding sites, affinities and specificities. manuscript in preparation (2007).
Google Scholar
E. Jenwitheesuk, K. Wang, J. Mittler, R. Samudrala. PIRSpred: a web server for reliable HIV-1 protein-inhibitor resistance/susceptibility prediction. Trends Microbiol. 13: 150–151 (2005).
Article PubMed CAS Google Scholar
E. Jenwitheesuk, R. Samudrala. Prediction of HIV-1 protease inhibitor resistance using a protein-inhibitor flexible docking approach. Antiv. Ther. 10: 157–166 (2005).
CAS Google Scholar
R. Jenwitheesuk, K. Wang, J. Mittler, R. Samudrala. Improved accuracy of HIV-1 genotypic susceptibility interpretation using a consensus approach. AIDS. 18: 1858–1859 (2004).
Article PubMed Google Scholar
K. Wang, E. Jenwitheesuk, R. Samudrala, J. Mittler. Simple linear model provides highly accurate genotypic predictions of HIV-1 drug resistance. Antiv. Ther. 9: 343–352 (2004).
CAS Google Scholar
K. Wang, R. Samudrala. Automated functional classification of experimental and predicted protein structures. Bioinformatics. 7: 278–277 (2006).
PubMed Google Scholar
A. Chang, J. McDermott, Z. Frazier, M. Guerquin, R. Samudrala. INTEGRATOR: interactive graphical search of large protein interactomes over the web. Bioinformatics. 7: 146–110 (2006).
PubMed Google Scholar
XML-RPC Home Page. http://www.xmlrpc.com.
J. McDermott, M. Guerquin, Z. Frazier, R. Samudrala. BellaVista: a flexible visualization environment for complex biological information. manuscript in preparation (2007).
Google Scholar
JSON. http://www.json.org/.
E. Birney, D. Andrews, P. Bevan, et al. Ensembl 2004. Nucleic Acids Res. 32: D468–D470 (2004).
Article PubMed CAS Google Scholar
A. Birkland, G. Yona. BIOZON: a hub of heterogeneous biological data. Nucl. Acids Res. 34: D235–D242 (2006).
Article PubMed CAS Google Scholar
B. Breitkreutz, C. Stark, M. Tyers. The GRID: the general repository for interaction datasets. Genome Biol. 4: 744120 (2003).
Google Scholar
M. Kanehisa, S. Goto, S. Kawashima, A. Nakaya. The KEGG databases at GenomeNet. Nucleic Acids Res. 30: 42–46 (2002).
Article PubMed CAS Google Scholar
K. Fleming, A. Muller, R. MacCallum, M. Sternberg. 3D-GENOMICS: a database to compare structural and functional annotations of proteins between sequenced genomes. Nucleic Acids Res. 32: D245–D250 (2004).
Article PubMed CAS Google Scholar
D. Frishman, M. Mokrejs, D. Kosykh, et al. The PEDANT genome database. Nucleic Acids Res. 31: 207–211 (2003).
Article PubMed CAS Google Scholar
M. L. Riley, T. Schmidt, C. Wagner, H.-W. Mewes, D. Frishman. The PEDANT genome database in 2005. Nucl. Acids Res. 33: D308–D310 (2005).
Article PubMed CAS Google Scholar
C. von Mering, M. Huynen, D. Jaeggi, et al. STRING: a database of predicted functional associations between proteins. Nucleic Acids Res. 31: 258–261 (2003).
Article Google Scholar
J. Mellor, I. Yanai, K. Clodfelter, J. Mintseris, C. DeLisi. Predictome: a database of putative functional links between proteins. Nucleic Acids Res. 30: 306–309 (2002).
Article PubMed CAS Google Scholar
P. Shannon, A. Markiel, O. Ozier, et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13: 2498–2504 (2003).
Article PubMed CAS Google Scholar
H. Yu, N. Luscombe, H. Lu, et al. Annotation transfer between genomes: protein-protein interologs and protein-DNA regulogs. Genome Res. 14: 1107–1118 (2004).
Article PubMed CAS Google Scholar
Python Programming Language – Official Website. http://www.python.org.
PostgreSQL: The world’s most advanced open source database. http://www.postgresql.org.
CherryPy. http://www.cherrypy.org.
htmltmpl templating engine. http://htmltmpl.sourceforge.net.
trimpath – Google Code. http://code.google.com/p/trimpath.

Download references

Acknowledgements

We acknowledge the invaluable help in the form of comments, contributions, and critiques of the Bioverse from all members of the Samudrala group and the Department of Microbiology at the University of Washington.

Many researchers have helped in the creation of the Bioverse and Protinfo web servers. We thank the scientific community (more properly attributed in Section 3.2) for making available data and techniques we have used and relied on.

This work was and is currently supported in part by the University of Washington’s Advanced Technology Initiative in Infectious Diseases, Puget Sound Partners in Global Health, NSF CAREER Grant, NSF Grant DBI-0217241, NIH Grant GM068152 and a Searle Scholar Award to Ram Samudrala.

Author information

Authors and Affiliations

Department of Microbiology, University of Washington, Seattle, WA, USA
Michal Guerquin, Zach Frazier & Ram Samudrala
Computational Biology and Bioinformatics, Pacific Northwest National Laboratory, Richland, WA, USA
Jason McDermott

Authors

Michal Guerquin
View author publications
You can also search for this author in PubMed Google Scholar
Jason McDermott
View author publications
You can also search for this author in PubMed Google Scholar
Zach Frazier
View author publications
You can also search for this author in PubMed Google Scholar
Ram Samudrala
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Guerquin, M., McDermott, J., Frazier, Z., Samudrala, R. (2009). The Bioverse API and Web Application. In: Ireton, R., Montgomery, K., Bumgarner, R., Samudrala, R., McDermott, J. (eds) Computational Systems Biology. Methods in Molecular Biology, vol 541. Humana Press. https://doi.org/10.1007/978-1-59745-243-4_22

Download citation

DOI: https://doi.org/10.1007/978-1-59745-243-4_22
Published: 10 March 2009
Publisher Name: Humana Press
Print ISBN: 978-1-58829-905-5
Online ISBN: 978-1-59745-243-4
eBook Packages: Springer Protocols

Publish with us

Policies and ethics