Encyclopedia of Complexity and Systems Science

Living Edition
| Editors: Robert A. Meyers

Biological Data Integration and Model Building

  • James A. Eddy
  • Nathan D. Price
Living reference work entry
DOI: https://doi.org/10.1007/978-3-642-27737-5_34-3

Definition of the Subject

Data integration and model building have become essential activities in biological research as technological advancements continue to empower the measurement of biological data of increasing diversity and scale. High-throughput technologies provide a wealth of global data sets (e.g., genomics, transcriptomics, proteomics, metabolomics), and the challenge becomes how to integrate this data to maximize the amount of useful biological information that can be extracted. Integrating biological data is important and challenging because of the nature of biology. Biological systems have evolved over the course of billions of years, and in that time biological mechanisms have become very diverse, with molecular machines of intricate detail. Thus, while there are certainly great general scientific principles to be distilled – such as the foundational evolutionary theory – much of biology is found in the details of these evolved systems. This emphasis on the details of...


Bayesian Network Interaction Network Protein Interaction Network Boolean Network Flux Balance Analysis 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in to check access.


  1. Albert R, Othmer HG (2003) The topology of the regulatory interactions predicts the expression pattern of the segment polarity genes in Drosophila melanogaster. J Theor Biol 223(1):1–18MathSciNetCrossRefGoogle Scholar
  2. Alm E, Arkin AP (2003) Biological networks. Curr Opin Struct Biol 13(2):193–202CrossRefGoogle Scholar
  3. Almaas E, Kovacs B et al (2004) Global organization of metabolic fluxes in the bacterium Escherichia coli. Nature 427(6977):839–843ADSCrossRefGoogle Scholar
  4. Basso K, Margolin AA et al (2005) Reverse engineering of regulatory networks in human B cells. Nat Genet 37(4):382–390CrossRefGoogle Scholar
  5. Beard DA, Liang SD et al (2002) Energy balance for analysis of complex metabolic networks. Biophys J 83(1):79–86CrossRefGoogle Scholar
  6. Beard DA, Babson E et al (2004) Thermodynamic constraints for biochemical networks. J Theor Biol 228(3):327–333MathSciNetCrossRefGoogle Scholar
  7. Bonneau R, Reiss DJ et al (2006) The inferelator: an algorithm for learning parsimonious regulatory networks from systems-biology data sets de novo. Genome Biol 7(5):R36CrossRefGoogle Scholar
  8. Burgard AP, Pharkya P et al (2003) Optknock: a bilevel programming framework for identifying gene knockout strategies for microbial strain optimization. Biotechnol Bioeng 84(6):647–657CrossRefGoogle Scholar
  9. Christopher R, Dhiman A et al (2004) Data-driven computer simulation of human cancer cell. Ann NY Acad Sci 1020:132–153ADSCrossRefGoogle Scholar
  10. Cohen JE (2004) Mathematics is biology’s next microscope, only better; biology is mathematics’ next physics, only better. PLoS Biol 2(12):e439CrossRefGoogle Scholar
  11. Covert MW, Knight EM et al (2004) Integrating high-throughput and computational data elucidates bacterial networks. Nature 429(6987):92–96ADSCrossRefGoogle Scholar
  12. Covert MW, Leung TH et al (2005) Achieving stability of lipopolysaccharide-induced NF-kappaB activation. Science 309(5742):1854–1857ADSCrossRefGoogle Scholar
  13. Deshpande N, Addess KJ et al (2005) The RCSB Protein Data Bank: a redesigned query system and relational database based on the mmCIF schema. Nucleic Acids Res (Database issue) 33:D233–D237CrossRefGoogle Scholar
  14. Duarte NC, Herrgard MJ et al (2004) Reconstruction and validation of Saccharomyces cerevisiae iND750, a fully compartmentalized genome-scale metabolic model. Genome Res 14(7):1298–1309CrossRefGoogle Scholar
  15. Duarte NC, Becker SA et al (2007) Global reconstruction of the human metabolic network based on genomic and bibliomic data. Proc Natl Acad Sci U S A 104(6):1777–1782ADSCrossRefGoogle Scholar
  16. Edwards JS, Palsson BO (2000) Robustness analysis of the Escherichia coli metabolic network. Biotechnol Prog 16(6):927–939CrossRefGoogle Scholar
  17. Edwards JS, Ibarra RU et al (2001) In silico predictions of Escherichia coli metabolic capabilities are consistent with experimental data. Nat Biotechnol 19(2):125–130CrossRefGoogle Scholar
  18. Faith JJ, Hayete B et al (2007) Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of expression profiles. PLoS Biol 5(1):e8CrossRefGoogle Scholar
  19. Famili I, Forster J et al (2003) Saccharomyces cerevisiae phenotypes can be predicted by using constraint-based analysis of a genome-scale reconstructed metabolic network. Proc Natl Acad Sci U S A 100(23):13134–13139ADSCrossRefGoogle Scholar
  20. Faure A, Naldi A et al (2006) Dynamical analysis of a generic boolean model for the control of the mammalian cell cycle. Bioinformatics 22(14):e124–e131CrossRefGoogle Scholar
  21. Forster J, Famili I et al (2003) Large-scale evaluation of in silico gene deletions in Saccharomyces cerevisiae. OMICS 7(2):193–202CrossRefGoogle Scholar
  22. Francke C, Siezen RJ et al (2005) Reconstructing the metabolic network of a bacterium from its genome. Trends Microbiol 13(11):550–558CrossRefGoogle Scholar
  23. Friedman N (2004) Inferring cellular networks using probabilistic graphical models. Science 303(5659):799–805ADSCrossRefGoogle Scholar
  24. Gianchandani EP, Papin JA et al (2006) Matrix formalism to describe functional states of transcriptional regulatory systems. PLoS Comput Biol 2(8):e101ADSCrossRefGoogle Scholar
  25. Han JD, Bertin N et al (2004) Evidence for dynamically organized modularity in the yeast protein-protein interaction network. Nature 430(6995):88–93ADSCrossRefGoogle Scholar
  26. Hashimoto RF, Kim S et al (2004) Growing genetic regulatory networks from seed genes. Bioinformatics 20(8):1241–1247CrossRefGoogle Scholar
  27. Heinemann M, Kummel A et al (2005) In silico genome-scale reconstruction and validation of the Staphylococcus aureus metabolic network. Biotechnol Bioeng 92(7):850–864CrossRefGoogle Scholar
  28. Hendriks BS, Wiley HS et al (2003) HER2-mediated effects on EGFR endosomal sorting: analysis of biophysical mechanisms. Biophys J 85(4):2732–2745CrossRefGoogle Scholar
  29. Herrgard MJ, Palsson BO (2005) Untangling the web of functional and physical interactions in yeast. J Biol 4(2):5CrossRefGoogle Scholar
  30. Hoffmann A, Levchenko A et al (2002) The IkappaB-NF-kappaB signaling module: temporal control and selective gene activation. Science 298(5596):1241–1245ADSCrossRefGoogle Scholar
  31. Hood L, Heath JR et al (2004) Systems biology and new technologies enable predictive and preventative medicine. Science 306(5696):640–643ADSCrossRefGoogle Scholar
  32. Hua Q, Joyce AR et al (2006) Metabolic analysis of adaptive evolution for in silico-designed lactate-producing strains. Biotechnol Bioeng 95(5):992–1002CrossRefGoogle Scholar
  33. Hwang D, Rust AG et al (2005a) A data integration methodology for systems biology. Proc Natl Acad Sci U S A 102(48):17296–17301ADSCrossRefGoogle Scholar
  34. Hwang D, Smith JJ et al (2005b) A data integration methodology for systems biology: experimental verification. Proc Natl Acad Sci U S A 102(48):17302–17307ADSCrossRefGoogle Scholar
  35. Ibarra RU, Edwards JS et al (2002) Escherichia coli K-12 undergoes adaptive evolution to achieve in silico predicted optimal growth. Nature 420(6912):186–189ADSCrossRefGoogle Scholar
  36. Ideker T (2004) A systems approach to discovering signaling and regulatory pathways-or, how to digest large interaction networks into relevant pieces. Adv Exp Med Biol 547:21–30CrossRefGoogle Scholar
  37. Ideker T, Galitski T et al (2001) A new approach to decoding life: systems biology. Annu Rev Genomics Hum Genet 2:343–372CrossRefGoogle Scholar
  38. Ideker T, Ozier O et al (2002) Discovering regulatory and signalling circuits in molecular interaction networks. Bioinformatics 18(Suppl 1):S233–S2340CrossRefGoogle Scholar
  39. Jamshidi N, Edwards JS et al (2001) Dynamic simulation of the human red blood cell metabolic network. Bioinformatics 17(3):286–287CrossRefGoogle Scholar
  40. Kauffman SA (1993) The origins of order: self organization and selection in evolution. Oxford University Press, New YorkGoogle Scholar
  41. Kelley BP, Yuan B et al (2004) PathBLAST: a tool for alignment of protein interaction networks. Nucleic Acids Res (Web Server issue) 32:W83–W88CrossRefGoogle Scholar
  42. Kim SY, Imoto S et al (2003) Inferring gene networks from time series microarray data using dynamic Bayesian networks. Brief Bioinform 4(3):228–235CrossRefGoogle Scholar
  43. Kirschner MW (2005) The meaning of systems biology. Cell 121(4):503–504CrossRefGoogle Scholar
  44. Kitano H (2002) Computational systems biology. Nature 420(6912):206–210ADSCrossRefGoogle Scholar
  45. Kurzweil R (2005) The singularity is near: when humans transcend biology. Penguin, LondonGoogle Scholar
  46. Lahdesmaki H, Shmulevich I et al (2003) On learning gene regulatory networks under the Boolean network model. Mach Learn 52(1–2):147–167CrossRefGoogle Scholar
  47. Lahdesmaki H, Hautaniemi S et al (2006) Relationships between probabilistic Boolean networks and dynamic Bayesian networks as models of gene regulatory networks. Signal Process 86(4):814–834CrossRefGoogle Scholar
  48. Levy S, Sutton G et al (2007) The diploid genome sequence of an individual human. PLoS Biol 5(10):e254CrossRefGoogle Scholar
  49. Li H, Zhan M (2006) Systematic intervention of transcription for identifying network response to disease and cellular phenotypes. Bioinformatics 22(1):96–102CrossRefGoogle Scholar
  50. Li F, Long T et al (2004) The yeast cell-cycle network is robustly designed. Proc Natl Acad Sci U S A 101(14):4781–4786ADSCrossRefGoogle Scholar
  51. Mahadevan R, Schilling CH (2003) The effects of alternate optimal solutions in constraint-based genome-scale metabolic models. Metab Eng 5(4):264–276CrossRefGoogle Scholar
  52. Margolin AA, Wang K et al (2006) Reverse engineering cellular networks. Nat Protoc 1(2):662–671CrossRefGoogle Scholar
  53. Mulquiney PJ, Kuchel PW (2003) Modelling metabolism with Mathematica, detailed examples including erythrocyte metabolism. CRC Press, Boca RatonCrossRefGoogle Scholar
  54. Pal R, Datta A et al (2005) Intervention in context-sensitive probabilistic Boolean networks. Bioinformatics 21(7):1211–1218CrossRefGoogle Scholar
  55. Palsson B (2004) Two-dimensional annotation of genomes. Nat Biotechnol 22(10):1218–1219CrossRefGoogle Scholar
  56. Papin JA, Palsson BO (2004a) The JAK-STAT signaling network in the human B-cell: an extreme signaling pathway analysis. Biophys J 87(1):37–46CrossRefGoogle Scholar
  57. Papin JA, Palsson BO (2004b) Topological analysis of mass-balanced signaling networks: a framework to obtain network properties including crosstalk. J Theor Biol 227(2):283–297CrossRefGoogle Scholar
  58. Papin JA, Price ND et al (2002) The genome-scale metabolic extreme pathway structure in Haemophilus influenzae shows significant network redundancy. J Theor Biol 215(1):67–82CrossRefGoogle Scholar
  59. Papin JA, Hunter T et al (2005) Reconstruction of cellular signalling networks and analysis of their properties. Nat Rev Mol Cell Biol 6(2):99–111CrossRefGoogle Scholar
  60. Pharkya P, Burgard AP et al (2003) Exploring the overproduction of amino acids using the bilevel optimization framework OptKnock. Biotechnol Bioeng 84(7):887–899CrossRefGoogle Scholar
  61. Pharkya P, Burgard AP et al (2004) OptStrain: a computational framework for redesign of microbial production systems. Genome Res 14(11):2367–2376CrossRefGoogle Scholar
  62. Pournara I, Wernisch L (2004) Reconstruction of gene networks using Bayesian learning and manipulation experiments. Bioinformatics 20(17):2934–2942CrossRefGoogle Scholar
  63. Price ND, Papin JA et al (2002) Determination of redundancy and systems properties of the metabolic network of Helicobacter pylori using genome-scale extreme pathway analysis. Genome Res 12(5):760–769CrossRefGoogle Scholar
  64. Price ND, Schellenberger J et al (2004a) Uniform sampling of steady-state flux spaces: means to design experiments and to interpret enzymopathies. Biophys J 87(4):2172–2186CrossRefGoogle Scholar
  65. Price ND, Reed JL et al (2004b) Genome-scale models of microbial cells: evaluating the consequences of constraints. Nat Rev Microbiol 2(11):886–897CrossRefGoogle Scholar
  66. Reed JL, Palsson BO (2003) Thirteen years of building constraint-based in silico models of Escherichia coli. J Bacteriol 185(9):2692–2699CrossRefGoogle Scholar
  67. Reed JL, Palsson BO (2004) Genome-scale in silico models of E. coli have multiple equivalent phenotypic states: assessment of correlated reaction subsets that comprise network states. Genome Res 14(9):1797–1805CrossRefGoogle Scholar
  68. Reed JL, Vo TD et al (2003) An expanded genome-scale model of Escherichia coli K-12 (iJR904 GSM/GPR). Genome Biol 4(9):R54CrossRefGoogle Scholar
  69. Reiss DJ, Baliga NS et al (2006) Integrated biclustering of heterogeneous genome-wide datasets for the inference of global regulatory networks. BMC Bioinforma 7:280CrossRefGoogle Scholar
  70. Rual JF, Venkatesan K et al (2005) Towards a proteome-scale map of the human protein-protein interaction network. Nature 437(7062):1173–1178ADSCrossRefGoogle Scholar
  71. Sachs K, Perez O et al (2005) Causal protein-signaling networks derived from multiparameter single-cell data. Science 308(5721):523–529ADSCrossRefGoogle Scholar
  72. Sauer U (2004) High-throughput phenomics: experimental methods for mapping fluxomes. Curr Opin Biotechnol 15(1):58–63MathSciNetCrossRefGoogle Scholar
  73. Shannon P, Markiel A et al (2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 13(11):2498–2504CrossRefGoogle Scholar
  74. Shmulevich I, Dougherty ER et al (2002a) Probabilistic Boolean networks: a rule-based uncertainty model for gene regulatory networks. Bioinformatics 18(2):261–274CrossRefGoogle Scholar
  75. Shmulevich I, Dougherty ER et al (2002b) From Boolean to probabilistic Boolean networks as models of genetic regulatory networks. Proc IEEE 90(11):1778–1792CrossRefGoogle Scholar
  76. Shmulevich I, Dougherty ER et al (2002c) Gene perturbation and intervention in probabilistic Boolean networks. Bioinformatics 18(10):1319–1331CrossRefGoogle Scholar
  77. Smith HO, Tomb JF et al (1995) Frequency and distribution of DNA uptake signal sequences in the Haemophilus influenzae Rd genome. Science 269(5223):538–540ADSCrossRefGoogle Scholar
  78. Stelzl U, Worm U et al (2005) A human protein-protein interaction network: a resource for annotating the proteome. Cell 122(6):957–968CrossRefGoogle Scholar
  79. Thakar J, Pillione M et al (2007) Modelling systems-level regulation of host immune responses. PLoS Comput Biol 3(6):e109ADSCrossRefGoogle Scholar
  80. Thiele I, Vo TD et al (2005a) Expanded metabolic reconstruction of Helicobacter pylori (iIT341 GSM/GPR): an in silico genome-scale characterization of single- and double-deletion mutants. J Bacteriol 187(16):5818–5830CrossRefGoogle Scholar
  81. Thiele I, Price ND et al (2005b) Candidate metabolic network states in human mitochondria. Impact of diabetes, ischemia, and diet. J Biol Chem 280(12):11683–11695CrossRefGoogle Scholar
  82. Tong AH, Lesage G et al (2004) Global mapping of the yeast genetic interaction network. Science 303(5659):808–813ADSCrossRefGoogle Scholar
  83. von Dassow G, Meir E et al (2000) The segment polarity network is a robust developmental module. Nature 406(6792):188–192ADSCrossRefGoogle Scholar
  84. Werner SL, Barken D et al (2005) Stimulus specificity of gene expression programs determined by temporal control of IKK activity. Science 309(5742):1857–1861ADSCrossRefGoogle Scholar
  85. Westbrook J, Feng Z et al (2002) The Protein Data Bank: unifying the archive. Nucleic Acids Res 30(1):245–248CrossRefGoogle Scholar
  86. Westerhoff HV, Palsson BO (2004) The evolution of molecular biology into systems biology. Nat Biotechnol 22(10):1249–1252CrossRefGoogle Scholar
  87. Zou M, Conzen SD (2005) A new dynamic Bayesian network (DBN) approach for identifying gene regulatory networks from time course microarray data. Bioinformatics 21(1):71–79CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  1. 1.Department of BioengineeringUniversity of IllinoisUrbana-ChampaignUSA