Inferring Transcription Networks from Data

Abstract

Reverse engineering of transcription networks is a challenging bioinformatics problem. Ordinary differential equation (ODEs) network models have their roots in the physicochemical base of these networks, but are difficult to build conventionally. Modeling automation is needed and knowledge discovery in data using computational intelligence methods is a solution. The authors have developed a methodology for automatically inferring ODE systems models from omics data, based on genetic programming (GP), and illustrate it on a real transcription network. The methodology allows the network to be decomposed from the complex of interacting cellular networks and to further decompose each of its nodes, without destroying their interactions. The structure of the network is not imposed but discovered from data, and further assumptions can be made about the parametersʼ values and the mechanisms involved. The algorithms can deal with unmeasured regulatory variables, like transcription factors (TFs) and microRNA (miRNA or miR). This is possible by introducing the regulome probabilities concept and the techniques to compute them. They are based on the statistical thermodynamics of regulatory molecular interactions. Thus, the resultant models are mechanistic and theoretically founded, not merely data fittings. To our knowledge, this is the first reverse engineering approach capable of dealing with missing variables, and the accuracy of all the models developed is greater than 99%.

Keywords

Root Mean Square Error Genetic Programming Reverse Engineering Biochemical Network Symbolic Regression 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Abbreviations

ACTB

actin cytoplasmic

COPASI

complexpathway simulator

EMT

epithelial-to-mesenchymal transition

GEO

gene expression omnibus

GP

genetic programming

NCBI

National Center for Biotechnology Information

ODE

ordinary differential equation

RMS

root-mean-square

RMSE

root mean squared error

RODES

reversing ordinary differential equation system

SBW

Systems Biology Workbench

SSE

squares due to error

TF

transcription factor

TGF

transforming growth factor

mRNA

messenger RNA

miRNA

microRNA

References

  1. 20.1.
    H. Guo, N.T. Ingolia, J.S. Weissman, D.P. Bartel: Mammalian microRNAs predominantly act to decrease target mRNA levels, Nature 466(7308), 835–840 (2010)CrossRefGoogle Scholar
  2. 20.2.
    J.R. Koza: Genetic Programming: On the Programming of Computers by Means of Natural Selection (MIT Press, Cambridge 1992)MATHGoogle Scholar
  3. 20.3.
    V.G. Keshamouni, P. Jagtap, G. Michailidis, J.R. Strahler, R. Kuick, A.K. Reka, P. Papoulias, R. Krishnapuram, A. Srirangam, T.J. Standiford, P.C. Andrews, G.S. Omenn: Temporal quantitative proteomics by iTRAQ 2D-LC-MS/MS and corresponding mRNA expression analysis identify post-transcriptional modulation of actin-cytoskeleton regulators during TGF-β-induced epithelial-mesenchymal transition, J. Proteome Res. 8(1), 35–47 (2009)CrossRefGoogle Scholar
  4. 20.4.
    G.K. Ackers, A.D. Johnson, M.A. Shea: Quantitative model for gene regulation by lambda phage repressor, Proc. Natl. Acad. Sci. USA 79(4), 1129–1133 (1982)CrossRefGoogle Scholar
  5. 20.5.
    L. Bintu, N.E. Buchler, H.G. Garcia, U. Gerland, T. Hwa, J. Kondev, R. Phillips: Transcriptional regulation by the numbers: Models, Curr. Opin. Genet. Dev. 15(2), 116–124 (2005)CrossRefGoogle Scholar
  6. 20.6.
    U. Alon: An Introduction to Systems Biology: Design Principles of Biological Circuits (Chapman Hall/CRC, New York 2006)Google Scholar
  7. 20.7.
    M.A. Shea, G.K. Ackers: The OR control system of bacteriophage lambda: A physical-chemical model for gene regulation, J. Mol. Biol. 181(2), 211–230 (1985)CrossRefGoogle Scholar
  8. 20.8.
    D. Searson: GPTIPS: Genetic programming and symbolic regression for MATLAB (2009) available from http://gptips.sourceforge.net/
  9. 20.9.
    D.P. Searson, D.E. Leahy, M.J. Willis: GPTIPS: An open source genetic programming toolbox for multigene symbolic regression, Proc. Int. Multiconf. Eng. Comput. Sci. (IMECS 2010) (2010) pp. 77–80Google Scholar
  10. 20.10.
    Y. Setty, A.E. Mayo, M.G. Surette, U. Alon: Detailed map of a cis-regulatory input function, Proc. Natl. Acad. Sci. USA 100, 7702–7707 (2003)CrossRefGoogle Scholar
  11. 20.11.
    H.M. Sauro, M. Hucka, A. Finney, C. Wellock, H. Bolouri, J. Doyle, H. Kitano: Next generation simulation tools: The systems biology workbench and BioSPICE integration, OMICS 7(4), 353–370 (2003), SBW latest version available free from http://sourceforge.net/projects/jdesigner/ CrossRefGoogle Scholar
  12. 20.12.
    G. Greenburg, E.D. Hay: Epithelia suspended in collagen gels can lose polarity and express characteristics of migrating mesenchymal cells, J. Cell Biol. 95(1), 333–339 (1982)CrossRefGoogle Scholar
  13. 20.13.
    C. de Boor: A Practical Guide to Splines (Springer, Berlin, Heidelberg 1978)MATHCrossRefGoogle Scholar
  14. 20.14.
    A.G. Floares: Automatic reverse engineering algorithm for drug gene regulating networks, Proc. 11th IASTED Int. Conf. Artif. Intell. Soft Comput. (2007)Google Scholar
  15. 20.15.
    A.G. Floares: A reverse engineering algorithm for neural networks, applied to the subthalamopallidal network of basal ganglia, Neural Netw. Spec. Issue 21, 379–386 (2008)CrossRefGoogle Scholar
  16. 20.16.
    M. Brameier, W. Banzhaf: Linear Genetic Programming (Springer, Berlin, Heidelberg 2007)MATHGoogle Scholar
  17. 20.17.
    U. Mückstein, H. Tafer, J. Hackermüller, S.H. Bernhart, P.F. Stadler, I.L. Hofacker: Thermodynamics of RNA–RNA binding, Bioinformatics, 22(10), 1177–1182 (2006)CrossRefGoogle Scholar
  18. 20.18.
    N.G. van Kampen: Stochastic Processes in Physics and Chemistry (North-Holland, Amsterdam 1992)Google Scholar
  19. 20.19.
    T. Barrett, D.B. Troup, S.E. Wilhite, P. Ledoux, C. Evangelista, I.F. Kim, M. Tomashevsky, K.A. Marshall, K.H. Phillippy, P.M. Sherman, R.N. Muertter, M. Holko, O. Ayanbule, A. Yefanov, A. Soboleva: NCBI GEO: Archive for functional genomics data sets – 10 years on, Nucleic Acids Res., 39(1), D1005–D1010 (2011)CrossRefGoogle Scholar
  20. 20.20.
    S. Hoops, S. Sahle, C. Lee, J. Pahle, N. Simus, M. Singhal, L. Xu, P. Mendes, U. Kummer: COPASI – A COmplex PAthway SImulator, Bioinformatics 22(24), 3067–3074 (2006)CrossRefGoogle Scholar
  21. 20.21.
    S. Luke, L. Panait: Lexicographic parsimony pressure, Proc. GECCO-2002 (Morgan Kaufmann, San Fancisco 2002) pp. 829–836Google Scholar

Copyright information

© Springer-Verlag 2014

Authors and Affiliations

  1. 1.SAIA, OncoPredict Cancer Institute Cluj-NapocaCluj-NapocaRomania
  2. 2.SAIA InstituteCluj-NapocaRomania

Personalised recommendations