Statistics and Computing

, Volume 22, Issue 6, pp 1257–1271 | Cite as

Reverse engineering gene regulatory networks using approximate Bayesian computation

  • Andrea Rau
  • Florence Jaffrézic
  • Jean-Louis Foulley
  • R. W. Doerge


Gene regulatory networks are collections of genes that interact with one other and with other substances in the cell. By measuring gene expression over time using high-throughput technologies, it may be possible to reverse engineer, or infer, the structure of the gene network involved in a particular cellular process. These gene expression data typically have a high dimensionality and a limited number of biological replicates and time points. Due to these issues and the complexity of biological systems, the problem of reverse engineering networks from gene expression data demands a specialized suite of statistical tools and methodologies. We propose a non-standard adaptation of a simulation-based approach known as Approximate Bayesian Computing based on Markov chain Monte Carlo sampling. This approach is particularly well suited for the inference of gene regulatory networks from longitudinal data. The performance of this approach is investigated via simulations and using longitudinal expression data from a genetic repair system in Escherichia coli.


Approximate Bayesian computation Gene regulatory networks Longitudinal gene expression Markov chain Monte Carlo 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Alon, U.: Network motifs: theory and experimental approaches. Nat. Genet. Rev. 8, 450–461 (2007) CrossRefGoogle Scholar
  2. Beal, M.J., Falciani, F., Ghahramani, Z., Rangel, C., Wild, D.L.: A Bayesian approach to reconstructing genetic regulatory networks with hidden factors. Bioinformatics 21(3), 349–356 (2005) CrossRefGoogle Scholar
  3. Beaumont, M.A., Zhang, W., Balding, D.J.: Approximate Bayesian computation in population genetics. Genetics 162, 2025–2035 (2002) Google Scholar
  4. Beaumont, M.A., Cornuet, J.M., Marin, J.M., Robert, C.P.: Adaptivity for ABC algorithms: the ABC-PMC. Biometrika 96(4), 983–990 (2009) MathSciNetzbMATHCrossRefGoogle Scholar
  5. Bortot, P., Coles, S.G., Sisson, S.A.: Inference for stereological extremes. J. Am. Stat. Assoc. 102, 84–92 (2007) MathSciNetzbMATHCrossRefGoogle Scholar
  6. Charbonnier, C., Chiquet, J., Ambroise, C.: Weighted-LASSO for structured network inference from time course data. Stat. Appl. Genet. Mol. Biol. 9(15) (2010) Google Scholar
  7. Cleveland, W.S.: Robust locally weighted regression and smoothing scatterplots. J. Am. Stat. Assoc. 74, 829–836 (1979) MathSciNetzbMATHCrossRefGoogle Scholar
  8. Damien, P., Wakefield, J., Walker, S.: Gibbs sampling for Bayesian non-conjugate and hierarchical models using auxiliary variables. J. R. Stat. Soc. B 61(2), 331–344 (1999) MathSciNetzbMATHCrossRefGoogle Scholar
  9. Del Moral, P., Doucet, A., Jasra, A.: Sequential Monte Carlo samplers. J. R. Stat. Soc. B 68, 411–436 (2006) zbMATHCrossRefGoogle Scholar
  10. Del Moral, P., Doucet, A., Jasra, A.: An adaptive sequential Monte Carlo method for approximate Bayesian computation. Stat. Comput. (2011). doi: 10.1007/s11222-011-9271-y Google Scholar
  11. Drovandi, C.C., Pettitt, A.N.: Estimation of parameters for macroparasite population evolution using approximate Bayesian computation. Biometrics 67, 225–233 (2011) MathSciNetzbMATHCrossRefGoogle Scholar
  12. Friedman, N.: Using Bayesian networks to analyze expression data. J. Comput. Biol. 7(3/4), 601–620 (2000) CrossRefGoogle Scholar
  13. Friedman, N.: Inferring cellular networks using probabilistic graphical models. Science 303(799), 799–805 (2004) CrossRefGoogle Scholar
  14. Gelman, A., Rubin, D.B.: Inference from iterative simulation using multiple sequences. Stat. Sci. 7, 457–511 (1992) CrossRefGoogle Scholar
  15. Geyer, C.J.: Practical Markov chain Monte Carlo. Stat. Sci. 7, 473–511 (1992) CrossRefGoogle Scholar
  16. Gilks, W.R., Richardson, S., Spiegelhalter, D.J. (eds.): Markov Chain Monte Carlo in Practice: Interdisciplinary Statistics. Chapman and Hall/CRC, Boca Raton (1996) Google Scholar
  17. Gottardo, R., Raftery, A.E.: Markov chain Monte Carlo with mixtures of singular distributions. Tech. Rep. 470, University of Washington, Department of Statistics (2004) Google Scholar
  18. Hastings, W.K.: Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57(1), 97–109 (1970) zbMATHCrossRefGoogle Scholar
  19. Husmeier, D.: Sensitivity and specificity of inferring genetic regulatory interactions from microarray experiments with dynamic Bayesian networks. Bioinformatics 19(17), 2271–2282 (2003) CrossRefGoogle Scholar
  20. Husmeier, D., Dybowski, R., Roberts, S. (eds.): Probabilistic Modeling in Bioinformatics and Medical Informatics. Springer, Berlin (2005) Google Scholar
  21. Leclerc, R.D.: Survival of the sparsest: robust gene networks are parsimonious. Mol. Syst. Biol. 4(213) (2008) Google Scholar
  22. Leuenberger, C., Wegmann, D.: Bayesian computation and model selection without likelihoods. Genetics 183, 1–10 (2009) CrossRefGoogle Scholar
  23. Lund, R., Li, B.: Revisiting climate region definitions via clustering. Am. Meteorol. Soc. 22, 1787–1800 (2009) Google Scholar
  24. Marjoram, P., Molitor, J., Plagnol, V., Tavaré, S.: Markov chain Monte Carlo without likelihoods. Proc. Natl. Acad. Sci. 100(26), 15324–15328 (2003) CrossRefGoogle Scholar
  25. Opgen-Rhein, R., Strimmer, K.: Learning causal networks from systems biology time course data: an effective model selection procedure for the vector autoregressive process. BMC Bioinf. 8(Suppl 2) (2007) Google Scholar
  26. Perrin, B.E., Ralaivola, L., Mazurie, A., Bottani, S., Mallet, J., d’Alché Buc, F.: Gene networks inference using dynamic Bayesian networks. Bioinformatics 19(Suppl. 2), ii138–ii148 (2003) CrossRefGoogle Scholar
  27. Pritchard, J.K., Seielstad, M.T., Perez-Lezann, A., Feldman, M.W.: Population growth of human Y chromosomes: a study of Y chromosome microsatellites. Mol. Biol. Evol. 16, 1791–1798 (1999) CrossRefGoogle Scholar
  28. Rangel, C., Angus, J., Ghahramani, Z., Lioumi, M., Southeran, E., Gaiba, A., Wild, D.L., Falciani, F.: Modeling T-cell activation using gene expression profiling and state-space model. Bioinformatics 20(9), 1361–1372 (2004) CrossRefGoogle Scholar
  29. Ratmann, O., Jorgensen, O., Hinkley, T., Stumpf, M., Richardson, S., Wiuf, C.: Using likelihood-free inference to compare evolutionary dynamics of the protein networks of H. pylori and P. falciparum. PLoS Comput. Biol. 3(11), 2266–2278 (2007). doi: 10.1371/journal.pcbi.0030230 MathSciNetCrossRefGoogle Scholar
  30. Ratmann, O., Andrieu, C., Wiuf, C., Richardson, S.: Model criticism based on likelihood-free inference, with an application to protein network evolution. Proc. Natl. Acad. Sci. 106(26), 10576–10581 (2009) Google Scholar
  31. Ratmann, O., Pudlo, P., Richardson, S., Robert, C.: Monte Carlo algorithms for model assessment via conflicting summaries. ArXiv e-prints 1106.5919v1 (2011)
  32. Rau, A.: Reverse engineering gene networks using genomic time-course data. Ph.D. dissertation, Purdue University, West Lafayette, IN, USA (2010) Google Scholar
  33. Rau, A., Jaffrézic, F., Foulley, J.L., Doerge, R.W.: An empirical Bayesian method for estimating biological networks from temporal microarray data. Stat. Appl. Genet. Mol. Biol. 9(9), 1–28 (2010) MathSciNetGoogle Scholar
  34. Robert, C.: Bayesian computational methods. (2010)
  35. Robert, C., Casella, G.: Monte Carlo Statistical Methods. Springer, Berlin (2004) zbMATHGoogle Scholar
  36. Ronen, M., Rosenberg, R., Shraiman, B.I., Alon, U.: Assigning numbers to the arrows: parameterizing a gene regulation network by using accurate expression kinetics. Proc. Natl. Acad. Sci. 99(16), 10555–10560 (2002) CrossRefGoogle Scholar
  37. Sachs, K., Perez, O., Pe’er, D., Lauffenburger, D.A., Nolan, G.P.: Causal protein-signaling networks derived from multiparameter single-cell data. Science 308(5721), 523–529 (2005) CrossRefGoogle Scholar
  38. Schlitt, T., Brazma, A.: Current approaches to gene regulatory network modelling. BMC Bioinform. 8(Suppl 6(S9)), 1–22 (2007) Google Scholar
  39. Sisson, S.A., Fan, Y., Tanaka, M.M.: Sequential Monte Carlo without likelihoods. Proc. Natl. Acad. Sci. 104, 1760–1765 (2007) MathSciNetzbMATHCrossRefGoogle Scholar
  40. Toni, T., Stumpf, M.P.H.: Simulation-based model selection for dynamical systems in systems and population biology. Bioinformatics 26(1), 104–110 (2010) CrossRefGoogle Scholar
  41. Toni, T., Welch, D., Strelkowa, N., Ipsen, A., Stumpf, M.P.H.: Approximate Bayesian computation scheme for parameter inference and model selection in dynamic systems. J. R. Soc. Interface 6(31), 187–202 (2009) CrossRefGoogle Scholar
  42. Wegmann, D., Leuenberger, C., Excoffier, L.: Efficient approximate Bayesian computation coupled with Markov chain Monte Carlo without likelihood. Genetics 182, 1207–1218 (2009) CrossRefGoogle Scholar
  43. Werhli, A.V., Husmeier, D.: Reconstructing gene regulatory networks with Bayesian networks by combining expression data with multiple sources of prior knowledge. Stat. Appl. Genet. Mol. Biol. 6(1), 15 (2007) MathSciNetGoogle Scholar
  44. Wilkinson, D.J.: Stochastic modelling for quantitative description of heterogeneous biological systems. Nat. Rev. Genet. 10, 122–133 (2009) CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  • Andrea Rau
    • 1
    • 2
  • Florence Jaffrézic
    • 2
  • Jean-Louis Foulley
    • 2
  • R. W. Doerge
    • 1
    • 3
  1. 1.Department of StatisticsPurdue UniversityWest LafayetteUSA
  2. 2.UMR 1313 GABIINRAJouy-en-JosasFrance
  3. 3.Department of AgronomyPurdue UniversityWest LafayetteUSA

Personalised recommendations