Gene Regulatory Networks from Single Cell Data for Exploring Cell Fate Decisions

  • Thalia E. Chan
  • Michael P. H. Stumpf
  • Ann C. BabtieEmail author
Part of the Methods in Molecular Biology book series (MIMB, volume 1975)


Single cell experimental techniques now allow us to quantify gene expression in up to thousands of individual cells. These data reveal the changes in transcriptional state that occur as cells progress through development and adopt specialized cell fates. In this chapter we describe in detail how to use our network inference algorithm (PIDC)—and the associated software package NetworkInference.jl—to infer functional interactions between genes from the observed gene expression patterns. We exploit the large sample sizes and inherent variability of single cell data to detect statistical dependencies between genes that indicate putative (co-)regulatory relationships, using multivariate information measures that can capture complex statistical relationships. We provide guidelines on how best to combine this analysis with other complementary methods designed to explore single cell data, and how to interpret the resulting gene regulatory network models to gain insight into the processes regulating cell differentiation.

Key words

Information theory Gene regulation Network inference Cell development Single cell 



This work was supported by a Biotechnology and Biological Sciences Research Council (BBSRC) DTP Studentship to T.E.C., and a BBSRC Future Leaders Fellowship (grant reference BB/N011597/1) to A.C.B. We thank Joe Greener, Gal Horesh, and Ananth Pallaseni for sharing code with us, Suhail Islam for computing support, and Ben MacArthur, Patrick Stumpf, and members of the theoretical systems biology group for useful discussions.


  1. 1.
    Andrews TS, Hemberg M (2018) M3Drop: dropout-based feature selection for scRNASeq. Bioinformatics bty1044.
  2. 2.
    Babtie AC, Chan TE, Stumpf MPH (2017) Learning regulatory models for cell development from single cell transcriptomic data. Curr Opin Syst Biol 5:72–81CrossRefGoogle Scholar
  3. 3.
    Bacher R, Kendziorski C (2016) Design and computational analysis of single-cell RNA-sequencing experiments. Genome Biol 17:63Google Scholar
  4. 4.
    Bezanson J, Edelman A, Karpinski S, Shah VB (2014) Julia: a fresh approach to numerical computing. arXiv 1411.1607Google Scholar
  5. 5.
    Brennecke P, Anders S, Kim JK, Kołodziejczyk AA, Zhang X, Proserpio V, Baying B, Benes V, Teichmann SA, Marioni JC, Heisler MG (2013) Accounting for technical noise in single-cell RNA-seq experiments. Nat Methods 10(11):1093–1095PubMedCrossRefGoogle Scholar
  6. 6.
    Buettner F, Natarajan KN, Casale FP, Proserpio V, Scialdone A, Theis FJ, Teichmann SA, Marioni JC, Stegle O (2015) Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells. Nat Biotechnol 33(2):155–160CrossRefGoogle Scholar
  7. 7.
    Butte AJ, Tamayo P, Slonim D, Golub TR, Kohane IS (2000) Discovering functional relationships between RNA expression and chemotherapeutic susceptibility using relevance networks. Proc Natl Acad Sci 97(22):12182–12186. Scholar
  8. 8.
    Chan TE, Stumpf MPH, Babtie AC (2017) Gene regulatory network inference from single-cell data using multivariate information measures. Cell Syst 5(3):251.e3–267.e3PubMedPubMedCentralCrossRefGoogle Scholar
  9. 9.
    Chan TE, Pallaseni A, Babtie AC, McEwen K, Stumpf MPH (2018) Empirical Bayes meets information theoretical network reconstruction from single cell data. bioRxiv.
  10. 10.
    Delvenne JC, Yaliraki SN, Barahona M (2010) Stability of graph communities across time scales. Proc Natl Acad Sci 107(29):12755–12760. Scholar
  11. 11.
    Efron B (2010) Large-scale inference. Empirical Bayes methods for estimation, testing, and prediction. Cambridge University Press, CambridgeGoogle Scholar
  12. 12.
    Faith JJ, Hayete B, Thaden JT, Mogno I, Wierzbowski J, Cottarel G, Kasif S, Collins JJ, Gardner TS (2007) Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of expression profiles. PLoS Biol 5(1):e8–e13PubMedPubMedCentralCrossRefGoogle Scholar
  13. 13.
    Fan J, Salathia N, Liu R, Kaeser GE, Yung YC, Herman JL, Kaper F, Fan JB, Zhang K, Chun J, Kharchenko PV (2016) Characterizing transcriptional heterogeneity through pathway and gene set overdispersion analysis. Nat Methods 13(3):241–244PubMedPubMedCentralCrossRefGoogle Scholar
  14. 14.
    Finak G, McDavid A, Yajima M, Deng J, Gersuk V, Shalek AK, Slichter CK, Miller HW, McElrath MJ, Prlic M, Linsley PS, Gottardo R (2015) MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data. Genome Biol 16:278Google Scholar
  15. 15.
    Grün D, van Oudenaarden A (2015) Design and analysis of single-cell sequencing experiments. Cell 163(4):799–810PubMedCrossRefGoogle Scholar
  16. 16.
    Haghverdi L, Büttner M, Wolf FA, Buettner F, Theis FJ (2016) Diffusion pseudotime robustly reconstructs lineage branching. Nat Methods 13(10):845–848. Scholar
  17. 17.
    Hausser J, Strimmer K (2009) Entropy inference and the James-Stein estimator, with application to nonlinear gene association networks. J Mach Learn Res 10:1469–1484Google Scholar
  18. 18.
    Kharchenko PV, Silberstein L, Scadden DT (2014) Bayesian approach to single-cell differential expression analysis. Nat Methods 11(7):740–742PubMedPubMedCentralCrossRefGoogle Scholar
  19. 19.
    Klein AM, Mazutis L, Akartuna I, Tallapragada N, Veres A, Li V, Peshkin L, Weitz DA, Kirschner MW (2015) Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells. Cell 161(5):1187–1201PubMedPubMedCentralCrossRefGoogle Scholar
  20. 20.
    Korthauer KD, Chu LF, Newton MA, Li Y, Thomson J, Stewart R, Kendziorski C (2016) A statistical approach for identifying differential distributions in single-cell RNA-seq experiments. Genome Biol 17:222.
  21. 21.
    Kumar P, Tan Y, Cahan P (2017) Understanding development and stem cells using single cell-based analyses of gene expression. Development 144(1):17–32PubMedPubMedCentralCrossRefGoogle Scholar
  22. 22.
    Lambiotte R, Delvenne JC, Barahona M (2014) Random walks, Markov processes and the multiscale modular organization of complex networks. IEEE Trans Netw Sci Eng 1(2):76–90. Scholar
  23. 23.
    Liang KC, Wang X (2008) Gene regulatory network reconstruction using conditional mutual information. EURASIP J Bioinform Syst Biol 2008(1):253894. Scholar
  24. 24.
    Liu S, Trapnell C (2016) Single-cell transcriptome sequencing: recent advances and remaining challenges. F1000Research 5:182. Scholar
  25. 25.
    Lönnberg T, Svensson V, James KR (2017) Single-cell RNA-seq and computational analysis using temporal mixture modelling resolves Th1/Tfh fate bifurcation in malaria. Science 2(9):eaal2192PubMedPubMedCentralCrossRefGoogle Scholar
  26. 26.
    Lun ATL, Calero-Nieto FJ, Haim-Vilmovsky L, Göttgens B, Marioni JC (2017) Assessing the reliability of spike-in normalization for analyses of single-cell RNA sequencing data. Genome Res 27:1795-1806. Scholar
  27. 27.
    Macosko EZ, Basu A, Satija R, Nemesh J, Shekhar K, Goldman M, Tirosh I, Bialas AR, Kamitaki N, Martersteck EM, Trombetta JJ, Weitz DA, Sanes JR, Shalek AK, Regev A, McCarroll SA (2015) Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell 161(5):1202–1214PubMedPubMedCentralCrossRefGoogle Scholar
  28. 28.
    Marbach D, Costello JC, Küffner R, Vega NM, Prill RJ, Camacho DM, Allison KR, DREAM5 Consortium, Kellis M, Collins JJ, Stolovitzky G (2012) Wisdom of crowds for robust gene network inference. Nat Methods 9(8):796–804PubMedCrossRefGoogle Scholar
  29. 29.
    Margolin AA, Nemenman I, Basso K, Wiggins C, Stolovitzky G, Favera R, Califano A (2006) ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinf 7(Suppl 1):S7–S15CrossRefGoogle Scholar
  30. 30.
    McMahon SS, Sim A, Johnson R, Liepe J, Stumpf MPH (2014) Information theory and signal transduction systems: from molecular information processing to network inference. Semin Cell Dev Biol 35:98–108. Scholar
  31. 31.
    Meyer PE, Kontos K, Lafitte F, Bontempi G (2007) Information-theoretic inference of large transcriptional regulatory networks. EURASIP J Bioinform Syst Biol 2007(1):79879CrossRefGoogle Scholar
  32. 32.
    Meyer PE, Lafitte F, Bontempi G (2008) minet: a R/Bioconductor package for inferring large transcriptional networks using mutual information. BMC Bioinf 9(1):461–510CrossRefGoogle Scholar
  33. 33.
    Moignard V, Göttgens B (2016) Dissecting stem cell differentiation using single cell expression profiling. Curr Opin Cell Biol 43:78–86PubMedCrossRefGoogle Scholar
  34. 34.
    Moignard V, Woodhouse S, Haghverdi L, Lilly AJ, Tanaka Y, Wilkinson AC, Buettner F, Macaulay IC, Jawaid W, Diamanti E, Nishikawa SI, Piterman N, Kouskoff V, Theis FJ, Fisher J, Göttgens B (2015) Decoding the regulatory network of early blood development from single-cell gene expression measurements. Nat Biotechnol 33(3):269–276. Scholar
  35. 35.
    Newman MEJ (2002) Mixing patterns in networks. (2 Pt 2), 026,126Google Scholar
  36. 36.
    Oates CJ, Mukherjee S (2012) Network inference and biological dynamics. Ann Appl Stat 6(3):1209–1235. Scholar
  37. 37.
    Ocone A, Haghverdi L, Mueller NS, Theis FJ (2015) Reconstructing gene regulatory dynamics from high-dimensional single-cell snapshot data. Bioinformatics 31(12):i89–i96. Scholar
  38. 38.
    Penfold CA, Wild DL (2011) How to infer gene networks from expression profiles, revisited. Interface Focus 1(6):857–870PubMedPubMedCentralCrossRefGoogle Scholar
  39. 39.
    Rizvi AH, Camara PG, Kandror EK, Roberts TJ, Schieren I, Maniatis T, Rabadan R (2017) Single-cell topological RNA-seq analysis reveals insights into cellular differentiation and development. Nat Biotechnol 35(6):551–560PubMedPubMedCentralCrossRefGoogle Scholar
  40. 40.
    Salter Townshend M, White A, Gollini I, Murphy TB (2012) Review of statistical network analysis: models, algorithms, and software. Stat Anal Data Min ASA Data Sci J 5(4):243–264CrossRefGoogle Scholar
  41. 41.
    Scargle JD, Norris JP, Jackson B, Chiang J (2013) Studies in astronomical time series analysis. VI. Bayesian block representations. Astrophys J 764:167. Scholar
  42. 42.
    Setty M, Tadmor MD, Reich-Zeliger S, Angel O, Salame TM, Kathail P, Choi K, Bendall S, Friedman N, Pe’er D (2016) Wishbone identifies bifurcating developmental trajectories from single-cell data. Nat Biotechnol 34(6):637–714PubMedPubMedCentralCrossRefGoogle Scholar
  43. 43.
    Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27(3):379–423. Scholar
  44. 44.
    Stegle O, Teichmann SA, Marioni JC (2015) Computational and analytical challenges in single-cell transcriptomics. Nat Rev Genet 16(3):133–145PubMedCrossRefGoogle Scholar
  45. 45.
    Stumpf MPH, Stumpf MPH, Kelly W, Thorne TW, Wiuf C, Wiuf C (2007) Evolution at the system level: the natural history of protein interaction networks. Trends Ecol Evol 22(7):366–373PubMedCrossRefGoogle Scholar
  46. 46.
    Stumpf PS, Smith RCG, Lenz M, Schuppert A, Müller FJ, Babtie A, Chan TE, Stumpf MPH, Please CP, Howison SD, Arai F, MacArthur BD (2017) Stem cell differentiation as a non-Markov stochastic process. Cell Syst 5(3):268.e7–282.e7PubMedPubMedCentralCrossRefGoogle Scholar
  47. 47.
    Tanay A, Regev A (2017) Scaling single-cell genomics from phenomenology to mechanism. Nat Biotechnol 541(7637):331–338PubMedPubMedCentralCrossRefGoogle Scholar
  48. 48.
    Timme N, Alford W, Flecker B, Beggs JM (2014) Synergy, redundancy, and multivariate information measures: an experimentalist’s perspective. J Comput Neurosci 36(2):119–140. Scholar
  49. 49.
    Trapnell C, Cacchiarelli D, Grimsby J, Pokharel P, Li S, Morse M, Lennon NJ, Livak KJ, Mikkelsen TS, Rinn JL (2014) The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells. Nat Biotechnol 32(4):381–U251. Scholar
  50. 50.
    Vallejos C (2016) Beyond comparisons of means: understanding changes in gene expression at the single-cell level. Genome Biol 17:70.
  51. 51.
    Vallejos CA, Risso D, Scialdone A, Dudoit S, Marioni JC (2017) Normalizing single-cell RNA sequencing data: challenges and opportunities. Nat Methods 14(6):565–571PubMedPubMedCentralCrossRefGoogle Scholar
  52. 52.
    van Dijk D, Sharma R, Nainys J, Yim K, Kathail P, Carr AJ, Burdziak C, Moon KR, Chaffer CL, Pattabiraman D, Bierie B, Mazutis L, Wolf G, Krishnaswamy S, Pe’er D (2018) Recovering gene interactions from single-cell data using data diffusion. Cell 174(3):716-729Google Scholar
  53. 53.
    Villaverde A, Ross J, Banga J (2013) Reverse engineering cellular networks with information theoretic methods. Cells 2(2):306–329PubMedPubMedCentralCrossRefGoogle Scholar
  54. 54.
    Villaverde AF, Ross J, Morán F, Banga JR (2014) MIDER: network inference with mutual information distance and entropy reduction. PLoS ONE 9(5):e96,732CrossRefGoogle Scholar
  55. 55.
    Villaverde AF, Becker K, Banga JR (2017) PREMER: a tool to infer biological networks. IEEE/ACM Trans Comput Biol Bioinform 15(4):1193–1202. Scholar
  56. 56.
    Wagner A, Regev A, Yosef N (2016) Revealing the vectors of cellular identity with single-cell genomics. Nat Biotechnol 34(11):1145–1160PubMedPubMedCentralCrossRefGoogle Scholar
  57. 57.
    Watkinson J, Liang KC, Wang X, Zheng T, Anastassiou D (2009) Inference of regulatory gene interactions from expression data using three-way mutual information. Ann N Y Acad Sci 1158(1):302–313. Scholar
  58. 58.
    Welch JD, Hartemink AJ, Prins JF (2016) SLICER: inferring branched, nonlinear cellular trajectories from single cell RNA-seq data. Genome Biol 17(1):106PubMedPubMedCentralCrossRefGoogle Scholar
  59. 59.
    Williams PL, Beer RD (2010) Nonnegative decomposition of multivariate information., arXiv:1004.2515Google Scholar
  60. 60.
    Woodhouse S, Moignard V, Göttgens B, Fisher J (2016) Processing, visualising and reconstructing network models from single-cell data. Immunol Cell Biol 94(3):256–265PubMedCrossRefGoogle Scholar
  61. 61.
    Zhao J, Zhou Y, Zhang X, Chen L (2016) Part mutual information for quantifying direct associations in networks. Proc Natl Acad Sci USA 113(18):5130–5135PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  • Thalia E. Chan
    • 1
  • Michael P. H. Stumpf
    • 1
  • Ann C. Babtie
    • 1
    Email author
  1. 1.Department of Life Sciences, Centre for Integrative Systems Biology and BioinformaticsImperial College LondonLondonUK

Personalised recommendations