Skip to main content
Log in

Markov chain Monte Carlo simulation of a Bayesian mixture model for gene network inference

  • Research Article
  • Published:
Genes & Genomics Aims and scope Submit manuscript

Abstract

Background

Simultaneous measurement of gene expression level for thousands of genes contains the rich information about many different aspects of biological mechanisms. A major computational challenge is to find methods to extract new biological insights from this wealth of data. Complex biological processes are often regulated under the various conditions or circumstances and associated gene interactions are dynamically changed depending on different biological contexts. Thus, inference of such dynamic relationships between genes with consideration of biological conditions is very challenging.

Method

In this study, we propose a comprehensive and integrated approach to infer the dynamic relationships between genes and evaluate this approach on three distinct gene networks.

Results

This study demonstrates the advantage of integrating Markov chain Monte Carlo (MCMC) simulation into a Bayesian mixture model to overcome the high-dimension, low sample size (HDLSS) problem as well as to identify context-specific biological modules. Such biological modules were identified through the summarization of sampled network structures obtained from MCMC simulation.

Conclusion

This novel approach gives a comprehensive understanding of the dynamically regulated biological modules.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th international conference on very large data bases (VLDB), pp 487–499

  • Agrawal R, Imieliński T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD international conference on management of data (SIGMOD), pp 207–216

  • Bar-Joseph Z, Gerber GK, Lee TI, Rinaldi NJ, Yoo JY, Robert F, Gordon DB, Fraenkel E, Jaakkola TS, Young RA et al (2003) Computational discovery of gene modules and regulatory networks. Nat Biotechnol 21:1337–1342

    Article  CAS  PubMed  Google Scholar 

  • Burdick D, Calimlim M, Flannick J, Gehrke J, Yiu T (2005) MAFIA: a maximal frequent itemset algorithm. IEEE Trans Knowl Data Eng 17:1490–1504

    Article  Google Scholar 

  • Butte AJ, Kohane IS (2000) Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements. In: Proceedings of pacific symposium on biocomputing (PSB), vol 5, pp 415–426

  • Chan TE, Stumpf MPH, Babtie AC (2017) Gene regulatory network inference from single-cell data using multivariate information measures. Cell Syst 5:251–267 e253

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Chen G, Jensen ST, Stoeckert CJ Jr (2007) Clustering of genes into regulons using integrated modeling-COGRIM. Genome Biol 8:R4

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Chen G, Cairelli MJ, Kilicoglu H, Shin D, Rindflesch TC (2014) Augmenting microarray data with literature-based knowledge to enhance gene regulatory network inference. PLoS Comput Biol 10:e1003666

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Eisen MB, Spellman PT, Brown PO, Botstein D (1998) Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci USA 95:14863–14868

    Article  CAS  PubMed  Google Scholar 

  • Friedman N, Linial M, Nachman I, Pe’er D (2000) Using Bayesian networks to analyze expression data. J Comput Biol 7:601–620

    Article  CAS  PubMed  Google Scholar 

  • Grzegorczyk M, Husmeier D (2008) Improving the structure MCMC sampler for Bayesian networks by introducing a new edge reversal move. Mach Learn 71:265–305

    Article  Google Scholar 

  • Guo S, Jiang Q, Chen L, Guo D (2016) Gene regulatory network inference using PLS-based methods. BMC Bioinform 17:545

    Article  Google Scholar 

  • Hastings WK (1970) Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57:97–109

    Article  Google Scholar 

  • Husmeier D, Werhli AV (2007) Bayesian integration of biological prior knowledge into the reconstruction of gene regulatory networks with Bayesian networks. Comput Syst Bioinform Conf 6:85–95

    Google Scholar 

  • Husmeier D, Dybowski R, Roberts S (2005) Probabilistic modeling in bioinformatics and medical informatics. Springer, New York

    Book  Google Scholar 

  • Huynh-Thu VA, Irrthum A, Wehenkel L, Geurts P (2010) Inferring regulatory networks from expression data using tree-based methods. PLoS One 5:e12776

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Imoto S, Tamada Y, Araki H, Yasuda K, Print CG, Charnock-Jones SD, Sanders D, Savoie CJ, Tashiro K, Kuhara S et al (2006) Computational strategy for discovering druggable gene networks from genome-wide RNA expression profiles. Pac Symp Biocomput:559–571

  • Ishida T, Schatz GC (1998) Monte Carlo sampling methods for determining potential energy surfaces using Shepard interpolation. The O(D-1) + H-2 system. Chem Phys Lett 298:285–292

    Article  CAS  Google Scholar 

  • Ko Y, Zhai C, Rodriguez-Zas S (2007) Inference of gene pathways using Gaussian mixture models. In: Proceedings of 2007 IEEE international conference on bioinformatics and biomedicine (BIBM), pp 362–367

  • Ko Y, Zhai C, Rodriguez-Zas S (2009) Inference of gene pathways using mixture Bayesian networks. BMC Syst Biol 3:54

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Ko Y, Zhai C, Rodriguez-Zas SL (2010) Discovery of gene network variability across samples representing multiple classes. Int J Bioinform Res Appl 6:402–417

    Article  PubMed  PubMed Central  Google Scholar 

  • Kuffner R, Petri T, Tavakkolkhah P, Windhager L, Zimmer R (2012) Inferring gene regulatory networks by ANOVA. Bioinformatics 28:1376–1382

    Article  CAS  PubMed  Google Scholar 

  • Lemmens K, De Bie T, Dhollander T, De Keersmaecker SC, Thijs IM, Schoofs G, De Weerdt A, De Moor B, Vanderleyden J, Collado-Vides J et al (2009) DISTILLER: a data integration framework to reveal condition dependency of complex regulons in Escherichia coli. Genome Biol 10:R27

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Liu F, Zhang SW, Guo WF, Wei ZG, Chen L (2016) Inference of gene regulatory network based on local Bayesian networks. PLoS Comput Biol 12:e1005024

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Madigan D, York J (1995) Bayesian graphical models for discrete-data. Int Stat Rev 63:215–232

    Article  Google Scholar 

  • Marbach D, Roy S, Ay F, Meyer PE, Candeias R, Kahveci T, Bristow CA, Kellis M (2012) Predictive regulatory models in Drosophila melanogaster by integrative inference of transcriptional networks. Genome Res 22:1334–1349

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Metropolis N, Rosenbluth A, Rosenbluth M, Teller A, Teller E (1953) Equation of state calculations by fast computing machines. J Chem Phys 21:1087–1092

    Article  CAS  Google Scholar 

  • Mordelet F, Vert JP (2008) SIRENE: supervised inference of regulatory networks. Bioinformatics 24:i76–i82

    Article  PubMed  Google Scholar 

  • Nariai N, Kim S, Imoto S, Miyano S (2004) Using protein-protein interactions for refining gene networks estimated from microarray data by Bayesian networks. Pac Symp Biocomput:336–347

  • Nir Friedman DK (2003) Being Bayesian about network structure. A Bayesian approach to structure discovery in Bayesian networks. Mach Learn 50:95–125

    Article  Google Scholar 

  • Paolo Giudici RC (2003) Improving Markov chain Monte Carlo model search for data mining. Mach Learn 50:127–158

    Article  Google Scholar 

  • Qiu J, Noble WS (2008) Predicting co-complexed protein pairs from heterogeneous data. PLoS Comput Biol 4:e1000054

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Reiss DJ, Baliga NS, Bonneau R (2006) Integrated biclustering of heterogeneous genome-wide datasets for the inference of global regulatory networks. BMC Bioinform 7:280

    Article  CAS  Google Scholar 

  • Riffle M, Malmstrom L, Davis TN (2005) The yeast resource center public data repository. Nucleic Acids Res 33:D378–D382

    Article  CAS  PubMed  Google Scholar 

  • Tamada Y, Kim S, Bannai H, Imoto S, Tashiro K, Kuhara S, Miyano S (2003) Estimating gene networks from gene expression data by combining Bayesian network model with promoter element detection. Bioinformatics 19(Suppl 2):ii227-236

    Article  Google Scholar 

  • Werhli AV, Husmeier D (2007) Reconstructing gene regulatory networks with bayesian networks by combining expression data with multiple sources of prior knowledge. Stat Appl Genet Mol Biol 6:Article15

    Article  PubMed  Google Scholar 

  • Werhli AV, Husmeier D (2008) Gene regulatory network reconstruction by Bayesian integration of prior knowledge and/or different experimental conditions. J Bioinform Comput Biol 6:543–572

    Article  CAS  PubMed  Google Scholar 

  • Yeung KY, Medvedovic M, Bumgarner RE (2003) Clustering gene-expression data with repeated measurements. Genome Biol 4:R34

    Article  PubMed  PubMed Central  Google Scholar 

  • Zitnik M, Zupan B (2015) Gene network inference by fusing data from diverse distributions. Bioinformatics 31:i230–i239

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

This study was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education Grant 2017R1D1A1B03032457 and Hankuk University of Foreign Studies Research Fund (to Y.K.), and the Ministry of Science and ICT of Korea Grant 2014M3C9A3063544 and the Ministry of Education of Korea Grant 2016R1D1A1B03930209 (to J.K.).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Jaebum Kim or Sandra L. Rodriguez-Zas.

Ethics declarations

Conflict of interest

Younhee Ko, Jaebum Kim, and Sandra L. Rodriguez-Zas declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human subjects or animals performed by any of the authors.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ko, Y., Kim, J. & Rodriguez-Zas, S.L. Markov chain Monte Carlo simulation of a Bayesian mixture model for gene network inference. Genes Genom 41, 547–555 (2019). https://doi.org/10.1007/s13258-019-00789-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13258-019-00789-8

Keywords

Navigation