Abstract
Background
Simultaneous measurement of gene expression level for thousands of genes contains the rich information about many different aspects of biological mechanisms. A major computational challenge is to find methods to extract new biological insights from this wealth of data. Complex biological processes are often regulated under the various conditions or circumstances and associated gene interactions are dynamically changed depending on different biological contexts. Thus, inference of such dynamic relationships between genes with consideration of biological conditions is very challenging.
Method
In this study, we propose a comprehensive and integrated approach to infer the dynamic relationships between genes and evaluate this approach on three distinct gene networks.
Results
This study demonstrates the advantage of integrating Markov chain Monte Carlo (MCMC) simulation into a Bayesian mixture model to overcome the high-dimension, low sample size (HDLSS) problem as well as to identify context-specific biological modules. Such biological modules were identified through the summarization of sampled network structures obtained from MCMC simulation.
Conclusion
This novel approach gives a comprehensive understanding of the dynamically regulated biological modules.
Similar content being viewed by others
References
Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th international conference on very large data bases (VLDB), pp 487–499
Agrawal R, Imieliński T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD international conference on management of data (SIGMOD), pp 207–216
Bar-Joseph Z, Gerber GK, Lee TI, Rinaldi NJ, Yoo JY, Robert F, Gordon DB, Fraenkel E, Jaakkola TS, Young RA et al (2003) Computational discovery of gene modules and regulatory networks. Nat Biotechnol 21:1337–1342
Burdick D, Calimlim M, Flannick J, Gehrke J, Yiu T (2005) MAFIA: a maximal frequent itemset algorithm. IEEE Trans Knowl Data Eng 17:1490–1504
Butte AJ, Kohane IS (2000) Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements. In: Proceedings of pacific symposium on biocomputing (PSB), vol 5, pp 415–426
Chan TE, Stumpf MPH, Babtie AC (2017) Gene regulatory network inference from single-cell data using multivariate information measures. Cell Syst 5:251–267 e253
Chen G, Jensen ST, Stoeckert CJ Jr (2007) Clustering of genes into regulons using integrated modeling-COGRIM. Genome Biol 8:R4
Chen G, Cairelli MJ, Kilicoglu H, Shin D, Rindflesch TC (2014) Augmenting microarray data with literature-based knowledge to enhance gene regulatory network inference. PLoS Comput Biol 10:e1003666
Eisen MB, Spellman PT, Brown PO, Botstein D (1998) Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci USA 95:14863–14868
Friedman N, Linial M, Nachman I, Pe’er D (2000) Using Bayesian networks to analyze expression data. J Comput Biol 7:601–620
Grzegorczyk M, Husmeier D (2008) Improving the structure MCMC sampler for Bayesian networks by introducing a new edge reversal move. Mach Learn 71:265–305
Guo S, Jiang Q, Chen L, Guo D (2016) Gene regulatory network inference using PLS-based methods. BMC Bioinform 17:545
Hastings WK (1970) Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57:97–109
Husmeier D, Werhli AV (2007) Bayesian integration of biological prior knowledge into the reconstruction of gene regulatory networks with Bayesian networks. Comput Syst Bioinform Conf 6:85–95
Husmeier D, Dybowski R, Roberts S (2005) Probabilistic modeling in bioinformatics and medical informatics. Springer, New York
Huynh-Thu VA, Irrthum A, Wehenkel L, Geurts P (2010) Inferring regulatory networks from expression data using tree-based methods. PLoS One 5:e12776
Imoto S, Tamada Y, Araki H, Yasuda K, Print CG, Charnock-Jones SD, Sanders D, Savoie CJ, Tashiro K, Kuhara S et al (2006) Computational strategy for discovering druggable gene networks from genome-wide RNA expression profiles. Pac Symp Biocomput:559–571
Ishida T, Schatz GC (1998) Monte Carlo sampling methods for determining potential energy surfaces using Shepard interpolation. The O(D-1) + H-2 system. Chem Phys Lett 298:285–292
Ko Y, Zhai C, Rodriguez-Zas S (2007) Inference of gene pathways using Gaussian mixture models. In: Proceedings of 2007 IEEE international conference on bioinformatics and biomedicine (BIBM), pp 362–367
Ko Y, Zhai C, Rodriguez-Zas S (2009) Inference of gene pathways using mixture Bayesian networks. BMC Syst Biol 3:54
Ko Y, Zhai C, Rodriguez-Zas SL (2010) Discovery of gene network variability across samples representing multiple classes. Int J Bioinform Res Appl 6:402–417
Kuffner R, Petri T, Tavakkolkhah P, Windhager L, Zimmer R (2012) Inferring gene regulatory networks by ANOVA. Bioinformatics 28:1376–1382
Lemmens K, De Bie T, Dhollander T, De Keersmaecker SC, Thijs IM, Schoofs G, De Weerdt A, De Moor B, Vanderleyden J, Collado-Vides J et al (2009) DISTILLER: a data integration framework to reveal condition dependency of complex regulons in Escherichia coli. Genome Biol 10:R27
Liu F, Zhang SW, Guo WF, Wei ZG, Chen L (2016) Inference of gene regulatory network based on local Bayesian networks. PLoS Comput Biol 12:e1005024
Madigan D, York J (1995) Bayesian graphical models for discrete-data. Int Stat Rev 63:215–232
Marbach D, Roy S, Ay F, Meyer PE, Candeias R, Kahveci T, Bristow CA, Kellis M (2012) Predictive regulatory models in Drosophila melanogaster by integrative inference of transcriptional networks. Genome Res 22:1334–1349
Metropolis N, Rosenbluth A, Rosenbluth M, Teller A, Teller E (1953) Equation of state calculations by fast computing machines. J Chem Phys 21:1087–1092
Mordelet F, Vert JP (2008) SIRENE: supervised inference of regulatory networks. Bioinformatics 24:i76–i82
Nariai N, Kim S, Imoto S, Miyano S (2004) Using protein-protein interactions for refining gene networks estimated from microarray data by Bayesian networks. Pac Symp Biocomput:336–347
Nir Friedman DK (2003) Being Bayesian about network structure. A Bayesian approach to structure discovery in Bayesian networks. Mach Learn 50:95–125
Paolo Giudici RC (2003) Improving Markov chain Monte Carlo model search for data mining. Mach Learn 50:127–158
Qiu J, Noble WS (2008) Predicting co-complexed protein pairs from heterogeneous data. PLoS Comput Biol 4:e1000054
Reiss DJ, Baliga NS, Bonneau R (2006) Integrated biclustering of heterogeneous genome-wide datasets for the inference of global regulatory networks. BMC Bioinform 7:280
Riffle M, Malmstrom L, Davis TN (2005) The yeast resource center public data repository. Nucleic Acids Res 33:D378–D382
Tamada Y, Kim S, Bannai H, Imoto S, Tashiro K, Kuhara S, Miyano S (2003) Estimating gene networks from gene expression data by combining Bayesian network model with promoter element detection. Bioinformatics 19(Suppl 2):ii227-236
Werhli AV, Husmeier D (2007) Reconstructing gene regulatory networks with bayesian networks by combining expression data with multiple sources of prior knowledge. Stat Appl Genet Mol Biol 6:Article15
Werhli AV, Husmeier D (2008) Gene regulatory network reconstruction by Bayesian integration of prior knowledge and/or different experimental conditions. J Bioinform Comput Biol 6:543–572
Yeung KY, Medvedovic M, Bumgarner RE (2003) Clustering gene-expression data with repeated measurements. Genome Biol 4:R34
Zitnik M, Zupan B (2015) Gene network inference by fusing data from diverse distributions. Bioinformatics 31:i230–i239
Acknowledgements
This study was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education Grant 2017R1D1A1B03032457 and Hankuk University of Foreign Studies Research Fund (to Y.K.), and the Ministry of Science and ICT of Korea Grant 2014M3C9A3063544 and the Ministry of Education of Korea Grant 2016R1D1A1B03930209 (to J.K.).
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflict of interest
Younhee Ko, Jaebum Kim, and Sandra L. Rodriguez-Zas declare that they have no conflict of interest.
Ethical approval
This article does not contain any studies with human subjects or animals performed by any of the authors.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Ko, Y., Kim, J. & Rodriguez-Zas, S.L. Markov chain Monte Carlo simulation of a Bayesian mixture model for gene network inference. Genes Genom 41, 547–555 (2019). https://doi.org/10.1007/s13258-019-00789-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13258-019-00789-8