Abstract
Inference of gene interaction networks from expression data usually focuses on either supervised or unsupervised edge prediction from a single data source. However, in many real world applications, multiple data sources, such as microarray and ISH measurements of mRNA abundances, are available to offer multi-view information about the same set of genes. We propose NP-MuScL (nonparanormal multi-source learning) to estimate a gene interaction network that is consistent with such multiple data sources, which are expected to reflect the same underlying relationships between the genes. NP-MuScL casts the network estimation problem as estimating the structure of a sparse undirected graphical model. We use the semiparametric Gaussian copula to model the distribution of the different data sources, with the different copulas sharing the same precision (i.e., inverse covariance) matrix, and we present an efficient algorithm to estimate such a model in the high dimensional scenario. Results are reported on synthetic data, where NP-MuScL outperforms baseline algorithms significantly, even in the presence of noisy data sources. Experiments are also run on two real-world scenarios: two yeast microarray data sets, and three Drosophila embryonic gene expression data sets, where NP-MuScL predicts a higher number of known gene interactions than existing techniques.
Chapter PDF
Similar content being viewed by others
Keywords
References
Segal, E., Koller, D., Friedman, N.: Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data. Nature Genetics 34, 166–176 (2003)
Basso, K., Magolin, A., Califano, A.: Reverse engineering of regulatory networks in human b cells. Nature Genetics 37, 382–390 (2005)
Morrissey, E.R., Juárez, M.A., Denby, K.J., Burroughs, N.J.: On reverse engineering of gene interaction networks using time course data with repeated measurements. Bioinformatics 26(18), 2305–2312 (2010)
Carro, M.S., Califano, A., Iavarone, A.: The transcriptional network for mesenchymal transformation of brain tumours. Nature 463, 318–325 (2010)
Wang, K., Saito, M., Califano, A.: Genome-wide identification of post-translational modulators of transcription factor activity in human b-cells. Nature Biotechnology 27(9), 829–839 (2009)
Meinshausen, N., Bühlmann, P.: High-dimensional graphs and variable selection with the lasso. Annals of Statistics (2006)
Banerjee, O., Ghaoui, L.E., d’Aspremont, A., Natsoulis, G.: Convex optimization techniques for fitting sparse gaussian graphical models. In: ICML (2006)
Friedman, J., Hastie, T., Tibshirani, R.: Sparse inverse covariance estimation with the graphical lasso. Biostatistics (2007)
Ben-Hur, A., Noble, W.S.: Kernel methods for predicting protein–protein interactions. In: ISMB, vol. 21, pp. i38–i46 (2005)
Wang, Y., Joshi, T., Zhang, X.S., Xu, D., Chen, L.: Inferring gene regulatory networks from multiple microarray datasets. Bioinformatics 22(19), 2413–2420 (2006)
Ahmed, A., Xing, E.P.: Tesla: Recovering time-varying networks of dependencies in social and biological studies. Proc. Natl. Acad. Sci. 106, 11878–11883 (2009)
Xu, Q., Hu, D.H., Yang, Q., Xue, H.: Simpletrppi: A simple method for transferring knowledge between interaction networks for ppi prediction. In: Bioinformatics and Biomedicine Workshops (2012)
Katenka, N., Kolaczyk, E.D.: Inference and characterization of multi-attribute networks with application to computational biology. Arxiv (2012)
Honorio, J., Samaras, D.: Multi-task learning of gaussian graphical models. In: ICML (2011)
Rothman, A.J., Bickel, P.J., Levina, E., Zhu, J.: Sparse permutation invariant covariance estimation. Electronic Journal of Statistics 2 (2008)
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Statist. Soc. B 58(1), 267–288 (1996)
Ravikumar, P., Liu, H., Lafferty, J., Wasserman, L.: Spam: Sparse additive models. In: NIPS (2007)
Liu, H., Lafferty, J., Wasserman, L.: The nonparanormal: Semiparametric estimation of high dimensional undirected graphs. Journal of Machine Learning Research 10, 2295–2328 (2009)
Balakrishnan, S., Puniyani, K., Lafferty, J.: Sparse additive functional and kernel cca. In: ICML (2012)
Cho, R., Campbell, M., Winzeler, E., Davis, R.: A genome-wide transcriptional analysis of the mitotic cell cycle. Mol. Cell 2(1), 65–73 (1998)
Hughes, T., Marton, M., Jones, A., Roberts, C., Friend, S.: Functional discovery via a compendium of expression profiles. Cell 102(1) (2000)
Hibbs, M., Hess, D., Myers, C., Troyanskaya, O.: Exploring the functional landscape of gene expression: directed search of large microarray compendia. Bioinformatics (2007)
Stark, C., Breitkreutz, B., Chatr-Aryamontri, A., Boucher, L., Tyers, M.: The biogrid interaction database: update. Nucleic Acids Res. 39(D), 698–704 (2011)
Liben-Nowell, D., Kleinberg, J.: The link prediction problem for social networks. In: CIKM (2003)
Tomancak, P., Beaton, A., Weiszmann, R., Kwan, E., Shu, S., Lewis, S., Richards, S., Celniker, S., Rubin, G.: Systematic determination of patterns of gene expression during drosophila embryogenesis. Genome Biol. 3(2), 14 (2002)
Puniyani, K., Xing, E.P.: Inferring Gene Interaction Networks from ISH Images via Kernelized Graphical Models. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VI. LNCS, vol. 7577, pp. 72–85. Springer, Heidelberg (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Puniyani, K., Xing, E.P. (2013). NP-MuScL: Unsupervised Global Prediction of Interaction Networks from Multiple Data Sources. In: Deng, M., Jiang, R., Sun, F., Zhang, X. (eds) Research in Computational Molecular Biology. RECOMB 2013. Lecture Notes in Computer Science(), vol 7821. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37195-0_15
Download citation
DOI: https://doi.org/10.1007/978-3-642-37195-0_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37194-3
Online ISBN: 978-3-642-37195-0
eBook Packages: Computer ScienceComputer Science (R0)