Machine Learning

, Volume 92, Issue 2–3, pp 251–283 | Cite as

Recovering networks from distance data

  • Sandhya Prabhakaran
  • David Adametz
  • Karin J. Metzner
  • Alexander Böhm
  • Volker Roth
Article

Abstract

A fully probabilistic approach to reconstructing Gaussian graphical models from distance data is presented. The main idea is to extend the usual central Wishart model in traditional methods to using a likelihood depending only on pairwise distances, thus being independent of geometric assumptions about the underlying Euclidean space. This extension has two advantages: the model becomes invariant against potential bias terms in the measurements, and can be used in situations which on input use a kernel- or distance matrix, without requiring direct access to the underlying vectors. The latter aspect opens up a huge new application field for Gaussian graphical models, as network reconstruction is now possible from any Mercer kernel, be it on graphs, strings, probabilities or more complex objects. We combine this likelihood with a suitable prior to enable Bayesian network inference. We present an efficient MCMC sampler for this model and discuss the estimation of module networks. Experiments depict the high quality and usefulness of the inferred networks.

Keywords

Network inference Gaussian graphical models Pairwise Euclidean distances MCMC 

Supplementary material

10994_2013_5370_MOESM1_ESM.pdf (1.9 mb)
(PDF 1.9 MB)

References

  1. Allen, G. I., & Tibshirani, R. (2010). Transposable regularized covariance models with an application to missing data imputation. Annals of Applied Statistics, 4(2), 764–790. MathSciNetMATHCrossRefGoogle Scholar
  2. Anandkumar, A., Tan, V., & Willsky, A. S. (2011). High-dimensional graphical model selection: tractable graph families and necessary conditions. Advances in Neural Information Processing Systems, 24, 1863–1871. Google Scholar
  3. Barthel, D., Hirst, J. D., Blazewicz, J., Burke, E. K., & Krasnogor, N. (2007). ProCKSI: a decision support system for protein (structure) comparison, knowledge, similarity and information. BMC Bioinformatics, 8(416), 3250–3264. Google Scholar
  4. Berger, J. O., Liseo, B., & Wolpert, R. L. (1999). Integrated likelihood methods for eliminating nuisance parameters. Statistical Science, 14(1), 1–28. MathSciNetMATHGoogle Scholar
  5. Carvalho, C. M., & West, M. (2007). Dynamic matrix-variate graphical models. Bayesian Analysis, 2(1), 69–97. MathSciNetCrossRefGoogle Scholar
  6. Cristianini, N., & Shawe-Taylor, J. (2000). An introduction to support vector machines. Cambridge: Cambridge University Press. Google Scholar
  7. Daniels, M. J., & Pourahmadi, M. (2009). Modeling covariance matrices via partial autocorrelations. Journal of Multivariate Analysis, 100(10), 2352–2363. MathSciNetMATHCrossRefGoogle Scholar
  8. Díaz-García, J. A., Gutierrez Jáimez, R., & Mardia, K. V. (1997). Wishart and pseudo-Wishart distributions and some applications to shape theory. Journal of Multivariate Analysis, 63, 73–87. MathSciNetMATHCrossRefGoogle Scholar
  9. Friedman, J., Hastie, T., & Tibshirani, R. (2007). Sparse inverse covariance estimation with the graphical lasso. Biostatistics, 9, 432–441. CrossRefGoogle Scholar
  10. Gupta, A. K., & Nagar, D. K. (1999). Matrix variate distributions. London/Boca Raton: Chapman & Hall/CRC Press. ISBN 978-1584880462. Google Scholar
  11. Harry, J. (1996). Families of m-variate distributions with given margins and m(m−1)/2 bivariate dependence parameters. In L. Rüschendorf, B. Schweizer, & M. D. Taylor (Eds.), IMS lecture notes: Vol. 28. Distributions with fixed marginals and related topics (pp. 120–141). Providence: AMS. Google Scholar
  12. Harville, D. A. (1974). Bayesian inference for variance components using only error contrasts. Biometrika, 61(2), 383–385. MathSciNetMATHCrossRefGoogle Scholar
  13. Hollander, M., & Wolfe, D. A. (1999). Nonparametric statistical methods (2nd ed.). New York: Wiley-Interscience. MATHGoogle Scholar
  14. Jebara, T., Kondor, R., & Howard, A. (2004). Probability product kernels. Journal of Machine Learning Research, 5, 819–844. MathSciNetMATHGoogle Scholar
  15. Johnson, J. K., Malioutov, D. M., & Willsky, A. S. (2005a). Walk-summable Gaussian networks and walk-sum interpretation of Gaussian belief propagation (Technical Report—2650). LIDS, MIT. Google Scholar
  16. Johnson, J. K., Malioutov, D. M., & Willsky, A. S. (2005b). Walk-sum interpretation and analysis of Gaussian belief propagation. In Advances in neural information processing systems 18 (pp. 579–586). Google Scholar
  17. Johnson, V. A., Brun-Vezinet, F., Clotet, B., et al. (2010). Update of the drug resistance mutations in HIV-1: Dec 2010. Topics in HIV Medicine, 18(5), 156–163. Google Scholar
  18. Keseler, I. M., Collado-Vides, J., Santos-Zavaleta, A., et al. (2011). Ecocyc: a comprehensive database of Escherichia coli biology. Nucleic Acids Research, 39, D583–D590. CrossRefGoogle Scholar
  19. Kolar, M., Song, L., Ahmed, A., & Xing, E. P. (2010a). Estimating time-varying networks. Annals of Applied Statistics, 4(1), 94–123. MathSciNetMATHCrossRefGoogle Scholar
  20. Kolar, M., Parikh, A. P., & Xing, E. P. (2010b). On sparse nonparametric conditional covariance selection. In Proceedings of the 27th international conference on machine learning (pp. 559–566). Google Scholar
  21. Krasnogor, N., & Pelta, D. A. (2004). Measuring the similarity of protein structures by means of the universal similarity metric. Bioinformatics, 20(7), 1015–1021. CrossRefGoogle Scholar
  22. Li, M., Chen, X., Li, X., Ma, B., & Vitanyi, P. M. B. (2004). The similarity metric. IEEE Transactions on Information Theory, 50(12), 3250–3264. MathSciNetCrossRefGoogle Scholar
  23. Martins, A. F. T., Figueiredo, M. A. T., Aguiar, P. M. Q., Smith, N. A., & Xing, E. P. (2008). Nonextensive entropic kernels. In Proceedings of the 25th international conference on machine learning (pp. 640–647). CrossRefGoogle Scholar
  24. McCullagh, P. (2009). Marginal likelihood for distance matrices. Statistica Sinica, 19, 631–649. MathSciNetMATHGoogle Scholar
  25. McCullagh, P., & Yang, J. (2008). How many clusters? Bayesian Analysis, 3, 101–120. MathSciNetCrossRefGoogle Scholar
  26. Meinhausen, N., & Bühlmann, P. (2006). High dimensional graphs and variable selection with the Lasso. The Annals of Statistics, 38, 1436–1462. CrossRefGoogle Scholar
  27. Mitchell, T. J., & Beauchamp, J. J. (1988). Bayesian variable selection in linear regression. Journal of the American Statistical Association, 83(404), 1023–1032. MathSciNetMATHCrossRefGoogle Scholar
  28. Muirhead, R. J. (1982). Aspects of multivariate statistical theory. New York: Wiley. MATHCrossRefGoogle Scholar
  29. Patterson, D., & Thompson, R. (1971). Recovery of inter-block information when block sizes are unequal. Biometrika, 58(3), 545–554. MathSciNetMATHCrossRefGoogle Scholar
  30. Prabhakaran, S., Metzner, K. J., Boehm, A., & Roth, V. (2012). Recovering networks from distance data. Journal of Machine Learning Research Workshop and Conference Proceedings, 25, 349–364. Google Scholar
  31. Rogers, D. J., & Tanimoto, T. T. (1960). A computer program for classifying plants. Science, 132, 1115–1118. CrossRefGoogle Scholar
  32. Schölkopf, B., Smola, A., & Müller, K.-R. (1998). Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation, 10, 1299–1319. CrossRefGoogle Scholar
  33. Tunnicliffe-Wilson, G. (1989). On the use of marginal likelihood in time series model estimation. Journal of the Royal Statistical Society, Series B, 51, 15–27. MathSciNetGoogle Scholar
  34. Uhlig, H. (1994). On singular Wishart and singular multivariate beta distributions. The Annals of Statistics, 22, 395–405. MathSciNetMATHCrossRefGoogle Scholar
  35. Vapnik, V. (1998). Statistical learning theory. New York: Wiley. MATHGoogle Scholar
  36. Vogt, J. E., Prabhakaran, S., Fuchs, T. J., & Roth, V. (2010). The translation-invariant Wishart-Dirichlet process for clustering distance data. In Proceedings of the 27th international conference on machine learning (pp. 1111–1118). Google Scholar
  37. Weininger, D. (1988). SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. Journal of Chemical Information and Computer Sciences, 28(1), 31–36. Google Scholar
  38. Zaslaver, A., Bren, A., Ronen, M., et al. (2006). A comprehensive library of fluorescent transcriptional reporters for Escherichia coli. Nature Methods, 3(8), 623–628. CrossRefGoogle Scholar
  39. Zhou, S., Lafferty, J., & Wasserman, L. (2010). Time varying undirected graphs. Machine Learning, 83, 295–319. CrossRefGoogle Scholar

Copyright information

© The Author(s) 2013

Authors and Affiliations

  • Sandhya Prabhakaran
    • 1
  • David Adametz
    • 1
  • Karin J. Metzner
    • 2
  • Alexander Böhm
  • Volker Roth
    • 1
  1. 1.Department of Mathematics and Computer ScienceUniversity of BaselBaselSwitzerland
  2. 2.Department of Medicine, Division of Infectious Diseases and Hospital EpidemiologyUniversity Hospital ZürichZürichSwitzerland

Personalised recommendations