Abstract
Network inference is the reverse-engineering problem of inferring graphs from data. With the always increasing availability of data, methods based on probability assumptions that infer multiple intertwined networks have been proposed in literature. These methods, while being extremely flexible, have the major drawback of presenting a high number of hyper-parameters that need to be tuned. The tuning of hyper-parameters, in unsupervised settings, can be performed through criteria based on likelihood or stability. Likelihood-based scores can be easily generalised to the multi hyper-parameters setting, but their computation is feasible only under certain probability assumptions. Differently, stability-based methods are of general application and, on single hyper-parameter, they have been proved to outperform likelihood-based scores. In this work we present a multi-parameters extension to stability-based methods that can be easily applied on complex models. We extensively compared this extension with likelihood-based scores on synthetic Gaussian data. Experiments show that our extension provides a better estimate of models of increasing complexity providing a valuable alternative of existing likelihood-based model selection methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Allen, G.I., Liu, Z.: A local poisson graphical model for inferring networks from sequencing data. IEEE Trans. Nanobiosci. 12(3), 189–198 (2013)
Barabasi, A.-L., Oltvai, Z.N.: Network biology: understanding the cell’s functional organization. Nat. Rev. Genet. 5(2), 101 (2004)
Bergomi, M.G., Ferri, M., Vertechi, P., Zuffi, L.: Beyond topological persistence: Starting from networks. arXiv preprint arXiv:1901.08051 (2019)
Blunt, M.J., Jackson, M.D., Piri, M., Valvatne, P.H.: Detailed physics, predictive capabilities and macroscopic consequences for pore-network models of multiphase flow. Adv. Water Resour. 25(8–12), 1069–1089 (2002)
Bogdan, M., Ghosh, J.K., Doerge, R.W.: Modifying the schwarz bayesian information criterion to locate multiple interacting quantitative trait loci. Genetics 167(2), 989–999 (2004)
Borgatti, S.P., Mehra, A., Brass, D.J., Labianca, G.: Network analysis in the social sciences. Science 323(5916), 892–895 (2009)
Broman, K.W., Speed, T.P.: A model selection approach for the identification of quantitative trait loci in experimental crosses. J. Roy. Stat. Soc.: Ser. B (Stat. Methodol.) 64(4), 641–656 (2002)
Chandrasekaran, V., Parrilo, P.A., Willsky, A.S.: Latent variable graphical model selection via convex optimization. In: 2010 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 1610–1613. IEEE (2010)
Chen, J., Chen, Z.: Extended bayesian information criteria for model selection with large model spaces. Biometrika 95(3), 759–771 (2008)
Cheng, L., Shan, L., Kim, I.: Multilevel gaussian graphical model for multilevel networks. J. Stat. Plann. Infer. 190, 1–14 (2017)
Danaher, P., Wang, P., Witten, D.M.: The joint graphical lasso for inverse covariance estimation across multiple classes. J. Roy. Stat. Soc.: Ser. B (Stat. Methodol.) 76(2), 373–397 (2014)
Foygel, R., Drton, M.: Extended bayesian information criteria for gaussian graphical models. In: Advances in Neural Information Processing Systems, pp. 604–612 (2010)
Friedman, J., Hastie, T., Tibshirani, R.: Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9(3), 432–441 (2008)
Friedman, N.: Inferring cellular networks using probabilistic graphical models. Science 303(5659), 799–805 (2004)
Guo, J., Levina, E., Michailidis, G., Zhu, J.: Joint estimation of multiple graphical models. Biometrika 98(1), 1–15 (2011)
Hallac, D., Leskovec, J., Boyd, S., lasso, N.: Clustering and optimization in large graphs. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 387–396. ACM (2015)
Hallac, D., Park, Y., Boyd, S., Leskovec, J.: Network inference via the time-varying graphical lasso. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 205–213. ACM (2017)
Lange, T., Roth, V., Braun, M.L., Buhmann, J.M.: Stability-based validation of clustering solutions. Neural Comput. 16(6), 1299–1323 (2004)
Lauritzen, S.L.: Graphical Models, vol. 17. Clarendon Press, Oxford (1996)
Lee, J.D., Hastie, T.J.: Learning the structure of mixed graphical models. J. Comput. Graph. Stat. 24(1), 230–253 (2015)
Li, H., Gui, J.: Gradient directed regularization for sparse gaussian concentration graphs, with applications to inference of genetic networks. Biostatistics 7(2), 302–317 (2005)
Liu, H., Roeder, K., Wasserman, L.: Stability approach to regularization selection (stars) for high dimensional graphical models. In: Advances in Neural Information Processing Systems, pp. 1432–1440 (2010)
Meinshausen, N., Bühlmann, P.: Stability selection. J. Roy. Stat. Soc.: Ser. B (Stat. Methodol.) 72(4), 417–473 (2010)
Meinshausen, N., Bühlmann, P., et al.: High-dimensional graphs and variable selection with the lasso. Ann. Stat. 34(3), 1436–1462 (2006)
Milenković, T., Pržulj, N.: Uncovering biological network function via graphlet degree signatures. Cancer Inform. 6, CIN–S680 (2008)
Molinaro, A.M., Simon, R., Pfeiffer, R.M.: Prediction error estimation: a comparison of resampling methods. Bioinformatics 21(15), 3301–3307 (2005)
Müller, C.L., Bonneau, R., Kurtz, Z.: Generalized stability approach for regularized graphical models. arXiv preprint arXiv:1605.07072 (2016)
Pelizzola, A.: Cluster variation method in statistical physics and probabilistic graphical models. J. Phys. A: Math. Gen. 38(33), R309 (2005)
Politis, D.N., Romano, J.P., Wolf, M.: Subsampling. Springer, New York (1999)
Pržulj, N.: Biological network comparison using graphlet degree distribution. Bioinformatics 23(2), e177–e183 (2007)
Pržulj, N., Corneil, D.G., Jurisica, I.: Modeling interactome: scale-free or geometric? Bioinformatics 20(18), 3508–3515 (2004)
Ravikumar, P., Wainwright, M.J., Lafferty, J.D., et al.: High-dimensional ising model selection using \(\ell _1\)-regularized logistic regression. Ann. Stat. 38(3), 1287–1319 (2010)
Sakamoto, Y., Ishiguro, M., Kitagawa, G.: Akaike information criterion statistics. Dordrecht, The Netherlands: D. Reidel 81 (1986)
Sarajlić, A., Malod-Dognin, N., Yaveroğlu, Ö.N., Pržulj, N.: Graphlet-based characterization of directed networks. Sci. Rep. 6, 35098 (2016)
Siegmund, D.: Model selection in irregular problems: applications to mapping quantitative trait loci. Biometrika 91(4), 785–800 (2004)
Snoek, J., Larochelle, H., Adams, R.P.: Practical bayesian optimization of machine learning algorithms. In: Advances in Neural Information Processing Systems, pp. 2951–2959 (2012)
Stoica, P., Selen, Y.: Model-order selection: a review of information criterion rules. IEEE Sig. Process. Mag. 21(4), 36–47 (2004)
Tomasi, F., Tozzo, V., Salzo, S., Verri, A.: Latent variable time-varying network inference. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 2338–2346. ACM (2018)
Von Luxburg, U., et al.: Clustering stability: an overview. Found. Trends® Mach. Learn. 2(3), 235–274 (2010)
Vujačić, I., Abbruzzo, A., Wit, E.: A computationally fast alternative to cross-validation in penalized gaussian graphical models. J. Stat. Comput. Simul. 85(18), 3628–3640 (2015)
Wang, C., Satuluri, V., Parthasarathy, S.: Local probabilistic models for link prediction. In: Seventh IEEE International Conference on Data Mining (ICDM 2007), pp. 322–331. IEEE (2007)
Wasserman, L., Roeder, K.: High dimensional variable selection. Ann. Stat. 37(5A), 2178 (2009)
Wilkinson, D.J.: Bayesian methods in bioinformatics and computational systems biology. Brief. Bioinform. 8(2), 109–116 (2007)
Yang, E., Baker, Y., Ravikumar, P., Allen, G., Liu, Z.: Mixed graphical models via exponential families. In: Artificial Intelligence and Statistics, pp. 1042–1050 (2014)
Yang, E., Ravikumar, P., Allen, G.I., Liu, Z.: Graphical models via univariate exponential family distributions. J. Mach. Learn. Res. 16(1), 3813–3847 (2015)
Yang, E., Ravikumar, P.K., Allen, G.I., Liu, Z.: On poisson graphical models. In: Advances in Neural Information Processing Systems, pp. 1718–1726 (2013)
Yuan, M.: Discussion: latent variable graphical model selection via convex optimization. Ann. Stat. 40(4), 1968–1972 (2012)
Zhou, S., Lafferty, J., Wasserman, L.: Time varying undirected graphs. Mach. Learn. 80(2–3), 295–319 (2010)
Žitnik, M., Zupan, B.: Gene network inference by fusing data from diverse distributions. Bioinformatics 31(12), i230–i239 (2015)
Zou, H., Hastie, T., Tibshirani, R.: On the “degrees of freedom” of the lasso. Ann. Statist. 35(5), 2173–2192 (2007)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Tozzo, V., Barla, A. (2020). Multi-parameters Model Selection for Network Inference. In: Cherifi, H., Gaito, S., Mendes, J., Moro, E., Rocha, L. (eds) Complex Networks and Their Applications VIII. COMPLEX NETWORKS 2019. Studies in Computational Intelligence, vol 881. Springer, Cham. https://doi.org/10.1007/978-3-030-36687-2_47
Download citation
DOI: https://doi.org/10.1007/978-3-030-36687-2_47
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-36686-5
Online ISBN: 978-3-030-36687-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)