Multi-parameters Model Selection for Network Inference

Tozzo, Veronica; Barla, Annalisa

doi:10.1007/978-3-030-36687-2_47

Veronica Tozzo⁷ &
Annalisa Barla⁷

Part of the book series: Studies in Computational Intelligence ((SCI,volume 881))

Included in the following conference series:

International Conference on Complex Networks and Their Applications

3119 Accesses

Abstract

Network inference is the reverse-engineering problem of inferring graphs from data. With the always increasing availability of data, methods based on probability assumptions that infer multiple intertwined networks have been proposed in literature. These methods, while being extremely flexible, have the major drawback of presenting a high number of hyper-parameters that need to be tuned. The tuning of hyper-parameters, in unsupervised settings, can be performed through criteria based on likelihood or stability. Likelihood-based scores can be easily generalised to the multi hyper-parameters setting, but their computation is feasible only under certain probability assumptions. Differently, stability-based methods are of general application and, on single hyper-parameter, they have been proved to outperform likelihood-based scores. In this work we present a multi-parameters extension to stability-based methods that can be easily applied on complex models. We extensively compared this extension with likelihood-based scores on synthetic Gaussian data. Experiments show that our extension provides a better estimate of models of increasing complexity providing a valuable alternative of existing likelihood-based model selection methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://github.com/veronicatozzo/regain/.

References

Allen, G.I., Liu, Z.: A local poisson graphical model for inferring networks from sequencing data. IEEE Trans. Nanobiosci. 12(3), 189–198 (2013)
Article Google Scholar
Barabasi, A.-L., Oltvai, Z.N.: Network biology: understanding the cell’s functional organization. Nat. Rev. Genet. 5(2), 101 (2004)
Article Google Scholar
Bergomi, M.G., Ferri, M., Vertechi, P., Zuffi, L.: Beyond topological persistence: Starting from networks. arXiv preprint arXiv:1901.08051 (2019)
Blunt, M.J., Jackson, M.D., Piri, M., Valvatne, P.H.: Detailed physics, predictive capabilities and macroscopic consequences for pore-network models of multiphase flow. Adv. Water Resour. 25(8–12), 1069–1089 (2002)
Article Google Scholar
Bogdan, M., Ghosh, J.K., Doerge, R.W.: Modifying the schwarz bayesian information criterion to locate multiple interacting quantitative trait loci. Genetics 167(2), 989–999 (2004)
Article Google Scholar
Borgatti, S.P., Mehra, A., Brass, D.J., Labianca, G.: Network analysis in the social sciences. Science 323(5916), 892–895 (2009)
Article Google Scholar
Broman, K.W., Speed, T.P.: A model selection approach for the identification of quantitative trait loci in experimental crosses. J. Roy. Stat. Soc.: Ser. B (Stat. Methodol.) 64(4), 641–656 (2002)
Article MathSciNet Google Scholar
Chandrasekaran, V., Parrilo, P.A., Willsky, A.S.: Latent variable graphical model selection via convex optimization. In: 2010 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 1610–1613. IEEE (2010)
Google Scholar
Chen, J., Chen, Z.: Extended bayesian information criteria for model selection with large model spaces. Biometrika 95(3), 759–771 (2008)
Article MathSciNet Google Scholar
Cheng, L., Shan, L., Kim, I.: Multilevel gaussian graphical model for multilevel networks. J. Stat. Plann. Infer. 190, 1–14 (2017)
Article MathSciNet Google Scholar
Danaher, P., Wang, P., Witten, D.M.: The joint graphical lasso for inverse covariance estimation across multiple classes. J. Roy. Stat. Soc.: Ser. B (Stat. Methodol.) 76(2), 373–397 (2014)
Article MathSciNet Google Scholar
Foygel, R., Drton, M.: Extended bayesian information criteria for gaussian graphical models. In: Advances in Neural Information Processing Systems, pp. 604–612 (2010)
Google Scholar
Friedman, J., Hastie, T., Tibshirani, R.: Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9(3), 432–441 (2008)
Article Google Scholar
Friedman, N.: Inferring cellular networks using probabilistic graphical models. Science 303(5659), 799–805 (2004)
Article Google Scholar
Guo, J., Levina, E., Michailidis, G., Zhu, J.: Joint estimation of multiple graphical models. Biometrika 98(1), 1–15 (2011)
Article MathSciNet Google Scholar
Hallac, D., Leskovec, J., Boyd, S., lasso, N.: Clustering and optimization in large graphs. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 387–396. ACM (2015)
Google Scholar
Hallac, D., Park, Y., Boyd, S., Leskovec, J.: Network inference via the time-varying graphical lasso. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 205–213. ACM (2017)
Google Scholar
Lange, T., Roth, V., Braun, M.L., Buhmann, J.M.: Stability-based validation of clustering solutions. Neural Comput. 16(6), 1299–1323 (2004)
Article Google Scholar
Lauritzen, S.L.: Graphical Models, vol. 17. Clarendon Press, Oxford (1996)
MATH Google Scholar
Lee, J.D., Hastie, T.J.: Learning the structure of mixed graphical models. J. Comput. Graph. Stat. 24(1), 230–253 (2015)
Article MathSciNet Google Scholar
Li, H., Gui, J.: Gradient directed regularization for sparse gaussian concentration graphs, with applications to inference of genetic networks. Biostatistics 7(2), 302–317 (2005)
Article Google Scholar
Liu, H., Roeder, K., Wasserman, L.: Stability approach to regularization selection (stars) for high dimensional graphical models. In: Advances in Neural Information Processing Systems, pp. 1432–1440 (2010)
Google Scholar
Meinshausen, N., Bühlmann, P.: Stability selection. J. Roy. Stat. Soc.: Ser. B (Stat. Methodol.) 72(4), 417–473 (2010)
Article MathSciNet Google Scholar
Meinshausen, N., Bühlmann, P., et al.: High-dimensional graphs and variable selection with the lasso. Ann. Stat. 34(3), 1436–1462 (2006)
Article MathSciNet Google Scholar
Milenković, T., Pržulj, N.: Uncovering biological network function via graphlet degree signatures. Cancer Inform. 6, CIN–S680 (2008)
Article Google Scholar
Molinaro, A.M., Simon, R., Pfeiffer, R.M.: Prediction error estimation: a comparison of resampling methods. Bioinformatics 21(15), 3301–3307 (2005)
Article Google Scholar
Müller, C.L., Bonneau, R., Kurtz, Z.: Generalized stability approach for regularized graphical models. arXiv preprint arXiv:1605.07072 (2016)
Pelizzola, A.: Cluster variation method in statistical physics and probabilistic graphical models. J. Phys. A: Math. Gen. 38(33), R309 (2005)
Article MathSciNet Google Scholar
Politis, D.N., Romano, J.P., Wolf, M.: Subsampling. Springer, New York (1999)
Book Google Scholar
Pržulj, N.: Biological network comparison using graphlet degree distribution. Bioinformatics 23(2), e177–e183 (2007)
Article Google Scholar
Pržulj, N., Corneil, D.G., Jurisica, I.: Modeling interactome: scale-free or geometric? Bioinformatics 20(18), 3508–3515 (2004)
Article Google Scholar
Ravikumar, P., Wainwright, M.J., Lafferty, J.D., et al.: High-dimensional ising model selection using \(\ell _1\)-regularized logistic regression. Ann. Stat. 38(3), 1287–1319 (2010)
Article Google Scholar
Sakamoto, Y., Ishiguro, M., Kitagawa, G.: Akaike information criterion statistics. Dordrecht, The Netherlands: D. Reidel 81 (1986)
Google Scholar
Sarajlić, A., Malod-Dognin, N., Yaveroğlu, Ö.N., Pržulj, N.: Graphlet-based characterization of directed networks. Sci. Rep. 6, 35098 (2016)
Article Google Scholar
Siegmund, D.: Model selection in irregular problems: applications to mapping quantitative trait loci. Biometrika 91(4), 785–800 (2004)
Article MathSciNet Google Scholar
Snoek, J., Larochelle, H., Adams, R.P.: Practical bayesian optimization of machine learning algorithms. In: Advances in Neural Information Processing Systems, pp. 2951–2959 (2012)
Google Scholar
Stoica, P., Selen, Y.: Model-order selection: a review of information criterion rules. IEEE Sig. Process. Mag. 21(4), 36–47 (2004)
Article Google Scholar
Tomasi, F., Tozzo, V., Salzo, S., Verri, A.: Latent variable time-varying network inference. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 2338–2346. ACM (2018)
Google Scholar
Von Luxburg, U., et al.: Clustering stability: an overview. Found. Trends® Mach. Learn. 2(3), 235–274 (2010)
Google Scholar
Vujačić, I., Abbruzzo, A., Wit, E.: A computationally fast alternative to cross-validation in penalized gaussian graphical models. J. Stat. Comput. Simul. 85(18), 3628–3640 (2015)
Article MathSciNet Google Scholar
Wang, C., Satuluri, V., Parthasarathy, S.: Local probabilistic models for link prediction. In: Seventh IEEE International Conference on Data Mining (ICDM 2007), pp. 322–331. IEEE (2007)
Google Scholar
Wasserman, L., Roeder, K.: High dimensional variable selection. Ann. Stat. 37(5A), 2178 (2009)
Article MathSciNet Google Scholar
Wilkinson, D.J.: Bayesian methods in bioinformatics and computational systems biology. Brief. Bioinform. 8(2), 109–116 (2007)
Article Google Scholar
Yang, E., Baker, Y., Ravikumar, P., Allen, G., Liu, Z.: Mixed graphical models via exponential families. In: Artificial Intelligence and Statistics, pp. 1042–1050 (2014)
Google Scholar
Yang, E., Ravikumar, P., Allen, G.I., Liu, Z.: Graphical models via univariate exponential family distributions. J. Mach. Learn. Res. 16(1), 3813–3847 (2015)
MathSciNet MATH Google Scholar
Yang, E., Ravikumar, P.K., Allen, G.I., Liu, Z.: On poisson graphical models. In: Advances in Neural Information Processing Systems, pp. 1718–1726 (2013)
Google Scholar
Yuan, M.: Discussion: latent variable graphical model selection via convex optimization. Ann. Stat. 40(4), 1968–1972 (2012)
Article Google Scholar
Zhou, S., Lafferty, J., Wasserman, L.: Time varying undirected graphs. Mach. Learn. 80(2–3), 295–319 (2010)
Article MathSciNet Google Scholar
Žitnik, M., Zupan, B.: Gene network inference by fusing data from diverse distributions. Bioinformatics 31(12), i230–i239 (2015)
Article Google Scholar
Zou, H., Hastie, T., Tibshirani, R.: On the “degrees of freedom” of the lasso. Ann. Statist. 35(5), 2173–2192 (2007)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Università degli Studi di Genova, 16146, Genoa, GE, Italy
Veronica Tozzo & Annalisa Barla

Authors

Veronica Tozzo
View author publications
You can also search for this author in PubMed Google Scholar
Annalisa Barla
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Veronica Tozzo .

Editor information

Editors and Affiliations

University of Burgundy, Dijon Cedex, France
Hocine Cherifi
Università degli Studi di Milano, Milan, Italy
Sabrina Gaito
University of Aveiro, Aveiro, Portugal
José Fernendo Mendes
Universidad Carlos III de Madrid, Leganés, Madrid, Spain
Esteban Moro
Indiana University, Bloomington, IN, USA
Luis Mateus Rocha

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tozzo, V., Barla, A. (2020). Multi-parameters Model Selection for Network Inference. In: Cherifi, H., Gaito, S., Mendes, J., Moro, E., Rocha, L. (eds) Complex Networks and Their Applications VIII. COMPLEX NETWORKS 2019. Studies in Computational Intelligence, vol 881. Springer, Cham. https://doi.org/10.1007/978-3-030-36687-2_47

Download citation

DOI: https://doi.org/10.1007/978-3-030-36687-2_47
Published: 26 November 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-36686-5
Online ISBN: 978-3-030-36687-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Multi-parameters Model Selection for Network Inference