Advertisement

Journal of Statistical Physics

, Volume 164, Issue 3, pp 531–574 | Cite as

Cycle-Based Cluster Variational Method for Direct and Inverse Inference

  • Cyril FurtlehnerEmail author
  • Aurélien Decelle
Article

Abstract

Large scale inference problems of practical interest can often be addressed with help of Markov random fields. This requires to solve in principle two related problems: the first one is to find offline the parameters of the MRF from empirical data (inverse problem); the second one (direct problem) is to set up the inference algorithm to make it as precise, robust and efficient as possible. In this work we address both the direct and inverse problem with mean-field methods of statistical physics, going beyond the Bethe approximation and associated belief propagation algorithm. We elaborate on the idea that loop corrections to belief propagation can be dealt with in a systematic way on pairwise Markov random fields, by using the elements of a cycle basis to define regions in a generalized belief propagation setting. For the direct problem, the region graph is specified in such a way as to avoid feed-back loops as much as possible by selecting a minimal cycle basis. Following this line we are led to propose a two-level algorithm, where a belief propagation algorithm is run alternatively at the level of each cycle and at the inter-region level. Next we observe that the inverse problem can be addressed region by region independently, with one small inverse problem per region to be solved. It turns out that each elementary inverse problem on the loop geometry can be solved efficiently. In particular in the random Ising context we propose two complementary methods based respectively on fixed point equations and on a one-parameter log likelihood function minimization. Numerical experiments confirm the effectiveness of this approach both for the direct and inverse MRF inference. Heterogeneous problems of size up to \(10^5\) are addressed in a reasonable computational time, notably with better convergence properties than ordinary belief propagation.

References

  1. 1.
    Bethe, H.A.: Statistical theory of superlattices. Proc. R. Soc. Lond. A 150(871), 552–575 (1935)ADSCrossRefzbMATHGoogle Scholar
  2. 2.
    Chertkov, M., Chernyak, V.Y.: Loop series for discrete statistical models on graphs. J. Stat. Mech. 6, P06009 (2006)Google Scholar
  3. 3.
    Cocco, S., Monasson, R.: Adaptive cluster expansion for the inverse Ising problem: convergence, algorithm and tests. J. Stat. Phys. 147(2), 252–314 (2012)ADSMathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Cooper, G.: The computational complexity of probabilistic inference using bayesian belief networks (research note). Artif. Intell. 42(2–3), 393–405 (1990)CrossRefzbMATHGoogle Scholar
  5. 5.
    Decelle, A., Ricci-Tersenghi, F.: Pseudolikelihood decimation algorithm improving the inference of the interaction network in a general class of Ising models. Phys. Rev. Lett. 112, 070603 (2014)ADSCrossRefGoogle Scholar
  6. 6.
    Dominguez, E., Lage-Castellanos, A., Mulet, R., Ricci-Tersenghi, F., Rizzo, T.: Characterizing and improving generalized belief propagation algorithms on the 2d Edwards-Anderson model. J. Stat. Mech. Theory Exp. 12, P12007 (2011)Google Scholar
  7. 7.
    Furtlehner, C.: Approximate inverse Ising models close to a Bethe reference point. J. Stat. Mech. 09, P09020 (2013)MathSciNetGoogle Scholar
  8. 8.
    Gabrié, M., Tramel, E.W., Krzakala, F.: Training restricted Boltzmann machine via the Thouless–Anderson–Palmer free energy. Adv. Neural Inf. Process. Syst. 28, 640–648 (2015)Google Scholar
  9. 9.
    Gelfand, A., Welling, M.: Generalized belief propagation on tree robust structured region graphs. In: Proceedings of the International Conference on Uncertainty in Artificial Intelligence, vol. 28 (2012)Google Scholar
  10. 10.
    Globerson, A., Jaakkola, T.: Fixing max-product: convergent message passing algorithms for MAP LP-relaxations. In: NIPS, pp. 553–560 (2007)Google Scholar
  11. 11.
    Heskes, T.: Stable fixed points of loopy belief propagation are minima of the Bethe free energy. Adv. Neural Inf. Process. Syst. 15, 343–350 (2003)Google Scholar
  12. 12.
    Heskes, T., Albers, K., Kappen, B.: Approximate inference and constrained optimization. In: UAI (2003)Google Scholar
  13. 13.
    Höfling, H., Tibshirani, R.: Estimation of sparse binary pairwise Markov networks using pseudo-likelihood. JMLR. 10, 883–906 (2009)MathSciNetzbMATHGoogle Scholar
  14. 14.
    Horton, J.: A polynomial-time algorithm to find the shortest cycle basis of a graph. SIAM J. Comput. 16(2), 358–366 (1987)MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Jörg, T., Lukic, J., Marinari, E., Martin, O.C.: Strong universality and algebraic scaling in two-dimensional Ising spin glasses. Phys. Rev. Lett. 96, 237205 (2006)ADSCrossRefGoogle Scholar
  16. 16.
    Kappen, H., Rodríguez, F.: Efficient learning in Boltzmann machines using linear response theory. Neural Comput. 10(5), 1137–1156 (1998)CrossRefGoogle Scholar
  17. 17.
    Kavitha, T., Liebchen, C., Mehlhorn, K., Michail, D., Rizzi, R., Ueckerdt, T., Zweig, K.A.: Cycle bases in graphs characterization, algorithms, complexity, and applications. Comput. Sci. Rev. 3(4), 199–243 (2009)CrossRefzbMATHGoogle Scholar
  18. 18.
    Kikuchi, R.: A theory of cooperative phenomena. Phys. Rev. 81, 988–1003 (1951)ADSMathSciNetCrossRefzbMATHGoogle Scholar
  19. 19.
    Kolmogorov, V.: Convergent tree-reweighted message passing for energy minimization. IEEE Trans. Pattern Anal. Mach. Intell. 28(10), 1568–1583 (2006)CrossRefGoogle Scholar
  20. 20.
    Kolmogorov, V., Wainwright, M.: On the optimality of tree-reweighted max-product message-passing. In: UAI, pp. 316–323 (2005)Google Scholar
  21. 21.
    Kudekar, S., Johnson, J., Chertkov, M.: Improved linear programming decoding using frustrated cycles. In: Proceedings of the 2013 IEEE International Symposium on Information Theory, Istanbul, Turkey, pp. 1496–1500, 7–12 July 2013Google Scholar
  22. 22.
    Lage-Castellanos, A., Mulet, R., Ricci-Tersenghi, F., Rizzo, T.: A very fast inference algorithm for finite-dimensional spin glasses: belief propagation on the dual lattice. Phys. Rev. E 84, 046706 (2011)ADSCrossRefGoogle Scholar
  23. 23.
    Lauritzen, S.: Graphical Models. Oxford University Press, Oxford (1996)zbMATHGoogle Scholar
  24. 24.
    LeCun, Y., Bengio, Y., Hinton, G.E.: Deep learning. Nature 521, 436–444 (2015)ADSCrossRefGoogle Scholar
  25. 25.
    Lee, S.I., Ganapathi, V., Koller, D.: Efficient structure learning of Markov networks using \({L}_1\)-regularization. In: NIPS (2006)Google Scholar
  26. 26.
    Martin, V., Furtlehner, C., Han, Y., Lasgouttes, J.-M.: GMRF estimation under topological and spectral constraints. In: ECML, vol. 8725, pp. 370–385 (2014)Google Scholar
  27. 27.
    Martin, V., Lasgouttes, J.-M., Furtlehner, C.: Latent binary MRF for online reconstruction of large scale systems. In: Annals of Mathematics and Artificial Intelligence, pp. 1–32. Springer, Dordrecht (2015)Google Scholar
  28. 28.
    Mézard, M., Mora, T.: Constraint satisfaction problems and neural networks: a statistical physics perspective. J. Physiol. Paris 103(1–2), 107–113 (2009)CrossRefGoogle Scholar
  29. 29.
    Montanari, A., Rizzo, T.: How to compute loop corrections to the Bethe approximation. J. Stat. Mech. Theory Exp. 2005(10), P10011 (2005)CrossRefGoogle Scholar
  30. 30.
    Mooij, J., Kappen, H.: Loop corrections for approximate inference on factor graphs. JMLR. 8, 1113–1143 (2007)MathSciNetzbMATHGoogle Scholar
  31. 31.
    Morita, T.: Cluster variation method and Möbius inversion formula. J. Stat. Phys. 59(3–4), 819–825 (1990)ADSCrossRefzbMATHGoogle Scholar
  32. 32.
    Nguyen, H., Berg, J.: Bethe–Peierls approximation and the inverse Ising model. J. Stat. Mech. 1112(3501), P03004 (2012)Google Scholar
  33. 33.
    Pakzad, P., Anantharam, V.: Estimation and marginalization using the Kikuchi approximation methods. Neural Comput. 17(8), 1836–1873 (2005)CrossRefzbMATHGoogle Scholar
  34. 34.
    Parisi, G., Slanina, F.: Loop expansion around the Bethe–Peierls approximation for lattice models. J. Stat. Mech. Theory Exp. 2006(02), L02003 (2006)CrossRefGoogle Scholar
  35. 35.
    Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Network of Plausible Inference. Morgan Kaufmann, San Mateo (1988)zbMATHGoogle Scholar
  36. 36.
    Pelizzola, A.: Cluster variation method in statistical physics and probabilistic graphical models. J. Phys. A Math. Gen. 38(33), R309–R339 (2005)ADSMathSciNetCrossRefGoogle Scholar
  37. 37.
    Ramezanpour, A.: Computing loop corrections by message passing. Phys. Rev. E 87, 060103 (2013)ADSCrossRefGoogle Scholar
  38. 38.
    Ravikumar, P., Wainwright, M.J., Lafferty, J.D.: High-dimensional Ising model selection using L\(_1\)-regularized logistic regression. Ann. Stat. 38(3), 1287–1319 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  39. 39.
    Rizzi, R.: Minimum weakly fundamental cycle bases are hard to find. Algorithmica 53(3), 402–424 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  40. 40.
    Ruozzi, N.: Message passing algorithms for optimization. PhD Thesis, Yale University (2011)Google Scholar
  41. 41.
    Savit, R.: Duality in field theory and statistical systems. Rev. Mod. Phys. 52(2), 453–487 (1980)ADSMathSciNetCrossRefGoogle Scholar
  42. 42.
    Shimony, S.: Finding MAPs for belief networks is NP-hard. Artif. Intell. 68(2), 399–410 (1994)MathSciNetCrossRefzbMATHGoogle Scholar
  43. 43.
    Sontag, D., Jaakkola, T.: New outer bounds on the marginal polytope. In: Neural Information Processing Systems. MIT, Cambridge (2007)Google Scholar
  44. 44.
    Sontag, D., Meltzer, T., Globerson, A., Jaakkola, T., Weiss, Y.: Tightening LP-relaxations for MAP using message passing. In: Uncertainty in Artificial Intelligence (UAI) (2008)Google Scholar
  45. 45.
    Sontag, D., Choe, D., Li, Y.: Efficiently searching for frustrated cycles in MAP inference. In: UAI, pp. 795–804 (2012)Google Scholar
  46. 46.
    Sudderth, E., Wainwright, M., Willsky, A.: Loop series and Bethe variational bounds in attractive graphical models. NIPS. 20, 1425–1432 (2008)Google Scholar
  47. 47.
    Tanaka, K.: Statistical-mechanical approach to image processing. J. Phys. A Math. Gen. 35(37), R81 (2002)ADSMathSciNetCrossRefzbMATHGoogle Scholar
  48. 48.
    Wainwright, M., Jordan, M.: Graphical models, exponential families, and variational inference. Found. Trends Mach. Learn. 1(1–2), 1–305 (2008)zbMATHGoogle Scholar
  49. 49.
    Wainwright, M., Jaakkola, T., Willsky, A.: MAP estimation via agreement on trees: message-passing and linear programming. IEEE Trans. Inf. Theory 51(11), 3697–3717 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  50. 50.
    Weiss, Y.: Correctness of local probability propagation in graphical models with loops. Neural Comput. 12(1), 1–41 (2000)CrossRefGoogle Scholar
  51. 51.
    Welling, M.: On the choice of regions for generalized belief propagation. In: Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence (UAI ’04), pp. 585–592 (2004)Google Scholar
  52. 52.
    Welling, M., Teh, Y.: Approximate inference in Boltzmann machines. Artif. Intell. 143(1), 19–50 (2003)MathSciNetCrossRefzbMATHGoogle Scholar
  53. 53.
    Welling, M., Minka, T., Teh, Y.W.: Structured region graphs: morphing EP into GBP. In: Proceedings of the International Conference on Uncertainty in Artificial Intelligence, vol. 21 (2005)Google Scholar
  54. 54.
    Xiao, J., Zhou, H.: Partition function loop series for a general graphical model: free-energy corrections and message-passing equations. J. Phys. A Math. Theor. 44(42), 425001 (2011)ADSMathSciNetCrossRefzbMATHGoogle Scholar
  55. 55.
    Yasuda, M., Tanaka, K.: Approximate learning algorithm in Boltzmann machines. Neural Comput. 21, 3130–3178 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  56. 56.
    Yedidia, J.S., Freeman, W.T., Weiss, Y.: Constructing free-energy approximations and generalized belief propagation algorithms. IEEE Trans. Inf. Theory. 51(7), 2282–2312 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  57. 57.
    Yuille, A.L.: CCCP algorithms to minimize the Bethe and Kikuchi free energies: convergent alternatives to belief propagation. Neural Comput. 14, 1691–1722 (2002)CrossRefzbMATHGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  1. 1.Inria Saclay - LRI, Tao Project TeamUniversité Paris SudOrsay CedexFrance
  2. 2.LRI, AO TeamUniversité Paris SudOrsay CedexFrance

Personalised recommendations