# Cycle-Based Cluster Variational Method for Direct and Inverse Inference

- 126 Downloads
- 1 Citations

## Abstract

Large scale inference problems of practical interest can often be addressed with help of Markov random fields. This requires to solve in principle two related problems: the first one is to find offline the parameters of the MRF from empirical data (inverse problem); the second one (direct problem) is to set up the inference algorithm to make it as precise, robust and efficient as possible. In this work we address both the direct and inverse problem with mean-field methods of statistical physics, going beyond the Bethe approximation and associated belief propagation algorithm. We elaborate on the idea that loop corrections to belief propagation can be dealt with in a systematic way on pairwise Markov random fields, by using the elements of a cycle basis to define regions in a generalized belief propagation setting. For the direct problem, the region graph is specified in such a way as to avoid feed-back loops as much as possible by selecting a minimal cycle basis. Following this line we are led to propose a two-level algorithm, where a belief propagation algorithm is run alternatively at the level of each cycle and at the inter-region level. Next we observe that the inverse problem can be addressed region by region independently, with one small inverse problem per region to be solved. It turns out that each elementary inverse problem on the loop geometry can be solved efficiently. In particular in the random Ising context we propose two complementary methods based respectively on fixed point equations and on a one-parameter log likelihood function minimization. Numerical experiments confirm the effectiveness of this approach both for the direct and inverse MRF inference. Heterogeneous problems of size up to \(10^5\) are addressed in a reasonable computational time, notably with better convergence properties than ordinary belief propagation.

## References

- 1.Bethe, H.A.: Statistical theory of superlattices. Proc. R. Soc. Lond. A
**150**(871), 552–575 (1935)ADSCrossRefzbMATHGoogle Scholar - 2.Chertkov, M., Chernyak, V.Y.: Loop series for discrete statistical models on graphs. J. Stat. Mech.
**6**, P06009 (2006)Google Scholar - 3.Cocco, S., Monasson, R.: Adaptive cluster expansion for the inverse Ising problem: convergence, algorithm and tests. J. Stat. Phys.
**147**(2), 252–314 (2012)ADSMathSciNetCrossRefzbMATHGoogle Scholar - 4.Cooper, G.: The computational complexity of probabilistic inference using bayesian belief networks (research note). Artif. Intell.
**42**(2–3), 393–405 (1990)CrossRefzbMATHGoogle Scholar - 5.Decelle, A., Ricci-Tersenghi, F.: Pseudolikelihood decimation algorithm improving the inference of the interaction network in a general class of Ising models. Phys. Rev. Lett.
**112**, 070603 (2014)ADSCrossRefGoogle Scholar - 6.Dominguez, E., Lage-Castellanos, A., Mulet, R., Ricci-Tersenghi, F., Rizzo, T.: Characterizing and improving generalized belief propagation algorithms on the 2d Edwards-Anderson model. J. Stat. Mech. Theory Exp.
**12**, P12007 (2011)Google Scholar - 7.Furtlehner, C.: Approximate inverse Ising models close to a Bethe reference point. J. Stat. Mech.
**09**, P09020 (2013)MathSciNetGoogle Scholar - 8.Gabrié, M., Tramel, E.W., Krzakala, F.: Training restricted Boltzmann machine via the Thouless–Anderson–Palmer free energy. Adv. Neural Inf. Process. Syst.
**28**, 640–648 (2015)Google Scholar - 9.Gelfand, A., Welling, M.: Generalized belief propagation on tree robust structured region graphs. In: Proceedings of the International Conference on Uncertainty in Artificial Intelligence, vol. 28 (2012)Google Scholar
- 10.Globerson, A., Jaakkola, T.: Fixing max-product: convergent message passing algorithms for MAP LP-relaxations. In: NIPS, pp. 553–560 (2007)Google Scholar
- 11.Heskes, T.: Stable fixed points of loopy belief propagation are minima of the Bethe free energy. Adv. Neural Inf. Process. Syst.
**15**, 343–350 (2003)Google Scholar - 12.Heskes, T., Albers, K., Kappen, B.: Approximate inference and constrained optimization. In: UAI (2003)Google Scholar
- 13.Höfling, H., Tibshirani, R.: Estimation of sparse binary pairwise Markov networks using pseudo-likelihood. JMLR.
**10**, 883–906 (2009)MathSciNetzbMATHGoogle Scholar - 14.Horton, J.: A polynomial-time algorithm to find the shortest cycle basis of a graph. SIAM J. Comput.
**16**(2), 358–366 (1987)MathSciNetCrossRefzbMATHGoogle Scholar - 15.Jörg, T., Lukic, J., Marinari, E., Martin, O.C.: Strong universality and algebraic scaling in two-dimensional Ising spin glasses. Phys. Rev. Lett.
**96**, 237205 (2006)ADSCrossRefGoogle Scholar - 16.Kappen, H., Rodríguez, F.: Efficient learning in Boltzmann machines using linear response theory. Neural Comput.
**10**(5), 1137–1156 (1998)CrossRefGoogle Scholar - 17.Kavitha, T., Liebchen, C., Mehlhorn, K., Michail, D., Rizzi, R., Ueckerdt, T., Zweig, K.A.: Cycle bases in graphs characterization, algorithms, complexity, and applications. Comput. Sci. Rev.
**3**(4), 199–243 (2009)CrossRefzbMATHGoogle Scholar - 18.Kikuchi, R.: A theory of cooperative phenomena. Phys. Rev.
**81**, 988–1003 (1951)ADSMathSciNetCrossRefzbMATHGoogle Scholar - 19.Kolmogorov, V.: Convergent tree-reweighted message passing for energy minimization. IEEE Trans. Pattern Anal. Mach. Intell.
**28**(10), 1568–1583 (2006)CrossRefGoogle Scholar - 20.Kolmogorov, V., Wainwright, M.: On the optimality of tree-reweighted max-product message-passing. In: UAI, pp. 316–323 (2005)Google Scholar
- 21.Kudekar, S., Johnson, J., Chertkov, M.: Improved linear programming decoding using frustrated cycles. In: Proceedings of the 2013 IEEE International Symposium on Information Theory, Istanbul, Turkey, pp. 1496–1500, 7–12 July 2013Google Scholar
- 22.Lage-Castellanos, A., Mulet, R., Ricci-Tersenghi, F., Rizzo, T.: A very fast inference algorithm for finite-dimensional spin glasses: belief propagation on the dual lattice. Phys. Rev. E
**84**, 046706 (2011)ADSCrossRefGoogle Scholar - 23.Lauritzen, S.: Graphical Models. Oxford University Press, Oxford (1996)zbMATHGoogle Scholar
- 24.LeCun, Y., Bengio, Y., Hinton, G.E.: Deep learning. Nature
**521**, 436–444 (2015)ADSCrossRefGoogle Scholar - 25.Lee, S.I., Ganapathi, V., Koller, D.: Efficient structure learning of Markov networks using \({L}_1\)-regularization. In: NIPS (2006)Google Scholar
- 26.Martin, V., Furtlehner, C., Han, Y., Lasgouttes, J.-M.: GMRF estimation under topological and spectral constraints. In: ECML, vol. 8725, pp. 370–385 (2014)Google Scholar
- 27.Martin, V., Lasgouttes, J.-M., Furtlehner, C.: Latent binary MRF for online reconstruction of large scale systems. In: Annals of Mathematics and Artificial Intelligence, pp. 1–32. Springer, Dordrecht (2015)Google Scholar
- 28.Mézard, M., Mora, T.: Constraint satisfaction problems and neural networks: a statistical physics perspective. J. Physiol. Paris
**103**(1–2), 107–113 (2009)CrossRefGoogle Scholar - 29.Montanari, A., Rizzo, T.: How to compute loop corrections to the Bethe approximation. J. Stat. Mech. Theory Exp.
**2005**(10), P10011 (2005)CrossRefGoogle Scholar - 30.Mooij, J., Kappen, H.: Loop corrections for approximate inference on factor graphs. JMLR.
**8**, 1113–1143 (2007)MathSciNetzbMATHGoogle Scholar - 31.Morita, T.: Cluster variation method and Möbius inversion formula. J. Stat. Phys.
**59**(3–4), 819–825 (1990)ADSCrossRefzbMATHGoogle Scholar - 32.Nguyen, H., Berg, J.: Bethe–Peierls approximation and the inverse Ising model. J. Stat. Mech.
**1112**(3501), P03004 (2012)Google Scholar - 33.Pakzad, P., Anantharam, V.: Estimation and marginalization using the Kikuchi approximation methods. Neural Comput.
**17**(8), 1836–1873 (2005)CrossRefzbMATHGoogle Scholar - 34.Parisi, G., Slanina, F.: Loop expansion around the Bethe–Peierls approximation for lattice models. J. Stat. Mech. Theory Exp.
**2006**(02), L02003 (2006)CrossRefGoogle Scholar - 35.Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Network of Plausible Inference. Morgan Kaufmann, San Mateo (1988)zbMATHGoogle Scholar
- 36.Pelizzola, A.: Cluster variation method in statistical physics and probabilistic graphical models. J. Phys. A Math. Gen.
**38**(33), R309–R339 (2005)ADSMathSciNetCrossRefGoogle Scholar - 37.Ramezanpour, A.: Computing loop corrections by message passing. Phys. Rev. E
**87**, 060103 (2013)ADSCrossRefGoogle Scholar - 38.Ravikumar, P., Wainwright, M.J., Lafferty, J.D.: High-dimensional Ising model selection using L\(_1\)-regularized logistic regression. Ann. Stat.
**38**(3), 1287–1319 (2010)MathSciNetCrossRefzbMATHGoogle Scholar - 39.Rizzi, R.: Minimum weakly fundamental cycle bases are hard to find. Algorithmica
**53**(3), 402–424 (2009)MathSciNetCrossRefzbMATHGoogle Scholar - 40.Ruozzi, N.: Message passing algorithms for optimization. PhD Thesis, Yale University (2011)Google Scholar
- 41.Savit, R.: Duality in field theory and statistical systems. Rev. Mod. Phys.
**52**(2), 453–487 (1980)ADSMathSciNetCrossRefGoogle Scholar - 42.Shimony, S.: Finding MAPs for belief networks is NP-hard. Artif. Intell.
**68**(2), 399–410 (1994)MathSciNetCrossRefzbMATHGoogle Scholar - 43.Sontag, D., Jaakkola, T.: New outer bounds on the marginal polytope. In: Neural Information Processing Systems. MIT, Cambridge (2007)Google Scholar
- 44.Sontag, D., Meltzer, T., Globerson, A., Jaakkola, T., Weiss, Y.: Tightening LP-relaxations for MAP using message passing. In: Uncertainty in Artificial Intelligence (UAI) (2008)Google Scholar
- 45.Sontag, D., Choe, D., Li, Y.: Efficiently searching for frustrated cycles in MAP inference. In: UAI, pp. 795–804 (2012)Google Scholar
- 46.Sudderth, E., Wainwright, M., Willsky, A.: Loop series and Bethe variational bounds in attractive graphical models. NIPS.
**20**, 1425–1432 (2008)Google Scholar - 47.Tanaka, K.: Statistical-mechanical approach to image processing. J. Phys. A Math. Gen.
**35**(37), R81 (2002)ADSMathSciNetCrossRefzbMATHGoogle Scholar - 48.Wainwright, M., Jordan, M.: Graphical models, exponential families, and variational inference. Found. Trends Mach. Learn.
**1**(1–2), 1–305 (2008)zbMATHGoogle Scholar - 49.Wainwright, M., Jaakkola, T., Willsky, A.: MAP estimation via agreement on trees: message-passing and linear programming. IEEE Trans. Inf. Theory
**51**(11), 3697–3717 (2005)MathSciNetCrossRefzbMATHGoogle Scholar - 50.Weiss, Y.: Correctness of local probability propagation in graphical models with loops. Neural Comput.
**12**(1), 1–41 (2000)CrossRefGoogle Scholar - 51.Welling, M.: On the choice of regions for generalized belief propagation. In: Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence (UAI ’04), pp. 585–592 (2004)Google Scholar
- 52.Welling, M., Teh, Y.: Approximate inference in Boltzmann machines. Artif. Intell.
**143**(1), 19–50 (2003)MathSciNetCrossRefzbMATHGoogle Scholar - 53.Welling, M., Minka, T., Teh, Y.W.: Structured region graphs: morphing EP into GBP. In: Proceedings of the International Conference on Uncertainty in Artificial Intelligence, vol. 21 (2005)Google Scholar
- 54.Xiao, J., Zhou, H.: Partition function loop series for a general graphical model: free-energy corrections and message-passing equations. J. Phys. A Math. Theor.
**44**(42), 425001 (2011)ADSMathSciNetCrossRefzbMATHGoogle Scholar - 55.Yasuda, M., Tanaka, K.: Approximate learning algorithm in Boltzmann machines. Neural Comput.
**21**, 3130–3178 (2009)MathSciNetCrossRefzbMATHGoogle Scholar - 56.Yedidia, J.S., Freeman, W.T., Weiss, Y.: Constructing free-energy approximations and generalized belief propagation algorithms. IEEE Trans. Inf. Theory.
**51**(7), 2282–2312 (2005)MathSciNetCrossRefzbMATHGoogle Scholar - 57.Yuille, A.L.: CCCP algorithms to minimize the Bethe and Kikuchi free energies: convergent alternatives to belief propagation. Neural Comput.
**14**, 1691–1722 (2002)CrossRefzbMATHGoogle Scholar