# A Markovianity based optimisation algorithm

## Abstract

Several Estimation of Distribution Algorithms (EDAs) based on Markov networks have been recently proposed. The key idea behind these EDAs was to factorise the joint probability distribution of solution variables in terms of cliques in the undirected graph. As such, they made use of the global Markov property of the Markov network in one form or another. This paper presents a Markov Network based EDA that is based on the use of the local Markov property, the Markovianity, and does not directly model the joint distribution. We call it Markovianity based Optimisation Algorithm. The algorithm combines a novel method for extracting the neighbourhood structure from the mutual information between the variables, with a Gibbs sampler method to generate new points. We present an extensive empirical validation of the algorithm on problems with complex interactions, comparing its performance with other EDAs that use higher order interactions. We extend the analysis to other functions with discrete representation, where EDA results are scarce, comparing the algorithm with state of the art EDAs that use marginal product factorisations.

## Keywords

Estimation of distribution algorithms Markov networks Competent genetic algorithms## Notes

### Acknowledgments

This work has been partially supported by the TIN2010-20900-C04-04, Consolider Ingenio 2010 – CSD2007-00018 projects (Spanish Ministry of Science and Innovation) and the Cajal Blue Brain project. Jose A. Lozano has been partially supported by the Saiotek, Etortek and Research Groups 2007–2012 (IT-242-07) programs (Basque Government), TIN2010-14931.

## References

- 1.M.A. Alden, MARLEDA: effective distribution estimation through Markov random fields. Ph.D. thesis, Faculty of the Graduate School, University of Texas at Austin, USA (2007)Google Scholar
- 2.S. Baluja, Population-based incremental learning: a method for integrating genetic search based function optimization and competitive learning. Tech. Rep. CMU-CS-94-163, Pittsburgh, PA (1994). http://citeseer.nj.nec.com/baluja94population.html
- 3.J. Besag, Spatial interactions and the statistical analysis of lattice systems (with discussions). J. R. Stat. Soc.
**36**, 192–236 (1974)MathSciNetMATHGoogle Scholar - 4.C. Bron, J. Kerbosch, Algorithm 457—finding all cliques of an undirected graph. Commun. ACM
**16**(6), 575–577 (1973)MATHCrossRefGoogle Scholar - 5.A.E.I. Brownlee, Multivariate markov networks for fitness modelling in an estimation of distribution algorithm. Ph.D. thesis, The Robert Gordon University. School of Computing, Aberdeen, UK (2009)Google Scholar
- 6.A.E.I. Brownlee, J. McCall, S.K. Shakya, Q. Zhang, Structure learning and optimisation in a Markov-network based estimation of distribution algorithm, in
*Proceedings of the 2009 Congress on Evolutionary Computation CEC-2009*(IEEE Press, Norway, 2009), pp. 447–454Google Scholar - 7.C. Echegoyen, J.A. Lozano, R. Santana, P. Larrañaga, Exact Bayesian network learning in estimation of distribution algorithms, in
*Proceedings of the 2007 Congress on Evolutionary Computation CEC-2007*(IEEE Press, New York, 2007), pp. 1051–1058Google Scholar - 8.R. Etxeberria, P. Larrañaga, Global optimization using Bayesian networks, in
*Proceedings of the Second Symposium on Artificial Intelligence (CIMAF-99)*, eds. by A. Ochoa, M.R. Soto, R. Santana (Havana, Cuba 1999), pp. 151–173Google Scholar - 9.J.A. Gámez, J.L. Mateo, J.M. Puerta, EDNA: estimation of dependency networks algorithm, in Bio-inspired Modeling of Cognitive Tasks, Second International Work-Conference on the Interplay Between Natural and Artificial Computation, IWINAC 2007,
*Lecture Notes in Computer Science*, vol. 4527, eds. by J. Mira, J.R. Álvarez (Springer, New York, 2007), pp. 427–436Google Scholar - 10.J.A. Gámez, J.L. Mateo, J.M. Puerta, Improved EDNA(estimation of dependency networks algorithm) using combining function with bivariate probability distributions, in
*Proceedings of the 10th annual conference on Genetic and evolutionary computation GECCO-2008*(ACM, New York, 2008). pp. 407–414. doi: 10.1145/1389095.1389228 - 11.S. Geman, D. Geman, Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images. In: M.A. Fischler, O. Firschein (eds)
*Readings in Computer Vision: Issues, Problems, Principles, and Paradigms*, (Kaufmann, Los Altos, 1987) pp. 564–584.Google Scholar - 12.D. Goldberg,
*Genetic Algorithms in Search, Optimization, and Machine Learning*. (Addison-Wesley, New York, 1989)MATHGoogle Scholar - 13.D.E. Goldberg, Simple genetic algorithms and the minimal, deceptive problem. In: L. Davis (eds)
*Genetic Algorithms and Simulated Annealing*, (Pitman Publishing, London, 1987) pp. 74–88.Google Scholar - 14.J.M. Hammersley, P. Clifford, Markov fields on finite graphs and lattices. Unpublished (1971)Google Scholar
- 15.H. Handa, EDA-RL: estimation of distribution algorithms for reinforcement learning problems, in
*Proceedings of the 11th Annual Genetic and Evolutionary Computation Conference GECCO-2009*(ACM, New York, 2009), pp. 405–412Google Scholar - 16.G. Harik, Linkage learning via probabilistic modeling in the ECGA. Tech. Rep. IlliGAL Report No. 99010, University of Illinois at Urbana-Champaign (1999). http://citeseer.nj.nec.com/harik99linkage.html
- 17.G.R. Harik, F.G. Lobo, K. Sastry , Linkage learning via probabilistic modeling in the ECGA. In: M. Pelikan, K. Sastry, E. Cantú-Paz (eds)
*Scalable Optimization via Probabilistic Modeling: From Algorithms to Applications, Studies in Computational Intelligence*, (Springer, London, 2006) pp. 39–62.Google Scholar - 18.D. Heckerman, D.M. Chickering, C. Meek, R. Rounthwaite, C.M. Kadie, Dependency networks for inference, collaborative filtering, and data visualization. J. Mach. Learn. Res.
**1**, 49–75 (2000). http://citeseer.nj.nec.com/article/heckerman00dependency.html - 19.M. Henrion, Propagating uncertainty in Bayesian networks by probabilistic logic sampling, in
*Uncertainty in Artificial Intelligence 2*eds. by J.F. Lemmer, L.N. Kanal. (North-Holland, Amsterdam, 1988), pp. 149–163Google Scholar - 20.J.H. Holland,
*Adaptation in Natural and Artificial Systems*. (University of Michigan Press, Ann Arbor, 1975)Google Scholar - 21.R. Höns, R. Santana, P. Larrañaga, J.A. Lozano, Optimization by max-propagation using Kikuchi approximations. Tech. Rep. EHU-KZAA-IK-2/07, Department of Computer Science and Artificial Intelligence, University of the Basque Country (2007)Google Scholar
- 22.M.I. Jordan (eds),
*Learning in Graphical Models*. (Kluwer Academic Publishers, Dordrecht, 1998)MATHGoogle Scholar - 23.Larrañaga P., Etxeberria R., Lozano J.A., Peña J.M. (2000) Combinatorial optimization by learning and simulation of Bayesian networks, in
*Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence*(Stanford), pp. 343–352Google Scholar - 24.P. Larrañaga, J.A. Lozano,
*Estimation of Distribution Algorithms: A New Tool for Evolutionary Computation*. (Kluwer Academic Publishers, Dordrecht, 2002)MATHGoogle Scholar - 25.S.L. Lauritzen,
*Graphical Models*. (Oxford University Press, Oxford, 1996)Google Scholar - 26.S.L. Lauritzen, D.J. Spiegelhalter, Local computations with probabilities on graphical structures and their application to expert systems. J. R. Stat. Soc. B
**50**, 157–224 (1988)MathSciNetMATHGoogle Scholar - 27.S.Z. Li,
*Markov Random Field Modeling in Computer Vision*. (Springer, New York, 1995)Google Scholar - 28.J.A. Lozano, P. Larrañaga, I. Inza, E. Bengoetxea (eds),
*Towards a New Evolutionary Computation: Advances on Estimation of Distribution Algorithms*. (Springer, New York, 2006)MATHGoogle Scholar - 29.Mahnig, T., Mühlenbein, H., Comparing the adaptive Boltzmann selection schedule SDS to truncation selection, in
*Evolutionary Computation and Probabilistic Graphical Models. Proceedings of the Third Symposium on Adaptive Systems (ISAS-2001)*(Havana, Cuba, 2001), pp. 121–128Google Scholar - 30.Mendiburu, A., Santana, R., Lozano, J.A., Introducing belief propagation in estimation of distribution algorithms: A parallel framework. Tech. Rep. EHU-KAT-IK-11/07, Department of Computer Science and Artificial Intelligence, University of the Basque Country (2007). http://www.sc.ehu.es/ccwbayes/technical.htm
- 31.N. Metropolis, Equations of state calculations by fast computational machine. J. Chem. Phys.
**21**, 1087–1091 (1953)CrossRefGoogle Scholar - 32.H. Mühlenbein, Convergence of estimation of distribution algorithms (2009). Submmited for publicationGoogle Scholar
- 33.H. Mühlenbein, T. Mahnig, FDA—a scalable evolutionary algorithm for the optimization of additively decomposed functions. Evol. Comput.
**7**(4), 353–376 (1999). http://citeseer.nj.nec.com/uhlenbein99fda.html - 34.H. Mühlenbein, T. Mahnig, A.R. Ochoa, Schemata, distributions and graphical models in evolutionary optimization. J. Heuristics
**5**(2), 215–247 (1999). http://citeseer.nj.nec.com/140949.html Google Scholar - 35.H. Mühlenbein, G. Paaß, From recombination of genes to the estimation of distributions: I. Binary parameters, in:
*Parallel Problem Solving from Nature—PPSN IV*, by eds. H.M. Voigt, W. Ebeling, I. Rechenberg, H.P. Schwefel (Springer, Berlin, 1996), pp. 178–187. http://citeseer.nj.nec.com/uehlenbein96from.html - 36.K. Murphy, Dynamic Bayesian networks: representation, inference and learning. Ph.D. thesis, University of California, Berkeley (2002)Google Scholar
- 37.I. Murray, Z. Ghahramani, Bayesian learning in undirected graphical models: approximate MCMC algorithms, in
*Twentieth Conference on Uncertainty in Artificial Intelligence (UAI 2004)*(Banff, Canada, 2004). http://citeseer.ist.psu.edu/714876.html - 38.A. Ochoa, H. Mühlenbein, M.R. Soto, A factorized distribution algorithm using single connected Bayesian networks, in Parallel Problem Solving from Nature—PPSN VI 6th International Conference,
*Lecture Notes in Computer Science 1917*, eds. by M. Schoenauer, K. Deb, G. Rudolph, X. Yao, E. Lutton, J.J. Merelo, H.P. Schwefel (Springer, Paris, 2000), pp. 787–796Google Scholar - 39.A. Ochoa, M.R. Soto, R. Santana, J. Madera, N. Jorge, The factorized distribution algorithm and the junction tree: a learning perspective, in
*Proceedings of the Second Symposium on Artificial Intelligence (CIMAF-99)*, eds. by A. Ochoa, M.R. Soto, R. Santana (Havana, Cuba, 1999), pp. 368–377Google Scholar - 40.J. Pearl,
*Probabilistic Reasoning in Intelligent Systems*. (Morgan Kaufman Publishers, Palo Alto, 1988)Google Scholar - 41.M. Pelikan, Bayesian optimization algorithm: from single level to hierarchy. Ph.D. thesis, University of Illinois at Urbana-Champaign, Urbana, IL (2002). Also IlliGAL Report No. 2002023Google Scholar
- 42.M. Pelikan,
*Hierarchical Bayesian Optimization Algorithm: Toward a New Generation of Evolutionary Algorithms*. (Springer, New York, 2005)MATHGoogle Scholar - 43.M. Pelikan, D.E. Goldberg, Hierarchical problem solving by the Bayesian optimization algorithm. IlliGAL Report No. 2000002, Illinois Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign, Urbana, IL (2000)Google Scholar
- 44.M. Pelikan, D.E. Goldberg, E. Cantú-Paz et al., BOA: the Bayesian optimization algorithm. In: W. Banzhaf (eds)
*Proceedings of the Genetic and Evolutionary Computation Conference GECCO99*, (Morgan Kaufmann Publishers, San Fransisco, 1999) pp. 525–532.Google Scholar - 45.M. Pelikan, D.E. Goldberg, F. Lobo, A survey of optimization by building and using probabilistic models. Comput. Optim. Appl.
**21**(1), 5–20 (2002)MathSciNetMATHCrossRefGoogle Scholar - 46.M. Pelikan, K. Sastry, M.V. Butz, D.E. Goldberg, Hierarchical BOA on random decomposable problems. IlliGAL Report No. 2006002, University of Illinois at Urbana-Champaign, Illinois Genetic Algorithms Laboratory, Urbana, IL (2006)Google Scholar
- 47.R. Santana, A Markov network based factorized distribution algorithm for optimization, in
*Proceedings of the 14th European Conference on Machine Learning (ECML-PKDD 2003)*, vol. 2837 (Springer, Dubrovnik, Croatia, 2003), pp. 337–348Google Scholar - 48.R. Santana, Estimation of distribution algorithms with Kikuchi approximation. Evol. Comput.
**13**, 67–98 (2005)CrossRefGoogle Scholar - 49.R. Santana, P. Larrañaga, J.A. Lozano, Protein folding in 2-dimensional lattices with estimation of distribution algorithms, in Proceedings of the First International Symposium on Biological and Medical Data Analysis,
*Lecture Notes in Computer Science*, vol. 3337 (Springer, Barcelona, 2004), pp. 388–398Google Scholar - 50.R. Santana, P. Larrañaga, J.A. Lozano, Mixtures of Kikuchi approximations, in Proceedings of the 17th European Conference on Machine Learning: ECML 2006,
*Lecture Notes in Artificial Intelligence*, vol. 4212, eds. by J. Fürnkranz, T. Scheffer, M. Spiliopoulou (2006), pp. 365–376Google Scholar - 51.R. Santana, P. Larrañaga, J.A. Lozano, Learning factorizations in estimation of distribution algorithms using affinity propagation. Evol. Comput.
**18**(4), 515–546 (2010)CrossRefGoogle Scholar - 52.R. Santana, A. Ochoa, M.R. Soto, The mixture of trees factorized distribution algorithm, in
*Proceedings of the Genetic and Evolutionary Computation Conference GECCO-2001*, eds. by L. Spector, E. Goodman, A. Wu, W. Langdon, H. Voigt, M. Gen, S. Sen, M. Dorigo, S. Pezeshk, M. Garzon, E. Burke (Morgan Kaufmann Publishers, San Francisco, 2001), pp. 543–550Google Scholar - 53.R. Santana, A. Ochoa, M.R. Soto, Solving problems with integer representation using a tree based factorized distribution algorithm, in
*Electronic Proceedings of the First International NAISO Congress on Neuro Fuzzy Technologies*(NAISO Academic Press, Canada, 2002)Google Scholar - 54.S. Shakya, DEUM: a framework for an estimation of distribution algorithm based on markov random fields. Ph.D. thesis (The Robert Gordon University, Aberdeen, UK, April 2006)Google Scholar
- 55.S. Shakya, J. McCall, Optimisation by estimation of distribution with DEUM framework based on Markov Random fields. Int. J. Autom. Comput.
**4**, 262–272 (2007)CrossRefGoogle Scholar - 56.S. Shakya , J. McCall , D. Brown , Updating the probability vector using MRF technique for a univariate EDA. In: E. Onaindia, S. Staab (eds)
*Proceedings of the Second Starting AI Researchers’ Symposium, Volume 109 of Frontiers in Artificial Intelligence and Applications*, (IOS press, Valencia, 2004) pp. 15–25.Google Scholar - 57.Shakya, S., McCall, J., Brown, D., Using a Markov network model in a univariate EDA: an emperical cost-benefit analysis, in
*Proceedings of Genetic and Evolutionary Computation Conference (GECCO2005)*(ACM, Washington, 2005) pp. 727–734Google Scholar - 58.S. Shakya, J. McCall, D. Brown, Solving the ising spin glass problem using a bivariate EDA based on Markov random fields, in
*Proceedings of IEEE Congress on Evolutionary Computation (IEEE CEC 2006)*(IEEE press, Vancouver, 2006), pp. 3250–3257Google Scholar - 59.S. Shakya, R. Santana, An EDA based on local Markov property and Gibbs sampling, in
*proceedings of Genetic and Evolutionary Computation Conference (GECCO2008)*(ACM, Atlanta, 2008), pp. 475–476Google Scholar - 60.S.K. Shakya, A.E.I. Brownlee, J. McCall, W. Fournier, G. Owusu, A fully multivariate DEUM algorithm, in
*Proceedings of the 2009 Congress on Evolutionary Computation CEC-2009*(IEEE Press, Norway, 2009), pp. 479–486Google Scholar - 61.J.S. Yedidia, W.T. Freeman, Y. Weiss, Constructing free-energy approximations and generalized belief propagation algorithms. IEEE Trans. Inf. Theory
**51**, 2282–2312 (2005)MathSciNetCrossRefGoogle Scholar - 62.T.L. Yu, A matrix approach for finding extrema: problems with modularity, hierarchy and overlap. Ph.D. thesis, University of Illinois at Urbana-Champaign, Urbana, Illinois (2006)Google Scholar
- 63.T.L. Yu, D.E. Goldberg, Y.P. Chen, A genetic algorithm design inspired by organizational theory: a pilot study of a dependency structure matrix driven genetic algorithm. IlliGAL Report 2003007, University of Illinois at Urbana-Champaign, Illinois Genetic Algorithms Laboratory, Urbana, IL (2003)Google Scholar