DICE: exploiting all bivariate dependencies in binary and multary search spaces

Abstract

Although some of the earliest Estimation of Distribution Algorithms (EDAs) utilized bivariate marginal distribution models, up to now, all discrete bivariate EDAs had one serious limitation: they were constrained to exploiting only a limited O(d) subset out of all possible \(O(d^{2})\) bivariate dependencies. As a first we present a family of discrete bivariate EDAs that can learn and exploit all \(O(d^{2})\) dependencies between variables, and yet have the same run-time complexity as their more limited counterparts. This family of algorithms, which we label DICE (DIscrete Correlated Estimation of distribution algorithms), is rigorously based on sound statistical principles, and particularly on a modelling technique from statistical physics: dichotomised multivariate Gaussian distributions. Initially (Lane et al. in European Conference on the Applications of Evolutionary Computation, Springer, 1999), DICE was trialled on a suite of combinatorial optimization problems over binary search spaces. Our proposed dichotomised Gaussian (DG) model in DICE significantly outperformed existing discrete bivariate EDAs; crucially, the performance gap increasingly widened as dimensionality of the problems increased. In this comprehensive treatment, we generalise DICE by successfully extending it to multary search spaces that also allow for categorical variables. Because correlation is not wholly meaningful for categorical variables, interactions between such variables cannot be fully modelled by correlation-based approaches such as in the original formulation of DICE. Therefore, here we extend our original DG model to deal with such situations. We test DICE on a challenging test suite of combinatorial optimization problems, which are defined mostly on multary search spaces. While the two versions of DICE outperform each other on different problem instances, they both outperform all the state-of-the-art bivariate EDAs on almost all of the problem instances. This further illustrates that these innovative DICE methods constitute a significant step change in the domain of discrete bivariate EDAs.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Notes

  1. 1.

    Strictly speaking, CMA-ES does not quite fall into the canonical EDA framework. However, it shares almost all of the core features of a typical EDA.

  2. 2.

    Downloadable from: http://www.math.nus.edu.sg/~matsundf/.

References

  1. 1.

    Baluja S, Caruana R (1995) Removing the genetics from the standard genetic algorithm. In: 12th International conference on machine learning, pp 38–46

  2. 2.

    Baluja S, Davies S (1997) Using optimal dependency-trees for combinational optimization. In: 14th International conference on machine learning, pp 30–38

  3. 3.

    Boros E, Hammer P, Tavares G (2007) Local search heuristics for quadratic unconstrained binary optimization (QUBO). J Heuristics 13(2):99–132

    Article  Google Scholar 

  4. 4.

    Caprara A et al (2014) Generation of antipodal random vectors with prescribed non-stationary 2-nd order statistics. IEEE Trans Signal Process 62(6):1603–1612

    MathSciNet  Article  Google Scholar 

  5. 5.

    Chow C, Liu C (1968) Approximating discrete probability distributions with dependence trees. IEEE Trans Inf Theory 14(3):462–467

    MathSciNet  Article  MATH  Google Scholar 

  6. 6.

    Crawford J, Auton L (1996) Experimental results on the crossover point in random 3-SAT. Artif Intell 81(1):31–57

    MathSciNet  Article  Google Scholar 

  7. 7.

    De Bonet JS, Isbell CL Jr, Viola PA (1997) MIMIC: finding optima by estimating probability densities. In: Mozer MC, Jordan MI, Petsche T (eds) Advances in neural information processing systems 9. MIT Press, pp 424–430

  8. 8.

    Emrich L, Piedmonte M (1991) A method for generating high-dimensional multivariate binary variates. Am Stat 45(4):302–304

    Google Scholar 

  9. 9.

    Etxeberria R, Larranaga P (1999) Global optimization using Bayesian networks. In: Second symposium on artificial intelligence (CIMAF-99), Habana, Cuba, pp 332–339

  10. 10.

    Gallo G, et al (1980) Quadratic knapsack problems. In: Combinatorial optimization, Springer, pp 132–149

  11. 11.

    Gange S (1995) Generating multivariate categorical variates using the iterative proportional fitting algorithm. Am Stat 49(2):134–138

    Google Scholar 

  12. 12.

    Glover F, Hao JK, Kochenberger G (2011) Polynomial unconstrained binary optimisation-part 2. Int J Metaheuristics 1(4):317–354

    MathSciNet  Article  MATH  Google Scholar 

  13. 13.

    Hansen N, Kern S (2004) Evaluating the CMA evolution strategy on multimodal test functions. In: PPSN VIII, pp 282–291

  14. 14.

    Harik G, Lobo F et al (1999) The compact genetic algorithm. IEEE Trans Evolut Comput 3(4):287–297

    Article  Google Scholar 

  15. 15.

    Higham N (2002) Computing the nearest correlation matrix : a problem from finance. IMA J Numer Anal 22(3):329–343

    MathSciNet  Article  MATH  Google Scholar 

  16. 16.

    Hyrš M, Schwarz J (2014) Multivariate Gaussian copula in estimation of distribution algorithm with model migration. In: IEEE Foundations of computational intelligence (FOCI), pp 114–119

  17. 17.

    Jin R, Wang S et al (2015) Generating spatial correlated binary data through a copulas method. Sci Res 3(4):206–212

    Article  Google Scholar 

  18. 18.

    Krząkała F (2005) How many colors to color a random graph? Cavity, complexity, stability and all that. Progress Theor Phys Suppl 157:357–360

    Article  Google Scholar 

  19. 19.

    Lane F, Azad R, Ryan C (2017) DICE: a new family of bivariate estimation of distribution algorithms based on dichotomised multivariate Gaussian distributions. In: European conference on the applications of evolutionary computation, Springer, pp 670–685

  20. 20.

    Larrañaga P, Etxeberria R, et al (2000) Combinatorial optimization by learning and simulation of Bayesian networks. In: 16th conference on uncertainty in artificial intelligence, pp 343–352

  21. 21.

    Lee A (1993) Generating random binary deviates having fixed marginal distributions and specified degrees of association. Am Stat 47(3):209–215

    Google Scholar 

  22. 22.

    Li B, Wang X, Zhong R, Zhuang Z (2006) Continuous optimization based-on boosting Gaussian mixture model. In: IEEE 18th international conference on pattern recognition, vol 1, pp 1192–1195

  23. 23.

    Li R, Emmerich M, et al (2006) Mixed-integer NK landscapes. In: PPSN IX, pp 42–51

  24. 24.

    Macke J, Berens P et al (2009) Generating spike trains with specified correlation coefficients. Neural Comput 21(2):397–423

    MathSciNet  Article  MATH  Google Scholar 

  25. 25.

    Mühlenbein H (1997) The equation for response to selection and its use for prediction. Evolut Comput 5(3):303–346

    Article  Google Scholar 

  26. 26.

    Ohlsson E (1998) Sequential Poisson Sampling. J Off Stat 14(2):149

    Google Scholar 

  27. 27.

    Pelikan M, Goldberg D, Cantú-Paz E (1999) BOA: the Bayesian optimization algorithm. GECCO 1999:525–532

  28. 28.

    Pelikan M, Mühlenbein H (1999) The bivariate marginal distribution algorithm. In: Advances in Soft Computing, pp 521–535

  29. 29.

    Qi H, Sun D (2006) A quadratically convergent Newton method for computing the nearest correlation matrix. SIAM J Matrix Anal Appl 28(2):360–385

    MathSciNet  Article  MATH  Google Scholar 

  30. 30.

    Rosén B (1997) On sampling with probability proportional to size. J Stat Plan Inference 62(2):159–191

    MathSciNet  Article  MATH  Google Scholar 

  31. 31.

    Zhang Q, Sun J, Tsang E, Ford J (2002) Estimation of distribution algorithm based on mixture: preliminary experimental results. In: The 2002 UK workshop on computational intelligence (UKCI’02). University of Birmingham, UK

Download references

Acknowledgements

This work was supported with the financial support of the Science Foundation Ireland grant 13/RC/2094.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Fergal Lane.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Lane, F., Azad, R.M.A. & Ryan, C. DICE: exploiting all bivariate dependencies in binary and multary search spaces. Memetic Comp. 10, 245–255 (2018). https://doi.org/10.1007/s12293-017-0246-1

Download citation

Keywords

  • Dichotomised Gaussian models
  • Bivariate estimation of distribution algorithms
  • Combinatorial optimization