Soft Computing

, Volume 15, Issue 7, pp 1351–1371 | Cite as

Model accuracy in the Bayesian optimization algorithm

  • Claudio F. Lima
  • Fernando G. Lobo
  • Martin Pelikan
  • David E. Goldberg
Original Paper

Abstract

Evolutionary algorithms (EAs) are particularly suited to solve problems for which there is not much information available. From this standpoint, estimation of distribution algorithms (EDAs), which guide the search by using probabilistic models of the population, have brought a new view to evolutionary computation. While solving a given problem with an EDA, the user has access to a set of models that reveal probabilistic dependencies between variables, an important source of information about the problem. However, as the complexity of the used models increases, the chance of overfitting and consequently reducing model interpretability, increases as well. This paper investigates the relationship between the probabilistic models learned by the Bayesian optimization algorithm (BOA) and the underlying problem structure. The purpose of the paper is threefold. First, model building in BOA is analyzed to understand how the problem structure is learned. Second, it is shown how the selection operator can lead to model overfitting in Bayesian EDAs. Third, the scoring metric that guides the search for an adequate model structure is modified to take into account the non-uniform distribution of the mating pool generated by tournament selection. Overall, this paper makes a contribution towards understanding and improving model accuracy in BOA, providing more interpretable models to assist efficiency enhancement techniques and human researchers.

Keywords

Estimation of distribution algorithms Bayesian optimization algorithm Bayesian networks Model overfitting Selection 

References

  1. Ackley DH (1987) A connectionist machine for genetic hill climbing. Kluwer Academic, BostonGoogle Scholar
  2. Ahn CW, Ramakrishna RS (2008) On the scalability of the real-coded Bayesian optimization algorithm. IEEE Trans Evol Comput 12(3):307–322CrossRefGoogle Scholar
  3. Balakrishnan N, Nevzorov VB (2003) A primer on statistical distributions. WileyGoogle Scholar
  4. Blickle T, Thiele L (1997) A comparison of selection schemes used in genetic algorithms. Evol Comput 4(4):311–347Google Scholar
  5. Brindle A (1981) Genetic algorithms for function optimization. PhD thesis, University of Alberta, Edmonton, Canada (unpublished doctoral dissertation)Google Scholar
  6. Chickering DM, Geiger D, Heckerman D (1994) Learning Bayesian networks is NP-Hard. Technical Report MSR-TR-94-17, Microsoft Research, Redmond, WAGoogle Scholar
  7. Chickering DM, Heckerman D, Meek C (1997) A Bayesian approach to learning Bayesian networks with local structure. Technical Report MSR-TR-97-07, Microsoft Research, Redmond, WAGoogle Scholar
  8. Cooper GF, Herskovits EH (1992) A Bayesian method for the induction of probabilistic networks from data. Mach Learn 9:309–347MATHGoogle Scholar
  9. Cormen TH, Leiserson CE, Rivest RL (1990) Introduction to algorithms. MIT Press, MassachusettsMATHGoogle Scholar
  10. Correa ES, Shapiro JL (2006) Model complexity vs. performance in the Bayesian optimization algorithm. In: Runarsson TP et al (eds) PPSN IX: Parallel problem solving from nature, LNCS 4193, Springer, pp 998–1007Google Scholar
  11. Deb K, Goldberg DE (1993) Analyzing deception in trap functions. Foundations of genetic algorithms 2, pp 93–108Google Scholar
  12. Echegoyen C, Lozano JA, Santana R, Larrañaga P (2007) Exact bayesian network learning in estimation of distribution algorithms. In: Proceedings of the IEEE congress on evolutionary computation, IEEE Press, pp 1051–1058Google Scholar
  13. Etxeberria R, Larrañaga P (1999) Global optimization using Bayesian networks. In: Rodriguez AAO et al (eds) Second symposium on artificial intelligence (CIMAF-99), Habana, Cuba, pp 332–339Google Scholar
  14. Friedman N, Goldszmidt M (1999) Learning Bayesian networks with local structure. Graphical Models. MIT Press, pp 421–459Google Scholar
  15. Goldberg DE, Sastry K (2010) Genetic algorithms: the design of innovation, 2nd edn. SpringerGoogle Scholar
  16. Goldberg DE, Korb B, Deb K (1989) Messy genetic algorithms: motivation, analysis, and first results. Complex Syst 3(5):493–530MathSciNetMATHGoogle Scholar
  17. Harik GR (1995) Finding multimodal solutions using restricted tournament selection. In: Proceedings of the sixth international conference on genetic algorithms pp 24–31Google Scholar
  18. Harik GR, Lobo FG, Goldberg DE (1999) The compact genetic algorithm. IIEEE Trans Evol Comput 3(4):287–297CrossRefGoogle Scholar
  19. Hauschild M, Pelikan M (2008) Enhancing efficiency of hierarchical BOA via distance-based model restrictions. In: Proceedings of the 10th international conference on parallel problem solving from nature, Springer-Verlag, pp 417–427Google Scholar
  20. Hauschild M, Pelikan M, Sastry K, Goldberg DE (2008) Using previous models to bias structural learning in the hierarchical BOA. In: Proceedings of the ACM SIGEVO genetic and evolutionary computation conference (GECCO-2008), ACM, New York, NY, USA, pp 415–422Google Scholar
  21. Hauschild M, Pelikan M, Sastry K, Lima CF (2009) Analyzing probabilistic models in hierarchical BOA. IEEE Trans Evol Comput 13(6):1199–1217CrossRefGoogle Scholar
  22. Heckerman D, Geiger D, Chickering DM (1994) Learning Bayesian networks: the combination of knowledge and statistical data. Technical Report MSR-TR-94-09, Microsoft Research, Redmond, WAGoogle Scholar
  23. Henrion M (1988) Propagation of uncertainty in Bayesian networks by logic sampling. In: Lemmer JF, Kanal LN (eds.) Uncertainty in artificial intelligence, Elsevier, pp 149–163Google Scholar
  24. Japkowicz N, Stephen S (2002) The class imbalance problem: a systematic study. Intell Data Anal 6(5):429–450MATHGoogle Scholar
  25. Johnson A, Shapiro J (2001) The importance of selection mechanisms in distribution estimation algorithms. In: Proceedings of the 5th European conference on artificial evolution, LNCS vol 2310, Springer-Verlag, London, pp 91–103Google Scholar
  26. Kubat M, Matwin S (1997) Addressing the curse of imbalanced training sets: one-sided selection. In: Proc. 14th international Conference on Machine Learning, Morgan Kaufmann, pp 179–186Google Scholar
  27. Larrañaga P, Lozano JA (eds) (2002) Estimation of distribution algorithms: a new tool for evolutionary computation. Kluwer Academic Publishers, Boston, MAMATHGoogle Scholar
  28. Lima CF (2009) Substructural local search in discrete estimation of distribution algorithms. PhD thesis, University of Algarve, PortugalGoogle Scholar
  29. Lima CF, Sastry K, Goldberg DE, Lobo FG (2005) Combining competent crossover and mutation operators: a probabilistic model building approach. In: Beyer H et al (eds) Proceedings of the ACM SIGEVO genetic and evolutionary computation conference (GECCO-2005), ACM Press, pp 735–742Google Scholar
  30. Lima CF, Pelikan M, Sastry K, Butz M, Goldberg DE, Lobo FG (2006) Substructural neighborhoods for local search in the Bayesian optimization algorithm. In: Runarsson TP et al (eds) PPSN IX: parallel problem solving from nature, LNCS 4193, Springer, pp 232–241Google Scholar
  31. Lima CF, Goldberg DE, Pelikan M, Lobo FG, Sastry K, Hauschild M (2007) Influence of selection and replacement strategies on linkage learning in BOA. In: Tan KC et al (eds) IEEE Congress on evolutionary computation (CEC-2007), IEEE Press, pp 1083–1090Google Scholar
  32. Lima CF, Pelikan M, Lobo FG, Goldberg DE (2009) Loopy substructural local search for the Bayesian optimization algorithm. In: Proceedings of the second international workshop on engineering stochastic local search algorithms (SLS-2009), LNCS Vol. 5752, Springer, pp 61–75Google Scholar
  33. Lozano JA, Larrañaga P, Inza I, Bengoetxea E (eds) (2006) Towards a new evolutionary computation: advances on estimation of distribution algorithms. Springer, BerlinGoogle Scholar
  34. Mühlenbein H (2008) Convergence of estimation of distribution algorithms for finite samples. (unpublished manuscript)Google Scholar
  35. Mühlenbein H, Mahning T (1999) FDA—a scalable evolutionary algorithm for the optimization of additively decomposed functions. Evol Comput 7(4):353–376CrossRefGoogle Scholar
  36. Mühlenbein H, Schlierkamp-Voosen D (1993) Predictive models for the breeder genetic algorithm: I. Continuous parameter optimization. Evol Comput 1(1):25–49CrossRefGoogle Scholar
  37. Pearl J (1988) Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann, San Mateo, CAGoogle Scholar
  38. Pelikan M (2005) Hierarchical Bayesian optimization algorithm: toward a new generation of evolutionary algorithms. SpringerGoogle Scholar
  39. Pelikan M, Goldberg DE (2001) Escaping hierarchical traps with competent genetic algorithms. In: Spector L, et al (eds) Proceedings of the genetic and evolutionary computation conference (GECCO-2001), Morgan Kaufmann, San Francisco, CA, pp 511–518Google Scholar
  40. Pelikan M, Sastry K (2004) Fitness inheritance in the bayesian optimization algorithm. In: Deb K et al (eds) Proceedings of the genetic and evolutionary computation conference (GECCO-2004), Part II, LNCS 3103, Springer, pp 48–59Google Scholar
  41. Pelikan M, Goldberg DE, Cantú-Paz E (1999) BOA: the Bayesian optimization algorithm. In: Banzhaf W et al (eds) Proceedings of the genetic and evolutionary computation conference GECCO-99, Morgan Kaufmann, San Francisco, CA, pp 525–532Google Scholar
  42. Pelikan M, Goldberg DE, Lobo F (2002) A survey of optimization by building and using probabilistic models. Comput Optim Appl 21(1):5–20MathSciNetCrossRefMATHGoogle Scholar
  43. Pelikan M, Sastry K, Goldberg DE (2003) Scalability of the Bayesian optimization algorithm. Int J Approx Reason 31(3):221–258MathSciNetCrossRefGoogle Scholar
  44. Pelikan M, Sastry K, Cantú-Paz E (eds) (2006) Scalable optimization via probabilistic modelling: from algorithms to applications. SpringerGoogle Scholar
  45. Pyle D (1999) Data preparation for data mining. Morgan Kaufmann, San Francisco, CAGoogle Scholar
  46. Rissanen JJ (1978) Modelling by shortest data description. Automatica 14:465–471CrossRefMATHGoogle Scholar
  47. Santana R, Larrañaga P, Lozano JA (2005) Interactions and dependencies in estimation of distribution algorithms. In: Proceedings of the IEEE congress on evolutionary computation, IEEE Press, pp 1418–1425Google Scholar
  48. Santana R, Larrañaga P, Lozano JA (2008) Protein folding in simplified models with estimation of distribution algorithms. IEEE Trans Evol Comput 12(4):418–438CrossRefGoogle Scholar
  49. Sastry K (2001) Evaluation-relaxation schemes for genetic and evolutionary algorithms. Master’s thesis, University of Illinois at Urbana-Champaign, Urbana, ILGoogle Scholar
  50. Sastry K, Goldberg DE (2004) Designing competent mutation operators via probabilistic model building of neighborhoods. In: Deb K et al (eds) Proceedings of the genetic and evolutionary computation conference (GECCO-2004), Part II, LNCS 3103, Springer, pp 114–125Google Scholar
  51. Sastry K, Pelikan M, Goldberg DE (2004) Efficiency enhancement of genetic algorithms via building-block-wise fitness estimation. In: Proceedings of the IEEE international conference on evolutionary computation, pp 720–727Google Scholar
  52. Sastry K, Abbass HA, Goldberg DE, Johnson DD (2005) Sub-structural niching in estimation distribution algorithms. In: Beyer H, et al (eds) Proceedings of the ACM SIGEVO genetic and evolutionary computation conference (GECCO-2005), ACM PressGoogle Scholar
  53. Sastry K, Lima CF, Goldberg DE (2006) Evaluation relaxation using substructural information and linear estimation. In: Keijzer M et al (eds) Proceedings of the ACM SIGEVO genetic and evolutionary computation conference (GECCO-2006), ACM Press, pp 419–426Google Scholar
  54. Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464CrossRefMATHGoogle Scholar
  55. Thierens D (1999) Scalability problems of simple genetic algorithms. Evol Comput 7(1):45–68CrossRefGoogle Scholar
  56. Thierens D, Goldberg DE (1993) Mixing in genetic algorithms. In: Forrest S (ed) Proceedings of the Fifth international conference on genetic algorithms, Morgan Kaufmann, San Mateo, CA, pp 38–45Google Scholar
  57. Weiss GM, Provost F (2003) Learning when training data are costly: the effect of class distribution on tree induction. J Artif Intell Res 19:315–354MATHGoogle Scholar
  58. Wu H, Shapiro JL (2006) Does overfitting affect performance in estimation of distribution algorithms. In: Keijzer M et al (eds) Proceedings of the ACM SIGEVO genetic and evolutionary computation conference (GECCO-2006), ACM Press, pp 433–434Google Scholar
  59. Yu TL, Goldberg DE (2004) Dependency structure matrix analysis: Offline utility of the dependency structure matrix genetic algorithm. In: Deb K et al (eds) Proceedings of the genetic and evolutionary computation conference (GECCO-2004), Part II, LNCS 3103, Springer, pp 355–366Google Scholar
  60. Yu TL, Sastry K, Goldberg DE (2007a) Population size to go: Online adaptation using noise and substructural measurements. In: Lobo FG, et al (eds) Parameter setting in evolutionary algorithms, Springer, pp 205–224Google Scholar
  61. Yu TL, Sastry K, Goldberg DE, Pelikan M (2007b) Population sizing for entropy-based model building in genetic algorithms. In: Thierens D, et al (eds) Proceedings of the ACM SIGEVO genetic and evolutionary computation conference (GECCO-2007), ACM Press, pp 601–608Google Scholar
  62. Yu TL, Goldberg DE, Sastry K, Lima CF, Pelikan M (2009) Dependency structure matrix, genetic algorithms, and effective recombination. Evol Comput 17(4):595–626CrossRefGoogle Scholar

Copyright information

© Springer-Verlag 2010

Authors and Affiliations

  • Claudio F. Lima
    • 1
  • Fernando G. Lobo
    • 2
  • Martin Pelikan
    • 3
  • David E. Goldberg
    • 4
  1. 1.Centre for Plant Integrative BiologyUniversity of NottinghamLoughboroughUK
  2. 2.Department of Electronics and Informatics EngineeringUniversity of AlgarveFaroPortugal
  3. 3.Department of Mathematics and Computer Science, 320 CCBUniversity of Missouri at St. LouisSt. LouisUSA
  4. 4.Department of Industrial and Enterprise Systems EngineeringUniversity of Illinois at Urbana-ChampaignUrbanaUSA

Personalised recommendations