Vine Estimation of Distribution Algorithms with Application to Molecular Docking

  • Marta Soto
  • Alberto Ochoa
  • Yasser González-Fernández
  • Yanely Milanés
  • Adriel Álvarez
  • Diana Carrera
  • Ernesto Moreno
Part of the Adaptation, Learning, and Optimization book series (ALO, volume 14)

Abstract

Four undirected graphical models based on copula theory are investigated in relation to their use within an estimation of distribution algorithm (EDA) to address the molecular docking problem. The simplest algorithms considered are built on top of the product and normal copulas. The other two construct high-dimensional dependence models using the powerful and flexible concept of vine-copula. Empirical investigation with a set of molecular complexes used as test systems shows state-of-the-art performance of the copula-based EDAs in the docking problem. The results also show that the vine-based algorithms are more efficient, robust and flexible than the other two. This might suggest that the use of vines opens new research opportunities to more appropriate modeling of search distributions in evolutionary optimization.

Keywords

Particle Swarm Optimization Molecular Docking Travel Salesman Problem Copula Model Distribution Algorithm 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Aas, K., Czado, C., Frigessi, A., Bakken, H.: Pair-copula constructions of multiple dependence. Insurance: Mathematics and Economics 44, 182–198 (2009)MathSciNetMATHCrossRefGoogle Scholar
  2. 2.
    Akaike, H.: A new look at statistical model identification. IEEE Transactions on Automatic Control 19, 716–723 (1974)MathSciNetMATHCrossRefGoogle Scholar
  3. 3.
    Armañazas, R., Inza, I., Santana, R., Saeys, Y., Flores, J.L., Lozano, J.A., van de Peer, Y., Blanco, R., Robles, V., Bielza, C., Larrañaga, P.: A review of estimation of distribution algorithms in bioinformatics. BioData Mining 1(6) (2008)Google Scholar
  4. 4.
    Auger, A., Blackwell, T., Bratton, D., Clerc, M., Croussette, S., Dattasharma, A., Eberhart, R., Hansen, N., Keko, H., Kennedy, J., Krohling, R., Langdon, W., Li, W., Liu, A., Miranda, V., Poli, R., Serra, P., Stickel, M.: Standard PSO (2007), http://www.particleswarm.info/
  5. 5.
    Bedford, T., Cooke, R.M.: Probability density decomposition for conditionally dependent random variables modeled by vines. Annals of Mathematics and Artificial Intelligence 32, 245–268 (2001)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Bedford, T., Cooke, R.M.: Vines – a new graphical model for dependent random variables. The Annals of Statistics 30, 1031–1068 (2002)MathSciNetMATHCrossRefGoogle Scholar
  7. 7.
    Belda, I., Madurga, S., Llorá, X., Martinell, M., Tarragó, T., Piqueras, M., Nicolás, E., Giralt, E.: ENPDA: An evolutionary structure-based de novo peptide design algorithm. Journal of Computer-Aided Molecular Design 19(8), 585–601 (2005)CrossRefGoogle Scholar
  8. 8.
    Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., Bourne, P.E.: The protein data bank. Nucleic Acids Research 28, 235–242 (2000)CrossRefGoogle Scholar
  9. 9.
    Brechmann, E.C.: Truncated and simplified regular vines and their applications. Diploma thesis, Technische Universität München (2010)Google Scholar
  10. 10.
    Brechmann, E.C., Czado, C., Aas, K.: Truncated regular vines in high dimensions with application to financial data. Note SAMBA/60/10, Norwegian Computing Center, NR (2010)Google Scholar
  11. 11.
    Cooke, R.M.: Markov and entropy properties of tree- and vine-dependent variables. In: Proceedings of the American Statistical Association Section on Bayesian Statistical Science, pp. 166–175 (1997)Google Scholar
  12. 12.
    Cuesta-Infante, A., Santana, R., Hidalgo, J.I., Bielza, C., Larrañaga, P.: Bivariate empirical and n-variate Archimedean copulas in estimation of distribution algorithms. In: Proceedings of the IEEE Congress on Evolutionary Computation (CEC 2010), pp. 1355–1362 (2010)Google Scholar
  13. 13.
    Genest, C., Rémillard, B.: Tests of independence or randomness based on the empirical copula process. Test 13, 335–369 (2004)MathSciNetMATHCrossRefGoogle Scholar
  14. 14.
    Goldberg, D.E.: Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley, Reading (1989)MATHGoogle Scholar
  15. 15.
    González-Fernández, Y.: Algoritmos con estimación de distribuciones basados en cópulas y vines. Diploma thesis, University of Havana (June 2011)Google Scholar
  16. 16.
    González-Fernández, Y., Soto, M.: copulaedas: Estimation of Distribution Algorithms Based on Copula Theory, R package version 1.0.1. (2011), http://CRAN.R-project.org/package=copulaedas
  17. 17.
    González-Fernández, Y., Soto, M.: vines: Multivariate Dependence Modeling with Vines, package version 1.0.1., p. 1 (2011), http://CRAN.R-project.org/package=vines
  18. 18.
    Hahsler, M., Hornik, K.: TSP – Infrastructure for the traveling salesperson problem. Journal of Statistical Software 23, 1–21 (2007)Google Scholar
  19. 19.
    Huey, R., Morris, G.M., Olson, A.J., Goodsell, D.S.: A semiempirical free energy force field with charge-based desolvation. Journal Computational Chemistry 28, 1145–1152 (2007)CrossRefGoogle Scholar
  20. 20.
    Joe, H.: Families of m-variate distributions with given margins and m(m − 1)/2 bivariate dependence parameters. In: Distributions with Fixed Marginals and Related Topics, pp. 120–141 (1996)Google Scholar
  21. 21.
    Joe, H.: Multivariate Models and Dependence Concepts. Chapman & Hall (1997)Google Scholar
  22. 22.
    Johnson, N.L., Kotz, S., Balakrishnan, N.: Continuous Univariate Distributions, 2nd edn., vol. 1. John Wiley & Sons (1994)Google Scholar
  23. 23.
    Kennedy, J., Eberhart, R.C.: Particle swarm optimization. In: Proceedings of the IEEE International Conference on Neural Networks IV, pp. 1942–1948 (1995)Google Scholar
  24. 24.
    Kurowicka, D., Cooke, R.M.: Uncertainty Analysis with High Dimensional Dependence Modelling. John Wiley & Sons (2006)Google Scholar
  25. 25.
    Larrañaga, P., Lozano, J.A. (eds.): Estimation of Distribution Algorithms. An New Tool for Evolutionary Computation. Kluwer Academic Publisher (2002)Google Scholar
  26. 26.
    Morris, G.M., Goodsell, D.S., Halliday, R.S., Huey, R., Hart, W.E., Belew, R.K., Olson, A.J.: Automated docking using a Lamarckian genetic algorithm and an empirical binding free energy function. Journal of Computational Chemistry 19(14), 1639–1662 (1998)CrossRefGoogle Scholar
  27. 27.
    Morris, G.M., Goodsell, D.S., Pique, M.E., Lindstrom, W., Halliday, R.S., Huey, R., Forli, S., Hart, W.E., Belew, R.K., Olson, A.J.: Automated Docking of Flexible Ligands to Flexible Receptors. User Guide AutoDock. Version 4.2 (2010)Google Scholar
  28. 28.
    Mühlenbein, H., Paaß, G.: From Recombination of Genes to the Estimation of Distributions I. Binary Parameters. In: Ebeling, W., Rechenberg, I., Voigt, H.-M., Schwefel, H.-P. (eds.) PPSN 1996. LNCS, vol. 1141, pp. 178–187. Springer, Heidelberg (1996)CrossRefGoogle Scholar
  29. 29.
    Namasivayam, V., Günther, R.: Flexible peptide-protein docking employing pso@autodock. In: From Computational Biophysics to Systems Biology (CBSB 2008), vol. 40, pp. 337–340 (2008)Google Scholar
  30. 30.
    Nelsen, R.B.: An Introduction to Copulas, 2nd edn. Springer (2006)Google Scholar
  31. 31.
    Rosenkrantz, D.J., Stearns, R.E., Lewis, P.M.: An analysis of several heuristics for the traveling salesman problem. SIAM Journal on Computing 6(3), 563–581 (1977)MathSciNetMATHCrossRefGoogle Scholar
  32. 32.
    Rousseeuw, P., Molenberghs, G.: Transformation of nonpositive semidefinite correlation matrices. Communications in Statistics: Theory and Methods 22, 965–984 (1993)MATHCrossRefGoogle Scholar
  33. 33.
    Santana, R.: Advances in Probabilistic Graphical Models for Optimization and Learning. Applications in Protein Modeling. PhD thesis, University of the Basque Country (2006)Google Scholar
  34. 34.
    Santana, R., Larrañaga, P., Lozano, J.A.: Side chain placement using estimation of distribution algorithms. Artificial Intelligence in Medicine 39, 49–63 (2007)CrossRefGoogle Scholar
  35. 35.
    Sklar, A.: Fonctions de répartition à n dimensions et leurs marges. Publications de l’Institut de Statistique de l’Université de Paris 8, 229–231 (1959)MathSciNetGoogle Scholar
  36. 36.
    Soto, M., González-Fernández, Y.: Vine estimation of distribution algorithms. Technical Report ICIMAF 2010-561, Institute of Cybernetics, Mathematics and Physics (May 2010) ISSN 0138-8916Google Scholar
  37. 37.
    Soto, M., Ochoa, A., Arderí, R.J.: Estimation of distribution algorithm based on Gaussian copula. Technical Report ICIMAF 2007-406, Institute of Cybernetics, Mathematics and Physics (June 2007) ISSN 0138-8916Google Scholar
  38. 38.
    Storn, R., Price, K.: Differential evolution – a simple and efficient heuristic for global optimization over continuous spaces. Journal of Global Optimization 11, 341–359 (1997)MathSciNetMATHCrossRefGoogle Scholar
  39. 39.
    Wang, L.F., Wang, Y., Zeng, J.C., Hong, Y.: An estimation of distribution algorithm based on Clayton copula and empirical margins. In: Life System Modeling and Intelligent Computing, pp. 82–88. Springer (2010)Google Scholar
  40. 40.
    Wang, L.F., Zeng, J.C., Hong, Y.: Estimation of distribution algorithm based on copula theory. In: Proceedings of the IEEE Congress on Evolutionary Computation (CEC 2009), pp. 1057–1063 (2009)Google Scholar
  41. 41.
    Warren, G.L., Andrews, C.W., Capelli, A.M., Clarke, B., LaLonde, J., Lambert, M.H., Lindvall, M., Nevins, N., Semus, S.F., Senger, S., Tedesco, G., Wall, I.D., Woolven, J.M., Peishoff, C.E., Head, M.S.: A critical assessment of docking programs and scoring functions. Journal of Medicinal Chemistry 49, 5912–5931 (2006)CrossRefGoogle Scholar

Copyright information

© Springer Berlin Heidelberg 2012

Authors and Affiliations

  • Marta Soto
    • 1
  • Alberto Ochoa
    • 1
  • Yasser González-Fernández
    • 1
  • Yanely Milanés
    • 2
  • Adriel Álvarez
    • 2
  • Diana Carrera
    • 2
  • Ernesto Moreno
    • 3
  1. 1.Institute of Cybernetics, Mathematics, and PhysicsHavanaCuba
  2. 2.University of HavanaHavanaCuba
  3. 3.Center of Molecular ImmunologyHavanaCuba

Personalised recommendations