Advertisement

Investigating Problem Hardness of Real Life Applications

  • Leonardo Vanneschi
Part of the Genetic and Evolutionary Computation Series book series (GEVO)

This chapter represents a first attempt to characterize the fitness landscapes of real-life Genetic Programming applications by means of a predictive algebraic difficulty indicator. The indicator used is the Negative Slope Coefficient, whose efficacy has been recently empirically demonstrated on a large set of hand-tailored theoretical test functions and well known GP benchmarks. The real-life problems studied belong to the field of Biomedical applications and consist of automatically assessing amathematical relationship between a set of molecular descriptors from a given dataset of drugs and some important pharmacokinetic parameters. The parameters considered here are Human Oral Bioavailability, Median Oral Lethal Dose, and Plasma Protein Binding levels. The availability of good prediction tools for pharmacokinetics parameters like these is critical for optimizing the efficiency of therapies, maximizing medical success rate and minimizing toxic effects. The experimental results presented in this chapter show that the Negative Slope Coefficient seems to be a reasonable tool to characterize the difficulty of these problems, and can be used to choose the most effective Genetic Programming configuration (fitness function, representation, parameters' values) from a set of given ones.

Keywords

problem difficulty fitness landscapes real life applications fitness clouds negative slope coefficient 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Altenberg, L. (1994). The evolution of evolvability in genetic programming. In Kinnear, K., editor, Advances in Genetic Programming, pages 47-74, Cambridge, MA. The MIT Press.Google Scholar
  2. Archetti, F., Messina, E., Lanzeni, S., and Vanneschi, L. (2007a). Genetic programming and other machine learning approaches to predict median oral lethal dose (LD50) and plasma protein binding levels (of drugs. In et al.,Google Scholar
  3. E. Marchiori, editor, Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics. Proceedings of the Fifth European Conference, EvoBIO 2007, Lecture Notes in Computer Science, LNCS 4447, pages 11-23. Springer, Berlin, Heidelberg, New York.Google Scholar
  4. Archetti, F., Messina, E., Lanzeni, S., and Vanneschi, L. (2007b). Genetic programming for computational pharmacokinetics in drug discovery and development. Genetic Programming and Evolvable Machines, special issue on Medical Applications. To appear. Submitted on November 18, 2006.Google Scholar
  5. Archetti, Francesco, Lanzeni, Stefano, Messina, Enza, and Vanneschi, Leonardo (2006). Genetic programming for human oral bioavailability of drugs. In Keijzer, Maarten, Cattolico, Mike, Arnold, Dirk, Babovic, Vladan, Blum, Christian, Bosman, Peter, Butz, Martin V., Coello Coello, Carlos, Dasgupta, Dipankar, Ficici, Sevan G., Foster, James, Hernandez-Aguirre, Arturo, Hornby, Greg, Lipson, Hod, McMinn, Phil, Moore, Jason, Raidl, Guenther, Rothlauf, Franz, Ryan, Conor, and Thierens, Dirk, editors, GECCO 2006: Proceedings of the 8th annual conference on Genetic and evolutionary computation, volume 1, pages 255-262, Seattle, Washington, USA. ACM Press.Google Scholar
  6. Barnett, L. (2003). Evolutionary Search on Fitness Landscapes with Neutral Networks. PhD thesis, University of Sussex.Google Scholar
  7. Berezhkovskiy, L. M. (2006). Determination of drug binding to plasma proteins using competitive equilibrium binding to dextran-coated charcoal. Journal of Pharmacokinetics and Pharmacodynamics, 33(5):920-937.Google Scholar
  8. Collard, P., Verel, S., and Clergue, M. (2004). Local search heuristics: Fitness cloud versus fitness landscape. In M ántaras, R. L. De and Saitta, L., editors, 2004 European Conference on Artificial Intelligence (ECAI04), pages 973-974, Valence, Spain. IOS Press.Google Scholar
  9. Horn, J. and Goldberg, D. E. (1995). Genetic algorithm difficulty and the modality of the fitness landscapes. In Whitley, D. and Vose, M., editors, FOGA-3, pages 243-269. Morgan Kaufmann.Google Scholar
  10. Inc, Simulation Plus (2006). a company that use both statistical methods and differential equations based simulations for adme parameter estimation. See www.simulationsplus.com.
  11. J. P. Eddershaw, A. P. Beresford and Bayliss, M. K. (2000). Adme/pk as part of a rational approach to drug discovery. Drug Discovery Today, 9:409-414.CrossRefGoogle Scholar
  12. Jones, T. (1995). Evolutionary Algorithms, Fitness Landscapes and Search. PhD thesis, University of New Mexico, Albuquerque.Google Scholar
  13. Keijzer, Maarten (2003). Improving symbolic regression with interval arithmetic and linear scaling. In Ryan, Conor, Soule, Terence, Keijzer, Maarten, Tsang, Edward, Poli, Riccardo, and Costa, Ernesto, editors, Genetic Programming, Proceedings of EuroGP’2003, volume 2610 of LNCS, pages 70-82, Essex. Springer-Verlag.Google Scholar
  14. Kennedy, T. (1997). Managing the drug discovery/development interface. Drug Discovery Today, 2:436-444.CrossRefGoogle Scholar
  15. Kinnear, Jr., Kenneth E. (1994). Fitness landscapes and difficulty in genetic programming. In Proceedings of the 1994 IEEE World Conference on Computational Intelligence, volume 1, pages 142-147, Orlando, Florida, USA. IEEE Press.Google Scholar
  16. Kola, I. and Landis, J. (2004). Can the pharmaceutical industry reduce attrition rates? Nature Reviews Dug Discovery, 3:711-716.CrossRefGoogle Scholar
  17. Koza, John R. (1992). Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge, MA, USA.zbMATHGoogle Scholar
  18. Koza, John R. and Poli, Riccardo (2003). A genetic programming tutorial. www.Langdon, W. B. and Poli, Riccardo (2002). Foundations of Genetic Programming. Springer-Verlag.
  19. Madras, N. (2002). Lectures on Monte Carlo Methods. American Mathematical Society, Providence, Rhode Island.Google Scholar
  20. Manderick, B., de Weger, M., and Spiessens, P. (1991). The genetic algorithm and the structure of the fitness landscape. In Belew, R. K. and Booker, L. B., editors, Proceedings of the Fourth International Conference on Genetic Algorithms, pages 143-150. Morgan Kaufmann.Google Scholar
  21. Naudts, B. and Kallel, L. (2000). A comparison of predictive measures of problem difficulty in evolutionary algorithms. IEEE Transactions on Evolutionary Computation, 4(1):1-15.CrossRefGoogle Scholar
  22. Norinder, U. and Bergstrom, C. A. S. (2006). Prediction of admet properties. ChemMedChem, 1:920-937.CrossRefGoogle Scholar
  23. Poli, R. and Vanneschi, L. (2007). Fitness-proportional negative slope coefficient as a hardness measure for genetic algorithms. In Proceedings of the 9th annual conference on Genetic and Evolutionary Computation, GECCO 2007, London, UK. To appear. Nominated for the best paper award of the Genetic Algorithms track.Google Scholar
  24. REACH (2006). Registration, evaluation and authorisation of chemicals.Google Scholar
  25. Stadler, P. F. (2002). Fitness landscapes. In Lassig, M. and Valleriani, A., editors, Biological Evolution and Statistical Physics, volume 585 of Lecture Notes Physics, pages 187-207. Springer, Berlin, Heidelberg, New York.Google Scholar
  26. Tetko, I. V., Gasteiger, J., Todeschini, R., Mauri, A., Livingstone, D., Palyulin, P. Ertland V.A., Radchenko, E.V., Zefirov, N.S., Makarenko, A.S., Tanchuk, V.Y., and Prokopenko, V.V. (2005). Virtual computational chemistry laboratory - design and description. Journal of Computer Aided Molecular Design, 19:453-63. see www.vcclab.org.Google Scholar
  27. Tomassini, M., Vanneschi, L., Collard, P., and Clergue, M. (2005). A study of fitness distance correlation as a difficulty measure in genetic programming. Evolutionary Computation, 13(2):213-239.CrossRefGoogle Scholar
  28. Tuffs, A. (2001). Bayer faces shake up after lipobay withdrawn. British Medical Journal, 23(317):828.CrossRefGoogle Scholar
  29. van de Waterbeemd, H. and Gifford, E. (2003). Admet in silico modeling: towards prediction paradise? Nature Reviews Drug Discovery, 2:192-204.CrossRefGoogle Scholar
  30. Vanneschi, L., Rochat, D., and Tomassini, M. (2007). Multi-optimization improves genetic programming generalization ability. In Proceedings of the Genetic and Evolutionary Computation Conference, GECCO 2007. ACM Press. To appear.Google Scholar
  31. Vanneschi, Leonardo (2004). Theory and Practice for Efficient Genetic Programming. PhD thesis, Faculty of Sciences, University of Lausanne, Switzerland.Google Scholar
  32. Vanneschi, Leonardo, Clergue, Manuel, Collard, Philippe, Tomassini, Marco, and V érel, S ébastien (2004). Fitness clouds and problem hardness in genetic programming. In Deb, Kalyanmoy, Poli, Riccardo, Banzhaf, Wolfgang, Beyer, Hans-Georg, Burke, Edmund, Darwen, Paul, Dasgupta, Dipankar, Floreano, Dario, Foster, James, Harman, Mark, Holland, Owen, Lanzi, Pier Luca, Spector, Lee, Tettamanzi, Andrea, Thierens, Dirk, and Tyrrell, Andy, editors, Genetic and Evolutionary Computation - GECCO2004, Part II, volume 3103 of Lecture Notes in Computer Science, pages 690-701, Seattle, WA, USA. Springer-Verlag.Google Scholar
  33. Vanneschi, Leonardo, Tomassini, Marco, Collard, Philippe, and V érel, S ébastien (2006). Negative slope coefficient. A measure to characterize genetic programming. In Collet, Pierre, Tomassini, Marco, Ebner, Marc, Gustafson, Steven, and Ek árt, Anik ó , editors, Proceedings of the 9th European Conference on Genetic Programming, volume 3905 of Lecture Notes in Computer Science, pages 178-189, Budapest, Hungary. Springer.Google Scholar
  34. V érel, S., Collard, P., and Clergue, M. (2003). Where are bottleneck in nk-fitness landscapes? In CEC 2003: IEEE International Congress on Evolutionary Computation. Canberra, Australia, pages 273-280. IEEE Press, Piscataway, NJ.Google Scholar
  35. Weinberger, E. D. (1990). Correlated and uncorrelated fitness landscapes and how to tell the difference. Biol. Cybern., 63:325-336.zbMATHCrossRefGoogle Scholar
  36. Wishart, D.S., Knox, C., Guo, A. C., Shrivastava, S., Hassanali, M., Stothard, P., Chang, Z., and Woolsey, J. (2006). Drugbank: a comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Research, 34. doi:10.1093/nar/gkj067.Google Scholar
  37. Yoshida, F. and Topliss, J. G. (2000). Qsar model for drug human oral bioavailability. Journal of Medicinal Chemistry, 43:2575-2585.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2008

Authors and Affiliations

  • Leonardo Vanneschi
    • 1
  1. 1.Dipartimento di Informatica, Sistemistica e Comunicazione (D.I.S.Co.)University of MilanoBicocca MilanItaly

Personalised recommendations