Environmental Sensing of Expert Knowledge in a Computational Evolution System for Complex Problem Solving in Human Genetics

  • Casey S. Greene
  • Douglas P. Hill
  • Jason H. Moore
Part of the Genetic and Evolutionary Computation book series (GEVO)


The relationship between interindividual variation in our genomes and variation in our susceptibility to common diseases is expected to be complex with multiple interacting genetic factors. A central goal of human genetics is to identify which DNA sequence variations predict disease risk in human populations. Our success in this endeavour will depend critically on the development and implementation of computational intelligence methods that are able to embrace, rather than ignore, the complexity of the genotype to phenotype relationship. To this end, we have developed a computational evolution system (CES) to discover genetic models of disease susceptibility involving complex relationships between DNA sequence variations. The CES approach is hierarchically organized and is capable of evolving operators of any arbitrary complexity. The ability to evolve operators distinguishes this approach from artificial evolution approaches using fixed operators such as mutation and recombination. Our previous studies have shown that a CES that can utilize expert knowledge about the problem in evolved operators significantly outperforms a CES unable to use this knowledge. This environmental sensing of external sources of biological or statistical knowledge is important when the search space is both rugged and large as in the genetic analysis of complex diseases. We show here that the CES is also capable of evolving operators which exploit one of several sources of expert knowledge to solve the problem. This is important for both the discovery of highly fit genetic models and because the particular source of expert knowledge used by evolved operators may provide additional information about the problem itself. This study brings us a step closer to a CES that can solve complex problems in human genetics in addition to discovering genetic models of disease.


Genetic Epidemiology Symbolic Discriminant Analysis Epistasis 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Banzhaf, W., Beslon, G., Christensen, S., Foster, J. A., Kepes, F., Lefort, V., Miller, J., Radman, M., and Ramsden, J. J. (2006). From artificial evolution to computational evolution: a research agenda. Nature Reviews Genetics, 7:729–735.CrossRefGoogle Scholar
  2. Banzhaf, Wolfgang, Nordin, Peter, Keller, Robert E., and Francone, Frank D. (1998a). Genetic ProgrammingAn Introduction; On the Automatic Evolution of Computer Programs and its Applications. Morgan Kaufmann, San Francisco, CA, USA.CrossRefGoogle Scholar
  3. Banzhaf, Wolfgang, Poli, Riccardo, Schoenauer, Marc, and Fogarty, Terence C., editors (1998b). Genetic Programming, volume 1391 of LNCS, Paris. Springer-Verlag.Google Scholar
  4. Bateson, W. (1909). Mendel's Principles of Heredity. Cambridge University Press, Cambridge.Google Scholar
  5. Edmonds, Bruce (1998). Meta-genetic programming: Co-evolving the operators of variation. CPM Report 98-32, Centre for Policy Modelling, Manchester Metropolitan University, UK, Aytoun St., Manchester, M1 3GH. UK.Google Scholar
  6. Edmonds, Bruce (2001). Meta-genetic programming: Co-evolving the operators of variation. Elektrik, 9(1):13–29. Turkish Journal Electrical Engineering and Computer Sciences.Google Scholar
  7. Fogel, G.B. and Corne, D.W. (2003). Evolutionary Computation in Bioinformatics. Morgan Kaufmann Publishers.Google Scholar
  8. Folino, Gianluigi, Pizzuti, Clara, and Spezzano, Giandomenico (1999). A cellular genetic programming approach to classification. In Banzhaf, Wolfgang, Daida, Jason, Eiben, Agoston E., Garzon, Max H., Honavar, Vasant, Jakiela, Mark, and Smith, Robert E., editors, Proceedings of the Genetic and Evolutionary Computation Conference, volume 2, pages 1015–1020, Orlando, Florida, USA. Morgan Kaufmann.Google Scholar
  9. Freitas, A. (2001). Understanding the crucial role of attribute interactions. Artificial Intelligence Review, 16: 177–199.zbMATHCrossRefGoogle Scholar
  10. Freitas, A. (2002). Data Mining and Knowledge Discovery with Evolutionary Algorithms. Springer.Google Scholar
  11. Goldberg, D. E. (2002). The Design of Innovation. Kluwer.Google Scholar
  12. Greene, C. S., White, B. C., and Moore, J. H. (2007). An expert knowledge-guided mutation operator for genome-wide genetic analysis using genetic programming. Lecture Notes in Bioinformatics, 4774:30–40.Google Scholar
  13. Keith, M. J. and Martin, M. C. (1994). Advances in Genetic Programming. MIT Press.Google Scholar
  14. Kira, K. and Rendell, L. A. (1992). A practical approach to feature selection. In: Machine Learning: Proceedings of the AAAI'92.Google Scholar
  15. Kononenko, I. (1994). Estimating attributes: Analysis and extension of relief. Machine Learning: ECML-94, pages 171–182.Google Scholar
  16. Koza, John R. (1992). Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge, MA, USA.zbMATHGoogle Scholar
  17. Koza, John R. (1994). Genetic Programming II: Automatic Discovery of Reusable Programs. MIT Press, Cambridge Massachusetts.zbMATHGoogle Scholar
  18. Koza, John R., Andre, David, Bennett III, Forrest H, and Keane, Martin (1999). Genetic Programming 3: Darwinian Invention and Problem Solving. Morgan Kaufman.Google Scholar
  19. Koza, John R., Keane, Martin A., Streeter, Matthew J., Mydlowec, William, Yu, Jessen, and Lanza, Guido (2003). Genetic Programming IV: Routine Human-Competitive Machine Intelligence. Kluwer Academic Publishers.Google Scholar
  20. Langdon, W. B. and Poli, Riccardo (2002). Foundations of Genetic Programming. Springer-Verlag.Google Scholar
  21. Langdon, William B. (1998). Genetic Programming and Data Structures: Genetic Programming + Data Structures = Automatic Programming!, volume 1 of Genetic Programming. Kluwer, Boston.Google Scholar
  22. Li, W. and Reich, J. (2000). A complete enumeration and classification of two-locus disease models. Human Heredity, 50:334–49.CrossRefGoogle Scholar
  23. Lucek, P.R. and Ott, J. (1997). Neural network analysis of complex traits. Genetic Epidemiology, 14(6):1101–1106.CrossRefGoogle Scholar
  24. Moore, J. H. (2003). The ubiquitous nature of epistasis in determining susceptibility to common human diseases. Human Heredity, 56:73–82.CrossRefGoogle Scholar
  25. Moore, J. H. (2007). Genome-wide analysis of epistasis using multifactor dimensionality reduction: feature selection and construction in the domain of human genetics. In Knowledge Discovery and Data Mining: Challenges and Realities with Real World Data. IGI.Google Scholar
  26. Moore, J. H. and White, B. C. (2006a). Exploiting expert knowledge in genetic programming for genome-wide genetic analysis. Lecture Notes in Computer Science, 4193:969–977.CrossRefGoogle Scholar
  27. Moore, J. H. and White, B. C. (2007a). Genome-wide genetic analysis using genetic programming: The critical need for expert knowledge. In Riolo, Rick L., Soule, Terence, and Worzel, Bill, editors, Genetic Programming Theory and Practice IV, Genetic and Evolutionary Computation. Springer.Google Scholar
  28. Moore, J. H. and White, B. C. (2007b). Tuning relieff for genome-wide genetic analysis. Lecture Notes in Computer Science, 4447:166–175.CrossRefGoogle Scholar
  29. Moore, J. H. and Williams, S. W. (2005). Traversing the conceptual divide between biological and statistical epistasis: Systems biology and a more modern synthesis. BioEssays, 27:637–46.CrossRefGoogle Scholar
  30. Moore, Jason H., Greene, Casey S., Andrews, Peter C., and White, Bill C. (2008a). Does complexity matter? artificial evolution, computational evolution and the genetic analysis of epistasis in common human diseases. In Riolo, Rick L., Soule, Terence, and Worzel, Bill, editors, Genetic Programming Theory and Practice VI, Genetic and Evolutionary Computation, chapter 9, pages 125–145. Springer, Ann Arbor.Google Scholar
  31. Moore, Jason H. and White, Bill C. (2006b). Exploiting expert knowledge in genetic programming for genome-wide genetic analysis. In Runarsson, Thomas Philip, Beyer, Hans-Georg, Burke, Edmund, Merelo-Guervos, Juan J., Whitley, L. Darrell, and Yao, Xin, editors, Parallel Problem Solving from Nature - PPSN IX, volume 4193 of LNCS, pages 969–977, Reykjavik, Iceland. Springer-Verlag.Google Scholar
  32. Moore, J.H. (2009). From genotypes to genometypes: putting the genome back in genome-wide association studies. Eur J Hum Genet.Google Scholar
  33. Moore, J.H., Andrews, P.C., Barney, N., and White, B.C. (2008b). Development and evaluation of an open-ended computational evolution system for the genetic analysis of susceptibility to common human diseases. Lecture Notes in Computer Science, 4973:129–140.CrossRefGoogle Scholar
  34. Moore, J.H, Barney, N., Tsai, C.T, Chiang, F.T, Gui, J., and White, B.C (2007). Symbolic modeling of epistasis. Human Heridity, 63(2):120–133.CrossRefGoogle Scholar
  35. Moore, J.H, Parker, J.S., Olsen, N.J, and Aune, T. (2002). Symbolic discriminant analysis of microarray data in autoimmune disease. Genetic Epidemiology, 23:57–69.CrossRefGoogle Scholar
  36. Perkis, Tim (1994). Stack-based genetic programming. In Proceedings of the 1994 IEEE World Congress on Computational Intelligence, volume 1, pages 148–153, Orlando, Florida, USA. IEEE Press.Google Scholar
  37. Ritchie, M. D., Hahn, L. W., and Moore, J. H. (2003). Power of multifactor dimensionality reduction for detecting gene-gene interactions in the presence of genotyping error, phenocopy, and genetic heterogeneity. Genetic Epidemiology, 24:150–157.CrossRefGoogle Scholar
  38. Ritchie, M. D., Hahn, L. W., Roodi, N., Bailey, L. R., Dupont, W. D., Parl, F. F., and Moore, J. H. (2001). Multifactor dimensionality reduction reveals high-order interactions among estrogen metabolism genes in sporadic breast cancer. American Journal of Human Genetics, 69:138–147.CrossRefGoogle Scholar
  39. Spector, Lee (2003). An essay concerning human understanding of genetic programming. In Riolo, Rick L. and Worzel, Bill, editors, Genetic Programming Theory and Practice, chapter 2, pages 11–24. Kluwer.Google Scholar
  40. Thornton-Wells, T. A., Moore, J. H., and Haines, J. L. (2004). Genetics, statistics and human disease: Analytical retooling for complexity. Trends in Genetics, 20:640–7.CrossRefGoogle Scholar
  41. Velez, D.R., White, B.C., Motsinger, A.A., Bush, W.S., Ritchie, M.D., Williams, S.M., and Moore, J.H. (2007). A balanced accuracy function for epistasis modeling in imbalanced datasets using multifactor dimensionality reduction. Genetic Epidemiology, 31(4).Google Scholar
  42. Vladislavleva, Ekaterina, Smits, Guido, and Kotanchek, Mark (2007). Soft evolution of robust regression models. In Riolo, Rick L., Soule, Terence, and Worzel, Bill, editors, Genetic Programming Theory and Practice V, Genetic and Evolutionary Computation, chapter 2, pages 13–32. Springer, Ann Arbor.Google Scholar
  43. Yu, T., Riolo, R., and Worzel, B. (Eds.) (2006). Genetic Programming Theory and Practice III. Springer.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2010

Authors and Affiliations

  • Casey S. Greene
    • 1
  • Douglas P. Hill
    • 1
  • Jason H. Moore
    • 1
  1. 1.Dartmouth CollegeLebanonUSA

Personalised recommendations