Exploring Interestingness in a Computational Evolution System for the Genome-Wide Genetic Analysis of Alzheimer’s Disease

  • Jason H. Moore
  • Douglas P. Hill
  • Andrew Saykin
  • Li Shen
Part of the Genetic and Evolutionary Computation book series (GEVO)


Susceptibility to Alzheimer’s disease is likely due to complex interaction among many genetic and environmental factors. Identifying complex genetic effects in large data sets will require computational methods that extend beyond what parametric statistical methods such as logistic regression can provide. We have previously introduced a computational evolution system (CES) that uses genetic programming (GP) to represent genetic models of disease and to search for optimal models in a rugged fitness landscape that is effectively infinite in size. The CES approach differs from other GP approaches in that it is able to learn how to solve the problem by generating its own operators. A key feature is the ability for the operators to use expert knowledge to guide the stochastic search. We have previously shown that CES is able to discover nonlinear genetic models of disease susceptibility in both simulated and real data. The goal of the present study was to introduce a measure of interestingness into the modeling process. Here, we define interestingness as a measure of non-additive gene-gene interactions. That is, we are more interested in those CES models that include attributes that exhibit synergistic effects on disease risk. To implement this new feature we first pre-processed the data to measure all pairwise gene-gene interaction effects using entropy-based methods. We then provided these pre-computed measures to CES as expert knowledge and as one of three fitness criteria in three-dimensional Pareto optimization. We applied this new CES algorithm to an Alzheimer’s disease data set with approximately 520,000 genetic attributes. We show that this approach discovers more interesting models with the added benefit of improving classification accuracy. This study demonstrates the applicability of CES to genome-wide genetic analysis using expert knowledge derived from measures of interestingness.


Computational evolution Genetic epidemiology Epistasis Gene-gene interactions 



This work was supported by NIH grants LM011360, LM009012, LM010098 and AI59694. We would like to thank the participants of present and past Genetic Programming Theory and Practice Workshops (GPTP) for their stimulating feedback and discussion that helped formulate some of the ideas in this paper.


  1. Banzhaf W, Francone FD, Keller RE, Nordin P (1998) Genetic programming: an introduction on the automatic evolution of computer programs and its applications. Morgan Kaufmann, San FranciscoCrossRefzbMATHGoogle Scholar
  2. Banzhaf W, Beslon G, Christensen S, Foster J, Képès F, Lefort V, Miller J, Radman M, Ramsden J (2006) From artificial evolution to computational evolution: a research agenda. Nat Rev Genet 7:729–735CrossRefGoogle Scholar
  3. Bertram L, Tanzi RE (2012) The genetics of Alzheimer’s disease. Prog Mol Biol Transl Sci 107:79–100. doi:10.1016/B978-0-12-385883-2.00008-4CrossRefGoogle Scholar
  4. Bullock JM, Medway C, Cortina-Borja M, Turton JC, Prince JA, Ibrahim-Verbaas CA, Schuur M, Breteler MM, van Duijn CM, Kehoe PG, Barber R, Coto E, Alvarez V, Deloukas P, Hammond N, Combarros O, Mateo I, Warden DR, Lehmann MG, Belbin O, Brown K, Wilcock GK, Heun R, Kolsch H, Smith AD, Lehmann DJ, Morgan K (2013) Discovery by the epistasis project of an epistatic interaction between the GSTM3 gene and the HHEX/IDE/KIF11 locus in the risk of Alzheimer’s disease. Neurobiol Aging 34(4):1309. e1–1309.e7. doi:10.1016/j.neurobiolaging.2012.08.010Google Scholar
  5. Combarros O, van Duijn CM, Hammond N, Belbin O, Arias-Vasquez A, Cortina-Borja M, Lehmann MG, Aulchenko YS, Schuur M, Kolsch H, Heun R, Wilcock GK, Brown K, Kehoe PG, Harrison R, Coto E, Alvarez V, Deloukas P, Mateo I, Gwilliam R, Morgan K, Warden DR, Smith AD, Lehmann DJ (2009) Replication by the epistasis project of the interaction between the genes for IL-6 and IL-10 in the risk of Alzheimer’s disease. J Neuroinflammation 6:22. doi:10.1186/1742-2094-6-22CrossRefGoogle Scholar
  6. Fogel GB, Corne DW (eds) (2003) Evolutionary computation in bioinformatics. Morgan Kaufmann, San FranciscoGoogle Scholar
  7. Geng L, Hamilton HJ (2006) Interestingness measures for data mining: a survey. ACM Comput Surv 38(3). doi:10.1145/1132960.1132963,
  8. Greene CS, Hill DP, Moore JH (2009a) Environmental noise improves epistasis models of genetic data discovered using a computational evolution system. In: Proceedings of the 11th annual conference on genetic and evolutionary computation, GECCO’09, Montreal. ACM, New York, pp 1785–1786. doi:10.1145/1569901.1570160,
  9. Greene CS, Hill DP, Moore JH (2009b) Environmental sensing of expert knowledge in a computational evolution system for complex problem solving in human genetics. In: Riolo RL, O’Reilly UM, McConaghy T (eds) Genetic programming theory and practice VII. Genetic and evolutionary computation. Springer, Ann Arbor, chap 2, pp 19–36Google Scholar
  10. Horn J, Nafpliotis N, Goldberg DE (1994) A niched pareto genetic algorithm for multiobjective optimization. In: Proceedings of the first IEEE conference on evolutionary computation, IEEE world congress on computational intelligence, Orlando, vol 1, pp 82–87. doi:10.1109/ICEC.1994.350037,
  11. Hornby GS (2006) ALPS: the age-layered population structure for reducing the problem of premature convergence. In: Proceedings of the 8th annual conference on genetic and evolutionary computation, GECCO’06, Seattle. ACM, New York, pp 815–822. doi:10.1145/1143997.1144142,
  12. Hu T, Chen Y, Kiralis JW, Moore JH (2013) ViSEN: methodology and software for visualization of statistical epistasis networks. Genet Epidemiol 37(3):283–285. doi:10.1002/gepi.21718CrossRefGoogle Scholar
  13. Koza JR (1992) Genetic programming: on the programming of computers by means of natural selection (complex adaptive systems), 1st edn. A Bradford Book. MIT Press, London.\&path=ASIN/0262111705
  14. Lamont GB, VanVeldhuizen DA (2002) Evolutionary algorithms for solving multi-objective problems. Kluwer Academic, NorwellzbMATHGoogle Scholar
  15. Lehmann DJ, Schuur M, Warden DR, Hammond N, Belbin O, Kolsch H, Lehmann MG, Wilcock GK, Brown K, Kehoe PG, Morris CM, Barker R, Coto E, Alvarez V, Deloukas P, Mateo I, Gwilliam R, Combarros O, Arias-Vasquez A, Aulchenko YS, Ikram MA, Breteler MM, van Duijn CM, Oulhaj A, Heun R, Cortina-Borja M, Morgan K, Robson K, Smith AD (2012) Transferrin and HFE genes interact in Alzheimer’s disease risk: the epistasis project. Neurobiol Aging 33(1):202.e1–202.e13. doi:10.1016/j.neurobiolaging.2010.07.018Google Scholar
  16. Moore JH, White BC (2007) Tuning ReliefF for genome-wide genetic analysis. In: Proceedings of the 5th European conference on evolutionary computation, machine learning and data mining in bioinformatics, EvoBIO’07, Valencia. Springer, Berlin/Heidelberg, pp 166–175.
  17. Moore JH, Williams SM (2009) Epistasis and its implications for personal genetics. Am J Hum Genet 85(3):309–320. doi:10.1016/j.ajhg.2009.08.006, Google Scholar
  18. Moore JH, Parker JS, Olsen NJ, Aune TM (2002) Symbolic discriminant analysis of microarray data in autoimmune disease. Genet Epidemiol 23(1):57–69CrossRefGoogle Scholar
  19. Moore JH, Gilbert JC, Tsai CT, Chiang FT, Holden T, Barney N, White BC (2006) A flexible computational framework for detecting, characterizing, and interpreting statistical patterns of epistasis in genetic studies of human disease susceptibility. J Theor Biol 241(2):252–261. doi:10.1016/j.jtbi.2005.11.036,
  20. Moore JH, Andrews PC, Barney N, White BC (2008) Development and evaluation of an open-ended computational evolution system for the genetic analysis of susceptibility to common human diseases. In: Marchiori E, Moore JH (eds) EvoBIO’08, Naples. Lecture notes in computer science, vol 4973. Springer, pp 129–140Google Scholar
  21. Moore JH, Asselbergs FW, Williams SM (2010) Bioinformatics challenges for genome-wide association studies. Bioinformatics 26(4):445–455. doi:10.1093/bioinformatics/btp713CrossRefGoogle Scholar
  22. Moore JH, Hill DP, Fisher JM, Lavender N, Kidd LC (2011) Human-computer interaction in a computational evolution system for the genetic analysis of cancer. In: Riolo R, Vladislavleva E, Moore JH (eds) Genetic programming theory and practice IX. Genetic and evolutionary computation. Springer, Ann Arbor, chap 9, pp 153–171. doi:10.1007/978-1-4614-1770-5-9Google Scholar
  23. Moore JH, Hill DP, Sulovary A, Kidd L (2013) Genetic analysis of prostate cancer using computational evolution, pareto-optimization and post-processing. In: Riolo RL, Moore JH, Ritchie MD, Vladislavleva K (eds) Genetic programming theory and practice X. Genetic and evolutionary computation. Springer, Ann Arbor, pp 87–101CrossRefGoogle Scholar
  24. Pattin KA, Payne JL, Hill DP, Caldwell T, Fisher JM, Moore JH (2010) Exploiting expert knowledge of protein-protein interactions in a computational evolution system for detecting epistasis. In: Riolo R, McConaghy T, Vladislavleva E (eds) Genetic programming theory and practice VIII. Genetic and evolutionary computation, vol 8. Springer, Ann Arbor, chap 12, pp 195–210.
  25. Payne J, Greene C, Hill D, Moore J (2010) Sensible initialization of a computational evolution system using expert knowledge for epistasis analysis in human genetics. In: Exploitation of linkage learning in evolutionary algorithms. Springer, Ann Arbor, chap 10, pp 215–226Google Scholar
  26. Smits G, Kotanchek M (2004) Pareto-front exploitation in symbolic regression. In: O’Reilly UM, Yu T, Riolo RL, Worzel B (eds) Genetic programming theory and practice II. Springer, Ann Arbor, chap 17, pp 283–299. doi:10.1007/0-387-23254-0-17Google Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  • Jason H. Moore
    • 1
  • Douglas P. Hill
    • 1
  • Andrew Saykin
    • 1
  • Li Shen
    • 1
  1. 1.The Geisel School of Medicine at DartmouthLebanonUSA

Personalised recommendations