Identification of Novel Genetic Models of Glaucoma Using the “EMERGENT” Genetic Programming-Based Artificial Intelligence System

  • Jason H. Moore
  • Casey S. Greene
  • Douglas P. Hill
Part of the Genetic and Evolutionary Computation book series (GEVO)


The genetic basis for primary open-angle glaucoma (POAG) is not yet understood but is likely the result of many interacting genetic variants that influence risk in the context of our local ecology. The complexity of the genotype to phenotype mapping relationship for common diseases like POAG necessitates analytical approaches that move beyond parametric statistical methods such as logistic regression that assume a particular mathematical model. This is particularly important in the era of big data where it is routine to collect and analyze data sets with hundreds of thousands of measured genetic variants in thousands of human subjects. We introduce here the Exploratory Modeling for Extracting Relationships using Genetic and Evolutionary Navigation Techniques (EMERGENT) algorithm as an artificial intelligence approach to the genetic analysis of common human diseases. EMERGENT builds models of genetic variation from lists of mathematical functions using a form of genetic programming called computational evolution. A key feature of the system is the ability to utilize pre-processed expert knowledge giving it the ability to explore model space much as a human would. We describe this system in detail and then apply it to the genetic analysis of POAG in the Glaucoma Gene Environment Initiative (GLAUGEN) study that included approximately 1,272 subjects with the disease and 1057 healthy controls. A total of 657,366 single-nucleotide polymorphisms (SNPs) from across the human genome were measured in these subjects and available for analysis. Analysis using the EMERGENT framework revealed a best model consisting of six SNPs that map to at least six different genes. Two of these genes have previously been associated with POAG in several studies. The others represent new hypotheses about the genetic basis of POAG. All of the SNPs are involved in non-additive gene-gene interactions. Further, the six genes are all directly or indirectly related through biological interactions to the vascular endothelial growth factor (VEGF) gene that is an actively investigated drug target for POAG. This study demonstrates the routine application of an artificial intelligence-based system for the genetic analysis of complex human diseases.


Exploratory modeling for extracting relationships using genetic and evolutionary navigation techniques Artificial intelligence Genetic programming Glaucoma 



The work on glaucoma was supported by NIH grant EY022300. Aspects of the algorithm development were also supported by NIH grants LM009012, LM010098, and AI59694. Computational infrastructure was partially supported by NIH grants P20 GM103506 and P20 GM103534. We would like to thank the participants of present and past Genetic Programming Theory and Practice Workshops (GPTP) for their stimulating feedback and discussion that helped formulate some of the ideas in this paper.


  1. Banzhaf W, Francone FD, Keller RE, Nordin P (1998) Genetic programming: an introduction: on the automatic evolution of computer programs and its applications. Morgan Kaufmann Publishers Inc., San FranciscozbMATHGoogle Scholar
  2. Banzhaf W, Beslon G, Christensen S, Foster J, Kepes F, Lefort V, Miller J, Radman M, Ramsden J (2006) From artificial evolution to computational evolution: a research agenda. Nature Rev Genet 7:729–735CrossRefGoogle Scholar
  3. Bush WS, Moore JH (2012) Chapter 11: Genome-wide association studies. PLoS Comput Biol 8(12):e1002822. doi:10.1371/journal.pcbi.1002822CrossRefGoogle Scholar
  4. Clark AF, Yorio T (2003) Ophthalmic drug discovery. Nat Rev Drug Discov 2(6):448–459. doi:10.1038/nrd1106CrossRefGoogle Scholar
  5. Cooke Bailey JN, Sobrin L, Pericak-Vance MA, Haines JL, Hammond CJ, Wiggs JL (2013) Advances in the genomics of common eye diseases. Hum Mol Genet 22(R1):R59–65. doi:10.1093/hmg/ddt396CrossRefGoogle Scholar
  6. Cornelis MC, Agrawal A, Cole JW, Hansel NN, Barnes KC, Beaty TH, Bennett SN, Bierut LJ, Boerwinkle E, Doheny KF, Feenstra B, Feingold E, Fornage M, Haiman CA, Harris EL, Hayes MG, Heit JA, Hu FB, Kang JH, Laurie CC, Ling H, Manolio TA, Marazita ML, Mathias RA, Mirel DB, Paschall J, Pasquale LR, Pugh EW, Rice JP, Udren J, van Dam RM, Wang X, Wiggs JL, Williams K, Yu K (2010) The gene, environment association studies consortium (geneva): maximizing the knowledge obtained from gwas by collaboration across studies of multiple conditions. Genet Epidemiol 34(4):364–372. doi:10.1002/gepi.20492CrossRefGoogle Scholar
  7. Dewey FE, Grove ME, Pan Cea (2014) Clinical interpretation and implications of whole-genome sequencing. JAMA 311(10):1035–1045. doi:10.1001/jama.2014.1717.,/data/Journals/JAMA/929845/joi140017.pdf CrossRefGoogle Scholar
  8. Fan R, Zhong M, Wang S, Zhang Y, Andrew A, Karagas M, Chen H, Amos CI, Xiong M, Moore JH (2011) Entropy-based information gain approaches to detect and to characterize gene-gene and gene-environment interactions/correlations of complex diseases. Genet Epidemiol 35(7):706–721. doi:10.1002/gepi.20621CrossRefGoogle Scholar
  9. Fogel GB, Corne DW (eds) (2003) Evolutionary computation in bioinformatics. Morgan Kaufmann Publishers Inc.Google Scholar
  10. Greene CS, Hill DP, Moore JH (2009a) Environmental noise improves epistasis models of genetic data discovered using a computational evolution system. In: Proceedings of the 11th annual conference on genetic and evolutionary computation, ACM, New York, NY, GECCO '09, pp 1785–1786. doi:10.1145/1569901.1570160.
  11. Greene CS, Hill DP, Moore JH (2009b) Environmental sensing of expert knowledge in a computational evolution system for complex problem solving in human genetics. In: Riolo RL, O’Reilly UM, McConaghy T (eds) Genetic programming theory and practice VII, genetic and evolutionary computation. Springer, Ann Arbor, pp 19–36Google Scholar
  12. Hahn LW, Ritchie MD, Moore JH (2003) Multifactor dimensionality reduction software for detecting gene-gene and gene-environment interactions. Bioinformatics 19(3):376–382CrossRefGoogle Scholar
  13. Hirschhorn JN, Daly MJ (2005) Genome-wide association studies for common diseases and complex traits. Nat Rev Genet 6(2):95–108. doi:10.1038/nrg1521CrossRefGoogle Scholar
  14. Horn J, Nafpliotis N, Goldberg DE (1994) A niched pareto genetic algorithm for multiobjective optimization. In: Evolutionary Computation, 1994. IEEE world congress on computational intelligence, proceedings of the first IEEE conference on, pp 82–87 vol.1. doi:10.1109/ICEC.1994.350037Google Scholar
  15. Hornby GS (2006) Alps: the age-layered population structure for reducing the problem of premature convergence. In: Proceedings of the 8th annual conference on Genetic and evolutionary computation, ACM, New York, NY, GECCO '06, pp 815–822. doi:10.1145/1143997.1144142Google Scholar
  16. Horsley MB, Kahook MY (2010) Anti-vegf therapy for glaucoma. Curr Opin Ophthalmol 21(2):112–117. doi:10.1097/ICU.0b013e3283360aadCrossRefGoogle Scholar
  17. Hu T, Sinnott-Armstrong NA, Kiralis JW, Andrew AS, Karagas MR, Moore JH (2011) Characterizing genetic interactions in human disease association studies using statistical epistasis networks. BMC Bioinform 12:364. doi:10.1186/1471-2105-12-364CrossRefGoogle Scholar
  18. Hu T, Chen Y, Kiralis JW, Moore JH (2013) Visen: methodology and software for visualization of statistical epistasis networks. Genet Epidemiol 37(3):283–285. doi:10.1002/gepi.21718CrossRefGoogle Scholar
  19. Koza JR (1992) Genetic programming: on the programming of computers by means of natural selection (Complex Adaptive Systems), 1st edn. A Bradford Book.
  20. Mailman MD, Feolo M, Jin Y, Kimura M, Tryka K, Bagoutdinov R, Hao L, Kiang A, Paschall J, Phan L, Popova N, Pretel S, Ziyabari L, Lee M, Shao Y, Wang ZY, Sirotkin K, Ward M, Kholodov M, Zbicz K, Beck J, Kimelman M, Shevelev S, Preuss D, Yaschenko E, Graeff A, Ostell J, Sherry ST (2007) The ncbi dbgap database of genotypes and phenotypes. Nat Genet 39(10):1181–1186. doi:10.1038/ng1007-1181CrossRefGoogle Scholar
  21. Moore J (2003) The ubiquitous nature of epistasis in determining susceptibility to common human diseases. Human Heredity 56:73–82CrossRefGoogle Scholar
  22. Moore JH, White BC (2007) Tuning relieff for genome-wide genetic analysis. Proceedings of the 5th European conference on Evolutionary computation, machine learning and data mining in bioinformatics. Springer-Verlag, Berlin, pp 166–175.
  23. Moore JH, Williams SM (2005) Traversing the conceptual divide between biological and statistical epistasis: systems biology and a more modern synthesis. Bioessays 27(6):637–646. doi:10.1002/bies.20236CrossRefGoogle Scholar
  24. Moore JH, Williams SM (2009) Epistasis and its implications for personal genetics. Am J Hum Genet 85(3):309–320. doi:10.1016/j.ajhg.2009.08.006CrossRefGoogle Scholar
  25. Moore JH, Parker JS, Olsen NJ, Aune TM (2002) Symbolic discriminant analysis of microarray data in autoimmune disease. Genet Epidemiol 23(1):57–69CrossRefGoogle Scholar
  26. Moore JH, Gilbert JC, Tsai CT, Chiang FT, Holden T, Barney N, White BC (2006) A flexible computational framework for detecting, characterizing, and interpreting statistical patterns of epistasis in genetic studies of human disease susceptibility. J Theor Biol 241(2):252–261. doi:10.1016/j.jtbi.2005.11.036. MathSciNetCrossRefGoogle Scholar
  27. Moore JH, Barney N, Tsai CT, Chiang FT, Gui J, White BC (2007) Symbolic modeling of epistasis. Hum Hered 63(2):120–133. doi:10.1159/000099184CrossRefGoogle Scholar
  28. Moore JH, Andrews PC, Barney N, White BC (2008a) Development and evaluation of an open-ended computational evolution system for the genetic analysis of susceptibility to common human diseases. In: Marchiori E, Moore JH (eds) EvoBIO, Springer, lecture notes in computer science, vol 4973, pp 129–140Google Scholar
  29. Moore JH, Barney N, White BC (2008b) Solving complex problems in human genetics using genetic programming: the importance of theorist-Practitionercomputer interaction. Springer US, pp 69–85Google Scholar
  30. Moore JH, Greene CS, Andrews PC, White BC (2008c) Does complexity matter? artificial evolution, computational evolution and the genetic analysis of epistasis in common human diseases. In: Riolo RL, Soule T, Worzel B (eds) Genetic programming theory and practice VI, genetic and evolutionary computation. Springer, Ann Arbor, pp 125–145. doi:10.1007/978-0-387-87623-8-9Google Scholar
  31. Moore JH, Asselbergs FW, Williams SM (2010) Bioinformatics challenges for genome-wide association studies. Bioinformatics 26(4):445–455. doi:10.1093/bioinformatics/btp713CrossRefGoogle Scholar
  32. Moore JH, Hill DP, Fisher JM, Lavender N, Kidd LC (2011) Human-computer interaction in a computational evolution system for the genetic analysis of cancer. In: Riolo R, Vladislavleva E, Moore JH (eds) Genetic programming theory and practice IX, genetic and evolutionary computation, Springer, Ann Arbor, pp 153–171. doi:10.1007/978-1-4614-1770-5-9Google Scholar
  33. Moore JH, Hill DP, Sulovari A, Kidd L (2012) Genetic analysis of prostate cancer using computational evolution, pareto-optimization and post-processing. In: Riolo R, Vladislavleva E, Ritchie MD, Moore JH (eds) Genetic programming theory and practice X, genetic and evolutionary computation. Springer, Ann Arbor, pp 87–101. doi:10.1007/978-1-4614-6846-2-7.
  34. Moore JH, Hill DP, Saykin A, Shen L (2013) Exploring interestingness in a computational evolution system for the genome-wide genetic analysis of alzheimer’s disease. In: Riolo R, Moore JH, Kotanchek M (eds) Genetic programming theory and practice XI, genetic and evolutionary computation. Springer, pp 31–45. doi:10.1007/978-1-4939-0375-7-2Google Scholar
  35. Osaadon P, Fagan XJ, Lifshitz T, Levy J (2014) A review of anti-vegf agents for proliferative diabetic retinopathy. Eye (Lond) 28(5):510–520. doi:10.1038/eye.2014.13CrossRefGoogle Scholar
  36. Pattin KA, Payne JL, Hill DP, Caldwell T, Fisher JM, Moore JH (2010) Exploiting expert knowledge of protein-protein interactions in a computational evolution system for detecting epistasis. In: Riolo R, McConaghy T, Vladislavleva E (eds) Genetic programming theory and practice VIII, genetic and evolutionary computation, vol 8. Springer, Ann Arbor. (Chap 12, pp 195–210,
  37. Payne J, Greene C, Hill D, Moore J (2010) Exploitation of linkage learning in evolutionary algorithms. Springer. (chap 10: Sensible initialization of a computational evolution system using expert knowledge for epistasis analysis in human genetics), pp 215–226Google Scholar
  38. Ritchie M, Hahn L, Roodi N, Bailey L, Dupont W, Parl F, Moore J (2001) Multifactor dimensionality reduction reveals high-order interactions among estrogen metabolism genes in sporadic breast cancer. Am J Hum Genetics 69:138–147CrossRefGoogle Scholar
  39. Ritchie MD, White BC, Parker JS, Hahn LW, Moore JH (2003) Optimization of neural network architecture using genetic programming improves detection and modeling of gene-gene interactions in studies of human diseases. BMC Bioinform 4:28. doi:10.1186/1471-2105-4-28CrossRefGoogle Scholar
  40. SooHoo JR, Seibold LK, Kahook MY (2014) The link between intravitreal antivascular endothelial growth factor injections and glaucoma. Curr Opin Ophthalmol 25(2):127–133. doi:10.1097/ICU.0000000000000036CrossRefGoogle Scholar
  41. Spector L, Robinson A (2002) Genetic programming and autoconstructive evolution with the push programming language. Genet Program Evolvable Mach pp 7–40Google Scholar
  42. Tyler AL, Asselbergs FW, Williams SM, Moore JH (2009) Shadows of complexity: what biological networks reveal about epistasis and pleiotropy. Bioessays 31(2):220–227. doi:10.1002/bies.200800022CrossRefGoogle Scholar
  43. Wang WYS, Barratt BJ, Clayton DG, Todd JA (2005) Genome-wide association studies: theoretical and practical concerns. Nat Rev Genet 6(2):109–118. doi:10.1038/nrg1522CrossRefGoogle Scholar
  44. Wang X, Harmon J, Zabrieskie N, Chen Y, Grob S, Williams B, Lee C, Kasuga D, Shaw PX, Buehler J, Wang N, Zhang K (2010) Using the utah population database to assess familial risk of primary open angle glaucoma. Vision Res 50(23):2391–2395. doi:10.1016/j.visres.2010.09.018CrossRefGoogle Scholar
  45. Welter D, MacArthur J, Morales J, Burdett T, Hall P, Junkins H, Klemm A, Flicek P, Manolio T, Hindorff L, Parkinson H (2014) The nhgri gwas catalog, a curated resource of snp-trait associations. Nucleic Acids Res 42(Database issue):D1001–6. doi:10.1093/nar/gkt1229CrossRefGoogle Scholar
  46. Wong AK, Park CY, Greene CS, Bongo LA, Guan Y, Troyanskaya OG (2012) Imp: a multi-species functional genomics portal for integration, visualization and prediction of protein functions and networks. Nucleic Acids Res 40(Web Server issue):W484–90. doi:10.1093/nar/gks458CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Jason H. Moore
    • 1
  • Casey S. Greene
    • 1
  • Douglas P. Hill
    • 1
  1. 1.The Perelman School of MedicineUniversity of PennsylvaniaPhiladelphiaUSA

Personalised recommendations