Evaluation of Parameter Contribution to Neural Network Size and Fitness in ATHENA for Genetic Analysis

  • Ruowang Li
  • Emily R. Holzinger
  • Scott M. Dudek
  • Marylyn D. Ritchie
Part of the Genetic and Evolutionary Computation book series (GEVO)


The vast amount of available genomics data provides us an unprecedented ability to survey the entire genome and search for the genetic determinants of complex diseases. Until now, Genome-wide association studies have been the predominant method to associate DNA variations to disease traits. GWAS have successfully uncovered many genetic variants associated with complex diseases when the effect loci are strongly associated with the trait. However, methods for studying interaction effects among multiple loci are still lacking. Established machine learning methods such as the grammatical evolution neural networks (GENN) can be adapted to help us uncover the missing interaction effects that are not captured by GWAS studies. We used an implementation of GENN distributed in the software package ATHENA (Analysis Tool for Heritable and Environmental Network Associations) to investigate the effects of multiple GENN parameters and data noise levels on model detection and network structure. We concluded that the models produced by GENN were greatly affected by algorithm parameters and data noise levels. We also produced complex, multi-layer networks that were not produced in the previous study. In summary, GENN can produce complex, multi-layered networks when the data require it for higher fitness and when the parameter settings allow for a wide search of the complex model space.


Grammatical evolution Neural networks Data mining Human genetics Systems biology XOR model 


  1. Andrew AS, Hu T, Gu J, Gui J, Ye Y, Marsit CJ, Kelsey KT, Schned AR, Tanyos SA, Pendleton EM, Mason RA, Morlock EV, Zens MS, Li Z, Moore JH, Wu X, Karagas MR (2012) HSD3B and gene-gene interactions in a pathway-based analysis of genetic susceptibility to bladder cancer. PLoS One 7(12):e51301CrossRefGoogle Scholar
  2. Breiman L (2001) Random forests. Mach Learn 45:5–32CrossRefzbMATHGoogle Scholar
  3. Edwards T, Bush W, Turner S, Dudek S, Torstenson E, Schmidt M, Martin E, Ritchie M (2008) Generating linkage disequilibrium patterns in data simulations using genomesimla. Evol Comput Mach Learn Data Min Bioinform 4973:24–35CrossRefGoogle Scholar
  4. Gibson G, Riley-Berger R, Harshman L, Kopp A, Vacha S, Nuzhdin S, Wayne M (2004) Extensive sex-specific nonadditivity of gene expression in drosophila melanogaster. Genetics 167:1791–1799. 104026583Google Scholar
  5. Hindorff LA, Sethupathy P, Junkins HA, Ramos EM, Mehta JP, Collins FS, Manolio T (2009) Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci USA 106:9362–9367CrossRefGoogle Scholar
  6. Koza JR, Rice JP (1991) Genetic generation of both the weights and architecture for a neural network. In: International joint conference on neural networks, IJCNN-91, Washington State Convention and Trade Center, Seattle, vol II. IEEE Computer Society, pp 397–404Google Scholar
  7. Motsinger AA, Lee SL, Mellick G, Ritchie MD (2006) GPNN: power studies and applications of a neural network method for detecting gene-gene interactions in studies of human disease. BMC Bioinform [electronic resource] 7(1):39–39CrossRefGoogle Scholar
  8. Motsinger-Reif AA, Dudek SM, Hahn LW, Ritchie MD (2001) Comparison of approaches for machine-learning optimization of neural networks for detecting gene-gene interactions in genetic epidemiology. Genet Epidemiol 32:325–340CrossRefGoogle Scholar
  9. O’Neill M, Ryan C (2001) Grammatical evolution. IEEE Trans Evol Comput 5(4):349–358CrossRefGoogle Scholar
  10. O’Neill M, Ryan C (2003) Grammatical evolution: evolutionary automatic programming in a arbitrary language. Volume 4 of genetic programming. Kluwer, BostonGoogle Scholar
  11. Pearson B, Lau K, Allen A, Barron J, Cool R, Davis K, DeLoache W, Feeney E, Gordon A, Igo J, Lewis A, Muscalino K, Madeline P, Penumetcha P, Rinker V, Roland K, Zhu X, Poet J, Eckdahl T, Heyer L, Campbell A (2011) Bacterial hash function using DNA-based XOR logic reveals unexpected behavior of the LuxR promoter. IBC 3, article no 10:1–10Google Scholar
  12. Privman V, Zhou J, Halamek J, Katz E (2010) Realization and properties of biochemical-computing biocatalytic XOR gate based on signal change. J Phys Chem B 114:13601–13608CrossRefGoogle Scholar
  13. Ritchie MD, Holzinger ER, Dudek SM, Frase AT, Chalise P, Fridley B (2012) Meta-dimensional analysis of phenotypes using the Analysis Tool for Heritable and Environmental Network Associations (ATHENA): challenges with building large networks. In: Riolo R et al (eds) Genetic programming theory and practice X. Springer, New YorkGoogle Scholar
  14. Skapura D (1995) Building neural networks. ACM, New YorkGoogle Scholar
  15. Turner SD, Dudek SM, Ritchie MD (2010) Grammatical evolution of neural networks for discovering epistasis among quantitative trait loci. In: Pizzuti C, Ritchie MD, Giacobini M (eds) 8th European conference on evolutionary computation, machine learning and data mining in bioinformatics (EvoBIO 2010), Istanbul. Volume 6023 of lecture notes in computer science, pp 86–97. SpringerGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  • Ruowang Li
    • 1
  • Emily R. Holzinger
    • 1
  • Scott M. Dudek
    • 1
  • Marylyn D. Ritchie
    • 1
  1. 1.Center for Systems GenomicsPennsylvania State UniversityUniversity ParkUSA

Personalised recommendations