Meta-Dimensional Analysis of Phenotypes Using the Analysis Tool for Heritable and Environmental Network Associations (ATHENA): Challenges with Building Large Networks

  • Marylyn D. RitchieEmail author
  • Emily R. Holzinger
  • Scott M. Dudek
  • Alex T. Frase
  • Prabhakar Chalise
  • Brooke Fridley
Part of the Genetic and Evolutionary Computation book series (GEVO)


The search for the underlying heritability ofcomplex traits has led to an explosion of data generation and analysis in the field of human genomics. With these technological advances, we have made some progress in the identification of genes and proteins associated with common, complex human diseases. Still, our understanding of the genetic architecture of complex traits remains limited and additional research is needed to illuminate the genetic and environmental factors important for the disease process, much of which will include looking at variation in DNA, RNA, protein, etc. in ameta-dimensional analysis framework. We have developed amachine learning technique, ATHENA: Analysis Tool for Heritable and Environmental Network Associations, to address this issue of integrating data from multiple “-omics” technologies to identify models that explain or predict the genetic architecture of complex traits. In this chapter, we discuss the challenges in handling meta-dimensional data usinggrammatical evolution neural networks (GENN) which are one modeling component ofATHENA, and a characterization of the models identified in simulation studies to explore the ability of GENN to build complex, meta-dimensional models. Challenges remain to further understand the evolutionary process for GENN, and an explanation of the simplicity of the models. This work highlights potential areas for extension and improvement of the GENN approach within ATHENA.

Key words

Grammatical evolution Neural networks Data mining Human genetics Systems biology Meta-dimensional data 



ERH was supported by NIH/NIGMS training grant T32 GM080178. MDR was supported by NIH grants LM010040 and P-STAR. P-STAR (PGRN Statistical Analysis Resource) is supported by funding from NIGMS and is part of the PGRN (Pharmacogenomics Research Network). P-STAR is a component of HL065962.


  1. Hamid JS et al (2009a) Data integration in genetics and genomics: methods and challenges. Hum Genomics Proteomics 2009Google Scholar
  2. Hindorff LA et al (2009b) Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci USA 106:9362–9367CrossRefGoogle Scholar
  3. Huang RS et al (2007a) A genome-wide approach to identify genetic variants that contribute to etoposide-induced cytotoxicity. Proc Natl Acad Sci USA 104:9758–9763CrossRefGoogle Scholar
  4. Huang RS et al (2007b) Identification of genetic variants contributing to cisplatin-induced cytotoxdicity by use of a genomewide approach. Am J Hum Genet 81:427–437CrossRefGoogle Scholar
  5. Huang RS et al (2008a) Genetic variants contributing tko danunorubicin-induced cytotoxicity. Cancer Res 68:3161–3168CrossRefGoogle Scholar
  6. Edwards T et al (2008b) Generating linkage disequilibrium patterns in data simulations using genomesimla. Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics 4973:24–35CrossRefGoogle Scholar
  7. Klein TE et al (2001) Integrating genotype and phenotype information: an overview of the pharmgkb project. pharmacogenetics research network and knowledge base. Pharmacogenomics J 1:167–170CrossRefGoogle Scholar
  8. Breiman L (2001) Random forests. Machine Learning 45:5–32zbMATHCrossRefGoogle Scholar
  9. Chalise P, Batzler A, Abo R, Wang L, Fridley BL (2012) Simultaneous analysis of multiple data types in pharmacogenomic studies using weighted sparse canonical correlation analysis. OMICS 16:363–373CrossRefGoogle Scholar
  10. Edgar R, Domrachev M, Lash A (2002) Gene expression omnibus: Ncbi gene expression and hybridization array data repository. Nucleic Acids Res 30:207–210CrossRefGoogle Scholar
  11. Holzinger ER, Ritchie MD (2012) Integrating heterogeneous high-throughput data for meta-dimensional pharmacogenomics and disease-related studies. Pharmacogenomics 13:213–222CrossRefGoogle Scholar
  12. Holzinger ER, Buchanan CC, Dudek SM, Torstenson EC, Turner SD, Ritchie MD (2010) Initialization parameter sweep in ATHENA: optimizing neural networks for detecting gene-gene interactions in the presence of small main effects. In: Branke J, Pelikan M, Alba E, Arnold DV, Bongard J, Brabazon A, Branke J, Butz MV, Clune J, Cohen M, Deb K, Engelbrecht AP, Krasnogor N, Miller JF, O’Neill M, Sastry K, Thierens D, van Hemert J, Vanneschi L, Witt C (eds) GECCO ’10: Proceedings of the 12th annual conference on Genetic and evolutionary computation, ACM, Portland, Oregon, USA, pp 203–210, DOI doi:10.1145/1830483.1830519Google Scholar
  13. Holzinger ER, Dudek SM, Frase AT, Fridley B, Chalise P, Ritchie MD (2012) Comparison of methods for meta-dimensional data analysis using in silico and biological data sets. In: Giacobini M, Vanneschi L, Bush WS (eds) 10th European Conference on Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics, EvoBIO 2012, Springer Verlag, Malaga, Spain, LNCS, vol 7246, pp 134–143, DOI doi:10.1007/ 978-3-642-29066-4-12Google Scholar
  14. Koza JR, Rice JP (1991) Genetic generation of both the weights and architecture for a neural network. In: International Joint Conference on Neural Networks, IJCNN-91, IEEE Computer Society Press, Washington State Convention and Trade Center, Seattle, WA, USA, vol II, pp 397–404, DOI doi:10.1109/IJCNN.1991.155366, URL
  15. Maher B (2008) Personal genomes: The case of the missing heritability. Nature 456:18–21CrossRefGoogle Scholar
  16. Motsinger AA, Lee SL, Mellick G, Ritchie MD (2006) GPNN: Power studies and applications of a neural network method for detecting gene-gene interactions in studies of human disease. BMC bioinformatics [electronic resource] 7(1):39–39, DOI doi:10.1186/1471-2105-7-39, URL
  17. Motsinger-Reif AA, Dudek SM, Hahn LW, Ritchie MD (2008) Comparison of approaches for machine-learning optimization of neural networks for detecting gene-gene interactions in genetic epidemiology. Genet Epidemiol 32:325–340CrossRefGoogle Scholar
  18. O’Neill M, Ryan C (2001) Grammatical evolution. IEEE Transactions on Evolutionary Computation 5(4):349–358, DOI doi:10.1109/4235.942529Google Scholar
  19. O’Neill M, Ryan C (2003) Grammatical Evolution: Evolutionary Automatic Programming in a Arbitrary Language, Genetic programming, vol 4. Kluwer Academic Publishers, URL
  20. Skapura D (1995) Building neural networks. ACM Press, New YorkGoogle Scholar
  21. Turner SD, Dudek SM, Ritchie MD (2010a) Athena: A knowledge-based hybrid backpropagation-grammatical evolution neural network algorithm for discovering epistatis among quantitative trait loci. BioData Min 3:5CrossRefGoogle Scholar
  22. Turner SD, Dudek SM, Ritchie MD (2010b) Grammatical evolution of neural networks for discovering epistasis among quantitative trait loci. In: Pizzuti C, Ritchie MD, Giacobini M (eds) 8th European Conference on Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics (EvoBIO 2010), Springer, Istanbul, Turkey, Lecture Notes in Computer Science, vol 6023, pp 86–97, DOI doi:10. 1007/978-3-642-12211-8Google Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  • Marylyn D. Ritchie
    • 1
    Email author
  • Emily R. Holzinger
    • 1
  • Scott M. Dudek
    • 1
  • Alex T. Frase
    • 1
  • Prabhakar Chalise
    • 2
  • Brooke Fridley
    • 2
  1. 1.Center for Systems GenomicsPennsylvania State UniversityUniversity ParkUSA
  2. 2.Biostatistics DepartmentUniversity of KansasKansas CityUSA

Personalised recommendations