Skip to main content

Grammatical Evolution Decision Trees for Detecting Gene-Gene Interactions

  • Conference paper
Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics (EvoBIO 2010)

Abstract

A fundamental goal of human genetics is the discovery of polymorphisms that predict common, complex diseases. It is hypothesized that complex diseases are due to a myriad of factors including environmental exposures and complex genetic risk models, including gene-gene interactions. Such interactive models present an important analytical challenge, requiring that methods perform both variable selection and statistical modeling to generate testable genetic model hypotheses. Decision trees are a highly successful, easily interpretable data-mining method that are typically optimized with a hierarchical model building approach, which limits their potential to identify interactive effects. To overcome this limitation, we utilize evolutionary computation, specifically grammatical evolution, to build decision trees to detect and model gene-gene interactions. Currently, we introduce the Grammatical Evolution Decision Trees (GEDT) method, and demonstrate that GEDT has power to detect interactive models in a range of simulated data, revealing GEDT to be a promising new approach for human genetics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Altshuler, D., Daly, M.J., Lander, E.S.: Genetic mapping in human disease. Science 322, 881–888 (2008)

    Article  Google Scholar 

  2. Moore, J.H., Ritchie, M.D.: STUDENTJAMA. The challenges of whole-genome approaches to common diseases. JAMA 291, 1642–1643 (2004)

    Article  Google Scholar 

  3. Hirschhorn, J.N.: Genomewide association studies–illuminating biologic pathways. N. Engl. J. Med. 360, 1699–1701 (2009)

    Article  Google Scholar 

  4. Goldstein, D.B.: Common genetic variation and human traits. N. Engl. J. Med. 360, 1696–1698 (2009)

    Article  Google Scholar 

  5. Moore, J.H.: The ubiquitous nature of epistasis in determining susceptibility to common human diseases. Hum. Hered. 56, 73–82 (2003)

    Article  Google Scholar 

  6. Bellman, R.: Adaptive Control Processes. Princeton University Press, Princeton (1961)

    MATH  Google Scholar 

  7. Moore, J.H., Williams, S.M.: New strategies for identifying gene-gene interactions in hypertension. Ann. Med. 34, 88–95 (2002)

    Article  Google Scholar 

  8. Motsinger, A.A., Ritchie, M.D., Reif, D.M.: Novel methods for detecting epistasis in pharmacogenomics studies. Pharmacogenomics 8, 1229–1241 (2007)

    Article  Google Scholar 

  9. Ritchie, M.D., Hahn, L.W., Roodi, N., Bailey, L.R., Dupont, W.D., Parl, F.F., Moore, J.H.: Multifactor-dimensionality reduction reveals high-order interactions among estrogenmetabolism genes in sporadic breast cancer. Am. J. Hum. Genet. 69, 138–147 (2001)

    Article  Google Scholar 

  10. Nelson, M.R., Kardia, S.L., Ferrell, R.E., Sing, C.F.: A combinatorial partitioning method to identify multilocus genotypic partitions that predict quantitative trait variation. Genome Res. 11, 458–470 (2001)

    Article  Google Scholar 

  11. Brieman, L.: Random Forests. Machine Learning 45, 27 (2001)

    Google Scholar 

  12. Aguilar-Ruiz, J.S., Moore, J.H., Ritchie, M.D.: Filling the gap between biology and computer science. BioData Min. 1, 1 (2008)

    Article  Google Scholar 

  13. Motsinger-Reif, A.A., Dudek, S.M., Hahn, L.W., Ritchie, M.D.: Comparison of approaches for machine-learning optimization of neural networks for detecting gene-gene interactions in genetic epidemiology. Genet. Epidemiol. (2008)

    Google Scholar 

  14. Yao, X.: Evolutionary artificial neural networks. Int. J. Neural Syst. 4, 203–222 (1993)

    Article  Google Scholar 

  15. Motsinger-Reif, A.A., Ritchie, M.D.: Neural networks for genetic epidemiology: past, present, and future. BioData Min. 1, 3 (2008)

    Article  Google Scholar 

  16. Koza, J., Rice, J.P.: Genetic generation of both the weights and architecture for a neural network. IEEE Transactions 2 (1991)

    Google Scholar 

  17. O’Neill, M., Ryan, C.: Grammatical Evolution. Kluwer Academic Publishers, Boston (2001)

    Google Scholar 

  18. O’Neill, M., Ryan, C.: Grammatical Evolution: Evolutionary automatic programming in an arbitrary language. Kluwer Academic Publishers, Boston (2003)

    MATH  Google Scholar 

  19. Alpaydin, E.: Introduction to Machine Learning. MIT Press, Cambridge (2004)

    Google Scholar 

  20. Shepherd, B.A.: An appraisal of a decision-tree approach to image classification. In: Proceedings of the Eighth International Joint Conference on Artificial Intelligence, p. 2 (1983)

    Google Scholar 

  21. Devroye, L., Györfi, L., Lugosi, G.: A Probabilistic Theory of Pattern Recognition. Springer, New York (1996)

    MATH  Google Scholar 

  22. Velez, D.R., White, B.C., Motsinger, A.A., Bush, W.S., Ritchie, M.D., Williams, S.M., Moore, J.H.: A balanced accuracy function for epistasis modeling in imbalanced datasets using multifactor dimensionality reduction. Genet. Epidemiol. 31, 306–315 (2007)

    Article  Google Scholar 

  23. Hastie, T.J., Tibshirani, R.J., Friedman, J.H.: The elements of statistical learning. Springer, Basel (2001)

    MATH  Google Scholar 

  24. Koza, J.: Genetic Programming: on the programming of computers by means of natural selection. MIT Press, Cambridge (1992)

    MATH  Google Scholar 

  25. Miller, B.L., Goldberg, D.E.: Genetic Algorithms, Tournament Selection and the Effects of Noise. Complex Systems 9, 193–212 (1995)

    MathSciNet  Google Scholar 

  26. Cordell, H.J.: Epistasis: what it means, what it doesn’t mean, and statistical methods to detect it in humans. Hum. Mol. Genet. 11, 2463–2468 (2002)

    Article  Google Scholar 

  27. Li, W., Reich, J.: A complete enumeration and classification of two-locus disease models. Hum. Hered. 50, 334–349 (2000)

    Article  Google Scholar 

  28. Frankel, W.N., Schork, N.J.: Who’s afraid of epistasis? Nat. Genet. 14, 371–373 (1996)

    Article  Google Scholar 

  29. Moore, J.H., Hahn, L.W., Ritchie, M.D., Thornton, T.A., White, B.C.: Application of genetic algorithms to the discovery of complex genetic models for simulations studies in human genetics. In: Langdon, W.B., Cantu-Paz, E., Mathias, K., Roy, R., Davis, D., Poli, R., Balakrishnan, K., Honavar, V., Rudolph, G., Wegener, J., Bull, L., Potter, M.A., Schultz, A.C., Miller, J.F., Burke, E., Jonoska, N. (eds.) Genetic and Evolutionary Algorithm Conference, pp. 1150–1155. Morgan Kaufman Publishers, San Francisco (2002)

    Google Scholar 

  30. Culverhouse, R., Suarez, B.K., Lin, J., Reich, T.: A perspective on epistasis: limits of models displaying no main effect. Am. J. Hum. Genet. 70, 461–471 (2002)

    Article  Google Scholar 

  31. Dudek, S.M., Motsinger, A.A., Velez, D.R., Williams, S.M., Ritchie, M.D.: Data simulation software for whole-genome association and other studies in human genetics. In: Pac. Symp. Biocomput., pp. 499–510 (2006)

    Google Scholar 

  32. Cantu-Paz, E.: Evolving Neural Networks for the classification of galaxies. Morgan Kaufman Publishers, San Franscisco (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Deodhar, S., Motsinger-Reif, A. (2010). Grammatical Evolution Decision Trees for Detecting Gene-Gene Interactions. In: Pizzuti, C., Ritchie, M.D., Giacobini, M. (eds) Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics. EvoBIO 2010. Lecture Notes in Computer Science, vol 6023. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12211-8_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-12211-8_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-12210-1

  • Online ISBN: 978-3-642-12211-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics