Abstract
A fundamental goal of human genetics is the discovery of polymorphisms that predict common, complex diseases. It is hypothesized that complex diseases are due to a myriad of factors including environmental exposures and complex genetic risk models, including gene-gene interactions. Such interactive models present an important analytical challenge, requiring that methods perform both variable selection and statistical modeling to generate testable genetic model hypotheses. Decision trees are a highly successful, easily interpretable data-mining method that are typically optimized with a hierarchical model building approach, which limits their potential to identify interactive effects. To overcome this limitation, we utilize evolutionary computation, specifically grammatical evolution, to build decision trees to detect and model gene-gene interactions. Currently, we introduce the Grammatical Evolution Decision Trees (GEDT) method, and demonstrate that GEDT has power to detect interactive models in a range of simulated data, revealing GEDT to be a promising new approach for human genetics.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Altshuler, D., Daly, M.J., Lander, E.S.: Genetic mapping in human disease. Science 322, 881–888 (2008)
Moore, J.H., Ritchie, M.D.: STUDENTJAMA. The challenges of whole-genome approaches to common diseases. JAMA 291, 1642–1643 (2004)
Hirschhorn, J.N.: Genomewide association studies–illuminating biologic pathways. N. Engl. J. Med. 360, 1699–1701 (2009)
Goldstein, D.B.: Common genetic variation and human traits. N. Engl. J. Med. 360, 1696–1698 (2009)
Moore, J.H.: The ubiquitous nature of epistasis in determining susceptibility to common human diseases. Hum. Hered. 56, 73–82 (2003)
Bellman, R.: Adaptive Control Processes. Princeton University Press, Princeton (1961)
Moore, J.H., Williams, S.M.: New strategies for identifying gene-gene interactions in hypertension. Ann. Med. 34, 88–95 (2002)
Motsinger, A.A., Ritchie, M.D., Reif, D.M.: Novel methods for detecting epistasis in pharmacogenomics studies. Pharmacogenomics 8, 1229–1241 (2007)
Ritchie, M.D., Hahn, L.W., Roodi, N., Bailey, L.R., Dupont, W.D., Parl, F.F., Moore, J.H.: Multifactor-dimensionality reduction reveals high-order interactions among estrogenmetabolism genes in sporadic breast cancer. Am. J. Hum. Genet. 69, 138–147 (2001)
Nelson, M.R., Kardia, S.L., Ferrell, R.E., Sing, C.F.: A combinatorial partitioning method to identify multilocus genotypic partitions that predict quantitative trait variation. Genome Res. 11, 458–470 (2001)
Brieman, L.: Random Forests. Machine Learning 45, 27 (2001)
Aguilar-Ruiz, J.S., Moore, J.H., Ritchie, M.D.: Filling the gap between biology and computer science. BioData Min. 1, 1 (2008)
Motsinger-Reif, A.A., Dudek, S.M., Hahn, L.W., Ritchie, M.D.: Comparison of approaches for machine-learning optimization of neural networks for detecting gene-gene interactions in genetic epidemiology. Genet. Epidemiol. (2008)
Yao, X.: Evolutionary artificial neural networks. Int. J. Neural Syst. 4, 203–222 (1993)
Motsinger-Reif, A.A., Ritchie, M.D.: Neural networks for genetic epidemiology: past, present, and future. BioData Min. 1, 3 (2008)
Koza, J., Rice, J.P.: Genetic generation of both the weights and architecture for a neural network. IEEE Transactions 2 (1991)
O’Neill, M., Ryan, C.: Grammatical Evolution. Kluwer Academic Publishers, Boston (2001)
O’Neill, M., Ryan, C.: Grammatical Evolution: Evolutionary automatic programming in an arbitrary language. Kluwer Academic Publishers, Boston (2003)
Alpaydin, E.: Introduction to Machine Learning. MIT Press, Cambridge (2004)
Shepherd, B.A.: An appraisal of a decision-tree approach to image classification. In: Proceedings of the Eighth International Joint Conference on Artificial Intelligence, p. 2 (1983)
Devroye, L., Györfi, L., Lugosi, G.: A Probabilistic Theory of Pattern Recognition. Springer, New York (1996)
Velez, D.R., White, B.C., Motsinger, A.A., Bush, W.S., Ritchie, M.D., Williams, S.M., Moore, J.H.: A balanced accuracy function for epistasis modeling in imbalanced datasets using multifactor dimensionality reduction. Genet. Epidemiol. 31, 306–315 (2007)
Hastie, T.J., Tibshirani, R.J., Friedman, J.H.: The elements of statistical learning. Springer, Basel (2001)
Koza, J.: Genetic Programming: on the programming of computers by means of natural selection. MIT Press, Cambridge (1992)
Miller, B.L., Goldberg, D.E.: Genetic Algorithms, Tournament Selection and the Effects of Noise. Complex Systems 9, 193–212 (1995)
Cordell, H.J.: Epistasis: what it means, what it doesn’t mean, and statistical methods to detect it in humans. Hum. Mol. Genet. 11, 2463–2468 (2002)
Li, W., Reich, J.: A complete enumeration and classification of two-locus disease models. Hum. Hered. 50, 334–349 (2000)
Frankel, W.N., Schork, N.J.: Who’s afraid of epistasis? Nat. Genet. 14, 371–373 (1996)
Moore, J.H., Hahn, L.W., Ritchie, M.D., Thornton, T.A., White, B.C.: Application of genetic algorithms to the discovery of complex genetic models for simulations studies in human genetics. In: Langdon, W.B., Cantu-Paz, E., Mathias, K., Roy, R., Davis, D., Poli, R., Balakrishnan, K., Honavar, V., Rudolph, G., Wegener, J., Bull, L., Potter, M.A., Schultz, A.C., Miller, J.F., Burke, E., Jonoska, N. (eds.) Genetic and Evolutionary Algorithm Conference, pp. 1150–1155. Morgan Kaufman Publishers, San Francisco (2002)
Culverhouse, R., Suarez, B.K., Lin, J., Reich, T.: A perspective on epistasis: limits of models displaying no main effect. Am. J. Hum. Genet. 70, 461–471 (2002)
Dudek, S.M., Motsinger, A.A., Velez, D.R., Williams, S.M., Ritchie, M.D.: Data simulation software for whole-genome association and other studies in human genetics. In: Pac. Symp. Biocomput., pp. 499–510 (2006)
Cantu-Paz, E.: Evolving Neural Networks for the classification of galaxies. Morgan Kaufman Publishers, San Franscisco (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Deodhar, S., Motsinger-Reif, A. (2010). Grammatical Evolution Decision Trees for Detecting Gene-Gene Interactions. In: Pizzuti, C., Ritchie, M.D., Giacobini, M. (eds) Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics. EvoBIO 2010. Lecture Notes in Computer Science, vol 6023. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12211-8_9
Download citation
DOI: https://doi.org/10.1007/978-3-642-12211-8_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12210-1
Online ISBN: 978-3-642-12211-8
eBook Packages: Computer ScienceComputer Science (R0)