Genetic programming (GP) shows great promise for solving complex problems in human genetics. Unfortunately, many of these methods are not accessible to biologists. This is partly due to the complexity of the algorithms that limit their ready adoption and integration into an analysis or modeling paradigm that might otherwise only use univariate statistical methods. This is also partly due to the lack of user-friendly, open-source, platform-independent, and freely-available software packages that are designed to be used by biologists for routine analysis. It is our objective to develop, distribute and support a comprehensive software package that puts powerful GP methods for genetic analysis in the hands of geneticists. It is our working hypothesis that the most effective use of such a software package would result from interactive analysis by both a biologist and a computer scientist (i.e. human—human—computer interaction). We present here the design and implementation of an open-source software package called Symbolic Modeler (SyMod) that seeks to facilitate geneticist—bioinformaticist—computer interactions for problem solving in human genetics. We present and discuss the results of an application of SyMod to real data and discuss the challenges associated with delivering a user-friendly GP-based software package to the genetics community.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Banzhaf, Wolfgang, Nordin, Peter, Keller, Robert E., and Francone, Frank D. 1998. Genetic Programming - An Introduction; On the Automatic Evolution of Computer Programs and its Applications. Morgan Kaufmann, San Francisco, CA, USA.
Fogel, G.B. and Corne, D.W. (2003). Evolutionary Computation in Bioinformatics. Morgan Kaufmann Publishers.
Freitas, Alex (2002). Data Mining and Knowledge Discovery with Evolutionary Algorithms. Springer-Verlag.
Goldberg, D. E. (2002). The Design of Innovation. Kluwer.
Jakulin, A. and Bratko, I. 2003. Analyzing attribute interactions. Lecture Notes in Artificial Intelligence, 2838:229-240.
Jin, Y. (2006). Multi-Objective Machine Learning. Springer.
Koza, John R. 1992. Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge, MA, USA.
Koza, John R. 1994.Genetic Programming II: Automatic Discovery of Reusable Programs. MIT Press, Cambridge Massachusetts.
Koza, John R., Andre, David, Bennett III, Forrest H, and Keane, Martin (1999). Genetic Programming 3: Darwinian Invention and Problem Solving. Morgan Kaufman.
Koza, John R., Keane, Martin A., Streeter, Matthew J., Mydlowec, William, Yu, Jessen, and Lanza, Guido (2003). Genetic Programming IV: Routine Human-Competitive Machine Intelligence. Kluwer Academic Publishers.
Langdon, W. B. and Poli, Riccardo (2002). Foundations of Genetic Programming. Springer-Verlag.
Langdon, William B. 1998. Genetic Programming and Data Structures: Genetic Programming + Data Structures = Automatic Programming!, volume 1 of Genetic Programming. Kluwer, Boston.
Langley, P. (2002). Lessons for the computational discovery of scientific knowledge. Proceedings of First International Workshop on Data Mining Lessons Learned, pages 9-12.
Larra ñga, P. and Lozano, J.A. 2002. Estimation of Distribution Algorithms: A New Tool for Evolutionary Computation. Kluwer Academic Publishers, Boston.
Moore, J. H. (2007). Genome-wide analysis of epistasis using multifactor dimensionality reduction: feature selection and construction in the domain of human genetics. In Knowledge Discovery and Data Mining: Challenges and Realities with Real World Data. IGI.
Moore, J. H., Gilbert, J. C., Tsai, C.-T., Chiang, F. T., Holden, W., Barney, N., and White, B. C. 2006. A flexible computational framework for detecting, characterizing, and interpreting statistical patterns of epistasis in genetic studies of human disease susceptibility. Journal of Theoretical Biology, 24:252-261.
Moore, J.H. 2003. Cross validation consistency for the assessment of genetic programming results in microarray studies. Lecture Notes in Computer Science, 2611:99-106.
Moore, J.H, Barney, N., Tsai, C.T, Chiang, F.T, Gui, J., and White, B.C 2007. Symbolic modeling of epistasis. Human Heridity, 63(2):120-133.
Moore, J.H. and Parker, J.S. 2001. Evolutionary computation in microarray data analysis. Kluwer Academic Publishers, Boston.
Moore, J.H., Parker, J.S., and Hahn, L.W. 2001. Symbolic discriminant analysis for mining gene expression patterns. Lecture Notes in Artificial Intelligence, 2167:191-205.
Moore, J.H, Parker, J.S., Olsen, N.J, and Aune, T. 2002. Symbolic discriminant analysis of microarray data in autoimmune disease. Genetic Epidemiology, 23:57-69.
Moore, J.H. and White, B.C. 2006a. Exploiting expert knowledge in genetic programming for genome-wide genetic analysis. Lecture Notes in Computer Science, 4193:969-977.
Moore, J.H. and White, B.C. (2006b). Genome-wide genetic analysis using genetic programming: The critical need for expert knowledge. Springer.
O’Reilly, U.-M., Yu, T., Riolo, R., and Worzel, B. (Eds.) (2005). Genetic Programming: Theory And Practice. Springer.
Reif, D.M, White, B.C., and Moore, J.H. 2004. Integrated analysis of genetic, genomic, and proteomic data. Expert Review of Proteomics, 1:67-75.
Reif, D.M, White, B.C., Olsen, N.J., Aune, T.A., and Moore, J.H. 2003. Complex function sets improve symbolic discriminant analysis of microarray data. Lecture Notes in Computer Science, 2724:2277-2287.
Ritchie, M. D., Hahn, L. W., Roodi, N., Bailey, L. R., Dupont, W. D., Parl, F. F., and Moore, J. H. 2001. Multifactor dimensionality reduction reveals high-order interactions among estrogen metabolism genes in sporadic breast cancer. American Journal of Human Genetics, 69:138-147.
Rowland, J.J. 2003. Model selection methodology in supervised learning with evolutionary computation. Biosystems, 72(1-2):187-196.
Sastry, K. and Goldberg, D. E. (2003). Probabilistic model building and competent genetic programming. Genetic Programming Theory and Practice.
Schwartz, S.A., Weil, R.J., Thompson, R.C., Shyr, Y., and Moore, J.H. 2005. Proteomic-based prognosis of brain tumor patients using direct-tissue matrixassisted laser desorption ionization mass spectrometry. Cancer Research, 65:7674-7681.
Tsai, C. T., Lai, L. P., Lin, J. L., Chiang, F. T., Hwang, J. J., Ritchie, M. D., Moore, J. H., Hsu, K. L., Tseng, C. D., Liau, C. S., and Tseng, Y. Z. 2004. Reninangiotensin system gene polymorphisms and atrial fibrillation. Circulation, 109:1640-6.
White, B. C., Gilbert, J. C., Reif, D. M., and Moore, J. H. (2005). A statistical comparison of grammatical evolution strategies in the domain of human genetics. Proceedings of the IEEE Congress on Evolutionary Computing, pages 676-682.
Yu, T., Riolo, R., and Worzel, B. (Eds.) (2006). Genetic Programming Theory and Practice III. Springer.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Moore, J.H., Barney, N., White, B.C. (2008). Solving Complex Problems in Human Genetics Using Genetic Programming: The Importance of Theorist-Practitionercomputer Interaction. In: Riolo, R., Soule, T., Worzel, B. (eds) Genetic Programming Theory and Practice V. Genetic and Evolutionary Computation Series. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-76308-8_5
Download citation
DOI: https://doi.org/10.1007/978-0-387-76308-8_5
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-76307-1
Online ISBN: 978-0-387-76308-8
eBook Packages: Computer ScienceComputer Science (R0)