Exploiting Expert Knowledge in Genetic Programming for Genome-Wide Genetic Analysis
Human genetics is undergoing an information explosion. The availability of chip-based technology facilitates the measurement of thousands of DNA sequence variation from across the human genome. The challenge is to sift through these high-dimensional datasets to identify combinations of interacting DNA sequence variations that are predictive of common diseases. The goal of this paper was to develop and evaluate a genetic programming (GP) approach for attribute selection and modeling that uses expert knowledge such as Tuned ReliefF (TuRF) scores during selection to ensure trees with good building blocks are recombined and reproduced. We show here that using expert knowledge to select trees performs as well as a multiobjective fitness function but requires only a tenth of the population size. This study demonstrates that GP may be a useful computational discovery tool in this domain.
Unable to display preview. Download preview PDF.