Combinatorial Methods for Disease Association Search and Susceptibility Prediction
- Cite this paper as:
- Brinza D., Zelikovsky A. (2006) Combinatorial Methods for Disease Association Search and Susceptibility Prediction. In: Bücher P., Moret B.M.E. (eds) Algorithms in Bioinformatics. WABI 2006. Lecture Notes in Computer Science, vol 4175. Springer, Berlin, Heidelberg
Accessibility of high-throughput genotyping technology makes possible genome-wide association studies for common complex diseases. When dealing with common diseases, it is necessary to search and analyze multiple independent causes resulted from interactions of multiple genes scattered over the entire genome. This becomes computationally challenging since interaction even of pairs gene variations require checking more than 1012 possibilities genome-wide. This paper first explores the problem of searching for the most disease-associated and the most disease-resistant multi-gene interactions for a given population sample of diseased and non-diseased individuals. A proposed fast complimentary greedy search finds multi-SNP combinations with non-trivially high association on real data. Exploiting the developed methods for searching associated risk and resistance factors, the paper addresses the disease susceptibility prediction problem. We first propose a relevant optimum clustering formulation and the model-fitting algorithm transforming clustering algorithms into susceptibility prediction algorithms. For three available real data sets (Crohn’s disease (Daly et al, 2001), autoimmune disorder (Ueda et al, 2003), and tick-borne encephalitis (Barkash et al, 2006)), the accuracies of the prediction based on the combinatorial search (respectively, 84%, 83%, and 89%) are higher by 15% compared to the accuracies of the best previously known methods. The prediction based on the complimentary greedy search almost matches the best accuracy but is much more scalable.
Unable to display preview. Download preview PDF.