Abstract
Both environmental and genetic factors play roles in the development of some diseases. Complex diseases, such as Crohn’s disease or Type II diabetes, are caused by a combination of environmental factors and mutations in multiple genes. Patients who have been diagnosed with such diseases cannot easily be treated. However, many diseases can be avoided if people at high risk change their living style, one example being their diet. But how can we tell their susceptibility to diseases before symptoms are found and help them make informed decisions about their health? The susceptibility to complex diseases can be predicted through the analysis of the genetic data. With the development of DNA microarray technique, it is possible to access the human genetic information related to specific diseases. This paper used a combinatorial method to analyze the genetic casecontrol data for Crohn’s disease. A distance based cluster method has been applied to publicly available genotype data on Crohn’s disease for epidemiological study and achieved a highly accurate result.
Similar content being viewed by others
References
Botstein, D., Risch, N. 2003. Discovering genotypes underlying human phenotypes: Past successes for Mendelian disease, future approaches for complex disease. Nat Genet 33, 228–237.
Brinza, D., Zelikovsky, A. 2006. 2SNP: Scalable phasing based on 2-SNP haplotypes. Bioinformatics 22, 371–373.
Brinza, D., He, J., Zelikovsky, A. 2006. Combinatorial search methods for multi-SNP disease association. Proceedings of International Conference of the IEEE Engineering in Medicine and Biology 1, 5802–5805.
Cardon, L.R., Bell, J.I. 2001. Association study designs for complex diseases. Nat Rev Genet 2, 91–98.
Clark, A.G., Boerwinkle, E., Hixson, J., Sing, C.F. 2005. Determinants of the success of whole-genome association testing. Genome Res 15, 1463–1467.
Cook, N.R., Zee, R.Y., Ridker, P.M. 2004. Tree and spline based association analysis of gene-gene interaction models for ischemic stroke. Stat Med 23, I439–I453.
Daly, M., Rioux, J., Schaffner, S., Hudson, T., Lander, E. 2001. High resolution haplotype structure in the human genome. Nat Genet 29, 229–232.
Hahn, L.W., Ritchie, M.D., Moore, J.H. 2003. Multifactor dimensionality reduction software for detecting gene-gene and gene-environment interactions. Bioinformatics 19, 376–382.
He, J., Zelikovsky, A. 2006. Tag SNP selection based on multivariate linear regression. Proc. of International Conference on Computational Science, LNCS 3992, 750–757.
Hirschhorn, J.N., Daly, M.J. 2005. Genome-wide association studies for common diseases and complex diseases. Nature Reviews: Genetics 6, 95–108.
Kimmel, G., Shamir, R. 2005. A block-free hidden markov model for genotypes and its application to disease association. J Comput Biol 12, 1243–1260.
Listgarten, J., Damaraju, S., Poulin, B., Cook, L., Dufour, J., Driga, A., Mackey, J., Wishart, D., Greiner, R., Zanke, B. 2004. Predictive models for breast cancer susceptibility from multiple single nucleotide polymorphisms. Clin Cancer Res 10, 2725–2737.
Mao, W., Brinza, D., Hundewale, N., Gremalschi, S., Zelikovsky, A. 2006. Genotype susceptibility and integrated risk factors for complex diseases. Proceedings of IEEE International Conference on Granular Computing 1, 754–757.
Margaret H.D. 2003. Data Mining — Introduction and advanced topics. 1st Edition, Prentice Hall, New York.
Merikangas, KR., Risch, N. 2003. Will the genomics revolution revolutionize psychiatry. The American Journal of Psychiatry 160, 625–635.
Ritchie, M.D., Hahn, L.W., Roodi, N., Bailey, L.R., Dupont, W.D., Parl, F.F., Moore, J.H. 2001. Multifactor-dimensionality reduction reveals highorder interactions among estrogen-metabolism genes in sporadic breast cancer. Am J Hum Genet 69, 138–147.
York, T.P., Eaves, L.J. 2001. Common disease analysis using multivariate adaptive regression Ssplines (MARS): Genetic analysis workshop 12 simulated sequence data. Genet Epidemiol 21Suppl I, S649–S654.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Tu, Y., Mao, W. A distance-based cluster algorithm for genomic analysis in genetic disease. Interdiscip Sci Comput Life Sci 4, 90–96 (2012). https://doi.org/10.1007/s12539-012-0124-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12539-012-0124-y