Abstract
The International HapMap Project is a partnership of scientists and funding agencies from different countries to develop a public resource that will help researchers find genes associated with human disease and response to pharmaceuticals. The project has collected large amounts of SNP(single-nucleotide polymorphism) data of individuals of different human populations. Many researchers have revealed evolution information from the SNP data. But how to find all the SNPs related to human evolution is still a hard work. At most time, these SNPs work together which leads to the differences between different human populations. The number of SNP combinations is very large, thus it is impossible to check all the combinations. In this paper, a novel algorithm is proposed to find the SNP combinatorial patterns whose frequencies are quite different in two different populations. The numbers of the multi-SNP combinations are regarded as the differences between each paired human populations, then a hierarchical clustering algorithm is used to construct the evolution trees for human populations. The trees from 4 chromosomes are consistent and the result can be validated by other literatures, which indicates that evolutionary information is well mined. The multi-SNP combinations found by our method can be studied further in many aspects.
This work is supported in part by the National Natural Science Foundation of China under grant NO.61232001, NO.61379108 and NO.61370172, the Program for New Century Excellent Talents in University (NCET-12-0547)
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Amos, W.: Even small snp clusters are non-randomly distributed: is this evidence of mutational non-independence? Proceedings of the Royal Society B: Biological Sciences 277(1686), 1443–1449 (2010)
Cai, Z., Sabaa, H., Wang, Y., Goebel, R., Wang, Z., Xu, J., Stothard, P., Lin, G.: Most parsimonious haplotype allele sharing determination. BMC Bioinformatics 10(1), 115 (2009)
Sabaa, H., Cai, Z., Wang, Y., Goebel, R., Moore, S., Lin, G.: Whole genome identity-by-descent determination. Journal of Bioinformatics and Computational Biology 11(02) (2013)
Wang, Y., Cai, Z., Stothard, P., Moore, S., Goebel, R., Wang, L., Lin, G.: Fast accurate missing snp genotype local imputation. BMC Research Notes 5(1), 404 (2012)
Cheng, Y., Sabaa, H., Cai, Z., Goebel, R., Lin, G.: Efficient haplotype inference algorithms in one whole genome scan for pedigree data with non-genotyped founders. Acta Mathematicae Applicatae Sinica, English Series 25(3), 477–488 (2009)
Patterson, N., Moorjani, P., Luo, Y., Mallick, S., Rohland, N., Zhan, Y., Genschoreck, T., Webster, T., Reich, D.: Ancient admixture in human history. Genetics 192(3), 1065–1093 (2012)
Gattepaille, L., Jakobsson, M., Blum, M.G.: Inferring population size changes with sequence and snp data: lessons from human bottlenecks. Heredity (2013)
Gutenkunst, R.N., Hernandez, R.D., Williamson, S.H., Bustamante, C.D.: Inferring the joint demographic history of multiple populations from multidimensional snp frequency data. PLoS Genetics 5(10), e1000695 (2009)
Ding, X., Wang, W., Peng, X., Wang, J.: Mining protein complexes from ppi networks using the minimum vertex cut. Tsinghua Science and Technology 17(6), 674–681 (2012)
Mao, W., Lee, J.: A combinatorial analysis of genetic data for crohn’s disease. In: The 1st International Conference on Bioinformatics and Biomedical Engineering, ICBBE 2007, pp. 1031–1034. IEEE (2007)
Brinza, D., Zelikovsky, A.: Combinatorial methods for disease association search and susceptibility prediction. In: Bücher, P., Moret, B.M.E. (eds.) WABI 2006. LNCS (LNBI), vol. 4175, pp. 286–297. Springer, Heidelberg (2006)
Brinza, D.: Discrete algorithms for analysis of genotype data. Computer Science Dissertations, 19 (2007)
Chuang, L.Y., Lin, M.C., Chang, H.W., Yang, C.H.: Analysis of snp interaction combinations to determine breast cancer risk with pso. In: 2011 IEEE 11th International Conference on Bioinformatics and Bioengineering (BIBE), pp. 291–294. IEEE (2011)
Zhang, Y.: A novel bayesian graphical model for genome-wide multi-snp association mapping. Genetic Epidemiology (2012)
Farheen, S., Basu, A., Majumder, P.P.: Haplotype variation in the ace gene in global populations, with special reference to India, and an alternative model of evolution of haplotypes. The HUGO Journal 5(1-4), 35–45 (2011)
Xue, C., Liu, X., Gong, Y., Zhao, Y., Fu, Y.X., et al.: Significantly fewer protein functional changing variants for lipid metabolism in Africans than in Europeans. Journal of Translational Medicine 11(1), 67 (2013)
Dewey, M., Seneta, E.: Carlo emilio bonferroni. In: Statisticians of the Centuries, pp. 411–414. Springer (2001)
The hapmap project homepage, http://hapmap.ncbi.nlm.nih.gov/whatishapmap.html.en
Duan, S., Zhang, W., Cox, N.J., Dolan, M.E.: Fstsnp-hapmap3: a database of snps with high population differentiation for hapmap3. Bioinformation 3(3), 139 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Ding, X., Gu, H., Zhang, Z., Li, M., Wu, F. (2014). Searching SNP Combinations Related to Evolutionary Information of Human Populations on HapMap Data. In: Basu, M., Pan, Y., Wang, J. (eds) Bioinformatics Research and Applications. ISBRA 2014. Lecture Notes in Computer Science(), vol 8492. Springer, Cham. https://doi.org/10.1007/978-3-319-08171-7_25
Download citation
DOI: https://doi.org/10.1007/978-3-319-08171-7_25
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-08170-0
Online ISBN: 978-3-319-08171-7
eBook Packages: Computer ScienceComputer Science (R0)