HapTree: A Novel Bayesian Framework for Single Individual Polyplotyping Using NGS Data
Using standard genotype calling tools, it is possible to accurately identify the number of “wild type” and “mutant” alleles (A, C, G, or T) for each singlenucleotide polymorphism (SNP) site. In the case of two heterozygous SNP sites however, genotype calling tools cannot determine whether “mutant” alleles from different SNP loci are on the same or different chromosomes. While in many cases the former would be healthy, the latter can cause loss of function; it is therefore important to identify the phase—the copies of a chromosome on which the mutant alleles occur—in addition to the genotype. This need necessitates efficient algorithms to obtain an accurate and comprehensive haplotype reconstruction (the phase of heterozygous SNPs in the genome) directly from the next-generation sequencing (NGS) read data. Nearly all previous haplotype reconstruction studies have focused on diploid genomes and are rarely scalable to genomes of higher ploidy; however, computational investigations into polyploid genomes carry great importance, impacting plant, yeast and fish genomics, as well as studies into the evolution of modern-day eukaryotes and (epi)genetic interactions between copies of genes.
- 2.Aguiar, D., Istrail, S.: Haplotype assembly in polyploid genomes and identical by descent shared tracts. Bioinformatics 29(13), i352–i360 (2013)Google Scholar
- 3.Bansal, V., Bafna, V.: Hapcut: an efficient and accurate algorithm for the haplotype assembly problem. Bioinformatics 24(16), i153–i159 (2008)Google Scholar