Genotype Error Detection Using Hidden Markov Models of Haplotype Diversity
The presence of genotyping errors can invalidate statistical tests for linkage and disease association, particularly for methods based on haplotype analysis. Becker et al. have recently proposed a simple likelihood ratio approach for detecting errors in trio genotype data. Under this approach, a SNP genotype is flagged as a potential error if the likelihood associated with the original trio genotype data increases by a multiplicative factor exceeding a user selected threshold when the SNP genotype under test is deleted. In this paper we give improved error detection methods using the likelihood ratio test approach in conjunction with likelihood functions that can be efficiently computed based on a Hidden Markov Model of haplotype diversity in the population under study. Experimental results on both simulated and real datasets show that proposed methods achieve significantly improved detection accuracy compared to previous methods with highly scalable running time.
KeywordsHide Markov Model Likelihood Function Error Detection Haplotype Diversity Real Dataset
Unable to display preview. Download preview PDF.
- 7.Cherny, S., Abecasis, G., Cookson, W., Sham, P., Cardon, L.: The effect of genotype and pedigree error on linkage analysis: Analysis of three asthma genome scans. Genet. Epidemiol. 21, S117–S122 (2001)Google Scholar
- 15.Rastas, P., Koivisto, M., Mannila, H., Ukkonen, E.: Phasing genotypes using a hidden Markov model. In: Bioinformatics Algorithms: Techniques and Applications, Wiley, Chichester, preliminary version in Proc. WABI 2005 (to appear)Google Scholar
- 16.Scheet, P., Stephens, M.: A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. American Journal of Human Genetics (to appear)Google Scholar
- 17.Schwartz, R.: Algorithms for association study design using a generalized model of haplotype conservation. In: Proc. CSB, pp. 90–97 (2004)Google Scholar
- 18.Gusev, A., Paşaniuc, B., Măndoiu, I.: Highly scalable genotype phasing by entropy minimization. IEEE Transactions on Computational Biology and Bioinformatics (to appear)Google Scholar