Abstract
DNA microarrays have been successfully used for human genetics research to assay single nucleotide polymorphisms (SNPs) throughout the genome. DNA microarrays work by measuring the relative amount of binding of input DNA to a set of complementary oligonucleotide probes for each allele using a photometric assay. Once the raw data are collected, they need to be converted into a genotype call automatically and with high accuracy. Over the past decade, many groups have published calling algorithms that are able to achieve greater than 99.5 % accuracy. However, these algorithms work best for common SNPs and are not as accurate for low-frequency and rare variants (minor allele frequency <5 %). With the widespread usage of microarrays targeting rare variants such as Exome Chip and MetaboChip, new calling algorithms that accurately call rare variants have been published over the last year. In this chapter, we will describe how DNA microarrays work (see section “Microarray Technology”), give a brief overview of genotype calling algorithms (see section “Genotype Calling Algorithms”), and summarize the different algorithms designed for rare variants and how well they perform (see section “Application to Rare Variants”).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Affymetrix, Inc.. BRLMM—an improved genotype calling method for the GeneChip® Human Mapping 500K Array Set. 2006. pp. 1–18.
Browning BL, Yu Z (2009) Simultaneous genotype calling and haplotype phasing improves genotype accuracy and reduces false-positive associations for genome-wide association studies. Am J Hum Genet 85:847–861. doi:10.1016/j.ajhg.2009.11.004
Carvalho B, Bengtsson H, Speed TP, Irizarry RA (2007) Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data. Biostatistics 8(2):485–499
Chee M et al (1996) Accessing genetic information with high-density DNA arrays. Science 274:610–614
The Wellcome Trust Case Control Consortium (2007) Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447:661–678. doi:10.1038/nature05911
Cutler DJ et al (2001) High-throughput variation detection and genotyping using microarrays. Genome Res 11(11):1913–1925
Di X et al (2005) Dynamic model based algorithms for screening and genotyping over 100 K SNPs on oligonucleotide microarrays
Fodor SP et al (1991) Light-directed, spatially addressable parallel chemical synthesis. Science 251:767–773
Giannoulatou E, Yau C, Colella S, Ragoussis J, Holmes CC (2008) GenoSNP: a variational Bayes within-sample SNP genotyping algorithm that does not require a reference population. Bioinformatics 24:2209–2214. doi:10.1093/bioinformatics/btn386
Goldstein JI et al (2012) zCall: a rare variant caller for array-based genotyping: Genetics and population analysis. Bioinformatics 28:2543–2545. doi:10.1093/bioinformatics/bts479
Gunderson, KL (2009) Whole-genome genotyping on bead arrays. In: Dufva M (ed) DNA microarrays for biomedical research: Methods and protocols. Humana Press, a part of Springer Science+Business Media, LLC. doi:10.1007/978-1-59745-538-1_13
Hoffmann TJ et al (2011) Next generation genome-wide association tool: design and coverage of a high-throughput European-optimized SNP array. Genomics 98:79–89. doi:10.1016/j.ygeno.2011.04.005
Illumina, Inc. Illumina GenCall Data Analysis Software. Technology Spotlight. 2005. http://www.illumina.com/Documents/products/technotes/technote_gencall_data_analysis_software.pdf%3E.
International HapMap Consortium (2005) A haplotype map of the human genome. Nature 437:1299–1320. doi:10.1038/nature04226
Korn JM et al. Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs. Nat Genet. 2008;40:1253–1260. doi:http://www.nature.com/ng/journal/v40/n10/suppinfo/ng.237_S1.html.
Li G, Gelernter J, Kranzler HR, Zhao H (2012) M(3): an improved SNP calling algorithm for Illumina BeadArray data. Bioinformatics 28:358–365. doi:10.1093/bioinformatics/btr673
Lin Y et al (2008) Smarter clustering methods for SNP genotype calling. Bioinformatics 24:2665–2671. doi:10.1093/bioinformatics/btn509
Liu W-M et al (2003) Algorithms for large-scale genotyping microarrays
Nicolae DL, Wu X, Miyake K, Cox NJ (2006) GEL: a novel genotype calling algorithm using empirical likelihood
O’Connell J, Marchini J (2012) Joint genotype calling with array and sequence data. Genet Epidemiol 36:527–537. doi:10.1002/gepi.21657
Rabbee N, Speed TP (2006) A genotype calling algorithm for affymetrix SNP arrays. Bioinformatics 22:7–12. doi:10.1093/bioinformatics/bti741
Shah TS et al (2012) optiCall: a robust genotype-calling algorithm for rare, low-frequency and common variants. Bioinformatics 28:1598–1603. doi:10.1093/bioinformatics/bts180
Teo YY et al (2007) A genotype calling algorithm for the Illumina BeadArray platform
Wright MH et al (2010) ALCHEMY: a reliable method for automated SNP genotype calling for small batch sizes and highly homozygous populations. Bioinformatics 26:2952–2960. doi:10.1093/bioinformatics/btq533
Xiao Y, Segal MR, Yang YH, Yeh RF (2007) A multi-array multi-SNP genotyping algorithm for Affymetrix SNP microarrays. Bioinformatics 23:1459–1467. doi:10.1093/bioinformatics/btm131
Zhou J et al (2014) iCall: a genotype-calling algorithm for rare, low-frequency and common variants on the Illumina exome array. Bioinformatics 30(12):1714–1720. doi:10.1093/bioinformatics/btu107
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer Science+Business Media New York
About this chapter
Cite this chapter
Goldstein, J.I., Neale, B.M. (2015). Calling Rare Variants from Genotype Data. In: Zeggini, E., Morris, A. (eds) Assessing Rare Variation in Complex Traits. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-2824-8_1
Download citation
DOI: https://doi.org/10.1007/978-1-4939-2824-8_1
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4939-2823-1
Online ISBN: 978-1-4939-2824-8
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)