New Genome-Wide Methods for Elucidation of Candidate Copy Number Variations (CNVs) Contributing to Alzheimer’s Disease Heritability

Part of the Methods in Molecular Biology book series (MIMB, volume 1303)

Abstract

The complexity of human genetic variation has been extended by the observation of abundant and widespread variation in the copy number of submicroscopic DNA segments. The discovery of this novel level of genome organization opened new possibilities concerning the genetic variation that may confer susceptibility to or cause disease. Copy number variants (CNVs) influence gene expression, phenotypic variation and adaptation by altering gene dosage and genome organization. Concordant with the common disease common variant hypothesis these structural variants are now subject to interrogation for disease association. Alzheimer’s disease (AD) is a progressive neurodegenerative disease with an estimated heritability of 60–80 %. Large scale genome-wide association studies (GWAS) using high frequency single nucleotide polymorphism (SNP) variants identified ten loci which do not account for the measured heritability. To find the missing heritability systematic assessment of all mutational mechanisms needs to be performed. Between the powerful SNP-GWAS studies and the planned Whole Genome Sequencing projects the contribution of copy number variation (CNV) to the genetic architecture of AD needs to be studied fully.

Key words

Copy number variation CNV Alzheimer’s disease Heritability GWAS 

1 Introduction

Alzheimer’s disease (AD) is the most common form of dementia and leads to unrelenting cognitive decline [1]. With increased longevity the prevalence of AD in the elderly represents a major public health problem. The heritability of AD is estimated at 60–80 % [2]. Several large scale genome-wide association studies (GWAS) using high frequency variants identified ten loci including APOE with a combined population attributable fraction of 0.51–0.6 [3]. To find the missing heritability systematic assessment of all mutational mechanisms needs to be performed.

The advent of whole-genome scanning methods revealed widespread variation in the copy number of submicroscopic DNA segments. Copy number variation (CNV) is defined as a DNA segment that is 1 kb or larger and is present at variable copy number in comparison with the reference genome [4]. CNVs are a group of structural variants and can be classified as deletions, duplications, deletions and duplications at the same locus, multi-allelic loci, and complex rearrangements.

CNVs are major contributors to genetic variance, thus it is conceivable that they may contribute to the heritability of disease [5]. CNVs influence gene expression, phenotypic variation and adaptation by altering gene dosage [5]; 18 % of the gene expression traits are associated with CNVs [6].

CNVs have been identified in Mendelian disease and were found to be associated with complex neurological traits. Duplication of amyloid precursor protein (APP) causes autosomal dominant early-onset AD with cerebral amyloid angiopathy [7], duplication and triplication of α-synuclein (SNCA) causes familial Parkinson disease [8], and lamin B1 (LMNB1) duplication causes leukodystrophy [9], all confirmed by segregation of the disease phenotype with the CNV in autosomal dominant families. CNV GWAS studies implicated several candidate loci contributing to the AD phenotype [10, 11, 12, 13, 14, 15, 16, 17].

The recombination events resulting in CNVs may be frequent. At the whole genome level about 0.3 % of biallelic CNV genotypes exhibit Mendelian discordance in parent-offspring trios [5, 18]. End tissue mosaicism could add additional complexity and introduce overlap between CNV states [19].

CNV studies leveraging the single nucleotide polymorphism (SNP) arrays used in traditional GWAS face multiple challenges, including variable coverage per platform, batch effects, and limited resolution due to inferior dynamic range [20]. To overcome these difficulties, the first iteration of CNV analyses of SNP arrays in AD applied very similar workflows, concentrating on high stringency calls [10, 11, 12, 13, 14, 15, 16, 17]. The majority of the studies performed genotyping on the Illumina platform with a coverage in the 600 k range, except the Translational Genomics Research Institute (TGEN) study which used the Affymetrix 6.0 array with two million probes (Table 1). The analysis methods were strikingly similar. To comply with the high stringency inferred CNV principal shared in all the studies, CNVs were excluded from the analysis based on number of probes, size and overlap with CNV variant regions or segmental duplications, sometimes even based on frequency (Table 2). These studies investigated only the tip of the iceberg with high specificity but low sensitivity for CNV detection. Importantly, these association studies addressed the question whether rare, large CNVs contribute to the genetic architecture of AD; however, due to the very low allele frequencies of these large variants, most studies were not powered and very large sample sizes are needed over 10,000 cases and controls. Variants with higher frequencies blur the CNV calls, as the Kernel distributions overlap due to the fact that the derived reference genome tends to deviate from the diploid state, both of which increase the genotyping error rate and thus decreases power. These are harder to study on the SNP arrays and have been eliminated in the first iteration.
Table 1

Published CNV GWAS studies; case-control

Study

Platform

Input DNA

AD

MCI

Control

GERAD

Illumina 610-quad

200 ng

3,260

0

1,290

ADNI

Illumina Human610-Quad

NA

288

183

184

Caribbean Hispanics

Illumina HumanHap 650Y

NA

559

0

554

Duke

Illumina Human Hap550K

NA

331

0

368

TGEN

Affymetrix 6.0

NA

1,022

0

595

NCRAD

Illumina Human610-Quad

NA

711

0

171

Table 2

Outline of the methodology applied in published CNV GWAS studies; case-control

Study

LogR calculation

Reference file

Segmentation algorithm

Model

CNV exclusion

GERAD

BeadStudio

Not mentioned

PennCNV

Hidden Markov Model

<20 probes, <100 kb, density <1/15 kb, >50 % overlap with segdup

ADNI

GenomeStudio

Not mentioned

PennCNV

Hidden Markov Model

<10 probes, overlap with centromeric and immunoglobulin regions

Caribbean Hispanics

BeadStudio

Not mentioned

QuantiSNP, iPattern, PennCNV, CNVpartition

Multiple

<5 probes, <100 kb, overlap with centromeric and immunoglobulin regions, 50 % overlap with segdup, >1 % frequency

Duke

BeadStudio

Not mentioned

PennCNV

Hidden Markov Model

10 SNPs, 50 % overlap with previously published regions

TGEN

Unknown

Not mentioned

PennCNV

Hidden Markov Model

10 SNPs, 50 % overlap with centromeric, telomeric and immunoglobulin regions

NCRAD

GenomeStudio

Not mentioned

PennCNV

Hidden Markov Model

<10 probes, likelihood ratio <10, centromeric, immunoglobulin

An alternative analysis strategy is to use segmentation only to reduce the dataset where events may occur, perform the test of association on the numeric segmented data and validate the CNV calls if a replicated association signal was detected [16, 17]. This approach detects association signals from smaller events that would have been discarded when performing the high confidence calls and overcomes the need to determine exact dosage, which is often problematic at common CNV loci as the reference may deviate from the diploid state. This approach signifies a screen and requires diligent validation and replication.

As GWAS studies are performed with increasing sample sizes [3, 21, 22, 23] it is becoming clear that in disorders with marked genetic heterogeneity, where the marker specific risk is low in case-control sets, it is difficult to identify the true positives from the false positives and to replicate the results [24, 25]. In addition, case-control design in AD suffers from additional confounders, such as misclassification bias due to age-dependent penetrance. To further empower association studies, quantitative endophenotypes may replace traditional case-control designs, examples being age at onset analysis or using expression quantitative trait loci (eQTL). Genetic variation, both single nucleotide variations (SNV) and copy number variations (CNVs), contribute to changes in gene expression. In some cases these variations are meaningfully correlated with disease states [26].

2 Methods

Three major high-resolution methods are currently available for detection of gene dosage at the genome level [27, 28]. Deep sequencing methodologies can detect CNVs, but is prohibitively expensive for whole genome studies of large patient cohorts. Most of the available copy number data has been collected using array comparative genomic hybridization (aCGH) or derived from SNP arrays (Table 3). Both approaches are subject to rapid technical improvements. The methodology in SNP arrays includes an amplification step which reduces the resolution of the CN calls. Another important difference is the derivation of CNV state in relation to a reference genome: while aCGH uses a single genome in every experiment as a common denominator (1 to 1 comparison), the SNP arrays use a bioinformatically generated reference genome from multiple cases (1 to average comparison). In aCGH the labeling is controlled at every single array, while in a SNP array the reference value will depend on the normalization efficiency and the allele frequency of any given CNV. Whole genome and exon sequencing methods are in development [29]. However, cohorts of AD with adequate sample sizes are not available yet.
Table 3

Comparison of SNP array and array comparative genome hybridization principles

 

aCGH

SNP array

Design

Main application

Secondary application

Probes empirically tested

Yes

No

Amplification step

No

Yes

Reference sample

Intraexperimental

Extraexperimental, reference mean of >40 samples

Interarray variability

Compensated for by reference sample

Compensated for by normalization

Intraarray variability

Compensated for by normalization

Compensated for by normalization

Optimization for sensitivity and specificity

For CNV

For SNP calling

3 Data Analysis Workflows

Several analysis workflows have been proposed and used. The first iteration workflows focused on high stringency calls optimizing specificity over sensitivity, and these often aim for redundancy to further enhance true positives. Other workflows focus on sensitivity over specificity, enhance resolution and apply complementary methods to fully explore datasets with the implied necessity for replication in additional cohorts with orthogonal methods. The selection of the workflow depends on the research question and requires orthogonal validation methods with locus specific high throughput assays (e.g. PCR or long range PCR for breakpoint, qPCR, TaqMan assay or multiplex ligation-dependent probe amplification (MLPA) for dosage).

3.1 Workflow for SNP Array Data

3.1.1 Quality Control (QC)

  1. 1.

    Experimental quality control: Most major platforms developed QC packages independently and these are incorporated in the data capturing process. Most of these QC parameters focus on signal to noise ratio as this is the key for the segmentation algorithms. For the Affymetrix arrays contrast QC and median absolute pair-wise differences are calculated, while for the Illumina arrays mean, median and standard deviation of the logR ratios and the B allele frequency are determined. Arrays with CNV calls more than two SD from the mean are eliminated as this reflects uneven baseline with false positive CNV calls.

     
  2. 2.

    Data quality control: The parallel detected SNP alleles allow additional QC measures including SNP call rate (exclude <97 %), gender mismatch (by X chromosome logR ratio), and related or duplicate samples (Pi > 0.95) by determining IBD with PLINK software using the genotype data.

     

3.1.2 Population Substructure/Admixture by the SNP Dataset

The concomitant SNP detection allows the determination of underlying substructure or admixture using principal component analysis (PC). Most of the commercially available software packages incorporate this feature and the most commonly used Eigensoft package is available open access. Primary analysis focuses on Caucasian subjects; in addition the PCs are used as covariates in the statistical model.

3.1.3 LogR Ratio Calculation

The log2 ratio calculation is one of the key elements especially for the Affymetrix arrays, where the analyst can define the samples contributing to the derived reference genome. Due to batch effects, using within study samples as reference genome enhances data quality and many more arrays (up to 155 more) will pass QC measures. As a first pass the reference file is generated from all controls in the given dataset. Second pass the reference for the complete analysis is generated from the top 100 DLRS control samples to optimize the elimination of noise. Normalization of logR data is performed by cRMAv2 (Bioconductor). The logR ratio data is subjected to numeric principal component analysis (GoldenHelix) and corrected for the number of PCs which yields a QQ plot devoid of inflation.

3.1.4 Numeric Array Data or Segmentation

  1. 1.

    Numerical array data: Quantile normalized numeric data is used in the analysis as independent variable. This approach suffers from marked multiple testing burden but has the advantage of highest resolution.

     
  2. 2.

    Segmented numeric data: Normalized, PC corrected numeric data is segmented to identify probes where a CNV is detected in any of the samples in the set. The segmentation results in a reduced dataset while maintaining the advantages of the numeric data without binned CNVs.

     
  3. 3.
    Inferred CNVs: Different algorithms give different results even on a single dataset. The algorithms that are developed for a certain platform derived data appear to perform better than the generic CNV calling algorithms [20]. CNV calls are collapsed into regions and CNVs called by either algorithms or both algorithms are entered into the analysis depending upon the goals regarding sensitivity and specificity. Validation of CNV calls with aCGH is depicted in Fig. 1. The two algorithms used generated distinct but overlapping CNV calls; several of the single algorithm calls were validated by the aCGH. This suggests that some of the algorithms are complementary. A recent head to head comparison of various CNV calling algorithms from data captured on various platforms suggests that the algorithm developed for a specific dataset performs best; generic or algorithms developed for a different dataset has lower specificity and sensitivity [20].
    Fig. 1

    Validation of CNV calls inferred by a Circular Binary Segmentation and a Hidden Markov Model by aCGH. Segmentation on 50 samples was performed using the Hidden Markov Model (HMM) algorithm implemented in Genotyping Console and the Circular Binary Segmentation (CBS) algorithm DNAcopy implemented in R. We used aCGH to reference the CNV calls to a gold standard in the same subjects. The number of events ascertained by the two segmentation algorithms was 2,282. CBS generated 2,060 CNV calls in the 50 subjects, while HMM generated 1,264 calls. 1,042 calls were overlapping between the two algorithms. There were 1,018 CBS only and 222 HMM only calls. aCGH validated a high percentage of single algorithm calls and all the double algorithm calls in regions where coverage was comparable. This suggests that the two segmentation algorithms are complimentary. The CNV calls are depicted to the right of the karyotype: HMM (light gray/blue), CBS (dark gray/purple) and aCGH (medium gray/orange) from left to right

     

3.1.5 Test of Association

  1. 1.

    Numerical array data as independent variable: This approach searches for genomically contiguous regions where CN state has an effect on case-control status. To enhance the analysis a “thin and bin” approach is applied.

    Thinning and Binning: Every other oligo is sampled to divide the data in half. In each half, K genomically adjacent oligos are binned and case-control association is performed on the mean CNV state within each thinned bin. False discovery rate (FDR) values for each thin bin p value is calculated, and the q-values for the CNV’s coefficient from lowest (near 0) to highest (near 1) in each half is ranked. K = 2 and K = 100 is tested empirically. The K at which maximum concordance is attained with FDR q values less than 0.05 in each data half is selected. The direction of effect (sign of the beta coefficient) is verified to be concordant.

    Effects of moderate size: The case control association is performed on the entire dataset removing the thinning but retaining data aggregation into K oligo bins. FDR q values are calculated.

     
  2. 2.

    Segmented numeric data as independent variable: Appropriate statistical methods including T-test, or various regression models are used depending upon the dependent variable.

     
  3. 3.

    Inferred CNVs as independent variable: Appropriate statistical methods including T-test, or various regression models are used depending upon the dependent variable.

     

3.1.6 Visualization of Log2 Ratio Data

  1. 1.

    Kernel distribution: These distribution plots delineate the separation of the various copy number states and assess the probability of the CNV events.

     
  2. 2.

    Log2 ratio data as genomic location: These visualization strategies depict the size of the CNV, the signal to noise ratio, the number of probes covering the CNV and the consistency of adjacent log2 ratios, also assessing probability.

     

3.2 Workflow for aCGH Data

At the present time only Agilent microarrays are available. The aCGH workflow is similar, although there are a few distinct features of the data, including 1 to 1 comparison, higher dynamic range and the potential to customize. Data from AD cohorts on aCGH is extremely limited and additional studies are needed.

3.2.1 Quality Control (QC)

The QC parameter for aCGH is MAPD, and values < 0.3 fulfill stringent criteria; even up to 0.35 yields good quality segmentation. Arrays with CNV calls more than two SD from the mean are eliminated. Data with gender mismatch (by X chromosome logR ratio) are eliminated.

3.2.2 Population Substructure/Admixture by the SNP Dataset

Agilent microarray design may incorporate SNPs; however, adding SNPs to the Agilent design reduces the density of the copy number probes, thus degrades resolution. For most of the AD sample collections at least one type of SNP GWAS data is available and that can be used for population substructure/admixture analysis.

3.2.3 Log2 Ratio Calculation

Normalized log-ratio data is generated with the manufacturer’s microarray scanner and quantification software (CGH analytics, Agilent). The log2 ratio is calculated between the sample and an intraexperimental control sample, a 1 to 1 comparison.

3.2.4 Numeric Array Data or Segmentation

Similar to the SNP arrays numeric, numeric segmented and inferred CNV calls can be used in downstream analyses.
  1. 1.

    Numerical array data: Quantile normalized numeric data is used in the analysis as independent variable.

     
  2. 2.

    Segmented numeric data: Numeric PC corrected data is segmented to identify probes where a CNV is detected in any of the samples in the set. The segmentation results in a reduced dataset while maintaining the advantages of the numeric data without binned CNVs.

     
  3. 3.

    Inferred CNVs: The Agilent package CNV calling algorithm is used. The high dynamic range results in superior accuracy compared to SNP arrays, especially for multi copy gains. Due to the dynamic range and the uniformity of the data (mostly Agilent at this point, since Nimblegene stopped manufacturing aCGH) algorithm development is stable. The sensitivity and specificity data reflects that 70 % of CNV events detected with three probes are validated by orthogonal methods and over 90 % of CNV events detected with five probes are validated by orthogonal methods. CNV calls are collapsed into regions and entered into the analysis as independent variable.

     

3.2.5 Test of Association

  1. 1.

    Numerical array data as independent variable: This approach searches for genomically contiguous regions where CN state has an effect on case control status. To enhance the analysis a “thin and bin” approach is applied, as described in the SNP array workflow.

     
  2. 2.

    Segmented numeric data as independent variable: Appropriate statistical methods including T-test, or various regression models are used depending upon the dependent variable.

     
  3. 3.

    Inferred CNVs as independent variable: Appropriate statistical methods including T-test, or various regression models are used depending upon the dependent variable.

     

3.2.6 Visualization of Log2 Ratio Data

  1. 1.

    Kernel distribution: These distribution plots delineate the separation of the various copy number states and assess the probability of the CNV events.

     
  2. 2.

    Log2 ratio data as genomic location: These visualization strategies depict the size of the CNV, the signal to noise ratio, the number of probes covering the CNV and the consistency of adjacent log2 ratios, also assessing probability.

     

For all statistical methods multiple testing correction is applied. The Bonferroni correction appears too conservative; the FDR approach and simulation, performing at least 1,000 permutations, are reasonable alternatives. As the analysis is redundant with the three types of data, arguable controlling for multiple testing burden in each results (numeric, numeric segmented and inferred CNVs) is sufficient. Validation of the CNV calls with orthogonal methods is important to assess the locus specific genotyping error rate. Well-powered replication studies with a locus specific orthogonal method is necessary to replicate the association.

4 Concluding Notes

For Alzheimer’s disease, only SNP array datasets are available currently, thus we discuss the caveats for these platforms and their analysis methods. The development of normalization and segmentation algorithms are a rapidly evolving field and vigilant monitoring is recommended. It is worthwhile to evaluate the novel algorithms on a subset of data that has orthogonal validated events for a couple of regions. Size and density of probes enhance for true positives but reduce resolution. Deletions are easier to detect due to the difference of distance from 0 between log2 of 1/2 versus log2 of 3/2. Common CNV regions further reduce the dynamic range as the calculated reference diploid genome are likely not diploid for that specific region. For common CNVs binning into CNV calls results in a very high genotyping error rate (Fig. 2). For these regions the numeric data or the numeric segmented data is superior for power. All GWAS studies require replication on independent datasets with alternative methods.
Fig. 2

Genotyping error rates for a frequent (10 %) variant, CHRFAM7A on the Affymetrix 6.0 and Illumina 610 arrays. CNVs were inferred based on the Kernel distributions of the segmented numeric data for the Affymetrix and the numeric data for the Illumina array as this latter dataset failed the segmentation algorithm. Breakpoint specific TaqMan assay was performed on the same samples to assess genotyping error rate. Rectangles represent concordant calls between the SNP array and the breakpoint specific TaqMan assay. The striking genotyping error rate for the Illumina 610 array further emphasizes the risks involved in assigning exact dosage; in these situations retaining the numeric values without binning implies a lower genotyping error rate, thus increases power

Notes

Acknowledgements

This work was supported by an Alzheimer Association New Investigator Research Grant to K.S.

References

  1. 1.
    Kukull WA, Higdon R, Bowen JD et al (2002) Dementia and Alzheimer disease incidence: a prospective cohort study. Arch Neurol 59:1737–1746PubMedCrossRefGoogle Scholar
  2. 2.
    Gatz M, Pedersen NL, Berg S et al (1997) Heritability for Alzheimer’s disease: the study of dementia in Swedish twins. J Gerontol A Biol Sci Med Sci 52:M117–M125PubMedCrossRefGoogle Scholar
  3. 3.
    Naj AC, Jun G, Beecham GW et al (2011) Common variants at MS4A4/MS4A6E, CD2AP, CD33 and EPHA1 are associated with late-onset alzheimer’s disease. Nat Genet 43:436–441PubMedCentralPubMedCrossRefGoogle Scholar
  4. 4.
    Feuk L, Carson AR, Scherer SW (2006) Structural variation in the human genome. Nat Rev Genet 7:85–97PubMedCrossRefGoogle Scholar
  5. 5.
    Redon R, Ishikawa S, Fitch KR et al (2006) Global variation in copy number in the human genome. Nature 444:444–454PubMedCentralPubMedCrossRefGoogle Scholar
  6. 6.
    Stranger BE, Forrest MS, Dunning M et al (2007) Relative impact of nucleotide and copy number variation on gene expression phenotypes. Science 315:848–853PubMedCentralPubMedCrossRefGoogle Scholar
  7. 7.
    Rovelet-Lecrux A, Hannequin D, Raux G et al (2006) APP locus duplication causes autosomal dominant early-onset alzheimer disease with cerebral amyloid angiopathy. Nat Genet 38:24–26PubMedCrossRefGoogle Scholar
  8. 8.
    Singleton AB, Farrer M, Johnson J et al (2003) alpha-Synuclein locus triplication causes Parkinson’s disease. Science 302:841PubMedCrossRefGoogle Scholar
  9. 9.
    Padiath QS, Saigoh K, Schiffmann R et al (2006) Lamin B1 duplications cause autosomal dominant leukodystrophy. Nat Genet 38:1114–1123PubMedCrossRefGoogle Scholar
  10. 10.
    Chapman J, Rees E, Harold D et al (2013) A genome-wide study shows a limited contribution of rare copy number variants to Alzheimer’s disease risk. Hum Mol Genet 22:816–824PubMedCentralPubMedCrossRefGoogle Scholar
  11. 11.
    Swaminathan S, Huentelman MJ, Corneveaux JJ et al (2012) Analysis of copy number variation in Alzheimer’s disease in a cohort of clinically characterized and neuropathologically verified individuals. PLoS One 7:e50640PubMedCentralPubMedCrossRefGoogle Scholar
  12. 12.
    Swaminathan S, Kim S, Shen L et al (2011) Genomic copy number analysis in Alzheimer’s disease and mild cognitive impairment: an ADNI study. Int J Alzheimers Dis 2011:729478PubMedCentralPubMedGoogle Scholar
  13. 13.
    Swaminathan S, Shen L, Kim S et al (2012) Analysis of copy number variation in Alzheimer’s disease: the NIALOAD/NCRAD family study. Curr Alzheimer Res 9:801–814PubMedCentralPubMedCrossRefGoogle Scholar
  14. 14.
    Heinzen EL, Need AC, Hayden KM et al (2010) Genome-wide scan of copy number variation in late-onset alzheimer’s disease. J Alzheimers Dis 19:69–77PubMedCentralPubMedGoogle Scholar
  15. 15.
    Ghani M, Pinto D, Lee JH et al (2012) Genome-wide survey of large rare copy number variants in Alzheimer’s disease among Caribbean hispanics. G3 (Bethesda) 2:71–78CrossRefGoogle Scholar
  16. 16.
    Shaw CA, Li Y, Wiszniewska J et al (2011) Olfactory copy number association with age at onset of Alzheimer disease. Neurology 76:1302–1309PubMedCentralPubMedCrossRefGoogle Scholar
  17. 17.
    Szigeti K, Lal D, Li Y et al (2013) Genome-wide scan for copy number variation association with age at onset of Alzheimer’s disease. J Alzheimers Dis 33:517–523PubMedCentralPubMedGoogle Scholar
  18. 18.
    Conrad DF, Keebler JE, DePristo MA et al (2011) Variation in genome-wide mutation rates within and between human families. Nat Genet 43:712–714PubMedCentralPubMedCrossRefGoogle Scholar
  19. 19.
    McConnell MJ, Lindberg MR, Brennand KJ et al (2013) Mosaic copy number variation in human neurons. Science 342:632–637PubMedCentralPubMedCrossRefGoogle Scholar
  20. 20.
    Pinto D, Darvishi K, Shi X et al (2011) Comprehensive assessment of array-based platforms and calling algorithms for detection of copy number variants. Nat Biotechnol 29:512–520PubMedCentralPubMedCrossRefGoogle Scholar
  21. 21.
    Harold D, Abraham R, Hollingworth P et al (2009) Genome-wide association study identifies variants at CLU and PICALM associated with Alzheimer’s disease. Nat Genet 41:1088–1093PubMedCentralPubMedCrossRefGoogle Scholar
  22. 22.
    Seshadri S, Fitzpatrick AL, Ikram MA et al (2010) Genome-wide analysis of genetic loci associated with Alzheimer disease. JAMA 303:1832–1840PubMedCentralPubMedCrossRefGoogle Scholar
  23. 23.
    Lambert JC, Heath S, Even G et al (2009) Genome-wide association study identifies variants at CLU and CR1 associated with Alzheimer’s disease. Nat Genet 41:1094–1099PubMedCrossRefGoogle Scholar
  24. 24.
    Ku CS, Loy EY, Pawitan Y et al (2010) The pursuit of genome-wide association studies: where are we now? J Hum Genet 55:195–206PubMedCrossRefGoogle Scholar
  25. 25.
    Florez JC (2008) Clinical review: the genetics of type 2 diabetes: a realistic appraisal in 2008. J Clin Endocrinol Metab 93:4633–4642PubMedCentralPubMedCrossRefGoogle Scholar
  26. 26.
    Nicolae DL, Gamazon E, Zhang W et al (2010) Trait-associated SNPs are more likely to be eQTLs: annotation to enhance discovery from GWAS. PLoS Genet 6:e1000888PubMedCentralPubMedCrossRefGoogle Scholar
  27. 27.
    Carter NP (2007) Methods and strategies for analyzing copy number variation using DNA microarrays. Nat Genet 39(7 Suppl):S16–S21PubMedCentralPubMedCrossRefGoogle Scholar
  28. 28.
    Scherer SW, Lee C, Birney E et al (2007) Challenges and standards in integrating surveys of structural variation. Nat Genet 39(7 Suppl):S7–S15PubMedCentralPubMedCrossRefGoogle Scholar
  29. 29.
    Duan J, Zhang JG, Deng HW, Wang YP (2013) Comparative studies of copy number variation detection methods for next-generation sequencing technologies. PLoS One 8:e59128PubMedCentralPubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  1. 1.Department of Neurology, School of Medicine and Biomedical SciencesUniversity at Buffalo SUNYBuffaloUSA

Personalised recommendations