Transmission-ratio distortion in the Framingham Heart Study

Paterson, Andrew D; Waggott, Daryl; Schillert, Arne; Infante-Rivard, Claire; Bull, Shelley B; Yoo, Yun Joo; Pinnaduwage, Dushanthi

doi:10.1186/1753-6561-3-S7-S51

Transmission-ratio distortion in the Framingham Heart Study

Proceedings
Open access
Published: 15 December 2009

Volume 3, article number S51, (2009)
Cite this article

Download PDF

You have full access to this open access article

BMC Proceedings Aims and scope

Transmission-ratio distortion in the Framingham Heart Study

Download PDF

Andrew D Paterson^1,2,
Daryl Waggott³,
Arne Schillert⁴,
Claire Infante-Rivard⁵,
Shelley B Bull^2,3,
Yun Joo Yoo³ &
…
Dushanthi Pinnaduwage³

1024 Accesses
9 Citations
Explore all metrics

Abstract

Transmission-ratio distortion (TRD) is a phenomenon in which the segregation of alleles does not obey Mendel's laws. As a simple example, a recessive locus that results in fetal lethality will result in live-born individuals sharing more alleles at this locus than expected under Mendel's laws. This could result in apparent linkage of the phenotype of 'being alive' to such a chromosomal regions. Further, this could result in false-positive linkage when 'affected-only' parametric or non-parametric linkage analysis is performed. Similarly, loci demonstrating TRD may be detectable in family-based association tests as deviant transmission of alleles. Therefore, TRD could result in confounding of family-based association studies of diseases. The Framingham Heart Study data available for Genetic Analysis Workshop 16 is a suitable dataset to determine whether there are loci in the genome that reveal TRD because of the large number of individuals from families, the high-resolution genotyping, and the population-based nature of the study. We have used both genome-wide linkage and family-based association methods to determine whether there are loci that demonstrate TRD in the Framingham Heart Study. Family-based association analysis identified thousands of loci with apparent TRD. However, the vast majority of these are likely the result of genotyping errors with application of strict quality control criteria to the genotype data, and automated inspection of the intensity plots, we identify a small number of loci that may show true TRD, including rs1000548 in intron 6 of S-antigen (arrestin, SAG) on chromosome 2 (p = 7 × 10^-10).

Background

A critical assumption for the majority of genetic mapping approaches (including both linkage and family-based association) is that Mendel's law of segregation is obeyed. Transmission-ratio distortion (TRD) refers to the deviation from the expected Mendelian inheritance of alleles. Violation of this assumption could result in false-positive linkage, particularly within 'affected-only' or 'non-parametric' linkage analysis frameworks. Futhermore, within a family-based association design, the presence of TRD could produce spurious association if transmissions are only assessed to affected, but not unaffected offspring. In addition, it is feasible that the presence of TRD could also reduce the power to detect true disease loci. The presence of TRD in humans has been addressed in only a few studies, using either linkage [1] or family-based association methods [2]. However, these studies had limited sample sizes, which may have resulted in low power. This limitation has recently been emphasized, when it was shown that hundreds or thousands of trios would be needed to detect loci even with large TRD deviations [3].

For a variety of reasons, including that of statistical power, the majority of genome-wide association studies have used a case-control design, which is not able to detect loci that are subject to TRD. However, some studies are employing a family-based design, but it is typical for them to study only affected offspring, and they are thus susceptible to identifying loci that demonstrate TRD and falsely concluding that they are associated with the disease of interest. Unless unaffected sibs are genotyped, one cannot determine whether association signals are the result of confounding by TRD. Therefore, we took advantage of the large sample size, pedigree-based design and genome-wide genotyping of the Framingham Heart Study Problem 2 data from Genetic Analysis Workshop 16 (GAW16) to determine whether we could identify loci demonstrating TRD.

Methods

Subject and genotype data

We used data from Affymetrix 500 k and 50 k single-nucleotide polymorphism (SNP) datasets from Problem 2 of GAW16, the Framingham Heart Study. Genotype data were called by the data providers using BRLMM [4], but no details were provided about how samples were batched for genotype calling. Data providers removed relationship errors and sample mix-ups but not any remaining Mendelian errors.

Linkage analysis

All genotyped individuals in the last generation were coded as 'affected' and we used non-parametric linkage approaches (Cox and Kong non-parametric linkage (NPL)) to determine whether there are regions in the genome linked to the phenotype of 'being alive in the last generation' (Merlin v 1.1.2) [5]. We dealt with linkage disequilibrium among the ~500 k SNPs by selecting a subset of SNPs based on: minor allele frequency (MAF)>45%, Hardy-Weinberg equilibrium (HWE) p-value > 0.05, individual genotype missing rate <5%, SNP missing rate <2%, pairwise r² < 0.05, and Mendelian error rate <5%. Individuals from Cohort 1 were not used in the analysis, therefore large pedigrees were split into smaller pedigrees using the R kinship package (makefamid function [6]) to allow the computation of NPL statistics.

Family-based association analysis

We also performed family-based association tests (i.e., the transmission-disequilibrium test, or TDT) to examine the transmissions of alleles for all SNPs across the genome to all genotyped individuals in the dataset using PLINK v1.02 [7, 8] with the Affymetrix 500 k and HuGeneFocused 50 k SNP genotype data. SNPs were initially selected to have MAF>1%, call rates >90%, and HWE p > 10^-5.

Results

Linkage analysis

Genome-wide linkage analyses used ~5 k SNPs from 1,028 pedigrees that were informative. There were no loci that met genome-wide criteria for significant linkage.

Family-based association analysis

Genome-wide TDT analysis was performed and identified 2,722 autosomal SNPs with TDT p < 10^-8, which was an unexpectedly large number. However, when we investigated this further, we suspected that the majority of these results were false positives due to genotyping error. It has been reported previously that, in the presence of certain common types of genotyping error, there is a bias to excess transmission of the major as opposed to minor allele for SNPs [9]. Indeed, in this data there was a striking bias in the transmission rates based on whether the major or minor allele showed excess transmission. Specifically, there were 2,701 SNPs with TDT p < 10^-8, HWE p > 10^-5, and MAF>1% in which the major allele showed excess transmission. This compared to only 21 SNPs using the same criteria in which the minor allele showed excess transmission.

To confirm our suspicions that genotyping error was the major cause of the large number of positive results, we took advantage of the fact that it is more difficult to detect Mendelian errors for SNPs with lower MAF [10]. This would lead us to expect that low-allele-frequency SNPs would be disproportionately represented in those SNPs that demonstrate excess transmission of the major allele compared to those where the minor allele showed excess transmission. Consistent with this expectation, when we compared the MAF as a function of the transmission of the major or minor allele for these 2,722 SNPs, the MAF was significantly lower for those SNPs where the major allele showed over-transmission (3.8 ± 4.4%, mean ± SD) compared with those where the minor allele was over-transmitted (33% ± 12%, p < 0.0001).

Visual inspection of the cluster-plots of thousands of SNPs is labor intensive, so we next investigated whether we could use automated methods to help distinguish which SNPs had good quality genotype calls. We then applied a less stringent criteria for TRD (i.e., p < 10^-5), and for these 4,501 SNPs we ran automated cluster plot analysis (ACPA) [11]. We limited this analysis to SNPs with MAF >0.01, missing rate < 0.02, and HWE p-value>10^-4. Using a criteria for genome-wide significance of p < 10^-8, only one SNP was predicted by the ACPA procedure to have good quality genotype clustering, rs1000548 (TDT p = 7.4 × 10^-10; Figure 1). Details about this and other SNPs that were also predicted using ACPA to have good quality genotype clustering using a more relaxed significance criteria (TDT p < 10^-5) are provided in Table 1. For these 8 SNPs, there was no significant heterogeneity between the paternal and maternal transmission rates (p > 0.08).

Table 1 SNPs showing TRD (TDT p < 10^-5) with genotype clustering passing ACPA

Full size table

Discussion and conclusion

The results of TDT analyses performed here have highlighted the problems of using high-throughput genotype data with even a small proportion of genotyping errors to detect phenomenon such as TRD. The gross over-transmission of the common allele for SNPs with a pattern consistent with TRD, and the marked allele frequency difference between them and the SNPs where the minor allele shows excess transmission, are consistent with genotyping error being the major force behind the unexpectedly large number of apparent positive results. Further contributing to the bias described by Mitchell et al. [9] in which genotyping errors are more difficult to detect for SNPs with low MAF, is the concern that the genotype error rate for rarer SNPs may be higher due to batch-calling of genotypes. These concerns make it challenging to distinguish true effects from artifact. Alternative genotype calling algorithms, which call genotypes from all or larger sets of samples at once, or even across multiple studies, have been shown to improve the quality of genotype calling, e.g., CHIAMO [12]. In addition, this work has implication for implementation of imputation strategies for ungenotyped SNPs (which is common for genome-wide association studies). Because we found that >1% of SNPs in this dataset likely have poor quality genotype calling even after applying conventional quality control criteria, this means that ungenotyped SNPs that are imputed based on these SNPs which have genotyping errors are likely to be subject to considerable error.

In addition to the complexities that have arisen in the interpretation of our analysis, there is concern that the use of HWE as a criterion to filter SNPs for the analysis of TRD is a double-edged sword. Some SNPs showing true TRD may also deviate significantly from HWE because of violation of the selection assumption, and may end up being removed from datasets in an attempt to remove genotyping errors. Similarly, automatic exclusion of SNPs with low MAF may bias against the detection of true TRD loci because it is likely that because of negative selection, SNPs which show TRD tend to have low MAF. Another caveat of this study is that at each of the eight regions with evidence for TRD (Table 1), there is only one SNP in each region which shows evidence for TRD. Given the general selection of SNPs on the Affymetrix 500 k chip, we would expect that in some regions there would be other SNPs with similar TRD results, so this makes us cautious about over-interpretation of these results.

There are some interesting genes near the SNPs in Table 1 that show TRD. For example, rs3786228 is in intron 4 of CTDP1 (carboxy-terminal domain, RNA polymerase II, polypeptide A phosphatase, subunit 1) on chromosome 18; autosomal recessive mutations in this gene have been shown to results in 'congenital cataracts facial dysmorphism neuropathy' (CCFDN), a developmental disorder prevalent in Roma Gypsies [13]. Similarly, autosomal recessive inheritance of mutations in SAG (S-antigen, arrestin) have been found in Oguchi disease, a rare autosomal recessive form of night blindness [14]. In this study we observed marked TRD of rs1000548, in intron 6 of SAG. It may be that in populations similar to Framingham, variation in these genes contributes to phenotypes that can result in TRD, including the failure of fertilization, implantation, or the differential survival of fetuses. Identifying loci that demonstration TRD could provide insight into the mechanisms of the processes.

Abbreviations

ACPA:: Automated cluster plot analysis
GAW:: Genetic Analysis Workshop
HWE:: Hardy-Weinberg equilibrium
MAF:: Minor allele frequency
NPL:: Non-parametric linkage
TDT:: Transmission-disequilibrium test
TRD:: Transmission-ratio distortion
SNP:: Single-nucleotide polymorphism

References

Zöllner S, Wen X, Hanchard NA, Herbert MA, Ober C, Pritchard JK: Evidence for extensive transmission distortion in the human genome. Am J Hum Genet. 2004, 74: 62-72. 10.1086/381131.
Article PubMed Central PubMed Google Scholar
The International HapMap Consortium: A haplotype map of the human genome. Nature. 2005, 437: 1299-1320. 10.1038/nature04226.
Article PubMed Central Google Scholar
Evans DM, Morris AP, Cardon LR, Sham PC: A note on the power to detect transmission distortion in parent--child trios via the transmission disequilibrium test. Behav Genet. 2006, 36: 947-950. 10.1007/s10519-006-9087-2.
Article CAS PubMed Google Scholar
BRLMM: an Improved Genotype Calling Method for the GeneChip® Human Mapping 500 K Array Set, Revision Date: 2006-04-14, Revision Version: 1.0. [http://www.affymetrix.com/support/technical/whitepapers/brlmm_whitepaper.pdf]
Abecasis GR, Cherny SS, Cookson WO, Cardon LR: Merlin-rapid analysis of dense genetic maps using sparse gene flow trees. Nat Genet. 2002, 30: 97-101. 10.1038/ng786.
Article CAS PubMed Google Scholar
Atkinson B, Therneau T: Kinship: mixed-effects Cox models, sparse matrices, and modeling data from large pedigrees. R package version 1.1.0-21. 2008, [http://www.r-project.org]
Google Scholar
Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D, Maller J, Sklar P, de Bakker PI, Daly MJ, Sham PC: PLINK: a toolset for whole-genome association and population-based linkage analysis. Am J Hum Genet. 2007, 81: 559-575. 10.1086/519795.
Article PubMed Central CAS PubMed Google Scholar
PLINK. [http://pngu.mgh.harvard.edu/purcell/plink/]
Mitchell AA, Cutler DJ, Chakravarti A: Undetected genotyping errors cause apparent overtransmission of common alleles in the transmission/disequilibrium test. Am J Hum Genet. 2003, 72: 598-610. 10.1086/368203.
Article PubMed Central CAS PubMed Google Scholar
Douglas JA, Skol AD, Boehnke M: Probability of detection of genotyping errors and mutations as inheritance inconsistencies in nuclear-family data. Am J Hum Genet. 2002, 70: 487-495. 10.1086/338919.
Article PubMed Central CAS PubMed Google Scholar
Schillert A, Schwarz DF, Vens M, Szymczak S, König IR, Ziegler A: ACPA: automated cluster plot analysis of genotype data. BMC Proc. 2009, 3 (suppl 7): S58-10.1186/1753-6561-3-s7-s58.
Article PubMed Central PubMed Google Scholar
Wellcome Trust Case Control Consortium: Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature. 2007, 447: 661-678. 10.1038/nature05911.
Article Google Scholar
Varon R, Gooding R, Steglich C, Marns L, Tang H, Angelicheva D, Yong KK, Ambrugger P, Reinhold A, Morar B, Baas F, Kwa M, Tournev I, Guerguelcheva V, Kremensky I, Lochmüller H, Müllner-Eidenböck A, Merlini L, Neumann L, Bürger J, Walter M, Swoboda K, Thomas PK, von Moers A, Risch N, Kalaydjieva L: Partial deficiency of the C-terminal-domain phosphatase of RNA polymerase II is associated with congenital cataracts facial dysmorphism neuropathy syndrome. Nat Genet. 2003, 35: 185-189. 10.1038/ng1243.
Article CAS PubMed Google Scholar
Fuchs S, Nakazawa M, Maw M, Tamai M, Oguchi Y, Gal A: A homozygous 1-base pair deletion in the arrestin gene is a frequent cause of Oguchi disease in Japanese. Nat Genet. 1995, 10: 360-362. 10.1038/ng0795-360.
Article CAS PubMed Google Scholar

Download references

Acknowledgements

The Genetic Analysis Workshops are supported by NIH grant R01 GM031575 from the National Institute of General Medical Sciences. ADP holds a Canada Research Chair in Genetics of Complex Diseases and it supported by Genome Canada through Ontario Genomics Institute, and NIH.

This article has been published as part of BMC Proceedings Volume 3 Supplement 7, 2009: Genetic Analysis Workshop 16. The full contents of the supplement are available online at http://www.biomedcentral.com/1753-6561/3?issue=S7.

Author information

Authors and Affiliations

Program in Genetics and Genome Biology, Hospital for Sick Children, 101 College Street, TMDT East Tower, Toronto, ON, M5G 1X8, Canada
Andrew D Paterson
Dalla Lana School of Public Health, University of Toronto, 155 College Street, Toronto, Ontario, M5T 3M7, Canada
Andrew D Paterson & Shelley B Bull
Samuel Lunenfeld Research Institute of Mount Sinai Hospital, Prosserman Centre for Health Research, 60 Murray Street, Toronto, ON, M5T 3L9, Canada
Daryl Waggott, Shelley B Bull, Yun Joo Yoo & Dushanthi Pinnaduwage
Institut für Medizinische Biometrie und Statistik, Universität zu Lübeck, Maria-Goeppert Str. 1, 23562, Lübeck, Germany
Arne Schillert
Department of Epidemiology, Biostatistics and Occupational Health, Faculty of Medicine, McGill University, 1110 Pine Avenue West, Montréal, Québec, H3A 1A3, Canada
Claire Infante-Rivard

Authors

Andrew D Paterson
View author publications
You can also search for this author in PubMed Google Scholar
Daryl Waggott
View author publications
You can also search for this author in PubMed Google Scholar
Arne Schillert
View author publications
You can also search for this author in PubMed Google Scholar
Claire Infante-Rivard
View author publications
You can also search for this author in PubMed Google Scholar
Shelley B Bull
View author publications
You can also search for this author in PubMed Google Scholar
Yun Joo Yoo
View author publications
You can also search for this author in PubMed Google Scholar
Dushanthi Pinnaduwage
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andrew D Paterson.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

ADP, CI-R and SBB conceived of the idea. DW, DP, and YJY performed the linkage and association analysis. AS ran the ACPA analysis. ADP wrote a draft of the manuscript which all authors edited, read and approved the final manuscript.

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Paterson, A.D., Waggott, D., Schillert, A. et al. Transmission-ratio distortion in the Framingham Heart Study. BMC Proc 3 (Suppl 7), S51 (2009). https://doi.org/10.1186/1753-6561-3-S7-S51

Download citation

Published: 15 December 2009
DOI: https://doi.org/10.1186/1753-6561-3-S7-S51

Transmission-ratio distortion in the Framingham Heart Study

Abstract

Background

Methods

Subject and genotype data

Linkage analysis

Family-based association analysis

Results

Linkage analysis

Family-based association analysis

Discussion and conclusion

Abbreviations

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Competing interests

Authors' contributions

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Transmission-ratio distortion in the Framingham Heart Study

Abstract

Background

Methods

Subject and genotype data

Linkage analysis

Family-based association analysis

Results

Linkage analysis

Family-based association analysis

Discussion and conclusion

Abbreviations

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Competing interests

Authors' contributions

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation