Human Genetics

, Volume 118, Issue 5, pp 626–639

Admixture-matched case-control study: a practical approach for genetic association studies in admixed populations

Authors

    • Department of Medicine, Study of African American, Asthma, Genes and Environments (SAGE)University of California
    • Lung Biology Center, Study of African American, Asthma, Genes and Environments (SAGE)San Francisco General Hospital
  • Jennifer Y. Kho
    • Department of Medicine, Study of African American, Asthma, Genes and Environments (SAGE)University of California
    • Lung Biology Center, Study of African American, Asthma, Genes and Environments (SAGE)San Francisco General Hospital
  • Nishat Shaikh
    • Department of Medicine, Study of African American, Asthma, Genes and Environments (SAGE)University of California
    • Lung Biology Center, Study of African American, Asthma, Genes and Environments (SAGE)San Francisco General Hospital
  • Shweta Choudhry
    • Department of Medicine, Study of African American, Asthma, Genes and Environments (SAGE)University of California
    • Lung Biology Center, Study of African American, Asthma, Genes and Environments (SAGE)San Francisco General Hospital
  • Mariam Naqvi
    • Department of Medicine, Study of African American, Asthma, Genes and Environments (SAGE)University of California
    • Lung Biology Center, Study of African American, Asthma, Genes and Environments (SAGE)San Francisco General Hospital
  • Daniel Navarro
    • Department of Medicine, Study of African American, Asthma, Genes and Environments (SAGE)University of California
    • Lung Biology Center, Study of African American, Asthma, Genes and Environments (SAGE)San Francisco General Hospital
  • Henry Matallana
    • Department of Medicine, Study of African American, Asthma, Genes and Environments (SAGE)University of California
    • Lung Biology Center, Study of African American, Asthma, Genes and Environments (SAGE)San Francisco General Hospital
  • Richard Castro
    • Department of Medicine, Study of African American, Asthma, Genes and Environments (SAGE)University of California
    • Lung Biology Center, Study of African American, Asthma, Genes and Environments (SAGE)San Francisco General Hospital
  • Craig M. Lilly
    • Brigham and Women’s Hospital
  • H. George Watson
    • The James A. Watson Wellness Center
  • Kelley Meade
    • Children’s Hospital and Research Institute
  • Michael LeNoir
    • Bay Area Pediatrics
  • Shannon Thyne
    • Department of Medicine, Study of African American, Asthma, Genes and Environments (SAGE)University of California
  • Elad Ziv
    • Department of Medicine, Study of African American, Asthma, Genes and Environments (SAGE)University of California
    • Center for Human Genetics, Study of African American, Asthma, Genes and Environments (SAGE)University of California
  • Esteban González Burchard
    • Department of Medicine, Study of African American, Asthma, Genes and Environments (SAGE)University of California
    • Lung Biology Center, Study of African American, Asthma, Genes and Environments (SAGE)San Francisco General Hospital
    • Center for Human Genetics, Study of African American, Asthma, Genes and Environments (SAGE)University of California
Original Investigation

DOI: 10.1007/s00439-005-0080-2

Cite this article as:
Tsai, H., Kho, J.Y., Shaikh, N. et al. Hum Genet (2006) 118: 626. doi:10.1007/s00439-005-0080-2
  • 99 Views

Abstract

Case-control genetic association studies in admixed populations are known to be susceptible to genetic confounding due to population stratification. The transmission/disequilibrium test (TDT) approach can avoid this problem. However, the TDT is expensive and impractical for late-onset diseases. Case-control study designs, in which, cases and controls are matched by admixture, can be an appealing and a suitable alternative for genetic association studies in admixed populations. In this study, we applied this matching strategy when recruiting our African American participants in the Study of African American, Asthma, Genes and Environments. Group admixture in this cohort consists of 83% African ancestry and 17% European ancestry, which was consistent with reports from other studies. By carrying out several complementary analyses, our results show that there is a substructure in the cohort, but that the admixture distributions are almost identical in cases and controls, and also in cases only. We performed association tests for asthma-related traits with ancestry, and only found that FEV1, a measure for baseline pulmonary function, was associated with ancestry after adjusting for socio-economic and environmental risk factors (P=0.01). We did not observe an excess of type I error rate in our association tests for ancestry informative markers and asthma-related phenotypes when ancestry was not adjusted in the analyses. Furthermore, using the association tests between genetic variants in a known asthma candidate gene, β2 adrenergic receptor (β2AR) and ΔFEF25–75, an asthma-related phenotype, as an example, we demonstrated population stratification was not a confounder in our genetic association. Our present work demonstrates that admixture-matched case-control strategies can efficiently control population stratification confounding in admixed populations.

Introduction

Population stratification is a potential confounding factor of case-control genetic association studies in admixed populations, such as African Americans (Cardon and Bell 2001; Cardon and Palmer 2003). Population stratification occurs when there are different allele frequencies between cases and controls due to heterogeneity in ancestry, which is unrelated to disease affectation status. Ignoring population stratification in association tests may lead to a potential excess of both false positive and false negative results (Burchard et al. 2003b; Lander and Schork 1994; Ziv and Burchard 2003).

The history of African Americans is notable for the admixture between Africans, Europeans, and native Americans (Parra et al. 1998). The first attempts to estimate admixture proportions in African Americans were in the 1950s (Glass and Li 1953). Since then, this field has been underdeveloped due to the limited availability of ancestry informative markers (AIMs) and data from ancestral populations. Recent studies have provided fruitful results on AIMs discovery and the development of methodologies for estimating individual and group admixture (Akey et al. 2002; Pfaff et al. 2001; Shriver et al. 2005, 2003).

Although the transmission/disequilibrium test (TDT) approach is robust against population stratification and has been proposed for finding susceptibility genes in complex traits (Allison 1997; Spielman et al. 1993), the TDT approach is often expensive and impractical for late-onset disorders. One promising solution to control population stratification is to match cases and controls carefully based on their genetic background. Well-matched case-control designs may avoid the confounding effect due to population stratification (Wacholder et al. 2002; Zondervan et al. 2002). To control population stratification confounding, a previous study reported several analytical strategies for matching cases and controls as part of association tests in admixed populations (Hinds et al. 2004). In addition, several groups have proposed to detect and control population stratification confounding in case-control association tests by using two powerful approaches: (1) identifying and including ancestry in the analysis; (2) using genomic control to adjust for potential existing population stratification (Bacanu et al. 2000; Devlin et al. 2001; Freedman et al. 2004; Hoggart et al. 2003; Parra et al. 2004; Pritchard and Donnelly 2001)

Significant worldwide variations in asthma prevalence have been reported by the International Study of Asthma and Allergies in Childhood (ISAAC) and the European Community Respiratory Health Survey (ECRHS) (1998; Pearce et al. 2000). In the US, it is well known that there are racial and ethnic differences in asthma prevalence, morbidity, and mortality. Specifically, asthma prevalence and mortality among African Americans is greater than among European Americans (Akinbami et al. 2005; Akinbami and Schoendorf 2002; Mannino et al. 2002). It is important to investigate the genetic, environmental, and socio-economic factors, which may lead to the racial and ethnic variations.

The β2 adrenergic receptor (β2AR) is one of the candidate genes most consistently identified as being associated with asthma-related phenotypes (Choudhry et al. 2005; Evans et al. 2001; Holloway et al. 2000; Silverman et al. 2003). The Gly16 polymorphism has been associated with asthma severity and lower bronchodilator responsiveness, while the Arg16 allele has been shown to be associated with increased bronchodilator responsiveness (Martinez et al. 1997). It may be because of the difficulties of controlling population stratification, the effect of the β2AR genetic variants on the asthma-related phenotypes among African American asthmatics is unclear.

In this study, we have recruited African American subjects participating in the Study of African American, Asthma, Genes and Environments (SAGE) through well-matched case-control strategies. We have detected the population substructure and recent admixture. We have also evaluated the group admixture and individual admixture using two programs—ADMIXMAP and Structure 2.1. We have examined the relationship between asthma-related phenotypes and ancestry. Moreover, we have demonstrated that there is no evidence of confounding due to population stratification in our genetic association tests of asthma-related traits with AIMs, and with the β2AR genetic variants. Our results have indicated that the inflation of type I error rate in association tests can be efficiently controlled in an admixture-matched case-control study of asthma in African Americans.

Subjects and methods

Study participants

One hundred and seventy-six African American asthmatics were recruited from three clinics as part of the ongoing Study of African American, Asthma, Genes and Environments (SAGE). One clinic is the San Francisco General Hospital, and the remaining two clinics are located less than 2 miles away from each other in Oakland, California. Eligible cases were between the ages of 8 and 40 years, had physician-diagnosed asthma, and had experienced two or more asthma symptoms (wheezing, coughing, and/or shortness of breath) in the previous 2 years. We recruited 176 matched controls whose ages were between 8 and 40. Controls were eligible only if they reported no history of asthma or allergies, no history or report of having experienced symptoms of coughing, wheezing or shortness of breath in the past 2 years, no other history of lung diseases or chronic illness or medications, less than10-pack-per-year smoking history, and no smoking in the last year. All subjects were enrolled into the study only if subjects self-identified as African Americans, and both biological parents and all biological grandparents were identified as African Americans.

Phenotype measurement

Asthma is characterized by recurrent episodes of wheeze, cough, and airway obstruction. Airway obstruction is an indicator of asthma severity and can be measured using spirometry. Standard measurements of the severity of the airway obstruction are FEV1, FEV1/FVC and FEF25–75, all expressed as a percentage of normal predicted values. The lower the value, the more severe the airway obstruction. Airway obstruction is reversible with the inhalation of medications such as albuterol, the most commonly prescribed asthma medication in the world. The reversibility of airway obstruction is a measure of drug responsiveness. Reversibility can be measured by performing spirometry before and after the administration of albuterol and measuring the difference (ΔFEV1, ΔFEV1/FVC, and ΔFEF25–75).

Asthmatic subjects were instructed to withhold their bronchodilator medications for at least 8 h before lung function tests. Spirometry was performed according to the American Thoracic Society standards (1995). Pulmonary function test results are expressed as a percentage of the predicted normal value using age-adjusted prediction equations from Hankinson (Hankinson et al. 1999). Baseline pulmonary function results are reported as pre-FEV1, pre-FEV1/FVC, and pre-FEF25–75. Albuterol was administered using an extension tube connected to a standard metered dose inhaler (180 μg or two puffs for subjects <16 years old and 360 μg or four puffs for subjects ≥16 years old). Fifteen minutes after albuterol administration, FEV1, FEV1/FVC, and FEF25–75 were measured again. Bronchodilator drug responsiveness to albuterol is reported as a percent change in FEV1, FEV1/FVC, and FEF25–75 between baseline and after albuterol administration (expressed herein as ΔFEV1, ΔFEV1/FVC, and ΔFEF25–75, respectively).

Quantitative measures of asthma severity were defined as pre-FEV1, pre-FEV1/FVC, and pre-FEF25–75. Qualitative measures of asthma severity were classified as “mild” or “moderate–severe” asthma based on four “yes/no” questions related to medication use, asthma symptoms, nocturnal awakenings, and pre-FEV1 (Burchard et al. 2003a). Total plasma IgE, a measure for determining atopic asthmatic cases, was collected in duplicate for asthmatic subjects using Uni-Cap technology (Pharmacia, Kalamazoo, MI, USA).

Selection of ancestry informative markers (AIMs)

We selected these 31 AIM SNP variants based on their informativeness of ancestry with a large difference of allele frequencies (δ) between Native American, African and European ancestral populations (Bonilla et al. 2004; Parra et al. 1998). For dimorphic variants, δ=|p1p2|, where p1 and p2 are defined as the allele frequencies in ancestral populations 1 and 2, respectively. The allele frequencies among these three ancestral populations were obtained by genotyping individuals of the following populations: Irish, English, German and Spanish (Europeans, N=243); Nigerian, Central African Republic and Sierra Leone (Africans, N=481); and Mayan, Pima, Cheyenne and Pueblo (Native Americans, N=148). Detailed information of these 31 AIMs regarding chromosomal location, allele frequencies among different ancestral populations, and difference of allele frequencies between different ancestral populations were provided in Supplementary Table S1. Flanking sequence and other relevant information of these 31 AIMs can be obtained from dbSNP website (http://www.ncbi.nlm.nih.gov/SNP/) and were also described elsewhere (Choudhry et al. 2005, in press).

Genotyping

All 31 AIMs and two β2AR SNP variants (SNP-468 in the promoter region and SNP+46 [Arg/Gly 16] within the β2AR coding region) were genotyped using the AcycloPrime-FPTM (PerkinElmer) method (Chen and Kwok 1999). The PCR conditions were as follows: 2.4–4.0 ng genomic DNA, 0.1–0.2 μM primers, 50 μM dNTPs, 0.1–0.2 U Platinum Taq (Invitrogen), 6 μl volume with Platinum Taq PCR buffer, 2.5 mM MgCl2 plus 1 μl extra water to counteract evaporation. Cycling conditions were: 95°C for 2 min, 35 cycles of 92°C for 10 s, 58°C for 20 s, 68°C for 30 s, and final extension at 68°C for 10 min. Enzymatic cleanup and single base extension genotyping reactions were performed with AcycloPrime-FP kits. Plates were read on an EnVision fluorescence polarization plate reader (PerkinElmer) for genotyping calls.

Statistical analyses

Allele frequencies and Hardy–Weinberg equilibrium

Allele frequencies of each AIM were computed by using genotype data of all individuals, cases and controls, separately. We tested whether there was a significant difference between cases and controls by χ2 test. We tested whether AIMs and β2AR SNPs in our SAGE cohort (N=352) were under Hardy–Weinberg equilibrium (HWE) by using the exact Hardy–Weinberg test, which calculates the probability of the exact number of heterozygotes conditional on the copies of the minor SNP allele. This test has been implemented in the PEDSTATS program (Abecasis et al. 2000). For each AIM, we calculated FST between Africans and Europeans, a measure of ancestral informativeness, as \( \delta ^{2} /(\ifmmode\expandafter\bar\else\expandafter\=\fi{p}(1 - \ifmmode\expandafter\bar\else\expandafter\=\fi{p})), \) where δ2 was denoted as variance and \( \ifmmode\expandafter\bar\else\expandafter\=\fi{p} \) was the mean of individual allele frequency (Wright 1969).

Recent admixture

We examined the presence of recent admixture using pair-wise combinations of 30 AIMs in all individuals, cases and controls, respectively. For each marker pair, we first estimated haplotype frequencies using the expectation maximization algorithm (EM) (Excoffier and Slatkin 1995) and computed a likelihood ratio statistic to test the strength of linkage disequilibrium based on the observed genotype data. We then permuted genotype data and computed the same likelihood ratio statistic for 10,000 permutations. Ranks were assigned for the observed and permuted likelihood ratio statistics. The sum of the ranks across all combinations in observed data was compared to the null distribution of the rank sums from 10,000 permutations. This is a global test for evaluating excess linkage disequilibrium across the genome. This statistical approach and original R code were kindly provided by Dr. Hua Tang.

Population stratification

To detect population stratification, we fit clustering models with K=1, 2, and 3 clusters, where K is the number of substructures, by using the Structure 2.1 program (Falush et al. 2003; Pritchard et al. 2000). We obtained the likelihoods for different K through the MCMC algorithm implemented in the Structure 2.1 program. We then selected the most likely K according to the maximum likelihood from the outputs.

Group admixture and individual admixture

The ADMIX 2.0 program based on a coalescent approach was used for estimating group admixture (Bertorelle and Excoffier 1998; Dupanloup and Bertorelle 2001). Admixture proportions are estimated based on the genotype frequencies of the AIMs and their level of divergence—number of generations. Standard deviation of group admixture estimates is calculated according to 10,000 bootstraps.

In order to make sure that we obtained proper estimates of individual admixture, we computed individual admixture estimates (IAEs) using the ADMIXMAP and Structure 2.1 programs, respectively. We then assessed the consistency of IAEs obtained from both programs. A combination of Bayesian and classical approaches has been implemented in the program ADMIXMAP (Hoggart et al. 2003). We input AIMs and trait data from the admixed population and AIMs data from ancestral populations to calculate IAEs by ADMIXMAP with 1,000 burn-in and 20,000 further iterations.

The admixture model implemented in the program Structure 2.1 assumes that each individual inherits some proportion of their ancestry from each ancestral population (Falush et al. 2003; Pritchard et al. 2000). To compute IAEs, we input genotype data from each ancestral population, specified as known populations, and admixed subjects, specified as an unknown population, assumed an admixture model and used default values for other parameters by Structure 2.1 with 50,000 burn-in and 50,000 further iterations.

The detailed methodologies of these two programs and their differences were described elsewhere (Tsai et al. 2005, in press).

Admixture-matched evaluation

We first obtained individual admixture for all subjects from ADMIXMAP and Structure 2.1. To compare the distribution of admixture background between cases and controls, we generated quantile–quantile (Q–Q) plots and performed the Wilcoxon rank sum test to examine whether the admixture distribution in cases was similar to the distribution in controls. To evaluate the admixture distribution in cases only, we carried out 10,000 permutations. We first randomly assigned cases into two groups, then compared the distribution between these two groups by the Wilcoxon rank sum test, recorded P value for each permutation and calculated the empirical P value for 10,000 permutations based on whether the P value for each permutation was less than 0.05.

Tests of association and evaluation of type I error rate

We first applied the regression analyses for association tests for asthma-related traits and ancestry as defined by IAEs (individual African and European ancestry estimates). We only incorporated African ancestry estimates in regression analyses to avoid co-linearity. We also performed regression models to test for association between the asthma disease status and AIMs under the additive genetic model assumption. For asthmatics, we applied regression models to test for association between the asthma-related traits and AIMs. We assessed the normality of quantitative asthma-related phenotypes (asthma severity as defined by: pre-FEV1, pre-FEV1/FVC, and pre-FEF25–75). Since drug response traits—ΔFEV1, ΔFEV1/FVC, ΔFEF25–75, and IgE were not normally distributed, we took logarithm transformation of these traits in our regression models.

To evaluate the inflation of the type I error rate, we first tested the association between asthma-related phenotypes and AIMs with or without including covariates: age, gender, socio-economic status (SES), asthma duration, regular use of asthma medication, and body mass index (BMI) in the models. We then performed association tests with adjustment for the same covariates and IAEs, specifically, individual African ancestry estimates. We used a P value less than 0.05 as the significance level and recorded the number of positives from regression analyses based on this corresponding threshold.

One way to detect and control population stratification is incorporating ancestry as defined by IAEs in the analysis and examining the results obtained from the models with and without adjusting ancestry. To demonstrate that population stratification did not confound the genetic association in our cohort, we applied linear regression analysis to test the association between two β2AR SNP variants and ΔFEF25–75, an asthma-related trait with and without including IAEs in the models. Data analyses were carried out using statistical packages R 1.9.0 and STATA 8.0 S/E (College Station, TX, USA).

Results

Demographic, clinical and AIMs characteristics

We have recruited 176 African American asthmatic cases and 176 matched controls in the SAGE study (Table 1). We carried out a χ2 test to examine whether or not there was a difference in the SES among the subjects recruited from different clinics. Based on our result, there was no significant difference between the SES and clinic sites (P=0.26). However, there was a significant difference in the age between cases and controls (P<0.001). Hence, we included age as a covariate in all the analyses. We genotyped 31 AIMs. One out of the 31 AIMs, rs2816, deviated from Hardy–Weinberg equilibrium in the SAGE cohort (N=352; Supplementary Table S1). Therefore, we excluded this marker in the analyses. The results based on χ2 tests indicated that there was no difference in allele frequencies of 30 AIMs between cases and controls. The average FST of these 30 AIMs, a measure of ancestral informativeness, between African and European populations was 0.35.
Table 1

Demographic and clinical characteristics in SAGE subjects

 

Controls (n=176)

Cases (n=176)

Demographic features

 Age (year)

31.3 (24.5:36.1) a

19.4 (13:31)

 Gender (% male)

38.07%

35.22%

 BMI (kg/m2)

25.8 (22:32)

 Socio-economic status

  Low

44.38%

33.73%

  Moderate

31.36%

40.36%

  Middle

20.71%

19.28%

  Upper

3.55%

6.62%

 Asthma onset (year)

5 (2:9)

 Asthma duration (year)

12 (7:18)

 Asthma severity (% severe)

65.91 %

 Serum IgE (IU/ml)

164 (47.5:417)

Baseline spirometry

 FEV1 (% predicted)

93.15 (84.44:101.69)

 FEV1 /FVC (% predicted)

93.79 (85.38:98.86)

 FEF25–75 (% predicted)

76.28 (59.33:95.65)

Bronchodilator responsiveness

 FEV1 reversibility (%)

7.45 (4.03:14.04)

 FEV1 /FVC reversibility (%)

4.04 (1.23:8.15)

 FEF25–75 reversibility (%)

22.31 (11.22:39.44)

aValues expressed as median (25th:75th percentile)

Recent admixture, population substructure, and group admixture

To examine the presence of a recent admixture, we applied a global test for evaluating excess genome-wide linkage disequilibrium (LD) by comparing the rank scores of the combination of marker pairs from the observed AIMs data and 10,000 permutations. There were 62 (14.3%), 32 (7.4%), and 57 (13.1%) of the 435 marker pairs with significant excess of LD in all subjects, cases and controls, respectively. The results from global rank tests showed significantly higher LD inflation than expected under the null in all subjects, cases and controls, respectively (all three P values <0.001). The Q–Q plot in Fig. 1 showed that the global observed LD was higher than the null distribution. These results demonstrated the presence of the recent admixture in African Americans.
https://static-content.springer.com/image/art%3A10.1007%2Fs00439-005-0080-2/MediaObjects/439_2005_80_Fig1_HTML.gif
Fig. 1

Quantile–quantile plot for comparing global LD patterns calculated from observed AIMs data in all African American subjects to the expected null distribution

We applied Structure 2.1 to assess the presence of population substructures within cases, controls, and all subjects combined, individually. The results in Table 2 indicated that our African American subjects most likely descended from two ancestral populations, instead of one or three ancestral populations. We also applied ADMIX to estimate group admixture of the cohort and inferred the cohort descend from either two or three ancestral populations, respectively. The admixture proportions based on three ancestral populations (Africans, Europeans, and Native Americans) were 83.2±1, 16.5±1, and 0.3±2%, respectively. The admixture proportions based on two ancestral populations (Africans and Europeans) were 83.3±0.8 and 16.7±0.8%, separately. The concordant results from Structure 2.1 and ADMIX suggested that our cohort was derived from two ancestral populations (Africans and Europeans).
Table 2

Number of population substructures within the SAGE cohort estimated by Structure 2.1

No. of population substructures

Log likelihood

All subjects combined (N=352)

Asthmatic cases (N=176)

Healthy controls (N=176)

1

−27,332

−21,494

−21,468

2

−20,276a

−14,610a

−14,570a

3

−22,314

−16,765

−16,691

aNumber of substructures with the highest log likelihood

Admixture background in cases and controls

We calculated IAEs by using ADMIXMAP and Structure 2.1, separately. The IAEs obtained from ADMIXMAP were highly correlated with the IAEs computed from Structure 2.1 in all subjects (correlation coefficients ρ=0.99). We observed similar results when evaluating IAEs in asthmatic cases and controls, separately.

In addition, we examined the distributions of admixture proportions between cases and controls by Q–Q plots (Fig. 2), and carried out the Wilcoxon rank sum test for comparing the admixture distributions between cases and controls. The results indicated that there was no difference in the distributions of admixture proportions between African American asthmatic cases and controls (P=0.49 and 0.48 for IAEs computed by ADMIXMAP and Structure 2.1, respectively). We also checked the admixture distribution within cases by carrying out 10,000 permutations (details provided in the ‘Subjects and methods’). The results showed that there was no difference in the admixture distribution within cases (P=0.95 and 0.94 for IAEs calculated by ADMIXMAP and Structure 2.1, respectively).
https://static-content.springer.com/image/art%3A10.1007%2Fs00439-005-0080-2/MediaObjects/439_2005_80_Fig2_HTML.gif
Fig. 2

Quantile–quantile plots of the distribution of admixture proportions in African American cases and controls

Admixture estimates using different priors

A restriction with respect to studying genetic association in admixed populations is collecting genotyping data of subjects from appropriate ancestral populations. We examined how the ADMIXMAP and Structure 2.1 programs performed by including either only the prior data from one ancestral population, or including no prior data from the ancestral populations. We then compared the admixture estimates with those obtained by using all prior information from both ancestral populations. The results in Fig. 3 showed that admixture estimates obtained by using all priors and only using the data from African subjects were highly concordant when using ADMIXMAP to estimate admixture. In contrast, when using Structure 2.1, a high correlation of admixture estimates was observed by using all priors and only using data from European subjects (Fig. 4). Both programs provided poor admixture estimates when using no priors.
https://static-content.springer.com/image/art%3A10.1007%2Fs00439-005-0080-2/MediaObjects/439_2005_80_Fig3_HTML.gif
Fig. 3

Estimates of African proportion based on different prior information using ADMIXMAP

https://static-content.springer.com/image/art%3A10.1007%2Fs00439-005-0080-2/MediaObjects/439_2005_80_Fig4_HTML.gif
Fig. 4

Estimates of African proportion based on different prior information using Structure 2.1

Association tests of asthma-related traits with ancestry, and with AIMs

We tested the association of asthma-related traits—affection status, severity (pre-FEV1, pre-FEV1/FVC, pre-FEF25–75), drug response (ΔFEV1, ΔFEV1/FVC, ΔFEF25–75), and IgE with ancestry after adjusting for covariates—age, gender, SES, asthma duration, regular use of asthma medication, and body mass index (BMI). Of note, since individual African and European ancestral proportions were summed to one, we only included individual African proportions as a covariate to account for ancestral information in the analyses. The results in Table 3 showed that a significant association was only observed between pre-FEV1 and ancestry (P<0.01). Figure 5 shows that individuals with more African background had lower pre-FEV1 values. We also examined the association between asthma-related phenotypes and 30 AIMs with adjustment of admixture background and covariates. We observed only a slight inflation of type I error in the association tests between pre-FEV1 and 30 AIMs (Table 4).
Table 3

Association of asthma-related traits with ancestry in the SAGE subjects

Trait 

Ancestrya

ADMIXMAP

Structure 2.1

Coefficientsb

P value

Coefficients

P value

Disease status

−0.303

0.818

−0.160

0.856

Asthma severity

−0.671

0.750

−0.515

0.716

Pre-FEV1

39.322

0.008

25.980

0.009

Pre-FEV1/FVC

−0.817

0.929

−0.041

0.995

Pre-FEF25–75

−18.596

0.442

−11.935

0.461

Δ?FEV1

0.472

0.226

0.298

0.254

Δ?FEV1/FVC

−0.278

0.521

−0.175

0.542

Δ? FEF25–75

0.301

0.455

0.196

0.466

IgE

0.196

0.753

0.131

0.752

aEstimates of ancestry, individual ancestral estimates (IAEs), are obtained either by ADMIXMAP or Structure 2.1

bCovariates included in the analyses are age, gender, SES, asthma duration, regular use of asthma medication, and BMI

https://static-content.springer.com/image/art%3A10.1007%2Fs00439-005-0080-2/MediaObjects/439_2005_80_Fig5_HTML.gif
Fig. 5

Relationship between African ancestry and pre- FEV1 in African American asthmatics

Table 4

Association of asthma-related traits with 30 ancestry informative markers in the SAGE subjects

Trait

Count of positivesa (P value <0.05)b

No adjustment

With covariates adjustment, but without IAEs adjustmentc

With covariates and IAEs adjustment

Disease status

0

0

0

Asthma severity

1

0

1

Pre-FEV1

3

3

2

Pre-FEV1/FVC

1

1

1

Pre- FEF25–75

1

2

2

Δ FEV1

2

2

2

Δ FEV1/FVC

1

0

0

Δ FEF25–75

0

0

0

IgE

2

2

2

aUnder 5% type I error rate (P=0.05), counts of expected false positives are 1.5 markers

bAnalyses were carried out by using multiple regression analysis

cCovariates included in the analyses are age, gender, SES, asthma duration, regular use of asthma medication, and BMI

Genetic association tests of an asthma-related trait with two β2AR SNPs

We have performed comprehensive analyses to examine the association between the β2AR gene and asthma-related traits in our ongoing genetic study (complete results will be presented elsewhere). Here, we presented the results of association tests between an asthma-related phenotype, ΔFEF25–75, and two β2AR SNP variants, SNP-468 and SNP+46 [Arg/Gly 16], to examine whether IAEs , estimates for ancestral background, were a confounder in our cohort. The results of association tests for these two β2AR SNPs remained the same, either with or without ancestry adjustment (Table 5). The significant association between SNP-468 and ΔFEF25–75 did not remain after adjusting for other covariates (Table 5). These results demonstrated that there was no confounding effect due to population stratification in our genetic association tests of the β2AR variants, a well-recognized asthma candidate gene.
Table 5

Genetic association of Δ FEF25–75 with β2AR SNP variants in African American asthmatics

β2AR SNP

No adjustment

With covariates, without IAEs adjustmentb

With adjustment of covariates and IAEs

t scoresa

P value

t scores

P value

t scores

P value

−468 C/G

2.3

0.023

1.87

0.063

1.91

0.059

+46 A/G

2.04

0.043

2.2

0.03

2.23

0.027

aAnalyses were carried out by using regression analysis

bCovariates included in the analyses are age, gender, SES, asthma duration, regular use of asthma medication and BMI

Discussion

Our results support the notion that a well-matched case-control study design is a feasible solution to overcome population stratification confounding while initiating genetic association studies in admixed populations (Cardon and Palmer 2003). To match admixture background, we have recruited self-identified African American cases and controls from three clinics, two of which are in the same census tract. As expected, we have minimized the differences in the degree of admixture by recruiting subjects in such a very specific way. Our results show that subjects share high similarity in genetic background and SES. The minimal degrees of admixture do not lead confounding effect in the genetic association tests. Our results demonstrate that an admixture-matched case-control study design among African Americans can successfully avoid inflation of type I error rate in genetic association tests. The recruitment strategies achieve the goal of matching cases and controls based on admixture background and SES.

In a well-designed case-control study, the source population from which cases are ascertained should be the one from which controls are also ascertained (Schlesselman and Stolley 1982). Our strategies for matching admixture are recruiting cases and controls on the basis of geographic proximity, self-reported ancestry, and similar SES background. In terms of cost effectiveness, this admixture-matched study design is less expensive, practical for late age-of-onset diseases, and is capable of minimizing the confounding effect due to population stratification. Our findings provide the evidence that it is feasible to control population stratification confounding in the study-design stage.

According to our previous works, three main factors affecting the accuracy of admixture estimates are the number of markers, the informativeness of markers, and the number of ancestral subjects. Specifically, the most important factor in determining the accuracy of admixture estimates is the number of AIMs (Tsai et al. 2005, in press). Although we only applied 30 AIMs to obtain admixture estimates, the results of the group admixture estimates from our African American cohort agree with the results in previous reports (Hoggart et al. 2003; Parra et al. 1998; Reiner et al. 2005; Shriver et al. 2003). The European admixture proportion in our cohort is approximately 20%, which is consistent to the European admixture proportion in northern or western African American populations from other studies. In addition, we applied two different programs, ADMIXMAP and Structure 2.1, for estimating individual admixture proportions. Admixture estimates from both programs showed a very high degree of correlation. Even though admixture estimates here could not be 100% accurate, they should be highly correlated to the underlying individual admixture proportions. Detailed information could be obtained elsewhere (Tsai et al. 2005, in press).

It has been reported that genetic background from different ancestral populations may be associated with SES (Burchard et al. 2003b). SES has been considered as an important indicator related to all-cause mortality within and across different racial groups (Lin et al. 2003). We examined the interaction between asthma-related phenotypes, SES, and ancestry in our cohort, but did not find any association. One possible explanation as to why we did not observe an association may be due to the fact that admixture proportion and SES were well matched between our African American cases and controls.

We observed the association of pre-FEV1 with ancestry (Table 3 and Fig. 5). The results demonstrated that higher African proportions among asthmatics were associated with more severe asthma as defined by lower pre-FEV1 values. Specifically, asthmatics with higher African ancestry had more severe asthma. Because pre-FEV1 is measured at least 8 h after the use of inhaled beta-agonists, this value is presumably an acceptable index of asthma severity. The NHLBI guidelines currently use pre-FEV1 as an objective measurement in grading asthma severity. Pre-FEV1 has been validated as a measure of airway obstruction as it closely correlates with pathologic scores of airway diameter (Hogg et al. 1968). Decreased measures of pre-FEV1 were shown to be associated with the risk of future attacks and response to therapy among children with asthma (Enright et al. 1994; Fuhlbrigge et al. 2001). A previous study in Latino Americans showed that asthma severity might be influenced by ancestry in Mexican Americans (Salari et al. 2005). Recognizing that there is no single measure that accurately captures all facets of asthma severity, pre-FEV1 percent predicted has several advantages as a marker of asthma severity, including its objectivity and reproducibility (Enright et al. 1991, 1994; Kitch et al. 2004).

Previous studies based on U.S. vital statistics collected from the Third National Health and Nutrition Examination Survey (NHANES III) have reported that African Americans have higher prevalence of asthma than European Americans (Rodriguez et al. 2002; Romieu et al. 2004). We observed a minor excess of false positives in the association tests of pre-FEV1 with AIMs before adjusting ancestry (Table 4). The type I error rate returned to the expected level after including ancestry in the models. It will be of importance to determine whether the association between ancestry and asthma severity will be reproducible in African American asthmatics across the United States. It will also be important to explore gene-environment interactions of ancestry with SES and/or environmental factors.

A known limitation of genetic association studies in admixed populations is the difficulty in recruiting subjects from appropriate ancestral populations. Recent studies have shown that the principal component approach may be an appealing alternative to account for the population stratification confounding, especially when investigators have no data from ancestral populations (Zhang et al. 2003). However, a significant limitation of this approach is how it handles missing data. Since it is common that many study participants do not have complete genotype information for all markers, power on the basis of the principal component approach may be limited. We approached this issue by incorporating only partial prior information from ancestral populations into ADMIXMAP and Structure 2.1. The results in Fig. 3 indicated that ADMIXMAP provided similar admixture estimates by using all priors and by only using data from African subjects. In contrast, Structure 2.1 gave comparable estimates by using all priors and by only using data from European subjects (Fig. 4). The difference was likely due to weighting ancestral information differently, while inferring admixture proportions in ADMIXMAP and Structure 2.1. In future plans, we will assess the difference in performance between ADMIXMAP and Structure 2.1 through realistic simulation works. Besides, both programs provided poor admixture estimates when using no priors from ancestral populations. If investigators do not have genotyping data collected from proper ancestral populations, we would recommend investigators using the ‘genome-control’ approach to adjust for population stratification confounding, instead of including poor estimates in the model.

Population stratification occurs when there is an event of nonrandom mating. This permits allele frequencies of markers to vary among segments of the populations, as the results of genetic drift or founder effects (Slatkin 1991). As a consequence, a disease with high prevalence in one subpopulation will also be associated with any alleles that are in high frequency in that subpopulation. Since we detected two subgroups in the cohort, it would be of interest to explore whether the ‘group membership’ was correlated with asthma disease status or asthma-related traits. We grouped the subjects into two clusters based on their IAEs by using k-means cluster analysis (data not shown). We then checked the correlation between group membership and disease status, and between group membership and asthma-related traits. Correlation coefficients were less than ±0.1 for disease status and asthma-related phenotypes, except for pre-FEV1 (ρ=0.25). Taken together, substructures observed in our cohort were not correlated with asthma disease status and asthma-related traits.

According to our admixture-matched study design, we did not observe inflation of type I error in our association tests, even without adjustment of ancestry (Table 4). To demonstrate that the population stratification can be a potential confounding factor if investigators do not match the admixture background during the recruitment stage, we deliberately created a subset from our 352 SAGE subjects. In this subset, we selected asthmatic cases with top 100 African ancestral proportions and healthy controls with bottom 100 African ancestral proportions from the cohort. We then performed association tests of disease status with 30 AIMs in this subset. The results in Table S2 showed that there was an excess of false positives while not adjusting ancestral information in the analysis. The results here strengthened that our recruitment schema—matching admixture background in the study design could efficiently control the population stratification confounding in admixed populations.

To adjust and control for potential confounding due to population stratification in the analysis, we applied a two-step approach, in which we estimated IAEs first using ADMIXMAP or Structure 2.1. We then included these estimates into a conventional regression model as a covariate. The ADMIXMAP program provides a one-step approach, in which inference of admixture proportions, regression modeling and testing for association are combined in one model simultaneously. We compared the two-step and one-step approaches via comprehensive simulation scenarios that were described elsewhere (Tsai et al. 2005, in press). The findings in this work showed that the most important factor in determining accuracy of IAEs and in minimizing type I error rate was the number of AIMs used to estimate ancestry. For both one-step and two-step approaches, after accounting for precise ancestry information in association tests, the excess of type I error rate was controlled at the 5% level when 100 AIMs were used to calculate IAEs.

In summary, our present study demonstrates that an admixture-matched case-control study design is capable of controlling for confounding due to population stratification in admixed populations. Our results indicate that recruiting admixed subjects in a very specific way such as recruiting from the same clinic or very nearby geographic location can minimize differences in the degree of admixture. Our results show that the minimal differences of admixture in our SAGE cohort do not confound the genetic association tests. Genetic background in our cohort is similar to previously reported genetic background in northern and western African Americans. Ancestry is likely to be associated with asthma severity. We do not observe an excess of false positives in our genetic association tests. Population stratification does not confound the genetic association tests of β2AR SNPs and asthma in our cohort. Our work supports that the admixture-matched case-control study design is a promising strategy for studying genetic association in admixed populations.

Electronic-Database Information

The URLs for data presented herein are as follows:

dbSNP website, National Center for Biotechnology Information, http://www.ncbi.nlm.nih.gov/SNP/

ADMIX web site, Center of Integrative Genomics, University of Lausanne, http://web.unife.it/progetti/genetica/Isabelle/admix2_0.html

ADMIXMAP website, Conway Institute of Biomolecular and Biomedical Research, http://www.ucd.ie/conway/cv1_324.html

Structure 2.1 website, Division of Biological Sciences, University of Chicago, http://pritch.bsd.uchicago.edu/

Acknowledgements

The support for this manuscript came from: National Institutes of Health K23 HL04464, HL07185, GM61390, NCMHD Health Disparities Scholar, Extramural Clinical Research Loan Repayment Program for Individuals from Disadvantaged Backgrounds, 2001–2003, American Lung Association of California and The National Center on Minority Health and Health Disparities to EGB, American Lung Association of California Research Training Fellowship to HJT, Sandler Center for Basic Research in Asthma and the Sandler Family Supporting Foundation. We would like to acknowledge the families and the patients for their participation. We would also like to thank the numerous health care providers for their support and participation in the SAGE study. We thank Dr. Mark D. Shriver for his assistance in development of the AIMs and for providing ancestral DNA. We thank Dr. Hua Tang for providing the R code implemented, a global test for estimating recent admixture. We would like to thank Dr. Neil Risch for his support and guidance. Finally, we would like to thank the Sandler Family Foundation, the main sponsor of this project.

Supplementary material

Copyright information

© Springer-Verlag 2005