Human Genetics

, Volume 131, Issue 11, pp 1699–1708

Serum vitamins A and E as modifiers of lipid trait genetics in the National Health and Nutrition Examination Surveys as part of the Population Architecture using Genomics and Epidemiology (PAGE) study

Authors

  • Logan Dumitrescu
    • Center for Human Genetics ResearchVanderbilt University
  • Robert Goodloe
    • Center for Human Genetics ResearchVanderbilt University
  • Kristin Brown-Gentry
    • Center for Human Genetics ResearchVanderbilt University
  • Ping Mayo
    • Center for Human Genetics ResearchVanderbilt University
  • Melissa Allen
    • Center for Human Genetics ResearchVanderbilt University
  • Hailing Jin
    • Center for Human Genetics ResearchVanderbilt University
  • Niloufar B. Gillani
    • Center for Human Genetics ResearchVanderbilt University
  • Nathalie Schnetz-Boutaud
    • Center for Human Genetics ResearchVanderbilt University
  • Holli H. Dilks
    • Center for Human Genetics ResearchVanderbilt University
    • Department of Molecular Physiology and BiophysicsVanderbilt University
    • Center for Human Genetics ResearchVanderbilt University
    • Department of Molecular Physiology and BiophysicsVanderbilt University
Original Investigation

DOI: 10.1007/s00439-012-1186-y

Cite this article as:
Dumitrescu, L., Goodloe, R., Brown-Gentry, K. et al. Hum Genet (2012) 131: 1699. doi:10.1007/s00439-012-1186-y

Abstract

Both environmental and genetic factors impact lipid traits. Environmental modifiers of known genotype–phenotype associations may account for some of the “missing heritability” of these traits. To identify such modifiers, we genotyped 23 lipid-associated variants identified previously through genome-wide association studies (GWAS) in 2,435 non-Hispanic white, 1,407 non-Hispanic black, and 1,734 Mexican-American samples collected for the National Health and Nutrition Examination Surveys (NHANES). Along with lipid levels, NHANES collected environmental variables, including fat-soluble macronutrient serum levels of vitamin A and E levels. As part of the Population Architecture using Genomics and Epidemiology (PAGE) study, we modeled gene–environment interactions between vitamin A or vitamin E and 23 variants previously associated with high-density lipoprotein cholesterol (HDL-C), low-density lipoprotein cholesterol (LDL-C), and triglyceride (TG) levels. We identified three SNP × vitamin A and six SNP × vitamin E interactions at a significance threshold of p < 2.2 × 10−3. The most significant interaction was APOB rs693 × vitamin E (p = 8.9 × 10−7) for LDL-C levels among Mexican-Americans. The nine significant interaction models individually explained 0.35–1.61 % of the variation in any one of the lipid traits. Our results suggest that vitamins A and E may modify known genotype–phenotype associations; however, these interactions account for only a fraction of the overall variability observed for HDL-C, LDL-C, and TG levels in the general population.

Introduction

The importance of both genetics and environment in shaping an individual’s lipid profile is intuitively obvious. However, the search for gene–environment interactions that influence levels of high-density lipoprotein cholesterol (HDL-C), low-density lipoprotein cholesterol (LDL-C), and triglycerides (TG) has only been relatively recent. One driving force for expanding beyond the standard single-variant models is the observation that single-variant main effects do not account for the majority of the heritability attributed to additive genetics for most complex human traits (Manolio et al. 2009). For the lipid traits, heritability estimates are as high as 80 % (Heller et al. 1993; O’Connell et al. 1988; Snieder et al. 1999), yet the largest and most comprehensive lipid meta-analysis to date was only able to explain about 25–30 % of the genetic variance (Teslovich et al. 2010). The identification of gene–environment interactions may help find a proportion of this “missing heritability”.

Within a statistical framework, a gene–environment interaction describes the effect of a genotype and an environmental factor that deviates from their additive effects. Within a biological framework, the environment (or its by-product) modifies the function or amount of a gene product (Hunter 2005). The latter approach to identify gene–environment interactions is difficult in outbred populations such as humans given that both genetic background and environmental exposures vary within and across populations. Model organisms are more suited to identify biological interactions, but it is difficult to automate these studies, and the findings of these experiments may not generalize to humans (Ober and Vercelli 2011). In contrast, methods to identify statistical interactions can be automated, making them an attractive option for detecting gene–environment interactions important for complex human traits (Hunter 2005).

A number of candidate environmental factors may affect lipoprotein phenotypes, including diet and nutrition. More specifically, fat-soluble micronutrients such as vitamin E (α-tocopherol) and vitamin A may influence lipid metabolism since their metabolic pathways are tightly linked as fat-soluble vitamins and vitamin precursors are absorbed together with dietary fat. Following absorption, both lipophilic molecules are transported to the liver via lipoproteins (as retinyl esters in the case of vitamin A). Vitamins E and A are then re-secreted by the liver into the circulation (as retinol in the case of vitamin A). However, here their metabolic pathways diverge, with the majority (90 %) of vitamin E found in LDL or HDL particles and majority of circulating vitamin A found in complex with a specific transporter, retinol-binding protein (Norum and Blomhoff 1992; Zingg and Azzi 2004). Despite this, both vitamins A and E are known to be positively correlated with cholesterol levels. Furthermore, previous studies have indicated that variants in genes which influence lipid metabolism also influence plasma α-tocopherol levels (Borel et al. 2007, 2009; Gomez-Coronado et al. 2002; Ortega et al. 2005) and trans-retinol levels (Gomez-Coronado et al. 2002).

Despite evidence that genetic variants and environmental factors are independently associated with lipid traits, relatively few studies have been published investigating the interaction between the two (Bernstein et al. 2002; Corella et al. 2001a, b; Hagberg et al. 2000; Lai et al. 2006; Weinberg 2002). And, to our knowledge, no studies explicitly testing the effect of non-additive interactions between lipid-associated SNPs and vitamin E or A on lipid levels have been published. We present here an investigation of the effects of 23 lipid-associated SNPs in the context of circulating levels of vitamins A and E using data from the National Health and Nutrition Examination Surveys (NHANES) as part of the Population Architecture using Genomics and Epidemiology (PAGE) study (Matise et al. 2011). Analysis of greater than 5,500 participants from this diverse population-based survey provides the first steps in finding the “missing heritability” for lipid traits by accounting for nutritional modifiers.

Materials and methods

Study population

Study samples were drawn from three National Health and Nutrition Examination Surveys (NHANES III, NHANES 1999–2000, and NHANES 2001–2002). Participant ascertainment and data collection for NHANES have been previously described (Centers for Disease Control and Prevention 1996; Centers for Disease Control and Prevention 2010). Fasting adults (age ≥18 years) were included in this analysis, regardless of self-reported lipid lowering medication use, as less than 4 % of participants fell into this category and previous sensitivity analyses showed that excluding participants based on medication use did not appreciably alter the results of single-SNP tests of association (Dumitrescu et al. 2011). Race/ethnicity was self-described. Body mass index (BMI) was calculated from height and weight measured in the Mobile Examination Center by CDC medical personnel. Current smoking was defined by “do you smoke cigarettes now?” or cotinine levels >15 ng/ml.

All procedures were approved by the CDC Ethics Review Board and written informed consent was obtained from all participants. Because no identifying information was accessed by the investigators, this study was considered exempt from human subjects by Vanderbilt University’s Institutional Review Board.

Laboratory and dietary measurements

Serum HDL-C, triglycerides, and total cholesterol were measured using standard enzymatic methods. LDL-C was calculated using the Friedewald equation, with missing values assigned for samples with triglyceride levels greater than 400 mg/dl. Serum levels of vitamin E (α-tocopherol) and vitamin A (retinol) were measured with isocratic high-performance liquid chromatography (Center for Disease Control and Prevention 1996; Centers for Disease Control and Prevention (CDC) 2002).

Data for dietary intake were collected via a 24-h dietary recall administered by a trained dietary interviewer. Total nutrient intake was calculated using the US Department of Agriculture’s survey nutrient database. Total energy intake from fat, protein, carbohydrates, and alcohol was calculated by multiplying the grams of intake by the appropriate conversation factor: 9, 4, 4, and 7 kcal/g, respectively.

SNP selection and genotyping

A total of 23 SNPs were considered in this analysis (Table 1). All SNPs were previously associated with HDL-C, LDL-C, and/or triglycerides in published (as of early 2009) candidate gene and genome-wide association studies (Kathiresan et al. 2008; Lu et al. 2008; Sandhu et al. 2008; Willer et al. 2008; Aulchenko et al. 2009; Kathiresan et al. 2009) and were subsequently analyzed for single-SNP associations with lipid levels in a large meta-analysis by the PAGE study (Dumitrescu et al. 2011). The 23 SNPs tested for gene–environment interactions were either accessed from existing data in the Genetic NHANES database or directly genotyped by the Epidemiological Architecture of Genes Linked to Environment (EAGLE), one of the four large population-based studies of the PAGE network, using Sequenom or Illumina BeadXpress. For rs3890182 (ABCA1) and rs12654264 (HMGCR) in NHANES III, we accessed existing data (Keebler et al. 2009). APOB rs693 genotyping in NHANES III was performed using the Illumina GoldenGate assay (as part of a custom 384 OPA) by the Center for Inherited Disease Research (CIDR) through the National Heart Lung and Blood Institute’s Resequencing and Genotyping Service. The remaining genotyping was performed in the Vanderbilt DNA Resources Core and in the laboratory of Dr. Jonathan Haines. In addition to genotyping experimental NHANES samples, we genotyped blinded duplicates provided by CDC and HapMap controls (n = 360). All EAGLE SNPs considered here were genotyped in all three NHANES (NHANES III, NHANES 1999–2000, and NHANES 2001–2002), had minor allele frequencies >5 % in all three racial/ethnic populations, passed CDC quality control metrics, and are available for secondary analyses through NCHS/CDC.
Table 1

List of 23 candidate genes and GWAS-identified SNPs genotyped in NHANES

SNP

Chr.

Build 37 location (bp)

Coded allele

Function

Gene of interest

rs11206510

1

55495789

C

Intergenic

PCSK9

rs1748195

1

63049343

C

Intronic

ANGPTL3

rs693

2

21231945

T

Synonymous

APOB

rs754523

2

21311441

C

Intergenic

APOB

rs780094

2

27740987

A

Intronic

GCKR

rs12654264

5

74648353

A

Intronic

HMGCR

rs1501908

5

156397919

C

Intergenic

TIMD4

rs2197089

8

19826123

C

Downstream

LPL

rs2954029

8

126560154

A

Intergenic

TRIB1

rs4149268

9

107647220

A

Intronic

ABCA1

rs3890182

9

107647405

A

Intronic

ABCA1

rs1883025

9

107664051

G

Intronic

ABCA1

rs174547

11

61570533

C

Intronic

FADS1

rs3135506

11

116662157

C

Non-synonymous

APOA1/C3/A4/A5

rs2338104

12

109894918

C

Intronic

MMAB-MVK

rs4775041

15

58674445

C

Intergenic

LIPC

rs9989419

16

56984889

A

Upstream

CETP

rs3764261

16

56993074

G

Upstream

CETP

rs2271293

16

67901820

A

Intronic

LCAT

rs2156552

18

47181418

A

Intergenic

LIPG

rs2967605

19

8469488

G

Downstream

ANGPTL4

rs6102059

20

39228784

C

Intergenic

MAFB

rs7679

20

44576252

C

Downstream

PLTP

For each SNP (denoted by rs number), we list the chromosomal and genomic location (base-pair), the putative function of the SNP (based on SNP location), and the nearest gene of interest

Statistical analysis

Regression modeling was used to investigate the effect of interactions between lipid-associated variants and vitamin levels on HDL-C, LDL-C, and triglycerides. Gene–environment interactions were modeled using a multiplicative interaction term between the environmental variable and the additively encoded SNP. All models were adjusted for the main effect of the SNP and the environmental variable, along with age and sex. Significant associations were further adjusted for total energy intake from five dietary variables and/or BMI and current smoking status. Triglycerides and vitamin E levels were natural-log transformed due to a skewed, non-normal distribution. A linear model was used for the main effects of both serum vitamins for all analyses. Visual inspection of the relevant scatter plots failed to suggest that more complicated models would better fit these main effects. Given that misspecification of the model will reduce our power to detect an interaction, only linear models were considered here.

All analyses were stratified by self-reported race/ethnicity to minimize possible confounding due to population stratification and were conducted in SAS v9.2 (SAS Institute, Cary, NC) using the Analytic Data Research by Email (ANDRE) portal of the CDC Research Data Center in Hyattsville, MD. Associations were deemed significant if the p value was less than or equal to the Bonferroni corrected threshold of 2.2 × 10−3 (=0.05/23 SNPs). Aggregate statistics related to this work will be available via dbGaP as part of the PAGE study.

To detect an interaction of a certain effect size, a general rule of thumb is that four times the sample size required to detect a comparable main effect is needed (Smith and Day 1984; Thomas 2010). Based on this assumption, we have 80 % power to detect a gene–environment interaction with an effect size as low as R2 = 2.3 % in non-Hispanic whites, R2 = 3.8 % in non-Hispanic blacks, and R2 = 3.1 % in Mexican-Americans (α = 2.2 × 10−3; additive genetic model). Quanto (Gauderman 2002) was used to estimate statistical power.

Results

Population characteristics

Table 2 displays descriptive statistics for the key variables in this study. Both vitamin A and E levels were significantly different among the three racial/ethnic groups (p < 0.001, one-way ANOVA). Non-Hispanic whites had both higher mean vitamin A and vitamin E levels (60.6 and 1,322 μg/dl, respectively) compared to non-Hispanic blacks (53.1 and 1,002 μg/dl) and Mexican-Americans (52.8 and 1,135 μg/dl). Non-Hispanic blacks and Mexican-Americans had similar mean vitamin A levels, although vitamin E levels are higher in Mexican-Americans.
Table 2

NHANES participant characteristics

Trait

Non-Hispanic whites

Non-Hispanic blacks

Mexican-Americans

N

2,435

1,407

1,734

Age (years)

51.9 ± 20

42.5 ± 17

42.8 ± 18

Female (%)

54

56

51

Vitamin A (μg/dl)

60.6 ± 16

53.1 ± 17

52.8 ± 15

Vitamin E (μg/dl)

1,322 ± 615

1,002 ± 379

1,135 ± 459

HDL-C (mg/dl)

51.3 ± 16

54.1 ± 17

48.3 ± 14

LDL-C (mg/dl)

126.9 ± 36

122.2 ± 39

120.9 ± 34

Triglycerides (mg/dl)

146.7 ± 93

107.0 ± 72

156.3 ± 104

Values are represented as mean ± SD unless otherwise indicated

It is important to note that vitamins A and E were highly correlated with the majority of lipid levels in all three NHANES populations (Table 3). More specifically, vitamin A was associated with all three lipid traits in the majority of participants. For triglycerides, the amount of variance explained (R2) by vitamin A was as high as 14 % in non-Hispanic whites. R2 was smaller for the other two lipid traits (max R2 < 5 % between LDL-C and vitamin A in Mexican-Americans; Table 3) although it was still larger than the average amount of variance explained by single common genetic variants (~3 %). Vitamin E was also very strongly correlated with LDL-C and triglyceride levels (p < 4.05 × 10−45) across all racial/ethnic groups. Furthermore, vitamin E levels explained 17–24 % of the variance in LDL-C levels and 25–40 % of the variance in triglyceride levels (Table 3).
Table 3

Associations between lipid traits and vitamins A and E

Lipid trait

Non-Hispanic whites

Non-Hispanic blacks

Mexican-American

β (SE)

p value

R2

β (SE)

p value

R2

β (SE)

p value

R2

Vitamin A

 HDL-C

0.05 (0.02)

2.40E−03

<0.01

0.08 (0.03)

4.37E−03

0.01

0.06 (0.02)

9.28E−03

<0.01

 LDL-C

0.24 (0.24)

2.88E−05

0.02

0.17 (0.08)

0.03

0.02

0.38 (0.07)

1.58E−08

0.05

 TG

0.01 (0.001)

5.50E−56

0.14

0.01 (0.001)

2.14E−27

0.11

0.01 (0.001)

1.26E−30

0.12

Vitamin E

 HDL-C

1.25 (0.82)

0.13

<0.01

0.65 (1.53)

0.67

<0.01

−2.18 (0.97)

0.02

<0.01

 LDL-C

36.21 (2.29)

1.03E−52

0.17

57.87 (3.88)

4.05E−45

0.24

46.91 (2.83)

2.04E−55

0.23

 TG

0.68 (0.03)

4.95E−116

0.26

0.78 (0.04)

1.91E−68

0.25

1.01 (0.03)

8.23E−151

0.40

The association of lipid traits and vitamin levels were performed using linear regression, adjusted for age and sex. Both triglycerides and vitamin E levels were natural-log transformed. Measures of variance explained (R2) are also provided for each association based on unadjusted regressions. Significant associations (p < 0.01) are in bold

SNP × vitamin interactions

We tested for gene–environment interaction effects between our 23 lipid-associated variants and vitamins A and E on HDL-C, LDL-C, and triglyceride levels. A total of nine associations, comprising eight distinct gene–environment interactions, were statistically significant at p < 2.1 × 10−3 and are summarized in Table 4. Full association results are reported in Supplementary Tables S1–S6. The association between LDL-C and APOB rs693 × vitamin E in Mexican-Americans was the most significant at p = 8.94 × 10−7. This same interaction was significant in non-Hispanic whites (p = 2.67 × 10−4) but not in non-Hispanic blacks (p = 0.11, Table S5). In addition, other interactions with this APOB variant (rs693 × vitamin A and rs693 × vitamin E) were significantly associated with triglyceride levels among non-Hispanic whites at p = 2.16 × 10−3 and 4.65 × 10−5, respectively.
Table 4

Significant SNP × environment interactions in NHANES

Interaction

Associated lipid trait

Population

SNP main effect

Environment main effect

SNP × Environment interaction effect

β (SE)

p value

β (SE)

p value

β (SE)

p value

R2 (%)

rs693 × VitA

TG

Non-Hispanic Whites

−0.16 (0.06)

6.11E−03

0.01 (0.001)

1.01E−22

0.003 (0.001)

2.16E−03

0.39

rs693 × VitE

LDL-C

Non-Hispanic Whites

−74.86 (21.54)

5.22E−04

31.86 (2.76)

1.39E−29

11.11 (3.04)

2.67E−04

0.67

rs693 × VitE

LDL-C

Mexican-Americans

−155.52 (31.82)

1.17E−06

38.98 (3.25)

2.51E−31

22.71 (4.60)

8.94E−07

1.61

rs693 × VitE

TG

Non-Hispanic Whites

−0.99 (0.25)

8.59E−09

0.60 (0.03)

3.48E−62

0.14 (0.04)

4.65E−05

0.60

rs1748195 × VitA

HDL-C

Non-Hispanic Whites

−5.15 (1.67)

2.07E−03

−0.05 (0.04)

0.18

0.09 (0.03)

1.16E−03

0.39

rs1748195 × VitE

HDL-C

Non-Hispanic Whites

−23.13 (7.55)

2.22E−03

−3.12 (0.65)

0.06

3.28 (1.06)

2.06E−03

0.35

rs11206510 × VitA

LDL-C

Mexican-Americans

−30.58 (8.08)

1.63E−04

0.25 (0.07)

9.22E−04

0.58 (0.15)

7.65E−05

1.26

rs11206510 × VitE

TG

Non-Hispanic Whites

1.03 (0.33)

1.63E−03

0.74 (0.03)

1.89E−49

−1.15 (0.05)

1.27E−03

0.36

rs3135506 × VitE

TG

Non-Hispanic Blacks

−3.02 (0.85)

4.16E−04

0.74 (0.04)

1.14E−56

0.46 (0.12)

2.45E−04

0.80

Associations with significant interaction terms (p < 2.17E−03, Bonferroni corrected p value for 23 SNPs) are listed. Both triglycerides and vitamin E levels were natural-log transformed. βs, standard errors (SE), and p values for main effects of the SNP and the environment are represented, along with the amount of trait variance explained (R2) by interaction term

Interactions between ANGPTL3 rs1748195 and both vitamins A and E were associated with HDL-C levels in non-Hispanic whites (p = 1.16 × 10−3 and p = 2.06 × 10−3). The ANGPTL3 rs1748195 × vitamin A interaction trended toward significance in non-Hispanic blacks (p = 0.01) but was not associated with HDL-C in Mexican-Americans (p = 0.64, Table S1). Similarly, the rs1748195 × vitamin E interaction was not associated with HDL-C in the other two populations.

Two interactions with a variant in PCSK9 are also listed in Table 4. The PCSK9 rs11206510 × vitamin A interaction was associated with LDL-C in Mexican-Americans at p = 7.65 × 10−5. In addition, the PCSK9 rs11206510 × vitamin E interaction was associated with transformed triglycerides in non-Hispanic whites at p = 1.27 × 10−3. Lastly, the only significant gene–environment interaction observed in non-Hispanic blacks was between the APOA1/C3/A4/A5 cluster variant rs3135506 and vitamin E, which was associated with triglyceride levels at p = 2.45 × 10−4.

The nine significant interaction models individually explained 0.35–1.61 % of the variation in one of the lipid traits. Interactions rs693 × vitamin E and rs11206510 × vitamin A had the greatest R2 values and contributed to 1.61 and 1.26 %, respectively, of the variation in LDL-C among Mexican-Americans. The seven other interaction terms had R2 values <1 %.

Adjustment for lifestyle and dietary variables

Additional factors, both dietary and environmental, may influence both serum lipid and serum vitamin levels and, possibly, the interactions modeled here. To account for these variables, we adjusted our nine most significant associations for (1) BMI and current smoking status and (2) BMI, current smoking status, and five dietary variables (total fiber and total energy intake from carbohydrates, protein, fat, and alcohol), along with age and sex. For five of the nine associations tested, adjustment for BMI and smoking did not appreciably alter the results compared to the models minimally adjusted for age and sex (Table S7). And of these five associations, only one (rs174819 × vitamin E with HDL-C in non-Hispanic whites) was no longer significant (p = 0.02) after including dietary variables in the model. Interestingly, of the four associations that no longer remained significant (p > 0.04) after adjustment for BMI and current smoking status, all four included an interaction with rs693. Indeed, the p value for the previously most significant interaction (rs693 × vitamin E with LDL-C in Mexican-Americans) rose from p = 2.67 × 10−7 to p = 0.59 and the amount of variance explained dropped from R2 = 1.61 % to only R2 = 0.30 % (Table S7).

Discussion

In this study, we have identified three novel SNP × vitamin A and six novel SNP × vitamin E interactions. A majority of the significant interactions were associated with triglycerides (4/9) and were among non-Hispanic whites (6/9), which may be a result of the stronger associations between triglycerides and serum vitamin levels (Table 3) and the larger sample size for non-Hispanic whites compared to non-Hispanic blacks and Mexican-Americans (Table 1). When dietary and lifestyle variables were included in the model, the four vitamin interactions with rs693 were no longer significant (Table S7).

Although we identified several statistically significant interactions, the overall contribution each interaction term made toward the observed trait variability for any of the lipid traits was small. For example, after adjusting for age and sex, the interactions discovered here explained only 0.35–0.39 %, 0.67–1.61 %, and 0.36–0.80 % of the variability in HDL-C, LDL-C, and triglyceride levels, respectively. Our most significant finding (APOB rs693 × vitamin E) only explained 1.61 % of the variance in LDL-C among Mexican-Americans, a trait that is up to 80 % heritable. In comparison, the effect of age and sex together accounted for 5.9 % of the variance in LDL-C among Mexican-Americans. Furthermore, after adjusting our most significant interactions for BMI, current smoking status, and dietary intake, all the R2 values decreased, with the rs11206510 × vitamin A interaction with LDL-C in Mexican-Americans resulting in the largest R2 of only 1.12 %.

All of the genes implicated here play key roles in lipid metabolism. ANGPTL3 encodes a protein which can suppress lipoprotein lipase (LPL) activity, leading to increases in plasma triglycerides and HDL-C. PCSK9 encodes protein convertase subtilisin kexin 9, a protein that binds the LDL receptor and induces its degradation. The APOA1/C3/A4/A5 gene cluster lies within a 17-kb region on chromosome 11. Proteins made by this gene cluster are the major constituents of very low-density lipoprotein (VLDL) and/or HDL, act to inhibit LPL activity, and influence dietary fat absorption and chylomicron synthesis (Delgado-Lista et al. 2010). The gene products of APOB, apoB-48 and apo-100, are the main apolipoproteins of chylomicrons and LDL particles, respectively.

Interestingly, four out of the nine (44 %) significant interactions included the variant rs693 in APOB (Table 4). The gene products of APOB, apoB-48 and apo-100, are the main apolipoproteins of chylomicrons and LDL particles, respectively. In fact, one study showed that genetically modified mice that do not express APOB in the intestine do not form chylomicrons and display defective absorption of fats and fat-soluble vitamins (Young et al. 1995). Furthermore, mutations in APOB have been shown to cause familial hypolipoproteinemia (FHBL), which is characterized by low levels of apolipoprotein B containing lipoproteins and fat-soluble vitamin malabsorption, resulting neurological complications from lack of vitamin E (Young 1990).

Both vitamin E and A precursors are incorporated into chylomicrons for delivery to the liver. In addition, circulating vitamin E is found exclusively in plasma lipoproteins (VLDL, LDL, and HDL) (Borel et al. 2007). The interdependence of these vitamins and lipids (as demonstrated in Table 3) suggests that the interactions described in this study may be either just reflective of the strong correlation between vitamins and lipids or biological relevance. In support of the latter interpretation, micronutrients have previously been implicated in affecting the gene expression of import lipid-metabolizing genes (Hagberg et al. 2000; Gatica et al. 2006; Mooradian et al. 2006a, b; Oliveros et al. 2007). For example, Mooradian et al. (2006a) demonstrated that high concentrations of vitamin E were associated with significant decreases in apoA-I expression (which is sensitive to the oxidative state of the cell) in hepatic HepG2 cells by reducing apoA-I promoter activity.

In this study, we tested for gene–environment interactions regardless if any of the 23 SNPs had a significant main effect in our previous meta-analysis as part of the larger PAGE study (Dumitrescu et al. 2011). Indeed, only half (13 HDL-C, 12 LDL-C, and 12 TG = 37/69 = 54 %) of the associations tested in Dumitrescu et al. were significant at p < 0.05 in European Americans, the largest population studied in PAGE (n ≈ 20,000). Even fewer single-SNP associations were significant in African Americans (8 HDL-C, 3 LDL-C, and 8 TG = 19/69 = 28 %; n ≈ 9,000) and in Mexican-Americans/Hispanics (8 HDL-C, 6 LDL-C, and 5 TG = 19/69 = 28 %; n ≈ 2,500). Differences in SNP main effects across racial/ethnic groups may help to explain the differences we observed across our three study populations. Indeed, of the significant interactions identified, only the rs693-vitamin E interaction with LDL-C was significant in more than one racial/ethnic group (non-Hispanic whites and Mexican-Americans). As discussed in Dumitrescu et al. (2011), the lack of generalization of SNP main effects between the different racial ethnic groups may be due to differences in linkage disequilibrium or differences in power.

It has also been argued that gene–environment heterogeneity may be, in part, to blame for the lack of replication among GWAS studies and among different ancestral populations (Lasky-Su et al. 2008; Ober and Vercelli 2011). In the single-SNP PAGE meta-analysis detailed in Dumitrescu et al. (2011), APOB rs693 was strongly associated with LDL-C in European Americans (p = 3.38 × 10−21), marginally associated in African Americans (p = 0.02), but not associated in Mexican-Americans/Hispanics (p = 0.18). However, in this analysis, which represents a subset of the PAGE study sample, the main effect of rs693 was significantly associated in Mexican-Americans (p = 1.17 × 10−6, Table 4) after adjusting for the interaction with vitamin E. Indeed, nine significant gene-environment interactions were identified here and involved seven-independent SNP main effects. Of those seven, only four had significant main effects in our earlier single-SNP meta-analysis for the same lipid trait and study population. However, after adjusting for the gene–environment interaction, the SNP main effect was significant (p < 6.11 × 10−3) for all seven. Accounting for environmental modifiers in genetic studies of lipid levels may not only uncover new biology, it may also improve the generalizability of findings from genome-wide association studies.

In interpreting our findings, we should consider several aspects. First, NHANES is a cross-sectional study and, therefore, we are unable to determine the temporal sequence of our results. Second, the issue of sample size and the ‘curse of dimensionality’ (Bellman 1961; Dumitrescu et al. 2011) are relevant to this study. As the number of factors under study increases (as with the addition of interaction terms), so do the number of strata. With a set sample size, increasing the number of terms in the model quickly increases the degrees of freedom and reduces the per-stratum sample size, thus decreasing statistical power. For this reason, even with relatively large sample sizes in NHANES, we had to restrict our analysis to SNPs with minor allele frequencies >5 %. To better study less common variants, collaborative studies and/or other non-regression-based approaches (such as Multifactor Dimensionality Reduction) (Ritchie et al. 2001) may be appropriate, although they are not without their own limitations.

In addition, it is important to note that correcting for multiple testing in gene–environment interaction studies is inherently more complicated than in standard single-SNP association studies. For GWAS, it is well known that a strict Bonferroni adjustment using the total number of SNPs tested is overly conservative as many SNPs are in linkage disequilibrium and, therefore, not all tests are independent. This concern holds true for studies of gene–environment interactions and is compounded by correlations among the SNP and the environmental variable (i.e., main effects) and, possibly, correlations among the different environmental variables tested. And while permutation testing has become very popular in single-SNP analyses as a way to correct for multiple testing, for gene–gene and gene–environment interaction studies, permutation testing is not available in most situations and does not guarantee strong control of the family-wise error rate (Anderson and Robinson 2001; Buzkova et al. 2011). As this was a discovery study, we corrected for only 23 tests even though we conducted 414 tests (23 SNPs × 2 environmental variables × 3 race/ethnicities × 3 lipid traits), albeit many of these tests are highly correlated. Indeed, replication may be the most acceptable approach to filter true findings from the false positives, but this approach, like all others, is not without limitations.

Conclusion

The differences in lipid traits between individuals and between populations may partly result from interactions of known lipid-associated genetic variants and fat-soluble micronutrients. The results presented here highlight the fact that effect sizes of gene–environment interactions which tend to be small and large sample sizes are needed to detect them. Nevertheless, understanding the mechanism of the interaction between these lipid-associated variants and environmental factors, such as serum vitamin E and A levels, is imperative to determining the etiology of a poor lipid profile and could, therefore, have implications in clinical care.

Acknowledgments

Genotyping in NHANES was supported in part by The Population Architecture Using Genomics and Epidemiology (PAGE) study, which is funded by the National Human Genome Research Institute (NHGRI). Data included in this report were resulted from the Epidemiologic Architecture for Genes Linked to Environment (EAGLE) Study, as part of the NHGRI PAGE study (U01HG004798). Genotyping services for select EAGLE NHANES III SNPs presented here were also provided by the Johns Hopkins University under federal contract number (N01-HV-48195) from NHLBI. We at EAGLE would like to thank Dr. Geraldine McQuillan and Jody McLean for their help in accessing the Genetic NHANES data. The Vanderbilt University Center for Human Genetics Research, Computational Genomics Core provided computational and/or analytical support for this work. The NHANES DNA samples are stored and plated by the Vanderbilt DNA Resources Core, managed by Cara Sutcliffe. The findings and conclusions in this report are those of the authors and do not necessarily represent the views of the National Institutes for Health or the Centers for Disease Control and Prevention.

Supplementary material

439_2012_1186_MOESM1_ESM.docx (36 kb)
Supplementary material 1 (DOCX 35 kb)

Copyright information

© Springer-Verlag 2012