Methods

The nuclear factor-kappaB (NF-κB) family of transcription factors regulates the expression of hundreds of genes including pro-inflammatory and apoptosis genes [13]. Transcription of these genes is activated by five NF-κB subunits (NFKB1 encoding p50, NFKB2 encoding p52, REL encoding c-Rel, RELA encoding p65, and RELB encoding Rel-B). The NF-κB pathway is a critical candidate gene pathway for numerous cancers and cardiovascular endpoints.

Samples and data availability

Genetic Analysis Workshop 15 (GAW15) Problem 1 included data on 14 three-generation pedigrees (two sets of grandparents, one set of parents, and a sibship of eight individuals) consisting of Utah residents with ancestry from northern and western Europe (CEPH-Utah, CEU). Pedigree members had genotypes on ~2882 single-nucleotide polymorphisms (SNPs) spread throughout the genome and ~3554 phenotypes consisting of expression levels from lymphoblastoid cells hybridized onto Affymetrix Genome Focus Arrays [4]. Expression density was scaled to 500 and transformed by log2 [4]. Forty-two participants (14 trios) were also studied by the International HapMap Consortium; thus, additional genotype data were available on selected individuals (including 28 unrelated individuals) in families 1340, 1341, 1345, 1346, 1347, 1362, 1408, 1416, and 1454 [5, 6].

Genotype selection

Genotypes from 21 GAW15-provided SNPs surrounding ~20 cM of each candidate gene were analyzed: NFKB1 (90.6 cM to 117.5 cM on chromosome 4), NFKB2 (94.5 cM to 119.8 cM on chromosome 10), REL (45.7 cM to 73.4 cM on chromosome 2), RELA (44.7 cM to 78.4 cM on chromosome 11), and RELB (41.2 cM to 58.0 cM on chromosome 19). Denser genotypes from HapMap within 5 kb of each gene were also used: NFKB1 (106 SNPs, mean r2 = 0.25), NFKB2 (3 SNPs, mean r2 = 0.01), REL (16 SNPs, mean r2 = 0.41), RELA (3 SNPs, mean r2 = 0.07), and RELB (8 SNPs, mean r2 = 0.18).

Phenotype selection and heritability

Regulatory targets of NF-κB transcription (N = 165) were compiled from review of the literature [13] and online catalogs [7]. Expression levels of 75 from these target genes were available in the GAW15 Problem 1 data. We estimated heritability (h2) using the Splus/R library multic [9] assuming a polygenic model in the 14 pedigrees. Fifteen phenotypes with h2 greater than 0.4 (p-value < 0.001) were included in the current analysis (Table 1). Additional h2 estimates are available upon request.

Table 1 Heritability (h2), association testing (minimum p-values of SNP and haplotype association test), and linkage analysis (maximum LOD scores)a

Linkage analysis in extended pedigrees

Variance components multipoint linkage analysis of 15 expression levels was performed using multic [9] with GAW15 genotype data among 14 extended pedigrees (194 individuals), assuming 1 Mb~1 cM.

Family-based association

Family-based association tests (single-SNP and three-SNP haplotypes) were performed using the program FBAT [10] to examine the null hypothesis of no association and no linkage. Two analyses were conducted for each phenotype; first, dense HapMap genotypes in 14 trios, and second, GAW15 genotypes in 14 extended pedigrees.

Association in unrelated individuals

Using data on 28 unrelated individuals, analysis of variance (ANOVA) tested associations between 15 heritable expression levels and genotypes at dense HapMap SNPs surrounding the five candidate genes. With the Splus library HaploStat [8], score testing assessed haplotype associations.

Results

Linkage analysis

Linkage analysis showed elevated LOD scores in the NFKB1 (FAS and IRF1 expression), NFKB2 (IRF1 and SLC2A5 expression), REL (CD40, BCL2A1, and MYC expression), and RELA regions (CD40, BCL2A1, and BIRC2 expression). Linkage regions and maximum LOD scores for each gene are presented in Tables 1 and 2.

Table 2 Linkage analysis and family-based association tests (FBAT) using GAW15 data

Family-based association tests (FBAT)

Analyses using GAW15-provided genotypes surrounding NFKB1 suggested an association between rs721412 at 111.3 cM and FAS, IRF1 expression. Haplotypes containing this SNP were also associated with FAS and IRF1 expression (Table 2). Using the HapMap data we found rs4648134 at 103.9 cM associated with CD80, FAS and ICAM1 phenotypes across three different methods (association, linkage, and FBAT) (Table 3). Analysis of REL GAW15 data revealed associations between genotypes at rs1363062 and rs1106577 and CD40, BCL2A1, and MYC expression levels (Table 2). FBAT analysis of denser REL HapMap data did not suggest any association with SNPs or haplotypes and any phenotype (Table 4). Using GAW15 data in the RELA region, we found that genotypes of rs1867791 at 44.9 cM had FBAT p-values of 0.02. Haplotype FBAT analysis indicated that two haplotypes were point-wise significantly associated with CD40, BCL2A1, and BIRC2 expression (Table 2). Using HapMap data, genotypes at rs11820062 were associated with each phenotype (p-values~0.02), and haplotype rs2306365-rs732072-rs11820062 was associated with all phenotypes (p-values~0.03).

Table 3 Association analysis of NFKB1 using HapMap dataa
Table 4 Association analysis of REL using HapMap dataa

Association in unrelated individuals

We examined associations between 15 expression phenotypes and genotypes at HapMap SNPs. Haplotype analyses indicated an overlap with the single SNP association results for NFKB1 and REL (Table 1). Among nine phenotypes associated with SNPs in NFKB1, six (BCL2A1, CD44, CD80, ICAM1, IRF1, and VCAM1) had suggestive haplotype associations. Among six phenotypes associated with REL SNPs, all six phenotypes had suggestive haplotype association (Table 1). More detailed results are available upon request.

Discussion

We utilized a variety of methods (association, linkage, and family-based association) in an attempt to understand the relationship between variation in NF-κB genes and expression levels of 15 proteins. We consider this to be an exploratory analysis of publicly available data with a limited sample size. We sought to reveal avenues for future study within the NF-κB pathway. As an assessment of these methods, we concluded that haplotype analysis combined with single-SNP analysis, family-based association tests, and linkage analysis has helped inform our understanding of the NF-κB pathway. Analyses revealed association and linkage between NFKB1 and FAS, IRF1 expression phenotypes, and between REL and CD40 expression phenotype. FAS is a cell surface protein that belongs to the tumor necrosis factor (TNF) receptor family; signals through FAS are able to induce apoptosis. IRF1 is a member of the interferon regulatory transcription factor family, which regulates apoptosis and tumor-suppression. CD40 proteins also belong to TNF protein family, which is essential in mediating a broad variety of immune and inflammatory responses. Based upon our results, we concluded that variation in the NFKB1 and REL genes may play a role in downstream regulation of FAS, IRF1, and CD40 expression.

There are several limitations to this study, including lack of adjustment for multiple tests on multiple loci and use of a small sample size; interpretation of tests on a sample of 14 warrants caution. No results were statistically significant after taking into account the multiple comparisons. Nonetheless, these exploratory analyses provide clues for further large scale studies.

Conclusion

We make three general conclusions. First, single-SNP association testing was less conservative than haplotype and FBAT analysis, where haplotype analyses indicated association, results of single-SNP association testing were also significant; however, association found by single-SNP testing was not always revealed by haplotype analysis. Because this is not simulated data, we do not know whether the single-SNP results represent true or false positives. Second, because haplotype analysis requires two or more SNPs, for those genes with only one or very few SNPs, haplotype analysis might not be an appropriate analysis to perform. Third, FBAT analysis was relatively conservative compared to single-SNP and haplotype association analyses. FBAT found fewer SNPs and haplotypes with point-wise significance. In summary, we suggest that single-SNP and haplotype association analyses be used in first-stage analysis to generate a smaller set of candidate SNPs; FBAT and linkage analysis can then narrow down the list of potentially important loci.