Background

Elevated triglyceride and cholesterol levels are two risk factors for cardiovascular diseases. These risk factors are often correlated with each other. In order to map the possible pleiotropic/clustered genes underlying the inheritance of these two traits, we performed a bivariate linkage analysis using a score statistic developed by Wang [1]. This score statistic is asymptotically equivalent to the likelihood ratio statistic and is straightforward to compute. We apply this statistic to data from Cohort 1 and Cohort 2 of the Framingham Heart Study.

Methods

Data

Participants in Cohort 1 had up to 16 reported cholesterol levels, and up to 3 reported triglyceride levels. For participants in Cohort 2, cholesterol and triglyceride levels were reported up to 5 times. These two cohorts together provided 22,040 measurements on the cholesterol level and 9,155 measurements on the triglyceride level (including all repeated measurements on all individuals). Individuals who lacked any measurements of cholesterol level or triglyceride level were excluded. A single linear regression of cholesterol on age was fit across different individuals and different measurements. The residuals from the regression fit were averaged for each individual. This average was used as the age-adjusted cholesterol level for that individual. The same method was used to obtain age-adjusted triglyceride level for each individual. Sib pairs from the same nuclear family or from different nuclear families that belonged to the same pedigree were regarded as biologically unrelated. For the case of univariate traits, there are reports showing that treating dependent sib pairs as independent ones does not increase the type I error rate of the test [2].

All sib pairs in all the pedigrees in Cohort 1 and Cohort 2 were generated, but not all of these sib pairs were used at the same time due to missing marker data. Genetic Analysis Workshop 13 (GAW13) provided identity-by-descent (IBD) sharing probabilities for some relative pairs (including sib pairs) at each of the scanned markers. The IBD sharing probabilities for a sib pair were available only for some markers. To simplify the programming, we excluded those markers at which there were less than 1000 sib pairs whose IBD sharing probabilities were available. Then, for each chromosome, we used only those sib pairs whose IBD sharing probabilities were available for all the remaining markers on that chromosome. See Table 1 for a summary of the number of markers excluded and the number of sib pairs used for each chromosome.

Table 1 Number of markers available and analyzed on each chromosome

Analysis

The bivariate score statistic is computed based on the observed phenotypic data on sib pairs. The phenotypic data of a sib pair can be denoted by a vector of four (adjusted) measurements – cholesterol levels on sib 1 and sib 2, and triglyceride levels on sib 1 and sib 2. Let x i be the phenotypic data on the ith sib pair and Σ0 be the sample variance-covariance of x i . As an average of the residuals of a regression, the sample mean of cholesterol levels on sib 1 and sib 2 is 0, so is the sample mean of triglyceride level. Let Σ0 be a 4 × 4 symmetric matrix whose (i,j) element is denoted by aij. Note that a11 and a33 are the variances of the cholesterol and triglyceride levels, respectively, of the first sib in the pairs. Similarly, a22 and a44 are the variances of the cholesterol and triglyceride levels of the second sib in the sib pairs. The off-diagonal terms represent covariances: a13 = a31 is the covariance between cholesterol and triglycerides for the first sib in the sib pairs, and a24 = a42 is the covariance for the second sib in the sib pairs. Since the sib-sib relationship in a sib pair is symmetric, we expect that a11 ≈ a22, a33 ≈ a44 and a13 ≈ a24 when the sample size is large. Alternatively, we can also use the (adjusted) measurements on cholesterol and triglycerides on all sibs (do not distinguish sib 1 from sib 2) in calculating the entries of Σ0. Then there would be a11 = a22, a33 = a44, and a13 = a24. Since the sample size is fairly large, we expect both methods give similar Σ0.

Define

w i = (wi1, wi2, wi3, wi4)t = 0 -1 x i

and

zi = wi1wi2a11 + (wi1wi3 + wi2wi4)a13 + wi3wi4a33.

Denote the proportion of alleles that are shared IBD by the ith sib pair by πi. Let and be the sample means of {πi} and {zi}, respectively. Define

where N is the total number of sib pairs. When the putative locus is not linked to any trait locus, the expectation of b is 0 and its variance is Var(b) = N s2πs2z, where s2π and s2z are the sample variances of {πi} and {zi}, respectively. The score statistic S for the bivariate phenotypes is defined by S = b2/Var(b) if b > 0; S = 0 otherwise. When the putative locus is not linked to any quantitative trait loci (QTL), the asymptotic distribution of this one-sided tests statistic, S, is 0.5 χ20 + 0.5 χ21 [1]. The score statistic S is a special case described by Wang [1] – the locus specific variances and covariance for the two traits are assumed to be proportional to their total variances and covariance.

Results

The score statistic S was calculated for every screened marker. In addition, the univariate score statistic of Wang and Huang [3] was also calculated for cholesterol level and triglyceride level separately. For sib-pair data, the type of data used in our analyses, this univariate score statistic is equivalent to other methods [4, 5]. The p-values of these three score statistics (one for the bivariate phenotypes, one for each of the two univariate phenotypes) at each marker location are plotted in Figure 1. Markers with p-values less than the significance level of α = 0.005 are shown in Table 2.

Figure 1
figure 1

p -values for the three statistics throughout the genome Bivariate score statistic (A), univariate score statistic for adjusted cholesterol level (B), and univariate score statistic for adjusted triglyceride level (C).

Table 2 Summary of markers that are significant at significance level 0.005

At the significance level 0.005, 10 markers were identified by the bivariate score statistic: 2 each from chromosome 1 (at 212 cM and 233 cM) and 7 (at 128 cM and 155 cM), and 1 each from chromosome 3 (at 112 cM), 4 (at 105 cM), 5 (at 19 cM), 6 (at 166 cM), 8 (at 140 cM), and 16 (at 64 cM). Five out of the 10 markers were also identified by the univariate score statistic for the adjusted triglyceride level. They were the two on chromosome 1, one on chromosome 7 (at 155 cM), one on chromosome 8, and one on chromosome 16. None of the 10 markers were identified by the univariate score statistic for the age-adjusted cholesterol level. The results seem to suggest that there were large overlaps of linkage signals between the bivariate score statistic and the univariate score statistic for the age-adjusted triglyceride level. There were no overlaps of linkage signals between the bivariate score statistic and the univariate score statistic for the age-adjusted cholesterol level. There were 5 markers that were identified by the bivariate score statistic, but not identified by any of the univariate score statistics. There were 3 markers whose p-values were below 0.001: one on chromosome 1 at 212 cM, one on chromosome 8 at 140 cM, and the other on chromosome 16 at 64 cM. The regions suggested by these 3 markers may be investigated in future genotyping and analysis.

Discussion

We performed a bivariate analysis of cholesterol and triglyceride levels on sib-pair data from the Framingham Heart Study using a method recently developed by Wang [1]. This method is asymptotically equivalent to the likelihood ratio statistic, but is straightforward to calculate. We also calculated the univariate score statistics for cholesterol and triglyceride levels separately. Five markers were identified by both the bivariate score statistic and the univariate score statistic for the adjusted triglyceride level, while the results of the bivariate score statistics had no overlap with the univariate score statistic for the age adjusted cholesterol levels.

The method in Wang [1] is general enough to handle general pedigrees, but we only applied it to sib pairs that were extracted from general pedigrees. This is because the programming for sib pairs is relatively easy and was feasible given the time constraint for GAW13. Some linkage information may have lost due to the fact that dependent sib pairs were treated as independent sib pairs, but the type I error rate of the test statistic is expected to be valid.

In a related study, Shearman et al. [6] used the ratio of triglyceride level to high-density lipoprotein cholesterol level as the phenotype of interest. Linkage evidence was reported at marker GATA112F07 (155 cM on chromosome 7), a marker that resulted in a p-value 0.0020 for the bivariate score statistic used in the current report. These authors reported a LOD score 1.5 at 70 cM on chromosome 16 with multipoint mapping. We used single-point IBD sharing probabilities with the bivariate score statistic and obtained a significant linkage signal (p = 0.0001) for marker ATA55A11 (64 cM on chromosome 16), 6 cM away from the locus they identified. Other markers in Table 2 that have small p-values for the bivariate or univariate score statistics but that did not show evidence for linkage in Shearman et al. [6] include GATA48B01, 036yb8, GATA21C12, and GATA3F02.

One caveat about bivariate analyses is that they are not always more powerful than univariate analyses. Theoretical [7] and simulation studies [1, 8, 9] demonstrate that when the polygenic correlation is in the same direction as the major gene correlation, a bivariate analysis may have lower power than a univariate analysis.