Skip to main content

Multivariate modelling of endophenotypes associated with the metabolic syndrome in Chinese twins

Abstract

Aims/hypothesis

The common genetic and environmental effects on endophenotypes related to the metabolic syndrome have been investigated using bivariate and multivariate twin models. This paper extends the pairwise analysis approach by introducing independent and common pathway models to Chinese twin data. The aim was to explore the common genetic architecture in the development of these phenotypes in the Chinese population.

Methods

Three multivariate models including the full saturated Cholesky decomposition model, the common factor independent pathway model and the common factor common pathway model were fitted to 695 pairs of Chinese twins representing six phenotypes including BMI, total cholesterol, total triacylglycerol, fasting glucose, HDL and LDL. Performances of the nested models were compared with that of the full Cholesky model.

Results

Cross-phenotype correlation coefficients gave clear indication of common genetic or environmental backgrounds in the phenotypes. Decomposition of phenotypic correlation by the Cholesky model revealed that the observed phenotypic correlation among lipid phenotypes had genetic and unique environmental backgrounds. Both pathway models suggest a common genetic architecture for lipid phenotypes, which is distinct from that of the non-lipid phenotypes. The declining performance with model restriction indicates biological heterogeneity in development among some of these phenotypes.

Conclusions/interpretation

Our multivariate analyses revealed common genetic and environmental backgrounds for the studied lipid phenotypes in Chinese twins. Model performance showed that physiologically distinct endophenotypes may follow different genetic regulations.

Introduction

The metabolic syndrome is a complex and composite disorder in which multiple phenotypes pertaining to morphological (e.g. BMI) and biochemical (e.g. fasting glucose, total cholesterol, triacylglycerol, etc.) alterations of an affected individual are implicated. Genetic epidemiological studies using related individuals, including twins, have shown that genetic and environmental factors make important contributions to the development of the disorder and its associated phenotypes. Instead of phenotype-wise univariate analysis, there has been an increasing interest in study of the so-called pleiotropic effects that affect multiple traits associated with the metabolic syndrome. Such approaches have used bivariate [13] and multivariate [4] modelling, with results indicating common genetic and environmental mechanisms among sub-clusters of phenotypes related to the metabolic syndrome.

Common and independent pathway models represent an important multivariate modelling approach in twin data analysis. The pathway models have been applied in twin studies of complex human traits, including, for example, behaviour [5, 6] and cardiovascular disorders [7], and have revealed common genetic and environmental factors underlying the sub-phenotypes associated with these composite phenotypes. Likewise, pathway modelling of endophenotypes associated with the metabolic syndrome could help identify the pleiotropy in the development of the disorder’s sub-phenotypes, eventually leading to more efficient treatment and prevention strategies. This paper reports results from our pathway modelling of multiple sub-phenotypes of the metabolic syndrome collected by the Qingdao Twin Registry (Qingdao Center for Disease Control and Prevention, Qingdao) in China, where prevalence of metabolic disorders is increasing rapidly due to changing life style and dietary structures, both of which have been undergoing steady westernisation.

Methods

Participants

Starting from 2001 when the Qingdao Twin Registry, the largest twin registry in China, was first established at Qingdao Center for Disease Control and Prevention, anthropometric and experimental data on metabolic phenotypes have been collected in monozygotic and dizygotic twins in different phases conducted in 2001, 2004 and 2008. At each phase, twins were sampled through the local disease control network and residence registry. Those who were pregnant, breastfeeding, had known diabetes and/or cardiovascular disease, or had taken weight-reducing medication within 1 month of sampling were excluded; incomplete twin pairs were discharged. A total of 695 pairs of twins (405 monozygotic, 290 dizygotic pairs) with a mean age of 37 years was available for analysis. Zygosity of same-sex twin pairs was determined by DNA testing using 16 short tandem repeat DNA markers.

All participants gave informed consent and the study was approved by the local Ethics Committee at Qingdao Center for Disease Control and Prevention.

Phenotypes

Under consideration of representativeness for pathway modelling in a sample of limited size, we selected six phenotypes associated with the metabolic syndrome. These included: BMI, LDL, HDL, fasting glucose, total cholesterol and triacylglycerol. All these phenotypes were measured using standard procedures. For individuals who had repeated measurements over the three phases, only the earliest measurements were kept to avoid correlated data structure due to multiple measurements from the same individual. All phenotype values were log-transformed (natural log) to avoid skewed phenotype distributions. Any measurement that was more than 3 SD below or above the mean was assigned as a missing value.

Data analysis

We introduced and fitted three multivariate models, the Cholesky decomposition [8] (also known as the full saturated model), the common factor independent pathway (IP) model (also called the biometric model) and the common factor common pathway (CP) model (the psychometric model) [9]. The Cholesky decomposition includes as many factors as phenotypes (six in this study) for each independent source of variance (additive genetic, A; common environment, C; and unique environment, E), with the first factor loading on all phenotypes, the second on all but the first, the third on all but the first two and so on. The saturated least restrictive Cholesky decomposition provides the fullest potential explanation of the data and thus can serve as the base model for comparison with the nested IP and CP models.

The CP model is a restrictive model under the hypothesis that the covariance between phenotypes is due to a single shared latent phenotype determined in turn by latent genetic and environmental factors (Fig. 1). On the other hand, the IP model assumes a more flexible factor model under the hypothesis that the variance and covariance between phenotypes is caused by one or more common factors with the residual variance accounted for by phenotype-specific genetic and environmental effects (Fig. 2). Both IP and CP models are nested sub-models of the fully saturated Cholesky model, such that model comparison can be easily done using a likelihood ratio test with the degree of freedom set as the difference in the number of variables specified. As a balance between model fit and parsimony in the number of variables, the Akaike’s Information Criterion (AIC) [10] was also used to determine the suitability of models, with the best model being that with the lowest AIC.

Fig. 1
figure 1

The common factor CP model restricts the common genetic and environmental factors leading to the same genetic and environmental architecture for all phenotypes by introducing a single latent phenotype. GLU, fasting glucose; TC, total cholesterol; TG, triacylglycerol. Ac, Cc, and Ec are the additive genetic, shared and unshared environmental components for the common factors; As, Cs, and Es are the additive genetic, shared and unshared environmental components for the phenotype-specific effects

Fig. 2
figure 2

The common factor IP model assumes common genetic and environmental factors but with different effects on the six phenotypes, while restricting and decomposing the cross-phenotype correlation into common genetic and environmental (including shared and unique) components. GLU, fasting glucose; TC, total cholesterol; TG, triacylglycerol. Ac, Cc, and Ec are the additive genetic, shared and unshared environmental components for the common factors; As, Cs, and Es are the additive genetic, shared and unshared environmental components for the phenotype-specific effects

In all model fitting, age and sex were included as covariates to adjust for their effects on the six phenotypes. Robustness of analysis was assessed using bootstrap re-sampling to calculate empirical 95% CIs for estimated variables.

All twin modelling was performed using the Mx package, which is freely available at www.psy.vu.nl/mxbib/ (accessed 17 March 2010).

Results

Table 1 shows the cross-phenotype correlation, the Pearson correlation coefficient and basic statistics for each of the six phenotypes adjusted for age and sex. Except for fasting glucose with HDL and LDL, all other combinations displayed significant cross-phenotype co-variation with: (1) high correlation between total cholesterol and LDL (0.84); (2) moderate to high correlation between LDL and HDL (0.62), and total cholesterol and HDL (0.58); (3) moderate correlation between triacylglycerol and total cholesterol (0.45), triacylglycerol and LDL (0.41), and triacylglycerol and BMI (0.39); and (4) low but significant correlation between BMI and all other phenotypes, and between triacylglycerol and HDL and fasting glucose. All these significant cross-phenotype correlations give a clear indication of common genetic or environmental backgrounds among the six endophenotypes to be decomposed by twin modelling.

Table 1 Inter-phenotype correlation, mean and standard deviation for all log-transformed phenotypes

The genetic and environmental effects in phenotype correlation were first modelled by the full Cholesky decomposition with genetic and environmental correlation coefficients estimated (Table 2). Here, correlations observed were as follows: (1) very high genetic correlations for total cholesterol and LDL (0.93); (2) moderate to high genetic correlation for HDL with total cholesterol (0.73) and HDL with LDL (0.68); (3) very high shared environmental correlation between total cholesterol and triacylglycerol (0.97); (4) moderate to high shared environmental correlation between LDL and total cholesterol (0.75), LDL and triacylglycerol (0.70), BMI and LDL (0.62), BMI and total cholesterol (0.60), and BMI and triacylglycerol (0.76); (5) moderate to high unique environmental correlation between total cholesterol and HDL (0.78), total cholesterol and LDL (0.88), and between LDL and HDL (0.63).

Table 2 Estimated genetic and environmental correlation coefficients for the full Cholesky decomposition model

In the IP model (Fig. 2), strong factor loadings for common factor genetic effect, shared environmental effect and unique environmental effects were found for total cholesterol and LDL. Strong factor loadings were also found with triacylglycerol for common factor genetic effect and with HDL for non-genetic common factors. The above suggests common genetic and environmental mechanisms among these lipid phenotypes (Table 3). Common factor genetic effect was also estimated for BMI and accounts for 15% of the total variation in BMI. Of special interest is the finding that high proportions of variation in BMI and fasting glucose are due to their unique pathways and include genetic, shared environmental and unique environmental components, with specific genetic effect accounting for as high as 61% of variation in BMI. The results mean that these two phenotypes are more distinct from the lipid phenotypes. Even among the lipid phenotypes, about 40% of variation in HDL can be ascribed to its specific pathway.

Table 3 Estimates and corresponding variance components for the IP model

In the more restricted CP model (Fig. 1), the latent phenotype was mainly characterised by a common genetic factor (Table 4) with 55% of variance accounted for by the genetic factor (path coefficient 0.74), 18% by shared environment (path coefficient 0.42) and 27% by unique environment (path coefficient 0.52). Factor loadings on the latent phenotype were mainly by the four lipid phenotypes, i.e. LDL (0.24), triacylglycerol (0.23), total cholesterol (0.22) and HDL (0.17). Similarly to the IP model, BMI and fasting glucose were mainly characterised by their phenotype-specific genetic and environmental mechanisms (Table 4).

Table 4 Estimates and corresponding variance components for the CP model

In Table 5, performance of the three multivariate models is compared using AIC and the likelihood ratio test, with the full additive genetic, common environment, and unique environment (source of variance) (ACE) Cholesky model as the baseline model. The results in Table 5 show that the full ACE Cholesky model performed significantly better than the restricted IP and the CP models. Comparing the AICs, it is clear that the more restricted the model, the worse the performance.

Table 5 Comparison of model performance for the three multivariate models

Discussion

There have been many multivariate studies on phenotypes related to the metabolic syndrome [3, 4] or obesity [2] using bivariate or multivariate Cholesky decompositions. The current study is characterised by two points. First, comparative modelling of multiple phenotypes was done by fitting the full saturated Choesky model and its nested sub-models, with restrictions in common and independent pathways, aiming to simultaneously evaluate the genetic and environmental correlations among multiple phenotypes. More than the pairwise analysis [1, 3], our multivariate approach allows modelling of pathway models with a view to examining the shared genetic and environmental mechanisms among groups of phenotypes. Second, our study included multiple phenotypes of biochemical measurements instead of closely correlated morphological measurements, and assumed that physiologically related biochemical endophenotypes could share a common genetic architecture.

As shown in Table 1, modest to high cross-phenotype correlations were mainly seen in the lipid phenotypes, i.e. among total cholesterol, HDL and LDL. Fasting glucose was least correlated with all phenotypes except BMI. This is interesting because such a pattern in phenotypic correlation could indicate that: (1) glucose and lipid metabolisms could be controlled by different biological processes; and (2) BMI is representative of all endophenotypes and thus a valuable measurement for studying metabolic disorders.

By decomposing phenotypic correlation into additive genetic, shared and unique environmental correlations, the variable estimates by Cholesky decomposition (Table 2) further revealed that the observed phenotype correlation among lipid phenotypes had both genetic and unique environmental backgrounds. High common environmental correlations are estimated to explain the observed phenotypic correlations of BMI and triacylglycerol with all phenotypes. Our results could indicate that cross-phenotype correlation between BMI and other phenotypes may lack a strong genetic background. Consequently, genes identified as affecting BMI may not overlap with those associated with other endophenotypes.

Assuming the existence of common genetic and environmental factors but with different effects on the six phenotypes, the IP model restricts and decomposes the cross-phenotype correlation into common genetic and environmental (including shared and unique) components (Fig. 2). As expected, factor loading for the common factors are mainly by the lipid phenotypes (Table 3), meaning more shared genetic components among these phenotypes than with the other phenotypes. In a recent large-scale genome-wide association (GWA) analysis on loci affecting lipid levels, many loci were associated with multiple lipid phenotypes including total cholesterol, HDL, LDL and triacylglycerol [11]. In contrast, the glucose phenotype fasting glucose was largely (95%) affected by its own phenotype-specific pathway. For BMI, the IP accounted for more than 80% of its total variance, thus distinguishing it from the lipid phenotypes. Of special interest is HDL, which was affected by common factors (59%) and independent (41%) pathways. Although the common factor pathway explained more variance than the phenotype-specific pathway, it is non-genetic, which suggests that HDL may be subject to independent genetic regulation that is different from that of the other lipid phenotypes.

The CP model further restricts the common genetic and environmental factors in the IP model leading to the same genetic and environmental architecture for all phenotypes by introducing a single latent phenotype (Fig. 1). As expected, factor loading from the latent phenotype was mainly by the lipid phenotypes, with loading for non-lipid phenotypes as low as 0.02 for fasting glucose and 0.03 for BMI (Table 4). This suggests that the latent variable could represent a common mechanism for the lipid phenotypes. Similarly to the IP model, BMI, fasting glucose and HDL had high factor loading on their phenotype-specific pathways.

Although meaningful outputs were obtained by all three models, results from model comparison in Table 5 imply that the greater the restriction in the pathway assumption, the worse the model performance, indicating biological heterogeneity in the development of these phenotypes. All our models point to a clear distinction between most of the lipid phenotypes and non-lipid phenotypes. Table 6 shows results on performances for the full, CP and IP models, applied only to the four lipid phenotypes with data on BMI and fasting glucose dropped. Similarly to Table 5, performance for the nested CP and IP models were again compared with that of the full model. However, in contrast to Table 5, the CP model (AIC −13,383.14) outperformed the IP model (AIC −13,193.28), giving a clear indication of shared genetic and environmental mechanisms among the four lipid phenotypes (HDL, LDL, triacylglycerol and total cholesterol), a finding consistent with conclusions from an American study on lipid phenotypes using a pairwise approach [1].

Table 6 Comparison of model performance for the three multivariate models applied to lipid phenotypes

The existence of genetic pleiotropy in lipid phenotypes can have an important implication for gene mapping practice. It has already been shown that, by a joint analysis of multiple related phenotypes, the multivariate linkage analysis can have increased power in mapping pleiotropic genes [1214]. Moreover, current rapid development in high-throughput genotyping techniques is enabling large-scale GWA studies for the mapping of common genetic variations with low to modest effects on possibly multiple phenotypes including lipid phenotypes. For example, a recent GWAS identified genetic variants that influence obesity and osteoporosis phenotypes [15]. Based on our results, future studies may well find that pleiotropic genes affect lipid phenotypes in the Chinese population.

Abbreviations

ACE:

Additive genetic, common environment, and unique environment (source of variance)

AIC:

Akaike’s Information Criterion

CP:

Common pathway model

GWA:

Genome-wide association

IP:

Independent pathway model

References

  1. Feitosa MF, Rice T, Rankinen T et al (2005) Common genetic and environmental effects on lipid phenotypes: the HERITAGE family study. Hum Hered 59:34–40

    CAS  Article  PubMed  Google Scholar 

  2. Hasselbalch AL, Benyamin B, Visscher PM, Heitmann BL, Kyvik KO, Sørensen TI (2008) Common genetic components of obesity traits and serum leptin. Obesity (Silver Spring) 16:2723–2729

    CAS  Article  Google Scholar 

  3. Zhang S, Liu X, Yu Y et al (2009) Genetic and environmental contributions to phenotypic components of metabolic syndrome: a population-based twin study. Obesity (Silver Spring) 17:1581–1587

    CAS  Article  Google Scholar 

  4. Benyamin B, Sørensen TI, Schousboe K, Fenger M, Visscher PM, Kyvik KO (2007) Are there common genetic and environmental factors behind the endophenotypes associated with the metabolic syndrome? Diabetologia 50:1880–1888

    CAS  Article  PubMed  Google Scholar 

  5. Dawood K, Kirk KM, Bailey JM, Andrews PW, Martin NG (2005) Genetic and environmental influences on the frequency of orgasm in women. Twin Res Hum Genet 8:27–33

    Article  PubMed  Google Scholar 

  6. Tozzi F, Aggen SH, Neale BM et al (2004) The structure of perfectionism: a twin study. Behav Genet 34:483–494

    Article  PubMed  Google Scholar 

  7. Williams FM, Cherkas LF, Spector TD, MacGregor AJ (2004) A common genetic factor underlies hypertension and other cardiovascular disorders. BMC Cardiovasc Disord 4:20

    Article  PubMed  Google Scholar 

  8. Medland S, Hatemi PK (2009) Political science, biometric theory, and twin studies: a methodological introduction. Polit Anal 17:191–214

    Article  Google Scholar 

  9. Rijsdijk FV, Sham PC (2002) Analytic approaches to twin data using structural equation models. Brief Bioinform 3:119–133

    CAS  Article  PubMed  Google Scholar 

  10. Akaike H (1974) A new look at the statistical model identification. IEEE Trans Automat Contr 19:716–723

    Article  Google Scholar 

  11. Aulchenko YS, Ripatti S, Lindqvist I et al (2009) Loci influencing lipid levels and coronary heart disease risk in 16 European population cohorts. Nat Genet 41:5–6

    Article  Google Scholar 

  12. Gray-McGuire C, Song Y, Morris NJ, Stein CM (2009) Comparison of univariate and multivariate linkage analysis of traits related to hypertension. BMC proc 3(Suppl 7):S99

    Article  PubMed  Google Scholar 

  13. Gorlova OY, Amos CI, Zhu DK, Wang W, Turner S, Boerwinkle E (2002) Power of a simplified multivariate test for genetic linkage. Ann Hum Genet 66:407–417

    CAS  Article  PubMed  Google Scholar 

  14. Turner ST, Kardia SL, Boerwinkle E, de Andrade M (2004) Multivariate linkage analysis of blood pressure and body mass index. Genet Epidemiol 27:64–73

    Article  PubMed  Google Scholar 

  15. Liu Y, Pei Y, Liu J et al (2009) Powerful bivariate genome-wide association analyses suggest the SOX6 gene influencing both obesity and osteoporosis phenotypes in males. PLoS One 4:e6827

    Article  PubMed  Google Scholar 

Download references

Acknowledgements

This study was supported by the Novo Nordisk Foundation Grant for Medical Research (2006) for ‘A study on heritability of glucose tolerance and indices of insulin sensitivity and secretion in Chinese twins’. It was also partially supported by National Natural Science Foundation of China (grant number 30872170).

Duality of interest

The authors declare that there is no duality of interest associated with this manuscript.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Z. Pang or D. Zhang.

Additional information

Z. Pang and D. Zhang contributed equally to this work.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Pang, Z., Zhang, D., Li, S. et al. Multivariate modelling of endophenotypes associated with the metabolic syndrome in Chinese twins. Diabetologia 53, 2554–2561 (2010). https://doi.org/10.1007/s00125-010-1907-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00125-010-1907-5

Keywords

  • Chinese twins
  • Endophenotypes
  • Metabolic phenotype
  • Multivariate analysis
  • Pathway models