Univariate assessment of lipoprotein and FA gender differences
Wilcoxon–Mann–Whitney (WMW) rank sum test (Wilcoxon 1945; Mann and Whitney 1947) was used to calculate univariate p-values for comparing the median concentrations of FAs and lipoprotein features for men and women. The null hypothesis was identical medians and the Bonferroni correction for multiple testing was subsequently applied by assuming that the FAs and the lipoproteins could be considered as two families of tests since they belong to different classes of molecules and are also measured independently by different instrumental procedures. After Bonferroni correction, serum concentrations show significant gender differences (Supplementary material 1) for 16:1 n−9 (q < 0.00005)), 18:1 n−9 (q < 0.0002), ALA (q < 0.0.003) and DPA (q < 0.04), all being higher for men than women. By using FDR as developed by Benjamini and Hochberg (1995) for comparing gender on the 20 tests for FAs, a value of p = 0.0075 from the WMW test is found sufficient for significance level of 0.025 in the multiple testing and 18:1 n−7 and TFA are found to have significantly higher concentration in serum of men in addition to those already found by the more conservative Bonferroni test. For the essential FAs, DHA, EPA and AA and the ratio EPA/AA, gender differences are not significant in the analyzed cohort.
Also the lipoprotein features (Supplementary material 2) reveal systematic gender differences. After Bonferroni correction, average size and concentration of VLDL is higher in men than in women (q < 0.000001) and so is the concentrations of CM, VLDL-VL, VLDL-L, TG, all with q < 0.000001, and, VLDL-M (q < 00003) and apolipoprotein B (apoB) with q < 0.01. Average size and concentration of HDL is higher in women than in men as is the concentrations of HDL-VL, HDL-L and HDL-M (all subclasses has q < 0.000001). Average size of LDL particles is higher in women than in men (q = 0.01), while the concentrations of the atherogenic subclasses LDL-S (q = 0.0005) and LDL-VS (q = 0.002) and the suspected atherogenic subclasses (Freedman et al. 1998) HDL-S (q = 0.0065) and HDL-VS (q < 0.002) are higher in men than in women. Calculation of FDR for a significance level of 0.01 for the multiple testing of the 24 lipoprotein features, provides the corresponding limit as p = 0.00875 for the WMW test. By this approach also the median concentrations of LDL and the subclass LDL-M are significantly different leaving only median concentrations of Chol, the subclasses VLDL-S and LDL-L as statistically indistinguishable between genders. Systematic gender differences in serum lipoprotein features have also been observed by Furusyo et al. (2013). They found, e.g., significantly higher levels of CM and VLDL subclasses in Japanese men and higher levels of the subclasses of very large, large and medium HDL particles for women.
Multivariate assessment of lipoprotein and FA gender differences
The multivariate approach described in Sect. 2.6 is now used to reveal lipoprotein and FA patterns related to gender differences.
PLS-DA accounted for 38.2 % of the gender variable (men = 0, women = 1) with lipoprotein features as input using RDCV by testing each PLS component before inclusion as described in Sect. 2.6 in order to optimize the predictive ability of the discriminatory model. Post processing of the PLS-DA model was performed by TP to obtain a single predictive multivariate component and from this component SRs were calculated for each lipoprotein feature. The SRs were also given plus or minus signs depending on the sign of their loadings in the TP component. Figure 1 displays the SR profile with confidence limits on each lipoprotein feature corresponding to two standard deviations determined from RDCV.
The size of SR is proportional to the importance of each individual lipoprotein feature for discriminating between genders. SR = ±1 for a variable corresponds to 50 % of the variance in that variable being predictive, and 50 % left in the residual. Larger average size of the HDL particles in women is the most striking feature with SR = 7.7. This is a reflection of larger concentrations of the subclasses HDL-VL (SR = 2.5), HDL-L (6.0) and HDL-M (SR = 1.6) in women compared to men. Furthermore, the total concentrations of HDL (SR = 2.2) and ApoA1 (SR = 1.2) are also significantly higher in women than in men. Other major differences are concentrations of TG (SR = −1.2) and VLDL (SR = −1.5) and its subclasses VLDL-VL (SR = −1.6) and VLDL-L (SR = −1.6), and, the average size of VLDL particles (SR = −1.4) which are all higher in men than in women since the signs of the TP loadings and thus the SRs are negative. The ranking of the features obtained from the SR plot matches closely the results from the univariate analysis for the most discriminating variables, but has the advantages of also showing whether a discriminating variable is higher or lower in one gender and by also including confidence limits in a single graph.
We can relate the SR to the discriminatory ability of the lipoprotein features with respect to gender. This is performed by calculating the RSCR for the subjects as a function of abs(SR) as explained in Sect. 2.6.
Figure 2 shows the RSCR for the gender differences in the lipoprotein features. With almost the same number of subjects for both genders as we have here, the expectation value of the RSCR is approx. 50 % if a variable has SR = 0 within the confidence limits. Figure 2 shows that the RSCR is 77 % for abs(SR) = 1.0 and for abs(SR) = 0.5 it is still as high as 70 %. Concentration of small (−0.5) and very small (SR = −0.4) LDL particles (see Fig. 1) are thus higher in men underlining a more atherogenic lipoprotein pattern in men compared to women (see Furusyo et al. 2013 and refs. therein). Similar to the WMW test, RSCR measures the degree of how well two groups separates on a variable, but the information content presented as a percentage is easier to comprehend than a p-value.
The cross validated PLS-DA of the FA profiles accounted for 38.8 % of the variance in the gender variable. Figure 3 shows the SR profile for the gender differences in FA profiles.
For the gender differences in FA pattern, largest differences are observed in C14-C18 FAs, i.e. 16:1 n−9 (SR = −0.52), 18:1 n−9 (SR = −0.45) and ALA (SR = −0.52) which all have RSCR higher than 70 % and Bonferroni corrected p-values less than 0.003. As they all have negative sign, their levels are higher in men than in women. The SRs for EPA, DHA, AA, and the ratio EPA/AA are all zero within the confidence limits obtained from cross validation and thus the same in men and women. Of the C20-C24 FAs, only docosapentaenoic acid (DPA) shows a gender difference.
The substantial gender differences in lipoprotein and FA patterns imply that men and women have to be treated separately in the multivariate analyses to follow.
Correlations between lipoproteins and FAs in women
For the female cohort, joint PCA of the 24 lipoprotein features and 18 individual FAs, TFA and the ratio of EPA to AA showed that 60.8 % of the total variance was accounted for by the two major PCs, emphasizing a pronounced correlation between lipoproteins and FAs. Figure 4 displays the (correlation) loadings of the 44 variables on the two PCs.
Since all features are standardized to unit variance prior to PCA, the distance from origin to the location of a variable represents a quantitative measure of explained variance in that variable. Furthermore, the cosine of the angle between a pair of variables represents a quantitative measure of their explained correlation in the two-component PC model.
Most of the features related to HDL particles are located in the left upper part of the loading plot: (i) the average size of the HDL particles, (ii) the total concentration of HDL particles, and, (iii) their fractions of very large, large and medium sized HDL particles. All these features associate with good cardiovascular (CV) health. As expected, concentration of apolipoprotein A1 (ApoA1), being a major protein component of HDL particles, falls close to the cluster of the majority of HDL features. The ratio EPA/AA and concentration of EPA cluster together with size and concentrations of HDL and fractions of very large and large HDL particles. Furthermore, the concentration of EPA and the ratio EPA/AA show a negative correlation to average size of VLDL particles. According to Freedman et al. (1998) reduced size of VLDL particles reduces severity of CVD. Thus, the positions of EPA and EPA/AA in the lipoprotein correlation pattern comply with the findings of Dyerberg et al. (1978) and later confirmed by, e.g., Ninomiya et al. (2013) of the preventive effects of these two biomarkers on risk for developing CVDs.
In the lower right corner of the plot, oppositely correlated to the major part of the HDL features, characteristics related to VLDL particles appear: i) average size of VLDL particles, (ii) total concentration of VLDL, and, (iii) concentrations of fractions of very large, large and medium sized VLDL particles. As expected, concentrations of CM and TG also fall in the same region. The display shows that the concentrations of very small and small HDL particles correlate with this group of features. Freedman el al. (1998) found that a global measure of coronary artery disease (CAD) severity was positively associated with levels of large VLDL and small HDL particles.
In the upper right corner of the loading plot, all the features measured for the LDL particles cluster together with the concentration of ApoB and Chol. Correlations between HDL and LDL features are weak except for concentrations of small and very small LDL particles which correlate positively with very small HDL particles. Concentration of small LDL particles is well-known to represent a strong predictor of CVD (Hirayama and Miida 2012). Thus, small dense LDL has been accepted as a risk factor for CV events by the National Cholesterol Education Program Adult Treatment Panel III (NCEP III 2002). The correlation of small and very small HDL particles with large VLDL particles and small and very small LDL particles supports the hypothesis that these subclasses of HDL are atherogenic.
Average size of LDL particles is not well correlated to the major lipoprotein patterns of women on the two major PCs.
DHA is located in-between the HDL and LDL cluster. DHA and DPA are both strongest correlated to large LDL particles. This observation complies with previous investigations that DHA correlates positively to large LDL particles and negatively to small dense atherogenic LDL particles (Neff et al. 2011). Thus, it appears that high levels of EPA/AA, EPA, DHA and DPA correlate strongly to lipoprotein features suggestive of good CV health, but that the effects are different with EPA having strongest effect on HDL and DHA and DPA on LDL particles.
TFA is positively correlated to apoB and very small LDL particles, both being indicators of poor CV health. In summary, the loading plot provides an interpretation in line with many findings in earlier work, but shows a stronger connection between EPA and HDL properties than found previously, while DHA (and DPA) is strongly connected to LDL properties, but less connected to HDL. The FA loading pattern on PC1 and PC2 can be interpreted as a multivariate CVD risk scale with C14-18 saturated and mono-unsaturated FAs leading to increased CVD risk on one side and long chain PUFAs leading to decreased CVD risk (i.e. EPA/AA, EPA and DHA) on the other side. This observation is in line with the univariate risk scale defined by Chowdhury et al. (2014). The FA pattern also matches the lipoprotein scale representing CV health with HDL features clustering together with EPA/AA, EPA and DHA, and VLDL features and the suspected atherogenic small and very small HDL particles clustering together with the saturated and mono-unsaturated C14-C18 FAs.
Correlations between lipoproteins and FAs in men
Figure 5 displays the loadings on the two major PCs for the lipoproteins and FAs for men.
The plot, which accounts for 56.6 % of the total variance, is similar to the loading plot for women (Fig. 4), but with some striking differences. The average size of LDL particles, which was almost uncorrelated to the major lipoprotein patterns for women, is falling in the cluster of HDL features for men and is negatively correlated to average size of VLDL particles. Thus, large average size of LDL particles is suggestive of good CV health for men. We also observe that concentration of small VLDL particles coincides with large LDL particles in men. In some methods for lipoprotein separation according to particle size, small VLDL as defined here are classified as intermediate density lipoprotein (IDL).High levels of small and very small HDL particles are associated with high levels of apoB.
We observe from Fig. 5 that the ratio EPA/AA is strongly positively correlated to average size of LDL particles and to the HDL lipoprotein features associated with good CV health. The strongest correlation is with average size of LDL particles. On the other hand, the concentrations of EPA, DHA, DPA and 24:1 n−9 are just as strongly associated with concentrations of large LDL and small VLDL particles as with the good HDL features. Concentration of TFAs correlates with apoB, total concentrations of LDL particles, and, concentrations of medium, small, and very small LDL particles. TFA also correlates with small and medium sized HDL particles. Most VLDL features cluster together and with strongest correlations to 16:1 n−9.
In summary, EPA and EPA/AA correlates strongest with good HDL features for men as for women, but the correlation is weaker for men than for women. For men, average LDL particle size correlates stronger to the favorable HDL properties than in women. EPA clusters together with DHA, DPA and nervonic acid in men, a pattern that is different from what was observed for women where EPA and the related ratio EPA/AA were more closely related to favorable HDL properties than the other long-chain PUFAs. As for women, the loading plot for men can be interpreted as indicating a multivariate CVD risk scale, but more pronounced, since the atherogenic small and very small LDL particles are located in close proximity to the VLDL features.
Agglomerative hierarchical cluster analysis (HCA) of lipoproteins
Using HCA, the similarity of the lipoprotein features for both genders based on their FA correlation patterns (supplementary material 3 (women) and 4 (men)) is assessed. The result of HCA of the lipoprotein features is presented as a dendrogram. This is a quantitative display of the similarities in FA patterns among the lipoprotein features. Men and women are treated separately. The result is displayed in Fig. 6.
For women, three main clusters are evident. ApoA1, average size of LDL and all the HDL features, except small and very small HDL particles, are connected in one cluster and all LDL features are connected in another cluster together with ApoB, Chol and small VLDL particles. In the last cluster, all the remaining VLDL features are found together with CM, TG and small and very small HDL particles. Thus, the picture revealed by HCA is identical to what was found by PCA on the two major PCs (Fig. 4).
For men, the result of HCA reveals one cluster which is identical to the one for women with most HDL features, ApoA1 and average size of LDL particles connected and a second cluster with embraces all the other lipoprotein features. Again, the overall picture from HCA is identical to the one obtained by PCA (Fig. 5). However, the second cluster splits into an LDL and a VLDL cluster located respectively in the middle and at the right end of the dendrogram which are similar to the ones for women, but the small and very small LDL particles have changed place and is located in the VLDL cluster for men with closest distance to the small and very small HDL particles.
Predicting lipoproteins from FA patterns in women
As a last step in the analysis, we used PLS regression as a tool to validate predictive FA patterns for the lipoprotein features. Supplementary material 5 summarizes the most important characteristics for the lipoprotein regression models for women. The squared correlation coefficient (R2Y) between measured and predicted values of a lipoprotein feature represents a quantitative measure of the strength of associations between the FA patterns and the lipoprotein feature for the optimal model as determined by RDCV with confidence intervals corresponding to p = 0.05 for the PLS component selection (see Sect. 2.6). As complementary measures of predictive association between lipoprotein features and FA patterns, we have included Q2Y which measures the squared correlation between measured and predicted values for the subjects kept out from the final modelling step and subsequently predicted, and, the root mean square error of prediction (RMSEP). The most important FAs have been ranked in accordance with their strength of association to each modelled lipoprotein feature. This ranking was obtained by making a TP to separate the lipoprotein-related information carried by each FA from systematic interfering variation and then calculating the SRs (supplementary material 7). As an additional measure of strength of association to the lipoprotein features, we have included the correlation coefficients (supplementary material 3) in the raw data in parentheses for the highest ranked and lowest ranked of the most important FAs.
It should be noticed that measures of physical activity/non-activity has been shown to have a large impact on lipoprotein distribution and largest on HDL features (Aadland et al. 2013). Physical activity triggers reverse cholesterol transport (RCT) whereby Chol is transported from the fat deposit to the liver. This process increases total HDL concentration and shifts the distribution towards larger average size of HDL particles. On the other hand, a sedentary lifestyle leads to a reduction in total HDL concentration and reduction in average size of HDL. The lack of measures of physical activity in the modelling leads to a reduction in predictive power of some PLS models, but our models are developed for the purpose of strengthening the interpretation, not to substitute lipoprotein features with predictions from FA profiles.
Supplementary material 5 reveals positive predictive associations of total concentration of HDL particles to DHA and EPA in women, but also with linoleic acid (LA), 16:0, TFA and ALA. This mixed pattern is disentangled when we look at models for subclasses of HDL. Thus, strong predictive positive associations exist between concentrations of large HDL particles and EPA, DHA and DPA in that order. The strongest association is, however, with the ratio EPA/AA which also possesses the strongest positive association to average size of HDL particles. Large HDL and average size of HDL particles both shows strong negative association with 20:3 n−6, dihomo-γ-linolenic acid (DGLA), which has the strongest positive association to TG. The picture for very large HDL particles is similar to the pattern for large HDL particles, but the predictive strength is weaker. Concentrations of small and very small HDL particles have strong associations to DGLA, but also saturated and mono-unsaturated C16–C18. Total concentration of LDL is predicted by concentrations of saturated FAs, DPA and DHA, ALA and LA. Furthermore, we observe strong predictive relationships between concentration of large and medium sized LDL particles and DPA and DHA, and weak predictive relationships with EPA in women, but with some saturated FAs also showing strong associations with these lipoprotein features. Small and very small LDL particles correlate to saturated FAs. ApoB shows almost the same pattern emphasizing its connection to the atherogenic subclasses of LDL. Total concentration and size of VLDL particles and the subclasses of very large, large and medium VLDL have strong predictive associations with DGLA and C16–C18 omega-7 and omega-9 FAs. The pattern is similar to what we obtained for small and very small HDL. LA and C16–C18 omega-7 FAs have strongest association to concentration of ApoA1.
Predicting lipoproteins from FA patterns in men
Supplementary material 6 shows R2Y, Q2Y, RMSEP and the most important predictive FAs for the lipoprotein features for men. The complete SR profiles are listed in supplementary material 8.
Average size of LDL particles are positively correlated to EPA/AA, EPA and DHA and negatively correlated to ALA and C16-C18 omega-7 and omega-9 FAs. Large average size of LDL has been shown to reduce risk of atherosclerosis so this observation is in accordance with protective effects of EPA and DHA. Total LDL is predicted by total concentration of FAs and saturated FAs, i.e. 16:0 and 18:0. Large LDL particles are most strongly associated to concentration of TFAs and saturated FAs, but also DHA, DPA and EPA have strong to moderate associations to large LDL particles. Small and very small LDL particles show identical predictive patterns with TFAs, saturated and omega- 9 FAs with 16 and 18 carbons possessing the strongest associations to these lipoprotein features. The same pattern is evident for apoB replicating the connection to saturated FAs that was found for women. Also very small HDL particles exhibit similar association pattern in men as in women. Very large, large and medium sized HDL particles showed no predictive FA patterns in men. These subclasses are strongly influenced by physical activity by triggering RCT and thus impacting the distribution of HDL. Total concentrations of HDL and apoA1 share weak predictive relationships to DPA, AA, and EPA. Average size of HDL particles is most strongly positively associated to EPA/AA and EPA and negatively to DGLA, just as for women, but the associations are weaker. Concentrations of very large, large and medium sized VLDL particles have similar predictive FA patterns in men and women with strongest associations to C16-C18 omega-9 FAs.