Background

High mammographic breast density (MBD) is a strong risk factor for breast cancer development [1]. Women with extremely dense breasts have a 4–sixfold increased risk of developing breast cancer compared to women with scattered areas of fibroglanduar density [1, 2]. It is estimated that ~ 29% of breast cancers in premenopausal women could be potentially prevented if women with dense or heterogeneously dense breasts reduced their density to scattered areas of fibroglandular density [3]. Further, MBD is an intermediate phenotype [4] for breast cancer development; hence, understanding the mechanisms underlying high MBD may open up new avenues for breast cancer prevention.

Adiposity is strongly associated with MBD measures. We and others have demonstrated that childhood, early adulthood, and adulthood adiposity, as well as changes in adiposity over the life course, are associated with MBD in premenopausal women [5,6,7]. Adiposity explains the greatest percentage of variability in MBD, approximately 22% in non-Hispanic black women and 26% in non-Hispanic white women [8]. Studies on the associations of adiposity biomarkers with MBD yielded inconsistent results [9,10,11,12]. Thus, there is an urgent need to investigate the associations of novel biomarkers of adiposity with MBD.

The lipidome is the complete lipid profile within a tissue [13]. Lipidomics, the comprehensive analysis of lipid molecules, is essential to understanding lipid biology. Recent technological advances have allowed the quantification of lipid species unbiasedly with increasing accuracy. Lipids play essential roles in cell structure, generation of cell membranes, signaling, and fuel/storage [14]. Lipids also function as second messengers and as hormones and are responsible for various cell functions. Changes in lipid metabolism have been shown to impact cell growth, proliferation, differentiation, and motility [15], and perturbations in lipid metabolism are implicated in metabolic diseases [16, 17]. Applying lipidomics to MBD should, therefore, provide meaningful insights into the biologic mechanisms underlying MBD, but to the best of our knowledge, no study has characterized the lipidome of MBD. Our study aimed to characterize, for the first time, the comprehensive lipidome of MBD in premenopausal women.

Methods

Study population

This study population is comprised of 705 premenopausal women who were recruited during annual screening mammogram at the Joanne Knight Breast Health Center at the Siteman Cancer Center at Washington University School of Medicine (WU) in St. Louis, MO. Women were eligible to participate in the study if they were premenopausal, not pregnant, and were able to comply with study procedures. Women with a history of cancer, breast augmentation (implants or reduction), and currently use or have used selective estrogen receptor modulators in the past six months were excluded [6]. Women were considered premenopausal if they had a regular menstrual cycle in the past 12 months and did not have a history of hormone replacement therapy or bilateral oophorectomy [6]. All participants provided written informed consent, and the study was performed in accordance with the Declaration of Helsinki. We received approval for this study from the WU Institutional Review Board.

Participants completed a questionnaire on behavioral, reproductive, demographic, and clinical characteristics. They also had their height, weight, and body fat percentage measured on the day of their screening mammogram visit [6]. Women were asked to fast on the day of their mammogram prior to providing a blood sample [6]. Blood samples were sent to the Tissue Procurement Core at WU Siteman Cancer Center within 30 min of collection and stored at −80 °C [18].

Lipidomics profiling

Blood samples from each woman were sent to Metabolon (Durham, NC®) for comprehensive quantitative lipidomic profiling. Lipidomic profiling quantified 982 lipid species in 3 lipid super pathways and 14 sub-pathways: phospholipids (phosphatidylcholines (PC), lysophosphatidylcholines (LPC), phosphatidylethanolamines (PE), lysophosphatidylethanolamines (LPE), and phosphatidylinositols (PI)), sphingolipids (ceramides (CER), dihydroceramides (DCER), hexosylceramides (HCER), lactosylceramides (LCER), and sphingomyelins (SM)) and neutral complex lipids (cholesteryl esters (CE), diacylglycerols (DAG), triacylglycerols (TAG), and monoacylglycerols (MAG). Individual lipid species were quantified by taking the ratio of the signal intensity of each target compound to that of its assigned internal standard, then multiplying by the concentration of internal standard added to the sample [19]. Quality control samples, a large pool of human plasma maintained by Metabolon, were included in the assay runs and median relative standard deviation (RSD) percent values were calculated for each run. The overall median RSD across runs was 8%. Lipid sub-pathway concentrations were calculated from the sum of all molecular species within a sub-pathway [19].

Mammographic breast density assessment

We assessed volumetric measures of MBD (volumetric percent density (%) (VPD), dense volume (cm3) (DV), and non-dense volume (cm3) (NDV)) using Volpara 1.5 (Volpara Health®). VPD was calculated by using the maximum fibroglandular volume between the left and right breasts. VPD was calculated by dividing DV by total breast volume and multiplying by 100. VPD was also categorized into: < 3.5%, 3.5% ~ 7.5%, > 7.5% ~ 15.5%, and > 15.5%, corresponding to the MBD groups of almost entirely fatty, scattered areas of fibroglandular density, heterogeneously dense, and extremely dense.

Statistical analysis

Distributions of demographic, reproductive, and clinical factors were summarized by the four VPD categories into means and standard deviations for continuous variables and counts and percentages for categorical variables. One hundred twenty-five lipid species missing in 300 or more of the 705 samples were excluded from the analyses, leaving 857 lipid species. Lipid species with less than 300 missing samples were imputed using the 10-nearest neighbor methods using the R package impute.[20] We excluded 5 women with missing MBD measures, leaving 700 in the analytic sample. Spearman correlation coefficients were calculated across the lipid sub-pathways and lipid species with MBD measures.

We performed covariate-adjusted multivariable linear regression models fitting each MBD outcome with the concentration of lipid sub-pathways and lipid species, in quantitative scale and in quartiles, to determine their associations. MBD measures were log10 transformed for normality. In the quantitative scale, each lipid sub-pathway/species was standardized to have a zero mean and a unit standard deviation. The following covariates were accounted for in covariate-adjusted analyses: age (continuous), age at menarche (continuous), body shape at age 10 (based on Stunkard pictogram), body fat % (continuous), race (non-Hispanic white, non-Hispanic black, other), family history of breast cancer (yes, no), oral contraceptive use (never, less than 1 year, 1–4 years, 5–9 years, more than 10 years), alcohol consumption (never, < 1 drink/week, 1–2 drinks/week, 3–5 drinks/week, and 6 + drinks/week), and parity/age at first birth (nulliparous, 1–2 children & < 25 years, 1–2 children & 25–29 years, 1–2 children & ≥ 30 years, ≥ 3 children & < 25 years, ≥ 3 children & ≥ 25 years). Body fat % was used instead of body mass index (BMI) because it explained a slightly greater proportion of variation in VPD (R2 = 0.45) than BMI (R2 = 0.43). BMI and body fat % were highly correlated (r = 0.88). Some of the covariates included in the analyses had > 1% missingness, including body shape at age 10 (N = 40, 5.7%), and body fat % (N = 25, 3.5%); hence, missing values in all covariates were imputed using multivariate imputation by chain equations method via the R package mice.[21] We report the least square means (LSM) of VPD, DV, and NDV by quartile of lipid sub-pathway and lipid species. We calculated the p-value for trend in the adjusted LSM along quartiles by setting all values within each quartile to the median of that quartile range and operationalizing it as an ordinal variable. Linear coefficients were back-transformed. Residuals from the linear regression models were graphically examined for model goodness-of-fit. The proportional odds ratio model was applied to the four VPD categories with each lipid class as a predictor with the inclusion of the covariates to estimate the adjusted odds ratio with 95% CI. We corrected for multiple testing and the family-wise error rate using false discovery rate (FDR), and Bonferroni method, respectively.

To identify variables that influenced VPD the most, we bootstrapped the original dataset 200 times and applied least absolute shrinkage and selection operator (LASSO) penalized multivariable linear regression to each bootstrapped dataset to fit VPD with consideration of all available variables (the lipid sub-pathways, lipid species, and the covariates). We calculated the frequency of each variable being retained in the model fitted from the 200 bootstrapped datasets as an importance measure of the variables in predicting VPD. Model goodness-of-fit was evaluated based on the adjusted R2 value after sequentially adding lipid species to the model.

Results

The mean age, BMI, and body fat % of study participants were 46 years, 30 kg/m2, and 40.4%, respectively; approximately 71.8% of participants were non-Hispanic white, and 23.1% were non-Hispanic black (Table 1). Almost half of the women in the study (44.9%) had scattered areas of fibroglandular density (VPD of 3.5% ~ 7.5%), and 28.1% had heterogeneously dense breasts.

Table 1 Characteristics of Premenopausal Women Recruited During Annual Screening Mammogram across Volumetric Percent Density Categoriesa

Correlations between lipid sub-pathways/species and MBD

Lipid species in the TAG, LCER, LPC, and PC sub-pathways were most strongly correlated with VPD, NDV, and DV (Additional file 1: Fig. 1, and Additional file 2: Table 1). The strongest positive correlations observed with VPD were with LPC(18:1) (r = 0.39, p-value = 4.4E−27); with NDV was TAG54:6-FA20:4 (r = 0.51, p-value = 3.3E−47); and with DV was TAG58:10-FA20:4 (r = 0.14, p-value = 0.0003). The strongest inverse correlations for VPD were TAG54:6-FA20:4 (r = − 0.47, p−value = 8.6E−40); with NDV was LPC(18:1) (r = − 0.44, p-value = 1.9E−34); and with DV was PC(18:1/18:2) (r = −0.10, p-value = 0.008), (Additional file 2: Table 1).

Multivariable linear regression between lipid sub-pathways/species and MBD

At the sub-pathway level, 4 of the 9 lipid sub-pathways (DCER, LCER, DAG, and TAG) were significantly associated with VPD (Bonferroni p value < 0.05) in the covariate-adjusted linear regression analyses (Fig. 1A and Additional file 1: Table 2). Fifty-six lipid species across seven lipid sub-pathways were significantly associated with VPD: TAG (N = 43), DAG (N = 7), PC (N = 2), PI (N = 1), LPC (N = 1), LCER (N = 1), and CE (N = 1). Four of the 56 lipid species were positively associated: LCER(14:0), LPC(18:1), PC(18:1/18:1), and PI(18:1/18:1), while the remaining 52 were inversely associated with VPD (Fig. 2A). Of the 52 lipid species that were inversely associated with VPD, 43 were from the TAG sub-pathway. The TAG species with similar chain lengths that were significantly associated with VPD displayed strong positive correlations ranging from 0.72–0.99 for TAG50 species, 0.58–0.99 for TAG52, and 0.69–0.97 for TAG54. One standard deviation increase in TAG54:6-FA20:4 was associated with a > 10% decrease in VPD, while one standard deviation increase in LCER(14:0) and PI(18:1/18:1) was associated with a > 10% increase in VPD. Three lipid sub-pathways (DCER, DAG, and TAG) were significantly associated with NDV (Fig. 1B and Additional file 1: Table 2). One hundred and seventeen lipid species were significantly associated with NDV, with 113 lipid species from the DAG (N = 14) and TAG (N = 99) sub-pathways showing positive associations, and 4 species (LCER(14:0), LPC(18:1), PC(18:1/18:1), and PI(18:1/18:1)) showing inverse associations (Fig. 2B). Three lipid sub-pathways and 48 lipid species (38.4%) were significantly associated with both VPD and NDV (Fig. 1C and Fig. 2C), and in opposite directions as expected. Of note, one standard deviation in TAG54:6-FA20:4 was associated with a > 10% change in both VPD (decrease) and NDV (increase). No lipid sub-pathway or species was significantly associated with DV.

Fig. 1
figure 1

Covariate-adjusted Associations between Lipid Sub-pathways with VPD and NDV A volcano plot of lipid sub pathways for VPD, B volcano plot lipid sub-pathways for NDV, C Venn diagram of lipid sub-pathways for VPD, DV and NDV Abbreviations: volumetric percent density (VPD), non-dense volume (NDV) dihydroceramide (DCER), lactosylceramide (LCER), diacylglycerol (DAG), triacylglycerol (TAG)

Fig. 2
figure 2

Covariate-adjusted Associations between Lipid Species with VPD and NDV A volcano plot of lipid species for VPD, B volcano plot of lipid species for NDV, C Venn diagram of lipid species for VPD, DV and NDV. Abbreviations: volumetric percent density (VPD), non-dense volume (NDV) lysophosphatidylcholine (LPC), phosphatidylinositol (PI), lactosylceramide (LCER), diacylglycerol (DAG), phosphatidylcholine (PC), triacylglycerol (TAG)

Multivariable covariate-adjusted least square means of volumetric percent density by quartiles of lipid sub-pathways/species

We categorized the lipid sub-pathways and species into quartiles and then calculated the least squared means (LSM) of VPD and NDV across the quartiles for those that were significant in the multiple linear regression analysis (Figs. 1&2). The four lipid sub-pathways, DCER, LCER, DAG, and TAG, were still associated with VPD after both FDR and Bonferroni correction (Table 2). All the 56 lipid species that were associated with VPD in continuous scale (Fig. 2A) were still significantly associated with VPD in quartiles at an FDR p-value < 0.05 and 48 at a more stringent Bonferroni p-value < 0.05 (Table 3). PI (18:1/18:1) was the lipid species with the strongest positive association with VPD. VPD increased across quartiles of PI(18:1/18:1): (Q1 = 7.5%, Q2 = 7.7%, Q3 = 8.4%, Q4 = 9.4%,FDR p-trend = 8.8E-05; Bonferroni p-trend = 0.02), (Table 3). All the 117 lipid species that were significantly associated with NDV in continuous scale (Fig. 2B) were still significantly associated with NDV at an FDR p-trend < 0.05, but only 101 were associated with NDV at a Bonferroni p-trend < 0.05, (Additional file 1: Table 3). Quartiles of DCER, LCER, DAG, and TAG were associated with NDV (Additional file 1: Table 4). There were no significant associations between lipid sub-pathways/lipid species and DV (Additional file 1: Table 4).

Table 2 Covariate-adjusted Least Square Means of Volumetric Percent Density by Quartiles of the 14 Lipid Sub-pathways
Table 3 Covariate-adjusted Least Square Means of Volumetric Percent Density (VPD) by Quartiles of Lipid Species that were Significantly Associated with VPD at a Bonferroni P-value < 0.05 a,b,c

Proportional odds model

Additional file 1: Fig. 2 presents the results from the covariate-adjusted proportional odds model investigating the associations between lipid sub-pathways and the four VPD categories. LPC and LCER were significantly positively associated with VPD categories, while DCER, DAG, and TAG were inversely associated.

LASSO regression and bootstrapping

Body fat %, body shape at age 10, parity/age at first birth, and family history of breast cancer were most strongly associated with VPD and were retained in the final lasso-penalized multivariable linear regression model in 100% of the 200 bootstrapped datasets. Some lipid species (e.g., PI(18:1/18:1)—99%, LCER (14:0), and PE(O-16:0/22:6)—98.5%) were selected at higher frequencies than classic covariates, such as age, race and age at menarche, that are associated with VPD. For VPD, the adjusted R2 (a model goodness-of-fit measure) derived from the model containing only the classic covariates was 0.45. A reasonably selected (based on the scree plot of the adjusted R2) model, which included all the covariates and 57 lipid species, rendered an improved adjusted R2 = 0.59 (results not shown).

Discussion

Using untargeted comprehensive lipidomics profiling, we demonstrate, for the first time, the associations of lipid species with MBD in premenopausal women. We observed 56 lipid species that were associated with VPD and 117 lipid species with NDV. Most of these lipid species were from the large TAG sub-pathway. Some of the lipid species explain greater variation in VPD than well-established determinants of MBD and appear to improve model goodness-of-fit based on the adjusted R2 as the covariate-only model rendered an adjusted R2 = 0.45 while additionally including 57 lipid species increased the adjusted R2 to 0.59.

To the best of our knowledge, this is the first study to characterize the lipidome of MBD. Several lipid species were associated with MBD after adjustment for adiposity. Studies have investigated the associations of biomarkers of adiposity, such as adipokines and insulin-like growth factors with, MBD, but the findings are not consistent. The association of most of these biomarkers with MBD, particularly the adipokines, is influenced by BMI. Several studies found a null association between these biomarkers and MBD after adjusting for measures of adiposity [10, 11, 22,23,24,25], although some studies reported an inverse association between leptin and percent density [10, 11, 26]. The association of C-peptide with MBD is similarly attenuated once adjusted for BMI [9, 27]. Similarly, the associations of insulin-like growth factors with MBD are not consistent [12, 28,29,30,31]. Our study, therefore, provides important novel insights on the biological mechanisms underlying MBD.

Although no population-based studies have comprehensively characterized the lipidome of breast cancer, preclinical studies suggest that breast cancer subtypes have unique lipidomic profiles [32,33,34], and a few studies have evaluated the utility of a limited set of lipid species for their diagnostic performance in breast cancer development. One study profiled 110 lipid species in 121 breast cancer cases and 45 healthy controls;19 lipid species distinguished women with triple-negative breast cancer (TNBC) from non-cancer controls, and 5 lipid species distinguished women with TNBC from women with other types of breast cancer [35]. Another study compared lipid profiles in breast tissue of 42 women with breast cancer to 19 healthy controls, reporting 48 significantly differential lipid species. They identified higher levels of lipid species in the LPC sub-pathway but lower levels of lipids in ceramides, DAG, PC, and PE sub-pathways in breast tissues of women with breast cancer compared to healthy controls [36]. They found higher levels of LPC(18:1) in cancer tissue compared to normal tissue [36]. We found a significant positive association between LPC(18:1) and VPD; hence, LPC(18:1) may be a novel biomarker of VPD and breast cancer development. LPC(18:1) has been found to be abundant in MCF-7 cell lines (ER + /PR +) compared to MCF10A cell lines (normal) [32].

Lipids are generated in cells through de novo synthesis or by intake through exogenous sources [15]. Lipid species are a diverse group of compounds and are categorized by their primary function, which is determined by their polar head and are further differentiated by the addition of a hydrocarbon chain [37]. Lipids are functionally important in maintaining cell membranes and fueling the cell [15]. Lipids within the phospholipid and sphingolipid super pathways are a key component in cell membrane development, while neutral complex lipids, particularly DAG and TAG provide energy to fuel the cell [34].

Three of the 4 lipid species (PC(18:1/18:1), LPC(18:1), PI(18:1/18:1)) that were positively associated with VPD were from the phospholipid super pathway and the associations are biologically plausible. They contain at least one monounsaturated fatty acid (MUFA) chain. Higher levels of MUFAs have been reported in breast tumor tissues compared to normal tissues from the same woman [38], and some, but not others, have suggested associations of MUFAs with breast cancer risk [39]. PI(18:1/18:1) promotes cell survival and is responsive to diet and stress [40]. LPCs have an impact on innate immunity and have been shown to possess both anti-and pro-inflammatory properties [41]. Two studies exploring the relationship between circulating triglycerides and MBD observed null or inverse findings.[42, 43]. Our findings on the inverse associations of several TAG species with VPD, therefore, provide new information on triglycerides and MBD. Because of the very strong correlations of TAG species with the same chain lengths, studies are needed to determine which of the TAG species are more amenable to targeting breast cancer prevention.

Strengths/limitations

Our study population is large and well annotated, with detailed information on demographic, reproductive, and anthropometric measures. This allowed us to perform robust analyses with control for confounders. The study population is diverse and mirrors the underlying population attending screening mammogram at our institution; hence, our results are generalizable to the source population which our study participants were recruited from or other populations of similar characteristics [8]. We performed lipidomic profiling on fasting blood samples, which should provide a more reliable measure of the lipid species than using non-fasted samples [44].

There are also some limitations. The study is cross-sectional; therefore, we cannot establish longitudinal trajectories of lipids. Due to the large number of lipid species evaluated, some findings may be due to chance. Nevertheless, we applied the stringent Bonferonni correction; hence, the potential of chance findings are considerably reduced. Our study is limited to premenopausal women; therefore, our results cannot be generalized to postmenopausal women.

Conclusion

Our study offers new insights into the biological mechanisms underlying high MBD in premenopausal women. Additional studies are needed to validate our findings.