Sex-steroid hormones and risk of postmenopausal estrogen receptor-positive breast cancer: a case–cohort analysis

Purpose Sex-steroid hormones are associated with postmenopausal breast cancer but potential confounding from other biological pathways is rarely considered. We estimated risk ratios for sex-steroid hormone biomarkers in relation to postmenopausal estrogen receptor (ER)-positive breast cancer, while accounting for biomarkers from insulin/insulin-like growth factor-signaling and inflammatory pathways. Methods This analysis included 1208 women from a case–cohort study of postmenopausal breast cancer within the Melbourne Collaborative Cohort Study. Weighted Poisson regression with a robust variance estimator was used to estimate risk ratios (RRs) and 95% confidence intervals (CIs) of postmenopausal ER-positive breast cancer, per doubling plasma concentration of progesterone, estrogens, androgens, and sex-hormone binding globulin (SHBG). Analyses included sociodemographic and lifestyle confounders, and other biomarkers identified as potential confounders. Results Increased risks of postmenopausal ER-positive breast cancer were observed per doubling plasma concentration of progesterone (RR: 1.22, 95% CI 1.03 to 1.44), androstenedione (RR 1.20, 95% CI 0.99 to 1.45), dehydroepiandrosterone (RR: 1.15, 95% CI 1.00 to 1.34), total testosterone (RR: 1.11, 95% CI 0.96 to 1.29), free testosterone (RR: 1.12, 95% CI 0.98 to 1.28), estrone (RR 1.21, 95% CI 0.99 to 1.48), total estradiol (RR 1.19, 95% CI 1.02 to 1.39) and free estradiol (RR 1.22, 95% CI 1.05 to 1.41). A possible decreased risk was observed for SHBG (RR 0.83, 95% CI 0.66 to 1.05). Conclusion Progesterone, estrogens and androgens likely increase postmenopausal ER-positive breast cancer risk, whereas SHBG may decrease risk. These findings strengthen the causal evidence surrounding the sex-hormone-driven nature of postmenopausal breast cancer. Supplementary Information The online version contains supplementary material available at 10.1007/s10552-024-01856-6.

a Physical activity was measured as total weighted minutes of walking, moderate-and vigorous-intensity recreation-and transport-related physical activity (MVPA) per week at the second follow-up wave.
Insufficiently active was defined as < 150 total weighted minutes of MVPA per week, sufficiently active was defined as 150 to ≤ 300 total weighted minutes of MVPA per week, and highly active was defined as > 300 total weighted minutes of MVPA per week.
Missing data for covariates include: 8 for socioeconomic disadvantage, 773 for physical activity and 44 for lifetime alcohol consumption.
Southern European Migrant status, socioeconomic disadvantage, education, smoking status, lifetime alcohol consumption, body mass index, dietary calcium intake and total carotenoid intake from diet were measured at baseline.Age at blood collection and physical activity were measured at the second follow-up wave.Age at menopause was not estimated for all eligible women and was thus not presented in the table.

Online Resource 2: Handling of competing risks
To minimize the impact of death as a competing risk and maximize the retention of eligible cases in this relatively older cohort, follow-up was chosen to end on participants' 86 th birthday (as informed by Online Resource Table 2.1).Accommodating competing risks is acknowledged as a challenge in the causal inference literature, especially in the absence of time-varying data [1,2].b Total number in the case-cohort after 32 women were retrospectively excluded due to estradiol values at or above 29.3pg/mL (or 107.6 pmol/L), one woman was excluded due to non-participation in the second follow-up wave despite providing a blood sample, and four cases outside the subcohort were retrospectively disqualified (diagnosis by death certificate only or non-adenocarcinoma breast cancer).

Online Resource
c Calculated as the cumulative number of breast cancer cases by the attained age divided by the total number of breast cancer cases (N = 437), multiplied by 100. a The ULOQ is 100ng/mL for IGF-1 measured in batch 5 only.

Online Resource 4: Assessment of reliability of biomarker measurements
Three quality control (QC) samples were created for the reliability study, one for each of three body mass index (BMI) categories at baseline: normal (≥ 18.5 kg/m 2 to < 25 kg/m 2 , QC1), overweight (≥ 25 kg/m 2 to < 30 kg/m 2 , QC2), and obese (≥ 30 kg/m 2 , QC3).Each QC sample contained the pooled plasma of 38 women who met the initial eligibility criteria for the case-cohort study and had at least three vials of plasma remaining at the second follow-up wave (F2).Two replicates of each QC sample were used in each batch.Intra-assay and interassay coefficients of variation (CVs) were calculated for each quality control sample for each biomarker to assess the reliability of biomarker measurements within and across batches, respectively (Online Resource Table 4.1).
In addition, linear mixed-effects regression models with a fixed effect for dispatch and random crossed-effects for BMI and batch were specified for each biomarker.These models were used to estimate within-batch and between-batch intra-class correlation coefficients (ICCs) (Online Resource Table 4.2).The delta method was used to calculate 95% confidence intervals.The within-batch ICCs estimate the correlation between biomarker measurements within the same batch and BMI category, and the between-batch ICCs estimate the correlation between biomarker measurements within the same BMI category (but not the same batch), after correction for dispatch effects [3].  a CV for dispatch one: 13.44%; CV for dispatch two: 13.77%.

Online Resource
b CV for dispatch one: 12.11%; CV for dispatch two: 11.50%.
Normalization of biomarker values before analyses accounted for batch effects (Online Resource 6).
The estimated 95% confidence intervals presented in Online Resource Table 4.2 were wide and/or required truncation.Constructing confidence intervals for ICCs is difficult when one or more factors has few levels (in this case, BMI) [4].This may explain why estimated lower bounds and upper bounds could be less than 0% and exceed 100%, respectively.Bounds were truncated at 0% or 100% where appropriate.

Online Resource 6: Normalization of biomarkers
To prepare biomarker data for normalization, values above the upper limit of quantification (ULOQ), or below the lower limit of quantification (LLOQ) or lower limit of detection (LOD), were first imputed as the ULOQ or LLOQ/2 respectively.Biomarker concentrations were converted to standard molar units where possible before log2-transformation.Reasons for missing data were also identified and were mostly attributable to missing samples or technical issues pertaining to a specific biomarker measurement.
The linear mixed-effects models for normalization were specified for each biomarker to include a batch-specific random effect, and fixed effects for dispatch, time since last meal, and biological sources of variation.Time since last meal was centered at twelve hours and modelled using restricted cubic splines.Biological sources of variation were identified a priori and included case status, age at blood collection, BMI at F2, Southern European migrant status and smoking status at F2. Normalized values were estimated as the sum of the residual error, the regression term for the constant, and biological covariates [5].This calculation retains biological sources of variation but removes nuisance variation by essentially 'fixing' the data as if every measurement came from the first dispatch (reference category) and all participants were fasting for T = 12 hours [5].
Prior to normalization, the linear mixed-effects models were used to estimate residual ICCs for the total proportion of variation attributable to batch (Online Resource Table 6.1).All estimated residual ICCs were below 5% except for IGFBP-3 (6%), adiponectin (8%), IGF-1 (11%), and TNF-α (16%) (Online Resource Table 6.1).The causal diagram considers sociodemographic and lifestyle covariates only.Biomarkers of the sex-steroid hormone pathway and physical activity were measured at the second follow-up wave (F2).The measure for age was age at blood collection.Age at menopause was measured when the cessation of periods for 12 months was first reported (baseline, the first follow-up wave, or F2).Postmenopausal breast cancer was measured after F2.

Online
All other covariates were measured at baseline.Health consciousness was not measured (indicated by *), but is assumed to influence lifestyle factors (i.e., diet, alcohol consumption, smoking status, physical activity) and be influenced by sociodemographic factors (i.e., country of birth, socioeconomic disadvantage, education).

Highest level of attained education was self-reported at baseline in the Melbourne
Collaborative Cohort Study (MCCS) and analyzed as a categorical variable.Country of birth was recorded at baseline and modelled as Southern European migrant status (yes or no).

Socioeconomic disadvantage was measured using the Index of Relative Socioeconomic
Disadvantage from the Socioeconomic Indexes for Areas (SEIFA) and categorized into quintiles (of the most to the least disadvantaged, modelled as a continuous variable in the analyses), which were derived at baseline from residential addresses and Australian census data [6].
Dietary factors associated with postmenopausal breast cancer include dietary intake of calcium and dietary intake of carotenoids [7].Baseline dietary intakes of carotenoids (continuous mcg/d) and dietary calcium intake (continuous mg/d) were derived from a 144item food frequency questionnaire [6].Dietary intake of carotenoids was the summation of total dietary intake of alpha-carotene, beta-carotene, beta-cryptoxanthin, lutein and zeaxanthin, and lycopene.
Lifetime alcohol consumption to baseline (continuous g/d) was calculated using self-reported consumption of alcoholic beverages.Baseline smoking status was modelled as a binary variable (ever or never smoked).
Physical activity can arguably influence adiposity, but due to the temporal order of measured variables, adiposity at baseline was assumed to influence physical activity at F2. Adiposity was marked by continuous BMI (kg/m 2 ) at baseline, derived from height and mass measured at the study center.Physical activity data at F2 were used because the assessment at F2 was more comprehensive than that at baseline.At F2, the duration and frequency of recreationand transport-related walking, moderate-intensity and vigorous-intensity physical activity of 10 minutes duration or longer across three months was self-reported using a physical activity questionnaire based on the short form of the International Questionnaire of Physical Activity (IPAQ-short) [8].Physical activity was organized into total weighted minutes of moderatevigorous physical activity (MVPA) and categorized according to the Physical Activity and Exercise Guidelines for Australians [9]: insufficiently active (< 150 total weighted minutes of MVPA per week); sufficiently active (150 to ≤ 300 total weighted minutes of MVPA per week); and highly active (> 300 total weighted minutes of MVPA per week).
Age was modelled as continuous age at blood collection (years) using restricted cubic splines with three degrees of freedom.Where possible, age at menopause (≤ 48; 49-50; 51-52; ≥ 53 years) was estimated from the data corresponding to when the cessation of periods for 12 months was first reported (baseline, the first follow-up wave, or F2).
to be upstream of the insulin/insulin-like growth factor (IGF)-signaling pathway, which was assumed to be upstream of the sex-steroid hormone pathway.
Three biomarkers of the sex-steroid hormone pathway were not assumed to have direct effects on postmenopausal breast cancer (Online Resource Fig. 8.1): sex hormone binding globulin (SHBG); progesterone; and DHEA.The inverse association between SHBG and postmenopausal breast cancer risk observed in the literature is assumed to be driven by the role of this glycoprotein in reducing the bioavailability of estrogens and androgens [10][11][12][13].
Drummond et al. [13] found moderate-quality evidence that progesterone was not associated with breast cancer, and thus progesterone was assumed to influence breast carcinogenesis indirectly (e.g., via its role as a precursor in steroidogenesis).There was also evidence to suggest that DHEA was not associated with breast cancer in this review [13].
Online Resource Table 8.1 Selection of biomarkers that may be potential confounders in primary analyses c Analyses of free testosterone and estradiol will not adjust for SHBG.r ≥ 0.50 was considered a strong correlation.

Online Resource 9: Characteristics of the case-cohort after post-hoc exclusions
Online Resource Table 9 C-Peptide (ng/mL)

Inflammatory Pathway
Leptin (pg/mL) 17586 (  a Physical activity was measured as total weighted minutes of walking, moderate-and vigorous-intensity recreation-and transport-related physical activity (MVPA) per week at the second follow-up wave.Insufficiently active was defined as < 150 total weighted minutes of MVPA per week, sufficiently active was defined as 150 to ≤ 300 total weighted minutes of MVPA per week, and highly active was defined as > 300 total weighted minutes of MVPA per week.
b Age at menopause was measured for naturally postmenopausal women only, when the cessation of periods for 12 months was first documented (baseline, the first follow-up wave, or the second follow-up wave).
Missing data for normalized biomarkers are as follows: 19 for progesterone; 18 for androsterone; 18 for testosterone; 18 for DHEA; 20 for estrone; 27 for estradiol; 18 for SHBG; 2 for insulin; 2 for IGF-1; 2 for IGFBP-3; 2 for C-peptide; 2 for leptin; 3 for adiponectin; 2 for TNF-α; 2 for IL-6; 2 for IL-8; 2 for IFN-γ; 6 for CRP.Missing data for other covariates include: 1 for socioeconomic disadvantage; 86 for physical activity; 491 for age at menopause (including 55 naturally postmenopausal women).Southern European Migrant status, socioeconomic disadvantage, education, smoking status, lifetime alcohol consumption, body mass index, dietary calcium intake and total carotenoid intake from diet were measured at baseline.Biomarker concentrations, age at blood collection and physical activity were measured at the second follow-up wave.
Online Resource 10: Sensitivity analyses excluding estrogen receptor-negative/progesterone receptor-positive tumors and tumors of unknown hormone receptor status Online Resource Table 10 The results of the primary analyses and sensitivity analyses were adjusted for sociodemographic and lifestyle confounders (education, socioeconomic disadvantage, Southern European Migrant status, dietary intake of carotenoids at baseline, dietary intake of calcium at baseline, lifestyle alcohol consumption at baseline, smoking status at baseline, adiposity at baseline, physical activity at the second follow-up wave and age at blood collection) and other biomarkers identified as potential confounders, where applicable (Online Resource 8).The sensitivity analyses additionally excluded cases that were ER-/PR+ or of unknown hormone receptor status.
Online Resource 11: Sensitivity analyses excluding cases and deaths that occurred within one year of blood draw at the second follow-up wave Online Resource Table 11 The results of the primary analyses and sensitivity analyses were adjusted for sociodemographic and lifestyle confounders (education, socioeconomic disadvantage, Southern European Migrant status, dietary intake of carotenoids at baseline, dietary intake of calcium at baseline, lifestyle alcohol consumption at baseline, smoking status at baseline, adiposity at baseline, physical activity at the second follow-up wave and age at blood collection) and other biomarkers identified as potential confounders, where applicable (Online Resource 8).The sensitivity analyses additionally excluded cases and deaths that occurred within one year of blood draw at F2. Results were adjusted for sociodemographic and lifestyle confounders (education, socioeconomic disadvantage, Southern European Migrant status, dietary intake of carotenoids at baseline, dietary intake of calcium at baseline, lifestyle alcohol consumption at baseline, smoking status at baseline, adiposity at baseline, physical activity at the second follow-up wave and age at blood collection).Results were also adjusted for age at menopause and other biomarkers identified as potential confounders (where applicable, Online Resource 8), unless otherwise specified.

Resource 5 :
Reference standards for estradiol and testosteroneReference standards for estradiol and testosterone were used to evaluate assay performance at the Nutrition and Metabolism Branch, IARC.Reference standards with known plasma concentrations for estradiol and testosterone were obtained from the Clinical Standardization Programs at the US Centers for Disease Control and Prevention.For each biomarker, fifteen samples were chosen to be representative of the range of values for postmenopausal women.The reference samples were aliquoted into 60 samples of four replicates and measured in two batches.Two sets of replicates were measured in each batch using a liquid chromatographymass spectrometry system.Assay details are described in the main text and Online Resource 3.The concentrations of the reference standards and their corresponding measurements at IARC were highly correlated.The calculated Pearson correlation coefficients (also known as the validity coefficients) for estradiol and testosterone were 0.987 and 0.997, respectively.For estradiol, the measured values slightly overestimated the true values at higher concentrations (Online Resource Fig. 5.1).The measured values for testosterone were very close to the true values (Online Resource Fig. 5.2).Online Resource Fig. 5.1 Correlation plot for estradiol E2: Estradiol.The beta-coefficient for the regression line used to generate predicted values was 1.15.Online Resource Fig. 5.2 Correlation plot for testosterone T: Testosterone.The beta-coefficient for the regression line used to generate predicted values was 1.09.

Resource 7 :Online Resource Fig. 7 . 1
Identification, measurement and modelling of sociodemographic and lifestyle confounders Causal diagram of the relationship between biomarkers of the sexsteroid hormone pathway and postmenopausal breast cancer

Online Resource 1:
Characteristics of the subcohort and all eligible women for the casecohort study within the Melbourne Collaborative Cohort Study Online Resource

Table 1 . 1
Characteristics of the subcohort and all eligible women Number.IQR: Interquartile range.g/d: Grams per day.mg/d: Milligrams per day.mcg/d: Micrograms per day.kg/m 2 : Kilograms per meters squared.

Table 2 . 1
Competing risk of death by age

Table 3 . 1
Calculated as the cumulative number of deaths from other causes by the attained age among subcohort nonfailures divided by the remaining subcohort non-failures by the attained age, multiplied by 100.Measured biomarkers and methods of measurement All biomarkers were measured at the Nutrition and Metabolism Branch, International Agency for Research on Cancer (IARC) as outlined in Online Resource Table 3.1.Biomarkers and methods of measurement d

Table 4 .1
Overall intra-assay and inter-assay coefficients of variation for
Resource 12: Sensitivity analyses for naturally postmenopausal women with a recorded age at menopause Online Resource Table 12.1 Risk ratios for postmenopausal estrogen receptor-positive breast cancer per doubling of biomarker concentration, for naturally postmenopausal women with a recorded age at menopause Confidence interval.DHEA: Dehydroepiandrosterone. SHBG: Sex hormone binding globulin.IGF-1: Insulin-like growth factor-1.IL-6: Interleukin-6.TNF-α: Tumor necrosis factor-alpha.nmol/L: Nanomoles per liter.pmol/L: Picomoles per liter.