Background

Breast cancer (BC), one of the most common malignant diseases in women worldwide, can be highly invasive and metastatic [1]. According to Global Cancer Statistics 2020, female breast cancer has surpassed lung cancer as the most commonly diagnosed cancer, with an estimated 2.3 million new cases (11.7%) [2]. Female breast cancer in China accounts for approximately 18% (0.42 million) of the global breast cancer deaths [3]. According to data published in 2022, an estimated 43,250 women died from BC in the United States, in other words, there are nearly 118 BC-related deaths in women per day [4]. And more than 90% of BC-related deaths are related to cancer cell metastasis [5]. Previous studies have shown that breast tumor cells can easily metastasize to some important organs, such as the lung, liver, bone, or brain through blood vessels and lymphatic vessels [1]. BC is an intricate disease characterized by genetic, epigenetic, and phenotypic changes [6]. At present, the carcinogenic mechanism of BC is not fully understood. Some studies have indicated that the occurrence of BC is influenced by genes, lifestyles, and other factors [7,8,9]. As revealed by investigations, age, family history, obesity, oral contraceptives, status menopausal, smoking, alcohol consumption, lifestyle, and genetics factors are significantly linked to the occurrence and development BC [10], especially genetics factors. Currently, some association studies have identified several susceptible single nucleotide polymorphisms (SNPs) correlated with BC risk [11,12,13].

Cytochrome P450 (CYP), a superfamily of cysteine heme monooxygenases, is responsible for bio-transforming a large number of endogenous and exogenous substances via oxidative reactions [14]. As one of the CYP isoforms, CYP4B1, mainly expressed in lung and bladder tissues can also be detected in BC tissues and cell lines (https://gtexportal.org/home/gene/CYP4B1). Studies have shown that the alteration of CYP4B1 gene expression may be associated with certain cancers [15], suggesting that CYP4B1 may work in the development of cancer. The high CYP4B1 expression increased the chances of bioactivation of carcinogenic aromatic amines, thereby leading to a high risk of cancer growth [15,16,17]. Gene mutations are associated with protein expression levels, and gene polymorphisms may play a role in disease susceptibility by affecting gene expression and/or enzyme activity [18, 19]. Previous studies have reported that CYP4B1 polymorphisms are significantly correlated with the risk of various cancers, such as gastric cancer [20], bladder cancer [21], and lung cancer [22]. Iscan et al. have pointed out that CYP4B1 mRNA is expressed in both breast tumor and tumor-free tissues by reverse transcription-polymerase chain reaction (RT-PCR) [23]. However, to date, little is known about the association of CYP4B1 SNPs with BC risk.

This case-control study of 1,143 Chinese women was conducted to evaluate the association between CYP4B1 SNPs and the risk of BC in the Chinese Han population. Additionally, our study revealed the relationship between these SNPs and BC susceptibility by stratified analyses based on multiple factors (demographic and clinicopathological features), which provides a theoretical basis for the prevention and early detection of BC.

Materials and methods

Study population and clinical data

In this study, a total of 571 Chinese women diagnosed with BC by clinical examinations and Breast Imaging Reporting and Data System (BI-RADS) were randomly selected as the case group, and 572 healthy Chinese women with no personal or family history of cancer were chosen as the control group. There was no genetic relationship between cases and controls. Patients aged < 20 years or > 70 years with a prior history of cancer, metastatic cancer, radiotherapy, or chemotherapy were excluded. All the healthy volunteers were recruited from the health examination center in the same hospital during the same period as controls. The demographic information (age, weight, height, smoking and drinking habits, reproductive numbers, and menopause) and clinical information (estrogen receptor (ER), progesterone receptor (PR), human epidermal growth factor receptor-2 (HER2), and cell proliferation marker (Ki67), lymph node metastasis (LNM) and staging) about all participates were collected from questionnaires and medical records, respectively. Each participant completed an informed consent form after learning about the purpose of the study. This study was approved by the Ethics Committee of the First Affiliated Hospital of Xi’an Jiaotong University School of Medicine, and all experimental methods were strictly in compliance with the 1964 Helsinki Declaration.

ER, PR, HER2, and cell proliferation marker (Ki67) in patient tissue samples were stained by Immune-histochemical (IHC) assay. According to Breast Cancer, version 3.2018 [24], we diagnosed patients as ER-positive (ER+) or ER-negative (ER), PR-positive (PR+) or PR-negative (PR), and HER2-positive (HER2+) or HER2-negative (HER2). BC cells with more than 25% Ki67 staining were Ki67-positive (Ki67+) or otherwise Ki67-negative (Ki67). Tumor staging and lymph node metastasis (LNM) were identified based on the 2010 American Joint Committee on Cancer (AJCC) criteria [25].

DNA extraction and SNP genotyping

Fasting venous blood samples were collected from all participants in the morning using a vacutainer containing EDTA. Genomic DNA was extracted and purified according to the kit instructions (GoldMag Co., Ltd., Xi’an, China). DNA was then stored in a refrigerator at -80 °C before the experiments. Three SNPs (rs2297813 G/T, rs12142787 G/A, and rs3766197 C/T) of CYP4B1 were selected based on (1) the variations (113,617) of CYP4B1 through the e!GRCh37 (http://asia.ensembl.org/Homo_sapiens/Info/Index) database, (2) the variations (923) with global minor allele frequency (MAF) > 0.05, (3) the biallelic variations (402), in which some variations (30) were randomly selected, and (4) combined MassARRAY primer design software, Hardy-Weinberg equilibrium > 0.05, MAF > 0.05 and the call rate > 95% in our study population.,. Furthermore, the MassARRAY system (Agena, San Diego, CA, USA) was applied for genotyping, and all primers were designed by MassARRAY Assay Design software.

Data analysis

The expression analysis of the CYP4B1 gene in BC was predicted based on the GEPIA database (http://gepia.cancer-pku.cn/index.html). Kaplan-Meier Plotter database (https://kmplot.com/analysis/) was utilized to assess the correlation between the expression of CYP4B1 and survival in patients with BC. HaploReg v4.1 (https://pubs.broadinstitute.org/mammals/haploreg/haploreg.php), RegulomeDB (https://regulome.stanford.edu/regulome-search/), and GTEx Portal database (https://gtexportal.org/home/) were applied to predict the potential functions of SNPs.

Data analysis was performed by SPSS software version 21.0 (SPSS, Chicago, IL, USA), and two-sided p < 0.05 indicated statistical significance. The differences in demographic characteristics (age, smoking, and drinking) between BC patients and healthy controls were analyzed by t-test/χ2-test. HWE was tested by χ2-test to determine whether the subjects in our study reached genetic equilibrium. Logistic regression models adjusted by age, drinking, and smoking were introduced to calculate odds ratios (ORs) and 95% confidence intervals (CIs) with the SNPStats online software (https://www.snpstats.net/start.htm?q=snpstats/start.htm). ORs and CIs represented the effect of CYP4B1 SNPs on the risk of BC (OR = 1: no effect; OR < 1: protective factor; OR > 1: risk factor). The association of SNP-SNP interactions with BC risk was evaluated by multi-factor dimensionality reduction (MDR) 3.0.2 software.

Results

The characteristics of study participants

The basic characteristics of participants are summarized in Table 1. The average ages of cases and controls were 52.11 ± 10.14 years and 51.88 ± 9.78 years, respectively. And no significant difference in age (p = 0.700), smoking (p = 0.190), and drinking (p = 0.260) between the two groups was detected. Among 571 patients, the majority were menopausal (42.4%), and had more than one reproduction (42.0%). The proportions of early-stage (stage I, II) tumors and late-stage (stage III, IV) tumors were 60.6% and 25.2%, respectively. The results of IHC showed that most cases were diagnosed as ER+ (63.4%), PR+ (54.8%), and HER2+ (22.1%), and had a high level of Ki67 (62.9%). In the BC group, there were 194 patients (34.0%) with LNM and 241 patients (42.2%) without LNM.

Table 1 Characteristics of patients with breast cancer and healthy individuals

CYP4B1gene expression and its correlation with the prognosis of BC patients and SNP genotypein silico.

The expression of the CYP4B1 gene in normal and BC tissues was predicted by the GEPIA database indicating that the expression level of CYP4B1 in BC tissues was significantly lower than that in normal tissues (p < 0.05, Fig. 1A). Moreover, the results of the Kaplan-Meier Plotter database analysis displayed that the higher CYP4B1 expression, the worse survival in BC patients (p = 0.048, Fig. 1B).

Fig. 1
figure 1

The expression ofCYP4B1mRNA in BC tissue (A) and its correlation with prognosis of BC patients (B). Data from GEPIA database (http://gepia.cancer-pku.cn/index.html) and Kaplan-Meier Plotter database (https://kmplot.com/analysis/). *P < 0.01. Red color means BC tissues and grey color means normal tissues

In bioinformatics analysis, these SNPs might be associated with the regulation of promoter/ enhancer histone marks, motif changes, selected eQTL hits, GRASP QTL hits, transcription factor (TF) binding, and chromatin accessibility peak (Table 2). Based on the GTEx Portal database, the genotypes of rs2297813 (p = 1.76e-7), rs12142787 (p = 9.06e-9), and rs3766197 (p = 2.17e-6) were associated with the mRNA expression of CYP4B1 in breast tissues (Fig. 2). Compared with RS2297813-GG and RS12142787-GG genotype, GT and TT genotypes of rs2297813 and GA and AA genotypes of rs12142787 may be associated with the lower expression of CYP4B1 mRNA. In addition, the CT and TT genotypes of rs3766197 may have higher mRNA expression of CYP4B1 than CC genotype.

Table 2 The basic information and HWE about the selected SNPs of CYP4B1
Fig. 2
figure 2

The violin plot for the association between the genotypes of rs2297813, rs12142787, and rs3766197 and the mRNA expression of CYP4B1 in breast tissues. Data from GTEx Portal database (https://gtexportal.org/home/)

Information about candidate SNPs

Basic information about three candidate genetic loci of CYP4B1 (rs2297813 G/T, rs12142787 G/A, and rs3766197 C/T) is demonstrated in Table 2. The MAFs of each candidate SNP were higher than 0.05 and in line with HWE (p > 0.05).

Association between CYP4B1 SNPs and BC risk

There was no significant correlation between CYP4B1 SNPs and BC risk in the overall analysis (Table 3). Further, stratified analysis was conducted to explore the association between CYP4B1 polymorphisms and BC risk in different subgroups based on age, smoking and drinking habits, BMI, and tumor location.

Table 3 Analysis of the association between susceptibility of BC and CYP4B1 SNPs (Overall analysis)

BMI (Table 4): CYP4B1-rs2297813 might be a risk factor for increasing the incidence of BC in participants with BMI ≤ 24 kg/m2 under the allele (OR = 1.53, 95% CI 1.01–2.32, p = 0.040), dominant (OR = 1.69, 95% CI 1.06–2.69, p = 0.027), over-dominant (OR = 1.72, 95% CI 1.07–2.77, p = 0.026), and log-additive (OR = 1.56, 95% CI 1.02–2.39, p = 0.041) models.

Table 4 The SNPs of CYP4B1 associated with susceptibility of BC in participants (BMI)

Smoking and drinking (Table 5): When stratified by smoking, rs12142787 was associated with AN increased incidence of BC in smokers under the allele model (OR = 1.32, 95% CI 1.01–1.74, p = 0.045). Among non-drinkers rs2297813 (allele: OR = 1.48, 95% CI 1.04–2.10, p = 0.029; heterozygote: OR = 1.75, 95% CI 1.16–2.64, p = 0.026; dominant: OR = 1.69, 95% CI 1.14–2.52, p = 0.009; over-dominant: OR = 1.75, 95% CI 1.16–2.63, p = 0.007; log-additive: OR = 1.53, 95% CI 1.06–2.19, p = 0.021) and rs12142787 (allele: OR = 1.36, 95% CI 1.03–1.79, p = 0.029; dominant: OR = 1.51, 95% CI 1.07–2.13, p = 0.020; over-dominant: OR = 1.47, 95% CI 1.03–2.09, p = 0.034; log-additive: OR = 1.34, 95% CI 1.02–1.78, p = 0.037) were studied to be related to an increased incidence of BC.

Table 5 The SNPs of CYP4B1 associated with susceptibility of breast cancer in participants (Smoking and drinking)

Tumor stage (Table 6): CYP4B1 rs3766197 was associated with the higher risk of advanced stages (III/IV stage) of BC under the heterozygote (OR = 1.57, 95% CI 1.04–2.38, p = 0.035) and over-dominant (OR = 1.61, 95% CI 1.05–2.46, p = 0.031) models.

Table 6 The SNPs of CYP4B1 associated with susceptibility of breast cancer in patients (Tumor staging)

Reproductive number (Table 7): The contribution of CYP4B1 rs2297813 (allele: OR = 1.55, 95% CI 1.07–2.26, p = 0.021; dominant: OR = 1.61, 95% CI 1.05–2.47, p = 0.029; over-dominant: OR = 1.55, 95% CI 1.00-2.39, p = 0.049; log-additive: OR = 1.58, 95% CI 1.06–2.35, p = 0.024) and rs12142787 (allele: OR = 1.38, 95% CI 1.02–1.87, p = 0.037; dominant: OR = 1.53, 95% CI 1.03–2.25, p = 0.033; log-additive: OR = 1.38, 95% CI 1.01–1.89, p = 0.044) to BC risk might be associated with more than one birth in patients with BC.

Table 7 The SNPs of CYP4B1 associated with susceptibility of breast cancer in patients (Reproductive number)

Other factors

After analyses stratified based on age and tumor location, no significant association was found (Table S1, p > 0.05). Patients were grouped according to menstrual status, tumor biomarkers, and LNM to analyze the relationship between these SNPs and the risk of BC, however, there was no significant correlation between them (Table S2, p > 0.05).

CYP4B1 SNP-SNP interaction in BC risk

MDR was used to analyze and evaluate SNP-SNP interaction (Fig. 3). The blue line indicates that rs12142787 and rs3766197 have a redundant effect on BC risk. As shown in Table 8, the three-locus model (a combination of rs2297813, rs12142787, and rs3766197) had the highest cross-validation consistency (CVC, 10/10) and testing balanced accuracy (Testing Bal. Acc., 0.507), making it the best predictive model for BC risk.

Fig. 3
figure 3

Analysis of SNP-SNP interactions (A. Dendrogram; B. Fruchterman-Reingold. Green and blue represent redundancy or correlation. Values in nodes represent the information gains of individual attribute (main effects). Values between nodes are information gains of each pair of attributes (interaction effects)

Table 8 SNP - SNP interaction models analyzed by the MDR method

Discussion

BC is a heterogeneous disease, and BC patients with different demographic characteristics and tumor status may exhibit different clinical manifestations [7]. To facilitate diagnosis, prognosis, and treatment, patients are classified according to their clinical data, tumor morphology, genomics, proteomics, and other characteristics [8]. The World Health Organization has identified 21 histological types of BC [26]. Although the mortality of BC shows a trend of decline in recent years, it remains high in all female malignancies [4, 27]. Identifying reliable susceptibility genes associated with BC and investigating the correlation of genetic variants with BC risk is crucial for the effective prevention and treatment of BC [28].

Cytochrome P450 4B1 (CYP4B1) exerts an influence on exogenous biotransformation, and its major extrahepatic expression is associated with some tissue-specific toxicity [16]. CYP4B1 can hydroxylate common endobiotic substrates such as fatty acids, as well as activate xenobiotics such as 4-ipomeanol (4-IPO), perilla ketone (PK) or valproic acid (VPA) [29]. The contribution of CYP4B1 in cancer is of particular interest as CYP4B1 gene expression has been found to be altered and presumably associated with certain cancers [30]. In bioinformatics analysis, CYP4B1 expression in BC tissues was lower than that in normal tissues, and the higher CYP4B1 expression was correlated with worse survival. Of note, the relationship between CYP4B1 variants and BC risk has not been reported. Our results displayed that CYP4B1 rs2297813 was linked to an increased risk of BC in women with BMI ≤ 24 kg/m2, and non-drinkers. And CYP4B1 rs12142787 had the same effect on BC risk in smokers, and non-drinkers. Moreover, CYP4B1 rs3766197 was related to the tumor stage, and the association of rs2297813 and rs12142787 with BC risk was related to reproductive number. Bioinformatics analysis showed that these SNPs might be associated with the regulation of promoter/ enhancer histone marks, motif changes, selected eQTL hits, GRASP QTL hits, transcription factor (TF) binding, and chromatin accessibility peak. Based on the GTEx Portal database, the genotypes rs2297813, rs12142787, and rs3766197 were associated with the mRNA expression of CYP4B1 in breast tissues. These results might suggest that these SNPs may be involved in BC etiology by regulating the expression or function of CYP4B1, which provides a theoretical basis for subsequent mechanistic studies.

Because of the clear correlation between increased BMI and increased incidence of BC, obese women have a higher risk of dying from BC than non-obese women [31, 32]. Therefore, the risk of developing BC is strongly affected by BMI. Wang et al. have reported that the SNPs of NBSI (rs1805812, rs2735385, and rs6999227), BRIP1 (rs7220719), and PTEN (rs2299941) are associated with BC risk in subjects with BMI < 25 kg/m2 [27]. Additionally, TP53 rs12951053 and BRIP1 rs16945628 are related to BC risk in people with BMI ≥ 25 kg/m2 [27]. Our data revealed the association between CYP4B1 SNPs and BC risk in patients with BMI ≤ 24 kg/m2.

It has been reported that compounds such as polycyclic hydrocarbons contained in tobacco smoke can induce BC [33]. Jones ME et al. have shown that smoking significantly increases the risk of BC in women [34]. In a study by Terry PD et al., the relationship between smoking and BC risk is affected by certain health index of smokers and tumor types, such as the BMI of smokers, and BC patients with ER+/- or PR+/- [33]. Studies over the past few decades have shown positive, negative, or no association between smoking and BC risk [33,34,35]. Hsieh et al. have reported that rs73229797 can increase the activity of the CHRNA9 gene promoter and affect the risk of BC in smoking exposure (both smoking and passive smoking) [36]. Furthermore, Naif et al. have also reported that the C and G alleles of CYP1A SNPs are associated with the risk of BC in Iraq smokers [37]. Our research is the first to indicate that CYP4B1 rs12142787 is a risk factor for BC in Chinese female smokers in the allele model. Studies have revealed that smoking status is associated with BC risk [38, 39]. However, the association between rs12142787 and BC risk needs to be further confirmed in subgroup studies based on participants’ smoking status (active or passive).

Reproductive number and tumor stage are closely associated with the progression and metastasis of tumor cells [40, 41]. A previous study has shown that rs2735385 and rs6999227 in NBS1 are significantly associated with BC in stage III [27]. LincRNA H19 rs2071095 is associated with BC in stage I [42]. Our results showed that CYP4B1 rs3766197 was also associated with BC risk according to stratified analysis by tumor stage. CYP4B1 rs2297813 and rs12142787 were risk factors in the stratified analysis by reproductive number. Yan et al. have reported a remarkable interaction between lncRNA HOTAIR and reproductive factors influenced by various factors, such as ER status, menopause, and family history [40, 43]. Our data revealed that CYP4B1 SNPs were correlated with BC risk after analysis stratified by reproductive numbers.

However, there still are limitations. Firstly, the molecular mechanisms of CYP4B1 in the occurrence of BC has not been analyzed. In subsequent studies, it is necessary to understand the exact molecular pathway of CYP4B1 in cells and to investigate the effect of SNPs on CYP4B1 gene expression. Secondly, the role of these polymorphisms in treatment response and relapse has not been explored, and we will continue to collect samples and refine the experimental design to further explore the role of these polymorphisms in BC treatment response and recurrence. Thirdly, missing data in this study led to errors in some experimental results. Our findings need to be further confirmed in future studies with high-quality and large samples. Nevertheless, to our knowledge, our research remains the first to demonstrate that CYP4B1 polymorphisms are related to BC risk in different subgroups of Chinese women.

Conclusion

In summary, CYP4B1 SNPs could be risk factors for BC in Chinese women, especially in patients with BMI ≤ 24 kg/m2, smokers, non-drinkers, patients in advanced stages (III/IV stage), and patients who reproduced once. These findings shed light on the relationship between CYP4B1 SNPs and BC risk in Chinese women and provide a theoretical and molecular foundation for the prevention of BC in the future.