Introduction

Breast cancer (BC) is the most common cancer among women; however, it is the main reason for death from cancer in women worldwide [1]. BC is the most prevalent cancer globally, with around 8 million women surviving in 2020 having been diagnosed in the preceding five years [2, 3]. Each year, over 22,000 new cases are identified. Each year in Egypt, 33% of all female cancer cases are detected; however, this proportion is predicted to climb dramatically in the coming years due to the expanding population [4, 5].

BC is a complicated illness, including environmental and genetic variables. Single nucleotide polymorphisms (SNPs) are often employed to predict disease risk, clinical outcome, and prognosis [6].

Long non-coding RNAs (lncRNA) have garnered more attention during the past few years, including SNPs that may alter cancer and other human disease risks. LncRNA is characterized as transcripts that are longer than 200 nucleotides and have no protein-coding potential [7, 8]. SNPs, copy-number changes, and non-coding genome mutations can greatly impact lncRNA production [9, 10].

HOTAIR as a lncRNA results from the HOXC gene, whose significance in the invasion and development of several types of tumors is well established [11, 12]. Many scientists have investigated the relationship between cancer prognosis and HOTAIR expression. However, they revealed that HOTAIR is suspected to be a cancer-causing oncogene. Its genetic variations increase intronic activity and enhance HOTAIR expression in specific cancer cells [13,14,15]. Two HOTAIR SNPs, rs920778 C > T and rs4759314 A > G, were selected to test their association with breast cancer susceptibility because they have previously been linked to elevating cancer risk.

The HOTAIR rs920778 polymorphism is in the HOTAIR gene intron 2 and results from the substitution of cytosine for thymine (C→T). The HOTAIR gene’s intron 2 contains a new intronic enhancer that is the home to the HOTAIR rs920778 polymorphism, which causes T allele carriers to express HOTAIR more frequently [16]. The polymorphism of rs4759314 (A > G) results from the replacement of adenine with guanine (A→G). Furthermore, it was found that the GG genotype can enhance HOTAIR expression by boosting HOXC promoter activity [11].

In carcinomas, the human epidermal growth factor receptor 2 oncogene (HER2) encodes a protein that activates cell signaling networks that influence various malignant cells. Through a complementary target location in HOTAIR’s final exon, HOTAIR works as a competitive endogenous RNA to negatively control miR-331-3p, preventing miR-331-3p-mediated suppression of the oncogene HER2 [17]. Subsequently, Our case-control study’s major purpose is to evaluate the connection between HOTAIR polymorphisms (rs4759314 and rs920778) and disease vulnerability, clinic-laboratory parameters, and hormonal parameters featuring status association with the BC risk in an Egyptian woman sample.

Patients and methods

Studied participant

Our research was performed in line with the principles of the Declaration of Helsinki. Approval was granted by the Egyptian Medical Research Ethics Committee, Faculty of Medicine, Mansoura University, Egypt (IRP Cod (R.22.06.1746)). Before enrolling in this study, all female participants provided a completed permission form. The study techniques were conducted in accordance with the approved protocols.

Our study considered a case-control study where potentially eligible participant patients were 250 cases diagnosed with BC and recruited between September 2022 and January 2023 from outpatient clinics of Mansoura University’s oncology center, Mansoura University Hospital, Egypt. Of the 250 cases, only 100 newly diagnosed cases underwent the study. In comparison, 150 cases were excluded from the research, including those with a history of cancer, metastasis to other sites, radiation exposure, autoimmune disease, immunological syndromes, or the use of any medicine, including those for hormonal, chemical, or radiological reasons. As a control group, 100 age-matched, seemingly healthy females with no history of health issues, typical routine checkups, and comparable socioeconomic variables. BC diagnosis was validated by histopathological examination for tumor biopsies; however, pathologists conducted them for tumor staging [18] and grading [19] evaluation. The BC prognostic biomarkers (HER2, estrogen, and progesterone receptors (ER/PR)) were examined via immunohistochemical methods [20].

Sample collection

Blood samples from people under examination (5 ml) were separated into two portions; some of the blood was deposited in vacutainer tubes without additives for tumor markers and biochemical evaluation. The remainder was drawn into vacutainer tubes containing the anticoagulant EDTA for hematological and genetic examination.

Evaluation of tumor markers, biochemical, and hematological assessment

Using enzyme-linked immunosorbent assay (ELIZA) kits, cancer antigen 15–3 (CA 15–3) was measured. A hematological cell analyzer (CELL-DYN 3700 SL, Abbott Diagnostics, USA) was used to measure hematological parameters, including leukocytes, lymphocyte count, erythrocytes, hemoglobin level, and platelet count. Biochemical estimations of serum transaminase enzymes (aspartate transaminase [19] and alanine transaminase [20], alkaline phosphatase (ALP), total bilirubin, albumin, uric acid, and creatinine were conducted using a Cobas c501, Roche Diagnostics Mannheim, Germany, fully automated biochemical analyzer.

Detection of gene polymorphisms (genotyping)

DNA extraction

Using the Qiagen DNA purification (Valencia, CA) kit, genomic DNA was obtained from peripheral blood according to the manufacturer’s recommendations.

PCR amplification and genetic typing assay

For genotyping the rs920778 polymorphism, the restriction fragment length polymorphism (PCR-RFLP) technique was utilized (Ana XavierMagalhães et al. 2017). The PCR protocol was executed using (Applied Biosystems, Foster City, CA), a thermal cycler. Briefly, rs920778 is amplified in a volume of 22 μl, including a DNA template (4 μl), forward & reverse primers (4 μl), and a PCR master mix (10 μl, Thermo Scientific). Adjustments were made to the reaction conditions, beginning with a denaturation stage for 5 min at 95 °C, followed by 35 cycles of 95 °C for 60 s, 58 °C for 60 s, and 72 °C for 60 s, and a final step at 72 °C for 10 min to allow for the extension of all PCR fragments. Consequently, a PCR amplification fragment of 234 bp was produced using the primers forward: 5′-TTA CAG CTT AAA TGT CTG AAT GTT CC, and reverse: 5′-TAT GCG CTT TGC TTC CAG.

For rs920778, an MSPI (Thermo Fisher Scientific) restriction enzyme was used to digest the PCR products after the 234 bp. The resulting assimilation fragments were electrophoresed using agarose gel (2%) and dyed with ethidium bromide to make them easier to see under UV light. Finally, these fragments were identified as follows: the homozygous wild type (CC) generated two fragments at 218 bp and 26 bp, the heterozygous (CT) genotype generated three fragments at 234 bp, 218 bp, and 26 bp, whereas the homozygous (TT) genotype produced just one fragment at 234 bp.

The rs4759314 was genotyped using a tetra-primer amplification refractory mutation system with PCR (T-ARMS-PCR). The thermal cycler of PCR denaturation temperature at 94oC for 4 min, followed by 35 cycle denaturation at 94OC for 45 s, annealing temperature of 54.5OC for 45 s, extension temperature of 72OC for 55 s, and final extension of 72OC for 10 min. The primer sequence was as follows:

reverse outer primer (5’- 3’) CCAAGGTAGGGAAGTCTCTATTTCTCTG;

forward outer primer (5’- 3’) AAACCATATCCTGACAGAAGCCAAATAC;

reverse inner primer (G allele) TTATCACGTTTTATTAACTTGCATCCTCC;

forward inner primer (A allele) GCATGGAAGAGATATAAACAGGCGAA.

The resultant assimilation fragments were electrophoresed on a 2% agarose gel and dyed with ethidium bromide to be visible by UV light. The resultant fragment size was 24 bp by outside primers, 121 bp for the G allele, and 181 bp for the A allele.

Sample size and statistical analysis

The sample size was calculated using the GAS Power Calculator, 2017. This calculation was based on a previous study by Lv et al. [21], who showed an elevated frequency of the G allele for rs920778 in patients with breast cancer compared to the control group, considering the expected odds ratio of 1.7, prevalence of breast cancer of 13%, disease allele frequency of 23%, a minimal sample size of 100 for cases and 100 for controls is required with a power of 80% and a significance level of 5%.

The data were modified, coded, tabulated, and uploaded to a computer using IBM’s 2017-released Statistical Software for Social Science, IBM SPSS version 25.0 for Windows (Armonk, New York: IBM Corporation, 2005). The t-test and Mann-Whitney test were used to compare the means of two groups, while Kruskal-Wallis tests and one-way analysis of variance (ANOVA) were used to compare the means of more than two groups. Deviations from Hardy–Weinberg equilibrium expectations among control groups were assessed to be in equilibrium using the chi-squared test. The odds ratio and 95% confidence intervals were obtained using logistic regression analysis. All reported p-values were two-tailed, and a p-value of 0.05 was statistically significant.

Results

The baseline characteristics, biochemical assessment, and clinicopathological variables of the study population

This study was performed on 100 female BC with a mean age of 48 ± 10.6 years. BC cases were significantly associated with a positive family history. Tumor marker assessment identified significantly higher CA15.3 serum levels in BC patients (24.4 U/ml) when compared to the control (21.185 U/ml) (p = 0.001), while no significant differences in hematological and biochemical markers were identified between patients and controls (Table 1). According to the BC stage, 87/100 (87%) cases were localized (non-metastatic) (stages 1 and 2), while 13/100 (13%) patients represented metastatic cases (stages 3 and 4). On the other hand, according to the BC grade, 52/100 (52%) were grade 2. According to hormonal features, 66% of cases were ER/PR positive, 77% were HER2 negative, and 46% were ER/PR negative-HER2 positive (Table 2).

Table 1 Demographic and laboratory data of the studied subjects
Table 2 Clinico-pathological tumor features among cases

Genotype and allelic distribution in studied groups and risk for BC

The present study revealed two alleles of rs4759314: allele A (70% control, 64% cases; 181 bp) and allele G (30% control, 36% cases; 121 bp). Further, the results explored three genotypes, including AA, AG, and GG, with a low frequency of the GG genotype among patients (0%) and controls (5%) (Table 3; Fig. 1a). In addition, the result revealed two alleles of rs920778, including allele C (42% control, 47.5% cases; 218 bp, and 26 bp) and allele T (58% control, 52.5% cases, and 234 bp); however, the result showed three genotypes: CC, CT, and TT (Table 3; Fig. 1a).

Fig. 1
figure 1

Genotypes and haplotypes distribution. (A) Frequency of rs4759314 and rs920778 genotypes among studied groups. (B) Frequency of rs4759314- rs920778 haplotypes among studied groups

The connection between rs4759314 and rs920778 SNPs and the risk of developing BC was investigated using regression analysis. rs4759314 patients had a considerably greater prevalence of the AG genotype (72% versus 50%) (p = 0.005, OR = 1.689, 95% CI = 1.168–2.441), dominant AA versus AG + GG (p = 0.013, OR = 1.592, 95% CI = 1.105–2.293), co-dominant AG versus AA (p = 0.006, OR = 2.314, 95% CI = 1.278–4.191), and over dominant AA + AG versus GG (p = 0.002, OR = 2.571, 95% CI = 1.430–4.624) compared to controls, with suggested susceptibility of BC (Table 3).

Table 3 Association of rs4759314 A > G genotypes and alleles with BC

In contrast, the rs920778 polymorphism genotypes of patients and controls did not differ significantly (p > 0.05) across all genetic models, including the dominant and recessive models.

On the other hand, our results reported no significant difference between the case group and control regarding rs920778 polymorphism genotypes (p > 0.05) in all genetic models, including the dominant and recessive models (Table 4).

Table 4 Association of rs920778 C > T genotypes and alleles with BC

Correlation of gene polymorphism variants with laboratory, clinical, and hormonal features in a patient group

rs4759314 and rs920778 genotypes showed no significant associations with studied demographic data or lab measurements (Table 5). On the other hand, associations of studied genotypes with clinical and hormonal features revealed that ER/PR positivity with HER2 negativity was significantly associated with AA compared to AG in the rs4759314 genotype. Otherwise, no significant associations could be found between the two SNPs and BC patients’ clinical stage, ER/PR, histological grade, or HER2 protein expression (Table 6).

Table 5 Association of rs4759314 A > G and rs920778 C > T genotypes with demographic and laboratory parameters among BC cases
Table 6 Association of rs4759314 A > G and rs920778 C > T genotypes with tumor features

The rs4759314-rs920778 haplotypes’ association and risk for BC in the studied groups

A haplotype is a group of alleles inherited from a single parent. The rs4759314-rs920778 haplotypes’ statistical analysis showed that the AC haplotype reported the highest frequency among cases (34.8%), while AT showed the highest allele in controls (37.5%). The GC haplotype showed the lowest frequency among both groups. No association between haplotypes and the risk of BC was discovered (Table 7; Fig. 1b). The non-random connection of alleles at two or more loci in a population is referred to as “linkage disequilibrium” (LD). D′ can vary from 0 (no disequilibrium) to 1.

Table 7 Association of rs4759314 - rs920778 haplotypes with BC (maximum disequilibrium)

The bioinformatics of the HOTAIR gene is explained in Fig. 2. HOTAIR ENSG00000228630 was positioned at the long arm of chromosome 12q and spanned about 12 649 bases (Chr12: (53, 962, 308. 53, 974, 956) that were oriented with respect to the reverse strand. The HOTAIR gene comprises six splice variants based on its genomic structure (HOTAIR-201-206) (data source: Ensembl databases). The HOTAIR gene is a lncRNA. It has no protein-coding potential and is highly expressed in multiple tumors.

Fig. 2
figure 2

Genomic structure of the human HOTAIR gene. (A) Location of HOTAIR gene on chromosome 12q 13.13. The HOTAIR gene is located at chromosome 12q13.13 and transverses 12,649 nt (chr 12: (53,962,308.53,974,956) along the reverse strand. (B) The genomic structure of the HOTAIR transcripts. The HOTAIR gene consists of six splice variants, including HOTAIR-201, HOTAIR-202, HOTAIR-203, HOTAIR-204, HOTAIR-205, and HOTAIR-206, lncRNA with no protein-coding potential [Data source: NCBI database, Ensembl.org]

Discussion

More than 80% of cancer-related SNPs have been identified in non-coding regions of the genome, according to genome-wide association studies. Most known lncRNAs are related to various cancer forms; however, their expression patterns are frequently specific to cell types and cancer types. One of the lncRNAs, HOTAIR, has been discovered as a BC risk factor and a biomarker for various malignancies [22]. Earlier research has shown that the expression of HOTAIR is considerably upregulated in both BC plasma and tissues. The detection of HOTAIR expression in plasma can be used instead of tissue biopsies as a biomarker for BC because it is a noninvasive technique with high sensitivity and specificity [23,24,25]. Among HOTAIR SNPs are rs920778 (C > T) and rs4759314 (A > G); Meanwhile, both were discovered to be related to higher expression of HOTAIR.

This study discovered that rs4759314 (A > G) was associated with an elevated BC risk in the heterozygote AG genotype, dominant, co-dominant, and overdominant models; however, there is no significant difference in rs920778 (C > T) genotype and allele frequencies, as well as no connection between HOTAIR (rs4759314, rs920078) variants and disease stages or histological grades.

Similarly, Minn et al. concluded that, in a Japanese population, the HOTAIR SNP rs920778 did not affect BC susceptibility. On the other hand, Lv et al. [26] discovered a strong relationship between rs920778 and rs4759314 and an elevated incidence of BC in the Northeastern Chinese population. A significant association between an enhanced risk of BC and the rs920778 polymorphism has been reported among Southeast Iranian ladies [27], the Turkish population [28], the Indian population [29], and Chinese cases [30]. Furthermore, Yan et al. [30] and Hassanzarei et al. [27] have investigated the link between rs4759314 and breast cancer susceptibility; however, their findings contradict the results in the current study. Contrary to our findings, Khorshidi et al. [31] investigated the association between three single nucleotide polymorphisms in the HOTAIR gene (rs12826786, rs1899663, and rs4759314). Regarding the prevalence of breast cancer in Iranians, they revealed that these polymorphisms do not appear to be associated with breast cancer risk. These discrepancies in results may be due to ethnic genetic diversity with different gene-gene interactions, gene-environment interactions, or probably due to other limiting factors related to sampling and the size of cases. The serum expression levels of HOTAIR, MALAT1, and NEAT1 were investigated in Egyptian patients by Abd El-Fattah et al. [32] using quantitative real-time PCR (qRT-PCR). They observed that the serum expression level of HOTAIR was significantly higher in the breast cancer patients compared to the fibroadenoma patients and the control subjects. Additionally, no other studies link these two SNPs to cancer among Egyptians.

According to prior research on other diseases, the allelic frequencies of the HOTAIR SNPs rs12826786 and rs920778 were not statistically different between cancer-free controls and glioma patients [33]. Oliveira et al. [34] showed that rs12826786 and rs920778 are not significantly correlated with prostate cancer susceptibility among Portuguese. Kim et al. [35] tested the correlation between colorectal cancer susceptibility and HOTAIR variants; however, they showed no association between rs920778, rs4759314, and breast cancer among the Korean population. This may reflect the fact that a population’s susceptibility to a disease may vary depending on the cancer type and the individual’s gender [36].

Based on a meta-analysis that investigated the connection between HOTAIR polymorphisms and risks of BC, cervical cancer, and ovarian cancer, only rs4759314 was substantially correlated to a lower risk of BC, ovarian cancer, and cervical cancer. At the same time, rs920778 and rs18995663 were linked to breast, cervical, and ovarian cancer [37]. By meta-analysis, Liu et al. [15] found a link between overall cancer risk and rs920778 and rs4759314 polymorphisms. Other meta-analyses revealed the contribution of the HOTAIR rs920778 mutation to the elevated cancer risk, but rs4759314 had no significant connection [38, 39]. Another meta-analysis found no difference between HOTAIR rs920778 and rs4759314 in relation to breast cancer susceptibility [40, 41]. A meta-analysis conducted by Wang et al. [6] showed a strong link between HOTAIR rs920778 and the BC risk, but there was no strong link between the rs4759314 polymorphism and the BC risk.

There is a low distribution frequency of the uncommon genotype GG of rs4759314 among patients (0%) and controls (5%); therefore, evaluating their relationship with BC requires a larger sample size. According to our knowledge, this is the first study to examine the association between these two polymorphisms and BC among Egyptians.

Conclusion

Susceptibility and illness progression differ from one community to the next due to gene-gene and gene-environment interactions; thus, gene expression could be population-specific. Therefore, the results of this study explored that rs4759314 (A > G) could be a BC risk factor among Egyptian women, and patients with ER/PR positivity and HER2 negativity were significantly associated with the AA genotype compared to the AG genotypes. However, larger case-control research should be recommended to evaluate the impact of HOTAIR SNPs on BC and measure HOTAIR levels in plasma.