Discovery and function exploration of microRNA-155 as a molecular biomarker for early detection of breast cancer

Background MicroRNA-155 (miR-155) may function as a diagnostic biomarker of breast cancer (BC). Nevertheless, the available evidence is controversial. Therefore, we performed this study to summarize the global predicting role of miR-155 for early detection of BC and preliminarily explore the functional roles of miR-155 in BC. Methods We first collected published studies and applied the bivariate meta-analysis model to generate the pooled diagnostic parameters of miR-155 in diagnosing BC such as sensitivity, specificity and area under curve (AUC). Then, we applied function enrichment and protein–protein interactions (PPI) analyses to explore the potential mechanisms of miR-155. Results A total of 21 studies were finally included. The results indicated that miR-155 allowed for the discrimination between BC patients and healthy controls with a sensitivity of 0.87 (95% CI 0.78–0.93), specificity of 0.82 (0.72–0.89), and AUC of 0.91 (0.88–0.93). In addition, the overall sensitivity, specificity and AUC for circulating miR-155 were 0.88 (0.76–0.95), 0.83 (0.72–0.90), and 0.92 (0.89–0.94), respectively. Function enrichment analysis revealed several vital ontologies terms and pathways associated with BC occurrence and development. Furthermore, in the PPI network, ten hub genes and two significant modules were identified to be involved in some important pathways associated with the pathogenesis of BC. Conclusions We demonstrated that miR-155 has great potential to facilitate accurate BC detection and may serve as a promising diagnostic biomarker for BC. However, well-designed cohort studies and biological experiments should be implemented to confirm the diagnostic value of miR-155 before it can be applied to routine clinical procedures.


Background
Breast cancer (BC) remains the most prevalent cause of cancer mortality in females worldwide [1]. Like all other malignant tumors, the prognosis of BC is highly associated with the extent how early this disease is diagnosed [2]. Currently, several conventional methods, including imaging examination and biopsy, and some tumor markers such as CEA and CA15-3 have been confirmed to be beneficial for detection of BC [3]. Nevertheless, there are some deficiencies as they are invasive and harmful procedure or not sensitive enough 1 3 for accurate BC diagnosis. Therefore, the development of efficient diagnostic methods or tumor biomarkers for BC is urgently needed.
MicroRNAs are a class of 18-25 nucleotide, non-coding RNAs that regulate gene expression through targeting messenger RNAs and triggering either translational repression or RNA degradation [4]. Increasing studies have demonstrated that microRNAs were highly involved in various physiological and pathological processes such as cell proliferation, differentiation, and apoptosis [5]. Moreover, a large number of articles have proved that the abnormal expressions of micro-RNAs have a direct association with the development and progression of various cancers including BC. Subsequent evidence also indicated that circulating microRNAs were highly stabile and could be easily extracted and measured, revealing that circulating microRNAs might serve as promising biomarkers for early detection of BC [6].
Among a wide spectrum of microRNAs, microRNA-155 (miR-155) has gained great attention and numerous studies have revealed the critical role of miR-155 in the development of some cancers including colorectal cancer, oesophageal squamous cell carcinoma, lung cancer and BC [7]. Recently, several studies have revealed that aberrant expression of miR-155 may become a potential diagnostic marker of several types of cancers including BC [8]. Nevertheless, the diagnostic accuracy of miR-155 in BC was inconsistent or even contradictory among different studies. In addition, the biomarker roles of miR-155 during the initiation and progression of BC are still poorly illustrated by previous studies.
Thus, we first performed a meta-analysis, to synthesize available evidence on miR-155 as a diagnostic biomarker in patients with BC. Furthermore, we designed a bioinformatics study to uncover the potential biomarker roles of miR-155.

Literature search strategy
To identify the relevant studies, we searched through the online PubMed, EMBASE and Web of Science databases until May 1, 2020. The key words used for literature retrieval were as follows: ("microRNA-155" OR "miR-155" OR "miRNA-155") and ("breast cancer" OR "breast tumor" OR "breast carcinoma" OR "breast neoplasm"). The references of all the relevant publications were manually searched to obtain additional eligible studies.

Inclusion and exclusion criteria
Studies were considered eligible if they met the following criteria: (1) all the patients with BC must have been confirmed by pathological examination; (2) the levels of miR-155 were measured; (3) they presented sufficient data to allow the establishment of two-by-two tables, including true positives (TP), false positives (FP), true negatives (TN), and false negatives (FN).
Studies were excluded if they were: (1) reviews, metaanalysis, letters, commentaries, or abstracts presented in conferences; (2) lacking sufficient data for calculation of specificity and sensitivity with 95% confidence intervals (CIs); (3) duplication of previous publications.

Data extraction and quality assessment
Two investigators (Xuemin Liu and Qingyu Chang) independently extracted all the data from eligible studies. A third investigator (Haiqiang Wang) was responsible for reconciling disagreements when the results were controversial. The extracted data elements of this study for diagnosis included: the first author's name, country of research, year of publication, age of patients, ethnicity, sample size, sample source, detection method, sensitivity, specificity, TP, FP, FN, and TN. We estimated the quality of each study using the revised Quality Assessment of Diagnostic Accuracy Studies (QUA-DAS-2), which has been confirmed to be an effective tool for assessing the quality of diagnostic tests [9].

Statistical analysis
The data analysis to assess the diagnostic role of miR-155 in BC was performed by STATA 14.0. The bivariate metaanalysis model was applied to obtain the pooled sensitivity, specificity, positive likelihood ratio (PLR), negative likelihood ratio (NLR), and diagnostic odds ratio (DOR) [10]. Furthermore, the summary receiver operator characteristic (SROC) curve was generated based on pooled sensitivity and specificity of included studies and the corresponding area under the SROC curve (AUC) was calculated to estimate the accuracy of miR-155 in BC detection [11]. Statistical heterogeneity was evaluated using Cochran-Q test and I 2 test. P value < 0.05 for Cochran's Q test, or I 2 > 50%, suggested an existence of significant heterogeneity [12]. The Spearman correlation coefficient of logarithm sensitivity and 1-specificity was calculated for measuring the threshold effect. In addition, subgroup, meta-regression and sensitivity analyses were employed to investigate the potential sources of heterogeneity among included studies. Finally, publication bias was checked with the Deeks' funnel plot [13]. A P value less than 0.05 for two-tailed was considered statistically significant.

Target genes prediction of miR-155
We then performed a bioinformatics analysis to explore the potential function of miR-155. First, the microRNA prediction data were downloaded from the microRNA target gene prediction website miRTarBase, which has been one of the most comprehensive databases with experimentally validated microRNA-target interactions [14]. We only selected the human microRNA-target information for further analysis.

GO enrichment and pathway analysis
Gene ontology (GO) analysis was carried out for annotating the parental genes according to the biological processes (BP), cellular components (CC) and molecular functions (MF) [15]. Besides, we provided pathway enrichment analyses for miR-155 targets using the Kyoto Encyclopedia of Genes and Genomes (KEGG) database [16]. To enable better recognition of the biological functions of miR-155, we used the online tool Database for Annotation, Visualization and Integrated Discovery (DAVID) to perform GO enrichment and KEGG analyses [17]. P value < 0.05 was considered statistically significant.

Establishing the PPI network, selection of hub genes and module analysis
To identify the hub regulatory genes and to investigate the interactions among the target genes of miR-155, a protein-protein interaction (PPI) network analysis was conducted. The Search Tool for the Retrieval of Interacting Genes (STRING) is a biological database constructed for collecting PPI information, and then analyzing the functional interactions between proteins [18]. All the targets of miR-155 were uploaded onto STRING to retrieve their PPI data and only the interactions with a combined score > 0.4 validated by experiments or collected by text mining were set to be the cutoff criterion. Afterwards, the PPI network was visualized with Cytoscape software [19]. The CytoNCA plug-in of Cytoscape was used to identify the hub genes from PPI network, and the hub genes were selected according to three different centrality measures, containing betweenness centrality, closeness centrality and degree centrality [20]. Furthermore, the plug-in of Molecular Complex Detection (MCODE) in Cytoscape software was employed to search the significant modules in PPI network [21]. Finally, KEGG pathway enrichment analyses were performed with the hub genes and genes involved in the selected modules.
The threshold P value < 0.05 was considered statistically significant.
The main characteristics of each enrolled publication were summarized in Table 1. The publication years of the included studies were from 2012 to 2020. All the studies performed reverse transcription quantitative PCR (RT-qPCR) for miR-155 detection, and the specimen type included serum, plasma, blood, urine, and tissue. The quality of the included articles was evaluated by querying the QUADAS-2 scores. As a result, the overall methodological qualities of the study designs of selected studies were generally good.

Test of heterogeneity and subgroup analysis
The Spearman correlation coefficient was 0.35 with a P value of 0.12, indicating no obvious heterogeneity from threshold effect in the present data pooling.
To investigate the between-study heterogeneity, subgroup analyses based on ethnicity (Asian or non-Asian), sample  Subsequently, we attempted to explain the heterogeneity by exploring some study characteristics including sample size, sample source, and ethnicity, through metaregression analyses. However, no significant results were identified.

Sensitivity analysis and publication bias
Afterwards, influence analysis was employed to evaluate the influence of single studies on overall estimates (Fig. 4). The random-effect bivariate model was demonstrated to be robust for the calculation of the pooled parameters by conducting the goodness of fit and bivariate normality analyses. From the illustrated results, a study from Zaleski et al. [31] was identified as an outlier. After excluding the outlier, the overall pooled sensitivity for miR-155 increased from 0.87 to 0.89, specificity increased from 0.82 to 0.83, DOR increased from 30 to 39, PLR increased from 4.8 to Fig. 2 Forest plots of sensitivity and specificity for miR-155 test in breast cancer 5.2, NLR decreased from 0.16 to 0.13, and AUC increased from 0.91 to 0.92. Moreover, the I 2 values for sensitivity and specificity were not significantly influenced by the outlier study. In general, the overall results did not show significant changes, indicating that the overall pooled estimates could not be affected by a single study.
We then used the Deeks' funnel plot to check for publication bias. The slope coefficient presented a P value of 0.71 (Fig. 5), revealing a low likelihood of publication bias.

Functional and pathway enrichment analyses
GO functional annotation and KEGG signaling pathway enrichment analysis were performed with the genes regulated by miR-155 to obtain an in-depth and comprehensive understanding of the biological activities of miR-155. The target genes of miR-155 were retrieved from miRTarBase and then conducted with functional GO enrichment analysis. GO analysis results indicated that BP terms of the PPI network were significantly enriched in cell cycle arrest, regulation of pri-miRNA transcription, peptidyl-serine phosphorylation, small GTPase-mediated signal transduction and negative regulation of protein ubiquitination, CC terms in nucleoplasm, cytosol, nuclear chromatin, focal adhesion and extracellular exosome, and MF terms in RNA polymerase II core promoter proximal region sequence-specific DNA binding, chromatin binding, ATP binding and GTP binding (Fig. 6).   KEGG pathway enrichment analysis indicated that the genes regulated by miR-155 were significantly enriched in pathways in cancer, microRNAs in cancer, TNF signaling pathway, FoxO signaling pathway, prolactin signaling pathway, T cell receptor signaling pathway, signaling pathways regulating pluripotency of stem cells, HIF-1 signaling pathway, and PI3K-Akt signaling pathway (Fig. 7). According to previous studies and KEGG results, it could be speculated that the prolactin signaling pathway may be one of the most important pathways. The chart arising from the KEGG pathway analysis is presented at Fig. 8.

PPI network integration and selection of hub genes
We applied the STRING online database to analyze the 965 identified target genes of miR-155 and to set up a PPI network, which consisted of 907 nodes interacting with the 11.742 average numbers of neighbors. The results were downloaded and visualized by Cytoscape software for further analysis. The hub genes were selected by CytoNCA plug-in, and the top significant hub genes were obtained by betweenness centrality, closeness centrality and degree centrality methods, respectively (Fig. 9). At last, 10 hub genes were identified, including TP53, AKT1, EGFR, MYC, CTNNB1, IL6, JUN, STAT3, CASP3, and CCND1. To clarify the signal pathways associated with the hub genes during the initiation and progression of BC, we performed KEGG pathway enrichment analysis using DAVID software. The results in Fig. 7 disclosed the most vital KEGG pathways of the hub genes regulated by miR-155 associated with BC including pathways in cancer, proteoglycans in cancer, Jak-STAT signaling pathway, PI3K-Akt signaling pathway, microRNAs in cancer, MAPK signaling pathway, and FoxO signaling pathway.

Module analysis
Based on the MCODE plug-in, the top two significant modules of the above PPI network were identified and reconstructed with Cytoscape (Fig. 10). Then, KEGG pathway analysis was conducted to predicate the potential module function and cellular pathways. A list of pathways that were notably enriched by the module nodes were pathways in cancer, FoxO signaling pathway, proteoglycans in cancer, PI3K-Akt signaling pathway, prolactin signaling pathway, cell cycle, microRNAs in cancer, HIF-1 signaling pathway, focal adhesion, T cell receptor signaling pathway, TNF signaling pathway, and sphingolipid signaling pathway (Fig. 10).

Discussion
Sensitive and specific tumor biomarkers are vital for early cancer detection and diagnosis and for adopting novel therapeutic trials and prevention strategies in clinic. Increasing evidence has demonstrated that microRNAs can take part in various critical processes containing tumor cell mutation, proliferation, invasion, progression and metastasis. Abnormal microRNA expression has been identified in extensive researches, and microRNAs could function as tumor suppressor or promoter in a variety of cancers. Among all studied microRNAs, miR-155 is no doubt one of the most attractive members, which has been explored to be a potential valuable biomarker for cancer detection and survival prediction. Nevertheless, the sample sizes in most studies are relatively small. Moreover, it is still inconclusive regarding the diagnostic roles of miR-155 expression in BC. In the present study, we systematically reviewed clinical studies on the application of miR-155 for diagnosis of BC in recent years and performed this study to clarify the clinical significance of miR-155 expression in BC.
In the comprehensive analysis of diagnostic accuracy of miR-155 and BC, the pooled sensitivity and specificity of miR-155 in the detection of BC was 0.87 and 0.82, respectively. The AUC, which is recommended for evaluating the accuracy of a diagnostic test, was calculated to be 0.91, suggesting that miR-155 has relatively high clinical significance for diagnosis of BC. The value of a DOR is also an indicator of test performance; it ranges from 0 to infinity, with higher values suggesting better test identification. The pooled DOR was 30 in our study, revealing that miR-155 assays had excellent test performance in the diagnosis of BC. The overall PLR value was 4.8, revealing that the probability of having BC in a person with a positive test result was approximately fivefold higher compared to non-cancer patients. The pooled NLR was 0.16, meaning that the probability of a patient having BC is 16% if the miR-155 assay exhibits a negative result. Taken together, these results suggest a higher level of accuracy and better diagnostic characteristics for miR-155 in the diagnosis of BC than traditional detection methods.
To explore the potential sources of the heterogeneity, we carried out subgroup analysis and meta-regression analysis. Our subgroup analysis based on the sample size indicated that miR-155 with large sample size had a higher accuracy than miR-155 with small sample size, suggesting a large samples of clinical study is needed. Meanwhile, the subgroup analyses based on specimen types indicated that serum had relatively higher diagnostic value compared to plasma, indicating serum may be applied as more suitable sources of clinical specimens than others for BC detection. However, this should be interpreted with caution as levels could differ significantly if the source was serum versus plasma, because of coagulation processes. More studies should be conducted to further explore the most suitable sources of specimens. Due to the genetic heterogeneity, a difference in diagnostic accuracy has been displayed among different ethnicities. The meta-regression result revealed that these factors may not be the potential sources contributing to heterogeneity in this study. In addition, the sensitivity analysis indicated that the overall pooled estimates could not be affected by a single study.
Currently, some efforts have been devoted to exploring the functional roles of miR-155 in the pathogenic mechanism and malignant progression of cancer. However, the underlying molecular mechanisms involved in cancer progression are still largely unclear. Thus, we used some bioinformatics analysis methods to explore the functional activities of miR-155 targets involved in the occurrence and development of BC. GO analysis indicated that miR-155 was associated with some vital biological processes, basic cell components and the binding function of some key molecules. Moreover, KEGG functional enrichment analysis identified a series of important pathways, which are closely associated with the initiation and progression of BC. For example, pathways in cancer and microRNAs in cancer directly reflected the associations between miR-155 and some vital pathways involved in BC. TNF signaling has been one of the most significant cellular pathways with important functions in homeostasis and disease pathogenesis [40]. Studies have convinced the roles of FoxO signaling pathway in regulating genes essential for cell proliferation, cell death, senescence, angiogenesis, cell migration and metastasis [41]. The prolactin pathway, known as a hormone involved in normal breast development and lactation, has been long demonstrated to be involved in the progression of human breast cancer [42]. It has been confirmed that T cell receptor signaling pathway has important roles in the host adaptive immune system and T cell-based adoptive immunotherapy has been shown to be a promising treatment for various types of cancers. The abnormal expression of T cell receptor may lead to carcinogenesis [43]. HIF-1 has been recognized as an essential component in changing the transcriptional response of tumors under hypoxia and plays important roles in crucial aspects of cancer biology, including angiogenesis, cell survival, glucose metabolism and invasion [44]. PI3K-Akt signaling pathway was demonstrated to be hyperactivated in most of the breast carcinomas and could regulate various biological processes such as cell growth, differentiation, migration, and survival, as well as angiogenesis and metabolism [45]. Moreover, previous evidence has reported that microRNAs have critical roles in cellular activities implicated in BC cell growth, migration and metastasis by targeting the PI3K/ AKT oncogenic signaling pathway [46]. These functional enrichment results may help us understand the mechanisms of miR-155 during the initiation and progression of BC.
PPI analysis has been recognized as an entry point for better interpretation of associations between different proteins on a genome-wide scale, and might be helpful to provide novel insights into protein function explanation. We constructed the PPI network with the target genes of miR-155 and identified the top ten hub genes. These hub genes were predominantly associated with some important pathways, most of which have been confirmed to be related to BC. In addition, there is growing evidence that proteoglycans can adjust extensive normal and pathological activities, such as morphogenesis, tissue repair, inflammation, vascularization and cancer metastasis [47]. Rapidly emerging groundbreaking discoveries have revealed that Jak-STAT signaling pathway regulates almost all immune regulatory processes, containing those that are involved in tumor cell recognition and tumor-driven immune escape [48]. Alterations in this pathway may contribute to the development of BC and metastatic spread and targeting of JAK-STAT signaling in BC may provide potential therapeutic strategies to overcome drug resistance [49]. The MAPK signaling pathway plays an important part in organizing great constitution network that mediates a variety of physiological processes, such as cell growth, differentiation, as well as apoptotic cell death and dysregulation of the MAPK signaling cascades has been confirmed to be associated with the pathogenesis of various human cancer types including BC [50,51]. The hub genes may provide potential predictor and therapeutic targets for BC patients.
The module analysis of the constructed PPI network was then carried out and the top two significant modules were selected for conducting KEGG pathway enrichment analysis of the genes included. The enriched results indicated that the nodes involved in the modules were highly related to some vital pathways. In addition to the most pathways which were already discussed above, cell cycle has been identified as a hallmark of cancer involved in cellular proliferation and cancer development. The dysregulation of cell cycle control may lead to cancer progression [52]. Moreover, focal adhesion kinase (FAK) is a cytoplasmic non-receptor tyrosine kinase involved in almost every aspect of cancer, from invasion to metastasis to epithelial-mesenchymal transition (EMT) and maintenance of cancer stem cells [53]. Sphingolipids have been the largest family of bioactive lipids that participate in a wide variety of biological mechanisms, including cell death and proliferation and are linked with numerous aspects of tumorigenesis [54]. The PPI analysis further uncovered the potential mechanisms of miR-155 involved in BC occurrence and development. However, further studies are still urgently demanded to validate the hub genes and pathways, and deeper mechanisms would be uncovered.
We should point out that there are some limitations in our study. First, the cutoff values of miR-155 in the enrolled diagnostic tests were not uniform, which might contribute to the observed heterogeneity. More studies should be conducted to reach an appropriate cutoff value to be accurate enough. Second, the sample sizes for plasma, whole blood, tissue and urine are relatively small. It is difficult to make a definitive conclusion about the accuracy of miR-155 in diagnosing BC from these sample sources. Thus, further evaluations of the utility of miR-155 from different sample sources as biomarkers for BC detection should be performed. Third, the lack of studies for different ethnicities evaluating the miR-155 diagnostic performance complicated the exploration of its clinical application. More studies should be conducted to evaluate the diagnostic accuracy of miR-155 for different ethnicities. In addition, due to the fact that all studies that are used for the data pooling have patients of various TNM stage (from stage I to stage IV), there is considerable heterogeneity in detection of BC. Without patient-level data, it would be very difficult to tell whether miR-155 is actually useful as a marker in early detection of BC. Besides, BRCA1-deficient tumors have been shown to have increased levels of miR-155, which could be relevant for identifying possible respondents to PARP-1 inhibitors [55,56]. Most of the mechanistic studies on miR-155 and BC have been done on these patients, which was ignored in this study. Last, due to lack of experiment verification, the function and mechanisms of miR-155 need to be further studied in vivo and in vitro.
Despite all this, we revealed that miR-155 had great potential in early diagnosis for BC and further studies to explore its clinical application are warranted and worthwhile. Some interesting findings will stimulate further research with rigid criteria and large study populations to resolve any remaining controversy of the diagnostic value of miR-155 in BC patients. The bioinformatics analysis results definitely point out the molecular basis to better understand the pathogenesis of miR-155 involved in BC and provide valuable novel markers or targets for the diagnosis and treatment of BC.

Conclusion
In conclusion, the present study according to published articles demonstrated that miR-155 has strong potential to serve as a novel noninvasive biomarker for detection of BC. In addition, integrated bioinformatics analysis identified some key hub genes and pathways, which may help understand why miR-155 could possess excellent biomarker characteristics and provide some new clues to explore the exact etiology and mechanism underlying the initiation and progression of BC. To strengthen our findings, the diagnostic value of miR-155 in BC should be further confirmed by large-scale and standard investigations. Moreover, further studies are urgently demanded to validate the hub genes and pathways, and further mechanisms would be uncovered.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.