Introduction

Endometriosis is a common disease where tissue similar to normal endometrium (ectopic endometrium) grows outside of the uterus. Most commonly affected sites are the pelvic peritoneum, ovaries, uterosacral ligaments, pouch of Douglas, and the rectovaginal septum [1]. Endometriosis affects approximately 10% of women in their reproductive age. The condition is more common in women suffering with chronic pelvic pain and infertility. Although associated with several shortcomings, the most commonly used classification system is the revised American Society for Reproductive Medicine classification [2]. According to this, endometriosis is classified into sub-phenotypic stages from I to IV (minimal–severe), based on lesion number and size, presence of adhesions, and ovarian vs peritoneal involvement [3]. The diagnosis is made by direct surgical visualisation with histological confirmation of the endometrial tissue in biopsied lesions [4]. The nature of disease is often progressive, with gradually worsening pain which can lead to absence from social and work obligations [5]. This can result in significant burden to healthcare systems [6]. Non-invasive diagnostic testing could provide earlier diagnosis and improve disease management for these women [7]. Measurable biologic markers (biomarkers) from eutopic endometrium (an innermost lining layer of the uterus) and body fluids are currently the most promising non-invasive diagnostic approaches [8].

The most widely accepted aetiology of endometriosis is Sampson theory [9], whereby retrograde menstrual bleeding from the Fallopian tubes to the peritoneal cavity leads to migration, attachment, and growth of endometrial tissues at ectopic sites. However, retrograde menstrual bleeding by itself does not always lead to development of endometriosis, and multiple dysregulated mechanisms in the eutopic endometrium potentially govern pathophysiology in endometriosis. These include altered processes of apoptosis, immunosurveillance, adhesion, steroid responsiveness, tissue remodelling, neovascularisation, and enhanced inflammatory response [10], thus providing a source for biomarker identification [11].

The endometrium periodically undergoes morphological changes caused by fluctuations in the ovarian steroid hormones. Each menstrual cycle starts with the M-phase (days 1–4 in a normal 28-day cycle), characterised by shedding of the endometrial tissue. This is followed by the P-phase (days 4–14 in a normal 28-day cycle), during which higher levels of oestrogen stimulate the proliferation of the stroma and glands, resulting in thickening of the endometrium. After ovulation, progesterone levels start to rise in the ES-phase (days 15–20 in a normal 28-day cycle), stimulating the glands to secret glycogen and mucus. In the MS-phase (days 21–24 in a normal 28-day cycle), the window of implantation (WOI) opens, representing the optimal time for blastocyst implantation. This coincides with the differentiation of the endometrial stromal cells into decidual cells, known as decidualisation. In the absence of pregnancy, the endometrial tissue starts to degrade in the LS-phase (days 25–28 in a normal 28-day cycle), [12,13,14]. Therefore, changes in the endometrium through the menstrual cycle phases are reflected in the changes in transcriptome [15,16,17] and regulation patterns [18]. Consequently, differentially expressed transcripts or proteins may shed light on the important biological processes occurring at the time endometrial biopsy samples are collected. Integrating multi-omics data can provide insights into the underlying physiological and pathophysiological mechanisms [19].

Due to endometrial periodical transitions through the cycle, endometrial dating of tissue biopsies from women with and without endometriosis may play an important role in identification of the molecular patterns specific for endometriosis [20]. Although candidate endometrial biomarkers from genome-wide studies associated with endometriosis were reviewed [21], dysregulated molecular patterns of eutopic endometrium throughout the menstrual cycle among studies have not yet been integrated. Data synthesis of differentially expressed transcripts and proteins may provide an additional insight into endometrial molecular signature in women with endometriosis.

The aim of this study was therefore to (1) screen for published genome-wide studies reporting genetic causes distinguishing eutopic endometrium between women with and without endometriosis, (2) develop a catalogue of genetic causes (mRNAs, ncRNAs, and proteins) associated with altered molecular patterns in eutopic endometrium in women with endometriosis, (3) modify the gene nomenclature of extracted loci according to the HGNC system, (4) sort genes according to the phases of the menstrual cycle, and (5) perform gene set enrichment analysis (GSEA) associated with each phase of the menstrual cycle.

Material and methods

Workflow of the study is presented in Fig. 1.

Fig. 1
figure 1

Workflow of the study

Literature screening and data extraction

Original publications with genome-wide approach were retrieved from PubMed database [22] using keyword “endometriosis” with combination of terms including “eutopic endometrium”, “molecular dysregulation”, “gene expression”, “transcriptomics”, “omics”, “proteomics”, “epigenomics”, “genomics”, and “biomarkers”. The time span of the literature screening was set from 1999 to December 2018. Some studies were additionally identified using the reference list of review or retrieved articles. We performed literature screening for studies that compared eutopic endometrium of women with and without endometriosis in the same phase of the menstrual cycle. We screened for studies performed at the genomic (DNA), transcriptomic (mRNAs, miRNAs, lncRNAs), proteomic, and epigenomic levels. Proteomic studies including the measuring of different spectral intensities of the peptide peaks and the peaks which could not be associated with candidate proteins were excluded. High-throughput data obtained from in vitro experiments using treated primary human endometrial cells were also excluded. Only studies published in English language were included.

The following data was extracted from publications: indications for endometrial tissue sample collection for case and control study groups, stage and type of endometriosis and other uterine/pelvis pathologies of the case group, gynaecological condition of women in the control group, clinical symptoms of pelvic pain and/or infertility for both groups, age of participants, endometrial dating (phase of the menstrual cycle) and number of endometrial tissue samples used for genome-wide analysis, procedures for processing of endometrial tissue samples (collection, storage, and nucleic acid/protein isolation), platform used for genome-wide analysis, cut-off values for expression/methylation change identification, genetic causes associated with endometriosis, determined expression/methylation fold change of transcripts/proteins, and corresponding significance values.

Adoption of HGNC gene nomenclature for the development of the catalogue

The gene catalogue was developed from the extracted data of published studies associated with altered expression at the RNA and protein levels in eutopic endometrium of endometriosis. Up to 15 differentially expressed (either up- and down-regulated) transcripts (mRNAs, lncRNAs, and sncRNs) and proteins associated with specific phase of the menstrual cycle, fold change of expression with statistical significance value if available were extracted from publications.

The nomenclature of genes coding for transcripts and proteins reported to be dysregulated in endometriosis was adopted according to the HGNC nomenclature system (version updated February, 25th 2019). HGNC database (https://www.genenames.org/) is the resource for approved human gene nomenclature [23]. In addition, corresponding Gene ID numbers were obtained from the National Centre for Biotechnology Information (NCBI) database, release 230. NCBI is the resource that provides biological information and data [22]. Workflow of nomenclature editing for each type of extracted loci is described further. The identification name or synonym for each extracted mRNA transcript was entered into the UniProt Knowledgebase (UniProtKB), release 2019_2, and/or HGNC database. UniProtKB (https://www.uniprot.org/) is a source of sequences and annotations for over 120 million proteins [24] which also provide web links to HGNC database. Gene symbols for transcripts marked with the expression sequence tag (EST) clusters with prefix “Hs.” (Homo sapiens) were retrieved from NCBI’s UniGene database. In July 2019, UniGene web pages retired, but UniGene cluster numbers are matched with gene records. For example, NCBI Gene database (https://www.ncbi.nlm.nih.gov/gene) provided BPIFB1 for “Hs.65551” that was extracted from the study performed by Burney et al. [25]. To obtain gene symbols for lncRNAs, the SeqName IDs of lncRNA transcripts were uploaded into the Ensembl/BioMart tool (version 96) (https://www.ensembl.org/biomart/martview/). BioMart is a web-based tool that provides access to the gene annotation of Ensembl data [26]. For example, Ensembl/BioMart returned MAP4K3-DT gene for the transcript “ENST00000451547”, obtained from the study performed by Wang et al. [27]. To obtain the official gene symbols for miRNAs, the MiRBase MIMAT accession number or miRNAs strand-specifying -3p and -5p suffixes of each miRNA transcript was entered into the miRBase database, release 22. The miRBase database (http://www.mirbase.org/) provides microRNA sequences and annotation [28]. In cases where the mature miRNA without an available ID number was obtained from the reference source, gene names for both stem loop sequences were included in the gene catalogue as they result in the mature miRNA with the same sequences. For example, extracted transcript “hsa-miR-138-5p” from the study performed by Zhou et al. [29] provided MIR138-1 and MIR138-2 genes by the miRBase. Synonyms, names, or UniProt accession IDs of proteins were entered into the UniProtKB database to retrieve the gene symbol. When GenInfo Identifier or “GI” number was available, then the NCBI protein database was used to obtain gene symbol or Gene ID number. For example, extracted “gi|825,671” from the study performed by Rai et al. [30] was entered into NCBI protein database, which further provided web link to the HGNC database from where NPM1 gene was obtained.

Gene set enrichment analysis

All genes from developed catalogue regardless of clinical and experimental characteristics from extracted studies were further sorted according to the phases of the menstrual cycle (M, P, S, ES, MS, and LS) and N/S, when information regarding endometrial dating was missing. Extracted transcripts and proteins with no confirmed differences in the expression levels after validation of the source references were excluded from gene sorting. Gene set enrichment analysis (also named functional enrichment analysis) was performed for each list of genes associated with specific phases of the menstrual cycle using the Database for Annotation, Visualization and Integrated Discovery (DAVID) Bioinformatics Resource (release 6.8). DAVID is a bioinformatics tool that accepts gene list and performs functional analysis using background algorithms and knowledge of annotation databases, including Kyoto Encyclopedia of Genes and Genomes (KEGG), BioCarta, Enzyme nomenclature database and Reactome, to understand biological meaning behind the genes of interest [31]. In the present GSEA, Gene Ontology (GO) terms such as biological process (BP), cellular component (CC), molecular function (MF), and KEGG pathways specific to Homo sapiens with a p ≤ 0.05 were considered as statistically significant. GO [32] and KEGG [33] are databases that collect knowledge regarding the function of gene products and their roles in the biological system and phenotypes.

Results

The main findings of the present study are overviewed in Fig. 2.

Fig. 2
figure 2

Study overview and main results. Different omics levels of studies that analysed eutopic endometrium in women with and without endometriosis. Identified enriched pathways associated with the phases of the menstrual cycle characterising eutopic endometrium of endometriosis

Classification of retrieved studies

The database mining (Fig. 3) provided 28 genome-wide studies that analysed the eutopic endometrium in women with and without endometriosis. The analysis of the obtained data revealed that datasets are heterogeneous and included dysregulation of expression patterns at the transcriptomics and proteomics levels. Additionally, some of the transcriptomics studies reported expression of both mRNAs and ncRNAs (sncRNAs or lncRNAs); therefore, these datasets were considered as transcriptomics and ncRNomics studies. Some retrieved studies reported dysregulation of sncRNAs only, and therefore these studies fell into both the transcriptomics and ncRNomics levels. Some studies also overlapped with epigenomics, because they reported aberrant DNA methylome and associated gene expression levels.

Fig. 3
figure 3

Workflow of the database mining

Out of 28, 21 studies that reported an association of altered molecular patterns related to eutopic endometrium with endometriosis were further used for the development of the catalogue. Eight out of 21 studies were performed at the transcriptomics level: five [25, 34,35,36,37] profiled only mRNAs, two [27, 38] profiled mRNAs and lncRNAs, and one [29] reported dysregulation of mRNAs and sncRNAs (miRNAs and snoRNAs). Five additional studies analysed dysregulation of sncRNAs only [39,40,41,42,43]. Six [30, 44,45,46,47,48] out of 21 studies were performed at the proteomics level. Additionally, we also included differentially expressed protein-coding genes from two epigenomics studies [49, 50], which reported altered mRNA expression levels of differentially methylated genes. Studies included in the present analysis are listed in Table 1. Figure 4 presents data extraction (differentially expressed transcripts and proteins) for multi-omics integration.

Table 1 Clinical and experimental characteristics of included studies
Fig. 4
figure 4

Omics levels of extracted data used for the development of the gene catalogue. CpG islands present sites of methylation. mRNA, messenger RNA; miRNA, micro RNA; snoRNA, small nucleolar RNA; lncRNA, long non-coding RNA

Out of 28, seven studies were not included in the catalogue development because they did not meet the inclusion criteria of the present study (no expression analysis or data not available, expression analysis performed using separated cell types, platforms with limited number of target genes, and negative association reported), [51,52,53,54,55,56,57]. Although not included in the developed catalogue, these studies contribute in understanding of endometrial biology in endometriosis and are therefore summarised in Supplementary file S2.

Description of the gene catalogue

From 21 publications, 760 dysregulated transcripts (mRNAs, lncRNAs, and sncRNAs) and proteins in endometriosis were extracted and were used for the gene catalogue development. Among 760 genetic causes, 37, 425, 169, and 129 were associated with endometriosis at the epigenomics, transcriptomics, ncRNomics, and proteomics level, respectively. The gene catalogue is provided in Supplementary file S1. After nomenclature editing according to HGNC database, 647 official gene symbols were retrieved. The official gene coding for 113 dysregulated transcripts/proteins is currently not available. Among 647, 24 genes were excluded since additional validation analysis using qPCR, immunohistochemistry, or western blot did not confirm genome-wide screening findings. Therefore, the final set of genes included 623 genes: 33, 368, 103, and 119 genes associated with endometriosis at the epigenomics, transcriptomics, ncRNomics, and proteomics level, respectively. Sorting of these 623 genes according to phases of the menstrual cycle resulted in 3, 200, 87, 87, 181, 37, and 28 genes associated with M-, P-, S-, ES-, MS-, LS-phase, and N/S, respectively. Analysis of the catalogue revealed that same genes were associated with different menstrual phases or were associated with the same phase in different studies. Among 623 genes, 72 genes were repeated 2 times (55 genes) or at least 3 times (17 genes) in the same or in different phases of the menstrual cycle. For example, FOS was associated with the MS-phase by two different studies [29, 37] as well as with S- [34], P-, and ES-phases [37]. Eleven genes were repeated within the same menstrual phase in the same study. For example, two transcripts of SNORD3A gene were reported by Cui et al. [38] and two isoforms of the VIM protein were shown to be up- and down-regulated [30]. Among 72 repeated genes, there are 61 genes that were repeated in different phases of the menstrual cycle. For example, down-regulated MIR374B was associated with P-phase [43] and with MS-phase [29]. A list of 72 repeated genes with associated phase of the menstrual cycle, omics level, and source reference are presented in Supplementary file S5. Table 2 includes 15 genes associated with the same phase of the menstrual cycle by more than one study, for example up-regulated CCN1 in the MS-phase [29, 37]. Removing duplicated genes from the same phase of the menstrual cycle resulted in 591 unique phase-specific genes, i.e. M = 3, P = 188, S = 81, ES = 82, MS = 173, LS = 36, and N/S = 28 (Supplementary file S3.1).

Table 2 Genes associated with the same phase of the menstrual cycle in at least two studies

Gene set enrichment analysis

A total of 591 unique genes sorted according to the phases of the menstrual cycle were used for GSEA. A set of 574 genes was identified by DAVID bioinformatics tool. The number of genes included in the GSEA and results of functional analysis are summarised in Table 3 (GO-BP, -CC, -MF terms, and KEGG pathways). The highest number of statistically significant enriched terms was obtained by GO-BP analysis (207), followed by GO-CC (93), GO-MF (73), and KEGG (36), however the majority of identified enriched pathways do not reach the Bonferroni correction value. The top 10 enriched GO-BP, -CC, -MF, and KEGG pathways terms with corresponding annotated genes ranked by lowest p values and Bonferroni correction for each phase of the menstrual cycle is listed in the Supplementary file S3.2, while a complete list of GSEA results is provided in the Supplementary file S4. The pathway most significantly associated with P- (p = 5.25E-07) and S- (p = 8.11E-08) phases was movement of cell or subcellular component. In addition, oestrogen signalling pathway (p = 0.03) and TNF signalling pathway (p = 0.04) were associated with P-phase. PI3K-Akt signalling pathway (p = 1.47E-05) was enriched by genes associated with the S-phase. Negative regulation of growth (p = 6.41E-05) was associated with the ES-phase. Endothelial cell chemotaxis (p = 3.85E-05), inflammatory response (p = 4.50E-05), and chemokine-mediated signalling pathway (p = 2.39E-04) were associated with the MS-phase. Extracellular matrix organization (p = 1.88E-04) and pathways in cancer (p = 0.03) were enriched by LS-phase gene list. The study also revealed that same pathways were associated with different phases of the menstrual cycle (Table 4). Among 243 statistically significant enriched GO-BP and KEGG pathways, 30 GO-BP and 3 KEGG pathways were associated with at least two phases. For example, negative regulation of apoptotic process was associated with four phases, i.e. P (p = 0.0004), S (p = 0.0001), ES (p = 0.0411), and MS (p = 0.0003).

Table 3 A summary of the gene catalogue and GSEA results for each phase of the menstrual cycle
Table 4 Biological pathways associated with in at least two phases of the menstrual cycle. Only pathways with p ≤ 0.05 values are shown

Discussion

In the present study, we developed a catalogue of genes reported to have altered molecular patterns in eutopic endometrium in endometriosis. The analysis of 21 studies including 39 women with endometriosis and 236 women without endometriosis revealed that the reported datasets are heterogeneous. Obtained data from publications included dysregulated molecular patterns at diverse omics levels, i.e. transcriptomics (mRNAs, sncRNAs, and lncRNAs), proteomics, and epigenomics (dysregulated expressed genes associated with altered methylation level). The nomenclature editing of extracted data was beyond of a single omics level, therefore, the official HGNC nomenclature system for human genes was adopted which enabled downstream GSEA. Sorting of the obtained 591 unique genes resulted in 7 groups according to the phases of the menstrual cycle and further enrichment functional analysis was performed.

The nomenclature editing and analysis of the gene catalogue

From the literature survey, we gathered mRNA and ncRNA transcripts and proteins, and developed the catalogue by manually editing their gene nomenclature. Menstrual phase–specific sorting of 623 genes revealed that 15 genes were repeated within the same phase of the menstrual cycle since dysregulated expression was reported by at least two studies (Table 2), therefore, could present stronger candidate biomarkers associated with affected endometrial function in endometriosis. Repeated genes associated with P-phase (ACTB, ANXA4, BPIFB1, EPHX1, MUC5B, PRDX2, and VIM) could indicate stronger genetic causes associated with pathophysiology of endometriosis. Repeated genes from the MS-phase (CCN1, CRISP3, EGR1, FOS, FOSB, and TRPM6) could be associated with affected receptivity in endometriosis. However, independent validation studies are now needed to verify these hypotheses. Most of the data in the catalogue included dysregulated mRNA transcripts which were supplemented with approved HGNC gene symbols. However, extracted data from publication also included proteins and genes coding for epigenetic regulators (sncRNA and lncRNAs) which were also supplemented by the corresponding HGNC gene symbols. In case of epigenetics marks (altered DNA methylation), gene symbol of the associated differentially expressed gene was added to the catalogue. Adopted gene nomenclature of reported RNA transcripts and proteins with altered expression levels in endometriosis at various omics levels enabled downstream analysis, since most bioinformatics tools require the input of official gene symbols. LncRNAs whose official gene symbols are not yet available, were also listed in the gene catalogue. Since the HGNC nomenclature is continuously updated, this catalogue will enable re-analysis of the gene list in the future.

Types of omics data in the gene catalogue

Five genes (ANXA4, CDA, CDK10, DST, and HSP90AB1) from the catalogue were reported to be associated with endometriosis at two omics levels, for example ANXA4 gene at transcriptomics [34] and ANXA4 at proteomics level [30]. As reported previously, there is no complete correlation between transcriptome (expressed portion of the genome) and proteome (expressed protein set from genome). This is because post-transcriptional mechanisms such as miRNA-mediated regulation impact gene expression by degrading their target mRNAs or/and inhibiting their further translation [58]. Additionally, alternative splicing of precursor mRNA [59] and post-translational modifications [60] result in several protein isoforms. The potential role of epigenetic mechanisms in endometriosis should be further investigated, since it may provide an insight into the molecular basis of altered expression. Wang et al. [27] reported an association between the altered expression levels of AC002454.1 antisense lncRNA and target CDK6 mRNA in endometriosis. In addition, combined analyses of both, mRNAs and ncRNAs by Zhou et al. [29] and Cui et al. [38], reported an association of differentially expressed sncRNAs and lncRNAs with their putative mRNA targets. Epigenomics studies by Naqvi et al. [49] and Houshdaran et al. [50] also provided molecular explanation for observed difference in gene expression levels due to the associated aberrant DNA methylation status.

Functional enrichment analysis by synthesised data

Functional enrichment analysis using genes sorted according to the phases of the menstrual cycle identified GO terms and KEGG pathways, potentially related with pathogenesis of endometriosis as well with affected physiological processes required for normal endometrial function and receptivity. Among top significantly enriched pathways, oestrogen signalling pathway was associated with P-phase. This is in correlation with Makieva et al. [14] who reviewed the important contribution of the oestradiol to the downstream pathways which enhance mitotic activity causing the thickening of the functional layer in P-phase of normo-ovulatory women [14]. Further, negative regulation of growth and G1/S transition of the mitotic cell cycle associated with the ES-phase and MAPK (mitogen activated protein kinase) signalling pathway associated by the S-phase gene set could indicate endometrial dysfunction in endometriosis. This is in accordance with the published studies [61, 62]. Velarde et al. demonstrated increased MAPK and ERK kinase 1/2 signalling cascade that inhibited cAMP-dependent cell cycle regulation in endometrial stromal fibroblasts from women with endometriosis which was further associated with potential persistence of endometrial cell proliferation from P- to S-phase [61]. Similarly, Yotova et al. have shown association of higher Ras/B-Raf/MAPK signalling activity with increased proliferation and migration rates in primary eutopic endometrial stromal cells of patients with endometriosis [62].

Extracellular matrix organization (ECM), ECM-receptor interaction, and focal adhesion pathways that were found to be associated with the S-phase gene list could also be related with pathophysiological mechanisms in endometriosis. In the literature, increased levels of metalloproteinases with a role in rearrangement of the ECM [63] and up-regulated levels of combined adhesion molecule ITGAV/ITGB3 integrin [64] were associated with greater invasiveness and susceptibility of sloughed menstrual cells being implanted at ectopic sites in endometriosis. In the present study, the pathways in cancer was associated with S-, MS-, and LS-phases. Although not statistically significant, the MicroRNAs in cancer was enriched with P-phase sub-group of genes. Sapalidies et al. reviewed genetic and epigenetic interactions that may contribute to the rare event of malignant transformation of the endometriosis lesions [65]. In the present analysis, also the PI3K-Akt signalling pathway was found to be associated with the S-phase group of genes. This is in accordance with the study by Kim et al. [66] who found that increased activation of AKT pathway in Ptenf/+ and PRcre/+Ptenf/+ mice with autologous implantation of human endometrial tissue promoted development of ectopic lesions. Identified antigen processing and presentation pathway associated with the S-phase in the present study may indicate an association of endometriosis with another pathogenetic mechanisms. Matarese et al. [67] reviewed endometriosis as a chronic inflammatory disease where the immune system induces autoimmunity which favours endometrioitic lesions formation.

In the present study, cell-cell adhesion pathway enriched by S-phase gene list may characterise affected endometrial receptivity in endometriosis. Khorram and Lessey [68] associated decreased expression levels of ITGA5/ITGB3 integrin during the WOI in endometriosis women with an unfavourable environment for embryo implantation. Furthermore, endothelial cell chemotaxis, inflammatory response, monocyte chemotaxis, and lymphocyte chemotaxis pathways associated with the MS-phase may also characterise affected receptivity. This is in accordance with the study by Lee et al. [69] where the role of chemokines and cytokines in recruitment of innate and adaptive immune cells for successful embryo implantation was reviewed.

Some pathways overlapped across different phases of the menstrual cycle. A total of 33 GO-BP and KEGG pathway terms (Table 4) were found to be associated in different phase-specific gene lists. For example, oestrogen signalling pathway was associated with P- and S-phases, negative regulation of apoptotic process was associated with P-, S-, ES-, and MS-phases, cell-cell signalling was associated with P- and MS-phases, and female pregnancy was associated with ES- and MS-phases. These pathways could indicate stronger potential for pathway-based identification of biomarkers associated with endometrial function in endometriosis.

The heterogeneity of retrieved studies

Studies that were included in the present integrative analysis exhibit heterogeneity at various levels, i.e. study design, recruitment criteria for participating women, sample size, procedures of endometrial tissue processing, analysed omics level, platforms for genome-wide profiling, and data presentation. Different techniques for endometrial tissue sampling, preservation, and extraction protocols were used. Also, the number of used endometrial samples and applied platforms for genome-wide profiling varies across the studies. Studies had heterogeneous enrolment criteria for participating women, i.e. different types and stages of endometriosis often with coexisted uterus/pelvic pathologies, different gynaecological conditions of control women as well as symptoms of chronic pelvic pain and/or infertility. In addition, indications for endometrium tissue sample collection were heterogeneous, including endometrial biopsy, laparoscopy, hysteroscopy, or hysterectomy.

In the present GSEA, all extracted genes across 21 studies were included, regardless of the type and stage of endometriosis, coexisting uterine/pelvic pathologies and symptoms of chronic pelvic pain and/or infertility. Therefore, identified enriched terms should be interpreted with caution. It was suggested by Painter et al. [70] that severe endometriosis may have different genetic origin compared with mild endometriosis, because most significant single nucleotide polymorphisms (SNPs) were associated with stage III/IV endometriosis in genome-wide association study (GWAS). In addition, transcriptomics studies [37, 71] distinguished endometrial expression signature between mild and severe endometriosis. Some studies recruited women with endometriosis complicated with additional uterine/pelvic pathologies and/or infertility in case group [25, 37, 41, 47]. It has been reported [72, 73] that non-endometriosis gynaecological pathologies and infertility impact endometrial expression patterns as well. Hever et al. [72] distinguished endometrial transcriptome signatures according to the presence of adenomyosis and fibroids, while Koot et al. [73] observed unique endometrium gene expression signature during the MS-phase in women with recurrent implantation failure (RIF) and associated endometrium dysfunction as one of the determining factors of infertility in reproductive technique (ART) treatments. Burney et al. [40] segregated endometriosis cases and control women with leiomyomas on the basis of endometrial miRNAs expression patterns when applied unsupervised hierarchical clustering. However, one endometriosis case was clustered together with controls since this patient had coexisting leiomyoma [40].

Some studies recruited women with leiomyomas, adenomyosis, uterine adhesions, menorrhagia, dysmenorrhoea, chronic pelvic pain, and/or infertility in the control group, which could limit identification of loci specific for endometriosis [25, 37, 45, 47, 41]. Tamaresis et al. demonstrated that different uterine/pelvic pathologies leave their own fingerprints in the endometrial transcriptome signature when comparing four different groups of women, i.e. mild-, severe endometriosis, healthy women, and women with non-endometriosis uterine/pelvic pathologies, including leiomyomas, adenomyosis, benign ovarian cysts, and endometrial polyps [37].

The catalogue developed in our study including assembled genetic data (Supplementary file S1) with characteristics of corresponding extracted studies (Table 1) now enables researchers to perform additional downstream bioinformatics analyses according to different sorting criteria (experimental characteristics and clinical data). For example, data sorting can be performed according to stage or type of endometriosis or coexisting gynaecological conditions.

Guidelines for study design harmonisation

World Endometriosis Research Foundation (WERF) Endometriosis Phenome and Biobanking Harmonisation Project (EPHect) tends to overcome study design variabilities in endometriosis research. Harmonisation of standard operating procedures (SOPs) for sampling, processing, and storaging of endometrium tissue biopsies from participating women would reduce biases and measurement errors, providing detailed surgical characterisation, including determination of the menstrual phase at the time of eutopic endometrium sample collection from women with and without endometriosis, surgical and clinical phenotypic presentation of lesions [20]. In addition, documented nonsurgical aspects, including information on chronic pelvic pain, subfertility, reproductive history, menstrual history and hormone therapies used, medical and surgical history, and personal information would improve comparison across studies and enable large-scale collaborative research [74]. While WERF EPHect provides directions for harmonised workflow across studies to identify reliable loci specific for sub-types of patients with endometriosis, it is recommended for reporting to use standardised HGNC nomenclature for gene symbols and names in publications which will facilitate data integration across studies and data exchange among researchers.

Limitations of the study

Besides contributing to the research field, the present study also has some limitations: (1) The gene catalogue was constructed from the top differentially expressed transcripts and proteins measured by high-throughput methodologies; thus, most of them were not further validated, and may therefore suffer from biases that could impact GSEA. (2) An unequal distribution of studies performed across omics levels, higher number of transcriptomics studies in comparison with epigenomics and proteomics studies. (3) An unequal distribution of studies performed across phases of the menstrual cycle is observed. For example, low ncRNomics types of studies performed through the S-phase consequently provided no enriched pathways associated with miRNA categories in specific sub-phases of S-phase. (4) The phase determination of the menstrual cycle can be subjective, which may lead to incorrect sorting into phase-specific gene lists and influence the GSEA. Endometrial dating from extracted publications was mainly determined by histological examination of an endometrial biopsy or was categorised based on self-reported date of last menstrual period and cycle day at the time of sampling. In addition, some publications provided stage of the menstrual cycle with general S-phase, while others provided specific sub-phases of S-phase (ES-, MS-, or LS-phase). (5) Variability in indications for endometrial tissue sample collection in participating women often complicated with non-endometriosis uterine/pelvic pathologies and/or infertility may impact molecular patterns in endometriosis cases. (6) Some of the articles which do not include a term “genome-wide study” in the title or keywords might have been missed from our literature screening and should be included in the next update study.

Conclusions

This study presents the first multi-omics data synthesis of reported altered molecular patterns associated with eutopic endometrium in endometriosis. Editing of heterogeneous nomenclature of reported genetic information resulted in the gene catalogue, which was further sorted according to the phases of the menstrual cycle. Functional enrichment analysis was applied to study the role of obtained genes in eutopic endometrium of endometriosis. The findings present a source of stronger candidate genes and pathways for further experiments in endometriosis. It is expected that current gene catalogue of endometriosis will expand by future studies. All reported data from additional omics levels (e.g. interactomics, microbiomics) should be captured to deepen insight into endometrial organisation in endometriosis. Identified pathophysiological and physiological mechanisms in eutopic endometrium of endometriosis could contribute to better diagnosis and treatment of women with endometriosis, and could increase the chances of successful pregnancy in infertile women seeking ART treatments.