Background

Uterine leiomyosarcoma (ULMS) is a type of malignant soft tissue tumors showing distinctive morphologic features and molecular signatures [1]. ULMS has poor prognosis and high recurrence rate [2,3,4]. Currently, the treatment of ULMS is mostly by surgery with some adjuvant therapies, such as cytotoxic chemotherapy and radiotherapy [5,6,7]. Due to the complex molecular heterogeneity of ULMS and unavailability of targeted therapeutic methods, the five-year survival rate of ULMS is still low [8]. By gene expression profiling methods, a number of malignant tumors, including breast cancer, gastric cancer, and uterine carcinosarcoma [9,10,11,12,13], have been categorized into different molecular subtypes. Based on the subtype information, patients may be able to receive better diagnosis and more effective therapeutic options [14]. Thus, it is important to classify molecular subgroups of ULMS, which may provide a better understanding of disease mechanism and guide future precision treatment. Previously, we analyzed leiomyosarcoma (LMS) cases from uterine and extra-uterine sites to classify molecular subtypes [15]. Nevertheless, the treatments for LMS were different regarding LMS locations, which were between uterine and extra-uterine LMS patients in clinical practice. Italiano et al. demonstrated the molecular heterogeneity of LMS from extra-uterus by genetic profiling [16]. But the molecular heterogeneity of ULMS was less investigated until now.

In this study, by analyzing gene expression data sets, we identified and defined two molecular subtypes of ULMS, each of which presents subtype-specific gene expression patterns. Genes and pathways enriched in subtype I ULMS were associated with smooth muscle function, while genes and pathways involved in epithelial to mesenchymal transition (EMT) and tumorigenesis were enriched in subtype II ULMS. Our findings will provide a better understanding of ULMS pathogenesis and facilitate the development of more effective and individualized therapies.

Methods

Bioinformatic analyses and immunohistochemistry staining

To identify the molecular subtypes of ULMS, we collected and analyzed expression profile data sets from TCGA database and determined the optimal number of molecular subtypes of ULMS by Consensus Clustering (R package, ConsensusClusteringPlus [17]). The accuracy of subtype assignments was evaluated by Silhouette analysis (R package cluster [18]). The subtype-specific gene expression patterns were investigated by Gene Set Enrichment Analysis (GSEA) and Significance Analysis of Microarrays (SAM-seq). To identify the pathways that enriched in each subtype, the KEGG pathway analysis was performed online (https://david.ncifcrf.gov/). Cluster 3.0 and TreeView were used to perform hierarchical clustering to view the top 500 significantly over-expressed genes from each subtype. To develop subtype specific diagnostic biomarkers, genes overexpressed in subtype I (LMOD1, 1:20, Sigma, CAT#HPA028325) and subtype II (ARL4C, 1:120, Sigma, CAT#HPA028927) ULMS were selected for immunohistochemistry staining (IHC) based on SAM-seq result and the antibody availability. The procedure and scoring of IHC were performed as described previously [15].

Statistical analyses

Statistical significance was assessed by the chi-square and Fisher exact tests. For all statistical analyses, p value less than 0.05 was considered statistically significant.

Results

Consensus clustering of gene expression profiles revealed two molecular subtypes of uterine leiomyosarcoma

Level 3 RNAseq expression data of 29 ULMS cases were collected from The Cancer Genome Atlas (TCGA) and used to determine the molecular heterogeneity of ULMS by consensus clustering (Fig. 1a), a method that estimates cluster stability by iterative resampling of genes and samples [17]. The consensus clustering demonstrated that two subtypes were the optimal number for ULMS, as indicated by the empirical cumulative distribution plots, showing the greatest increase in the area under CDF curve (Additional file 1: Figure S1A and B). Next, the confidence of subtype assignment from Consensus Clustering was evaluated by silhouette analysis (Fig. 1b), which showed that all cases from both subtypes have a positive silhouette value, confirming the two molecular ULMS subtypes.

Fig. 1
figure 1

Identification of two distinct molecular subtypes of ULMS. a Consensus clustering reveals two distinct molecular subtypes of ULMS. Each column corresponds to a case of ULMS. b Silhouette analysis validates the subtype assignments from consensus clustering

Clinicopathologic features of ULMS molecular subtypes

Next, we compared the clinicopathologic features between subtype I and subtype II ULMS patients. As shown in Table 1, the ULMS subtype is significantly associated with clinical treatment response. Specifically, subtype I patients were significantly more responded to chemotherapy treatment than subtype II. However, there is no significant association between molecular subtypes with other clinicopathologic characteristics, including tumor weight, metastasis status, invasion and necrosis (Table 1).

Table 1 Clinicopathologic characteristics (N = 29)

Distinct molecular subtypes of ULMS have different gene expression patterns

We next investigated the subtype-specific gene expression patterns of ULMS by Gene Set Enrichment Analysis (GSEA) [19]. GSEA analysis showed that 2669 gene sets were enriched. Among the 2669 gene sets, 1568 gene sets were over-expressed in subtype II while the other 1101 gene sets were over-expressed in subtype I (Fig. 2a). Gene sets associated with leiomyosarcoma and myogenic targets were enriched in subtype I (Fig. 2b), whereas gene sets involved in EMT and tumorigenesis were enriched in subtype II (Fig. 2c). Interestingly, genes over-expressed in subtype II were also associated with cell cycle, proliferation, organ development and tumorigenesis. These genes include CDK6, BMP1, MAPK13, PDGFRL and HOXA1 (Fig. 3). Subtype I ULMS was enriched with genes involved in smooth muscle function (Fig. 3), including LMOD1, SLMAP, MYLK, and MYH11, all of which are the smooth muscle-specific markers [20,21,22].

Fig. 2
figure 2

Different gene sets enriched in distinct molecular subtypes. a The summary of GSEA results. b and c The gene sets enriched in subtype I and subtype II, respectively. Permutation = 1000, p < 0.05

Fig. 3
figure 3

Different gene expression signatures enriched in distinct molecular subtypes. Subtype I and subtype II ULMSs have different gene expression signatures revealed by GSEA. Each row denotes a gene and each column corresponds to a case of ULMS. Red, over-expressed genes; Blue, down-expressed genes

To further explore the different gene expression patterns of the two subtypes of ULMS, we performed Significance Analysis by SAM-seq. Our results showed that 1947 genes were significantly differently expressed between two subtypes. Among these genes, 1050 were over-expressed in subtype II ULMS while 897 genes were over-expressed in subtype I. Next, we analyzed the top 500 over-expressed genes in each subtype by hierarchical clustering. As the heatmap shows, those genes were significantly over-expressed in subtype I and subtype II, respectively (Additional file 2: Figure S2). Consistent with the GSEA results, the pathways enriched in subtype I were associated with smooth muscle function, such as vascular smooth muscle contraction, calcium signaling pathway, and regulation of actin cytoskeleton (Table 2). Pathways involved in tumorigenesis associated with subtype II, such as pathways in cancer, TGF-β and Hedgehog signaling pathway (Table 2).

Table 2 Pathways enriched in each molecular subtype

Immunohistochemistry staining of ULMS subtype specific biomarkers

To further validate ULMS subtypes and develop subtype specific diagnostic biomarkers, we selected ULMS subtype specific genes for IHC staining by SAM-seq result and availability of commercial antibodies. The IHC results showed that subtype I biomarker LMOD1 was positively stained in 51 of 68 ULMS cases (75%), while 13 subtype II ULMS cases were positive for ARL4C (13/68, 19%) (Fig. 4a), and the correlation coefficient of staining results of these two genes across all the ULMS cases is −0.43 (Fig. 4b).

Fig. 4
figure 4

Immunohistochemistry staining of ULMS subtype specific biomarkers. a Representative staining of LMOD1 and ARL4C for a subtype I ULMS case (case # pt69) and a subtype II ULMS case (case # pt103). b Heatmap of LMOD1 and ARL4C IHC staining results on 68 ULMS cases. Bright-red and dull-red represent strong and weak staining, while green and black indicated negative and equivocal staining

Discussion

Uterine sarcomas are composed of leiomyosarcoma, endometrial stromal sarcoma and carcinosarcoma. Among these, leiomyosarcoma is the most common subclass, mainly found in postmenopausal women [1, 23]. Although early diagnosis could improve the survival rate of ULMS patients, there are still challenges for treating late stage ULMS patients due to its high invasiveness and relatively high resistance to radiotherapy and chemotherapy [24]. Molecular subtyping of tumors based on their gene expression profiling have guided subtype-specific diagnosis, prognosis, and aided to develop subtype targeted therapies [17]. In our study, we identified two molecular subtypes of ULMS and found that these two subtypes exhibited significantly different gene expression patterns and distinct sensitivities to chemotherapy treatment. Nevertheless, please kindly noted that the number of patients with clinical treatment response information in this study is low, and additional large scale validation of treatment responses on distinct molecular subtypes will be needed.

Among the genes and pathways enriched, subtype I ULMS showed overexpression of smooth muscle-specific markers, including LMOD1, SLMAP, MYLK, MYH11. LMOD1, also known as Leiomodin-1, could be activated by serum response factor (SRF) or myocardin (MYOCD) and functions in smooth muscle cell differentiation [20]. SLMAP, or sarcolemmal membrane-associated protein, is involved in microtubule organization [25], excitation-contraction coupling [26] and myoblast fusion [27]. Myosin light chain kinase and Myosin-11 protein, encoded by MYLK and MYH11, respectively, are components of smooth muscle cells SMC contractile apparatus [28]. MYLK is a Ca2+/CaM-dependent kinase and involved in smooth muscle contraction by promoting the interaction between myosin and actin filaments [29]. Myosin-11 belongs to the myosin heavy chain family, and has contractile activity by hydrolyzing ATP [30].

Subtype II ULMS showed overexpression of genes enriched in pathways including cancer, TGF-β and Hedgehog signaling. Particularly, over-expression was found in the following genes, including CDK6, MAPK13 and HOXA1. CDK6 (Cyclin-dependent kinase 6) is a cell cycle regulator and forms a complex with cyclin D to initiate G1 to S phase transition by phosphorylating and inactivating Rb [31,32,33]. MAPK13 (Mitogen-activated protein kinase 13) belongs to the MAP kinase family and functions in cell proliferation, differentiation and development. MAPK13 is also involved in cell motility and invasion, serving as a diagnostic marker for cholangiocarcinoma [34]. MAPK13 is highly expressed in uterine and ovary tumor tissues, especially in gynecological cancer stem cells, and has tumor-initiating activity that was involved in tumorigenesis [35]. Highly expressed in many types of tumor cells, HOXA1 (Homeobox A1) is a DNA-binding protein and involved in facilitating cell proliferation, invasion, metastasis, and tumor progression [36, 37]. These proto-oncogenes were all over-expressed in subtype II, suggesting that the extent of malignancy of subtype II ULMS may be higher than that of subtype I, and subtype II may represent high-grade ULMS.

Conclusions

In conclusion, we characterized distinct intrinsic molecular subtypes of ULMS with different gene signatures. Our findings provide new insights into the understanding of the pathogenesis of ULMS, facilitate the development of subtype-specific diagnostic biomarkers and targeted treatment for ULMS. Furthermore, our finding may provide valuable information to develop individualized medicine for ULMS patients.