To the Editor

MDSs are a heterogeneous group of clonal hematopoietic neoplasms. Although recent studies have shown that MDS and AML patients had different gene mutation patterns [1,2,3,4], the molecular underpinnings remain unknown [5,6,7,8,9,10]. To identify DEGs related to the PMDS, we performed RNA-seq in 4 patients with primary PMDS and in 2 control pediatric samples (Additional file 1: Figures S1A-B). Because of the limited number of samples and to limit the false positives, we used two independent bioinformatics pipelines, STAR + DESeq2 and SALMON + edgeR, and considered only genes differentially expressed in both pipelines. Hierarchical clustering showed that PMDS patients and controls clustered in two distinct groups (Fig. 1a). In total, 651 DEGs were identified by STAR + DESeq2 and 616 DEGs by SALMON + edgeR (Fig. 1B; Additional file 1: Figures S1C-D). 291 DEGs were identified by both pipelines among which 136 genes were upregulated and 155 downregulated in patients (Additional file 1: Table 1). As a further validation, we used the LPEseq method. The concordance of the genes in the ranks of the differential gene lists was remarkably high (Additional file 1: Figures S1E-G). We then used GSEA to identify altered pathways from the Reactome database (Web reference 1) (Fig. 1c). The Enrichr enrichment analysis tool revealed that DEGs in PMDS are mainly related to pathways associated with the cell abnormal activity, immune and inflammatory systems and erythropoiesis (Additional file 1: Figure S2A).

Fig. 1
figure 1

a Z-score hierarchical clustering analysis and heatmap of differentially expressed genes. The color scale means the gene expression standard deviations from the mean green. b Scatterplot of the differentially expressed genes obtained using the SALMON and STAR pipelines (different colors highlight genes identified as differentially expressed in none, one, or both pipelines). c Gene set enrichment analysis (GSEA) rank plots for top statistically significant Reactome pathways with Normalized Enrichment Score (NES)

Further, we compared our data with the transcriptomic profiles from TCGA database. Interestingly, we found a clear distinction of PMDS from all other types of tumors (Fig. 2a; Additional file 1: Figure S2B). Moreover, the DEGs profile was able to divide tumors into three distinct groups (Additional file 1: Figure S3A). As for control samples, we integrated the transcriptomic data from the GTEx (Web reference 2) and observed a clear separation between blood related tissues and other normal tissues (Additional file 1: Figures S3B). Finally, we compared the DEGs gene list with the gene sets available in the Enrichr database specifically for “Diseases/Drugs” and “Cell types “categories (Additional file 1: Tables 2–3). We confirmed that the DEGs identified in PMDS are significantly connected with blood tissues and blood disorders (Additional file 1: Figure S3C).

Fig. 2
figure 2

a T-distributed stochastic neighbor embedding (t-SNE) plot in the expression space of several cancer datasets, plotting the results of the two principal dimensions. The data were obtained from the GDC-PAN cancer data Portal. The PMDS samples do not cluster near other tumor types, AML in particular (black arrowhead), showing a distinct profile. b Boxplot: ddPCR analysis of twelve genes, comparing expression levels between controls and PMDS patients. For each gene, box–whisker plots of concentration values are shown. Genes are classified as upregulated (red), downregulated (blue) and reference (grey). Significant changes in cDNA concentration between control and patients are highlighted (one-tailed t test, corrected for unequal variances *p < 0.05, **p < 0.01, ***p < 0.001)

A comparison of our PMDS DEGs with multiple RNA-seq datasets from adult MDS samples revealed a statistically significant overlap (67 out of 136 DEGs). Nonetheless, 69 upregulated genes and almost all downregulated genes were unique in PMDS (Additional file 1: Figure S4A-B; Additional file 1: Table 4).

Then, we validated the most statistically significant and biologically relevant DEGs either up- or downregulated. Analysis by ddPCR showed significant differences between patient and control samples (Fig. 2b). The log2 fold-change values for all 10 genes were highly correlated (Additional file 1: Figure S5). We also validated the DEGs in 6 new PMDS patients (Additional file 1: Figure S6). Additionally, we compared our data with 36 pediatric patients (3). The comparative data on 10 DEGs in PMDS and validation are shown in the Additional file 1: Figure S7.

In conclusion, we have identified 291 DEGs that correlate with the PMDS which might represent novel candidate genes for therapeutic intervention. Although a larger study cohort would be desirable, our data suggest that at the level of gene expression the PMDS is indeed a distinct disorder.