Deep analysis of neuroblastoma core regulatory circuitries using online databases and integrated bioinformatics shows their pan-cancer roles as prognostic predictors

Aim Neuroblastoma is a heterogeneous childhood cancer derived from the neural crest. The dual cell identities of neuroblastoma include Mesenchymal (MES) and Adrenergic (ADRN). These identities are conferred by a small set of tightly-regulated transcription factors (TFs) binding super enhancers, collectively forming core regulatory circuitries (CRCs). The purpose of this study was to gain a deep understanding of the role of MES and ADRN TFs in neuroblastoma and other cancers as potential indicators of disease prognosis, progression, and relapse. Methods To that end, we first investigated the expression and mutational profile of MES and ADRN TFs in neuroblastoma. Moreover, we established their correlation with neuroblastoma risk groups and overall survival while establishing their extended networks with long non-coding RNAs (lncRNAs). Furthermore, we analysed the pan-cancer expression and mutational profile of these TFs and their correlation with patient survival and finally their network connectivity, using a panel of bioinformatic tools including GEPIA2, human pathology atlas, TIMER2, Omicsnet, and Cytoscape. Results We show the association of multiple MES and ADRN TFs with neuroblastoma risk groups and overall survival and find significantly higher expression of various MES and ADRN TFs compared to normal tissues and their association with overall survival and disease-free survival in multiple cancers. Moreover, we report the strong correlation of the expression of these TFs with the infiltration of stromal and immune cells in the tumour microenvironment and with stemness and metastasis-related genes. Furthermore, we reveal extended pan-cancer networks comprising these TFs that influence the tumour microenvironment and metastasis and may be useful indicators of cancer prognosis and patient survival. Conclusion Our meta-analysis shows the significance of MES and ADRN TFs as indicators of patient prognosis and the putative utility of these TFs as potential novel biomarkers. Supplementary Information The online version contains supplementary material available at 10.1007/s12672-021-00452-3.


Gene Ontology of MES and ADRN genes and their correlation with NB risk groups
A list of 485 MES and 369 ADRN genes was subjected to Gene Ontology enrichment analysis using PANTHER (Additional file 1: Materials S1) [5][6][7]34]. Despite the paucity of somatic mutations in NB and the importance of epigenetic regulation in this cancer [35], we subjected 854 MES and ADRN extended CRC network genes to the cBioportal genetic alteration tool, having selected 1472 NB patient samples including AMC Amsterdam, Nature 2012 (87 samples), Broad, Nature Genetics 2013 (240 samples), Broad, Nature 2015 (56 samples) and Paediatric NB, TARGET, 2018 (1089 samples) (Additional file 1: Materials S1) [36,37].
The expression correlation analyses were performed on cBioPortal RNA sequencing data of 143 NB patient samples (TARGET, 2018) for the 38 MES and ADRN genes. The top 20 positively correlated genes were shortlisted for each of the MES and ADRN genes in order to select the strongest co-expression correlations. Amongst them, 9 lncRNAs positively correlated with 10 ADRN and MES genes, most of which with strong correlations (Spearman > 0.7).
The correlation between TFs and lncRNAs expression with patient OS was obtained from cBioPortal and the following method was used: the samples were grouped based on the upregulation (high) or downregulation (low) of each queried gene, using z-score equals 1 or a consistent criterion of at least 10% of samples displaying upregulation and relevant survival plots were generated. In this statistical test, the variable of interest was the time elapsed before event occurrence. Accordingly, increased and decreased OS accounted for an elongated or shortened duration for which the patient was alive.

Gene expression analysis
Expression profiles of the 38 TFs were investigated in more than 30 cancer types using TCGA (tumour) and matching normal tissue data using the GEPIA2 expression DIY module [38]. This programme collated expression data from tumour and normal tissue as the log2 fold change (log2FC) with a cut-off of 1 and p < 0.01 and used one-way ANOVA to assign statistical significance. Cancer name abbreviations have been provided in the list of abbreviations. We also linked the expression of the 38 TFs in over 30 cancer types with their mutational profile in these cancers using cBioPortal (Additional file 1: Materials S2).

Independent prognostic summary
The 38 TFs were further studied using TIMER2 across over 30 cancer types to determine the clinical relevance of their expression [39,40]. This tool used a Cox proportional hazard (PH) evaluation to assess the clinical significance of the expression of a gene in a cancer type and provided Kaplan Meier OS analyses.
Furthermore, OS and EFS maps and Kaplan Meier curves were generated using the GEPIA2 tool, which also used the COX PH model. OS and EFS represented the time for which a patient was alive and the duration for which the patient did not display signs/ symptoms of cancer, respectively. The median expression was utilised as a cut-off between low and high expression, with the number of patient tissue samples per cancer type reported in Additional file 1: Materials S3 (although this tool did not specify whether these samples were primary or secondary tumours).

The prognostic value of expression of the 38 TFs
The prognostic significance of the 38 TFs was investigated across multiple cancer types using the human pathology atlas (https:// www. prote inatl as. org/). The classification of favourable and unfavourable prognoses in this database is based on Kaplan Meier survival analyses.

The correlation between immune cell infiltration and gene expression across cancer types
The 38 TFs were further studied using TIMER2 for immune cell infiltration correlations for over 30 cancer types [39,40]. TIMER2 utilised deconvolution statistical methods to determine the distribution of tumour infiltrating cells in the context of TCGA gene expression profiles. Partial Spearman's correlation allowed for the assessment of immune cell infiltration with adjustment for tumour purity. The number of patient tissue samples utilised for each cancer is reported in Additional file 1: Materials S4 (this tool did not specify whether these samples were primary or secondary tumours).

The association of the 38 TFs with EMT and cancer cell genes across cancer types
The TIMER2 gene correlation module was used to establish the association of MES and ADRN TFs with a list of genes involved in EMT and cancer stemness markers including CD44, CDH1, CDH2, FN1, FOXC2, NANOG, SOX2, TWIST1, and VIM [32].

Omicsnet, Cytoscape, and dbCorC database for network analysis for the 38 TFs
Omicsnet was used to investigate different types of network interactions [41]. We used the 38 TF gene IDs and built networks using default settings (transcription factor gene interactome; TGI) and 2D visualisation. TGI utilised the information in TRRUST, JASPAR, and ENCODE databases and the network was subjected to enrichment analysis by selecting the KEGG gene option. Further, we utilised Cytoscape 3.8.0 [42] and NDEx v2.4.5 [43] to investigate and visualise the mRNA versus miRNA networks of GATA3 and SOX9 in hepatocellular cancer (HCC) and diffuse large B cell lymphoma (DLBCL), respectively.
Finally, the dbCoRC database was utilised to integrate the mRNA expression of core TFs with their reconstructed circuitry. This database archived information about CRC components, including CRC TFs, binding sites for TFs, and super enhancer genomic coordinates, allowing for the interrogation of specific TFs in CRCs of defined cell types and the subsequent visualisation of the CRCs constructed [44].
cBioPortal analysis was conducted of 1472 individual NB primary patient tissues previously included in large studies (i.e., Neuroblastoma (AMC Amsterdam), Neuroblastoma (Broad 2013 and 2015), Paediatric Neuroblastoma (TARGET)) using the 485 MES and 369 ADRN associated genes revealing genetic alterations of putative drivers, e.g., EPHA3 A629 splice mutation, FGFR1 N546K missense mutation and LATS2 P479_A480insPP (Additional file 1: Materials S1). EPHA3 A629 splice mutation, per se, was identified in sample NBL44 data deposited to cBioPortal and is likely oncogenic, while FGFR1 N546K missense mutation was identified in TARGET-30-PARCRR sample revealing an allele frequency of 0.47, represented in less than 1% of the TARGET cohort. Finally, LATS2 P479_A480insPP identified in NBL35 sample is an in-frame mutation that is predicted to be oncogenic. Apart from these predicted drivers, we have also linked each genetic alteration with unknown significance reported in Additional file 1: Materials S1, with specific identifiers, which will allow obtaining more information about the alteration and the corresponding patient data. For instance, a missense mutation of unknown significance, DLC1 S978C, is linked to TARGET-30-PAMVLG sample. The inspection of this alteration using cBioportal reveals that the allelic frequency is 0.55 and specific clinical information about the corresponding patient from which this sample was obtained. These data are significant, since, apart from MYCN and ALK genetic alterations, others, especially those with a putative oncogenic driver description, have not previously been described in NB. These discoveries could provide a blueprint for future diagnostic and therapeutic endeavours.
Given these results, we sought to determine correlations between these TFs, risk groups, and patient survival. Multiple criteria influence high-risk stratification in NB, and for clarity, we have referred to the International Neuroblastoma Risk Group (INRG) classification, where EFS cut-offs are 75-85% (low-risk), 50-75% (intermediate-risk), and < 50% (high-risk) [3].
Among the MES genes, MEOX1, CBFB and DCAF6 were downregulated, (p = 0.046, 0.0001 and 0.0001, respectively), while SMAD3, ID1, SOX11, ZNF217 and EGR3 were upregulated in NB high-risk groups (p = 0.009, 0.0115, 0.014, 0.021 and 0.035 respectively) (Fig. 1B). Therefore, these TFs (isolated or in clusters) might represent biomarkers for assigning   Given these significant findings, we sought to follow up these results by assessing the correlation of these TFs with NB prognosis and patient survival. Of the CRC TF genes, 6 were associated with NB patient OS, based on reads per kilobase of transcripts per million reads (RPKM) values of RNA sequencing (Additional file 2: Fig. S1A) or Agilent microarray data (Additional file 2: Fig. S1B) deposited to cBioportal. For instance, KLF7 and TFAP2B were upregulated in patients with increased survival. In contrast, HAND1, EGR3, PBX3, and ASCL1 were upregulated in patients with reduced survival (although both TFAP2B and HAND1 show trends) (Additional file 2: Fig. S1A, B). Notably, KLF7 was associated with better survival outcomes based on both RNA sequencing and microarray analyses (Additional file 2: Fig. S1A, B). These data suggest a significant association of expression of these TFs with NB high-risk groups and their association with patient survival outcomes. However, these data will need to be validated in preclinical models and in a prospective clinical trial scenario, whereby a predictive outcome algorithm can be developed. These findings lay the groundwork for potential new biomarker discovery in both MES and ADRN subtypes of NB. A further factor that affects poor prognosis in NB is MYCN amplification status. Since a component of MYCN regulation and NB stratification depends on interactions with lncRNAs, we investigated potential links between lncRNAs with MES and ADRN TFs and NB features [46] The correlation analyses of Spearman and Pearson found lncRNAs that positively correlate with MES and ADRN CRC genes ( Fig Two other lncRNAs of particular interest were also identified, SIX3-AS1 and GATA3-AS1, since their expression positively correlated with SIX3 and GATA3, respectively (Fig. 2B). Both were upregulated in the NB high-risk group (SIX3-AS1 p = 0.0298 and GATA3-AS1 p = 0.015) (Fig. 2C). GATA3-AS1 was associated with reduced survival (p = 0.024), even though for SIX3-AS1 the log-rank test was not significant and did not reveal an association with patient OS (Fig. 2D).
DBH-AS1 expression was also decreased in the NB high-risk group (p < 0.0001) (Fig. 2C) and was upregulated in patients with increased survival, although this result was not statistically significant (as defined in the Methods section, log-rank Test (p = 0.0642) (Fig. 2D). Other lncRNA, such as LIFR-AS1 and KIF9-AS1, were downregulated in high-risk NB (p < 0.0001 and p < 0.0001, respectively) (Additional file 2: Fig. S1D). All the expression data presented for both TFs and lncRNAs and their NB risk correlation are displayed as RNA sequencing RPKM values. These findings suggested that MES and ADRN TFs are coexpressed with lncRNAs and display similar association patterns with NB risk and patient survival. Notably, lncRNAs can interact with TFs, mostly by stabilising them and promoting their downstream activity [47].
Accordingly, DBH-AS1 has been linked to viral-mediated hepatocellular carcinoma [48], while GATA3-AS1 is associated with poor prognosis in breast cancer via the GATA3-AS1/miR-495-3p/CENPU axis [49]. Furthermore, LIFR-AS1 regulates invasion and metastasis in thyroid cancers [50], while the downregulation of EMX2OS is associated with poor patient prognosis in clear cell carcinoma of the kidney [51]. In addition, the association of these lncRNAs with the 38 TFs in NB and other cancers led us to profile the role of these TFs and their extended gene networks in cancers.

Gene expression analysis across cancer types using GEPIA2 shows that MES and ADRN TFs are widely expressed in cancers
The expression of 20 MES and 18 ADRN CRC associated genes were analysed in over 30 cancer types using GEPIA2.
In addition, we linked the expression of the 38 TFs in over 30 cancer types with their mutational profile in these cancers (Additional file 1: Materials S2). For instance, we have displayed up-or downregulation of the TF genes based on the data presented in Fig. 3 and we have linked this information with various patient sample mutations for that cancer type obtained from cBioportal. Notably, GATA3 is upregulated in BRCA and we linked this to multiple putative driver . C, D LncRNAs are associated with NB risk group; for instance, GATA3-AS1 and SIX3-AS1 are overexpressed in highrisk NB (C) and survival in the same cohorts of patients. GATA3-AS1 and SIX3-AS1 are also upregulated in patients with lower OS, despite the latter not being significant (D) (red and blue lines represent increased and reduced expression, respectively). However, GATA2-AS1 expression is not associated with NB groups, while it is upregulated in patients with reduced survival. GATA2-AS1, GATA3-AS1 and SIX3-AS1 survival data were obtained from RNA sequencing of 143 NB patient tissue, and DBH-AS1 survival data were reported from Agilent microarray of 249 NB samples (TARGET, 2018). Statistics was calculated on mean ± SEM with student`s t test in (C). NS: not significant, SEM: standard error of mean mutations including GATA3 M293K, GATA3 S402Lfs*105, GATA3 F430Lfs*45, GATA3 L396Pfs*111, GATA3 P408Afs*99. This observation also applied to STAD in which we linked the upregulation of GATA3 with some driver mutations including GATA3 S237Afs*28 and GATA3 S237Afs*28, although these links do not necessarily convey causative effects. These results suggest a significant role for these TFs in a wider range of cancers.

TIMER2 analysis shows an association of the 38 TFs with patient survival across cancer types
Kaplan Meier survival analysis was conducted for 20 MES and 18 ADRN genes in over 30 cancer types and their subtypes using the TIMER 2 gene_outcome module. This tool uses a Cox proportional hazard (PH) to assess the clinical significance of the expression of a gene in a cancer type in the form of Kaplan Meier OS analysis (Additional file 1: Materials S3). For example, for 290 Kidney renal papillary cell carcinoma (KIRP) and 545 Uterine corpus endometrial carcinoma (UCEC) patients, the expression of 16/38 and 12/38 genes respectively (e.g., CREG1, SIX1, and MEOX1), was linked with a significantly higher risk and a resulting reduced OS (p < 0.05, Spearman's p > 0) (Additional file 1: Materials S3). For instance, the cumulative survival for KIRP and UCEC patients expressing CREG1 (HR = 1.52, p = 0.0217), SIX1 (HR = 2.04, p = 0.00046), and MEOX1 (HR = 2.25, p = 0.00098) was significantly reduced. Similarly, cumulative survival was reduced for UCEC patients, expressing CREG1 (HR = 1.32, p = 0.048), SIX1 (HR = 1.85, p = 0.0001), and MEOX1 (HR=1.43, p=0.0079) (Fig. 4A, B. These results suggest that the 38 TFs are associated with patient survival in a wide range of cancers, highlighting overlapping mechanisms, processes, and gene networks and indicating that these could be used as biomarkers for outcome for a range of cancers. Indeed, considering GEPIA2, survival maps for OS and DFS were generated for the 20 MES and 18 ADRN TFs over 30 cancer types (Additional file 3: Fig. S2 and Additional file 4: Fig. 3, respectively). These analyses revealed an association of the expression of these genes with OS and DFS (Additional file 3: Fig. S2 and Additional file 4: Fig. 3). For example, significantly higher risk and consequent reduced OS and DFS were detected for KIRP patients expressing MEOX1 and MEOX2 (Additional file 3: Fig. S2) [53]. A similar result was also found for adrenocortical adenocarcinoma (ACC) patients expressing GATA3, GATA2, and SOX11 (Additional file 4: Fig. S3) [54].
Furthermore, OS and DFS for ACC patients expressing CBFB (HR = 7.5, log-rank test p = 4.9E−06 and HR = 3.7, log-rank test p = 0.00023) were reduced (Fig. 4C, top and bottom). Similarly, OS and DFS for ACC patients expressing SOX11 (HR = 5.2, log-rank test p = 0.00032 and HR = 3.7, log-rank test p = 0.00037) (Fig. 4D, top and bottom) and ISL1 (HR = 6.3, logrank test p =4.4E−05 and HR = 3.7, log-rank test p = 0.00031) were also reduced (Fig. 4E, top and bottom). These data reveal a lower OS and DFS for ACC patients expressing CBFB, SOX11, and ISl1 individually, demonstrating the significance of these genes not only in NB but also other solid cancers. Given these results, we sought to investigate the association of these TFs with patient prognosis using other databases.

Human pathology atlas analysis of the 38 TFs shows their association with prognosis in cancers
We found an association of the 38 TFs with prognosis (both favourable and unfavourable) in various cancers including renal, lung, pancreatic, endometrial, ovarian, liver, breast, glial, urothelial, and melanoma (Fig. 5A). For instance, ELK4, CBFB, IFI16, PRRX1, AEBP1, and GATA3 expression was associated with an unfavourable outcome in renal cancer revealing the significance of these TFs (pink and blue colours represent unfavourable and favourable prognosis, respectively) [55] (Fig. 5A).
Further to the association of expression of these TFs with patient survival, previous studies have linked their expression with various immune cells in the TME [27], leading us to address the impact of MES and ADRN TFs on the TME.

Immune cell infiltration correlates with gene expression across cancer types
Using TIMER2 to associate gene expression with immune cell infiltration, we compared Tregs, CAFs, ECs, B cells, and gamma-delta T cells with 20 MES and 18 ADRN genes for KIRP, ACC, DLBCL, PAAD, THYM, STAD, and GBM adjusting for tumour purity. We found a strong positive association (p < 0.05, Spearman's Rho > 0) between the expression of multiple MES and ADRN TFs with the infiltration of CAFs, ECs, and B cells in KIRP and PAAD, while this only applied to CAFs and ECs in STAD and THYM (Additional file 1: Materials S4). Further, the expression of these TFs in ACC and DLBCL showed a negative association with Treg cells (p < 0.05, Spearman's Rho < 0), while the picture for gamma-delta T cells was mixed (Additional file 1: Materials S4).
A positive association of expression of EGR3 with CAFs in KIRP (p = 7.67E−08, Spearmann's Rho = 0.327), PAAD (p = 2.54E−05, Spearmann's Rho=0.316) and STAD (p = 1.24E−10, Spearmann's Rho = 0.323) was detected (Fig. 5B, D). These results suggest that the expression of EGR3 in KIRP, PAAD, and STAD shows a strong positive correlation with TME elements, including stromal and immune cells [56]. Accordingly, the interaction of stromal cells such as CAFs with tumour cells in the TME can shape the immunosuppressive microenvironment and tumour-promoting phenotypes and potentially inform on therapeutic strategies that aim to reverse CAF-mediated immunosuppression [57]. Hence, the positive association of EGR3 gene expression with CAFs in these cancers suggests the contribution of this gene to promoting the immunosuppressive roles of CAFs, while negative correlations obtained for Tregs in DLCBCL would suggest the reverse. Moreover, different components of the TME, including tumour-associated macrophages (TAMs), induce EMT and cancer cell migration, while EMT, chiefly triggered by hypoxia can activate stemness factors [33]. Hence, we sought to understand the influence of these TFs on the occurrence of EMT and the expression of stemness factors.

MES and ADRN TFs are strongly associated with EMT and stemness factors across cancers
Given the link between MES and ADRN TFs and cancer cell migration and the activation of stemness factors [33,57], we used the TIMER2 gene correlation module to establish such associations, detailed further in Additional file 1: Materials S5. For example, a positive correlation with all markers tested including CD44, CDH1, CDH2, FN1, FOXC2, NANOG, SOX2, TWIST1, and VIM was observed with the following MES and ADRN TFs: SMAD3, CBFB, ZFP36L, KLF7, DACH1, and SATB1 in liver hepatocellular carcinoma (LIHC). This positive association implies that SMAD3 may contribute to EMT and stemness programmes in these cancers, potentially contributing to their aggressiveness and metastatic potential. Previous studies reported the role of TGF-β/SMAD signalling in inducing both stemness markers and EMT in prostate cancer cells by post-transcriptional modification of CD44 [58]. These findings collectively suggest that MES and ADRN TFs also show a positive association with genes involved in EMT and cancer stemness characteristics in other solid cancers. Thus, we compared the extended network comprising TF mRNAs and miRNAs that influence the TME, EMT, stemness markers, and aggressive behaviour in cancers.

Analysis of TF-gene networks, integration, and visualisation of network data shows that these extended networks influenced the TME and cancer progression
We sought to determine the extended networks of the 38 TFs and miRNA in impacting the TME, progression, and aggressive behaviour of cancers by using software packages including Omicsnet, Cytoscape, and dbCoRC. GATA3 and GATA2 displayed the highest degree of connectivity with 38 and 19 connections, respectively (Fig. 6A), and KEGG gene enrichment analysis revealed enrichment for immunity functions and processes, and miRNA in cancer terms (Fig. 6B). SATB1 also showed a degree of connectivity of 15 (not shown in Fig. 6B to improve the legibility of the network). In addition, MES TFs, SMAD3, SOX9, WWTR1, and IFI16 had 32, 23, 6, and 5 connections, respectively (Fig. 6C), and KEGG gene enrichment analysis revealed enrichment for 'breast cancer' , 'prostate cancer' , and 'hepatocellular carcinoma' processes in addition to 'miRNA in cancer' terms (Fig. 6D). This analysis suggests that the extended network of TF-gene interactomes for MES and ADRN TFs, also play roles in cancer pathways, gene regulation, and immune-related processes. Consistent with the miRNA links identified, we used Cytoscape to search for miRNA-mRNA networks of GATA3 and SOX9 in DLBCL and LIHC, respectively, as two examples of such networks (Additional file 5: Fig. S4). GATA3 associated with TCGA data for DLBCL (GO enrichment q-value = 6.55). Also, a negative correlation of GATA3 with hsa-mir-431 (p-value = 4.9−E07, correlation = − 0.65) and hsa-mir-433 (p = 3.33E−5, correlation = − 0.566) was observed (Additional file 5: Fig. 3 Gene expression analysis of MES TFs in 31 cancer types. A Expression of TFs in cancer types based on TCGA records in comparison to matched normal tissue: significant overexpression in TCGA over normal tissue is displayed in red, while overexpression in normal tissue compared to TCGA data are displayed in green. B PRRX1 is significantly overexpressed in TCGA samples for diffuse large B cell Lymphoma (DLBCL), pancreatic adenocarcinoma (PAAD), thymoma (THYM) and stomach adenocarcinoma (STAD) tumours compared to normal samples expressed in log2 (TPM + 1), C PRRX1 gene expression fold change in TCGA in comparison to normal samples. The parameters for this analysis were set as Log2FC cut-off of 1 and p-value < 0.01. One-way ANOVA was used to test for differences in expression between normal and cancer tissue. D Expression of TFs in cancer types based on TCGA records in comparison to matched normal tissue, significant overexpression in TCGA over normal is displayed in red, while overexpression in normal over TCGA is displayed in green. E ASCL1 is significantly overexpressed in TCGA samples for glioblastoma multiforme (GBM), low grade glioma (LGG) and thymoma (THYM) tumours compared to normal samples expressed in log2 (TPM + 1), F ASCL1 gene expression fold change in TCGA in comparison to normal samples. The parameters for this analysis were set as Log2FC cut-off of 1 and p-value < 0.01. One-way ANOVA was used to test for differences in expression between normal and cancer tissue Fig. S4B). Closer inspection of the network also revealed the presence of ASCL1, another member of the MES CRC TFs, which also displayed negative regulation with hsa-mir-431 (p = 3.58E−6, correlation = − 0.61), hsa-mir-432 (p = 1.7E-4, correlation = − 0.52) and hsa-mir-433 (p = 1.48E-5, correlation=− 0.58) (Additional file 5: Fig. S4C). hsa-mir-433, a tumour suppressor in breast cancer, inhibits the growth of cancer cells by affecting cell migration [59]. Similarly, SOX9 was associated with TCGA data for LIHC (miRNA vs RNA) (GO enrichment q-value = 6.75) (Additional file 5: Fig. S4D). SOX9 positively correlated with hsa-mir-429 (= 4.03E−32, correlation = 0.56), hsa-mir-200a (p = 1.54E−32, correlation = 0.56) and hsamir-200b (p = 3.8E−30, correlation = 0.54) (Additional file 5: Fig. S4E). The role of hsa-mir-429 as a biomarker of cancer initiation and progression for breast cancer has been previously validated, and, similarly, the mir-200 family members have been reported as potential prognostic biomarkers in multiple cancers [60,61]. Collectively these results suggest the intricate connectivity of genes such as GATA3 and SOX9 that play essential roles in NB differentiation and neural crest development, respectively, with networks of miRNAs. These miRNAs have prognostic value in various cancers and should be investigated further in NB [13]. In Table 1, we have provided examples of the predicted prognostic potential of the 38 TFs with NB risk groups and patient survival in NB and other cancers.
Finally, we used the dbCoRC database to integrate mRNA expression of core TFs with their reconstructed circuitry in cancers. For instance, SMAD3 was indicated in the CRC of representative cell lines of DLBCL, colorectal cancer, pancreatic, hepatocellular carcinoma, breast, and gastric cancers ( Table 2). These results suggested similarities in transcriptional regulatory mechanisms governed by these TFs in various cancer types [44], hence pointing towards the versatility and commonalities of these networks in cancers. These core TFs may display similar or contrasting diagnostic and prognostic values, a result warranting further investigation and validation.

Discussion
Core regulatory circuitries can regulate lineage or cell-specific gene expression and thereby confer a specific identity. These CRCs comprise super enhancers that are marked with high deposition of permissive H3K27 acetyl histone marks that drive a set of highly regulated TFs, which in turn self-regulate and regulate the expression of other TFs within the CRC [8][9][10][11].
Two specific cell identities have been identified in NB tumours; ADRN and MES, the latter bearing higher therapy resistance and being enriched in relapsed samples [5][6][7]. Despite conferring specific yet interconvertible cell identities in a rare paediatric cancer by the 38 MES and ADRN TFs, we were intrigued to study these TFs in both NB and the wider context of cancers to assess their potential as indicators of disease prognosis, progression, and relapse. Accordingly, MES-specific minimal residual disease (MRD) markers in NB were identified as PRRX1, EMO3, and POSTN in previous studies [62]. The detection of the mRNA of these genes in peripheral blood was effectively used for assessing MRD and correlated with low OS and DFS in NB patients, collectively suggesting the potential use of PRRX1 as a biomarker for prognosis, treatment, and remission in NB [62]. On a similar note, in several reports, GATA2 was shown to be associated with low-risk disease and better NB patient prognosis [63,64], while EYA1 was expressed in earlier stages of NB [24]. Given this background, we aimed to dissect such links more comprehensively in NB and other cancers, specifically from the perspective of MES-and ADRN-specific CRC TFs. To that end, the extended list of MES and ADRN genes were initially subjected to cBioportal for NB samples retrieving identified genetic alterations with unknown or predicted oncogenic functions, including amplification, missense, in-frame, and truncating mutations, in addition to fusions and deletions in 853 genes, with the vast majority bearing very low mutation rates per sample (< 0.5%). The identification and study of potential drivers in cancers will not only lead to the greater dissection of genomic alterations but may also inform future diagnostic and therapeutic studies. Gene Ontology enrichment studies revealed the enrichment of JAK/ STAT signalling for the MES gene list perhaps suggesting a requirement for the signalling input from this pathway in cell proliferation, were obtained (red and blue lines representing increased and reduced expression in patients, respectively). In all cases, increased expression of these genes correlates with reduced cumulative survival of patients. C-E OS and DFS Kaplan Meier curves generated by GEPIA2 for ACC patients for CBFB (top: log-rank test p = 4.9E-06, HR = 7.5, p(HR) = 7.2E-5 and bottom: log-rank test p = 0.00023, HR = 3.7, p(HR) = 0.00056), SOX11 (top: log-rank test p = 0.00032, HR = 5.2, p(HR) = 0.0012 and bottom: log-rank test p = 0.00037, HR = 3.7, p(HR) = 0.00088), and ISL1 (top: log-rank test p = 4.4E-05, HR = 6.3, p(HR) = 0.00033 and bottom: log-rank test p = 0.00031, HR = 3.7, p(HR) = 0.00073) revealing significant decreases in OS and DFS for patients. HR = Hazard risk tumourigenesis, and migration [65][66][67][68][69][70]. Furthermore, we elected to focus on the list of 20 MES and 18 ADRN CRC TFs, initially testing the expression of MES and ADRN genes with NB risk levels. This study revealed that ADRN-associated SATB1, GATA2, TFAP2B, KLF13, KLF7, and PBX3 were downregulated in high-risk NB cases, while SIX3 and GATA3 were upregulated in this group. Similarly, MES-associated MEOX1, CBFB, and DCAF6 genes were downregulated in high-risk NB cases, while SMAD3, ID1, SOX11, ZNF217, and EGR3 were upregulated. Evidence supports the potential of GATA2 as an early cancer detection marker and a good prognostic factor in NB, with high GATA2-expressing patient groups displaying greater OS compared to those with low levels of GATA2 [22,63,64]. Consistently, we showed that GATA2 was downregulated in high-risk NB. Consistently, ID1, SOX11, TFAP2B, KLF7, and GATA3 were linked to neurogenesis or NB differentiation [13,18,19,23,71,72], but to our knowledge, these studies did not discuss their association with survival and risk groups in NB, hence our findings are novel.
Evidence suggested the importance of lncRNAs in NB risk group stratification, leading us to compare the network formed by the 38 TFs and lncRNAs expression [46]. We identified that DBH-AS1 expression was strongly associated with TFAP2B and GATA2. Furthermore, its expression was lower in high-risk NB and was associated with better prognosis. In keeping with these data, the downregulation of DBH-AS1 in osteosarcoma was shown to be an indicator of good prognosis in these patients [73], amongst other reports in the literature highlighting the significance of this lncRNA in various cancers [48,74].
Given the connection between the 38 TFs and the reported lncRNAs in other cancers [48][49][50][51], we elected to expand our investigation to other cancers. We used several bioinformatic tools, including GEPIA2, TIMER2, Omicsnet, Cytoscape, and dbCoRC to gain a deeper understanding of the role of these TFs in other cancers including their overexpression, prognostic value, association with cells of the TME, miRNA-TF network connectivity, and utility in other CRCs. Both MES and ADRN TFs displayed patterns of overexpression in TCGA tumour samples compared to matched normal samples. For   [5,6] was overexpressed in tumour samples compared to matched normal tissue of multiple cancers including low-grade glioma, GBM, DLBCL, STAD, PAAD, and THYM. Consistent with this finding, the role of PRRX1 in promoting stemness and angiogenesis in glioma has been reported [75]. In our study, ADRN-specific GATA3 was overexpressed in breast invasive carcinoma (BRCA), cervical squamous cell carcinoma and endocervical adenocarcinoma (CESC), testicular germ cell tumours (TGCT), STAD, PAAD, and THYM, data that we matched with the mutational profiles of these genes. Consistent with these findings, the predictive value of SMAD4/GATA3 for OS and relapse-free survival (RFS) for breast invasive ductal carcinoma has been reported in previous studies with SMAD4-/GATA3+ patients displaying better OS and RFS [76]. The association of these TFs with lower survival was determined using the TIMER2 tool across various cancer types. For more than half of the MES and ADRN TFs (e.g., SIX1, MEOX1, and ZNF536), decreased cumulative survival was observed for two cancer types in particular; KIRP and UCEC, but to a lower extent in other cancers, indicating the influence of these TFs on patient survival in these cancers. Consistently, the high expression of SIX1 in endometrial cancer could be a predictor of unfavourable prognosis in these patients [77]. Similarly, higher Spearman's risk and reduced survival were observed for LGG, STAD, mesothelioma (MESO), pheochromocytoma and paraganglioma (PCPG), and thyroid carcinoma (THCA) patients expressing SIX1, collectively suggesting the prognostic value of SIX1 in these cancers. On a similar note, a significant association between the expression of ELK4, CBFB, IFI16, PRRX1, AEBP1, and GATA3 with an unfavourable outcome in renal cancer has been identified in previous reports [78,79]. In light of these findings, the most significant result obtained in this study is the potential prognostic values of the 38 TFs across the cancer types reported in Table 1.
We next investigated the association between MES and ADRN TFs with the infiltration of cells associated with cells in the TME including Treg, CAFs, ECs, B cells, and gamma-delta T cells. We found a positive correlation of MES and ADRN TFs with the infiltration of CAFs and ECs in KIRP, PAAD, STAD, and THYM. Recent studies have suggested that CAFs produce growth factors and cytokines that may facilitate angiogenesis, promote tumour growth and modulate cancer stem cell characteristics [80]. In keeping, the role of CAFs in promoting angiogenesis in gastric cancer has previously been reported [81]. We also found strong correlations between MES and ADRN TFs with markers of cancer cell stemness and cancer cell motility including CDH1, CDH2, NANOG, SOX2, TWIST1, VIMENTIN, and Fibronectin across cancers. For instance, we found positive correlations between SOX9 and SATB1 with these markers in colon adenocarcinoma (COAD) and lung adenocarcinoma (LUAD), respectively. Accordingly, SATB1 promotes metastasis and cell growth in colorectal cancer [82]. In another study, TGF-β secreted by TAMs was shown to promote metastasis in non-small cell lung cancer (NSCLC) through promoting TGF-β/SOX9 axis expression [83]. These observations suggest that MES TFs may be associated with genes that mediate tumour cell motility and metastasis.
Network connectivity followed by KEGG term enrichment of MES TFs revealed that the extended network of genes linked to TFs such as SMAD3, IFI16, and SOX9 was also involved in various cancer regulatory pathways, cancer types, and transcriptional misregulation, while the extended network of GATA2 and GATA3 connected genes has roles in various immune pathways and miRNAs in cancers. Recent studies have shown that SMAD3 can have a dual role in repressing and promoting cancer by inhibiting cell proliferation and regulating transcriptional output favouring metastasis, respectively [84]. On the other hand, a study identified GATA3 upregulation and its association with favourable outcomes in breast cancer [85]. Moreover, we reveal that both GATA3 and SOX9 are negatively and positively associated with hsa-mir-433 in DLBCL and hsa-mir-429 in LIHC, respectively, suggesting crucial CRC TFs may overlap in function in various cancer types, govern similar regulatory networks, and display similar roles as disease biomarkers and prognostic predictors. Finally, we show that SMAD3, ZNF217, KLF13, GATA2, GATA3, KLF7, and PHOX2B are components of CRCs in other cancers, Gastric cancer (T2001206), small cell lung cancer (NCI-H69) GATA2 Prostate cancer (LNCaP) GATA3 Breast cancer (MCF7) KLF7 Gastric cancer (T2000085), small cell lung cancer (NCI-H69) PHOX2B Small cell lung cancer (NCI-H82)