1 Introduction

Acute myeloid leukemia (AML), a very aggressive and heterogeneous neoplasm, is defined by varied patient prognoses and a high mortality rate. Currently, the risk stratification and therapeutic approaches for patients suffering from AML are based on the abnormalities in their cytogenetic and molecular features; however, the precise underlying mechanisms of this disease remain unclear [1, 2]. The advent of targeted agents has improved the outcomes of personalized therapy and survival rates; however, the outcomes of monotherapy or monotherapy combined with traditional chemotherapy are unsatisfactory [3]. Therefore, screening novel biomarkers could enhance our comprehension of the molecular basis behind AML. This would facilitate diagnosing, predicting the prognosis and therapeutic response, residual monitoring, and developing targeted drugs.

Rho guanine nucleotide exchange factor 5 (ARHGEF5), a guanine nucleotide exchange factors (GEFs) Dbl family member, manages Rho GTPases regulation [4]. ARHGEF5 comprises two isoforms encoded by a single mRNA, and transforming immortalized mammary (TIM) is the short isoform of ARHGEF5. A high TIM expression level was observed in lung carcinoma, and TIM could activate RhoA in vivo, thereby regulating the reorganization of the RhoA-mediated stress fiber [5]. A study has shown that TIM could be involved in breast cancer progression [6]. Studies have shown a significant increase in ARHGEF5 expression in cell lines and tissues of patients with lung adenocarcinoma. In fact, ARHGEF5 overexpression exhibits a correlation to a shorter survival time [7, 8]. ARHGEF5 activates RhoA to promote thick stress fiber formation and links the Src and PI3K pathways to promote Src-induced podosome formation [9]. A study has shown that the Src-ARHGEF5-PI3K complex is expressed in LuM1 cells, a highly metastatic colon adenocarcinoma, while it is not expressed in NM11 cells, which exhibit moderate metastatic potential [10]. ARHGEF5 interacts with the cAMP and NOTCH1 signaling pathways, influencing cell migration, tumor invasion, and immune cell function. Aberrant activation of cAMP signaling can lead to increased proliferation and survival of leukemic cells [11]. Salah demonstrated that NOTCH-1 gene mutations were associated with a bad clinical outcome, shorter overall survival, and failure to achieve complete remission [12]. However, the ARHGEF5 expression profile in patients with AML and its significance in predicting their prognosis is still unclear.

Herein, the analysis of ARHGEF5 expression was conducted first in patients diagnosed with AML. Next, the functional enrichment analysis of ARHGEF5 was performed through Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment pathway analyses, Gene set enrichment analysis (GSEA), immune cell infiltration (ICI), and protein–protein interaction (PPI) network analyses. The study ended with conducting Kaplan–Meier (KM) analysis and Cox regression analysis (CRA), followed by constructing a prognostic nomogram model aimed at ascertaining the clinical function of ARHGEF5 in patients suffering from AML. The findings of our study proposed that ARHGEF5 may be involved in AML. The utilization of this biomarker can be a prognostic indicator and aid in identifying treatment targets for individuals diagnosed with AML. Herein, we elucidated the significance of ARHGEF5 in AML and its potential implications for the research and treatment of patients suffering from AML.

2 Materials and methods

2.1 The collection and processing of data

The present study obtained data pertaining to gene expression patterns and clinical information of patients from two databases, namely "the Cancer Genome Atlas" (TCGA; https://portal.gdc.cancer.gov/repository) and "the Genotype-Tissue Expression Project" (GTEx; https://commonfund.nih.gov/gtex). Subsequently, the level 3 HTSeq-FPK format data went through normalization to transcripts per million (TPM) reads. RNA-sequencing data in TPM format was obtained from the UCSC-Xena (https://xenabrowser.net/datapages/) and the GTEx databases to facilitate pan-cancer analysis.

2.2 Differentially expressed genes (DEGs) analysis

First, we categorized patients with AML from TCGA into low- (LAEG) and high-ARHGEF5 expression groups (HAEG) using the median ARHGEF5 expression score as the threshold value. Next, we employed the “DESeq2” R package for DEGs screening between both groups [13]. We identified DEGs based on the following thresholds: “adjusted P-value < 0.05” and “|log2-fold-change (FC)|> 1.” Finally, heat maps were constructed to visualize the top ten DEGs.

2.3 Functional enrichment analysis

The study performed functional enrichment analysis on the DEGs meeting the criteria of “|logFC|> 1.5” and “p-adj < 0.05”. Subsequently, a GO functional analysis was conducted, wherein enriched GO terms were identified across the cellular component (CC), molecular function (MF), and biological process (BP) categories. Additionally, KEGG pathway enrichment analysis was conducted utilizing the "ClusterProfiler" R package [14].

2.3.1 GSEA

We performed GSEA through the “ClusterProfiler” R package (3.6.3) to identify the differences in functions and pathways between both groups [14]. “ p-adj < 0.05” and “FDR q < 0.25” indicated a significant difference.

2.4 Analyzing PPI network

The study established a PPI network utilizing the DEGs through the web-based STRING (http://string-db.org/) database, applying a confidence score > 0.4 and default parameters. Subsequently, the PPI network visualization was performed utilizing the “Cytoscape” software (version 3.5.0) [15]. Finally, the significant modules in the PPI network were identified utilizing MCODE (version 1.8.0) [16]. The criteria employed for this analysis were “MCODE scores > 3” and default parameters.

2.5 Analysis of ICI

The single-sample GSEA (ssGSEA) was conducted via the “GSVA” R package to analyze 24 immune cell types and their relative enrichment score for determining ICI degree in patients with AML [17]. Next, the association between ARHGEF5 expression and these immune cells was determined through Spearman's correlation analysis. Finally, we compared ICI levels in patients between both groups utilizing the Wilcoxon rank-sum test (WRST).

2.6 Survival analysis

We performed survival analysis based on the KM method and log-rank test, setting the cut-off as the median ARHGEF5 expression value. Next, we performed univariate CRA (UCRA) and multivariate CRA (MCRA) to determine the influence of clinical features on patient outcomes. Finally, we performed MCRA on prognostic factors with P < 0.05 identified using the UCRA. We visualized the forest plot using the “ggplot2” R package.

2.7 Constructing and validating the nomogram

We constructed a nomogram using prognostic factors, which could independently predict the overall survival (OS) of patients identified using MCRA. Next, we used calibration plots to determine the performance, and the concordance index (C-index) was utilized to measure the nomogram discriminatory ability. The RMS (version 5.1–3) R package was utilized for generating the nomogram and calibration plots. The study also evaluated the nomogram accuracy in predicting patient prognosis through a time-dependent receiver-operating characteristic (ROC) curve, employing the “timeROC” package.

2.8 Statistical analysis

We statistically analyzed the data using R (version 3.6.2) [18]. First, a paired t-test and WRST were conducted to establish the statistical significance of MCTS1 expression in paired and non-paired samples, respectively. Subsequently, the study conducted Kruskal–Wallis, Wilcoxon signed-rank, and logistic regression analyses to examine the association between clinical/cytogenetic features and ARHGEF5 expression. Next, we employed CRA and the KM method to evaluate the prognostic factors. Finally, we used MCRA to determine the effect of ARHGEF5 expression and other clinical features on the patient's survival. P < 0.05 was deemed significant for all analyses.

The patient’s age, cytogenetic risk, and ARHGEF5 expression were selected as independent prognostic factors based on their statistical significance in univariate and multivariate Cox regression analyses, as well as their established importance in AML prognosis. These factors demonstrated the strongest and most consistent associations with patient outcomes in our analyses.

3 Results

3.1 ARHGEF5 expression in Pan-Cancer and patients with AML

We retrieved RNA-seq data of patients from TCGA and GTEx using the UCSC-XENA, which were uniformly processed via the toil pipeline, and revealed that ARHGEF5 was significantly overexpressed level in 20 types of cancer (Fig. 1A), including patients with AML from TCGA, compared to normal samples from TCGA and GTEx (Fig. 1B). In addition, ROC analysis demonstrated that the sensitivity and specificity of ARHGEF5 in predicting the patient's outcomes was high (AUC 0.872; Fig. 1C).

Fig. 1
figure 1

ARHGEF5 expression in patients suffering from AML. A Comparison of the high- or low ARHGEF5 expression in different cancer tissues of patients to normal tissues from TCGA. B ARHGEF5 overexpression in patients with AML compared to normal tissues. C The ROC curve indicates that ARHGEF5 could serve as a potential diagnostic marker

3.2 Identifying DEGs in patients with AML in both groups

LAEG and HAEG were compared for differences in median mRNA expression levels. We identified 412 DEGs between both groups, of which 216 were upregulated and 196 were downregulated based on gene expression RNA-seq-HTSeqCounts with significance (Fig. 2A). The heatmap shows the top five DEGs (up- and down-regulated) in patients in both groups (Fig. 2B).

Fig. 2.
figure 2

412 DEGs were identified in patients in both groups. A The volcano plot shows DEGs, including 216 upregulated and 196 downregulated DEGs. Descending order of normalized expression from red to blue. B Ten DEGs were revealed by heat map: five overexpressed and five suppressed genes

3.3 Functional enrichment analysis of DEGs

To identify the functions enriched by these 412 DEGs among patients with AML, GO and KEGG enrichment analyses were conducted (Fig. 3), revealing that these DEGs showed significant enrichment in the GO-BP terms, including pattern specification process, synapse organization, and regulating transmembrane ion transport. In addition, the GO-CC terms, such as extracellular matrix containing collagen and transporter, as well as ion channel complexes, and the GO-MF, such as the passive transmembrane transporter, substrate-specific channel, and ion channel activities, were enriched by these DEGs. In addition, the KEGG pathways, including the interaction between cytokine-cytokine receptor, the cAMP signaling pathway, and chemical carcinogenesis, were enriched by these DEGs.

Fig. 3
figure 3

GO and KEGG pathway enrichment analyses of DEGs in patients having AML in both groups. AC GO analysis reveals the BP, CC, and MF enriched by DEGs with ARHGEF5. D KEGG analysis of DEGs with ARHGEF5 showed significantly enriched KEGG pathways

Next, GSEA was conducted for the identification of the biological pathways involved in AML in patients expressing varying ARHGEF5 levels. We compared the datasets of patients in both groups to identify signaling pathways involved in AML. Enrichment of these pathways in the MSigDB collection (C2.all.v7.0.symbols.gmt) was observed to differ significantly (FDR < 0.05, p-adj < 0.05, Fig. 4). The results revealed an association between ARHGEF5 and cytotoxicity mediated by natural killer (NK) cells, the notch, t-cell receptor, and nod-like receptor signaling pathways and the interaction between the cytokines and cytokine receptors.

Fig. 4
figure 4

GSEA enrichment plots. A NK-cell mediated cytotoxicity. B The notch and C nod-like receptor signaling pathways. D The interaction between cytokines and cytokine receptors. E The t-cell receptor signaling pathway. NES normalized enrichment scores, FDR false discovery rate

3.4 PPI enrichment analysis in patients with AML

A PPI network of ARHGEF5 and probably co-expressing genes with ARHGEF5-associated DEGs were built utilizing the STRING database and a threshold value of 0.4. We identified 412 DEGs. The constructed PPI network comprised 303 nodes and 389 edges and was visualized through Cytoscape-MCODE (Figure S1A). The MCODE score of the module, which was considered the most significant, was 4.667. This module had ten nodes and 21 edges (Figure S1B).

3.5 Analyzing ICI in patients with AML

Spearman's correlation analysis showed the relation between ARHGEF5 expression and ICI in patients having AML quantified via ssGSEA. ARHGEF5 exhibited a positive correlation with active dendritic cells (aDC), cytotoxic cells, NK cells, plasmacytoid dendritic cells (pDC), T helper cells, and follicular helper T cells (TFH) (Fig. 5A–G).

Fig. 5
figure 5

Correlation between ARHGEF5 expression and ICI. AF Relation between E2F2 expression with ICI, including A aDC, B cytotoxic cells, C NK cells, D pDC, E Th cells, F Tfh cells. G Correlation between ARHGEF5 expression and 24 tumor-infiltrating lymphocytes

3.6 The relation between ARHGEF5 expression, clinical characteristics, and cytogenetic risk

Table 1 lists the primary clinical features of patients with AML from TCGA. We analyzed 151 patients with AML, comprising 68 females and 83 males. The average age of patients was 56.7 years. ARHGEF5 expression was low in 75 (49.7%) patients and high in 76 (50.3%) patients with AML. The correlation analysis results indicated a significant relation between ARHGEF5 expression and cytogenetic risk (P = 0.011) and harboring mutations in FLT3 (P < 0.001) and NPM1 (P = 0.018).

Table 1 Correlation between ARHGEF5 expression and clinical features in patients with AML based on TCGA

The WRST was conducted for the comparison of the differences in ARHGEF5 expression patterns across patients with varying clinical and pathological features and indicated that ARHGEF5 was significantly overexpressed in patients in the Black or African American, del7 and complex karyotype, high-risk cytogenetics groups and patients who are harboring mutations in FLT3 and NPM1, and patients harboring wild-type IDH1R140 (Fig. 6A–F).

Fig. 6
figure 6

Relation between ARHGEF5 expression, clinical features, and cytogenetic risks. A Correlation between ARHGEF5 expression and race, B cytogenetics, C cytogenetic risk, D FLT3, E NPM1, and F IDH1 R140 mutations

3.7 The relation between ARHGEF5 expression and poor prognosis of patients with AML

The Kaplan–Meier analysis yielded findings indicating that the patients belonging to the HAEG exhibited a significantly worse prognosis relative to those in the LAEG (hazard ratio: 1.79 (1.17–2.73); P = 0.007; Fig. 7A). Additional analysis indicated that the prognosis of male patients (P = 0.025), patients aged ≤ 60 (P = 0.03), intermediate cytogenetic risk (P = 0.003), the M2 subtype (P = 0.025), normal karyotype (P = 0.003), patients harboring mutations in FLT3 (P = 0.003) and NPM1 (P = 0.027), and patients harboring wild-type RAS (P = 0.011) in the HAEG was poor (Fig. 7BI).

Fig. 7
figure 7

Relation between high ARHGEF5 expression and poor OS of patients having AML. A KM survival curves of patients with AML. B KM survival curves of both male patients with AML and CI patients with age ≤ 60, intermediate risk, FAB classification-M2, normal chromosome karyotype, FLT3, RAS, and NPM1-positive mutations. FAB French–American–British

The CRA results revealed that patients aged > 60 years, intermediate/poor cytogenetics, and patients expressing high ARHGEF5 levels were significant risk factors for AML (Fig. 8A). Furthermore, MCRA showed that ARHGEF5 overexpression could be an independent risk factor for predicting patient prognosis (Fig. 8B).

Fig. 8
figure 8

UCRA and MCRA show a correlation between clinical features and the OS of patients with AML. A The forest plot shows a UCRA and B MCRA of OS

3.8 ARHGEF5 prognostic model for patients with AML

We performed MCRA for constructing a nomogram to improve the accuracy of predicting the patient's prognosis. Three independent prognostic factors, including the patient's age, cytogenetic risk, as well as ARHGEF5 expression, were incorporated into the prognostic model. The column chart model revealed that higher scores were correlated to poor patient prognosis (Fig. 9A). Additionally, we used the calibration plot to determine the nomogram's predictive efficacy. The bootstrap-corrected C-index of the nomogram was 0.715 (95% CI 0.690–0.754), thus revealing that the nomogram accuracy in predicting the patient's OS was moderate (Fig. 9B).

Fig. 9
figure 9

A prognosis-predictive model of ARHGEF5 in patients with AML. A nomogram and B Calibration plot predicting the 1- and 3-year OS probability among patients having AML

4 Discussion

AML can be defined as a heterogeneous clonal disease characterized by the clonal proliferation of primitive hematopoietic stem cells or progenitor cells [19]. Despite the availability of various therapeutic strategies, the success rate of these treatments in patients with AML is low, and the mortality rate is high due to cancer relapse [20]. Establishing a standardized therapeutic strategy for relapsed AML is challenging due to genetic and clinical heterogeneity [21]. Molecular targets are likely to become an established strategy for both induction and consolidation therapy, in addition to maintenance therapy followed by consolidation therapy [22]. Several studies have focused on assessing genetic alterations at the molecular level for predicting outcomes and identifying prognostic markers [23]. Recently, there has been a significant emphasis on investigating epigenetic mutations in DNMT3A, TET2, and ASXL1 [24]; however, the underlying immunological mechanisms of AML pathogenesis are poorly understood. Our results revealed an increase in ARHGEF5 expression in patients with AML. Furthermore, the study revealed a significant association between ARHGEF5 overexpression and complex chromosomal karyotypes, poor risk classification, and an unfavorable prognosis. The above-mentioned results indicated that ARHGEF5 overexpression might be a probable prognostic biomarker used for individuals suffering from AML.

ARHGEF5 belongs to the GEFs family that regulates Rho GTPases [4, 25]. Mounting evidence has demonstrated that ARHGEF5 promotes the metastasis and infiltration of cancer cells by activating Rho GTPase, which alters cell adhesion and cytoskeletal functions [26, 27]. Debily et al. demonstrated an increase in ARHGEF5 expression level in breast cancer, and a high ARHGEF5 expression level could significantly impact breast cancer progression [6]. In addition, ARHGEF5 could alter the growth characteristics and the development of tumors in mice [25]. Compared to normal lung tissue, a significant elevation in ARHGEF5 expression in non-small cell lung cancer cell lines was detected [5]. ARHGEF5 played a critical role in malignant progression, particularly in colorectal cancer cells that have acquired a mesenchymal phenotype through EMT [28]. However, the comprehension of the expression and prognostic implication of ARHGEF5 among individuals diagnosed with AML remains restricted. The findings of our study indicate a significant elevation in the expression levels of ARHGEF5 in individuals diagnosed with AML. Moreover, high ARHGEF5 expression is strongly correlated with intermediate-to-high cytogenetic risk and poor patient prognosis.

Our findings revealed unfavorable survival outcomes for patients with overexpressed ARHGEF5. MCRA revealed that high ARHGEF5 expression could independently predict patient prognosis. Furthermore, we constructed a nomogram prediction model, revealing the significance of ARHGEF5 expression in predicting the patient's prognosis. Taken together, these results indicate that ARHGEF5 could be used for predictive adverse prognosis of patients suffering from AML.

The prognosis of patients harboring FLT3 mutations and expressing high ARHGEF5 levels was found to be poor. In addition, the occurrence of FLT3 mutations in patients having AML can be 10–30%, while being relatively low in the elderly population [29, 30]. FLT3 mutations, such as internal tandem duplications (ITD) and point mutations in the tyrosine kinase domain, result in the constant activation of receptors independent of ligands [31]. Moreover, a study has highlighted an association between ITD mutations, increased incidence of cancer relapse, and poor OS of patients [32]. Our results revealed that patients diagnosed with AML and possessing FLT3/ITD mutations but without ARHGEF5 mutations exhibited a more favorable prognosis in comparison to those with both FLT3/ITD and ARHGEF5 mutations. However, additional investigations are necessary to validate the impact of upregulated ARHGEF5 expression in patients diagnosed with AML harboring FLT3/ITD mutation and their underlying mechanisms.

ARHGEF5 overexpression in AML is closely associated with the cAMP and Notch pathways. Activated cAMP signaling pathway could inhibit p53 accumulation in acute lymphoblastic leukemia cells due to DNA damage and apoptosis [33]. Maintaining c-Myc and Bcl2 expression in the HL60 human promyelocytic leukemia cell line may require reciprocal Notch signaling, which could contribute to both cell proliferation and survival [34]. Yan et al. showed that the Notch signaling pathway-related genes could mediate drug resistance in patients with AML [35]. We demonstrated a potential correlation between ARHGEF5 and the Notch signaling pathway, indicating that ARHGEF5 could be involved in developing and maintaining leukemia cells. Hence, further research is necessary to corroborate these findings and explore the fundamental regulatory mechanisms implicated in ARHGEF5 and the Notch signaling pathway.

The involvement of the tumor immune microenvironment is of critical importance in tumorigenesis and tumor progression. The ssGSEA algorithm was utilized to generate a comprehensive map of 22 distinct immune cell types. Subsequently, the correlation between ARHGEF5 expression and the identified immune cell types was analyzed, revealing a significant association between ARHGEF5 and various immune cell subtypes, comprising aDC, cytotoxic cells, NK cells, pDC, Th cells, and Tfh cells. These immune cells are involved in tumorigenesis and cancer development. Therefore, our results suggest that ARHGEF5 could impact AML onset and progression by regulating ICI.

Our study findings indicate that ARHGEF5 may be a prognostic factor for unfavorable outcomes in patients suffering from AML, even following the adjustment for routine clinical features. CRA indicated that patients expressing high ARHGEF5 levels, above the age of 60, and intermediate/poor cytogenetics could independently predict the poor OS of patients. For the accuracy improvement of ARHGEF5 in predicting prognosis, a nomogram model was constructed by combining ARHGEF5, cytogenetic risk, and age. The C-index of the nomogram model for the OS prediction was 0.715 (0.690–0.754). The calibration plot showed an agreement between 1-year and 3-year OS predictions by nomogram and actual observations. These findings suggested that ARHGEF5 could act a biomarker for accurately predicting the prognosis and stratifying patients having AML based on the NCCN cytogenetically group.

However, our study has several limitations that should be addressed. First, due to time and resource constraints, we were unable to increase our sample size to validate our findings more robustly. Future studies with larger cohorts are needed to confirm the prognostic value of ARHGEF5 in AML. Second, our current dataset lacks comprehensive information on specific epigenetic mutations such as DNMT3A, TET2, and ASXL1 for all patients. This prevents us from performing a robust analysis of their relationship with ARHGEF5 expression. Investigating these relationships could provide valuable insights for treatment strategies and prognosis in AML and should be a focus of future research.Additionally, our dataset does not contain detailed information on the specific treatment regimens for each patient. This limitation prevents us from providing a comprehensive analysis of treatment options and their potential interactions with ARHGEF5 expression. Future studies incorporating treatment data will be crucial for understanding the relationship between ARHGEF5 expression and treatment outcomes in AML patients.

In conclusion, our study showed a significant ARHGEF5 overexpression in patients with AML. Further, a correlation was observed between high ARHGEF5 expression, disease progression, and poor patient survival. Dysregulated notch and cAMP pathways could be involved in high ARHGEF5-mediated leukemogenesis. These results shed light on the pathogenesis and molecular targets of AML. However, additional studies are required to identify the specific mechanism underlying AML progression [28].