Background

Acute myeloid leukemia (AML) is a heterogeneous disease in terms of genetic basis, clinical, biological and prognostic, and is a malignant clonal disease of leukemia stem cells (LSCs). According to the prognosis, AML patients can be classified based on favorable-risk, intermediate-risk and adverse-risk [1, 2]. AML can achieve 60% to 70% complete response (CR) post-chemotherapy regimens with supportive care, and with long-term survival reaches 25% to 35% [3]. After active treatment in young AML patients, the 5-year survival rate is between 40 and 50%, and there are still treatment challenges in elderly AML patients [3]. However, recurrence is a major obstacle to cure after AML patients achieve CR [4]. The high recurrence rate of AML is considered to be the persistence of LSC [5]. Allogeneic hematopoietic stem cell transplantation (allo-HSCT) is the best treatment for many AML patients [6, 7]. Especially for high-risk AML patients, allo-HSCT can achieve the first complete remission (CR1) and become an effective treatment [8]. As for intermediate-risk acute myeloid leukemia (IR-AML), allo-HSCT is an independent favorable factor for EFS and OS [9].

Nearly half of adult AML patients exhibit a cytogenetic normal acute myeloid leukemia (CN-AML) [10]. CN-AML was classified as IR-AML. Next-generation sequencing detected mutations in CN-AML for 19 AML-related genes, and each patient detected at least one mutation: DNMT3A, IDH2, IDH1, NRAS, NPM1, TET2, ASXL1, PTPN11, and RUNX1 [11]. The prognosis can be further stratified according to different gene mutation combinations in CN-AML patients. For example, mutations in NPM1 and CEBPA are associated with a good prognosis in CN-AML [12, 13], whereas DNMT3, TET2, and RUNX1 mutations are associated with poor prognosis in CN-AML [14,15,16]. Besides, high expression of genes including ITPR2, MAPKBP1, CPNE3, RUNX1 and ATP1B1 are associated with poor prognosis of CN-AML, while high expression of LEF1 is considered as a favorable prognostic factor for CN-AML [17,18,19,20,21,22]. Therefore, the identification of new biomarkers can help to predict the prognosis risk of CN-AML.

NCALD (Neurocalcin Delta) is a member of the Neuronal Calcium Sensor family of calcium binding proteins. The expression of NCALD gene is more abundant in the brain, testis, ovary and small intestine [23]. Experiments have shown that NCALD is a potential hippocampal memory-related factor related to obesity [24]. NCALD expression is down-regulated in lung cancer tissues, while low NCALD expression levels are associated with poor prognosis in non-small cell lung cancer (NSCLC) patients [25]. NCALD gene expression was significantly different in survival analysis, indicating that low expression of NCALD gene predicts poor prognosis in patients with ovarian cancer [26]. However, the expression level of NCALD gene in CN-AML patients has never been reported. Thus, we analyzed the relationship between the expression of NCALD and the prognosis of CN-AML.

Materials and methods

Data source and gene expression analysis

The method of calculating the probe set measurements for all Gene Expression Omnibus (GEO) (https://www.ncbi.nlm.nih.gov/geo) arrays uses RMA algorithm (robust multiarray averaging). The relative expression values of RNA were log-transformed using log2. Gene expression profiles of The Cancer Genome Atlas (TGCA) (https://portal.gdc.cancer.gov/) AML patients were detected from the RNA-seq data showed with RPKM (Reads of Per Kilobase per Million mapped reads) and further log2 transformed. We calculated the P value (and hazard ratio) of each gene expression from the RNA-seq data and EFS/OS of TCGA CN-AML patients using survcomp package. NCALD located in the top position through the rank of the P-value. We divided patients into high-NCALD group and low-NCALD group by NCALD expression levels using maximally selected rank statistics algorithm from survminer package. We also resort to a different method for patient stratification division based on NCALD gene expression quartiles. Top 75% patients are NCALD high expression group and the other 25% patients are NCALD low expression group by ranking NCALD gene expression from high to low. EFS and OS are clinical end points. The time from diagnosis to the first event (last visit or no complete remission or relapse) was defined as EFS. The time from diagnosis to death for any cause or to the last follow-up was defined as OS. The written informed consent of all patients in this study was consistent with the Helsinki Declaration.

We studied totally 165 AML patients from TCGA dataset [27]. Of these, 75 AML patients are CN-AML, 67 patients underwent allo-HSCT and 94 patients received chemotherapy. All the gene expression was detected by RNA-seq method in the TCGA AML patients. Survival prognosis of EFS and OS in AML patients was analyzed by NCALD gene expression.

We studied totally 240 CN-AML patients from the GSE12417 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE12417) [28]. Gene expression was performed by using the Affymetrix Human Genome U133A/B Array and Affymetrix Human Genome U133 Plus 2.0 Array. The relationship between NCALD gene expression and OS prognosis in patients with CN-AML was analyzed.

We studied totally 78 CN-AML patients from the GSE22778 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE22778) [29]. The relationship between NCALD gene expression and OS prognosis of CN-AML patients was analyzed.

We studied totally 104 CN-AML patients from the GSE71014 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE71014) [30]. Gene expression was performed by using the Illumina HumanHT-12 V4.0 expression beadchip. The relationship between NCALD gene expression and OS prognosis CN-AML patients was analyzed.

Gene expression was performed by using the NanoString Expression Assay system in GSE76004 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE76004) [31]. We analyzed the expression of NCALD genes in different LSC subgroups from 78 AML patients.

Statistical analysis

Statistical analysis of this study was performed using R software v3.1.3 (ggplot2 and survminer packages). Descriptive statistics are used to summarize the clinical and molecular characteristics of the patient. Survival analysis was assessed by log rank test and Kaplan–Meier curves. Comparison of NCALD expression levels between CD34/CD38 subgroups using one-way analysis of variance (Anova test). The expression of NCALD between the LSC- and LSC + groups was determined by unpaired t test. The statistical method for quantitative variables is t test or Anova test. The statistical method for categorical variables is Fisher’s exact test. Univariate and multivariate analysis of EFS and OS in 74 TCGA CN-AML were performed using Cox regression analysis. One TCGA CN-AML patient without the mutation information was not used in the univariate and multivariate Cox regression analysis. Variables of with P < 0.05 in univariate analysis were selected for multivariate analysis. The time dependent ROC curves estimation from EFS and OS were conducted using survival ROC package. In all statistical methods, P value < 0.05 was considered statistically significant.

Results

Clinical baseline characteristics of CN-AML

75 TCGA CN-AML patients were divided into NCALD-high group and NCALD-low group, according to NCALD expression level using maximally selected rank statistics algorithm, and the clinical and molecular characteristics between the two groups were compared (Table 1). Patients in the NCALD-high group had more relapses than the NCALD-low group (P = 0.001; Fisher’s exact test). There were no significant differences in basic characteristics between the two groups, such as sex, race, FAB subtypes, WBC, transplant, chemotherapy, genetic mutations (DNMT3A, NPM1, TET2, FLT3, IDH2, IDH1, RUNX1, NRAS, WT1, CEBPA, PTPN11, KRAS), complete response and relapse of before transplantation (all P > 0.05; Fisher’s exact test). Age, bone marrow blast (BM-blast), peripheral blood blast (PB-blast) and WBC also had no significant differences between the two groups (all P > 0.05; unpaired t test).

Table 1 Baseline patient characteristics according to the expression level of NCALD in TCGA CN-AML patients

CN-AML patients from other GEO datasets (GSE12417, GSE22778) were divided into NCALD-high group and NCALD-low group by the same method. The clinical and molecular characteristics between the two groups were compared. FAB subtypes, sex and gene mutations (NPM1, RUNX1, TET2) were not significantly differences in the basic characteristics between the two groups (Additional file 1: Tables S1–S4, all P > 0.05; Fisher’s exact test), and the age was not significantly difference between the two groups (Additional file 1: Tables S1–S4, all P > 0.05; unpaired t test).

High NCALD expression predicts worse survival in CN-AML patients

We analyzed the gene expression profiles of NCALD of EFS and OS in 75 CN-AML patients from TCGA dataset. There was a significant difference between the NCALD-high group and the NCALD-low group. High expression of NCALD gene in EFS and OS of CN-AML patients has a lower prognosis than low expression NCALD gene (Fig. 1a, EFS, P < 0.0001; OS, P = 0.00011; log rank test). We analyzed the NCALD gene expression profiles of OS in 240 patients with CN-AML from the GSE12417 dataset. Patients with high NCALD expression in CN-AML have a worse prognosis in OS than low NCALD expression (Fig. 1b, OS, P < 0.0001; log rank test). Then, we also found that patients with high NCALD expression had a worse prognosis in OS than low NCALD expression. These patients were 78 CN-AML patients from GSE22778 (Fig. 1c, OS, P < 0.0001; log rank test) and 104 CN-AML patients from GSE71014 (Additional file 2: Figure S1, OS, P < 0.0001; log rank test).

Fig. 1
figure 1

High NCALD expression predicts worse survival of CN-AML. The X axis represents time and the Y axis represents survival probability. A, Kaplan–Meier curves were used for EFS and OS in different NCALD expression groups of CN-AML in the TCGA dataset. The cutoff value for the high and low NCALD groups was 4.74. EFS, P < 0.0001; OS, P = 0.00011; log rank test. B, Kaplan–Meier curves were used for OS in different NCALD expression groups of CN-AML in GSE12417. The cutoff values for the high and low NCALD groups were 8.1794 (left)/5.882 (right). OS, P < 0.0001; log rank test. C, Kaplan–Meier curves were used for OS in different NCALD expression groups in GSE22778. The cutoff values for the high and low NCALD groups were − 1.384 (left)/− 1.095 (right). OS, P < 0.0001; log rank test

We used a different method for patient classification based on NCALD gene expression quartiles. All datasets showed that patients with high NCALD expression have a poor prognosis (Additional file 2: Figure S2, all P < 0.001; Additional file 2: Figure S3, P < 0.0001, log rank test).

NCALD expression predicts survival after treatment in AML patients

To investigate the gene expression of NCALD in AML patients after allo-HSCT or chemotherapy, we examined the gene expression profiles of NCALD in EFS and OS of 67 allo-HSCT patients from TCGA dataset. In EFS and OS, the NCALD-high group was associated with poor survival of allo-HSCT patients, while the NCALD-low group had good survival (Fig. 2a, EFS, P = 0.0051; OS, P = 0.028; log rank test). We additionally analyzed the prognostic value of NCALD gene expression in 94 AML patients after chemotherapy. Over expression of NCALD gene predicts worse EFS and OS in AML patients after chemotherapy from the TCGA dataset (Fig. 2b, EFS, P = 0.011; OS, P = 0.0056; log rank test).

Fig. 2
figure 2

High NCALD expression predicts worse survival of allo-HSCT or post-chemotherapy AML patients from the TCGA dataset. The X axis represents time (months) and the Y axis represents survival probability. A, Kaplan–Meier curves were used of EFS and OS in different NCALD expression groups after allo-HSCT. The cutoff value for the high and low NCALD groups was 5.30. EFS, P = 0.0051; OS, P = 0.028; log rank test. B, Kaplan–Meier curves were used of EFS and OS in different NCALD expression group after chemotherapy. The cutoff value for the high and low NCALD groups was 4.66. EFS, P = 0.011; OS, P = 0.0056; log rank test

We divided allo-HSCT or post-chemotherapy patients from the TCGA dataset into two groups based on the quartile of NCALD expression levels. The result showed that patients with high NCALD expression have a poor prognosis (Additional file 2: Figure S4, all P < 0.05, log rank test).

Univariate and multivariate analysis of EFS and OS

Univariate analysis for 17 variables including NCALD, age, WBC count, PB-blast, BM-blast and genes mutation (DNMT3A, NPM1, TET2, FLT3, IDH2, IDH1, RUNX1, NRAS, WT1, CEBPA, PTPN11, KRAS, mutation vs. wild type) from 74 TCGA CN-AML patients was conducted (Additional file 1: Table S5). Univariate analysis showed that FLT3 mutations are associated with shorter EFS (P = 0.0214; Cox regression), while high NCALD expression and DNMT3A mutations contributed to worse EFS and OS (NCALD, EFS, P = 3.90E−05, OS, P = 0.000115; DNMT3A, EFS, P = 0.0193, OS, P = 0.0402; Cox regression). NCALD, DNMT3A, and FLT3 have meaningful P values (P < 0.05) and HRs > 1 in EFS, while NCALD and DNMT3A have meaningful P values (P < 0.05) in OS and their HRs > 1 (Additional file 2: Figure S5). Among them, the NCALD gene is the most significant predictor of CN-AML survival (Additional file 2: Figure S5).

We selected NCALD expression levels and genes mutation (DNMT3A, FLT3, mutation vs. wild type) for multivariate analysis of EFS and OS (Table 2). Multivariate analysis assessed independent risk factors for clinical prognostic of EFS and OS in CN-AML. High NCALD expression and DNMT3A mutations were significantly associated with EFS (NCALD, P = 3.84E−05; DNMT3A, P = 0.0106; Cox regression). High NCALD expression and mutation of DNMT3A were significantly associated with OS (NCALD, P = 8.53E−05; DNMT3A, P = 0.0243; Cox regression).

Table 2 Multivariate analysis for EFS and OS of TCGA CN-AML patients

NCALD is a potential marker for predicting prognosis

To further investigate the potential clinical value of NCALD, we performed an analysis of the area under the receiver operating characteristic curve (AUC-ROC) of the survival model for variables from TCGA CN-AML. In the ROC curve analysis of age (Additional file 2: Figure S6), WBC (Additional file 2: Figure S7), DNMT3A (Additional file 2: Figure S8), and NCALD (Fig. 3), NCALD expression reached the largest area under the curve (AUC) of both EFS and OS. These results suggest that NCALD seems to be an excellent predictor of CN-AML prognosis.

Fig. 3
figure 3

ROC curves of survival for NCALD gene expression in TCGA CN-AML patients. The X axis represents false positive (FP) and the Y axis represents true positive (TP). A, ROC curves for NCALD expression and EFS of 75 TCGA CN-AML patients were performed. AUC reached 0.747 (1 year survival, left panel), 0.809 (2 year survival, middle panel) and 0.873 (3 year survival, right panel) respectively. B, ROC curves for NCALD expression and OS of 75 TCGA CN-AML patients were performed. AUC reached 0.711 (1 year survival, left panel), 0.727 (2 year survival, middle panel) and 0.783 (3 year survival, right panel) respectively. The grey shadow of the curves represents 95% confidence interval

Expression level of NCALD in LSC

Stem cells are divided into 4 cell populations based on fractions of the CD34 and CD38 phenotypes. The expression of NCALD was different in each subgroups (Additional file 2: Figure S9a, P = 0.00019; Anova test). Compared with the mean of all subgroups, the expression of NCALD gene was reduced in the CD34/CD38−/ −group (Additional file 2: Figure S9a, P ≤ 0.001; unpaired t test, two sided), and the expression of CD34/CD38 ± group was increased (Additional file 2: Figure S9a, P ≤ 0.01; unpaired t test, two sided), while the CD34/CD38∓ group and CD34/CD38 +/+ group showed no significant difference (Additional file 2: Figure S9a, P > 0.05; unpaired t test, two sided). To investigate the expression of NCALD gene in LSC, we analyzed 78 AML patients from the GSE76004 dataset. The data showed that the expression of NCALD gene in LSC + group was higher than it in LSC-group (Additional file 2: Figure S9b, P = 0.01; unpaired t test, two sided).

Discussion

The NCALD gene is a regulator of G-protein coupled receptor signaling, and there are several alternatively spliced gene variants that all encode the same protein. So far, little has been reported known about the relationship between NCALD and cancer, and the prognostic relationship between this gene and CN-AML has not been reported. We analyzed the gene expression of NCALD from AML patients in 5 independent datasets. Our study reports that the expression of NCALD is associated with the prognosis of CN-AML and is an independent risk factor of CN-AML.

Previous studies have reported that low expression of the NCALD gene is associated with the prognosis of some kinds of cancer. NCALD expression is down-regulated in the poor prognosis group of advanced ovarian cancer which suggests that NCALD is a prognostic biomarker for these ovarian cancer patients [32]. Low NCALD expression levels predict poor prognosis in patients with non-small cell lung cancer (NSCLC) [25]. However, in our survival analysis study, we found that the high expression of NCALD of CN-AML patients has a lower prognosis than patients with low NCALD expression. These results suggest that NCALD can predict different prognosis in different cancers and may play a different role in each type of human cancers.

Predicting the prognosis of AML after allo-HSCT treatment is particularly clinically important. Recent studies have shown that high expression of GAS6-mRNA is associated with poor prognosis of allo-HSCT in AML patients [33]. Previous studies of TMEM18, TP53, IDH and DNMT3A are potential biomarkers for AML [34,35,36]. However, there are still challenges to the prediction of the prognosis of AML, especially the prediction of the prognosis of AML after allo-HSCT. Our results suggest that high NCALD expression in AML patients after allo-HSCT can still predict poor prognosis. High NCALD expression also predicts poor prognosis in AML patients after chemotherapy. Therefore, NCALD may be a biomarker for predicting the prognosis of AML with or without allo-HSCT.

In univariate and multivariate analyses, we found that the DNMT3A mutations are associated with EFS and OS in CN-AML, and that the FLT3 mutations are associated with EFS in CN-AML. This is consistent with previous findings that DNMT3A mutations are associated with shorter EFS and OS [37,38,39]. Multivariate analysis showed that high expression of NCALD in CN-AML patients was confirmed to be associated with worse EFS and OS. Furthermore, in the ROC curve survival analysis, we found that the relation between NCALD gene expression levels and the prognosis of CN-AML patients was superior to other variables. These results suggest that NCALD should be an independent predictor of prognosis in patients with CN-AML.

The LSC immunophenotypic analysis of all subtypes of AML is CD34+ CD38− [40, 41]. Most of CD34+ and a few of CD34- fractions contain LSC, and LSC was detected in all CD34/CD38 phenotype fractions [31, 42]. AML is initiated by a small fraction of LSCs that maintain extensive proliferation and unrestricted self-renewal of leukemia mother cells, leading to chemoresistance and relapse [43, 44]. High expression of LSC gene is independently associated with poor prognosis in patients with AML [45]. In our study, more LSC+ was present in AML patients with high NCALD gene expression, suggesting that the NCALD gene is enriched in stem cell expression. This result suggests that the NCALD gene may affect LSC.

However, we did not further study the specific function of NCALD in the AML pathogenic signaling pathway. This study only explored the expression of individual genes, and future studies combining different types of potential biomarkers can be conducted to further predict the prognosis of patients with AML.

Conclusions

In summary, our results indicate that high expression of NCALD gene is a poor prognostic factor for CN-AML. NCALD can be considered as independent predictors of CN-AML patients and can be used as a biomarker for the prognosis of CN-AML.