Discovery
For the pilot study of 3 ESCC patients and 3 healthy volunteers (Fig. 1A), the isolated exosomes from either saliva or cell lysate were confirmed by TEM (Fig. S1A) and immunoblot using antibodies against specific exosomal markers (Alix, TSG101, CD63, CD9) and Calnexin, an intracellular protein that is not present in exosomes (Fig. S1B). Nanoparticle tracking showed that the average diameter of exosomes from saliva was 95 nm (Fig. S1C). Compared to the controls, 1366 differentially (p < 0.05) expressed sesncRNAs (excluding miRNAs) were identified in ESCC patients. Among them, 32 were highly expressed in ESCC patients with log2 fold changes > 2, and the top five candidates with the most significant fold changes were selected for further investigation (Fig. 2A).
Totally, 33 ESCC patients and 33 controls were recruited at CHSUMC to evaluate the levels of the top five sesncRNAs by RT-qPCR (Fig. 1B). Two of the five sesncRNAs met the predetermined significance level of 0.01 (Fig. 2B). Both of them, tsRNA (tRNA-GlyGCC-5) [34] and a previously uncharacterized small non-coding RNA, located in chromosome 1. Blast analysis on the National Center for Biotechnology Information (NCBI) found that tRNA-GlyGCC-5 is derived from 5’end of TRG-GCC. Further analysis found that sRESE gene resided between SSX2IP and lncRNA LOC102724892. Since this uncharacterized RNA is a small RNA identified in Exosomes from the Saliva of ESCC patients, we named it sRESE. The expression levels of these two sesncRNAs were examined in exosomes (Fig. S2A) and cell lines (Fig. S2B). Compared to the immortalized esophageal epithelial cells, both tRNA-GlyGCC-5 and sRESE were highly expressed in exosomes secreted into the conditioned media and ESCC cell lines. Sanger sequencing furthermore confirmed that these sesncRNAs were present and detectable in exosomes (Fig. S3). To understand the biological role of these sesncRNAs, we transfected ESCC cells with antisense RNAs against sesncRNAs, and found that proliferation, migration, and invasion were all significantly suppressed in both KYSE150 (Fig. 2C) and TE-12 cells (Fig. S4) suggesting that these sesncRNAs could be involved in the regulation of cell proliferation, migration, and invasion.
Detection of the presence of ESCC
A prospective study (ChiCTR2000031507) is currently underway to collect saliva samples to study exosomal biomarkers. Since the pre-specified sample size has not been reached, the study is still ongoing. The demographic and clinicopathological characteristics of the patients in this interim analysis were summarized in Table 1. We analyzed the levels of these sesncRNAs in the salivary samples collected as of January 1, 2018 (Fig. 1C and D) and found that both tRNA-GlyGCC-5 and sRESE were significantly (p < 0.001) increased in ESCC patients compared with healthy volunteers in both CHSUMC cohort (200 ESCC patients and 120 controls, Fig. 2D) and ATH cohort (140 ESCC patients and 60 controls, Fig. 2E). In addition, the AUROC for tRNA-GlyGCC-5 and sRESE is 0.878 and 0.871, respectively, in the CHSUMC subjects (training cohort) (Fig. 2F). Using the optimal cutoff values for tRNA-GlyGCC-5 and sRESE determined using the Youden indices in the receiver operating characteristics analyses of the training cohort, the sensitivity of tRNA-GlyGCC-5 and sRESE in the prediction of ESCC is 79.00 and 77.00%, respectively, for the training cohort (Table 2).
Table 1 Patient demographics and clinicopathological characteristics of the training and validation cohorts Table 2 Performance of sesncRNAs test to differentiate ESCC patients from healthy subjects in CHSUMC and ATH cohorts To investigate the efficacy of a bi-sesncRNA signature (tRNA-GlyGCC-5 and sRESE), we performed a logistic regression analysis using the expression values of tRNA-GlyGCC-5 and sRESE to predict the presence of ESCC. The bi-sesncRNA signature risk score for diagnosis (RSD) was defined as:
$$\mathrm{RSD}=111.01\ \mathrm{x}\ \left(\mathrm{expression}\ \mathrm{value}\ \mathrm{of}\ \mathrm{tRNA}-\mathrm{GlyGCC}-5\right)+27.198\ \mathrm{x}\ \left(\mathrm{expression}\ \mathrm{value}\ \mathrm{of}\ \mathrm{sRESE}\right)-4.029.$$
ROC analysis indicated that the bi-sesncRNA signature RSD has better performance than each sesncRNA alone (AUROC 0.933 vs. 0.878 or 0.871, Delong test, both p < 0.001). Based on the optimal cutoff value (0.040) of the Youden index obtained from the ROC curve, ESCC patients in the CHSUMC cohort could be discriminated from controls by the RSD with a sensitivity of 90.50% and a specificity of 94.20%. Additionally, the positive predictive value (PPV) was 96.28%, and the negative predictive value (NPV) was 85.61% (Table 2). The cutoff value from the CHSUMC training cohort was then applied in the ATH validation cohort and found that the sensitivity, specificity, PPV, NPV are 87.14, 85.00, 93.13, 75.00%, respectively (Table 2). Interestingly, based on this cutoff, patients with stage I ESCC could also be discriminated from controls both in CHSUMC and ATH cohorts (Table S1). Therefore, the bi-sesncRNA signature could robustly distinguish ESCC (including stage I disease) patients from healthy subjects, thus promising a high translational potential.
Prognostic prediction of ESCC
To assess the potential clinical utility of a bi-sesncRNA signature score in ESCC prognosis, we developed logistic regression formula to model the prediction of vital status to calculate a Risk Score for Prognosis (RSP) for each patient based on the two sesncRNAs expression levels:
$$\mathrm{RSP}=22.979\ \mathrm{x}\ \left(\mathrm{expression}\ \mathrm{value}\ \mathrm{of}\ \mathrm{tRNA}-\mathrm{GlyGCC}-5\right)+5.741\ \mathrm{x}\ \left(\mathrm{expression}\ \mathrm{value}\ \mathrm{of}\ \mathrm{sRESE}\right)-2.199.$$
The medians of tRNA-GlyGCC-5, sRESE, and bi-sesncRNA signature RSP were used to divide the patients in the CHSUMC cohort into high (above median) and low (at or below median) groups. The bi-sesncRNA signature RSP is highly correlated with the lymph node metastasis, and histological differentiation (Table 3). Kaplan-Meier analysis revealed that ESCC patients with high-RSP have a significantly shorter OS (Fig. 3C, HR = 4.95, 95%CI 2.90–8.46, p < 0.001) and PFS (Fig. 3G, HR = 3.69, 95%CI 2.24–6.10, p < 0.001) than those with a low-RSP. Notably, the bi-sesncRNA signature improved the prediction of OS and PFS than either sesncRNA alone (Fig. 3A and B, OS: HR 4.95 [95%CI 2.90–8.46] vs 2.63 [1.65–4.19] or 2.93 [1.82–4.72]; PFS: HR 3.69 [2.24–6.10] vs 2.22 [1.41–3.50] or 2.46 [1.53–3.94]).
Table 3 Association of tRNA-GlyGCC-5 expression, sRESE expression and bi-sesncRNAs signature RSP with demographic and clinicopathological characteristics of the CHSUMC cohort When the ESCC patients in the ATH cohort were divided into high-RSP or low-RSP groups using the above-defined cutoff value (− 0.436), Kaplan-Meier analysis revealed that compared to those with low-RSP the patients with high-RSP have shorter OS (Fig. 3D, HR = 2.06, 95%CI 1.23–3.46, p = 0.005) and PFS (Fig. 3H, HR = 2.05, 95%CI 1.21–3.46, p = 0.006), suggesting that bi-sesncRNA-derived high RSP can serve as an indicator for good prognosis of ESCC. The multivariate COX regression analysis indicates that the histological differenciation, bi-sesncRNA signature RSP and TNM stage were independent prognostic factors for OS and PFS of both CHSUMC and ATH cohorts (Table 4 and S2). To seek more precise prediction for an individual ESCC patient’s survival while controlling for TNM stage and histological differentiation, a nomogram prediction model was established based on multivariate regression analysis (Fig. 4A). The 3-year-OS were predicted well in both CHSUMC cohort (Fig. 4B, C-index = 0.718) and ATH cohort (Fig. 4C, C-index = 0.711). These findings collectively demonstrated that the bi-sesncRNA signature RSP could serve as an independent predictor for the clinical outcomes of ESCC.
Table 4 Univariate and multivariate Cox proportional hazards analyses of survival in ESCC patients of CHSUMC cohort Prediction of benefit of adjuvant therapy
We next examined the instructive role of the bi-sesncRNA signature RSP in postoperative adjuvant treatment. Using the above-established cutoff value (− 0.436) of bi-sesncRNA-derived RSP, patients were stratified into high-RSP and low-RSP groups to retrospectively analyze the effect of adjuvant therapy on ESCC clinical outcomes. In the CHSUMC cohort, 54 ESCC patients with high RSP and 41 ESCC patients with low RSP received adjuvant therapy. When the Kaplan-Meier survival analysis was stratified by the bi-sesncRNA signature RSP, we found that adjuvant therapy was associated with an improved OS (Fig. 5A, HR 0.47, 95%CI 0.29–0.77; p = 0.002) and PFS (Fig. 5E, HR 0.36, 95%CI 0.21–0.62; p < 0.001) in patients with high-RSP but not those with low-RSP value (Fig. 5B&F, OS: HR 0.62, 95%CI 0.22–1.77; p = 0.370; PFS: HR 1.06, 95%CI 0.44–2.51; p = 0.903). Similar findings were observed in the ATH cohort (High-RSP patients, OS: HR 0.28, 95%CI 0.12–0.70; p = 0.003; PFS: HR 0.32, 95%CI 0.13–0.78; p = 0.008. Low-RSP patients, OS: HR 0.71, 95%CI 0.32–1.58; p = 0.403; PFS: HR 1.05, 95%CI 0.49–2.27; p = 0.898.), in which 55 of the140 patients received adjuvant therapy (Fig. 5C-D and G-H). To avoid the influence of the bias of clinicopathological characteristics of patients with or without adjuvant therapy, χ2 test was performed and no significantly difference was found in these two groups (Tables S3 and S4). The results of this analysis suggested that only the ESCC patients with high bi-sesncRNA signature RSP might benefit from adjuvant therapy to improve their PFS and OS. Therefore, the bi-sesncRNA signature RSP might be a potential tool to predict which pre-operative patients might benefit from adjuvant therapy.