Background

Gastric cancer (GC) is one of the most common tumours among all kinds of malignant carcinomas worldwide, with GC-associated death ranking fourth [1]. Currently, there are several classification systems for GC, including the Lauren classification and WHO classification. The Lauren classification divides GC into intestinal and diffuse types according to histological subtype; the WHO classification describes GC as four types, including signet ring cell carcinoma (SRCC) and mucinous types [2, 3]. SRCC is defined as a histological type GC in which tumour cells are composed of many mucins (> 50% size of cell) and the nucleus is squeezed into the ridge of cell. SRCC belongs to diffuse-type GC and undifferentiated GC, predicting poorer prognosis compared to other types of GC [4]. Regarding the survival of SRCC, the 5-year disease-free survival (DFS) rate is 86.9% for stage I patients, 38.3% for stage II patients, and only 16.2% for stage III patients [5]. Compared to other types of GC, SRCC is considered an unfavourable predictor of prognosis; additionally, SRCC serves as an independent risk factor for hepatic metastasis and peritoneal metastasis [4]. In addition, according to age at diagnosis, GC is divided into early-onset GC (EOGC) and late-onset GC (LOGC), of which the former is defined as a tumour diagnosed before 45 years of age [6]. With regard to the trend of incidence in GC, the incidence of diffuse-type GC is increasing, and that of intestinal-type GC is declining; the incidence of EOGC has also been steadily increasing, and that of LOGC is decreasing [6, 7]. Compared to LOGC, EOGC is quite different as a worrisome malignant tumour because of less exposure to environmental factors [8].

Given the disparities between EOGC and LOGC and the high malignancy of SRCC, we sought to investigate the rate of metastasis for early- and late-onset SRCC. Through this study, we aim to enhance our clinical understanding of these cancer types and provide evidence for therapeutic selection, with the ultimate goal of improved outcomes. In our study, we collected 2052 SRCC patients from the SEER database, including 234 EOGC and 1818 LOGC patients who were diagnosed from 2010 to 2015, and performed a comprehensive analysis that was validated by an external group consisting of 403 SRCC patients, including 54 EOGC and 349 LOGC patients.

Methods

Patients

All patients with GC in the SEER database were retrieved using National Cancer Institute’s SEER*Stat software (version 8.3.6). The patients did not give informed consent because the SEER database is free for public use. All patients underwent surgery. According to the International Classification of Diseases in Oncology (ICD-O-3), tumours with codes 8490 are identified as SRCC. In our study, patients were included according to the following criteria: (1) older than 20 who were diagnosed with GC by positive histology from 2010 through 2015; (2) SRCC histopathology; (3) survival information; and (4) detailed information, including age, race, grade, examined LNs, tumour size, T stage, N stage and M stage. Detailed information on the excluded patients is listed in Fig. 1. In addition, we extracted 403 patients diagnosed with SRC from March 2011 to March 2019 in the First Affiliated Hospital of Nanchang University. Patients were included according to the following criteria: (1) aged more than 20 years and underwent surgery, (2) diagnosed with SRCC by histology from March 2013 to March 2019, and (3) no serious chronic diseases, such as chronic renal failure. Patients were excluded according to the following criteria: (1) no record of TNM staging, tumour size, lymphatic vessel invasion or examined lymph nodes (LNs) or (2) chemotherapy before surgery. The study was approved by the Ethics Committee of the First Affiliated Hospital of Nanchang University. The detailed information is shown in Fig. 2.

Fig. 1
figure 1

The flowchart of extracting patient information from the SEER database in our study

Fig. 2
figure 2

The flowchart of extracting patient information from the First Affiliated Hospital of Nanchang University in our study

Clinicopathological factors

The cases in the SEER database and from our hospital were divided into the EOGC group and LOGC group based on clinicopathological variables. The patients from the SEER database and our hospital were divided into two age groups: < = 45 and > 45 years. Race was classified into three types: white, black and other. T stage was recorded as T1, T2, T3 and T4. LNM was described as N0 (negative), N1 (1–2 positive LNs), N2 (3–6 positive LNs) and N3 (> 6 positive LNs). M1 (Yes) indicates a positive M stage. Tumour size was categorized into 4 groups: ≤ 2 cm, ≤ 5 cm, and > 5 cm. With respect to examined LNs, the cut-off value was 16 according to previous studies [9]. Primary sites were recorded as cardia, fundus, body, antrum, and overlapping lesion/NOS. The status of chemotherapy was recorded as no or yes. The medicines used for chemotherapy included XELODA, tegafur and oxaliplatin. The different methods and courses of chemotherapy were determined according to the TNM stage of the patients [10]. All information of patients from the SEER database is shown in Table 1; that of patients from our hospital is shown in Table 2. The primary observation indicator was distant metastasis. The status of distant metastasis was diagnosed when the patients were first admitted to the hospital.

Table 1 Basic information of extracted patients from SEER diagnosed in 2010–2015
Table 2 Basic information of included patients from our hospital diagnosed in 2003–2019

Statistical analysis

For basic statistics, patients were divided into two groups, namely, LOGC and EOGC, and Pearson’s chi-squared test was utilized to investigate the association among categorical variables. To explore potential risk factors for distant metastasis, we performed univariate and multivariate Cox regression analyses, and the results are reported using the odds ratio (OR) with the 95% confidence interval (CI). K-M survival curve analysis was performed for the OS and CSS of patients with SRC.

Regarding the imbalance between LOGC and EOGC groups, we performed propensity-score matching (PSM) to obtain new data for analysis with the MatchIt package in R software. The value of the calipre was set as 0.02, and the effect was evaluated based on the P value. The detailed process was as follows. First, we calculated the propensity scores of each patient according to age (LOGC and EOGC) with the multivariate logistic regression model. Then, we matched patients between the two groups at a ratio of 1:1. Next, we analysed differences in all variables between the EOGC and LOGC groups with the chi-squared test. Finally, we explored the correlation between age and distant metastasis using a univariate logistic regression model.

All statistical analyses were performed with R software (version 3.6.1, StataCorp LLC, College Station, Texas). The chi-square test for the categorical variable, Student’s t-test for continuous variables with Gaussian distribution, and the nonparametric Kruskal–Wallis rank sum test for continuous variables with nonnormally distributed data or ordinal categorical variables were used for comparisons among different patient groups. The chi-squared test was carried out with SPSS (version 24.0). The results were considered to be statistically significant when the P value was less than 0.05.

Results

Basic information of extracted cases

According to the inclusion criteria and exclusion criteria, we collected patients diagnosed with SRC from the SEER database and included patients with SRC histology from our hospital. As shown in Figs. 1 and 2, we extracted 2052 cases from the SEER database and 403 from our hospital. The basic information of the patients in the two groups is listed in Tables 1 and 2. For patients from the SEER database, patients diagnosed with early-onset SRCC were more frequently female (59.83% vs 47.8%, P < 0.001) and the tumour was located in the body of the stomach (16.7% vs 10.6%, P < 0.05) compared to late-onset SRCC. In addition, early-onset SRCC patients more often had distant metastasis (19.66% vs 10.34%, P < 0.001), though they more rarely had more than 2 tumours (7.26% vs 22.33%, P < 0.001). With regard to the basic information of patients from the First Affiliated Hospital of Nanchang University (Table 2), similarly, we found that early-onset SRCC patients tended to be female compared to late-onset SRCC patients (59.25% vs 32.09%, P < 0.05). In addition, the proportion of patients with metastatic and lymphatic invasion in early-onset SRCC was larger than that in late-onset SRCC (P < 0.05). Interestingly, we found that the proportion of those undergoing chemotherapy for early-onset SRCC was obviously higher than that for late-onset SRCC (46.3% vs 30.09%, P = 0.017).

Survival analysis and identification of risk factors for metastasis

To investigate the survival of patients with early-onset and late-onset SRCC, we generated a K-M survival curve (Fig. 3). Regarding overall survival, patients with early-onset SRCC had a 1-year survival rate of 74.25%, a 3-year survival rate of 56.32% and a 5-year survival rate of 45.84%; patients diagnosed with late-onset SRCC had a 1-year survival rate of 65.48%, a 3-year survival rate of 47.29% and a 5-year survival rate of 36.45%, with significant differences (Fig. 3a, P = 0.0044). Similarly, for cancer-specific survival, early-onset SRCC patients had a better survival than those who had late-onset SRCC (Fig. 3b, P = 0.038). In addition, we divided patients into negative metastasis and positive metastasis groups and drew K-M survival curves. For both early-onset SRCC and late-onset SRCC, patients with distant metastasis had obviously poorer survival than those with no metastasis (Fig. 3c, d). To identify potential risk factors for distant metastasis, we performed univariate and multivariate logistic regression analyses. For patients from the SEER database, we found that black ethnicity was a favourable factor compared to other races (OR = 0.462, 95%CI, 0.272–0.787; P = 0.004). Additionally, advanced T stage was an independent risk factor for distant metastasis. Interestingly, we found that age was an independent factor for metastasis and that patients diagnosed with early-onset SRCC more frequently developed metastasis (Table 3). To validate these findings, we analysed data from our own hospital and found that patients aged more than 45 years had metastasis less often than patients aged < = 45 years old (OR = 0.301, 95%CI, 0.135–0.672; P = 0.003) (Table 4). Moreover, smoking, advanced T stage and lymphatic vessel invasion were independent risk factors for metastasis (P < 0.05) (Table 4). Regarding confounding factors, we performed PSM using the two sets of data. As shown in Tables 5 and 6, we adjusted the imbalanced data (P > 0.05) and observed that early-onset SRCC patients were more likely to develop metastasis (P < 0.05).

Fig. 3
figure 3

Survival of SRCC patients from the SEER database with early onset and late onset. a Overall survival of early-onset SRCC and late-onset SRCC patients. b Cancer-specific survival of early-onset SRCC and late-onset SRCC patients. c Survival of early-onset SRCC with negative metastasis or positive metastasis. d Survival of late-onset SRCC with negative metastasis or positive metastasis

Table 3 Univariate and Multivariate logistic regression analysis of EOGC and LOGC patients from SEER for metastasis
Table 4 Univariate and Multivariate logistic regression analysis of EOGC and LOGC patients from our hospital for metastasis
Table 5 Basic information of extracted patients from SEER diagnosed in 2010–2015 after propensity–score matching
Table 6 Basic information of included patients from our hospital diagnosed in 2003–2019 after PSM

Discussion

SRCC is a kind of rare GC and predicts poor survival. Although the total incidence of GC has decreased, that of SRCC is on the rise [11]. In general, SRCC is known to have a female predominance, to comprise a younger population and to be located in the middle and distal stomach [12]. According to data from American, SRCC is more common in black people [11, 12]. A large proportion of patients are at a late stage when diagnosed and even have distant metastasis, resulting from the insidious symptoms and high malignancy of SRCC [12]. Moreover, approximately 10% of GC is detected in those younger than 45 years old, a trend that is increasing [13]. However, as for early-onset SRCC, some have reported a knowledge gap [14, 15]. In our study, we extracted 2052 cases from the SEER database and 403 from our hospital. Univariate and multivariate analyses revealed that age at 45 years or younger was an independent risk factor for metastasis, as demonstrated by PSM with SEER data and our data. To our knowledge, this is the first study to illustrate the disparity of metastasis between early-onset and late-onset SRCC by the SEER database and validate it in an external group.

EOGC has increased over the past several decades and is highlighted as a challenging compared to LOGC. With regard to genetic variation, TP53 mutation occurs less often in EOGC than in LOGC, whereas MUC5B, CDH1, and TGFBR1 present higher mutation rates, demonstrating that the poor prognosis for EOGC might be associated with these mutated genes [16]. The histopathological characteristics and clinical behaviour are also quite distinct between EOGC and LOGC. For example, early-onset colorectal cancer is reported to more frequently present with lymph node metastasis or distant metastasis [17, 18]. Another study using the SEER database found that the proportion of EOGC patients with metastatic disease was greater than that of LOGC patients (49.5 vs 40.9%, P < 0.01) [15]. In line with this previous study [19], our results showed that early-onset SRCC patients from the SEER database more commonly had metastasis (19.66% vs 10.34%, P < 0.001), and early-onset SRCC patients from our hospital had similar clinical characteristics (20.37% vs 9.17%, P < 0.05). Furthermore, multivariate logistic regression analysis demonstrated that age at 45 years or younger was a risk factor, which was consistent with the results of the PSM analysis. Multivariate analysis and PSM analysis can avoid the effect of confounding factors, making our results reliable. There are some potential explanations from different aspects for these findings. Regarding genomic mutations, several studies have reported that early-onset SRCC is associated with a de novo deletion of CDH1, which encodes a protein functioning as an adherens junction; this to some extent explains why early-onset SRCC is more likely to be metastatic [15, 20]. Regarding clinical characteristics, the distinct sex distribution between early-onset SRCC and late-onset SRCC may also account for the high malignancy [21]. In addition to distant metastasis, we found that lymphatic vessel invasion was more frequent in early-onset patients, which suggests that early-onset SRCC has greater malignancy.

In line with other studies [15, 22], we found that early-onset SRCC patients had a better prognosis than late-onset patients base on SEER data. Although the results could not be validated by data from our hospital because of the limited number of patients, we provide some explanations. We found that the proportion of those who underwent chemotherapy was larger in early-onset patients than in late-onset patients (Table 2), which somewhat explains the result. Similarly, other studies found that chemoradiotherapy was more frequent in early-onset patients; moreover, the complication of surgery for early-onset patients was less than that for late-onset patients [23]. In early-onset patients, the main cause of death is advanced disease, and in older patients, it is due to associated comorbid conditions. In addition, mutation of TP53 occurs less often in EOGC than in LOGC, whereas MUC5B, CDH1, and TGFBR1 present higher mutation rates, demonstrating that the poor prognosis of EOGC might be associated with these mutated genes [16]. Overall, survival between early-onset and late-onset GC remains controversial, and some studies report that when matched for imbalanced information, the prognosis does not differ from that of late-onset patients [24].

Our study has some limitations that should be discussed. First, TNM staging in the SEER database was performed using the 7th edition rather than the 8th edition, which may influence the identification of risk factors for metastasis. However, our own data assessed TNM staging according to the 8th edition, which compensated for this defect. Similarly, some important factors, such as lymphatic invasion and smoking, are not recorded in the SEER database, but these data were available at our hospital. Then, we excluded many patients who had missing data associated with our collected variables, increasing selection bias. Next, variables including examined LNs and positive LNs depended on each doctor in different clinical centres. Finally, we only enrolled patients who received surgical resection, which can be a critical limitation and affect our results to some degree. Our study has some advantages, such as internal and external validation and complementary roles between the two sets of data.

Conclusion

In conclusion, our study showed that distant metastasis is more common in early-onset SRCC than in late-onset SRCC. The increased frequency of distant metastasis in SRCC patients younger than 45 years of age offers a unique opportunity to gain a better understanding of carcinogenesis, which might be exploited during diagnosis and management. Regardless, further studies are needed to explore the potential aetiologic basis for the disparity.