Introduction

Cervical cancer is the fourth most prevalent malignancy among women globally (Small et al. 2017). In 2020, the incidence of cervical cancer was approximately 13.3 per 100,000 females, and mortality was 7.2 per 100,000 females, with higher rates among young women aged 20–39 years (roughly nine deaths per week) (Cohen et al. 2019; Siegel et al. 2019). Cervical adenocarcinoma (CA), accounting for approximately 10–25% of all cervical cancer cases, is the second most prevalent subtype after cervical squamous cell carcinoma (CSCC) (Rivera-Colon et al. 2020). However, as cervical screening methods improve, particularly among young women under 40, CSCC rates are declining while CA rates are increasing (Bray et al. 2005). Moreover, CA patients are more likely to experience lymph node metastasis and recurrence with a worse prognosis (Irie et al. 2000).

The current approaches to treatment and factors influencing prognosis for cervical cancer are primarily determined by the 7th edition of the Cancer Staging Manual by the American Joint Committee on Cancer (AJCC 7th), as well as the staging guidelines provided by the International Federation of Gynecology and Obstetrics in 2018 (Bhatla and Denny 2018). However, increasing evidence suggests that those who share a similar stage of cervical cancer may exhibit varying prognoses, particularly individuals with large tumor size or those diagnosed with endocervical adenocarcinoma (Farley et al. 2003; Xie et al. 2018). Nomograms are assessment tools that quantify risks and benefits, allowing clinicians to predict the occurrence and progression of disease based on meaningful clinical indicators. Nomograms have found extensive application in anticipating the prognosis of individuals with diverse cancer types, such as gastric cancer, lung cancer, and breast cancer (Pan et al. 2019; Yang et al. 2022; Yu and Zhang 2019). Herein, we established and verified the accuracy of nomogram models for anticipating the overall survival (OS) rate at 1, 3, and 5 years and the cancer-specific survival (CSS) rate for CA patients. These models can provide clinicians with more accurate and personalized prognosis estimates, helping them create reliable clinical treatment plans for CA patients.

Methods

Data source and patients

This study encompassed individuals who received a diagnosis of cervical adenocarcinoma during the period between 2004 and 2015. The data were procured from the SEER database of the National Cancer Institute employing SEER*Stat software (version 8.4.0.1; http://seer.cancer.gov/). The researchers acquired official authorization and explicit consent from the SEER program to access and utilize the data, ensuring that patient privacy was protected throughout the entire process.

The following criteria were used for inclusion: (1) site recode ICDO-3/WHO2008: Cervix Uteri, (2) histologic type ICD-O-3 were: 8089/3, 8140/3, 8144/3, 8200/3, 8210/3, 8244/3, 8255/3, 8260/3, 8261/3, 8262/3, 8263/3, 8310/3, 8313/3, 8323/3, 8384/3, 8441/3, 8460/3, 8480/3, 8481/3, 8482/3, 8574/3, (3) year of diagnosis: 2004–2015, (4) known CS tumor size (2004–2015), (5) known reason for death and the duration of survival. The following criteria were used for exclusion: (1) unknown survival time; (2) unknown AJCC stage; (3) unknown TNM stage; (4) unknown seer cause-specific death; (5) unknown grade; (6) unidentified tumor dimensions. The screening flowchart for the subjects is depicted in Fig. 1. Overall, 4485 patients were assigned to a training cohort (n = 2679) and testing cohort (n = 1806) in a 3:2 ratio.

Fig. 1
figure 1

Flow chart for screening patients with cervical adenocarcinoma. AJCC American Joint Committee on Cancer; TNM tumor node metastasis stages

Study variables

The SEER database provided clinical variables for this study, including age at diagnosis, race, histologic type, TNM stage, AJCC stage, chemotherapy, radiotherapy, histologic grade, tumor size (in cm), classification of cause-specific mortality, recoding of vital status, and the duration of survival. Age was subjectively categorized as less than 30, 30–50, 50–70, or 70 or older. Tumor variables included radiotherapy (yes or no/unknown), chemotherapy (yes or no/unknown), and tumor size (in cm) (< 2 cm, 2–4 cm, or 4 cm or greater). Tumor grades I-IV were further classified as well differentiated, moderately differentiated, poorly differentiated, or undifferentiated.

The primary outcome measure of this study was overall survival (OS), while the secondary outcome measure was cancer-specific survival (CSS). OS time refers to the length of time a patient survives from diagnosis until death from any cause. If a patient was lost to follow-up, their last recorded time was typically used as their time of death. CSS time, on the other hand, represents the length of time a patient survives with cancer from diagnosis until cancer-related death, excluding other factors that may have contributed to their death.

Statistical analysis

Categorical variables were presented as frequencies (n) and percentages (%). Univariate and multivariate Cox analyses were performed to identify independent prognostic determinants and develop nomograms for assessing overall survival (OS) and cancer-specific survival (CSS) in the training cohort, respectively. Variables displaying P-values less than 0.05 in both the univariate and multivariate analyses were deemed statistically significant. The Cox analysis was performed using SPSS software (version 25.0, Chicago, IL, USA).

The prognostic nomograms were developed through LASSO regression, utilizing the findings of the multivariate Cox analysis in the training cohort. To assess the prognostic accuracy of the model, the concordance index (C-index) and calibration curve were employed. Furthermore, the predictive performance of the nomograms for 1, 3, and 5-year OS was assessed employing the receiver operating characteristic (ROC) curve. The R version 4.2.1 software in R Studio was used to construct the nomograms, C-index, calibration curves, and ROC curves.

Results

Patient baseline characteristics

In this study, a total of 4485 patients with cervical adenocarcinoma from 2004–2015 were identified in the SEER database. Subsequently, the entire dataset was randomly classified into a training cohort (n = 2679) and a testing cohort (n = 1806) in a 3:2 ratio. For all patients, the count of patients categorized by age < 30, 30–50, 50–70, and ≥ 70 years old was 216 (4.82%), 2520 (56.19%), 1387 (30.93%), and 362 (8.07%), respectively. For the race group, 3708 (82.68%) patients were white, 255 (5.69%) patients were black, and 522 (11.64%) patients were others. The total count of patients in AJCC stage I, II, III, and IV were 3047 (67.94%) 442 (9.86%), 645 (14.38%), and 351 (7.83%), respectively. The total count of patients with chemotherapy was 1617 (36.05%) and 2868 (63.95%). The total count of patients with radiotherapy was 1953 (43.55%) and 2532 (56.45%). The total count of patients in tumor grade I, II, III, and IV were 1425 (31.77%), 1851 (41.27%), 1056 (23.55%), and 153 (3.41%), respectively. Total count of patients with tumor size < 2 cm, 2–4 cm, and ≥ 4 cm were 1790 (39.91%), 1284 (28.63%), and 1411 (31.46%). For all the variables examined in both the training and testing cohorts, the chi-square test yielded results with P-values greater than 0.05. Table 1 displays the baseline features of the participants enrolled.

Table 1 The demographics and clinical characteristics of patients in training cohort and testing cohort

Exploring independent prognosis-predictive factors of OS within the training set

In the univariate analysis, all variables except for “unmarried or domestic partner” in the marriage group were found to be significant regarding overall survival (OS), as shown in Table 2. In the multivariate analysis for OS, ten variables—age, race, AJCC TNM stage, SEER stage, chemotherapy, radiotherapy, grade, and tumor size—were found to be independent prognostic factors. LASSO regression was utilized to create a predictive model using 10 variables from the multivariate analysis of the training set. Ultimately, seven variables—age, AJCC TNM stage, SEER stage, grade, and tumor size—were considered statistically significant factors and used to establish a nomogram that predicts 1-, 3-, and 5-year OS.

Table 2 Association between variables and overall survival

Construction of prognostic nomograms

Nomograms were developed to predict 1, 3, and 5-year survival for OS in the training set based on statistically significant prognostic factors identified through LASSO regression analysis (Figs. 2 and 3). The nomogram computes a predicted score for each variable by its corresponding point and then sums up all the scores of the variables to obtain the total score. To assess the anticipated probability of 1, 3, and 5-year OS for individual patients, a straight line is drawn from the “total point” to the respective “survival axes”. Age and T stage were found to be more important predictive factors than other variables in the OS nomogram.

Fig. 2
figure 2

Selection of significant parameters in clinicopathologic variables in the training set and definition of linear predictor. Seven time cross-validation for tuning parameter selection in the LASSO model. The LASSO was used for regression of high dimensional predictors. The method uses an L1 penalty to shrink some regression coefficients to exactly zero. The binomial deviance curve was plotted versus log (λ), where λ is the tuning parameter. LASSO least absolute shrinkage and selection operator

Fig. 3
figure 3

Nomograms for predicting 1-, 3- and 5-year OS in training cohort. grade: I, well differentiated; II, moderately differentiated; III, poorly differentiated, IV, undifferentiated or anaplastic; OS overall survival

Validation and calibration of the nomogram for OS and CSS

The nomogram was validated using the C-index. Figure 4 demonstrates that the nomogram predicted OS with a C-index of 0.832 (95% CI, 0.817–0.847) in the training cohort and a C-index of 0.823 (95% CI, 0.805–0.841) in the testing cohort. These findings imply that the nomogram exhibits potential as a promising tool for anticipating OS in patients. Calibration plots were constructed to compare the performance of the nomogram with an ideal curve, and the outcomes demonstrated strong concordance between the predicted values from the nomogram and the actually observed values in both the training and testing cohorts.

Fig. 4
figure 4

Calibration plots of 1-, 3-, and 5-year OS prediction in training cohort (A, B, C) and testing cohort (D, E, F)

Furthermore, we evaluated the performance of the model in predicting overall survival (OS) in the testing cohort and assessed its predictive value for cancer-specific survival (CSS). In the training cohort, the receiver operating characteristic (ROC) curve's area under the curve (AUC) values for 1, 3, and 5-year overall survival (OS) were recorded to be 0.917, 0.885, and 0.867, respectively. Similarly, in the testing cohort, the corresponding AUC values were 0.893, 0.873, and 0.859. Additionally, the AUC values for CSS at 1, 3, and 5 years were 0.927, 0.893, and 0.874 in the training set, and 0.904, 0.884, and 0.878 in the testing set, respectively (Fig. 5).

Fig. 5
figure 5

ROC curves verified the predictive value of nomogram. ROC of 1-, 3- and 5-year OS in the training cohort (A) and testing cohort (C). ROC of 1-, 3- and 5-year CSS in the training cohort (B) and testing cohort (D)

Clinical applicability of the nomogram

To assess the practicality and effectiveness of the prognosis-predictive nomogram, patients were classified into two groups: high-risk and low-risk, as per their respective prognostic scores. The optimal cutoff value was determined by selecting the maximum Youden index at 5 years using the ROC curve. Kaplan–Meier survival analysis for OS and CSS was performed for both training and testing cohorts, and a significant difference in prognosis was detected between the two groups. The results indicated that patients with high-risk prognostic scores had significantly reduced OS and CSS in both the training and testing cohorts (P < 0.05) (Fig. 6). Furthermore, age and AJCC stage were selected as important clinical prognostic factors and were categorized into two groups, namely high-risk and low-risk, according to their respective prognostic scores for further validation. The survival curves of different age and AJCC stage groups showed similar trends (P < 0.05) (Fig. 7). The median survival of OS and CSS between high-risk and low-risk groups was shown in Table S1 and Table S2.

Fig. 6
figure 6

Kaplan–Meier curves of OS and CSS in the training cohort (A, C) and testing cohort (B, D)

Fig. 7
figure 7

Kaplan–Meier curves of OS and CSS based on age in the training cohort (A, B, I, J) and testing cohort (C, D, K, L). Kaplan–Meier curves of OS and CSS based on AJCC stage in the training cohort (E, F, M, N) and testing cohort (G, H, O, P)

Discussion

Cervical cancer is a prevalent form of malignant tumor in women, ranking fourth in the incidence of female malignant tumors (Small et al. 2017; Johnson et al. 2019). On a global scale, the annual incidence of cervical cancer surpasses 500,000 cases among women, with a corresponding mortality toll of over 300,000 individuals succumbing to this disease (Cohen et al. 2019). It continues to be one of the primary burdens of cancer worldwide. While cervical cancer screening and HPV vaccination have reduced the incidence in many countries, the incidence of cervical adenocarcinoma has been steadily increasing (Bray et al. 2005; de Martel et al. 2017). Moreover, patients with adenocarcinoma tend to be younger than those with squamous cell carcinoma, and there is mounting evidence that they are more likely to still be in the reproductive stage (Loureiro and Oliva 2014). The rise in adenocarcinoma incidence is not entirely clear, but it may relate to the internal growth pattern of endocervical glands, which hinders accurate measurement of invasion depth and diagnosis (Rivera-Colon et al. 2020). The FIGO staging system remains an essential standard for determining cervical cancer treatment. Patients with the same clinical stage may exhibit different behaviors and survival outcomes due to different histologic subtypes. In 2013, Silva et al. proposed a new model for risk stratification of cervical adenocarcinoma based on interstitial invasion of the tumor (Silva classification) (Roma et al. 2015). This model is repeatable and better predicts the risk of lymph node metastasis compared with FIGO clinical staging criteria. However, Silva’s classification is still in the early stages and requires further clinicopathological studies to verify (Roma et al. 2015, 2016; Rutgers et al. 2016). Therefore, developing an individual prediction model for cervical adenocarcinoma is necessary to accurately predict prognosis and optimize treatment planning. In this study, we identified seven independent risk factors and provided a convenient nomogram prediction model to evaluate individual probabilities for the overall survival of cervical adenocarcinoma. We further explored this nomograph for cancer-specific survival in cervical adenocarcinoma.

According to the Cox analysis, the age of diagnosis and T stage are independent predictive variables for OS and they were also the top two contributing factors for OS in the nomograms. Our study reveals that as the grade of tumor T stage increases, the risk of poor prognosis also increases accordingly. In the nomograms, age contributes the most to the eventual risk score for OS (Fig. 2). Currently, the impact of age on the prognosis of cervical cancer individuals is a topic of debate. Some researchers support the conclusion that the risk of poor survival outcomes increases with age (Chen et al. 1999; Quinn et al. 2019; Sharma et al. 2012; Wright et al. 2005; Yagi et al. 2019). Older patients are less likely to receive aggressive treatment or may refuse it altogether. Conversely, other research indicates that younger patients experience less favorable prognoses and decreased survival rates, particularly in the advanced stages (Lau et al. 2009). One possible explanation is that young women have a significantly higher incidence of cervical adenocarcinoma (Lee et al. 2011; Loureiro and Oliva 2014; Sasieni and Adams 2001), which is harder to detect and has a high rate of missed diagnosis by cytology screening (Kiser and Butler 2020). The pathogenic factors behind this include infection by human papillomavirus (HPV) types 16, 18, and 45, oral contraceptive use, hormone replacement therapy, obesity, and others (Dahlström et al. 2010). Therefore, analyzing the prognosis of CA requires considering the factor of age. Radiotherapy and chemotherapy have also played an important role in the treatment of cervical cancer, thanks to advances in medical technology and treatment plans.

Table 2 shows that the use of radiotherapy and chemotherapy independently served as prognostic factors for OS. Interestingly, radiotherapy was found to result in a poor prognosis for OS. It could be inferred that the long-term survival of CA patients may be negatively affected by the side effects of radiotherapy, which can often be severe and occur several years after treatment (Bryant et al. 2017). Recent evidence suggests that neoadjuvant chemoradiotherapy followed by radical surgery is more effective for improving OS and PFS than concomitant chemotherapy and radiotherapy (CCRT) for locally advanced cervical adenocarcinoma (Tian et al. 2021). In a study conducted by Kazuhiro Suzuki et al., adjuvant CCRT following radical hysterectomy did not significantly improve survival in FIGO stage IIIC1 cervical adenocarcinoma patients (Suzuki et al. 2021). Another retrospective study found that while adjuvant radiotherapy decreased the recurrence rate from 44% to 9% in patients with adenocarcinomas and adenosquamous carcinomas, it did not provide any significant survival benefits (Galic et al. 2012). Although the concept of combining chemotherapy and radiotherapy is not new, the optimal chemotherapy regimen and sequence to deliver radiotherapy and chemotherapy remain unclear, especially for cervical adenocarcinoma (Benard et al. 2017). Therefore, new treatment approaches should be considered for this cancer type.

Nomograms were used to predict overall survival (OS) and other variables such as TNM stage, SEER stage, tumor grade, and tumor size were classified as independent risk factors impacting the prognosis of CA patients. In the United States, it has been observed that black women exhibit reduced survival rates than white women (Benard et al. 2017). TNM stage stands as the most prevalent tumor staging system globally and is determined by laboratory and postoperative pathological examinations. It is an independent prognostic factor for cervical HPV-related adenocarcinoma patients. However, the TNM stage has its limitations as it does not provide personalized prognosis prediction for a patient. Tumor grade has also been reported as an independent prognostic factor in cervical patients (Chung et al. 1981). Although there is a limited number of nomograms explicitly developed to anticipate OS and CSS among CA patients, this study evaluated many influencing factors, and accurate predictions can be made based on all significant factors included in the nomogram. The establishment of these nomograms will be useful in designing individualized treatments for CA patients.

There are several constraints to this research. Firstly, it was a retrospective study based on the SEER database, which makes it susceptible to selection biases that may affect the analysis results. As a result, more large-scale prospective studies are required to verify and explore potential prognostic factors. Secondly, the SEER database lacks information regarding specific treatments, such as endocrine therapy and chemotherapy programs, cycles, and doses. Due to these data limitations, family history and gene mutation status were unknown, in addition to HER-2 status, which could not be included in this study. Thirdly, there is a lack of external data to further validate the results of the competing risk model analysis and the predictive power of the nomogram. Therefore, we plan to utilize additional repositories in the United States and various countries to enhance the model during subsequent analyses.

Conclusion

In this study, we established independent prognostic factors for cancer patients using the SEER database. We also constructed nomograms to predict 1, 3, and 5-year overall survival rates. The model demonstrated good predictive performance and may aid clinicians in evaluating survival prognosis and creating personalized treatment plans for cancer patients. However, because of the study’s retrospective nature, there is some degree of selection bias. Future research should focus on prospective studies to further investigate these findings.