Enhancing prognostic accuracy: a SEER-based analysis for overall and cancer-specific survival prediction in cervical adenocarcinoma patients

Background Cervical adenocarcinoma (CA) is the second most prevalent histological subtype of cervical cancer, following cervical squamous cell carcinoma (CSCC). As stated in the guidelines provided by the National Comprehensive Cancer Network, they are staged and treated similarly. However, compared with CSCC patients, CA patients are more prone to lymph node metastasis and recurrence with a poorer prognosis. The objective of this research was to discover prognostic indicators and develop nomograms that can be utilized to anticipate the overall survival (OS) and cancer-specific survival (CSS) of patients diagnosed with CA. Methods Using the Surveillance, Epidemiology, and End Result (SEER) database, individuals with CA who received their diagnosis between 2004 and 2015 were identified. A total cohort (n = 4485) was randomly classified into two separate groups in a 3:2 ratio, to form a training cohort (n = 2679) and a testing cohort (n = 1806). Overall survival (OS) was the primary outcome measure and cancer-specific survival (CSS) was the secondary outcome measure. Univariate and multivariate Cox analyses were employed to select significant independent factors and Least Absolute Shrinkage and Selection Operator (LASSO) Cox regression analysis was utilized to develop predictive nomogram models. The predictive accuracy and discriminatory ability of the nomogram were assessed by employing metrics such as the calibration curve, receiver operating characteristic (ROC) curve, and the concordance index (C-index). Results Age, Tumor Node Metastasis stages (T, N, and M), SEER stage, grade, and tumor size were assessed as common independent predictors of both OS and CSS. The C-index value of the nomograms for predicting OS was 0.832 (95% CI 0.817–0.847) in the training cohort and 0.823 (95% CI 0.805–0.841) in the testing cohort. Conclusion We developed and verified nomogram models for predicting 1-, 3- and 5-year OS and CSS among patients with cervical adenocarcinoma. These models exhibited excellent performance in prognostic prediction, providing support and assisting clinicians in assessing survival prognosis and devising personalized treatments for CA patients. Supplementary Information The online version contains supplementary material available at 10.1007/s00432-023-05399-2.


Introduction
Cervical cancer is the fourth most prevalent malignancy among women globally (Small et al. 2017).In 2020, the incidence of cervical cancer was approximately 13.3 per 100,000 females, and mortality was 7.2 per 100,000 females, with higher rates among young women aged 20-39 years (roughly nine deaths per week) (Cohen et al. 2019;Siegel et al. 2019).Cervical adenocarcinoma (CA), accounting for approximately 10-25% of all cervical cancer cases, is the second most prevalent subtype after cervical squamous cell carcinoma (CSCC) (Rivera-Colon et al. 2020).However, as cervical screening methods improve, particularly among young women under 40, CSCC rates are declining while CA rates are increasing (Bray et al. 2005).Moreover, CA patients are more likely to experience lymph node metastasis and recurrence with a worse prognosis (Irie et al. 2000).
The current approaches to treatment and factors influencing prognosis for cervical cancer are primarily determined by the 7th edition of the Cancer Staging Manual by the American Joint Committee on Cancer (AJCC 7th), as well as the staging guidelines provided by the International Federation of Gynecology and Obstetrics in 2018 (Bhatla and Denny 2018).However, increasing evidence suggests that those who share a similar stage of cervical cancer may exhibit varying prognoses, particularly individuals with large tumor size or those diagnosed with endocervical adenocarcinoma (Farley et al. 2003;Xie et al. 2018).Nomograms are assessment tools that quantify risks and benefits, allowing clinicians to predict the occurrence and progression of disease based on meaningful clinical indicators.Nomograms have found extensive application in anticipating the prognosis of individuals with diverse cancer types, such as gastric cancer, lung cancer, and breast cancer (Pan et al. 2019;Yang et al. 2022;Yu and Zhang 2019).Herein, we established and verified the accuracy of nomogram models for anticipating the overall survival (OS) rate at 1, 3, and 5 years and the cancer-specific survival (CSS) rate for CA patients.These models can provide clinicians with more accurate and personalized prognosis estimates, helping them create reliable clinical treatment plans for CA patients.

Data source and patients
This study encompassed individuals who received a diagnosis of cervical adenocarcinoma during the period between 2004 and 2015.The data were procured from the SEER database of the National Cancer Institute employing SEER*Stat software (version 8.4.0.1; http:// seer.cancer.gov/).The researchers acquired official authorization and explicit consent from the SEER program to access and utilize the data, ensuring that patient privacy was protected throughout the entire process.

Study variables
The SEER database provided clinical variables for this study, including age at diagnosis, race, histologic type, TNM stage, AJCC stage, chemotherapy, radiotherapy, histologic grade, tumor size (in cm), classification of cause-specific mortality, recoding of vital status, and the duration of survival.Age was subjectively categorized as less than 30, 30-50, 50-70, or 70 or older.Tumor variables included radiotherapy (yes or no/unknown), chemotherapy (yes or no/unknown), and tumor size (in cm) (< 2 cm, 2-4 cm, or 4 cm or greater).Tumor grades I-IV were further classified as well differentiated, moderately differentiated, poorly differentiated, or undifferentiated.
The primary outcome measure of this study was overall survival (OS), while the secondary outcome measure was cancer-specific survival (CSS).OS time refers to the length of time a patient survives from diagnosis until death Fig. 1 Flow chart for screening patients with cervical adenocarcinoma.AJCC American Joint Committee on Cancer; TNM tumor node metastasis stages from any cause.If a patient was lost to follow-up, their last recorded time was typically used as their time of death.CSS time, on the other hand, represents the length of time a patient survives with cancer from diagnosis until cancerrelated death, excluding other factors that may have contributed to their death.

Statistical analysis
Categorical variables were presented as frequencies (n) and percentages (%).Univariate and multivariate Cox analyses were performed to identify independent prognostic determinants and develop nomograms for assessing overall survival (OS) and cancer-specific survival (CSS) in the training cohort, respectively.Variables displaying P-values less than 0.05 in both the univariate and multivariate analyses were deemed statistically significant.The Cox analysis was performed using SPSS software (version 25.0, Chicago, IL, USA).
The prognostic nomograms were developed through LASSO regression, utilizing the findings of the multivariate Cox analysis in the training cohort.To assess the prognostic accuracy of the model, the concordance index (C-index) and calibration curve were employed.Furthermore, the predictive performance of the nomograms for 1, 3, and 5-year OS was assessed employing the receiver operating characteristic (ROC) curve.The R version 4.2.1 software in R Studio was used to construct the nomograms, C-index, calibration curves, and ROC curves.

Exploring independent prognosis-predictive factors of OS within the training set
In the univariate analysis, all variables except for "unmarried or domestic partner" in the marriage group were found to be significant regarding overall survival (OS), as shown in Table 2.In the multivariate analysis for OS, ten variablesage, race, AJCC TNM stage, SEER stage, chemotherapy, radiotherapy, grade, and tumor size-were found to be independent prognostic factors.LASSO regression was utilized to create a predictive model using 10 variables from the multivariate analysis of the training set.Ultimately, seven variables-age, AJCC TNM stage, SEER stage, grade, and tumor size-were considered statistically significant factors and used to establish a nomogram that predicts 1-, 3-, and 5-year OS.

Construction of prognostic nomograms
Nomograms were developed to predict 1, 3, and 5-year survival for OS in the training set based on statistically significant prognostic factors identified through LASSO regression analysis (Figs. 2 and 3).The nomogram computes a predicted score for each variable by its corresponding point and then sums up all the scores of the variables to obtain the total score.To assess the anticipated probability of 1, 3, and 5-year OS for individual patients, a straight line is drawn from the "total point" to the respective "survival axes".Age and T stage were found to be more important predictive factors than other variables in the OS nomogram.

Validation and calibration of the nomogram for OS and CSS
The nomogram was validated using the C-index.Figure 4 demonstrates that the nomogram predicted OS with a C-index of 0.832 (95% CI, 0.817-0.847) in the training cohort and a C-index of 0.823 (95% CI, 0.805-0.841) in the testing cohort.These findings imply that the nomogram exhibits potential as a promising tool for anticipating OS in patients.Calibration plots were constructed to compare the performance of the nomogram with an ideal curve, and the outcomes demonstrated strong concordance between the predicted values from the nomogram and the actually observed values in both the training and testing cohorts.
Furthermore, we evaluated the performance of the model in predicting overall survival (OS) in the testing cohort and assessed its predictive value for cancer-specific survival (CSS).In the training cohort, the receiver operating characteristic (ROC) curve's area under the curve (AUC) values for 1, 3, and 5-year overall survival (OS) were recorded to be 0.917, 0.885, and 0.867, respectively.Similarly, in the testing cohort, the corresponding AUC values were 0.893, 0.873, and 0.859.Additionally, the AUC values for CSS at 1, 3, and 5 years were 0.927, 0.893, and 0.874 in the training set, and 0.904, 0.884, and 0.878 in the testing set, respectively (Fig. 5).

Clinical applicability of the nomogram
To assess the practicality and effectiveness of the prognosis-predictive nomogram, patients were classified into two groups: high-risk and low-risk, as per their respective prognostic scores.The optimal cutoff value was determined by selecting the maximum Youden index at 5 years using the ROC curve.Kaplan-Meier survival analysis for OS and CSS was performed for both training and testing cohorts, and a significant difference in prognosis was detected between the two groups.The results indicated that patients with   The method uses an L1 penalty to shrink some regression coefficients to exactly zero.The binomial deviance curve was plotted versus log (λ), where λ is the tuning parameter.LASSO least absolute shrinkage and selection operator high-risk prognostic scores had significantly reduced OS and CSS in both the training and testing cohorts (P < 0.05) (Fig. 6).Furthermore, age and AJCC stage were selected as important clinical prognostic factors and were categorized into two groups, namely high-risk and low-risk, according to their respective prognostic scores for further validation.The survival curves of different age and AJCC stage groups showed similar trends (P < 0.05) (Fig. 7).The median survival of OS and CSS between high-risk and low-risk groups was shown in Table S1 and Table S2.

Discussion
Cervical cancer is a prevalent form of malignant tumor in women, ranking fourth in the incidence of female malignant tumors (Small et al. 2017;Johnson et al. 2019).On a global scale, the annual incidence of cervical cancer surpasses 500,000 cases among women, with a corresponding mortality toll of over 300,000 individuals succumbing to this disease (Cohen et al. 2019).It continues to be one of the primary burdens of cancer worldwide.While cervical cancer screening and HPV vaccination have reduced the incidence in many countries, the incidence of cervical adenocarcinoma has been steadily increasing (Bray et al. 2005;de Martel et al. 2017).Moreover, patients with adenocarcinoma tend to be younger than those with squamous cell carcinoma, and there is mounting evidence that they are more likely to still be in the reproductive stage (Loureiro and Oliva 2014).
The rise in adenocarcinoma incidence is not entirely clear, but it may relate to the internal growth pattern of endocervical glands, which hinders accurate measurement of invasion depth and diagnosis (Rivera-Colon et al. 2020).The FIGO staging system remains an essential standard for determining cervical cancer treatment.Patients with the same clinical stage may exhibit different behaviors and survival outcomes due to different histologic subtypes.In 2013, Silva et al. proposed a new model for risk stratification of cervical adenocarcinoma based on interstitial invasion of the tumor (Silva classification) (Roma et al. 2015).This model is repeatable and better predicts the risk of lymph node metastasis compared with FIGO clinical staging criteria.However, Silva's classification is still in the early stages and requires further clinicopathological studies to verify (Roma et al. 2015(Roma et al. , 2016;;Rutgers et al. 2016).Therefore, developing an individual prediction model for cervical adenocarcinoma is necessary to accurately predict prognosis and optimize treatment planning.In this study, we identified seven independent risk factors and provided a convenient nomogram prediction model to evaluate individual probabilities for the overall survival of cervical adenocarcinoma.We further explored this nomograph for cancer-specific survival in cervical adenocarcinoma.
According to the Cox analysis, the age of diagnosis and T stage are independent predictive variables for OS and they were also the top two contributing factors for OS in the nomograms.Our study reveals that as the grade of tumor T stage increases, the risk of poor prognosis also increases accordingly.In the nomograms, age contributes the most to the eventual risk score for OS (Fig. 2).Currently, the impact of age on the prognosis of cervical cancer individuals is a topic of debate.Some researchers support the conclusion that the risk of poor survival outcomes increases with age (Chen et al. 1999;Quinn et al. 2019;Sharma et al. 2012;Wright et al. 2005;Yagi et al. 2019).Older patients are less likely to receive aggressive treatment or may refuse it altogether.Conversely, other research indicates that younger patients experience less favorable prognoses and decreased survival rates, particularly in the advanced stages (Lau et al. 2009).One possible explanation is that young women have a significantly higher incidence of cervical adenocarcinoma (Lee et al. 2011;Loureiro and Oliva 2014;Sasieni and Adams 2001), which is harder to detect and has a high rate of missed diagnosis by cytology screening (Kiser and Butler 2020).The pathogenic factors behind this include infection by human papillomavirus (HPV) types 16, 18, and 45, oral contraceptive use, hormone replacement therapy, obesity, and others (Dahlström et al. 2010).Therefore, analyzing the prognosis of CA requires considering the factor of age.Radiotherapy and chemotherapy have also played an important role in the treatment of cervical cancer, thanks to advances in medical technology and treatment plans.
Table 2 shows that the use of radiotherapy and chemotherapy independently served as prognostic factors for OS.Interestingly, radiotherapy was found to result in a poor prognosis for OS.It could be inferred that the long-term survival of CA patients may be negatively affected by the side effects of radiotherapy, which can often be severe and occur several years after treatment (Bryant et al. 2017).Recent evidence suggests that neoadjuvant chemoradiotherapy followed by radical surgery is more effective for improving OS and PFS than concomitant chemotherapy and radiotherapy (CCRT) for locally advanced cervical adenocarcinoma (Tian et al. 2021).In a study conducted by Kazuhiro Suzuki et al., adjuvant CCRT following radical hysterectomy did not significantly improve survival in FIGO stage IIIC1 cervical adenocarcinoma patients (Suzuki et al. 2021).Another retrospective study found that while adjuvant radiotherapy decreased the recurrence rate from 44% to 9% in patients with adenocarcinomas and adenosquamous carcinomas, it did not provide any significant survival benefits (Galic et al. 2012).Although the concept of combining chemotherapy and radiotherapy is not new, the optimal chemotherapy regimen and sequence to deliver radiotherapy and chemotherapy remain unclear, especially for cervical adenocarcinoma (Benard et al. 2017).Therefore, new treatment approaches should be considered for this cancer type.
Nomograms were used to predict overall survival (OS) and other variables such as TNM stage, SEER stage, tumor grade, and tumor size were classified as independent risk factors impacting the prognosis of CA patients.In the United States, it has been observed that black women exhibit reduced survival rates than white women (Benard et al. 2017).TNM stage stands as the most prevalent tumor staging system globally and is determined by laboratory and postoperative pathological examinations.It is an independent prognostic factor for cervical HPV-related adenocarcinoma patients.However, the TNM stage has its limitations as it does not provide personalized prognosis prediction for a patient.Tumor grade has also been reported as an independent prognostic factor in cervical patients (Chung et al. 1981).Although there is a limited number of nomograms explicitly developed to anticipate OS and CSS among CA patients, this study evaluated many influencing factors, and accurate predictions can be made based on all significant factors included in the nomogram.The establishment of these nomograms will be useful in designing individualized treatments for CA patients.
There are several constraints to this research.Firstly, it was a retrospective study based on the SEER database, which makes it susceptible to selection biases that may affect the analysis results.As a result, more large-scale prospective studies are required to verify and explore potential prognostic factors.Secondly, the SEER database lacks information regarding specific treatments, such as endocrine therapy and chemotherapy programs, cycles, and doses.Due to these data limitations, family history and gene mutation status were unknown, in addition to HER-2 status, which could not be included in this study.Thirdly, there is a lack of external data to further validate the results of the competing risk model analysis and the predictive power of the nomogram.Therefore, we plan to utilize additional repositories in the United States and various countries to enhance the model during subsequent analyses.

Conclusion
In this study, we established independent prognostic factors for cancer patients using the SEER database.We also constructed nomograms to predict 1, 3, and 5-year overall survival rates.The model demonstrated good predictive performance and may aid clinicians in evaluating survival prognosis and creating personalized treatment plans for cancer patients.However, because of the study's retrospective nature, there is some degree of selection bias.Future research should focus on prospective studies to further investigate these findings.

Fig. 2
Fig. 2 Selection of significant parameters in clinicopathologic variables in the training set and definition of linear predictor.Seven time cross-validation for tuning parameter selection in the LASSO model.The LASSO was used for regression of high dimensional predictors.The method uses an L1 penalty to shrink some regression coefficients to exactly zero.The binomial deviance curve was plotted versus log (λ), where λ is the tuning parameter.LASSO least absolute shrinkage and selection operator

Fig. 5
Fig. 5 ROC curves verified the predictive value of nomogram.ROC of 1-, 3-and 5-year OS in the training cohort (A) and testing cohort (C).ROC of 1-, 3-and 5-year CSS in the training cohort (B) and testing cohort (D)

Fig. 6
Fig. 6 Kaplan-Meier curves of OS and CSS in the training cohort (A, C) and testing cohort (B, D)

Fig. 7
Fig. 7 Kaplan-Meier curves of OS and CSS based on age in the training cohort (A, B, I, J) and testing cohort (C, D, K, L).Kaplan-Meier curves of OS and CSS based on AJCC stage in the training cohort (E, F, M, N) and testing cohort (G, H, O, P)