Deep learning models for predicting the survival of patients with medulloblastoma based on a surveillance, epidemiology, and end results analysis

Sun, Meng; Sun, Jikui; Li, Meng

doi:10.1038/s41598-024-65367-9

Deep learning models for predicting the survival of patients with medulloblastoma based on a surveillance, epidemiology, and end results analysis

Article
Open access
Published: 24 June 2024

Volume 14, article number 14490, (2024)
Cite this article

Download PDF

You have full access to this open access article

Scientific Reports

Deep learning models for predicting the survival of patients with medulloblastoma based on a surveillance, epidemiology, and end results analysis

Download PDF

Meng Sun¹,
Jikui Sun¹ &
Meng Li¹

152 Accesses
Explore all metrics

Abstract

Medulloblastoma is a malignant neuroepithelial tumor of the central nervous system. Accurate prediction of prognosis is essential for therapeutic decisions in medulloblastoma patients. We analyzed data from 2,322 medulloblastoma patients using the SEER database and randomly divided the dataset into training and testing datasets in a 7:3 ratio. We chose three models to build, one based on neural networks (DeepSurv), one based on ensemble learning that Random Survival Forest (RSF), and a typical Cox Proportional-hazards (CoxPH) model. The DeepSurv model outperformed the RSF and classic CoxPH models with C-indexes of 0.751 and 0.763 for the training and test datasets. Additionally, the DeepSurv model showed better accuracy in predicting 1-, 3-, and 5-year survival rates (AUC: 0.767–0.793). Therefore, our prediction model based on deep learning algorithms can more accurately predict the survival rate and survival period of medulloblastoma compared to other models.

Survival prediction of glioblastoma patients using modern deep learning and machine learning techniques

Article Open access 29 January 2024

GLIMPSE: a glioblastoma prognostication model using ensemble learning—a surveillance, epidemiology, and end results study

Article 12 January 2021

A Survey on Application of Machine Learning Algorithms in Cancer Prediction and Prognosis

Introduction

Medulloblastoma is an embryonal tumor that arises from the cerebellum and has the potential to spread throughout the nervous system. It is the most common type of paediatric embryonal tumor, with an incidence ranging from 5 to 11 cases per 1 million individuals^1,2. According to current international consensus, there are four subgroups of medulloblastoma: Wingless (WNT), Sonic Hedgehog (SHH), group 3 (G3), and group 4 (G4)³. Multimodal therapy, which includes surgery, external beam irradiation, and/or cytotoxic chemotherapy, can result in survival rates ranging from 50 to 80% based on clinical staging⁴. Certain prognostic features, such as age at diagnosis, extent of resection, histological subtype, and molecular subgroup classification, have been found to affect survival predictions in individual patients.

Previous studies have used the Cox proportional-hazards model (CoxPH) to evaluate the survival rate of medulloblastoma patients^5,6,7. This model incorporates survival outcomes and time as target variables, allowing for the simultaneous analysis of multiple factors' impact on survival time. It is extensively used for predicting outcome events when the survival distribution of the analyzed data is unknown⁸. A nomogram is a commonly used method for quantifying and combining important clinical characteristics of patients to calculate the probabilities of outcome events based on the CoxPH model⁹. However, the model assumes that each predictor variable has the same effect throughout the follow-up time, which ignores variations in their impact on individual patients at different times. Therefore, a new method is required to improve the accuracy of predicting the survival rate of cancer patients.

In recent years, computer and information technology have shown revolutionary potential for artificial intelligence (AI) in the healthcare industry^10,11,12. Machine learning models have stronger nonlinear modeling capabilities compared to traditional linear models and can better capture complex relationships among clinical variables. The analysis of these models can provide accurate personalized survival predictions and decision-making support for treatment strategies to improve patient survival rates^13,14. Deep learning is a subfield of machine learning that involves discovering the distributed features of sample data by learning the underlying laws and levels of representation^15,16. Neural networks are at the heart of deep learning algorithms and consist of input, hidden and output layers that can be used to solve complex, multi-factor and non-linear problems. Deep learning-based models have become highly effective predictors of clinical outcomes across various disease domains due to the continuous advancements in deep learning research techniques and the abundance of biomedical big data. Jiang et al.¹⁷ demonstrated the use of an artificial neural network model to predict the survival rate of patients diagnosed with pancreatic neuroendocrine neoplasms, by leveraging clinical information. Katzman et al.¹⁸ integrated deep learning with a multilayer neural network architecture, known as the DeepSurv model, resulting in a personalized treatment recommendation system that showed remarkable performance.

To our knowledge, there is a lack of research combining deep learning techniques with the study of medulloblastoma. Therefore, this study aimed to fill this research gap by utilizing data obtained from the Surveillance, Epidemiology, and End Results (SEER) database, which contains information on patients diagnosed with medulloblastoma in the United States. And then the DeepSurv model was used to evaluate their survival rates.

Method

Data source and patient selection

The data of this retrospective cohort study were obtained from the SEER database, which encompasses information from 18 cancer registries representing approximately 28% of the entire US population¹⁹. This database offers extensive and detailed patient data, including demographic characteristics, tumor-related information, cause of death, and survival duration. The SEER*Stat software (version 8.3.6) was used to identify patients with medulloblastoma. The dataset covering the years 2000 to 2019 in the United States was accessed.

The patients included in the study had to meet the following criteria: (1) a confirmed pathological diagnosis of medulloblastoma; (2) identification of medulloblastoma cases based on the third edition of the International Classification of Diseases for Oncology (ICD-O3) using specific ICD-O-3 codes for histopathology, including 9,470/3 for medulloblastoma, NOS; 9,471/3 for desmoplastic nodular medulloblastoma; and 9,474/3 for large cell medulloblastoma. Furthermore, patients were required to have a known survival status and time. Afterwards, they were randomly divided into a training group and a testing group at a 7:3 ratio. A flowchart in Fig. 1 illustrates the process of patient selection.

Variable’s definitions

Several parameters were collected from the samples, including age at diagnosis, sex, race, histological type, tumor size, surgery, chemotherapy, radiation therapy, and survival time. To evaluate the prognostic value of age and tumor size in patients with medulloblastoma objectively, the patients were categorized into two groups based on the optimal cutoff values obtained using the X-tile software (https://x-tile.software.informer.com, Yale School of Medicine, New Haven, CT, United States). Age cutoff values of ≤ 3 years and > 3 years, and tumor size cutoff values of ≤ 3.4 cm, > 3.4 cm, and/or unknown were utilized. For detailed visual representations, please refer to Fig. 2.

Model development

This study selected three models for training: DeepSurv, RSF, and CoxPH. DeepSurv is a deep feedforward neural network used to predict patients’ survival time or survival probability. It employs a multi-layer neural network to capture the complex nonlinear relationship between patients’ survival probability and input features. This study utilized deep-learning calculations based on the DeepSurv calculation method described by Katzman et al.¹⁸ to predict the survival outcome of patients diagnosed with medulloblastoma. The term RSF refers to Random Survival Forests, which is a survival analysis method based on random forests. When constructing a random survival forest, subsets of samples and features are randomly selected, and multiple decision trees are built using these subsets²⁰. Each decision tree splits the samples based on features in the nodes and determines the optimal splitting based on the evaluation of survival time differences. The predictions from multiple decision trees in the random survival forests are combined to obtain the final survival prediction. The CoxPH is a semi-parametric regression model used to analyse survival data and estimate the risk of event occurrence. The Cox proportional-hazards model is used to compare the relative risks of events between different groups and study the impact of various factors on event occurrence. The model functions by modeling the relationship between time and event occurrence as a function of hazard ratios.

We performed hyperparameter tuning in the Deepsurv model using grid search and fivefold cross-validation on the training dataset, selecting the parameter with the highest average C-index in the cross-validation as the optimal parameter.

For the implementation of the algorithms in this research, CoxPH and RSF were implemented using the Python package “Scikit-learn (version 0.24.1)” and DeepSurv was implemented using the open-source Python package “Tensorflow-gpu (version 2.6.2)”.

Model evaluation

The study evaluated the model’s performance using several metrics, including C-index, Brier score, integrated brier score (IBS), receiver operating characteristic (ROC) curves, and area under the curve (AUC) values.

The C-index is a commonly used metric for evaluating the accuracy of survival predictions²¹. It measures the concordance or correlation between the predicted survival risk and the actual observed survival time. A C-index of 0.5 indicates random predictions, while a value of 1.0 indicates perfect predictions. The Brier score assesses the mean squared difference between the observed patient statuses (event occurrence or censoring) and the predicted survival probabilities. It ranges from 0 to 1, with 0 indicating a perfect match between predictions and observations. In practice, models with Brier scores less than 0.25 are considered useful^22,23. The IBS is a metric that evaluates the overall performance of a survival model across all available time points²⁴. It takes into account the model’s sensitivity and specificity to time-dependent events, providing a comprehensive measure of predictive accuracy. Receiver Operating Characteristic (ROC) curves are frequently used to assess a model’s sensitivity and specificity at various discrimination thresholds. The ROC curve plots the true positive rate against the false positive rate. The Area Under the Curve (AUC) values, which range from 0 to 1, are computed to quantify the overall performance of the model. A higher AUC indicates better discrimination ability. This study calculated AUC values to assess the model's performance at different time points: 1, 3, and 5-year survival rates.

Statistical analysis

In the clinical data, continuous variables are expressed as mean ± standard deviation (SD), while categorical variables are described using frequencies and percentages. Statistical tests such as chi-square tests and unpaired t-tests are used to compare variables between groups.

Result

Basic characteristics

This study analysed data from 2,322 medulloblastoma patients registered in the SEER database between 2000 and 2019. Table 1 presents the demographic features of the patients, with 869 cases (37.42%) being female and 1,453 cases (62.58%) being male. The racial distribution was as follows: 185 patients (7.97%) were Black, 1,939 (83.51%) were White, and 198 (8.53%) belonged to other races. Regarding the subtypes of medulloblastoma, 329 patients (14.17%) had desmoplastic/nodular medulloblastoma (DMB), 1,866 (80.36%) had medulloblastoma, not otherwise specified (MB, NOS), and 127 (5.47%) had large-cell/anaplastic medulloblastoma (LC). In terms of surgical interventions, 1,616 patients (69.60%) underwent total resection, 244 (10.51%) underwent subtotal resection, 343 (14.77%) underwent local excision or biopsy, and 119 (5.12%) did not undergo surgery. Of the patients, 1,849 (79.63%) received chemotherapy, 1,766 (76.06%) underwent radiation therapy, and 713 (30.71%) died. The cutoff values for age and tumor size were determined using X-tile analysis (Fig. 2). Specifically, 324 patients (13.95%) were ≤ 3 years old, and 1,998 patients (86.05%) were older than 3 years. Regarding tumor size, 314 patients (13.52%) had tumors ≤ 3.4 cm, 1,269 patients (54.65%) had tumor size > 3.4 cm, and the tumor size was unknown for 739 patients (31.83%).

Table 1 Characteristic distribution of data into raining sets and test sets.

Full size table

The predictive model was generated by partitioning the complete dataset into two mutually exclusive subsets. 70% of the dataset was allocated for the training set, while the remaining 30% was used for the testing set. Model generation was performed on 1,625 randomly assigned patients from the training set, while the accuracy of the model was estimated using 697 randomly assigned patients from the testing set. No statistically significant differences in characteristics were found between the two groups (refer to Table 1). Additionally, survival outcomes showed no differences between the two groups (refer to Fig. S1).

Cox proportional-hazard (CoxPH) model

The CoxPH model was developed using the training set (refer to Fig. 3). Only variables that showed statistical significance in the univariate analysis were included in the multivariate analysis. The survival of medulloblastoma patients was significantly affected by non-surgical treatment, LC, white race, tumor size ≤ 3.4 cm, total resection, age > 3 years, chemotherapy, and radiotherapy. Furthermore, the survival of the patients was significantly associated with these features in the multivariate analysis. The collinearity analysis also revealed a high correlation between age and radiotherapy, as well as between chemotherapy and radiotherapy (refer to Fig. S2). Ultimately, we included seven features (age, race, tumor size, histological type, surgery, chemotherapy, and radiotherapy) in the model development.

Random survival forests (RSF)

Prediction error was calculated using the out-of-bag (OOB) from the training set (Fig. 4A). The predicted probability function for patient in the test cohort was plotted in Fig. 4B. Variable Importance (VIMP) is used to indicate the extent to which the sample characteristics contribute to the regression, as shown in Fig. 4C. A higher VIMP value indicates a greater influence or importance of that variable in accurately predicting the outcome²⁵. The interaction between variables in the analyzed data is illustrated and displayed in Fig. 4D. If one variable's split in a decision tree affects or influences the split of another variable, it suggests an interaction between those variables^26,27. The extent of interactions is assessed based on the minimum depth, which represents the distance from the root node to the node where the variable first splits. In this case, chemotherapy and radiotherapy were found to have the lowest minimum depth among the variables considered that were expected to be associated with other variables.

DeepSurv

The hyperparameters of DeepSurv were tuned with reference to previous studies that grid search and fivefold cross-validation on the training dataset^18,28. The model with the optimal set of hyperparameters achieved the accuracy of 91.06% and the corresponding R² value of 0.6455. The best combination of the model hyperparameters included 2000 epochs, the Adam optimizer, binary cross-entropy loss, four layers (nodes: 32, 64, 128, 256), a dropout rate of 0.2, and a learning rate of 0.001. Furthermore, the performance of the model was evaluated with the testing set.

The loss function curve illustrates the relationship between the loss and the number of iterations, providing valuable information about the convergence and performance of the model²⁹. In addition, the C-index is a commonly used metric for evaluating the performance of survival analysis models. If the C-index is only measured on the training set, the possibility of overfitting cannot be completely ruled out, as the model may over-fit the training data, leading to a decrease in generalization performance on test data³⁰. In this study, the C-index was measured on two mutually exclusive data sets (training and test) and no overfitting phenomenon was observed. The learning process of DeepSurv, a survival prediction model based on deep learning, was visualized (Fig. 5). The figure shown a good model fit, indicating that the model was effectively learning and capturing the underlying patterns in the data.

Model comparisons

The predictive performance of the three models is shown in Table 2. In the test dataset, the DeepSurv and RSF model exhibited better discrimination abilities (the DeepSurv C-index: 0.763, RSF: 0.759) compared with the CoxPH model (the C-index: 0.757). And in the three models, DeepSurv had the highest C-index of 0.763. The C-index obtained from the train data set (DeepSurv: 0.751, RSF: 0.750, CoxPH: 0.748) differed only slightly with test set, indicating that the models did not exhibit overfitting. The IBS for the three models were as follows: DeepSurv (0.150), RSF (0.160), and CoxPH (0.166). Lower IBS values indicate better model performance.

Table 2 Performance of three survival models.

Full size table

Furthermore, in terms of the Brier score (Fig. S3), DeepSurv outperformed the other two models, indicating its superior accuracy. The AUC for DeepSurv was also higher than the other models (Fig. 6), that 1-year-AUC of DeepSurv: 0.793 (95% CI 0.754–0.833), RSF: 0.757 (95% CI 0.720–0.795), CoxPH: 0.736 (95% CI 0.699–0.772); 3-year-AUC of DeepSurv: 0.775 (95% CI 0.736–0.814), RSF: 0.738 (95% CI 0.701–0.774), CoxPH: 0.712 (95% CI 0.677–0.778); 5-year-AUC of DeepSurv: 0.767 (95% CI 0.729–0.806), RSF: 0.734 (95% CI 0.697–0.770), CoxPH: 0.704 (95% CI 0.669–0.740). These results demonstrate that DeepSurv outperforms both RSF and the classical CoxPH model in accurately predicting the prognosis of patients with medulloblastoma.

Discussion

Medulloblastoma, a malignant brain tumor that mainly impacts children, continues to pose a substantial obstacle in the field of pediatric oncology. Precisely predicting the individual prognosis of patients is crucial for customizing treatment approaches and enhancing survival rates. Prior research has identified several prognostic factors that affect the survival duration of medulloblastoma patients, including age, extent of surgical removal, and the administration of radiotherapy or chemotherapy^7,31,32. Moreover, as medical advancements progress, an increasing amount of imaging data⁵ and genetic data³³ are being analyzed for survival analysis of medulloblastoma patients. However, classical survival analysis methods, such as the Cox proportional-hazards model, assume a linear relationship between variables, which may be limited in the face of multidimensional data. With the advancement of artificial intelligence, machine learning methods are being applied to clinical, imaging, and genetic data, allowing for the discovery of potential nonlinear relationships within the data^34,35,36. Within machine learning, deep learning is a specific class of methods that utilizes multilayered neural networks to extract high-order features. Deep learning has gained increasing popularity in the field of cancer survival analysis, and has demonstrated excellent performance^37,38,39. As far as we know, this approach has not been applied to medulloblastoma. Therefore, we applied a deep learning model (Deepsurv) to predict the overall survival (OS) of medulloblastoma patients and compared its performance to that of a machine learning model (RSF) and a classical model (CoxPH).

By extracting potentially significant features from the SEER database, this research developed multiple models to forecast the survival rates of individuals diagnosed with medulloblastoma. Initially, we utilized the X-tile tool to determine the optimal cutoff values for age and tumor size from a cohort of 2,322 medulloblastoma patients. We identified two high-risk factors, age ≤ 3 years old and tumor size > 3.4 cm, that significantly impact the survival duration of patients with medulloblastoma. Subsequently, we employed Cox proportional hazards regression to identify variables associated with the prognosis of medulloblastoma patients. Age, race, tumor size, histological type, surgery, chemotherapy, and radiotherapy were selected for inclusion in the modeling process (p < 0.05). We established RSF, DeepSurv and CoxPH models and evaluated their performance using metrics such as the C-index, IBS, and ROC curve. The study results demonstrated that the DeepSurv model outperformed both the CoxPH and RSF models, as indicated by its higher C-index in both the training and testing sets. Moreover, the DeepSurv model exhibited the lowest IBS and the largest AUC values when predicting 1-, 3-, and 5-year survival. These findings collectively suggest that the DeepSurv model is more accurate in predicting the survival of patients with medulloblastoma.

In previous studies, Guo et al.⁷ and Zhou et al.⁵ utilized Cox proportional hazard regression for survival analysis of medulloblastoma and developed a nomogram. Compared with their study, the C-index values obtained from the DeepSurv model were higher in both the training and the testing cohort, indicating its superior predictive accuracy of the prognosis of patients with medulloblastoma. This finding is consistent with the results reported in several previous studies focusing on cancer prognosis^40,41. The main advantage of the DeepSurv model is its ability to handle both linear and non-linear predictive variables using a multi-layer neural network. It has a powerful ability to capture arbitrarily complex non-linear interactions in the data, allowing such models to discover correlations that are difficult for the human eye or traditional statistical techniques to detect.

Nevertheless, our study encountered several limitations. Firstly, the data collected from the SEER database for medulloblastoma patients contain some missing information that may affect survival outcomes, including important details such as molecular subgroups, specific radiotherapy doses, and chemotherapy regimens. Among other things, molecular diagnostics are critical for treatment and prognosis prediction of tumours, especially medulloblastoma. Nevertheless, the availability and completeness of these data depend on continuous improvements in data collection in the SEER database. Secondly, our model has yet to undergo external validation, and it is necessary to validate its performance on new data. Conducting further validations using independent datasets would enhance the reliability and generalizability of the findings. Another inherent limitation lies within the DeepSurv model itself. Due to its utilization of hidden layers in its architecture, the model operates as a black-box, making it challenging to fully comprehend the computations involved in the model construction process and its associated limitations. Future research should aim to address these concerns and explore the inner workings of the model to improve interpretability.

Conclusions

This study employed Cox proportional hazards regression analysis to examine the prognostic factors influencing medulloblastoma patients’ outcomes, which include age, race, tumor size, histological type, surgery, chemotherapy, and radiotherapy. Subsequently, we developed a groundbreaking DeepSurv prediction model, which exhibited strong predictive capabilities in assessing the prognosis of patients diagnosed with medulloblastoma. This innovative DeepSurv model holds significant potential in accurately predicting the survival duration of medulloblastoma patients.

Data availability

The datasets analyzed during the current study are available in the SEER database repository (https://seer.cancer.gov/).

References

Gajjar, A. J. & Robinson, G. W. Medulloblastoma-translating discoveries from the bench to the bedside. Nat. Rev. Clin. Oncol. 11, 714–722. https://doi.org/10.1038/nrclinonc.2014.181 (2014).
Article CAS PubMed Google Scholar
Ostrom, Q. T., Cioffi, G., Waite, K., Kruchko, C. & Barnholtz-Sloan, J. S. CBTRUS statistical report: Primary brain and other central nervous system tumors diagnosed in the United States in 2014–2018. Neuro. Oncol. https://doi.org/10.1093/neuonc/noab200 (2021).
Article PubMed PubMed Central Google Scholar
Taylor, M. D. et al. Molecular subgroups of medulloblastoma: The current consensus. Acta Neuropathol. 123, 465–472. https://doi.org/10.1007/s00401-011-0922-z (2012).
Article CAS PubMed Google Scholar
Ramaswamy, V. & Taylor, M. D. Medulloblastoma: From myth to molecular. J. Clin. Oncol. 35, 2355–2363. https://doi.org/10.1200/JCO.2017.72.7842 (2017).
Article CAS PubMed Google Scholar
Zhou, L. et al. Automatic image segmentation and online survival prediction model of medulloblastoma based on machine learning. Eur. Radiol. https://doi.org/10.1007/s00330-023-10316-9 (2023).
Article PubMed PubMed Central Google Scholar
Li, X. & Gong, J. Survival nomogram for medulloblastoma and multi-center external validation cohort. Front. Pharmacol. 14, 1247812. https://doi.org/10.3389/fphar.2023.1247812 (2023).
Article PubMed PubMed Central Google Scholar
Guo, C. et al. External validation of a nomogram and risk grouping system for predicting individual prognosis of patients with medulloblastoma. Front. Pharmacol. 11, 590348. https://doi.org/10.3389/fphar.2020.590348 (2020).
Article PubMed PubMed Central ADS Google Scholar
Baek, E. T. et al. Survival time prediction by integrating cox proportional hazards network and distribution function network. BMC Bioinform. 22, 192. https://doi.org/10.1186/s12859-021-04103-w (2021).
Article Google Scholar
Iasonos, A., Schrag, D., Raj, G. V. & Panageas, K. S. How to build and interpret a nomogram for cancer prognosis. J. Clin. Oncol. 26, 1364–1370. https://doi.org/10.1200/JCO.2007.12.9791 (2008).
Article PubMed Google Scholar
Schwalbe, N. & Wahl, B. Artificial intelligence and the future of global health. Lancet 395, 1579–1586. https://doi.org/10.1016/S0140-6736(20)30226-9 (2020).
Article CAS PubMed PubMed Central Google Scholar
Hamet, P. & Tremblay, J. Artificial intelligence in medicine. Metabolism 69S, S36–S40. https://doi.org/10.1016/j.metabol.2017.01.011 (2017).
Article CAS PubMed Google Scholar
Hunter, D. J. & Holmes, C. Where medical statistics meets artificial intelligence. N. Engl. J. Med. 389, 1211–1219. https://doi.org/10.1056/NEJMra2212850 (2023).
Article PubMed Google Scholar
Connor, C. W. Artificial intelligence and machine learning in anesthesiology. Anesthesiology 131, 1346–1359. https://doi.org/10.1097/ALN.0000000000002694 (2019).
Article PubMed Google Scholar
Bhat, M., Rabindranath, M., Chara, B. S. & Simonetto, D. A. Artificial intelligence, machine learning, and deep learning in liver transplantation. J. Hepatol. 78, 1216–1233. https://doi.org/10.1016/j.jhep.2023.01.006 (2023).
Article PubMed Google Scholar
Choi, R. Y., Coyner, A. S., Kalpathy-Cramer, J., Chiang, M. F. & Campbell, J. P. Introduction to machine learning, neural networks, and deep learning. Transl. Vis. Sci. Technol. 9, 14. https://doi.org/10.1167/tvst.9.2.14 (2020).
Article PubMed PubMed Central Google Scholar
Greener, J. G., Kandathil, S. M., Moffat, L. & Jones, D. T. A guide to machine learning for biologists. Nat. Rev. Mol. Cell Biol. 23, 40–55. https://doi.org/10.1038/s41580-021-00407-0 (2022).
Article CAS PubMed Google Scholar
Jiang, C. et al. Predicting the survival of patients with pancreatic neuroendocrine neoplasms using deep learning: A study based on surveillance, epidemiology, and end results database. Cancer Med. 12, 12413–12424. https://doi.org/10.1002/cam4.5949 (2023).
Article PubMed PubMed Central Google Scholar
Katzman, J. L. et al. DeepSurv: Personalized treatment recommender system using a cox proportional hazards deep neural network. BMC Med. Res. Methodol. 18, 24. https://doi.org/10.1186/s12874-018-0482-1 (2018).
Article PubMed PubMed Central Google Scholar
Hankey, B. F., Ries, L. A. & Edwards, B. K. The surveillance, epidemiology, and end results program: a national resource. Cancer Epidemiol. Biomark. Prev. 8, 1117–1121 (1999).
CAS Google Scholar
Rahman, S. A. et al. Prediction of long-term survival after gastrectomy using random survival forests. Br. J. Surg. 108, 1341–1350. https://doi.org/10.1093/bjs/znab237 (2021).
Article CAS PubMed PubMed Central Google Scholar
Alexiuk, M. & Tangri, N. Prediction models for earlier stages of chronic kidney disease. Curr. Opin. Nephrol. Hypertens 33, 325–330. https://doi.org/10.1097/MNH.0000000000000981 (2024).
Article PubMed Google Scholar
Jiang, F. et al. Automated machine learning-based model for the prediction of pedicle screw loosening after degenerative lumbar fusion surgery. Biosci. Trends 18, 83–93. https://doi.org/10.5582/bst.2023.01327 (2024).
Article PubMed Google Scholar
Ding, H., Yuan, M., Yang, Y., Gupta, M. & Xu, X. S. Evaluating prognostic value of dynamics of circulating lactate dehydrogenase in colorectal cancer using modeling and machine learning. Clin. Pharmacol. Ther. 115, 805–814. https://doi.org/10.1002/cpt.3052 (2024).
Article CAS PubMed Google Scholar
Wang, X. et al. Quantifying and interpreting the prediction accuracy of models for the time of a cardiovascular event-moving beyond c statistic: A review. JAMA Cardiol. 8, 290–295. https://doi.org/10.1001/jamacardio.2022.5279 (2023).
Article PubMed PubMed Central Google Scholar
Taylor, J. M. Random survival forests. J. Thorac. Oncol. 6, 1974–1975. https://doi.org/10.1097/JTO.0b013e318233d835 (2011).
Article PubMed Google Scholar
Gilhodes, J. et al. Comparison of variable selection methods for high-dimensional survival data with competing events. Comput. Biol. Med. 91, 159–167. https://doi.org/10.1016/j.compbiomed.2017.10.021 (2017).
Article PubMed Google Scholar
Kretowska, M. Tree-based models for survival data with competing risks. Comput. Methods Progr. Biomed. 159, 185–198. https://doi.org/10.1016/j.cmpb.2018.03.017 (2018).
Article Google Scholar
Adeoye, J. et al. Deep learning predicts the malignant-transformation-free survival of oral potentially malignant disorders. Cancers https://doi.org/10.3390/cancers13236054 (2021).
Article PubMed PubMed Central Google Scholar
Du, J., Zhou, Y., Liu, P., Vong, C. M. & Wang, T. Parameter-free loss for class-imbalanced deep learning in image classification. IEEE Trans. Neural Netw. Learn. Syst. 34, 3234–3240. https://doi.org/10.1109/TNNLS.2021.3110885 (2023).
Article PubMed Google Scholar
Serghiou, S. & Rough, K. Deep learning for epidemiologists: An introduction to neural networks. Am. J. Epidemiol. 192, 1904–1916. https://doi.org/10.1093/aje/kwad107 (2023).
Article PubMed Google Scholar
Dasgupta, A. et al. Nomograms based on preoperative multiparametric magnetic resonance imaging for prediction of molecular subgrouping in medulloblastoma: Results from a radiogenomics study of 111 patients. Neuro. Oncol. 21, 115–124. https://doi.org/10.1093/neuonc/noy093 (2019).
Article CAS PubMed Google Scholar
Liu, H. & Sun, P. A nomogram model for predicting prognosis of patients with medulloblastoma. Turk. Neurosurg. 34, 38–45. https://doi.org/10.5137/1019-5149.JTN.40397-22.3 (2024).
Article PubMed Google Scholar
Zhu, S. et al. Identification of a twelve-gene signature and establishment of a prognostic nomogram predicting overall survival for medulloblastoma. Front. Genet. 11, 563882 (2020).
Article CAS PubMed PubMed Central Google Scholar
Erickson, B. J., Korfiatis, P., Akkus, Z. & Kline, T. L. Machine learning for medical imaging. Radiographics 37, 505–515. https://doi.org/10.1148/rg.2017160130 (2017).
Article PubMed Google Scholar
Eraslan, G., Avsec, Z., Gagneur, J. & Theis, F. J. Deep learning: New computational modelling techniques for genomics. Nat. Rev. Genet. 20, 389–403. https://doi.org/10.1038/s41576-019-0122-6 (2019).
Article CAS PubMed Google Scholar
Handelman, G. S. et al. eDoctor: Machine learning and the future of medicine. J. Intern. Med. 284, 603–619. https://doi.org/10.1111/joim.12822 (2018).
Article CAS PubMed Google Scholar
She, Y. et al. Deep learning for predicting major pathological response to neoadjuvant chemoimmunotherapy in non-small cell lung cancer: A multicentre study. Ebiomedicine 86, 104364. https://doi.org/10.1016/j.ebiom.2022.104364 (2022).
Article CAS PubMed PubMed Central Google Scholar
Tran, K. A. et al. Deep learning in cancer diagnosis, prognosis and treatment selection. Genome Med. 13, 152. https://doi.org/10.1186/s13073-021-00968-x (2021).
Article PubMed PubMed Central Google Scholar
Foersch, S. et al. Multistain deep learning for prediction of prognosis and therapy response in colorectal cancer. Nat. Med. 29, 430–439. https://doi.org/10.1038/s41591-022-02134-1 (2023).
Article CAS PubMed Google Scholar
Huang, B. et al. Deep learning for the prediction of the survival of midline diffuse glioma with an H3K27M alteration. Brain Sci. https://doi.org/10.3390/brainsci13101483 (2023).
Article PubMed PubMed Central Google Scholar
Zhang, X. et al. Deep learning-based pathology image analysis predicts cancer progression risk in patients with oral leukoplakia. Cancer Med. 12, 7508–7518. https://doi.org/10.1002/cam4.5478 (2023).
Article PubMed PubMed Central Google Scholar

Download references

Author information

Authors and Affiliations

Department of Neurosurgery, The First Affiliated Hospital of Shandong First Medical University, Jinan, 250014, Shandong, China
Meng Sun, Jikui Sun & Meng Li

Authors

Meng Sun
View author publications
You can also search for this author in PubMed Google Scholar
Jikui Sun
View author publications
You can also search for this author in PubMed Google Scholar
Meng Li
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Meng Sun: Conceptualization, Data curation, Investigation, Methodology, Software, Visualization, Writing-original draft; Jikui Sun: Conceptualization, Data curation, Formal analysis; Meng Li: Methodology, Project administration, Supervision-review and review. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Jikui Sun or Meng Li.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Figures.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Sun, M., Sun, J. & Li, M. Deep learning models for predicting the survival of patients with medulloblastoma based on a surveillance, epidemiology, and end results analysis. Sci Rep 14, 14490 (2024). https://doi.org/10.1038/s41598-024-65367-9

Download citation

Received: 21 February 2024
Accepted: 19 June 2024
Published: 24 June 2024
DOI: https://doi.org/10.1038/s41598-024-65367-9
Springer Nature Limited

Deep learning models for predicting the survival of patients with medulloblastoma based on a surveillance, epidemiology, and end results analysis

Abstract

Similar content being viewed by others

Survival prediction of glioblastoma patients using modern deep learning and machine learning techniques

GLIMPSE: a glioblastoma prognostication model using ensemble learning—a surveillance, epidemiology, and end results study

A Survey on Application of Machine Learning Algorithms in Cancer Prediction and Prognosis

Introduction