A total of 10,841 first-ever ischaemic stroke admissions between Jan 2003 and Dec 2016 were initially extracted from the NNSTR. Patients with missing follow-up data (n = 81), those with missing discharge data (n = 221) and those aged under 45 (n = 175) were excluded due to the likely different underlying mechanisms of disease in younger patients. Therefore, a total of 10,366 ischaemic stroke patients aged 45 years and over were included in the analysis (Supplementary Fig. 1).
Table 1 presents summary characteristics on admission for the entire cohort. Mean age was 78.5 years (standard deviation 10.9 years). There were 5409 females (52%). A total of 3534 (38.0%) patients suffered a Partial Anterior Circulation Stroke (PACS), 2401 (25.8%) of them had a Lacunar Stroke (LACS), 1830 (19.7%) patients had a Total Anterior Circulation Stroke (TACS) and 1533 (16.5%) patients had a Posterior Circulation Stroke (POCS). A total of 2916 patients (28.1%) suffered from coronary heart disease and 6377 (61.5%) from hypertension. Heart failure was diagnosed in 1481 (14.3%) patients, atrial fibrillation in 3393 (32.7%) patients, cancers in 1660 (16.0%) patients, liver disease in 156 (1.5%) patients and peripheral vascular disease in 437 (4.2%) patients. Over the 10 years of follow-up, 4879 patients died (47.1%). Baseline 10-year survival was 44%. Median (95% CI) follow-up was 5.47 (5.35–5.58) years.
The following six candidate predictors had missing data (ranging from 2.2 to 10.3%): OCSP stroke classification (10.3%), pre-stroke modified Rankin score (5.9%), haemoglobin (4.9%), white blood cell count (2.2%), sodium (2.6%), and creatinine (2.5%). Supplementary Tables 2–7 detail the characteristics of the included cohort, stratified by whether each variable in question had missing data. Patients with liver disease, patients without hypertension, as well as older patients were more likely to have missing data.
CRP, cholesterol and serum glucose were not included in the model since they were missing for a high proportion of patients (17.16%, 36.1%, and 24.19%, respectively).
Cox proportional-hazards model
Candidate predictors included in the final Cox proportional hazards model were age , sex , pre-stroke disability (measured using the modified Rankin Score), type of stroke , haemoglobin , sodium , white blood cell count , ischaemic heart disease , atrial fibrillation , cancers , hypertension , chronic obstructive pulmonary disease , liver disease , peripheral vascular disease , heart failure [20, 21] and eGFR . Diabetes was the only candidate predictor that was excluded from the final model.
Table 2 details the association between each predictor and mortality within 10 years as hazard ratios and parameter estimates from the Cox proportional-hazards model. The baseline survival at 10 years was 0.44. Patients with TACS were almost 3 times more likely to die compared those with LACS, hazard ratio (95% confidence interval) (HR, 95% CI): (2.87, 2.62–3.14). Those with eGFR below 15 were almost twice more likely to die after stroke compared to those with eGFR over 90 (1.97, 1.55–2.52). Out of all the comorbidities included, liver disease had the strongest effect on stroke mortality, with a hazard ratio of (1.50, 1.20–1.87). Hypertension was inversely associated with 10-year mortality HR: (0.77, 0.72–0.82). A 1-year increase in age was associated with a 4% increased hazard of mortality HR: (1.04, 1.04–1.05). Hazard ratios for haemoglobin, sodium and white blood cell count values were computed using RCSs and are shown in Supplementary Fig. 2.
The R-squared for the model was 0.32 whilst the C-statistic was 0.76 which is recognised as ‘fair’ discrimination ability . After internal validation, the optimism-adjusted R-squared slightly reduced to 0.31 whilst the C-statistic remained the same. The calibration slope was 0.98 indicating good model fit. The relationship between the predicted and observed probabilities was also assessed visually using a calibration plot and showed good agreement (Fig. 1). The blue line on the plot represents the bootstrap bias-corrected calibration curve and displays evidence of slight overprediction for good prognosis patients. However, for poor prognosis patients (survival < 40%), the model appears to be accurate.
Figure 2 details the resulting score nomogram based on the results of the Cox proportional-hazards model. For example, a female 60-year-old patient with haemoglobin level of 120, sodium of 135, white blood count of 6.5, eGFR of 45, with pre-stroke modified Rankin score of 1 and history of AF and no other comorbidities, who suffered a partial anterior circulation stroke, would receive 71.66 points. This corresponds to 10-year survival of 0.74.
Supplementary Fig. 4 details the distribution of the total score points across the study population, calculated using the provided nomogram. The total score points were normally distributed across the study population with mean (SD): 113 (38.6), Min = 11.54 and Max = 250.36.
Supplementary Fig. 5 details the observed 10-year survival curves, stratified by score quintiles (Fifth 1: 11.5–79.5, Fifth 2: 79.4, 100.9, Fifth 3: 100.9–120.9, Fifth 4:120.9–145.8, Fifth 5: 145.8–250.4). The score discriminates well between strata according to their risk of death within 10 years.