FormalPara Key Summary Points

Postherpetic neuralgia (PHN) is a prevalent and challenging chronic pain condition to treat.

Early intervention can reduce the incidence and severity of PHN, making it crucial to identify high-risk patients.

By analyzing PHN risk factors using six machine learning methods, we have developed an efficient statistical model and scoring table for predicting PHN.

For individuals at high risk of developing PHN, prompt pain intervention is recommended.

Introduction

The most prevalent complication of herpes zoster infection is postherpetic neuralgia (PHN), defined by current European guidelines as persistent pain lasting beyond 90 days following the resolution of herpes zoster rash [1]. The incidence of PHN in herpes zoster patients is approximately 13.7% [2], increasing further among individuals aged over 50 years old [3]. PHN significantly compromises the quality of life for affected patients [4], affecting sleep patterns, physical sensations, and mental health, and imposing additional economic burdens [5,6,7,8].

The current strategy for preventing PHN is through proactive vaccination against herpes zoster [9]. The recombinant herpes zoster vaccine's efficacy in preventing herpes zoster has been shown in numerous studies, demonstrating safety and effectiveness for up to 10 years [10,11,12,13]. Despite this, a substantial population remains at risk for PHN due to additional costs and limited public awareness about herpes zoster [14, 15]. According to a survey, only 16.57% of people in China aged 50–69 are willing to get vaccinated against herpes zoster [16]. Management of PHN involves analgesic therapy and interventional approaches. Early pain interventions such as epidural and paravertebral blocks, percutaneous electrical nerve stimulation, and stellate ganglion blocks in herpes zoster patients can effectively reduce the occurrence of PHN [17, 20, 21]. However, treatment outcomes for established PHN are often unsatisfactory [18], and, consequently, the prediction of PHN holds significant clinical relevance, prompting global medical attention towards prognostication and early intervention strategies [19].

Logistic regression results indicate risk factors for PHN, such as advanced age, preherpetic pain, severe acute pain, enlarged lesion area, and ocular involvement [22,23,24,25]. However, these factors only indicate an increased likelihood of incidence. Machine learning models can transform these risk factors into predictive probabilities [26], thereby improving clinical decision-making.

To the best of our knowledge, previous studies have utilized machine learning methods, including support vector machine (SVM) and random forest (RF) models, to predict PHN [27, 28]. However, these studies solely rely on a single machine learning approach and lack comparative analysis of various models, potentially overlooking opportunities to construct an optimal predictive model. Additionally, existing literature on PHN prediction models lacks inclusion of objective serological indicators as potential risk factors for PHN. Notably, some serological indicators may hold significant predictive value regarding PHN [29]. In this study, we established six models and incorporated serological indicators associated with PHN, including SVM, logistic regression (LR), RF, K-nearest-neighbors (KNN), gradient boosting model (GBDT), and neural network (NN), providing a more comprehensive approach. This is a new attempt to use machine learning models to predict PHN, aiming to identify the optimal algorithm for prediction. The significance of building a predictive model for PHN is multifaceted: for patients who are predicted to have PHN, it helps patients mentally prepare for the possibility of PHN, applies early intervention to patients, reducing the likelihood of PHN occurring and alleviating their pain. For patients who are not predicted to have PHN, it helps reduces overtreatment and the financial burden on the patients. For doctors, predictive models aid in decision-making, suggesting hospitalization for patients predicted to have PHN and outpatient follow-up for those not predicted to have it. For researchers, our modeling approach may provide insights for their research on other diseases.

To our knowledge, no authoritative PHN predictive scale currently exists. While machine learning models may exhibit superior predictive efficiency compared to simple scales, the convenience and comprehensibility of scales hold significant value for clinicians' decision-making and patient education. Therefore, our secondary objective is to develop a robust predictive scale based on outcomes from the machine learning model, ensuring its high predictive performance.

Materials and Methods

Data Source and Extraction

A retrospective, observational study was conducted with 524 patients with herpes zoster hospitalized at The First Affiliated Hospital of Zhejiang Chinese Medical University from December 2020 to December 2023. For each subject, the following data were collected. First, basic information: gender, age, height, weight, body mass index (BMI), smoking and drinking habits, and history of general anesthesia surgery. Second, pain characteristics: presence of prodromal pain, type of pain (needle-like, knife-like, distention, dull, non-rash site), and NRS score. Third, rash characteristics: type of lesion (blistering erythema or papule), location of lesion (head and face; chest and back; waist and abdomen; neck and shoulder; upper limb; lower limb; left side only/right side only/bilateral), and time to regression of skin lesions. Fourth, treatment details: time to treatment after onset, use of antiviral or hormone therapy. Fifth, other disease information: history of hypertension along with systolic/diastolic blood pressure readings, history of diabetes with blood glucose levels, history of hyperlipidemia with triglyceride/high-density lipoprotein cholesterol levels, presence/absence of rheumatic immune-related diseases, presence/absence of malignant tumors in various systems, and Charlson comorbidity index (CCI). Sixth, objective serological parameters: white blood cell count, neutrophil ratio, lymphocyte ratio, monocyte ratio, eosinophil ratio, basophil ratio, hypersensitive C-reactive protein level, serum neurospecific enolase level, varicella-zoster virus IgG and IgM levels. And seventh, follow-up: data on the occurrence of PHN 3 months post-discharge. The NRS score was assessed on the first day of admission. Serological parameters were collected following these criteria: after an 8-h overnight fast, venous blood samples were obtained from the antecubital vein and analyzed in our central laboratory using standardized procedures. The protocol was approved by the Institutional Review Board of The First Affiliated Hospital of Zhejiang Chinese Medical University (2024-KL-304–01), and the ethics committee exempted our study from obtaining informed consent from participants as we conducted a retrospective analysis using an existing database containing relevant data. The flowchart depicting patient recruitment and processing is presented in Fig. 1.

Fig. 1
figure 1

Recruitment and processing of patients

The diagnosis of herpes zoster was made by experienced dermatologists in accordance with the 2016 European consensus guidelines for herpes zoster treatment [30], which recommend basing the diagnosis on typical clinical symptoms. PHN diagnosis was defined by pain persisting for more than 90 days after rash resolution [30]. Following this classification, a 3-month follow-up was conducted after discharge to determine PHN incidence. The diagnosis of diabetes, hypertension, and hypertriglyceridemia was established through comprehensive medical history assessment and serological examination per the diagnostic criteria outlined by the International Diabetes Association consensus on metabolic diseases in 2009 [31]. Other diseases were diagnosed by acquiring the patient's medical history. The Numeric Rating Scale was used to assess pain intensity, particularly suited for monitoring chronic pain conditions [32]. The Charlson Comorbidity Index (CCI), a robust measure for assessing comorbidity in clinical research, was employed to investigate the association between comorbidities and PHN [33]. As part of our objective to independently investigate the impact of diabetes, malignant tumors, and rheumatic immune-related diseases on herpes zoster, these conditions were considered separate potential risk factors, and their scores were excluded from the CCI. Smoking was defined as continuous or cumulative tobacco use for a minimum of 6 months in an individual's lifetime, while drinking was defined as consuming alcohol at least once per week within the past year. Diagnostic criteria and scoring guidelines are provided in Table 1.

Table 1 Diagnostic and scoring criteria for assessment

Inclusion and Exclusion Criteria

The inclusion criteria for this study are as follows: (1) age greater than 18 years; and (2) adherence to the medical diagnostic criteria for herpes zoster and no prior treatment. The exclusion criteria are: (1) cases with incomplete data; (2) individuals unable to complete the follow-up visit; (3) those with mental disorders, neurological diseases, or other conditions affecting somatosensory perception; (4) individuals with other pain-related disorders that may confound pain perception; (5) patients with atypical manifestations of herpes zoster, including meningeal and visceral involvement; (6) patients not administered conventional treatment for herpes zoster due to various factors, including drug-related allergies; and (7) patients in a state of pregnancy or lactation.

Model Endpoint Definition

This study proposes a binary classification model for predicting PHN incidence in patients diagnosed with herpes zoster.

Model Input Features

A total of 47 potential risk factors were collected, with the following processing steps: gender was coded as 1 for male and 0 for female; dichotomous variables (smoking, drinking, prodromal pain, lesion type and location, and pain type) were coded as 1 for yes and 0 for no. Standardized using maximum–minimum standardization with the formula: E(x) = (x − min)/(max − min).

The least absolute shrinkage and selection operator (LASSO) method was used for variable selection. LASSO regression's objective function incorporates a penalty term into the least squares approach, compressing some regression coefficients to zero, achieving effective feature selection. LASSO regression identified potential risk factors, and non-zero features were included in both the model and scoring table.

Establishment and Comparison of Model and Scoring Table

To obtain the optimal prediction model, we established six models: SVM, LR, RF, KNN, GBDT, and NN. The prediction score table was created from nomogram results obtained through logistic regression output, and variables were manually grouped and scored in a simplified manner.

Performance evaluation for both the model and prediction score table included accuracy, precision, recall, specificity, AUC, Youden index, Kappa index, receiver operating characteristic (ROC) curve analysis, calibration curve analysis, and decision curve analysis.

Statistical Methods, Sample Size Calculation and Grouping

The normal continuous data are presented as mean ± standard deviation, while the skewed continuous data are expressed as median (upper quartile, lower quartile). Categorical parameters are reported as the number of patients and corresponding percentages. Normality tests, including the Shapiro test and histogram analysis, were conducted on continuous data to determine their distribution. A p value greater than 0.05 was considered indicative of a normal distribution. Group comparisons were performed using the t test for normally distributed continuous variables, the Wilcoxon M–W test for skewed continuous variables and the Pearson chi-square test for nonparametric categorical variables. Statistical significance was defined as a p value less than 0.05. All statistical analyses were conducted using R version 4.3.1 ® Foundation for Statistical Computing, Vienna, Austria).

The sample size was calculated using the pmsampsize package in R, which implements the specification of the prediction model sample size calculation method previously published [34], with the following parameters (type = "b", cstatistic = 0.90, parameters = 8, prevalence = 0.4), where the cstatistic parameter was obtained from previously published literature [28]. The calculation results showed that we needed at least 369 cases of data and at least 148 cases of positive results. Therefore, we divided the 524 cases of data into a test set and a training set in a ratio of 1:4, with 419 cases of data included in the training set and 180 cases of positive results, in order to prevent overfitting of the model and to ensure the accuracy of the estimated key parameters in the prediction model.

The LASSO algorithm was solved using the glmnet package in R, and the optimal parameters were selected through tenfold cross-validation before being incorporated into the models and scoring table.

The cut-off values were calculated using the cutoff package in R, which determines the threshold values to achieve a balance between sensitivity and specificity. Kappa coefficient is calculated using the fmsb package in R, the coefficient serving as an index for assessing consistency, ranging from − 1 to 1. A Kappa value exceeding 0.7 indicates a commendable level of model consistency.

The logistic regression model and the random forest model were implemented using the glmnet and randomForest packages in R, the support vector machine model was implemented using the e1071 package in R, the KNN model was implemented using the kknn package in R, the GBDT model was implemented using the xgboost package in R, and the NN model was implemented using the neuralnet package in R. The performance metrics were computed exclusively using the test dataset that was set aside for evaluation purposes.

Results

Patient Characteristics

The study included a total of 524 subjects who were hospitalized for herpes zoster. Among the enrolled population, 229 cases (43.70%) had PHN. Our study utilized a well-integrated database and excluded patients with unavailable follow-up data, ensuring that all data collected from the complete dataset were used to evaluate important variables associated with PHN. The variables including serum-specific enolase and hypersensitivity C protein contained missing values. The missing portions of these variables accounted for 11.2 and 15.8% of the total dataset, respectively. To address this issue, we employed RF regression to impute the missing data.

Lasso Regression Results

Before conducting Lasso regression, we initially performed ROC curve analysis and univariate analysis on individual features and the complete set of findings is available in Table 2. Our findings revealed that the single AUC value of age (AUC = 0.812, p < 0.001), NRS score (AUC = 0.792, p < 0.001), rash recovery time (AUC = 0.680, p < 0.001), serum neurospecific enolase (AUC = 0.659, p < 0.001), diabetes (AUC = 0.638, p < 0.001), varicella-zoster virus IgM (AUC = 0.620, p < 0.001), height (AUC = 0.616, p < 0.001), CCI score (AUC = 0.613, p < 0.001), and receiving treatment time (AUC = 0.612, p < 0.001) exceeded 0.6, thus indicating their potential inclusion in the model. Subsequently, we conducted correlation tests among all the features to avoid incorporating highly correlated confounding factors that could compromise the model's performance. The results of both AUC curves and correlation tests for individual features are presented in Fig. 2A–C.

Table 2 Patient characteristics and in the PHN group and non-PHN group and individual variable AUCs
Fig. 2
figure 2

Results of single factor ROC (A) curve and AUC nomogram (B) and correlation analysis (C)

The optimal parameter (lambda) in the LASSO regression was validated through a tenfold cross-validation, and its optimal value is depicted as a dotted line using both the minimum standard and the 1-SE (standard error) of the minimum standard (Fig. 3A, B). In our LASSO regression results, we identified eight significant variables: age, NRS score, rash recovery time, receiving treatment time, history of malignancy, diabetes, varicella-zoster virus lgM, and serum neurospecific enolase. Due to their minimal correlations with each other, these variables were included in our final model.

Fig. 3
figure 3

The results of Lasso regression

Comparability Between the Test Set and the Training Set

The prediction model was developed and a scoring table was created using the training set comprising 80% of the subjects, while the test set consisting of 20% of the subjects was employed to validate the prediction model. Table 3 provided comprehensive information on subject characteristics. Notably, there were no significant differences observed between the training set and the test set in terms of their characteristics, indicating comparability.

Table 3 Patient characteristics in the test set and the training set

Model Efficiency

In this manuscript, we provided the model parameters and cut-off values to enhance the reproducibility of our model. The performance evaluation of the model included accuracy, precision, recall, specificity, AUC index, Youden index, Kappa index, etc. Detailed results for both training and test sets are provided in Tables 4 and 5, respectively. Additionally, Fig. 4 illustrates the ROC curve analysis along with calibration curve and decision curve analyses for both training and test sets. The closer the ROC curve of the model approaches the upper left corner, the larger the area it covers, indicating superior performance of the model. If the calibration curve closely approximates the 45° line between the X and Y axes, this indicates a high level of consistency in model predictions. In the range where the modeled decision curve surpassed the model-free baseline, the model predicts a higher clinical net benefit of intervention compared to no intervention.

Table 4 Performance metrics for six models and PHNPD scale in training dataset
Table 5 Performance metrics for six models and PHNPD scale in testing dataset
Fig. 4
figure 4

ROC curve for training (A) and testing (B) sets, correction curve for training (C) and testing (D) sets, decision curve for training (E) and testing (F) sets

LR Model

We established the logistic regression model using the glmnet package. Our results indicated that the AUC value for the training set is 0.927 [95% confidence interval (CI): 0.904–0.951], with an accuracy of 0.852 (95% CI 0.814–0.885) and a cut-off value of 0.402; on the test set, our LR model achieved an AUC index of 0.942 (95% CI 0.903–0.982) and an accuracy of 0.848 (95% CI 0.764–0.910).

KNN Model

We established the KNN kernel function model using the KKNN package, with the parameter set as (K = 20, kernel = "rectangular"). The results of our KNN model demonstrated that, in the training set, the AUC index is 0.930 (95% CI 0.907–0.953), achieving an accuracy rate of 0.854 (95% CI 0.817–0.887) and a cut-off value of 0.400. On the test set, our KNN model achieves an AUC index of 0.943 (95% CI 0.904–0.981) and an accuracy rate of 0.829 (95% CI 0.743–0.895).

SVM Model

We established the SVM sigmoid function model using the e1071 package, with parameter settings of (gamma = 0.1 and coef0 = 3). The results of our SVM model demonstrated that the training set achieved an AUC index of 0.902 (95% CI 0.874–0.930), an accuracy rate of 0.821 (95% CI 0.781–0.857), and a cut-off value of 0.391, while, on the test set, the SVM model obtained an AUC index of 0.897 (95% CI 0.840–0.954) and an accuracy rate of 0.771 (95% CI 0.679–0.848).

RF Model

We established the RF model using the randomForest package with parameters set as (ntree = 428). Our RF model results demonstrate that the training set achieved an AUC index of 0.999 (95% CI 0.998–1.000), an accuracy rate of 0.981 (95% CI 0.963–0.992), and a cut-off value of 0.489. Furthermore, the RF model exhibited an AUC index of 0.936 (95% CI 0.892–0.979) and an accuracy rate of 0.838 (95% CI 0.753–0.903) on the test set.

GBDT Model

We established the GBDT model using the xgboost package in R4.3.1, with parameter settings of (eta = 0.3, max_depth = 3, subsample = 1, colsample_bytree = 1, gamma = 0.25). The results obtained from our GBDT model demonstrate that, on the training set, the AUC index achieved a value of 0.993 (95% CI 0.988–0.998), while the accuracy rate reached 0.967 (95% CI 0.945–0.982) with a cut-off value of 0.555. On the test set, our GBDT model yielded an AUC index of 0.931 (95% CI 0.882–0.980) and an accuracy rate of 0.886 (95% CI 0.809–0.940).

NN Model

The NN model was constructed using the neuralnet package of R version 4.3.1, with the parameter set as (hidden = 1, err.fct = "ce"). Our results demonstrated that the training set achieved an AUC index of 0.923 (95% CI 0.904–0.951), an accuracy rate of 0.852 (95% CI 0.814–0.885), and a cut-off value of 0.413. Furthermore, the NN model exhibited an AUC index of 0.934 (95% CI 0.891–0.978) and an accuracy rate of 0.848 (95% CI 0.764–0.910) on the test set.

Prediction Table for PHN in Inpatients (PHNPD)

We derived a prediction table for PHN in inpatients using the nomogram of logistic regression on the training set. The specific distribution of scores can be found in Table 6, while Fig. 5 displays the nomogram of logistic regression. On the training set, our prediction table achieved an AUC index of 0.913 (95% CI 0.886–0.940), an accuracy rate of 0.814 (95% CI 0.773–0.850), and a cut-off value of 9 points. On the test set, the prediction table yielded an AUC index of 0.820 (95% CI 0.869–0.970) and an accuracy rate of 0.790 (95% CI 0.700–0.864).

Table 6 Prediction table for postherpetic neuralgia in inpatients (PHNPD)
Fig. 5
figure 5

Nomogram for logistic regression

Discussion

In comparing various models, both the GBDT and RF models exhibited exceptional performance on the training set, surpassing other models in terms of ROC values, accuracy, precision, recall, and specificity. On the test set, the GBDT model continued to show superior accuracy, precision, specificity, kappa index, and Youden index, and maintained strong performance across the ROC curve, calibration curve, and decision curve analyses. However, the RF model's performance on the test set was weaker, likely due to overfitting during training. Additionally, our GBDT prediction model demonstrated higher accuracy than the SVM and RF models reported in previous literature [27, 28].

We contend that GBDT represents the optimal predictive model for forecasting PHN. The GBDT model effectively incorporates unclassified instances by iteratively fitting them to the residuals, improving performance through progressive iterations, and showing robustness in handling intricate data structures [35]. This performance suggests that the GBDT model could facilitate PHN prognostication in hospitalized herpes zoster patients, allowing for targeted early intervention measures to reduce patient distress.

Although logistic regression nomogram-based predictive scores exhibited inferior performance compared to machine learning models, the complexity of machine learning algorithms may hinder comprehension for medical staff and patients [36]. This motivated the development of our PHNPD scoring system. The predictive scoring system can be easily implemented through simple inquiry and routine serological examination, making it convenient for clinical settings or wards. Additionally, its visually intuitive presentation increases patient understanding of treatment necessity. The scoring table demonstrated commendable accuracy and specificity, though with a relatively lower recall rate. This implies that patients predicted to have PHN by our table are highly likely to actually have PHN, while those predicted not to have PHN still possess a certain probability of having the condition. This classification efficiency is considered feasible, as our primary objective is accurately predicting patients with PHN. In conclusion, patients with PHNPD exceeding 9 should receive proactive treatment to reduce the risk of developing PHN.

In our study, we integrated the following eight risk factors into the model: age, NRS score, time to regression of skin lesions, serum neurospecific enolase levels, diabetes status, varicella-zoster virus IgM antibody levels, duration of treatment, and history of malignancy. The latency of varicella-zoster virus is primarily regulated by cell-mediated immunity, while reactivation is believed to occur due to the loss of immune surveillance [37]. The association between age and PHN is attributed to the decline in cellular immunity associated with aging, leading to a reduced clearance rate of the varicella-zoster virus, resulting in increased inflammation and nerve damage [38]. This mechanism may also contribute to the development of PHN in individuals with tumors and diabetes [39, 40]. The correlation between treatment duration and PHN may be due to early intervention with herpes zoster expediting viral clearance [41], thereby reducing nerve damage. The time taken for skin lesion regression and the NRS score serve as outcome indicators reflecting the extent of viral damage to the skin and nerves, directly impacting the onset of PHN.

We aimed to primarily discuss serological markers not incorporated in previous prognostic models, such as varicella-zoster virus IgM and serum neurospecific enolase. Due to a median hospital admission time of five days in our cohort, most patients had already entered the eruption phase following the acute febrile phase of herpes zoster [42], causing a delay in testing varicella-zoster virus IgM, resulting in negative results for most patients. However, we observed that testing varicella-zoster virus IgM in non-acute phase patients could potentially aid in predicting PHN, as prolonged viral replication often leads to increased inflammation and nerve damage. Therefore, we strongly recommend including this test for all patients, irrespective of their acute phase status, due to its significant predictive value for PHN.

Serum neurospecific enolase, an acid protease exhibiting the highest activity in brain tissues and intermediate levels in peripheral nerve and neuroendocrine tissues, is widely recognized as a preferred marker for monitoring small-cell lung cancer [43]. It has also been observed to be elevated in various brain disorders, including stroke, spinal cord injury, and Alzheimer's disease [44]. Our study demonstrated elevated serum levels of neurospecific enolase in patients with PHN, potentially indicating the influence of the herpes zoster virus on the central or peripheral nervous system. Peripheral nerve damage caused by the herpes zoster virus is unquestionable, and numerous studies have reported various aspects of herpes zoster virus affecting the central nervous system [45,46,47]. The specific underlying mechanism remains unclear and requires further validation through rigorous biological experiments. Nevertheless, this should not hinder the use of this biomarker for predicting PHN, and we recommend including its assessment for patients diagnosed with herpes zoster.

However, there were certain limitations in this study. Firstly, as a retrospective study, the data may be subject to bias, though we attempted to mitigate this by incorporating objective indicators into our analysis. Secondly, this study was conducted at a single tertiary care medical center, necessitating caution when generalizing findings to the broader population. Future research aims to gather external validation datasets to improve and refine the model's performance, and we invite fellow researchers to employ our model parameters to validate their own datasets. Additionally, this retrospective study faced challenges in classifying some subjective indicators. For instance, herpes lesions were categorized into regions such as head and face or waist and abdomen, but herpes zoster affecting the head and face can be further subdivided based on specific nerves involved. Lastly, the scope of nerve injury-related indicators in this study remains somewhat limited, impeding the full realization of our model's potential.

Despite these limitations, our study has notable advantages: firstly, the database's integrity is commendable, as all patients underwent examination and blood sample collection at the same center; secondly, objective serological indicators were incorporated into the prediction model for the first time, yielding commendable outcomes that may offer valuable insights for future research endeavors; lastly, we devised a PHN prediction score with potential for widespread clinical utilization and patient education.

Conclusions

Early intervention for pain is a critical measure in preventing postherpetic neuralgia. Therefore, identifying patients at risk of developing PHN is essential. We utilized six machine learning models to predict the likelihood of PHN in patients with shingles, finding the gradient boosting model to be the most effective. Additionally, we developed a high-performance scoring table for predicting PHN.