Background

Chronic obstructive pulmonary disease (COPD) is a disease that seriously endangers human health, causing huge socioeconomic and health burdens worldwide, and has become a major challenge to public health [1, 2]. According to the Global Initiative on Chronic Obstructive Pulmonary Disease (GOLD)(2021): acute exacerbation of chronic obstructive pulmonary disease is an important event in the process of disease management [3]. Due to the enhanced chronic inflammatory response of the airway and lungs to toxic particles or gases, airflow restriction is often progressive in COPD patients, and acute exacerbations occur repeatedly in the course of disease development [1]. Studies have shown that the readmission rate of COPD patients with acute exacerbation within 30 days was 6.70–22.60%, and acute exacerbation readmission within 30 days due to its short acute exacerbation cycle not only seriously damages lung function and increases the risk of death, but also occupies a large number of medical resources [4].

Research shows that early identification of high-risk populations for readmission and appropriate measures can prevent readmission in some populations [5]. Lau et al. [6] used the RACE scale and LACE to predict the readmission of COPD patients in the United States, Australia and other countries, and tested their effectiveness. Donze et al. [7] used the HOSPITAL scale to predict readmission in the US and even international multicenter populations. In addition, some scholars have predicted and compared the risk of acute exacerbation readmission within 30 days of COPD patients based on accelerometer-based activity monitoring, predictive scales and related models [8,9,10,11]. The results of the studies all show that the risk prediction results based on the model are more reliable, but it is still inconclusive which way the model constructed is more suitable for the risk screening of readmission. Although Goto et al. [5] compared logistic regression model with decision tree algorithm model and deep neural network model, they still lacked specific prediction tools. As an emerging model building method, support vector machine (SVM) is an important technology in data mining classification, and its predictive performance has been gradually recognized by the medical field, but there is still a lack of relevant reports in the risk prediction of readmission in COPD patients [12].

Our study assumes that the SVM model can achieve a certain prediction effect in predicting the risk of readmission in COPD patients, and the results have certain reference value. Therefore, it is proposed to use SVM to build a 30-day acute exacerbation readmission risk prediction model for elderly COPD patients, and evaluate its prediction effect, so as to provide a basis for early identification of patients with high risk of readmission in the future.

Methods

Subjects selection

From April 2019 to August 2020, COPD patients who met the inclusion and exclusion criteria in respiratory department of 13 general hospitals in Ningxia, China were investigated. Patients were followed up from May 2019 to September 2020. Ethics approval for the data collection and the use of clinical data in the study were obtained from the Ethics Committee of the General Hospital of Ningxia Medical University (2020-643), and all subjects signed informed consent. 1200 patients were enrolled. Inclusion criteria: (1) it meets the GOLD (2021) diagnostic criteria for COPD: that is, it has the following characteristics: ① Previous diagnosis of COPD; ② Smoking more than 20 cigarettes a day for more than 15 years; ③Have a long-term history of exposure to a large number of biofuels or occupational dust in an enclosed space; ④ Symptoms, chronic course and progressive aggravation above 40 years old; ⑤ Chronic cough or expectoration, and dyspnea after gradual activity; ⑥ Symptoms persist, with less daytime variation. After excluding the possibility of pulmonary tuberculosis, the peak expiratory flow rate (PEFR) was measured. The PEFR of 2 receptor agonist salbutamol 2 was improved by less than 20% after repeated determination 15 min after spray; FEV1/FVC0.7; (2) COPD atients with stable condition: the patient has stable or mild cough, expectoration, shortness of breath and other symptoms; (3) Age ≥ 60 years old. According to the following exclusion criteria, a total of 142 patients were excluded: (1) errors or missing items exceeding 20% and completely identical questionnaires were 83 cases; (2) 16 cases with audio-visual impairment and unable to communicate; (3) 21 cases requested to withdraw from the study; (4) During the study, 22 cases could not be contacted due to change of contact information or other reasons. A total of 1058 patients were included (Fig. 1).

Fig. 1
figure 1

Flowchart of inclusion and exclusion criteria

Research tools

General data questionnaire: designed by researchers according to the purpose and content of the research, through literature review and preliminary investigation, including age, sex, marital status, education level, smoking status, comorbidity, hospitalization times of acute exacerbation of COPD in the past 1 year, etc.

Modified Medical Research Council Dyspnea Scale (mMRC) are used to assess the severity of dyspnea in COPD patients, and verified to have good testing characteristics in Chinese population, it is divided into five levels. Level 0: difficulty breathing only during strenuous activity; Level 1: shortness of breath when walking briskly on flat ground or walking on a hill; Level 2: walking on flat ground slower than peers or needing to stop to rest due to shortness of breath; Level 3: walk on flat ground for about 100 m or need to stop for breath after a few minutes; Level 4: unable to leave the house because of severe breathing difficulties, or having difficulty breathing when putting on or undressing [13].

Activities of Daily Living (ADL) are used to evaluate patients’ ability of Daily Living Activities. It was first published in 1965 by Dorothy Barthel and Florence Mahone, and it was verified to have good test characteristics among Chinese people [14] includes 10 items. The total score is 100 points, and the evaluation standard is: 81~100 points, life completely self-care. 61~80 points, mild dysfunction, able to complete daily activities independently; 41~60 points, moderate dysfunction, need help in life ≤ 40 points, severely dysfunctional or totally dependent, most daily activities cannot be completed or require human care [15].

The Geriatric Depression Scale (GDS) is a measure of Depression in the elderly over the last 1 week. Prepared by Brink Equal in 1982, proved to have good test characteristics in The Chinese population [16] includes 30 entries. The total score is 30 and the assessment standard is: 0~10 is normal, that is no depression; 11 to 20 are classified as likely to have depressive symptoms; Between 21 and 30 is classified as depression [17].

The Mini Nutritional Assessment-Short Form (MNA-SF) is designed to evaluate Nutritional status. In 2001, it was proposed by Rubenstein et al. On the basis of Mini Nutritional Assessment and verified to have good testing characteristics in Chinese population [18] includes 8 questions. With a total score of 14, 12 to 14 is classified as normal nutritional status, 8 to 11 as at risk of malnutrition and 0 to 7 as undernourished [19].

COPD assessment test (CAT) scale used to assess the severity of COPD. Proposed by Jones on the basis of the St George Respiratory Questionnaire (SGRQ) in 2009, it has proven to have good test characteristics in a Chinese population [20] Includes 8 questions. The total score is 40 points, with the total score < 10 indicating mild illness, 10 < total score ≤ 20 indicating moderate illness, 20 < total score ≤ 30 indicating serious illness, and > 30 indicating very serious illness [21].

Data collection methods

The researchers conducted a preliminary survey of 48 elderly patients with COPD and improved the questionnaire. The investigator of each hospital shall be trained uniformly (Investigators having worked in respiratory medicine department for 5 y or more, having bachelor’s degree and above with the qualification certificate of supervisor nurse, having questionnaire investigation experience), the training contents shall include the research purpose, research contents, questionnaire filling requirements, etc., trained investigator ask the patient about each item of the questionnaire and fill out each item according to the patient’s answers. Check if there is any missing item, take it back after checking. Patients were followed up to 30 days after discharge, including whether they were readmitted for acute exacerbation within 30 days after discharge, whether they took medication regularly and whether they took rehabilitation exercise, etc., and the readmission season of patients was recorded.

Construction and verification of risk prediction model

A total of 28 variables were included in the screening of indicators for the construction of risk prediction model. After single factor analysis, a total of 15 variables such as age, education level and smoking status entered the regression analysis. Finally, 8 meaningful factors of logistic regression analysis were included in the construction of the model. The research objects are divided into training set (n = 741) and test set (n = 317) according to the ratio of 7:3. Based on the influencing factors, four kernel function models of SVM, linear SVM, polynomial SVM, sigmoid SVM and RBF SVM, are constructed. The prediction efficiency of the four kernel functions is analyzed and compared through the precision, recall, accuracy, F1 index and the area under the ROC curve (AUC) respectively.

Data analysis

Epidata 3.1 was used for data entry, SPSS 25.0 and MATLAB R2020b were used for statistical analysis. Measurement data were described by median and quartile, while counting data were described by frequency and percentage. The measurement data of the two groups were compared by Mann–Whitney U test, the grading data were compared by Wilcoxon W test, and the counting data were compared by χ2 test. precision, recall, accuracy, F1 index and AUC evaluation model prediction ability. In all analyses, statistical significance was set at P < 0.05.

Results

General information of subjects

This study investigated 1058 elderly patients with COPD, including 645 males (60.96%) and 413 females (39.04%), aged from 60 to 96 years old. For other general information, see Table 1.

Table 1 Single factor analysis of different characteristics of the two groups

Risk prediction model construction index screening

Single factor analysis showed that there were statistically significant differences in age, education level, smoking status, diabetes, coronary heart disease (CHD), hospitalization times of acute exacerbation of COPD in the past 1 year, seasonal factors, whether Long-term home oxygen therapy, whether regular medication, whether rehabilitation exercise, course of disease, ADL, GDS, mMRC and MNA-SF of the subjects (P < 0.05), (Table1).

In Single factor analysis, statistically significant factors were independent variables, whether readmission was the dependent variable, was included in binary Logistic regression analysis. The results showed that education level, smoking status, CHD, hospitalization times of acute exacerbation of COPD in the past year, whether Long-term home oxygen therapy, whether regular medication, MNA-SF and seasonal factors were the influencing factors (P < 0.05), (Table 2). The eight significant factors of Logistic regression analysis were incorporated into the model construction.

Table 2 Logistic regression of acute exacerbation readmission in elderly patients with COPD within 30 days

Construction and verification of risk prediction model

Eight significant factors were included in logistic regression analysis to build a SVM model, the subjects were divided into training set (n = 741) and test set (n = 317) in a 7:3 ratio.

Since the dimensions of variables are inconsistent and the range of values is large, in order to avoid tedious calculation and ignore small numerical data, SVM is applied to normalize the data to [− 1, 1] before modeling. Using grid search and cross-validation methods to find the optimal (c, g) pairs, The optimal (c, g) was used to establish the SVM model using Linear, Polynomial, Sigmoid and RBF kernel functions respectively.

Comparison of predictive performance of risk prediction model

The prediction effect was evaluated by precision, recall, accuracy, F1 index and AUC, In the training set and test set, the precision, recall, accuracy, F1 index and AUC of RBF-SVM model are better than those constructed by Linear-SVM, Polynomial-SVM and Sigmoid-SVM, shown in Table 3, Figs. 2 and 3.

Table 3 Predictive performance indicators of the four kernel functions
Fig. 2
figure 2

ROC of the training sets of four functions

Fig. 3
figure 3

ROC of the test sets of four functions

Discussion

Studies have shown that acute exacerbations of COPD occur repeatedly in the course of disease development and may lead to readmission due to acute exacerbations within a short period after discharge. Acute exacerbation readmission within 30 days due to the short acute exacerbation cycle, not only severely damages lung function and increases the risk of death, but also occupies a large number of medical resources [4]. Based on the SVM model, this study used Linear-SVM, Polynomial-SVM, Sigmoid-SVM and RBF-SVM kernel function to develop and verify the risk prediction model of acute exacerbation readmission within 30 days of elderly patients with COPD. The 28 variables were analyzed, and 8 factors that were significant for logistic regression analysis were finally included in the model construction, namely education level, smoking status, CHD, the number of hospitalizations for acute exacerbation of COPD in the past 1 year, seasonal factors, home oxygen therapy, regular medication, nutritional status. The prediction model has good prediction performance on both the training set and the test set, and the RBF kernel function fitting model in both the training set and the test set has the best effect. This result is consistent with the comparison results of the four kernel functions in the study of Du et al.12. This may be related to the characteristics of the RBF kernel function itself. The RBF kernel function determines the classification boundary according to the distance of each support vector, and is the most flexible method. The eight variables required to construct the predictive model are readily available in the follow-up of elderly patients with COPD, suggesting that the model may be a convenient tool for predicting the risk of readmission for acute exacerbation within 30 days in elderly patients with COPD, it can help medical staff to effectively assess and manage the risk of acute exacerbation, slow the disease process and reduce mortality.

In this study, educational level and smoking status were the influencing factors of readmission for acute exacerbation. Collinsworth et al. [22] showed that a high level of self-management education was a protective factor for readmission. In this study, patients with higher education level (β = − 0.358) had lower risk of acute exacerbation and readmission, which may be related to the fact that patients with higher education level have a stronger ability to understand and comprehend health education knowledge, and can pass various media and get more knowledge about COPD prevention and treatment through the Internet. Smoking is significantly associated with readmission for acute exacerbations. Four studies included 21,270 subjects, and the 52-week observation found that smokers had a higher risk of readmission than quit smoking and never-smokers. In addition, this study also found that current or heavy smokers had a higher frequency of exacerbations than former or light smokers [RR: (0.81–0.99): (0.92–1.29)], which may smoking counteracts the therapeutic effect of inhaled glucocorticoids and can aggravate the patient's inflammatory state, causing greater damage to lung function [22]. Therefore, for patients with low education level, health education should be carried out in an easily acceptable way such as video broadcasting, and inform the smoking patients of the impact of smoking on the disease and the harm to the body.

CHD, the number of hospitalizations for acute exacerbation of COPD in the past 1 year, and seasonal factors were the influencing factors of readmission for acute exacerbation. COPD often coexists with other diseases. The most common one is cardiovascular disease represented by CHD. Patients with CHD have more severe shortness of breath and lower exercise tolerance after exercise [23]. In addition, this study found that patients with ≥ 2 hospitalizations for acute exacerbations in the past 1 year were 3.891 times more likely to have readmissions than patients with < 2. Studies by other scholars have also confirmed that the number of exacerbations in COPD patients in the past year can predict the risk of early readmission, and regardless of the severity of the disease, the history of previous exacerbations is the best predictor of later exacerbations [24]. The acute exacerbation of COPD patients is usually caused by viral/bacterial infection, which not only increases the risk of readmission for acute exacerbation, but also leads to the decline of lung function and quality of life, and the increase of mortality. Therefore, the prevention of acute exacerbations of COPD is particularly important. Medical staff should conduct health education for patients with acute exacerbation hospitalization ≥ 2 times in the past year to help them master the knowledge of pulmonary rehabilitation. Study has shown that acute exacerbation of COPD is closely related to seasonal changes. Patients with low temperature have obvious immune dysfunction, and chronic airway inflammation is likely to aggravate after cold air stimulation [25]. In addition, COPD patients are mostly elderly, their airway reactivity is increased, they are more sensitive to changes in air temperature, and are likely to cause airway spasm [23]. This study found that the readmission rates of elderly COPD patients in spring, summer, autumn, and winter were 20.60%, 21.36%, 26.66%, and 31.38%, respectively. The risk of readmission in spring and winter was 3.814 times and 5.070 times that in summer, respectively, similar to the results of a study by Fang Jiaying [26].

Home oxygen therapy, regular medication and nutritional support are the influencing factors of readmission for acute exacerbation. Long-term home oxygen therapy, as one of the treatment measures in the stable management of COPD patients, can delay the progression of the disease. In addition, COPD is mainly treated with drugs for a long time. At present, there are many kinds of drugs for the treatment of COPD, and elderly patients have poor memory and poor medication compliance. In this study, only 23.35% of the patients took regular medication as prescribed by the doctor. The multivariate analysis showed that the risk of readmission was 0.313 times higher than that of regular users, which was consistent with the results of Kume [27]. This may be related to the fact that regular medication can improve the pulmonary ventilation function of patients, effectively prevent the development of the disease, and thus reduce the number of hospitalizations for acute exacerbations. COPD is a chronic wasting disease. It is a common disease that causes malnutrition due to increased energy consumption, electrolyte disturbances, digestive disturbances and the influence of drugs [28,29,30]. In this study, the risk of malnutrition and the incidence of malnutrition in elderly patients with COPD were 44.71% and 28.07%, respectively. The results of multivariate analysis showed that patients with poorer nutritional status (β = 1.541) had a higher risk of readmission, which was similar to the findings of Wu et al. [31]. This may be related to muscle atrophy caused by poor nutritional status, resulting in weakened respiratory muscle strength, further affecting respiratory function, and promoting acute exacerbation of COPD.

Although this study based on the SVM model predicts the 30-day acute exacerbation readmission risk in elderly COPD patients with good performance on both the training set and the test set, there are still some limitations. First of all, the data of the training set and test set of this study are from the same sample group, which may limit its promotion in different populations, hospitals and regions. Second, because C-reactive protein and sputum culture results related to COPD are not routine inspection items in some surveyed hospitals, relevant indicators were not included in this study, which may lead to bias in the results, and relevant indicators can be added for analysis in the future.

Conclusions

In summary, this study found factors that may affect readmission. The SVM model constructed based on the above factors has achieved a certain predictive effect on predicting the risk of readmission of patients, and has a certain reference value, which can provide reference information for risk assessment and clinical prevention and treatment of readmission.