Background

Stroke, the second most common cause of death in Korea, is a considerable burden on Korean society. According to an annual report of the National Health Insurance Corporation in 2009, the estimated cost to treat and manage cerebrovascular diseases totals 920 billion won (approximately 0.87 billion dollars) [1].

Historically, traditional Korean medicine (TKM), which considers- the location, cause, and nature of patients’ diseases, has been the major treatment modality in the Korean medical system for properly treating stroke patients.

TKM has a unique diagnostic system called pattern identification (PI), which distinguishes it from modern Western medicine. PI is a process of the overall analysis of clinical data using an integrative approach that addresses the etiology, pathology, and treatment method [2]. This diagnostic system is theoretically well organized and involves the classification of patients into certain categories. However, other medical systems did not recognize the pattern classifications of TKM as a diagnostic system by TKM doctors, which prompted these doctors to study whether they could make the same diagnosis in identical patients (reliability) and whether their diagnoses were accurate (validity). Considering the importance of PI in determining treatment regimens, an accurate questionnaire is warranted.

In the last five years, many TKM doctors have studied the diagnostic validity and reliability of TKM for various health conditions [26]. Lee et al. [7] released the Korean Standard Pattern Identification for Stroke (K-SPI-Stroke) in 2011, which included the sign and symptoms used by TKM doctors to diagnose patterns of stroke in Korean patients. We evaluated the reliability and validity of the K-SPI-Stroke.

Methods

Study subjects and data collection

This study was a community-based multi-center trial. Stroke patients were admitted to 11 TKM hospitals. We enrolled stroke patients within 30 days of the onset of their symptoms as confirmed by imaging diagnosis, such as computerized tomography (CT) or magnetic resonance imaging (MRI). We excluded traumatic strokes, such as subarachnoid, subdural, and epidural hemorrhages. Moreover, we excluded patients who exhibited a blood stasis pattern. Originally, blood stasis was one of the possible patterns, but the number of patterns was changed from four patterns to five pattern in a previous study [7] because blood stasis was dropped due to a small sample size. Informed consent was received from all of the subjects, and PI was performed using the 44 signs and symptoms related to the four patterns. On the same day, to minimize differences between the two diagnoses, patients were independently diagnosed with one of the four patterns using PI by two physicians. The physicians had at least three years of clinical experience with stroke after finishing their regular (six years) education in TKM. The physicians took regular training courses on standard operating procedures (SOP) twice a year. The number of physicians at each site ranged from 2 to 25, and a total of 105 physicians were included in this study. On the basis of the diagnostic results from the physicians, the patients were classified to have an identical pattern when the two physicians’ decisions were in agreement. Over 40 months, 2,905 patients participated in this study (from September 2006 to December 2010). This study was approved by the Institutional Review Boards of the Korea Institute of Oriental Medicine (KIOM) and each of the participating TKM hospitals.

Pattern identification questionnaire

The K-SPI-Stroke questionnaire was developed in a previous study via a multistage process [8]. First, the signs and symptoms included in the K-SPI-Stroke were selected from the TKM literature, textbooks, and clinical papers by an expert committee organized by the KIOM. Second, a K-SPI-Stroke prototype featuring various signs and symptoms was applied to test the data from recruited patients. The K-SPI-Stroke was then reexamined by a committee of experts and adjusted to maximize its validity. Consequently, the final version of the K-SPI-Stroke included 44 sign and symptom entries (11 Qi deficiency pattern items, 7 Dampness-phlegm pattern items, 7 Yin deficiency pattern items and 19 Fire-heat pattern items) [7, 9, 10]. The severity for each item is graded on the following scale: 1 = not significant, 2 = significant and 3 = very significant. To convert this scale to a dichotomous scoring system, 0 points were assigned for a grade score of 1, and 1 point was given for each item receiving a graded score of 2 or 3. The dichotomous scores of each item were summed, and the K-SPI-Stroke score (Qi deficiency pattern score: QDP score, Dampness-phlegm pattern score: DPP score, Yin deficiency pattern score: YDP score, and Fire-heat pattern score: FHP score) was calculated.

Statistical methods

The internal consistency of the questionnaire was assessed using Cronbach’s α coefficient. It has been suggested that values of α > 0.5 are acceptable, although ideally the scores should be >0.7 [11, 12]. The K-SPI-Stroke score of each pattern was calculated by summing all of the relevant item scores for each pattern category. As an indication of the discriminant validity, the mean score of the patients’ diagnosed pattern should be significantly higher than the other nondiagnosed patterns, indicating the scores’ ability to differentiate among the patterns. The underlying hypothesis was that stroke patients would report higher scores for their diagnosed pattern and thus report lower scores for the other nondiagnosed patterns. To test the discriminant validity, which test whether measures that are supposed to be unrelated are in fact unrelated [13], we used a one-way analysis of variance (ANOVA to compare the mean scores of the four patterns. Posteriori comparisons were performed using Scheffé’s test. To evaluate the predictive validity, we used a classification accuracy test with a discriminant linear function for the four patterns. Statistical significance was achieved when a P value was less than 0.05. All analyses were performed with SAS version 9.1.3 (SAS Institute, Cary, NC).

Results

Demographic characteristics of patients with stroke

The mean age of the 2,905 subjects was 66.98 (standard deviation, 11.55) years. The numbers of male and female were 1,530 (52.67%) and 1,375 (47.33%), respectively. Approximately 65% of the patients had histories of drinking and smoking. Patients with diabetes and hyperlipidemia accounted for 26.73% and 12.63% of the population, respectively. The frequency of stroke patients with hypertension (60.54%) was higher than the frequency of patients with diabetes or hyperlipidemia. Other variables are also presented in Table 1.

Table 1 Demographic characteristics of patients with stroke

Reliability analysis

Cronbach’s α coefficient was computed to test the internal consistency of the 44 items on the K-SPI-Stroke questionnaire. Cronbach’s α was 0.70 for the 44 signs and symptoms, indicating that the K-SPI-Stroke exhibited good internal consistency and the alpha values for the patterns ranged from 0.42 to 0.67 (Table 2).

Table 2 Internal consistency (Cronbach’s coefficient α) total and within each pattern and mean scores for each item of patterns

Validity analysis

Because we expected to find that the mean scores of each pattern would differ depending on the diagnosed pattern of the patient, we assessed the discriminant validity using an ANOVA for the mean scores of the four patterns. We compared the means of the scores (FHP, DPP, QDP and YDP scores) to determine whether differences could be detected among the four patterns. The mean FHP score for the FHP patients was significantly higher than that for the DPP, QDP and YDP patients. The mean score of patients’ diagnosed pattern was significantly higher than the mean scores of the other patterns (Table 3).

Table 3 Discriminant validity of the four patterns

To evaluate the predictive validity of this study, we used a discriminant analysis to classify the four patterns. The linear discriminant function, which consisted of the 44 items for the four patterns, is not shown. The predictive function that was taken into consideration for the distribution of the patterns and the individual patterns was assessed with the classification rates listed for the whole model. Table 4 shows the classification results for the four patterns. The overall accuracy of the classification of the four patterns was 65.2 %, and the individual classification accuracy for QDP, DPP, YDP and FHP was 64.13 %, 72.61 %, 41.77 %, and 68.23 %, respectively.

Table 4 Results using the classification of discriminant linear function

Discussion

This study demonstrates that the K-SPI-Stroke questionnaire is a reliable and valid instrument. Reliability and validity are two important factors in designing a questionnaire. Reliability is concerned with the repeatability or reproducibility of the measurements while validity reflects the accuracy of the data and ensures that responses are a true reflection of the issues of interest [1417]. To test the reliability of the K-SPI-Stroke questionnaire, we evaluated its internal consistency by using Cronbach’s α, which equals zero when the true score is not measured at all and when the data show only errors or noise; when Cronbach’s α equals 1.0, all of the items measure the true score alone without any error contributions.

The K-SPI-Stroke questionnaire had strong internal consistency, with a Cronbach’s α of 0.700 for the total signs and symptoms. However, each pattern was unsatisfactory and varied from 0.424 to 0.674. The poor reliability of the internal consistency in each pattern may suggest that the pattern constructs are not homogenous, or perhaps that the signs and symptoms are not appropriate measures of these constructs for stroke. It is likely that supplementing other patterns of K-SPI-Stroke with additional signs and symptoms, or eliminating poor signs and symptoms, may improve the reliability of the internal consistency for this questionnaire.

Cronbach’s α increased to a maximum of 0.715 when “wheezing in throat with sputum” was removed. Although this result implies that “wheezing in throat with sputum” does not measure stroke with the same validity as the other items, this item was not removed because this symptom had little influence on the overall internal consistency of the K-SPI-Stroke questionnaire.

In the analysis of the validity, we used two methods: scores comparison and a classification accuracy test. The mean score of the patients’ diagnosed pattern was significantly higher than the mean scores of the other patterns. In other words, the QDP, DPP, YDP and FHP patients reported the highest mean score for the QDP, DPP, YDP and FHP pattern, respectively. Thus, each score was a good reflection of the patient’s pathologic pattern. The second method was to compare the classification results with the physicians’ diagnoses to show the classification accuracy. The overall classification accuracy of the four patterns was 65.2 %, and the classification accuracy of the QDP, DPP, YDP and FHP was 64.13 %, 72.61 %, 41.77 %, and 68.23 %, respectively.

The YDP score was significantly higher than the scores of the other patterns (Table 2). However, the classification accuracy of the YDP (41.77 %) was lower than that of the other patterns (Table 3). This result indicates that some of the YDP items may not have accurately reflected the YDP pattern. In order words, the K-SPI-Stroke questionnaire does not discriminate YDP among the four patterns because in TKM, YDP simultaneously includes a Heat element of the FHP and a deficiency element of the QDP.

Conclusion

A 44 items for K-SPI-Stroke was developed, which included 33 items related to signs and 11 items related to symptoms of stroke. The developed K-SPI-Stroke questionnaire had satisfactory reliability (α = 0.700), predictive validity (classification accuracy of 65.2 %) and discriminant validity with significant differences in the means of scores among the patterns. Further studies are warranted to overcome the predictive validity limitation.