Background

The Japanese national health insurance system covers 148 Kampo formulas—which are traditional Japanese herbal formulas mainly derived from ancient China—making them widely available, and each formula has some indications for Western diseases and/or symptoms. Approximately 90 % of Japanese physicians who learned Western medicine use Kampo formulas in daily practice [1, 2]. However, in medical school or continuing medical education, medical students and physicians have limited exposure to Kampo medical education. Thus, it is difficult for most Japanese physicians who do not specialize in Kampo medicine to prescribe Kampo formulas according to the traditional medicine pattern diagnoses and theories of Kampo medicine, which are far different from those of Western medicine. Therefore, the Japanese non-specialist physicians prescribe Kampo formulas according to the indications based on Western diseases and/or symptoms (i.e., maoto for influenza, yokukansan for dementia, and shakuyakukanzoto for muscle cramps [37]). This “Western disease based” prescription of Kampo formula is easy to perform for non-specialist physicians but far different from the original way of prescription based on the pattern diagnoses. For example, patients diagnosed with a certain Western disease may have various subjective symptoms and objective findings, and will be classified by pattern diagnoses in Kampo medicine. Moreover, different Kampo formulas can be prescribed for patients with the same pattern diagnosis. Questions about safety, effectiveness, and cost of “Western disease based” prescription of Kampo medicine still remain.

A pattern diagnosis in Kampo medicine refers to the complete clinical presentation of the patient at a given moment in time. Physicians specialized in Kampo medicine use four methods of procedures to make their diagnoses including: inspection, hearing, enquiry, and palpation. Based on the information obtained through these complex procedures, the diagnoses are formed through the process of applying the differential diagnoses for “disharmony symptoms” in the areas of excess–deficiency, heat–cold, and body constituents (Qi, blood, and fluid) to chronic health conditions [8]. However, because there are many ingredients of Kampo formulas and countless traditional medicine pattern diagnoses, it is difficult for non-specialist physicians to accurately and promptly choose a suitable Kampo formula. In this situation, a decision support system (DSS) is required for non-specialist physicians.

We are developing a DSS for non-specialist physicians based on a clinical database created by specialist physicians. This system does not rely on traditional measurement methods such as inspection, hearing, and palpation (including pulse and abdominal examinations), which are difficult for non-specialist physicians to perform. Our DSS comprises two parts: (1) the prediction of the traditional medicine pattern diagnosis, and (2) the prediction of the appropriate Kampo formula. We have already reported on the first part, including the two essential diagnoses—excess–deficiency pattern [9] and heat–cold pattern [10]. However, the second part has not yet been reported on. When the non-specialist physicians can select the appropriate Kampo formula with our DSS, the formula selection will be more safe, efficient, and cost-effective. The standardized and reproducible formula selection will be used for the clinical trials of Kampo medicine which include the idea of pattern diagnoses and proper Kampo formulas.

Herein, we discuss the preliminary results on the use of the DSS for predicting Japanese Kampo formulas for non-specialist physicians.

Methods

Patient enrollment

Keio University first introduced a browser-based questionnaire for collecting clinical information in May 2008. The present observational study included patients who made their first visit to the Kampo Clinic at Keio University Hospital between May 2008 and March 2013.

Inclusion criteria were a willingness to be in the study and having more than 20 subjective symptoms as determined by 128 questions in the browser-based questionnaire. We excluded patients who answered “yes” to less than 20 symptoms because they tended to be outliers in our previous research [9]. The participants had to be taking at least one Kampo formula, but not more than two.

Exclusion criteria were missing data on age, sex, body mass index (BMI), subjective symptoms, traditional measurements, information on Western diagnoses, and traditional medicine pattern diagnoses, as well as not being prescribed a Kampo formula. Actually, data collection of BMI was started from January 2012, and most of patients who made their first visit until December 2011 were excluded.

Data collection

The browser-based questionnaire collected clinical information about patient’s subjective symptoms and symptom severity via visual analogue scales (VAS), along with information on age, sex, BMI, traditional measurements, Western diagnoses based on the International Statistical Classification of Diseases and Related Health Problems (ICD-10), traditional medicine pattern diagnoses, and Kampo formulas prescribed by Kampo specialists. We collected information about patients’ subjective symptoms using a questionnaire comprising 128 binary questions [11]. You can find all the items in Appendix: Tables 4 and 5. Among these questions, 106 were also assessed using a VAS if patients answered with “yes” to the binary portion. To normalize within each patient, we divided each patient’s VAS score by the maximum VAS possible. Data from each question on the traditional measurements, Western diagnoses, traditional medicine pattern diagnoses, and Kampo formulas were binary.

Model fitting procedure

We applied random forests in accordance with our previous reports [9, 10] to predict the frequently used Kampo formulas. Kampo formula for each data record was based on the browser-based questionnaire, and was selected by a Kampo specialist at the first consultation for the patient in Keio University. The random forests method was developed by Breiman [12, 13] and is a classification algorithm that uses an ensemble of classification trees. Random forests build a large collection of decorrelated trees and then averages them [14]. Random forests often have very good predictive accuracy [15] and have been widely used in many fields (e.g., body pose recognition for Microsoft’s popular Kinect sensor [16]).

We set the number of candidate Kampo formulas from among the most frequent two or more. If there were Kampo formulas with the same frequency, we added either one or all of them as candidates.

We used two sets of predictor variables: the non-specialist variable set and the specialist variable set. In this analysis, we did not perform the variable selection. We intended to validate the difference in discriminant rates between these two variable sets to see if the non-specialist variable set, which did not include all of the items Kampo specialists use in selecting a proper Kampo formula, was appropriate for selecting Kampo formulas. The non-specialist variable set included age, sex, BMI, subjective symptoms, and the two essential and predictable traditional medicine pattern diagnoses—excess–deficiency and heat–cold—according to Kampo specialists. The specialist variable set included ten abdominal examination findings, which are especially important traditional measurement methods, and eight body constituent patterns, in addition to all of the predictor variables of the non-specialist variable set [8, 17].

The two essential patterns of the non-specialist variable set were used because we found that they can be accurately predicted using BMI and subjective symptoms [9, 10]. In contrast, the body constituent patterns were included only in the specialist variable set because they are rather difficult to predict. Abdominal examination findings were also included only in the specialist variable set because it is difficult for non-specialist physicians to perform abdominal examinations without training in Kampo medicine. Furthermore, the findings of abdominal examinations using the traditional method have been reported to vary between observers [18].

An internal validation of the prediction model was based on leave-one-out cross-validation (LOOCV) [19]. In this procedure, we set one patient’s data as validation set and the remaining data as training set. For example about the two candidates, hachimijiogan (54 patients) and kamishoyosan (46 patients), we obtained the prediction model for the Kampo formulas with 99 training data, and predicted Kampo formulas for rest one validation data. We calculated the discriminant rate repeating this procedure for all the 100 data. We did not perform any external validation in this preliminary analysis.

Variable importance and the marginal effects of random forests

Random forests can calculate the mean decrease in the Gini coefficient of each tree. This is called the importance. At each split in each tree, the improvement in the split criterion is an indicator of the importance attributed to the splitting variable; this importance is accumulated over all the trees in the forest separately for each variable. A higher importance means that the variable makes a more sizable contribution to predicting the Kampo formulas.

Random forests also calculate the marginal effect of a variable on the class probability, called the partial dependency. A positive value means that the variable contributes positively to selection of a Kampo formula. In contrast, a negative value indicates that the variable contributes negatively to selection of the Kampo formula. Finally, a value of zero indicates that the variable does not contribute at all to the selection of the Kampo formula.

Statistical analyses

All statistical analyses were conducted with the use of R version 3.1.1 (The R Foundation for Statistical Computing; July 10, 2014, see also: http://www.r-project.org). We used the package “randomForest” [20] and parameters were used as default settings. Data are presented as means ± standard deviations (SDs).

Results

Participant information

We registered 4057 patients who made their first visit to the Kampo Clinic at Keio University Hospital between May 2008 and March 2013. The largest reason of ineligibility was missing data on BMI, which was happened for 2776 patients (75.8 %) who made their first visit until December 2011. About a half of patients had 19 or fewer subjective symptoms, and about one third of patients were prescribed two or more Kampo formulas. Finally, we decided to use data from 393 patients in this analysis, including 57 male and 336 (85.5 %) female (Fig. 1). The mean age was 57.9 ± 16.3 years old, and the mean BMI was 21.5 ± 3.3 kg/m2. Of the 393 patients, 17 were categorized as showing an excess pattern, 55 a slight excess pattern, 189 an intermediate (between excess and deficiency) pattern, 79 a slight deficiency pattern, and 53 a deficiency pattern. Furthermore, of the total sample, 16 were categorized as showing a heat pattern, 52 an intermediate (neither heat nor cold) pattern, 214 a cold pattern, and 111 a tangled heat and cold (both heat and cold) pattern (Table 1).

Fig. 1
figure 1

Participant recruitment flow diagram. The largest cause of ineligibility was missing data of on body mass index, which was happened for 2776 patients (75.8 %) who made their first visit between May 2008 and December 2011. Actually, data collection of body mass index was started from January 2012, and most of patients who made their first visit until December 2011 were excluded

Table 1 Baseline characteristics of participants

Hachimijiogan was the most frequently prescribed Kampo formula (for 54 patients) in our data set, followed by kamishoyosan (46 patients), keishikaryukotsuboreito (33 patients), and keishibukuryogan (29 patients). Shimbuto, bukuryoingohangekobokuto, and yokukansan were each used for 17 patients. We summarized the top 10 frequently used Kampo formulas in Table 2. The mean age for patients prescribed hachimijiogan was the highest of the ten Kampo formulas, and the mean BMI for patients prescribed keishikaryukotsuboreito was the lowest of the ten formulas.

Table 2 Demographics and standard traditional pattern diagnosis of frequently used Kampo formulas

Model fitting procedure

We applied random forests for the prediction of Kampo formulas from the selected candidates using either the non-specialist or specialist variable set. The discriminant rate via LOOCV was highest when we tried to predict a Kampo formula from among two candidates, hachimijiogan and kamishoyosan (Fig. 2). The discriminant rate decreased as the number of candidate Kampo formulas increased; in particular, it showed a marked drop in shifting from three to four candidates, meaning when we added the fourth Kampo formula keishibukuryogan to the top three formulas, hachimijiogan, kamishoyosan, and keishikaryukotsuboreito (see also Table 2).

Fig. 2
figure 2

Leave-one-out cross-validation accuracy of prediction according to the number of candidate Kampo formulas. The discriminant rate via leave-one-out cross-validation was highest when we tried to predict a Kampo formula from among two candidates, hachimijiogan and kamishoyosan. The discriminant rate decreased as the number of candidate Kampo formulas increased; in particular, it showed a marked drop in shifting from three to four candidates, meaning when we added the fourth Kampo formula keishibukuryogan to the top three formulas, hachimijiogan, kamishoyosan, and keishikaryukotsuboreito (see also Table 2). The non-specialist variable set included age, sex, body mass index, and subjective symptoms, as well as the two essential and predictable traditional medicine pattern diagnoses (excess–deficiency and heat–cold) according to Kampo specialists. The specialist variable set included abdominal examination findings and body constituent patterns in addition to all of the predictor variables of the non-specialist variable set

The discriminant rates were 4 to 11.7 % higher when we used the specialist variable set than when using the non-specialist variable set. This gap was smallest when we tried to predict Kampo formulas from among two candidates.

Variable importance and marginal effects from random forests

Non-specialists would appear to benefit from having fewer candidates when selecting Kampo formulas. In addition, the discriminant rate dropped considerably when the number of candidate Kampo formulas reached four as noted in the previous section. As such, for the non-specialist variable set, we selected a prediction model wherein Kampo formulas were predicted from among three candidates. We then analyzed how this model worked with our data set.

Among the top 30 items in terms of importance, we found that age and BMI had the highest and second highest values, respectively (Table 3). Here, we focused on the top 10 variables with higher importance, and analyzed how they worked in this model. Figure 3 shows the partial dependency plot about the top 10 important variables in Table 3. For example, in the upper left plot for age and BMI, the solid line indicates that those with an age of greater than 60 and a BMI over 20 had a positive value for hachimijiogan. In other words, the model suggests that hachimijiogan be selected when the patients are older than 60 and have a BMI over 20. In contrast, the model suggests that hachimijiogan is not as suitable when the patients are younger than 60 and have a BMI under 20. Similarly, an age under 60 and a BMI over 20 suggests that kamishoyosan be selected, while a BMI under 20 indicates the selection of keishikaryukotsuboreito. The partial dependency of age for keishikaryukotsuboreito was consistently negative, meaning that all ages contributed negatively to selection of this Kampo formula, especially ages over 60 (Fig. 3; age and BMI). These findings indicate that an age of around 60 and a BMI of around 20 did not contribute to the selection of Kampo formulas, except for negatively affecting selection of keishikaryukotsuboreito. These findings on the partial dependency of age and BMI were consistent with the characteristics of patients prescribed the three Kampo formulas shown in Table 2.

Table 3 Top 30 most important variables for predicting top 3 Kampo formulas with non-specialist variable set. Higher importance values indicate that the item makes a more sizable contribution to predicting the Kampo formulas
Fig. 3
figure 3

Partial dependency of the top 10 important variables for each Kampo formula. This figure shows the partial dependency plot about the top 10 important variables in Table 3. A positive value in the plot means that the variable contributes positively to selection of the Kampo formula. In contrast, a negative value indicates that the variable contributes negatively to selection of the Kampo formula. A value of zero indicates the variable does not contribute at all to the Kampo formula. Older age, higher body mass index, excess pattern, lower depressed mood and irritability, higher early-morning awakening, lower heat sensation in the face and hot flashes, negative heat pattern, and lower neck stiffness indicated the use of hachimijiogan. In the same way, younger age, higher body mass index, excess pattern, depressed mood, higher irritability, lower early-morning awakening, higher heat sensation in the face and hot flashes, heat pattern, and higher neck stiffness indicated the use of kamishoyosan. Lower body mass index and deficiency pattern indicated keishikaryukotsuboreito, but all other items in the top 10 negatively contributed to use of this Kampo formula. Solid lines are used for hachimijiogan, broken lines for kamishoyosan, and dotted lines for keishikaryukotsuboreito

We also found that the two traditional medicine pattern diagnoses, excess–deficiency and heat–cold, had high importance (Table 3). The partial dependency plot for excess–deficiency patterns showed that the deficiency pattern indicated use of keishikaryukotsuboreito and not hachimijiogan. In contrast, the excess pattern suggested the use of hachimijiogan or kamishoyosan and not keishikaryukotsuboreito (Fig. 3; excess–deficiency pattern). In fact, Kampo specialists rarely prescribed hachimijiogan for treating patients diagnosed with a deficiency or a slight deficiency pattern. In contrast, keishikaryukotsuboreito was not prescribed for patients diagnosed with a slight excess pattern or an excess pattern (Fig. 4). The partial dependency plot for the heat pattern showed that the heat pattern indicated the use of kamishoyosan, and not hachimijiogan or keishikaryukotsuboreito (Fig. 3; heat pattern). Patients who were diagnosed with a heat pattern or a tangled heat and cold pattern by Kampo specialists were generally not prescribed hachimijiogan. In contrast, over 60 % of patients who were prescribed kamishoyosan were diagnosed as having a heat pattern or a tangled heat and cold pattern (Fig. 5).

Fig. 4
figure 4

Excess–deficiency pattern diagnoses of patients by Kampo specialists. Kampo specialists rarely prescribed hachimijiogan for treating patients diagnosed with a deficiency pattern or a slight deficiency pattern. In contrast, keishikaryukotsuboreito was not prescribed for patients diagnosed with a slight excess pattern or an excess pattern

Fig. 5
figure 5

Heat–cold pattern diagnoses of patients by Kampo specialists. Patients who were diagnosed with a heat pattern or a tangled heat and cold pattern by Kampo specialists were rare among patients who were prescribed hachimijiogan. In contrast, over 60 % of patients who were prescribed kamishoyosan were diagnosed with a heat pattern or a tangled heat and cold pattern

In a similar manner with the subjective symptoms, lack of depressed mood and irritability; early-morning awakening; and lack of heat sensation in the face, hot flashes, or neck stiffness indicated the use of hachimijiogan. In contrast, presence of depressed mood and irritability; lack of early-morning awakening; and presence of hot flashes, heat sensation in the face, and neck stiffness indicated the use of kamishoyosan. Lack of irritability and hot flashes and the presence of a depressed mood and early-morning awakening indicated the use of keishikaryukotsuboreito (Fig. 3; depressed mood, irritability, early-morning awakening, heat sensation in the face, hot flashes, and neck stiffness). This was consistent with the results related to the severity of symptoms summarized in Fig. 6. Many of the patients who were prescribed hachimijiogan did not have a depressed mood, irritability, heat sensation in the face, or hot flashes. They also tended not to have neck stiffness. In contrast, many patients who were prescribed kamishoyosan exhibited a depressed mood, irritability, heat sensation in the face, and hot flashes, but did not exhibit early-morning awakening. Patients who were prescribed keishikaryukotsuboreito tended to exhibit a more depressed mood and early-morning awakening and to not exhibit irritability or hot flashes.

Fig. 6
figure 6

Severity of subjective symptoms with top 10 importance rankings. Many of the patients who were prescribed hachimijiogan did not have depressed mood, irritability, heat sensation in the face, or hot flashes. They also tended to not have neck stiffness. In contrast, many patients who were prescribed kamishoyosan had a more depressed mood, irritability, heat sensation in the face, and hot flashes, but did not exhibit early-morning awakening. Patients who were prescribed keishikaryukotsuboreito tended to have a more depressed mood and early-morning awakening and to not have irritability or hot flashes

To further understanding, we provided the results from multinomial logistic regression when we tried to predict a Kampo formula from among the same three candidates with using non-specialist variable set in Appendix: Tables 4 and 5. You can obtain area under the receiver operating characteristic curves, and crude odds ratios with 95 % confidential intervals for each item of non-specialist variable set.

Discussion

Approximately 90 % of Japanese physicians who are well educated in Western medicine use Kampo formulas regularly despite not having specific education in Kampo medicine. We believe that they will benefit most from the DSS to make more standardized traditional medicine pattern diagnoses and prescribe appropriate Kampo formulas [2]. In this article, we discussed the preliminary results of our DSS in predicting use of appropriate Kampo formulas.

In this study, we used only 9.7 % of overall patients’ data. Although it is uncommon, we provided the baseline characteristics of participants not only about eligible patients but also ineligible patients. Our data were primarily obtained from female patients, and this sex difference is common in Japanese Kampo clinics [21, 22]. This is because some Kampo formulas, including kamishoyosan and keishibukuryogan, are especially used for female patients. These Kampo formulas are originally designed for perimenopausal or post-partum women, respectively. We could obtain similar results using only data from female patients. When we excluded male patients, female sex was disappeared from the table of important variables but the other variables had almost the same rank with same value of importance (Data not shown).

Our results suggested that a greater number of candidate Kampo formulas led to a worse discriminant rate. Specifically, the DSS could only handle around two or three frequently used Kampo formulas, even though there are far more officially approved formulas in Kampo medicine. Indeed, upon including a fourth Kampo formula, keishibukuryogan, the discriminant rate dropped by approximately 30 %. This was likely because it was difficult to differentiate between the second Kampo formula, kamishoyosan, and the fourth, keishibukuryogan (data not shown). Our data suggest that differences between the top three Kampo formulas were rather large, whereas the differences between these top three formulas and the more minor formulas were relatively small. If our DSS could handle only these two or three candidate Kampo formulas, it would not be able to support non-specialist physicians in daily practice.

Our DSS might be able to cover more Kampo formulas using cluster analysis, which has already given us other candidate formulas [23]. Our DSS will support non-specialist physicians in daily practice when the DSS will be combined with the cluster analysis. We have already reported that cluster analysis can reproduce some traditional medicine pattern diagnoses, and that frequently used Kampo formulas differed among each cluster, in accordance with the traditional medicine pattern diagnoses they best suited [23]. In other words, our model can be applied to different clusters of diagnoses, which may help us obtain the most appropriate Kampo formulas from among those most frequently used. However, there are only a few papers reporting the results of a cluster analysis on this topic. Ishizuka et al. also reported on the results of a cluster analysis of patients who visited a Kampo institution in Japan. They concluded that their cluster analysis results supported the rationale behind the empirical determination of traditional medicine pattern diagnosis [24]. However, their results supported only kidney and liver deficiency patterns, and did not discuss Kampo formulas. There are two more articles reporting cluster analyses, but they are not widely accessible because they are written in Chinese [25, 26].

We utilized two sets of predictor variables, and found that the non-specialist variable set worked well in predicting appropriate Kampo formulas. Indeed, the discriminant rates were more affected by the number of candidate Kampo formulas than by the change in the variable set. We found that we could achieve higher discriminant rates when we used the specialist variable set including the abdominal examination findings and the body constituent patterns. However, the findings from traditional measurements can vary between physicians, and thus the standardization of such traditional measurements remains an issue [27]. Use of standardized traditional measurements will help also in the prediction of body constituent patterns.

We analyzed the importance and partial dependency of the variables. It must be noted that the variables with high importance are useful only for the random forests and would not directly relate to the clinical decisions of Kampo specialists. However, we noted that many of these items were compatible with our clinical experience. While this experience is based on ancient knowledge and traditional theory, a significant amount of it has proven relevant today. For example, Odaguchi et al., in 2007, reported that a Kampo formula, goshuyuto, was effective for some patients with headaches with specific characteristics; these characteristics were similar to those written about in traditional medical textbooks [28].

We also tried using logistic regression with Lasso penalty as a linear model [29] and classification and regression trees (CART) as a non-linear model [30], but random forests performed the best in terms of the discriminant rate. For example, the discriminant rates were 85.0 and 79.0 % by logistic regression, or by CART, respectively, when we tried to predict a Kampo formula from among two candidates with non-specialist variable set. In contrast, the discriminant rates via LOOCV were both 90.0 % with specialist variable set by logistic regression methods, or by CART. From this result, the statistical methods are not important issue with enough predictors, but are important with restricted predictors.

Herein, we predicted specialists’ selections of Kampo formulas in daily clinical situations, and the selection didn’t depend on an objective golden standard or pre-defined standards. The Kampo specialists select a “proper” Kampo formula based on traditional medicine pattern diagnosis and theory. We provided the standard pattern diagnosis for each Kampo formula defined by the Japanese Society of Oriental Medicine in Table 2. However, the diagnosis in the table was not the only one way, and different pattern diagnosis can be made for the patients with the same Kampo formulas as shown in the Figs. 4 and 5. We can understand the actual connection between Kampo formula and traditional medicine pattern diagnosis with this study, but we can also recognize a problem about standardization of diagnosis and Kampo formula selection. One possible reason of this discrepancy is the indications for each Kampo formula do not include traditional medicine pattern diagnosis; rather they rely only on Western diseases and symptoms.

Additionally, we did not consider efficacy. “True” Kampo formulas must be defined by their efficacy. Although most prescriptions by specialists are effective, sometimes there may be better options. We have already noticed that a few patients got better after changing the Kampo formula which was selected at the first consultation. Then, we must carefully confirm the proper Kampo formula, especially for patients whose formulas were difficult to predict.

We did not perform variable selection in this analysis, and used many variables for small number of patients. The prediction models with small number of variables may be better for generalization like applying this method to other medical record databases. We, however, showed the results with all of the potential variables because we thought it would help non-specialist physicians to understand specialists’ tacit knowledge (See also Appendix: Tables 4 and 5).

Our aim is to further progress this DSS reflecting clinical outcomes and the experience of other institutions. Now, we are collecting clinical information through the browser-based questionnaire at some representative institutions about Kampo medicine. We have collected more than forty thousand of record from more than eight thousand patients. This data will further progress our DSS.

Conclusions

Our decision support system for non-specialist physicians works well in selecting a proper Kampo formula from two or three candidates. Additional studies are required to integrate such statistical analysis in clinical practice.