Background

Western medicine classifies coronary heart disease (CHD) as a kind of myocardial dysfunction and organic lesion, occasionally accompanied by coronary artery stenosis and vertebrobasilar insufficiency [1]. In contrast, Chinese medicine (CM) classifies CHD as a type of chest paralysis and heart pain, for which effective diagnosis and treatment are available [2].

CM treatment is based primarily on syndrome differentiation and physiology and pathology of Zang-fu organs and meridians. In CM, a symptom represents an observable indicator of abnormality, while a syndrome is the disease state manifested by symptoms. The connections between symptoms and syndromes in CM are not clearly defined. Therefore, it is necessary to delineate different relationships between symptoms and syndromes and explain the diagnosis results in comprehensible terms [3].

Machine learning builds empirical models on data for analysis and forecasting, which has recently been used for CM data analysis. Huang and Gao [4] reviewed several classifiers of data mining in CM. Li and Huang [5] used fuzzy neural network for analysis of CM ingredients. Wang et al. [6] used a decision tree method to generate prediction models for CM hepatitis data and liver cirrhosis data. Zhang et al. [7] combined factor and cluster analysis in the classification of CM syndromes related to post-hepatitic cirrhosis. Zhang et al. [8] used latent tree models to aid CM diagnosis. Knowledge discovery in database (KDD) [9], rough set [10], and expert system [11], have also been applied to CM.

Most CM machine learning works does not consider the medical meaning and links among features. However, CM data contain a large quantity of symptoms or syndromes which have specific medical meaning. Therefore, seeking the links between features including symptoms and syndromes in CM data analysis is also important.

Conventional methods usually use only one numerical value to describe the relationship of two symptoms. In this study, we use a pair of characteristic values to describe a relative link between the symptoms as a relative associated density (RAD). By analysing the characteristic value pairs, we searched significant one-way links between symptoms and confirmed the links according to CM theory [12, 13]. The RAD method was also used to find one-way links among multiple syndromes in the clinical data.

Among a large number of symptoms in CM diagnosis data sets for a certain disease, some symptoms may be redundant. Therefore, selecting major or relevant symptoms is crucial to the performance of machine learning. Wang et al. [14] used support vector machine (SVM) to generalize symptom weights in CHD predictions. Liu et al. [15] used symptom frequency analysis to enhance modelling results in learning. Zhou et al. [16] developed a clinical reference information model (RIM) and a physical data model to manage various entities and relationships in CM clinical data. Principal component analysis (PCA) [17], partial least squares (PLS) [18], maximum relevance and minimum redundancy (MRMR) [19] have been used to perform symptom selection to improve prediction accuracy.

The results from conventional primary symptom selection or reduction methods are difficult to be interpreted in CM. For instance, PCA reduces symptom dimensionality at the expense of loss of medical meaning [20]. Although MRMR can predict fairly using only a few major symptoms [21], the results are often inconsistent with basic CM theory [12, 13]. This study aims to use RAD to perform symptom selection, and evaluate whether the results can be better explained by CM theory [12, 13].

Methods

Data set of CHD in CM

A total of 555 clinical cases were collected from the cardiology departments of Longhua Hospital, Shuguang Hospital, Shanghai Renji Hospital, and Shanghai Hospital of CM form March 2007 to May 2008 to compile the CHD data set used in this study. It could be obtained from the address http://levis.tongji.edu.cn/gzli/publication.htm [15].

Out of the 555 cases, 265 patients (47.7%) were male, age (mean ± standard deviation): 65.15 ± 13.17 and 290 patients (52.3%) are female, age: 65.24 ± 13.82. The symptoms collected from inquiry diagnosis include 125 symptoms in eight dimensions (cold or warm, sweating, head, body, chest and abdomen, urine and stool, appetite, sleeping, mood, and gynecology). The differentiation diagnosis includes 15 syndromes, as described in Liu et al. [15].

For unification of the results, specific types and feeling information of some symptoms were combined and some symptoms unique to females were deleted. The variables analyzed in this study include 63 symptoms and 10 syndromes. The 63 included symptoms were listed in Table 1. The 10 included syndromes were (I) heart-qi deficiency syndrome; (II) heart-yang deficiency syndrome; (III) heart-yin deficiency syndrome; (IV) heart-blood deficiency syndrome; (V) turbid phlegm syndrome; (VI) blood stasis syndrome; (VII) qi stagnation syndrome; (VIII) heart-fire hyperactivity syndrome; (IX) heart-kidney yang deficiency syndrome; (X) cardiopulmonary-qi deficiency syndrome.

Table 1 The 63 symptoms in the data set

The RAD method

Probability and statistics

In the medical diagnosis of CHD, frequency of symptom occurrence may be different. For instance, the chest tightness symptom and the dizziness symptom are frequent symptoms, while the sleepiness symptom and the diarrhea with undigested food symptom are rare symptoms. In the data analysis, the first step is to distinguish between the frequent and the rare symptoms.

In probability of symptoms, Pf i stands for the appearance probability of the i th symptom across all cases, which is defined as

P f i = m = 1 N F i m N
(1)

where F im = 1 if the i th symptom appears in the m th case, or else F im = 0. N denotes the number of the cases.

Similarly, Pl i stands for the appearance probability of the i th syndrome across all cases, which is defined as

P l i = m = 1 N L i m N
(2)

where the i th syndrome appears in the m th sample, L im = 1, or else L im = 0.

Building the symptom-symptom interaction network

Equations (1) and (2) calculate the appearance probability of all symptoms and syndromes. But these values cannot reveal their potential connections. Symptom-symptom interaction (SSI) network in the same manner as used for human social networks was used to find the connections [21, 22].

When two different symptoms occur simultaneously in the same case, sign G im = 1 indicating that symptom F i and symptom F j appear at the same time in the mth case, or else G ijm = 0. F i F j stands for the number of simultaneous occurrences of F i and F j . Then for N cases,

F i F j = m = 1 N G i j m
(3)

which contains two types of information: the frequency of features and the relevancy of two features.

Relative associated density

Equation (3) is largely concerned with the frequency of symptoms. In other words, frequent relationships between symptoms are obvious, while less frequent relationships are hard to be detected. The difference is even more than 300 folds. Therefore, this study used RAD, which uses conditional probability to measure the relationships of symptoms and syndromes.

The term C(Fi, Fj) represents the RAD values of symptom F i associated with F j and use C(Fj, Fi) represents the RAD values of symptom F i associated with F j . According,

C F i , F j = F i F j m = 1 N F i m
(4a)
C F j , F i = F i F j m = 1 N F j m
(4b)

Symptom selection with RAD

In the mth case, if symptom F i appears with syndrome L j , H ijm = 1; otherwise, H ijm = 0. Then for all N cases,

F i L j = m = 1 N H i j m
(5)

RAD estimates the influence of the appearance probability on the interaction between a symptom and a syndrome. Equation (6) calculates the RAD value between symptoms and syndromes,

C F i , L j = F i L j m = 1 N L j m
(6)

This kind of association could be recognized as the contribution of one symptom to the syndrome.

Each syndrome was considered a single label; thus we selected corresponding symptoms regardless of their RAD values. For each single label prediction, the symptoms with low RAD values were removed one by one, and the predictions were calculated with SVM and KNN. The symptoms that lead to the highest prediction were recorded as the result of symptom selection.

MRMR symptom selection was used for a comparison [19]. The idea of MRMR is to search the optimal subset by maximizing relevance while minimizing redundancy based on mutual information. To maintain consistency with the RAD method, we used SVM [23] and KNN [24] for classification.

To evaluate the prediction results, we calculated the true positive rate (TPR), and true negative rate (TNR) criteria: TPR = TP/(TP + FN), TNR = TN/(FP + TN), where TP is the number of true positives, TN is that of true negatives, FP is that of false positives, and FN is that of false negatives. The G-means criterion was used to describe the equilibrium of the positive and negative classes of the prediction results, where G-means = (TPR * TNR)1/2.

Results and discussion

RAD performed better than MRMR in feature selection for machine learning to discover CM relationships among the symptoms, syndromes, and even between the symptoms and syndromes in a CHD data set. RAD analysis found one-way connections among symptoms and the syndromes that are consistent with CM theory. RAD not only improves prediction accuracy but also enhanced interpretability.

Common and rare symptoms

We used equation (1) to determine the symptom frequency in the data set. The first 20 frequent symptoms were identified as listed in Table 2. Table 3 lists the first 10 rare symptoms in the data set.

Table 2 The most frequent symptoms and their appearance probability
Table 3 The 10 rarest appeared symptoms and their frequency

SSI was calculated by equation (3). Figure 1 shows a network constructed from the SSI results, i.e., the frequency and relationship among the symptoms. Table 4 lists the important symptoms shown in Figure 1.

Figure 1
figure 1

The network of SSI. The points denote the symptoms; solid lines connect the high SSI.

Table 4 Symptoms with high SSI values shown in Figure 1

CHD was identified as a kind of deficiency syndromes or excess syndromes. As shown in Tables 2 and 4, CHD was associated with kidney deficiency, diet disloyalty, mental disturbance, cold pathogen invasion, and other factors. CHD occurred in the heart but was related to the liver, the kidney, and the spleen. CHD was also bound with heart-qi deficiency, heart-yang deficiency, heart-blood deficiency, and heart-yin deficiency. The imbalance of liver, kidney, and spleen was often accompanied by turbid phlegm syndrome, qi stagnation syndrome, blood stasis syndrome. From the first 20 most frequent symptoms, the symptoms of chest distress, hard breath/dyspnoea/suffocation, palpitation, and chest pain were found to be the locating syndrome of syndrome patterns of the heart, in consistency with modern clinical practice of CHD in CM. Other symptoms among the top 20 were also basic factors in CM heart system diseases diagnosis [12, 13].

Table 3 lists the top 10 rare symptoms and their probabilities. The symptoms of the heart syndrome patterns were hunger without desire to eat and water-like stool symptom. This result was also consistent with CM theory [12, 13].

Analysis using the RAD method

RAD analysis of the SSI networks was used to determine the connections between symptoms, and identified major symptoms in CHD.

Equation (4) was used to determine the RAD values of SSI, as shown in Table 5.

Table 5 Some RAD values of SSI

P ij and P ji always appeared as a pair. Some symptoms were obviously one-way connections. For example, only 11.4% of occurrences of the hard breath symptom were accompanied by the hot flash symptom, while 74.6% of occurrences of the hot flash symptom appeared with the hard breath symptom. This was typical one-way connection between two symptoms.

Table 6 lists more connections between two symptoms. CM theory holds that chills occur with yang asthenia [12, 13]. Yin asthenia occurs with hot flashes and night sweats [12, 13]. The probabilities of chills appearing with hot flashes and night sweats is low, and their occurring probabilities are 0.087 and 0.061, separately.

Table 6 One-way connections between symptoms

Table 6 also lists the RAD values of one-way connections between symptoms. For instance, the probability of chills accompanied by body coldness was 71.5%, while the probability of body coldness accompanied by chills was only 45.4%. These unequal results indicate that a patient suffering from chills would be more likely to have the body coldness symptom. By contrast, a patient suffering from body coldness would be less likely to have the chills symptom. Furthermore, the locating symptom of chest distress occurred with qualitative and locating symptoms, such as paroxysmal night dyspnoea or orthopnoea, tastelessness and tediousness, nausea and vomiting, epigastric upset, deficient urine, dark urine, feverish palms and soles, intake of fluid failing to resolve thirst, stool resembling sheep's droppings. When paroxysmal night dyspnoea or orthopnoea happened, chest distress symptoms rarely appeared at the same time. Therefore, the one-way connections between the symptoms calculated by RAD explained the clinical results in CM. For example, yang asthenia was the representation of chills, and when chills present, distending pain in the hypochondrium and urine astringent pain appeared at the same time. However, the latter two symptoms did not represent chills; thus, they would not be accompanied by the symptom of chills. For another example, spontaneous sweating was an expression of the qi asthenia symptom and possibly appeared with distending pain in the hypochondrium, a sticky slimy sensation in the mouth, dark urine, but not vice versa. From these two examples, we can see that the contribution of chills to yang asthenia was greater than that of spontaneous sweating to qi asthenia. In the meantime, we may infer that distending pain in the hypochondrium, a sticky slimy sensation in the mouth, and dark urine are not typical features of qi asthenia and yang asthenia. This association analysis of symptoms can show which symptoms are major features and identify possible relationships between symptoms and syndromes. This kind of analysis would provide an objective basis for standardization of dialectic diagnosis.

Relationships among the syndromes

Table 7 shows the frequencies of all 10 syndromes calculated using equation (2). Table 8 lists the RAD values of the syndrome.

Table 7 Frequency values of 10 syndromes
Table 8 RAD values of syndromes

High correlation of the syndromes

Relevant analysis of the relationships between syndromes found high correlations in heart-qi insufficiency, such as heart-yin deficiency, heart-blood deficiency, turbid phlegm, blood stasis, qi stagnation, heart-fire hyperactivity, and cardiopulmonary qi deficiency. For example, blood stasis was highly correlated with heart-qi insufficiency, heart-yang insufficiency, heart-yin deficiency, heart-blood deficiency, turbid phlegm, qi stagnation, heart-kidney yang deficiency, and cardiopulmonary qi deficiency. The one-way RAD values of these syndromes were 0.80, 0.73, 0.75, 0.63, 0.87, 0.84, 0.63, and 0.86, respectively.

The finding of high correlation of heart-qi insufficiency with heart-blood deficiency and heart-yin deficiency is consistent with CM theory that a long period of heart-qi insufficiency would result in yin blood, causing fluid and blood deficiency and then qi yin deficiency [25]. In consistency with this theory, qi yin deficiency syndrome was common. The correlations of heart-qi insufficiency with turbid phlegm, blood stasis, qi stagnation, heart-fire hyperactivity, and cardiopulmonary qi deficiency were high, and consistent with the feature of deficiency syndrome or excess syndrome of CHD [12, 13]. According to CM theory [12, 13], turbid phlegm, qi stagnation, and blood stasis are symptoms, while qi deficiency is the radical that causes heart vessel stagnation and then CHD. The high RAD values of turbid phlegm and cardiopulmonary qi deficiency would explain that cardiopulmonary qi deficiency causes retention of water and dampness, and then sputum and more turbid phlegm [12, 13].

The high degree of correlation of blood stasis with heart-qi insufficiency, heart-yang insufficiency, heart-yin deficiency, heart-blood deficiency, turbid phlegm, qi stagnation, heart-kidney yang deficiency, and cardiopulmonary qi deficiency indicates that blood stasis appeared in these syndromes. According to CM theory [12, 13], heart controlling the blood vessel, yang asthenia, and qi asthenia may cause degradation of driving blood ability, and then blood stasis. Heart-fire hyperactivity and heat scorching blood viscous may cause blood stasis [12, 13]. Qi stagnation and poor blood flow may also cause blood stasis [12, 13]. Blood stasis may be the basic pathogenesis of CHD [26].

One-way connection of the syndromes

Table 8 shows some syndrome pairs with obvious one-way connections. For example, the RAD value of heart-qi insufficiency to insufficiency of the heart blood was 0.69, but the reversed RAD value was only 0.03. The RAD value of heart-qi insufficiency to heart-fire hyperactivity was 0.60, while the reversed RAD was 0.05. Table 9 summarizes the one-way connections of the syndrome pairs.

Table 9 One-way connections of the syndrome pairs

Taking heart-qi insufficiency and insufficiency of the heart blood as an example, CM theory [12, 13] emphasizes the interdependence between qi and blood, and long-term qi insufficiencies will cause blood deficiency. However, insufficiency of the heart blood is not always accompanied by heart-qi insufficiency [12, 13]. In elder patients, viscera function is weak, a pure sthenic syndrome is rare, and an asthenia with sthenia syndrome is more common. The RAD value of heart-qi insufficiency to heart-yin deficiency was 0.81, indicating that most CHD patients were qi asthenia together with yin asthenia. According to CM theory [12, 13], heart-fire hyperactivity is not directly related to heart-qi insufficiency or insufficiency of heart-yin. High one-way connections were found for blood stasis to cardiopulmonary qi deficiency, insufficiency of the heart blood, heart-fire hyperactivity, qi stagnation, and heart-kidney yang deficiency. However, the RAD values of reversed connections were low, indicating that blood stasis was not the only reason for CHD.

Two-ways connections of the syndrome

In addition to the observations of one-way connections, two-way connections were also found. For example, the mutual RAD values of blood stasis and qi asthenia were 0.80 and 0.64, respectively, indicating that these two syndromes were highly correlated. CM theory [12, 13] holds that qi asthenia and then poor blood flow would lead to blood stasis, in reverse. Long-term blood stasis may also cause qi asthenia. These two syndromes causally influence with each other.

Relationships between symptoms and syndromes

According to CM theory [12, 13], a symptom is an expression of internal syndrome, and a syndrome is essential to symptom appearance. The RAD results (Table 10) calculated by equation (6) showed the one-way connections of symptoms to syndromes, whose connections could be viewed as the contributions of symptoms to syndromes.

Table 10 Some RAD values between symptoms and syndromes

Figure 2 illustrates the data in Table 10, where the x-axis represents the 63 symptoms and the y-axis represents the 10 syndromes. Red rectangles represent high RAD values, and the blue ones represent low RAD values. From Figure 2, the correlations between symptoms and syndromes were determined. As shown in Figure 2, the symptoms of palpitation, chest distress, short breath, weakness, soreness, and weakness of waist and knees were related to most of the syndromes. At the same time, chills and some other symptoms showed strong connections to some syndromes, such as heart-kidney yang deficiency and yang asthenia. Table 11 lists the symptoms and syndromes with high and low RAD values. In Table 11, chills showed a low relation to most of the syndromes except for heart-yang insufficiency and heart-kidney yang deficiency, indicating that chills were closely related to the latter syndromes. CM theory [12, 13] holds that weakness of yang and qi and lack of warmth may cause chills. The high RAD values of night sweats to insufficiency of heart-yin did confirm the CM theory that yang cannot be restricted by yin asthenia, and then deficiency fire will be an internal disturbance and cause night sweats [12, 13]. Constipation and insufficiency of heart blood showed a strong connection. Inner Canon of Yellow Emperor points out that "people over 40 years old may lose half of the yin qi", and CM theory [12, 13] holds that insufficiency of the heart blood causes body fluid deficiency, which in turn causes insufficient lubrication of the colon, leading to constipation. The strong connections between nocturnal frequent micturition and heart-kidney yang deficiency can be explained by the lack of yang in the heart and kidney which resulted in a decrease of the controlling and qi transformation functions, bladder retention failure, and then nocturnal frequent micturition.

Figure 2
figure 2

The RAD values of symptoms to syndromes.

Table 11 Symptoms with relative high and low RAD values to syndromes

The weak connections (Table 11) of chest pain and insufficiency of the heart blood, nocturnal frequent micturition and insufficiency of the heart blood, and edema and insufficiency of the heart blood were also significant and consistent with CM theory [12, 13].

Symptom selection with RAD

In this study, RAD was used for symptom selection, and then SVM [23] and K-nearest neighbours (KNN) [24] were used for the prediction.

Table 11 shows individual contributions of symptoms to the syndromes.

The predictions were not sound as the syndromes 4, 8, 9, and 10 in this data set showed serious imbalance; therefore, we omitted these results. For syndromes 1, 2, 3, 5, 6, and 7, (Table 12), the results were much better. Table 12 indicates that the prediction results with MRMR favoured either the positive class or the negative class. In the G-means results of the syndromes, these maximum values were obtained by the RAD method, indicating that RAD achieved a good balance between the positive class and the negative class. Although for some syndromes, the prediction results of RAD and MRMR were close when the TPR, TNR, and G-means values were all considered. In general, the results obtained by RAD were more reasonable.

Table 12 Statistical Results of TPR, TNR and G-means by using SVM and KNN with RAD and MRMR or without symptom selection

Conclusions

The RAD method is effective for CM clinical data analysis, particular for analysis of relationships between symptoms in diagnosis and generation of compact and comprehensible symptom feature subsets.