A Belief Rule Based Expert System to Assess Tuberculosis under Uncertainty

The primary diagnosis of Tuberculosis (TB) is usually carried out by looking at the various signs and symptoms of a patient. However, these signs and symptoms cannot be measured with 100 % certainty since they are associated with various types of uncertainties such as vagueness, imprecision, randomness, ignorance and incompleteness. Consequently, traditional primary diagnosis, based on these signs and symptoms, which is carried out by the physicians, cannot deliver reliable results. Therefore, this article presents the design, development and applications of a Belief Rule Based Expert System (BRBES) with the ability to handle various types of uncertainties to diagnose TB. The knowledge base of this system is constructed by taking experts’ suggestions and by analyzing historical data of TB patients. The experiments, carried out, by taking the data of 100 patients demonstrate that the BRBES’s generated results are more reliable than that of human expert as well as fuzzy rule based expert system.


Introduction
Tuberculosis (TB) is considered as one of the life threatening infectious diseases all over the world, usually, caused by the bacterium Mycobacterium tuberculosis. It is usually two types, namely Pulmonary TB (PTB) and Extra-pulmonary TB (ETB). PTB affects lungs, while ETB can attack any organ of the body except brain, spine, heart, pancreas, skeletal striated muscle, and thyroid. The rate of occurrence of PTB is much higher than that of ETB [1][2][3].
In 2014, about 9.6 million people became ill and 1.5 million died from TB all over the world. It has been observed that over 95 % death from TB occurs in low and middle income countries. It is considered as one of the top five causes of death for women aged between 15 to 44 [4]. The TB bacteria are usually encapsulated as tiny capsules, called tubercles, in the people with healthy immune system. This stage is known as latent TB.
In this stage, the bacteria remain inactive and cannot spread to other people. On the contrary, when people's immune system becomes weak and hence, it is unable to prevent the growth of bacteria. Eventually, TB becomes active in the human body. Only active pulmonary TB is contagious and the bacteria spread into the air through the cough and sneeze of the affected people. However, ETB is not contagious. In case of PTB nearby people can easily be infected during inhaling. TB can be fatal if it is not treated in time, causing serious complications in the lungs, forming hole between the nearby airways, making breathing difficult because of blocked airways. The primary signs and symptoms of TB are coughing more than three weeks, coughing up blood, fatigue, unintentional weight loss, chest pain, prolonged fever, lack of appetite and night sweating [1][2][3].
A physician generally determines the suspicion of TB based on these signs and symptoms. Signs are measured by physician while symptoms are expressed by the patients [5,6]. Patients usually express the symptoms by using linguistic terms such as high, medium and low, which are imprecise, ambiguous and vague. Therefore, these linguistic terms cannot express the level of symptoms with 100 % certainty and hence, it inherits the types of uncertainty mentioned.
In some cases, patients may ignore the importance of coughing since they consider it is related to other common diseases, which is an example of uncertainty due to ignorance. The sputum smear microscopy, which is a method to diagnose the presence of active TB, sometimes it is unable to detect. This is an example of uncertainty due to incompleteness. A comprehensive survey has been carried out in consultation with the physicians of the various TB hospitals, located in Chittagong District of Bangladesh, to identify the types of uncertainties, associated with each of the signs and symptoms of TB, which are described in Table 1.
Since the traditional way of determining suspicion of TB is usually carried out by the physicians by looking at the signs and symptoms, it does not consider the above uncertain phenomenon. Thus, the method jeopardizes the accuracy of the detection of TB. However, an expert system which emulates the decision making process of human being can be considered as an appropriate tool to address the uncertain phenomenon to accurately detect the suspicion of TB. An expert system consists of two components, namely knowledge-base and the inference mechanisms. However, such an expert system should have the knowledge representation schema to acquire uncertain clinical knowledge. At the same time, inference mechanism should have the robust reasoning algorithms with the capability to handle various types of uncertainties as mentioned. Therefore, in this study the development of a belief rule-based expert system has been considered, where belief rule base used to handle uncertain knowledge and the evidential reasoning is used as the inference mechanism.
The remaining of the article is organized as follows. Section "Literature review" presents the literature review, "An overview of belief rule based expert system's methodology" gives an overview of the belief rule based expert systems methodology, and "BRBES to diagnose TB" describes the design, architecture, knowledge-base construction along with an overview of the system interface. Section "Results and discussion" includes the results and discussion, while "Conclusion" concludes the article.

Literature review
Several studies have been conducted with reference to the diagnosis of Tuberculosis (TB). Multilayer neural networks (MLNN) structures were used to facilitate the analysis of tuberculosis [7]. Back-propagation with momentum and Levenberg-Marquardt algorithms were used to perform the task of training in the MLNN. However, in this approach, an absence of explicit relationship between the input and output data noticed, which is necessary to ensure the transparent diagnosis of TB. Moreover, this approach has not considered the uncertainty issues, associated with the signs and symptoms, related to the input data.
The ensemble classifiers such as Bagging, AdaBoost and Random Forest tree were used to estimate the performance in detecting pulmonary tuberculosis [8]. Coughing sound detection algorithm [9] and lung auscultation software [10] were also used to TB diagnosis. The above approaches considered most of the signs and symptoms as appear in Table  1 in diagnosing TB. However, they lack the procedures to measure these signs and symptoms by taking account of the various uncertainties; rather there measuring approach is Boolean in nature.
In [11], K-mean clustering was combined with the different classifiers, namely, Support Vector Machine (SVM), Naïve Bayesian Classifier and K-Nearest Neighbor (K-NN) to support the prediction of tuberculosis [5].
Moreover, there are systems developed to accurately classify tumor and epilepsy using Genetic Algorithm by combining multiclass classification method and Support Vector Machine (SVM) [12,13].
However, SVM lacks the transparency [14,15] since by using this method the relationship between the signs and symptoms of the patient and the diagnosis cannot be established in an understandable way. However, in case of TB diagnosis, a continuous generation of scenario to establish the relationship between the signs and symptoms and the treatment as well as with the diagnostic process is essential [16,17].
Fuzzy Cluster Means (FCM or Fuzzy C-Mean) analysis [18,19] used a clustered data set for identifying different types of TB. TUBERDIAG [20] is also a fuzzy rule based system for detecting TB which combines positive and negative knowledge. Although fuzzy based expert system can handle uncertainties due to vagueness, imprecision and ambiguity, they are not equipped to address uncertainties due to ignorance, incompleteness and randomness, which are the cases with the signs and symptoms of TB as shown in Table 1. In addition, the above mentioned approaches [7][8][9][10][11][14][15][16][17][18][19][20] lack the appropriate inference procedures to address different types of uncertainties such as vagueness, imprecision, ambiguity, ignorance, incompleteness and randomness in a single integrated framework. Such an integrated framework has an important role to make robust decision to support TB diagnostic decision making process.
Therefore, it is necessary to employ an appropriate knowledge representation schema to capture uncertain knowledge that exists with the signs and symptoms of TB. Belief Rule Base (BRB) is widely used to represent this type of knowledge [21][22][23]. In addition, BRB can be used to demonstrate the explicit non-linear relationship between the input-output data which is necessary to ensure the transparent diagnosis of TB. Evidential reasoning in combination with the BRB can be used as the inference mechanism of the expert system which has the capability to handle all types of uncertainties in an integrated framework [20][21][22][23][24][25][26][27][28]. Therefore, the following section will represent the belief rule based expert systems (BRBESs) methodology, consisting of knowledge-base construction and the inference procedures.

An overview of belief rule based expert system's methodology
An expert system is mainly consists of two components, namely, knowledge-base and the inference procedures. In BRBESs, belief rule-base is used to represent the domain knowledge under uncertainty. On the other hand, inference procedures of BRBEs consists of input transformation, rule activation weight calculation, belief update and rule aggregation using evidential reasoning [21]. Each of them will be elaborated below.

Domain knowledge representation using BRB
A belief rule is the extension of traditional IF-THEN rule, where a belief structure is used in the consequent part. Antecedent part of the belief rule consists of one or more antecedent attributes with associated referential values, while consequent part consists of one consequent attribute. Knowledge representation parameters such as rule weight and antecedent attribute weight are used. A belief rule base (BRB) consists of one or more than one belief rules. The reason for adopting IF-THEN rule is that it is considered as an appropriate mechanism to represent human knowledge [21].
In addition, non-linear causal relationship between the antecedent and consequent attributes can be established in a belief-rule-base. Equation 1 represents an example of belief rule.
.., A ij i } is the referential value of antecedent attribute. C 1 , C 2 , ..., C N are the referential values of the consequent attribute while β ki is the belief degree to which C i is believed to be true. If N i=1 β ki = 1, the belief rule is said to be complete, otherwise it is incomplete. In this way, BRB addresses the uncertainty resulting from the incompleteness. Equation 2 represents the example of a belief rule from the domain of TB. Here, the consequent attribute is "TB Suspicion" with three referential values consisting of "High", "Medium", and "Low" with degree of belief 0.8, 0.2, and 0.0. The rule is said to be complete since the summation of the belief degrees stands at 1.
Each antecedent attribute of this rule also consists of three referential values and hence, the total number of rules in this BRB can be calculated by applying Eq. 3, which is 6,561.
where L is the total number of rules in a sub rule base and J i is the referential values of the ith antecedent attribute.

BRBESs inference procedures
Each of the inference procedures used in a BRBES is described below.

Input transformation
Input transformation consists of the distribution of the input value of an antecedent attribute over its different referential values by applying Eq. 4. The distributed value with each referential values of the antecedent attribute is called matching degree or the degree of belief. It is interesting to note that when the value of this matching degree is calculated for each of the referential values of the antecedent attribute (AA), this value is assigned to only those rules where this referential value exits with the AA.
As an example, Eq. 2 consists of eight antecedent attributes, each with three referential values. When the input data for one antecedent attribute is collected from the TB patient, its matching degrees to the corresponding referential values are calculated by applying Eq. 4. Consequently, the matching degree, related to a referential value corresponding to the antecedent attribute of a rule is assigned. In this way, input data for the eight antecedent attributes can be collected for a patient and their corresponding matching degree is assigned in a rule. Once the rule is assigned with the corresponding matching degree then it is said to be active and the rule is called packet antecedent. This phenomenon can be described that the rule is in the RAM while the initial rule base is in the secondary memory.
where u(A ih+1 ) and u(A ih ) are grade values of A ih+1 and A ih respectively. Table 2 shows the matching degree of the antecedent attribute value into its different referential values. For example, the input value of the antecedent attribute "Cough" is identified as "low", which is weighted as 10 % by the expert and its corresponding matching degrees associated with the referential values (in this case they are "High", "Medium" and "Low") obtained by using Eq. 4.

Calculation of activation weights
Rule activation weight calculation consists of calculating the combined matching degree (α k ) as well as the weight of a rule in the BRB. Equation 2 consists of eight antecedent attributes and hence, it is important to calculate their combined matching degree, which can be calculated by using multiplicative function [as shown in Eq. 5], allowing the inter-relationship between the attributes [21].
so that 0 ≤ δ k i ≤ 1 where δ k i (i = 1, ..., T k ) is the relative weight of the ith antecedent attribute in the kth belief rule. T k is the total number of antecedent attributes in the kth rule. Here, δ k i = 0 is meaning that the attribute which has zero importance and hence, no impact on the aggregation process, while δ k i = 1 demonstrates the significant impact. Moreover, overall belief increases with the increment of the individual belief of the antecedent attributes. The rule activation weight is calculated by using Eq. 6 [21].
where θ k is the relative weight of the kth rule and L is the total number of belief rule in the belief rule-base. When the W k of a rule is zero then it has no impact in the BRB while it is "1" then its important is high.
Belief degree update Equation 2 shows the presence of eight antecedent attributes, necessary to assess the suspicion of TB. However, there could be some situation when the data of the some attributes could not be available. In such situation, the initial belief degrees that were assigned to the referential values of the consequent attribute needs to be updated by applying Eq. 7 [21]. This phenomenon is an example uncertainty due to ignorance.
where τ (k, t) = ⎧ ⎨ ⎩ 1, if used in defining R k (t = 1, ..., T k ) or 0, otherwise Here, β ki is the original belief degree while β ki is the updated belief degree. The original belief degree is updated while any ignorance is noticed. For example, if the antecedent "cough" is ignored, then the initial belief degrees are updated as shown in Table 3.

Inference using evidential reasoning
In order to obtain the aggregated value of the referential values of the consequent attribute, based on the input data of the antecedent attributes, either recursive or analytical evidential reasoning (ER) algorithms as shown in Eq. 8 can be applied [22]. To reduce the computational complexity analytical ER algorithm is found effective to calculate the final belief degree β j .
The final combined result or output generated by ER is represented by {(C 1 , β 1 ), (C 2 , β 2 ), ..., (C N , β N )} where β j is the final belief degree attached to the j th referential value C j of the consequent attribute C, which is obtained after all activated rules in the BRB are combined by using ER. This output can be converted into a crisp/numerical value [as shown in Eq. 9] by assigning a utility score to each referential value of the consequent attributes [20].
where H (A * ) is the expected score expressed as a numerical value and u(C j ) is the utility score of the j th referential value.

BRBES to diagnose TB
This section presents the architecture along with implementation strategy of the belief rule-based expert system (BRBES) to diagnose Tuberculosis (TB). This is followed by the presentation of the knowledge-base construction as well as a description of the BRBES's interface.

Architecture and implementation
A system architecture can be defined how its components are organized. It is also important to know the pattern of system organization, which is known as architectural style. BRBES presented in this article follows three-layer architecture, consisting of web-based interface, application and database management layers as shown in Fig. 1. Web-based interface: Since the BRBES to be used by the physicians, patients and researchers at various hospitals of Bangladesh, especially in the rural areas, a web-based user-friendly interface is necessary. Therefore, to ensure the usability of the system by the mentioned users at any time and at any place, a web-based interface has been developed for the BRBES. This interface has been developed by integrating various web technologies such as Javascript, Jquery, HTML and CSS. The application layer, which facilities inference and database access, has been developed by using PHP because of its simplicity, shorter development cycle and it can be used through online. The database management layer, which consists of clinical data and knowledge-base, developed by using MySQL, which is a relational database management system. MySQL is exible, user friendly and ensures security as well as faster data access.

Knowledge base construction in BRB
The knowledge-base construction consists of developing a BRB tree by identifying the necessary antecedent and consequent attributes. Figure 2 shows the single level BRB structure to diagnose TB. The leaf nodes represent the antecedent attributes while root node consequent attribute. The eight antecedent attributes, having three referential values each, are identified and they are verified in consultation with the physicians, located at the various hospitals of Chittagong City of Bangladesh. The BRB consists of 6,561 rules since it comprises eight antecedent attributes each with three referential values. The number of rules are calculated by using Eq. 3. A BRB usually can be established [20] by acquiring domain expert knowledge, by collecting historical data, by using earlier rules if they are available and by developing rules in a random way without any prior-knowledge.
In this study, the initial BRB has been constructed by acquiring knowledge of physicians or the domain experts as well as by using patients' historical data. Each rule of the BRB has given rule weight "1" while each antecedent attribute's weight is considered as "1". An example of such rule is illustrated in Eq. 2. Table 4 illustrates the initial BRB of the BRBES.

BRB interface
An interface can be defined as a media, facilitating the users to interact with the system. Figure 3 illustrates the interface of the BRBES to diagnose TB, allowing the acquisition of the input value, associated with each of the eight antecedent    attributes, either from the patients or from the physicians. Figure 3 illustrates the matching degree, associated with the antecedent attributes with reference to the data of the first patient as shown in Table 5. For example, the input value of the antecedent attribute A1 (coughing more than three weeks) is obtained as "98". This is acquired by asking the patient about the intensity level of the A1, which is in the range of 0-100. This input value is then distributed over the three referential values of A1, which are "(High, 0.98)", "(Medium, 0.0)" and "(Low, 0.2)". This distributed value is called the matching degree obtained by using Eq. 4. The suspicion level of the TB of this patient is obtained by using Eqs. 5, 6, 7 and 8. This is measured as degree of belief associated with each of the referential values of the suspicion level of Tuberculosis and found as (High, 90.72), (Medium, 0.06), and (Low, 0.22). These fuzzy values are converted into crisp value by using Eq. 9 and obtained as 95.25 %.

Results and discussion
To demonstrate the applicability and the reliability of the BRBES to diagnose tuberculosis (TB), the system fed with the input data received from the TB patients of a hospital located in the Chittagong City of Bangladesh (Fig. 4).
The input data, associated with the eight attributes, of 77 TB patients have been collected. The real laboratory test results of those patients collected and they were considered as the benchmark data. If TB is found positive in laboratory test result for a patient, then the benchmark is considered as "1", otherwise it is "0" for that patient. Table. V illustrates the collected data on the eight attributes of a patient (column 2 through 9) along with level of TB suspicion generated by the BRBES (Column 11). Table. V also illustrates the expert opinion on the level of TB suspicion   Table 5 only presents the data of 15 patients out of 77. The Receiver Operating Characteristic (ROC) curves are usually used to analyze the accuracy and reliability of the diagnostic tests having ordinal or continuous results. Therefore, the method was considered, to measure the reliability of BRBES in comparison with expert opinion and FRBES. The accuracy of a system to assess the level of suspicion of the TB can be measured by calculating the Area Under Curve (AUC) [29,30]. For example, if the AUC is found to be one for the results generated by the BRBES then the system can be considered as 100 % reliable. Figure 5 illustrates the ROC curves plotted for both the BRBES and the Expert Opinion. The ROC curve plotted by the blue line in this figure is associated with the results generated by the BRBES with AUC of 0.910 (95 % confidence intervals 0.848 -0.972).
The ROC curve plotted by the green line in Fig. 5 is obtained against the physician's opinion, and its AUC is 0.710 (95 % confidence intervals 0.587 -0.815). However, Fig. 6 illustrates the ROC curves for BRBES, FRBES and Expert Opinion.
The ROC curve plotted by the red line in Fig. 6 is obtained against the FRBES and its AUC is 0.777 (95 % confidence intervals 0.680 -0.873). While the AUC values of Fig. 6 for both the BRBES and the Expert Opinion are same as of Fig. 5. Table 6 summarizes the above results associated with BRBES, FRBES and Expert Opinion. From Figs. 5 and 6 as well as from Table 6 it can be observed that AUC of Expert Opinion is much less than from both the BRBES and FRBES. The reason for this is that during the interviewing and conversation with the physicians it has been observed that they are not aware of the uncertainty issues related to the signs and symptoms of the TB.
The reason for this is that during the interviewing and conversation with the physicians it has been observed that they are not aware of the uncertainty issues related to the signs and symptoms of the TB. Therefore, the reliability of their assessment level of TB suspicion is much lower than that of BRBES and FRBES. From the Table 6 as well as from Figs. 5 and 6, it can also be observed that AUC of BRBES is much larger than that of FRBES. The reason for this is that fuzzy rule based expert system (FRBES) only considers uncertainty due to the vagueness, imprecision and ambiguity. However, BRBES includes uncertainties due to the ignorance, incompleteness and randomness in addition to the vagueness, imprecision and ambiguity. Further, the inference procedures of the FRBES which uses either Mamdanior T-S methods are not equipped to process uncertainty issues during the reasoning process. On the contrary, BRBES uses evidential reasoning procedures as the inference engine which is equipped to handle types of uncertainties mentioned before.

Conclusion
This article described the design, implementation and applications of a Belief Rule Based Expert system (BRBES), allowing the measurement of the level of suspicion of TB by taking account of its various signs and symptoms. The system allows the understanding of the relationship between signs and symptoms and the level of suspicion of TB of a patient in an explicit and transparent way. This will allow the identification of the signs and symptoms those are responsible for increasing the suspicion level of TB of a patient. In this way, various scenarios by taking account of the signs and symptoms of a patient from various perspectives can be carried out by using the BRBES. The physicians will eventually be able to select the appropriate medicine for a patient. In addition, all types of uncertainties such as vagueness, imprecision, ambiguity, ignorance and incompleteness can be handled in an integrated framework which makes the system more robust as evident from the comparative results generated by using both the manual system and the fuzzy rule based expert system as illustrated in Table 6. Therefore, the BRBES can provide a decision support platform to the physicians and would serve a savior by offering primary health care to the people with reduced time and low diagnosis cost. In future, the BEBES would be extended to support optimal learning by training the knowledge representation parameters such as rule weight, attribute weight, and belief degrees.
Funding This study was funded by Swedish Research Council under Grant 2014-4251.
Ethical approval All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.
Informed consent Informed consent was obtained from all individual participants included in the study.