A dialogue-based approach for dealing with uncertain and conflicting information in medical diagnosis

In this paper, we propose a multi-agent framework to deal with situations involving uncertain or inconsistent information located in a distributed environment which cannot be combined into a single knowledge base. To this end, we introduce an inquiry dialogue approach based on a combination of possibilistic logic and a formal argumentation-based theory, where possibilistic logic is used to capture uncertain information, and the argumentation-based approach is used to deal with inconsistent knowledge in a distributed environment. We also modify the framework of earlier work, so that the system is not only easier to implement but also more suitable for educational purposes. The suggested approach is implemented in a clinical decision-support system in the domain of dementia diagnosis. The approach allows the physician to suggest a hypothetical diagnosis in a patient case, which is verified through the dialogue if sufficient patient information is present. If not, the user is informed about the missing information and potential inconsistencies in the information as a way to provide support for continuing medical education. The approach is presented, discussed, and applied to one scenario. The results contribute to the theory and application of inquiry dialogues in situations where the data are uncertain and inconsistent.


Introduction
Clinical decision-support systems (CDSSs) use Artificial Intelligence (AI) to help doctors reach diagnostic decisions and are playing an increasingly important role in clinical practice [29]. Knowledge bases are key elements in CDSSs. As the decisive factor for the precision in the decision-making process, they constitute the basis for the system. In practice, however, there are situations where two or more knowledge bases cannot be merged or efficiently combined due to security, privacy, or other concerns, but the knowledge stored therein is needed in the decision-making process. In such cases, a multi-agent system (MAS) [8] is a natural choice for a CDSS solution with practical usefulness. Moreover, a MAS can illustrate different viewpoints in a clinical teamwork situation, e.g., when a primary care practitioner treats a patient with progressing dementia without having much experience in this particular disease domain and may need to consult an expert physician during the diagnostic process. In sparsely populated areas, it is also common that experts may be located at specialist hospitals a long distance from the primary care center. In our work, we take this perspective a step further by building a decision-support system based on international consensus on diagnostic criteria [22], which may act as an expert agent in a MAS that can be consulted through the CDSS. We also choose dialogue-based systems to simulate the dialogue that a clinician could have with an expert physician in a patient case for educational purposes.
Walton and Krabbe defined several types of agent dialogues [33,41]. Among them, the inquiry dialogue allows agents to collaborate in order to find the best solution and new knowledge. However, the value of inquiry dialogue-based MASs in dealing with situations where knowledge bases could not be easily put together has only been explored to a limited extent. For instance, Black and Hunter describe an inquiry dialogue approach in [2]. However, to the best of our knowledge, their theories have not been implemented in practical applications. Actually, some of their theories are not straightforward to realize in applications for clinical situations.
Against this background, we have extended Black and Hunter's theoretical dialogue approach [2], which allows for a significantly simplified implementation. This modification is mainly in the structure. In Black and Hunter's structure, a top level warrant inquiry (wi) dialogue consists of several argument inquiry (ai) dialogues at different hierarchical levels, whereas in our structure, we have wi dialogue not only in the top level but also in all sublevels. The modified structure allows agents to reach partial conclusions within the nested dialogues, a method in which the strongest arguments are aggregated to serve the argument evaluation for deciding upon the major topic. Therefore, the structure of our system is clearer, and the system becomes time efficient, which is also in line with how humans reason. Details for generating wi and ai dialogues are provided, and a strategy for dealing with endless loops is given.
Moreover, since the agents have different roles and knowledge bases that include rules and behaviors, the data to be fused are often inconsistent and uncertain. In our earlier research [21,43], inquiry dialogues were explored, which were based on defeasible logic and integrated preference levels. The purpose was to integrate different preferences regarding knowledge sources (e.g., different potentially conflicting clinical practice guidelines). However, the ability of defeasible logic programming to deal with explicit uncertainty and fuzzy knowledge which often exist in the medical domain is fundamentally limited [4]. Thus, the approach presented in this article is built on the formal argumentation for dealing with inconsistent information introduced in [43]. We extend our earlier research by integrating possibilistic logic to capture uncertain information and degrees of confidence in sources of knowledge [6]. The reason for applying possibilistic logic in the clinical dialogue between agents is to extend defeasible logic programming capabilities for qualitative reasoning by incorporating the treatment of possibilistic uncertainty and vague knowledge [4]. It has applicability in specific individual patient cases. Reasoning with probabilities that require statistical information presents problems since the statistical distributions provided by evidence-based medicine (EBM) are not coherent; there are overlapping conditions [13], and the individual case may not be a standard case. In contrast, reasoning with possibilities gives a different starting point, where if knowing nothing at all, all potential diagnoses are possible, which may help the novice clinician to not jump to conclusions too soon [34]. This approach also follows the terminology applied in some diagnostic criteria where the medical community has tried to translate the uncertainty in EBM-based statistical information into formulations useful in clinical practice using terms like possible, probable, and unlikely (e.g., [28]). This approach allows to evaluate arguments based on the reliability of the source of generic knowledge (e.g., clinical diagnostic criteria) or the source of patient information (e.g., the patient vs. a relative).
Hence, using the methods we develop based on possibilistic logic and argumentation theory, traditionally difficult situations where the data are uncertain and inconsistent can now be properly dealt with transparently. This approach provides transparency so the clinician can follow the reasoning and decision-making process and make more well-founded medical decisions, and using the system will also provide continued medical education during everyday clinical practice. Consequently, our results contribute significantly to the research field of knowledge-based systems and practical application in the medical domain.
We implemented our methodologies in the CDSS that we developed for dementia diagnosis and management. The results are illustrated with representative diagnostic situations from the dementia domain. The domain knowledge related to dementia is a good example of a medical domain where uncertain and conflicting information is present [19,22] and where interactive decision support could increase the physician's knowledge [20].
The paper is organised as follows. The next section presents a short background of possibilistic logic. In Section 3, the developed methods for dialogue generation are described. Section 4 describes the details of the realisation of the MAS embedded in the clinical decision-support system. Next, one scenario is used to show how MAS is integrated into the clinical system and how it works. We compare our work with other related research in Section 6. The paper ends with conclusions and an outline of future work.

Background: possibilistic knowledge bases
Our approach is based on Possibilistic logic [6]. We begin by presenting the syntax of our rule-based knowledge bases. In the logic we used, formulas are expressed by classical propositional logic with some numbers between 0 and 1. In the settings of possibilistic logic, we consider the framework of necessity formulas; hence, the numbers attached to logic formulas denote the degree of necessity of a formula. A literal denotes either an atom α or its negation ¬α. The contradiction of a literal l is defined as c, such that c(l) ≡ ¬l and c(¬l) ≡ l. The symbols, such as binary connectives ∧, implication →, and negation ¬ are the same as in propositional logic.
A rule is denoted by α 1 ∧ · · · ∧ α n → β , such that α i (1 ≤ i ≤ n), and β are literals. α i (1 ≤ i ≤ n) is called the premise of the rule and β is called the conclusion of the rule. Given a rule r = α 1 ∧ · · · ∧ α n → β , concl(r) = β . When n = 0, a rule is denoted by → α and called a f act.
A rule together with a number p form a possibilistic belief. p is not a probability (like it is in probability theory), but it induces a certainty (or confidence) scale. This value is determined by the expert providing the knowledge base. Definition 1. A possibilistic belief, denoted by B, is a tuple of the form (φ , p) where φ is a rule and p ∈ (0, 1] is the lower bound of the belief in terms of necessity measures, which means the formula φ is certain at least to the level p. Given a belief (φ , p), if φ is a fact, we name this belief state belief; otherwise, we name it domain belief.
In this paper, unless otherwise stated, a possibilistic belief is called a belief since all beliefs in this paper are possibilistic. We name the rule's premise as the belief's premise and the rule's conclusion as the belief's conclusion. Given a belief B = (φ , p), N(φ ) ≥ p where N is a necessity measure modeling the possibly incomplete state knowledge [6,30], and B * = φ . (l, p) is called a possibilistic literal where l is a literal, and p is its possibilistic value.
A belief base of an agent x, denoted by Σ x , is a finite set of beliefs. The set of all rules in Σ x is denoted by To project a set of rules from a belief base of a given agent x regarding a particular conclusion, we define the concept of related belief base as follows: The related belief base about literal α regarding agent x, denoted by Σ α x is defined as ∈ Σ x and (concl(φ ) = α or concl(φ ) = ¬α)} We use the function relatedBeliefBase x (α) to return Σ α x from Σ x . Following Ref. [6], the possibilistic inference rule, under the framework of necessity formulas, is (φ 1 , p 1 ), (φ 1 → φ 2 , p 2 ) (φ 2 , min(p 1 , p 2 )). The rule's necessity number is given by the minimal necessity numbers of all the participant formulas. We use pl to denote the inference in possibilistic logic. From this rule, if we know (φ , p 1 ) is true, we can infer that all beliefs (φ , p 2 )(p 2 ≤ p 1 ) are true.
An argument is usually defined as a set of propositions such that a set of premises supports a conclusion. Depending on the goal of a given dialogue, an argument can be used, for instance, to persuade someone of something or present reasons for accepting a conclusion.
Definition 3. An argument is a tuple of the form Φ, (l, p) constructed from a belief base Σ, where Φ is a set of beliefs, and (l, p) is a possibilistic literal and called the claim of the argument, so that the following conditions hold: 1. Φ ⊆ Σ and Φ pl (l, p), 3. There is no Φ * 1 ⊂ Φ * , such that item 1 and item 2 are satisfied. In this definition, item 2 describes an argument as being derived from a consistent set of beliefs. Consistency is understood in the standard way as it does by classical logic which means that one cannot infer a and ¬a from the same logic theory. Therefore, no conflicting arguments can be derived from the same set. Item 3 states that the consistent set is a minimally set (i.e. no subset of this consistent set can derive the argument with the same fact or its contradiction).
When we say the argument Rebut and undercut are also known as counterargument in literature [1]. Beliefs in a belief base are sometimes inconsistent, leading to conflicts between arguments constructed from this belief. In the following section, we introduce an argumentationbased approach to resolve conflicts and calculate the acceptability of a literal.

Modeling dialogues
In this section, we present our approach to generate dialogues and evaluate arguments between agents.

Dialogue representation
The inquiry dialogue, among other types of dialogues, was defined by Walton and Krabbe [41], with the purpose to collaboratively build new knowledge. The goal of an inquiry dialogue is to prove or contradict and possibly falsify the hypothesis in the proof process of collaborative reasoning. In our approach, there are always exactly two agents taking part in an inquiry dialogue. In the following sections, we use x to represent one agent andx to represent the other. That is, if x = 1 thenx = 2 and vice versa. Each agent has its own belief base Σ x . However, they cannot prove the truth of the hypothesis by themselves; hence, they need to collaborate to find and verify the evidence regarding the given hypothesis (topic).
To formalize our dialogue system, we follow the dialogue style introduced by Black and Hunter [2]. Two participating agents use moves to communicate with each other in our argumentation system. Three types of moves are allowed: open, assert, and close. An open move means an agent opens a new dialogue. An assert move means an agent believes a given belief is true. A close move indicates an agent wants to close the current dialogue; however, if another agent does not agree, this dialogue will not be closed.
In the same way as in [2], we use two kinds of inquiry dialogues in our framework: wi and ai dialogues. Ai dialogue generates the argument that can be used in a wi dialogue (i.e. if all the premises of a rule can be proven true, an argument is generated). Wi dialogue contains 0 to n ai dialogues. Wi dialogue generates new knowledge by comparing these arguments. The two types of dialogues above are nested within each other. A move m is a tuple of the form agent, move type, dialogue type, topic , where agent is the agent that makes this move; move type denotes the kind of move (open, assert or close); dialogue type means the type of the dialogue (wi or ai); and topic is the content of the move. If the move is an open/close wi move, the topic is a literal; if it is an open/close ai move, the topic is a domain belief; otherwise, the topic is a state belief. The topic of the dialogue is determined by the topic of the first move made in this dialogue, so the topic of a wi dialogue is a literal and the topic of an ai dialogue is a domain belief. Since we have two types of inquiry dialogues and three types of moves, there are a total of six types of move formats (Table 1). More details are listed below.
x, open/close, wi, α means that agent x opens/closes a wi dialogue, and the topic of the dialogue is α. x, open/close, ai, (α 1 ∧ ... ∧ α n → β , p) means that agent x opens/closes an ai dialogue and the topic of the dialogue is (α 1 ∧ ... ∧ α n → β , p). x, assert, wi, (→ α, p) means that this move is within a wi dialogue whose topic is α or ¬α, and agent x asserts that (→ α, p) is true. x, assert, ai, (→ α, p) means this move is within an ai dialogue, and agent x asserts that (→ α, p) is true and α or ¬α is one of the ai topic's premises. Let us observe that (→ α, p) is not from the agent's belief base but from the result store (which will be described in next section in Definition 11) that has already been proved to be true, false, or unknown by the two agents.
To formalize our dialogue system, we use the definitions of dialogue, sub-dialogue and well-formed dialogue in [2]. Here, we only give a brief description. For detailed descriptions of these concepts, we refer to [2].
r (r, t ∈ N and r≤ t ) is a sequence of moves [m r , . . . , m t ] with two agents participating, in which: (1) the first move of the dialogue is an open move, and (2) each agent takes its turn to make moves. A sub-dialogue is a sub-sequence of another dialogue. A well-formed dialogue is a dialogue with the following conditions: (1) the last two moves must be close moves made by two agents successively, which means both agents agree to close the dialogue; (2) the dialogue only terminates once; and (3) all the sub-dialogues are also well-formed and terminate before their parent dialogue. We use the function Topic(D t r ) to return the topic of the dialogue D t r .

Generating dialogues
We first define some notations which are needed to generate the dialogues and then provide specific protocols for generating the two different types of dialogues: wi and ai.
In a wi dialogue, we use a Possible Beliefs Queue (PBQ) to store the belief's relatedBeliefBase (see Definition 2) according to a topic; consequently, it can pick up the first belief from this queue when it needs to make a move.

Definition 6.
A PBQ is a queue of beliefs that the agent can legally use for selecting the next move from the current wi dialogue. Let D t r be the current dialogue and I be the set of participants. For all x ∈ I , PBQ is used by an agent to select the next explicit move in a given wi dialogue. Each agent has its own PBQ where state beliefs and domain beliefs are both stored. When agent x opens a wi dialogue with topic α, it updates its PBQ, and at the next move, agentx updates its PBQ. Within the wi dialogue, the agent retrieves and deletes the first belief in its PBQ (since PBQ is a First-In-First-Out queue data structure) and uses this belief for its next move. If the belief is a state belief, the agent makes an assert wi move (see step 3.3.1 in Table 3); else, if it is a domain belief, the agent makes an open ai move (see step 3.3.2 in Table 3); else it makes a close wi move (see step 3.2 in Table 3) since the queue is empty. Example 1. The example comes from a medical practice scenario. There are two agents: professional agent (PA) simulates a novice physician, and domain agent (DA) simulates a medical domain expert. PA investigates a patient regarding possible Lewy body dementia. However it does not have enough knowledge to reach a decision. Therefore, it consults DA who has more experience diagnosing Lewy body dementia. The belief bases associated with DA and PA are: , (a, 1), (d, 1)}. p 1 and p 2 mean probable and possible, respectively, and probable is more credible than possible. The meaning of the atoms represent the symptoms or the disease that are listed in Table 2.
If PA makes the move PA, open, wi, f at t=1, the PBQ for each agent is (also shown in Fig. 1): When an agent opens an ai dialogue with the topic (Φ, p), a Query Store (QS) associated with this topic is created and shared between the agents. Within an ai dialogue, if an agent needs to make a move, it can consult the QS w.r.t. a specific topic, retrieve the first fact, and then make an open wi move.
Definition 7. For the current ai dialogue with topic (Φ, p), a QS, denoted by QS t Φ , is a queue with a finite number of literals, such that Both agents share the same QS, which only stores premises. The QS is used by the agents to select the next explicit move in a given ai dialogue. When an agent makes an open ai move, the premises of its topic rule are stored in the QS. Within this ai dialogue, if the QS is empty, the agent makes a close ai move (see step 3.2 in Table 4); otherwise, it makes an open wi or assert ai move. If the move x, open, wi, α k (see step 3.3.3 in Table  4) or x, assert, ai, (→ α k , p) is made (see step 3.3.1 in Table 4), the QS removes α k . If the move x, assert, ai, (→ ¬α k , p) is made (see step 3.3.2 in Table 4), the QS resets / 0 directly since this premise is already proved to be false, and the ai dialogue can then be closed without consulting the QS anymore. Hence, the QS of a well-formed dialogue is empty in the last move of the dialogue.  Fig. 1 at t=2).
To manage committed data that becomes public to other agents, a Commitment Store (CS) is used, which is a set of possibilistic literals and initiated as an empty set. To identify the state of the CS of each agent which participates in a given dialogue D t r , CS t x denotes the CS of Agent x, and t denotes the timepoint in the dialogue D t r . The updates of a CS, outcome of an ai dialogue (Outcome ai ), and outcome of a wi dialogue (Outcome wi ) are obtained recursively. To update CS, we need to get Outcome ai . To get Outcome ai , we need to calculate Outcome wi . To calculate Outcome wi , we need to know CS. The update of the CS of each agent is done as follows (Outcome ai and Outcome wi will be defined in Definition 9 and Definition 10, respectively).
The CS of each agent is initiated as an empty set. It is updated whenever the agent performs an assert wi move to assert a state belief or when the ai dialogue closes and a state belief is calculated according to Outcome ai , which is defined in Definition 9. An important consequence of this update is that the information added to the CS is public to the other agents taking part in the given dialogue.
When an ai dialogue terminates, its outcome is calculated. If all the premises of its topic are considered to be true (i.e. Outcome wi = T, p , Outcome wi is given in Definition 10), the outcome is a belief constructed with the rule's conclusion and a necessity value p; otherwise, the outcome is null.
r be a well-formed ai dialogue and (α 1 ∧ ... ∧ α n → β , p) be its topic. Outcome of an ai dialogue is defined as follows: Otherwise.
Within a wi dialogue, several arguments defending or attacking the topic α may be generated. When this wi dialogue terminates, its outcome is calculated. The outcome is a tuple r, p) where r ∈ {T, F,U}. If the defending arguments win, r = T , meaning α is True. If the attacking arguments win, r = F, meaning α is False. In both cases, p is a necessity value which can be calculated from the Algorithms 1 or 2. However if the two sides are well matched, r = U, which means the result is unknown and p is null.
Definition 10. Let D t r be a well-formed ai dialogue and α be its topic. Outcome of a wi dialogue is the outcome of the function Outcome wi (D t r ). To meet different users' requirements, we provide two alternative algorithms (Algorithms 1 and Algorithms 2) to obtain Outcome wi . The following notations are used in the two algorithms: Let Λ be a set of possibilistic literals, α be a propositional atom and p ∈ (0, 1], The main idea of Algorithm 1 is as follows: 1) classify beliefs from the union of two CSs into two sets, which are Λ d and Λ a ; 2) make a copy of the two sets to show all the arguments in the set to the user when this set wins in the end; 3) retrieve the GLB from each set and compare these two numbers; 4) the set with larger GLB wins; 5) if they have the same GLB, the algorithm will count the quantity of beliefs whose possibilistic value equals the GLB in each set; 6) if they are equal, these beliefs will be removed from each set (this operation is usually called cut in possibilistic logic), and two new sets are retrieved; 7) the new sets are compared until one set wins or both become empty.
Require: a well-formed wi dialogue D t r with α as its topic Ensure: r, p 1: If Cardinality(Λ dt ) > Cardinality(Λ at ), return T, GLB(Λ d ) . 10: If Cardinality(Λ dt ) < Cardinality(Λ at ), return F, GLB(Λ a ) . 11: The main idea of Algorithm 2 is similar to Algorithm 1. However, it omits the cut and loop parts (i.e. step 2, 11, and 12 in Algorithm 1 are omitted) so that only the arguments with the highest weight in both sets are compared. If it cannot make a comparison from these arguments, U, null will be returned. The motivation of Algorithm 2 is grounded in the medical domain where clinicians typically evaluate the strongest evidence that supports a conclusion.
Require: a well-formed wi dialogue D t r with α as its topic Ensure: r, p 1: It is common that the same atom or its negation can be used in different rules as a premise. If the premise has already been proved before (i.e. a wi dialogue with this premise as topic has already been terminated), the system should not prove it twice. We use Result Store (RS) to store the intermediate result in order to avoid repetitive work. Definition 11. A RS is a set of tuples of the form α, Outcome wi (D t r ) where D t r is a well-formed wi dialogue, and α is its topic.
When a wi dialogue opens, the RS is added with a new pair α, U, null where α is the topic of the dialogue. When it closes, U, null part is updated with Outcome wi (D t r ). Otherwise.
The outcome of the wi dialogue with topic α is a tuple r, p . r and p are returned by the functions Result(α) and PL(α), respectively, which are defined as follows: When we get Result (α) t , Result(¬α) t can also be inferred as follows: If the wi dialogue with topic α has been opened, Result(α) t and PL(α) t should be U and null, respectively. This, however, is different from the case when the dialogue has not been opened at all. In the latter case, Result(α) t and PL(α) t should both return null. In fact, when the wi dialogue opens, α, U, null is added to the RS as an indicator so we can distinguish if the dialogue has opened or not.
When Result(α) t is U, it can be further divided into two cases. The first is more common, where the dialogue has already been terminated before time t and the result is U (see step3 in Algorithm 1). The second is special, where the dialogue remains open until t. Here, in case an ai dialogue with topic (α → β ) exists inside the wi dialogue with topic α, then an endless loop may occur if the agent now opens a wi dialogue with topic α inside this ai dialogue. To address this problem, we develop a fast end strategy so that the agent makes an assert ai move for both cases.
Definition 12. The fast end strategy for an agent x participating in an ai dialogue is that the ai dialogue terminates directly if one of the premises of the ai dialogue's topic is a wi dialogue's topic which has not been closed yet. Specifically, Result(α) returns U when the wi dialogue with topic α is not terminated. Then, the agents will do a close ai move in turn in the following two steps and, the QS regarding this ai dialogue is reset to / 0.
In agent dialogue frameworks, protocols are usually presented to determine the agents' next moves [2,15,26,35]. In this paper, we regard a protocol as a function that, given a particular type of dialogue, a specific move is returned according to its belief base and the previous moves they have already made. This is somewhat different with respect to the approaches described in [2,26], where their protocols only return a set of legal moves. To select exactly one move, they need additional computation. However, using our protocols, exactly one move can be selected at each time point. If the current dialogue is a wi dialogue, the wi dialogue protocol (presented in Table  3) should be followed by each agent to determine its next move. When Agent x opens a wi dialogue at time r (i.e. m r = x, open, wi, α ), Agent x updates its PBQ according to Definition 6. At the next time (i.e. r + 1),x updates its PBQ. Then, within this wi dialogue, an agent makes its next move according to its PBQ. If PBQ is empty, the agent makes a close wi move (see step 3.2 in Table 3); else if the first element in PBQ is a state belief, the agent asserts this belief (see step 3.3.1 in Table 3); else the agent opens an ai dialogue since it is a domain belief (see step 3.3.2 in Table 3).
Similar to the wi dialogue, if the current dialogue is an ai dialogue, the ai dialogue protocol (presented in Table 4) should be followed by each agent. When Agent x opens an ai dialogue at time r (i.e. m r = x, open, ai, (α 1 ∧ · · · ∧ α n → α, p)), the QS is updated according to Definition 7. Within this ai dialogue, the agents make the next move according to the QS and the RS. If QS is empty, the agent makes a close ai move (see step 3.2 in Table  4); else the next move is determined by the RS. If the premise α k is not in the QS (i.e. the wi dialogue with topic α k has never been opened yet), the agent makes an open wi dialogue (see step 3.3.3 in Table 4); else, it makes an assert ai move. However, there are two cases about the assert ai move. If Result(α k ) = T , x, assert, (→ α k , p ) is performed (see step 3.3.1 in Table 4) and the ai dialogue can continue; else x, assert, (→ ¬α k , p ) is performed (see step 3.3.2 in Table 4) and the ai dialogue will be closed in the following two steps since the agents cannot prove the current premise.
Example 3. Continuing the running example, the whole dialogue generated by PA and DA are shown in Fig. 1.
In this example, eight dialogues are generated. Inside the top wi dialogue D 31 1 , there are three ai dialogues: D 12 2 , D 20 13 , and D 29 22 . Inside each ai dialogue, some wi dialogues are also generated. To determine the outcome of the outer layer dialogue, the inner layer should be determined first.
To diagnose whether the patient has f or not, PA opens the wi dialogue with topic f . Meanwhile, f , U, null is added to the RS. Then, PA and DA both update their PBQs. 1, Agent x starts an ai dialogue D t r with the move m r = x, open, ai, (α 1 ∧ · · · ∧ α n → α, p) and let t = r. 2, The QS is updated according to Definition 7. 3, Loop 3.1, Set x = the opposite agent and t = t+1. 3.2, If QS t (D t r ) is empty, m t = x , close, ai, (α 1 ∧ · · · ∧ α n → α), p) . If m t−1 = x, close, ai, (α 1 ∧ · · · ∧ α n → α), p) , the ai dialogue closes and goes to 5. DA has two domain beliefs in its PBQ and opens an ai dialogue with the first one as its topic. The premises of the topic is added to the QS. Next, PA opens another wi dialogue with the first literal a in the QS as its topic. Meanwhile, a, U, null is added to RS. DA has no beliefs regarding a. Therefore, it tries to close the dialogue. However, since PA has a state belief, it asserts (→ a, 1), and the dialogue is still open. (a, 1) is added to CS PA . When the wi dialogue successfully terminates at t = 7, its outcome is calculated according to Algorithm 1 or 2. Since there is only one argument ((→ a, 1)), (a, 1) supporting the topic a, we have Outcome wi (D 7 3 ) = T, 1 . Meanwhile, a, T, 1 is updated to RS. Outcome wi (D 10 8 ) = U, null is obtained since no beliefs related to b are found in both belief bases. Therefore, Outcome ai (D 12 2 ) = null since one of the topic's premises cannot be proven to be true.
At t = 13, PA opens the ai dialogue with topic (a ∧ c → f , p 2 ). In the next step, DA makes the assert ai move DA, assert, ai, (→ a, 1) instead of opening a wi dialogue with a because the wi dialogue with a as the topic has already opened before and

Implementation
The argumentation-based approach presented in Section 2 and Section 3 has been implemented and integrated into the application DMSS-W (Dementia Diagnosis and Management Support System -Web version) [22], which is a clinical decision-support system for diagnosing dementia diseases. DMSS-W utilizes data from the ACKTUS platform  [23,24], which is a web-based tool for modeling medical knowledge into rules and claims in natural language. We used Java Agent Development Framework (JADE) to develop the MAS and Java language platform to integrate it into DMSS-W as one of its inference engines. In the MAS, DA and PA are defined as two agents.
In this section, we will explain how the whole system works, where the knowledge comes from, and how it is mapped as facts and rules and used by DA and PA. First, the architecture of the whole system will be given.

Architecture
The whole system can be divided into four layers: interface layer, web service layer, storage layer, and inference layer as depicted in Fig. 2. It includes two systems: ACKTUS and DMSS-W; one web service provider; two ontology repositories: domain repository and actor repository; one kind of relational database patientCase; and two kinds of agents: DA and PA. There are two kinds of users: domain experts and the regular professionals in this architecture. We will detail each of them in the following paragraphs.
Domain experts in a medical domain model the evidence-based domain knowledge into formal machine-interpretable representations using ACKTUS. The knowledge is stored in the domain repository, which is used by the DA. The domain repository is located on the same server as ACKTUS and DMSS-W. Consequently, DA has access to a large set of rules based upon clinical guidelines, which are consensus documents involving a large international community of domain experts [12,42].
A regular professional can also formulate his or her knowledge using the same structures; however, since the knowledge is not official and evidence-based, it is stored in the actor repository, which is used by PA. PA can also retrieve factual data (state beliefs, symptoms observed in the patient) from the PatientCaseDB. These factual data are inputted by the user through DMSS-W user interface, transferred via the web service, and stored in the PatientCaseDB.
PA initiates a dialogue, and DA responds. Subsequently, they generate a dialogue with nested sub-dialogues using the approach presented in Section 3. When the top dialogue terminates, a result about a hypothesis is reached. The dialogue with all the moves and the

Construction and Design of Knowledge Bases
Below we show how to interpret the knowledge as beliefs and use them in the reasoning process.

State beliefs
The domain experts model interaction objects (IOs) through ACKTUS and store them in the domain repository. The domain repository is used for elaborated knowledge such as symptom manifestations, syndromes and diseases, and the evaluated observations obtained from laboratory examinations. Each IO contains one or more scales with different scale values. Fig. 3  The scale can be simple with two or three possible values. For this type, if the scale value is a certain knowledge (e.g., normal, affected), we use an atom, such as α, to represent affected and its opposite ¬α to represent normal; also, we assign the possibilistic value as 1. If the knowledge is uncertain, we map this data to two opposite possibilistic beliefs with a necessity value less than 0.5 (the assignment of it is the user's task). We give a concrete example to demonstrate our method. The IO Orientation to time in Fig. 3 has a simple scale type with three values (normal/unknown/affected). We use OOT to represent "Orientation to time-affected" and ¬OOT to represent "Orientation to time-normal". Therefore from the interface, 1. If "Orientation to time-normal" is clicked, the state belief (→ ¬OOT, 1) is generated; 2. If "Orientation to time-unknown" is clicked, both (→ OOT, p) and (→ ¬OOT, p) are generated, such that (0 < p < 0.5); 3. If "Orientation to time-affected" is clicked, (→ OOT, 1) is generated.
The presence of a symptom can, in some cases, lead to the possibility to choose an additional scale value, such as the severity level. Such successor values are treated as follows. We also use a concrete example to show our method. The IO Judgement in Fig. 3 contains a reliability scale [normal, unknown, affected], and the scale value "affected" triggers the severity scale [not specified, mild, Significant]. In this example, the atom Judge is used for representing "Judgement-affected" and ¬Judge as "Judgements-normal". Therefore: 1. If "Judgement-normal" is clicked, (→ ¬Judge, 1) is generated; 2. If "Judgement-unknown" is clicked, both (→ Judge, p) and (→ ¬Judge, p) are generated, such that (0 < p < 0.5); 3. If "Judgement-affected" is clicked, (a) If "not specified" is chosen, both (→ Judge, 1) and (→ Judge n, p) are generated, such that (0 < p < 0.5); (b) If "mild" is chosen, both (→ Judge, 1) and (→ Judge m, 1) are generated; (c) If "significant" is chosen, both (→ Judge, 1) and (→ Judge s, 1) are generated.
Except these, there are additional domain beliefs that are applied during the inference procedure.

Domain beliefs
An IO associated with one of its scale values composes a premise or a conclusion. Based on the premises and conclusion, a rule can be created. The domain expert who models the knowledge in ACKTUS selects a scale for a certain conclusion that mirrors the underlying medical guideline, and this scale is used for extracting the rule's possibilistic value. Typically, this scale contains the values present, absent or unknown, where present represents pathology, and absent represents a normal condition. The values of a scale can also represent a broader range of possibilities, such as [excluded, unlikely, possible-, possible, possible+, probable-, probable, probable+]. The rules are assigned with their possibilistic values based on the scale and values the domain expert applies to symptoms and diseases.
In this way, we can map the knowledge base into the logic framework and use the logicbased method to conduct a reasoning process. The following is an example of the construction of a domain belief. The knowledge base contains a structure with the premise "Episodic memory is normal" and the conclusion "Alzheimer's disease is excluded". This information can be formalised as a domain belief (¬α → ¬β , excluded), where ¬α is the premise, ¬β is the conclusion and excluded is the possibilistic value of the belief.

Arguments
An argument is constructed based on a set of beliefs to deduct a conclusion. Its p is calculated based on the beliefs forming the argument. We continue the previous example. If Episodic memory is normal is proved to be true, an argument can be formalized as ((→ ¬α, 1), (¬α → ¬β , excluded)), (¬β , excluded) .
Since the data in the knowledge bases can be uncertain and inconsistent, conflicting arguments could be deducted. Our strategy to solve the conflict is based on Algorithm 1 or 2, depending on the user's preference.

Dealing with uncertain and inconsistent data via an inquiry dialogue
In this section, we show one scenario of how the MAS is integrated into the DMSS-W application and demonstrate how it works.
A physician diagnoses a potential dementia patient. The physician uses different methods (e.g., observation, asking the patient or the relatives, laboratory examination) to collect information about the patient's condition and inputs the information into DMSS-W. A part of the system interface is shown in Fig. 4. The symptoms used in this scenario are the following: -Status shows mild executive dysfunction (the red button in the upper right corner in Fig. 4). Based on this, information (→ a, 1) and (→ a m, 1) are retrieved.
The symptoms are stored in the PatientCaseDB and used as state beliefs by the PA in the reasoning process. The hypothesis mild cognitive impairment (MCI) is present is the topic that the physician selects for the dialogue between PA and DA.
The DA, representing an expert, has a large set of domain beliefs in its belief base, which are related to the topic of the dialogue and retrieved from the domain repository. It should be noted that these domain beliefs are retrieved from different, sometimes conflicting sources (in this case two different guidelines for diagnosis). A subset of domain beliefs are applied in our examples for readability purposes. We apply five domain beliefs that are associated to two different conflicting diagnostic criteria for MCI. These domain beliefs have a conclusion that leads to a hypothesis about MCI, as listed below. The DA has no state beliefs about the patient in its belief base since it does not have the management responsibility for patients like the PA has. Although, in our practice, DA only has domain beliefs; we note that the theory presented in Section 3 does not have such a restriction on the agents.
All of the the first three domain beliefs' conclusions are MCI is present which support the hypothesis. Both of the last two domain beliefs' conclusions are MCI is absent which attack the hypothesis. Some domain beliefs in this scenario are listed here. When the wi dialogue with the topic ¬ f closes, the result ¬ f , T, 1 is reached and stored in RS. Later, when needed, two agents do not need to check this literal again; instead, they just retrieve it from RS and reuse it. For example, DBelief3 also has premise ¬ f . When DA makes the move DA, open, ai, (¬ f ∧ g ∧ a m ∧ ¬e → MCI, 1) (move 132 in Fig. 5), the next move for PA to make is PA, assert, ai, (→ ¬ f , 1) , claiming that f is false instead of making an open wi move.
Then, two agents open another wi dialogue with the topic cognitive symptoms are present (g). In this example, it can also be proved to be true. The third premise of DBelie f 1 proven no disability to perform self care (¬e) can be proved also since PA has a symptom (the last state belief of PA) to show this, and there is no other beliefs to attack it in this example.
Two agents then open the fourth wi dialogue with the last premise Status shown no executive dysfunction (¬a) as the topic. This premise is proved to be false since its opposite Status shown mild executive dysfunction is proved to be true (a and a m are true); therefore, the wi dialogue's outcome is F, null . Until now, the ai dialogue with topic DBelie f 1 is closed and its outcome is null since ¬a is f alse.
From previous explanation, we can see four wi dialogues have been generated and nested within the ai dialogue based on DBelief1 since it has four premises. However if the order of its premises is changed, the amount of wi dialogues can be different. For example, if the last premise ¬a is the first one to be proven, there will be only one wi dialogue (with this premise as its topic). The reason is its result is null, so the agents need not check other premises.
Since DA has four more domain beliefs related to MCI, it will open the other ai dialogues in a sequence. When all five domain beliefs are checked, the wi dialogue with MCI as its topic is closed. A total of 237 moves are made by the two agents, which can be seen in the screenshot in Fig. 5. The timepoints in Fig. 5 are not consecutive because the sub-dialogue are hidden, which can be expanded by clicking the small triangles.
In this scenario, there are two conflicting arguments generated due to the two conflicting guidelines, which can be seen in Fig. 6. In this case, the outcome of the dialogue is U, null since the two arguments are equal in weight.

Related Work
MASs are widely studied in a variety of fields such as task allocation [16], online trading [36], disaster response [9], and modeling social structures [38]. There are some works related to MASs in the healthcare domain, such as in medical data management [40], patient scheduling [25], and remote care (e.g., for senior citizens) [5,10,17,39]. There are also some works on decision-support applications in MASs to help physicians diagnose patients [11,37]. However, there are few works on how agents communicate and make  decisions in collaboration in a transparent manner for the purpose of educating medical professionals.
The authors in [11] present a decision-support system for diagnosing brain tumors and predicting the progress. In their system, there are several medical centres where each centre has several agents (e.g., a classifier agent, database agent, preprocessing agent) contributing to different roles. The classifier agent aims to provide tumor classifications based on its case data to support the decision-making process when receiving a request to diagnose a new patient from another centre. However, this classification process is only based on its local database, which means a classifier agent does not interact with other classifier agents to make the decision. As the authors mentioned in [11], the accuracy of the classifiers depends heavily on the volume of cases. Therefore, the result is less accurate when it is only based on the case data in its local database than when it uses the data in all the classifier agents. In our system, the agents make a decision based on the combination of their belief bases, and they provide transparent reasons for the decision which are comprehensible to the clinician.
The authors in [37] present the architecture of a healthcare intelligent assistant using various existing organizational knowledge to solve medical cases. They use grid technology and do not need to consider the underlying details of each node, such as resource management, security, and others. In [18], a generic computational model is proposed to implement the development of interoperable intelligent software agents for medical applications. None of these approaches provide details about how agents interact to collaboratively make a decision.
Although the approaches presented in [11,37] are related to decision-support systems using case-base reasoning in MASs, the inference procedure is still within one agent. The reasoner agent receives a new case from the sender agent, infers with its database to match the received case and then forwards the result to the sender or a third agent. This approach is more like standard programs that receive the input and return the output. From a software development perspective, communication between agents in these systems are simpler than in our system. In our system, neither of the agents can execute the reasoning process alone since a single agent does not have enough knowledge to do so. The data needed for reasoning depends on all the agents. Before the dialogue, no agent has explicit requirement to get all the necessary data from another agent. Only when executing does an agent asks/asserts new knowledge related to the current topic, its local knowledge, and the retrieved knowledge from the other agent. Therefore, we cannot model this system with the standard approach using inputs and outputs.
MASs are still mostly studied in academia and rarely widely deployed in the practical medical domain [14]. Our framework solves the problem of the necessary knowledge being located in a distributed environment, which opens the possibility to provide a generic, consensus-based domain expert agent (DA) that can support physicians in reasoning about a particular patient case, regardless of the physical location, while keeping the patient anonymous to the DA to protect sensitive information. The solution is implemented for experimental purposes in an existing medical application. We anticipate it could be widely used in the near future since the hypothesis-based dialogue complements the diagnostic reasoning support that is currently implemented in the system. However, user studies need to be conducted in clinical practice among users with different levels of knowledge and skills.
Our architecture allows users to store patient data in their local databases/repositories as well as share data and make decisions together with the latest DA knowledge. DA's knowledge (mainly represented as rules in our application) is dynamically increased. Therefore, it is not a good idea to store this knowledge in the client's local repository to make decisions locally.
We should note that although DA has rules and no facts in our application, our inquiry dialogues presented in Section 2 and 3 can be used when both agents have both facts and rules, for example, when the two agents represent different complementary disciplines in a team.
From a theoretical perspective, to the best of our knowledge, there are only a few works on generating inquiry dialogues apart from [2]. Authors in [32] define locutions (similar as moves in our system) and attitudes. Locutions describe which legal moves can be made at a specific point and how to update the commitment store after the move. Attitudes control the assertion and acceptance of propositions. It also gives a precise notion of the outcome of the dialogue. Similar to the approach presented by [32], McBurney and Parsons [27] define utterance rules to generate legal moves. They also define a rule for preclude-infinite regression by malevolent participants so that the same move may only be executed once by only one of the participants. However, neither of these two approaches provide a strategy for generating inquiry dialogues.
Fan and Toni propose a generic dialogue model that is not tailored to any particular dialogue type [7]. Both their approach and ours have some similarities, such as providing legal-move and outcome functions and agents can jointly construct arguments and play both roles (defend vs attack). However their system is based on assumption-based argumentation (ABA) framework. To use it in real applications, a specific kind of dialogue and a specific knowledge representation language should be considered before turn it to ground. However, to our knowledge, it has not been implemented in a real application.
Argumentation is very useful in dealing with conflicts, as evidenced by recent research in healthcare. Chalaguine et al. [3] use argumentation to change patients' behavior. They propose four main dimensions to categorize the arguments in their domain model: ontological, functional, context, and topic types. Noor et al. [31] extract arguments from patient experience expressed on the social web to assess a particular drug. They formalise ten classification rules to sort the arguments, evaluate the arguments, and compare with user ratings of the drug. Unlike these works which obtained arguments directly from practical domains, we let the agents make dialogues to generate arguments since the beliefs are distributed in different places.

Conclusions and future work
In domains such as the medical domain, knowledge bases sometimes cannot be combined due to various constrictions, and the knowledge applied in reasoning and decision making is often uncertain and inconsistent. To deal with these complex medical situations, inquiry dialogues in MASs are pursued, which simulate reality well. However, this field is relatively unexplored, and the very limited number of existing pioneer research studies, (e.g., [2]) are theoretically oriented and not trivial to implement.
The presented research contributes to the topic at least from three aspects. First, the theoretical framework of an earlier approach is modified and made more practical and feasible for real applications. For example, a wi dialogue is introduced not only in the top level but also in each sublevel of reasoning. Thus, the structure is more confined, allowing the reasoning to be performed in well-defined steps. Each problem is isolated and solved oneby-one, avoiding unnecessary complexity and uncertainty. This clearer and more efficient system is also very suitable for educational purposes. Second, possibilistic logic is used for capturing uncertain information, and an argumentation framework is used for dealing with inconsistent knowledge in reasoning about a diagnosis. An approach for generating two kinds of inquiry dialogues in a MAS is presented. Possibilistic logic is identified as a key factor for achieving the goal of managing uncertain information. Other methods for dealing with fuzzy situations (e.g., the probability-based approach) present challenges due to the lack of statistical data, which is frequently the case in the medical domain, when treating individual cases. Last and most importantly, the new methods and theories are facilitating the realization of inquiry dialogue systems. By implementing the results into DMSS-W, the decision-support system developed for the dementia domain, it is demonstrated how the uncertain and inconsistent data is properly dealt with in the practice of dementia diagnosis. One scenario is presented and discussed in detail. These kinds of real applications are particularly rare in earlier work in the field; however, it is necessary to demonstrate the strong potential of applying CDSS in real medical practice and show how uncertainty can be managed using possibilistic logic and argumentation. The study develops the solution a step further, when compared to earlier work [43], in the application of possibilistic logic and argumentation, providing a systematic and comprehensive solution that can be implemented.
In future work, a graphical display will be used instead of the current text display (e.g., Fig.5), so that the system appears more intuitive and user friendly. This way, the user can more easily get an overview of the dialogues about the related diagnostic reasoning and decision-making. This is important for medical educational purposes. Also, the theories and methods will be further refined so that even more complicated situations could be handled with more precision in the final diagnosis results.
Finally, the MAS will be evaluated in clinical practice. The log data will be analyzed to detect reasoning patterns of the users. In this way, the educational function of the system can be verified by observing how novice users develop their reasoning and decision-making skills while using the system.