Combining theory of mind and abductive reasoning in agent-oriented programming

This paper presents a novel model, called TomAbd, that endows autonomous agents with Theory of Mind capabilities. TomAbd agents are able to simulate the perspective of the world that their peers have and reason from their perspective. Furthermore, TomAbd agents can reason from the perspective of others down to an arbitrary level of recursion, using Theory of Mind of nth\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n^{\text {th}}$$\end{document} order. By combining the previous capability with abductive reasoning, TomAbd agents can infer the beliefs that others were relying upon to select their actions, hence putting them in a more informed position when it comes to their own decision-making. We have tested the TomAbd model in the challenging domain of Hanabi, a game characterised by cooperation and imperfect information. Our results show that the abilities granted by the TomAbd model boost the performance of the team along a variety of metrics, including final score, efficiency of communication, and uncertainty reduction.


INTRODUCTION
Theory of Mind (ToM) is the human cognitive ability to perceive, interpret and reason about others in terms of their mental states, such as their beliefs, goals and intentions [2].It is an essential requirement for effective participation in social life, and is strongly linked to the feeling of empathy [4] and moral judgement [3].The emergent field of social AI acknowledges the need of ToM-like capabilities for software agents to successfully interact with other agents as well as humans [1,6].
In this work, we present an overview of our novel TomAbd agent model [5] (BDI) agent architecture.A TomAbd agent  is designed to operate in the following generic scenario.A different (not necessarily TomAbd or even BDI) agent , denoted as the actor, executes some action   observable by .Upon observing the action,  uses ToM and substitutes its belief base with the beliefs it estimates  has.Once this change of perspective is complete,  engages in abductive reasoning to derive the possible beliefs that might have led  to decide on   .After some post-processing,  incorporates the information thus derived into its own belief base, to inform its posterior decisionmaking.
Besides ToM, the second main component of the TomAbd agent model is abductive reasoning.Abduction is a logical inference paradigm differing from traditional deductive reasoning [7].Classical deduction follows the modus ponens rule: from knowledge of  and of the implication  →  ,  is inferred.In contrast, abduction makes inferences in the opposite direction: from knowledge of the implication  →  and the observation of  ,  is inferred as a possible explanation for  .The explanation  may be further constrained by the need to be consistent with prior knowledge.In the TomAbd agent model, observations refer to actions executed by other agents (action(,  )), while explanations refer to the beliefs that might have led  towards   .
In the remainder of this paper, we present the key features of the TomAbd agent model (Section 2) as well as its performance in the cooperative board game Hanabi and the main takeaways from this work (Section 3).

THE TOMABD AGENT MODEL
Figure 1 presents the architecture of the TomAbd agent model. 1he core ToM functionality is provided by the AdoptViewpoint function.Agent  operates according to the logic program   contained in its belief base.  is composed of a set of ground literals  and Horn clauses ℎ ← .Among the clauses are domain-dependent ToM clauses.ToM clauses have literals believes(Ag,F) as their head, to express the fact that agent Ag believes some fact F to be true.The ToM capabilities of TomAbd agents comes from their ability to substitute their logic program   with the logic program they estimate that others have, and reason from that perspective.Hence, the program that agent  estimates that agent  has is denoted by  , and given by: The switch from   to  , constitutes first-order ToM, as agent  adopts the beliefs that it estimates agent  has.However, eq. ( 1) can be extended down to an arbitrary level recursion: Hence, eq. ( 2) constitutes  th -order ToM.We denote the sequence [ , . . ., , ] of agents whose beliefs  recursively incorporates as the perspective that  adopts.The AdoptViewpoint function essentially takes as input a perspective  and adopts it by applying eq. ( 2).First, however, it saves a copy of the agent 's original program   in a back-up belief base, so that it can return to it later and continue the reasoning from 's perspective.Now that we understand how TomAbd agents implement ToM, we present how they combine it with abductive reasoning to derive the beliefs motivating the action of other agents.The integration of ToM and abductive reasoning into a single functionality is provided by TomAbductionTask, which constitutes the core function of the TomAbd agent model.
Function TomAbductionTask takes as input: (i) an observer perspective ; (ii) an actor agent ; and (iii) the action   executed by .It starts by adopting the perspective of the actor , generated by appending  to the observer perspective .Once the agent  is operating from the logic program it estimates the actor  to have (with as many intermediate perspective switches as indicated by ),  proceeds to generate explanations for the observation action(,  ).This step relies on an abductive meta-interpreter specific to the TomAbd agent model.This meta-interpreter is based on classical SLD clause resolution with a small extension to handle abducible literals.We use Φ to denote the set of explanations generated by the abductive meta-interpreter.Φ is revised for consistency against the logic program that  estimated the actor to have.Next, Φ is also checked for consistency against the logic program that  estimates the observer to have.Again, the observer's perspective is obtained by calling the AdoptViewpoint function.From these two consistency checks, agent  obtains two (possibly different) sets of abductive explanations: Φ  and Φ  (revised from the actor's (observer's) perspectives).Φ  and Φ  are transformed into a suitable format to be incorporated into agent 's own belief base, to be queried during its own decision-making process.
Next we provide an overview of the other components from Figure 1 that have not been mentioned so far.The explanation revision function, ERF, is called by TomAbductionTask to ensure the consistency of explanations both from the actor's and the observer's perspectives.BuildAbdLit is an auxiliary function to transform abductive explanations in a format suitable to be added to the agent 's belief base.The explanation update function, EUF, is called from the standard BDI belief update function BUF.EUF is triggered at every perception step of the BDI cycle, and is in charge of removing explanations from agent 's belief base if they are no longer informative.Finally, the SelectAction function is in charge of selecting agent 's next action, taking into account the abductive explanations currently present in its belief base.

RESULTS AND CONCLUSIONS
We have tested the TomAbd agent model on the Hanabi benchmark, a cooperative board game where agents can see each other's cards, but not their own, and can provide hints to other agents about their cards.The objective is to maximize a single team score by playing the cards.A full description of the game rules is available elsewhere. 2e have compared the performance in Hanabi of teams of Tom-Abd agents that use 1st-order ToM versus teams where agents do not use ToM capabilities.Our results show that teams where players use ToM score significantly higher, with acceptably low execution overhead.Besides the score, players with ToM exchange information more efficiently, and we also observed that a large percentage of the gain in score could be attributed to the information acquired by the agents through ToM reasoning.
In summary, our work presents a novel model for agents with Theory of Mind.It provides the cognitive machinery for agents to adopt and reason at multiple levels of perspective of their peers, and abduce the potential reasons leading their peers to act the way they do, and hence increasing the agent's own understanding of its environment.Our model endows autonomous agents with essential social abilities, which are becoming increasingly important in the current AI landscape.
Our model opens the door to several avenues for future work.Namely, the trade-offs between higher level of ToM reasoning, the increased uncertainty about the information abduced and the computational cost associated with recursive changes in perspective should be explored.This can potentially inform psychological research on the limits that humans have for adopting high levels of ToM.

Figure 1 :
Figure 1: Architecture of the TomAbd agent model.