Exploring agent-based chatbots: a systematic literature review

Calvaresi, Davide; Eggenschwiler, Stefan; Mualla, Yazan; Schumacher, Michael; Calbimonte, Jean-Paul

doi:10.1007/s12652-023-04626-5

Exploring agent-based chatbots: a systematic literature review

Original Research
Open access
Published: 01 June 2023

Volume 14, pages 11207–11226, (2023)
Cite this article

Download PDF

You have full access to this open access article

Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Exploring agent-based chatbots: a systematic literature review

Download PDF

Davide Calvaresi ORCID: orcid.org/0000-0001-9816-7439¹,
Stefan Eggenschwiler¹,
Yazan Mualla²,
Michael Schumacher¹ &
…
Jean-Paul Calbimonte^1,3

3752 Accesses
1 Citation
11 Altmetric
2 Mentions
Explore all metrics

Abstract

In the last decade, conversational agents have been developed and adopted in several application domains, including education, healthcare, finance, and tourism. Nevertheless, chatbots still need to address several limitations and challenges, especially regarding personalization, limited knowledge-sharing capabilities, multi-domain campaign support, real-time monitoring, or integration of chatbot communities. To cope with these limitations, many approaches based on multi-agent systems models and technologies have been proposed in the literature, opening new research directions in this context. To better understand the current panorama of the different chatbot technology solutions employing agent-based methods, this Systematic Literature Review investigates the different application domains, end-users, requirements, objectives, technology readiness levels, designs, strengths, limitations, and future challenges of the solutions found in this scope. The results of this review are intended to provide researchers, software engineers, and innovators with a complete overview of the current state of the art and a discussion of the open challenges.

ChatGPT is bullshit

Article Open access 08 June 2024

AI-based chatbots in customer service and their effects on user compliance

Article Open access 17 March 2020

An Overview of Chatbot Technology

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Conversational agents have been proposed and designed to enable seamless interactions with people, through computer-based means for communication, language processing, interpretation, and dialogue exchange (Adamopoulou and Moussiades 2020). These agents have substantially evolved from its first incarnation, the seminal project ELIZA developed by Joseph Weizenbaum (Weizenbaum 1966). Ever since, conversational agents have leveraged on Natural Language Processing (NLP), state machine engines, and pattern matching with the intent of engaging in purposeful conversations with human users. Several milestones marked the technological evolution of conversational bots. Towards the end of the 1980 s, Rollo Carpenter developed Jabberwacky (Rollo 1997), a self-learning agent mainly employing contextual pattern matching to identify the best answer (accessible over the internet only later in 1997). Later, in 1994, the term ChatterBot made its first appearance, used by Michael Mauldin to describe conversational programs (Mauldin 1994). Nowadays, this term has been shortened to chatbot, and it is used on a daily basis to describe these technologies. In the 1990 s, considerable progress was made on conversational agent technologies, based on advances in Artificial Intelligence. For example, Richard Wallace developed ALICE (Artificial Linguistic Internet Computer Entity), which leveraged on heuristical pattern matching.^{Footnote 1}

In the 2010 s, chatbot technologies started to gain adoption, outside the academic sphere, in industrial and mainstream applications. Apple was among the first to commercialize a personal assistant with conversational capabilities in 2011 with the release of Siri.^{Footnote 2} Initially based on the Active platform (Guzzoni 2008), it assisted iPhone users recognizing both written and spoken language. Other major technology companies released their virtual assistants shortly. Google Now for Android and iOS devices appeared in 2012,^{Footnote 3} evolving from a simple recommendation engine to a personal assistant able to dialogue with the user (similar to Siri).^{Footnote 4} Microsoft followed with Cortana,^{Footnote 5} which was released in 2014. The same year, Alexa was launched by Amazon, primarily targeting home automation and online shopping. Although it is not linked to any OS, it quickly gained adoption in the market (Etherington 2014). The widespread acceptance of these major companies’ virtual assistants and their usage of asynchronous text-based interactions stimulated instant messenger applications to release APIs for third-party development of chatbots (i.e., Telegram, Facebook Messenger, and WhatsApp), in addition to those mainly dedicated to customer services through their web pages.

The increasing adoption of chatbots has been boosted by anywhere/anytime availability, immediate response, confidentiality, social acceptance, and massive scalability. Leveraging on these aspects, chatbots have proven to be effective in a wide range of domains such as eCommerce (Cui et al. 2017), education (Winkler and Söllner 2018), and in particular for motivational (e.g., social network campaigns (Calvaresi et al. 2019)) and support (e.g., customer management (Xu et al. 2017), eHealth (Calbimonte et al. 2019), and assisted-living scenarios (Fadhil and Gabrielli 2017)).

Recent remarkable technological advancements are pushing the evolution of chatbots using keyword-based text recognition or static finite state machines (FSM) to interpret and orchestrate user interactions (today still representing a significant share of the market), to hybrid solutions merging NLP (for text recognition) and FSM (for the management of intentions and user stories) (DeepLink 2022). However, the solely FSM-based solutions still expose significant limitations, such as inadequate personalization, lack of real-time monitoring, reporting and customization, lack of mechanisms to integrate communities of chatbots, limited knowledge sharing capabilities, and the impossibility of deploying multi-domain campaigns within the same framework. These limitations are linked to the predominantly rigid architectures proposed in most existing approaches. These rely on very specific scenarios translated into chatbot logic, which have to be reprogrammed every time a new scenario arrives. This raises the costs of modifying the behavior of a chatbot and prevents administrators from adapting it to specific situations. Moreover, most chatbot solutions rely on monolithic and centralized data management strategies, making it hard to comply with privacy regulations (e.g., European Union’s General Data Protection Regulation – GDPR (Voigt and Von dem Bussche 2017)). The sensitive nature of data collected through chatbot interactions makes it necessary to shift the control of personal data towards the users themselves, empowering them in the process. Many chatbot systems have used AI to boost the accuracy and user-experience of its interactions. Examples include the use of NLP to generate asynchronous follow-up questions (Rao et al. 2021), or the application of neural networks to perform emotion detection in chatbot conversations (Huddar et al. 2021). However, these AI techniques focus more on the generation of responses and monitoring conversational context, without considering the autonomous, decentralized and collaborative nature of chatbots.

Nevertheless, in the last decade, the trend of combining chatbots with multi-agent systems (MAS) models and technologies tried to mitigate the limitations mentioned above. Particular emphasis is given to application domains where the social and collaborative dimensions (e.g., crowd-sourcing, user profiling and personalization) is essential in the interaction with users. These features are particularly relevant for domains such as healthcare fostering behavioral change (Pereira and Díaz 2019), where the majority of the studies/contributions bridging chatbots and MAS can be found (Calbimonte et al. 2019; Calvaresi et al. 2019).

To better understand the current panorama of the different chatbot technology solutions employing agent-based approaches, this work presents a Systematic Literature Review (SLR) investigating application domains, end-users, requirements, objectives, technology readiness level (TRL) (European Commission 2017), designs, strengths, limitations, and future challenges of the solutions found in the literature. The goal is to provide a tool for researchers, software engineers, innovation managers, and other practitioners to investigate the current state of the art and discuss the open challenges.

The rest of the paper is structured as follows: Sect. 2 presents the methodology applied for performing the SLR. Section 3 presents the review planning phase, including the definition of the protocol and the research questions. Section 4 describes how the review was performed. Section 5 analyses the outcomes of the applied methodology structured according to the research questions. Section 6 discusses the obtained results, projecting them into the stated (by the primary studies) and envisioned (by the authors of this paper) future directions. Finally, Sect. 7 concludes the paper.

2 Systematic literature review methodology

The approach employed in this paper aims at being both rigorous and reproducible. It relies on the methodology outlined by Kitchenham (Kitchenham et al. 2009), which has also been employed in a similar contexts (Palmarini et al. 2018; Calvaresi et al. 2021b; Anjomshoae et al. 2019; Mualla et al. 2019; Calvaresi et al. 2018). Figure 1 proposes a schematic representation of the adopted procedure. In particular, it comprises three stages:

P1::: Planning the review. This phase consists of defining the main generic question(s) and deriving Structured Research Questions, characterizing the entire search protocol, matching the requirements (rigorousness and reproducibility), and validating the protocol.
P2::: Performing the review. Entails the execution of the following planned activities: collection and selection of literature, literature elaboration, and disagreement resolution.
P3::: Dissemination. Includes analysis, documentation, reporting, and summary of the learned lessons.

3 Review planning

This section describes the definition of the structured research questions and the development of the review protocol describing the search strategy, the inclusion and exclusion criteria, the biases and disagreement resolution, and the quality criteria.

3.1 Research questions

As introduced in Sect. 1, the research community has proposed the usage of multi-agent-based chatbots in recent years, for different domains, stakeholders, and purposes. Therefore, the main research question can be contextualized in these terms as follows: How are agent-based chatbots characterized, envisioned, and employed? To better investigate such a question, we comply with the Goal-Question-Metric (GQM) approach introduced by (Galster et al. 2014; Kitchenham et al. 2010). Such an approach has been employed in several other studies in the computer science-related domain (e.g., augmented reality for maintenance (Palmarini et al. 2018), virtual reality for education (Radianti et al. 2020), explainable agents and robots (Anjomshoae et al. 2019), agents and blockchains (Calvaresi et al. 2018)) and other domains (e.g., tourism (Yang et al. 2017; Calvaresi et al. 2021b). The dimensions targeted in this study apply to “intelligent” technologies and research. In particular, they are scientific interest over the years, application domains, stakeholders, requirements, goals, technologies, advantages, limitations, countermeasures, and future research. By formulating questions addressing such aspects, provide investigations and analysis in support of practitioners (providing an aggregated understanding of the current works), new tech pioneers (understanding what has been tried and what might be future targets), and industrial researchers (to bring research ideas onto the real-world market). Thus, we devised a set of ten structured research questions.

SRQ1:

To establish an understanding of the demographic evolution of agent-based chatbots, we inquire: How are the research efforts temporally and geographically distributed?

SRQ2:

To elicit the domains on which the agent-based chatbots research focuses, we inquire: Which application domains have employed agent-based chatbots?

SRQ3:

To clarify who are the stakeholders of agent-based chatbots, we inquire: Who are the users of the chatbot systems relying on the agent paradigm?

SRQ4:

To formalize the requirements arranged w.r.t. the given stakeholders, we inquire: What are the requirements standing behind the employment of agent-based chatbots?

SRQ5:

To explore what research tried to achieve with agent-based chatbots, we inquire: What are the objectives set for agent-based chatbots?

SRQ6:

To better understand the technological characterization, we have structured SRQ6 in four sub-questions:

a):: Which chatbot design (e.g., paradigms) and implementations have been proposed?
b):: Which technologies have been employed in the proposed solutions?
c):: Which technologies have been previously employed?
d):: What is the Technology Readiness Level (European Commission 2017) of the solutions proposed in the primary studies?

SRQ7:

To explore the benefit of existing solutions, we inquire: What are the strengths of employing agent-based chatbots?

SRQ8:

To identify the shortcomings of the existing solutions, we inquire: What are the limitations of employing agent-based chatbots?

SRQ9:

To understand the measures employed by the authors to achieve their objectives and overcome the limitations, we inquire: What are the proposed solutions for the limitations identified in SQR8?

SRQ10:

Finally, to foster the establishment of future objectives, we inquire: What are the future challenges for chatbot-based solutions envisioned by the primary studies?

3.2 Review protocol

The search strategy included the selection of the following information sources: IEEE Xplore,^{Footnote 6} ScienceDirect,^{Footnote 7} ACM Digital Library,^{Footnote 8} Citeseerx,^{Footnote 9} and Pubmed.^{Footnote 10} The selection of the keywords relied on the reviewers’ background and knowledge in the context of agent-based chatbots, and they include the following: Multi-agent system, MAS, agent-based, chatbot, conversational agent, virtual assistant, personal assistant. To increase the results’ accuracy, some keywords have been aggregated. For example, MAS was expanded to three different queries: (i) MAS + chatbot + virtual assistant, (ii) MAS + chatbot + personal assistant, and (iii) MAS + chatbot + conversational agent.

Each search query produced a set of articles added to the list of papers to be considered. The result of each query has been screened by the reviewers to evaluate the articles’ coherence with the study. In particular, title and abstract have been pre-processed according to the criteria presented in the next section.

3.2.1 Inclusion and exclusion criteria

The initial search collected 108 papers, hereafter referred to as primary studies. Additional filtering criteria have been applied (see Table 1). In particular, such criteria have been selected aiming at (i) avoiding multiple papers (usually incremental) describing the same work, (ii) bounding the time window for the investigation (e.g., excluding too old and less-relevant works, given the technological advancements), (iii) selecting works contributing to the actual investigated topic, and (iv) ensuring the presence of a tangible theoretical/practical contribution – avoiding purely visionary and blue sky studies). The criteria definition is usually quite specific per topic/review. Nevertheless, several studies recall similar criteria selections (Yang et al. 2017; Anjomshoae et al. 2019). Applying the criteria defined in Table 1, we purged unrelated papers and narrowed them down to a set of 38 contributions. Three reviewers were instructed to verify the compliance of the papers with the aforementioned inclusion criteria. Each reviewer operated independently while filtering out the list of papers. After the filter process ended, a review process was established so that a paper was included if at least two reviewers agreed on it.

Table 1 Inclusion and exclusion criteria

Full size table

3.2.2 Biases and disagreement resolution policy

The policy for biases and disagreement resolution allows the reviewers to cross-examine each task to limit biases and resolve disagreements among themselves. In particular, during the articles selection task, three reviewers cross-validated the inclusion/exclusion. During the elaboration of the articles, uncertainties have been discussed during periodic meetings.

3.2.3 Features and quality criteria

Assessing the quality of the extracted information is crucial. The following set of features has been chosen to answer the structured research questions: Publication year, geographical localization, main purpose, context, kind of users involved, scenarios, level of abstraction$\dagger$, architectures and designs, development methodologies, techniques, technologies and devices, user needs coverage$\ddagger$, need - offered support relation, kind of disease or difficulties supported$\ddagger$, awareness provided, architectural evidence$\ddagger$, technological evidence$\ddagger$, technical evidence$\ddagger$, architectural limitations$\ddagger$, technological limitations$\ddagger$, technical limitations$\ddagger$, identified future directions, identified future challenges. The features annotated with ($\dagger$) are classified with C, P, or T as possible values, that respectively stand for C = Conceptual; P = Prototype Architectures and Frameworks, no results are provided; T = Tested Architectures and Frameworks, results are provided. The features annotated with ($\ddagger$) are associated to Y, P, or N values, that stand for: Y = information are explicitly defined / evaluated; P = information are implicit / stated; N = information are not inferable. Such a categorization of the collected features has been performed according to the DARE criteria, elaborated and proposed by (Kitchenham et al. 2009).

4 Review execution

This section details the Perform Review task in Fig. 1. In particular, it elaborates on the review’s execution, including details on the article collection, selection, and elaboration. The semi-automatic search presented in Sect. 2 resulted in a total of 108 selected articles. The assessment of the primary studies to be finally included in the elaboration phase has been conducted by a total of three reviewers. In particular, the articles have been organized into three equally distributed groups, each of them elaborated by two reviewers (in rotation) with the third one involved in the case of conflict.^{Footnote 11} Table 2 details the selection assessments, referring to the reviewers with the letters ${\mathcal {A}}$, ${\mathcal {B}}$, and ${\mathcal {C}}$.

Table 2 Summary of the inclusion/exclusion phase of the collected papers

Full size table

The papers have been listed following the collection order and respecting the relevance-based sorting obtained when querying the scientific web collectors. It is possible to notice that the third set of papers recorded a drastic reduction in the acceptance rate. Such a piece of information offers two possible reading keys: (i) the stop criteria has acted too loosely and/or (ii) title & abstract do not mirror the papers’ content properly.

The filtering phase concludes by selecting 38 papers to be elaborated out of the 108 initially collected (21.1% total acceptance rate). In turn, the features presented in section 3.2 have been extracted and collected in a tabular format to facilitate their elaboration and the understanding of possible correlations to be discussed. Nevertheless, in some cases, the extraction of relevant information has been challenging due to the lack of explicit statements (e.g., very few studies have clearly mentioned the limitations of their approaches). To cope with this situation, the reviewers have leveraged their knowledge of the topic to produce a more comprehensive understanding and propose to the reader additional information (rigorously decoupled with the presentation of the results and solely addressed in the discussion).

5 Review results and analysis

In the following, we structure the results of the SLR according to the research questions defined in Sect. 3.1.

5.1 Demographics

Referring to question SRQ1, Figs. 2 and 4 show the temporal and geographical distribution of papers targeting agent-based chatbots. Figure 2 reports the primary studies’ distribution over the time-window selected for this study. A slight upward trend can be observed in recent years. Nevertheless, the research field of multi-agent-based chatbots still seems to be a niche area. Indeed, looking at Fig. 4, the geographical localization of the first authors’ institutions (organized per country) relates to the distribution of research groups in the field of multi-agent systems (i.e., centered in the US and Europe). Finally, Fig. 3 provides a further view on the selected primary studies by grouping the papers per continent.

5.2 Application domains

Regarding SRQ2, we graphically represented in Fig. 5 the selected application domains of the primary studies. It is noticeable that the panorama of the application domains is remarkably broad and diversified. For example, it ranges from education (Alencar and Netto 2014) to healthcare (Kökciyan et al. 2021) and financing (de Bayser et al. 2018). Nevertheless, it appears that personalized assistive purposes have attracted most efforts across domains.

5.3 Intended user classes

Concerning SRQ3, Fig. 7 shows the distribution of the diverse intended user classes identified by the selected primary studies, which is a direct consequence of the application domains. On the one hand, it is evident that most of the literature operates in the context of education, having either students, tutors, or professors as the main users. On the other hand, although being a minority, a considerable amount of studies is solely conceptual or general (see Fig. 6) and does not tackle a specific use case. Overall, the majority ($57.89\%$) of the primary studies presented some form of prototypes, $23.69\%$ deal with technical or scientific concepts, and $18.42\%$ of the selected papers contains extensively tested artifacts.

5.4 Requirements

Concerning question SRQ4, we elicited the requirements expressed by the primary studies. We can see the evolution of the main features captured by these requirements in Fig. 9. We categorized the requirements as follows:

Functional Requirements: requirements affecting the behavior of the platforms (see Table 3);
Architectural Requirements: requirements stirring the system or the back end of the platforms (see Table 4);
Front end Requirements: requirements applied to the front end of the platforms (see Table 5);

Figure 8 depicts the distribution of types of requirements characterizing the primary studies. The authors of the elaborated papers focus primary on functional (41.7%) and architectural (40.0%) requirements. Requirements concerning the front end were only explicitly formalized in 18.3% of the studies.

Table 3 Functional requirements

Full size table

Table 4 Architectural requirements

Full size table

Table 5 Front end requirements

Full size table

5.5 Objectives of the studies

Investigating SRQ5, we collected and clustered the objectives of the primary studies as depicted in Fig. 10. Most of the papers tackle the theoretical foundations of MAS-based chatbots (i.e., nine studies focus primarily on conceptual aspects of the current state of the art or non-concrete systems). Among them, we can mention (Augello et al. 2017), where a notion of “social intelligence” for chatbots is defined, and linked to current technologies’ capability to develop social chatbots. Also, (Hung et al. 2009) defines a method for an evaluation process to assess the “naturalness” of a chatbot system.

Concerning more practical studies, goal-driven behaviors (e.g., intended to tackle user personalization) have been studied for dietary and entertainment proposes. (Angara et al. 2017) describes a chatbot designed to support users in their kitchen by providing recipe recommendations while adhering to their dietary goals, medical conditions, preferences, and available ingredients. Similarly, (Wong et al. 2012) describe a goal-oriented virtual chat companion for children with a focus on structured entertainment (e.g., story-telling, collaborative games) and engaging in “free-flowing” dialogue with unstructured responses. Concerning behavioral change, studies such as (Calvaresi et al. 2019; Calbimonte et al. 2019) target profiling and cravings’ analysis to tailor smoking cessation support, (Calvaresi et al. 2021a) target the maintenance/improvement of physical balance capabilities with personalized exercises. (Chapman et al. 2019), and (Kökciyan et al. 2021) demonstrate the development of a chatbot system to help stroke patients manage their care. The system processes data from multiple inputs (e.g., blood pressure monitor, electronic health record) to serve a computational argumentation engine and respond to user queries.

From a different perspective, data-driven behavior has been addressed in contributions including (Agostaro et al. 2005; Pilato et al. 2007; Augello et al. 2009) which deal with the limitations of the conventional, rule-based, data-driven semantics by introducing the paradigm of LSA. Indeed, according to (Landauer et al. 1998), LSA allows overcoming rule-based pattern matching limits and introduces an element of intuitiveness by constructing a conceptual space. Another targeted objective is the integration of multiple domain-specific knowledge sources into one chatbot system. For example, (Jiang et al. 2015; Augello et al. 2011) deal with the integration of different static sources (i.e., vector space model-based indices, XML, relational databases, SPARQL queries, and AIML), while (Pilato et al. 2011; Tarau and Figa 2004) are intended to manage knowledge dynamically based on the current dialogue context.

While the studies mentioned above are in a user-to-single agent scope, a few studies are in the user to multi-agents (i.e., chatbots) scope. For example, (de Bayser et al. 2017, 2018) address the coordination of multiple bots providing financial advice within the same chat. Their final goal is the moderation of the user-bots’ interaction. Finally, (Calvaresi et al. 2021a) focused, among other aspects, on the facets of data protection and data privacy.

5.6 Technology characterization

Studying SRQ6, we have classified the primary studies according to the technology readiness level (European Commission 2017) (see Table 6). In turn, we have analyzed the technologies, architecture, and design principles employed in the primary studies.

Assessing the TRL is a valuable way to measure the maturity of a technology/system. The scale was originally devised by NASA ((Sadin et al. 1989)) and is nowadays used in many areas in various forms. In this context, we rely on the definition provided by the European Commission in the context of research and innovation projects ((European Commission 2017)) as shown in Table 6.

Table 6 Technology readiness levels according to the definition provided by (European Commission 2017)

Full size table

The TRL distribution of the primary studies is depicted in Fig. 11. It is noticeable that most of the studies are in Levels 3 and 4 (68.1%). This entails that the final outcome of these studies is either a non-validated prototype (TRL 3) or is at the laboratory test stage (TRL 4). Two studies (i.e., (Calvaresi et al. 2019) and (Calvaresi et al. 2021a)) are classified as TRL 5. Indeed, such studies have been deployed and validated in real-world health and social-related campaigns.

In addition to analyzing the TRL of each study, the front-end and back-end technologies applied in the presented systems were analyzed. All studies with a TRL of 3 and higher were considered. Figure 12 depicts the distribution of the back-end technologies used in the primary studies. The majority (38.7%) of the systems employ Java-based back ends. This prevalence can be related to the wide use of MAS frameworks such as JADE^{Footnote 12} and MaSMT.^{Footnote 13} For example, studies such as (Alencar and Netto 2014), (de M. Batista et al. 2009), and (Bentivoglio et al. 2010) rely on JADE and (Hettige and K. 2015) implemented the system based on MaSMT. Although not relying on a pre-existing MAS framework, (Pilato et al. 2007) and (Tarau and Figa 2004) implemented their own ad-hoc Java-based systems. Moreover, (Estes 2011) exploit features of the Java Enterprise Edition platform (JavaEE) to develop their chatbot system and (Memon et al. 2018) use communication sockets of the Java Standard Edition (Java SE). Several studies use unconventional technologies to develop MAS. For example, (de Bayser et al. 2017) use Akka,^{Footnote 14} an actor-based framework, and (Z. et al. 2016) relied on ActiveMQ,^{Footnote 15} a multi-protocol messaging server.

Python-based back ends are 9.7% of the total. In particular, (Jiang et al. 2015) and (Calvaresi et al. 2019) have developed ad hoc systems, while (Calvaresi et al. 2021a) rely on the SPADE framework.^{Footnote 16}

Several studies (9.7%) relied on existing proprietary systems. For example, (Kalia et al. 2017) and (Angara et al. 2017)) rely on IBM Watson’s Conversation Platform^{Footnote 17} and (Zolitschka 2020) rely on Aimpulse Spectrum.^{Footnote 18}

A number of studies (9.7%) developed their systems’ back end as ad-hoc solution using JavaScript (i.e., (de Bayser et al. 2018), (Thosani et al. 2020) and (Bosse. 2021)).

6.5% of all studies (i.e., (Tarau and Figa 2004) and (Bosse. 2021)) implemented a PROLOG^{Footnote 19}-based back end. Finally, With a share of 25.8%, a substantial number of studies have developed prototypes but failed to mention details regarding their back end implementation. One such example is (Kökciyan et al. 2021). Although the authors specify the human interface, it does not go into detail about how the actual backend is implemented.

Figure 13 displays the distribution of the front-end technologies used in the developed chatbot systems. Web-based technologies have received the most attention (31.3%), mostly using JavaScript or JavaServer Pages (JSP) in Java.

Using existing web/mobile messaging platform is a choice undertaken by 15.6% of the studies. In particular, (Calvaresi et al. 2019) rely on Facebook Messenger,^{Footnote 20} (Calvaresi et al. 2021a) offer Telegram Messenger^{Footnote 21} among the available interfaces, (Tarau and Figa 2004) use Yahoo Instant Messenger (deprecated since 2012), and (Bentivoglio et al. 2010) adopt Jabber.^{Footnote 22}

The development of ad hoc solutions accounts for 15.6%. the programming languages involved are Java (e.g., (Hettige and K. 2015) or (Tatai et al. 2003)), C#, and C++ (e.g., (Huang et al. 2008)).

6.3% of the elaborated solutions’ front ends uses cross-platform frameworks. Such frameworks allow the same code base to be used for web and smartphone app development. For example, The studies used (Thosani et al. 2020) use Ionic,^{Footnote 23} and (Calvaresi et al. 2021a) offer among the possible interfaces HemerApp, which is written in Flutter.^{Footnote 24}

3.1% of systems used an Android application as front end (e.g., (Kökciyan et al. 2021)).

Finally, 28.1% of the studies do not mention what technologies are used in their solution or provide only simplistic and non-classifiable descriptions. For example, (de Bayser et al. 2018) focuses primarily on the conception of the backend side without mentioning how their human interfacing system was implemented.

5.7 Strengths of the primary studies

Referring to question SRQ7, the strengths of the primary studies are listed in Table 7. Among all the strengths, 22% of the strengths are classified as Y, which means that the strengths are explicitly defined and evaluated, 21% are classified as P, indicating that the information is implicitly defined, 57% are classified as N, denoting that the information is not inferrable (see Fig. 14). Figure 15 shows, in particular, the classification per strength.

Table 7 Strengths of the primary studies

Full size table

5.8 Limitations and solutions of the primary studies

Referring to questions SRQ8 and SRQ9, the limitations stated in the studies and their proposed solutions were analyzed. Table 8 lists all limitations acknowledged by the authors and their proposed solutions. Only five of the ten papers that point out limitations proposed solutions to address them. As an unfortunate habit, limitations are often overlooked. However, among those who mentioned limitations, it is possible to identify two main categories: architectural and functional. As architectural limitation, we specify limitations that are of technical nature and can be solved by changing the applied architecture or technologies. An example of architectural limitations is (de Bayser et al. 2017), which states performance problems when raising the number of participants in a chat group. To solve this problem, they suggest switching to a micro-service architecture. Another example is (Calvaresi et al. 2019), emphasizing several limitations of their current system architecture, specifically scaling issues with more complex behaviors, lack of standardized inter-agent communication, and no means of integrating third-party data analysis tools. The solution to these limitations is an entirely new platform based on a MAS. Functional limitations are issues on a functional level that can usually be overcome by exploring alternative approaches to a problem. Examples of functional limitations are (Hettige and K. 2015) and (Jiang et al. 2015), both of which mention limitations related to semantic processing. The proposed solution of (Hettige and K. 2015) is to update the corresponding subsystem, while (Jiang et al. 2015) proposes to analyze the user input with domain-independent analyzers (e.g., linguistic analysis or keyboard analysis).

Table 8 Study limitations and proposed solutions

Full size table

5.9 Future challenges stated in the primary studies

Concerning SRQ10 giving the heterogeneous perspective of the future challenges are rather disparate. However, generally, future challenges can be divided into three categories:

System-related challenges relate to extending already existing functionalities.
Functionality-related challenges refer to new functionality to be implemented.
User-related challenges refer to collecting user experiences (usually in the form of trials).

The studies were analyzed for these three categories. Figure 16 shows the breakdown of the three categories across all studies. With 57.9%, most studies desire to enhance their current system’s stability or expand already implemented functionalities. For example, (Shashaj et al. 2019) see improving the system component stability and interoperability with other FIPA^{Footnote 25}-compliant MAS environments as a future goal, whereas (Calvaresi et al. 2019) wish to adapt their architecture to allow distributed computing among several servers to increase performance and to handle agent migration from one server instance to another. A complete list of system-related challenges can be seen in Table 9. At 28.9%, about one-third of studies are endeavoring to add new functionalities to their existing system. (Vasconcelos et al. 2017) attempt to implement more test metrics to test more aspects of a chatbot system, and (Memon et al. 2018) seek to expand their chatbot with a graphical user interface and extend its user input capabilities with voice recognition and interpretation. All functionality-related future challenges are listed in Table 10. 13.2% of all future challenges focus on capturing user feedback. (Alencar and Netto 2014) are seeking to test their tutoring system with the help of students and make further improvements to the system based on the feedback collected, and (Kökciyan et al. 2021) are conducting two pilot studies with patients to test different aspects of their system. Table 11 lists all user-related challenges stated in the primary studies.

Table 9 Future challenges: system-related

Full size table

Table 10 Future challenges: functionality-related

Full size table

Table 11 Future challenges: user-related

Full size table

6 Discussion

Analyzing the primary studies emerges that the application of the MAS’ paradigm has slightly increased in the past twenty years, although only moderately. The elaborated works acknowledge the suitability and the intrinsic added value of agent-based systems, including autonomy, goal-setting, and behavior definition. Nevertheless, it appears that these technologies are mostly at an early stage of development. On the one hand, the TRL of most primary studies did not exceed level 3 or 4 (as shown in Fig. 11), and it is questionable whether these early-stage systems would be capable of meeting the requirements of a real-world scenario. On the other hand, a few systems have been studied in real-world scenarios (i.e., (Calvaresi et al. 2021a)—testing the developed chatbot in a physical balance-preserving campaign and (Kökciyan et al. 2021) – letting both experts and real users analyzing the system. However, it still remains to test such systems in fully operational environments.

Several studies focused on aspects revolving around the management and reconciliation of different knowledge bases. However, only one (Calvaresi et al. 2021a) has addressed the topic of data privacy and user consent directly. To date, this is a remarkable concern that practitioners have to address imperatively. Indeed, too many studies addressing topics such as user profiling and the processing of user input to enhance chatbot knowledge have either ignored data privacy or not tackled it explicitly. If people are involved, it is of paramount importance to ensure their control over their data. Due to the implementation of more rigid data privacy laws such as GDPR, next-generation systems must have no room to neglect this topic.

The analysis of the technologies’ distribution within the primary studies suggests some trends to be observed. Figure 17a shows the back-end technologies used over the years. It is possible to notice that Java-based systems have been used extensively. However, since 2015, Python-based systems have emerged, likely due to Python’s prevalence in areas such as machine learning and data science libraries. Moreover, since 2017, the employment of proprietary systems (e.g., IBM Watson) has been increasingly considered. Although initially rather rudimentary, such platforms now offer a wide range of possibilities, such as integrating machine learning modules or extensive analytical capacities. Figure 17b shows that a shift occurred in the area of front-end technologies too. In addition to the increasing prevalence of web-based solutions, messaging services such as Facebook Messenger or Telegram have become increasingly popular since 2015. Nevertheless, in recent years, the use of cross-platform frameworks became a consistent practice. Cross-platform frameworks such as Ionic or Flutter make it possible to develop front-end solutions for mobile phones and web browsers using a single code base. Moreover, it can be observed a trend to use more complex multi-agent chatbots (e.g., (Bosse. 2021)) to blend in IoT and micro-services domain with highly scalable multi-agent chatbot networks.

Most studies have used MAS enabling agents to abstract individual components such as language processing or output composition. (Calvaresi et al. 2019) and (Calvaresi et al. 2021a) have taken a different approach by coupling the users themselves with personalized agents. According to such studies, the goal of this 1:1 relation is to facilitate user profiling, data management, privacy preservation, and personalization. Indeed, by interacting with the user, the respective agent is expected to increase its knowledge and enhance the personalization’s accuracy level over time.

Looking at the evaluation of the strengths of the primary studies in Fig. 14, it is noticeable that S2 (i.e., adaptability to different domains) and S6 (i.e., scalability) have an above-average number of implicitly defined and evaluated strengths. In the case of S2, this is primarily due to studies having justified their system’s adaptability with the implementation of a single case study to conclude that the system can also be applied to other domains. This is not necessarily a wrong assumption, but the implementation of several distinct scenarios would have been more effective to show this strength explicitly. Compared to S2, S6 is a more generic strength. Since most of the studies are in an early prototype stage, even if the respective systems’ scalability was reported as a strength, this strength was mostly not evaluated. This leads to the question of what methods can be used to evaluate a chatbot platform’s scalability. All studies use the term scalability as a synonym for Size Scalability as defined by (Neuman 1994). Size Scalability defines that a system scales easily with the number of users and resources without noticeable loss of performance. To implicitly define this aspect, a load test with several simulated users in which the response times and hardware load of the system are analyzed could be theoretically sufficient.

7 Conclusions

This paper has analyzed the current state of the art of chatbot solutions leveraging the multi-agent approach and agent-based frameworks by performing an SLR. In particular, it employs a well-established methodology characterized by ten structured research questions. Such an investigation focused on aspects including application domains, end-users, requirements, objectives, technology readiness level, designs, strengths, limitations, and future challenges of the solutions found in the literature. Such aspects have been analyzed “per-feature” and overall aggregated in a reconciling discussion. The insights elicited in this work can be beneficial for both theoretical and practical future research.

Notes

Program D: 2012 version of ALICE: https://github.com/noelbush-xx/programd.
https://www.apple.com/siri/.
https://developers.google.com/events/io/2012.
Now called Google Assistant: https://assistant.google.com/.
https://www.microsoft.com/en-us/cortana.
http://ieeexplore.ieee.org.
http://www.sciencedirect.com/.
http://dl.acm.org/.
http://citeseerx.ist.psu.edu/index.
http://www.ncbi.nlm.nih.gov/pubmed.
Possible evaluations: Y = Yes, N = No, X = Not sure. If both reviewers agreed on the assessment, no further review was required. However, if a conflict occurred (e.g., Y-X, X-X, X-Y, N-X, X-N), the third reviewer was consulted.
https://jade.tilab.com/.
https://sourceforge.net/projects/masmt/.
https://akka.io/.
https://activemq.apache.org/.
https://spade-mas.readthedocs.io/en/latest/readme.html.
https://www.ibm.com/cloud/watson-assistant.
https://www.aimpulse.com/.
https://www.iso.org/standard/21413.html.
https://www.messenger.com/.
https://telegram.org/.
https://www.cisco.com/c/en/us/products/unified-communications/jabber/index.html.
https://ionicframework.com/.
https://flutter.dev/.
Foundation for Intelligent Physical Agents.

References

Adamopoulou E, Moussiades L (2020) Chatbots: History, technology, and applications. Machine Learning with Applications 2(100):006
Google Scholar
Agostaro F, Augello A, Pilato G et al (2005) A conversational agent based on a conceptual interpretation of a data driven semantic space. Congress of the Italian Association for Artificial Intelligence. Springer, New York, pp 381–392
Google Scholar
Alencar M, Netto JF (2014) Tutor collaborator using multi-agent system. International Conference on Collaboration Technologies. Springer, New York, pp 153–159
Chapter Google Scholar
Angara P, Jiménez M, Agarwal K, et al (2017) Foodie fooderson a conversational agent for the smart kitchen. In: Mindel M, Lyons KA, Wigglesworth J (eds) Proceedings of the 27th Annual International Conference on Computer Science and Software Engineering, CASCON 2017, Markham, Ontario, Canada, November 6-8, 2017. IBM / ACM, pp 247–253, http://dl.acm.org/citation.cfm?id=3172825
Anjomshoae S, Najjar A, Calvaresi D, et al (2019) Explainable agents and robots: Results from a systematic literature review. In: 18th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2019), Montreal, Canada, May 13–17, 2019, International Foundation for Autonomous Agents and Multiagent Systems, pp 1078–1088
Augello A, Pilato G, Vassallo G et al (2009) A semantic layer on semi-structured data sources for intuitive chatbots. 2009 International Conference on Complex. Intelligent and Software Intensive Systems, IEEE, pp 760–765
Google Scholar
Augello A, Scriminaci M, Gaglio S, et al (2011) A modular framework for versatile conversational agent building. In: 2011 International Conference on Complex, Intelligent, and Software Intensive Systems, IEEE, pp 577–582
Augello A, Gentile M, Dignum F (2017) An overview of open-source chatbots social skills. International conference on internet science. Springer, New York, pp 236–248
Google Scholar
Bentivoglio C, Bonura D, Cannella V et al (2010) Intelligent agents supporting user interactions within self regulated learning processes. J E-learn Knowl Soc 6(2):27–36
Google Scholar
Bosse. S (2021) Distributed serverless chat bot networks using mobile agents: A distributed data base model for social networking and data analytics. In: Proceedings of the 13th International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,, INSTICC. SciTePress, pp 398–405, https://doi.org/10.5220/0010319503980405
Calbimonte JP, Calvaresi D, Dubosson F et al (2019) Towards profile and domain modelling in agent-based applications for behavior change. International Conference on Practical Applications of Agents and Multi-Agent Systems. Springer, New York, pp 16–28
Google Scholar
Calvaresi D, Dubovitskaya A, Calbimonte JP et al (2018) Multi-agent systems and blockchain: results from a systematic literature review. International conference on practical applications of agents and multi-agent systems. Springer, New York, pp 110–126
Google Scholar
Calvaresi D, Calbimonte JP, Dubosson F, et al (2019) Social network chatbots for smoking cessation: agent and multi-agent frameworks. In: 2019 IEEE/WIC/ACM International Conference on Web Intelligence (WI), IEEE, pp 286–292
Calvaresi D, Calbimonte JP, Siboni E et al (2021) Erebots: Privacy-compliant agent-based platform for multi-scenario personalized health-assistant chatbots. Electronics. https://doi.org/10.3390/electronics10060666
Article Google Scholar
Calvaresi D, Ibrahim A, Calbimonte JP et al (2021) The evolution of chatbots in tourism: a systematic literature review. Inf Commun Technol Tour 2021:3–16
Google Scholar
Chapman M, Balatsoukas P, Ashworth M, et al (2019) Computational argumentation-based clinical decision support. In: Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems. International Foundation for Autonomous Agents and Multiagent Systems, Richland, SC, AAMAS ’19, p 2345-2347
Cui L, Huang S, Wei F, et al (2017) Superagent: A customer service chatbot for e-commerce websites. In: Proceedings of ACL, System Demonstrations, pp 97–102
de M. Batista AF, Marietto MdGB, Barbosa GCdO, et al (2009) Multi-agent systems to build a computational middleware: A chatterbot case study. In: 2009 International Conference for Internet Technology and Secured Transactions,(ICITST), IEEE, pp 1–2
de Bayser MG, Cavalin PR, Souza R, et al (2017) A hybrid architecture for multi-party conversational systems. CoRR arXiv: abs/1705.01214
de Bayser MG, Pinhanez C, Candello H, et al (2018) Ravel: a mas orchestration platform for human-chatbots conversations. In: The 6th International Workshop on Engineering Multi-Agent Systems. Stockholm, Sweden
DeepLink (2022) Deeplink.ai: Artificial intelligence to boost your customer relationship. https://www.deeplink.ai/en/, Accessed: 2023-02-24
Estes TW (2011) Knowledge discovery agent system and method. US Patent 8,015,143
Etherington D (2014) Amazon echo is a \$199 connected speaker packing an always-on siri-style assistant - techcrunch. https://techcrunch.com/2014/11/06/amazon-echo/, Accessed: 2023-02-24
European Commission (2017) Appendix g. technology readiness levels (trl). https://ec.europa.eu/research/participants/data/ref/h2020/other/wp/2016_2017/annexes/h2020-wp1617-annex-g-trl_en.pdf
Fadhil A, Gabrielli S (2017) Addressing challenges in promoting healthy lifestyles: the al-chatbot approach. In: Proceedings of the 11th EAI International Conference on Pervasive Computing Technologies for Healthcare, ACM, pp 261–265
Galster M, Weyns D, Tofan D et al (2014) Variability in software systems-a systematic literature review. IEEE Trans Software Eng 40(3):282–306. https://doi.org/10.1109/TSE.2013.56
Article Google Scholar
Guzzoni D (2008) Active: a unified platform for building intelligent applications (phd thesis) pp 1–263. https://doi.org/10.5075/epfl-thesis-3990, http://infoscience.epfl.ch/record/114758
Hettige B, K. A (2015) Octopus: a multi agent chatbot. In: 8th International Research Conference, KDU, November 2015
Huang HH, Cerekovic A, Tarasenko K et al (2008) Integrating embodied conversational agent components with a generic framework. Multiagent Grid Syst 4(4):371–386
Article MATH Google Scholar
Huddar MG, Sannakki SS, Rajpurohit VS (2021) Attention-based multi-modal sentiment analysis and emotion detection in conversation using RNN. Int J Interact Multim Artif Intell 6(6):112–121. https://doi.org/10.9781/ijimai.2020.07.004
Article Google Scholar
Hung V, Elvir M, Gonzalez A, et al (2009) Towards a method for evaluating naturalness in conversational dialog systems. In: 2009 IEEE international conference on systems, man and cybernetics, IEEE, pp 1236–1241
Jiang R, Banchs RE, Kim S, et al (2015) Configuration of dialogue agent with multiple knowledge sources. In: 2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA), IEEE, pp 840–849
Kalia AK, Telang PR, Xiao J, et al (2017) Quark: a methodology to transform people-driven processes to chatbot services. In: International Conference on Service-Oriented Computing, Springer, pp 53–61
Kitchenham B, Pearl Brereton O, Budgen D et al (2009) Systematic literature reviews in software engineering—a systematic literature review. Inf Softw Technol 51(1):7–15. https://doi.org/10.1016/j.infsof.2008.09.009
Article Google Scholar
Kitchenham B, Brereton P, Turner M et al (2010) Refining the systematic literature review process-two participant-observer case studies. Empir Softw Eng 15(6):618–653. https://doi.org/10.1007/s10664-010-9134-8
Article Google Scholar
Kökciyan N, Sassoon I, Sklar E et al (2021) Applying metalevel argumentation frameworks to support medical decision making. IEEE Intell Syst 36(2):64–71. https://doi.org/10.1109/MIS.2021.3051420
Article Google Scholar
Landauer TK, Foltz PW, Laham D (1998) An introduction to latent semantic analysis. Discourse Process 25(2–3):259–284. https://doi.org/10.1080/01638539809545028
Article Google Scholar
Maher ML, Gu N (2002) Design agents in virtual worlds-a user-centred virtual architecture agent. Agents in Design. pp 23–38
Mauldin ML (1994) Chatterbots, tinymuds, and the turing test entering the loebner prize competition. In: Proceedings of the Twelfth AAAI National Conference on Artificial Intelligence. AAAI Press, AAAI’94, p 16-21
Memon Z, Jalbani AH, Shaikh M et al (2018) Multi-agent communication system with chatbots. Mehran Univ Res J Eng Technol 37(3):663
Article Google Scholar
Mori K, Jatowt A, Ishizuka M (2003) Enhancing conversational flexibility in multimodal interactions with embodied lifelike agent. In: Proceedings of the 8th international conference on Intelligent user interfaces, pp 270–272
Mualla Y, Najjar A, Daoud A et al (2019) Agent-based simulation of unmanned aerial vehicles in civilian applications: a systematic literature review and research directions. Futur Gener Comput Syst 100:344–364. https://doi.org/10.1016/j.future.2019.04.051
Article Google Scholar
Neuman B (1994) Scale in distributed systems. Inf. Sci. Inst., Univ. Southern California (ISI/USC), Los Angeles, CA, USA. 68
Noori Z, Bandarl Z, Crockett K (2014) Arabic goal-oriented conversational agent based on pattern matching and knowledge trees. In: Proceedings of the World Congress on Engineering 2014 Vol I. Newswood/International Association of Engineers, July 2 - 4, 2014, London, UK. ISSN 2078-0958, https://e-space.mmu.ac.uk/id/eprint/609597
Palmarini R, Erkoyuncu JA, Roy R et al (2018) A systematic review of augmented reality applications in maintenance. Robot Comput-Integr Manuf 49:215–228
Article Google Scholar
Pereira J, Díaz Ó (2019) Using health chatbots for behavior change: a mapping study. J Med Syst 43(5):135
Article Google Scholar
Pilato G, Augello A, Vassallo G, et al (2007) Sub-symbolic semantic layer in cyc for intuitive chat-bots. In: International Conference on Semantic Computing (ICSC 2007), IEEE, pp 121–128
Pilato G, Augello A, Gaglio S (2011) A modular architecture for adaptive chatbots. In: 2011 IEEE Fifth International Conference on Semantic Computing, IEEE, pp 177–180
Radianti J, Majchrzak TA, Fromm J et al (2020) A systematic review of immersive virtual reality applications for higher education: design elements, lessons learned, and research agenda. Comput Educ 147(103):778
Google Scholar
Rao SBP, Agnihotri M, Babu Jayagopi D (2021) Improving asynchronous interview interaction with follow-up question generation. Int J Interact Multim Artif Intell 6:79–89. https://doi.org/10.9781/ijimai.2021.02.010
Article Google Scholar
Rollo C (1997) jabberwacky - about thoughts - an artificial intelligence ai chatbot, chatterbot or chatterbox, learning ai, database, dynamic - models way humans learn - simulate natural human chat - interesting, humorous, entertaining. http://www.jabberwacky.com/j2about, Accessed: 2022-01-10
Sadin SR, Povinelli FP, Rosen R (1989) The nasa technology push towards future space mission systems. Acta Astronaut 20:73–77. https://doi.org/10.1016/0094-5765(89)90054-4
Article Google Scholar
Shashaj A, Mastrorilli F, Stingo M et al (2019) An industrial multi-agent system (mas) platform. International Conference on P2P Parallel, Grid, Cloud and Internet Computing, Springer, New York. pp 221–233
Tarau P, Figa E (2004) Knowledge-based conversational agents and virtual storytelling. In: Proceedings of ACM symposium on Applied computing, pp 39–44
Tatai G, Csordás A, Kiss Á, et al (2003) The chatbot who loved me. In: Proc. ECA Workshop of AAMAS, Melbourne, Australia
Thosani P, Sinkar M, Vaghasiya J, et al (2020) A self learning chat-bot from user interactions and preferences. In: 4th International Conference on Intelligent Computing and Control Systems (ICICCS), IEEE, pp 224–229
Vasconcelos M, Candello H, Pinhanez C, et al (2017) Bottester: testing conversational systems with simulated users. In: Proceedings of the XVI Brazilian Symposium on Human Factors in Computing Systems, pp 1–4
Voigt P, Von dem Bussche A (2017) The eu general data protection regulation (gdpr). A Practical Guide, 1st Ed, vol 10, issue 3152676. Springer International Publishing, Cham. pp 10–5555
Weizenbaum J (1966) Eliza-a computer program for the study of natural language communication between man and machine. Commun ACM 9(1):36–45
Article Google Scholar
Winkler R (2018) Söllner M (2018) Unleashing the potential of chatbots in education: a state-of-the-art analysis. Acad Manag Proc 15:903. https://doi.org/10.5465/AMBPP.2018.15903abstract
Article Google Scholar
Wong W, Cavedon L, Thangarajah J et al (2012) Flexible conversation management using a bdi agent approach. International Conference on Intelligent Virtual Agents. Springer, New York, pp 464–470
Chapter Google Scholar
Xu A, Liu Z, Guo Y, et al (2017) A new chatbot for customer service on social media. In: Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, ACM, pp 3506–3510
Yang ECL, Khoo-Lattimore C, Arcodia C (2017) A systematic literature review of risk and gender research in tourism. Tour Manage 58:89–100
Article Google Scholar
Zhao T, Lee K, Eskenazi M (2016) The dialport portal: grouping diverse types of spoken dialog systems. In: Workshop on Chatbots and conversational agents
Zolitschka JF (2020) A novel multi-agent-based chatbot approach to orchestrate conversational assistants. International Conference on Business Information Systems. Springer, New York, pp 103–117
Chapter Google Scholar

Download references

Acknowledgements

The work is supported by: ◇ HES-SO RCSO ISNet PERSA project: Computational Persuasion in eHealth Support Applications. ◇ Chist-Era grant CHIST-ERA19-XAI-005, and by (i) the Swiss National Science Foundation (G.A. 20CH21\_195530), (ii) the Italian Ministry for Universities and Research, (iii) the Luxembourg National Research Fund (G.A. INTER/CHIST/19/14589586), (iv) the Scientific, and Research Council of Turkey (TÜBİTAK, G.A. 120N680) ◇ The authors thank Diego Collarana from Universidad Privada Boliviana for his insights. This collaboration was supported by the Research Partnership Grant RPG2106 funded by the Swiss Leading House for Latin America.

Funding

Open access funding provided by University of Applied Sciences and Arts Western Switzerland (HES-SO).

Author information

Authors and Affiliations

University of Applied Sciences and Arts Western Switzerland, Sierre, Switzerland
Davide Calvaresi, Stefan Eggenschwiler, Michael Schumacher & Jean-Paul Calbimonte
Université de Technologie de Belfort Montbéliard, Sevenan, France
Yazan Mualla
The Sense Innovation and Research Center, Sion, Switzerland
Jean-Paul Calbimonte

Authors

Davide Calvaresi
View author publications
You can also search for this author in PubMed Google Scholar
Stefan Eggenschwiler
View author publications
You can also search for this author in PubMed Google Scholar
Yazan Mualla
View author publications
You can also search for this author in PubMed Google Scholar
Michael Schumacher
View author publications
You can also search for this author in PubMed Google Scholar
Jean-Paul Calbimonte
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Davide Calvaresi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Calvaresi, D., Eggenschwiler, S., Mualla, Y. et al. Exploring agent-based chatbots: a systematic literature review. J Ambient Intell Human Comput 14, 11207–11226 (2023). https://doi.org/10.1007/s12652-023-04626-5

Download citation

Received: 23 March 2022
Accepted: 02 May 2023
Published: 01 June 2023
Issue Date: August 2023
DOI: https://doi.org/10.1007/s12652-023-04626-5

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Exploring agent-based chatbots: a systematic literature review

Abstract

Similar content being viewed by others

ChatGPT is bullshit

AI-based chatbots in customer service and their effects on user compliance

An Overview of Chatbot Technology

1 Introduction

2 Systematic literature review methodology

3 Review planning

3.1 Research questions

3.2 Review protocol

3.2.1 Inclusion and exclusion criteria

3.2.2 Biases and disagreement resolution policy

3.2.3 Features and quality criteria

4 Review execution

5 Review results and analysis

5.1 Demographics

5.2 Application domains

5.3 Intended user classes

5.4 Requirements

5.5 Objectives of the studies

5.6 Technology characterization

5.7 Strengths of the primary studies

5.8 Limitations and solutions of the primary studies

5.9 Future challenges stated in the primary studies

6 Discussion

7 Conclusions

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation