Taxonomy Development Procedure
This paper develops a taxonomy of design elements for chatbots based on scientific literature and empirical data in order to provide a systematic representation of existent scientific knowledge on chatbot design and to develop a deeper understanding of the degree to which domain-specific chatbots integrate conceptually grounded characteristics in practice. Therefore, our taxonomy not only provides a structure to differentiate domain-specific chatbots according to archetypal qualities, but also reflects the extent of their current technological development and allows to identify gaps between research and practice.
To develop our taxonomy, we followed the seven-step framework of Nickerson et al. (2013). The first step begins by the determination of a meta-characteristic, which embodies a superordinate and abstract description of the taxonomy’s focus (Nickerson et al. 2013). We defined the meta-characteristic as the design elements for domain-specific chatbots. For the purpose of this analysis, the term “design elements” refers to the distinctive technical, situational and knowledge features that frame the structure of chatbots and act as delimiting factors of the extent to which domain-specific chatbots can maintain a human-like interactive communication process with awareness for and understanding of the discussed topic. The second step consists of determining the objective and subjective ending conditions that define when the iterative development process can be considered as completed. To this end, we adopted all the objective and subjective ending conditions (see Table A.5 in the Appendix, available online via https://doi.org/10.1007/s12599-020-00644-1) suggested by Nickerson et al. (2013, p. 344). In the third step, the process provides the possibility to combine conceptual knowledge and empirical findings either through an empirical-to-conceptual or a conceptual-to-empirical path (Nickerson et al. 2013), which can be applied alternately until all end conditions are met. For the development of our taxonomy for design elements of chatbots, we have adopted a conceptual-to-empirical path as a starting point. Hence in the fourth step and through a deductive concept modeling approach based on prior research, we abstracted a preliminary conceptual taxonomic structure, which we subsequently refined in the fifth step through an iterative analysis of existing domain-specific chatbots. After conducting five iterations, we obtained a taxonomic structure. Subsequently, in the sixth step, we evaluated the taxonomic structure using three focus group discussions. Below we provide a description of the procedure executed in each individual iteration.
Iteration 1
In the first iteration, we conceptualized an initial collection of dimensions and characteristics through deductive reasoning and extraction, using a set of English written, peer-reviewed scientific articles published in high quality academic journals or conference proceedings belonging to the field of information systems (IS). These articles were identified by means of an explorative literature review. We selected the electronic databases EBSCOhost Business Source Premier, AISeL, ScienceDirect and ACM, which cover relevant literature in both IS and computer science.
To consider various terms used to describe chatbots, we first performed an explorative search to identify relevant keywords. This explorative search formed the basis for the creation of the search string (“chatbot*” OR “conversational agent*” OR “dialog system*” OR “computer user communication*” OR “conversational robot*”), which we used to search for relevant literature via titles and abstracts search that yielded a total of 1076 hits in the four databases which we reduced to 72 articles after excluding the literature that contains our search string, but is unrelated to chatbot design. Additionally, through a full-text revision, we further discarded articles that do not match our conception of “design elements” or provide elemental classification frameworks, narrowing our initial set to 24 relevant scientific articles. This set was further reinforced by means of a backward and forward reference search that led to the identification of four additional scientific articles related to the areas of computer science and software engineering (i.e., Mittal et al. 2016; Saravanan et al. 2017; Wei et al. 2018), as well as language technology (i.e., McTear 2016). This procedure led us to identify a final sample of 28 articles (see Table A.1 in online appendix) that concretely deal with specific technical, situational and knowledge structure features of chatbots.
Consistent with Nickerson et al. (2013), we applied a deductive development approach to derive an initial set of conceptually grounded dimensions and characteristics in line with our meta-characteristic from the identified scientific literature on chatbot design. The taxonomy was developed in a way that all characteristics of a dimension are to be regarded as exclusive. This means that for a chatbot only one characteristic can be true within one dimension. A description of each characteristic from the final taxonomy can be found in Table A.2 of the online appendix.
In line with our working definition of “design elements”, the dimensions were allocated to three overarching perspectives: (i) intelligence (knowledge structure features), (ii) interaction (technical features), and (iii) context (situational features) to facilitate the comprehension of the taxonomy. The adoption of these overarching perspectives is in line with the primary aim for the development of chatbots which is to emulate the process of human communication using AI. Here, the perspectives of intelligence, interaction and context are envisioned as natural attributes of the human communication process. As described by Littlejohn and Foss (2010, p. 8), human communication is “[…] the primary process by which human life is experienced; communication constitutes reality. How we communicate about our experience [Intelligence] helps to shape that experience. The many types of experience are the result of many forms of communication [Interaction]. Our meanings change from one group to another, from one setting to another, and from one time period to another because communication itself is dynamic across situations [Context]”. In this respect, the notions of interaction and intelligence are two common levels of abstraction that have been widely used in the IS scientific literature to describe the structural characteristics of chatbots (see Maedche et al. 2016; Knote et al. 2019; Stoeckli et al. 2019). On the other hand, the notion of context has been commonly used to frame the extension of the mediated environment (i.e., general-purpose and domain-specific, see Gnewuch et al. 2017; Diederich et al. 2019) in which the chatbot is used and hence has an influence on the chatbot construction (Knote et al. 2018).
Iteration 2
In this iteration we chose to follow an empirical approach to substantiate our conceptual taxonomic structure (T1) (Nickerson et al. 2013). We distributed the empirical investigation of all chatbots among the authors. To determine the characteristics of a sample of real-world chatbots, we used the definitions provided in Table A.2 of the online appendix and jointly determined selection criteria for non-self-explanatory dimensions. This empirical chatbot classification was achieved primarily through targeted interaction with the chatbot and secondarily partly through available videos and reports, which we also consulted. To this end, we classified an initial sample of 12 chatbot interfaces (see Table A.3 in online appendix) within the taxonomic structure (T1). This sample was composed of the most popular chatbots in the areas of communication, cryptocurrency, analytics and education according to the ranking provided by the third-party database BotList.co (2019).
Within this iteration, we removed all dimensions that were important from a conceptual point of view but could not be empirically determined from the outside by testing a chatbot, as detailed in Fig. A.1 in the online appendix. This includes, e.g., type of artificial intelligent system (AIS), memory, and sequentiality of process structure. After reviewing the aforementioned chatbots, we systematically readjusted our conceptual taxonomy by (i) removing the characteristics that were not empirically observable in any of the analyzed objects; (ii) merging redundant characteristics (i.e., conversational chatbots and interactive chatbots, (iii) disjoining characteristics that showed to have individual descriptive power (i.e., the compound characteristic daily life and family was divided into the individual characteristics daily life and family) and; (iv) adding the new characteristics identified during the examination (i.e., utility into the dimension motivation for chatbot use). Additional to the mentioned adjustments, we proceeded to merge the dimensions of personality processing and sentiment detection because of their overlapping nature, as well as to add to the taxonomy a new empirically observed dimension named additional human support to reflect the interactive design of those chatbots that enable a connection of the digital and physical world by means of integrating human support into its collection of interactive capabilities.
Iteration 3
To obtain a sample composed of chatbots from different application domains and platforms, we decided to search for a database that allows us to include chatbots from multiple domains. Accordingly, we analyzed five different chatbot databases (botlist.co, chatbottle.co, chatbots.org, 50bots.com, botfinder.io). The most suitable database for our purposes turned out to be the database chatbots.org, given that it allows to filter a total of 1194 chatbots according to 27 application domains. This feature enabled us to view 10% of the chatbots from each area (chatbots.org, 2019). In this iteration we categorized a collection of 66 chatbots (see Table A.3 in online appendix) composed by the ten percent of the total chatbots listed on the third-party database chatbots.org (2019) within the areas finance and legal (n = 15), social (n = 11), home and living (n = 5), body health (n = 5), government (n = 5), education (n = 5), electronics and hardware (n = 4), career and education (n = 3), cooking (n = 3), children (n = 2), environmental (n = 2), fashion (n = 2), sport (n = 2), culture (n = 1) and beauty (n = 1).
During the development of this iteration, we merged the characteristics of crowd setting and two or more humans of the dimension number of participants due to their overlapping nature. Additionally, we identified the feature only rule-based knowledge as additional descriptive characteristic of the intelligence quotient dimension; likewise the characteristics of advice and customer support were added to the dimension motivation for chatbot use to enhance its descriptive power.
Iteration 4
Subsequently, we analyzed additional 13 chatbots relating to the areas of telecommunication and utilities (n = 6), mobility (n = 2), mental and spirituality (n = 2), news and gossip (n = 2), and leisure (n = 1) from the database Chatbots.org (2019). In this iteration, we added a new dimension named service provider integration, consisting of the following characteristics: none, single integration and multiple integration, to describe the capacity of different chatbots to integrate supplementary services.
Iteration 5
As the ending conditions were not fulfilled in the last iteration due to the addition of one dimension, we proceeded then to carry out a further empirical iteration path. In this iteration, we integrated into the taxonomy an additional subset of chatbots interfaces consisting of in total 12 chatbots of the areas of travel (n = 5), TV, visual entertainment, creation and gaming (n = 4) and trade (n = 3) indexed as well in Chatbots.org (2019) database. As a result of this iteration, in the dimension motivation for chatbot use, we changed the name of the characteristic work support to work and career. Likewise, to enhance the explanatory power of the taxonomy, we also modified the name of the characteristic multiple to text understanding plus further elements in the intelligence quotient dimension.
Evaluation
To evaluate the taxonomy, we considered and answered three questions: “who”, “what” and “how” within the framework for taxonomy evaluation by Szopinski et al. (2019). With regard to the subject of evaluation (the “who”), we decided to choose individuals who had no previous contact with the development of the taxonomy. For the evaluation of the taxonomy in terms of both method and content, we involved three sets of participants within three separated focus group discussions: practitioners with domain knowledge about chatbots, academics with methodological knowledge about taxonomy development, and academics with chatbot domain knowledge. This heterogeneity is supposed to avoid inconsistencies and to ensure a broad applicability and usefulness for academic and practical purposes. With regard to the object of evaluation (the “what”), we determined “the design of a chatbot” as the real-world problem to be investigated. Focus group discussions were chosen as the method of evaluation (the “how”), because hereby the taxonomy can be analyzed jointly and new thoughts and ideas can be discussed.
As mentioned above, we conducted three focus group discussions, each of which began with a presentation of the taxonomy and the delivery of a sheet of paper with the taxonomy and all definitions. Then a worksheet was presented in which each participant was asked, as a first step, to note on an individual basis which perspectives, dimensions and characteristics should be deleted, added, merged, relocated or modified in wording, and their rationale for each proposed change. This was followed by a discussion on the fulfillment of the subjective ending conditions (see Table A.5 in online appendix) and the criteria of comprehensiveness, understandability, wording, and extendibility for the individual dimensions and characteristics explored by Szopinski et al. (2019).
Group 1 consisted of five participants with an academic background, all with methodological knowledge and two with chatbot domain knowledge. As a result of the discussion, which lasted 40 min, the characteristic text understanding and further abilities and the dimension intelligence quotient were renamed. The dimension socio-emotional behavior was particularly discussed, since emotional intelligence is currently gaining importance. This dimension was assigned to the intelligence perspective. The descriptions of the dimensions and characteristics were seen as appropriate and understandable.
Group 2 consisted of three participants with doctoral and post-doctoral backgrounds, one with strong methodological taxonomy knowledge and two with knowledge about the introduction of chatbots within the context of a research project on the development of a digital assistant for e-learning. Within this discussion, which lasted 105 min, we debated the results of the first group and placed a special emphasis on the evaluation of the definitions of dimensions and characteristics. The results were to rename dimension D5 to service integration, to rewrite the corresponding definition, to rename D14 to relation duration and to rename C15,1 to e-customer service. Furthermore, it was suggested to change the order of the characteristics at D4, D11, D13 and D14.
The third focus group discussion was held in an industrial company with four participants, each with previous experience in the development and implementation of domain-specific chatbots. The discussion lasted 75 min and was aimed at evaluating the taxonomy in terms of its comprehensibility for practitioners as well as the potential applicability and usefulness of the taxonomy in practice. Participants reported that the use of the taxonomy would provide a great added value before and during the development of chatbots. It helps them as an overview, as it can be used as a template for guiding the fundamental questions that every chatbot developer team should ask itself before starting the process of chatbot design, such as whether a chatbot should be better embodied or disembodied, whether socio-emotional behavior should be incorporated into the chatbot architecture, or what role a chatbot should play within the intended interaction with users. Furthermore, the taxonomy was considered to provide a useful synthesis of design elements that is independent of chatbot design providers and industries. Participants also stated that it would not only be helpful for them to classify their own chatbots in the taxonomy, but also to use this classification to analyze chatbots of competitors in a structured way, which in turn helps as a basis for decision-making.
Since no more dimensions or characteristics were merged, split, added or eliminated during the focus group discussion with group 3, the ending conditions have been fulfilled as shown in Table A.5 of the online appendix; consequently, the taxonomy development process ended after six iterations. The taxonomy development over iterations is shown in Fig. A.1 in the online appendix.