An Agent that Facilitates Crowd Discussion

Ito, Takayuki; Hadfi, Rafik; Suzuki, Shota

doi:10.1007/s10726-021-09765-8

An Agent that Facilitates Crowd Discussion

A Crowd Discussion Support System based on an Automated Facilitation Agent

Open access
Published: 06 November 2021

Volume 31, pages 621–647, (2022)
Cite this article

Download PDF

You have full access to this open access article

Group Decision and Negotiation Aims and scope Submit manuscript

An Agent that Facilitates Crowd Discussion

Download PDF

4607 Accesses
13 Citations
1 Altmetric
Explore all metrics

Abstract

Online discussion platforms are perceived as the next-generation method of citizen involvement. Such platforms can collect, integrate, and synthesize opinions to achieve social good. Crowd-scale platforms are being developed and deployed in social experiments that involve citizens and local governments. In such platforms, human facilitation is often used to preserve the quality of the discussions. Human facilitators often face difficulties when the discussions grow in size. In this paper, we present “D-agree,” a crowd-scale discussion support system based on an automated facilitation agent. The agent extracts discussion structures from online discussions, analyzes them, and posts facilitation messages. We conducted small- and large-scale social experiments in Japan to assess the social impact of the platform. The results showcase the success of our automated facilitation agents in gathering valuable opinions from citizens. In addition, our experiments highlight the effect of an automated facilitation agent on online discussions. In particular, we find that combining the agent facilitator with human facilitators leads to higher user satisfaction.

Agent-Based Crowd Discussion Support System and Its Societal Experiments

Discussion and Negotiation Support for Crowd-Scale Consensus

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Crowd-scale discussion platforms are posed to be the next next-generation platforms for democratic citizen involvement. Such platforms require support functions that can integrate ideas, opinions, and arguments, discourage the publication of toxic content, and even achieve consensus (Malone and Klein 2007; Malone 2018a; Ito et al. 2019). An example of such platforms is the “COLLAGREE” system, with its ability to work jointly with human facilitators to promote crowd-scale online discussions (Sengoku et al. 2016; Ito et al. 2015, 2014; Ito 2018).

Despite their ability to promote citizen participation, human facilitators face cognitive challenges due to the possible scale of discussions and the complexity of the themes discussed (Kawase et al. 2018; Nishida et al. 2018, 2017). For instance, in the case of “COLLAGREE” discussions, some threads had over one thousand opinions that were posted simultaneously by the users of the system. In this paper, we propose to address such facilitation challenges by building an automated facilitation agent that can manage online discussions in a new crowd-scale discussion support system called “D-agree”. The automated facilitation agent extracts the structure of the discussion, analyzes it, and posts targeted messages to effectively facilitate the discussion.

To evaluate our system, we conducted small- and large-scale social experiments within the city of Nagoya (Japan) with the collaboration of the local municipal government. We initially posited the following three hypotheses:

Hypothesis 1

The agent can incentivize the participants to submit more postings and to diversify these postings.

Hypothesis 2

When the agent works collaboratively with the human facilitator, the overall performance of the facilitation increases.

Hypothesis 3

The satisfaction of the participants in the discussions facilitated by the agent is more than average. This means that the participants were at least not dissatisfied with the discussion facilitated by the agent.

The results of our experiments verify the above three hypotheses. Moreover, in the experiment with the collaboration of the municipal government of Nagoya, the collected insights were later analyzed and used to elaborate upon social decisions and policies.

The contribution of the paper is twofold. First, we propose an agent platform that can intelligently interact with humans and extract insights from their discussions. Second, the platform successfully guided humans in their discussions using facilitation mechanisms that were evaluated in real social experiments.

The paper is structured as follows. In Sect. 2, we cover the relevant literature on crowd-scale platforms and the underlying technologies. In Sect. 3, we present an outline of our system, including the automated facilitation agent. In Sect. 4, we review the large-scale experiment and its results. In Sect. 5, we cover the small-scale controlled experiment. Finally, we summarize our work and highlight the future directions.

2 Related Work

2.1 Online Platforms

Online platforms are becoming crucial for the empowerment of citizens and in implementing sustainable goals Savaget et al. (2019). They can now collect opinions and even lead to advanced forms of social agreements (Malone 2018b; Malone and Klein 2007). For instance, the Climate CoLab system (Malone and Klein 2007) was used to integrate the collective intelligence of thousands of people worldwide to address climate change. The Deliberatorium Iandoli et al. 2007 is another system where people submit ideas by following an argumentation map through which participants frame their ideas. The first difference between our system and the Deliberatorium is that our discussions are structured around issues, or critical questions, to be addressed based on the Issue-Based Information System (IBIS) (Conklin and Begeman 1989). The second difference is that participants in the Deliberatorium create their discussions according to a predefined argumentation map, while in our system we do not constrain the participants to using such a map. Instead, the system builds the argumentation structure automatically from their posts after classifying them into IBIS elements.

Another similar system that shares many aspects with our proposal is called “COLLAGREE” (Sengoku et al. 2016; Ito et al. 2015, 2014; Ito 2018). COLLAGREE has been employed for large-scale social experiments in Japan. The system was used in the context of a collaboration with the local government to gather opinions from the public about next-generation planning. The real social impact of the system was that it succeeded in gathering opinions from younger people at a lower cost. The main difference between our current platform and COLLAGREE is that the latter used human facilitators. Other platforms for citizen participation rely on decision theory with insights from social sciences. For instance, the work Mkude et al. (2014) focused on participatory budgeting and the assessment of added public value. The Participatory On-Line Interactive System (POLIS) is another platform that allows multi-method, multi-stage, and semi-structured electronic public participation for citizens (Williams 2010). Our proposed system shares the same motivation with respect to the future of deliberative democracy and public sociology.

In practice, intelligent discussion platforms combine algorithmic and statistical techniques to harness the intelligence of the crowds. In our work, we focus on the use of artificial agents for their ability to adapt to complex human behavior, particularly in argumentative domains. These complex problem domains will first be addressed using argumentation mining, and then a facilitation agent will be implemented to handle them.

2.2 Argumentation Mining

A crucial component to the development of our platform is the ability to manipulate argumentative text in online discussions. This task is performed using argumentation mining, which also refers to the research area that is closely related to our study. Argumentation mining aims at identifying the structure of different arguments in natural language texts. For instance, many studies in the field of argumentation mining extract structures from essays (Stab and Gurevych 2014b; Nguyen and Litman 2016, reviews Kim 2014, and legal texts Palau and Moens 2009) in the same way as we propose for extracting structures from online discussions. These essays, reviews, and legal texts are represented according to different data models such as the Issue-Based Information System (IBIS) model Conklin and Begeman 1989. Among the studies in the field of argumentation mining, the subtasks of the component classification and the structure identification Stab and Gurevych (2017) are particularly related to our subtasks of node and link extractions. The difference between classical argumentation mining and our approach is that we perform the mining in real time as people discuss and alter the mined text. In addition, our agent dynamically posts its facilitation messages so that the entire discussion grows according to the IBIS model. In our mining approach, we extract the IBIS nodes from the text and then add the links that connect these nodes in the original text. The links are crucial for obtaining the final IBIS hierarchy that represents the argumentative structure of the discussion as illustrated, for instance, in Fig. 7. In the results of our extractions, we found that the F scores of extracting issues exceeded 0.80, and the precision of identifying the links among the IBIS elements was around 0.88; consequently, these scores are higher than state-of-the-art argumentation mining. Furthermore, our results greatly depended on the manual annotation efforts on over 38 discussion datasets, that is, after carefully defining our annotation scheme, our annotation results had a Fleiss Kappa value of around 0.66 (Yamaguchi et al. 2019). Here, the Fleiss Kappa value is a statistical quantity that measures inter-rater agreement for qualitative items. One of the main limitations of our method is that we focus on simplified discussion structures using the IBIS model. This model assumes that there are only four clearly classifiable components: issue, idea, pros, and cons. This assumption is the reason why we obtained higher accuracy. This being said, our major goal in this paper is not the classification of arguments but a clarification of the effect that a facilitation agent could have on online “argumentative” discussions. In the field of argumentation mining, more general components are often considered such as the major claims, minor claims, and premises (Stab and Gurevych 2014a). Another alternative to IBIS is to use coarse discourse acts and their richer set of argumentative types (Zhang et al. 2017). In the end, maintaining a limited set of argumentative utterances made the extraction more tractable and allowed the agent to interact in real time with the participants. In addition, it allowed us to create around 200 tractable facilitation rules that were carefully assembled after consultation with professional human facilitators. By combining these rules and the obtained IBIS structures, we could generate and use the facilitation messages in real time.

2.3 Chatbots

The key component in our platform is the facilitation agent and its ability to interact with humans in online discussions. Such an agent is identified as a conversational agent, a chatbot, or a social bot, and it is defined as a computer program that is designed to converse using natural language (Almansor and Hussain 2020; Tavanapour and Bittner 2018; Tavanapour et al. 2019). Such agents could generally be classified as task-oriented agents and non-task-oriented agents Chen et al. 2017; Yan et al. 2017. Task-oriented agents are designed for a particular task and are set up to have short conversations, usually within a closed domain such as online shopping, customer support, or medical expertise. Many techniques can be adopted to build this type of agent, such as parsing Weizenbaum (1966), pattern matching Wallace (2009), and more recently with the use of neural networks (Nuez Ezquerra 2018; Csaky 2019). The approach we adopt is rule-based and relies on deep learning classification, which gives the agent the ability to respond to a given message with the purpose of facilitating argumentative discussions. That is, our facilitation agent can identify argumentative utterances, build the corresponding semantic structure, and post adequate facilitation messages based on this structure.

2.4 Evaluation Methodology of Crowd-Scale Systems

The evaluation of crowd-scale systems requires the use of appropriate methodology when looking at the usefulness of the system and its acceptance among the crowds. Examples of such methodologies include the Technology Acceptance Model (TAM) (Davis 1985; Dasgupta et al. 2002; Venkatesh and Davis 2000), user satisfaction Zviran and Erlich (2003), usability evaluation Lewis (2018), and so forth. Due to the large scale of our studies and the ill-defined nature of the discussions, we relied on a quantitative method that combines questionnaires, annotated data, and statistical analyses of the argumentative data generated from the discussions. We particularly looked at how many IBIS elements are generated in a discussion and how many of these are generated as a result of the facilitation messages. We then combined such measures with the satisfaction levels of the users (Joshi et al. 2015). To this end, we used questionnaires created by experts in social psychology as well as psychological measurement scales (Hori and Yamamoto 2001; Hori and Yoshida 2001; Hori and Matsui 2001; Hori et al. 2007, 2011). A detailed investigation based on TAM will be one of our future works.

Finally, we looked at the interaction among the types of replies; i.e., those from participants to other participants, from participants to facilitator, and from facilitator to participants.

3 The D-Agree Platform

The D-agree system is composed of our artificial agent and the Web platform that hosts the participants and allows them to exchange messages with each other and with the agent. An example of such an exchange is shown on the left side of Fig. 1, where the first person submits a question in the form“How can we solve congestion in Nagoya city?”. Then, the automated facilitation agent identifies this post as an issue, labels it as an issue, and stores it in the database. The second person submits his/her post “How about introducing a traffic tax mechanism?”. Our agent identifies this as an idea corresponding to the issue submitted by the first person. This new post is labeled as an idea, which is stored in the database with a link to the corresponding issue submitted by the first person. By following this process, the agent constructs a typed hierarchic structure of the discussion. Finally, given predefined facilitation rules, the agent posts a facilitation message such as “What are the merits of this idea?” whenever the number of the pros is small for the idea under discussion.

For the extraction of the discussion structure, we adopted the Issue-Based Information System (IBIS) Kunz and Rittel (1970), shown in the right side of Fig. 1. This choice is justified by the need to lead the discussions while allowing people to clarify issues and ideas and then debate their merits and demerits. The IBIS model can comprehensively distinguish between such elements as well as any argumentative text (Lawrence and Reed 2017). Once the IBIS structure is automatically extracted, the facilitation agent posts facilitation messages in relation to the discussion to incentivize the users to cover more issues, ideas, pros, and cons. The resulting structured discussion is stored in the discussion database and later solicited in future discussions as reference.

The system’s Web interface is shown in Fig. 2. The example is taken from an experiment conducted during an official governmental meeting in Afghanistan Haqbeen et al. (2020). The features of the interface are described as follows.

1.
The phase of the discussion.
2.
Discussion topic posted by the moderator.
3.
Human-based facilitated message.
4.
Facilitation message of the agent.
5.
Ranking that includes user aspects of performance such as the number of posted items and the activity-based points.
6.
Summary of agent activities such as classification, analysis, and visualization.
7.
The post form used to post discussion topics.
8.
The reply function used by users to post opinions.
9.
Search function used to refer to current and past discussions using keywords.
10.
Menu bar that includes account settings and logout button.
11.
Discussion theme and media. Users can see the total number of discussants, posted items, discussion time, and live discussion videos.
12.
Ranking of the posted topics.
13.
Discussion points earned through participation.

3.1 Issue-Based Information System (IBIS)

The Issue-Based Information System (IBIS) is a practical model to structure arguments in textual discourse (Noble and Rittel 1989). This is done by categorizing sentences into issues, positions, and arguments in a graphical manner. There were previous attempts to use the IBIS model in the context of face-to-face meetings (Noble and Rittel 1989). Similarly, another approach Conklin (2003) proposed a system called Dialog Mapping, where an idea is used instead of “position” and arguments are set to “pros” and “cons.” Here, we use a similar formalism as illustrated in the example of Fig. 3. The root node is often the main question to be addressed by adding new ideas or arguments.

3.2 Automated Facilitation Agent

We developed an automated facilitation agent software that performs the following tasks:

1.
Observing the textual content posted by the users,
2.
Extracting the argumentative utterances from the content,
3.
Generating facilitation messages according to predefined rules, and
4.
Posting the messages to the discussion board in response to other posts.

The agent has additional functions such as filtering inappropriate posts and visualizing the IBIS elements as a tree. The agent consists of two main modules:

1.
Observation and posting module, and
2.
Data extraction module.

The observation and posting module was implemented using Amazon CloudWatch Wittig and Wittig (2018) and AWS Lambda functions to enable scalable observation and posting functions Varia and Mathew (2014). Accordingly, the agent can be activated when events happen within the discussion such as detecting certain utterances or receiving events from external triggers (Cloud Watch). The posting function is activated when a particular clue is detected, which allows the agent to post a message based on predefined rules. For instance, if three posts are added to the discussion and the last post is an issue, then the agent could post a message that asks the user to elaborate on the issue or propose a solution.

To detect the types of the posts, the agent needs to classify the text according to the IBIS types. To this end, we implemented the data extraction module using a Bidirectional Long Short-Term Memory (BiLSTM) classifier (Suzuki et al. 2019; Lample et al. 2016). The module captures the sentences and their IBIS word constituents (issues, ideas, pros, and cons). Then, it identifies the links that connect these nodes within the textual data. Finally, the module adds these relationships to the IBIS data model of the agent. Our proposed extraction method relies on previous works in argumentation mining Suzuki et al. (2019; Lawrence and Reed 2017; Stab and Gurevych 2017, 2014b, a) while remaining better suited to the IBIS data types. Such types often include an issue component, which is different from conventional argumentation structures. Furthermore, most of the literature on argumentation mining focuses on the use of claims and premises, thus lacking issue components in their structure (Cabrio and Villata 2018). Issues are critical for ill-defined discussions and wicked problems (Churchman 1967). To overcome this limitation, we adopt the conception where claims are decomposable into issues and ideas, while premises are equivalent to arguments (pros and cons). Adopting this mapping in the IBIS model provides a richer data model for argumentative discussions. More details on our implementation of the extraction method can be found in a previous study (Suzuki et al. 2019).

The generation of the facilitation messages is controlled with two parameters: a period of 1 minute specific to Amazon CloudWatch Wittig and Wittig (2018) and a threshold of 3 messages. This threshold sets the number of messages that the agent should count before taking part in the discussion. That is, the agent will wait before extracting the node types of the last message and then selecting an adequate message. The messages are selected based on rules that map each IBIS type to a random sentence. For example, a message to an idea would look like “That is a good perspective. Anybody else agree with your idea?” or “You are absolutely right. Anyone else support {user}’s idea?”. The variable “{user}” is the name of the participant to whom the agent is replying to Hadfi et al. (2020).

3.3 Architecture of System

Figure 4 illustrates the architecture of our system and its user interface. The system operates on Amazon Web Services (AWS) to manage the scale of the discussions (Varia and Mathew 2014). Here, the discussions are conducted in Japanese or English. The Web server component manages discussion boards and all of the data stored in the database. Users can access our system using Web browsers or iPhone and Android applications. The red boxed area in Fig. 4 shows the automated facilitation agent and its constituents.

4 Large-Scale Societal Experiment

4.1 Setting

The objective of this experiment was to gather opinions on the next-generation planning in the city of Nagoya (Japan). The resulting comprehensive plan will be the basis for the administrative decisions within the next few decades in the city of Nagoya. The D-agree system was used for this task and allowed Nagoya citizens to discuss five themes about the future of their city. As a result, we received 15,199 page views, 157 registered participants, and 432 submitted opinions, and the system was visited by 798 participants. These discussions were also held in the context of more than 10 face-to-face meetings with the city’s administrative staff. In a typical town meeting, there were more than 100 people gathered, and each person had an opportunity to provide opinions to the city administrators during the two-hour session. People who attend such town meetings are generally senior citizens, since such meetings are held in daytime. In contrast, our online platform attracted younger people at lower participation cost.

The experiment was conducted from November 1 to December 7, 2018. The whole campaign was advertised on Google Ads, on the homepage of the Nagoya municipal government, on the town meetings announcement of the Nagoya municipal government, and on various social media (Facebook, Twitter, Line, etc.).

The plan has five main themes in total.

Theme1 : :: Human rights and diversity.
Theme2 : :: Secure childcare.
Theme3 : :: Disaster prevention.
Theme4 : :: City environment.
Theme5 : :: Attractiveness to industry and the world.

Themes 1 and 2 were facilitated by expert human facilitators. In particular, for theme 1, the facilitators used their own facilitation methodologies, while for theme 2, their facilitation was based on the IBIS model. Themes 3 and 4 were facilitated by automated facilitation agents only. Theme 5 was facilitated through the cooperation of humans and agents, and here human facilitators used IBIS.

The choice of themes and the differences between them are paramount to conducting significant evaluations of the system’s output. In our case, theme differences could in fact give rise to distinct distributions of ideas, issues, and arguments, depending on the initial questions and the populations. Here, we were not mainly focusing on comparing the discussions and the resulting IBIS data, since they revolve around completely different themes. For example, some topics could naturally lead to more questions (for example, unresolved social problems), while other topics might lead to more ideas (for example, well-understood topics). Our main goal was to globally assess the behaviors of the agent facilitator, human facilitator, and participants within their discussions.

We established two phases with the goal of summarizing the discussed ideas. The first phase had 30-day discussions, and the second phase was conducted during 7-day discussions. In the first phase, people discussed issues using the D-agree system. In the second phase, administrative staff members summarized the discussions into several concrete ideas on which the citizens voted.

In the first phase, we launched the D-agree system on the internet. Anyone could register with the platform and post comments on the discussion threads. To register, users provided their email address, nickname, gender information, and home region (town-level). We did not gather actual names and exact home addresses. The collected information was carefully secured by the administrative staff to protect the privacy of the participants who only know each other by the registered nicknames.