1 Introduction

Communication is a complex subject that can be affected by numerous factors, including emotions. Two people may perceive very different meanings and attitudes regarding the choice of words and/or body language during a communication act. Furthermore, the same participant (in the communication activity) could interpret two different meanings from a message when she/he is in two different emotional states, i.e., she/he has two different moods. This phenomenon is amplified when communication is computer-mediated. In particular, the overwhelming use of online social networks (OSNs) has changed the perception of communication among individuals. All of us experience every day how computer-mediated communication (CMC) through OSNs is sometimes more difficult than face-to-face (F2F) communication. In this latter setting, nonverbal cues are a great part of communication; see, e.g., Knapp et al. (2013), though in CMC, it has been shown that emotions are not absent and are not even difficult to communicate (see Derks et al. 2008). Along this direction, Eligio et al. (2012) analysed the impact of moods on human activities and affirmed that individuals participating in computer-mediated collaboration sessions benefit from receiving information that helps them understand one another. In other words, the awareness is improved.

The contribution of this paper is the definition and preliminary evaluation of a system devoted to increasing the awareness of users involved in a CMC scenario to help them make informed decisions: the virtual counselling system (VCS). The objectives of the VCS are twofold: (1) to improve awareness of the emotional situation in a CMC scenario and (2) to recommend actions to users. A key concept to achieve these objectives is that of emotional signatures of an individual and a group that we frame in the cognitive framework of situation awareness (SA).

The term counselling usually refers to professional guidance using psychological methods and techniques and to the development of online or web counselling systems devoted to addressing issues such as mental health diseases (Kato et al. 2011; Shiono et al. 2009) or general online psychotherapeutic interventions (Palaniappan and Jun-E 2006). In this paper, we borrow the term counselling to focus its application domain on any interactive process, such as CMC. The definition we consider is that of the European Association for Counselling (EAC).Footnote 1 This definition emphasizes the role of the counselor in relation to the ability to address issues that require decision support, increased situational awareness, and to help individuals in critical and conflict situations.

The cognitive framework underlying the work proposed in this paper is that of SA developed through the formal setting of fuzzy set theory. Endsley (1995) defines SA as “the perception of elements in the environment within a volume of time and space, the comprehension of their meaning, and the projection of their status in the near future”. SA is a cognitive construct devoted to supporting humans and agents in making informed decisions. SA helps to interpret and understand information in the context of a larger concept called situation, which is an abstract state of affairs related to specific applications. The situations that the VCS must identify are related to the emotional dynamics, individual and group, in a CMC scenario. Fuzzy set theory Zadeh (1996) offers us a formal tool for modeling and reasoning about such situations. The advantage offered by this machinery consists of the fact that it allows us to consider the fuzziness related to emotional dynamics and moods that do not always appear to be classifiable in a precise way.

As done for the term “counselling”, we clarify what we refer to by emotional signature. According to Champion (2016), the concept of emotional signature is a development of the earlier concept of emotional scheme (Greenberg and Paivio 2003) that is defined as “a set of organizing principles constructed from the individual’s innate response repertoire and past experience that interact with the current situation and generate current experience”. The emotional signature of an individual is therefore constructed by considering his past behaviours in relation to particular situations, and in general, the study of emotional signatures aims at understanding dysfunctional behaviours. We consider this concept of emotional signature and generalize it for use in CMC processes. The concept of emotional signature is of interest to us because it photographs the behaviour of the individual in different situations. For example, it can help us understand that an individual in situations of anger has nevertheless had positive interactions with other individuals. The emotional signature, to some extent, provides information about how others perceive an individual. An emotional signature helps individuals increase their SA, and by aggregating the emotional signatures of individuals, a measure of the group SA can be obtained.

Even if the VCS model is generic enough to apply to different contexts, in the paper, we limit ourselves to the identification of emotional dynamics based on the phenomenon of emotional contagion. This is a phenomenon in which user emotions and related moods activate or influence, directly or indirectly, similar emotions and moods in other people. To some extent, this effect can extend beyond the limits of individuals and characterize a group or team (Barsade 2002; Kelly and Barsade 2001). The reason that led us to start with the study of these dynamics is the possibility of influencing, through the advice and recommendations of the VCS, the group emotional dynamics towards the achievement of empathy that can be advantageous in different application sectors. There is, in fact, a relation between empathy and emotional contagion, with the latter being the inner level of an empathetic model according to the Russian Dolls model of empathy (De Waal 2007). Thus, according to this model, establishing emotional contagion among individuals involved in communication is required to achieve empathy.

The paper is organized as follows. Section 2 reports background information, analyses related works and compares these works with our proposal. Section 3 describes the VCS model. Section 4 presents the architecture of the VCS and describes the current state of the implementation. Section 5 describes the experimentation devoted to validating the capabilities of our prototype to recognize emotional situations and recommend corrective actions. The section also discusses the results achieved. Section 6 concludes the paper and presents future works. A separate Appendix in Sect. 7 reports an illustrative example.

2 Background and related works

This section reports background information on theories of emotion, some related works on recognition of emotional dynamics and, finally, background information on the concept of the fuzzy user signature developed in Yager and Reformat (2012) that is used to derive the emotional signature.

2.1 Theories of emotion

There are two main schools for the theory of emotions. The first, which includes authors such as Russell and Barrett (1999), hypothesizes that emotions can be classified on a dimensional scale. For example, Russell (1980) proposes a circumplex model with a two-dimensional polar space, whose axes are the valence (the level of pleasure) and arousal (the level of arousal). The second models emotions in a discrete way. The best known case relates to Plutchik’s Wheel of emotions (Plutchik 2001) which places emotions on petals, where the distance between two leaves (emotions around the circle) somehow measures similarity to adjacent emotions that are more similar than distant emotions. This model allows the combination of primary emotions to create secondary or even tertiary emotions (TenHouten 2006). In this paper, we use both the circumplex model (to map discrete moods on the positive, neutral and negative elements of a circumplex model) and discrete emotions from Plutchik. Specifically, the outputs of a wearable device devoted to detecting moods are mapped on the circumplex model. Then, an emotional signature correlates these moods with emotions derived from text analysis. This process is described in Sect. 4.1.

In addition to these two main lines, other theories of emotion are emerging. Two of these that leverage situational awareness are of interest for our work. The first is appraisal theory, which hypothesizes that emotions result from people’s interpretations and explanations of their circumstances even in the absence of physiological arousal (Aronson et al. 2010). In the absence of arousal, according to this theory, an individual decides how to feel about a situation after having interpreted and explained the phenomena in a way that may be similar to the three levels of an SA model. The second is the emotional contagion (Hatfield et al. 1993), which states that one person’s emotions and related behaviours directly trigger similar emotions and behaviours in other people. In Cheshin et al. (2011), this theory is examined in the context of textual cues and to assess the effects of emotion in virtual teams. This objective is close to ours to the extent that the VCS intends to recommend actions to improve emotional dynamics in groups with the aim of supporting the creation of empathetic groups.

2.2 Recognition of emotional dynamics in conversation

In this subsection, we report a brief overview of some recent works that focus on the recognition of emotional dynamics (such as shifts and variations) in conversations and share some commonalities with our approach. We avoid comparing our work with the numerous research results that focus on the detection of emotions from text, as this is not our aim, and it is not our research objective to find new methods and algorithms for this purpose.

A recent research work that pursues a direction similar to ours is Ghosal et al. (2020a), where the authors leverage common-sense knowledge to develop a framework that incorporates different elements of common sense such as mental states, events, and causal relations. The goal is to learn interactions between interlocutors participating in a conversation to predict emotion shifts and understand differences between closely related emotions. The same research group, in another recent work Ghosal et al. (2020b), explores the role of context and employs perturbations to distort the context of a given utterance to study its impact. These works share numerous aspects with ours. First, the use of common-sense knowledge provides, to some extent, a description of the emotional situation, as it allows us to understand elements such as intention and reaction as well as emotion. Furthermore, the modeling of the situation that can be provided with this approach is, to a certain extent, even more descriptive than the one we use in the VCS, which is based on fuzzy logic and the concept of the fuzzy emotional signature. On the other hand, the approach we propose is computationally more tractable, as it requires the use of relatively complex similarity measures for reasoning purposes. Another added value of our approach is the possibility of using the models of the signatures to carry out actions such as searching for users with similar or dissimilar emotional states.

Another interesting work that aims at identifying shifts in emotional patterns is Servi and Elson (2014). Their objective is to identify emotional levels, detect influence and forecast emotions as part of a larger process whose goal is to explain the meaning of shifts. For this purpose, Servi and Elson (2014) propose a mathematical model based on the concept of breakpoints. The forecasting is rule based. Additionally, in this case, a common aspect is the aim of providing meaning to the variation of emotional dynamics. The VCS does this through a situational analysis and is not based on a mathematical model. The reasoning supported by the VCS is of an approximate type and, in fact, relies on techniques with three-way decisions that have the aim of providing rapid decision making with reduced cognitive effort.

As discussed, the concept of the emotional signature appears to be a distinctive aspect of our work. A study that uses a similar concept is Mokryn et al. (2020), where the emotional signature is defined to characterize movie-derived emotions via reviews. In that case, the emotional signature for the film characterizes the emotions that it tends to evoke in viewers. Once derived, the purpose of these emotions is to leverage them in affective recommender systems and affective multimedia retrieval. We use the concept of (fuzzy) signatures to characterize a user in a conversation, and in our case, the signatures serve to support informed decision making with the VCS. However, in addition to the differences in the formal settings adopted to model signatures, we see the contribution of the signature in the wider context of the recognition and comprehension of emotional situations. The contextualization of this concept in the SA decision-making process differentiates our use of emotional signatures.

2.3 Fuzzy signature

The fuzzy user signature was proposed by Yager and Reformat (2012) to analyse and process user tagging activities on the social web. The fuzzy user signature is a fuzzy representation of the activities of a user. Specifically, following the nomenclature and formalism proposed in Yager and Reformat (2012), the user signature of Eq. (1) is a fuzzy relation between two fuzzy sets: a set representing resource attractiveness and a set representing tag popularity.

$$\begin{aligned} UserSignature(r,t) = ResAttract(r) \times TagPop(t) \end{aligned}$$
(1)

To clarify how a signature can be constructed, let us consider Fig. 1. The fuzzy set ResAttract(r) can be constructed by considering the last row of Fig. 1 as follows:

$$\begin{aligned} ResAttract(r) = \bigg \lbrace \frac{b_{1}}{r_{1}}, \frac{b_{2}}{r_{2}}, \ldots , \frac{b_{n}}{r_{n}} \bigg \rbrace \end{aligned}$$
(2)

with

$$\begin{aligned} b_{i} = \frac{\# \; of \; tags \; used \; for \; r_{i}}{max \; \# \; of \; different \; tags \; used \; for \; a \; single \; resource} \end{aligned}$$
(3)

where \(i = 1, 2, \ldots , n\) and n equals the total number of resources.

In a similar way, the fuzzy set TagPop(t) is defined as:

$$\begin{aligned} TagPop(t) = \bigg \lbrace \frac{a_{1}}{t_{1}}, \frac{a_{2}}{t_{2}}, \ldots , \frac{a_{m}}{t_{m}} \bigg \rbrace \end{aligned}$$
(4)

with

$$\begin{aligned} a_{j} = \frac{\# \; of \; times \; t_{j} \; is \; used}{max \; \# \; of \; resources \; tagged \; with \; a \; single \; tag} \end{aligned}$$
(5)

where \(j = 1, 2, \ldots , m\) and m equal sthe total number of tags.

Fig. 1
figure 1

Matrix of activities for user \(u_{k}\) (from Yager and Reformat (2012))

These two fuzzy sets provide information about the activities of a single user, specifically the degrees of interest in resources and degrees of usage of labels. To consider both aspects and characterize a user, the concept of user signature is defined in Eq. (1). For a single resource \(r_{i}\) and a single tag \(t_{j}\), the value of the relation is:

$$\begin{aligned} UserSignature(r_{i},t_{j}) = min \lbrace ResAttract(r_{i}), TagPop(t_{j}) \rbrace \end{aligned}$$
(6)

For the example shown in Fig. 1, the values of the relations are reported in Fig. 2.

Fig. 2
figure 2

Fuzzy signature for user \(u_{k}\) (from Yager and Reformat (2012))

In Yager and Reformat (2012), the concept of fuzzy user signature is used to determine a level of like-mindedness between users using a similarity measure, a signature of a group of users using aggregation operators and a level of like-mindedness between a single user and a group of users. Interested readers can refer to Yager and Reformat (2012) for further details.

3 The virtual counselling system

The VCS is a system able to determine the emotional signatures of the users involved in a CMC scenario and, on the basis of these signatures, to recognize emotional situations and to recommend actions to the users to pursue an objective. The starting point for VCS is the ability to detect moods and emotions of users involved in a conversation. As reported in Beedie et al. (2005), the distinction between mood and emotion is “clouded, in part, because an emotion and a mood may feel very much the same from the perspective of an individual experiencing either”. Duration seems to be a differentiator in the sense that moods are considered more persistent than emotions which, on the other hand, have extreme volatility. For our objectives, moods are discrete elements representing emotional states that affect the experience and behaviour of a person (Scherer 2000), and emotions are elements defined according to discrete models such as Plutchik’s discrete emotion model (Plutchik 2001). The technologies we use to detect moods and emotions will be explained in Sect. 4. For now, consider a conversation between two users A and B.

For each message sent by a user, we collect the mood that is detected while sending the message and the emotion that is extracted from the written text message. The VCS uses these moods and emotions to build emotional signatures. The initialization of the VCS occurs during the first communication session [0, T]. In this period, the VCS collects the information necessary to characterize individuals A, B and a group (A and B) from an emotional point of view. We will focus on dyadic interaction behaviours by analysing relationships between pairs of participants. In Fig. 3, we considered discrete time sessions and depicted individual emotional signatures with emojis.

Fig. 3
figure 3

Emotional signature and situation

The VCS, once the emotional signatures of individuals have been determined, classifies a situation with respect to an objective or a strategy that must be pursued. As we will see later, we experiment with a tripartite classification method of situations. Moreover, the objectives and actions that the VCS takes after identifying the situations are described with an expert-based method based on goal-directed task analysis (GDTA) (Endsley 2016), which is a requirement elicitation method allowing the identification of goals and critical decision-making tasks as well as the information requirements needed to make decisions.

With reference to Fig. 3, the VCS at the end of the first communication session (T) classifies the situation as safe with respect to its objectives and perceives a positive emotional contagion between the two users since they present similar positive emotions. The VCS can decide not to intervene in the communication, suggesting for B to send a message to A. After a second communication session (2T), the VCS detects a change in the emotional signatures of A and B and in the situation. It can then decide to suspend communication between users who appear very dissimilar since A is less happy and B is angrier and in the context of an unsafe situation. The VCS can thus suggest for user A to have a conversation with a user C that has an emotional signature more similar to A.

To perform these tasks, the VCS has to be able to build individual and group signatures, perform some operations on signatures and, finally, classify a situation to decide actions according to the GDTA. The following subsections describe how the VCS works to execute the above mentioned operations. An illustrative example that shows how the VCS derives emotional signatures, comprehends situations and makes decisions is reported in the Appendix.

3.1 Emotional signatures

An emotional signature, ES, provides a snapshot of a user’s emotional state during a communication session. This signature consists of \(\langle mood, emotion \rangle\) pairs that characterize a user. To determine the emotional signature of a user, we use the concept of the fuzzy user signature defined by Yager and Reformat (2012) and briefly described in Sect. 2.3.

We consider two users, A and B, involved in a conversation and a time interval, \([t, t + \delta ]\). \(T_{A \rightarrow B} \subseteq T_{A}\) is the set of messages that A has sent to B during the time period under consideration. \(M_{A \rightarrow B} \subseteq M_{A}\) and \(E_{A \rightarrow B} \subseteq E_{A}\) are the sets of moods and emotions associated with the messages that A has sent to B. We define two fuzzy sets:

$$\begin{aligned} {\tilde{E}}_{A \rightarrow B}(e)= & {} \bigg \lbrace \frac{x_{i}}{e_{i}}, \frac{x_{2}}{e_{2}}, \ldots , \frac{x_{k}}{e_{k}} \bigg \rbrace \end{aligned}$$
(7)
$$\begin{aligned} {\tilde{M}}_{A \rightarrow B}(m)= & {} \bigg \lbrace \frac{y_{1}}{m_{1}}, \frac{y_{2}}{m_{2}}, \ldots , \frac{y_{l}}{m_{l}} \bigg \rbrace \end{aligned}$$
(8)

where:

$$\begin{aligned} x_{i} = \frac{\# \, times \, e_{i} \, is \, detected \, in \, messages \, that \, A sends \, to B}{|T_{A \rightarrow B}|} \end{aligned}$$
(9)

for \(i = 1, \ldots , k\) is the membership degree of the element \(e_{i}\), and

$$\begin{aligned} y_{j} = \frac{\# \, times \, m_{j} \, is \, detected \, in \, messages \, that \, A sends \, to B}{|T_{A \rightarrow B}|} \end{aligned}$$
(10)

for \(j=1, \ldots , l\) is the membership degree of the element \(m_{j}\), and \(|T_{A \rightarrow B}|\) is the cardinality of the set \(T_{A \rightarrow B}\). This cardinality value is the total number of messages that A sent to B.

We define the emotional signature of user A when interacting with B as:

$$\begin{aligned} ES_{A \rightarrow B}(e,m) = {\tilde{E}}_{A \rightarrow B}(e) \times {\tilde{M}}_{A \rightarrow B}(m) \end{aligned}$$
(11)

where the value of the relation for a couple \(\langle e_{i}, m_{j} \rangle\) is the minimum. The adoption of the minimum for the evaluation of Eq. (11) comes from the fact that the signature is a binary relation between two fuzzy sets. Given two sets, A and B, we evaluate the Cartesian product that, in fuzzy logic, is the set of all pairs from A and B with the minimum associated memberships.

Tables 1 and 2 help us understand how we can derive an emotional signature. Table 1 shows the moods and emotions that belong to the subsets \(M_{A \rightarrow B}\) and \(E_{A \rightarrow B}\). For each intersection, we count the number of messages from which we have detected the specific mood and emotion. Next, we build the fuzzy sets \({\tilde{E}}\) and \({\tilde{M}}\) as in Eqs. (7) and (8) and derive the value of Eq. (11). The results are shown in Table 2.

Table 1 Moods and emotions of user A interacting with B
Table 2 Emotional signature of user A

Emotional signatures can be aggregated by the VCS to obtain the complete behaviour of a user from partial behaviours or group signatures. Signatures can also be compared with similarity measures to identify users with similar or dissimilar emotional behaviours and/or to evaluate how the emotional situations of users and groups evolve over different communication sessions.

3.1.1 Aggregation of emotional signatures

We can use an aggregation operator to obtain: (1) the overall emotional signature of a user and (2) a group signature:

$$\begin{aligned} ES_{A}= & {} \bigodot _{i \in U - \lbrace A \rbrace }(ES_{A \rightarrow i}) \end{aligned}$$
(12)
$$\begin{aligned} GS= & {} \bigodot _{u \in U}(ES_{u}) \end{aligned}$$
(13)

where U is the set of users belonging to a group and \(\bigodot\) is an aggregator operator.

Equation (12) aggregates the partial signatures of a user to obtain an overall emotional signature. Equation (13) aggregates the signatures of all the users involved in a communication session. The aggregation operators can be more or less complex and vary from simple averages to more complex ordered weighted averaging operators like OWA (Yager 1993) that also work on the basis of linguistic quantifiers.

3.1.2 Comparing signatures with similarity measures

As proposed also in Yager and Reformat (2012), we can use a similarity measure to identify similar users. Considering \(ES_{A}\) and \(ES_{B}\) as the emotional signatures of users A and B, we define the absolute similarity between A and B:

$$\begin{aligned} Sim_{Abs}(A, B) = \frac{|ES_{A} \cap ES_{B}|}{|ES_{A} \cup ES_{B}|} \end{aligned}$$
(14)

and relative similarity between A and B:

$$\begin{aligned} Sim_{Rel}(A, B) = \frac{|ES_{A} \cap ES_{B}|}{|ES_{A}|} \end{aligned}$$
(15)

where || is a cardinality measure and we model \(\cup\) and \(\cap\) respectively with max and min t-norms and conorms.Footnote 2 There are differences between Eqs. (14) and (15). In this last case, in fact, we compute the common elements (i.e., the common \(\langle mood, emotion \rangle\) pairs) between A and B and refer to this value as the elements of A. Strictly speaking, Eq. (15) is not a similarity measure, because \(Sim_{Rel}(A, B) \ne Sim_{Rel}(B, A)\), and provides information on how much user B presents the same emotional elements as those in A.

The two measures of Eqs. (14) and (15) have different objectives: the first allows the VCS to recognize emotional situations characterized by similar states between two subjects involved in communication. The second allows the VCS to suggest to a user A another user B with an emotional signature compliant with A.

Equations (14) and (15) can also be used to compare group signatures. For instance, \(Sim_{Abs}(GS_{t}, GS_{t+\delta })\) compares the group signature at time t with that at time \(t+\delta\) and allows the VCS to understand if there have been significant emotional variations in a group between two time sessions.

3.2 Situation identification and recommendations

The analysis of the emotional signatures supports the phase of identification of the situation by the VCS.

A situation is defined as a tuple \(Sit=(S,R, \lbrace ES_{S \rightarrow R}) \rbrace , GS)\) where:

  • \(S \subseteq U\) is the set of senders;

  • \(R \subseteq U\) is the set of receivers;

  • \(\lbrace ES_{S \rightarrow R} \rbrace\) is a family of emotional signatures related to the sender-receiver interactions;

  • GS is the emotional signature of the set \(S \cup R\)

As already mentioned, the individual and group emotional signatures are developed starting with a subset of moods and emotions detected. Limiting our study to dyadic interactions, with reference to Fig. 4, we can show two cases of templates for a situation: case (a) refers to a rare case in which there is no overlapping of roles during a communication session. This includes, for example, an asynchronous communication situation wherein the recipient generally reacts over a longer period of time and may take time to reflect. Case (b) is a more general situation wherein the sender and receiver can exchange roles during a communication session.

Fig. 4
figure 4

Templates of a situation

Based on this definition, it is clear that if the VCS has a comprehension of the constituent elements, then it can correctly identify the situation. Thus, based on the comprehension of the emotional signatures and, based on a specific strategy, it can classify a situation and take action. The following subsections describe these two aspects.

3.2.1 Comprehension of the emotional signatures

One way to comprehend and label an emotional signature is to use the fuzzy sets \(\alpha\)-cut concept. In short, given a fuzzy set, the use of an \(\alpha\)-cut allows us to eliminate from the set all the elements for which the degree of membership is less than a fixed threshold, say \(\alpha\). Thus, for example, suppose \(\alpha = 0.3\) and consider the signature of Table 2. With this cut, we remove all \(\langle mood, emotion \rangle\) pairs that have a membership degree lower than 0.3, allowing us to easily classify the signature as \(\langle M3, E4 \rangle\) (e.g., \(\langle Excited, Sad \rangle\)). Such simplicity, however, comes at the cost of a very coarse-grained description of a user’s emotional behaviour. Again with reference to Table 2, let us observe that with a slightly reduced value of \(\alpha\) (e.g., 0.25), elements relating to the emotion E2 emerge with all the moods detected. In this case, the labeling is not immediate, but what is most important for the VCS is to understand the behaviour described by the signature, which can be formulated as follows: when the mood is M1, the user feels the emotion E2 or E4; when the mood is M2, the user experiences only the emotion E2; and when the mood is M3, the user more often experiences the emotion E3 than E2. In this way, the VCS gains an improved awareness of the emotional behaviour of a user that can be used to classify situations.

From the behavioural description mentioned above, it is easy to understand that the VCS already has the ability to predict that if the user is in the M2 state of mind, the message he will send to one of his interlocutors will carry the E2 emotion. Recommending sending this message or not recommending it depends, however, on the strategy and the objectives of the VCS.

3.2.2 Situation classification and decision making

When the VCS has a good understanding of the elements of a situation, it makes a classification and proposes actions. For this purpose, the VCS makes use of a GDTA as defined by Endsley (1995). A GDTA is a form of cognitive task analysis, key to acquiring a good SA, and focuses on the goals the human operator must achieve and the information requirements that are needed to make appropriate decisions. In a GDTA structure, a decision is associated with a sub-goal. For each sub-goal, the GDTA clearly reports the information to acquire at the perception, comprehension and projection levels of a situation. The use of the GDTA has an advantage related to the reuse of the VCS in other domains (e.g., education), which can be made without changing the methodology but considering only other domain expert knowledge acquired, possibly also with machine learning techniques.

In the example reported in the Appendix of Sect.  7, we show how the VCS can make decisions related to a sub-goal such as “avoiding conflict situations”. First, however, we clarify the methodological approach underlying the classification of an identified situation.

The classification mechanism is based on the three-way decisions approach (Yao 2016). The fundamental idea is to divide a universal set into three pairwise disjoint regions, or more generally a whole into three distinctive parts, and to act upon each region or part by developing an appropriate strategy. The tripartition is based on the comparison of an evaluation function with two thresholds: \(\beta\) and \(\gamma\). In three-way decision models, the determination of the thresholds is of fundamental importance. Several approaches have been proposed based on Shannon entropy (Deng and Yao 2012), the Chi-square statistic (Gao and Yao 2016), and game theory (Herbert and Yao 2011). If the value of the function is greater than or equal to \(\gamma\), the situation is classified in a positive region. If the value of the function is less than or equal to \(\beta\), the situation is classified in a negative region. Otherwise, the situation is classified in a boundary region.

Our purpose is to classify situations with respect to the objectives of the VCS (e.g., SAFE or UNSAFE with respect to the goal of avoiding conflicts) and to the emotional contagion dynamic (e.g., to enforce empathy in a group).

Consider the template of situation (b) of Fig. 4. The VCS can label this situation with the GS. For instance, applying an \(\alpha\)-cut to GS, the VCS can label this situation as \(\langle M1, E5 \rangle\), e.g., \(\langle Calm, Joy \rangle\), which may indicate a SAFE situation with respect to a specific sub-goal. Of course, there are other group characterizations that can be considered SAFE for this specific objective, such as \(\langle Excitement, Joy \rangle\) and combinations of these pairs. Let us call SafeSit the set of these situations. Now, the VCS must determine how to classify this situation with respect to the dynamics of emotional contagion and, to this purpose, apply a three-way decision. The VCS uses the similarity between emotional signatures of sender and receiver, \(Sim_{Abs}(ES_{S \rightarrow R}, ES_{R \rightarrow S})\), as an evaluation function. For the sake of simplicity, let us label as SAFE the situation characterized by \(GS = (\langle Calm, Trust \rangle )\). This yields the following subsets:

$$\begin{aligned} \begin{aligned} POS(SafeSit) =&\bigg \lbrace GS \in SafeSit | \\&Sim_{Abs}(ES_{S \rightarrow R}, ES_{R \rightarrow S}) \ge \gamma \bigg \rbrace \end{aligned} \end{aligned}$$
(16)
$$\begin{aligned} \begin{aligned} NEG(SafeSit) =&\bigg \lbrace GS \in SafeSit | \\&Sim_{Abs}(ES_{S \rightarrow R}, ES_{R \rightarrow S}) \le \beta \bigg \rbrace \end{aligned} \end{aligned}$$
(17)
$$\begin{aligned} \begin{aligned} BND(SafeSit) =&\bigg \lbrace GS \in SafeSit | \\&\beta< Sim_{Abs}(ES_{S \rightarrow R}, ES_{R \rightarrow S}) < \gamma \bigg \rbrace \end{aligned} \end{aligned}$$
(18)

where SafeSit is the set of all situations characterized as SAFE, i.e., having a GS that can be labeled as SAFE.

In general terms, the expert decides which combinations of pairs \(\langle mood, emotion \rangle\) in the GDTA can be considered SAFE for a particular goal (i.e., the set of SafeSit). Similarly, the GTDA provides information on combinations of pairs that can be considered UNSAFE and DOUBT. With the evaluation of the similarity between the individual emotional signatures, the VCS applies three-way decisions. Thus, the set of all possible situations SAFE, UNSAFE or DOUBT is partitioned into three regions (POS, NEG, BND) by comparing the emotional signature similarity against two thresholds. It is possible to determine in which region the current situation falls and act consequently to the indication of the GDTA. An example related to our experiment is shown in Table 3 of Sect. 4.2.

The expressions of Eqs. (16), (17) and (18) are meaningful to our classification purposes. Equation (16) in fact highlights that the similarity between the emotional signatures of the individuals involved is high. Considering Eq. (14), this means that the two emotional signatures have many elements in common related to all possible elements, so the pairs \(\langle mood, emotion \rangle\) of the users are aligned during the communication session being examined. The VCS can infer that there is an assonance between the emotional dynamics of individuals. Similar reasoning can be adopted for Eq. (17). In this case, the similarity is low, and therefore, the pairs \(\langle mood, emotion \rangle\) of the users are different, indicating distance in the emotions and moods expressed in the communication session. The VCS in this case infers that a strong difference between the emotional dynamics of individuals exists. The third case of Eq. (18) refers to a boundary situation in which the VCS cannot make a proper decision about the classification with respect to the emotional dynamic.

4 The prototype of the virtual counselor system

In this section, VCS implementation is described according to two perspectives: technological and behavioural. The first presents an overview of the architecture and technological assets used to implement the prototype. The second describes the behaviour of the prototype. A constraint of this work is to design a low-cost solution for the users. Furthermore, the current prototype implementation has been conducted with the objective of evaluating the distinctive features of our approach, which are the construction of individual and group emotional signatures, situation identification and classification with the three-way decisions approach, and decision making.

4.1 Technological perspective of the VCS

Figure 5 shows the architecture of the VCS. The VCS is deployed as a web service on a remote server and provides its functionalities by exploiting components able to detect emotions from text messages (i.e., a sentiment analysis tool) and to detect moods by accessing the API in the cloud for reading data traced by the users’ wearable devices (i.e., a smart watch). The main role of the VCS is to generate emotion-driven advice to the users in a specific communication channel. Such advice is expressed as emotional awareness information represented and visualized in the communication tool plug-in to provide users with higher levels of emotional awareness and greater impact on the communication quality.

Fig. 5
figure 5

Overall architecture of the virtual counselor system

To implement these functions, we need to employ technological assets to detect emotions and moods. The first asset is the IBM Watson Tone AnalyzerFootnote 3 service (Tone Analyzer), which uses linguistic analysis to detect emotional and language tones in written text. The Tone Analyzer service is accessed through an SDK provided by IBM, and we used a pretrained model provided by IBM. The second asset (for mood detection through a low-cost wearable device) is the VYVO smart watch accessible through VYVO APIs.

To represent moods and sentiments, it is important to also consider what the selected components (Tone Analyzer and VYVO API) offer. In particular, with respect to moods detected by VYVO wearable devices, the adopted dictionary includes three different values: DEPRESSION, EXCITEMENT, and CALM. The VYVO API provides a representative value for each quadrant of the arousal-valence schema of the Circumplex Model of Affects (Russell 1980), except for EXCITEMENT for which it is not possible (through the VYVO approach) to understand if such a mood is positive or negative (see Fig. 6).

Fig. 6
figure 6

Mapping between Circumplex model and VYVO approach

Furthermore, the possible output of Tone Analyzer is represented by the values ANGER, FEAR, JOY, SADNESS, and DISGUST plus an additional neutral value NONE. Such values can be easily mapped on the basic emotions provided by Izard (2009) and on Plutchik’s discrete emotion model (Plutchik 2001).

4.2 Behavioural perspective of the VCS

Figure 7 provides an overview of the VCS behaviour. The VCS collects emotions and moods of the users of a conversation with IBM Tone Analyzer and VYVO API. For each user, the VCS builds individual and group signatures. The values of \(\alpha\) to implement an \(\alpha\)-cut for signature classification are given as external parameters. Similarly, the values of \(\beta\) and \(\gamma\) to classify the recognized situations are external parameters in the current implementation of the prototype. These three parameters allow the VCS to perform analysis at different levels of granularity for the classification of signatures (e.g., as already mentioned, high values of \(\alpha\) simplify the classification, but at a coarse grain level) and with variable precision for the classification of a situation.

Fig. 7
figure 7

Main behaviour of the virtual counselor

In this version of the prototype, the situations are labeled as SAFE, UNSAFE and DOUBT. Taking into account the set of moods and emotions we are able to detect with our technological choices, the situations described in Sect. 5 will be labeled as follows:

  • SAFE, if the \(\langle mood, emotion \rangle\) pairs of a GS with a specific value of \(\alpha\)-cut consist only of combinations of positive or neutral moods, such as Calm and Excitement, and positive emotions such as Joy and neutral None;

  • UNSAFE, if the \(\langle mood, emotion \rangle\) pairs of a GS with a specific value of \(\alpha\)-cut consist only of combinations of negative or neutral moods, such as Depression and Excitement, and negative emotions such as Anger, Fear, Sadness, and Disgust;

  • DOUBT, for the other cases.

Using three-way decisions, each of these cases can be further classified as POS, NEG and BND. Therefore, we have 9 classes, and we have to recommend actions for these classes on the basis of a GDTA. Table 3 reports the corresponding actions for the objective “avoiding conflict situations”.

The common strategy underlying the actions reported is to support the dynamics of positive emotional contagion leading towards empathy. In the column relating to POS situations, the similarity of emotional behaviour of the users is high. In the case of SAFE situations, it is decided to allow sending the message, while it is recommended to temporarily suspend the conversation in the case of UNSAFE. The general strategy underlying the BND column is to reflect A in both SAFE and UNSAFE situations before sending the message. The reflection takes place downstream of the explanation of the emotional state of B to prevent the emotional dynamics from turning towards a negative type of influence (in the case of SAFE) or to promote a possible positive influence (in the case of UNSAFE). The VCS may also suggest reformulating the message at the end. The actions proposed in the NEG column consider the differences between the emotional states of the two users. In all cases, a better assessment is recommended, showing the emotional state of B. With respect to the DOUBT row, the VCS tries to pass this classification as SAFE by recommending the sending of positive messages when the distance between the emotional states of the users is not large.

Table 3 Courses of action when A has to send a message to B

We recall that the development of GTDA for VCS is an expert-based task and that the set of actions reported above has been defined only for evaluation purposes.

5 Experimentation

This section describes the experimentation carried out through our prototype of the VCS. The experimentation was conducted by involving 40 international students following the Decision Model Course, held in English, at the DISA-MIS department of the University of Salerno. The students were divided into pairs to face exam project work. To this end, they worked 1 day at the beginning of September 2020.

5.1 Experimental prototype

To execute the first experimentation activity, an app for Microsoft Teams was developed. The students involved in the experimentation have skills in using this application, which is known to them because it is used for distance learning at the University of Salerno. Once the app is installed in the Teams environment, users have to sign up for the counselor service in a given channel. Then, the users are provided with a new button (perfectly integrated into the Teams user interface) to submit their messages into the chat tool of Teams. The click on the aforementioned button activates an action command that serializes the current text message and information about the sender, receivers and channel into a JSON payload. This payload is sent to the web service that invokes the two components and executes its inner behaviour. More precisely, in the complete next release of the software, the sender will be provided with a dialog box in which suggestions generated by the Web Service are visualized. In particular, the sender will be notified that the message was sent or, for example, that reflection is required given some specific motivations related to the emotional signature of the other user. The release used for this experimentation activity does not provide any feedback to the sender, but it traces and stores input, intermediate and output data into a log file (represented in the CSV format).

5.2 Methodology

The 40 students were divided into 20 groups of 2 users. Each group had an objective (project work related to the topics of the course) and worked for 4 h. The 4 h consisted of three communication sessions of 1 h each, with two breaks of 30 min each (see Fig. 8).

Fig. 8
figure 8

Experimental sessions for a group

During each session, the VCS worked as explained. At the end of the first session, the VCS created the individual and group signatures and classified a situation. This information was updated at the end of the second and third sessions. The system prototype traced data and produced a CSV file with the following columns:

  • sender ID

  • receiver ID

  • content (text message)

  • senderSignature (a pointer to the individual sender signature)

  • receiverSignature (a pointer to the individual receiver signature)

  • advice (generated by the Virtual Counselor)

  • groupSignature (a pointer to the group signature)

During the 30-min break, the students were asked to (1) complete an Excel table that reproduces the table of the emotional signatures (see Table 2) by indicating the mood-emotion pairs they perceived during the session, (2) assess how they perceive the given advice (i.e., if the given advice is correct or not), and (3) assess whether the situation they experienced during the communication session was perceived as serene (SAFE) or conflict (UNSAFE), or whether they do not know (DOUBT), and if they perceived commonality of emotions (POS) or not (NEG), or whether they are uncertain about this (BND).

5.3 Results

We have a total of 60 situations, 40 recommendations and 120 individual emotional signatures to assess. The parameters are as follows: \(\alpha =0.3\) for the cut of the signature and \(\beta = 0.3\) and \(\gamma = 0.6\) for the three-way decisions. The value of the \(\alpha\)-cut allows removal of the signature pairs with low values, which would reduce the understanding of the signatures. The trade-off, as discussed, is the removal of some \(\langle mood, emotion \rangle\) pairs.

The ground truth derived from the students for the situations is reported in Table 4. The predictions of the VCS are reported in Table 5. The confusion matrix and the values of the accuracy measures evaluated with the R caretFootnote 4 package are shown in Tables 6 and 7. In the confusion matrix of Table 6, each row represents the instances in a predicted class (prediction), while each column represents the instances in an actual class (truth). The classes are labeled with abbreviations consisting of the first letters (e.g., N-S stands for NEGATIVE-SAFE, P-S stands for POSITIVE-SAFE, and so on). The accuracy measures evaluated are sensitivity (which is equal to recall), specificity, positive predicted value (PPV, which is equal to precision), negative predicted value (NPV), F-measure and balanced accuracy (BA).

Table 4 Ground for situations
Table 5 Prediction for situations
Table 6 Confusion matrix
Table 7 Accuracy measures

From the analysis of Table 4, we can observe that there is only one unsafe situation belonging to the N-U class. This is motivated by the fact that the students were friends and course colleagues who knew each other well and decided to carry out a collaborative project. This limited the presence of Unsafe situations. From the confusion matrix of Table 6, it is also observed that for this unique N-U class, the VCS has made a mistake in the prediction by classifying it as B-U. The VCS was therefore wrong regarding the tripartition and not wrong in the comprehension of the group signature. The accuracy measurements are good for the other classes, with BA values higher than 0.9.

Regarding the recommendations, 29 were (approximately 72.5%) correct overall and 11 were incorrect. The distribution of correct and incorrect recommendations with respect to the correct classification of the situation (i.e., the ground from the student) is represented in Tables 8 and 9. For a better discussion of this result, we also report in Table 10 the ground for sessions 1 and 2 (without the 20 sessions 3 that do not foresee a recommendation). We can also observe that most of the NEG cases belong to the third session. The motivation is that, for some groups, 3 working hours may not have been enough to successfully finish the project, and thus, some misalignment between the emotional signatures emerges.

It can be observed that the recommendations proposed in the SAFE-NEG situation are always wrong. This is also due to the limitation of the current prototype that does not allow visualization of a dialog box with explanation, so the users do not understand the motivation for a better assessment. Additionally, the situation DOUBT-BND presents 3 out of 5 incorrect recommendations. In this case, the recommended action is the same as that in the DOUBT-POS, which has a large number of corrected recommendations. The motivation may be that the BND is characterized by different emotional states of the user (lower similarity compared to the POS); thus, the user does not agree with the recommendation of sending because s/he perceives this different emotional setting with her/his interlocutor.

Table 8 Distribution of correct recommendations
Table 9 Distribution of incorrect recommendations
Table 10 Ground for sessions 1 and 2

Finally, we report considerations of the individual emotional signatures. We evaluated the similarity between the signatures detected by the VCS and those compiled by the students. Figures 9, 10 and 11 report the similarity values. The captions also report the average value in each session. In all cases, the average similarity value is approximately 0.6.

Fig. 9
figure 9

Similarities of signatures for session 1 (mean = 0.59)

Fig. 10
figure 10

Similarities of signatures for session 2 (mean = 0.62)

Fig. 11
figure 11

Similarities of signatures for session 2 (mean = 0.60)

6 Conclusions

We have presented the first implementation of a system to enhance CMC by improving the situational awareness of the emotional dynamics of individuals and groups. VCS has concrete applications in social networks, working or learning environments that leverage CMC. In particular, the study carried out focused on the ability of the system to acquire a certain awareness of the emotional dynamics characterized by emotional contagion, in which emotions and behaviours of a person can trigger similar emotions and behaviours in other people.

The current implementation of the VCS presents some limitations. Specifically, it presents limitations with respect to (1) the way in which users are informed about their emotional state, (2) how multiple actions can be suggested on the basis of an individual emotional state, and (3) the ethical effects that such a system can have on the privacy of communication.

With respect to the first issue, the best way to visualize the emotion-mood couples in a user interface in light of the entire emotional process is being studied. An increase in emotional awareness can be obtained; in fact, when the whole process includes a trigger event for the emotion, the emotion itself and the action consequent to the emotional state are represented in an intuitive way. A correct visualization of the emotional signature must therefore be contextualized in the communication process to support individuals’ awareness of affective information linked to their activities.

With regard to the recommendation of multiple actions, the study of multicriteria decision analysis methods is ongoing. In particular, the dominance-based rough set approach (Greco et al. 2007) seems to be a good candidate since it works on the basis of a dominance relation. With this approach, a set of actions described with different attributes (that also need to include emotional aspects) can be ordered according to the dominance relation, and thus, the VCS may recommend a set of multiple actions ordered by a priority ranking. However, an order of preference must be established for the emotional-affective attributes so that the dominance relationship can be applied.

With respect to ethical and privacy issues, this is an open research topic. It is easy to understand that emotional data differ from other “my data”, and the former poses additional constraints on traditional privacy problems. These constraints relate to the ethical implications of detecting and elaborating personal emotions. Preserving certain privacy levels of user information can be done by integrating VCS privacy models and techniques such as Blundo et al. (2017). On the other side there are no ethical negative effects on the use of individual decisions based on emotions since these decisions are always advices that the user receives, but the final decisions are always up to she/he.

The path taken is, however, promising, and the results of the preliminary experimentation, with all the limitations discussed, allow us to hope for the achievement of a broader goal: the possibility of directing communications towards emotional empathy whose effects we also intend to evaluate in previously investigated domains such as the enterprise workplaces (Gaeta et al. 2012). For instance, we are investigating the combination of FOAF schemes of Gaeta et al. (2012) and the fuzzy signature. In brief, Gaeta et al. (2012) report some results achieved in the domain of enterprise learning and, specifically, in the definition and validation of semantic models to represent enterprise resources and knowledge. Among these models, Gaeta et al. (2012) present a “Worker Model” based on FOAF, SKOS and other ontologies. This worker model contains a representation of the tasks and competencies of a worker. We believe that the VCS can have important applications in the working environment, but we need to correlate the emotional signatures of users/workers with concrete operational tasks and competencies required to achieve the tasks. Our current research extends the models of Gaeta et al. (2012), allowing reasoning on operational tasks and emotions.