Introduction

Affective experiences are a ubiquitous feature of learning (Boekaerts & Pekrun, 2016) and often originate and are constructed in social interaction (Järvenoja et al., 2017). The social dimension of affective experiences is particularly present in collaborative learning (Baker et al., 2013), which is increasingly implemented in both physical classrooms and online learning environments. When working together toward a common goal, group members can experience shared enjoyment of learning (Anttila et al., 2018) or encounter different kinds of socio-emotional challenges (Näykki et al., 2014) creating unique affective experiences for group members. Affect is constantly present as a condition influencing group members’ interactions and behaviors (Winne & Hadwin, 2008) and can foster processes beneficial for collaborative learning (Barron, 2003; Rogat & Adams-Wiggins, 2015) but if socio-emotional challenges are not successfully regulated, have detrimental effects on group members’ collaboration (Bakhtiar et al., 2018). Understanding emotional variations in group members’ shared affective space could be instrumental in studying the role of affect in the collaborative learning process and, for example, in locating emotionally relevant situations to study and support group level emotion regulation in various learning contexts. While the role of affect for collaboration has been largely recognized, only few studies have explicitly addressed group level affective processes (Bakhtiar et al., 2018; Isohätälä et al., 2018). This study tapped into this underemphasized dimension and targeted groups’ affective states during a collaborative learning process.

Emotions are multifaceted phenomena consisting of multiple psychological processes, including affective, cognitive, physiological, motivational, and expressive components (Pekrun, 2016; Russell & Barrett, 1999). Through socio-emotional interactions, collaborative learning brings an additional dimension that influences affective experiences (Baker et al., 2013). Capturing different individual and group level affective components during learning processes requires technologies and data sources that can capture temporal fluctuation of different components as the learning progresses (Järvenoja et al., 2018). Recently, the implementation of physiological measures has gained growing interest implying that combined with another data channel, such as video observations, they can reveal the multilayered functioning of affect in collaborative learning (Ahonen et al., 2018; Gillies et al., 2016; Haataja et al., 2018; Harley, 2015; Järvelä et al., 2019). This study adopted this interest and combined video data observations with measurement of students’ physiological activation and explored groups’ affective states and situational factors triggering them during collaborative learning.

Theoretical framework

Valence and activation—the dimensions contributing to affective effects on learning

Affect can be considered either as an affective trait or as a state. Where affective traits are relatively stable tendencies in a person’s emotional responding, affective states, including emotions, are situation-specific responses to the changing environment (Rosenberg, 1998). In this study, affective state is used to refer to situation-specific affective responses during collaborative learning process. The dimensions of valence and activation can be used as an organizational framework for studying various forms of affect such as academic emotions (Linnenbrink-Garcia et al., 2016; Pekrun et al., 2002). Valence separates positive emotions from negative, whereas activation refers to the extent of physiological arousal the emotion is causing (Ben-Eliyahu & Linnenbrink-Garcia, 2013; Boekaerts & Pekrun, 2016). By utilizing the affective circumplex (Russell & Barrett, 1999), academic emotions can be divided into four groups: positive activating (e.g., enjoyment), negative activating (e.g., anxiety), positive deactivating (e.g., relief), and negative deactivating (e.g., boredom) emotions, each making a unique contribution to individuals’ learning process (Fig. 1) (Pekrun, 2016; Pekrun et al., 2002).

Fig. 1
figure 1

adapted from Russell and Barrett (1999)

The affective circumplex model

Previous research has found that, when directed toward the task, positive emotions can enhance students’ engagement, motivation, interest, and the use of flexible, creative, and deep learning strategies, whereas contrary to positive emotions, negative emotions can more often lead to negative results (Pekrun, 2016). Robinson et al. (2017) used four affective profiles (positive, deactivating, negative, and moderate-low) to explore the relationships among affective experiences, engagement, and academic achievement. Results of Robinson et al. (2017) strengthen the previous findings indicating that students who reported positive activating and positive deactivating affect (positive profile) and students who reported moderate-high levels of both positive deactivating and negative deactivating affect (deactivating profile) experienced lower levels of disengagement and achieved higher exam scores. However, the emotional effects on learning are not always that straightforward. Positive deactivating emotions can sometimes result in decreased task motivation and disengagement, and when facing a difficult learning task, for example, students’ negative activating emotions can enhance motivation and effort investment in order to prevent failure (Pekrun et al., 2002).

Affective states in the context of collaborative learning

In collaborative learning situations, students are working in small groups with a goal of seeking solutions, constructing understanding and meanings, or creating a product together (Roschelle & Teasley, 1995). When a group of people collaborate, it is probable that individual group members converge in their affective states (Duffy et al., 2015), leading to synchronous and interactive experiences of group affect (Barsade & Knight, 2015). In the collaborative learning process, individual and collective affective states can be seen as conditions influencing behaviors and interactions among group members, but also as products shaped by those operations (Winne & Hadwin, 2008). When constructing the group’s affective state and socio-emotional climate during the learning process, socio-emotional interactions including emotional expressions serve as operations shaping the group members’ perceptions of the group’s affect (Bakhtiar et al., 2018; Kwon et al., 2014) and, thus, offer observable indicators of the group’s current affective state.

Concerning individuals’ affective states during collaboration, negative affect has been linked to disengagement and social loafing, while positive affect has been linked positively to group interactions, collaboration, and conceptual understanding (Linnenbrink-Garcia et al., 2011; Pietarinen et al., 2018). Zschocke et al. (2016) studied individual group work appraisals and emotions arising in the group work context and found that appraisals of the cognitive benefits of group work were a significant predictor of positive activating emotions, and experiences of negative activating and deactivating affect were mostly associated with task management and group assessment aspects. Positive socio-emotional interactions have been linked to positive affect (Linnenbrink-Garcia et al., 2011), favorable socio-emotional atmosphere (Bakhtiar et al., 2018; Kwon et al., 2014), processes beneficial for collaborative learning such as high-level cognitive processes (Barron, 2003; Isohätälä et al., 2018; Järvelä et al., 2016a), and facilitative group level regulation (Rogat & Adams-Wiggins, 2015; Rogat & Linnenbrink-Garcia, 2011). In turn, negative interaction has been linked to negative affect (Linnenbrink-Garcia et al., 2011) and, when persistent, shown to constrain groups’ regulatory actions (Bakhtiar et al., 2018).

Thus, affect is constantly present in groups’ collaborative interactions, influencing and shaping the learning process (Baker et al., 2013). To align empirical research on groups’ affective states during learning with the theoretical assumption of the multi-componential and situation-specific nature of affect, a need to employ multiple data sources that provide versatile information of different components during the process is evident (Azevedo et al., 2016; Harley, 2015). Video observation analysis, for example, can reveal students’ affective states continuously throughout the learning session (Linnenbrink-Garcia et al., 2011; Porayska-Pomsta et al., 2013). It also enables a more in-depth examination considering the situational factors present, especially in authentic learning settings (Linnenbrink-Garcia et al., 2016), but does not reveal non-observable reactions and interpretations (Duffy et al., 2015). Physiological measures, when combined with video data that contextualizes physiological reactions, offer a way to detect these hidden processes within and between collaborating students (Malmberg et al., 2018; Palumbo et al., 2017).

Studying affect in collaborative learning with physiological data

The physiological component of affect is closely linked to the activity of the autonomic nervous system (Kreibig, 2010). One way to detect the level of physiological activation is through sympathetic arousal, that is, the activation of the sympathetic nervous system responsible for fight or flight responses (Palumbo et al., 2017). Some previous results have indicated that, during learning situations, high arousal can be related to negative affect (Ahonen et al., 2018; Harley et al., 2019; Malmberg et al., 2018) and lower performance (Mason et al., 2018; Pizzie & Kraemer, 2018). However, there is also research supporting the profitable aspects of high arousal (e.g., Harley et al., 2019). For example, in a recent study, Pijeira-Díaz et al. (2018) examined the relationship between high school students’ arousal and achievement during a collaborative physics course. Strong positive correlation was found between arousal during the exam and the grades achieved. When noting the concept of optimal arousal (Yerkes & Dodson, 1908), it is relevant to point out that studies often do not indicate the level of arousal defined as detrimental or beneficial. Furthermore, the role of the complementary data that reveals, for example, the valence of the arousal episode has often been ignored. That is to say, profound conclusions of effects on learning and performance cannot be made without contextualizing physiological reactions (Harley et al., 2019; Mason et al., 2018; Pizzie & Kraemer, 2018).

Since physiological arousal is an individual measure, inter-individual processes, such as collaboration, have been studied mostly through physiological synchrony, which is defined as any interdependent or associated activity in the physiological processes of two or more individuals (Palumbo et al., 2017). Previous research attempting to explore social interactions via physiological synchrony has indicated that physiological synchrony episodes in social interaction are often emotionally relevant (Mønster et al., 2016). Earlier, Kaplan et al. (1963) found that the synchrony in two individuals’ sympathetic arousal was more likely to occur in dyads that have a strong positive or negative affective relationship. Current findings have related physiological synchrony to emotional engagement (Slovák et al., 2014), feelings of non-belonging to the group (Mønster et al., 2016), and construction and maintenance of a common social and affective space (Cornejo et al., 2017). Also, Slovák et al. (2014) have suggested that synchrony in sympathetic arousal reflects emotional reactivity, that is, situations in which two people react to each other emotionally.

The implementation of physiological measures in collaborative learning research is, however, only just emerging (Ahonen et al., 2018; Gillies et al., 2016; Haataja et al., 2018; Malmberg et al., 2018; Pijeira-Díaz et al., 2018). Gillies et al. (2016) found that the level of synchrony among students in the classroom was higher during teacher driven whole-class periods than during small group cooperative learning. Being defined as coordinated and synchronous activity (Roschelle & Teasley, 1995), it could be assumed that, when true collaboration happens, physiological synchrony would also exist among the students in the same group (Pijeira-Díaz et al., 2018). Haataja et al. (2018) used three case groups to study relations between physiological synchrony and monitoring events during collaborative learning process. They found that students in the same group showed a significant amount of physiological synchrony, and synchrony was positively connected to the shared monitoring events.

Thus, when exploring collaborative groups’ affective states, situations including physiological synchrony and increased physiological activation might be considered to be characterized by observable affective features, such as emotional expressions or socio-emotional interaction, and to point out emotionally relevant situations for collaborative learning processes (Mønster et al., 2016). With negative valence, those situations might indicate strong negative activating affective state caused by a cognitive or emotional challenge calling for emotion regulation (Järvenoja & Järvelä, 2009). In turn, situations with positive valence might indicate strong positive activating affective state, which could be a favorable state for collaborative learning and a fruitful condition for strengthening the state trough positive socio-emotional interaction (Bakhtiar et al., 2018; Isohätälä et al., 2018). By now, existing research has concentrated either on positive or negative convergence in affective states and ignored the role of mixed situations in which group members’ affective states diverge from each other. Those situations might be detrimental for groups’ performance by hampering group members’ shared and coordinated behavior (Barsade & Gibson, 2012). This study takes a position that recognizes these variations in collaborative groups’ affective states and explores the potential of capturing varying affective states during authentic learning situations with physiological and video data.

Aims

This study investigated group level affective states during a collaborative learning session. Specifically, it explored the relations between groups’ observed affective states and group members’ physiological activation. Based on the affective circumplex model (Russell & Barrett, 1999), we hypothesized that emotional activation could be captured with physiological activation (i.e., sympathetic arousal) measurement (Kreibig, 2010). Accordingly, the overall hypothesis was that physiological activation was expected to differentiate positive and negative situations from neutral, but not positive and negative valences. Based on previous research findings (e.g., Harley et al., 2019; Malmberg et al., 2018; Pijeira-Díaz et al., 2018), students were expected to have mostly low levels of physiological activation. Thus, high group level physiological activation was expected to reveal emotionally relevant situations in which the group members would also be expressing activating emotions. Furthermore, this study aimed to understand what type of affective situations are related with physiological activation of various group members. We hypothesized that high physiological activation could be provoked by various kinds of situational factors that students interpret to be meaningful in terms of the learning situation. The research questions were the following:

  1. 1.

    How are observed group level affective states related to collaborative group members’ physiological activation?

  2. 2.

    What kind of factors trigger physiological activation on a group level (a) in relation to activating positive, negative, and mixed affects and (b) without observable emotional expressions?

Methods

Participants and task

The participants were volunteered 6th grade primary school students (N = 41, 12–13 years old, 23 female, 18 male) from Finland. The students were assigned to 13 groups of 3–4 students, heterogeneously, based on their gender and interest in science measured with the Task Interest Inventory (Cleary, 2006). The students answered the questionnaire instructed by their teachers at school before beginning the data collection. To ensure the analysis validity, the data from 10 groups and 31 students (17 female, 14 male) were used since a sufficient quality electrodermal activity (EDA) data were not reached from all of the group members in the three remaining groups.

The data collection was conducted as a part of the students’ environmental studies. That is, one environmental studies class was held in a classroom-like learning and research space (LeaForum, University of Oulu). Accordingly, both the topic of the task (heat energy) and a collaborative way of working derived from the curriculum. Environmental studies are included in the Finnish curriculum of basic education already from the first grade and the curriculum highlights collaborative ways of working (Finnish National Agency for Education, 2016). Thus, the students had prior experiences of collaborative learning.

The collaborative task was instructed by one of the researchers. Before the task, some basic information about heat energy was presented. The task was to design and construct a model of an energy-efficient house that makes use of solar energy. There were a number of rules for the house design. For example, there had to be a door to access the house, the house had to be comfortable on a sunny day, as well as on a cold winter night, and a family consisting of two parents and two children should be able to live in the house comfortably. The pedagogical structure of the collaborative task included four phases: (1) Becoming an expert (15 min), (2) Brainstorming (10 min), (3) Sketching (20 min), and (4) Building (60 min). Each phase was timed, and the groups were shown the passing of time. In phase 1, each group member received an individual subtopic to specialize in and was instructed to make notes on how the topic was related to the building of houses. The students were provided with guiding questions to help the notetaking (e.g., Where should you put insulation in your house?). The subtopics assigned to the group members were heat capacity of different materials, heat conduction, and heat convection. Thus, the expertise was divided among the different group members so that they needed to work collaboratively to be able to perform the next phases. In phase 2, the students were instructed to share their unique expertise using the notes they had made during phase 1. They were required to synthesize the discrete pieces of knowledge by forming a shared list of facts they would need to consider when designing and building the house. Phase 3 focused on sketching the house. The sketch had to make clear how the house would be built and what materials (e.g., cardboard, tape, aluminum foil, cotton wool, and styrofoam sheets) would be used. After five minutes of sketching, students received additional information about the dominant direction of the wind and angle of the sun during summer and winter. In phase 4, the groups collaboratively constructed a scale model of the house. After the data collection, the students presented their outcomes in their own classrooms.

Data collection

In their own classrooms before the actual data collection, students completed the Task Interest Inventory self-report (Cleary, 2006), which measured their interest in science with six items (e.g., “I like studying science”) and a 5-point Likert scale (“How much do you agree or disagree with the following statements?”) ranging from 1 (strongly disagree) to 5 (strongly agree). The measurement was used to group the students into high, medium, and low interest clusters. The groups were then formed heterogeneously to include students from different interest clusters. During the data collection, the groups were working in a classroom-like learning and research space (LeaForum, University of Oulu). Phases 2–4 were recorded with the MORE observation system, which is designed to observe social-interaction in real-life contexts (e.g., classroom learning situations). It enables a multichannel approach to collect data simultaneously with three 360° cameras and multiple microphones (for technical information, see Keskinarkaus et al., 2016). In this case, three groups worked simultaneously in the LeaForum space while their collaboration was recorded with one 360° camera for each group and an individual microphone for each student. Altogether, observational data included 16 h of video (session mean duration 95 min). To track group members’ physiological activation, a measurement of electrodermal activity (EDA) was utilized. The advantage of EDA is that it produces temporal online process data with high granularity, which can be analyzed on either an individual or a group level. EDA is related to the function of sweat glands, and it is the sole measure of sympathetic arousal and, thus, closely linked to cognitive and emotional processing (Braithwaite et al., 2013; Dawson et al., 2007).

EDA measurement is divided into phasic short-term skin conductance response (SCR) and tonic skin conductance level (SCL) (Boucsein, 2012). SCR peaks are considered to be strongly associated with emotional responses caused by an external stimulus and more reactive to variations in experimental conditions than SCL (Christopoulos et al., 2016; Dawson et al., 2007). EDA values can also increase without a specific external stimulus, and those fluctuations are called non-specific (NS)-SCRs (Dawson et al, 2007). In situations with continuous stimuli, such as collaborative learning, the frequency of NS-SCRs can be used as an indicator of the current arousal state (Braithwaite et al., 2013). In this study, EDA was recorded using Empatica E4 wristbands (Empatica Inc., Cambridge, MA, USA) (Garbarino et al., 2014) placed on the non-dominant hand of each student. Empatica E4 measures EDA through two silver-coated electrodes with a sampling rate of 4 Hz. The recorded EDA data were analyzed in relation to the video data, so the duration of each student’s EDA recording used for the analysis was equal to the recorded video of the group in question.

Analysis

To find commensurable indicators of the groups’ affective state from two different data channels, both video and EDA data were segmented in 30-s segments. The 30-s period was based on the mean duration (24.6 s) of socio-emotional interaction episodes coded in the preliminary analysis of three videos where all episodes including socio-emotional expressions were located, and the duration of each episode was coded. Furthermore, 30-s was long enough to enable valid judgements of affective states (Porayska-Pomsta et al., 2013), and it also was reasonable in terms of combining the two data sets with different granularity. For this study, the 30-s segment was also deemed valid considering the nature of the task. The focus was on collaborative group members’ affective states that were made visible in the socio-emotional interaction episodes. As stated above, preliminary data analyses revealed that, on average, these episodes spanned approximately 30-s, which provided enough time for all the participating group members to contribute to the interaction. After segmentation, both video and EDA data included 1922 30-s segments.

Phase 1: Observing the valence of the groups’ affective states

The video data was processed with Observer XT software (Noldus Information Technology). The valence of groups’ affective state in video segments was coded into four categories (positive, negative, mixed, neutral) based on the group members’ emotional expressions. The framework of academic emotions (Pekrun et al., 2002) and the affective circumplex model (Russell & Barrett, 1999) were used as a theoretical basis to separate expressions of positive and negative emotions. In addition, the study of Linnenbrink-Garcia et al. (2011) was utilized as an example when building the video coding scheme for emotional expressions. Before proceeding to the actual video data coding, three videos were coded multiple times, and unclear cases were discussed by two researchers to develop and adjust the coding scheme. The final criteria for emotional expressions included clearly stated verbal (e.g., “We are so good”) or clearly visible bodily (e.g., laughing, lack of focus) indicators of positive or negative affect and negatively or positively charged interactions (e.g., joking, arguing). The groups’ valence was coded as positive when at least two group members expressed clear signs of positive affect or made a positively charged comment and negative in opposite cases. If the valence was mixed within one group member (e.g., negative verbal sign with positive bodily sign) or among different group members’ (e.g., two students had positive and one had a negative valence) valence was coded as mixed. Segments, which included no emotional expressions or expressions from only one group member, were considered as neutral. The valence coding categorization is presented in Table 1. The table also illustrates how individual level emotional expressions are turned into a group level factor of the groups’ valence. For the valence coding, the interrater reliability analysis was performed for 40% of the coded videos using Cohen’s kappa statistics. Substantial agreement was reached (κ = 0.723). Then, the discrepancies between the two coders were discussed to reach a consensus for the final codes.

Table 1 Coding categories for the valence of groups’ affective state

Phase 2: Analyzing the physiological data in relation to the video coding

The EDA data were processed using Python programming language. The baseline was computed using a third-order low-pass filter, and NS-SCR peaks were detected using the minimum value of 0.05 µS between the baseline and peak (Boucsein, 2012; Dawson et al., 2007). The EDA data were further processed using Excel. First, each group member’s individual EDA recordings were synchronized. Second, the EDA data sets were synchronized with the groups’ video data segment coding. Number of the NS-SCR peaks per each 30-s segment was counted for each student. At rest, a frequency of 1–3 peaks/min occurs (Dawson et al., 2007), and frequencies higher than 20 peaks/min are considered as high arousal (Boucsein, 2012). The criteria were applied in 30-s time frames, and frequencies from 0 to 1 were considered as low, 2–9 as medium, and 10 or more as high arousal. In order to locate the segments including physiological activation of several group members, the group level variable of the group’s activation level (activating/deactivating) was formed based on the number of students with medium or high arousal during the segment. Segments, in which two or more students were in medium/high arousal, were considered as activating and other segments as deactivating.

Phase 3: Exploring the relationships between observed affective states and physiological activation

Since the variables were categorical and included repeated observations of the groups’ collaboration, non-parametric tests were considered suitable for testing the relationships between the different categorical variables. Accordingly, the relationships between observed affective states and physiological activation were explored with a chi-square test of independence. Significant relationships were further explored with significant z scores from adjusted residuals with an alpha level 0.001 (z < 2.58).

Phase 4: Exploring triggers for physiological activation and activating affect qualitatively

To identify possible triggers for physiological activation, segments labelled as activating and segments preceding those were further observed and transcribed. Although, in neutral cases, the activating segments did not include observable emotional expressions, the preceding segments could include emotional expressions from the group members. From the transcribed interactions, possible triggers for physiological activation were identified using inductive qualitative content analysis (Elo & Kyngäs, 2008) focusing on the situational factors that provoked students’ emotional expressions. Two categories of factors emerged from the qualitative content analysis: task-related and socially-related factors. The interrater reliability analysis was performed for 30% of the transcribed interactions using Cohen’s kappa statistics. Almost perfect agreement was reached (κ = 0.870). Again, the discrepancies were discussed to reach an agreement between the coders. The analysis proceeded with exploring the relationships between different types of factors and observed valence of the activating segments with a chi-square test of independence. Significant relations were further explored with significant z scores from adjusted residuals with alpha level 0.001 (z < 2.58).

Results

Research question 1: how are observed group level affective states related to collaborative group members’ physiological activation?

The valence of the groups’ affective state observed from the video was neutral in 41.1% (f = 790) of the segments. From the groups’ collaborative working, 58.9% included observable indications of affect. The valence of the groups’ affective state in these segments was negative in 27.9% (f = 537), mixed in 16.5% (f = 318), and positive in 14.4% (f = 277) segments. Although the groups’ indicated observable emotional expressions over half of the time they engaged in collaborative work, the majority of groups’ physiological activation levels were deactivating (93.4%, f = 1795) and activating only in 6.6% (f = 127) of the segments.

As there was a clear difference in the frequency of manifestation of the groups’ observable emotional expressions and physiological activation states, it was next analyzed in more detail if the activating and deactivating states were related to certain types of emotional valence. Combining the valence and activation dimensions resulted in eight valence-activation pairings: positive activating, negative activating, mixed activating, neutral activating, positive deactivating, negative deactivating, mixed deactivating, and neutral deactivating. The activating affective states were rare compared with the deactivating affective states, negative activating being the most common (f = 42), followed by neutral activating (f = 36), mixed activating (f = 28), and positive activating (f = 21). In terms of the deactivating states, the groups’ affective states were mostly neutral deactivating (f = 754), followed by negative deactivating (f = 495), mixed deactivating (f = 290), and positive deactivating (f = 256). The relationships between physiological activation and observed valence of the situations were explored with a χ2 test of independence. The results showed that, in physiologically activating segments, group members also indicated significantly more emotional expressions (the interaction had emotional valence) in the video than in physiologically deactivating situations (χ2 (3) = 9.579, V = 0.071, f = 1922, p < 0.05). The valence of these situations varied between negative, mixed, and positive.

The relationships between each type of group level valence and activation level were next explored further with z scores from adjusted residuals. The significant negative relationship was found between neutral valence and physiologically activating (z = − 3.0, p < 0.001) segments (Table 2). This indicated that when the group members’ interaction did not show any observable emotional expressions (valence was considered as neutral), the group members were low also in terms of physiological activation. That is, if the groups did not express affect, neither were they in a physiologically activating state as often as they were when in segments including emotional expressions. However, there were no significant relationships between positive, negative, or mixed valence and physiological activation. This indicated that, for the increased physiological activation, it did not matter what the valence of the emotional expression was.

Table 2 Frequencies of different valences and adjusted residuals (z) in deactivating and activating segments

Research question 2: what kind of factors trigger physiological activation on a group level?

When exploring the situational factors triggering both emotional expressions with a different valence and physiological activation, two categories for the triggering factors emerged from the qualitative analysis: task-related (f = 68) and socially-related (f = 56) factors. Task-related factors included three qualitatively different types of activities triggering an activating affect on a group level: monitoring and reflecting, facing a challenge, and external factors (Table 3). These activities were not exclusive and could be present at the same time. In situations with monitoring and reflecting, group members were monitoring their progress in relation to time and other groups, task interest, difficulty, or emotional state, and reflecting on their performance. In situations including a challenge, group members were facing either cognitive challenges or challenges related to their skills or task equipment. External factors included task instructions given by the researcher, asking for help from or discussing with the researcher, and paying attention to the research equipment. Monitoring and reflecting served as triggers for physiological activation in general, whereas challenges were present in negative and neutral cases and external factors in positive, mixed, and neutral cases.

Table 3 Examples of the task related factors

Socially-related factors included, in positive cases, social reinforcement in which group members were joking together or praising/encouraging each other (Table 4). In negative and mixed cases, group members were squabbling with each other. Some mixed cases also included joking, which annoyed some group members and was then followed by squabbling. Most of the situations with socially-related factors (f = 41) also included a task-related factor as a trigger. Due to the low frequency of purely socially-related factors, instead of having three different categories, combinations of task- and socially-related factors were coded as socially-related. In three neutral segments, students executed the task but no visible triggering factor could be identified. Those segments were excluded from the statistical analysis.

Table 4 Examples of socially-related factors

The relationships between different types of factors and observed valence of the situations were explored with a chi-square test of independence. The relationship between these variables was significant (χ2 (3) = 21.355, V = 0.415, f = 124, p = 0.000).

  1. (a)

    What kind of factors trigger physiological activation in relation to activating positive, negative, and mixed affects?

    In segments with mixed valence coded from the video, physiological activation was triggered significantly more often through socially-related factors (z = 3.6, f = 21, p < 0.001) than through pure task-related factors (z = − 3.6, f = 7, p < 0.001). In segments with negative valence, physiological activation was triggered equally through both factor categories. From the segments with positive valence, socially-related factors were identified more, but the positive relationship (z = 1.2) was not statistically significant. Frequencies and adjusted residuals are presented in Table 5.

  2. (b)

    What kind of factors trigger physiological activation without observable emotional expressions?

    Physiological activation with neutral valence (i.e., segments without emotional expressions) was triggered through task-related (f = 27) and socially-related (f = 6) factors. Although, in this case, the activating segments did not include observable emotional expressions, the factors were also identified from the preceding segment, which could include emotional expressions from the group members (e.g., emotionally charged interactions such as joking or squabbling). Physiological activation in neutral segments was triggered significantly more through pure task-related factors (z = 3.6, f = 27, p < 0.001), than socially-related factors (z = –3.6, f = 6, p < 0.001). Frequencies and adjusted residuals are presented in Table 5.

Table 5 Frequencies of different types of triggers and adjusted residuals (z) in segments with different valences

Discussion and conclusions

This study explored the relationships between groups’ observed affective states and group members’ physiological activation in collaborative learning. The results revealed that physiologically activating situations were rare compared with the emotional expressions observed in the video. However, when several group members were in a physiologically activating state, they also showed visible emotional expressions more often than in deactivating situations. In terms of physiological activation, the results are in line with previous findings indicating that students experience quite low levels of arousal during learning situations, and simultaneous high arousal among group members is rare (Harley et al., 2019; Malmberg et al., 2018; Pijeira-Díaz et al., 2018). As discussed also by Harley et al. (2019), low arousal levels found in this study can be due to the fact that the learning session was not directly tied to students’ science grades and, thus, might not have had enough extrinsic and instrumental value to cause a high intensity affect (Lavoué et al., 2020; Pekrun, 2006). However, this does not mean that the activating affect would not be experienced or expressed in those situations. For example, higher arousal emotions might be physiologically experienced at lower levels, but based on the other components and contextual factors, still be constructed as activating affect (Barrett, 2017; Harley et al., 2015). Students’ emotional expressions observed in the video might also be related to constructing and maintaining a favorable socio-emotional atmosphere continuously through socio-emotional interaction in the course of collaborative learning and thus, not directly linked to intense affective reactions.

In terms of optimal performance, students might be expected to operate at their optimal level of arousal (Yerkes & Dodson, 1908) and, hence, might mostly be experiencing moderate levels of arousal throughout the learning process. In this study, group members’ simultaneous high arousal episodes were usually related to the interactions with emotional valence, which indicated that situations with simultaneous high arousal were also emotionally relevant on a group level. According to the results, high arousal episodes might indicate emotionally relevant situations by revealing the situations in which group members’ jointly experience an activating affect with high intensity. Even though, or because of, their rare occurrence, these situations might reveal the critical moments during the collaborative learning process and, therefore, are relevant moments to track and study (Baker et al., 2013; Linnenbrink-Garcia et al., 2011). Recognizing emotionally critical moments could be especially important in learning settings where socio-emotional and situational cues are limited, such as online learning.

To follow up this assumption, this study continued to explore the simultaneous high arousal episodes qualitatively, in order to reveal what kind of situational factors triggered a joint physiologically activating affect among the group members. Successful collaboration requires effective coordination of cognitive and social group processes, and both can also trigger different kinds of emotions among group members (Barron, 2003; Isohätälä et al., 2018; Kwon, 2020). Results of this study reflected these general dimensions of the collaborative process and showed that convergent positive and negative activating affects were triggered equally by task- and by socially-related factors. However, when the valences of the group members’ affective states differed from each other, it typically involved some type of social factor as a trigger. Accordingly, the results indicated that socially-related factors were more likely to trigger a mixed activating affect on a group level than task-related factors. In turn, solely task-related factors were more likely to trigger physiological activation without visible emotional expressions.

It can be discussed whether certain types of socially-related factors are more likely to be appraised differently by the group members than task-related factors. While challenges in task comprehension, for example, can be addressed by referring to the task instructions if the group members share the same ultimate goal, challenges originating from more severe differences among the group members’ goals or priorities can result in conflicting views or experiences that further result in differences in group members’ emotional appraisals of the situation (Järvenoja & Järvelä, 2009). Furthermore, affect triggered by those interactions might be expressed in more covert ways and, thus, be more complex to interpret by other group members and create mixed reactions (Duffy et al., 2015). Expressions and interpretations can be even more complex in e.g., online collaborative learning contexts where group members’ have constricted opportunities to receive and interpret emotional cues from each other. This study indicates that especially when group members’ experience divergent affective states, they could benefit from the support targeting especially to the socio-emotional aspects of collaboration. While advanced technologies can be used to track students’ socio-emotional processes, they could also be used to support and make these processes visible to the students (Järvelä et al., 2016b).

In this study, the most frequent task-related factors involved some type of monitoring and reflecting activities. Compared with the social factors, task-related factors might cause more convergent affective appraisals, since group members have a shared goal to aim for, and monitor and reflect on task-related factors through shared standards (Hadwin et al., 2018). Affect triggered by task-related factors might also be more acceptable to express without, for example, social masking. In this study, the remaining time was constantly visible for the groups, which might have prompted the groups to monitor their progress and increased the amount of monitoring. Still, the results support the assumption that the group level analysis of physiological data might also reveal relevant cognitive processes such as shared monitoring events (Haataja et al., 2018).

While the multimethod approach used in this study provided a possibility to explore affective processes from multiple data channels, it also posed some empirical challenges and limitations (Azevedo & Gašević, 2019). First, poor quality EDA data of some students limited the already small sample size. Since the analysis was done on a group level, problems with only one individual’s data excluded the whole group from the analysis. Given that this study was one of the first attempts to collect EDA data from primary school students during an authentic collaborative learning activity, some loss of the EDA data is understandable. Some data loss has been documented also in prior research implementing physiological measures especially in authentic learning contexts (e.g., Pijeira-Díaz et al., 2018). Nevertheless, the data of the current study enabled the reliable analysis of the selected groups with a good data quality. Second, in the EDA data analysis moving window with 30-s steps was used when counting the frequency of NS-SCR peaks, even though the sampling rate of 4 Hz would have enabled a moving step of 250 ms (Pijeira-Díaz et al., 2018). This was a necessary choice in order to combine the EDA data set with the video coding. With the manual video coding, it was not reasonable to code valence using, for example, a second by second moving window. Finally, the effect size for the relationship between the group level valence and physiological activation was small. This is understandable since variations in physiological activation can be linked to cognition and physical activity in addition to emotional activation (Dawson et al., 2007). Moreover, prior research has argued that even though emotions are multi-componential, the components are not necessarily tightly coupled (Harley et al., 2015). When taking into account these arguments, the effect size for the relationship is indicative. Considering the limitations of this study, generalizable conclusions cannot be made from the results. However, combining physiological data analysis with detailed video coding is a novel approach and, hence, gives valuable information even with the small sample size.

In the future, more studies with larger sample sizes are needed to explore the groups’ affective states. When the sample size increases simultaneously with the experiences in multiple data source integration, multichannel process data also enable many possibilities for different analytical approaches. Future studies could also utilize other complementary process-data channels. For example, since detailed manual video coding is labor-intensive, other data channels, such as facial expression recognition could be used to track groups’ emotional valence with high granularity to reveal, for example, transitions from one affective state to another. Furthermore, log data from online collaborative learning environments and tools could be used to reveal emotion related cognitive processes in relation to the task progress.

To conclude, while emerging technologies enable more systematic analysis and understanding of the role of emotions in collaborative learning, gained information can also be used to design new technologies to prompt and support socio-emotional aspects of collaboration (Järvelä et al., 2016b; Lavoué et al., 2020). This study contributes to finding new ways to capture group members’ affective states during collaborative learning and shed more light on understanding the antecedents of group level affective experiences. Acknowledging group level factors of affective experiences can contribute to developing a theoretical understanding of emotions experienced during collaborative learning and therefore, also in harnessing the benefits of technology to support collaborative learning processes. With more profound understanding of indications of different data channels, the potential of multichannel process data lies in its possibility to be used when designing advanced technologies that can provide learners targeted and timely support when needed (Järvenoja et al., 2020).