1 Introduction

Collaborative learning (CL) denotes situations in which two or more students work together on a shared goal (Dillenbourg 1999). Although collaborative problem solving has been shown to be effective for learning (e. g., Kyndt et al. 2014), adequate support is necessary to ensure its effectiveness. Without proper support, there is a danger of the occurrence of problems such as groups straying off-task or social conflict within a group. The field of computer-supported collaborative learning (CSCL) is based on the idea of facilitating or supporting CL by means of digital tools (Stahl et al. 2006). CSCL environments can for example provide collaboration scripts that aid students in interacting with each other, or provide explanation prompts that help students solve the task at hand. As in the context of CSCL the collaboration occurs in a digital setting, students’ activities can often be automatically captured by the system. Thus, a wealth of data becomes available that can be used to study what activities predict the effectiveness of CL, and to use these insights to provide automated, real time support that is tailored to a group’s needs. Such automated support systems have indeed been shown to be effective in contexts of individual work (Gerard et al. 2015; Van Lehn 2011). However, as Gerard et al. describe, these studies mostly concern mathematical problems that have clear cut correct answers, in which case it is easier to apply if-then rules to provide automated guidance. In contrast, in the context of collaboration, it is much more difficult to provide automated, adapted support (Walker et al. 2009). This difficulty arises because students often work on open-ended problems that have multiple pathways of arriving at a solution or even multiple correct solutions (Munneke et al. 2007). Furthermore, in case of collaboration it is not only the cognitive level on which students need guidance, but to a large extent also the social domain, that is, the way students collaborate and interact with each other needs to be monitored and supported (Kaendler et al. 2015). Although promising steps have been made concerning adaptive collaboration support, it remains challenging to automatically determine the appropriate support at any single time point because there are several levels on which assistance needs to be delivered during student collaboration (Walker et al. 2009).

In the context of CL and CSCL, the teacher therefore plays a large role in making sure the types of interactions between students occur that are effective for learning (Van Leeuwen and Janssen 2019). As Swidan et al. (this issue) note, teachers need a thorough conceptual understanding of the progress of the group to provide adequate support for the group. While the importance of the teacher during CL is stressed, it is also acknowledged that it is a demanding task for the teacher to monitor multiple groups at the same time and to decide about the adequate type of support at any given moment without disrupting the collaborative process (Kaendler et al. 2015; Van Leeuwen and Janssen 2019). A possible way to aid teachers in this task is by providing them with information about their collaborating students. Besides using automatically collected data about collaborating students for a direct form of automated support, these data can also be used as input for teacher tools that aim to indirectly support students by informing the teacher of their students’ activities (Rummel 2018). Thus, the assumption is that by enhancing teachers’ understanding or diagnosis of the collaborating activities of their teachers, teachers can provide more adequate support, and in turn enhance the effectiveness of the collaboration. It is thus assumed that such tools aid the teacher to orchestrate student collaboration, in the sense of coordinating the learning situation by managing and guiding the activities within the collaborating groups (Prieto et al. 2011). In the present paper, we focus on what we henceforth term orchestration tools developed for teachers that take data concerning collaborating students as input and provide analyses or visualizations of the data to teachers for the benefit of more effective teacher guidance of student collaboration.

Orchestration tools thus build upon learning analytics, an area of research that is summarized as the measurement, collection, analysis, and reporting of data about learners and their contexts, for purposes of understanding and optimizing learning and the environments in which it occurs (Lang et al. 2017). Recent reviews of the field of learning analytics (Papamitsiou and Economides 2014), monitoring tools (Rodríguez-Triana et al. 2017), and learning analytics dashboards (visual displays that provide information for students or teachers, see Schwendimann et al. 2017) all show that the teacher is increasingly being targeted as a potential user for which analytics about students could have large benefits. Sergis and Sampson (2017) reviewed tools that support teacher inquiry in the classroom, that is, tools that allow teachers to analyze the effectiveness of their own teaching, including analytics about learners that help the teacher to reflect on and improve their own practice. These reviews all point out that initial findings show the positive role learning analytics can play for teachers. On the other hand, the reviews also show that the type of analytics that are collected and displayed show large diversity, and that most empirical studies evaluating the effects of such tools are of an exploratory nature. Thus, drawing firm conclusions remains difficult.

Relevant to the present paper, these reviews do not specifically focus on the context of (CS)CL and the hypothesized affordances or challenges of teacher orchestration tools in that context. Sharples (2013) describes the unique nature of the task of the teacher during (CS)CL and how this task might be further complicated by the addition of orchestration tools, because it is not only the interaction between the teacher and the tool that is important, but also the interactions between the teacher and the students, between the students and the CSCL platform, and the interactions between the students themselves. As Sharples (2013) notes, it has to be carefully considered what function the orchestration tool fulfills and what this means for the teacher’s role in the classroom.

If we look at examples of studies that examined teacher orchestration tools focused on supporting (CS)CL, it becomes apparent that there is diversity in the types of research that are carried out. First, the studies differ with respect to the function that the orchestration tools fulfill. Some systems provide analytics about collaborating students to the teacher, and leave all interpretation of the information up to the teacher (i. e., mirroring, for example Looi and Song 2013). Other tools go a step further and provide alerts (e. g., Casamayor et al. 2009) or even advice (e. g., Berland et al. 2015) about how the teacher could act in a certain collaborative situation. Second, studies that evaluate orchestration tools differ in terms of their design, ranging from exploratory to experimental, and showing diversity in their sample sizes.

There is currently no review of the research available that would allow us to even draw preliminary conclusions about what function an orchestration tool should fulfill, and how strong the evidence for these claims is. Such a review could not only provide a summary of the state of the research, but also a means to discuss what aspects deserve more attention in future research and thus be valuable input to the discussions surrounding teacher orchestration tools.

In the present paper, we aim to take a first step at filling this gap by providing an overview of studies that describe and evaluate teacher orchestration tools in the context of student collaboration, henceforth abbreviated as TOSC tools. Thus, we focus on studies that not only present an orchestration tool, but also investigate teachers’ use of the tool and the resulting influence on the teacher or the effect on the collaborating students, or both.

The remainder of this paper is structured as follows. In Sect. 2 (Method), we outline how we identified the papers included in this review and what information we extracted from each paper. In Sect. 3 (Results), we present and discuss a summary of the characteristics of the included studies. In Sect. 4 (Discussion), we outline directions for future research based on the current state of this area of research.

2 Method

The review was conducted in two steps: 1) find relevant studies by means of a search query and a snowballing technique, and 2) extract information from the included studies. Each of these steps is explained in more detail below.

2.1 Step 1: Find relevant studies

As outlined above, this review was performed in a very specific area of research. We therefore had a set of strict inclusion criteria that studies had to meet to be included in the review. To summarize, we were interested in studies with the following characteristics:

  • Context: Studies had to be conducted in the context of synchronous collaborative learning, because in these contexts teachers can have a direct impact on the interaction between students. Studies dealing with asynchronous collaboration, such as discussion forums, were therefore out of the scope of this review.

  • Technology: Studies had to involve a TOSC tool, that is, an orchestration tool that displays some form of analytics to the teachers for enabling them to support collaborating students.

  • Focus: Studies needed to include the evaluation of the TOSC tool in terms of reporting about a situation in which teachers actually made use of the tool. Note that this means that studies that implemented the tool for teacher use, and reported on student outcome measures instead of teacher measures, were also included.

2.1.1 Search query

In accordance with these criteria, a search query was composed for the search engine WebOfScience (see Appendix). The search query consisted of three parts, namely one part with keywords for collaborative learning, one part with keywords concerning teacher presence, and the final part consisting of keywords for orchestration tools. Whereas the keywords for the first two parts were quite straightforward, this was not the case for keywords concerning orchestration tools. In this area of research, many different terminologies are employed due to the area’s interdisciplinary nature. For example, besides the more obvious terms “learning analytics”, “dashboards”, and “orchestration tool”, other terms that are being used include “smart classroom” (Mercier 2016) and “artificial intelligence techniques” (Slotta et al. 2013). To avoid the excess of search results that we would obtain if we included these broad keywords, we chose a very limited set of keywords. To make sure we would find as many relevant papers as possible, we opted for an elaborate snowballing technique after performing this initial conservative search query. The search query was inputted in WebOfScience in February 2019, and resulted in 113 articles that potentially could be included in the review.

After reading the abstracts and method sections of these papers, 9 remained that adhered to the inclusion criteria. The resulting set of papers is thus quite limited, largely because in very few studies, an evaluation of the TOSC tool of any kind was performed. In many of the search results a TOSC tool was described in terms of its design and underlying analyses, but the report did not include the subsequent step of teachers actually using it. The review by Schwendimann et al. (2017) shows a similar finding: none of the 21 studies that they identified that targeted teachers as users of analytics dashboards provided any evaluation. Similarly, a large number of our initial WebOfScience search results concerned studies that analyzed student behavior in some way, and indicated that the results might be useful for teachers, but did not implement or evaluate the use of the analytics for that purpose (e. g., Cukurova et al. 2018).

2.1.2 Snowballing

After we obtained these 9 studies, we used a snowballing technique to uncover other relevant studies. In particular, we used three sources as a starting point. First, we checked the reference lists of the reviews by Sergis and Sampson (2017) and Schwendimann et al. (2017), which are closest in focus to our aims, for potentially relevant studies. Second, we checked the archives of the ijCSCL and AIED journals for potentially relevant articles because, again, these journals are closest in focus to our purposes. Third, we checked the reference lists of the 9 studies we obtained from the WoS search. Snowballing led to the identification of 16 additional papers that were included in the review. Also, we included the paper by Swidan et al. that can be found in the current special issue, leading to a total of 26 included studies.

2.2 Step 2: Extracting information from the included studies

Once the final list of studies was collected, we proceeded to extract information from the manuscripts. We listed the following information for each included study:

  • Sample size in terms of the number of participating teachers.

  • Design of the study (i. e., descriptive, experimental, etc).

  • The function that the TOSC tool fulfilled. The tools were classified as either mirroring, alerting, or advising. By Mirroring, we mean systems that provide information but do not aid in the interpretation thereof. By alerting, we mean systems that in some way alert the teacher to important events during collaboration. By advising, we mean systems that advise the teacher about the status of the current situation or about possible ways to act to support students.

  • The type of analytics that the TOSC tool displayed. The analytics were broadly categorized as cognitive or social. By cognitive analytics we mean indicators related to the task content, such as how many tasks are solved. By social analytics we mean indicators related to the collaboration between students, such as graphs that display each member’s contributions to the task.

  • The actors from which data was collected. We coded whether studies measured the influence of the orchestration tool on the teacher or on the collaborating students, or both.

This information is summarized in Table 1 (see the Results section). Furthermore, we carefully read the Results and Discussion sections from each included article, and synthesized the most important themes and discussion points that the authors described in terms of the usefulness and influence of the TOSC tools on teachers’ practice.

Table 1 Included studies and their characteristics

3 Results

Table 1 displays the full list of included studies and the information that we extracted from each study. Below, we discuss each coded aspect.

3.1 Design and number of participants

The majority of included studies (14 out of 26) had a descriptive or exploratory design. In these studies, there was either a qualitative investigation of how teachers employed the TOSC tool or if there were quantitative measures, there was no statistical testing of the results. These studies all had a sample of 1 to 3 teachers, which, for example, allowed for in-depth investigation of teachers’ behavior and choices while supporting students partly mediated by the TOSC tool (e. g., Schwarz and Asterhan 2011). In the studies that had an experimental or quasi-experimental design, we can see that the number of participating teachers was generally quite low. Examples of experimental designs include the comparison of teachers’ behavior or perceptions when they were provided with a TOSC tool versus when they were not (e. g., Casamayor et al. 2009; Swidan et al., this issue) or teachers’ interaction with the TOSC tool in the context of supporting a small versus a larger number of collaborating groups (Chounta and Avouris 2016). Thus, with a few exceptions (e. g., Van Leeuwen et al. 2015a), the overall picture is that the conducted studies generally had small sample sizes, and that the employed methodology often entailed qualitative investigation or within-subjects comparisons.

3.2 Function of orchestration tool

The majority of reported TOSC tools fulfilled a Mirroring function, that is, information was made available to the teacher, but further interpretation thereof was left to the teacher. In some cases, we found combinations of Mirroring and Alerting functions (e. g., Schwarz et al. 2018) in which teachers could peruse the provided information and were also given alerts that something might be going wrong within one of the collaborating groups. We found two studies (Berland et al. 2015; Duque et al. 2015) that evaluated an Advising orchestration tool. In both cases, the tool advised the teacher concerning which students to pair into collaborating dyads to make their collaboration more effective.

3.3 Type of analytics and examples of displayed information

In the included studies, there was considerable variation concerning the types of information provided to the teacher by the TOSC tool. Some tools focused on cognitive aspects of collaboration (6), others on social aspects (5), and quite a number of orchestration tools provided both types of information (14). Often encountered types of cognitive information include: the topics that groups are working on or discussing, and the correctness of answers that groups input. Often encountered social types of information are the participation rates for each member of the collaborating group, and the types of interactions or contributions that students engage in.

3.4 Measures and results of studies

The influence of the TOSC tools were measured in a number of ways, concerning both the teacher and the collaborating students. Most studies focused on the influence of the TOSC tool on the teacher using it (24), for example, in terms of teachers’ satisfaction with the tool (e. g., Berland et al. 2015), how teachers interacted with the tool (e. g., Voyiatzaki and Avouris 2014), or how teachers’ support of groups changed as a result of having a TOSC tool available to them, either in terms of awareness/diagnosis of the groups’ state (e. g., Swidan et al., this issue) or in terms of interventions (e. g., Van Leeuwen et al. 2014). Measures at student level were less frequently encountered (12), but also showed variation. Examples of measures at the student level include student perceptions of the collaboration (e. g., Duque et al. 2015), students’ progression on tasks (e. g., Casamayor et al. 2009), the quality of collaboration (Marcos-Garcia et al. 2015), and students’ skills as a result of collaboration (e. g. programming skills, Berland et al. 2015). The studies that focused on students all showed a positive influence of the teacher using a TOSC tool, for example, increased student programming skills in case of Berland et al. (2015) when the teacher employed a tool that advised how to pair students.

The studies focusing on the teacher yielded a more complicated picture. On the one hand, a large part of the studies showed positive results: Teachers evaluated the use of a TOSC tool in their classroom positively because it provided them with more information about their students, enhanced their diagnosis of the situation, and formed input for their further decision making (e. g., Van Leeuwen et al. 2015b). Voyiatzaki and Avouris (2014) elaborately describe the way teachers use the TOSC tool as a source of information in different situations, including the detection of groups that may need support, finding out whether a problem is group specific or an issue in multiple groups, and finding out more about a group once a problem has been signaled. Also, some studies found that teachers’ behavior was influenced by the TOSC tool, for example, in terms of providing more support for the collaborating groups (Van Leeuwen et al. 2015a) or by adjusting the runtime of activities in the classroom (Martinez-Maldonado et al. 2015b).

On the other hand, some studies showed that the TOSC tool can also impair teachers instead of assisting them. For example, Swidan et al. (this issue) report on an alerting TOSC tool, and found that teachers sometimes felt interrupted in their practice and that their own experience was a more valuable source to act on. These authors thus hypothesize that teacher experience might moderate how teachers make use of the TOSC tool and whether they find it a valuable addition to their practice; in particular, more experienced teachers would find TOSC tools disruptive to their routine. Other studies also point at the role of several teacher factors and contextual factors that influence how teachers use the TOSC tool, such as teachers’ beliefs of what constitutes effective collaboration (Van Leeuwen et al. 2014) and the number of groups that a teacher has to monitor (Chounta and Avouris 2016).

4 Discussion

In the present paper, we reviewed studies that evaluated teachers’ use of orchestration tools in the context of student collaboration (TOSC tools). Based on our findings about the current state of the research in this area, we identify a number of conclusions and related recommendations for future research.

First and foremost, it must be noted that this review only included a relatively small number of papers (26 studies), and those papers often addressed just exploratory research questions or used very small scale within-subjects experiments. This finding is in line with other reviews in the area of learning analytics, for example, with Schwendimann et al. (2016) who noted that the available research is largely of exploratory nature. Furthermore, although the underlying assumption of TOSC tools is that by supporting the teacher, the effectiveness of student collaboration can be increased, not many studies actually focus on the student level so far. Maybe the research field is too young to move to this question, and the specific ways of how teachers use TOSC tools need to be addressed first. In general, though, it can be stated that there is not a large body of research yet and that there is a need for more rigorous work that would allow for firmer conclusions.

A second major finding is that there is large diversity in TOSC tools concerning their function and the type of information they display. On the one hand, this finding seems positive because it means that teachers can be provided with different types of support and with different types of information. For example, specific types of collaboration tasks or specific group characteristics (such as familiarity between group members) might determine which information is useful for the teacher at that moment. As Kaendler et al. (2015) and Van Leeuwen and Janssen (2019) point out, teachers need to support student collaboration both in the cognitive and the social domain. In that regard, the TOSC tools that we discussed in this review seem to be able to provide teachers with a variety of types of information that could be adapted to the specific situation. On the other hand, several studies pointed at the possible detrimental effects a TOSC tool can have if it is not tailored to a teacher’s needs. Based on this review, we would therefore argue for a teacher-centered approach in terms of the implementation and evaluation of TOSC tools. Concerning implementation, it is important to determine beforehand what type of support a teacher is most in need of and which data about students is most relevant to display, and to select a TOSC tool accordingly. For example, in the paper by Wiedmann et al. (this issue), an instrument is proposed to measure teachers’ monitoring skills, on the basis of which it could be decided whether mirroring, alerting, or advising is most appropriate. Concerning evaluation, it is important that more research is carried out concerning the relation between teacher characteristics such as teaching experience and teachers’ pedagogical beliefs, and how teachers interact with the TOSC tool. In the studies we included in this review, there often was very little description of the teachers’ characteristics in the study’s sample, although several authors do discuss the potential relevance of such characteristics. Looi and Song (2013) for example discuss the role of teachers’ pedagogical qualities in terms of their pedagogical and content knowledge. In that sense, existing learning sciences literature could inform future research into orchestration tools. For example, Heitink et al. (2016) outline the skills required from teachers in the wider context of formative assessment, and stress the importance of factors such as teachers’ data literacy, teachers’ ability to provide adequate feedback, and the role of the culture at the school in which the teacher is employed. Investigating the role of such factors in the context of TOSC tools could advance the field by providing a theoretical basis to build on.

Our last suggestion for future research is to more systematically vary and evaluate the characteristics of TOSC tools in controlled experiments, and to try to build on each other’s work more explicitly. As it stands, the research is quite varied in terms of the design and function of the TOSC tools under investigation. The aspects we coded in this review allow for combinations of factors that lead to specific designs of TOSC tools that could subsequently be evaluated. For example, consider the three functions of tools we outlined (mirroring, alerting, advising) and types of information provided by the tool (cognitive, social, or both). It would be a good step forwards if researchers employed this terminology to situate their work and to contribute to our understanding of which tools are effective and which are not (cf. Rummel 2018).

To conclude, our review leads to four important recommendations for future research. The first recommendation is to invest in larger scale as well as more rigorous and controlled studies. The second recommendation is to also take into account the influence of TOSC tools on collaborating students as a result of their teacher using such tools. The third recommendation is to focus on the relation between teacher characteristics and teachers’ interaction with TOSC tools to provide insights into what type of TOSC tool best serves the needs of individual teachers. The fourth recommendation is to more systematically build upon each other’s work by systematically varying and evaluating the dimensions of TOSC tools. The framework we provide here, in terms of a tool’s function and the type of information it displays, could be used as a vocabulary or framework to inform future research.