1 Introduction

Video-based learning is becoming more eminent in today’s higher education and in digital learning environments as a result of the newer online teaching models such as flipped classrooms, Massive Open Online Courses (MOOCs), or Small Private Online Courses (SPOCs) (Chatti et al., 2016). Watching videos is regarded as the main source of content in these courses (Chen et al., 2017). Given their increasing popularity and importance, most Learning Management Systems (LMS) (e.g., Moodle, Open edX) and external tools (e.g., H5P), provide utilities for embedding interactive videos in their courses, and for tracing the students' interactions with them (e.g., play, pause, seek) (Giannakos et al., 2015). Given this context, this interaction data can serve as a valuable resource for understanding the processes involved in student learning (Khalil, 2018), and for effectively intervening in the course (e.g., video redesign, timely feedback, drop-out prediction). The data generated from the learner-video interactions is frequently known as "video analytics'', and can be framed inside the broader umbrella of Learning Analytics. Therefore, this data can be used together with other course data to help understand students' behaviour and engagement within the course, thus offering new ways of supporting the teaching process (Mirriahi & Vigentini, 2017).

Previous studies have explored the use of video analytics to identify and explain learning patterns (Hu et al., 2020; Kim et al., 2014); at-risk students (Sinha et al., 2014); learning outcomes (Lang et al., 2020) and self-regulation strategies (Baker et al., 2021). Video analytics can also benefit the instructional design of courses (Wachtler et al., 2016). For example, Guo et al. (2014) identified a correlation between the student attention span and course videos length.

However, video analytics should be informed by the course context and the learning design to better comprehend students’ performance and behaviours (Gašević et al., 2016). For example, analytics of those videos that provide optional/additional contents should not be interpreted in the same way as those including the main contents of the course. Similarly, while watching the videos before the lectures is of relevant importance in flipped classrooms, the self pace of most MOOCs makes temporality not a relevant factor for understanding students’ success in the course. Therefore, course contextualization is presented as an important characteristic to understand students’ behaviour through video analytics. To the best of our knowledge, previous studies have reported individual studies about the use of video analytics in one single context (MOOCs, SPOCs). However, there is a lack of studies understanding whether different learning contexts are likely to generate the same or different video-related student behaviour.

Consequently, this study aims at searching for commonalities and differences of video-related student behaviour in three different learning contexts whose primary source of content are videos: an online university course, a SPOC and a MOOC. The novelty of this research lies in uncovering the video analytics that are applied in the different learning settings and its implications to monitor students’ behaviours. We deem that this knowledge will permit a better connection among the videos characteristics, the course learning objectives and the instructors’ needs. Therefore, the underlying research question driving this study is: What are the commonalities and differences on student video engagement across three different learning contexts? According to Fredricks et al. (2004), behavioural engagement can be defined as the observable behaviours that represent the student’s interaction and participation within the course. Therefore, moving this definition to video engagement, behavioural engagement refers to the student interaction and participation with course videos (e.g., number of times watched, number of times a video is forwarded, etc.).

The remainder of the paper is structured as follows. Section 2 provides an overview of previous work studies using video analytics to understand students behaviour. Next, Sect. 3 introduces the methodological approach of this study. Section 4 describes the main findings obtained in different contexts of the study. Then, Sect. 5 discusses the results, including their similarities and differences in video analytics, and reports the main limitations of this work. Finally, some conclusions are outlined in Sect. 6.

2 Related Work

Previous studies explored the use and impact of video analytics on students’ video-related behaviour. This section summarises these studies and presents them according to their different learning contexts.

2.1 SPOCs Related Work

SPOCs are online courses limited to a small group of students who regularly have similar backgrounds. Unlike university courses, usually their duration is much shorter and can complement the contents of other university courses.

Given this context, Belarbi et al. (2019) propose a method to profile SPOC students according to their video-related behaviour and personal interests. To that end, authors relied on students’ clickstream data with the course videos (e.g., play, pause, move forward) and on machine learning techniques. According to the authors, this type of classification permits the recommendation of videos that better fit the personal interests and individual needs of the learners. In this same context, Ding and Zhao (2020) explored the relation between learners’ emotions and video engagement within a SPOC. The results show that emotions such as boredom, excitement, annoyance and enjoyment are significant predictors of video engagement in the SPOC under study. Accordingly, authors could predict students’ behaviour with course videos based on their emotion. These two studies show the potentiality of using students’ video clickstream data to categorise them. However, one could think, would the same classifications be obtained in a different learning context (e.g., a MOOC or university course)?

2.2 University Courses Related Work

Giannakos et al. (2015) proposed a video learning analytics to be used in online courses and provided a preliminary evaluation with university students. The authors analysed the relation between students' video navigation and their learning performance. The findings showed a significant relationship between the number of repeated views of videos and the students' performance (i.e., course scores). Some years later, the remote emergency of covid switched the modality of many university courses, relying in some cases on videos as the main form of content delivery. Under this context, Baker et al. (2021) explored students’ engagement and perception regarding the video lectures in an undergraduate university course. The authors performed a pattern analysis exploring the number of students visualising the videos, their timing and the video repetitions. Results revealed that students tend to revisit the videos, especially, at certain points that were considered critical by the instructors. This fact was perceived by the authors as an indicator of students’ self-regulation development.

2.3 MOOCs Related Work

As compared with the different learning contexts, MOOCs are open online courses that gather thousands of students with different learning interests and who self-enrol in the courses. It seems relevant to highlight that most works related to video analytics and student behaviour regard MOOCs. This subsection summarises the most significant ones.

Sinha et al. (2014) analysed students’ clicks of video sequences such as play, pause, or seek forward, and seek backward and created an algorithm to forecast students’ dropout in one MOOC. Results showed that those students generating a higher volume of video interactions are more likely to have a higher retention rate in online courses of up to 37% than the rest of the class. Similarly, Atapattu and Falkner (2017) explored MOOC students’ video interaction patterns (e.g., pause, seek video events, skip interval, speed change, show closed captions/transcripts) and their influence on students’ tasks (e.g., syntactic simplicity of text). The preliminary findings revealed various correlations among the students’ video engagement and some students’ discourse features. Furthermore, Lang et al. (2020) explored the relationship between the video play speed and the students’ performance in a MOOC. The authors found that the students who accelerated the videos were more likely to achieve higher scores, attempt more in quizzes and finally receive a certificate. Some other similar studies are those reported by Kim et al. (2014), Hu et al. (2020) and Mubarak et al. (2021). All these studies correlate or predict the students’ video-related behaviour with other student variables (e.g., student retention, scores and activity). Nevertheless, no further references are made regarding the transferability of these results to other learning contexts.

2.4 Limitations of Previous Studies and Significance of Current Research

In summary, previous studies analysed video analytics in non-identical manners, probably led by different research designs and/or by different analytics provided by each learning platform. Yet, although there is a number of studies in the literature analysing students’ video-based behaviour in different contexts: e.g., MOOCs (e.g., Lang et al., 2020), SPOCs (e.g., Ding & Zhao, 2020), university courses (e.g., Giannakos et al., 2015), to the best of our knowledge, they do not focus on the similarities and differences of students’ behaviour among the three presented learning contexts. By knowing the similarities and differences, we could better understand whether the results observed in the previous studies are likely to be transferred to other contexts. Although the structure and content of educational data are not consistent across different platforms (Gershon et al., 2021; Macfadyen & Dawson, 2012), in this study we attempt to homogenise the students’ behaviours with the video analytics provided by each learning platform. The next section presents the methodology followed in this study, including a description of the different learning contexts analysed.

3 Research Background and Methodology

This section presents the methodology followed in this work: Case Study (Yin, 1992). Case Studies “investigates a contemporary phenomenon within its real-life context when the boundaries between phenomenon and context are not clearly evident” (Yin, 1992, p. 123). Previous research both in social (Noor, 2008) and computer science (Dodig-Crnkovic, 2002) found the use of Case Study approach as an ample research method to deliver results from different contexts in a seamless and integrated way. This methodology was selected due to the following reasons (Yin, 1992):

  • Our research question is posed by exploratory inquiries in the specific theme of video and learning analytics

  • We have little control over other events in other locations. That is, there is no connection and relation among the three case studies

  • The focus is happening in authentic settings of online learning. Within this context, video analytics can provide insights into students’ behaviour

The study involves three learning settings each with a particular learning context and platform. Figure 1 depicts the features of each case study (i.e., learning platform, learning context, data collection and analysis methods) as well as the differences between the three contexts and platforms.

Fig. 1
figure 1

Features of the case studies applied in this study

In order to understand student video-related behaviour we considered the different variables (e.g., clickstream data) provided by the learning platforms hosting the different learning models (see Fig. 1).

The approach followed for both data collection and data analysis in the three case studies are mainly quantitative. The data collection is based on clickstream data that include information containing clicks of events that the learners perform in the learning environments. Details on the assortment of the clickstream data of the SPOC, University course, and the MOOC are available in Table 3, Table 6, and Table 9, respectively. The data analytics methods followed are descriptive across the three cases. Unsupervised clustering for the cases of the SPOC and university course, and sequence analysis for the SPOC and the MOOC.

3.1 Context of the Three Learning Settings

3.1.1 SPOC Setting

3.1.1.1 Course Background

The first case study describes a self-paced SPOC about Women's Health (Sexual and Reproductive Health and Rights) offered by Oslo Metropolitan University in Norway. The course is tied to the United Nations sustainability goals, including health, gender equality and poverty eradication and it is offered at the Master’s level. In addition to Norwegian students, the course was opened for international students coming from Africa (particularly Ghana) and the Middle East (particularly Palestine) through a partnership with the University of Ghana and Birzeit University.

The course ran over 8 weeks from October to November 2021, with a total number of enrolments of 86 students.

3.1.1.2 Platform Background

The course was delivered via Open edX, a free and open-source learning platform rooted in the leading MOOC platforms, edX. Open edX provides a platform where instructors can upload videos, textbooks, assignments, quizzes, surveys and create discussion forums. The platform is free and open-source that can be used by universities, individuals, schools, and government organisations to deliver their content to students. Video clickstream data, including detailed interactions, was supported in the SPOC case study via Open edX Learning Analytics Tool (OXALIC), developed by the researchers from University of Bergen in Norway (Khalil & Belokrys, 2020). OXALIC provides useful representations of students' data (e.g., dashboards) to multi-stakeholders who are involved in the process of improving the learning experience for students using learning analytics.

3.1.2 University Course Setting

3.1.2.1 Course Background

The second study focused on an undergraduate level course about algorithm design and development. The course was taught online during the COVID-19 pandemic. This course was compulsory for first-year undergraduate students in the Computer Education and Instructional Technology department in a Turkish University. In total, 55 freshman students enrolled in the course.

3.1.2.2 Platform Background

The course was delivered online through the Moodle LMS. The synchronous weekly lectures were conducted via Zoom and recorded to share with students later. The video recordings (or lecture captures) were uploaded to Moodle soon after the lectures. In total, there were 11 lecture captures with an average length of 80 min. The videos were lengthy since they were captures of whole lecture sessions. Since the interactions with videos embedded in a Moodle course page are not automatically traced, a custom video player was developed by the researcher. This player, in addition to tracing basic video events (such as play and pause) were designed to create a new log (called signal) every 5 s while a video is being watched. These signal logs helped determine the active play time at each session.

3.1.3 MOOC Setting

3.1.3.1 Course Background

The third study involves an 8-week instructor-led MOOC offered by a Spanish university in the Canvas Network platform in 2018. The topic of the course was related to English–Spanish translation. The course was divided into 7 weekly modules. The modules included content pages (where videos were embedded), discussion forums and individual and collaborative activities (see Fig. 2). The course had 866 student registrations.

Fig. 2
figure 2

MOOC outline including the number and position of videos per module

The course composed 21 videos as presented in Fig. 2. All course modules (except the first and last modules) followed a similar structure:

  • One introduction video where the main concepts of the module are presented;

  • One theoretical video where the concepts addressed in the module are explained; and,

  • One summary video wrapping up the module with recommended readings and activities.

Additionally, those modules with collaborative activities (Module 4 and Module 6) included one extra video to better describe the task. Further information about the course videos is presented in Table 1.

Table 1 Statistical summary of MOOC videos attending to its type
3.1.3.2 Platform Background

As stated above, the course under study was delivered on Canvas Network. Canvas Network is a MOOC platform offering a variety of open online courses about professional development and academic inquiry. Canvas Network offers courses created as well by professors at community colleges and high school institutions. The platform supports both instructor-led and self-paced courses with the possibility of video lectures, pdf readings, resources like wikis, discussion forums, and options for individual, peer and group collaborative activities.

4 Results

4.1 SPOC Case Study

4.1.1 Video Data and Analysis

The SPOC comprised 29 learning videos in total, whose average length is close to 15 min (SD = 09:46 min), as presented in Table 2. The total number of video sessionsFootnote 1 are 439 (Avg = 5.10, SD = 10.31 per student). Thanks to the affordances of OXALIC’s student interactions with the videos that were captured, as described in Table 3.

Table 2 SPOC general video-interaction statistics
Table 3 OXALIC SPOC video-interaction metrics

For each session, several metrics were computed based on their interactions with the videos, including video loaded, played, seeked, paused, stopped, and speeded events. The total number of video events in the studied course was 8652. For a detailed insight into the events per each video, see Appendix 1.

Even though learning videos for the studied course are relatively high in volume (i.e., 29 videos), the number of video sessions surprisingly is relatively low. On average, per student, there were 5.1 video sessions. According to Table 2, the loaded and played events are the highest in count frequency whereas the speeded events were the least frequent. On average, only 100 video events account per each student.

4.1.1.1 Correlation and Sequence Analysis

In order to investigate the relationship between the engagement in video activities and the engagement in the rest of activities in the course (e.g., self-assessment quizzes), we carried out a Pearson correlation analysis between the total number of video interactions and the total number of course activities performed by each student (see Fig. 3a). These activities include student engagement within the course as a whole which comprises forum activities, logging in and out, and solving the assignments of the course. Prior to the correlation analysis, we validated that the covariance between the two sets of data is linearly modelled. Findings show a strong positive correlation equalling to r (70) = 0.90, p < 0.001. In a more detailed view, we took another step to measure the correlation between play video interactions (a highly frequent video event) and the total number of activities in the course (see Fig. 3b). Again, we checked that the covariance between the two sets of data is also linearly modelled. The result indicated a significant positive linear correlation between both factors equalling to r (70) = 0.794, p < 0.001.

Fig. 3
figure 3

Pearson correlation in the SPOC case study. a left: total video interactions to total course interactions. b right: play video events to total course interactions

To further elaborate on the video engagement behaviour in this course, we analysed the sequences of video events with respect to one-step-before and one-step-after of students engaging with the course videos. Figure 4 depicts a general overview of video events sequences of the course for all students. Arrow boldness in the figure represents the frequency of occurrences. The thicker the line, the greater the number of occurrences. The overview portrays that video events are centrally positioned around three combinations. First, video and course navigation with a proportion of around 80% of the overall chain of steps. Course navigation in edX refers to a set of actions including hyperlink clicks, and course subsections clicks. Second, video and the platform interface clicks outside the course screen with a proportion of 16.4%. Platform interface (i.e., edx.ui in the figure) refers to actions clicked by the students such as settings, progress, and the front panel. Third, video events and assignment page events (denoted as problem in the figure below) with less than 3% of the overall proportion. Table 4 shows detailed information of the top sequences of the videos in the course.

Fig. 4
figure 4

Overview of the video event sequences of the SPOC course

Table 4 Top video event sequences

To delve more into video events, we further analysed the steps of student navigations before and after the specific video events (i.e., loading, playing, pausing, seeking, and speeding of the videos) as shown in Fig. 5 and Table 5. Our analysis shows that problem solving was a central event as students exchanged moves mainly with problem-solving events. Except for much movement between videos and navigation as well as the platform interface, students flipped considerable cruises between videos and self-assessment quizzes (i.e., denoted as 'problem' in Fig. 5). Events such as 'bookmarking a module' or visits to the 'discussion forum' of the course were recorded but those were barely visited in the course.

Fig. 5
figure 5

Advanced view of the video event sequences of the SPOC course

Table 5 Detailed video event sequences, excluding navigation and user interface

4.1.2 Outlook of the SPOC Case Study

In the first study, the focus was on analysing video engagement in a course offered in a blended module, including 29 medium-length videos, with an average of 15-min length. According to the results, we noticed an interesting finding that the more video events demonstrated in a course the more student engagement is anticipated. While this is a result we cannot generalise for every course, our findings implicitly align with several studies that agree on video lectures as the main vehicle for increasing student active learning and participation online (Guo et al., 2014; Khalil, 2021). Although video events do not necessarily mean that students watched large parts of the videos, our next elaboration that the event of ‘playing’ is a relevant correlation factor between video engagement and the general course engagement has been remarked. This has been predicted by the authors since playing video is a principal function. While the correlations are statistically significant, qualitative check is required for further exploration of this assumption.

Learning behaviour according to sequences revealed that a lot of the course engagement appears to be linked to auditing the course material. We observed that navigation and iterating with the platform interface as well as loading the video are the most repeated interactions recorded in the log files. One might believe that students would interchangeably relocate focus across the course modules and self-assessment quizzes, nevertheless, the click ‘behaviourism’ of students can be difficult to translate into learning behaviours such as cognitive engagement (looking at the focused effort students give to what is being taught).

4.2 University course Case Study

4.2.1 Video Data and Analysis

Before delving into students’ video activities, user activities are segmented into video watching sessions. A session starts when a new video is loaded and is considered to end when the current video is re-loaded or a different video is loaded. In total, 284,000 video events were processed, and 1245 video sessions were extracted. For each session, several metrics were computed based on students’ interactions with videos. The description of these metrics are provided in Table 6.

Table 6 University course video-interaction metrics

Cluster analysis was used to reveal the prominent interaction patterns in the video sessions of the students. In the cluster analysis, the activities of the students on the video timeline were taken into account (the last metric in Table 6). At this stage, since the lengths of the videos are not standardised, a standardisation process was first carried out. For this, the videos were divided into 100 equal parts and the number of activities that the students did in each part was calculated. In 1245 video watching sessions, every part of videos was coded as 1 or 0 based on if students have activity on that part or not, which resulted in 100 binary features. Then, using these features, the cluster analysis was performed using the k-Means method to identify the emerging session clusters with similar video watching behaviour. In the k-Means clustering method, the aim is to divide the data into similar groups as the number of clusters determined. While Euclidean distances are taken into account in the determination of similar groups, the Silhouette method is used to determine the optimal number of clusters.

4.2.1.1 Descriptive Statistics About Videos

In Table 7, some descriptive data about videos are provided in terms of the length in minutes, total number of interaction events (e.g., play, pause), and the number of users who interacted at least once with the video. The videos are listed based on the order of weekly modules to which they belong. According to the table, the total number of events and the interacting users were higher for the videos of earlier weeks. In other words, as the course progressed, there was a decline in both the users accessing the videos and the resulting interaction with the videos. The Loops video was an exception to this finding since it led to a higher number of interactions although it belonged to the seventh module.

Table 7 The list of the videos and basic statistics
4.2.1.2 Session Data Statistics

For each video-interaction metric, several descriptive statistics were computed as provided in Table 8. Also, the box plot for each video-interaction metric is provided in Fig. 6. When glimpsed at the table, a high standard deviation for metric values can be observed. The box plot matrix indicates that this high standard deviation is generally caused by some outlier sessions where student behaviour was extreme. For example, while the average session duration is 45.7 (SD = 143.6), there are around ten sessions that lasted more than 1000 min, as observed in the box plot. Similarly, although the average play/pause event is around 7, there are many sessions where students played/paused videos more than 50 times. Thus, there are some sessions with exceptional video behaviour, different from the majority of the sessions.

Table 8 Descriptive statistics on video-interaction metrics
Fig. 6
figure 6

Box plots for video-interaction metrics

According to the table, the average session duration was 45.7 min (SD = 143.6). 75% of the sessions lasted 40 min or less, while 50% of them were 10 min or less. The high standard deviation (143.6) was caused by the outliers which can be observed in the box plot. In such sessions, students paused a video and returned to watching the video at a later time (e.g., next day). For example, the maximum duration value, which is 2365 (around 40 h), is an example for such a case.

Maximum and total percentage of the videos watched provide further insight into student activities during the sessions. On average, students watched 24.6% (SD = 26.4) of the videos, and in 75% of the sessions, students watched only 36% (or less) of the entire video. On the other hand, the maximum point reached in terms of percentage is relatively higher. On average, 65.6% has been the latest time point viewed in the timeline of a video. In 50% and 75% of the sessions, the latest time point is recorded as 80% and 97%, respectively. As seen in the corresponding box plot in Fig. 5, values of the maximum percentage metric are more evenly distributed and no outliers were detected. This result combined with the preceding shows that although students skipped most parts of the videos in many sessions, they tried to check some later parts in videos.

Among the four student interactions (i.e., play, pause, backward, and forward), forward was the most common event with an average value of 25.4 (SD = 50.4) and a maximum value of 519. In 75% of the sessions, students forwarded a video 20 times or less. The standard deviation of 50.4 actually indicates a high variability among students' forwarding behaviour. This result indeed provides additional evidence for students’ behaviour of skipping the most video (i.e., pressing forward) and focusing on the parts close to the end. Moreover, play and pause were infrequent, with a mean value of 7.7, and 7.5 (SD = 13.1), respectively. 75% of the sessions contain a maximum of 8 play or pause events. High standard deviation (13.1) may indicate the availability of sessions with many play and pause events, reaching up to 127 and 126 in a single session. Last, backwarding was the least frequent event with a mean value of 5.6 (SD = 13.1).

4.2.1.3 Clustering of the Sessions

The analysis yielded four clusters as depicted in Fig. 7. During the sessions gathered under Cluster 1 (n = 190), students seemed to skip almost the first half of the video, then start watching the video around the midpoint until almost the end. In most sessions of Cluster 1, the last 10% of the videos were watched less.

Fig. 7
figure 7

Session based clustering (C denotes cluster)

Compared to those of Cluster 1, the Cluster 2 (n = 137) contained sessions where students watched most parts of the videos. Although the viewing rates decreased starting from the midpoint until the end of the videos, this cluster differs from Cluster 1 in that the first %70 of the videos were mostly watched. In the sessions falling under Cluster 3 (n = 753), it is observed that the videos were watched very little. Last, in the sessions of Cluster 4 (n = 165) during the first 30% of the videos there was a high and increasing trend in watching videos, which then drops suddenly towards the end. This is an exact opposite pattern observed in the sessions of Cluster 1, where students were more active during the last 30% of videos. Further statistics for each cluster can be found in Appendix 2.

4.2.2 Outlook of the University Course Case Study

In the second study, the focus was on video engagement in the lengthy lecture-capture videos, which are the recording of the entire live lectures. According to the results, in most sessions, students tended to watch some segments of the videos or to open a video just to search for a specific content (as suggested by the high rates of forwarding and small portions watched on average). Students also were inclined to focus on the second part of the videos (mostly toward ending), which can be partially explained by the fact that the instructor of the course made some announcements at the end of each live lecture.

In relation to the emerging clusters of sessions, students seemed to watch mostly the first half in some sessions, and mostly the second half in some other sessions. This may imply that students tended to complete watching a complete video in two parts, which is not a surprising attitude considering the length of the videos. There was still a considerable number of sessions where students watched the entire video at once, suggesting the existence of some persistent students. However, in the majority of sessions, students viewed only a very small portion of the videos. That is, in many cases, students might quickly have visited videos to check something very fast without the intention of watching the entire video (i.e., knowledge catch).

4.3 MOOC Case Study

4.3.1 Video Data and Analysis

The integrated video system of Canvas NetworkFootnote 2 does not record interaction with videos, such as the number of times that a video was paused or the video percentage watched. Instead, it records the page views and the time spent on each page. Therefore, considering that videos are the main content of video pages (some of them also include a textual description), students’ video engagement was calculated as the number of times that a video page was visited, the time spent in the video pages, and the number of students who watched each video. In order to diminish the noise caused by students’ accidental visits to video pages, we excluded the visit data for the video pages that were visited less than 20 s. Table 9 describes the clickstream data used to analyse the video engagement.

Table 9 MOOC video-interaction metrics
4.3.1.1 Video Engagement

Figure 8 shows the number of times that video pages were viewed (blue) and the number of unique students viewing them (red). The results show that the number of views decreases during the first weeks of the course and then remains stable, following a trend similar to the change in the number of active MOOC students. Moreover, the median value of the first three theory video pages was 2 (i.e., students watched twice the video pages), while the median value of the three last theory video pages was 1. This result suggests that student video engagement also decreased throughout the course (as the number of active students). Additionally, for the last modules, the theory videos were the ones watched by more unique students (as compared with other video types), suggesting that several students were not interested in watching videos not relevant for passing the course.

Fig. 8
figure 8

Evolution of video page visits throughout the course

As another indicator of video engagement, the maximum time that videos were watched by unique students was computed (see Fig. 9). Results revealed that for most theory videos, at least half of the students stayed on the page the same time as the video duration (except for modules 2 and 6, the longest theory videos of the course). This result suggests that the length of the videos (12:04 min and 5:15 min respectively) is an important factor for video engagement in MOOCs. In this analysis we can also see that “summary videos” were the least interesting videos for students (see N in “summary videos” at Fig. 9), and that despite the task description videos were watched by many students, most of them were not watched completely. Probably, this can be explained by the fact that these pages contained the textual description of the collaborative tasks and students preferred to read them instead of watching the video. Further research would be needed to understand the actual reasons of the students.

Fig. 9
figure 9

Boxplots regarding the maximum time that video pages were watched by unique students. Red line shows the length of the corresponding video

4.3.1.2 Video Sequence

In order to better understand students’ video-related behaviour, we further explored the pages visited by students before visiting video pages. It is important to consider that in this instructor-led course, modules were opened weekly. Therefore, students could not proceed at their own pace. Results from the previous visited pages show that students mostly followed the learning design path configured by the teacher (see red colour in Fig. 10). Also, the Modules page from where students can view the overview of the course (yellow colour) was the second most frequent option, and jumping from one video to another non-consecutive video in the learning design (green colour) was the third option.

Fig. 10
figure 10

Most visited pages before watching course videos. Red: previous course activity as configured in the learning design. Yellow: course overview page listing all the activities. Green: video pages

4.3.2 Outlook of the MOOC Case Study

In the third case study, we explored the students’ video engagement in the context of a MOOC in terms of visits to the video pages, time spent in the video pages, and the number of students who watched each video. The evidence gathered revealed that video engagement decreased during the course enactment, a fact consistent with the decrease of the general engagement of the students. Indeed, previous studies (Aldowah et al., 2020; Er et al., 2019; Gregori et al., 2018) criticised the low engagement of MOOC students as the course advances. Additionally, we found that video engagement is related to the content and the length of the video itself. Concretely, the examined MOOC consisted of four different video types (see Table 8), and theoretical videos attracted more attention. Similarly, the longer videos, such as the ones related to task description, were not completely watched. Nevertheless, further research is needed to draw conclusions about the extent to which video content and video duration affected student engagement.

5 Discussion on the Three Case Studies

This section discusses the commonalities and differences among the three cases (as stated in the research question) and presents the implications for future research.

First, despite the fact that the three courses were delivered in disparate learning platforms (each of them providing different analytics), meaningful and similar insights regarding video-related behaviour were obtained. To begin with, whereas in the SPOC case, the Open edX platform already has the integrated capability of recording student interactions with videos, in the university course case study, the researcher implemented a custom video player plug-in to log the interaction data, which was otherwise impossible with the default player. Although such interaction data were invaluable, the MOOC case showed that alternative analysis approaches can still be effective in deriving useful insights into students’ video engagement. However, the incompatibility in learning platforms’ affordances for tracing video interaction data remains as a barrier to conducting comprehensive video analytics with fine-grained time-stamped data. At this point, one important implication for such platforms is to add a feature for tracing video interaction data based on some standards and to possibly provide a (dashboard) interface to help practitioners understand and extract actionable insights from such data. In this regard, some techniques (such as automatically adding a log every 5-s while a video is being watched) can be implemented to capture student behaviour with a higher accuracy.

Among the similarities between the three case studies, we can highlight that the content type and the length of videos were two critical factors affecting student video-related behaviour. In all cases, the number of students watching the videos showed a decreasing trend throughout the course. However, while this decrease in the theory videos was slight in the university case (dropping from 46 to 27 students, 41.30%), in the MOOC, it was more severe (dropping from 312 to 109 students, 65.06%), even considering that the length of the videos in the university case was much longer (on average, 83.2 min and 6.04 min per video respectively). Nevertheless, such decline of the student engagement in MOOCs, should be interpreted within their massive context of these learning settings, which are characterised by progressively lower levels of student interest during the course enactment (Gregori et al., 2018).

Another similarity observed was that students’ viewing behaviour in longer videos varied more than in short videos, and this required a different analysis approach, thus remarking the influence of learning design when interpreting video analytics results. For example, in the university case, cluster analysis was powerful in identifying groups of sessions with distinct student behaviour (such as some sessions with initial engagement in videos versus other sessions with later engagement). Such analysis is unlikely to yield meaningful clusters regarding student behaviour in short videos. Moreover, the results indicate that the video content itself can affect students’ video engagement. For instance, in the MOOC case, the findings show a relation between the video content and the number of times that students watched particular course videos. Specifically, students watched the videos related with the course theory more frequently than the others, omitting other types of videos offered by the course instructor, e.g., videos structuring the weekly modules and their learning objectives. This finding may indicate that the primary aim of the students was to pass the course and, thus they focused more on the videos that offered content knowledge required for the assignments. Additionally, the results revealed that the two most longer theoretical videos were the only ones that were not fully watched by most of the students. Another interesting finding is about the threshold when the students stopped watching videos in different cases. For the MOOC case, such a threshold was near 5–6 min videos, SPOC about 10 min, while in the university case median session duration was 10 min. Longer videos appear to negatively impact learners' viewing behaviour in terms of the number of logged events as previously discussed in Kim et al. (2014).

Pedagogical intentions behind videos (and therefore, the way they are integrated into the courses) are important for determining how to analyse video-related behaviour. For example, in both SPOC and MOOC cases, the courses were highly structured, guiding students in following a specific learning path. In these courses, short videos were placed at specific locations as essential course elements to introduce/explain concepts, with some prior or posterior learning activities. Most students followed the learning path configured by the teacher, i.e., students performed the activities in the order that the teacher designed the course, including the visualisation of videos. Differently in the university course context, long capture videos were uploaded to the LMS after the class, just in case students may want to refer to them later for reviewing the concepts. That is, students could choose to watch these videos freely any time and also watching them was not necessarily important for all students (especially for those who also participated in live lectures). Therefore, checking what students did before/after videos was not meaningful for the University case study, while this analysis brought critical insights for the MOOC case where each video had a specific purpose in the course and placed carefully before or after certain content or activity.

Regarding the main differences, we observed the existence of different video types according to the learning context of each case. While many MOOCs frequently incorporate other types of video different than the theoretical ones, university courses do not. This difference confirms the importance of the teacher input for video analysis, indicating the contents of the video and the relation with the other course components (e.g., the answers to this quiz are explained in this theoretical video). Under the same scope, previous literature discussed the key role that course instructors should play in the design, selection and/or sense-making of learning analytics features (Wiley et al., 2020). In video analytics, having instructor insights may better understand student behaviours contextualised with the course particularities and objectives.

In response to practical implications with differences in foci, we found that carrying out video analytics at scale is complex. Our empirical attempt to synchronise efforts between the three different contexts to analyse at scale has been hindered by different data ecosystems. Data transferability regulations and the lack of systemic approaches obscures successful scaling of video analytics. We corroborate this finding with Ferguson et al. (2014) who stated that “Transferability is a key factor…analytic and predictive models need to be reliable and valid” (p. 121) to scale learning analytics. Educational platforms are rapidly changing, henceforth altering data, and making it more disparate, heterogeneous, and complex to enable scalable analytics (Rahmani et al., 2021). Grounding in our experience of this study, we argue that policy agreements between universities are necessary to transform the trajectory of analytics into fruitful scalable analytics.

An implication we like to raise from this study is the conflict on defining “video sessions''. As Kovanović et al. (2015) discussed, sessions depend on timeout intervals. While web-based learning management systems, including MOOC and SPOC platforms, employ advanced technologies to capture student interactions within the platforms, learning sessions including video sessions are hardly defined (Chitraa et al., 2010; Kovanović et al., 2015). Our practical inference is the need for video session identification in terms of understanding the notion of time spent in watching learning videos. This is helpful for learning analytics applications to create and further develop personalised interventions and extend the understanding of learning behaviour.

Video watching strategies may vary depending on the task or time. For example, students are expected to watch the video slowly and interactively while doing an assignment or preparing for an exam, on the contrary, they are expected to watch the video faster while searching for specific information. For this reason, the analysis in the university case was performed at video session level instead of student level. In future studies, the situations in which students exhibit these watching behaviours, the relationship between video watching behaviour with individual characteristics of students and academic performance can be investigated. Such an analysis can help with determining students' learning approaches (Akçapinar et al., 2020) or self-regulated learning skills (Fan et al., 2022), and provide important data-driven evidence for the determination of instructional design principles related to video-based learning (Kim et al., 2014).

Learning analytics dashboards are widely used tools for delivering learning analytics interventions to students. Communicating the results of video analytics through dashboards can provide feedback on learning processes for students and enables teachers to see insights such as completion rates of videos and how students are watched. In this way, teachers may have the chance to identify students who are likely to fail the course early or to receive feedback on their video designs.

5.1 Theoretical and Practical Implications

Building on the context above, we envision that the theoretical and practical implications of this analysis extend in two directions:

  1. 1.

    The consideration of the learning design in the design and interpretation of the video analytics

  2. 2.

    The reflection of video related aspects when designing courses based on videos as the main content material.

Our analysis indicated that various learning design aspects may impact the students’ video engagement differently. For instance, the evidence gathered indicated that in the university course, some video lessons noted a high engagement at the beginning of the video and some others at the final moments of the video. Additionally, the MOOC case revealed that varying levels of engagement happened on different types of videos (e.g., content or summary videos). Furthermore, as stated above, the students’ objective while watching the video (e.g., refreshing the knowledge, being informed about a concrete aspect) can result in different engagement behaviours. Therefore, we deem that considering the learning design (e.g., learning goals, learning topic, the context, and the course sequence in relation to the videos) and the students’ objectives for the selection or fine-tuning of the video analytics could explain better the video-related behaviours. A participatory approach, by actively involving and defining together with the course teacher the requirements or the technical solutions of the video analytics, could support the learning design consideration and thus, the generation of more informative insights on students’ engagement. Indeed, the course teachers are the ones who can indicate the key videos (given their type and topic) of the learning process and their relation to the rest of the course and/or crucial video checkpoints (within the same video) that could permit a deeper interpretation of students’ video engagement.

With respect to the video sequences, an implication is the possibility to gain a good understanding of how learners in online environments may make use of video prompts and interventions. Such as self-regulated learning prompts and motivation incentives. Some studies explored how video sequences have that potential in MOOCs (Wong et al., 2019) and campus courses in LMS (Zhou & Bhat, 2021).

Additionally, the three cases indicated the key role of the content type, the length, and the aim of the video itself on students’ engagement. For example, the use of longer videos showed different engagement in the case of the SPOC and the university course. Likewise, the different types of videos (in terms of contents) resulted in different students’ behaviours. These aspects should be considered a priory by the course teacher or the instructional designer, during the design of the video-based course itself to result in more meaningful learning. We deem that a set of guidelines or a framework could support a better integration of the videos in association with its context (e.g., MOOC, SPOC, University course) and their aim.

5.2 Limitations of the Current Study

This work incurs some limitations. With respect to the context, we acknowledge that the analysed courses are divergent in terms of learning model, content, video volume, lengths, and language throughout the three cases. Moreover, across contexts from different countries has most likely increased the variability in learning characteristics, skills, and other differences rooted in each culture, which may have some effect on the results. Although this diversity introduces a considerable level of complexity in conducting joint work, it yet offers rich and invaluable opportunities for further collaboration between the three contexts.

With respect to video analytics, as commonly known in data and learning analytics, collecting data traces is rather constrained to log files residing in the examined platform (Khalil et al., 2022). That is, we do not report video analytics for students who for instance download the learning videos on their local machines, or watch them in external websites (e.g., YouTube). In addition, we acknowledge that we lack the exact time spent in watching the learning videos by the students. This is very difficult to measure in the three used systems of Open edX, Canvas Network, and Moodle. Another limitation we want to highlight is the absence of analysing and correlating videos to completion/certification rate. Only the third case study of MOOCs could have access to such data. We therefore excluded the synthesis of retention to keep a consistent workflow.

Further limitations are associated with the methodology that guided the current study. The emphasis on a single case led us to a deep understanding of the video-analytics in three different authentic scenarios (Yin, 1992) through the employment of multiple types of data sources and the thick description of the context (Guba, 1981). Nevertheless, we acknowledge that the presented study cannot support generalizable results. Indeed, given the followed interpretative approach, we are interested in achieving transferability or else ‘naturalistic’ generalisation, rather than ‘scientific’ generalisation (Stake, 1978, p. 6). That is, our intention is the readers to gain insights by reflecting on the presented findings and on the level to which these findings can serve for other cases related to the video-analytics in MOOCs, SPOCs, or university settings. Apart from the methodological constraints, we encounter data constraints as well. Specifically, data collection concerned each country separately, due to GDPR restrictions and the absence of valid consent agreement to exchange data across countries in and outside the borders of Europe.

6 Conclusions and Future Directions

The proliferation of videos in today’s digital learning environments suggests a desire for further investigation and synthesis of an increasing variety and volume of video data. The current paper employs various video analytics techniques and explores students' video interactions and engagement in three different case studies. The analysis offered for each case study contributed to a better understanding of student behaviours occurring in the context of MOOCs, SPOCs and University courses. Concretely, according to the evidence gathered, the first case study that took place in SPOC acknowledged the strong correlation between video engagement and the general course interactions. The second case study, taken place in the university context following online learning, acknowledged distinct behavioural patterns in watching videos. Finally, the last case study, that regarded a MOOC, highlighted the relation of the video engagement with the video content and the video length.

As a future perspective and direction, our aim is to build upon the findings of this study by emphasising the significance of expanding video analytics on scale. This can be achieved by enabling learning analytics, specifically for videos, across multiple learning platforms through facilitating data log communication and presenting the information to the stakeholders. Enabling video analytics on scale will provide researchers to develop comprehensive analytics engines that can work across multiple learning platforms and also helping teachers and students to take advantage of actionable suggestions that could improve their current situation.