In 2012, Greenberg & Zanetis stated: “Video appears poised to be a major contributor to the shift in the educational landscape, acting as a powerful agent that adds value and enhances the quality of the learning experience” (p.4). Some 3 years later, Woolfit (2015, p.2) confirmed that videos have actually exploded on to the Higher Education scene and in their annual statement Kaltura (2015) state: “video is permeating our educational institutions, transforming the way we teach, learn, study, communicate, and work”. Siemens et al. (2015) refer to the consequences of this as ‘thinning of classroom walls’ where learners are now able to access a wealth of material from a range of technologies, but in particular videos. Therefore, the prominence of educational videos in higher education is at unprecedented levels, however the impact they are having is not fully known. Stigler et al. (2015, p.15) argues that videos are yet to have an impact on learning and teaching, whereas Hansch et al. (2015, p.1) and Hoogerheide et al. (2016, p.22) acknowledge that the use of video in learning is now taken for granted despite “a relative lack of evidence as to video’s effectiveness for learning”. Therefore, this study intends to measure the impact of a quiz in a video within a higher education context. For the purpose of this study, video is limited to and defined as digitally recorded content of presentation slides with audio voice over.

Literature Review

The use of a quiz question within an educational video appears to be intuitive for a number of reasons. Merkt et al. (2011, p.700) has highlighted that quiz questions increase engagement and motivation of learners to want to learn. Also Szpunar et al. (2014, p.163) reported that students had more mind wandering tendencies and made fewer notes when students watched a video with no quiz present. In addition, Delen et al. (2014, p.314) concluded quiz questions supported self-regulation when watching educational videos. While Cummins et al. (2016, p.57) reported the use of a quiz question allows the learner to receive immediate feedback while watching an educational video, which in turn frees up time for more focused face to face teaching sessions.

On the other hand, it could be argued from Cognitive Theory of Multimedia Learning, CTML (Mayer 2009) that a quiz question can be a distraction from the learning goal if it is not linked, and thus should not be used to avoid a cognitive processing overload. However, to date, very little research has directly been carried out evaluating the impact, in terms of student grades, of a quiz question in an educational video (Cummins et al. 2016, p.59).

One study, (n = 223) by Shelton et al. (2016) concluded that student’s self-reported perceptions rated significantly higher with enhanced videos (with quiz questions embedded), compared to the common videos (no quiz questions) on the four themes of their research: student engagement, scaffolding learning, learning gains and student accountability. Post video quiz scores were significantly higher with the enhanced videos compared to the common videos. Student’s perceptions indicated that they were more engaged as they did not know when a quiz question would occur; a quiz question highlighted to the students the content that the learning designers considered important. Finally, students commented the quiz questions meant they felt they had to watch the whole video, which in turn, could have an impact on pedagogical integrity.

However, there were concerns expressed by the students. It was reported it was not necessary to watch the videos to succeed on the post video quizzes, thus providing a negative perspective for pedagogical integrity. Consistent with Mayer (2009), some students commented on the embedded quiz questions being a distraction, which caused loss of interest, or the big message of the video to be missed. In addition, it was reported that the embedded quiz questions lead to anxiety for some students which prohibited any learning gains to take place.


Method for Data Collection

During February/March 2017 data was collected from two modules (POD2014, n = 23 and MKT1022, n = 102) from different faculties at UoN. POD2014 is a second year module in Podiatry, and MKT1022 is a first year module in Marketing. All available students agreed to take part and both modules undertook the same initial process for collecting data. The researcher, in collaboration with each module leader, amended existing videos used in the previous academic year. In one video, quiz questions were embedded throughout the video (video 3), and in another, quiz questions were added at the end (video 2). The scores of the quiz questions within each video were recorded, but they were not the focus of this study and are not used in the analysis. A further video was used in both modules that was not amended from the previous academic year, hence there were no quiz questions present in this video (video 1). One week after the video was made available on the virtual learning environment, (VLE) students were invited to take part in a non-assessed multiple choice quiz on the content of each video. The test questions were written by the module leader, to ensure reliability in terms of terminology used, and were also marked by the respective module leaders to ensure consistency and validity of scores awarded. The scores were recorded and will be referred to as video 1 test, video 2 test, and video 3 test for each module in the following sections.

Due to the different requirements from each module leader, there were then differences on the subsequent data collection methods. The methods for data collection for each module will therefore be broken down for a detailed explanation.

Method for Data Collection for POD2014

On completion of all three tests, students on module POD2014 were invited to take part in a hard copy questionnaire which investigated the students’ perceptions of each of the videos. The questionnaire had a mixture of closed and open ended questions. A chronological plan for data collection for POD2014 can be seen in Fig. 1.

Fig. 1
figure 1

Method for data collection for POD2014

Method for Data Collection for MKT1022

Students from module MKT1022 did not complete a questionnaire, but they did complete an additional assessed multiple choice quiz, to measure the retention of knowledge on the content of all three videos 1 week after video 3 test. This was an online test and scores were captured on the institutions VLE (Fig. 2).

Fig. 2
figure 2

Method for data collection for MKT1022

Method for Data Analysis

All quantitative data collected was entered into IBM SPSS Statistics (v.22) for analysis. However, in advance of collecting data an a priori sample size calculation was performed on G*Power (v.3.1) for each test respectively. Thus, sample sizes were determined in order to detect a significant difference, if one existed, as classified by Cohen (1992, p.157). For all sample size calculations the significance level given was α = 0.05, and β = 0.8. It was found that every test had sufficient sample size, and thus power, to detect at least a large effect size.

Limitations to Study

This research has only just commenced the second cycle of an action research. It can be seen that this research has completed the following stages: planning, action, observing, reflection, and planning again, however the study would have benefitted from a further cycle of action, observing and reflection. Due to the availability of students, no follow up interviews were able to take place. Although the data collected for this study follows the design and was considered to be appropriate for this project, there is a limited use of qualitative data. However it is advised that any future research is designed to incorporate student interviews in order to allow for a deeper understanding and additional triangulation.


Descriptive statistics are presented for each module separately and then inferential statistics will follow. Since the two modules have a common design element, results will be reported on both modules together in this section. However, the additional analysis for each module (assessed test for MKT1022, and questionnaire for POD2014) will be reported separately.

Out of the 102 students on module MKT1022, only 32 (31.4%) students actually watched all three videos and took the seminar multiple choice tests, therefore only these students are used for comparisons of seminar test scores. The means and standard deviations for seminar scores (see Table 1), show for both modules that there was an improvement in scores from video 1 to video 2, and again from video 2 to video 3. These results are further investigated in upcoming sections to determine what generalisable conclusions can be made.

Table 1 Descriptive statistics for seminar tests broken down by module and video type

Differences in Video Tests for MKT1022

In order to evaluate whether the differences found for MKT1022 could be generalised a one-way repeated measures ANOVA was conducted to determine whether there were statistically significant differences in seminar scores following each video format. The data was normally distributed, as assessed by boxplot, histograms and skewness and kurtosis values respectively. Therefore the analysis proceeded with the parametric assumptions. The assumption of sphericity was accepted, as assessed by Mauchly’s test of sphericity, χ2(2) = 0.866, p = 0.116. It can be seen there was a significant difference in the three results for the video multiple choice tests F(2, 62) = 63.849, p < 0.0005, partial η2 = 0.673. Post hoc analysis with a Bonferroni adjustment revealed that scores statistically significantly increased from video 1 to video 2 with an increase of 7.188% (95% CI, 2.1 to 12.3, p < 0.005), and from video 1 to video 3 with an increase of 22.813% (95% CI, 16.8 to 28.9, p < 0.0005). Furthermore there was also a significant increase from video 2 to video 3 with a recorded difference of 15.625% (95% CI, 11.2 to 20.0, p < 0.0005).

It can be concluded from this data, that a quiz question in a video resulted in students to scoring significantly higher on a subsequent video test compared to no quiz at all. Moreover a quiz embedded throughout the video generated significantly better video test scores compared to a quiz at the end of a video. Thus it could be argued that the format for video 3 is better for students retaining that information in the short term.

Differences in Video Tests for POD2014

In order to test whether there were statistically significant differences in video test scores based on when the video was watched a two-way mixed ANOVA was run. The within element of the design was the test scores on each video test and the between element of the design was when the student watched the video. Due to the relatively small number of students in this part of the study, analysis of the time period when the students watched the video was collapsed into two categories, within a few days of becoming available, and on the day or night before the video test.

Interestingly, all video tests resulted in higher scores for the group who watched the day before compared to the group who watched within a few days of the video becoming available, see Table 2.

Table 2 Descriptive statistics for POD2014 module broken down by when the videos were watched

Analysis of the data showed that there were no outliers, as assessed by the boxplot. The data was normally distributed, as assessed by Shapiro-Wilk’s test of normality (p > 0.05). There was homogeneity of variances (p > 0.05) and covariances (p > 0.05), as assessed by Levene’s test of homogeneity of variances and Box’s M test, respectively. The assumption of sphrecity was met, as assessed by Mauchly’s test of sphericity, χ2(2) = 0.918, p = 0.425. There was no statistically significant interaction between the time watched and video test score, F(2, 42) = 0.249, p = 0.781, partial η2 = 0.012. Since there was no significant interaction, main effects were examined for further clarity. The main effect of video tests showed a statistically significant difference between the different video tests, F(2, 42) = 11.465, p < .0005, partial η2 = 0.353. The main effect of the time before the videos were watched showed that there was no statistically significant difference in video test scores F(1, 21) = 0.732, p = 0.402, partial η2 = 0.034, meaning over the three videos combined, video test scores were not dependent upon when the students watched the videos, (see Fig. 3).

Fig. 3
figure 3

Test scores broken down by when students watch videos

The main effects of video test scores do not distinguish between when the videos were watched and therefore simply evaluate the differences in video test scores in isolation. Overall, the mean scores for video tests were 63.8% for videos 1, 74.2% for videos 2, and 78.5% for video 3. It was shown there are statistically significant differences in video test scores and pairwise comparisons confirm those significant differences were between video 1 and video 3, difference 14.7% (95% CI, 4.8 to 16.1%, p < 0.0005), and video 1 and video 2, difference 10.5% (95% CI, 4.8 to 16.1%, p < 0.001). Although the video test score for video 3 was larger than video 2, this did not produce a significant increase, difference 4.3% (95% CI, −2.5 to 11.0%, p > 0.05). Again, as per the results for MKT1022, it has been shown that having a quiz in a video has resulted in significantly higher video test scores than not having a quiz. However on this occasion the positioning of the quiz questions, albeit with a smaller sample size, and embedded quiz questions scoring higher, has resulted in no significant differences compared to quiz questions at the end of the video.

Differences in Assessed Test for those Students Watching (and Not Watching) each Video and Taking the Video Tests for MKT1022

Due to a number of students either not watching the videos or not taking the video multiple choice quizzes, and the fact that all students took the assessed test, it was decided to investigate the assessed test scores comparing those students who had watched the videos (n = 32) and those students who had not (n = 70). The mean assessed score for those who had watched the videos were 50.2% (standard deviation of 8.8), and the mean assessed score of those students who had not watched the videos were 43.2% (standard deviation 10.8). A Shapiro-Wilk test was carried out to test for normality and indicated that the data was approximately normally distributed (p > 0.05). Therefore an independent t-test was run to investigate whether there were differences between the two groups on assessed scores.

There was homogeneity of variance, as assessed by Levene’s test for equality of variance (p = 0.187) and a difference of 7.0% in scores corresponded to a statistically significant difference (t(95) = 3.173, p = 0.002, d = 0.685). Therefore it can be concluded that watching the videos has led to an increased score in the assessed test compared to not watching the videos. While this result is not surprising, it is interesting to note that majority of students (69.6%) did not watch all the videos even though they were directed to do so.

This leads the research on to the final analysis for this module, and probably the most important for this data set. In order to investigate whether there was a statistical interaction of watching the videos, with performance in the assessed test, a two way mixed ANOVA was run. The within element was the assessed test scores for each content and the between element was whether the students watched the videos. The means and standard deviations for these students can be found in Table 3 and the corresponding box and whisker plot is displayed in Fig. 4.

Table 3 Descriptive statistics for MKT1022 for assessed test broken down by if students watched videos and completed seminar test
Fig. 4
figure 4

Distributions broken down by if students watched videos and completed MCQ seminar test

When running the two-way mixed ANOVA it was found that there were no outliers, as assessed by the boxplot. The data was normally distributed, as assessed by Shapiro-Wilk’s test of normality (p > .05). There was homogeneity of variances (p > .05) and covariances (p > .05), as assessed by Levene’s test of homogeneity of variances and Box’s M test, respectively. Mauchly’s test of sphericity indicated that the assumption of sphericity was met for the two-way interaction, χ2(2) = 0.983, p = 0.440. There was a statistically significant interaction between watching the videos and specific content assessed performance, F(2, 188) = 2.707, p < 0.05, partial η2 = 0.28. The statistical interaction between the assessed results for the material in each video and whether the student had watched the video is further highlighted in the profile plot Fig. 5.

Fig. 5
figure 5

Test scores broken down by if students watched videos and completed MCQ seminar test

Whereas there is not much difference in the assessed scores broken down by the three video formats for those students who did not watch the videos, there appears to be considerable difference for those students who did watch each of the three videos. To further examine these differences in scores, simple main effects were investigated. This examines those students who had watched the videos and those who had not individually to determine whether there were differences in scores for each content respectively. Hence a separate one-way repeated ANOVA was run for those who had watched the videos and those who had not.

First of all, investigating those students who had not watched the videos found, as expected, that there were no significant differences in assessed scores broken down by the three content sections, F(2, 126) = 0.222, p = 0.801, partial η2 = 0.004. However for the group who did watch the videos, post-hoc analysis identified statistically significant differences in assessed scores for content from video 1 and content from video 3, with a difference of 9.06% (95% CI, 0.19 to 17.94%, p < 0.05). Although there was an improvement in assessed scores for content from video 3 compared to video 2, this was not quite considered to be a significant increase, with a difference of 6.25% (95% CI, −1.12 to 13.62%, p = 0.094).

These results provide salient findings which will be expanded in the discussion section. First, the performance of assessed content significantly improved when the students had watched the videos. Moreover, out of those students who watched all the videos, the assessed score arising from content from video 3 was statistically significantly higher than assessed scores from content from video 1, and higher compared to video 2. Furthermore the increase in results on assessed scores between those students who had watched and those who did not watch the videos was greatest for video 3 (quiz questions embedded). This again implies that quiz questions embedded throughout is better for students to retain information.

Results from Questionnaire

In addition to exploring the impact of video test scores, and identifying the viewing behaviours of students watching the video, the questionnaire used on module POD2014 generated data on students’ perceptions of the videos. Students were requested to rate each individual video for usefulness and quality on a one to five scale (five being the best score). The data did not show any significant differences between the videos, and the means and standard deviations are reported in Table 4. The results for both usefulness and quality of learning were virtually identical for each video; as far as the students are concerned, and contrary to their test results, they did not detect a difference in any of the videos in terms of usefulness or quality of learning. However it must be mentioned that the scores for both usefulness and quality of learning were very high (the lowest score given was a three out of five). Hence, it is possible that students’ scores reflect the content provided and not necessarily the different formats as intended.

Table 4 Descriptive statistics for POD2014 on usefulness and quality of learning for each video type

In addition to the quantitative results, qualitative data was obtained via the questionnaire for this module. The main area of interest was to find out what the students thought about having a quiz in the video. Overall, out of the 23 students, 15 students made only positive comments, one student made both positive and negative comments, and seven students did not comment. Following a thematic analysis, the positive comments were categorised into three main themes, in decreasing order of popularity: i) helps understanding and knowledge; ii) increases attention and engagement; iii) provides immediate feedback. The only negative comment was regarding the questions being a distraction from the content delivered, but this was only provided by one person.

Examples provided by students to illustrate the three themes are provided below. Most comments received were regarding the questions supporting understanding and knowledge. Comments such as: “the questions provide me a measurement of my own understanding”, “they check what I understand”, and “the questions are useful to test my knowledge” were received. The second theme in term of frequency of comments, was regarding attention and engagement. Students provided the following comments that contributed to this theme: “questions in the video makes me pay attention”, “questions help me interact with the video”, and “quizzes make sure you stay engaged”. The final theme established was connected to feedback. Comments received were: “I liked to know if my answers were correct” and “it was helpful to get the answers immediately”. Although this has been classified as a separate theme as the comments directly relate to feedback, it is recognised that this could also be a sub theme of knowledge and understanding as feedback indirectly relates to understanding.


Impact of Videos on Video Test Scores

The results from this study have unveiled some very interesting findings. It has been shown that scores in both modules were higher in video test 3, compared to video test 2, and higher in video test 2 compared to video test 1. This result is consistent with research carried out on this topic (Delen et al. 2014, p.318; Merkt and Schwan 2014 p.431 & Shelton et al. 2016, p.64). Therefore, given the results found in this study, and similar research found, it is reasonable to conclude that the quiz questions embedded in a video have positively impacted on video test scores.

On the contrary, one possible explanation that could be put forward to counter the effectiveness of a quiz question in a video could be that the quiz questions in the video closely resembled those in the video tests. Thus, it could be argued that lecturers (subconsciously or not) could be prompting students to the questions in the video tests. If this was the case, students might simply be remembering the answers to the questions and thus learning had not taken place from the videos. However, following informal discussions with the module leaders who designed the questions in this instance this scenario is considered unlikely. Moving forward, it is recommended in any further research, the questions are analysed to ensure this issue can be fully dismissed.

Impact of Videos on Assessed Scores

Similar results were also found when evaluating the assessed scores from module MKT1022. Scores were significantly higher for content taken from video 3 compared to video 1, video 2 compared to video 1, and higher (not significantly) for content taken from video 3 compared to video 2. Thus, it is reasonable to again conclude that quiz questions in the videos have positively impacted on results. However, whereas the seminar tests were all taken within the same time period of the videos becoming available (1 week), the assessed test was taken after varying time periods of each video release. The summative assessment was taken 1 week following video test 3, two weeks following video test 2, and 3 weeks following video test 1. Thus, it could be argued that students did better in the assessment for content from video 3, due to the fact that the content was delivered more recently and was fresher in the students’ minds. Any further research should ideally be a balanced design to negate the discrepancy between the time period of each video and the subsequent assessment.

However, these claims only have validity when investigating the differences between scores for those students who watched the videos. For those students who did not watch the videos, the effect of time has been removed. Furthermore the results for this group showed that there were no significant differences in the assessed scores for content from each of the three videos. This therefore indicates that questions in the assessed test arising from each of the videos were, as designed, equal in difficulty level. As a consequence, since the assessed results for the students who watched the videos were significantly different in the three content areas, this has to be accredited to the differences in the video format as opposed to the amount of time between the videos.

This argument is further supported by the fact that the majority of the videos in this study were watched the day before the video tests. However, there was no significant difference in results between those students who watched the day before compared to those students who watched earlier in the week. The time when the video was watched did not impact on results, thus giving more weight to the premise that differences in scores are due to the format of the video rather than being a result of when the video was watched.

Another interesting finding emerged when the results of the assessed test for the students who watched the videos and those who did not were compared. As expected, the students who watched the videos outperformed the students who did not watch the videos. This could be due to the positive impact of the videos or simply that more conscientious students are more likely to engage with the resources provided. Regardless of the reason, this result reassuringly suggests that learning designers are not wasting their time by producing educational videos. As a side issue, and beyond the limits of this study, it will be worthwhile to pursue further research on resources that will motivate and engage all students, although the discussion on viewing behaviour will be expanded upon in the next section.

Students’ Perceptions of Videos - Advantages

Qualitative comments made by students in this study on the benefits of a quiz question mainly concentrated on understanding and knowledge, engagement, and feedback. Clearly there are some cross overs in terms of these themes, however the comments derived from this study resonate closely with the scaffolding learning and learning gains themes identified by Shelton et al. (2016, p.468). Comments such as ‘helps monitor understanding’ and ‘retain information’ were frequently recorded in this study, which reinforced the scaffolding and learning gains themes. The similarities between these studies therefore provide the researcher with greater confidence to conclude that embedded quiz questions positively impact on the understanding and ability of the student to retain information.

In this study some students perceived an added benefit of having immediate feedback on their response to the question. In the videos with questions, responses were identified as correct or incorrect. However, research carried out by Johnson & Priest (2014, p.460) found that feedback is significant in the learning process and furthermore that explanatory feedback is considerably more effective than simple corrective feedback (as per this study). Therefore it is suggested that the videos could be further improved by providing explanations with the feedback, and this could lead to students’ perceptions of the videos with quiz questions being further improved.

Students’ Perceptions of Videos - Disadvantages

With regard to the disadvantages of a quiz question within a video, this study again found similar results to Shelton et al. (2016, p.470). Firstly, and most importantly, it must be recognised in both studies that the majority of students did not comment on any perceived disadvantages. The only negative comment made in this study related to the possibility of a quiz question distracting students. Again, the findings of this study are given more credibility as Shelton et al. (2016, p.471) also reported similar comments. These results are further supported when principles of multimedia design (Mayer 2009) are taken into account. The consequence of including a quiz question embedded in a video might be that students lose interest, that they focus on the question rather than the content of the video, or that the quiz question might provoke anxiety, leading to confusion.

Although learning designers should be mindful of the disadvantages discussed, it should be reiterated that the vast majority of students did not highlight any disadvantages of a quiz question in an educational video. Therefore it is suggested that the disadvantages do not outweigh the benefits of a quiz question, and attempts should be made to address the disadvantages rather than removing quiz questions from videos.


Finally, it is noteworthy that the student’s qualitative comments regarding their perceptions of a quiz in a video did not match their ratings for each of the three videos. The ratings were very similar, suggesting that students did not differentiate between the usefulness and quality of the three video formats, however the qualitative comments provided show overwhelming support for the videos with quizzes embedded. Ideally follow up interviews would have clarified this discrepancy, and will be included in further research.

One possible explanation of the discrepancy could be the impact of the lecturer on student responses. One criticism of action research as a methodology is a conflict of role of the ‘insider’ researcher (Dover 2008). In this instance it is possible that the lecturer who administered the questionnaire has impacted on the student’s responses in the questionnaire. The same lecturer was the person who created the videos and marked the video tests, therefore this may have innocently resulted in the high scores for all videos. However, despite this factor and possible threat to the validity of student feedback, this by no means diminishes the impact that quiz questions have had on both video test scores and assessed test scores.

Recommendations and Future Research

The credibility of action research is measured according to whether the actions arising from it solve problems (Hannay et al. 2003, p.123). Although it is acknowledged that generalisations are limited from this research, given the results of this study combined with the limited research undertaken in this area the following recommendations are made:

  1. 1)

    If content is provided to students in the form of an educational video, then it is encouraged that the video has quiz questions embedded throughout as standard.

  2. 2)

    Distractions should be kept to a minimum. This can be achieved by following the principles of multimedia learning and ensuring that questions are a continuation of the video and not simply an add on (Mayer 2009).

  3. 3)

    Any responses (correct or incorrect) to questions should be immediately followed by explanatory feedback.

  4. 4)

    Full training should be provided how a student can navigate throughout the video. This training becomes more relevant for the novice learner (Gajos et al. 2014).

It has been shown that this study has benefitted from a full cycle of the action research process (plan, act, observe, reflect) and has now moved on to the plan stage of the second cycle. Therefore this study has developed a plan of action for further research. It would be valuable to investigate whether future research provides similar results for different levels of study (level 4, 5, 6 and 7). Moreover further research specifically regarding the short term and long term impact of embedded quiz questions within a video is recommended. Any future research should also include analysis on the actual questions used in the quiz and how they impact results. For example, can learning take place at higher levels of Bloom’s taxonomy to incorporate evaluating and synthesise questions (Anderson 2014, p.31)? It is also recommended to research whether there is an optimum number of questions to be asked in a video or is there a point when an additional question will have no impact or even a negative impact. Finally, any future research should be designed to incorporate an emphasis on generating more qualitative data as this will go beyond what is happening and address why it is happening.


This study has shown that adding questions to a video will improve the recall of knowledge of students on a subsequent test. Moreover, the most effective position of those questions for short term recall is when embedded throughout the video and not grouped together at the end. In terms of viewing behaviour, the time period during which the video was watched did not impact on test scores. The evidence provided in this study together with the results of other studies reviewed, has given the researcher the confidence to develop and pursue this topic further. Although it is recognised that there will not be a magic formula found, it is strongly believed that there will be recommended guidelines unearthed. It is acknowledged that further studies are required to fully comprehend the complexities of a quiz question in an educational video. That said, it is the conclusion of this study, based on the evidence generated, that quiz questions embedded throughout a video are the most effective format for producing an educational video.