Descriptive statistics are presented for each module separately and then inferential statistics will follow. Since the two modules have a common design element, results will be reported on both modules together in this section. However, the additional analysis for each module (assessed test for MKT1022, and questionnaire for POD2014) will be reported separately.
Out of the 102 students on module MKT1022, only 32 (31.4%) students actually watched all three videos and took the seminar multiple choice tests, therefore only these students are used for comparisons of seminar test scores. The means and standard deviations for seminar scores (see Table 1), show for both modules that there was an improvement in scores from video 1 to video 2, and again from video 2 to video 3. These results are further investigated in upcoming sections to determine what generalisable conclusions can be made.
Table 1 Descriptive statistics for seminar tests broken down by module and video type Differences in Video Tests for MKT1022
In order to evaluate whether the differences found for MKT1022 could be generalised a one-way repeated measures ANOVA was conducted to determine whether there were statistically significant differences in seminar scores following each video format. The data was normally distributed, as assessed by boxplot, histograms and skewness and kurtosis values respectively. Therefore the analysis proceeded with the parametric assumptions. The assumption of sphericity was accepted, as assessed by Mauchly’s test of sphericity, χ2(2) = 0.866, p = 0.116. It can be seen there was a significant difference in the three results for the video multiple choice tests F(2, 62) = 63.849, p < 0.0005, partial η2 = 0.673. Post hoc analysis with a Bonferroni adjustment revealed that scores statistically significantly increased from video 1 to video 2 with an increase of 7.188% (95% CI, 2.1 to 12.3, p < 0.005), and from video 1 to video 3 with an increase of 22.813% (95% CI, 16.8 to 28.9, p < 0.0005). Furthermore there was also a significant increase from video 2 to video 3 with a recorded difference of 15.625% (95% CI, 11.2 to 20.0, p < 0.0005).
It can be concluded from this data, that a quiz question in a video resulted in students to scoring significantly higher on a subsequent video test compared to no quiz at all. Moreover a quiz embedded throughout the video generated significantly better video test scores compared to a quiz at the end of a video. Thus it could be argued that the format for video 3 is better for students retaining that information in the short term.
Differences in Video Tests for POD2014
In order to test whether there were statistically significant differences in video test scores based on when the video was watched a two-way mixed ANOVA was run. The within element of the design was the test scores on each video test and the between element of the design was when the student watched the video. Due to the relatively small number of students in this part of the study, analysis of the time period when the students watched the video was collapsed into two categories, within a few days of becoming available, and on the day or night before the video test.
Interestingly, all video tests resulted in higher scores for the group who watched the day before compared to the group who watched within a few days of the video becoming available, see Table 2.
Table 2 Descriptive statistics for POD2014 module broken down by when the videos were watched Analysis of the data showed that there were no outliers, as assessed by the boxplot. The data was normally distributed, as assessed by Shapiro-Wilk’s test of normality (p > 0.05). There was homogeneity of variances (p > 0.05) and covariances (p > 0.05), as assessed by Levene’s test of homogeneity of variances and Box’s M test, respectively. The assumption of sphrecity was met, as assessed by Mauchly’s test of sphericity, χ2(2) = 0.918, p = 0.425. There was no statistically significant interaction between the time watched and video test score, F(2, 42) = 0.249, p = 0.781, partial η2 = 0.012. Since there was no significant interaction, main effects were examined for further clarity. The main effect of video tests showed a statistically significant difference between the different video tests, F(2, 42) = 11.465, p < .0005, partial η2 = 0.353. The main effect of the time before the videos were watched showed that there was no statistically significant difference in video test scores F(1, 21) = 0.732, p = 0.402, partial η2 = 0.034, meaning over the three videos combined, video test scores were not dependent upon when the students watched the videos, (see Fig. 3).
The main effects of video test scores do not distinguish between when the videos were watched and therefore simply evaluate the differences in video test scores in isolation. Overall, the mean scores for video tests were 63.8% for videos 1, 74.2% for videos 2, and 78.5% for video 3. It was shown there are statistically significant differences in video test scores and pairwise comparisons confirm those significant differences were between video 1 and video 3, difference 14.7% (95% CI, 4.8 to 16.1%, p < 0.0005), and video 1 and video 2, difference 10.5% (95% CI, 4.8 to 16.1%, p < 0.001). Although the video test score for video 3 was larger than video 2, this did not produce a significant increase, difference 4.3% (95% CI, −2.5 to 11.0%, p > 0.05). Again, as per the results for MKT1022, it has been shown that having a quiz in a video has resulted in significantly higher video test scores than not having a quiz. However on this occasion the positioning of the quiz questions, albeit with a smaller sample size, and embedded quiz questions scoring higher, has resulted in no significant differences compared to quiz questions at the end of the video.
Differences in Assessed Test for those Students Watching (and Not Watching) each Video and Taking the Video Tests for MKT1022
Due to a number of students either not watching the videos or not taking the video multiple choice quizzes, and the fact that all students took the assessed test, it was decided to investigate the assessed test scores comparing those students who had watched the videos (n = 32) and those students who had not (n = 70). The mean assessed score for those who had watched the videos were 50.2% (standard deviation of 8.8), and the mean assessed score of those students who had not watched the videos were 43.2% (standard deviation 10.8). A Shapiro-Wilk test was carried out to test for normality and indicated that the data was approximately normally distributed (p > 0.05). Therefore an independent t-test was run to investigate whether there were differences between the two groups on assessed scores.
There was homogeneity of variance, as assessed by Levene’s test for equality of variance (p = 0.187) and a difference of 7.0% in scores corresponded to a statistically significant difference (t(95) = 3.173, p = 0.002, d = 0.685). Therefore it can be concluded that watching the videos has led to an increased score in the assessed test compared to not watching the videos. While this result is not surprising, it is interesting to note that majority of students (69.6%) did not watch all the videos even though they were directed to do so.
This leads the research on to the final analysis for this module, and probably the most important for this data set. In order to investigate whether there was a statistical interaction of watching the videos, with performance in the assessed test, a two way mixed ANOVA was run. The within element was the assessed test scores for each content and the between element was whether the students watched the videos. The means and standard deviations for these students can be found in Table 3 and the corresponding box and whisker plot is displayed in Fig. 4.
Table 3 Descriptive statistics for MKT1022 for assessed test broken down by if students watched videos and completed seminar test When running the two-way mixed ANOVA it was found that there were no outliers, as assessed by the boxplot. The data was normally distributed, as assessed by Shapiro-Wilk’s test of normality (p > .05). There was homogeneity of variances (p > .05) and covariances (p > .05), as assessed by Levene’s test of homogeneity of variances and Box’s M test, respectively. Mauchly’s test of sphericity indicated that the assumption of sphericity was met for the two-way interaction, χ2(2) = 0.983, p = 0.440. There was a statistically significant interaction between watching the videos and specific content assessed performance, F(2, 188) = 2.707, p < 0.05, partial η2 = 0.28. The statistical interaction between the assessed results for the material in each video and whether the student had watched the video is further highlighted in the profile plot Fig. 5.
Whereas there is not much difference in the assessed scores broken down by the three video formats for those students who did not watch the videos, there appears to be considerable difference for those students who did watch each of the three videos. To further examine these differences in scores, simple main effects were investigated. This examines those students who had watched the videos and those who had not individually to determine whether there were differences in scores for each content respectively. Hence a separate one-way repeated ANOVA was run for those who had watched the videos and those who had not.
First of all, investigating those students who had not watched the videos found, as expected, that there were no significant differences in assessed scores broken down by the three content sections, F(2, 126) = 0.222, p = 0.801, partial η2 = 0.004. However for the group who did watch the videos, post-hoc analysis identified statistically significant differences in assessed scores for content from video 1 and content from video 3, with a difference of 9.06% (95% CI, 0.19 to 17.94%, p < 0.05). Although there was an improvement in assessed scores for content from video 3 compared to video 2, this was not quite considered to be a significant increase, with a difference of 6.25% (95% CI, −1.12 to 13.62%, p = 0.094).
These results provide salient findings which will be expanded in the discussion section. First, the performance of assessed content significantly improved when the students had watched the videos. Moreover, out of those students who watched all the videos, the assessed score arising from content from video 3 was statistically significantly higher than assessed scores from content from video 1, and higher compared to video 2. Furthermore the increase in results on assessed scores between those students who had watched and those who did not watch the videos was greatest for video 3 (quiz questions embedded). This again implies that quiz questions embedded throughout is better for students to retain information.
Results from Questionnaire
In addition to exploring the impact of video test scores, and identifying the viewing behaviours of students watching the video, the questionnaire used on module POD2014 generated data on students’ perceptions of the videos. Students were requested to rate each individual video for usefulness and quality on a one to five scale (five being the best score). The data did not show any significant differences between the videos, and the means and standard deviations are reported in Table 4. The results for both usefulness and quality of learning were virtually identical for each video; as far as the students are concerned, and contrary to their test results, they did not detect a difference in any of the videos in terms of usefulness or quality of learning. However it must be mentioned that the scores for both usefulness and quality of learning were very high (the lowest score given was a three out of five). Hence, it is possible that students’ scores reflect the content provided and not necessarily the different formats as intended.
Table 4 Descriptive statistics for POD2014 on usefulness and quality of learning for each video type In addition to the quantitative results, qualitative data was obtained via the questionnaire for this module. The main area of interest was to find out what the students thought about having a quiz in the video. Overall, out of the 23 students, 15 students made only positive comments, one student made both positive and negative comments, and seven students did not comment. Following a thematic analysis, the positive comments were categorised into three main themes, in decreasing order of popularity: i) helps understanding and knowledge; ii) increases attention and engagement; iii) provides immediate feedback. The only negative comment was regarding the questions being a distraction from the content delivered, but this was only provided by one person.
Examples provided by students to illustrate the three themes are provided below. Most comments received were regarding the questions supporting understanding and knowledge. Comments such as: “the questions provide me a measurement of my own understanding”, “they check what I understand”, and “the questions are useful to test my knowledge” were received. The second theme in term of frequency of comments, was regarding attention and engagement. Students provided the following comments that contributed to this theme: “questions in the video makes me pay attention”, “questions help me interact with the video”, and “quizzes make sure you stay engaged”. The final theme established was connected to feedback. Comments received were: “I liked to know if my answers were correct” and “it was helpful to get the answers immediately”. Although this has been classified as a separate theme as the comments directly relate to feedback, it is recognised that this could also be a sub theme of knowledge and understanding as feedback indirectly relates to understanding.