Flipped Classrooms: a Review of Key Ideas and Recommendations for Practice
Flipped classrooms refer to the practice of assigning lectures outside of class and devoting class time to a variety of learning activities. In this review, we discuss the range of approaches to the flipped classroom and focus on activities frequently used in these settings. Amongst these, we examine both out-of-class activities (e.g., video lectures) and in-class activities (e.g., quizzes, student discussions). We argue that the value of these activities reflects the particular cognitive processes engaged by the activity regardless of whether the setting is the traditional (lecture-based) classroom or the flipped classroom. Future work should continue to examine the influence of individual activities on student learning and behaviors, particularly when objective measures of learning, such as quizzes and exams, are held constant.
KeywordsFlipped classroom Active learning Instruction Review
What are “flipped classrooms”? Although there is no single model (Tucker 2012), the flipped classroom is characterized by course structure: instructional content (e.g., prerecorded class lectures) is assigned as homework before coming to class. In-class time is then spent working on problems, advancing concepts, and engaging in collaborative learning (Findlay-Thompson and Mombourquette 2014). Removing the instructional content from in-class time allows the instructor more time for one-on-one engagement with individual students (Roehl et al. 2013), but perhaps equally important, the flipped classroom model is student-centered (McLaughlin et al. 2014). That is, students are responsible for watching lectures on their own and coming to class prepared for in-class activities and discussion. Little direct evidence currently exists regarding student learning outcomes or academic performance in a flipped versus traditional (lecture-based) classroom (for reviews, see Bishop and Verleger 2013; O’Flaherty and Phillips 2015). Of the 28 higher-education studies reviewed by O’Flaherty and Phillips (2015), only 11—or fewer than half—reported measures of student performance, and amongst these, only five reported comparisons to a traditional classroom with comparable exams, assessments, or exam questions that overlapped between the two formats (Hung 2015; Mason et al. 2013; McLaughlin et al. 2014; Missildine et al. 2013; Pierce and Fox 2012). Rather, outcomes frequently reflected students’ perceptions of their learning (e.g., Butt 2014; Critz and Knight 2013; Davies et al. 2013; Ferreri and O’Connor 2013; Forsey et al. 2013; Gilboy et al. 2015; Hoffman 2014; Jamaludin and Osman 2014; Kim et al. 2014; Lage et al. 2000; McLaughlin et al. 2013; 2014; Schlairet et al. 2015; Strayer 2012; Wilson 2013; Yeung and O’Malley 2014; Young et al. 2014).
Relying on students’ evaluations assumes that they can accurately assess learning. However, in sharp contrast to this assumption, research in metacognition (i.e., “knowing about knowing”; Flavell 1979; Nelson 1996) suggests that students are often unable to assess their own learning or identify strategies that enhance learning. For example, Kornell and Bjork (2008) had participants study paintings by a series of artists, which were either blocked by artist at study, or interleaved (i.e., “shuffled”) by artist, such that no two paintings by the same artist were presented in sequence. On a subsequent test, participants who studied on an interleaved schedule were nearly two times more likely to correctly identify the artists among a new set of paintings than participants who studied on a massed schedule. However, when asked which format resulted in better learning, 78 % of participants believed that massing was as good as—or better than—interleaved learning. Such findings are legion (e.g., Benjamin et al. 1998; Hartwig and Dunlosky 2012; Koriat et al. 2004; Kornell and Bjork 2007, 2008, 2009; Kornell et al. 2011; McCabe 2011; Morehead et al. 2015; Rhodes and Castel 2008, 2009; Yan et al. 2014) and indicate that students’ perceptions of learning are not tantamount to objective measures of learning performance. Accordingly, any evaluation and review of flipped classrooms should ideally be guided by objective measures of learning.
Yet, another obstacle to evaluating the efficacy of flipped classrooms lies in the vast differences in instructor implementation of the classroom “flip.” For example, instructors might utilize a “partial” flip, in which only a portion of lectures reflect a flipped classroom approach (Seery 2015). Additionally, the flipped classroom may include a large array of out-of-class activities beyond lectures, including readings, homework, and supplemental videos1. Finally, in-class activities vary widely, including activities such as role-play, debates, quizzes, and group presentations, amongst others (see O’Flaherty and Phillips 2015). Given this variety of approaches, flipped classrooms should be evaluated with regard to the individual strategies used in creating the classroom flip, both for in-class and out-of-class activities. Although several prior reviews report findings (e.g., student perceptions, educational outcomes; Bishop and Verleger 2013; Giannakos et al. 2014; O’Flaherty and Phillips 2015), in this review, we seek to examine the value of the methods used in flipped classrooms. That is, the present review seeks to evaluate the quality of individual activities and practices (e.g., recorded lectures, quizzes) frequently utilized in flipped classrooms in terms of the cognitive processes engaged.
In order to synthesize a diverse array of activities, we first describe one example of a flipped classroom (McLaughlin et al. 2014) and then discuss each of the activities in light of the processes engaged. Specifically, McLaughlin et al. (2014) redesigned a graduate-level health professions course into a flipped classroom format. In-class lectures were replaced with recorded videos, to be viewed prior to coming to class. In-class time involved various combinations of active learning techniques: audience response, open questions, individual or paired quizzes, pair and share activities, and student presentations and discussion. We discuss out-of-class lectures first and then address each of these activities in turn.
One of the most common means of moving instruction outside the classroom in a flipped classroom format has been to require students to watch prerecorded video lectures or screencasts prior to attending class (Abeysekera and Dawson 2015)2. Because the lecture is such a large portion of a class (even within the flipped classroom), it seems reasonable to examine whether prerecorded lectures have any impact, deleterious or positive, on learning. For ten in-class lectures in a physiology course, El Sayad and El Raouf (2013) had nursing students alternate between viewing a video-based format or a narrated PowerPoint format, with learning measured through quizzes and exams. Overall, the percentage of students who passed or failed these tests (i.e., earned 60 % or less) did not differ across lecture formats for any of the assessments. Similarly, Ellis and Mathis (1985) had introductory sociology students watch either televised or in-person lectures for an entire class semester and observed similar test performance regardless of lecture format. As well, Conway et al. (2010) reduced in-class lecture time by 14 % to provide time for completing additional out-of-class assignments which they were responsible for, with no effect upon student grades. Thus, it does not appear that video lectures, in and of themselves, either add to or detract from student learning.
Further inspection of the literature indicates that the format of imparting instructional content also fails to substantially influence learning. For example, Stephenson et al. (2008) compared learning amongst bioscience majors exposed to each of three lecture formats (traditional, virtual, e-lecture) for different topics in a human genetics course. For the traditional lecture format, students viewed an in-class lecturer giving PowerPoint presentations with printed notes as supplements. In the virtual lecture format, students navigated through an interactive multimedia online lecture (primarily text-based), which organized each of the three topics into subtopics and provided interactive audio and visual explanations in addition to self-assessment questions. Finally, in the e-lecture format, students viewed narrated PowerPoint lectures with the ability to stop, pause, fast-forward, or rewind the lectures at any time. Learning was examined using a test assessing factual recall, comprehension, analysis skills, application skills, and evaluation skills. Overall, test performance was similar regardless of lecture format.
Zhang et al. (2006) examined management information systems students’ performance on the topic of internet search engines. Participants were exposed to either a standard lecture, an interactive video lecture (i.e., the ability to pause, fast-forward, etc.), a noninteractive video lectures, or a lecture with text subtitles rather than sound. Interactive video lectures resulted in the highest test performance, with similar performance between all other conditions. Of note, this study was conducted on only one lecture, suggesting some caution in generalizing across an entire course.
Collectively, the extant data suggest that video lectures themselves do not affect learning. Such data are consistent with the theory that the medium is a carrier of content and unlikely to affect learning itself (Clark 1983, 1994). Thus, any advantage of providing lectures outside the classroom should come from releasing class time for active learning.
The primary motive for flipping a classroom is to provide additional time for in-class activities, including active learning (Haak et al. 2011). The particular methods of active learning used vary in their utility (Prince 2004), although several methods appear to consistently enhance learning (see e.g., Freeman et al. 2014, for a review). Indeed, the success of any method of active learning will be a function of the processes engaged by that method, and we consider methods used in the flipped classroom according to that rubric. McLaughlin et al. (2014) had students engage in a series of in-class activities during each class session: audience response and open questions, pair-and-share activities, student presentations, discussion, and individual or paired quizzes. We consider the processes each of these activities engage in turn.
Audience Response, Open Questions, and Quizzes
McLaughlin et al. (2014) had students respond to clicker questions (i.e., questions administered via audience response systems, commonly termed “clickers”; Caldwell 2007) regarding out-of-class content (video lectures and readings). Based on students’ responses, the instructor would then provide feedback and answer questions from students regarding the lectures, readings, or related content. Additionally, at the end of each class, students completed brief quizzes on that day’s material, either individually or in pairs. Other flipped classroom instructors also report utilizing clicker questions or quizzes, either as a means of ensuring that students completed the out-of-class assignments (Hung 2015; Wilson 2013) or to gather real-time feedback regarding student understanding of content (Ferreri and O’Connor 2013).
Clickers appear to either improve or fail to harm exam scores, relative to equivalent time listening to class lectures or participating in class discussions (for a review, see Caldwell 2007). For example, Wenz et al. (2014) examined the effect of clicker questions on two learning modules, each taking 2 weeks to complete, in a preclinical dentistry class. The class was divided into two groups and, for the first module, one group of participants discussed questions on that module, while a second group responded to those questions via clickers. At the end of this module, participants completed a multiple-choice exam. For the second module, participants switched conditions and again completed an exam at the end of the module. Overall, test performance was highest (and fewer students failed the exams) following clicker questions.
One potential explanation for an advantage of employing clicker questions is that it encourages students to engage in retrieval. Such retrieval enhances memory for the retrieved information, compared to rereading that material, a finding known as the testing effect (for reviews, see Roediger 2008; Rowland 2014). Indeed, Rowland’s meta-analysis found that testing material resulted in higher average performance (i.e., in terms of proportion of items correctly recalled, recognized, etc.) than simply rereading that same information (g = .50).
Do clickers enhance memory more than other means of testing information? Lasry (2008) examined this question in two different sections of a mechanics course: one using clickers and clicker questions and the other presenting the same clicker questions via flashcards. Final exam performance was similar for material tested via clickers or flashcards, arguing against any additional benefit accruing from clickers. Thus, although clickers provide a practical means of administering tests to large groups, the act of retrieving information should benefit learning regardless of the format. Indeed, the benefits of testing are not restricted to in-class activities. For example, quizzes on out-of-class material (e.g., Flynn 2015; Wilson 2013) can also enhance learning. Further, if instructors test students on out-of-class material prior to class, they have the additional opportunity to use feedback from these tests to tailor the content addressed in-class (e.g., Hurtubise et al. 2015).
McLaughlin et al. (2014) used three different types of “pair-and-share” activities in which students worked with each other (in groups of two) prior to sharing their work with the class. In rapid pair-and-share activities, students were given an in-class discussion question and were paired together to discuss and later present their ideas to the class and instructor, who provided feedback. In reflective pair-and-share activities, students had 2–3 days to answer discussion questions in brief essays, from which the instructor selected groups to present their essays for in-class discussion. In the proactive pair-and-share activities, students paired together and took turns preparing their own discussion questions and hosting class discussion on that topic. Similarly, other work on flipped classrooms has reported a variety of pair-and-share activities, including paired problem-solving (Love et al. 2014), and predict-observe-explain (Flynn 2015), in which learners are given a research hypothesis, subsequently predicting and observing the results, followed by explaining or discussing any discrepancies.
Group discussions appear to benefit learning. Over the course of a semester, Smith et al. (2009) had students participate in five clicker questions per class, during which they were encouraged to discuss questions with their classmates3. Benefits of this group discussion were assessed throughout the semester via 16 separate pairs of clicker questions, in which each question of a pair required the respondent to apply the same underlying concept or principle for solution. After answering the first question of a pair individually (selecting one of approximately four responses), students were permitted to discuss the question with nearby classmates before reanswering the first question, followed by the second question of a pair. Out of those students who correctly answered the first question after group discussion, 77 % answered the second, individually-answered question correctly, indicating that group discussion helped them learn the underlying concepts. Even when students did not answer the first question correctly (either initially or after group discussion), they were more likely than chance to correctly answer a second question applying the same principle after discussion (44 vs. ∼25 %). Thus, group discussion benefitted student performance as well as their conceptual understanding of the applied principles. Using a similar design, in which students answered the first question individually, Smith et al. (2011) allowed students to either engage in group discussion4, receive instructor explanation, or a combination of group discussion and instructor explanation. Students then individually answered a second question, which utilized the same underlying concept or principle. Performance on the second question was best for students who received the combination of group discussion and instructor explanation, regardless of performance on the first question of the pair.
Do the benefits of group activities or discussion depend on factors such as the size of the group? In order to address this question, Alexopoulou and Driver (1996) had high school students discuss physics questions in groups of two or four individuals. Students answered questions individually at first (pretest) and then participated in group discussion about these questions. Two to 3 weeks later, students individually reanswered the same questions (post-test). Students in larger groups demonstrated greater learning gains than students in smaller groups and were less likely to regress from their pretest performance. Alexopoulou and Driver suggest that groups of four students were less constrained in their interactions than groups of two students and were also more willing to discuss conflicting perspectives. Such findings argue that larger groups may enhance the value of group discussion (e.g., by furthering discussion, resolving group conflict, bringing prior experiences to bear, etc.; for similar suggestions, see Shimazoe and Aldrich 2010). However, these conclusions are based only on a single study; further work should examine the reliability of this finding and determine the point where adding individuals to a group results in diminishing returns (e.g., additional individuals offer little benefit or are detrimental). Indeed, group dynamics (e.g., influence of individuals) may also change as group size increases (Fay et al. 2000). Manipulating only group size (5 vs. 10 members), Fay et al. (2000) found that students in smaller group discussions were equally influenced by members they interacted with, whereas students in larger group discussions were primarily influenced by dominant group speakers. Although the issue of group size is not yet resolved, Michaelsen and Sweet (2008a, b, 2011) recommend groups of five to seven members when groups must address challenging intellectual tasks.
McLaughlin et al. (2014) had groups of four to five students prepare a summary of class readings and create presentation materials. These materials were used to lead in-class discussion, during which the students were responsible for answering other students’ questions. Indeed, student presentations are commonly reported as being used in the flipped classroom format (Hung 2015; Kim et al. 2014; Mason et al. 2013; McLaughlin et al. 2013; Schlairet et al. 2015).
In a recent review, Carberry and Ohland (2012) discuss how teaching can benefit one’s own learning (i.e., learning-by-teaching). The teaching processes (composed of: preparation, presentation, and student assessment) are frequently presumed to benefit the teacher’s learning of the presented content, in addition to improving related skills, such as communication. In one example of these learning benefits, Nestojko et al. (2014) examined test performance amongst students who had studied text passages either with the expectation of teaching another student or with the expectation of a test. However, no participants engaged in teaching, and all participants engaged in the test. Overall, test performance was highest amongst learners who had expected to teach. Actual teaching may also confer an additional benefit to learning (e.g., Ross and Di Vesta 1976). For example, Fiorella and Mayer (2014) had participants study information on the Doppler Effect, either with the expectation of a subsequent test or the expectation of teaching that material. Compared to participants who expected a test, those who expected to teach performed better on both immediate and delayed comprehension tests, with this effect being strongest for those who both expected to teach and actually did teach. Overall, these results indicate that presenting (or the expectation of presenting) provides benefits to the student above and beyond either listening to a class lecture or studying in anticipation of a test.
These data are reminiscent of the finding that generation, or actively producing information that is being learned, results in better memory for that information than if the information is simply provided to them (for reviews, see Bjork et al. 2007; Mulligan and Lozito 2004). That is, students presenting must generate the material for the presentation, whereas students listening to the presentation might benefit from the additional review provided from the presentation itself but would not receive the benefits of generating that material. Such findings suggest that the benefit of student presentations would be enhanced to the degree that the student generates, or creates, his or her own content for the presentation (see also Foos et al. 1994). That is, if the student connects related class concepts or introduces relevant outside readings into his or her presentation, the resulting benefit to learning may be greater than if the presentation was a simple summary of the class material.
Active Learning Interacting with the Flipped Classroom
Thus far, we have reviewed various components of the flipped classroom in terms of the cognitive processes (e.g., generation) they engage (e.g., Jensen et al. 2015). However, it must be noted that these processes may interact when multiple activities are paired together. For example, although both spacing and variation individually benefit learning, their combination becomes counterproductive if spaced variations fail to be connected with the original learning (Appleton-Knapp et al. 2005), or with the nature of the material (e.g., McDaniel et al. 1986). Accordingly, the benefits of any individual technique employed in flipped classroom may be altered by other methods in use. One intriguing possibility is that flipped classrooms may be an ideal venue for combining multiple methods of active learning. For example, compared to classes that integrated some active learning (e.g., clickers, online homework, and demonstrations), Flynn (2015) integrated a wide variety of active learning techniques and reported a benefit for flipped classrooms (i.e., reduced withdrawal and failure rates, increased exam scores). Future work in this area should continue to examine the degree to which individual active learning strategies aid student performance in the flipped versus traditional classroom.
Summary and Conclusions
Overall, we considered several instructional strategies that have been implemented in a flipped classroom, evaluating their potential benefits. To summarize, video instruction in and of itself does not appear responsible for changes in learning performance, but may provide additional time for in-class activities that enhance learning performance due to active learning (Prince 2004). Other learning activities common in the flipped classroom (e.g., quizzes or clicker questions, pair-and-share activities, student presentations and discussion) differ both in their effectiveness and in the conditions necessary for enhancing learning performance. Perhaps most important, the benefit of testing students (either through standard quizzes or clicker questions) is not contingent upon the performance of others. Of all the active learning methods discussed, testing is the simplest to isolate and identify as an effective learning strategy (but see Carpenter et al. 2015, for some limits to this based on student knowledge). As such, we note that the benefits of testing are robust and likely to enhance performance regardless of how it is carried out—something difficult to say about many techniques. Clickers may present an opportunity to incorporate frequent testing in a manner that students find more enjoyable than traditional exams or quizzes (Birjandi and Alemi 2010).
We do not recommend that other forms of active learning be ignored (and certainly, several other techniques might be viable). For example, collaborative learning (Kirschner et al. 2009), cooperative learning (Johnson et al. 1998), and problem-based learning (Dochy et al. 2003) are broadly used group learning activities with varying levels of support (Prince 2004). Benefits of group learning depend upon a variety of factors, such as task complexity, prior expertise, and individuals’ contributions to the group (Gadgil et al. 2012; Kirschner et al. 2009; Meade et al. 2009) and are worthy of further exploration in the flipped classroom. As well as enhancing individual learning course content, group activities can provide additional benefits such as developing leadership skills, enhancing the ability to work in teams, and building social support (Johnson et al. 1998). Many of these factors are important not only for retention in academic settings but for achieving success once in the workforce.
Comprehensive research on flipped classrooms is still in a nascent stage. Indeed, the literature remains characterized by few studies that compare the flipped vs. traditional classroom when objective measures of learning (e.g., the quizzes and exams used) are held constant. Accordingly, we strongly recommend that this practice guides future research. Prior work has discussed a wide variety of factors involved in the classroom flip (e.g., preclass activities, student satisfaction, teacher preparation), yet little work has examined the effectiveness of individual practices utilized. In this review, we emphasize the importance of the cognitive processes these practices engage and recommend that both researchers and instructors seek to integrate empirically supported activities. However, there remain a number of unanswered or infrequently examined questions. Amongst these, do student approaches to learning change when enrolled in a flipped vs. traditional classroom? When learning is student-centered, students are responsible for much of their own learning. However, students learn the most when studying items that are neither too difficult nor too easy (Metcalfe 2002). When left to their own devices, do students become better at identifying information at the appropriate difficulty level? Is student growth (e.g., Inaba et al. 2003) enhanced? Do certain active learning techniques lend themselves better to the flipped classroom format than others?
Given that much of the literature reflects environments that are not controlled (e.g., students can engage in learning outside of the classroom), future research would benefit from continued examination of the individual components of a flipped classroom. For example, although the practice of “flipping” lectures out-of class does not appear to harm learning performance, does it change certain student behaviors, such as increase the likelihood that students will fail to read the textbook? Are students less likely to drop out of the course? Do students engage in more preparation before class? As noted by Pascarella and Terenzini (2005), “evidence strongly suggests that… multiple forces operate in multiple settings to influence student learning and change” (p. 629). That insight succinctly summarizes the research on the individual components of a flipped classroom approach.
Although a variety of work suggests strategies to increase reading compliance (e.g., Dobson 2008; Wilson 2013), little is known about the efficacy of assigning readings when learners do not know what information they should extract.
In order to ensure that students complete out-of-class activities such as readings or video lectures, some instructors have successfully implemented low-stakes quizzes (e.g., 10 % or less of the overall course grade) on the out-of-class materials (Flynn 2015; Jensen et al. 2015; Wilson 2013). However, it remains an open question (a) whether out-of-class assignments are regularly completed in flipped classrooms and (b) whether the rate of adherence differs between flipped and traditional classroom formats.
Although group sizes differed, a post-semester self-report study indicated an average of three participants per group.
Group size was not specified.