Systematic Reviews on Flipped Learning in Various Education Contexts

This chapter shares the author's experiences of and reflections on conducting systematic reviews of flipped classroom research. The author first discusses the rationale for conducting systematic reviews and how the reviews contribute to the flipped learning field. After that, he lighlights some possible strategies, regarding literature search, article selection, and research synthesis, to improve the quality of systematic reviews.


Introduction
In recent years, numerous studies about the flipped (or inverted) classroom approach have been published Karabulut-Ilgu et al. 2018). In a typical flipped classroom, students learn course materials before class by watching instructional videos (Bishop and Verleger 2013;. Class time is then freed up for more interactive learning activities, such as group discussions O'Flaherty and Phillips 2015). In contrast to a traditional lecture-based learning environment, students in flipped classrooms can pause or replay the instructor's presentation in video lectures without feeling embarrassed. These functions enable them to gain a better understanding of course materials before moving on to new topics (Abeysekera and Dawson 2015). Moreover, instructors are no longer occupied by direct lecturing and can thus better reach every student inside the classroom. For example, Bergmann and Sams (2008) provide one-to-one assistance and small group tutoring during their class meetings. The growth in research on flipped classrooms is reflected in the increasing number of literature review studies. Many of these are systematic reviews (e.g., Betihavas et al. 2016;Chen et al. 2017;Karabulut-Ilgu et al. 2018;Lundin et al. 2018;O'Flaherty and Phillips 2015;Ramnanan and Pound 2017). One would expect that if the scope of review has remained unchanged, contemporary reviews would include and analyze more research articles than the earlier reviews. Moreover, because flipped classroom practice is becoming more innovative (e.g., gamified flipped classroom), recent reviews should provide new insights into future research and practice. However, this is not always the case.
With this in mind, this chapter highlights possible strategies to improve the quality of systematic reviews. The chapter is based on my experiences of and reflections on systematic reviews of flipped classroom research in various contexts (Table 1). It begins by presenting the rationale for conducting systematic reviews. The chapter then discusses how systematic reviews contribute to the flipped learning field. In contrast to several existing reviews, it then shares my reflections on practical aspects of systematic reviews, including literature search, article selection, and research synthesis. The chapter concludes with a summary.

Rationale for Conducting Systematic Reviews
To avoid repeating previous research efforts, researchers should first understand the current state of the literature by either examining existing reviews or conducting their own systematic review. Phrases such as "little research has been done" and "there is a lack of research" are extensively used to justify a newly written article. However, I sometimes doubt the grounds for these claims. There is no longer a lack of research in the field of flipped learning. In mathematics education alone, for example, 61 peer-reviewed empirical studies were published between 2012 to 2016 . Karabulut-Ilgu et al. (2018) found 62 empirical research articles on flipped engineering education as of May 2015. Through a systematic review of the literature, a more comprehensive picture of current research can be revealed. In fact, before conducting my studies of flipped learning in secondary schools, I carried out a systematic review in the context of K-12 education . At the time of writing (October 2016), only 15 empirical studies existed. We therefore knew little (at that time) about the effect of flipped learning on K-12 students' achievement under this instructional approach. With such a small number of research published, the systematic review thus provided a justification for our planned studies (see Lo et al. 2018 for a review) and those of other researchers (e.g., Tseng et al. 2018) to examine the use of the flipped classroom approach in K-12 contexts. In addition to understanding the current state of the literature, systematic reviews help identify research gaps. In flipped mathematics education, for example, Naccarato and Karakok (2015) hypothesized that instructors "used videos for the delivery of procedural knowledge and left conceptual ideas for face-to-face interactions" (p. 973). However, researchers have not reached a consensus on course planning using the lens of procedural and conceptual knowledge. While Talbert (2014) found that students were able to acquire both procedural and conceptual knowledge by watching instructional videos, Kennedy et al. (2015) discovered that flipping conceptual content might impair student achievement. More importantly, we found in our systematic review that very few studies evaluated the effect of flipping specific types of materials, such as procedural and conceptual problems . To flip or not to flip the conceptual knowledge? That is a key question for future studies of flipped mathematics learning.

Contribution of Systematic Reviews
A systematic review should not be merely a summary of existing studies. Instead, the review should contribute to the body of knowledge. Researchers must figure out the purpose of their systematic review and ensure the significance of their work. This section illustrates several possible goals of research synthesis. Table 2 shows that in our systematic review, we aimed to achieve two main goals: (1) To inform future flipped classroom practice, and (2) to compare the overall effect of flipped learning to traditional lecture-based learning. First, the overarching goal of some of our systematic reviews was to inform future flipped classroom practice. Using the findings of the reviewed studies, we have developed a 5E flipped classroom model for history education (Lo 2017), made 10 recommendations for flipping K-12 education , and established a set of design principles for flipped mathematics classrooms . Taking the design principles for flipped mathematics classrooms as an example, our Principle 4 suggested that short videos could be used to enable effective multimedia learning. This principle was based on the problem (reported in the literature) that students tend to disengage when watching long videos.
To avoid making similar mistakes, we recommended that each video be limited to six minutes and all combined video segments be no more than 20-25 min. With this principle applied, Chen and Chen (2018) confirmed that the assigned workload was bearable for the students in their flipped research methodology course.
Second, the goal of our systematic reviews was to examine the effect of flipped learning versus traditional learning on student achievement. These reviews focus on flipped mathematics education , health professions , and engineering education (Lo and Hew 2019). Researchers have conducted several systematic reviews of flipped learning in the health professions Ramnanan and Pound 2017) and engineering education (Karabulut-Ilgu et al. 2018). Ramnanan and Pound (2017) reported that medical students were generally satisfied with flipped learning and preferred this instruction approach to traditional lecture-based learning. However, strong satisfaction with learning does not necessarily mean improved achievement. Examining student learning outcomes, Karabulut-Ilgu et al. (2018) classified their flipped-traditional  comparison studies into five categories: (1) More effective, (2) more effective and/ or no difference, (3) no difference, (4) less effective, and (5) less effective and/ or no difference. As in Chen et al. (2017), they presented the effect size of each flipped-traditional comparison study. However, as Karabulut-Ilgu et al. (2018) acknowledged, no definitive conclusion can be made without a meta-analysis of student achievement in flipped classrooms. We therefore attempted to examine the overall effect of flipped learning on student achievement through systematic reviews of the empirical research. The findings enhance our understanding of this instructional approach. Using a meta-analytic approach, a small but significant difference in effect in favor of flipped learning over traditional learning was found in all three contexts (i.e., mathematics education, health professions, and engineering education). Most importantly, our moderator analyses provided quantitative support for a brief review and/or formative assessment of pre-class materials at the start of face-to-face lessons. The effect of flipped learning was further promoted when instructors provided such an assessment (for mathematics education and health professions) and/or review (for engineering education) in their flipped classrooms. These findings not only extend our understanding of flipped learning, but also inform future practice of flipped classrooms (e.g., offering a quiz on pre-class materials at the start of face-to-face lessons).

Reflections on Some Practical Issues of Conducting Systematic Reviews
The following sections cover some practical aspects of systematic reviews of flipped classroom research, including literature search, article selection, and research synthesis.

Literature Search
Abeysekera and Dawson (2015) shared their experiences of searching for articles on flipped classrooms. They performed their search using the term "flipped classroom" in the ERIC database. In June 2013, they found only two peer-reviewed articles on flipped learning. Although not much research had been published at that time, this scarcity of search outcome has prompted us to reflect on (1) the design of the search string and (2) the choice of databases when conducting a systematic review.

The Design of Search String
The search term "flipped classroom" is very specific in that it cannot include other terms used to describe this instructional approach, such as flipped learning, flipping classrooms, and inverted classrooms. From my observation, some authors use even more flexible wording. For example, Talbert (2014) entitled his article "Inverting the Linear Algebra Classroom" (p. 361). If certain keywords are not included in their title, abstract, and keywords, their articles might not be retrieved through a narrow database search.
Although it is the authors' responsibility to use well-recognized keywords, researchers producing systematic reviews should make every effort to retrieve as many relevant studies as possible. To this end, we used the asterisk as a wild card to capture different verb forms of "flip" (i.e., flip, flipping, and flipped) and "invert" (i.e., invert, inverting, and inverted). The asterisk also allowed the inclusion of both singular and plural forms of nouns (e.g., class and classes, classroom and classrooms). Furthermore, Boolean operators (i.e., AND and OR) were applied to separate each search term to increase the flexibility of our search strings. In this way, we were able to include some complicated expressions used in flipped classroom research, such as "Flipping the Statistics Classroom" (Kuiper et al. 2015, p. 655). Table 3 shows the search strings that we used in the systematic reviews of flipped history education (Lo 2017), K-12 education , and mathematics education .
Our search strings comprised two parts: (1) The instructional approach, and (2) the context. In the first part, "(flip* OR invert*) AND (class* OR learn*)" allowed us to capture different combinations of terms about flipped learning. In the second part, we used various search terms to specify the research contexts (e.g., K12 OR K-12 OR primary OR elementary OR secondary OR "high school" OR "middle school") or subject areas (e.g., math* OR algebra OR trigonometry OR geometry OR calculus OR statistics) that we wanted. As a result, we were able to reach research items that had seldom been downloaded and cited.
However, upon completion of the systematic reviews in Table 3, we realized that researchers might use other terms to describe the flipped classroom approach, such as "flipped instruction" (He et al. 2016, p. 61). Therefore, we further included "instruction*" and "course*" in our search strings. Table 4 shows the improved search strings that we used in the systematics reviews of flipped health professions  and engineering education (Lo and Hew 2019).
As a side note about the design of search strings, one researcher emailed me about our systematic review of flipped mathematics education ). He told me that our review has missed his article, an experimental study of flipped mathematics learning. After careful checking, his study perfectly fulfilled all inclusive criteria for our systematic review. However, I could not find any variations of "mathematics" or other possible identifiers of subject areas (e.g., algebra, calculus, and statistics) in his title, abstract, and keywords. That is why we were unable to retrieve his article through database searching using our search string.  At this point, I still believe that the context part of our search string of flipped mathematics education (i.e., math* OR algebra OR trigonometry OR geometry OR calculus OR statistics) is broad enough to capture the flipped classroom research conducted in mathematics education. However, this search string cannot capture studies that do not describe their subject domain at all. Without this information, other readers would have no idea about where the work is situated within the broader field of flipped learning if they only scan the title, abstract, and keywords. Most importantly, this valuable piece of work cannot be retrieved in a database search. Other snowballing strategies, such as tracking the reference lists of reviewed studies (see Lo 2017; Wohlin 2014 for a review), should be applied to find these articles in future systematic reviews.

The Choice of Databases
In our systematic reviews, we performed our literature search across databases, such as Academic Search Complete, TOC Premier, and ERIC. For the systematic review of flipped health professions , we further used databases of medicine education, including PubMed, PsycINFO, CINAHL Plus, and British Nursing Index. In my experience, there are relatively few documents about flipped learning in the ERIC database. For example, Fig. 1 shows that we obtained 1611 peer-reviewed journal articles (though not all articles were related to flipped learning) in Academic Search Complete using our search string of health professions, but only 14 in ERIC. This situation was similar to the systematic review of flipped engineering education by Karabulut-Ilgu et al. (2018), in which we only found two documents in ERIC. Therefore, flipped classroom research reviewers should not restrict their searches to this database. Fig. 1 The search outcome of flipped classroom research across databases in health professions (Hew and Lo 2018, p. 4) Apart from the aforementioned databases, other researchers (e.g., Lundin et al. 2018;O'Flaherty and Phillips 2015;Ramnanan and Pound 2017) have used the following databases in their systematic reviews of flipped learning: Cochrane library, EMBASE, Joanna Briggs Institute, Scopus, and Web of Science. In future systematic reviews, relevant databases need to be consulted. Researchers can follow existing reviews in their research field or consult librarians for advice on which databases to use.

Article Selection
After obtaining the search outcomes, we selected articles based on our inclusion and exclusion criteria. Other existing systematic reviews also develop criteria for article selection. However, they have a few constraints ( Table 5) that reviewers may disagree and could significantly limit the number of studies included. As a result, the representativeness and generalizability of the reviews could be impaired. Researchers should thus provide strong rationales for their inclusion and exclusion criteria for article selection. Taking a recent systematic review by Lundin et al. (2018) as an example, they reviewed the most-cited publications on flipped learning. They only included publications that were cited at least 15 times in the Scopus database. With such a constraint, 493 out of 530 documents were excluded in the early stage of their review. Only 31 articles were ultimately included in their synthesis. This particular criterion could block the inclusion of recently published articles because it takes time to accumulate a number of citations. The majority of the articles that they included were published in 2012 (n = 6), 2013 (n = 16), and 2014 (n = 5), with only a scattering of articles from 2000 (n = 1), 2008 (n = 1), and 2015 (n = 2). No documents after 2016 were included in their systematic review. The authors argued that citation frequency is "an indicator of which texts are widely used in this emerging field of research" (p. 4). However, further justification may help highlight the value of examining this particular set of documents instead of a more comprehensive one. They also have to provide a strong rationale for their 15+ citation threshold (as opposed to 10+ or other possibilities). In our systematic reviews, we also added a controversial criterion for article selection, the definition of the flipped classroom approach. In my own conceptualization, "Inverting the classroom means that events that have traditionally taken place inside the classroom now take place outside the classroom and vice versa" (Lage et al. 2000, p. 32). What traditionally takes place inside the classroom is instructor lecturing. Therefore, I agree with the definition of Bishop and Verleger (2013) that instructional videos (or other forms of multimedia materials) must be provided for students' class preparation. For me, the use of preclass videos is a necessary element for flipped learning, although it is not the whole story. Merely asking students to read text-based materials on their own before class is not a method of flipping. As one student of Wu et al. (2017) said, "Sometimes I couldn't get the meanings by reading alone. But the instructional videos helped me understand the overall meaning" (p. 150). Using instructional videos, instructors of flipped classrooms still deliver lectures and explain concepts for their students (Bishop and Verleger 2013). Most importantly, this instructional medium can "closely mimic what students in a traditional setting would experience" (Love et al. 2015, p. 749).
However, a number of researchers have challenged the definition provided by Bishop and Verleger (2013). For example, He et al. (2016) asserted that "qualifying instructional medium is unnecessary and unjustified" (p. 61). During the peer-review process, reviewers have also questioned our systematic reviews and disagreed with the use of this definition. In response to the reviewers' concern, we added a section discussing our rationale for using the definition by Bishop and Verleger (2013). We also acknowledged that our systematic review "focused specifically on a set of flipped classroom studies in which pre-class instructional videos were provided prior to face-to-face class meetings" (Lo et al. 2017, p. 50). Without a doubt, if instructors insist on "flipping" their courses using pre-class text-based materials only, they will not find our review very useful. Therefore, in addition to explaining the criteria for article selection, future systematic reviews should detail their review scope and acknowledge the limitations of reviewing only a particular set of articles.

Research Synthesis
The difficulty of the research synthesis is somewhat correlated to the number of studies to be analyzed. My research synthesis of flipped history education (Lo 2017) was not difficult. In this systematic review, I found only five empirical studies at the time of writing (June 2016). I first extracted the data on learning activities, learning outcomes, benefits, and challenges reported in the reviewed studies. These data were then organized and presented in a logical sequence (e.g., from pre-class to in-class). Similarly, Betihavas et al. (2016) also reviewed and identified themes from only five empirical studies of flipped nursing education. They focused on study characteristics, academic performance outcomes, student satisfaction, and challenges in implementing flipped classrooms. With a limited number of studies, Betihavas et al. (2016) were able to discuss the findings of each reviewed study in detail.
In contrast, synthesizing the findings of a large number of studies is challenging and time-consuming. In our systematic review of flipped mathematics education , we included and analyzed 61 empirical studies. We read through all of the texts, focusing particularly on the results/findings and discussion sections. One of our research objectives was to understand how the flipped classroom approach benefits student learning, and the challenges of flipping mathematics courses. Codes were assigned to pieces of data (i.e., the benefits and challenges reported in the reviewed studies). Thanks to previous efforts in flipped classroom research, we were able to adopt the frameworks by Kuiper et al. (2015) and Betihavas et al. (2016) as our initial analytic frameworks for benefits and challenges, respectively. Despite the large amount of data to be analyzed, these established frameworks made our research synthesis easier.
Taking the challenges of implementing flipped classrooms as an example, Betihavas et al. (2016) defined three kinds of challenges in their systematic review of flipped nursing education, namely (1) student-related challenges, (2) faculty challenges, and (3) operational challenges. This framework basically covered every aspect involved in implementing a flipped classroom. We therefore adopted this framework as our initial analytic framework for flipped mathematics education . With these three kinds of challenges defined as the major themes, all of the identified challenges were then organized into sub-themes (Table 6).
Furthermore, we quantified our thematic analysis by counting the number of studies that contributed to a theme. In this way, our findings could be more specific. Most importantly, such an analysis provided a foundation to develop our design principles to address these challenges. For example, the most-reported student-related challenge was students' unfamiliarity with flipped learning. Therefore, our Principle 1 was to manage their transition to the flipped classroom. We recommended that instructors introduce students to (1) the rationale for flipped learning, (2) the potential benefits and challenges of this instructional approach, (3) the logistics of their flipped course, and (4) the tasks that students need to do .

Summary
This chapter shared some experiences of conducting systematic reviews of flipped classroom research. Table 7 recaps the recommendations for future systematic reviews. First, researchers can understand the current state of the literature and identify research gaps by conducting systematic reviews. Systematic reviews can inform future practice or examine the overall effect of instructional strategies. This chapter discussed several practical aspects of systematic reviews such as literature search, article selection, and research synthesis. To identify relevant documents, researchers should design more flexible search strings using Table 6 Thematic analysis of the challenges of flipped mathematics education. (Lo et al. 2017, p. 61) Theme Sub-themes (Count) Student-related challenges • Unfamiliarity with flipped learning (n = 26) • Unpreparedness for pre-class learning tasks (n = 14) • Unable to ask questions during out-of-class learning (n = 13) • Unable to understand video content (n = 11) • Increased workload (n = 9) • Disengaged from watching videos (n = 3) Faculty challenges • Significant start-up effort (n = 21) • Not accustomed to flipping (n = 10) • Ineffectiveness of using others' videos (n = 4) Operational challenges • Instructors' lacking IT skills (n = 3) • Students' lacking IT resources (n = 3) the asterisk and Boolean operators. Moreover, relevant databases should be consulted in the literature search. Researchers should also provide strong rationales for inclusion and exclusion criteria for article selection. Meanwhile, they should acknowledge any possible limitations of their review scope. For the research synthesis, researchers can adopt established frameworks as initial analytic frameworks. Finally, the thematic analysis can be quantified by counting the number of studies that contribute to a theme. Taking these recommendations into account, the quality of future systematic reviews can be improved.