To date, nine studies have been published that focus on analyzing student learning outcomes when OER are substituted for traditional textbooks in higher education settings. In this section I review these studies and synthesize their overall results.
Lovett et al. (2008) measured the result of implementing an online OER component of Carnegie Mellon University’s Open Learning Initiative (OLI). In fall 2005 and spring 2006 researchers invited students who had registered for an introductory statistics class at Carnegie Mellon to participate in an experimental online version of the course which utilized OER. Volunteers for the experimental version of the course were randomly assigned to either treatment or control conditions, with those who did not volunteer also becoming part of the control group, which was taught face-to-face and used a commercial textbook.
In the fall of 2005 there were 20 students in the treatment group and 200 in the control group. In spring of 2006 there were 24 students in the treatment group and an unspecified number of students in the control group. Researchers compared the test scores (three midterm and one final exam) between students in the experimental and control versions of the course for each of these two semesters and found no statistically significant differences.
In a follow up experiment reported in the same study, students in the spring of 2007 were given an opportunity to opt into a blended learning environment in which students who utilized OER in combination with face-to-face instruction would complete the course materials in half the time used by those taking the traditional version of the course. In this instance, the treatment and control groups (22 and 42 students respectively) were only drawn from those who volunteered to participate in the accelerated version of the course. The authors stated that “as in the two previous studies, in-class exams showed no significant difference between the traditional and online groups.…[however] students in OLI-Statistics learned 15 weeks’ worth of material as well or better than traditional students in a mere 8 weeks” (pp. 10, 12). Five months after the semester ended (seven months after the end for the treatment students), a follow up test was given to determine how much of the material had been retained. No significant difference was found between the two groups.
In addition to comparing student exam scores, researchers examined student understanding of basic statistical concepts as measured by the national exam known as “Comprehensive Assessment of Outcomes in a first Statistics course” (CAOS). Research subjects in the spring of 2007 took this test at the beginning and end of the semester in order to measure the change in their statistics understanding. Students in the blended version of the course improved their scores by an average of 18 %; those in the control group on average improved their scores by 3 %, a statistically significant difference. This study is notable both for being the first published article to examine comparative learning outcomes when OER replace traditional learning materials and for its selection criteria of participation. The method used in the spring of 2007, when treatment and control groups were randomly selected from the same set of participants, represents an important attempt at randomization that has unfortunately rarely been replicated in OER studies. At the same time, it should be noted that the sample sizes are relatively small and there was a confound between the method in which students were taught and the use of OER.
Bowen et al. (2012) can be seen as an extension of the study just discussed. They compared the use of a traditional textbook in a face-to-face class on introductory statistics with that of OER created by Carnegie Mellon University’s Open Learning Initiative taught in a blended format. They extended the previous study by expanding it to six different undergraduate institutions. As in the spring 2007 semester reported by Lovett et al. (2008), Bowen et al. (2012) contacted students at the beginning or before each semester to ask for volunteers to participate in their study. Treatment and control groups were randomly selected from those who volunteered to participate, and researchers determined that across multiple characteristics the two groups were essentially the same.
In order to establish some benchmarks for comparison, both groups took the same standardized test of statistical literacy (CAOS) at the beginning and end of the semester, as well as a final examination. In total, 605 students took the OER version of the course, while 2439 took the traditional version. Researchers found that students who utilized OER performed slightly better in terms of passing the course as well as on CAOS and final exam scores; however, these differences were marginal and not statistically significant.
Bowen et al. (2012) is the largest study of OER efficacy that both utilized randomization and provided rigorous statistical comparisons of multiple learning measures. A weakness of this study in terms of its connection with OER is that those who utilized the OER received a different form of instruction (blended learning as opposed to face-to-face); therefore, the differences in instruction method may have confounded any influence of the open materials. Nevertheless, it is important to note that the use of free OER did not lead to lower course outcomes in this study (Bowen et al. (2014) model how their 2012 results could impact the costs of receiving an education).
A third study (Hilton and Laman 2012), focuses on an introductory Psychology course, taught at Houston Community College (HCC). In 2011, in order to help students save money on textbooks, HCC’s Psychology department selected an open textbook as one of the textbooks that faculty members could choose to adopt. The digital version was available for free, and digital supplements produced by faculty were also freely available to HCC students.
In the fall of 2011, seven full-time professors taught twenty-three sections using the open textbook as the primary learning resource; their results were compared with those from classes taught using commercial textbooks in the spring of 2011. Results were provided for 740 students with roughly 50 % treatment and control conditions. Researchers used three metrics to gauge student success in the course: GPA, withdrawals, and departmental final exam scores. They attempted to control for a teacher effect by comparing those measures across the sections of two different instructors. Each of these instructors taught one set of students using a traditional textbook in spring of 2011 and other students using the open textbooks in fall of 2012.
Their overall results showed that students in the treatment group had a higher class GPA, a lower withdrawal rate, and higher scores on the department final exam. These same results occurred when only comparing students that had been taught by the same teacher. While this research demonstrated what may appear to be learning improvements, there were many methodological problems with this study. These limitations are significant, including the fact that the population of individuals who take an introductory psychology course in the spring may be different from the one that takes the same course in the fall. There was no attempt made to contextualize this potential difference by providing information about the difference between fall and spring semesters in previous years. In addition, changes were made in the course learning outcomes and final exam during the time period of the study. While there is no indication that the altered test was harder or easier than previous tests, it is a significant weakness. Moreover, there was no analysis performed to determine whether the results were statistically significant.
A fourth study, Feldstein et al. 2012, took place at Virginia State University (VSU). In the spring of 2010 the School of Business at VSU began implementing a new core curriculum. Faculty members were concerned because an internal survey stated that only 47 % of students purchased textbooks for their courses, largely because of affordability concerns. Consequently, they adopted open textbooks in many of the new core curriculum courses. Across the fall of 2010 and spring of 2011, 1393 students took courses utilizing OER and their results were compared with those of 2176 students in courses not utilizing OER.
These researchers found that students in courses that used OER more frequently had better grades and lower failure and withdrawal rates than their counterparts in courses that did not use open textbooks. While their results had statistical significance, the two sets of courses were not the same. Thus while these data provide interesting correlations, they are weak because the courses being compared were different, a factor that could easily mask any results due to OER. In other words, while this study establishes that students using OER can obtain successful results, the researchers compared apples to oranges, leading to a lack of power in their results.
In the fifth study, Pawlyshyn et al. (2013) reported on the adoption of OER at Mercy College. In the fall of 2012, 695 students utilized OER in Mercy’s basic math course, and their pass rates were compared with those of the fall of 2011, in which no OER were utilized. They found that when open materials were integrated into Mercy College, student learning appeared to increase. The pass rates of math courses increased from 63.6 % in fall 2011 (when traditional learning materials were employed) to 68.9 % in fall 2012 when all courses were taught with OER. More dramatic results were obtained when comparing the spring of 2011 pass rate of 48.4 % (no OER utilized) with the pass rate of 60.2 % in the spring of 2013 (all classes utilized OER). These results however, must be tempered with the fact that no statement of statistical significance was included. Perhaps a more important limitation is that simultaneous with the new curriculum came the decision to flip classroom instruction, thus introducing a significant confound into the research design. Mercy’s supplemental use of explanatory videos and new pedagogical model may be responsible for the change in student performance, rather than the OER.
In addition to the change in the math curriculum, Mercy College also adopted OER components based on reading in some sections of a course on Critical Inquiry, a course that has a large emphasis on reading skills. In the fall of 2011, 600 students took versions of the course that used OER, while an unspecified number of students enrolled in other sections did not use the OER. In the critical reading section of the post-course assessment, students who utilized OER scored 5.73, compared with those in the control group scoring 4.99 (the highest possible score was 8). In the spring of 2013, students enrolled in OER versions of the critical inquiry course performed better than their peers; in a post-course assessment with a maximum score of 20, students in the OER sections scored an average of 12.44 versus 11.34 in the control sections. As with the math results, no statement of statistical significance was included; in addition, no efforts were made to control for any potential differences in students or teachers. Another weakness of this aspect of the study is that there was significant professional development that went into the deployment of the OER. It is conceivable that it was the professional development, or the collaboration across teachers that led to the improved results rather than the OER itself. If this were to be the case, then what might be most notable about the OER adoption was its use as a catalyst for deeper pedagogical change and professional growth.
A sixth study (Hilton et al. 2013), took place at Scottsdale Community College (SCC), a community college in Arizona. A survey of 966 SCC mathematics students showed that slightly less than half of these students (451) used some combination of loans, grants and tuition waivers to pay for the cost of their education. Mathematics faculty members were concerned that the difficulties of paying for college may have been preventing some students from purchasing textbooks and determined that OER could help students access learning materials at a much lower price.
In the fall of 2012 OER was used in five different math courses; 1400 students took these courses. Each of these courses had used the same departmental exam for multiple years; researchers measured student scores on the final exam in order to compare student learning between 2010 and 2011 (when there were no OER in place) and 2012 (when all classes used OER). Issues with the initial placement tests made it so only four of the courses could be appropriately compared. Researchers found that while there were minor fluctuations in the final exam scores and completion rates across the four courses and three years, these differences were not statistically significant. As many of the studies discussed in this section, this study did not attempt to control for any teacher or student differences due to the manner in which the adoption that took place. While it is understandable that the math department wished to simultaneously change all its course materials it would have provided a better experimental context had only a portion of students and teachers been selected for an implementation of OER.
The seventh study (Allen et al. 2015), took place at the University of California, Davis. The researchers wanted to test the efficacy of an OER called ChemWiki in a general chemistry class. Unlike some of the studies previously discussed, researchers attempted to approximate an experimental design that would control for the teacher effect by comparing the results of students in two sections taught by the same instructor at back-to-back hours. One of these sections was an experimental class of 478 students who used ChemWiki as its primary learning resource, the other was a control class of 448 students that used a commercial textbook. To minimize confounds, the same teaching assistants worked with each section and common grading rubrics were utilized. Moreover, they utilized a pretest to account for any prior knowledge differences between the two groups.
Students in both sections took identical midterm and final exams. Researchers found no significant differences between the overall results of the two groups. They also examined item-specific questions and observed no significant differences. Comparisons between beginning of the semester pre-tests and final exam scores likewise showed no significant differences in individual learning gains. This pre/post analysis was an important measure to control for initial differences between the two groups.
Researchers also administered student surveys in order to determine whether students in one section spent more time doing course assignments than those in the other section. They found that students in both sections spent approximately the same amount of time preparing for class. Finally, they administered the chemistry survey known as “Colorado Learning Attitudes about Science Survey” (CLASS) in order to discern whether student attitudes towards chemistry varied by treatment condition. Again, there was no significant difference.
The eighth study (Robinson 2015) examined OER adoption at seven different institutions of higher education. These institutions were part of an open education initiative named Kaleidoscope Open Course Initiative (KOCI). Robinson focused on the pilot adoption of OER resources at these schools in seven different courses (Writing, Reading, Psychology, Business, Geography, Biology, and Algebra). In the 2012–2013 academic year, 3254 students across the seven institutions enrolled in experimental versions of these courses that utilized OER and 10,819 enrolled in the equivalent versions of the course that utilized traditional textbooks. In order to approximate randomization, Robinson used propensity score matching on several key variables in order to minimize the differences between the two groups. After propensity score matching was completed, there were 4314 students remaining, with 2157 in each of the two conditions.
Robinson examined the differences in final course grade, the percentage of students who completed the course with a grade of C- or better, and the number of credit hours taken, which was examined in order explore whether lower textbook costs were correlated with students taking more courses. Robinson found that in five of the courses there were no statistically significant differences between the two groups in terms of final grades or completion rates. However, students in the Business course who used OER performed significantly worse, receiving on average almost a full grade lower than their peers. Those who took the OER version of the psychology course also showed poorer results; on average, they received a half-grade lower for their final grade (e.g. B + to a B). Students in these two courses were significantly less likely to pass the course with a C- or better.
In contrast, students who took the biology course that used OER were significantly more likely to complete the course, although there were no statistically significant differences between groups in the overall course grades. Across all classes there was a small but statistically significant difference between the two groups in terms of the number of credits they took, with students taking OER versions of the course taking on average .25 credits more than their counterparts in the control group. This study is notable in higher education OER efficacy studies in terms of its rigorous attempts to use propensity score matching to control for potentially important confounding variables.
In the ninth study, Fischer et al. (2015) performed follow-up research on the institutions participating in KOCI. Their study focused on OER implementation in the fall of 2013 and spring of 2014. Their original sample consisted of 16,727 students (11,818 control and 4909 treatment). From this sample, there were 15 courses for which some students enrolled in both treatment (n = 1087) and control (n = 9264) sections (the remaining students enrolled in a course which had either all treatment or all control sections and were therefore excluded). While this represents a large sample size, students in treatment conditions were only compared with students in control conditions who were taking the same class in which they were enrolled. For example, students enrolled in a section of Biology 111 that used OER were only compared with students in Biology 111 sections that used commercial textbooks (not students enrolled in a different course). Thus when diffused across 15 classes, there was an insufficient number of treatment students to do propensity score matching for the grade and completion analyses.
The researchers found that in two of the 15 classes, students in the treatment group were significantly more likely to complete the course (there were no differences in the remaining 13). In five of the treatment classes, students were significantly more likely to receive a C- or better. In nine of the classes there were no significant differences and in one study control students were more likely to receive a C- or better. Similarly, in terms of the overall course grade, students in four of the treatment classes received higher grades, ten of the classes had no significant differences, and students in one control class received higher grades than the corresponding treatment class.
Researchers utilized propensity score matching before examining the number of credits students took in each of the semesters as this matching could be done across the different courses. Drawing on their original sample of 16,727 students, the researchers matched 4147 treatment subjects with 4147 controls. There was a statistically significant difference in enrollment intensity between the groups. Students in fall 2013 who enrolled in courses that utilized OER took on average two credit hours more than those in the control group, even after controlling for demographic covariates. ANCOVA was then used to control for differences in fall enrollment and to estimate differences in winter enrollment. Again, there was a significant difference between the groups, with treatment subjects enrolling in approximately 1.5 credits more than controls.
This study is unique in its large sample size and rigorous analysis surrounding the amount of credits taken by students. In some ways, its strength is also a weakness. Because of the large number of contexts, OER utilized, number of teachers involved, and so forth, it is difficult to pinpoint OER as the main driver of change. For example, it is possible that the level of teacher proficiency at the college that taught Psychology using open resources was superior than that of the college where traditional textbooks were used. A host of other variables, such as student awareness of OER, the manner in which the classes were taught were not analyzed in this study; these could have overwhelmed any influence of OER. Moreover, the authors neglect to provide an effect size, limiting the ability to determine the magnitude of difference between the control and treatment courses. At the same time, one would expect that if using OER does significantly impact learning (for good or bad), that that finding would be visible in the results. The lack of difference between the groups indicates that substituting OER for traditional resources was not a large factor in influencing learning outcomes.
Table 1 summarizes the results of the nine published research studies that compare the student learning outcomes in higher education based on whether the students used OER or traditional textbooks.
Table 1 Summary of OER Efficacy Studies