Evaluating the Effectiveness of Instructional Methods
The previous chapters in this book present a variety of insights for the instructional design of high-stakes learning environments. These insights are based on randomised controlled experiments that compared different instructional formats for learners with varying degrees of prior experience with content to be learned as well as other types of carefully designed studies. Moreover, efforts across fields resulted in a variety of instruments for the measurement of cognitive load or, to some extent, even of separate types of cognitive load. Some of these measurements have been successfully used in research in, for instance, emergency medicine settings. However, to bring instructional design research to the next level, a critical revision of common methodological and statistical practices to evaluate the effectiveness of different instructional methods is needed. In this chapter, suboptimal practices that occur across the board in instructional design research are discussed, and more viable alternatives are provided. Although a variety of factors may frequently put constraints on the sample sizes of our studies and variables measured in these studies, we should do efforts to go beyond small samples and beyond single measurements whenever we can. Further, we should adopt alternatives to the traditional statistical significance testing approach that has dominated statistical testing in research in education, psychology and other fields. Finally, we should adjust our approach to the evaluation of the reliability of our measurements, and we should consider an important recent development in the peer-review and reporting practice.
- Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In B. N. Petrov & F. Csaki (Eds.), Second international symposium on information theory (pp. 267–281). Budapest, Hungary: Academiai Kiado.Google Scholar
- Burnham, K. P., & Anderson, D. R. (2002). Model selection and multimodel inference: A practical information-theoretic approach. New York: Springer.Google Scholar
- Center for Open Science (COS). Registered reports: Peer review before results are known to align scientific values and practices. Retrieved from: https://cos.io/rr/. Accessed 23 Mar 2018.
- Comparison of Registered Reports. Retrieved from: https://docs.google.com/spreadsheets/d/1D4_k-8C_UENTRtbPzXfhjEyu3BfLxdOsn9j-otrO870/edit#gid=0. Accessed 23 Mar 2018.
- Crutzen, R. (2014). Time is a jailer: What do alpha and its alternatives tell us about reliability? European Health Psychologist, 16, 70–74.Google Scholar
- Leppink, J., & Van Merriënboer, J. J. G. (2015). The beast of aggregating cognitive load measures in technology-based learning environments. Educational Technology & Society, 18, 230–245.Google Scholar
- Leppink, J., Paas, F., Van Gog, T., Van der Vleuten, C. P. M., & Van Merriënboer, J. J. G. (2014). Effects of pairs of problems and examples on task performance and different types of cognitive load. Learning and Instruction, 30, 32–42. https://doi.org/10.1016/j.learninstruc.2013.12.001 CrossRefGoogle Scholar
- Peters, G. J. Y. (2014). The alpha and the omega of scale reliability and validity. European Health Psychologist, 16, 56–69.Google Scholar
- Wicherts, J. M., Veldkamp, C. L. S., Augusteijn, H. E. M., Bakker, M., Van Aert, R. C. M., & Van Assen, M. A. L. M. (2016). Degrees of freedom in planning, running, analyzing, and reporting psychological studies: A checklist to avoid p-hacking. Frontiers in Psychology, 7, 1–12. https://doi.org/10.3389/fpsyg.2016.01832 CrossRefGoogle Scholar