Abstract
Technology is continuing to change the way we live. Given its centrality to our lives, it is not surprising that its use in educational assessment has been increasing, and has been an important focus of education assessment research in recent years. While the initial motivation for computer-delivered assessments was gains in assessment efficiency, this chapter demonstrates that computer delivery can enhance validity and reliability through the capture of process data. When an assessment is computer-delivered, every interaction of the test-taker with the environment may be recorded as process data. The use of process data holds much promise for providing previously inaccessible insights into not just whether a student solved a task, but how they did so. Further, it is the processes that contribute to twenty-first century skills that are likely to be amenable to direct targeting in terms of teaching and learning. However, collecting large amounts of information in the absence of a plan for its analysis and use is unlikely to lead to useful outcomes. Through item response theory analysis of process data collected in the Digital Reading Assessment included as part of the 2012 cycle of the Programme for International Student Assessment (PISA), this chapter illustrates how process data that relates to the way a student navigates the problem space can be used to improve validity and reliability. By fitting alternative item response models to the data, it is shown that measurement can be improved by using process data if a clear connection is made between these data and theories of developing competence in the domain of interest.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
This explicit acknowledgement that digital reading has two components raises the possibility that digital reading should be viewed as a multidimensional, rather than a unidimensional construct. Detailed comment on this issue is beyond the scope of this chapter, but a series of analyses undertaken suggested that the use of a unidimensional model (used in the analyses reported in this chapter) was appropriate.
- 2.
This material has been publicly released.
- 3.
This unit can be viewed at the website cbasq.acer.edu.au
- 4.
Defining navigation so that exploration in a previous item is included in the definition of whether the target page has been visited potentially introduces dependency issues. Detailed exploration of this form of dependency (and potentially others) is beyond the scope of this chapter, but a series of tests of the level of dependency between items suggested that this was not an issue in the current work.
References
Adams, R. J. (2005). Reliability as a measurement design effect. Studies in Educational Evaluation, 31(2–3), 162–172.
Adams, R. J., Wu, M. L., & Wilson, M. R. (2015). ACER ConQuest: Generalised item response modelling software (Version 4). Melbourne: Australian Council for Educational Research.
Baker, E. L., & Mayer, R. E. (1999). Computer-based assessment of problem solving. Computers in Human Behavior, 15(3–4), 269–282.
Barab, S. A., Bowdish, B. E., Young, M. F., & Owen, S. V. (1996). Understanding kiosk navigation: Using log files to capture hypermedia searches. Instructional Science, 24(5), 377–395.
Bennett, R. E., Goodman, M., Hessinger, J., Kahn, H., Ligget, J., Marshall, G., & Zack, J. (1999). Using multimedia in large-scale computer-based testing programs. Computers in Human Behavior, 15(3–4), 283–294.
Berezner, A., & Adams, R. J. (2017). Why large-scale assessments use scaling and item response theory. In P. Lietz, J. C. Cresswell, K. F. Rust, & R. J. Adams (Eds.), Implementation of large-scale education assessments (pp. 323–356). Hoboken: Wiley.
Bock, R. D. (1972). Estimating item parameters and latent ability when responses are scored in two or more nominal categories. Psychometrika, 37, 29–51.
Chung, G. K., & Baker, E. L. (2003). An exploratory study to examine the feasibility of measuring problem-solving processes using a click-through interface. Journal of Technology, Learning, and Assessment, 2(2).
Crawford, V., & Toyama, Y. (2002). Assessing the educational technology proficiency of students and educators – A review of current policy, practice, and research – Final Report. SRI Project 11061. Retrieved 29 May, 2003, from http://www.sri.com/policy/ctl/pdfs/Task2_FinalReport3.pdf.
Cuttance, P., & Stokes, S. (2001). School innovation: Pathway to the knowledge society. Canberra: Department of Education Science and Training.
International Telecommunications Union. (2015). ITU World Telecommunication/ICT Indicators database. Retrieved July 7, 2015, from http://www.itu.int/en/ITU-D/Statistics/Pages/stat/default.aspx.
International Telecommunications Union. (2016). ITU World Telecommunication/ICT Indicators database. Retrieved November 22, 2016, from http://www.itu.int/en/ITU-D/Statistics/Pages/stat/default.aspx.
Kozma, R. (Ed.). (2003). Technology, innovation, and education change: A global perspective: A report of the Second Information Technology in Education Study (SITES) Module 2. Eugene: International Society for Technology in Education.
Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale: Erlbaum.
Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47(2), 149–174.
Mendelovits, J., Ramalingam, D., & Lumley, T. (2012). Print and digital reading in PISA 2009: Comparison and contrast. Paper presented at the Annual Conference of the American Educational Research Association (AERA), Vancouver.
Mislevy, R. J., Beaton, A. E., Kaplan, B., & Sheehan, K. M. (1992). Estimating population characteristics from sparse matrix samples of item responses. Journal of Educational Measurement, 29, 133–161.
Mislevy, R. J., Steinberg, L. S., Almond, R. G., & Lukas, J. F. (2006). Concepts, terminology and basic models of evidence-centred design. In D. M. Williamson, I. I. Bejar, & R. J. Mislevy (Eds.), Automated scoring of complex tasks in computer-based testing (pp. 15–48). Mahwah: Lawrence Erlbaum.
OECD. (2009a). PIAAC problem solving in technology rich environments: Conceptual framework. Paris: OECD.
OECD. (2009b). PISA 2009 assessment framework: Key competencies in reading, mathematics and science. Paris: OECD.
OECD. (2011). PISA 2009 results: Students on line: Digital technologies and performance (Volume VI).
OECD. (2012a). PISA 2009 Technical Report PISA Retrieved from http://dx.doi.org/10.1787/9789264167872-en
OECD. (2012b). PISA 2012 main survey problem solving framework. Paris: OECD.
OECD. (2014). PISA 2012 results: Creative problem solving: Students’ skills in tackling real-life problems (Vol. V). Pisa: OECD Publishing.
OECD. (n.d.). PISA computer-based sample items. Retrieved July 22, 2012, from cbasq.acer.edu.au.
Pelligrino, J., Chudowsky, N., & Glaser, R. (Eds.). (2001). Knowing what students know: The science and design of educational assessment. Washington, DC: National Academy Press.
Quellmalz, E. S., & Pellegrino, J. W. (2009). Technology and testing. Science, 323(5910), 75–79.
Ramalingam, D., McCrae, B., & Philpot, R. (2017). The PISA 2012 assessment of problem solving. In B. Csapó & J. Funke (Eds.), The Nature of problem solving: Using research to inspire 21st century learning (pp. 75–91). Paris: OECD Publishing.
Reinking, D. (1997). Me and my hypertext: A multiple digression analysis of technology and literacy (Sic). The Reading Teacher, 50(8), 626–643.
Ridgway, J., & McCusker, S. (2003). Using computers to assess new educational goals. Assessment in Education: Principles, Policy & Practice, 10(3), 309–328. doi:10.1080/0969594032000148163.
Rouet, J.-F., & Levonen, J. J. (1996). Studying and learning with hypertext: Empirical studies and their implications. In J.-F. Rouet, J. J. Levonen, A. Dillon, & R. J. Spiro (Eds.), Hypertext and cognition (pp. 9–23). Hillsdale: Erlbaum.
Schacter, J., Herl, H. E., Chung, G. K. W. K., Dennis, R. A., & O’Neil, H. F., Jr. (1999). Computer-based performance assessments: A solution to the narrow measurement and reporting of problem-solving. Computers in Human Behavior, 15(3–4), 403–418.
Williamson, D. M., Bejar, I. I., & Mislevy, R. J. (2006). Automated scoring of complex tasks in computer-based testing: An introduction. In D. M. Williamson, I. I. Bejar, & R. J. Mislevy (Eds.), Automated scoring of complex tasks in computer-based testing (pp. 1–13). New Jersey: Laurence Erlbaum Associates.
Wilson, M. (1992). The ordered partition model: An extension of the partial credit model. Applied Psychological Measurement, 16, 309–325.
Wilson, M., & Scalise, K. (2006). Assessment to improve learning in higher education: The BEAR assessment system. Higher Education, 52(4), 635–663.
Zoanetti, N. (2010). Interactive computer based assessment tasks: How problem-solving process data can inform instruction. Australasian Journal of Educational Technology, 26(5), 585–606.
Zoanetti, N., & Griffin, P. (2017). Log-file data as indicators for problem-solving processes. In B. Csapó & J. Funke (Eds.), The nature of problem solving: Using research to inspire 21st century learning (pp. 177–191). Paris: OECD Publishing.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this chapter
Cite this chapter
Ramalingam, D., Adams, R.J. (2018). How Can the Use of Data from Computer-Delivered Assessments Improve the Measurement of Twenty-First Century Skills?. In: Care, E., Griffin, P., Wilson, M. (eds) Assessment and Teaching of 21st Century Skills. Educational Assessment in an Information Age. Springer, Cham. https://doi.org/10.1007/978-3-319-65368-6_13
Download citation
DOI: https://doi.org/10.1007/978-3-319-65368-6_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-65366-2
Online ISBN: 978-3-319-65368-6
eBook Packages: EducationEducation (R0)