Skip to main content

The Psychometric Evaluation of a Summative Multimedia-Based Performance Assessment

Part of the Communications in Computer and Information Science book series (CCIS,volume 571)


In this article, a case study on the design, development, and evaluation of a multimedia-based performance assessment (MBPA) for measuring confined space guards’ skills is presented. A confined space guard (CSG) supervises operations that are carried out in a confined space (e.g. a tank or silo). Currently, individuals who want to become a certified CSG in The Netherlands have to participate in a one day training program and have to pass both a knowledge-based MC test and a practice-based performance-based assessment (PBA). Our goal is to measure the skills that are currently being assessed through the PBA, with the MBPA. We first discuss the design and development of the MBPA. Secondly, we present an empirical study which was used for assessing the quality of our measurement instrument. A representative sample of 55 CSG students, who had just completed the one day training program, has subsequently performed in the MC test, and then, depending on the condition they were assigned, the PBA or the MBPA. We report the psychometric properties of the MBPA. Furthermore, using correlations and regression analysis, we make an empirical comparison between students’ scores on the PBA and the MBPA. The results show that students’ scores on the PBA and the MBPA are significantly correlated and that students’ MBPA score is a good predictor for their score on the PBA. In the discussion, we provide implications and directions for future research and practice into the field of MBPA.


  • Performance-based assessment
  • Multimedia-based performance assessment
  • Psychometric evaluation
  • Design and development

This is a preview of subscription content, access via your institution.

Buying options

USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions


  1. Bangor, A., Kortum, P.T., Miller, J.T.: An empirical evaluation of the system usability scale. Int. J. Hum.-Comput. Interact. 24, 574–594 (2008)

    CrossRef  Google Scholar 

  2. Brennan, R.L.: Generalizability Theory. Springer, New York (2001)

    CrossRef  MATH  Google Scholar 

  3. Cito: The use of internet and the computer at home questionnaire. Dutch version (2014).

  4. Clarke-Midura, J., Dede, C.: Assessment, technology, and change. J. Res. Technol. Educ. 42(3), 309–328 (2010)

    CrossRef  Google Scholar 

  5. Cronbach, L.J., Linn, R.L., Brennan, R.L., Haertel, E.H.: Generalizability analysis for performance assessments of student achievement or school effectiveness. Educ. Psychol. Meas. 57(3), 373–399 (1997)

    CrossRef  Google Scholar 

  6. Dekker, J., Sanders, P.F.: Kwaliteit van beoordeling in de praktijk [Quality of rating during work placement]. Ede: Kenniscentrum handel (2008)

    Google Scholar 

  7. Field, A.: Discovering Statistics Using SPSS, 3rd edn. SAGE Publications Inc, Thousand Oaks (2009)

    Google Scholar 

  8. Gulikers, J.T.M., Bastiaens, T.J., Kirschner, P.A.: A five-dimensional framework for authentic assessment. Educ. Technol. Res. Dev. 52(3), 67–86 (2004)

    CrossRef  Google Scholar 

  9. Levy, R.: Psychometric and evidentiary advances, opportunities, and challenges for simulation-based assessment. Educ. Assess. 18(3), 182–207 (2013)

    CrossRef  Google Scholar 

  10. Quellmalz, E.S., Pellegrino, J.W.: Technology and testing. Science 323, 75–79 (2009)

    CrossRef  Google Scholar 

  11. Roelofs, E.C., Straetmans, G.J.J.M. (eds.) Assessment in actie [Assessment in action]. Cito, Arnhem (2006)

    Google Scholar 

  12. Shavelson, R.J., Baxter, G.P., Gao, X.: Sampling variability of performance assessments. J. Educ. Meas. 30(3), 215–232 (1993)

    CrossRef  Google Scholar 

  13. Shavelson, R.J., Ruiz-Primo, M.A., Wiley, E.: Note on sources of sample variability in science performance assessments. J. Educ. Meas. 36(1), 56–69 (1999)

    CrossRef  Google Scholar 

  14. Sijtsma, K.: On the use, the misuse, and the very limited usefulness of Cronbach’s alpha. Psychometrika 74(1), 107–120 (2009)

    CrossRef  MATH  MathSciNet  Google Scholar 

  15. Ten Berge, J.M.F., Sočan, G.: The greatest lower bound to the reliability of a test and the hypothesis of unidimensionality. Psychometrika 69, 613–625 (2004)

    CrossRef  MATH  MathSciNet  Google Scholar 

  16. Verhelst, N.D.: Estimating the reliability of a test from a single test administration. Measurement and Research Department Reports 98-2. National Institute for Educational Measurement, Arnhem (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Sebastiaan De Klerk .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

De Klerk, S., Veldkamp, B.P., Eggen, T. (2015). The Psychometric Evaluation of a Summative Multimedia-Based Performance Assessment. In: Ras, E., Joosten-ten Brinke, D. (eds) Computer Assisted Assessment. Research into E-Assessment. TEA 2015. Communications in Computer and Information Science, vol 571. Springer, Cham.

Download citation

  • DOI:

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-27703-5

  • Online ISBN: 978-3-319-27704-2

  • eBook Packages: Computer ScienceComputer Science (R0)