Three Things Game Designers Need to Know About Assessment

  • Robert J. Mislevy
  • John T. Behrens
  • Kristen E. Dicerbo
  • Dennis C. Frezzo
  • Patti West


Designing game-based assessments requires coordinating the work of people from communities with little overlap, such as subject matter experts, game designers, software engineers, assessment specialists, and psychometricians. This chapter discusses three things that game designers should know about assessment to help their work come together toward the common goal: (1) Assessment design is compatible with game design, because they build on the same principles of learning. (2) Assessment is not really about numbers; it is about the structure of reasoning. (3) The key constraints of assessment design and game design need to be addressed, even if in rudimentary form, from the very beginning of the design process. The assessment design framework called “evidenced centered design” is introduced to complement game design principles, so that designers can address assessment criteria such as reliability and validity jointly with game criteria such as engagement and interactivity. The ideas are illustrated with examples from the Packet Tracer simulation environment and Aspire game that are used in the Cisco Networking Academies for learning and assessing computer network engineering.


Work Product Task Model Game Design Computerize Adaptive Test Student Model 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



The work reported here was supported in part by a research contract from Cisco Systems, Inc., to the University of Maryland, College Park, and the Center for Advanced Technology in Schools (CATS), PR/Award Number R305C080015, as administered by the Institute of Education Sciences, U.S. Department of Education. The findings and opinions expressed in this report are those of the authors and do not necessarily reflect the positions or policies of Cisco, the CATS, the National Center for Education Research (NCER), the Institute of Education Sciences (IES), or the U.S. Department of Education.


  1. Alexander, C., Ishikawa, S., & Silverstein, M. (1977). A pattern language: Towns, buildings, ­construction. New York: Oxford University Press.Google Scholar
  2. Almond, R. G., Steinberg, L. S., & Mislevy, R. J. (2002). Enhancing the design and delivery of assessment systems: A four-process architecture. Journal of Technology, Learning, and Assessment, 1(5). Retrieved May 1, 2011, from
  3. Bagley, E., & Shaffer, D. W. (2009). When people get in the way: Promoting civic thinking through epistemic gameplay. International Journal of Gaming and Computer-mediated Simulations, 1, 36–52.CrossRefGoogle Scholar
  4. Barab, S. A., Dodge, T., & Gee, J. P. (in press). The worked example: Invitational scholarship in service of an emerging field. Educational Researcher.Google Scholar
  5. Behrens, J. T., Frezzo, D. C., Mislevy, R. J., Kroopnick, M., & Wise, D. (2007). Structural, functional, and semiotic symmetries in simulation-based games and assessments. In E. L. Baker, J. Dickieson, W. Wulfeck, & H. F. O’Neil (Eds.), Assessment of problem solving using simulations (pp. 59–80). New York: Erlbaum.Google Scholar
  6. Behrens, J. T., Mislevy, R. J., Bauer, M., Williamson, D. M., & Levy, R. (2004). Introduction to evidence centered design and lessons learned from its application in a global e-learning program. International Journal of Testing, 4, 295–301.CrossRefGoogle Scholar
  7. Behrens, J. T., Mislevy, R. J., DiCerbo, K. E., & Levy, R. (2012). An evidence centered design for learning and assessment in the digital world. In M. C. Mayrath, J. Clarke-Midura, & D. Robinson (Eds.), Technology-based assessments for 21st century skills: Theoretical and practical implications from modern research (pp. 13–53). Charlotte, NC: Information Age.Google Scholar
  8. Bejar, I. I., & Braun, H. (1999). Architectural simulations: From research to implementation. Final report to the National Council of Architectural Registration Boards (ETS RM-99-2). Princeton, NJ: Educational Testing Service.Google Scholar
  9. Bennett, R. E., & Bejar, I. I. (1998). Validity and automated scoring: It’s not only the scoring. Educational Measurement: Issues and Practice, 17(4), 9–17.CrossRefGoogle Scholar
  10. Cheng, B. H., Ructtinger, L., Fujii, R., & Mislevy, R. (2010). Assessing systems thinking and complexity in science (Large-Scale Assessment Technical Report 7). Menlo Park, CA: SRI International.Google Scholar
  11. Chi, M. T. H., Glaser, R., & Farr, M. J. (Eds.). (1988). The nature of expertise. Hillsdale, NJ: Erlbaum.Google Scholar
  12. Chung, G. K. W. K., Baker, E. L., Delacruz, G. C., Bewley, W. L., Elmore, J., & Seely, B. (2008). A computational approach to authoring problem-solving assessments. In E. L. Baker, J. Dickieson, W. Wulfeck, & H. F. O’Neil (Eds.), Assessment of problem solving using simulations (pp. 289–307). Mahwah, NJ: Erlbaum.Google Scholar
  13. Clarke-Midura, J., & Dede, C. (2010). Assessment, technology, and change. Journal of Research on Technology in Education, 42, 309–328.Google Scholar
  14. Claxton, G. (2002). Education for the learning age: A sociocultural approach to learning to learn. In G. Wells & G. Claxton (Eds.), Learning for life in the 21st century (pp. 19–33). Oxford, UK: Blackwell.Google Scholar
  15. Conejo, R., Guzmán, E., Millán, E., Trella, M., Pérez-De-La-Cruz, J. L., & Ríos, A. (2004). A web-based tool for adaptive testing. International Journal of Artificial Intelligence in Education, 14, 29–61.Google Scholar
  16. Csíkszentmihályi, M. (1975). Beyond boredom and anxiety. San Francisco, CA: Jossey-Bass.Google Scholar
  17. Embretson, S. E. (Ed.). (1985). Test design: Developments in psychology and psychometrics. Orlando: Academic.Google Scholar
  18. Embretson, S. E. (1998). A cognitive design system approach to generating valid tests: Application to abstract reasoning. Psychological Methods, 3, 380–396.CrossRefGoogle Scholar
  19. Ericsson, A. K., Charness, N., Feltovich, P., & Hoffman, R. R. (2006). Cambridge handbook on expertise and expert performance. Cambridge, UK: Cambridge University Press.CrossRefGoogle Scholar
  20. Fletcher, J. D., & Morrison, J. E. (2007). Representing cognition in games and simulations. In E. Baker, J. Dickieson, W. Wulfeck, & H. O’Neil (Eds.), Assessment of problem solving using simulations (pp. 107–137). New York: Lawrence Erlbaum.Google Scholar
  21. Frezzo, D. C. (2009). Using activity theory to understand the role of a simulation-based interactive learning environment in a computer networking course. Doctoral dissertation, ProQuest. Retrieved August 29, 2011, from
  22. Frezzo, D. C., Behrens, J. T., & Mislevy, R. J. (2009). Design patterns for learning and assessment: Facilitating the introduction of a complex simulation-based learning environment into a community of instructors. The Journal of Science Education and Technology. Retrieved April 10, 2012, from Springer Open Access
  23. Fullerton, T., Swain, C., & Hoffman, S. S. (2008). Game design workshop: Designing, prototyping, and playtesting games (2nd ed.). Burlington, MA: Morgan Kaufmann.Google Scholar
  24. Gamma, E., Helm, R., Johnson, R., & Vlissides, J. (1994). Design patterns. Reading, MA: Addison-Wesley.Google Scholar
  25. Gee, J. P. (2003). What video games have to teach us about learning and literacy. New York: Palgrave/Macmillan.Google Scholar
  26. Greeno, J. G. (1998). The situativity of knowing, learning, and research. American Psychologist, 53, 5–26.CrossRefGoogle Scholar
  27. Katz, I. R. (1994). Coping with the complexity of design: Avoiding conflicts and prioritizing constraints. In A. Ram, N. Nersessian, & M. Recker (Eds.), Proceedings of the sixteenth annual meeting of the Cognitive Science Society (pp. 485–489). Mahwah, NJ: Erlbaum.Google Scholar
  28. Koster, R. (2005). A theory of fun for game design. Scottsdale, AZ: Paraglyph.Google Scholar
  29. Lave, J., & Wenger, E. (1991). Situated learning: Legitimate peripheral participation. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
  30. Leighton, J., & Gierl, M. (Eds.). (2007). Cognitive diagnostic assessment for education: Theory and applications. New York, NY: Cambridge University Press.Google Scholar
  31. Levy, R., Behrens, J. T., & Mislevy, R. J. (2006). Variations in adaptive testing and their online leverage points. In D. D. Williams, S. L. Howell, & M. Hricko (Eds.), Online assessment, measurement, and evaluation (pp. 180–202). Hershey, PA: Information Science Publishing.Google Scholar
  32. Loftus, E. F., & Loftus, G. R. (1983). Mind at play: The psychology of video games. New York: Basic Books.Google Scholar
  33. Lord, F. M. (1980). Applications of item response theory to practical testing problems. Mahwah, NJ: Erlbaum.Google Scholar
  34. Luecht, R. M. (2006). Assessment engineering: An emerging discipline. Paper presented in the Centre for Research in Applied Measurement and Evaluation, University of Alberta, Edmonton.Google Scholar
  35. Malone, T. W. (1981). What makes computer games fun? Byte, 6, 258–277.Google Scholar
  36. Margolis, M. J., & Clauser, B. E. (2006). A regression-based procedure for automated scoring of a complex medical performance assessment. In D. M. Williamson, R. J. Mislevy, & I. I. Bejar (Eds.), Automated scoring for complex tasks in computer-based testing (pp. 123–167). Hillsdale, NJ: Lawrence Erlbaum.Google Scholar
  37. Mayer, R. E. (1981). Frequency norms and structural analysis of algebra story problems into families, categories, and templates. International Science, 10, 135–175.Google Scholar
  38. Messick, S. (1994). The interplay of evidence and consequences in the validation of performance assessments. Educational Researcher, 23(2), 13–23.Google Scholar
  39. Mislevy, R. J. (2004). Can there be reliability without “reliability”? Journal of Educational and Behavioral Statistics, 29, 241–244.CrossRefGoogle Scholar
  40. Mislevy, R. J., & Riconscente, M. M. (2006). Evidence-centered assessment design: Layers, concepts, and terminology. In S. Downing & T. Haladyna (Eds.), Handbook of test development (pp. 61–90). Mahwah, NJ: Erlbaum.Google Scholar
  41. Mislevy, R. J., Riconscente, M. M., & Rutstein, D. W. (2009). Design patterns for assessing model-based reasoning (Large-Scale Assessment Technical Report 6). Menlo Park, CA: SRI International.Google Scholar
  42. Mislevy, R. J., Steinberg, L. S., & Almond, R. A. (2003). On the structure of educational assessments. Measurement: Interdisciplinary Research and Perspectives, 1, 3–67.CrossRefGoogle Scholar
  43. Mislevy, R. J., Steinberg, L. S., Breyer, F. J., Johnson, L., & Almond, R. A. (2002). Making sense of data from complex assessments. Applied Measurement in Education, 15, 363–378.CrossRefGoogle Scholar
  44. Moss, P. (1994). Can there be validity without reliability? Educational Researcher, 23(2), 5–12.Google Scholar
  45. Nelson, B. C., Erlandson, B., & Denham, A. (2011). Global channels of evidence for learning and assessment in complex game environments. British Journal of Educational Technology, 42, 88–100.CrossRefGoogle Scholar
  46. Pausch, R., Gold, R., Skelly, T., & Thiel, D. (1994). What HCI designers can learn from video game designers. In Conference on human factors in computer systems (pp. 177–178). Boston, MA: ACM.Google Scholar
  47. Pearl, J. (1988). Probabilistic reasoning in intelligent systems: Networks of plausible inference. San Mateo, CA: Kaufmann.Google Scholar
  48. Quellmalz, E., & Pellegrino, J. W. (2009). Technology and testing. Science, 323, 75–79.CrossRefGoogle Scholar
  49. Rollings, A., & Morris, D. (2000). Game architecture and design. Scottsdale, AZ: Coriolis.Google Scholar
  50. Roschelle, J. (1996). Designing for cognitive communication: Epistemic fidelity or mediating collaborative inquiry? In D. L. Day & D. K. Kovacs (Eds.), Computers communication and mental models (pp. 13–25). Bristol, PA: Taylor and Francis.Google Scholar
  51. Rupp, A., Gushta, M., Mislevy, R. J., & Shaffer, D. W. (2010). Evidence-centered design of epistemic games: Measurement principles for complex learning environments. Journal of Technology, Learning, and Assessment, 8(4). Retrieved April 10, 2012, from
  52. Rupp, A., Templin, J., & Henson, R. (2010). Diagnostic measurement: Theory, methods, and applications. New York, NY: Guilford.Google Scholar
  53. Salen, K., & Zimmerman, E. (2004). Rules of play: Game design fundamentals. Cambridge: MIT.Google Scholar
  54. Salthouse, T. A. (1991). Expertise as the circumvention of human processing limitations. In K. A. Ericcson & J. Smith (Eds.), Toward a general theory of expertise (pp. 286–300). Cambridge, UK: Cambridge University Press.Google Scholar
  55. Scalise, K., & Gifford, B. (2006). Computer-based assessment in E-learning: A framework for constructing “Intermediate Constraint” questions and tasks for technology platforms. Journal of Technology, Learning, and Assessment, 4(6). Retrieved July 17, 2009, from
  56. Schmit, M. J., & Ryan, A. (1992). Test-taking dispositions: A missing link? Journal of Applied Psychology, 77, 629–637.CrossRefGoogle Scholar
  57. Schwartz, D. L., & Arena, D. (2009). Choice-based assessments for the digital age. Palo Alto: Stanford University.Google Scholar
  58. Shaffer, D. W. (2006). How computer games help children learn. New York: Palgrave/Macmillan.CrossRefGoogle Scholar
  59. Shaffer, D. W., & Gee, J. P. (2012). The Right Kind of GATE: Computer games and the future of assessment. In M. Mayrath, J. Clarke-Midura, & D. H. Robinson (Eds.), Technology-based assessments for 21st century skills: Theoretical and practical implications from modern research (pp. 211–228). Charlotte: Information Age Publishing.Google Scholar
  60. Shaffer, D. W., Hatfield, D., Svarovsky, G. N., Nash, P., Nulty, A., Bagley, E., et al. (2009). Epistemic network analysis: A prototype for 21st century assessment of learning. The International Journal of Learning and Media, 1, 33–53.CrossRefGoogle Scholar
  61. Shao, J., & Tu, D. (1995). The jackknife and bootstrap. New York: Springer.CrossRefGoogle Scholar
  62. Shute, V. J. (2011). Stealth assessment in computer-based games to support learning. In S. Tobias & J. D. Fletcher (Eds.), Computer games and instruction (pp. 503–524). Charlotte, NC: Information Age Publishers.Google Scholar
  63. Shute, V. J., & Torres, R. (2012). Where streams converge: Using evidence-centered design to assess Quest to Learn. In M. Mayrath, J. Clarke-Midura, & D. H. Robinson (Eds.), Technology-based assessments for 21st century skills: Theoretical and practical implications from modern research (pp. 91–204). Charlotte, NC: Information Age Publishing.Google Scholar
  64. Sundre, D. L., & Wise, S. L. (2003). ‘Motivation filtering’: An exploration of the impact of low examinee motivation on the psychometric quality of tests. Paper presented at the annual meeting of the National Council on Measurement in Education, Chicago.Google Scholar
  65. Vendlinski, T. P., Baker, E. L., & Niemi, D. (2008). Templates and objects in authoring problem solving assessments. In E. L. Baker, J. Dickieson, W. Wulfeck, & H. F. O’Neil (Eds.), Assessment of problem solving using simulations (pp. 309–333). New York: Erlbaum.Google Scholar
  66. Vygotsky, L. S. (1978). Mind and society: The development of higher psychological processes. Cambridge, MA: Harvard University Press.Google Scholar
  67. Wainer, H., Dorans, N. J., Flaugher, R., Green, B. F., Mislevy, R. J., Steinberg, L., et al. (2000). Computerized adaptive testing: A primer (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum.Google Scholar
  68. Wertsch, J. (1998). Mind as action. New York: Oxford University Press.Google Scholar
  69. Williamson, D. M., Bauer, M., Steinberg, L. S., Mislevy, R. J., Behrens, J. T., & DeMark, S. (2004). Design rationale for a complex performance assessment. International Journal of Measurement, 4, 303–332.Google Scholar

Copyright information

© Springer Science+Business Media New York 2012

Authors and Affiliations

  • Robert J. Mislevy
    • 1
  • John T. Behrens
    • 2
  • Kristen E. Dicerbo
    • 2
  • Dennis C. Frezzo
    • 3
  • Patti West
    • 4
  1. 1.Educational Testing ServicePrincetonUSA
  2. 2.Center for Digital Experience and Analytics, PearsonAustinUSA
  3. 3.Instructional Research and Technology, Cisco SystemsSan FranciscoUSA
  4. 4.Cisco Networking AcademyOcalaUSA

Personalised recommendations