Abstract
Due to their multidimensional nature and their development in group settings, the assessment of competencies acquired in professional education goes beyond typical methodological difficulties related to any assessment of behaviour, personality or capability. Competencies are complex constructs for which it is challenging to develop reliable and valid assessments because classical assumptions such as unidimensionality, which means that components of an assessment should reflect only one underlying dimension, and independence of the individuals assessed are violated.
The present paper discusses these methodological challenges and presents ways how to deal with them based on examples from different studies carried out on competence-based professional education at the secondary and post-secondary educational level. Classical test theory provides a useful tool for examining the reliability of competence assessments and examining the quality of an assessment as a whole (instead of examining single items). But since quality assurance in competence assessments is difficult, drawing on a range of other methodological approaches including generalizability theory (GT) and item response theory (IRT) is important.
The recognition of these methodological challenges has strengthened the sensitivity to subject-specific and methodological problems of measurements during the past decade – not least because of public debates about the outcomes of international large-scale assessments of student achievement which have been scrutinized in the public media but also in academia. Collaboration of methodological and subject-matter experts as well as utilizing technological progress has strengthened the quality of competence assessments which is again demonstrated based on examples from professional education.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Anderson, L. W., & Krathwohl, D. (Eds.). (2001). A taxonomy for learning, teaching, and assessing: A revision of Bloom’s taxonomy of educational objectives. Longman: Addison Wesley.
Blömeke, S., Suhl, U., Kaiser, G., & Döhrmann, M. (2012). Family background, entry selectivity and opportunities to learn: What matters in primary teacher education? An international comparison of fifteen countries. Teaching and Teacher Education, 28, 44–55.
Blömeke, S., Zlatkin-Troitschanskaia, O., Kuhn, C., & Fege, J. (Eds.). (2013). Modeling and measuring competencies in higher education: Tasks and challenges ((= Professional and VET learning; 1)). Rotterdam: Sense Publishers.
Blömeke, S., Busse, A., Suhl, U., Kaiser, G., Benthien, J., Döhrmann, M., & König, J. (2014). Entwicklung von Lehrpersonen in den ersten Berufsjahren: Längsschnittliche Vorhersage von Unterrichtswahrnehmung und Lehrerreaktionen durch Ausbildungsergebnisse. Zeitschrift für Erziehungswissenschaft, 17, 509–542.
Blömeke, S., Gustafsson, J.-E., & Shavelson, R. (2015). Beyond dichotomies: Competence viewed as a continuum. Zeitschrift für Psychologie, 223, 3–13.
Boritz, J. E., & Carnaghan, C. A. (2003). Competency-based education and assessment for the accounting profession: A critical review. Canadian Accounting Perspectives, 2(1), 7–42.
Brennan, R. L. (2001). Generalizability theory. New York: Springer.
Bruner, J. (1966). Toward a theory of instruction. Cambridge, MA: Harvard University Press.
Bryk, A. S., & Raudenbush, S. W. (1992). Hierarchical linear models in social and behavioral research: Applications and data analysis methods. Newbury Park: Sage.
Campbell, D. T., & Fiske, D. W. (1959). Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin, 56, 81–105.
Costa, P. T., Jr., & McCrae, R. R. (1985). The NEO personality inventory manual. Odessa: Psychological Assessment Resources.
Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16, 297–334.
Cronbach, L. J., Nageswari, R., & Gleser, G. C. (1963). Theory of generalizability: A liberation of reliability theory. The British Journal of Statistical Psychology, 16, 137–163.
Cronbach, L. J., Gleser, G. C., Nanda, H., & Rajaratnam, N. (1972). The dependability of behavioral measurements. New York: Holt, Rinehart & Winston.
Davey, T., Ferrara, S., Shavelson, R., Holland, P., Webb, N., & Wise, L. (2015). Psychometric considerations for the next generation of performance assessment. Princeton: Educational Testing Service.
Dunekacke, S., Jenßen, L., & Blömeke, S. (2015). Effects of mathematics content knowledge on pre-school teachers’ performance: A video-based assessment of perception and planning abilities in informal learning situations. International Journal of Science and Mathematics Education, 13, 267–286.
Förster, M., Zlatkin-Troitschanskaia, M., Brückner, S., Happ, R., Hambleton, R. K., Walstad, W. B., Asano, T., & Yamaoka, M. (2015). Validating test score interpretations by cross-national comparison: Comparing the results of students from Japan and Germany on an American test of economic knowledge in higher education. Zeitschrift für Psychologie, 223, 14–23.
Gardner, J. (Ed.). (2011). Assessment and learning (2nd ed.). London: Sage.
Gershon, R. C. (2005). Computer adaptive testing. Journal for Applied Measurement, 6, 109–127.
Gold, B., Förster, S., & Holodynski, M. (2013). Evaluation eines videobasierten Trainingsseminars zur Förderung der professionellen Wahrnehmung von Klassenführung im Grundschulunterricht. Zeitschrift für Pädagogische Psychologie, 27, 141–155.
Gustafsson, J.-E. (2002). Measurement from a hierarchical point of view. In H. I. Braun, D. N. Jackson, & D. E. Wiley (Eds.), The role of constructs in psychological and educational measurement (pp. 73–95). London: Lawrence Erlbaum Associates, Publishers.
Gustafsson, J.-E., & Åberg-Bengtsson, L. (2010). Unidimensionality and interpretability of psychological instruments. In S. E. Embretson (Ed.), Measuring psychological constructs: Advances in model-based approaches. Washington, DC: American Psychological Association.
Hambleton, R. K., & Jones, R. W. (1993). An NCME instructional module on comparison of classical test theory and item response theory and their applications to test development. Educational Measurement Issues and Practice, 12, 38–47.
Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory. Newbury Park: Sage Press.
Hektner, J. M., Schmidt, J. A., & Csikszentmihalyi, M. (Eds.). (2006). Experience sampling method: Measuring the quality of everyday life. Thousand Oaks: Sage.
König, J., & Blömeke, S. (2012). Future teachers’ general pedagogical knowledge from a comparative perspective: Does school experience matter? ZDM – The International Journal on Mathematics Education, 44, 341–354.
Lord, F. M. (1980). Applications of item response theory to practical testing problems. Mahwah: Erlbaum.
Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Reading: Addison-Welsley Publishing Company.
Metropolis, N., Howlett, J., & Rota, G.-C. (Eds.). (1980). A history of computing in the twentieth century. Orlando: Academic.
Mislevy, R. J. (2006). Implications of evidence-centered design for educational testing. Educational Measurement: Issues and Practice, 25, 6–20.
Mulder, M., Weigel, T., & Collins, K. (2007). The concept of competence in the development of vocational education and training in selected EU member states: A critical analysis. Journal of Vocational Education and Training, 59(1), 67–88.
Neumann, I., Rösken-Winter, B., Lehmann, M., Duchhardt, C., Heinze, A., & Nickolaus, R. (2015). Modeling and measuring mathematical competencies of engineering students by combining IRT and think aloud methods. Peabody Journal of Education, 90, 465–476.
Raudenbush, S. W., & Sampson, R. J. (1999). ‘Ecometrics’: Toward a science of assessing ecological settings, with application to the systematic social observation of neighborhoods. Sociological Methodology, 29, 1–41.
Raudenbush, S. W., Martinez, A., Bloom, H., Zhu, P., & Lin, F. (2010). Studying the reliability of group-level measures with implications for statistical power: A six-step paradigm. University of Chicago Working Paper.
Reckase, M. D. (2009). Multidimensional item response theory. New York: Springer.
Rijmen, F., Tuerlinckx, F., De Boeck, P., & Kuppens, P. (2003). A nonlinear mixed model framework for item response theory. Psychological Methods, 8, 185–205.
Schmitt, N. (1996). Uses and abuses of coefficient alpha. Psychological Assessment, 8, 350–353.
Shavelson, R. J. (2012). An approach to testing and modeling competencies. In S. Blömeke, O. Zlatkin-Troitschanskaia, C. Kuhn, & J. Fege (Eds.), Modeling and measuring competencies in higher education: Tasks and challenges. Boston: Sense.
Shavelson, R. J., & Webb, N. M. (1991). Generalizability theory: A primer. Thousand Oaks: Sage.
Shavelson, R. J., Baxter, G. P., & Pine, J. (1992). Performance assessments: Political rhetoric and measurement reality. Educational Researcher, 21(4), 22–27.
Skrondal, A., & Rabe-Hesketh, S. (2004). Generalized latent variable modeling: Multilevel, longitudinal, and structural equation models. Boca Raton: Chapman and Hall/CRC.
Spencer, L. M., Jr., & Spencer, S. M. (1993). Competence at work: Models for superior performance. New York: Wiley.
Stone, E., & Davey, T. (2011). Computer-adaptive testing for students with disabilities: A review of the literature ((=ETS Research Report, 2011/2)). Princeton: ETS.
Van der Linden, W. J., & Glas, A. J. (2000). Computerized adaptive testing: Theory and practice. Dordrecht: Kluwer.
van Es, E. A., & Sherin, M. G. (2006). How different video club designs support teachers in ‘learning to notice.’. Journal of Computing in Teacher Education, 22, 125–135.
Wainer, H., & Thissen, D. (1987). Estimating ability with the wrong model. Journal of Educational Statistics, 12, 339–368.
Webb, N. M., Shavelson, R. J., & Steedle, J. (2012). Generalizability theory in assessment contexts. In C. Secolsky (Ed.), Measurement, assessment and evaluation in higher education (pp. 132–149). London: Routledge.
Weinert, F. E. (2001). Concept of competence: A conceptual clarification. In D. S. Rychen & L. H. Salganik (Eds.), Defining and selecting key competencies (pp. 45–66). Göttingen: Hogrefe.
Williamson, D. M., Bennett, R. E., Lazer, S., Bernstein, J., Foltz, P. W., Landauer, T. K., Rubin, D. P., Way, W. D., & Sweeney, K. (2010). Automated scoring for the assessment of common core standards. Princeton: ETS.
Wilson, M. (2004). Constructing measures: An item response modeling approach. Mahwah: Lawrence Erlbaum Associates.
Wilson, M. (2013). Seeking a balance between the statistical and scientific elements in psychometrics. Psychometrika, 78, 211–236.
Wilson, M., & Sloane, K. (2000). From principles to practice: An embedded assessment system. Applied Measurement in Education, 13(2), 181–208.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Blömeke, S. (2017). Assuring Quality in Competence Assessments: The Value Added of Applying Different Assessment Approaches to Professional Education. In: Mulder, M. (eds) Competence-based Vocational and Professional Education. Technical and Vocational Education and Training: Issues, Concerns and Prospects, vol 23. Springer, Cham. https://doi.org/10.1007/978-3-319-41713-4_29
Download citation
DOI: https://doi.org/10.1007/978-3-319-41713-4_29
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41711-0
Online ISBN: 978-3-319-41713-4
eBook Packages: EducationEducation (R0)