Measuring Postsecondary Achievement: Lessons from Large-Scale Assessments in the K-12 Sector

Koretz, Daniel

doi:10.1057/s41307-019-00142-4

Measuring Postsecondary Achievement: Lessons from Large-Scale Assessments in the K-12 Sector

Original Article
Published: 24 April 2019

Volume 32, pages 513–536, (2019)
Cite this article

Higher Education Policy Aims and scope Submit manuscript

Daniel Koretz ORCID: orcid.org/0000-0002-6036-6149¹

218 Accesses
4 Citations
Explore all metrics

Interest in using large-scale standardized assessments in the postsecondary sector has been growing rapidly in recent years. However, our experience is still limited, and there is a serious dearth of research investigating the characteristics and effects of testing in the postsecondary sector. We have far more extensive experience with large-scale testing in the K-12 sector, particularly in the USA. In this paper, I discuss a number of important issues that have arisen in K-12 testing and explore their implications for testing in the postsecondary sector. These include mistaking the part for the whole, overstating comparability, adding functions to extant tests without sufficient justification or validation, Campbell’s Law, and unwarranted causal inference. All of these issues are relevant to assessment in the postsecondary sector, and some are more severe in that sector than in K-12 education. I end with recommendations for productive and appropriate uses of assessments in this sector.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Surprisingly low results from studies on cognitive ability in developing countries: are the results credible?

Article Open access 21 May 2024

The Use of Cronbach’s Alpha When Developing and Reporting Research Instruments in Science Education

Article Open access 07 June 2017

Relations between students’ well-being and academic achievement: evidence from Swedish compulsory school

Article Open access 22 March 2023

References

Allen, J. (2018) Personal communication, May 18.
American Educational Research Association, American Psychological Association, and National Council on Measurement in Education. (2014) Standard for educational and psychological testing (2014 Edition), Washington, DC: Authors.
Google Scholar
American Statistical Association. (2014) ASA statement on using value-added models for educational assessment, Author. https://www.amstat.org/asa/files/pdfs/POL-ASAVAM-Statement.pdf. Accessed 11 Apr 2019.
Astin, A.W. and Antonio, A.L. (2012) Assessment for excellence: the philosophy and practice of assessment and evaluation in higher education, Lanham, MD: Rowman & Littlefield.
Breakspear, S. (2012) The Policy Impact of PISA: An Exploration of the Normative Effects of International Benchmarking in School System Performance, Paris, OECD Publishing (OECD Education Working Papers No. 71), http://dx.doi.org/10.1787/5k9fdfqffr28-en.
Campbell, D.T. (1976) ‘Assessing the Impact of Planned Social Change,’ Occasional paper #8, in G.M. Lyons (ed.) Social Research and Public Policies, Hanover, NH: Dartmouth College.
Google Scholar
Castellano, K.E. and Ho, A.D. (2013) A practitioner’s guide to growth models, Washington, DC: Council of Chief State School Officers.
Google Scholar
Cizek, G.J. (2016) ‘Validating test score meaning and defending test score use: different aims, different methods’, Assessment in Education: Principles, Policy and Practice 23(2): 212–225.
Google Scholar
Coates, H. and Mahat, M. (2014) ‘Advancing student learning outcomes’ in H. Coates (ed.) Higher education learning outcomes assessment: international perspectives, Frankfurt am Main: Peter Lang, pp. 15–32.
Google Scholar
Corcoran, S.P., Jennings, J.L. and Beveridge, A.A. (2012) Teacher effectiveness on high- and low-stakes tests, New York University, working paper. Retrieved from https://www.nyu.edu/projects/corcoran/papers/Corcoran_Jennings_Houston_Teacher_Effects.pdf. Accessed 11 Apr 2019.
Council for Aid to Education. (2013) Performance assessment: CLA+ overview, New York, Author. Retrieved from https://2014.accreditation.ncsu.edu/pages/3.5/3.5.1/CLA.pdf. Accessed 11 Apr.
Hamilton, L.S., Nussbaum, E.M. and Snow, R.E. (1997) ‘Interview procedures for validating science assessments’, Applied Measurement in Education 10(2): 181–200.
Article Google Scholar
Ho, A.D. (2007) ‘Discrepancies between score trends from NAEP and state tests: a scale-invariant perspective’, Educational Measurement: Issues and Practice 26(4): 11–20.
Article Google Scholar
Holcombe, R., Jennings, J. and Koretz, D. (2013) ‘The roots of score inflation: an examination of opportunities in two states’ Tests’, in G. Sunderman (ed.) Charting reform, achieving equity in a diverse nation, Greenwich, CT: Information Age Publishing, pp. 163–189. http://dash.harvard.edu/handle/1/10880587. Accessed 11 Apr 2019.
Hoover, H.D., Dunbar, S.D., Frisbie, D.A., Oberley, K.R., Ordman, V.L., Naylor, R.J., Bray, G.B., Lewis, J.C., Qualls, A.L., Mengeling, M.A. and Shannon, G.P. (2003) The Iowa tests: guide to research and development, Forms A and B, Itasca, IL: Riverside Publishing.
Google Scholar
Jacob, B.A. (2005) ‘Accountability, incentives and behavior: the impact of high-stakes testing in the Chicago public schools’, Journal of Public Economics 89(5–6): 761–796.
Article Google Scholar
Judd, T. and Keith, B. (2102) ‘Student learning outcomes at the program and institutional levels’, in C. Secolsky and D.B. Denison (eds.) Handbook on measurement, assessment, and evaluation in higher education, New York: Routledge, pp. 31–46.
Kane, M.T. (2006) ‘Validation’, in R.L. Brennan (ed.) Educational measurement (4th ed.), Westport, CT: American Council on Education/Praeger, pp. 17–64.
Google Scholar
Kane, M.T. (2016) ‘Explicating validity’, Assessment in Education: Principles, Policy and Practice 23(2): 198–211.
Google Scholar
Klein, S.P., Hamilton, L.S., McCaffrey, D.F., and Stecher, B.M. (2000) What do test scores in texas tell us? Santa Monica, CA: RAND (Issue Paper IP-202).
Klieme, E. (2016) TIMSSS 2015 and PISA 2015: How Are They Related At The Country Level? DIPF Working paper, Frankfurt, Germany: Deutsches Institut für Internationale Pädagogische Forschung.
Koretz, D. (2008) Measuring up: what educational testing really tells us, Cambridge, MA: Harvard University Press.
Google Scholar
Koretz, D. (2016) ‘Making the term “validity” useful’, Assessment in Education: Principles, Policy, and Practice 23(2): 290–292.
Google Scholar
Koretz, D. (2017) The testing charade: pretending to make schools better, Chicago: University of Chicago Press.
Book Google Scholar
Koretz, D., and Barron, S.I. (1998) The validity of gains on the Kentucky Instructional Results Information System (KIRIS), Santa Monica, CA: RAND (MR-1014-ED).
Koretz, D. and Hamilton, L.S. (2006) ‘Testing for accountability in K-12’, in R.L. Brennan (ed.) Educational measurement (4th ed.), Westport, CT: American Council on Education/Praeger, pp. 531–578.
Google Scholar
Koretz, D., Linn, R.L., Dunbar, S.B. and Shepard, L.A. (1991) ‘The effects of high-stakes testing: preliminary evidence about generalization across tests,’ in R.L. Linn (chair), The effects of high stakes testing, symposium presented at the annual meetings of the American Educational Research Association and the National Council on Measurement in Education, Chicago, April. http://dash.harvard.edu/handle/1/10880553. Accessed 11 Apr 2019.
Lindquist, E.F. (1951) ‘Preliminary considerations in objective test construction’, in E.F. Lindquist (ed.) Educational measurement, Washington, DC: American Council on Education, pp. 119–184.
Google Scholar
Lockwood, J.R., McCaffrey, D.F., Hamilton, L.S., Stecher, B., Le, V. and Martinez, J.F. (2007) ‘The sensitivity of value-added teacher effect estimates to different mathematics achievement measures’, Journal of Educational Measurement 44(1): 47–67.
Article Google Scholar
McCaffrey, D.F., Lockwood, J. R., Koretz, D.M. and Hamilton, L.S. (2003) Evaluating value-added models for teacher accountability, Santa Monica, CA: RAND (MG-158-EDU). Retrieved from http://www.rand.org/pubs/monographs/MG158.html. Accessed 11 Apr 2019.
Massachusetts Department of Elementary and Secondary Education. (2018) 2019 Next-generation MCAS test information for Grade 10 Mathematics, Malden, MA, Author (Revised September 7.) Retrieved from http://www.doe.mass.edu/mcas/tdd/math.html?section=nextgen. Accessed 11 Apr 2019.
Messick, S. (1989) ‘Validity’, in R. Linn (ed.) Educational measurement (3rd ed.), Washington, DC: American Council on Education, pp. 13–100.
Google Scholar
Mullis, I.V.S., Martin, M.O. and Foy, P. (2008) TIMSS 2007 International Mathematics Report, Newton, MA, TIMSS & PIRLS International Study Center, Boston College.
Google Scholar
Moore, K., Coates, H. and Croucher, G. (2014) ‘Understanding and improving higher education productivity’, in E. Hazelkorn, H. Coates and A.C. McCormick (eds.) Research handbook on quality, performance and accountability in higher education, Cheltenham, UK: Edward Elgar, pp. 161–177.
Google Scholar
Reardon, S.F. (2011) ‘The widening academic achievement gap between the rich and the poor: new evidence and possible explanations’, in R. Murnane and G. Duncan (eds.) Whither opportunity? Rising inequality and the uncertain life chances of low-income children, New York: Russell Sage Foundation, pp. 91–116.
Google Scholar
Rothstein, R. (2008) Holding accountability to account: How scholarship and experience in other fields inform exploration of performance incentives in education, Nashville: National Center on Performance Incentives, Vanderbilt Peabody College. Retrieved from http://www.epi.org/files/2014/holding-accountability-to-account.pdf. Accessed 11 Apr 2019.
Rubinstein, J. (2000) Cracking the MCAS Grade 10 Math, New York: Princeton Review Publishing.
Google Scholar
Secolsky, C. and Denison, D.B. (2012) Handbook on measurement, assessment, and evaluation in higher education, New York: Routledge.
Book Google Scholar
Shavelson, R.J. (2010) Measuring college learning responsibly, Stanford, CA: Stanford University Press.
Google Scholar
Tremblay, K., Lalancette, D. and Roseveare, D. (2012) AHELO feasibility study report, Volume 1, Paris: OECD.
Google Scholar
U.S. Department of Education. (2006) A test of leadership: charting the future of U.S. higher education, Washington, DC: Author.
Waldow, F. (2009) ‘What PISA did and did not do: Germany after the ‘PISA-Shock”’, European Educational Research Journal 8: 476–483. Published online 1 January. http://dx.doi.org/10.2304/eerj.2009.8.3.476.
Williams, R. (2014) ‘Comparing and benchmarking higher education systems’, in E. Hazelkorn, H. Coates and A.C. McCormick (eds.) Research handbook on quality, performance and accountability in higher education, Cheltenham, UK: Edward Elgar, pp. 178–188.
Google Scholar
Wu, M. (2009) A Critical Comparison of the Contents of PISA and TIMSS Mathematics Assessments, unpublished working paper, University of Melbourne. Retrieved from https://edsurveys.rti.org/PISA/documents/WuA_Critical_Comparison_of_the_Contents_of_PISA_and_TIMSS_psg_WU_06.1.pdf. Accessed 11 Apr 2019.
Yamada, R. (2014) ‘Comparative analysis of learning outcomes assessment policy contexts’, in H. Coates (ed.) Higher education learning outcomes assessment: international perspectives, Frankfurt am Main: Peter Lang, pp. 33–48.
Google Scholar

Download references

Author information

Authors and Affiliations

Harvard Graduate School of Education, 415 Gutman Library, Cambridge, MA, 02138, USA
Daniel Koretz

Authors

Daniel Koretz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Daniel Koretz.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Koretz, D. Measuring Postsecondary Achievement: Lessons from Large-Scale Assessments in the K-12 Sector. High Educ Policy 32, 513–536 (2019). https://doi.org/10.1057/s41307-019-00142-4

Download citation

Published: 24 April 2019
Issue Date: December 2019
DOI: https://doi.org/10.1057/s41307-019-00142-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Measuring Postsecondary Achievement: Lessons from Large-Scale Assessments in the K-12 Sector

Access this article

Similar content being viewed by others

Surprisingly low results from studies on cognitive ability in developing countries: are the results credible?

The Use of Cronbach’s Alpha When Developing and Reporting Research Instruments in Science Education

Relations between students’ well-being and academic achievement: evidence from Swedish compulsory school

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Measuring Postsecondary Achievement: Lessons from Large-Scale Assessments in the K-12 Sector

Access this article

Similar content being viewed by others

Surprisingly low results from studies on cognitive ability in developing countries: are the results credible?

The Use of Cronbach’s Alpha When Developing and Reporting Research Instruments in Science Education

Relations between students’ well-being and academic achievement: evidence from Swedish compulsory school

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation