Skip to main content

Advertisement

Log in

Towards a framework for the validation of early childhood assessment systems

  • Published:
Educational Assessment, Evaluation and Accountability Aims and scope Submit manuscript

Abstract

American early childhood education is in the midst of drastic change. In recent years, states have begun the process of overhauling early childhood education systems in response to federal grant competitions, bringing an increased focus on assessment and accountability for early learning programs. The assessment of young children is fraught with challenges; psychometricians and educational researchers must work together with the early childhood community to develop these instruments. The purpose of this paper is to present a conceptual framework for the validation of such instrumentation and examine its implications for early childhood educators. We formulate a validity argument for early childhood assessments providing a pivotal link between validity theory and early education practice. Recommendations for the assessment field are also considered.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Aboud, F. (2006). Evaluation of an early childhood preschool program in rural Bangladesh. Early Childhood Research Quarterly, 21(1), 46–60.

    Article  Google Scholar 

  • Alexander, K. L., & Entwisle, D. R. (1988). Achievement in the first 2 years of school: Patterns and processes. Monographs of the Society for Research in Child Development, 53(2, Serial No. 218).

  • Alexander, K. L., & Entwisle, D. R. (1996). Schools and children at risk. In A. Booth & J. F. Dunn (Eds.), Family school links: How do they affect educational outcomes? (pp. 67–87). Mahwah, NJ: Erlbaum.

    Google Scholar 

  • Alvidrez, J., & Weinstein, R. S. (1999). Early teacher perceptions and later student academic achievement. Journal of Educational Psychology, 91, 731–746.

    Article  Google Scholar 

  • American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (2014). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.

    Google Scholar 

  • Andrich, D., & Styles, I. (2004). Final report on the psychometric analysis of the Early Development Instrument (EDI) using the Rasch Model: a technical paper commissioned for the development of the Australian commissioned for the development of the Australian Early Development Index(AEDI). Perth: Murdoch University.

  • Barnett, D. W., Macmann, G. M., & Carey, K. T. (1992). Early intervention and the assessment of developmental skills: challenges and directions. Topics in Early Childhood Special Education, 12, 21–43.

    Article  Google Scholar 

  • Bassok, D., Fitzpatrick, M., Loeb, S., & Paglayan, A. S. (2012). The early childhood care and education workforce in the United States: Understanding changes from 1990 through 2010. Unpublished Manuscript. Retrieved from http://cepa.stanford.edu/sites/default/files/AEFP_ECCE%20Workforce.pdf.

  • Berkner, L. K., & Chavez, L. (1997). Access to postsecondary education for the 1992 high school graduates. Statistical Analysis Report, NCES 98–105. Washington, DC: U.S. Department of Education, Office of Educational Research and Improvement, National Center for Education Statistics.

    Google Scholar 

  • Beswick, J. F., Willms, J. D., & Sloat, E. A. (2005). A comparative study of teacher ratings of emergent literacy skills and student performance on a standardized measure. Education, 126(1), 116.

    Google Scholar 

  • Bowman, B., Donovan, M. S., & Burns, M. S. (Eds.). (2001). Eager to learn: educating our preschoolers. Washington, DC: National Academy Press.

    Google Scholar 

  • Braswell, J. S., Lutkus, A. D., Grigg, W. S., Santapau, S. L., Tay-Lim, B. S.-H., & Johnson, M. S. (2001). The nation’s report card: mathematics 2000 (NCES 2001–517). Washington, DC: U.S. Department of Education, Office of Educational Research and Improvement, National Center for Education Statistics.

    Google Scholar 

  • Brinkman, S., & Blackmore, S. (2003). Pilot study results of the Australian early development instrument: a population based measure for communities and community mobilisation tool. Adelaide: Paper presented at the Beyond the Rhetoric in Early Intervention Conference.

    Google Scholar 

  • Brinkman, S., Silburn, S., Lawrence, D., Goldfeld, S., Sayers, M., & Oberklaid, F. (2007). Investigating the validity of the Australian early development index. Early Education and Development, 18(3), 427–451.

    Article  Google Scholar 

  • Bulotsky-Shearer, R. J., Fernandez, V. A., & Rainelli, S. (2013). The validity of the Devereux early childhood assessment for culturally and linguistically diverse head start children. Early Childhood Research Quarterly, 28(4), 794–807.

    Article  Google Scholar 

  • Burkam, D. T., LoGerfo, L., Ready, D., & Lee, V. E. (2007). The differential effects of repeating kindergarten. Journal of Education for Students Placed at Risk, 12(2), 103–136.

    Article  Google Scholar 

  • Case, R., & Griffin, S. (1990). Child cognitive development: the role of central conceptual structures in the development of scientific and social thought. In C. A. Hauert (Ed.), Developmental psychology: cognitive, perceptuo-motor, and neuropsychological perspectives (pp. 193–230). Amsterdam: Elsevier Science.

    Chapter  Google Scholar 

  • Chappelle, C. A., Enright, M. K., & Jamieson, J. (2010). Does an argument-based approach to validity make a difference? Educational Measurement: Issues and Practice, 29, 3–13. doi:10.1111/j.1745-3992.2009.00165.x.

    Article  Google Scholar 

  • Cizek, G. J., Rosenberg, S., & Koons, H. (2008). Sources of validity evidence for educational and psychological tests. Educational and Psychological Measurement, 68, 397–412.

    Article  Google Scholar 

  • Council of Chief State School Officers (CCSSO). (2011). Moving forward with kindergarten readiness assessment efforts: A position paper of the Early Childhood State Collaborative on Assessment and Student Standards. Washington, DC: Council of Chief State School Officers.

  • Crooks, T. J. (1988). The impact of classroom evaluation on students. Review of Educational Research, 58, 438–481.

    Article  Google Scholar 

  • De Kruif, R. E. L., McWilliam, R. A., Ridley, S. M., & Wakely, M. B. (2000). Classification of teachers’ interaction behaviors in early childhood classrooms. Early Childhood Research Quarterly, 15(2), 247–268.

    Article  Google Scholar 

  • Denton, K., & West, J. (2002). Children’s reading and mathematics achievement in kindergarten and first grade (NCES 2002–125). Washington, DC: National Center for Education Statistics.

    Google Scholar 

  • Department for Education. (2013). Early Years Foundation Stage Profile Handbook. Available at: https://www.gov.uk/government/publications/early-years-foundation-stage-profile-handbook.

  • Diamond, K.E., Justice, L.M., Siegler, R.S., & Snyder, P.A. (2013). Synthesis of IES research on early intervention and early childhood education. U.S. Department of Education. NCSER 2013–3001.

  • Duncan, G. J., & Magnuson, K. A. (2005). Can family socioeconomic resources account for racial and ethnic test score gaps? Future of Children, 15(1), 35–54.

    Article  Google Scholar 

  • Education Commission of the States. (2014). 50-state analysis: Kindergarten entrance assessments. Available at: http://ecs.force.com/mbdata/mbquestRT?rep=Kq1407.

  • Entwisle, D. R., & Alexander, K. L. (1993). Entry into schools: the beginning school transition and educational stratification in the United States. Annual Review in Sociology, 19, 401–423.

    Article  Google Scholar 

  • Francis, D. J., Fletcher, J. M., Shaywitz, B. A., Shaywitz, S. E., & Rourke, B. P. (1996). Defining learning and language disabilities: conceptual and psychometric issues with the use of IQ tests. Language, Speech, and Hearing Services in Schools, 27, 132–143.

    Article  Google Scholar 

  • Fryer, R. G., & Levitt, S. D. (2004). Understanding the black-white test score gap in the first two years of school. Review of Economics and Statistics, 86(2), 447–464.

    Article  Google Scholar 

  • Fryer, R. G., & Levitt, S. D. (2006). The black-white test score gap through third grade. American Law and Economics Review, 8(2), 249–281.

    Article  Google Scholar 

  • Gilliam, W. S. (2000). On over-generalizing from overly-simplistic evaluations of complex social programs. Early Childhood Research Quarterly, 15(1), 67–74.

    Article  Google Scholar 

  • Goldfeld, S., Sayers, M., Brinkman, S., Silburn, S., & Oberklaid, F. (2009). The Process and Policy Challenges of Adapting and Implementing the Early Development Instrument in Australia. Early Education & Development, 13, 978–991.

  • Goldstein, J., & Behuniak, P. (2011). Assumptions in alternate assessment: An argument-based approach to validation. Assessment for Effective Intervention, 36, 179–191.

  • Gordon, R. A., Fujimoto, K., Kaestner, R., Korenman, S., & Abner, K. (2013). An assessment of the validity of the ECERS–R with implications for measures of child care quality and relations to child development. Developmental Psychology, 41(1), 146–160.

    Article  Google Scholar 

  • Guhn, M., Gadermann, A., & Zumbo, B. D. (2007). Does the EDI measure school readiness in the same way across different groups of children? Early Education and Development, 18(3), 453–472.

    Article  Google Scholar 

  • Harms, T., Clifford, R. M., & Cryer, D. (1998). Early childhood environment rating scale (Revisedth ed.). New York: Teachers College Press.

    Google Scholar 

  • Haskins, R., & Rouse, C. (2005). Closing achievement gaps. The future of children Spring Policy Brief. Princeton: Princeton University and Brookings Institution.

    Google Scholar 

  • Heaviside, S., & Farris, E. (1993). Public school kindergarten teachers’ views on children’s readiness for school (NCES No. 93–410). Washington, DC: U.S. Department of Educational, Office of Educational Research and Improvement.

    Google Scholar 

  • Herman, J., & Dorr-Bremme, D. (1982). Assessing students: teachers’ routine practices and reasoning. New York: Paper presented at the annual meeting of the American Educational Research Association.

    Google Scholar 

  • Herzenberg, S., Price, M., & Bradley, D. (2005). Losing ground in early childhood education: declining workforce qualifications in an expanding industry. Washington, DC: Economic Policy Institute.

    Google Scholar 

  • High/Scope Educational Research Foundation. (1992). High/scope Child Observation Record (COR) for ages 2 1/2-6. Ypsilanti, MI: High/Scope Press.

  • Hofer, K. G. (2010). How measurement characteristics can affect ECERS-R scores and program funding. Contemporary Issues in Early Childhood, 11(2), 175–191.

    Article  Google Scholar 

  • Jaeger, E. and Funk, S. (2001). The Philadelphia Child Care Quality Study: An examination of quality in selected early education and care settings. Available at: www.sju.edu/int/academics/cas/resources/cdl/resources/Phila.CC%20Study.pdf.

  • Janus, M., & Offord, D. (2007). Development and psychometric properties of the early development instrument (EDI): a measure of children’s school readiness. Canadian Journal of Behavioural Science, 39, 1–22.

    Article  Google Scholar 

  • Janus, M., Brinkman, S., & Duku, E. (2011). Validity and psychometric properties of the early development instrument in Canada, Australia, United States, and Jamaica. Social Indicators Research, 103(2), 283–297.

    Article  Google Scholar 

  • Jordan, N. C., Huttenlocher, J., & Levine, S. C. (1992). Differential calculation abilities in young children from middle- and low-income families. Developmental Psychology, 28, 644–653.

    Article  Google Scholar 

  • Juel, C. (1988). Learning to read and write: a longitudinal study of 54 children from first through fourth grades. Journal of Educational Psychology, 80(4), 437–447.

    Article  Google Scholar 

  • Kagan, S. L., Scott-Little, C., & Clifford, R. M. (2003). Assessing young children: what policymakers need to know and do. In C. Scott-Little, S. L. Kagan, & R. M. Clifford (Eds.), Assessing the state of state assessments: perspectives on assessing young children. Greensboro: North Carolina: University of North Carolina, SERVE.

    Google Scholar 

  • Kane, M. T. (2006). Validation. In R. L. Brennan (Ed.), Educational measurement (4th ed., pp. 17–64). Washington, DC: The National Council on Measurement in Education & the American Council on Education.

    Google Scholar 

  • Kane, M. T. (2013). Validating the interpretations and uses of test scores. Journal of Educational Measurement, 50(1), 1–73.

    Article  Google Scholar 

  • Kim, D. H., & Smith, J. D. (2010). Evaluation of two observational assessment systems for children’s development and learning. NHSA Dialog, 13, 253–267.

    Article  Google Scholar 

  • Kim, D. H., Lambert, R. G., & Burts, D. C. (2013). Evidence of the validity of teaching strategies GOLD® assessment tool for english language learners and children with disabilities. Early Education and Development, 24(4), 574–595.

    Article  Google Scholar 

  • Lambert, R. G., Kim, D. H., & Burts, D. C. (2015). The measurement properties of the Teaching Strategies GOLD® assessment system. Early Childhood Research Quarterly. doi:10.1016/j.ecresq.2015.05.004.

    Google Scholar 

  • Lane, S., Parke, C. S., & Stone, C. A. (1998). A framework for evaluating the consequences of assessment programs. Educational Measurement: Issues and Practice, 17(2), 24–28.

    Article  Google Scholar 

  • LeBuffe, P. A., & Naglieri, J. A. (1999). DECA: Devereux early childhood assessment. Lewisville: Kaplan Press.

    Google Scholar 

  • Li, K., Hu, B., Pan, Y., Qin, J., & Fan, X. (2011). Chinese Early Childhood Environment Rating Scale (trial) (CECERS): A validity study. Early Childhood Research Quarterly, 29, 268–282.

  • Lin, H. L., Lawrence, F. R., & Gorrell, J. (2003). Kindergarten teachers’ views of children’s readiness for school. Early Childhood Research Quarterly, 18(2), 225–237.

    Article  Google Scholar 

  • Loeb, S., Bridges, M., Bassok, D., Fuller, B., & Rumberger, R. (2007). How much is too much? The influence of preschool centers on children’s social and cognitive development. Economics of Education Review, 26(1), 52–66.

    Article  Google Scholar 

  • Marion, S., & Pellegrino, J. (2006). A validity framework for evaluating the technical quality of alternate assessments. Educational Measurement: Issues and Practice, 25(4), 47–57.

    Article  Google Scholar 

  • Mashburn, A. J., & Henry, G. T. (2004). Assessing school readiness: Validity and bias in preschool and kindergarten teachers’ ratings. Educational Measurement: Issues and Practice, 23(4), 16–30.

  • Mehrens, W. (2002). Consequences of assessment: what is the evidence? In G. Tindal & T. Haladyna (Eds.), Large-scale assessment programs for all students: validity, technical adequacy, and implementation. Mahwah: Lawrence Earlbaum Associates.

    Google Scholar 

  • Meisels, S. J. (1996). Performance in context: assessing children’s achievement at the outset of school. In A. J. Sameroff & M. M. Haith (Eds.), The five to seven year shift: the age of reason and responsibility (pp. 410–431). Chicago, IL: University of Chicago Press.

    Google Scholar 

  • Meisels, S. J. (2007). Accountability in early childhood: no easy answers. In R. C. Pianta, M. J. Cox, & K. Snow (Eds.), School readiness, early learning, and the transition to kindergarten (pp. 31–48). Baltimore: Paul H. Brookes.

    Google Scholar 

  • Meisels, S. J., Liaw, F., Dorfman, A., & Nelson, R. F. (1995). The work sampling system: reliability and validity of a performance assessment for young children. Early Childhood Research Quarterly, 10, 277–296.

    Article  Google Scholar 

  • Meisels, S. J., Wen, X., & Beachy-Quick, K. (2010). Authentic assessment for infants and toddlers: exploring the reliability and validity of the ounce scale. Applied Developmental Science, 14, 55–71.

    Article  Google Scholar 

  • Messick, S. (1994). The interplay of evidence and consequences in the validation of performance assessments. Educational Researcher, 23(2), 13–23.

    Article  Google Scholar 

  • Mislevy, R. J., & Haertel, G. D. (2006). Implications of evidence‐centered design for educational testing. Educational Measurement: Issues and Practice, 25(4), 6–20.

    Article  Google Scholar 

  • Mislevy, R. J., Steinberg, L. S., & Almond, R. G. (2003). Focus article: on the structure of educational assessments. Measurement: Interdisciplinary research and perspectives, 1(1), 3–62.

    Google Scholar 

  • Myers, R. G. (2004). In search of quality programmes of early child- hood care and education. Background paper for Education for All, Global Monitoring Report 2005. Paris, France: UNESCO. Retrieved from www.unesco.org/education/gmrdownload/references 2005.pdf.

  • National Research Council. (2008). Early childhood assessment: what, why, and how. Washington, DC: National Academies Press.

    Google Scholar 

  • Nelson, K. (Ed.). (1998). Principles and recommendations for childhood assessments. DIANE Publishing.

  • Neuman, S. B., & Dickinson, D. K. (Eds.). (2001). Handbook of early childhood literacy research. New York: Guilford.

    Google Scholar 

  • Powell, D. R., Son, S., File, N., & San Juan, R. R. (2010). Parent-school relationships and children’s academic and social outcomes in public school pre-kindergarten. Journal of School Psychology, 48(4), 269–292.

    Article  Google Scholar 

  • Rathburn A, West J. From Kindergarten Through Third Grade: Children’s Beginning School Experiences. Washington, DC: National Center for Education Statistics; 2004. Available at: http://nces.ed.gov/pubsearch/pubsinfo.asp?pubid2004007.

  • Reardon, S. F. (2003). Sources of educational inequality: the growth of racial/ethnic and socioeconomic test score gaps in kindergarten and first grade (Working Paper 03-05R). University Park: The Pennsylvania State University, Population Research Institute.

    Google Scholar 

  • Reckase, M. (1998). Consequential validity from the test developer’s perspective. Educational Measurement: Issues and Practice, 17(2), 13–16.

    Article  Google Scholar 

  • Rumberger, R. W. & Arellano, B. (2004). Understanding and addressing the Latino achievement gap in California. (Working paper 2004–01). Berkeley, CA: UC Latino Policy Institute.

  • Schafer, W. D., Wang, J., & Wang, V. (2009). Validity in action: state assessment validity evidence for compliance with NCLB. In R. Lissitz (Ed.), The concept of validity: revisions, new directions and applications (pp. 173–193). Charlotte: Information Age Publishing Inc.

    Google Scholar 

  • Shaywitz, S. E., Fletcher, J. M., Holahan, J. M., Shneider, A. E., Marchione, K. E., Stuebing, K. K., & Shaywitz, B. A. (1999). Persistence of dyslexia: the Connecticut longitudinal study at adolescence. Pediatrics, 104, 1351–1359.

    Article  Google Scholar 

  • Silburn, S., Brinkman, S., Sayers, M., Goldfeld, S., & Oberklaid, F. (2007). Establishing the construct and predictive validity of the Australian early development index (AEDI). Early Human Development, 83(1), S125.

    Article  Google Scholar 

  • Sireci, S. G. (2009). Packing and unpacking sources of validity evidence. The concept of validity: Revisions, new directions, and applications, 19.

  • Stiggins, R. J. (1999). Evaluating classroom assessment training in teacher education programs. Educational Measurement: Issues and Practice, 18(1), 23–27.

    Article  Google Scholar 

  • Sylva, K., Siraj-Blatchford, I., & Taggart, B. (2003). Assessing quality in the early years: Early Childhood Environment Rating Scale-Extension (ECERS-E): Four curricular subscales. Stoke-on Trent: Trentham Books.

  • Sylva, K., Melhuish, E. C., Sammons, P., Siraj, I., & Taggart, B. (2004). The Effective Provision of Pre-School Education (EPPE) Project Technical Paper 12, The Final Report: Effective Pre-School Education. London: DfES / Institute of Education, University of London.

    Google Scholar 

  • Sylva, K., Siraj-Blatchford, I., Taggart, B., Sammons, P., Melhuish, E., Elliot, K., & Totsika, V. (2006). Capturing quality in early childhood through environmental rating scales. Early Childhood Research Quarterly, 21(1), 76–92.

    Article  Google Scholar 

  • Tach, L. M., & Farkas, G. (2006). Learning-related behaviors, cognitive skills, and ability grouping when schooling begins. Social Science Research, 35(4), 1048–1079.

    Article  Google Scholar 

  • U.S. Department of Education. (2011a, October 20). 35 States, D.C. and Puerto Rico submit applications for the Race to the Top-Early Learning Challenge. Retrieved from https://www.ed.gov/news/press-releases/35-states-dc-and-puerto-rico-submit-applications-race-top-early-learning-challenge.

  • U.S. Department of Education. (2011). Race to the Top - Early Learning Challenge application for initial funding: CFDA Number: 84.412. Retrieved from http://www2.ed.gov/programs/racetothetop-earlylearningchallenge/2011-412.doc.

  • U.S. Department of Education. (2013, May 23). Applications for new awards: Enhanced assessment instruments Grants Program-Enhanced Assessment Instruments-Kindergarten Entry Assessment Competition. Retrieved from https://www.federalregister.gov/articles/2013/05/23/2013-12212/applications-for-new-awards-ehanced-assessment-instruments-grants-program-enhanced-assessment.

  • U.S. Department of Education (2015). Kindergarten Entry Assessments in RTT-ELC Grantee States. Retrieved from: https://elc.grads360.org/services/PDCService.svc/GetPDCDocumentFile?fileId=10126.

  • U.S. Department of Health and Human Services. (2011). Minimum preservice qualifications and annual ongoing training house for center teaching roles in 2011. National Center on Child Care Quality Improvement. Fairfax, VA. Retrieved from https://childcare.gov/sites/default/files/542_1305_qualstchmst_2011.pdf.

  • Volante, L., & Fazio, X. (2007). Exploring teacher candidates’ assessment literacy: implications for teacher education reform and professional development. Canadian Journal of Education, 30(3), 749–770.

    Article  Google Scholar 

  • Wesley, P. W., & Buysse, V. (2003). Making meaning of school readiness in schools and communities. Early Childhood Research Quarterly, 18(3), 351–375.

    Article  Google Scholar 

  • West, J., Denton, K., & Germino-Hausken, E. (2001a). America’s kindergartners: findings from the Early Childhood Longitudinal Study, kindergarten class of 1998–99. Washington, DC: National Center for Education Statistics.

    Google Scholar 

  • West, J., Denton, K., & Reaney, L. (2001b). The kindergarten year (NCES 2001–023). Washington, DC: National Center for Education Statistics.

    Google Scholar 

  • Zill, N., & West, J. (2001). Findings from the condition of education 2000: entering kindergarten. Washington, DC: National Center for Education Statistics.

    Google Scholar 

  • Zill, N., Collins, M., West, J., & Hausken, E. G. (1995). Approaching kindergarten: A look at preschoolers in the United States. U.S. Department of Education, Office of Educational Research and Improvement, National Center for Education Statistics.

Download references

Acknowledgments

This research was supported in part by a contract from the Connecticut State Department of Education.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jessica Goldstein.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Goldstein, J., Flake, J.K. Towards a framework for the validation of early childhood assessment systems. Educ Asse Eval Acc 28, 273–293 (2016). https://doi.org/10.1007/s11092-015-9231-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11092-015-9231-8

Keywords

Navigation