Abstract
This paper estimates the effects of a comprehensive school reform program on high-stakes test scores in Amsterdam. The program implements a systematic and performance-based way of working within weakly performing primary schools and integrates measures such as staff coaching, teacher evaluations and teacher schooling, and the use of new instruction methods. Difference-in-differences estimates show substantial negative effects on test scores for pupils in their final year of primary school. The program decreased test scores with 0.17 standard deviations in the first 4 years after its introduction. A potential explanation for this finding is the intensive and rigorous approach that caused an unstable work climate with increased teacher replacement.
Similar content being viewed by others
Notes
Over 800 variations of CSR models have been implemented in more than 5000 schools in the US in the past decades (Rowan et al. 2004).
The expert team largely consists of former inspectors of the Dutch Inspectorate of Education.
The teacher evaluation system (TES) is called ‘Kijkwijzer’. Van der Steeg and Gerritsen (2013) find that high teacher quality scores on this TES are associated with better pupil test scores. This suggests that the TES measures teacher practices that are important for the educational performance of pupils.
In exceptional cases where educational goals are not achieved yet, the program can be extended by an additional third year.
These courses are mainly targeted at school-leaders and supportive school personnel.
With an average primary school size of 220 pupils, the average yearly ASIP investment per pupil is around 680 Euros. This is more than 10 % of the per-pupil government funding of around 5000 Euros.
This program was called ‘Omdat elk kind telt in Zuidoost’.
This action program is called ‘Beter Presteren’ and focuses on additional school time, professionalisation of schools and parental involvement.
An exception is made for pupils in special categories such as foreign students that have been in the Netherlands for a short time and students that are expected to be assigned to secondary education types with special care (see also Table 3).
The total number of primary schools that were classified as ‘weak’ or ‘very weak’ on 1 January 2008 equals 751, of which 38 in Amsterdam.We do not observe the schools that never participated in the CITO test.
A five point increase in CITO test score corresponds roughly to a one level higher secondary education type.
The median CITO score equals 534.
All primary schools that are classified as weak or very weak are subject to the nationwide interventions of the Dutch Inspectorate of Education. This may explain the improvement in test scores of weakly performing schools over time.
In case of a so-called ‘Ashenfelter dip’ one would expect schools in Amsterdam to improve after the introduction of the program, because of mean reversion. One might then incorrectly conclude that this improvement would be caused by the program.
The subsidy factor equals 1.2 in case the highest completed education level is primary education for at least one of the parents and lower secondary education for the other. The subsidy factor equals 0.3 in case lower secondary education is the highest completed education level for both parents (or the parent which is responsible for daily care).
This type of education is called ‘Leerwegondersteunend Onderwijs’ (LWOO). Pupils in this category generally suffer from learning arrears, low IQ and/or social or emotional problems.
Estimated treatment effects in models without the year dummies are very similar.
For example, the increase in the share of non-disadvantaged pupils with a subsidy factor of 0 between 2008 and 2012 is larger in Amsterdam than that in other cities (see Table 3). Since non-disadvantaged pupils are more likely to perform well on the CITO test, inclusion of this control variable decreases the estimated treatment effect.
In models (3) and (4) the estimated effects are both larger (in absolute value) than the total sample estimates, while models (5) and (6) both yield smaller estimates. This can be explained by the fact that we leave out pupils with missing values on socioeconomic status or home language.
Excluding the schools in Rotterdam in the G4 sample yields an estimated effect of −0.167***(0.061) on the total CITO test score.
The sample contains 614 schools and 2 time periods. Not all 614 schools are present in both the before and the after period: there are 561 schools in the before period and 584 schools in the after period.
Since the CITO test scores are important for judging educational quality, schools may have an incentive for shaping the testing pool. Previous studies have shown that schools can respond strategically to the implementation of accountability policies by excluding weak students from the test (e.g. Jacob 2005).
The estimated effect for the interaction term is 0.001 (0.029). The included pupil background characteristics are gender, age, age squared, and a categorical socioeconomic status variable that distinguishes six categories based on parental education level and ethnic origin. The total sample includes 18,887 pupils in grade eight divided over 676 different schools.
Restricting the sample to only those that were classified as weak or very weak on 1 January 2008 leaves us with only 6 schools in Amsterdam, of which 3 participated in the ASIP. A similar analysis on this subsample yields an estimated effect for the interaction term of 0.005 (0.006).
Please note that this analysis is not fully informative on the magnitude of potential bias caused by the impact of grade retention. After all, it does not exclude those pupils in Amsterdam that have not retained after the introduction of the policy, but that would have retained in the absence of the policy. In addition, the lower estimated effect may well be explained by the exclusion of weak performing pupils for whom the impact of the policy on test scores is likely to be more detrimental (see Table 6).
References
Angrist, J. D., Imbens, G., & Rubin, D. (1996). Identification of causal effects using instrumental variables. Journal of the American Statistical Association, 91(434), 444–455.
Ashenfelter, O. (1978). Estimating the effect of training programs on earnings. Review of Economics and Statistics, 60, 47–50.
Ashenfelter, O., & Card, D. (1985). Using the longitudinal structure of earnings to estimate the effect of training programs. Review of Economics and Statistics, 67(4), 648–660.
Bertrand, M., Duflo, E., & Mullainathan, S. (2004). How much should we trust differences-in-differences estimates. Quarterly Journal of Economics, 119, 249–275.
Bifulco, R., Duncombe, W., & Yinger, J. (2005). Does whole-school reform boost student performance? The case of New York City. Journal of Policy Analysis and Management, 24, 47–72.
Bloom, H. (1984). Accounting for no-shows in experimental evaluation designs. Evaluation Review, 8, 225–246.
Blundell, R., Duncan, A., & Meghir, C. (1998). Estimating labor supply responses using tax reforms. Econometrica, 66(4), 827–861.
Borman, G. D., Hewes, G. M., Overman, L. T., & Brown, S. (2003). Comprehensive school reform and achievement: A meta-analysis. Review of Educational Research, 73(2), 125–230.
Borman, G. D., Hewes, G. M., Overman, L. T., & Brown, S. (2004). Comprehensive school reform and achievement: A meta-analysis. In C. T. Cross (Ed.), Putting the pieces together: Lessons from comprehensive school reform research (pp. 53–108). Washington, DC: The National Clearinghouse for Comprehensive School Reform.
Card, D., & Krueger, A. (1994). Minimum wages and employment: A case study of the fast-food industry in New Jersey and Pennsylvania. American Economic Review, 84(4), 772–793.
Cook, T. D., Habib, E., Phillips, M., Settersten, R., Shagle, S. C., & Degirmencioglu, S. M. (1999). Comers school development program in Prince George’s County, Maryland: A theory-based evaluation. American Education Research Journal, 36(3), 543–597.
Cook, T. D., Hunt, H. D., & Murphy, R. E. (1998). Comer’s School Development Program in Chicago: A theory-based evaluation. WP-98-24. Evanston, IL: Institute for Policy Research, Northwestern University.
Driessen, G., Mulder, L., Ledoux, G., Roeleveld, J., & van der Veen, I. (2009). Cohortonderzoek COOL5-18. Technisch rapport basisonderwijs, eerste meting 2007/08. Nijmegen: ITS / Amsterdam: SCO-Kohnstamm Instituut.
Driessen, G., Mulder, L., & Roeleveld, J. (2012). Cohortonderzoek COOL5-18. Technisch rapport basisonderwijs, tweede meting 2010/11. Nijmegen: ITS / Amsterdam: SCO-Kohnstamm Instituut.
Feng, L., Figlio, D., & Sass, T. (2010) School accountability and teacher mobility. NBER Working Paper No. 16070.
Figlio, D., & Loeb, S. (2011). School Accountability. In E. Hanushek, S. Machin, & L. Woessmann (Eds.), Handbook of the economics of education (Vol. 3, pp. 383–421). The Netherlands: Elsevier.
Gross, B., Brooker, T. K., & Goldhaber, D. (2009). Boosting student achievement: The effect of comprehensive school reform on student achievement. Educational Evaluation and Policy Analysis, 31(2), 111–126.
Herman, R., Aladjem, D., McMahon, P., Masem, E., Mulligan, I., O’Malley, A. S., et al. (1999). An educator’s guide to schoolwide reform. Arlington, VA: American Institutes for Research.
Imbens, G. W., & Angrist, J. D. (1994). Identification and estimation of local average treatment effects. Econometrica, 62(2), 467–475.
Jacob, B. A. (2005). Accountability, incentives and behavior: The impact of high-stakess testing in the Chicago Public Schools. Journal of Public Economics, 89, 761–796.
Inspectorate of Education, 2008, Regionale analyse; Een analysevan het Amsterdamse basisonderwijs. Utrecht.
Inspectorate of Education, 2009, De Staat van het Onderwijs, Onderwijsverslag 2007/2008. Utrecht.
Luna, C., & Turner, C. L. (2001). The impact of the MCAS: Teachers talk about high-stakes testing. English Journal, 91(1), 79–87.
May, H., & Supovitz, J. A. (2006). Capturing the cumulative effects of school reform: An 11-year study of the impacts of America’s choice on student achievement. Educational Evaluation and Policy Analyis, 28(3), 231–257.
Millsap, M.A., Chase, A., Obiedallah, D., & Perez-Smith, A., (2001). Evaluation of the comer school development program in detroit, 1994–1999: Methods and results. Washington, DC.
Moulton, B. (1986). Random group effects and the precision of regression estimates. Journal of Econometrics, 32(3), 385–397.
Municipality of Amsterdam, 2009, Kwaliteitsaanpak Basisonderwijs Amsterdam; programmaplan 2009–2014. Amsterdam.
Rowan, B., Barnes, C., & Camburn, E. (2004). Benefiting from comprehensive school reform: A review of research on CSR implementation. In C. T. Cross (Ed.), Putting the pieces together: Lessons from comprehensive school reform research (pp. 1–52). Washington, DC: National Clearinghouse for Comprehensive School Reform.
Schwartz, A. E., Stiefel, L. S., & Kim, D. Y. (2004). The impact of school reform on student performance: Evidence from the New York Network for School Renewal Project. Journal of Human Resources, 39(2), 500–522.
Slavin, R. E., & Fashola, O. S. (1998). Show me the evidence!. Thousand Oaks, CA: Corwin Press.
Traub, J. (1999). Better by design? A consumer’s guide to schoolwide reform. Washington, DC: Thomas B. Fordham Foundation.
Tyack, D., & Cuban, L. (1995). Tinkering toward utopia: A century of public school reform. Cambridge, MA: Harvard University Press.
U.S. Department of Education, 2004, Implementation and early outcomes of the Comprehensive School Reform Demonstration (CSRD) Program (No. 2004-15). Jessup, MD: Policy and Program Studies Service, U.S. Department of Education.
U.S. Department of Education, 2006, Comprehensive school reform program: Funding status. Washington, DC: Author. Retrieved July 6, 2006. http://www.ed.gov/programs/compreform/funding.html.
Van der Steeg, M., & Gerritsen, S. (2013). Teacher evaluations andpupil achievement; evidence from classroom observations. CPB Discussion Paper 230.
Author information
Authors and Affiliations
Corresponding author
Additional information
The authors would like to thank Chris van Klaveren, Dinand Webbink, Bas ter Weel, Debby Lanser, Ted Reininga, Jacqueline Visser and seminar participants at the Ministry of Education and CPB The Hague for their valuable comments. The authors also thank the Dutch Inspectorate of Education, the CITO organisation, and the municipality of Amsterdam for supplying the data used in this paper.
Rights and permissions
About this article
Cite this article
van Elk, R., Kok, S. The Impact of a Comprehensive School Reform Policy for Weak Schools on Educational Achievement; Results of the First 4 years. De Economist 164, 445–476 (2016). https://doi.org/10.1007/s10645-016-9281-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10645-016-9281-4