Abstract
School accountability schemes require measures of school performance, and these measures are in practice often based on pupil test scores. It is wellknown that insufficiently correcting these test scores for pupil characteristics may provide incentives for pupil selection. Building further on results from the theory of fair allocation, we show that the tradeoff between reward and pupil selection is not only a matter of sufficient information. A school accountability scheme that rewards school performance will create incentives for pupil selection, even under perfect information, unless the educational production function satisfies an (unrealistic) separability assumption. We propose different compromise solutions and discuss the resulting incentives in theory. The empirical relevance of our analysis—i.e., the rejection of the separability assumption and the magnitude of the incentives in the different compromise solutions—is illustrated with Flemish data. The traditional valueadded model turns out to be an acceptable compromise.
This is a preview of subscription content, access via your institution.
Notes
 1.
We will focus on school financing schemes, but the question is equally relevant for the design of report cards. It could also be relevant for designing differentiated vouchers (Epple and Romano 2008).
 2.
See Lefranc et al. (2009) for an alternative approach in which luck is not classified as either “responsibility” or “compensation”, but treated as a separate category.
 3.
We do not impose any restrictions on the function f. It is natural to define the variables a and b in such a way that they have a positive monotonic effect on school output, but this is not needed for our results.
 4.
Note that for the axiom to be meaningful, it is not necessary that increases in a have a positive effect on school output. If they have not, decreases in a will be rewarded.
 5.
Moreover, Meyer (1997) claims that the empirical relevance of his observation is limited because “the assumption that slopes do not vary across schools is often a very reasonable assumption.” In the next section, we falsify this claim with Flemish data.
 6.
The math tests consists of between 40 and 80 questions (depending on the grade). The score distributions are wellbehaved, showing no floor and only limited ceiling effects.
 7.
Note that Dutch is the official language in Flanders.
 8.
To limit the reduction in sample size, we add a dummy ‘missing’ in case of missing pupil level data (except for initial test scores). We will not report the corresponding estimates which are, as expected, never significant.
 9.
The coefficient on a dummy “present in the first period” is 0.036 (S.E. 0.056) and hence is not significantly different from zero. The dummy “present in the second period” gets a highly significant estimated coefficient of 0.291 (S.E. 0.048). A \(\chi ^{2}\)test for joint significance of the two dummies yields \(\chi ^{2}(2)=39.67\) (\(p=0.00\)).
 10.
We report the estimates of the selection equation for each period in the appendix. The differences between both periods are striking, but often in line with expectations. For example, initial math scores do not matter for being present in grade 1 (the first year of formal education), but matter for still being present in grade 2.
 11.
The effect of being in a catholic school cannot be estimated with a full set of school dummies included. We therefore omit the school dummies in model (f).
 12.
We again report the estimates of the sorting equation for each period in the appendix. The religion (dummies) of fathers and mothers are (jointly) significant in both periods.
 13.
Including a quadratic term for class size reveals that class size has a negative effect up to a class size of (slightly more than) 23 pupils, covering almost 80 % of all pupils in the sample. The estimated coefficients are \(\)0.0394622 for class size and 0.0008517 for squared class size, but both together are not significant (\(\chi ^{2}(2)=0.33\) with prob \(>\chi ^{2}\) equal to 0.85).
 14.
Note that they are normalized so that the coefficient for “girl” gets the value \(\)1. Absolute numbers are therefore not directly comparable to the estimates in Table 5, but the ratios between the coefficients are.
 15.
To avoid confusion, we stress that the subscript b in the estimated slope vector \(\widehat{\beta }_{b,j}\) indicates that it is a slope vector for the background variables. Still, these background slopes are at the school level and therefore classified as administration variables.
 16.
The separability test statistic is \({\chi }^{2}(220)=558.38\) with prob \(>{\chi }^{2}\) equal to 0.000.
 17.
Remember that if we define an average without a subscript, this is the average over the whole population, i.e. over all schools.
 18.
The reported slopes for average initial test scores include the peer effect. For parental education we include both the education of the mother and of the father.
 19.
We do not report the results for the percapita scheme, because percapita subsidies obviously do not respond to the simulated changes.
 20.
Again, the subsidy changes of decreasing, rather than increasing the average background characteristics are exactly the same, up to a minus sign.
References
Arcidiacono P, Koedel C (2014) Race and college success: evidence from Missouri. Am Econ J Appl Econ 6(3):20–57
Barlevy G, Neal D (2012) Pay for percentile. Am Econ Rev 102(5):1805–1831
Burgess S, Propper C, Slater H, Wilson D (2005) Who wins and who loses from school accountability? The distribution of educational gain in English secondary schools. CMPO, Bristol
Burgess S, Propper C, Wilson D (2007) The impact of school choice in England. Policy Stud 28(2):129–143
Cawley J, Heckman J, Vytlacil E (1999) On policies to reward the value added by educators. Rev Econ Stat 81(4):720–727
Chiang H (2009) How accountability pressure on failing schools affects student achievement. J Public Econ 93:1045–1057
Epple D, Romano R (2008) Educational vouchers and cream skimming. Int Econ Rev 49(4):1395–1435
Figlio D, Rouse C (2006) Do accountability and voucher threats improve lowperforming schools? J Public Econ 90:239–255
Figlio D, Winicki J (2005) Food for thought: the effects of school accountability plans on school nutrition. J Public Econ 89:381–394
Fleurbaey M (2008) Fairness, responsibility and welfare. Oxford University Press, Oxford
Hanushek E (2006) School resources, chapter 14. In: Hanushek E, Welch F (eds) Handbook of the economics of education. Elsevier, Amsterdam
Hanushek E, Raymond M (2003) Lessons about the design of state accountability systems. In: Peterson P, West M (eds) No child left behind? The politics and practice of accountability. Brookings, Washington
Hanushek E, Raymond M (2004) The effect of school accountability systems on the level and distribution of student achievement. J Eur Econ Assoc 2(2/3):406–415
Hanushek E, Raymond M (2005) Does school accountability lead to improved student performance? J Policy Anal Manage 24(2):297–327
Jacob B (2005) Accountability, incentives and behavior: the impact of highstakes testing in the Chicago public schools. J Public Econ 89:761–796
Kane T, Staiger D (2002) The promise and pitfalls of using imprecise school accountability measures. J Econ Perspect 16(4):91–114
Ladd H, Walsh R (2002) Implementing valueadded measures of school effectiveness: getting the incentives right. Econ Edu Rev 21:1–17
Lefranc A, Pistolesi N, Trannoy A (2009) Equality of opportunity and luck: definitions and testable conditions, with an application to income in France. J Public Econ 93:1189–1207
Meyer R (1997) Valueadded indicators of school performance: a primer. Econ Edu Rev 16(3):283–301
Mundlak Y (1978) On the pooling of time series and cross section data. Econometrica 46:69–85
Neal D (2008) Designing incentive systems for schools. In: Springer M (ed) Performance incentives: their growing impact on American K12 education. Brookings, Washington
Reback R (2008) Teaching to the rating: school accountability and the distribution of student achievement. J Public Econ 92:1394–1415
Schokkaert E, Dhaene G, Van de Voorde C (1998) Risk adjustment and the tradeoff between efficiency and risk selection: an application of the theory of fair compensation. Health Econ 7:465–480
Schokkaert E, Van de Voorde C (2004) Risk selection and the specification of the conventional risk adjustment formula. J Health Econ 23:1237–1259
Schokkaert E, Van de Voorde C (2009) Direct versus indirect standardization in risk adjustment. J Health Econ 28:361–374
Semykina A, Wooldridge J (2010) Estimating panel data models in the presence of endogeneity and selection. J Econom 157:375–380
Taylor J, Ngoc Nguyen A (2006) An analysis of the value added by secondary schools in England: is the value added indicator of any value? Oxford Bull Econ Stat 68(2):203–224
Verbeek M, Nijman T (1992) Testing for selectivity bias in panel data models. Int Econ Rev 33:681–703
West M, Peterson P (2006) The efficacy of choice threats within school accountability systems: results from legislatively induced experiments. Econ J 116:46–62
Wooldridge J (1995) Selection corrections for panel data models under conditional mean independence assumptions. J Econom 68:115–132
Wössmann L (2003) Schooling resources, educational institutions and student performance: the international evidence. Oxford Bull Econ Stat 65(2):117–170
Author information
Affiliations
Corresponding author
Additional information
We would like to thank Ides Nicaise and Jan Van Damme for their permission to use the SiBOdata, Frederik Maes and Peter Helsen for their valuable help with the data, and the editor, two anonymous referees, Dolors Berga, Geert Dhaene, Carmen Herrero, Iñigo IturbeOrmaetxe, Dirk Van de gaer, Frank Vandenbroucke, Carine Van de Voorde, and seminar participants in Alicante, Leuven, LouvainlaNeuve, Oxford, and Rome for useful comments.
Appendix
Appendix
Proof of proposition 2
A subsidy scheme can satisfy incentives for good administration and no incentives for pupil selection if and only if there exist functions \(g:\mathbb {R}\times B\rightarrow \mathbb {R}\) and \(h:A\rightarrow \mathbb {R}\), with g strictly increasing in its first argument, such that \( f(a,b)=g(h(a),b)\), for all \(x=(a,b)\) in X.
If the separability condition holds, it is possible to define a subsidy scheme s such that each school subsidy \(s_{j}\) is a strictly increasing function of \(h\left( a_{j}\right) \) only. Such a scheme satisfies both axioms. We show the opposite.
Consider a subsidy scheme that satisfies incentives for good administration and no incentives for pupil selection. We show that, for arbitrary administrations \(a,a^{\prime }\in A\) and backgrounds \( b,b^{\prime }\in B\), we have
This would indeed allow to properly define functions

1.
\(h:A\rightarrow \mathbb {R}\) with \(h(a)\ge h(a^{\prime })\) if \(f\left( a,b\right) \ge f\left( a^{\prime },b\right) \) for some \(b\in B\), and

2.
\(g:\mathbb {R}\times B\rightarrow \mathbb {R}\) with \(g(h(a),b)=f(a,b)\) for all \(x=(a,b)\),
with g strictly increasing in its first argument.
We proceed by contradiction. Suppose Eq. (19) does not hold, e.g., both \(f\left( a,b\right) \ge f\left( a^{\prime },b\right) \) and \(f\left( a,b^{\prime }\right) <f\left( a^{\prime },b^{\prime }\right) \) are true for some \(a,a^{\prime }\in A\) and \(b,b^{\prime }\in B\). (It is easy to verify the other direction using the same logic.) We can use these \(a,a^{\prime }\in A\) and \(b,b^{\prime }\in B\) to construct four states— \((a,b),\, (a^{\prime },b),\, (a,b^{\prime })\), and \((a^{\prime },b^{\prime })\)—for some school (tacitly assuming that school information remains constant for all other schools). We suppress subscripts and use f(a, b) and (with slight abuse of notation) s(a, b) to refer to the output and the subsidy of the school under consideration. Applying incentives for good administration twice, we must have
Applying no incentives for pupil selection twice, we obtain
and, subtracting both equations, we get:
Equation (20) and (21) are incompatible, a contradiction.
A derivation of the empirical subsidy schemes
The percapita and uncorrected output schemes are straightforward. We discuss the reference administration, reference background and value added scheme. A subsidy scheme is defined as
with the slope defined by (17) for each scheme. We focus here on the difference \(\widetilde{y}_{j}\overline{\widetilde{y}}\).
We start from the empirical model
The RA models use a reference administration, say \(\widetilde{a}=(\widetilde{ z}_{a},\widetilde{v},\widetilde{\beta }_{b})\), to define the hypothetical output as
The average hypothetical output is equal to
and the difference \(\widetilde{y}_{j}\overline{\widetilde{y}}\) is indeed equal to
Starting from the same empirical model, the RB models replace \(\overline{z} _{b,j}\) by a reference background \(\widetilde{b}=\widetilde{z}_{b}\) to get
The OLS estimate for \(\widehat{v}_{j}\) is
and we can rewrite the hypothetical output as
The average is given by
and the difference \(\widetilde{y}_{j}\overline{\widetilde{y}}_{j}\) indeed becomes
Finally, for the valueadded (VA) model we have
with the OLS estimate of \(v_{j}^{VA}\) in (18) given by
Plugging in the OLS estimate, corrected output becomes
Averaging the corrected output, we get
and the difference \(\widetilde{y}_{j}\overline{\widetilde{y}}\) indeed reduces to
First step probit estimates for Table 6
Firststep probit estimates for the sample selection model (model e)
Present  In period 1  In period 2  

Coeff.  \(\hbox {p}>{\vert }\hbox {t}{\vert }\)  Coeff.  \(\hbox {p}>{\vert }\hbox {t}{\vert }\)  
\(\hbox {Math}_{0}\)  0.02  0.39  0.30  0.00 
Girl  0.04  0.40  0.01  0.86 
m_dutch  0.20  0.07  \(0.39\)  0.00 
f_dutch  0.07  0.49  \(0.01\)  0.96 
m_edu_sec  \(0.00\)  1.00  0.16  0.05 
m_edu_high  \(0.01\)  0.90  0.20  0.03 
m_edu_uni  \(0.09\)  0.47  0.30  0.03 
f_edu_sec  \(0.05\)  0.53  0.07  0.42 
f_edu_high  \(0.06\)  0.57  0.14  0.18 
f_edu_uni  \(0.23\)  0.04  \(0.04\)  0.74 
Duo  \(0.65\)  0.00  0.45  0.00 
Peer  0.15  0.01  0.34  0.00 
Time_math  \(0.12\)  0.00  0.29  0.00 
Experience  \(0.02\)  0.00  0.01  0.00 
Class_size  0.00  0.58  \(0.01\)  0.41 
Pseudo\({R}^2\)  0.12  0.14  
No. of observations  3578  3578 
Firststep probit estimates for the sorting model (model f)
In a catholic school  In period 1  In period 2  

Coeff.  \(\hbox {p}>{\vert }\hbox {t}{\vert }\)  Coeff.  \(\hbox {p}>{\vert }\hbox {t}{\vert }\)  
\(\hbox {Math}_{0}\)  0.03  0.48  0.02  0.55 
Girl  0.18  0.00  0.20  0.00 
m_dutch  \(0.06\)  0.62  \(0.04\)  0.73 
f_dutch  0.03  0.80  0.00  0.97 
m_edu_sec  0.06  0.49  0.04  0.63 
m_edu_high  0.25  0.02  0.23  0.03 
m_edu_uni  0.22  0.12  0.15  0.30 
f_edu_sec  0.07  0.43  0.08  0.37 
f_edu_high  0.14  0.18  0.14  0.17 
f_edu_uni  0.16  0.21  0.21  0.11 
Duo  \(0.08\)  0.39  0.78  0.00 
Peer  1.12  0.00  1.11  0.00 
Time_math  \(0.16\)  0.00  \(0.20\)  0.00 
Experience  \(0.01\)  0.00  \(0.02\)  0.00 
Class_size  0.03  0.00  0.02  0.01 
m_freemason  0.08  0.68  0.05  0.79 
m_christian  0.42  0.02  0.38  0.03 
m_jewish  0.23  0.67  0.46  0.40 
m_islam  0.59  0.06  0.60  0.06 
m_atheist  0.03  0.89  0.02  0.92 
m_other  0.05  0.86  0.06  0.80 
f_freemason  \(0.38\)  0.04  \(0.35\)  0.06 
f_christian  \(0.02\)  0.92  0.02  0.90 
f_jewish  \(1.37\)  0.01  \(1.41\)  0.01 
f_islam  \(0.45\)  0.01  \(0.45\)  0.13 
f_atheist  \(0.07\)  0.74  \(0.04\)  0.87 
f_other  \(0.67\)  0.01  \(0.74\)  0.00 
\(\hbox {Pseudo}\)\(R^{2}\)  0.25  0.29  
No. of observations  2747  2747 
Rights and permissions
About this article
Cite this article
Ooghe, E., Schokkaert, E. School accountability: can we reward schools and avoid pupil selection?. Soc Choice Welf 46, 359–387 (2016). https://doi.org/10.1007/s0035501509170
Received:
Accepted:
Published:
Issue Date:
Keywords
 Compromise Solution
 Catholic School
 Subsidy Scheme
 School Accountability
 Selection Incentive