Skip to main content

School accountability: can we reward schools and avoid pupil selection?


School accountability schemes require measures of school performance, and these measures are in practice often based on pupil test scores. It is well-known that insufficiently correcting these test scores for pupil characteristics may provide incentives for pupil selection. Building further on results from the theory of fair allocation, we show that the trade-off between reward and pupil selection is not only a matter of sufficient information. A school accountability scheme that rewards school performance will create incentives for pupil selection, even under perfect information, unless the educational production function satisfies an (unrealistic) separability assumption. We propose different compromise solutions and discuss the resulting incentives in theory. The empirical relevance of our analysis—i.e., the rejection of the separability assumption and the magnitude of the incentives in the different compromise solutions—is illustrated with Flemish data. The traditional value-added model turns out to be an acceptable compromise.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2


  1. 1.

    We will focus on school financing schemes, but the question is equally relevant for the design of report cards. It could also be relevant for designing differentiated vouchers (Epple and Romano 2008).

  2. 2.

    See Lefranc et al. (2009) for an alternative approach in which luck is not classified as either “responsibility” or “compensation”, but treated as a separate category.

  3. 3.

    We do not impose any restrictions on the function f. It is natural to define the variables a and b in such a way that they have a positive monotonic effect on school output, but this is not needed for our results.

  4. 4.

    Note that for the axiom to be meaningful, it is not necessary that increases in a have a positive effect on school output. If they have not, decreases in a will be rewarded.

  5. 5.

    Moreover, Meyer (1997) claims that the empirical relevance of his observation is limited because “the assumption that slopes do not vary across schools is often a very reasonable assumption.” In the next section, we falsify this claim with Flemish data.

  6. 6.

    The math tests consists of between 40 and 80 questions (depending on the grade). The score distributions are well-behaved, showing no floor and only limited ceiling effects.

  7. 7.

    Note that Dutch is the official language in Flanders.

  8. 8.

    To limit the reduction in sample size, we add a dummy ‘missing’ in case of missing pupil level data (except for initial test scores). We will not report the corresponding estimates which are, as expected, never significant.

  9. 9.

    The coefficient on a dummy “present in the first period” is 0.036 (S.E. 0.056) and hence is not significantly different from zero. The dummy “present in the second period” gets a highly significant estimated coefficient of 0.291 (S.E. 0.048). A \(\chi ^{2}\)-test for joint significance of the two dummies yields \(\chi ^{2}(2)=39.67\) (\(p=0.00\)).

  10. 10.

    We report the estimates of the selection equation for each period in the appendix. The differences between both periods are striking, but often in line with expectations. For example, initial math scores do not matter for being present in grade 1 (the first year of formal education), but matter for still being present in grade 2.

  11. 11.

    The effect of being in a catholic school cannot be estimated with a full set of school dummies included. We therefore omit the school dummies in model (f).

  12. 12.

    We again report the estimates of the sorting equation for each period in the appendix. The religion (dummies) of fathers and mothers are (jointly) significant in both periods.

  13. 13.

    Including a quadratic term for class size reveals that class size has a negative effect up to a class size of (slightly more than) 23 pupils, covering almost 80 % of all pupils in the sample. The estimated coefficients are \(-\)0.0394622 for class size and 0.0008517 for squared class size, but both together are not significant (\(\chi ^{2}(2)=0.33\) with prob \(>\chi ^{2}\) equal to 0.85).

  14. 14.

    Note that they are normalized so that the coefficient for “girl” gets the value \(-\)1. Absolute numbers are therefore not directly comparable to the estimates in Table 5, but the ratios between the coefficients are.

  15. 15.

    To avoid confusion, we stress that the subscript b in the estimated slope vector \(\widehat{\beta }_{b,j}\) indicates that it is a slope vector for the background variables. Still, these background slopes are at the school level and therefore classified as administration variables.

  16. 16.

    The separability test statistic is \({\chi }^{2}(220)=558.38\) with prob \(>{\chi }^{2}\) equal to 0.000.

  17. 17.

    Remember that if we define an average without a subscript, this is the average over the whole population, i.e. over all schools.

  18. 18.

    The reported slopes for average initial test scores include the peer effect. For parental education we include both the education of the mother and of the father.

  19. 19.

    We do not report the results for the per-capita scheme, because per-capita subsidies obviously do not respond to the simulated changes.

  20. 20.

    Again, the subsidy changes of decreasing, rather than increasing the average background characteristics are exactly the same, up to a minus sign.


  1. Arcidiacono P, Koedel C (2014) Race and college success: evidence from Missouri. Am Econ J Appl Econ 6(3):20–57

    Article  Google Scholar 

  2. Barlevy G, Neal D (2012) Pay for percentile. Am Econ Rev 102(5):1805–1831

    Article  Google Scholar 

  3. Burgess S, Propper C, Slater H, Wilson D (2005) Who wins and who loses from school accountability? The distribution of educational gain in English secondary schools. CMPO, Bristol

    Google Scholar 

  4. Burgess S, Propper C, Wilson D (2007) The impact of school choice in England. Policy Stud 28(2):129–143

    Article  Google Scholar 

  5. Cawley J, Heckman J, Vytlacil E (1999) On policies to reward the value added by educators. Rev Econ Stat 81(4):720–727

    Article  Google Scholar 

  6. Chiang H (2009) How accountability pressure on failing schools affects student achievement. J Public Econ 93:1045–1057

    Article  Google Scholar 

  7. Epple D, Romano R (2008) Educational vouchers and cream skimming. Int Econ Rev 49(4):1395–1435

    Article  Google Scholar 

  8. Figlio D, Rouse C (2006) Do accountability and voucher threats improve low-performing schools? J Public Econ 90:239–255

    Article  Google Scholar 

  9. Figlio D, Winicki J (2005) Food for thought: the effects of school accountability plans on school nutrition. J Public Econ 89:381–394

    Article  Google Scholar 

  10. Fleurbaey M (2008) Fairness, responsibility and welfare. Oxford University Press, Oxford

    Book  Google Scholar 

  11. Hanushek E (2006) School resources, chapter 14. In: Hanushek E, Welch F (eds) Handbook of the economics of education. Elsevier, Amsterdam

    Google Scholar 

  12. Hanushek E, Raymond M (2003) Lessons about the design of state accountability systems. In: Peterson P, West M (eds) No child left behind? The politics and practice of accountability. Brookings, Washington

    Google Scholar 

  13. Hanushek E, Raymond M (2004) The effect of school accountability systems on the level and distribution of student achievement. J Eur Econ Assoc 2(2/3):406–415

    Article  Google Scholar 

  14. Hanushek E, Raymond M (2005) Does school accountability lead to improved student performance? J Policy Anal Manage 24(2):297–327

    Article  Google Scholar 

  15. Jacob B (2005) Accountability, incentives and behavior: the impact of high-stakes testing in the Chicago public schools. J Public Econ 89:761–796

    Article  Google Scholar 

  16. Kane T, Staiger D (2002) The promise and pitfalls of using imprecise school accountability measures. J Econ Perspect 16(4):91–114

    Article  Google Scholar 

  17. Ladd H, Walsh R (2002) Implementing value-added measures of school effectiveness: getting the incentives right. Econ Edu Rev 21:1–17

    Article  Google Scholar 

  18. Lefranc A, Pistolesi N, Trannoy A (2009) Equality of opportunity and luck: definitions and testable conditions, with an application to income in France. J Public Econ 93:1189–1207

    Article  Google Scholar 

  19. Meyer R (1997) Value-added indicators of school performance: a primer. Econ Edu Rev 16(3):283–301

    Article  Google Scholar 

  20. Mundlak Y (1978) On the pooling of time series and cross section data. Econometrica 46:69–85

    Article  Google Scholar 

  21. Neal D (2008) Designing incentive systems for schools. In: Springer M (ed) Performance incentives: their growing impact on American K-12 education. Brookings, Washington

    Google Scholar 

  22. Reback R (2008) Teaching to the rating: school accountability and the distribution of student achievement. J Public Econ 92:1394–1415

    Article  Google Scholar 

  23. Schokkaert E, Dhaene G, Van de Voorde C (1998) Risk adjustment and the trade-off between efficiency and risk selection: an application of the theory of fair compensation. Health Econ 7:465–480

    Article  Google Scholar 

  24. Schokkaert E, Van de Voorde C (2004) Risk selection and the specification of the conventional risk adjustment formula. J Health Econ 23:1237–1259

    Article  Google Scholar 

  25. Schokkaert E, Van de Voorde C (2009) Direct versus indirect standardization in risk adjustment. J Health Econ 28:361–374

    Article  Google Scholar 

  26. Semykina A, Wooldridge J (2010) Estimating panel data models in the presence of endogeneity and selection. J Econom 157:375–380

    Article  Google Scholar 

  27. Taylor J, Ngoc Nguyen A (2006) An analysis of the value added by secondary schools in England: is the value added indicator of any value? Oxford Bull Econ Stat 68(2):203–224

    Article  Google Scholar 

  28. Verbeek M, Nijman T (1992) Testing for selectivity bias in panel data models. Int Econ Rev 33:681–703

    Article  Google Scholar 

  29. West M, Peterson P (2006) The efficacy of choice threats within school accountability systems: results from legislatively induced experiments. Econ J 116:46–62

    Article  Google Scholar 

  30. Wooldridge J (1995) Selection corrections for panel data models under conditional mean independence assumptions. J Econom 68:115–132

    Article  Google Scholar 

  31. Wössmann L (2003) Schooling resources, educational institutions and student performance: the international evidence. Oxford Bull Econ Stat 65(2):117–170

    Article  Google Scholar 

Download references

Author information



Corresponding author

Correspondence to Erwin Ooghe.

Additional information

We would like to thank Ides Nicaise and Jan Van Damme for their permission to use the SiBO-data, Frederik Maes and Peter Helsen for their valuable help with the data, and the editor, two anonymous referees, Dolors Berga, Geert Dhaene, Carmen Herrero, Iñigo Iturbe-Ormaetxe, Dirk Van de gaer, Frank Vandenbroucke, Carine Van de Voorde, and seminar participants in Alicante, Leuven, Louvain-la-Neuve, Oxford, and Rome for useful comments.



Proof of proposition 2

A subsidy scheme can satisfy incentives for good administration and no incentives for pupil selection if and only if there exist functions \(g:\mathbb {R}\times B\rightarrow \mathbb {R}\) and \(h:A\rightarrow \mathbb {R}\), with g strictly increasing in its first argument, such that \( f(a,b)=g(h(a),b)\), for all \(x=(a,b)\) in X.

If the separability condition holds, it is possible to define a subsidy scheme s such that each school subsidy \(s_{j}\) is a strictly increasing function of \(h\left( a_{j}\right) \) only. Such a scheme satisfies both axioms. We show the opposite.

Consider a subsidy scheme that satisfies incentives for good administration and no incentives for pupil selection. We show that, for arbitrary administrations \(a,a^{\prime }\in A\) and backgrounds \( b,b^{\prime }\in B\), we have

$$\begin{aligned} f(a,b)\ge f(a^{\prime },b)\Leftrightarrow f(a,b^{\prime })\ge f(a^{\prime },b^{\prime }). \end{aligned}$$

This would indeed allow to properly define functions

  1. 1.

    \(h:A\rightarrow \mathbb {R}\) with \(h(a)\ge h(a^{\prime })\) if \(f\left( a,b\right) \ge f\left( a^{\prime },b\right) \) for some \(b\in B\), and

  2. 2.

    \(g:\mathbb {R}\times B\rightarrow \mathbb {R}\) with \(g(h(a),b)=f(a,b)\) for all \(x=(a,b)\),

with g strictly increasing in its first argument.

We proceed by contradiction. Suppose Eq. (19) does not hold, e.g., both \(f\left( a,b\right) \ge f\left( a^{\prime },b\right) \) and \(f\left( a,b^{\prime }\right) <f\left( a^{\prime },b^{\prime }\right) \) are true for some \(a,a^{\prime }\in A\) and \(b,b^{\prime }\in B\). (It is easy to verify the other direction using the same logic.) We can use these \(a,a^{\prime }\in A\) and \(b,b^{\prime }\in B\) to construct four states— \((a,b),\, (a^{\prime },b),\, (a,b^{\prime })\), and \((a^{\prime },b^{\prime })\)—for some school (tacitly assuming that school information remains constant for all other schools). We suppress subscripts and use f(ab) and (with slight abuse of notation) s(ab) to refer to the output and the subsidy of the school under consideration. Applying incentives for good administration twice, we must have

$$\begin{aligned} s(a,b)-s(a^{\prime },b)\ge 0 \quad \text {and}\quad s(a,b^{\prime })-s(a^{\prime },b^{\prime })<0. \end{aligned}$$

Applying no incentives for pupil selection twice, we obtain

$$\begin{aligned} s(a,b)=s(a,b^{\prime })\quad \text {and}\quad s(a^{\prime },b)=s(a^{\prime },b^{\prime }), \end{aligned}$$

and, subtracting both equations, we get:

$$\begin{aligned} s(a,b)-s(a^{\prime },b)=s(a,b^{\prime })-s(a^{\prime },b^{\prime }). \end{aligned}$$

Equation (20) and (21) are incompatible, a contradiction.

A derivation of the empirical subsidy schemes

The per-capita and uncorrected output schemes are straightforward. We discuss the reference administration, reference background and value added scheme. A subsidy scheme is defined as

$$\begin{aligned} s_{j}= {\texttt {1}} + {\texttt {slope}} \times (\widetilde{y}_{j}- \overline{\widetilde{y}}), \end{aligned}$$

with the slope defined by (17) for each scheme. We focus here on the difference \(\widetilde{y}_{j}-\overline{\widetilde{y}}\).

We start from the empirical model

The RA models use a reference administration, say \(\widetilde{a}=(\widetilde{ z}_{a},\widetilde{v},\widetilde{\beta }_{b})\), to define the hypothetical output as

$$\begin{aligned} \widetilde{y}_{j}=\overline{y}_{j}-f\left( \widetilde{a},b_{j}\right) = \overline{y}_{j}-\left( \beta _{a}^{\prime }\widetilde{z}_{a}+\widetilde{v}+ \widetilde{\beta }_{b}^{\prime }\overline{z}_{b,j}\right) . \end{aligned}$$

The average hypothetical output is equal to

$$\begin{aligned} \overline{\widetilde{y}}=\overline{y}-\left( \beta _{a}^{\prime }\widetilde{z}_{a}+ \widetilde{v}+\widetilde{\beta }_{b}^{\prime }\overline{z}_{b}\right) , \end{aligned}$$

and the difference \(\widetilde{y}_{j}-\overline{\widetilde{y}}\) is indeed equal to

$$\begin{aligned} (\overline{y}_{j}-\overline{y})-\widetilde{\beta }_{b}^{\prime }(\overline{z} _{b,j}-\overline{z}_{b}). \end{aligned}$$

Starting from the same empirical model, the RB models replace \(\overline{z} _{b,j}\) by a reference background \(\widetilde{b}=\widetilde{z}_{b}\) to get

$$\begin{aligned} \widetilde{y}_{j}=f(a_{j},\widetilde{b})=\widehat{\beta }_{a}^{\prime } \overline{z}_{a,j}+\widehat{v}_{j}+\widehat{\beta }_{b,j}^{\prime } \widetilde{z}_{b}. \end{aligned}$$

The OLS estimate for \(\widehat{v}_{j}\) is

$$\begin{aligned} \widehat{v}_{j}=\overline{y}_{j}-\widehat{\beta }_{a}^{\prime }\overline{z} _{a,j}-\widehat{\beta }_{b,j}^{\prime }\overline{z}_{b,j}, \end{aligned}$$

and we can rewrite the hypothetical output as

$$\begin{aligned} \widetilde{y}_{j}=\overline{y}_{j}-\widehat{\beta }_{b,j}^{\prime }( \overline{z}_{b,j}-\widetilde{z}_{b}). \end{aligned}$$

The average is given by

$$\begin{aligned} \overline{\widetilde{y}}_{j}=\overline{y}-\overline{\widehat{\beta } _{b,j}^{\prime }(\overline{z}_{b,j}-\widetilde{z}_{b})}, \end{aligned}$$

and the difference \(\widetilde{y}_{j}-\overline{\widetilde{y}}_{j}\) indeed becomes

$$\begin{aligned} (\overline{y}_{j}-\overline{y})-\widehat{\beta }_{b,j}^{\prime }(\overline{z} _{b,j}-\widetilde{z}_{b})+\overline{\widehat{\beta }_{b,j}^{\prime }( \overline{z}_{b,j}-\widetilde{z}_{b})}. \end{aligned}$$

Finally, for the value-added (VA) model we have

$$\begin{aligned} \widetilde{y}_{j}=\widehat{\beta }_{a}^{VA\prime }\overline{z}_{a,j}+ \widehat{v}_{j}^{VA}, \end{aligned}$$

with the OLS estimate of \(v_{j}^{VA}\) in (18) given by

$$\begin{aligned} \widehat{v}_{j}^{VA}=\overline{y}_{j}-\widehat{\beta }_{a}^{VA\prime } \overline{z}_{a,j}-\widehat{\beta }_{b}^{VA\prime }\overline{z}_{b,j}. \end{aligned}$$

Plugging in the OLS estimate, corrected output becomes

$$\begin{aligned} \widetilde{y}_{j}=\overline{y}_{j}-\widehat{\beta }_{b}^{VA\prime }\overline{ z}_{b,j}. \end{aligned}$$

Averaging the corrected output, we get

$$\begin{aligned} \overline{\widetilde{y}}=\overline{y}-\widehat{\beta }_{b}^{VA\prime } \overline{z}_{b}, \end{aligned}$$

and the difference \(\widetilde{y}_{j}-\overline{\widetilde{y}}\) indeed reduces to

$$\begin{aligned} (\overline{y}_{j}-\overline{y})-\widehat{\beta }_{b}^{VA\prime }(\overline{z} _{b,j}-\overline{z}_{b}). \end{aligned}$$

First step probit estimates for Table 6

 First-step probit estimates for the sample selection model (model e)

Present In period 1 In period 2
  Coeff. \(\hbox {p}>{\vert }\hbox {t}{\vert }\) Coeff. \(\hbox {p}>{\vert }\hbox {t}{\vert }\)
\(\hbox {Math}_{0}\) 0.02 0.39 0.30 0.00
Girl 0.04 0.40 0.01 0.86
m_dutch 0.20 0.07 \(-0.39\) 0.00
f_dutch 0.07 0.49 \(-0.01\) 0.96
m_edu_sec \(-0.00\) 1.00 0.16 0.05
m_edu_high \(-0.01\) 0.90 0.20 0.03
m_edu_uni \(-0.09\) 0.47 0.30 0.03
f_edu_sec \(-0.05\) 0.53 0.07 0.42
f_edu_high \(-0.06\) 0.57 0.14 0.18
f_edu_uni \(-0.23\) 0.04 \(-0.04\) 0.74
Duo \(-0.65\) 0.00 0.45 0.00
Peer 0.15 0.01 0.34 0.00
Time_math \(-0.12\) 0.00 0.29 0.00
Experience \(-0.02\) 0.00 0.01 0.00
Class_size 0.00 0.58 \(-0.01\) 0.41
Pseudo-\({R}^2\)   0.12   0.14
No. of observations   3578   3578
  1. Constant and regional dummies included, but not reported

 First-step probit estimates for the sorting model (model f)

In a catholic school In period 1 In period 2
  Coeff. \(\hbox {p}>{\vert }\hbox {t}{\vert }\) Coeff. \(\hbox {p}>{\vert }\hbox {t}{\vert }\)
\(\hbox {Math}_{0}\) 0.03 0.48 0.02 0.55
Girl 0.18 0.00 0.20 0.00
m_dutch \(-0.06\) 0.62 \(-0.04\) 0.73
f_dutch 0.03 0.80 0.00 0.97
m_edu_sec 0.06 0.49 0.04 0.63
m_edu_high 0.25 0.02 0.23 0.03
m_edu_uni 0.22 0.12 0.15 0.30
f_edu_sec 0.07 0.43 0.08 0.37
f_edu_high 0.14 0.18 0.14 0.17
f_edu_uni 0.16 0.21 0.21 0.11
Duo \(-0.08\) 0.39 0.78 0.00
Peer 1.12 0.00 1.11 0.00
Time_math \(-0.16\) 0.00 \(-0.20\) 0.00
Experience \(-0.01\) 0.00 \(-0.02\) 0.00
Class_size 0.03 0.00 0.02 0.01
m_freemason 0.08 0.68 0.05 0.79
m_christian 0.42 0.02 0.38 0.03
m_jewish 0.23 0.67 0.46 0.40
m_islam 0.59 0.06 0.60 0.06
m_atheist 0.03 0.89 0.02 0.92
m_other 0.05 0.86 0.06 0.80
f_freemason \(-0.38\) 0.04 \(-0.35\) 0.06
f_christian \(-0.02\) 0.92 0.02 0.90
f_jewish \(-1.37\) 0.01 \(-1.41\) 0.01
f_islam \(-0.45\) 0.01 \(-0.45\) 0.13
f_atheist \(-0.07\) 0.74 \(-0.04\) 0.87
f_other \(-0.67\) 0.01 \(-0.74\) 0.00
\(\hbox {Pseudo}\)-\(R^{2}\) 0.25 0.29
No. of observations 2747 2747
  1. Constant and regional dummies included, but not reported

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Ooghe, E., Schokkaert, E. School accountability: can we reward schools and avoid pupil selection?. Soc Choice Welf 46, 359–387 (2016).

Download citation


  • Compromise Solution
  • Catholic School
  • Subsidy Scheme
  • School Accountability
  • Selection Incentive