A growing literature has appeared in the last 2 decades with the aim to explore if the way in which publicly funded private schools are managed (a very autonomous mode) is more effective, than that applied in public schools (where decisions are highly centralized), concerning the promotion of student’s educational skills. Our paper contributes to this literature providing new evidence from the Spanish experience. To this end, we use the Spanish Assessment named “Evaluación de Diagnóstico,” a national yearly standardized test given to students in the fourth grade and administered by the Regional Educational Authorities. In particular, our data are those corresponding to the assessment conducted in the Spanish region of Aragón in 2010. Our methodological strategy is defined by the sequential application of two methods: propensity score matching and hierarchical linear models. Additionally, the sensitivity of our estimates is also tested with respect to unobserved heterogeneity. Our results underline the existence of a slight advantage of the private management model of schools in the promotion of scientific abilities of students and in the acquisition of foreign language (English) skills.
This is a preview of subscription content, log in to check access.
Buy single article
Instant access to the full article PDF.
Price includes VAT for USA
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
This is the net price. Taxes to be calculated in checkout.
For the sake of simplicity, hereinafter we will refer to the publicly funded privately run schools simply as private schools, except when necessary to differentiate from completely private independent schools. Namely, in this paper a private school is understood as a type of publicly funded but self-governed school. Similarly, we will refer to state funded and run schools as public schools. The main difference between them lies mainly in the way they are administered, being private schools much more autonomous concerning the process and personnel decisions (deciding on the purchase of supplies and on budget allocations within schools, hiring and rewarding teachers, choosing textbooks, instructional methods, and the like).
This bias has its origin in the fact that attendance at a school, whether private or public, is not random but instead is conditioned by characteristics of the family background, which in turn are extremely important in the determination of educational outcomes (the family socioeconomic level, for example).
In this sense, our paper meets Davies’ claim (2013, p. 880): “As debates over school choice become increasingly transnational, we need studies from a variety of settings to build a stockpile of international knowledge about school sectors and student achievement.”
The average score of each competence for the total number of schools is 500 and the standard deviation 100, given that as established by the General Report on Diagnostic Evaluation in Aragón 2010 “the evaluation of each competence in Aragón as a whole is established at the level of the average scores transformed into a reference value which has been fixed at 500, with a standard deviation of 100.” Here, the approach of the Spanish Diagnostic Evaluation is similar to that of the evaluations of the PISA Project of the OECD. In Table 1, the average score differs from 500 due to the elimination from the sample of completely private independent schools and of those situated in municipalities in which there exists no choice between public and private schools.
The key point is that these characteristics are also chief determinants of educational outcomes. This circumstance is the cause underlying the self-selection bias problem threatening our estimates. Selection bias and/or endogeneity are widespread in educational research and is the main methodological problem encountered when trying to evaluate the effect of private schools on the academic performance of children (Lefebvre et al. 2011). This is a methodological problem inherent in all impact evaluations in non-experimental studies (such as 2010 Aragón ED).
The most common estimands in non-experimental studies are the “average effect of the treatment on the treated” (ATT), which is the effect for those in the treatment group, and the “average treatment effect” (ATE), which is the effect on all individuals (treatment and control). Our focus of interest is to measure the expected effect on the outcome if individuals in the population were randomly assigned to treatment being this what is exactly captured by the ATE (Austin 2011). This parameter allows us to know what the performance of the Spanish students would be if they attended a self-governing private school.
Such as Imbens (2004, p. 11) states when he refers to the combination of methods to estimate ATE “The motivation for these combinations is that although in principle any one of these methods can remove all of the bias associated with the covariates, combining two may lead to more robust inference. For example, matching leads to consistent estimators for average treatment effects under weak conditions, so matching and regression can combine some of the desirable variance properties of regression with the consistency of matching.”
The assumption of selection on observables requires that conditional on the observed variables, the assignment to treatment is random.
We are assuming homogeneity in response across observed covariates. Lehrer and Kordas (2013) demonstrate that when the treatment effects vary in an unsystematic manner with the true propensity score, there are gains from using a matching algorithm based on propensity scores estimated via binary regression quantiles.
In the estimation of the propensity score, only those variables that could affect both the choice of a private school and the students’ academic performance were included (no consideration is taken of either the variables which can potentially contribute to explaining the differences in educational outcomes but which do not influence the choice of school, such as study habits, for example, nor those which could be determinants of that choice but do not influence the educational skills cited, such as the distance to the school, for example). In addition, only those variables which are potential predictors of educational outcomes and which occur prior to the choice of school (or were stable between the time of the choice of school and the time of the outcome assessment) were included as explanatory variables in equation 1 (Caliendo and Kopeinig 2008). Material that point out all the observables are listed in Table 1 and case-wise deletion was used to handle missing data.
The first of these (NNM) matches each treated individual with that non-treated individual having the most similar propensity score value. This is to say, in nearest neighbour matching, Stata selects the control(s) nearest to each treated observation for comparison. KM constructs matches using all the individuals in the potential control sample in such a way that it gathers more information from those who are closer matches and less from distant observations. In so doing, KM uses comparatively more information than other matching algorithms (Guo and Fraser 2010, chapter 7).
We applied the coarsened exact matching as a robustness technique obtaining worse results in terms of similarity between treatment and control groups generated. Results are available upon request.
Results supplied by the different matching estimation methods led to similar conclusions. They are not supplied here but are available from the authors upon request.
Other results devoted to test the matching quality are shown in Appendix. In particular, Table 5 shows the differences in the average values of propensity scores and covariates for the whole sample and the paired sample. The last two rows in this table show the median absolute standardized bias (Rosenbaum and Rubin 1985) before and after matching. As can be inferred, KM has reduced covariate imbalance on all variables. Figure 2 shows graphically the pre- and post-matching bias for each of the variables included in the estimation of the propensity score. Figure 3 depicts the distribution of these same variables by type of school for the complete sample (figures on the left) and the matched sample (figures on the right).
This command calculates the ATE as a weighted average of the ATT (average effect on treated) and the ATU (average effect on untreated). This is a very common definition of the ATE in the literature (see for instance, Böckerman et al. 2013; Gangl 2014). An alternative way to calculate the ATE is by weighting observations by the inverse of the calculated propensities scores (Hirano et al. 2003). In order to check the robustness of the ATE, we also calculated it applying this last method, i.e., using the propensities as sampling weights. For this, we used the Stata’s teffects module. Results are similar to those shown in Table 3 and are available upon request.
For a mathematical demonstration, see DiPrete and Gangl (2004).
In addition, we calculated the Hodges–Lehmann point estimates and its confidence intervals obtaining the same critical values.
Additionally, we test our estimation with another sensitivity analysis proposed by Ichino et al. (2008) This consists in calculating the ATE under different possible scenarios of deviation of conditional independence assumption (CIA). To do so, the authors impose values to parameters that characterize the U distribution in order to simulate the ability to generate bias in the unobservable and recalculate the parameter value with the inclusion of the influence of simulated unobserved variable. Results are available upon request. This approach has been widely used in the literature (Binder and Coad 2013; Caliendo and Künn 2015, among others). Others types of sensitivity analysis have been proposed in the literature. For example Altonji et al. (2005) applied a similar idea to the Heckman selection model.
HLM are similar to OLS concerning the way in which they weigh the observations (see Yitzhaki 1996 for a discussion of OLS weights). Both weigh the observations differently to PSM. We are grateful to an anonymous referee for making this point. In any case, our purpose with the HLM is not to compare the ATE that it supplies with that obtained via PSM.
Multilevel models, such as HLM, are built on the Moulton’s (1990) work of clustering. The insight provided by Moulton’s work was that when individuals within the aggregated level are clustered, so that they are in fact more similar to one another than individuals belonging to another cluster group, the OLS assumption that observations are independent and identically distributed is violated. For this reason, the estimation by OLS can result in a downward bias in the estimated standard errors of estimates leading the analyst to conclude the aggregate level effects are statistically significant when they are in fact not. Multilevel models have the benefit of allowing for partial pooling of coefficients towards the completely pooled OLS estimate which according to Gelman (2006) can be a more effective estimation strategy. Simulations using a dataset with students clustered within classrooms and classrooms within schools suggest that modelling the clustering of the data using a multilevel method is a better approach than clustering the standard errors of OLS estimate (Cheah 2009).
Bryk and Raudenbusch (1988) recommend the use of this type of general model when analysing the effects of schools on educational outcomes. There exist multiple applications of this methodology to the educational context. Among these are Willms (2006), Somers et al. (2004) and Mancebón et al. (2012), the last of these being applied to Spanish data from PISA 2006.
Previously to the estimation of the HLM, we evaluated the appropriateness of applying it to our data. For this, we calculated the intra-class correlation (ICC) values of the null model of science and foreign language (English) performance (the two being the dependent variables of the regression). If the ICC were zero, a hierarchical model would not be necessary, since in this case the total variance of the scores would not be explained by the differences existing between students attending different classes or schools. Results of these calculations for an HLM at two levels and three levels are offered in Appendix (Tables 6 and 7). These results (which show that the class level explains a small percentage of the variance of the results in foreign language (English), but does explain a higher percentage of the results in science) leads us to apply a two-level model for achievement in a foreign language (English) and a three-level model for science. At any rate, results for three-level model for English and two-level model for science lead to the same conclusions and are available upon request.
Allen R, Vignoles A (2015) Can school competition improve standards? The case of faith schools in England. Empir Econ 50(3):959–973
Altonji J, Elder T, Taber C (2005) Selection on observed and unobserved variables: assessing the effectiveness of Catholic schools. J Polit Econ 113(1):151–184
Altonji J, Elder T, Taber C (2008) Using selection on observed variables to assess bias from unobservables when evaluating Swan–Ganz catheterization. Am Econ Rev 98(2):345–350
Austin P (2011) An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivar Behav Res 46(3):399–424
Böckerman P, Bryson A, Ilmakunnas P (2013) Does high involvement management lead to higher pay? J R Stat Soc Ser A Stat Soc 176(4):861–885
Bernal J (2005) Parental choice, social class and market forces: the consequences of privatization of public services in education. J Educ Policy 20(6):779–792
Bettinger E (2011) Educational vouchers in international contexts. In: Hanushek E, Machin S, Woessmann L (eds) Handbook of the economics of education, vol 4. Elsevier Science & Technology, North-Holland, pp 551–572
Binder M, Coad A (2013) Life satisfaction and self-employment: a matching approach. Small Bus Econ 40(4):1009–1033
Bradley S, Migali G, Taylor J (2013) Funding, school specialisation and test scores: an evaluation of the specialist schools policy using matching models. J Hum Cap 7(1):76–106
Bryk A, Raudenbusch S (1988) Toward a more appropriate conceptualization of research on school effects: a three-level hierarchical linear model. Am J Educ 97(1):65–108
Caliendo M, Künn S (2015) Getting back into the labor market: the effects of start-up subsidies for unemployed females. J Popul Econ 28(4):1005–1043
Caliendo M, Kopeinig S (2008) Some practical guidance for the implementation of propensity score matching. J Econ Surv 22(1):31–72
Cheah B (2009) Clustering standard errors for modeling multilevel data. Technical report, Columbia University, New York
Chowa G, Masa R, Wretman C, Ansong D (2013) The impact of household possessions on youth’s academic achievement in the Ghana Youthsave experiment: a propensity score analysis. Econ Educ Rev 33:69–81
Chudgar A, Quin E (2012) Relationship between private schooling and achievement: results from rural and urban India. Econ Educ Rev 31(4):376–390
Cohen J (1988) Statistical power analysis for the behavioral sciences, 2nd edn. Erlbaum Associates, Lawrence
Coleman J, Hoffer T, Kilgore S (1982) Secondary school achievement. Public, catholic and private schools compared. Basic Books, Inc. Publishers, New York
Crespo E, Santín D (2014) Does school ownership matter? An unbiased efficiency comparison for regions of Spain. J Prod Anal 41(1):153–172
Davies S (2013) Are there Catholic school effects in Ontario, Canada? Eur Sociol Rev 29(4):871–883
DiPrete T, Gangl M (2004) Assessing bias in the estimation of causal effects: Rosenbaum bounds on matching estimators and instrumental variables estimation with imperfect instruments. WZB discussion paper SP I 2004-101, Wissenschaftszentrum Berlin für Sozialforschung
Doncel L, Sainz J, Sanz I (2012) An estimation of the advantage of charter over public schools. Kyklos 65(4):442–463
Epple D, Romano R, Zimmer R (2016) Charter schools: a survey of research on their characteristics and effectiveness. In: Hanushek E, Machin S, Woessmann L (eds) Handbook of the economics of education, vol 5. Elsevier Science & Technology, North-Holland, pp 139–208
Escardibul JO, Villarroya A (2009) The inequalities in school choice in Spain in accordance to PISA data. J Educ Policy 24(6):673–695
Gangl M (2014) Matching estimators for treatment effects. In: Best H, Wolf C (eds) The SAGE handbook of regression analysis and causal inference. SAGE Publications Ltd, London, pp 251–276
Gelman A (2006) Multilevel (hierarchical) modeling: what it can and cannot do. Technometrics 48(3):432–435
Green C, Navarro-Paniagua M, Ximénez de Embún D, Mancebón M (2014) School choice and student wellbeing. Econ Educ Rev 38:139–150
Gronberg T, Jansen D (2001) Navigating newly chartered waters. An analysis of charter school performance. Texas Public Policy Foundation, Austin
Guo S, Fraser M (2010) Propensity score analysis. Statistical methods and applications. SAGE publications Ltd., London
Hanushek E, Woessmann L (2014) Institutional structures of the education system and student achievement: a review of cross-country economic research. In: Strietholt R, Bos W, Gustafsson JE, Rosen M (eds) Educational policy evaluation through international comparative assessments. Waxmann, Munster
Hanushek E, Kain J, Rivkin S, Branch F (2007) Charter school quality and parental decision making with school choice. J Public Econ 91(5–6):823–848
Herron M (1999) Postestimation uncertainty in limited dependent variable models. Polit Anal 8(1):83–98
Hirano K, Imbens G, Ridder G (2003) Efficient estimation of average treatment effects using the estimated propensity score. Econometrica 71(4):1161–1189
Ichino A, Mealli F, Nannicini T (2008) From temporary help jobs to permanent employment: what can we learn from matching estimators and their sensitivity? J Appl Econ 23(3):305–327
Imbens GW (2004) Nonparametric estimation of average treatment effects under exogeneity: a review. Rev Econ Stat 86(1):4–29
Kim Y (2011) Catholic schools or school quality? The effects of Catholic schools on labor market outcomes. Econ Educ Rev 30(3):546–558
Lee BK, Lessler J, Stuart EA (2010) Improving propensity score weighting using machine learning. Stat Med 29(3):337–346
Lee M, Lee S (2009) Sensitivity analysis of job-training effects on reemployment for Korean women. Empir Econ 36(1):81–107
Lefebvre P, Merrigan P, Verstraete M (2011) Public subsidies to private schools do make a difference for achievement in mathematics: longitudinal evidence from Canada. Econ Educ Rev 30(1):79–98
Lehrer S, Kordas G (2013) Matching using semiparametric propensity scores. Empir Econ 44(1):13–45
LODE (1985) Organic Law 8/1985, 3 July, Regulating education. Official Spanish State Bulletin 159
Mancebón M, Ximénez-de-Embún DP (2014) Equality of school choice: a study applied to the Spanish region of Aragón. Educ Econ 22(1):90–111
Mancebón M, Calero J, Choi A, Ximénez-de-Embún DP (2012) The efficiency of public and publicly-subsidized high schools in Spain. Evidence from PISA-2006. J Oper Res Soc 63(11):1516–1533
McCaffrey D, Ridgeway G, Morral A (2004) Propensity score estimation with boosted regression for evaluating causal effects in observational studies. Psychol Methods 9(44):403–425
Moulton B (1990) An illustration of a pitfall in estimating the effects of aggregate variables on micro units. Rev Econ Stat 72(2):334–338
Murname R, Willett J (2011) Methods matter. Oxford University Press, New York
Peel M (2014) Addressing unobserved endogeneity bias in accounting studies: control and sensitivity methods by variable type. Account Bus Res 44(5):545–571
Rehm P (2005) Citizen support for the welfare state: determinants of preferences for income redistribution. WZB markets and political economy working paper SP II 2, Wissenschaftszentrum Berlin für Sozialforschung
Rosenbaum P (2002) Observational studies. Springer, New York
Rosenbaum P, Rubin D (1983) The central role of the propensity score in observational studies for causal effects. Biometrika 70(1):41–55
Rosenbaum P, Rubin D (1985) The bias due to incomplete matching. Biometrics 41(1):103–116
Rubin D, Thomas N (2000) Combining propensity score matching with additional adjustments for prognostic covariates. J Am Stat Assoc 95(450):573–585
Setoguchi S, Schneeweiss S, Brookhart MA, Glynn RJ, Cook EF (2008) Evaluating uses of data mining techniques in propensity score estimation: a simulation study. Pharmacoepidemiol Drug Saf 17(6):546–555
Smith H, Tood P (2005) Does matching overcome LaLonde’s critique of non-experimental estimators? J Econ 125(1–2):305–353
Somers M, McEwan P, Willms J (2004) How effective are private schools in Latin America? Comp Educ Rev 48(1):48–69
Stuart E (2010) Matching methods for causal inference: a review and a look forward. Stat Sci 25(1):1–21
Spanish Ministry of Education (2013) Spanish Education Statistics. Madrid. 2013. https://www.mecd.gob.es
Urquiola M (2016) Competition among schools. Traditional public and private schools. In: Hanushek E, Machin S, Woessmann L (eds) Handbook of the economics of education, vol 5. Elsevier Science & Technology, North-Holland, pp 209–237
Willms J (2006) Learning divides: ten policy questions about the performance and equity of schools and schooling systems. UIS working paper 5. UNESCO Institute for Statistics, Montreal
Yitzhaki S (1996) On using linear regressions in welfare economics. J Bus Econ Stat 14(4):478–486
Zimmer B R Gill, Booker K, Lavertu S, Witte J (2012) Examining charter student achievement effects across seven states. Econ Educ Rev 31(2):213–224
The authors are grateful for the financial support received from the Spanish Government, Ministry of Economics and Competitiveness (Project EDU2013-42480-R). Mauro Mediavilla and Domingo P. Ximénez-de-Embún also acknowledge the support from Fundación Ramón Areces. We thank the editor, two anonymous referees and the associate editor for their helpful comments.
About this article
Cite this article
Mancebón, M.J., Ximénez-de-Embún, D.P., Mediavilla, M. et al. Does the educational management model matter? New evidence from a quasiexperimental approach. Empir Econ 56, 107–135 (2019). https://doi.org/10.1007/s00181-017-1351-1
- School choice
- Propensity score matching
- Hierarchical linear models
- Unobservable variables bias
- Science and Foreign Language (English) skills
- Primary schools