1 Introduction

This paper studies the location decisions of the privately provided and voucher-funded schoolsFootnote 1 that emerged in the wake of the Swedish independent school reform of 1992. In particular, it tests what characteristics of the local school market were correlated with independent primary levelFootnote 2 school entry in 1992–2000, with respect to aspects such as the family background of the local student population, the local political majority, and the quality and density of the existing neighbourhood schools. The 1992 reform introduced practically free entry, including for for-profit companies, and thus introduced strong market incentives into the education sector. As it gave rise to the entry of a large number of independent schools, it provides an excellent opportunity to study the location decisions of private providers in a mixed market setting.

This paper makes two main contributions: (i) It provides descriptive evidence on the geographical pattern of a dramatic independent school expansion, by analysing what local characteristics were correlated with a higher likelihood for independent school location.Footnote 3 (ii) It carefully assesses the sensitivity of the results to the modifiable areal unit problem (MAUP),Footnote 4 i.e. it investigates if the regression estimates are sensitive to the type of spatial aggregation employed. This is done by making use of highly detailed geographical information to construct a set of alternative definitions of local school markets, and by investigating if the estimated parameters are sensitive to gradual changes in the local market definition, thus providing an informal test of the MAUP.Footnote 5

The MAUP is, I would argue, in general not sufficiently addressed in the literature outside the domain of economic geography. For example, previous studies on school locations in general tend to make use of the spatial units that are readily available, such as school districts, zip codes, or census tracts in the US education literature, or SAMS areas or municipalities in the Swedish context, without much discussion of whether or not this is the appropriate level of aggregation.Footnote 6 With the increased accessibility of geo-coded data, as well as sufficiently powerful software,Footnote 7 there is scope to take this issue more seriously, for example by using tailor-made local units, and this applies to all studies using spatially aggregated data.

The main results of this study suggest that the likelihood for independent school entry was correlated with the local student population density as well as with the local student family background. In particular, the independent school entry probability was higher in locations where students were of high-educated family background and was lower in locations, where a large share of students had at least one Swedish-born parent. There was also some indication of a lower likelihood for independent school entry in municipalities with a local left-wing political majority, although this result was not robust to changes in the outcome time period.

The above-mentioned results were robust to the various alternative and flexible definitions of local school markets. For other variables, such as the local income dispersion, average GPA, and voucher level, the definition of the local market, however, had a substantial impact on the results. This underlines the importance of accounting for the MAUP in studies using spatially aggregated variables.

The remaining sections of this paper are organized as follows: Sect. 2 gives a brief overview of the Swedish independent school reform, Sect. 3 provides information on the data variables, and Sect. 4 describes the MAUP and the spatial aggregation measures. Section 5 presents the results, and Sect. 6 concludes.

2 The Swedish independent school reform

For a more detailed overview of the independent school reform, see the working paper version of this paper.

Prior to the independent school reform of 1992, the vast majority of Swedish children were educated in a municipal school. Schooling could also take place in an independent school, but these were limited to schools that were either of alternative pedagogical profile, boarding schools, or schools for foreign nationals.

In the fall of 1991, a tight parliamentary election brought a right-wing coalition government to power, and by July the following year it had implemented an extensive reform of the independent school system. The reform introduced a voucher based funding system for the independent schools and abolished the restriction of independent school status to the above listed types. The vouchers were to be paid out by the students’ home municipalities, and at a level basically on par with the funding provided to the municipality-operated schools.Footnote 9

For the period under study in this paper, the application process for start-up independent schools was handled by The Swedish National Agency for Education. According to the regulation, approval should be granted if the provider was deemed competent to provide education according to the goals and (since 1997) the value system of the Swedish education system, and had a credible economic plan.

The 1992 reform meant a significant improvement of the conditions for the independent schools and led to an immediate rise in the number of independent schools, as illustrated in Fig. 1.Footnote 10

Fig. 1
figure 1

Number of independent and public schools offering grades 1–3 in 1988–2000. Note that the number of public schools has been divided by ten

The rapid expansion of the independent schools is also visible in the maps of Fig. 2, which show the locations of all grade 1–3 independent schools in the last pre-reform year of 1991, as well as in year 2000. It also shows the population density among school-age children in 1991. As indicated by the maps, the independent schools opened up in many parts of the country, but tend to cluster in the more densely populated areas.

Fig. 2
figure 2

Maps over the location of independent schools offering grades 1–3 in 1991 and 2000, and the population density among 7–9-year olds in 1991

3 Data variables of the empirical model

Before we move on to the issue of the spatial definition of local entities, this section describes the data variables of the analysis of the location choices of the independent schools that entered the education market after the 1992 independent school reform.Footnote 11 All right-hand side regression variables will be measured in 1991,Footnote 12,Footnote 13 the year before the reform, and the empirical model will test how well they predict the subsequent location choices made by schools opening up in 1992–2000.Footnote 14

3.1 Right-hand side regression variables

The aim is to include variables that capture a broad range of location specific characteristics that potentially influence the location choices of independent school providers.Footnote 15 To this end, I first construct a set of variables for students’ family background, namely: (i) the share of students with at least one parent with post-secondary education; (ii) the share of students with at least one Swedish-born parent; and (iii) the average and standard deviation of the disposable household income, all measured among the local population aged 7–9. These variables are motivated on the grounds that school providers may prefer to locate near students with a certain socio-economic profile in the hope of attract them to the school. On the one hand, a provider with a social mission to bring good education to disadvantaged areas may opt for neighbourhoods, where students in general come from a low-education family background. On the other hand, another provider may prefer students from a high-education or high-income background, in order to earn a reputation as an elite type school, or if such students are perceived as being associated with lower costs. These student background variables may also be related to the demand side, if independent schooling is more popular among students of some family background. It shall be underlined that the results of the empirical model will not enable us to distinguish between these potential underlying channels—several of them may be at play simultaneously—but will rather show the aggregate picture.

Second, the model will include variables for the local student population density, measured as the number of students aged 7–9 since this corresponds to the lower primary school age,Footnote 16 and for local the school density, defined as the number of public and independent schools, respectively, in the local market.Footnote 17 These variables are included as they are likely to capture the local demand for an entering school: more students mean a higher demand in general, and even more so if the number of existing schools in the neighbourhood is low. The local demand may furthermore be higher if the quality of the existing schools is low. In order incorporate this notion in the model, I add as a proxy variable for education quality, the grade point average (GPA) among local students at age 16, which is the earliest age at which any education attainment information is provided in the national registers for the studied period. This is naturally a crude measurement for lower primary school quality, both as it also reflects the quality of higher education, but it is the information that is available for the period under study.Footnote 18 GPA is furthermore a limited indicator of quality if we consider that schools may differ in their grading standards, but is on the other hand an indicator that is visible to both families and school providers.

Third, I add a political variable in the form of an indicator for left-wing local political majority, defined as the Social Democrats and the Left Party jointly having at least 50% of the seats of the municipal council. The council decides on issues such as school choice policy, public transport, and construction permits, and can thus impact the day-to-day operation of the schools. The left-wing parties have from the start been more sceptical to independent schools than the right-wing parties, and the hypothesis is thus that the independent schools may therefore seek to avoid left-wing municipalities.

Finally, the model includes variables for local costs for facilities and wages, which are the main cost items for schools, as well as a measure of the per student voucher a schools can expect to receive from the municipalities. Due to a lack of more precise information,Footnote 19 these variables will be approximated for using the following variables: The expected costs for facilities will be measured using the per student cost for school premises in the municipality-operated schools, and labour market region dummy variables will be added to the regression in a robustness analysis to account for the fact that teacher wages may vary regionally.Footnote 20 The expected per student vouchers are measured as the per student expenditure in the municipalities’ own schools. This is motivated by the fact that the vouchers shall, according to the regulation, approximately follow the resources provided to the municipal schools. While these measures are included to give a more comprehensive model, it shall be underlined that they are likely measured with error, and their estimates shall thus be interpreted with extra caution.

3.2 Outcome variable

The outcome variable of the regression model is measured as a binary indicator of independent school entry. In the baseline specification it takes the value one if at least one independent school opened in a local school market at any point in time between 1992, the first year of the independent school reform, and year 2000.Footnote 21,Footnote 22 This means that I will study the location decisions of the independent school start-ups during the first 9 years of the independent school reform. Results for alternative time periods are reported in a sensitivity analysis in Sect. 5.3.

The binary definition of the outcome variable does not take into account the intensity of the outcome variable, i.e. whether one or more independent schools choose a specific location. As will become clear in the following section, I will for most of the regression analysis define the geographical units of the analysis based on very small geographical entities, such that there will rarely be more than one independent school per unit. However, for the regressions using the larger SAMS and municipalities as spatial units, I will complement the binary outcome with a continuous outcome variable in the form of the number of start-up independent schools.

4 Spatial definitions of potential school locations and school market characteristics

In contrast to most of the previous literature on this subject, this study will not rely on pre-existing administrative geographical units (school districts, zip codes, etc.). The reason is that such areas—which were not generated for the particular research question at hand—may not constitute feasible measures of local characteristics in the context of schools’ location choices. What is the relevant unit of measurement may additionally vary across the regression variables. This section first provides a brief overview of the modifiable areal unit problem in relation to school locations, and then turns to the spatial definitions that are used in the analysis.

Before doing so, it shall be acknowledged that this is not the first school location paper to deviate from using (only) administrative neighbourhood measures.Footnote 23 Compared to the previous studies, however, this paper goes a step further by using a wider set of alternative measures of geographical school markets that are tailor-made for the analysis.

4.1 The modifiable areal unit problem in the context of school locations

As was mentioned in the introduction, the general message of the MAUP is that the level and shape of the spatial aggregation may affect the results that are obtained from the analysis. This may seem self-evident, but the MAUP is useful in defining and analysing the concept in more detail.Footnote 24 In particular, the problem is divided into two parts: the scale and zoning problems. The first relates to, as the name suggests, how the scale of spatial unit affects measurement, while the zoning problem has to do with its shape.

The problem can be illustrated in the context of school locations, for example by considering the US literature, which has often used school districts as the spatial unit of aggregation.Footnote 25 Relating to the MAUP, the root of the problem is that these districts were not created in order to be used in studies of school locations, but are rather local government units. This means that if we are lucky, they may incidentally coincide with what is the appropriate level of aggregation for school location decisions. But it may well be the case that they are too large—or too small—(the scale problem), or of the wrong shape (the zoning problem), to be useful. It could for example be that a school that chooses to locate near the border of a school district does so with the aim to serve students in parts of all districts near that border, instead of serving only the students in the district of location. If so, using the school district as unit of analysis means measuring the relevant variables with error. Or, it could be the case that one (large) school district comprises a number of (smaller) potential school locations, in which case the level of aggregation is too high and may mask underlying patterns.Footnote 26 The issue is furthermore complicated by the fact that what is the relevant unit of measurement may vary across the regression variables, such that different levels of aggregation are relevant for different variables.

These concerns are naturally not limited to school districts, but potentially apply to any spatial measure that was not tailor-made for the particular analysis at hand. In the Swedish context, one commonly used local spatial unit is the Small Area Market Statistics (SAMS), generated by Statistics Sweden.Footnote 27 It is also relatively common to base regional analyses on the municipalities, i.e. the lowest level of local government.Footnote 28 Both of these measures, however, suffer from drawbacks that can be related to the MAUP. As noted by Amcoff (2012), differences in the underlying local data that were used to construct the SAMS units in the 1990s have led to large, and apparently arbitrary, size differences in the SAMS areas in different parts of the country. A striking example is that the SAMS areas are significantly smaller in central Gothenburg (Sweden’s second largest city) than in central Stockholm (the largest city).Footnote 29 The municipalities are also very different in size, ranging from the smallest municipality, Bjurholm, with a population of 2450, to the largest, Stockholm, with over 960,000 inhabitants.Footnote 30 While this does not rule out that SAMS or municipalities are sometimes the appropriate spatial measures, it seems plausible that for many research problems these sizeable differences in scale could be problematic.

4.2 Spatial definition of the outcome variable

In an effort to deal with the MAUP, and starting out with the question of how to best measure the school location variable, I note that in principle, every “spot on the map”—every coordinate point—constitutes a potential location for an entering independent school. That said, letting each coordinate point constitute a separate location unit is not a feasible option for the regression analysis, as it would result in low statistical power due to too many (too small) location spots.Footnote 31 I therefore aggregate the coordinate points to a larger grid, consisting of 1 km × 1 km squares.Footnote 32 The choice of exactly 1 km × 1 km sized grid cells is arbitrary, and as an alternative, I also provide results from using a smaller 0.5 km × 0.5 km sized grid, thus changing the scale of the spatial units in the terminology of the MAUP.

The generated grid cells constitute the locations that can be chosen by the entering independent school. The outcome variable of the regression analysis will thus be defined as a dummy variable which takes the value one if at least one independent school has started up in the grid cell during the period of study, and zero otherwise.Footnote 33 Grid cells located in low-populated areas, defined as having fewer than 30 students residing within a 3 km radius, will, however, be excluded from the regression sample, as they are unlikely to be considered by entering schools. The resulting regression sample of grid cells for the 1 km × 1 km specification is shown in the left-hand side map in Fig. 3. As indicated by the middle and right-hand side map (which are copied from Fig. 2), the regression sample grid cells cover the more populated areas of Sweden and vast majority of the actual independent school location choices.

Fig. 3
figure 3

Maps over the regression sample grid cells for the 1 km × 1 km grid, the location of independent schools offering grades 1–3 in 2000, and the population density of 7–9-year olds in 1991

In addition to these grid-based specifications, I will, as previously commented, also provide results from using SAMS and municipality as unit of analysis for the sake of comparison.

4.3 Spatial definition of the explanatory variables

For the explanatory regression variables, the spatial aggregation will vary depending on what is suitable for the specific variable, and depending on the restrictions posed by the data at hand.

First, the regression variables for student background, i.e. parental education level, immigrant background, disposable income and GPA, and for the voucher level,Footnote 34 shall be generated such that they reflect the characteristics of the pool of students that an entering school can be expected to attract if it chooses a certain location. Taking this into account, I define the following four alternative spatial measures:

  1. (i)

    Students residing within a 3-km radius from a grid cell midpoint.

  2. (ii)

    Students residing within 1.5-km radius from a grid cell midpoint.

  3. (iii)

    The 50 students residing nearest the grid cell midpoint, and

  4. (iv)

    The 100 students residing nearest the grid cell midpoint.Footnote 35

The two first measures are based on the notion that primary school students are likely to prefer schools located near their home and use fixed cut-offs in order to measure proximity.Footnote 36 The precise cut-offs are chosen ad hoc, but they reflect the assumption that primary level students are likely to prefer schools relatively near home. Whereas I lack data on the school of attendance for students of primary school age, the median distance to the school of attendance for students in the final grade of compulsory school, who are typically 16 years of age, was 1.6 km in 2000, and the average distance was 7.2 km. This suggests that, even among students much older than the 7–9-year olds that are of relevance to this study, most attend a school in the near vicinity. This link is likely to be even stronger for younger students.

Alternatives (iii) and (iv) instead take into account the fact that what is viewed as an acceptable travel distance is likely to differ across regions, for example depending on access to public transport. They thus assume that it is the students residing nearest the school that are likely to be more interested in the school, without explicitly taking into account the travel distance.Footnote 37

Second, the variable for population density will be measured using the same cut-off values as in the two first student-based measures above (note that the nearest-student type alternatives (iii) and (iv) are not useful for this variable), since this variable is relevant for the entering school to the extent that it affects the number of students who are potentially interested in attending it. That is, population density is in the regression analysis defined as the number of age 7–9 individuals residing within 3 km from a grid cell midpoint, or alternatively, within 1.5 km from a grid cell midpoint.

Third, the variables for school density will be measured using double the cut-off distances for the population density measure, namely 6 km and 3 km from a grid cell midpoint, respectively. This is based on the notion that if schools are assumed to attract students within a given radius (3 km and 1.5 km above), then they are expected to compete with schools within twice that distance.

For the variable municipal political majority, finally, the relevant geographical area is naturally the municipality, so each observation—whether based on grid cell, SAMS, or municipality—will for this variable be assigned the value of political majority corresponding to its municipality. Furthermore, the proxy variable for costs for school premises is also measured at the level of the municipality, although in this case due to lack of more detailed information.

In addition to the above tailor-made spatial measures, I will also show results for two commonly used measures of local areas for Sweden, SAMS, and municipalities. In those regressions, both the explanatory and the outcome variables are measured at the SAMS/municipality level.

The analysis will thus be carried out on a set of alternative regression variables, based on slightly different spatial assumptions. The aim is, as stated above, to evaluate if the results change when the different measures are used, thereby indicating if the MAUP is present in the current setting.

Table 1 summarizes descriptive statistics for the alternative variables based on the 1 km × 1 km grid cells, and for the SAMS and municipality level measures. (Descriptive statistics for the variables when using 0.5 km × 0.5 km grid cells are available in the working paper version of the article, see Edmark 2018.) The table uses the following abbreviations for the variable names: dS2000 denotes the binary outcome variable defined based on the period 1992–2000, and NrS2000 denotes the continuous outcome measures (for number of start-up schools) for the same periods. GPA denotes the grade point average, High educ parent denotes the variable for parental education level, Fam disp inc is the family disposable income measure, and Std Fam disp inc gives the standard deviation for the same measure. Sw parent denotes the variable for Swedish-born parent, School-age pop is the density among the school-age population, and Voucher proxy denotes the proxy variable for the voucher level. Left denotes the local political majority variable, and Costs premises gives the proxy variable for costs for premises. Finally, Mun sch density and Indep sch density denote the density variables for municipal and independent schools, respectively.

Table 1 Descriptive statistics for all generated regression variables for the 1 km × 1 km grid, and for the municipality and SAMS-level specifications

5 Regression model and results

5.1 Regression model and estimation issues

The likelihood that an independent school chooses location g in municipality m, \(P\left( {y_{gm} } \right)\), is modelled as a logistic function of the matrix of local characteristics Xgm, which implies the following regression equationFootnote 38:

$$P\left( {y_{gm} |\varvec{X}_{{\varvec{gm}}} } \right) = \frac{{\exp \left( {\alpha + \varvec{X}_{{\varvec{gm}}}^{{\prime }}\varvec{\beta}+ \varepsilon_{gm} } \right)}}{{1 + \exp \left( {\alpha + \varvec{X}_{{\varvec{gm}}}^{{\prime }}\varvec{\beta}+ \varepsilon_{gm} } \right)}}.$$
(1)

As previously mentioned, the outcome variable is based on independent school start-ups in 1992–2000, and the explanatory variables are measured in 1991 in order to avoid endogeneity bias. The downside of measuring local characteristics in 1991, however, is that they will be imperfect measures for the local characteristics in later years, and likely more so the further we move in time from the starting year. The correlation between the variables over time is, however, quite strong, which suggests that this is not a major issue.Footnote 39,Footnote 40

5.2 Main results

Table 2 shows the regression results for the outcome period 1992–2000 for the set of alternative local school market specifications. The results in Table 2 are presented in the form of elasticities, in order to facilitate comparison between the specifications. Note that all columns show the results of logit estimations for the binary outcome variable for independent school locations, except for columns (6) and (8) which show the results from Poisson regression models using the continuous outcome variables measuring the number of independent schools opening up in SAMS (column 6) and municipalities (column 8). These estimations are shown as a complement to the binary specifications for the SAMS and municipality regressions, and are in particular relevant for the municipalities, which often have several independent schools.

Table 2 Average marginal effects: elasticities, outcome period 1992–2000. Population density ≤ 3 km, school density ≤ 6 km. 1 km × 1 km grid

The results in Table 2 suggest that the probability for independent school entry is higher in locations where a larger share of the student population has high-educated parents, and lower where a larger share of the student population has Swedish-born parents. It is furthermore positively correlated with the local student population density and is negatively correlated with a left-wing municipal political majority. In terms of estimate sizes, the estimated elasticities are the largest for the share of students with Swedish-born parents: a 1% increase in this variable is estimated to be correlated with a 2–4% lower probability of independent school location, depending on the specification. The second largest elasticity, 0.7–1.8%, is estimated for the share of students with high-educated parents.

For the variable left-wing political majority, it is more intuitive to express the estimate size in terms of a 0–1 change than in terms of the elasticities given in the table. Expressed in this manner, the results suggest that locations in municipalities with a left-wing political majority are on average 0.6–1% less likely than others to experience independent school entry.Footnote 41

The above results are consistent for all of the alternative school market definitions, except for a statistically insignificant estimate on population density in the municipality level regression in column 8. Some of the other estimated elasticities, however, differ markedly depending on the level of spatial aggregation. For example, the elasticities in columns (1)–(3) of Table 2 suggest that a 1% higher standard deviation in household income is correlated with an approximately 0.1% increase in the likelihood that an independent school opens up. However, this relation turns insignificant when measured among the 100 nearest students in column (4), and is insignificant, and sometimes even changes sign, in the SAMS and municipality level specifications in columns (5)–(8).

In this case, which of the specifications shall we trust? As described in Sect. 4.3, the alternative grid-based spatial measures were derived from slightly different measures on students’ willingness to travel to school, and the resulting impact on competition between schools. However, our knowledge on students’ actual willingness to travel to school is limited, and so we cannot distinguish, based on the results, what is the more correct specification. What we can say, however, from the MAUP, is that using too large a scale in general tends to reduce the variance in the data, by smoothing out local extreme values. This in turn makes it harder to identify underlying associations that occur at a smaller scale. It is possible that the above-mentioned insignificant estimates for the standard deviation in household income in columns (4)–(8), i.e. the specifications using the larger of the “nearest students” measures, SAMS and municipalities, are a result of using too large a scale.

The local student grade point average (GPA) is furthermore estimated to be strongly and statistically significantly correlated with the likelihood for independent school location for the specifications using a 3-km cut-off in column (1) and the SAMS-based specifications in columns (5)–(6), and even more so when using the municipality level in columns (7)–(8). The elasticity is, however, small and statistically insignificant for the other specifications in columns (2)–(4). The estimate for the proxy variable for costs for premises also varies a lot across the specifications. While this variable is always measured at the municipality level, the alternative specifications still differ in the level at which school entry is measured. In this case, the SAMS-level specifications in columns (5)–(6) stand out by yielding statistically significant and positive elasticities, i.e. of opposite direction of what we expected theoretically. The estimates are, however, insignificant and of much smaller magnitude, and sometimes of negative sign, for the other specifications. I deem it likely that the large and positive estimates for the SAMS-level specifications are the result of some omitted variable which correlates with the independent school entry variable at the SAMS level. In general, the variation in the estimates for GPA and costs for premises across the specifications is in itself an indication of the MAUP—it exemplifies how the level of aggregation can impact the estimated results. However, just as for the estimate on the standard deviation in household income above, we cannot deduct from the results which one of the specifications is more correct.

5.3 Additional robustness tests

In addition to the above estimations, the independent school location choice was estimated after making the following alterations to the baseline specification of Table 2Footnote 42:

  • Labour market region dummies were added to the regressions, in order to account for the fact that teacher wage levels may differ between regions. For the grid cell-based specifications, this had little impact on the overall results. The most striking differences were that the estimated positive elasticity for the standard deviation of household income came out as statistically significant in all grid cell-based specifications, and the positive estimate for the elasticity for the average disposable income level was statistically significantly different from zero in all specifications. The dummy variable for left-wing political majority was no longer statistically significant. This is, however, not surprising given that the Labour market regions are likely to capture a lot of the municipality level variation. The SAMS-level estimates were overall more sensitive to the inclusion of Labour market region dummies than the grid cell-based estimationsFootnote 43: the elasticities for the share of Swedish-born parents and the cost for premises were no longer statistically significantly different from zero. In addition, a negative and statistically significant elasticity was estimated for the average disposable income.

  • Estimating the regression for alternative outcome years (1992–1995, and 1992–2005, respectively), and estimating the regression for outcome years 1996–2000 and measuring explanatory variables in 1995. The overall pattern of results was similar to the baseline across specifications. The main difference was that the left-wing majority in the municipal council was not always statistically significantly different from zero.

  • Using a smaller (500 m × 500 m) grid (instead of the baseline 1 km × 1 km size grid) to measure potential location points. This yielded results that were very similar to the baseline specification.

  • Defining population density as number of students residing within 1.5 km from the grid cell (instead of the 3 km in the baseline specification) and measuring school density within 3 km (instead of 6 km). Most estimates were unaltered by this; however, the standard deviation of income, which was statistically significant in most specifications in the baseline case, was now insignificant in most of the specifications. Municipality school density, which was previously insignificant, was on the contrary now positive and statistically significant in all four specifications.

  • Adding dummy variables for municipality type, based on the categories defined by the Swedish Association of Local Authorities and Regions,Footnote 44 and estimating the regression only for the municipalities within the Stockholm County. The results were overall very similar to the baseline specification. The more prominent difference was that the estimated elasticities for the average and standard deviation in local disposable income were more often statistically significantly different from zero than in the baseline specifications.

6 Concluding discussion

The overall results of this study suggest that the likelihood for independent school entry during the studied period was higher in locations where a larger share of school-age children had high-educated parents and where the local population density among school-age children was higher. The likelihood was furthermore lower in locations with a larger share of school-age children with Swedish-born parents. These results were stable across the alternative spatial specifications and the alternative regression models that were estimated.

These results are well in line with previous results on the USA and Sweden. Many of the US studies suggest that the likelihood for charter or private school location is positively correlated with the local adult education level and with a higher level of dispersion in terms of ethnicity or higher shares of students with foreign background. The previous Swedish reports, Angelov and Edmark (2016) and Holmlund et al. (2014), also suggest a higher likelihood for independent school entry in locations with a larger share of students from high-educated or foreign family background, and in more densely populated areas.

The results also showed that a left-wing political majority in the local council was negatively correlated with independent school entry in the baseline specification, although this correlation was not always statistically significant when alternative outcome periods were used.

The independent schools furthermore tended to locate in areas with a higher income dispersion, although this result was not stable across all spatial specifications. Other variables were even more sensitive to variations in the spatial aggregation measures. This held for example for the average GPA among local students, for the level of local household income, and for the proxy variables for local voucher levels and estimated costs for facilities. Although the analysis was not a formal test of which spatial aggregation was the best, the results indicate that the spatial unit of analysis can have significant impact on the estimated results, thus highlighting the potential importance of the MAUP.