The purpose of this paper is to estimate families’ willingness to pay for different school characteristics in two contiguous provinces of northern Spain, and to examine how these estimates differ when bilingualism is incorporated into the educational system. Specifically, we investigate the educational choices made by families in the Basque Country and Cantabria. In the former, Spanish coexists with Basque, a regional minority language, whereas Spanish is the only official language in Cantabria. Each educational system is tailored to its region’s bilingualism, or lack thereof.

The different educational systems in these geographically similar regions of Spain provide an opportunity to examine whether the presence of a second official language, as well as the local government's policy to increase its number of speakers, has an impact on the priorities of parents and their family budgets when selecting a school for their children.

Government policies on school choice and the educational system can have a huge impact on how parents decide how much of their household budget to allocate to school choice as well as how families are allocated to different schools and classrooms, potentially leading to higher (or lower) levels of socioeconomic segregation (Baum-Snow & Lutz, 2011; Burgess & Briggs, 2010; Elacqua, 2012; Kim, 2018; Nechyba, 2000).

In Spain, which has a federal government and several official (minority) languages coexist with Spanish, regional authorities have a great deal of leeway in terms of both the mechanism of school choice and the organisation of the educational system, as long as they meet certain requirements set by the central government. Regions with a second official (minority) language, such as the Basque Country, have placed great emphasis on teaching Basque and using it as a language of instruction in order to increase the number of Basque speakers.

The goal of this paper is to determine whether the Basque Country’s school system and policies regarding bilingualism have had an effect on parental school budgets, and whether the school characteristics for which parents are most willing to pay differ when bilingualism is involved. More specifically, the goal is to assess parental preferences for bilingualism in school selection in two scenarios: when the bilingualism offered by the school involves a local minority language and when it involves international foreign language such as English. This is studied through two discrete choice experiments (DCEs) in two specific areas: the metropolitan area of Bilbao, in the Basque Country, and the metropolitan area of Santander in Cantabria.

This study builds on the case study presented by Vega-Bayo and Mariel (2019), which analysed parental preferences regarding school choice in the metropolitan area of Bilbao by conducting a second DCE regarding school choice in a nearby area, the metropolitan area of Santander. By comparing the results obtained from the two DCEs using a latent class model (LCM), we are able to see how family preferences in school choice vary between two contiguous regions in Spain, and, more specifically, how much variation there is in the willingness to pay for specific school characteristics.

Conducting a DCE for school choice in Spain offers several advantages over working with revealed preferences. Firstly, it allows the possibility of showing individuals combinations that do not exist in the market, which they might prefer over the existing market options. Secondly, we can use a DCE to obtain several answers (observations) from the same individual or parent. This can improve the precision of the estimates of parental preferences because it increases the sample size. Working with revealed or observational preferences is currently not feasible in the Basque Country due to legal constraints and a general lack of data availability. Besides requiring families’ addresses in order to calculate the distance from the different schools to their family home, the fees charged by government-dependent private schools are actually not fees or tuition per se, but are disguised as monthly “donations”, and there is no official data on these donations because such fees are actually illegal (Arellano & Zamarro, 2007; Consumers’ & Users’ Organisation – OCU, 2012; Government of Spain, 1985).

The main results show that the existence of a second official language strongly influences parental preferences for school choice because the language of instruction drives the choice for most parents in the metropolitan area of Bilbao. In contrast, families in the other area, which does not have a second official language, show preference patterns more generally reported in the literature, with families with a higher level of education segregating into government-dependent private schools.

The following section of this paper, devoted to a “Literature Review”, is followed by a section that describes the case studies and the differences between the two. The “Sampling and Data: Descriptive Statistics” section continues by explaining the sampling procedure and the descriptive statistics of the gathered data. The “Methodology” section details the estimation method used for the analysis. The section “Results and Discussion” comments on the results obtained, and the last section concludes.

Literature Review

The literature on the economic impact of bilingualism shows that, besides an increase in earnings (Cappellari & Di Paolo, 2018; Saiz & Zoido, 2005), bilinguals may also benefit from enhanced attention to detail, cross-cultural awareness, and social communication abilities. Though the estimated increase in earnings is relatively small (1.4% per year of bilingual education in Catalonia according to Cappellari & Di Paolo, 2018, and between 2 and 3% for the US according to Saiz & Zoido, 2005) studies have shown that there is also a link between bilingualism and cognitive ability. Bilinguals have been shown to have better executive function as they age than their monolingual counterparts (Bialystok & Craik, 2010). Luk et al. (2011) also observe better white matter connectivity in adult bilinguals, which they believe to be one of the underlying mechanisms of the increased executive function. More recently, Costa and Sebastián-Gallés (2014) demonstrate that bilinguals also have higher neural processing demands that result in heightened brain activity, though they caution that this might only happen if a regular, varied, and socially beneficial linguistic input occurs in both languages. Furthermore, business owners and other unilingual workers benefit from bilingualism as well (Canadian Heritage, 2016).

According to Slavkov (2017), a child’s likelihood of becoming bilingual is directly related to the language of instruction. He discovered that parents usually emphasize the minority language in communication with their kids, whereas kids tend to use the majority language. In the Bilbao area of the Basque Country, only 8.5% of the population speaks Basque at home, while 62 percent of students are enrolled in the All Basque language model, and another 21% in the Bilingual (Basque-Spanish) model (Eustat – Basque Statistics Institute, 2016, 2020).

Ryan (2016) indicates that few families in the area choosing between public or government-subsidized schools have the option of picking the language of instruction for their children. The key to “blending in, belonging” for Anglophone children in the area was French education, even if it limited future access to English schooling.

Potter and Hayden (2004) discovered that parents in Argentina care about their children learning a second language (English), but the mechanism appears to be different from the Canadian case analysed by Slavkov (2017) who states that parents in Ontario, where English and French are both official languages, care about their children learning a minority language and express that concern. These two situations resemble those in our study areas, Basque Country and Cantabria.

California has a large Latino population. However, the use of teaching languages other than English was banned in 1998. According to Monzó (2005), although Latino parents valued bilingual education for both cultural and educational reasons, they could not get the information they needed to make informed school decisions. That is, they predominantly chose English as the only language of instruction, despite those preferences.

Finland is another unique situation where both Finnish and Swedish are official languages and everyone has the choice to choose their language. Lojander-Visapää (2008) observed that even though only 5% of the population speaks Swedish, bilingual individuals increased as Finnish-Swedish groups recruited children with Finnish-Swedish parents into Swedish schools. The number of students in Finnish-Swedish schools is growing, and bilingual families see the benefits of the system. However, most Finns only study Swedish since it is required, and English is their first foreign language (Sjöholm, 2004).

The literature emphasizes the need to consider a second (minority) official language when formulating educational policy and organizing the educational system (Cabau, 2014; Huguet et al., 2000; Priven, 2008). There is also evidence that dual language education helps reduce segregation (Kotok & DeMatthews, 2018). A second (minority) language in the educational system is thus a critical issue in areas where multiple official languages coexist.

The issues described by Potter and Hayden (2004) and Slavkov (2017) are particularly acute in the two regions of Spain that are the focus of this study: the Basque Country and Cantabria. The former has two official languages, Spanish and Basque, with Spanish dominating. Although Spanish is the only official language in the latter, some parents may still worry about bilingualism (Spanish and English in this case), especially since English is a worldwide language.

Case Studies

A DCE is a tool widely used for economic valuation that is based on stated preferences. In the case of school choice, using a DCE that includes monthly cost as a school attribute enables us to estimate how much parents are willing to pay, in euros, for a particular school characteristic when they choose a school for their children. That is, DCEs—when applied to school choice—consider parents’ preferences over differences in the characteristics of schools, such as distance from home, religious orientation, whether it is recommended by family or friends, cost, and many other factors. Parents consider different (hypothetical) alternatives, among which they have to choose a single option. Their hypothetical choices can then be used to estimate their willingness to pay for each of the school’s characteristics.

Although they are only 100 km apart, the school choice situation is completely different in the two metropolitan areas examined in this study, due to the bilingual nature of the Basque Country. The new territorial organisation that took shape after the Spanish constitution of 1978 (still in effect) was signed, means that the different Spanish regions (autonomous communities) each organised their own educational systems (Dávila Balsera, 2003).

In the Basque Country, the government regulated the presence of Basque in the school system in an attempt to normalise and increase its use, which was in decline (Basque Government, 1982) due to Franco’s dictatorship. The educational system was organised by introducing language models, that differed in the language of instruction: All Spanish, All Basque, or Bilingual (Basque and Spanish). Nowadays, two additional language models coexist in the area: Trilingual (Basque, Spanish, and English) and International Schools (the language of instruction in independent private schools, e.g., French, German, or English).

Table 1 shows the percentage of primary schools in Bizkaia (the province where the first case study is conducted) that offer each of the three main language models. Note that the same school can offer more than one language model option. Public schools clearly favour the All Basque option, while government-dependent schools have a more varied offering when it comes to the language of instruction. Families preferring something other than the All Basque language model will therefore be more likely to choose a non-public school option if they can afford it.

Table 1 Percentage of primary schools in Bizkaia that offer each of the main language models

Both public schools and government-dependent private schools are subsidised by the government. However, public schools have higher spending per student than government-dependent private schools. The families of students in these schools make up (part of) the difference via monthly “donations” (de la Rica et al., 2020a; Basque Government, 1987). Overall, the Basque educational system is very egalitarian, and there are few differences in learning outcomes and quality between schools; there is a “no child left behind” policy (de la Rica et al., 2020b; Basque Government, 2002; ISEI-IVEI, 2016). This means that the Basque system does not conform to the typical situation in which schools teaching in a minority language are considered less prestigious than those that do not. In fact, although there are no official school rankings (due to egalitarianism), the unofficial ranking of Spanish schools carried out by the national newspaper El Mundo consistently ranks five or six schools in Bizkaia as among the best in Spain, and they all teach in Basque (El Mundo, 2020).

Cantabria also organised its own educational system after the creation of autonomous communities and the decentralisation that took place after 1978. Unlike in the Basque Country, however, since there was only one official language, the system did not create different language models until, officially, 2013 (Government of Cantabria, 2013). Both public schools and government-dependent private schools now offer this around Santander (Spanish–English). As in the Basque Country, both types of schools are subsidised by the government, although public schools have a higher expenditure per student (Government of Cantabria, 2020).

Due to the idiosyncrasies of each region, the DCEs carried out are completely separate, and each one has its own design. Although the set of school alternatives varies between DCEs, they both share the same school attributes—that is, the characteristics taken into account by parents when choosing a school. The set of alternatives, attributes, and the levels of both DCEs were all chosen after carrying out qualitative discussion focus groups. We gathered a group of more than 20 people for each of the DCEs, which included all the agents relevant to school choice: parents, teachers, principals, and administrators. The goal of the focus groups was to gather opinions concerning the school alternatives specific to the area, as well as the school attributes considered significant by parents when choosing a school for their children. The participants of the focus groups anonymously rated the importance of several characteristics, on a scale from one to ten. The decisions about alternatives, attributes, and levels were ultimately made according to the answers gathered in the focus groups.

The final DCE designs applied in the two metropolitan areas are as follows. Parents in the Basque Country were presented with three different alternatives: a public school, a government-dependent private school, and an independent private school. One of the most important attributes considered in the Basque Country is the language of instruction, which can be Spanish, Basque or both—or even, additionally, English. All these regional idiosyncrasies are considered in the Basque Country case study, and they are presented in detail in Vega-Bayo and Mariel (2019). In the DCE carried out in the metropolitan area of Santander, parents were asked to choose between the only two alternatives available in this metropolitan area: a public school or a government-dependent private school. Each of these two alternatives has different characteristics (attribute levels). Table 2 shows the attributes and levels considered in both DCEs.

Table 2 Attributes and levels of the DCE

Although the attributes are the same, the levels differ due to taking into consideration the particularities of the region. There are three types of schools in the Bilbao area (public, government-dependent private or charter schools, and independent private schools). The cost of the government-dependent schools is also higher on average in Bilbao than in Santander. Distances are also greater, and this is therefore reflected in the chosen levels for the “distance to home” attribute. Given the existence of a second official language (Basque) and the fact that the only schools that offer an international language model are the independent private schools, the levels also differ. There is also a larger presence of immigrants in Basque schools.

Even though all the attributes are shown for each of the different alternatives, their levels depend on the alternative, and some levels are specific to an alternative. For example, the cost is always 0 € for the public school alternative, whereas it varies between 20 and 80 € per month for a government-dependent private school in Santander, and between 50 and 200 € per month in Bilbao. Public schools can only be secular. The presence of immigrants varies between 0 and 30% for public schools in Santander and between 0 and 60% in Bilbao, but only 0% and 10% are presented as possibilities for the government-dependent private school alternative.

Respondents were asked to choose between a public and government-dependent private school (and, in Bilbao, between these two and an independent private school) in a series of 8 different hypothetical choice situations, in which the attribute levels were randomly changed based on a D-efficient experimental design, generated in Ngene for a multinomial logit (ChoiceMetrics, 2012).

The second part of both DCEs gathered socio-demographic information on the respondents (parents) and their spouses, including: gender, language spoken at home, educational level, employment status, age (only for the metropolitan area of Santander), and family income bracket (only for the metropolitan area of Bilbao).

Sampling and Data: Descriptive Statistics

The two DCEs were carried out separately. The DCE for the metropolitan area of Bilbao was carried out between October 2015 and January 2016 with 300 families who had children between 3 and 8 years old. Parents were unsupervised while responding; they completed the questionnaire at home and sent in their response using a pre-stamped envelope. In this case, both parents filled in some parts of the questionnaire separately, since the focus of this DCE was on a comparison between the mothers’ and fathers’ preferences. The original article (Vega-Bayo & Mariel, 2019) shows in Table 3 that the responding sample appeared to be representative of the target population despite the fact that the attrition rate was quite high, probably due to the unsupervised data collection via pre-stamped envelope.

Table 3 Descriptive statistics for socio-demographic characteristics

Table 3 shows the descriptive statistics for the socio-economic characteristics of the families in the two samples. Note that for the dummy-coded variables, the average represents the proportion of the sample with that characteristic. The descriptive statistics for the socio-demographic characteristics of respondents in the Bilbao subsample indicate that 50% of the respondents were female, as the respondents were both members of the same couple. This is why there is information about the educational levels, employment and ages of both parents (Vega-Bayo & Mariel, 2019).

The DCE for the metropolitan area of Santander was carried out between October and December 2018 with approximately 400 families who had children between the ages of three and eight. The parents were selected using simple random sampling in 12 different schools (public and government-dependent, 6 of each type) which represent the educational offerings of this area. In this case, only one member of a parental couple completed the questionnaire. Responses were gathered in face-to-face interviews conducted by a professional interviewer right outside the sampled schools, using a combination of paper and digital formats. The majority of the parents responded (309 out of 400), probably due to the importance of this issue for them. Furthermore, in Spain, when parents pick up their children from school, there is an established custom of staying in the school playground for a while; the children play while the parents talk to each other. This habit probably contributed to parents’ not minding having to answer the questionnaire in this format, leading to a relatively high response rate.

Table 3 shows that 64% of the respondents in the Santander subsample are female and 36% are male. This overrepresentation of women was caused by the location used for data collection. As women are usually more involved in the daily routines of their children, more women responded to the questionnaires than men.

The average age of the respondents was 38, and 85% of the families in the sample spoke only Spanish at home, whereas the remaining 15% spoke at least one other language. At least one of the parents has a university degree in half the couples in the sample. Lastly, 93% of the respondents or their spouses were working.

Unfortunately, there are no official statistics for the variables in Table 3 that are directly comparable to ours, but the Spanish National Statistics Institute does have the following data available, which shows that our sample can be considered representative for our target population: 43.5% of the adult population in Cantabria has a post-secondary degree, whereas, in Cantabria, the employment rate for those 25–54 years old was 78% (Spanish National Statistics Institute – INE, 2018a, 2018b). In Cantabria, in 2012, the average age at which women had children was32 (Spanish National Statistics Institute – INE, 2012).

Methodology

Following the stream of literature that analyses school choice using a discrete choice modelling approach (Asadullah, 2018; Sakaue, 2018), we use the latent class model (Greene & Hensher, 2003) to analyse the responses obtained from the two DCEs. This model is based on random utility theory (McFadden, 1974) and allows individuals to be sorted into classes that can be characterised by socio-demographic variables. It therefore also allows for the heterogeneity of individual preferences. In the random utility model framework, the utility \({U}_{njt}\) that respondent n obtains from alternative j in each choice situation t can be expressed as

$${U}_{njt}={V}_{njt}+{\varepsilon }_{njt}$$

for a total of \(n = 1, 2, \dots , N\) decision makers, \(j = 1, 2, \dots ,\) J alternatives and \(t = 1, 2, \dots ,T\) choice occasions. In the literature, the representative utility \({V}_{njt}\) is usually set to be a linear combination of observable explanatory variables (attributes) and parameters \(\beta\). The error term \({\varepsilon }_{njt}\) is assumed to be Gumbel (0,1) distributed. In the latent class, we assume that preferences differ among individuals, and these can be sorted into C classes. The representative utility corresponding to respondent n in class \(c\) from alternative j in each choice situation t is thus

$$\begin{aligned} V_{njtc} = \,& ASC_{jc} + \beta_{1c} Bilingual_{njt} \\ & + \beta_{2c} SchoolingThroughSecondary_{njt} + \beta_{3c} Recommended_{njt} \\ & + \beta_{4c} ExtensiveExtracurricular_{njt} + \beta_{5c} Immigrants_{njt} \\ & + \beta_{6c} Religious_{njt} + \beta_{7c} Distance_{njt} + \beta_{8c} Cost_{njt} \\ \end{aligned}$$
(1)

for Santander, and

$$\begin{aligned} V_{njtc} = \,& ASC_{jc} + \beta_{1ac} AllSpanish_{njt} + \beta_{1bc} Bilingual_{njt} \\ & + \beta_{1cc} Trilingual_{njt} + \beta_{1dc} InternationalSchool_{njt} \\ & + \beta_{2c} SchoolingThroughSecondary_{njt} + \beta_{3c} Recommended_{njt} \\ & + \beta_{4c} ExtensiveExtracurricular_{njt} + \beta_{5c} Immigrants_{njt} \\ & + \beta_{6c} Religious_{njt} + \beta_{7c} Distance_{njt} + \beta_{8c} Cost_{njt} \\ \end{aligned}$$
(2)

for Bilbao, where \({ASC}_{jc}\) is an alternative-specific j constant in class c that captures the mean effect of unobserved factors in the error terms for each of the alternatives (public, government-dependent and private schools). For the sake of the model identification, one of these constants in each class must be set to zero. In our case, the alternative-specific constants corresponding to public schools have been set to zero in the two data sets.

The explanatory variables of Eqs. (1) and (2) are presented in Table 2. The variables \(AllSpanish\), \(Bilingual\), \(Trilingual\) and \(InternationalSchool\) are dummy coded variables representing one of the levels of the language of instruction that are presented in the first row of Table 2 (‘All Spanish’, ‘Bilingual’, ‘Trilingual’ and ‘International school’). The variables \(SchoolingThroughSecondary\), \(Recommended\), \(ExtensiveExtracurricular\) and \(Religious\) are also dummy coded variables presented in the second, third, fourth and sixth rows of Table 2 respectively. Finally, \(Immigrants\) represents the percentage of immigrants, \(Distance\) is the distance of the school from home and \({Cost}_{njt}\) is the hypothetical cost that parents would have to pay for their child to attend the school if they chose the specific option, measured in € per month (Table 2). Therefore, the parameter \({\beta }_{8c}\) captures the effect on utility of \({Cost}_{njt}\) and \({\beta }_{rc}\) (r = 2, 3, …, R) the effects of the remaining school characteristics, where r = 2, 3, …, R indicates the R non-cost school attributes considered. All parameters vary depending on the latent class, as denoted by \(c = 1, 2, \dots , C\). We can shorten Eqs. (1) and (2) as follows,

$${V}_{njtc}= {ASC}_{jc}+{{\Sigma }_{r=1}^{R} \beta }_{rc}{Attrib}_{njtr}+{\beta }_{8c}{Cost}_{njt}.$$
(3)

Let us now denote the alternative chosen by decision maker n in choice situation t by \({i}_{nt}\), so that \({P}_{n{i}_{n}t}\) represents the logit probability of the observed choice for individual n in choice situation t. The conditional probability of individual n choosing alternative \({i}_{nt}\) in their t-th choice situation for a specific class c is defined as

$$P_{{ni_{nt} t}} = \frac{{\exp \left( {ASC_{{i_{nt} c}} + \beta_{1c} {\text{Cos}} t_{{ni_{nt} t}} + \mathop \sum \nolimits_{r = 2}^{R} \left( {\beta_{rc} Attrib_{{ni_{nt} tr}} } \right)} \right)}}{{\mathop \sum \nolimits_{j = 1}^{J} \left[ {\exp \left( {ASC_{jc} + \beta_{1c} {\text{Cos}} t_{njt} + \mathop \sum \nolimits_{r = 2}^{R} \left( {\beta_{rc} Attrib_{njtr} } \right)} \right)} \right]}}.$$

If the probability that respondent n belongs to class c is denoted by \({\pi }_{nc}\), the unconditional probability of a sequence of choices \(({i}_{n1},{i}_{n2},\dots ,{i}_{nT})\) can be derived by taking the expectation over all C classes as follows.

$$\Pr \left( {\left( {i_{n1} ,i_{n2} , \ldots ,i_{nT} } \right)|{\text{Cos}} t, Attrib} \right) = \mathop \sum \limits_{c = 1}^{C} \pi_{nc} \mathop \prod \limits_{t = 1}^{{T_{n} }} \frac{{\exp \left( {ASC_{{i_{nt} c}} + \beta_{1c} {\text{Cos}} t_{{ni_{nt} t}} + \mathop \sum \nolimits_{r = 2}^{R} \left( {\beta_{rc} Attrib_{{ni_{nt} tr}} } \right)} \right)}}{{\mathop \sum \nolimits_{j = 1}^{J} \left[ {\exp \left( {ASC_{jc} + \beta_{1c} {\text{Cos}} t_{njt} + \mathop \sum \nolimits_{r = 2}^{R} \left( {\beta_{rc} Attrib_{njtr} } \right)} \right)} \right]}}$$

Then, the log-likelihood function of the observed choices that is maximised in the estimation procedure is

$$L= {\sum }_{n=1}^{N}\mathrm{ln}(Pr\left(({i}_{n1},{i}_{n2},\dots ,{i}_{nT})|Cost, Attrib\right))$$

The class allocation probabilities \({\pi }_{nc}\) are usually modelled using a logit structure, where the utility of a class is a function of the socio-demographic variables \({SD}_{n}\) and parameters \({\lambda }_{c}\), in addition to a constant \({\mu }_{0c}\), for class c. The allocation probability \({\pi }_{nc}\) for class c and for individual n can therefore be written as follows:

$$\pi_{nc} = \frac{{\exp \left( {\mu_{0c} + SD^{\prime}_{n} \lambda_{c} } \right)}}{{\mathop \sum \nolimits_{c = 1}^{C} \left[ {\exp \left( {\mu_{0c} + SD^{\prime}_{n} \lambda_{c} } \right)} \right]}},$$
(4)

where \({\mu }_{0c}\) and \({\lambda }_{c}\) are the parameters to be estimated. If a parameter \(\lambda\) corresponding to a specific socio-demographic variable is positive, this means that an increase in this variable increases the individual-specific class c probability. On the other hand, if it is negative, an increase in the value of its corresponding socio-demographic variable decreases the individual-specific class c probability. Note that in the LCM, individuals are not assigned to a specific class, and only the probability of belonging to a specific class is defined (4).

Prior to model estimation, the number of classes must be determined. The (consistent) Akaike Information Criteria (CAIC and AIC), Bayesian Information Criterion (BIC), and log likelihood values (log L) for two and three classes are presented in Table 4, as is typically done in the literature (Swait, 2007).

Table 4 AIC, BIC and CAIC criteria on the number of classes for the LCM

As the bolded numbers in Table 4 suggest, the AIC and AIC3 criteria suggest three classes, while the CAIC and BIC suggest two classes. On the one hand, some authors argue that AIC overestimates the number of classes (Celeux & Soromenho, 1996), while others show that BIC favours a small number of classes, especially in small sample sizes (McLachlan & Peel, 2000). Furthermore, Scarpa and Thiene (2005) and Hynes et al. (2008) indicate that when determining the number of classes, the statistical criteria and the significance of the parameter estimates must be balanced against the researcher’s own assessment of the model’s suitability. Therefore, the LCM models in this paper are estimated using two classes.

Results and Discussion

Table 5 presents the coefficients of the two-class LCM estimation for both DCEs, obtained by the maximum-likelihood estimation method of the software package Apollo (Hess & Palma, 2019) using the two samples described in section “Sampling and Data: Descriptive Statistics”. The robust standard errors are computed using the ‘sandwich’ estimator (Huber, 1967), which is defined as \(S={(-H)}^{-1}B{(-H)}^{-1}\), where \(H\) is the Hessian matrix, i.e. the matrix of second derivatives of the log-likelihood function with respect to the model parameters to be estimated, and \(B\) is the Berndt–Hall–Hall–Hausman matrix (Berndt et al., 1974), defined as the matrix whose \(jk\)th entry is \({B}_{\mathcal{l}h} ={\sum }_{n=1}^{N}{\partial L}_{\mathcal{l}n}\partial {L}_{hn},\) where \({\partial L}_{\mathcal{l}n}\) and \(\partial {L}_{hn}\) are the first derivatives with respect to model parameter \(\mathcal{l}\) and \(h\), respectively, of the contribution to the log-likelihood function from observation \(n\).

Table 5 Results of the LCM estimations for the two areas

The signs of the estimated coefficients are, in general, the expected ones. Firstly, the effect of the language of instruction depends on the location and class. All parents in the metropolitan area of Santander, in both classes, obtain a positive utility from a bilingual (Spanish and English) language model, as opposed to a school that simply teaches in Spanish.

As we have already mentioned, the language of instruction is a very important attribute in the metropolitan area of Bilbao, and its effect is closely related to the language spoken at home and the particularities of the Basque region. First of all, none of the parents wanted the All Spanish language model as the language of instruction (compared to the All Basque language model, which is the omitted or base category for families in Bilbao). This is probably due to the fact that a certain level of Basque is nowadays required for many jobs in the area, especially in the public sector, and this level is unlikely to be achieved if students attend a school teaching in Spanish only (Urrutia & Urrutia, 2021). Parents in the second class seem to favour bilingual (Spanish and Basque), trilingual (Spanish, Basque and English) or even international schools.

The next characteristic, schooling through secondary school, appears important for parents in Santander only, since it increases their utility from a school offering coverage from 2 to 18 years of age. As expected, preferences for being recommended by family and/or friends in the two areas, and in all classes are positive and significant.

Having extensive extracurricular activities, however, has a varying effect on a parent’s probability of choosing a specific school. Preferences for extensive extracurricular activities in the first class in both metropolitan areas are positive, whereas there is no statistically significant effect in either of the second classes.

The presence of immigrants appears to have a negative effect on utility in all classes except for the first one in the metropolitan area of Santander. The effect of the school’s religious orientation, however, is more varied. Families in the second class in Bilbao obtain a disutility from having a school with a religious orientation (versus a secular one). Parents in the first class in Bilbao and both classes in the metropolitan area of Santander, however, do not seem to care either way: they do not have a strong preference for religious or secular schools.

Distance also appears to have the expected effect: it is negative and significant for three out of four classes in the two models. Only preferences regarding distance in the first class in the metropolitan area of Bilbao do not seem to be clearly defined. Lastly, and also as expected, an increase in cost reduces the utility of individuals in all the classes.

The results show that if the mother has a university degree, it increases the individual-specific class 2 probability in Santander, and being an older parent is non-significant. In Bilbao, having a high income increases the individual-specific class 2 probability, while speaking something other than Spanish at home reduces it.

Using the LCM estimates in Table 5, we obtain Fig. 1, which plots the willingness to pay (WTP) for the school characteristics analysed in the two areas. The values of WTP for individual \(n\) and attribute \(r=\mathrm{2,3},\dots ,R\) are computed as a weighted average of the ratio of non-cost and cost coefficients in each class (3) weighted by the coefficients of the allocation function (4):

Fig. 1
figure 1

WTP obtained from the LCM estimation

$${WTP}_{nr}={\pi }_{n1}\frac{{\beta }_{r1}}{{\beta }_{11}}+{\pi }_{n2}\frac{{\beta }_{r2}}{{\beta }_{12}}.$$
(5)

To make the panels in Fig. 1 easily comparable, only common attributes in the two DCEs are presented. The WTP for the remaining attributes are presented in Table 6. These represent a monetary value (in euros): how much parents are willing to pay for each of the characteristics, taking into account the individual probability of belonging to each of the classes. As seen in Fig. 1, as well as in Table 6, the distributions of the values of the WTP in the two analysed areas are very different.

Table 6 WTP descriptive statistics for the metropolitan areas of Santander and Bilbao LCMs

In both cases, the school characteristic with the highest WTP in absolute value is the language model. The median WTP of parents in the metropolitan area of Santander is around 77 €/month for a Spanish–English bilingual school compared to an All Spanish school. Families in the metropolitan area of Bilbao do not care for the All Spanish language model school with respect to the All Basque option. They would actually have to be highly compensated, at around 256 €/month, to accept the All Spanish language model. Most families are willing to pay for a Spanish-Basque bilingual school a median amount of 135 €/month.

As reflected in Table 6, most parents in the metropolitan area of Bilbao are not willing to pay for schooling through secondary school, but families in the metropolitan area of Santander do care about this school characteristic and are willing to pay around 38 €/month. This might be due to the fact that most schools in the Basque Country currently offer this option, and parents therefore take it for granted. Virtually all government-dependent private schools offer this in the Basque Country, which accounts for more than half of the student population. Basque public schools that do not offer schooling through secondary school are typically separated into primary and secondary schools that are next to each other or close by. Students who enter a public primary school typically continue on, by default, to the nearby secondary school.

The median WTP for a school being recommended by family and/or friends is just below 43 €/month in Santander, and around 54 €/month in Bilbao. The median amount parents are willing to pay for extensive extracurricular activities in the metropolitan area of Santander is 20 €/month, while in the metropolitan area of Bilbao it is almost 58 €/month.

The median value for a decrease of 10 pp in the percentage of immigrants is approximately 1€/month in Santander and 5€/month in Bilbao. As shown in Fig. 1, the variation in the WTP for this attribute is very small in both DCEs. The percentage of immigrants in 2020 was 9.5% in Cantabria and 10.9% in the Basque Country. In Cantabria, around 52% of these immigrants were from Latin America, and 9% from Africa. In the Basque Country, those figures are 50% and 21% respectively (Spanish National Statistics Institute – INE, 2021a). The effect found (a negative willingness-to-pay) is not entirely unexpected, and the lower level of compensation (almost zero) required in the metropolitan area of Santander is probably due to the fact that the percentage of immigrants is lower. According to social barometers such as the one conducted by Ikuspegi (2021), there is a positive trend in relation to the presence of false and negative stereotypes towards immigrants in the Basque Country. However, the Basque Country was the Spanish region with the highest level of school segregation in 2018 due to immigration (Ferrer & Gortazar, 2021; Murillo et al., 2017), i.e. Basque parents choose schools with a lower presence of immigrants. Although the general Basque population has a non-negative perception of immigrants, as captured by the barometer, it appears that when it comes to choosing a school for their children, Basque parents do not necessarily behave in a way that reflects this view, whether consciously or unconsciously.

On the other hand, parents prefer non-religious schools on average, both in Santander and Bilbao. The median value that parents expect to be compensated for a religious school in Bilbao is 114 €/month, whereas in Santander this is around 56 €/month. Lastly, the median value of the compensation expected by families in the metropolitan area of Santander for each additional 10 km of distance from home is 57 €/month, whereas in the metropolitan area of Bilbao, where schools are more spread out, the median amount parents expect to be compensated is only around 22 €/month for each additional 10 km.

In general, and for all characteristics except for distance and 2–18 schooling, there are larger absolute values of median WTP in Bilbao than Santander. This is because schools and the cost of living are more expensive in the Basque Country (Costa et al., 2015), although, as mentioned, the school characteristic for which parents are willing to pay the most is, in both cases, the language model. It is worth noting that there is a difference in degree relative to the second-most significant characteristic between the two areas. In Bilbao, median families are willing to pay 135 and 192 €/month for bilingual and trilingual models respectively, and they would have to be compensated 256 €/month to accept the All Spanish language model. These numbers are two to three times higher than the second-highest WTP (recommended). In Santander, however, the WTP for the bilingual model is around 50% larger than the second-highest WTP (including for the recommended characteristic). This appears to imply that families who live in a bilingual area, such as the Basque Country, devote a larger portion of their budget to school choice in order to ensure that their children learn the language(s) they will need in the future.

In this regard, the northern part of Spain is an interesting case study of how two contiguous Spanish provinces, with similar cultures and habits, but which differ in their official languages, also differ in parental school choice preferences and willingness to pay for different school characteristics. The existence of a second (minority) official language in the Basque Country affects both the educational system and the labour market. The results obtained from the two DCEs highlight the fact that parental school choice is highly dependent on the area: parents in each of these two areas have different preferences and behave differently when choosing a school for their children.

Parents in Bilbao’s metropolitan area are used to longer distances, and the existing bus system covers practically the entire network. As they do not care as much about longer distances, they do not have to be compensated as much as parents in the area of Santander.

The second difference that stands out is the consideration of religious versus secular schools. This is reflected in the fact that the disparity between the compensation required to accept a religious school in the areas of Santander and Bilbao is extremely large; it is, in fact, the largest of the disparities. This therefore seems to suggest that this disparity is not simply due to the fact that the cost of living in the metropolitan area of Bilbao is higher, but that something else is going on. On average, the metropolitan area of Santander is more religious than the Basque Country. Most of the government-dependent private schools in the metropolitan area of Santander are religious, whereas this is not the case in Bilbao. There is a strong tradition of secular Basque government-dependent private schools (known as “ikastolak”). These were created as a safeguard against the Franco-era oppression that the Basque language was forced to suffer.

Most important, however, is the vast difference in terms of valuing the language of instruction. In the Bilbao area, parents who speak Basque at home are mostly concerned with having Basque as the sole language of instruction. Spanish-speaking families, on the other hand, prefer bilingual or trilingual models, though they still place a very high value on their children learning the Basque language. This is probably related to government policies, which have sought to increase the number of Basque speakers in recent years, as well as to the current labour market situation. There is a high unemployment rate among youth, which was 23% for those younger than 25 years old in the second trimester of 2019 and reached an all-time high of 50% during the recession of 2014 (Spanish National Statistics Institute – INE, 2021b).

Public sector jobs account for approximately 17% of the jobs in the labour market of the Basque Country, and knowledge of Basque at an advanced level is mandatory for the vast majority of these positions nowadays, as it is an official language. According to the last available data reported by the Spanish National Statistics Institute – INE (2020), the annual average earnings per worker in private firms was 25,976.52 € in 2018, whereas workers in public firms earned 39,353.23 €, that is, 51% more. Even so, many private sector companies in the area also positively value a knowledge of Basque when hiring candidates (Elorriaga et al., 2019).

The labour market in Cantabria has an even higher percentage of public sector jobs (19.6% according to the Spanish National Statistics Institute – INE, 2021b) and they also pay more than jobs in private firms, in this case 48% (Spanish National Statistics Institute – INE, 2020). However, since there is only one official language in Cantabria, there is no second-language barrier to access public sector jobs in this province, as is the case in the Basque Country.

In this regard, although there is, to our knowledge, no evidence in the literature of a wage premium per se for knowledge of Basque, Cappellari and Di Paolo (2018) find positive returns of Catalonian bilingualism on earnings of approximately 1.4% per year of bilingual education (or 20% baseline returns), using data on the 1983 reform that established Catalan as a language of instruction in primary and secondary schools. This reform was similar to the one carried out in the Basque Country in the same year that we previously mentioned. However, Aspachs-Bracons et al. (2008) use these same reforms to measure the effect of bilingualism on civic outcomes and individual identity for both Catalonia and the Basque Country, and they find an effect only for Catalonia. They believe this is due to the fact that in the Basque Country, parents were free to choose the language of instruction in spite of the reform, whereas in Catalonia the reform made bilingualism compulsory.

Knowledge of Basque is not an advantage outside the region since its use outside the area is negligible, and it does not closely resemble any other language (Michelena & Rijk, 2013; Totoricaguena, 2008). However, around 85% of young college graduates in the Basque Country remained in the area 3 years after graduation in 2018, and this number has been stable between 85 and 93% since 2009 (Basque Employment Service – Lanbide, 2019). Decisions about the language of instruction made by Basque parents is therefore probably driven by the requisites of the labour market in that area. Moreover, and as Slavkov (2017) suggests for the case of Canada, the majority language (Spanish) is learned de facto in the Basque Country, even if the chosen language of instruction is Basque only. The same is not true for minority languages, however—that is, Basque (or French in the case of Canada) is not learnt de facto if the instructional language chosen is not the minority language. Parents might therefore choose Basque as a language of instruction in order for their children to become bilingual individuals, with all the advantages this entails. This result would also align with previous evidence for immigrant parents making this choice (Levels & Dronkers, 2008; Rangvid, 2010; Smith et al., 2019).Footnote 1

In the metropolitan area of Santander, such an issue is not as salient because there is only one official language. Parents can still choose between All Spanish or bilingual (Spanish and English) instruction, however, and they prefer, on average, the bilingual option, although parents with a university degree (more likely to belong to the second class in our LCM) are willing to pay slightly more for it. This result is in line with Potter and Hayden (2004).

Of course, the main difference between our findings in the Basque Country—and Slavkov’s (2017) in Canada—and those for the metropolitan area of Santander and Potter and Hayden (2004), is that in the former case, there are two official languages, one of which is a minority language. In the latter, there is a single official language, but parents still want their children to learn a second language (English) in order to benefit from bilingualism. The mechanisms behind school choice decisions and parental preferences are therefore somewhat different.

The case of the Basque Country seems to be situated between the opposing cases of California (Monzó, 2005) and Finland (Lojander-Visapää, 2008), both of which have a dominant and a minority language. In California, Latino parents have seen their choice to educate their children in a bilingual system stripped from them. In Finland, the number of Finnish-Swedish families that educate their children in both languages has grown thanks to supportive legislation that nurtures both the dominant and minority languages. In the Basque Country, however, the public school system leans towards nurturing the minority language in an attempt to preserve it.

Conclusions

Although the school characteristics which families in both the Basque Country and Cantabria care about appear to overlap, the relative weight or willingness to pay for each characteristic differs between these areas. The language of instruction still drives the choice of most parents in the metropolitan area of Bilbao, due to the fact that this area possesses two official languages; whereas parents in the metropolitan area of Santander place similar importance on the other school characteristics, while remaining concerned that their children learn a second language.

The main conclusion, therefore, is that the existence of a second official language has a strong effect on parental preferences for school choice. This results in a different allocation of the family budget when it comes to school choice in a bilingual area. Families in the metropolitan area of Santander, without a second official language, show similar preference patterns to those already reported in the literature (Burgess et al., 2015; Goldring & Phillips, 2008; Hanushek et al., 2007; Schneider et al., 2006). Families with a higher level of education segregate into government-dependent private schools, and families with a lower level of education prefer public schools.

School choice preferences and a family’s willingness to pay appear very dependent on local conditions, and as such, it might be worthwhile to promote educational policies specifically tailored to local conditions as opposed to taking a more centralised approach. Accounting for the existence of more than one official language when designing an educational system appears necessary in order for educational policies to have the desired effect.

Like most other empirical studies, our analysis has some limitations related mainly to the moderate size of the sample and the use of stated preference data. In general, the objective of most DCE surveys in any field is to provide monetary measures that are close as possible to the “true” unknown values of the target population. As stated in the Introduction, collecting revealed preference data on school choice is not possible due to legal constraints and that is why a direct and simple test of the validity of a DCE survey through a comparison of the values obtained by stated preference approach with those obtained by a revealed preference approach is not available. Our approach should be validated through more indirect indicators similar to those presented by Bishop and Boyle (2019), who define three different aspects of validity: content validity, construct validity and criterion validity.

Future studies could perform similar analyses for other metropolitan areas of the analysed contiguous provinces (Basque Country and Cantabria) to check the robustness of our main results. Moreover, there are other Spanish provinces with two official languages that could be analysed similarly. In addition, there are many different regions around the world with more than one official language. Carrying out additional studies on this issue in other countries and settings could shed light on whether the increase in the budget allocation toward school choice and changes in the relative importance of school characteristics due to the existence of more than one official language are also observed in other areas, as well as whether this is related to labour market conditions.

In relation to this, our modelling approach could be extended in future studies by the inclusion of some additional questions related to general or specific attitudinal indicators and perceptions related to education. These indicators could enrich the applied methodology in many different ways, including the use of hybrid choice models (McFadden, 1986) or the control function approach (Guevara & Ben-Akiva, 2012).