1 Introduction

In the last decades, economists have used hedonic methods as an important tool in economics. Hedonic models characterize the pricing of differentiated goods, viewed as bundles of characteristics or attributes, and the demand and supply of those goods (attributes) under different assumptions about preferences and technology (Heckman et al. 2004). They allow for a systematic economic analysis of the demand and supply of a quality of the attributes of the good. That is, they evaluate the impacts of the improvement of the attributes of a good, or the amenities offered by an environmental improvement. For example, in the case of a home purchase, the idea is that the consumer buys environmental quality through the house. His utility or satisfaction will depend not only on the consumption of market goods (C), but also on the consumption of nonmarket goods (X). Although the consumer pays a price for the house, he is in fact paying for all the individual attributes of the house. Cornerstone works related to the hedonic method include Rosen’s seminal paper (1974), Graves et al. (1988), Sattinger (1993), Boyle et al. (1999), Palmquist and Smith (2001). This method has been applied to explain the behavior of many different markets such as housing, labor, paintings, and classical music.

One of the most important objectives of using hedonic models is to calculate Willingness to Pay (WTP) to avoid or reduce disamenities. For example, if researches want to know how much a citizen is willing to pay to reduce air pollution, they may use a hedonic approach. The main goal is to obtain an indicator of the possible response of citizens by reducing air pollution. However, this benefit is biased. The reason, first mentioned in Bayer et al. (2009), is that people cannot move freely from one city to another (as the traditional hedonic approach assumes) and they face a mobility or migration constraint. That is, an individual living in a polluted city will prefer to move to another city with a better air quality, but in reality this person will face some mobility or migration costs to move there and this person must stay in the polluted city for personal, familial or economic reasons. This finding is important since the result given by traditional hedonic methods is biased and eliminating this bias will allow policy makers to make better decisions on the public’s willingness to pay. This is especially important in developing countries, such as Mexico, where the government faces a more stringent budget constraint and must choose where to allocate their scarce resources.

Mexico faces many major problems such as corruption, poverty, illiteracy, and pollution. Regarding pollution, the capital of Mexico is one of the most polluted cities in the world. In fact, Forbes (2008) ranks Mexico City in the number 5 of the world’s dirtiest cities. This is the result of industrial and automobile emissions that affect the air quality and these emissions cause higher levels of sulfur dioxide, nitrogen oxide, carbon monoxide, fine particulate matter and organic compounds like benzene. Some compounds like nitrogen oxygen and volatile organic compounds cause air pollution problems in stagnant air, as the reaction between these elements form ozone and other oxidants. Ozone and particulate matter are the most serious pollutants in developing and developed countries. In Mexico City, the ozone levels fail to meet World Health Organization standards at least 300 days of the year. These levels of pollution affect residents with health problems and their negative externalities have an impact in the economy too. In Mexico, the government provides health services for almost all people. As a consequence, the government must offer medicines and health care to the residents if they are sick due to pollution. Also, there is an obvious productivity loss since workers cannot be in their jobs. However, if there is a reduction in the emissions of pollutants, people have a higher probability to be healthy and the government can save money. Of course, to achieve this goal at first the government must spend some money to reduce pollution, but at the end the benefits (resulting from decreased cost of health care and increased productivity) would be greater than the costs.

Air pollution is a major problem and one of its impacts is on health. There are no many researches of this area in Mexico and those are focused on Mexico City. However, these papers could give us an idea of the severity of the problem. Loomis et al. (1999) conducted a time series study of infant mortality in the southwestern part of Mexico City due to high levels of fine particles. They found that there is a positive relationship between concentrations of PM2.5 and infant deaths. Holguín et al. (2003) found that ambient levels of PM2.5 and ozone can reduce the high-frequency component of heart rate variability in elderly subjects living in Mexico City. Hence, there are major impacts on health because of the particulate matters and the reductions of these should be a priority in any country. Of course, it is important to measure the benefits people have to decrease the air pollution in Mexico. The main goal for this paper is to calculate WTP to reduce air pollution to know how Mexicans value their health by this figure.

Air pollution is also a problem issue in other large Mexican cities such as Guadalajara, Monterrey, Tijuana, Ciudad Juarez, and Puebla. Therefore, it is important to evaluate the benefits to reduce air pollution for the whole country. The main objective of this paper is to calculate WTP to reduce air pollution in Mexico in dollars. To obtain the latter figure, we use a hedonic wage and property approach that allows us to eliminate the bias of no free mobility. Households cannot move freely from one state to another state since they have reasons to stay there such as their jobs and the neighborhood where they live.

In Mexico, there are only a few papers that evaluate what people are willing to pay to avoid air pollutionFootnote 1. This situation is understandable since for many years Mexico did not have an Inventory of Emissions and good data bases to calculate that figure by hedonic methods. However, researchers faced this problem using a different approach, called contingent valuation (CV), to obtain this WTP, but they apply it only to the Mexico City Metropolitan Area. Hammitt and Ibarraran (2006) used a CV method to estimate the value of reducing health risks by improving air quality in that area. Therefore, they collected data by in-person survey and they followed Viscusi’s (1993) approach to calculate the value per statistical life, which is the total amount that the inhabitants would be willing to pay to prevent one unidentified random fatality in the next year.

Other way researchers inferred WTP for Mexico City Metropolitan Area is using estimates from other countries (World Bank 2002). World Bank estimates the levels of emissions for the period 2000–2010 and the benefits from reducing the concentrations in PM10 and Ozone under different scenarios. This study analyzed a wide range of health benefits of reducing air pollution, such as reduced cost of illness, reduced losses in productivity, WTP for reduce acute and chronic exposure. A big problem that his study faced was the estimation of WTP, since there was no information about this estimate in Mexico. Hence, they decided to use WTP obtained in the US to forecast that figure in Mexico. They used the below equation to predict WTP in Mexico:

$$ {\text{WTP}}_{\text{Mexico}} = {\text{WTP}}_{{{\text{Country}}\,{\text{A}}}} \left[ {{\text{Income}}_{\text{Mexico}} /{\text{Income}}_{{{\text{Country}}\,{\text{A}}}} } \right]^{\varepsilon } $$

where ε represents the income elasticity of WTP, that is the percentage change in WTP corresponding to a 1 % change in income and Country A is the US. Basically, since income in the US is higher than in Mexico and assuming that ε is one, WTP in Mexico must be lower than WTP in the US.

In the studies described before, the researchers focused only on Mexico City. This paper is more ambitious and the WTP obtained here is for the whole country. Also, this value is calculated correctly, eliminating all possible problems related to the bias of the WTP. Therefore, the goal in this paper is to obtain the marginal WTP to reduce air pollution in Mexico. This is the first paper that obtains this figure for a developing country and this could encourage the research in other countries in this area too.

The hedonic method has been used widely on many important issues where it is hard to determine the value of a good. Basically, goods that do not have a specific market need to calculate their value by this method. In the case of WTP to reduce air pollution, there is no market for this figure, so it is necessary to use the hedonic method. However, since there are migration costs to get this value for air pollution, the traditional hedonic approach must be changed and it must incorporate these costs as Bayer et al. (2009) did in their paper.

This paper is based on the study by Bayer et al. (2009). They incorporate migration or mobility costs into the hedonic approach using a “residential sorting model”. This constraint is not taken into account by the traditional hedonic method and the outcomes obtained by this approach will be biased, as it was stated before. They fix this problem using a two-stage model. In the first stage, they use a discrete choice model to obtain the probability for a person to choose any location to live constrained on the migration costs, the income this individual could earn in any location and the quality of life in every location (the country fixed effects). In the second stage, they regress these country fixed effects on air pollution concentrations to recover the WTP for air amenity in metropolitan areas throughout the US. Their estimations are much larger than the comparable estimate form the conventional hedonic model. This implies that there is a bias in the estimations using the latter approach and that mobility costs are important. Therefore, this paper follows that approach, calculating the marginal WTP to reduce air pollution in Mexico avoiding this bias. The reason is that in Mexico there are migration costs too. Tables 1 and 2 relate birth location and current residency, as it is shown that the majority of household heads in Mexico stay not only in their birth state but also in their birth region. Hence, they will prefer not to migrate since there is a tradeoff between their location (familiar or personal reasons) and rents and wages in other places.

Table 1 State-wise percent by residence location in 2000 (Census Data)
Table 2 Region-wise percent by residence location in 2000 (Census Data)

This paper is organized as follows: Sect. 2 describes the methodology; Sect. 3 presents the data used in this paper; Sect. 4 defines the econometric specification; Sect. 5 discusses the main results and Sect. 6 gives the conclusion.

2 Methodology

This paper seeks to estimate the WTP to reduce air pollution in Mexico using a residential sorting model. This model, based on Bayer et al. (2009), is described below in detail. The main outcome in Bayer’s paper is that a WTP incorporating mobility costs is almost four times greater than one derived from the traditional hedonic technique for the US.

This residential sorting model is a structural model choice based on a discrete choice model. In contrast, the traditional hedonic model is a reduced form model. Bayer et al. showed that both models have the same results in the case that there are no migration costsFootnote 2. However, when there are mobility costs, the results are different and the bias is given by those costs.

Following Bayer et al. (2009), this paper assumes the following utility function:

$$ U_{i,j} = C_{i}^{{\beta_{C} }} H_{i}^{{\beta_{H} }} X_{j}^{{\beta_{X} }} e^{{M_{ij} + \xi_{j} + \eta_{ij} }} $$

where U ij is the utility obtained by the household head or individual i to live in sate j; C i is the quantity consumed by i of the numerarie good; H i is the quantity consumed by i of the housing characteristics, X j is local air quality (measured by PM10 concentrationsFootnote 3); M ij is the disutility of migrating from i’s birth place to his currently residency j; ξ j are the unobservable factors at location j; and η ij is the individual idiosyncratic error of the utility.

Since it is assumed that individuals are rational and therefore, they desire to maximize their utility, individuals solve the following problem:

$$ {\text{Max}} U_{ij} = C_{i}^{{\beta_{C} }} H_{i}^{{\beta_{H} }} X_{j}^{{\beta_{X} }} e^{{M_{ij} + \xi_{j} + \eta_{ij} }}\quad s.t.\,C + \rho_{j} H = I_{ij} $$

where the price of the numerarie is 1; ρ j is the price of housing services in location j; I ij is individual i’s income in location j. Therefore, individuals maximize utility subject to their budget constraint and solve the following maximization problem: \( U_{i,j} = (I_{ij} - \rho_{j} H_{i} )^{{\beta_{C} }} H_{i}^{{\beta_{H} }} X_{j}^{{\beta_{X} }} e^{{M_{ij} + \xi_{j} + \eta_{ij} }} \), taking first-order condition for H i , we obtain the following result: \( H_{i} = \left( {\frac{{\beta_{H} }}{{\beta_{H} + \beta_{C} }}} \right)\left( {\frac{{I_{ij} }}{{\rho_{j} }}} \right) \). Plugging it in the budget constraint and substituting for H into the utility function give us the following indirect utility function:

$$ V_{i,j} = I_{ij}^{{\beta_{C} + \beta_{H} }} e^{{M_{ij} + \xi_{j} + \eta_{ij} + \beta_{X} \,ln\,X_{j} + \beta_{H} \ln \rho_{j} }} $$

To find the marginal WTP (MWTP) for the amenity X j , it is needed to take partial derivative of the indirect utility function with respect to X and I, and these derivatives will help us to construct the marginal rate of substitution between X and I:

$$ {\text{MWTP}} = \frac{{\frac{{\partial V_{ij} }}{{\partial X_{j} }}}}{{\frac{{\partial V_{ij} }}{{\partial I_{ij} }}}} = \frac{{\left( {1/X} \right)\beta_{X} I^{{\beta_{I} }} e^{{M_{ij} + \xi_{j} + \eta_{ij} + \beta_{X} lnX_{j} + \beta_{H} ln\rho_{j} }} }}{{\beta_{I} I^{{\beta_{I} - 1}} e^{{M_{ij} + \xi_{j} + \eta_{ij} + \beta_{X} lnX_{j} + \beta_{H} ln\rho_{j} }} }} = \frac{{\beta_{X} I_{ij} }}{{\beta_{I} X_{j} }} $$

where \( \beta_{I} = \beta_{H} + \beta_{C}. \)

Regarding the income it is known how much money the household head is earning in his current residency, but in reality we do not know the income that this individual would earn in any other location. Hence, we have to estimate the income this person would obtain in any other place. Therefore, we have to separate the income into a predicted mean income and an idiosyncratic error term as follows: \( I_{ij} = \hat{I}_{ij} + v_{ij} \). In the case of the housing variable, we do not have to separate it into a predicted mean housing and an error, since in this case the calculations are more precise and the error for this case is approximately zero. In fact, we did this exercise and the results show that the housing variable and its predicted mean value were basically the same.

Using this last equation and plugging into the indirect utility function, we obtain the following:

$$ \ln V_{ij} = \beta_{I} \,\ln \,\hat{I}_{ij} + M_{ij} + \theta_{j} + v_{ij}^{1} $$
$$ {\text{where}}\,\theta_{j} = \beta_{X} lnX_{j} - \beta_{H} ln\rho_{j} + \xi_{j}$$
$${\text{and}}\,v_{ij}^{1} = \beta_{I} lnv_{ij} + \eta_{ij}$$

θ j is defined as the utility relevant attributes of location j (location fixed effects) or the “quality of life” in that state; and v 1 ij is an error term.

It could be possible that an individual decides to live in a polluted city, because the price of a house is lower than in cities with higher quality of air, and this individual does not care about pollution at all. Hence, there is a possibility to have the self-selection problem that Chay and Greenstone (2005) mentioned in their paper. That is, household heads with lower valuation for air quality could locate in areas with worse air quality and this will affect the estimates of the MWTP since there will be a bias. In this case, we can avoid this problem by finding the probability that the household head i sorts any location j given by Eq. (1). The location the individual i will choose depends on the income this person could earn in any place, the migration cost facing by the household head, and the quality of life in that location. Basically, we want to obtain the choice probability of individual i to settle in location j; hence, assuming that the idiosyncratic city preferences v 1 ij are independently and identically distributed (iid) Type 1 Extreme Value, we have a Logit specification with the following closed form:

$$ P\left[ {\ln \tilde{V}_{ij} \ge \ln \tilde{V}_{il} ,\forall l \ne i} \right] = \frac{{e^{{\sigma \left( {ln\hat{I}_{ij} + \tilde{M}_{ij} + \tilde{\theta }_{j} } \right)}} }}{{\mathop \sum \nolimits_{q} e^{{\sigma \left( {ln\hat{I}_{iq} + \tilde{M}_{iq} + \tilde{\theta }_{q} } \right)}} }} $$

where we divide Eq. (1) by β I , so the tildes denote that, for example, \( \tilde{\theta } = \theta /\beta_{I} \), and \( \sigma = 1/\beta_{I } \) is a logit scaling parameter.

The big advantage of the Logit estimation is its closed form; however, this specification allows for independence of irrelevant alternatives (IIA) and it cannot represent random taste variation. But because of its closed form, it is easy to estimate. Equation (3) will define the first-stage estimation for the residential sorting model, and this part is estimated by maximum likelihood (ML). In the same equation, \( \tilde{M}_{ij} \) is the migration cost function and is defined latter in Sect. 4.

In many situations where IIA are exhibited, the choice probabilities are an accurate representation of reality. Luce in 1959 established IIA to be a property of specific choice probabilities. In fact, he derived the Logit model directly from an assumption that choice probabilities exhibit IIA (Train 2009). However, the Logit model exhibits independence from IIA and in this paper, due to the nature of the model we do not have to worry about IIA.

Hence, using ML we obtain the estimates of \( \tilde{\theta } \), and we use them in the second stage. This second stage is defined by Eq. (2), so we regress these “state area utilities” on local air pollution emissions and other local amenities. However, in the second stage we can face two econometric problems. First, there may be a correlation between the price of housing services and the unobserved local characteristics, hence moving the term with housing services to the left-hand side avoids this problem, that is

$$ \widetilde{{\theta_{j} }} + \tilde{\beta }_{H} \ln \rho_{j} = \tilde{\beta }_{X} \ln X_{j} + \tilde{\xi }_{j} $$

The above equation eliminates the correlation mentioned above, and this is possible because the estimate of the share of income spent on housing (\( \tilde{\beta }_{H} \)) is really close to the value obtained in our data. Therefore, we can substitute the value given by our data and do not have to estimate it.

Second, there may be a correlation between amenity levels and local unobservable attributes in the same region. Even though local emissions (correlated with local economic activity) are the key determinants of local air quality, pollution comes from other distant sources. Emissions from other locations outside the one we are analyzing are likely to be uncorrelated with local economic activity. Therefore, to avoid this problem of endogeneity, we construct a new variable that is not related to the unobservable term. This new variable is related to the exposure a resident has to the emissions of PM10 in a specific state given only the emissions of PM10 outside that state. This variable will give us a “Lower Bound” to the MWTP in Mexico.

3 Data

The data used in this research come from several Mexican sources. For the first stage, the discrete choice model, we use the Mexican Census 2000 to estimate the Logit model. I draw a random sample of 80,000 observations of the Census. The Census has important information about demographic characteristics of the household heads such as gender, age, marital status, level of education, total income earned from employment and migration status comparing the current location of the household head and his birth’s location.

For the second stage, we use the National Survey of Household Income and Expenditure (ENIGH) 2004, 2005, and 2006 (58,275 observations). This survey provides information on income and expenditure for a household head in Mexico and the characteristics of the house in which the household is living. The key variables for this stage are related to the characteristics of the house and characteristics of the location or state. The following variables are used to obtain the housing index: number of rooms, number of bedrooms, dwelling with kitchen (if the house has a kitchen or not), dwelling with plumbing facilities, dwelling owned (if the house is rented or owned), number of years of the dwelling, and dwelling with electricity. Other sources that we use in this stage to obtain information about local amenities were obtained from the Instituto Nacional de Estadística y Geografía (INEGI), Consejo Nacional de Población (CONAPO) and Programa de las Naciones Unidas para el Desarrollo (PNUD).

In this model, it is assumed that the household head is the decision maker, and that those household heads over 35-year old are excluded to make sure that location decisions are driven by current local attributesFootnote 4. This paper assumed this due to the fact that households with heads ≤35-year old are more mobile than the rest of the population. Also, the decision makers live in any of the 32 states and in any of the 2,445 municipalities that comprise Mexico.

Another important source is the Instituto Nacional de Ecología y Cambio Climático (INECC), an agency similar to the Environmental Protection Agency (EPA) in the US. One of its main objectives is to improve the air quality in Mexico. Mexico did not have a National Inventory of Emissions (NIE) until 2006 when INECC completed this task with the first NIE. This inventory measures the emissions by states and municipalities for NO x (Nitrogen Oxide), SO x (Sulfur Oxide), COV (Volatile Organic Compounds), CO (Carbon Monoxide), PM10 (Particulate Matter 10), PM2.5 (Particulate Matter 2.5), and NH3 (Ammonia). This paper uses the emissions of PM10 to calculate the marginal WTP to avoid pollution in Mexico. Since the level of emissions is aggregated we calculate the emissions of PM10 per area, depending on the size per state.

3.1 Air quality measures

In this paper, PM10 emissions’ levels indicate the level of air pollution. Problems due to the inhalation of PM10 are major in humans and animals. PM10 emissions can cause health problems since they settle in the bronchi and lungs. They can cause asthma, lung cancer, cardiovascular harm and a higher probability of dying at a young age. Therefore, the reduction of the emissions of this particulate matter in the air is beneficial for the whole population in Mexico and it is important to know those benefits.

The emissions of PM10 are taken from the NIE 1999 and we use the total emissions produced by all sources for the 32 states and all the municipalities. However, since there could be enormous difference among states due to the size, we decided to divide the emissions by the state’s area. This is one of the covariates that we use in the second stage of this paper.

A major issue in this model is that in the second stage it is plausible to have a relationship between state PM10 and the unobservable term in location j. If that is the case, a problem of endogeneity appears and the outcome of the WTP will be biased. Therefore, we decided to construct a new variable that avoids this problem and it can be used as a covariate in this state. This new variable is based on the exposure an individual faces in a specific state to the emissions of PM10 originated outside that state. For example, if you are living in Nuevo Leon, and other things equal, the emissions produced in other states will affect you. Abstracting for the emissions inside the state, we use only the exposure an individual faces in that specific state given by the neighboring states´ emissions. However, we have municipality information for each state regarding the emissions of PM10, so we can use that information to construct this new variable. A specific state has many municipalities so each of these municipalities will be affected by other municipalities that are outside this state and belong to other states. Suppose we want to calculate individual’s exposure to PM10 in Nuevo Leon: first, we have to compute the exposure per municipality in all the municipalities that belong to Nuevo Leon given by all the municipalities that are close to them and are not located in Nuevo Leon. Therefore, we define the following equation:

$$ {\text{Exposure municipality }}j = \sum\limits_{i = 1}^{k} {\left( {\frac{1}{{D_{ij} }}} \right)} {\text{PM}}10_{i} $$

where D ij is the distance between the center of municipality j and the center of the municipality that is close to it but is not in the same state as municipality j. PM10i is obtained from the NIE and it is for the k neighbor municipalitiesFootnote 5.

Once we calculate the exposure in all the municipalities for Nuevo Leon, we have to calculate the exposure to PM10 in that state. We define the following equation:

$$ {\text{Exposure state }}1 = \sum\limits_{c = 1}^{m} {\left( {\frac{{{\text{Population municipality}}_{c} }}{{{\text{Population State}} \,1}}} \right)} {\text{Exposure municipality}}_{c} $$

The above equation tells us that the exposure in state 1, for example Nuevo Leon, depends on the exposure per municipality in that state and how large this municipality is in relation to the whole state. It is important to emphasize that we are assuming that the exposure in Nuevo Leon depends on the emissions of the neighboring states and not its own emissions. Of course, this will tell us how the emissions outside the state will affect the residents in that state. Also, we expect that the closest the municipalities are, the more exposure the individuals will have to PM10. In the calculation of this variable we use municipality information, but the whole analysis is based on state information.

In the empirical analysis, we use four possible scenarios for the above new variable. First, using the map, we locate the closest municipalities that are not in the same state as the municipality we are interested in. We then calculate the above equations and the new variable. However, since these calculations are based on the map without any specific distance, it is likely that in some cases the neighboring municipality is, in fact, far from the municipality we measured. To avoid this problem, we also use the square distance in Eq. (5) and this equation is now:

$$ {\text{Exposure municipality }}j = \sum\limits_{i = 1}^{k} {\left( {\frac{1}{{D_{ij}^{2} }}} \right)} {\text{PM}}10_{i} $$

The above equation is used in the second scenario and it is the only change we did since Eq. (6) does not change to calculate the total emissions in a specific state.

The third scenario is based on a specific distance from the center of each municipality to the center of the receptor municipality j. For this case, if the distance is less than 80 km (\( D \le 80 \)), then neighboring municipality will affect the individuals in municipality j. After that, we have to calculate the exposure per state using Eqs. (5) and (6). Finally, the last scenario is based on Eqs. (6) and (7). In the last two scenarios the outcome obtained by them must be basically the same, since we have a specific distance between the municipalities. Therefore, we can conduct a sensitive analysis using the different scenarios and find the range where the WTP to reduce air pollution in Mexico will be.

4 Econometric specification

There are many intermediate steps to get the main result in this research. First, we have to estimate housing prices and incomes in each state. Second, we use these estimates and the migration cost function in the Logit specification to estimate the state fixed or “quality of life” in that state. The last step is to regress those fixed effects on local attributes and the WTP is obtained using this regression.

Hence, we have to calculate the housing prices first using data given by the ENIGH. These prices can be obtained from data on observed rent or house values and housing characteristics. The following functional form is used:

$$ \ln\, P_{i,j} = \ln \rho_{j} + \lambda_{j} \varOmega_{i} + h_{i}^{'} \phi + \varepsilon_{i,j}^{H} $$

where P i,j is a measure of house rent by individual i in location j; \( \varOmega_{i} \) is a dummy variable of the house ownership (\( \varOmega_{i} \) = 1 if house is owned and 0 if it is rented); ρ j represents the housing services in each locations; \( h_{i}^{{\prime }} \) represents the attributes of the house; and \( \varepsilon_{i,j}^{H} \) is the error term. The estimates of the ρ’s are used in Eq. (4) and they measure the “price of housing services” in a specific stateFootnote 6.

As was stated before, for the first stage of the residential sorting model we need to predict the income that a household head could earn in any state. The equation to estimate this structure is the following:

$$ \begin{aligned} & \ln {\text{INCTOT}}_{i,j} = \alpha_{0,j} + \alpha_{{{\text{SINGLE}},j}} {\text{SINGLE}}_{i} + \alpha_{{{\text{MALE}},j}} {\text{MALE}}_{i} + \alpha_{{{\text{AGE,}}j}} {\text{AGE}}_{i} + \alpha_{{{\text{AGE}}2,j}} {\text{AGE}}2_{i} + \\ & \alpha_{{{\text{JH}},j}} {\text{JH}}_{i} + \alpha_{HS,j} {\text{HS}}_{i} + \alpha_{{{\text{COLLEGE}},j}} {\text{COLLEGE}}_{i} + \alpha_{{{\text{UNIVERSITY}},j}} {\text{UNIVERSITY}}_{i} + \\ & \alpha_{{{\text{HIGHED}},j}} {\text{HIGHED}}_{i} + \alpha_{P1,j} P\left( {R_{B} ,R_{D} |{\text{ED}}} \right) + \alpha_{P2,j} \{ P\left( {R_{B} ,R_{D} |{\text{ED}}} \right)\}^{2} + \varepsilon_{i,j}^{1} \\ \end{aligned} $$

where INCTOTi,j is the income from employment that household head i obtains in location j; the other variables are demographic characteristics of that household head such as age, education, and marital status. The last two terms before the error term are defined as follows:

$$ \begin{aligned} & P\left( {R_{B} , R_{D} |{\text{ED}}} \right) = {\text{JH}}_{i} P\left( {R_{B} ,R_{D} |{\text{JH}}} \right) + {\text{HS}}_{i} P\left( {R_{B} ,R_{D} |{\text{HS}}} \right) + {\text{COLLEGE}}_{i} P\left( {R_{B} ,R_{D} |{\text{COLLEGE}}} \right) + \\ & {\text{UNIVERSITY}}_{i} P\left( {R_{B} ,R_{D} |{\text{UNIVERSITY}}} \right) + {\text{HIGHE}}D_{i} P\left( {R_{B} ,R_{D} | {\text{HIGHED}}} \right) \\ \end{aligned} $$

This measures the observed percentage of individuals with education level, born in region R B , that are found to be living in region R D . The idea behind these terms is to control individuals that are migrating from one region to other region due to their levels of education. Equation (9) is estimated using the Census and then the estimates are used to predict the income each individual would earn in any state. These predictions are introduced in Eq. (3).

Finally, the migration variable is calculated from data describing the household’s state of birth and the household’s location in 2000. It is a dummy variable with 1 if the household head migrates from his birth’s residency to his current location and 0 otherwise. We use a migration cost matrix with some flexibility where dummystate i,j  = 1 if location j is outside i’s birth state (=0 otherwise); dummyregionec i,j  = 1 if location j is outside i’s birth region (=0 otherwise)Footnote 7; and dummymacroreg i,j  = 1 if location j is outside i’s macro-region (=0 otherwise)Footnote 8.

The above structure is represented by the following migration cost:

$$ \tilde{M}_{ij} = \tilde{\mu }_{S} d_{ij}^{S} + \tilde{\mu }_{R} d_{ij}^{R} + \tilde{\mu }_{\text{MR}} d_{ij}^{\text{MR}} $$

Equation (11) is also plugged into Eq. (3) and we can estimate the parameters for the first stage: {\( \tilde{\mu }_{S} ,\,\tilde{\mu }_{R} ,\,\tilde{\mu }_{\text{MR}} ,\sigma ,\tilde{\theta } \)}. In the second stage, the thetas estimated in the first stage are regressed on local air pollution emissions and other local amenities. Therefore, the estimating equation in this stage must be

$$ \widetilde{{\theta_{j} }} + 0.20\, {\ln }\rho_{j} = \tilde{\beta }_{\text{PM}} {\text{ln\, PM}}_{j} + \tilde{\beta }_{Z} Z_{j} + \tilde{\xi }_{j} $$

The estimate of 0.20 corresponds to the share of income spent on housing in the sample given by ENIGH, and the results are robust to other choices of this parameter. On the other hand, since a higher value of PM10 translates into a worse air quality it is expected to have \( \tilde{\beta }_{\text{PM}} < 0 \), if and only if a household head is willing to pay for better air quality. To avoid the endogeneity problem, we use as a covariate the exposure to PM10 per state described in Sect. 3.1 instead of PM10 emissions per area. The explanatory variables in Z j contain crime per capita, employment rate, government expenditure per capita, population, life expectancy, rankings of art, and number of firms in location j.

5 Results

5.1 Housing price and income regressions

Tables 3 and 4 describe the key variables used in the analysis and their means and standard deviations. Table 3 tells us that the average age of this sample is 40, almost 89 % of the household heads are male, 5 % are single, and 12 % graduated from university. As shown in the third table, 3.5 % of the houses do not have a kitchen, 4.8 % of the houses do not contain plumbing facilities, and 1.5 % of the houses do not have access to electricity.

Table 3 Data summary
Table 4 Data summary

Table 5 reports the results of the housing index regressions. As we can see, results are as expected and are intuitive, the price of a house will be high if it has more rooms, bedrooms, and it has more housing services. That is, if a house has no kitchen, no plumbing facilities or no electricity, for example, its price is lower than a house that has those services. Almost all the estimates are statistically significant at the common levels of significance.

Table 5 Housing services index parameters

Table 6 shows that men earn more than women, more education causes household heads to earn more, and there is no statistical evidence that single individuals earn less compared to the excluded groups (married, separated and divorced). As it was expected, income increases with age, but at a decreasing rate. All the estimates are statistically significant at the usual levels.

Table 6 Income regression

5.2 Results from the residential sorting model

Table 7 is based on McFadden’s choice model; where individual i chooses where to live among all states constrained by the income and the migration costs. Table 8 summarizes the results presented in Table 7. As shown by Table 8, estimates are statistically significant and have the expected signs. There is a major utility cost (−4.63) associated with leaving one’s birth state. Also, the costs continue to rise with leaving one’s birth region and macro-region, but at a decreasing rate (−6.56879) and (−7.65025). Finally, the estimate of the scaling parameter σ is 1.36, or the estimate of the income parameter is 0.7301. Therefore, the results show that there is a migration cost or disutility to leave the birth’s state and settle in another state as Tables 1 and 2 suggested. Also, people have a higher utility in major states, that is, they will prefer to stay there since the quality of life is better compared to other states.

Table 7 Conditional logit
Table 8 First-stage maximum likelihood parameter estimates

The estimates of the state fixed effects are used as the dependent variables in the second-stage estimation given by Eq. (12). Tables 9, 10, and 11 report the results for all the scenarios discussed in Sect. 3. In all these cases, the share of income spent on housing is 0.20 obtained by the ENIGH. However, all the results are robust for different values of this share.

Table 9 Results from second-stage regressions
Table 10 Results from second-stage regressions
Table 11 Results from second-stage regressions

Table 9 shows the result using emissions of PM10 per area and there is a negative relationship between the “quality of life” and this variable using the state data. Also, a state has a better quality of life if it is more populated and if there are not many firms established in that state. States that have higher government expenditure per capita have a lower quality of life. This could be plausible since the government expenditure per capita is not translated into a benefit for the state, but on the contrary, the government could spend the money on other “things”, for example corruption, but not on the improvement of the state. However, the last result is not robust for other scenarios and is counter intuitive, and further work is needed to figure this out.

Table 10 presents the results for the cases of exposure without a specific distance. As shown in column 2, the coefficient of the exposure to PM10 per state is not statistical significant. In this case, a more populated state has a better quality of life. States with a higher life expectancy have a better quality of life, and states with fewer firms have also a better quality of life. As was pointed out before, the first result (about exposure not statistical significant) seems feasible, because we could have a measurement error. Since we use only the map to choose the closest neighbor municipalities, it is possible that in some cases the neighbor municipality is too far from the receptor municipality. That is, if the municipality is too far, this municipality will not affect the other municipality at all, and the exposure given by this municipality must be zero. To fix this problem, we use the square of the distance and the results are in columns 3 and 4. In this case, the estimate of the exposure to PM10 per state using the square distance becomes statistically significant. The estimated coefficient represents the elasticity of WTP with respect to air pollution exposure and its value is equal to −0.216. The value given by the first case using the emissions of PM10 per area (−0.590) is nearly twice in magnitude than the latter value. However, the only conclusion here is that there is a bias in the first case, and it is expected that the correct value must be close to −0.21. Hence, this value is the minimum value the elasticity of WTP can get. The other conclusions about population, firms established and life expectancy remain the same.

The last scenarios are presented in Table 11. The estimated coefficient of the elasticity of WTP has again the expected sign and the elasticity of WTP ranges from −0.215 to −0.199 in both cases. Since in these cases we have a specific distance (D < 80 km) among the municipality in state m and the municipalities in other states, the result in both cases must be basically the same, that is, we do not have a measurement error. Again, we have the same results as before. States with more people are significantly more appealing. States with higher life expectancy can attract more residents and states with a bigger local economy (given by the firms established) are not a very good option for living. All these results are robust. In the case of the latter result, a higher level of economic activity has some negative externalities to some states due to air pollution. Therefore, people in those places could have a negative impact in their quality of life.

The last step in this research is to calculate the marginal WTP. Therefore, we need to multiply the elasticity of WTP by the income and dividing by the air pollution emissions. Table 12 reports the results of the estimates of marginal WTP for air quality. Those figures represent the median household’s willingness to pay for a 1 Mg/year reduction in ambient PM10 emissions. We use the median values of household income ($25,716.00 pesos) and PM10 emissions in our sample as the measures of income (I) and air pollution (X)Footnote 9. However, we want to compare the different values of WTP obtained in each scenario. Therefore, we normalize these results and multiply them by its one standard deviation. The standard deviations are obtained as the exposure per municipality since we use municipality information to calculate the exposure per stateFootnote 10. The final figures are presented in Table 12.

Table 12 Estimated marginal WTP for air quality

The estimated MWTP for air quality ranges from $443.66 to $2,682.92 Mexican pesos. The first case has a MWTP equal to $1,818.27 and has a bias due to the endogeneity problem. However, in the other cases we eliminate this bias and the results are correct. As shown in the table, the results for the case with a specific distance are really close as it should be. Finally, as was stated before, it could be possible to use the exposure variable as an instrument and calculate the marginal WTP avoiding this bias. However, this instrument is not strictly a good one and it is better to use only the results with the exposure variables. Therefore, we expect that the MWTP must be greater than $443.66 and lower than $2,682.92.

6 Conclusions

This paper uses a residential sorting model to avoid the bias obtained by the traditional hedonic approach when we calculate MWTP to reduce air pollution. The main goal in this paper is to calculate MWTP for a reduction in air pollution measured by the emission of PM10 and since we face a problem of endogeneity, we implement a new variable to successfully measure the tradeoff between a better quality of life and a lower exposure to the emission of PM10. Our estimates imply that the household head in Mexico would pay $443.66 to $2,682.92 (in constant 2000 Mexican pesos) or 46.90–283.61 (2,000 dollars) for a one-unit reduction in PM10 emissions. This value is very important since if we aggregate individual WTP and compare this figure to the cost of pollution mitigation, we can clearly state that the benefits of reduction of air pollution are higher than its costs. Therefore, a public policy that helps to improve air quality in Mexico would be important and beneficial for all MexicansFootnote 11.

Since we get a lower bound for the WTP when using the exposure variables, we can assure that the minimum value for the WTP to reduce air pollution in Mexico is $443.66 and it is expected that the figure combining emissions in the state and outside the state must be higher than the latter value. Therefore, these results point out that in reality Mexicans do care about air pollution and there are benefits to decrease this disamenity. Hence, policy makers in Mexico must face this major problem and spend money to reduce the emissions of PM10 to improve the quality of life of the Mexicans.

This lower bound value of WTP to reduce air pollution in Mexico is lower than the WTP obtained by Bayer et al. (2009). They obtained a value of 149–185 dollars for this reduction. It seems that because of the lower income per capita in Mexico, we should have a lower value and that is the case. However, we have here some evidence that this is the case, but we need to research more on this topic, so we can conclude this asseveration.