1 Introduction

There is now compelling evidence that preferences for some environmental goods follow spatial patterns. One reason for this is that there are differences in the spatial configuration of these goods—so that preferences adapt to individuals’ home environments (Nielsen et al. 2007) and the availability to them of substitutes (Munro and Hanley 1999). Another line of reasoning concerns residential sorting: people’s preferences for environmental goods partly determine where they choose to live, so that measures of preferences tend to be correlated with measures of environmental quality or with distance to environmental amenities (Timmins and Murdock 2007; Timmins and Schlenker 2009; Baerenklau et al. 2010). Recent developments in Geographical Information Systems (GIS) allow the researcher to obtain rich datasets containing detailed information about the spatial configuration of environmental goods, and to investigate spatial patterns in stated and revealed preferences for such environmental goods. In this paper we use GIS data related to forest characteristics in Poland as variables explaining the variation in the publics’ Willingness To Pay (WTP) for changes in forest attributes resulting from the implementation of a new country-wide forest management and protection program. Individual-specific WTP values are derived from a Discrete Choice Experiment (DCE) study. Spatial regression methods are used to investigate the relationships between stated WTP for changes in national forest management, and the characteristics of forests where people live. This shows that WTP is higher the closer people live to their nearest forest, but WTP is also higher the scarcer forests are in the area where they live (that is, the lower is the fraction of forest cover to total land area). We also investigate spatial patterns in latent classes which represent heterogeneity in preferences, finding that such latent clusters of preferences in space do indeed exist. Identifying such clusters, we argue, is useful for policy-making and environmental management.

The literature devoted to spatial dependencies in economic values of the natural environment includes revealed preference studies where GIS data are used to identify characteristics of environmental goods. For example, Schläpfer and Hanley (2003) use this approach to investigate preferences for landscape protection programs, and Termansen et al. (2008) investigate recreational values of forests in Denmark using site choice data for 52 forest sites whose characteristics (forest area, share of coniferous forests and distance to the site) are described using GIS data. Baerenklau (2010) incorporates latent class modelling into a travel cost count data approach to deal with un-observed heterogeneity in the population of potential back-country hikers. Our approach differs in making use of a stated preference method which can estimate a wider set of values than those related to recreational use alone. We also derive maps of spatial clusters in WTP values for forest attributes. Spatially-referenced environmental data has also been used extensively in stated-preference studies. For example, Jørgensen et al. (2013) use GIS data on distance to the Odense River as an explanatory variable in contingent valuation study. GIS data is also used in the development of benefits transfer models based on stated preference data (e.g., Bateman et al. 2011). More relevant for this paper are studies where GIS data has been used to investigate the spatial distribution of WTP derived from DCE studies. Examples of this approach include non-parametric analysis of respondents’ WTP for choice attributes. Campbell et al. (2008) applied Moran’s I statistic and confirmed significant global spatial clustering of the WTP estimates and that these estimates exhibited positive spatial autocorrelation even over relatively large spatial areas. In a follow-up paper, Campbell et al. (2009) find further evidence of an intrinsic spatial influence on WTP. They used Kriging methods as a means of benefit transfer to illustrate spatial variation and regional disparities in WTP. As a further example, Johnston et al. (2011) use the Getis–Ord statistic to locate welfare hot-spots.

Parametric analysis of estimated WTP was applied by Abildtrup et al. (2013), Broch et al. (2013) and Yao et al. (2014). In both of these studies of forest values, random effect panel models were used with socio-demographic and spatial data as explanatory variables to explain variation in individual WTP values for changes in forest management. In Abildtrup et al. (2013), spatial heterogeneity in WTP for forest attributes (tree species, facilities, hiking paths and access to water) was modeled using binary variables equal to one if a forest with corresponding attributes was within 10 km radius from an individuals’ home. Choice data came from an internet sample of residents in Lorraine, in Northern France, and concerned enhancements in the recreational resources of local forests, Similarly, Yao et al. (2014) used data on forest size and distance from respondent’s homes to capture spatial dependencies in WTP for enhancement of biodiversity in New Zealand forests. Finally Broch et al. (2013) included spatial variables as covariates directly into a DCE model.

The main objective of our paper is to identify spatial determinants and spatial clustering of WTP for forest attributes in Poland. We argue that spatial variability has implications for the design and acceptability of forest management policies. To do this we firstly estimate a Mixed Logit (MXL) model using the public’s DCE responses to derive individual WTP values, and then use these individual-specific WTP estimates as dependent variables in a GIS-based spatial regression. This approach very much follows that of Abildtrup et al. (2013) and Yao et al. (2014). However, our paper extends the existing literature of spatial heterogeneity of preferences in three ways. Firstly, our GIS dataset contains much more detailed information about forest characteristics than any of the earlier approaches. Secondly, for the second stage of the analysis, spatial econometric models were applied to recognize the inherent spatial autocorrelation of WTP. Thirdly, we apply a spatial latent class analysis to identify clusters of respondents with similar preferences for forest conservation. This proves to be a useful exercise in revealing spatial patterns of values.

2 Methodology

The analysis conducted in this paper is divided into three stages. First, a Mixed Logit (MXL) model is estimated which allows the estimation of individual-specific WTP using sample-level WTP distribution parameters, respondents’ choices and Bayes theorem. In the second stage these WTP scores are modeled in a spatial econometric framework using GIS data as explanatory variables. In the third stage, spatial latent class models are estimated, and results are mapped.

2.1 First Stage: The MXL Model

Stated preferences methods have several advantages over revealed preferences methods, but the most important in the context of estimating values for forest attributes is that the experimental design process avoids some of the multicollinearity problems with respect to forest attributes which result from using travel cost approaches to study the same problem (Hanley et al. 2003).

In this paper we apply the mixed logit model (MXL). It is a very flexible model which allows for heterogeneity in respondents’ preferences. The model is defined as follows. A respondent n’s utility from choosing alternative i in the j-th choice task is given by:

$$\begin{aligned} U_{ijn} =V_{ijn} +\varepsilon _{ijn} ={\varvec{\upbeta }}_{n}^{\prime }{} \mathbf{X}_{ijn}+\varepsilon _{ijn} \end{aligned}$$

where the error term \(\varepsilon _{ijn}\) is assumed to be i.i.d with a Gumbel distribution. \(\beta _{n}\) is an individual set of parameters which are assumed to come from some known distribution which depends on some unknown parameters \(\theta \) to be estimated (usually means, and the covariance matrix). In most cases the researcher is interested in the estimation of the distribution of WTP, therefore it is useful to re-parametrize Eq. (1) to:

$$\begin{aligned} U_{ijn}= & {} -\beta _n^{\mathrm{cost}}\left( -\frac{{{\varvec{\upbeta }}}_{n}^{\mathrm{non-cost}}}{\beta _{n}^{\mathrm{cost}}}^{\prime }{} \mathbf{X}_{ijn}^{\mathrm{non-cost}} -X_{ijn}^{\mathrm{cost}}\right) +\varepsilon _{ijn} =-\beta _{n}^{\mathrm{cost}}\left( {{{\varvec{\upalpha }}'}_{n}{} \mathbf{X}_{ijn}^{\mathrm{non-cost}}-X_{ijn}^{\mathrm{cost}}} \right) \nonumber \\&+\,\varepsilon _{ijn}, \end{aligned}$$

where \({\varvec{\upalpha }}_{n}\) is vector of WTPs for every non-monetary attribute defined as \(-{\varvec{\upbeta }}_{n}^{\mathrm{non-cost}}/\beta _{n}^{\mathrm{cost}}\). This specification is often referred to as a WTP-space model, in which rather than assuming some distribution for the \({\varvec{\upbeta }}_{n}\) vector, the distribution of vector \(({\varvec{\upalpha }}_{n}, \beta _{n}^{\mathrm{cost}})\) is assumed and the associated parameters estimated. In this case the likelihood of the n’th respondent’s choices \(({\mathbf{y}}_{n})\) is given by:Footnote 1

$$\begin{aligned} L(\mathbf{y}_{n}|\mathbf{X}_{n}, \theta )=\iint \limits _{{\varvec{\upalpha }}_{n},\beta _{n}^{\mathrm{cost}}} {\prod _{j}{\sum \limits _{i}{y_{ijn}\frac{\exp (V_{ijn})}{\sum \limits _{k}{\exp (V_{kjn})}} f\left( {{\varvec{\upalpha }}_{n}, \beta _{n}^{\mathrm{cost}}|\theta }\right) d{{\varvec{\upalpha }}}_{n} d\beta _{n}^{\mathrm{cost}}}}}. \end{aligned}$$

The likelihood function is integrated over random parameters as they are not directly observed in the data. This integral is usually approximated using simulation methods.

As random parameters are not observed one cannot obtain them directly, but it is possible to derive their conditional expected values using Bayes Theorem, resulting in:

$$\begin{aligned} E\left( {{\varvec{\upalpha }}_{n}|\mathbf{y}_{n}, \mathbf{X}_{n}, \theta } \right) =\int {{\varvec{\upalpha }}_{n}\frac{p\left( {\mathbf{y}_n |\mathbf{X}_{n}, \theta , {\varvec{\upalpha }}_{n}, \beta _{n}^{\mathrm{cost}}} \right) f\left( {\varvec{\upalpha }}_{n}, \beta _{n}^{\mathrm{cost}}|\theta \right) }{p\left( \mathbf{y}_{n}|\mathbf{X}_{n},\theta \right) }d\left( {{\varvec{\upalpha }}}_{n},\beta _{n}^{\mathrm{cost}}\right) }, \end{aligned}$$

which can also be simulated using numerical methods as well. These so called ‘WTP scores’ will be used in the second stage of our analysis. These conditional parameter estimates are strictly same-choice-specific parameters, or the mean of the parameters of the subpopulation of individuals who, when faced with the same choice task, made the same choices. This is an important distinction since it is impossible to establish, for each individual, their unique set of estimates; instead, a mean estimate for the subpopulation who made the same set of choices in the panel was derived (e.g., see Hensher et al. 2006; Hess 2010).

Following Campbell et al. (2009), we also provide a visual representation of spatial distribution of WTP-scores using regression Kriging. Specifically, we estimated linear regression models with the dependent variable being a WTP-score and an independent variable being a 2-D smooth spline function of latitude and longitude. Using splines allowed us to capture nonlinear spatial dependencies. Having estimates of this model we then extrapolate results onto the whole map using coordinates of centers of \(10\times 10~\hbox {km}\) squares. Calculations were done using the R mgcv package (Wood and Wood 2007).

2.2 Second Stage: The Spatial Regression Model

In this stage the WTP scores calculated in stage 1 are explained using GIS data on forest characteristics and survey data on respondents’ socio-demographic variables. The most common approach currently applied in the literature (e.g., Campbell 2007; Yao et al. 2014) is the random effects panel regression in which all WTP scores are modeled simultaneously. This approach, however, forces some unrealistic assumptions on the modelling process, such as that all variables influence every WTP score in the same way. A possible solution is to interact the variables in the model with dummy variables for every WTP score (Campbell 2007), but this may result in too many parameters to estimate. For these reasons we estimated a separate model for each WTP score.

To accommodate spatial dependence a spatial lag model was applied.Footnote 2 This decision was reached after also evaluating the spatial error model. However, since there was little empirical guidance to tell which modelling framework was more appropriate we justified our decision on our belief that there is potential for diffusion and WTP feedbacks effects; and that this effect could be more profound than accounting for the variables omitted from the systematic component which could induce spatial correlation in the errors of the model. Besides, in non-market valuation settings we feel that there are many variables that are impossible to measure in order to fully explain spatial clustering. Therefore, we did not feel we were in a position where we understood every variable which should be included and were able to properly specify these in the systematic component of our WTP model. If this had been the case, we would have been more convinced that the spatial error model would be better suited to account for any residual correlation.

Under the spatial lag model each WTP is assumed to be of the following form:

$$\begin{aligned} \textit{WTP}=\tau c+\rho {\mathbf{W}}'{\mathbf{WTP}}+{{\varvec{\upgamma }}}'{\mathbf{Z}}+e. \end{aligned}$$

In this specification, c is a constant, \(\mathbf{Z}\) is a matrix of GIS and socio-demographic explanatory variables and \({\varvec{\upgamma }}\) is a vector of parameters to be estimated (which can be different for every WTP score). The error term e is assumed to follow a normal distribution with zero mean and standard deviation \(\sigma \). Additionally, a spatial lag term is included which addresses the problem of possible spatial autocorrelation between observations which is not explained by the covariates. \(\mathbf{W}\) is a matrix of spatial weights and \(\rho \) is a scalar parameter which corresponds to the magnitude of this spatial correlation—high positive values of \(\rho \) can be interpreted as showing that the WTP values of respondents who live close to each other are more similar than the covariates would predict. The inverse square function of distance was chosen for the spatial weights.Footnote 3 This allows us to capture global effects, as it has non-zero values even for very far away regions (although the influence of very distant neighbors will be infinitesimal), but gives much higher weights for very close regions, thus allowing us to capture so-called hot-spot effects. Thought was also given to alternative ways to define the neighbor criterion (e.g., based on contiguity and distance) but we found that that this quite often led to quite arbitrary neighborhood definitions and other problems with ‘point’ data.

We admit that our spatial lag model does not get around the potential loss of efficiency that stems from the fact that the dependent variable (i.e., WTP) is conditional on the estimated parameters from the MXL model. Added to this, we only use the means of the conditional distributions. The standard deviations of the conditional distributions give a measure of the uncertainty in the respondents’ WTPs. Given that these differ across respondents, uncertainty varies across respondents, potentially resulting in further inefficiencies and perhaps inconsistent standard errors. Undoubtedly, it would be better to include the GIS and socio-demographic explanatory variables of the spatial lag model directly in the MXL model. However, including these variables and simultaneously accounting for any spatial autocorrelation would entail substantial complexity. The model would become quite convoluted and is likely to make hypothesis testing less accessible, given the proliferation of parameters needed to interact the spatial variables. Our two-stage approach has the appealing feature that the models are more tractable and easier to manage. This approach also resembles that used in earlier applications (e.g., Campbell 2007; Yao et al. 2014) and, importantly, should not preclude us from drawing policy inferences on the influence that the GIS and other variables on WTP (which is our overall objective).

2.3 Spatial Latent Class Analysis

Finally, to provide an additional insight by visualizing the clustering of WTPs, we estimated a latent class logit (LC) model (Baerenklau 2010; Garrod et al. 2012), applied Bayes theorem to predict each individual’s probability of participating in each latent class, and lastly extrapolated these probability scores using regression Kriging.

Formally, individual i’s utility (in WTP-space) for the LC model can be written down in a similar way to the MXL model (2):

$$\begin{aligned} U_{ijn}^{c}=-\beta _{c}^{\mathrm{cost}} \left( {{\varvec{\upalpha }}}_{c}^{\prime } \mathbf{X}_{ijn}^{\mathrm{non-cost}} -X_{ijn}^{\mathrm{cost}}\right) +\varepsilon _{ijn} =V_{ijn}^c +\varepsilon _{ijn}. \end{aligned}$$

The only difference is that now utility is also indexed by the class to which a respondent belongs \((c\in \{1,\ldots ,C\})\) and parameters are not individual-specific but rather class-specific. As in reality we have no way of knowing for sure which class an individual belongs to, this leads to a formula on the likelihood of class membership of the following form:

$$\begin{aligned} L\left( y_n |\mathbf{X}_{n}, \theta \right) =\sum \limits _c {\pi _c L_c (y_n |X_n ,\theta _{c})} \end{aligned}$$

where \(L_c (y_n |X_{n}, \theta _{c})\) is a likelihood conditional on being in class c:

$$\begin{aligned} L_{c}(y_n|X_n,\theta _{c})=\prod _{j}{\sum \limits _{i} {y_{ijn} \frac{\exp \left( V_{ijn}^{c}\right) }{\sum \limits _{k}{\exp \left( V_{kjn}^{c}\right) }}}}, \end{aligned}$$

and \(\pi _{c}\) is a set of parameters representing probability of being in given class to be estimated with restriction that \(\sum \nolimits _{c}{\pi _{c}}=1\). After estimation of the model, individual n’s probability score of being in class c can be estimated with Bayes’ formula:

$$\begin{aligned} p_{cn}=\frac{L_{c}(y_{n}|X_{n}, \theta _{c})\pi _{c}}{L(y_{n}|X_{n},\theta )}. \end{aligned}$$

Therefore it is a likelihood conditioned on being in class c, multiplied by the probability of being in this class, divided by an unconditional likelihood.

As a final step, we extrapolated the conditional probability scores. For extrapolation we once again used the regression Kriging method, but this time we used hierarchical logit regressions to assure that extrapolated class memberships probabilities add to one. Under this framework a series of logit models is estimated with dependent variables being \(Q_{ic}=\mathbf{I}_{\{P_{ic}=\mathop {\max }\limits _{j}(P_{ij})\}}\) and a 2-D smooth spline function of latitude and longitude as an independent variable. For the first class we estimated an unconditional membership probability. For each of the other classes we estimated additional binary logit models conditional on respondents not being members of previous class(es). These results were extrapolated over the whole map using fitted probabilities resulting from these models. The unconditional probabilities of belonging to every class can then be calculated using unconditional probability estimated for the first class and conditional probabilities for all other classes.

3 Data

3.1 Discrete Choice Experiment

The dataset used in this study was investigated in Czajkowski et al. (2014a) and Czajkowski et al. (2014b). The original survey was conducted in 2010 on a representative sampleFootnote 4 of 1001 Polish adults. The main objective of the survey was to find the most important biodiversity and recreation attributes of Polish forests for the general public. After intensive qualitative studies, three attributes were selected to describe the environmental management options for national forests, namely: (1) passive protection of the most ecologically valuable forests in Poland,Footnote 5 (2) reducing the amount of litter (garbage, rubbish) in forests through tougher law enforcement and by increasing forest cleaning services and (3) increasing the level of recreational infrastructure, such as way-marking of trails. In all cases, respondents were asked whether they would support a particular policy change for the environmental management of all publicly-owned forests in Poland, which currently comprise 82 % of all forests in Poland.

In Poland about 3 % of all 90,000 \(\hbox {km}^{2}\) of forests are perceived as the most ecologically valuable. Currently only about half of these forests are properly protected which led to the following levels of the first attribute being chosen:

figure a

For the second attribute, the amount of litter in forests was used, since this is a significant problem in Poland. It is obvious that littering can decrease the recreational value of forests as well as non-use values. For this attitude the following levels were chosen:

figure b

Pre-testing showed that availability of the appropriate infrastructure for tourists is important for a forest’s recreational value. This infrastructure may include parking spaces, paths and trails for tourists, picnic sites etc. Levels for this attribute were chosen as:

figure c

The last attribute was the annual cost of these changes in the form of an increase in annual income taxes to pay for a national program of enhanced forest management. The levels we used were: 0, 10, 25, 50 and 100 PLN.Footnote 6

Every respondent completed 26 choice tasksFootnote 7 with 4 alternatives. In every choice task a Status Quo alternative with no changes in each attribute and a zero additional tax cost was included. The experimental design was optimized for Bayesian (median) d-efficiency of the MXL model (Sándor and Wedel 2001; Ferrini and Scarpa 2007; Bliemer et al. 2008; Scarpa and Rose 2008). To obtain initial estimates (priors) and verify the qualitative properties of the questionnaire itself, we conducted a pilot study on a sample of 50 respondents. An example choice card is provided as Fig. 1. For more details about design and survey see Czajkowski et al. (2014a).

Fig. 1
figure 1

An example choice card used in the DCE study (translation)

3.2 GIS Data

Information on forest characteristics used in this study was obtained from two different sources which we will now describe. Firstly, the CORINE Land Cover (CLC) dataset was used. This project is coordinated by the European Environment Agency with the objective of collecting high resolution data for the whole continent.Footnote 8 CLC databases contain area data for objects with a minimum area of 5 ha and a width of more than 100 meters. The second source of information we used was the Polish Information System of State Forests which has been used in Poland for the management of State Forests since 1995. This tool contains very precise data about the characteristics of forests in Poland.

The data from these sources was aggregated to \(10\times 10~\hbox {km}\) squares.Footnote 9 In total, 3307 such squares cover the area of Poland. Figure 2 presents a map with a distribution of DCE study respondents. The GIS data were associated with particular respondents using their ZIP-codes identifying their places of residence. For every respondent, the explanatory variables were calculated as weighted averages of forest characteristics in the \(10\times 10~\hbox {km}\) area common with respondents’ ZIP area code. The GIS variables used in this study are described in Table 1.

Fig. 2
figure 2

Respondents and forest area spatial distribution. As some respondents reported the same ZIP-codes we jittered all of them with uniform random variable on [−5,5] \(\times \) [−5,5] \(\hbox {km}^ {2}\). This also allow us to compute spatial weights matrix

Table 1 GIS variables used to characterize the locations in which respondents’ lived

Maps presenting the distribution of the environmental characteristics of forests used in this paper are presented in Appendix 1 in Supplementary Materials. In addition in Table 2. we present basic characteristics of variables considered in our model.

Table 2 Characteristics of variables used in spatial lag model

4 Results

4.1 Discrete Choice Model

The MXL model was estimated in Matlab.Footnote 10 In order to make sure the maximum simulated log-likelihood approach avoids masking identification problems (Chiou and Walker 2007), difficulties for the optimizer to cross local optima and reach the global optimum, and bias or unacceptable uncertainty levels associated with the log-likelihood function value, parameter estimates and their standard errors we used 10,000 scrambled Sobol draws, as suggested by Czajkowski and Budziński (2015). The model was estimated in WTP space. For every non-monetary attribute, two dummy variables were included to allow for varying marginal utilities associated with their improvements. As a result NAT, TRA and INF represent partial or substantial improvement in passive protection of most ecologically valuable forests, the amount of litter, and recreation infrastructure respectively. In addition, the status quo dummy (SQ) was included in the utility function as an alternative specific constant for the “no new improvements” option. The cost parameter was assumed to have a negative log-normal distributionFootnote 11, while other parameters were assumed normally distributed. Full correlation of parameters was allowed. The results are presented in Table 3.

Table 3 Mixed Logit results of the discrete choice experiment dealing with changes in management of Polish forests (model in WTP-space, results in EUR per year)

All of the MXL parameters are highly significant. The Status Quo coefficient is negative, which means that on average individuals derive utility from improvements in forest ecosystem management. However, this effect varies considerably in the sample as indicated by a relatively high standard deviation estimate. The WTP for all non-monetary attributes have positive means and vary significantly. Consistent with economic theory, higher levels of attributes have higher means. The highest mean WTP is associated with reduction in the amount of litter in forests—respondents were willing to pay about 12 EUR annually for 50 % reduction of litter in forests and about 18 EUR for a 90 % reduction. This is about 4 EUR more than average WTP for passive protection of all the most ecologically valuable forests and 9 EUR more than for 100 % increase in the appropriate tourist infrastructure. Appendix 2 in Supplementary Materials presents the correlation matrix of WTP’s estimated by the MXL model and correlation matrix of predicted WTP scores. Note that there is an almost perfect correlation between attribute levels of each attribute—we return to this observation when interpreting the results of spatial regression models.

4.2 Spatial Heterogeneity of Respondents’ Preferences

To investigate the influence of various characteristics of the location in which respondents live (grid cell characteristics) on their preferences, individual WTP scores were predicted using Bayes theorem.Footnote 12 Using Moran I statistics we firstly analyzed whether these scores are spatially autocorrelated. As presented in Table 4. in all cases we have to reject null hypothesis of no spatial autocorrelation.

Table 4 Moran’s I statistics for WTP scores

To facilitate interpretation, we continue the analysis of WTP scores with extrapolating them to the map of Poland using the regression Kriging method. The distribution of WTP scores associated with \(NAT_{1}\) and SQ are presented in Figs. 3 and 4.Footnote 13 It is clear, from analyzed figures, that presented WTP scores exhibit strong spatial correlation. In addition, one can observe relatively stronger preferences for all improvements in the eastern, western and parts of central Poland, while they are the least appreciated in the south-east (the Bieszczady region), the north-east (the Mazury region) and north-west (the Pomorze region). Respondents from different regions also differ with respect to how important each attribute is (how much they would be WTP for them), although these dissimilarities are less stark. There seems to be no evident pattern of preferences associated with the vicinity of big cities—WTP near some of them is relatively high (Cracow, Wroclaw, Gdansk), while for others WTP is low (Warsaw, Poznan). The same conclusion can be made for population density—the regions with high population (south and central Poland), as well as with low population, contain parts with both high and low WTP values. Two other thing should be noted as well—spatial distribution of WTP for SQ attribute is approximately reverse to what we observe for \(NAT_{1}\) and there is a clear similarity of respondents’ spatial preferences for ‘partial’ and ‘substantial’ improvements of each attribute (see Appendix 3 in Supplementary Materials).

Fig. 3
figure 3

Spatial distribution of respondents’ WTP scores associated with the \(\textit{NAT}_{1}\) attribute

Fig. 4
figure 4

Spatial distribution of respondents’ WTP scores associated with the SQ attribute

The results of the Kriging regression presented here are relatively informal and included for illustrative purposes only. For a more formal, structural analysis we apply a simple OLS regression for WTP scores for each of the attribute levels. In the next step we used Lagrange Multiplier (LM) test for missing spatial lag of explanatory variable. As presented in Table 5. for all attributes we have to reject null hypothesis of no spatial lag. This means that the variables do not correctly capture the spatial dependencies in WTP scores, and that more advanced spatial models should be applied.

In order to discover what drives these spatial differences in respondents’ WTP we use a spatial lag model. In addition to local forest characteristics, we include socio-demographic variables which are presented in Table 2. Insignificant variables (p value > 10 %) were then excluded from the model to increase efficiency of the other coefficients’ estimates. Models were estimated with Maximum Likelihood method using the R statistical software and spdep package (Bivand 2005). Results are provided in Table 6.

In all 6 models the spatial lag parameter \(\rho \) is positive and highly significant—indicating that respondents are expected to have higher WTP values if, on average, their neighbors also have high WTP values. The results of the models estimated for different levels of the same attribute are similar. This is not surprising as their correlation is quite large (cf. Appendix 2). We interpret and discuss these results in Sect. 5.

Table 5 LM statistics for missing spatial lag of explanatory variable in OLS regressions
Table 6 Spatial lag models results for the effects of land use patterns on WTP for different forest conservation attributes

4.3 Latent Classes and Spatial Clustering

It is quite likely that people with similar stated preferences are found in spatial clusters (as already shown in the mapped WTP distributions). This is for the two reasons noted in the introduction: amenities may attract specific kinds of people (specific in terms of their tastes) to locate in areas which score relatively highly in certain environmental features such as the area, proximity or type of woodland; or living in an area with a given environmental character results in individuals’ preferences evolving to a set which shows a preference for the environmental features where they live. Of course, spatial patterns in socio-economic variables such as incomes also exist, and to the extent that these socio-economic variables drive WTP, this will also result in a clustering of WTP values.

Fig. 5
figure 5

Extrapolated probability of class membership for the 2-class and 3-class LC models a The 2-class LC model, B The 3-class LC model

Figure 5 shows the spatial clusters from the latent class analysis for 2- and 3- class models, respectively. Model results are given in Table 7.

As can be seen from the model with 2 classes, about 75 % of respondents are likely to belong to the first class. This class is more pro-environmental in the sense that their WTP for changes in forest management are much higher than WTP indicated by the utility function parameters for the second latent class, whilst the SQ parameter is negative. In the LC model with 3 classes, about 66 % of respondents is likely to belong to the third class, which is even more pro-environmental than Class 1 in the previous model. WTP’s in classes 1 and 2 are much lower and also the SQ coefficients are highly positive. Especially in class 1 WTP for not making any change is higher than the sum of WTPs for making substantial improvements in all three attributes together. Class 1 can therefore be interpreted as respondents who in every situation would prefer not to make any improvements to national forest management, while class 2 respondents would derive some utility from changes in forest management, but this would not be substantial.

Firstly, we present Moran’s I statistics for calculated probability scores in Table 8. In all cases, the p values are very low allowing us to conclude that there is a very significant spatial dependence.

Figure 5 illustrates the spatial distribution of extrapolated probability scores for both LC models. The left panel (A) depicts class probabilities for the 2-class LC model.Footnote 14 This is thus a spatial representation of clustering in WTP values for forest enhancements. It appears that individuals who have low probability scores (and therefore are more likely to belong to class 2, and so have lower values for the changes in forest management) are mostly located in the northeast (near the border with Russia) and also found in the south of Poland, near the borders with Slovakia and the Czech Republic. Other regions with lower probabilities include patches of central regions, particularly between some of the big cities. We note, however, that there were not many observations in our sample which were located there (cf. Fig. 1), it is therefore possible that this is an artefact of extrapolating low probability scores in the bigger cities mentioned. The highest probability scores are placed near west and east boarders and also in the central Poland. These are the regions which have relatively fewer forests. This result thus supports our findings reported earlier in this paper.

Table 7 Results of 2- and 3- class latent class models estimated in WTP-space
Table 8 Moran’s I statistics for class probability scores

In the right panel (B) of Fig. 5 we report similar maps for case of the LC model with 3 classes. The conclusions from the distribution of the probability of belonging to class 3 (the most pro-environmental respondents, lower-right panel) are very similar to those derived from the model with 2 classes for the most pro-forest-conservation group. The regions of high probability are significantly smaller because the probabilities are around 10 percentage points lower than for the case with 2-class model. The main insight of the LC with 3 classes is that one can now see that individuals from class one (who do not want any changes) are mostly concentrated in the northern part of Poland (near the border with Russia) and in the south (near the borders with Slovakia and the Czech Republic). People with lower WTP are also more likely to live near big cities (albeit the effect is less evident). Lastly, class 2 probability seems to be the highest around the city of Poznań in central Poland, and also (albeit to a lesser extent) in the south (near the border with the Czech Republic) and north-east (near the border with Russia).

5 Discussion and Conclusions

The spatial patterns of WTP for environmental improvements provide important information for improving the economic efficiency of land management. Combined with information on the spatial distribution of costs, such WTP data enables the better targeting of environmental management, since those areas with higher net benefits can become the focus for enhancements, given limited budgets for funding such improvements. In this paper, we combine detailed spatial data on forest characteristics and land use patterns with estimates of individual WTP (obtained from mixed logit models) for changes in forest management, using spatial statistical methods to deal with problems of autocorrelation. We also use latent class modelling, and show that there are spatial clusters of people with similar preferences for environmental improvements in different parts of Poland. As far as the authors are aware, this paper is the first to combine these modelling approaches, and thus the paper will be of general interest to those concerned with relating spatial patterns of land use and land management with environmental valuation estimates.

The most consistent results to emerge from the spatial lag models are that (i) the further away from a forest one lives, the less one is willing to pay for improvements to national environmental forest management; and (ii) the more forests there are in the \(10 \times 10~\hbox {km}\) square where one lives, the lower is willingness to pay for enhancing the national forest estate. Result (i) is a type of distance decay function found in many papers in the literature (e.g., Jørgensen et al. 2013). Result (ii) initially looks counter-intuitive, but also makes sense. Recall that the discrete choice experiment asks respondents to bid for increases in the quality of environmental management of forests nationally. The more of a good that is (locally) available to the respondent, the lower his or her marginal WTP for increases in that good. That is exactly what we find here. Another explanation might be that two of the improvements included in the choice experiment (littering and recreational facilities) might lead to increases in visitor pressures in an area where forests are currently more abundant. This increased congestion and traffic etc. might be a source of dis-utility to local people, who are more likely to experience such effects the greater is the area of forest in the grid square where they live.

Looking at the type of forest growing in the gird square where respondents live, the specification shows that having a predominance of conifer species always lowers WTP, whilst having a predominance of deciduous trees in your local forests increases WTP for improvements to national forest environmental management. This echoes early hedonic price analysis of the impacts of forest characteristics on property values in the UK (Garrod and Willis 1992).

The presence of forests older than 120 years in the area where the respondent lives has a positive and highly significant effect on their WTP for conserving natural forests and removing litter, and for improving forest infrastructure. Forests as old as this constitute about 5 % of all forests in Poland and are considered the most ecologically valuable. The coefficients of this variable are also much higher (in absolute values) than the coefficients of the other forest-related area characteristics. It thus seems that individuals who live near old-growth forests have substantially different preferences with respect to the desirability of enhancements in national forest management and protection practices.Footnote 15

Finally, we note that GIS variables indicating the area of built-up land and the area of forests with more than 6 tree species did not turn out to be significant. As for socio-demographic variables considered, we found that age had a highly significant and negative effect on WTP while income has a positive influence on WTP. We found that respondents’ with higher education levels were WTP less for the improvements in the infrastructure than the remainder of the sample, which may reflect the way in which variations in educational attainment are correlated with how forests are used for recreation. Additionally, as expected, frequent users of forests for recreational purposes are WTP more for improvements in their management.

What our data cannot address is the question of why there are spatial clusters of preferences for forest conservation. One possible reason is the equilibrium sorting view of individuals choosing where to live based on their preferences and the amenities of an area (Timmins and Schlenker 2009). An alternative view is that it is the environmental characteristics of an area which shape the preferences of those who live there. Given these two possible explanations for observed spatial variation in preferences for forest management, it would be wrong to infer any kind of causality from our models.