1 Introduction

Knowing how biological dimensions and travel distance of birding places enter into the utility function of birders and how birders differ in characteristics, means, and objectives, may make planning and management of natural areas suitable for birdwatching more effective, facilitate the organization of birding activities, and promote targeted, sustainable ecotourism (Bennett et al. 2017; Czajkowski et al. 2014; Edwards et al. 2011; Haefele et al. 2019; Kolstoe and Cameron 2017; Loomis et al. 2018; Mattsson et al. 2018; Myers et al. 2010; Steven et al. 2015; Vas 2017). Empirical evidence reveals that birders, like recreationists enjoying other nature-based activities, are less worried about sites’ infrastructure and care more about biodiversity and habitat quality (e.g., Guimarães et al. 2014; Steven et al. 2015), even if differences arise depending on the birder’s level of specialization (Hvenegaard 2002). Many studies indicate that the level of specialization substantially influences the variability among groups of birders, not only in terms of the desired setting attributes but also in terms of awareness, knowledge, conservation attitudes, information used to determine site destination decisions, and behavioral attitudes and motivations (Cole and Scott 1999; Eubanks et al. 2004; Lessard et al. 2018; Maple et al. 2010; Miller et al. 2014; Scott and Thigpen 2003; Shipley et al. 2019). The level of specialization also affects the travel intention of birders, and the values they assign either to the entire recreation experience or the marginal values of destination attributes (De Salvo et al. 2020a; Lee et al. 2010).

In this paper, we use “distance-based” discrete choice experiments (DCEs) to investigate whether and how quantitative and qualitative biodiversity aspects are important in birding site selection. Literature offers several applications of DCEs to identify the multidimensional facets of birdwatching (Carson and Czajkowski 2014; Guimarães et al. 2014; Hanley et al. 2010; Lee et al. 2010; Naidoo and Adamowicz 2005; Roberts et al. 2017; Steven et al. 2017; Veríssimo et al. 2009). DCEs indeed deliver information on the relative importance of attributes characterizing birding places, allow estimates of the direction and size of marginal changes of significant attributes and relative trade-off ratios, and indicate which sites offer the best birdwatching experience. Moreover, DCEs provide ways to profile and segment birders in classes according to their socioeconomic characteristics and preferences.

Our application differs in at least five aspects from previous DCEs conducted in the area of birdwatching. First, we intercept a sample composed only of specialized birders.Footnote 1 Second, we focus on the biological attributes of sites. Third, we test whether there is a significant correlation among birders’ preferences for bioecological site attributes and whether it is possible to segment specialized birders into classes in which preferences assume the same pattern, identifying the socioeconomic and attitudinal birders’ factors that explain differences in preferences for site attributes among classes. Fourth, we estimate the marginal value of birding site attributes in terms of willingness to travel (WTT). WTT is generally elicited in contingent behavior (or contingent activity) studies (Heyes and Heyes 1999; Whitehead et al. 2013). It is also used as a proxy of cost or price attribute in some DCE to minimize protest-motivated reactions (Heyes and Heyes 1999; Whitehead and Wicker 2018), although, in the successive econometric analysis, it is converted to money to obtain welfare measures in monetary terms (Hanley et al. 2002; Kerr and Abell 2014; Sælen and Ericson 2013; Unbehaun et al. 2008). In this study, we use and maintain in all stages WTT as a nonmonetary proxy of willingness to pay (WTP) to avoid potential bias in WTP estimates related to the different ways to transform travel distances into travel costs (Chae et al. 2012; Heyes and Heyes 1999; Pascoe et al. 2014). Regardless, WTT is a valid welfare measure that can be, for instance, directly used to define the natural site users’ catchment area. Finally, we employ econometric models to explore birders’ preference heterogeneity, also considering whether birders’ preferences for one attribute are related to preferences for another attribute, and testing whether significant scale heterogeneity exists across birders (Fiebig et al. 2010; Keane and Wasi 2013). Scale heterogeneity is now an important issue in DCE literature (e.g., Burke et al. 2010; Czajkowski et al. 2016; Hess and Train 2017; Revelt and Train 2000; Scarpa et al. 2008). However, according to our best knowledge, it has not been investigated in the choice behavioral analysis of birders. Scale heterogeneity addresses factors not explicitly included in the model that can differently affect choices. It implies correlation among coefficients of included variables; this source of correlation can be confounded with other forms of correlation (Hess and Train 2017). We test forms of correlation by assuming that birders’ preferences have either continuous or discrete distributions.

2 Materials and methods

2.1 Design of DCEs and data description

In DCEs, respondents elicit their preferences by selecting the preferred option from a discrete set of hypothetical alternatives. Each alternative is described by a finite number of characteristics or attributes from different levels. The choice task involves selecting from two or more alternatives that differ in levels. Respondents are generally asked to complete multiple-choice tasks. Thus, in DCEs, respondents do not provide a direct estimate of their preferences; they provide only indirect information from which it is possible to infer the value placed on each attribute or alternative (Adamowicz and Deshazo 2006; Hensher et al. 2015; Louvriere et al. 2000; Hoyos 2010).

In this study, attributes and levels to identify the finite number of options to include in choice sets (or choice tasks) were determined with personal interviews, focus groups, and a pilot survey. We identified two bioecological qualitative attributes (the probability of observing a new species and rare species) and one biological quantitative attribute (the number of observable bird species during one trip) (see Table 1). As previously mentioned, we selected the distance of the site from home as a proxy of an attribute required for the calculation of welfare estimates.

Table 1 Attributes and levels

Choice sets were generated using a D-efficient fractional design (Street and Burgess 2004). Combinations among attributes and levels were obtained using NGENE 1.2 (ChoiceMetrics 2018). Respondents were grouped into six blocks. Each choice set included two alternatives and an opt-out option. Each alternative was described in textual terms. The choice task was repeated four times. An example of the choice card used in the choice task is depicted in Fig. 1.

Fig. 1
figure 1

Example of choice card

Data were collected by online surveys of birders living in Sicily (Italy). Birdwatchers were contacted through a mailing list provided by EBN Italy, the biggest specialized birdwatchers’ community in Italy. A structured questionnaire was sent in 2019 to all Sicilian EBN members (N = 178). After three months, we collected 103 complete and useful questionnaires (rate of response: 0.58%). As the survey experienced a response rate below the generally accepted rule of thumb of 80%, we verified if the realized sample was affected by sampling error or non-response bias. To conduct tests on the difference from the population target, we used available information on relevant characteristics of Sicilian specialized birders, coming from the same survey, previous survey (De Salvo et al. 2020a, b), personal knowledge of birders, and follow-up direct contacts. Individual t tests on means related to relevant demographics (age, gender, education level) revealed the absence of statistically difference between respondents and non-respondents. Through a Chi-squared test, we found the same insignificant difference in terms of the birder’s level of specialization. These tests validate the use without any correction of our estimates for generalization and aggregation purposes. Moreover, according to the criterion of the minimum sample size, 103 useful respondents guarantee 10% precision and 95% probability of the hypothesis that the true proportion that a generic alternative is chosen equals 35%. Observed probabilities for each alternative equal to 25.71% for the opt-out option, 42.86% for Site A,Footnote 2 and 31.43% for the alternative Site B. Consequently, a mean value of 35% can be deemed acceptable (Hensher et al. 2015; Louviere et al. 2000).

Table 2 reports summary statistics for the main variables of the sample. Mean values of variables related to birdwatching confirm the high specialization of sampled birders. The average experience in birdwatching equals is approximately 20 years, and 87% of the respondents could identify more than 40 bird species (48% more than 100 species).

Table 2 Summary statistics

2.2 Econometric analysis

We estimated several models able to allow for scale heterogeneity and other forms of observed and unobserved preference heterogeneity, and correlation among attributes in the context of repeated choices by respondents. All models were based on the standard framework of the random utility model (McFadden 1974), according to which the utility (Unjt) to person n from choosing alternative j in the t choice occasion is the sum of a deterministic part (Vnjt), that accounts for attributes that are observable by the researcher, and a stochastic or idiosyncratic error (εnjt) that captures unobservable characteristics influencing respondents’ choices.

Attribute utility weights were assumed to be continuously distributed or discrete (finite) among classes. When weights had a continuous distribution, we used the generalized multinomial (GMNL) model proposed by Fiebig et al. (2010) because it allows for random heterogeneity, including correlation induced by the presence of a significant random scale heterogeneity. The GMNL model was specified as:

$${U}_{njt}=\left[{\sigma }_{n}\beta +\gamma {\eta }_{n}+\left(1-\gamma \right){\sigma }_{n}{\eta }_{n}\right]{\mathrm{x}}_{njt}+{\varepsilon }_{njt,}$$
(1)

where \({\sigma }_{n}\) is an n-person-specific parameter that accounts for scale heterogeneity. \(\beta\) is the vector of the mean attribute utility weights in the population. \({\eta }_{n}\) is the vector of person n-specific deviations from the mean \({\beta }_{n}\). \({\text{x}}_{njt}\) represents the vector of attributes. \(\gamma\) is a parameter. \({\varepsilon }_{njt}\) is the idiosyncratic error that exhibits an i.i.d. Gumbel distribution.

The scale parameter (\({\sigma }_{n}\)) considers that the variance of the error term is not constant, but varies among respondents. For some individuals, the scale of the idiosyncratic term is greater than for others. These individuals are, in the real world, more affected in their choices by factors that are not explicitly included in the model. Conversely, the variance of the error term is lower for individuals whose preferences are well captured by variables included in the utility function. The scale parameter is indexed on the person n to consider its variability at an individual level. Thus, it was necessary to specify an a priori distribution. The scale parameter is a “scale” factor, so it should be positive; for this reason, a lognormal distribution—LN(1,\({\tau }^{2})\)—was assumed with a mean normalized to 1 for identification proposes and standard deviation equal to \(\tau\). The latter is the key parameter that captures scale heterogeneity. As the parameter \(\tau\) increases, the degree of scale heterogeneity and the correlation among utility coefficients rises. \(\beta\) and \({\eta }_{n}\) are instead the parameters that determine random attributes—that is, they are the parameters related to the attribute utility weights, which were assumed to be continuously distributed. The former parameter (\(\beta\)) is the vector of the means. The latter (\({\eta }_{n}\)) represents the variability. Several distributions can be assumed for random attributes (e.g., uniform, triangular, normal, lognormal). Here, we supposed a normal random parameter distribution for all the attributes.Footnote 3 This hypothesis implies that \({\eta }_{n}\) follows a multivariate normal distribution, MVN \((0,\Sigma\)), where \(\Sigma\) is the variance and covariance matrix.Footnote 4 Parameter \(\gamma\) controls how the variance of the residual taste heterogeneity (e.g., \({\eta }_{n}\)) varies with the scale parameter (e.g., \({\sigma }_{n}\)). Fiebig et al. (2010) suggested two GMNL specifications according to particular cases that arise when \(\gamma\) equals to 0 or 1:

$${\mathrm{GMNL-I}: U}_{njt}=\left({\sigma }_{n}\beta +{\eta }_{n}\right){\mathrm{x}}_{njt}+{\varepsilon }_{njt}\mathrm{when }\gamma =1,$$
(2)
$$\mathrm{GMNL-II}: {U}_{njt}={\sigma }_{n}\left(\beta +{\eta }_{n}\right){\mathrm{x}}_{njt}+{\varepsilon }_{njt}\mathrm{when }\gamma =0.$$
(3)

In the GMNL model, individual choice probabilities are simulated through D draws—\({\left\{{\eta }^{d}\right\}}_{d=1,\dots D}\)—from the multivariate normal distribution MVN \((0,\Sigma\)) by averaging simple logit expression over these draws:

$$L\left(j\left|{\mathrm{x}}_{njt}\right.,{\sigma }^{d},{\eta }^{d}\right)=\frac{1}{D}\sum_{d=1}^{D}\frac{\exp\left({\sigma }^{d}\beta +\gamma {\eta }^{d}+\left(1-\gamma \right){\sigma }^{d}{\eta }^{d}\right){\mathrm{x}}_{njt}}{\sum_{k=1}^{J}\exp\left({\sigma }^{d}\beta +\gamma {\eta }^{d}+\left(1-\gamma \right){\sigma }^{d}{\eta }^{d}\right){\mathrm{x}}_{nkt}}.$$
(4)

In a multiple-choice context, the simulated probability of observing for person n the choice sequency \({\left\{{y}_{njt}\right\}}_{t=1}^{T}\) is the product of standard logit formulas:

$$L\left({y}_{n1}, {y}_{n2},\dots ,{y}_{nT }\right)=\frac{1}{D}\sum_{d=1}^{D}\prod_{t=1}^{T}\prod_{j=1}^{J}\frac{exp\left({\sigma }^{d}\beta +\gamma {\eta }^{d}+\left(1-\gamma \right){\sigma }^{d}{\eta }^{d}\right){\mathrm{x}}_{njt}}{\sum_{k=1}^{J}exp\left({\sigma }^{d}\beta +\gamma {\eta }^{d}+\left(1-\gamma \right){\sigma }^{d}{\eta }^{d}\right){\mathrm{x}}_{nkt}}.$$
(5)

All parameters were estimated using the maximum likelihood estimator. Table 3 details values assigned to key parameters of models largely used in DCE studies. All these alternative models could be interpreted as constrained GMNL models. Models in Table 3 were estimated through Nlogit 6.0. Simulations were based on 1000 shuffled draws (Hess et al. 2003).Footnote 5

Table 3 Estimated models and values assumed by key parameters

In circumstances in which the weights of arguments of utility function have a discrete distribution and, consequently, the overall population is segmented into unobserved groups, the econometric analysis was based on a standard latent class (LC) model, in which only latent preference classes are considered, and on a scales-adjusted LC (SALC) model, in which each individual belongs to a latent preference class and an unobserved scale parameter class. In SALC models, preferences varied among classes, but are strongly homogeneous within each class; the same occurred for the scale parameter, which was constant in each class but varied among classes. The SALC model allows for scale heterogeneity within classes but, similarly to the GMNL model, it cannot disentangle scale from taste heterogeneity; moreover, the class-specific scale parameter caught all forms of within-class correlation in the real world (Hess and Train 2017).

Following Magidson and Vermunt (2007), the probability that the individual n chooses the ith alternative in the tth choice situation, given that they fall in the qth latent preference class (among the Q taste classes) and the dth unobserved scale factor class (among the D scale classes) is:

$$P\left(nit|q,d\right)=\frac{exp\left({\lambda }_{d}{\beta }_{q}^{*{{\prime}}}{x}_{nt,i}\right)}{\sum_{j}exp\left({\lambda }_{d}{\beta }_{q}^{*{{\prime}}}{x}_{nt,j}\right)},$$
(6)

where \({\lambda }_{d}\) is the scale parameter for the d class and \({\beta }_{q}^{*}\) is the vector of taste parameter for the q class. Prior probability for preference class q and for scale class d equals, respectively, to:

$${H}_{\left.nq\right|d}=\frac{\exp\left({\theta }_{q}^{^{\prime}}{z}_{n}\right)}{\sum_{q}\exp\left({\theta }_{q}^{^{\prime}}{z}_{n}\right)},$$
(7)
$${G}_{\left.n\right|d}=\frac{\exp\left({\gamma }_{q}^{^{\prime}}{z}_{n}\right)}{\sum_{q}\exp\left({\gamma }_{q}^{^{\prime}}{z}_{n}\right)},$$
(8)

where zn is the vector of k socioeconomic and attitudinal characteristics for individual n (including a constant), and \(\theta\) and \(\gamma\) are the respective parameters of such variables in Eqs. (5) and (6). All parameters were estimated through maximization of the log-likelihood function:

$$\mathrm{ln}L=\sum_{n=1}^{N}{P}_{n}={\text{ln}}\left[\sum_{d=1}^{D}\sum_{q=1}^{Q}{G}_{n\left|d\right.}{H}_{nq\left|d\right.}\prod_{t=1}^{T}\frac{{\text{exp}}\left({\lambda }_{d}{\beta }_{q}^{*{{\prime}}}{x}_{nt,i}\right)}{\sum_{j}{\text{exp}}\left({\lambda }_{d}{\beta }_{q}^{*{{\prime}}}{x}_{nt,j}\right)}\right].$$
(9)

The posterior probability that an individual falls in a specific latent preference and scale class was then estimated using the Bayes rule. For identification, one scale parameter was standardized to unity so that the other scale parameters were ratios of the references one. The optimal number of classes was not automatically determined by the model itself, but it was derived through appropriate information criteria (Scarpa and Thiene 2005).

LC models were estimated using Latent Gold 5.1. The best model specification was selected by comparing the Bayesian information criterion (BIC) among models, which assumed a variable number of taste and scale classes. Variables included among covariates account for the birder’s profile, specialization level, and behavior in birding.

In the post-estimation analysis, we assessed marginal willingness to travel (MWTT). In the case of discrete taste parameters, MWTT was equal to the ratio between the relative attribute’s coefficient and the coefficient of the distance. Confidence intervals for MWTTs were obtained through the Delta method (Hole 2007). In the case of continuous distribution assumption, given that all attributes (included the distance) were assumed to be normally random, the distribution of MWTT was the ratio of two normal distributions. Thus, MWTT estimates were inferred at the population level using the estimated vector of means and variance–covariance matrix by taking the ratio of a large number of draws (10,000 Halton draws) from each distribution (Rischatsch 2009). For model specification, based on a continuous distribution for parameters, MWTT estimates were reported in terms of mean, median, first, and third quantiles.

Finally, we used the best model specification based on continuous distribution (i) to simulate kernel density estimates for significantly correlated attributes (De Salvo et al. 2020b; Duong 2020; Scarpa and Thiene 2005), and (ii) to assess the bioecological attribute probabilistic demands in terms of kilometers, through derived patterns of covariation across taste parameters. We simulated changes in kilometers caused by variation in the probability of observing a rare and a new species, and in bird species numerosity at the site. To execute these simulations, we used R programming languages, and extracted 10,000 Halton draws from the population distributions.

3 Results and discussion

3.1 Continuous mixtures models

Table 4 reports estimates for models based on a continuous distribution for the attribute utility weights. The MNL model was estimated to obtain an initial insight into the data. Mean values were all significant, at least with p < 0.01. As expected, the coefficient for the distance was negative, confirming that an increase in travel distance implies, on average, a decrease in the relative site’s utility. Conversely, the relation between the probability of observing a new (or a rare species) and the probability that the site is chosen for birdwatching proposes was positive. Estimated signs for these attributes are coherent with the literature (Baral et al. 2007; Becker et al. 2009; Booth et al. 2011; Dissanayake and Ando 2014; Guimarães et al. 2014; Stevens et al. 2017). Coefficient estimates suggest that utility improvement is on average higher if there is a high probability of observing a rare species rather than a new species (0.90 vs 0.62). Further, the higher the likely number of species, the higher the probability that the site will be chosen (Becker et al. 2009; Dissanayake and Ando 2014). The significance of both dummy variables used to determine this quantitative biodiversity indicator confirms a non-linearity in the relation between the probability of selecting a site and the numerosity of species.

Table 4 Coefficient estimates for continuous mixture models

The inclusion of an individual specific scale parameter improves model performance. This means, as suggested by Hess and Train (2017), that some forms of correlation among utility coefficients exist, and these forms are captured by the scale parameter. The S-MNL shows a higher log-likelihood function, a lower Akaike Information Criterion (AIC) and a lower Bayesian Information Criterion (BIC) than does the MNL (from − 323.6 to − 312.54; from 657.20 to 637.10, and from 660.1417 to 640.6032 respectively). The τ parameter, which represents the standard deviation of the scale parameter, is significant (p < 0.01), whereas the mean scale parameter—sigma(i)—is not significant.

Assuming randomness of both qualitative and quantitative bioecological site attributes and correlation among attributes (see RP-MNL model results), the model’s performances improve the basic MNL specification, in terms of both log likelihood and AIC and BIC criteria (from − 323.61 to − 276.89, from 657.20 to 593.80, and from 660.1417 to 605.4952 respectively). The RP-MNL with full correlation among site attributes allows for all sources of correlation, including scale heterogeneity. However, the several forms of heterogeneity cannot be empirically distinguished (Hess and Train 2017). In the RP-MNL model’s specification, all random attributes showed a significant standard deviation. This means that preferences for qualitative and quantitative biodiversity dimensions are heterogeneous. Birders’ heterogeneity of preferences was confirmed for a well-defined segment of advanced or specialized users (Hvenegaard 2002; Stevens et al. 2017).

Among the GMNL models, the specification that better fits the data is the GMNL, in which the γ parameter is not constrained. This model’s specification shows the best performance in fitting data in terms of log-likelihood function (− 270.526) AIC (585.100) and BIC (597.9323). In this specification, random attributes all showed a highly significant standard deviation (p < 0.01). The τ parameter was also significant (p < 0.01) and captured any variation among utility coefficients in the real world that was not explicitly treated in other ways by the model (e.g., taste heterogeneity and correlation among attributes). These results, as observed by Keane and Wasi (2013), confirm that GMNL with no constrained γ parameter is highly suitable to capture not only “extreme” or lexicographic behavior, in which choice is largely based on a single attribute, but also “random” behavior, which occurs when choice is influenced only slightly by observed attributes.

Table 5 reports the Cholesky matrix estimated for the (unconstrained) GMNL model specification. The diagonal values of this matrix represent the true level of variance for each random parameter once the cross-correlated parameter terms have been unconfounded; unobserved heterogeneity (including scale heterogeneity) is isolated. The statistical significance of diagonal Cholesky elements for the variables related to the probability of observing a new species, the probability of observing a rare species, and a medium numerosity of bird species at the site provides evidence of preference heterogeneity, even after allowing cross-correlations across attribute parameters. Examination of the off-diagonal elements of the Cholesky matrix revealed several statistically significant estimates. This implies significant cross-correlations among the random parameter estimates that otherwise could have been inappropriately confused within standard deviation estimates of each random parameter without Cholesky matrix decomposition and evaluation. Evaluation of the correlation terms revealed that the probability of observing a rare species is negatively correlated with the probability of observing a new species (ρ = − 0.88). Further, the probability of observing a rare species is negatively correlated to both a medium (ρ = − 0.57) and high (ρ = − 0.63) numerosity of bird species at the site. A medium numerosity of bird species at the site is positively and strongly correlated to a high numerosity of bird species at the site (ρ = 0.95), while distance is positively correlated to the probability of observing a rare species. However, this relation is weak (ρ = 0.27).

Table 5 Cholesky matrix estimated for the GMNL specification

3.2 Discrete mixtures models

Table 6 reports estimates of log-likelihood and the BIC for models based on a discrete mixture of taste parameters and a variable number of latent and scale classes. Estimates reveal the best model is that which assumes two scale classes and two choice classes.

Table 6 Log-likelihood (LL) and Bayesian Information Criterion (BIC) for discrete mixture models

Table 7 shows the estimates for LC models. Results are displayed on the left and right of the table, respectively. The latter model (scale-adjusted LC model) shows best fitting performance, as previously stated, but also shows greater capacity in terms of choice class segmentation and an increase in the significance of the profile variables used to identify class membership. In this model, it is possible to identify two latent taste classes—“specific bird-lookers” (Class 1) and “quali-quantitative features addicted” (Class 2). Members of Class 1 are prevalently interested in qualitative aspects of biodiversity and do not care about distance. If the number of species at the site increases from low to high, their utility increases significantly, even if the magnitude of this effect is limited (0.49). Conversely, members of Class 2 are attracted to all bioecological site attributes and distance. Utility strongly depends on the magnitude of species numerosity and, to a lesser extent, on the probability of observing a new or a rare species.

Table 7 Coefficient estimates for discrete mixture models

Among variables included to infer class membership, in the LC model, only education level and advanced ability to identify bird species were statistically significant (p > 0.10). In the SALC specification, years of experience in birdwatching and the average number of visits in the last three years were also significant (the latter variable, p > 0.05). The SALC model, compared with the LC model, assures an improvement in parameters’ significance. According to the results, “specific bird-lookers” (Class 1) are on average less educated, involved in birdwatching for more years, and less skilled in identifying a high number of bird species (> 100). They declared an average number of visits in the last three years higher than those in Class 2. Our results are consistent with the previous empirical literature on birdwatching, and provide confirm the role played by activity participation, skills and commitment in identifying segments with specific behavioural patterns (Curtin and Wilkes 2005; Kim et al. 2010; Scott et al. 2005).

In regard to class size, 75% of the sample fell in Class 1 and the remainder in Class 2; 49% of individuals in the first LC showed the same scale parameter and were grouped in the first scale class. The remaining individuals (26%) were included in scale Class 2. Similarly, for the second LC, more individuals fell into the first scale class (16% vs 8%).

Stevens et al. (2017) identified two segments of birders: “quantity-driven birders” and “special-bird seekers”. The latter group assigns lower importance for diversity and endemic species site attributes than does the former, but more consistent preferences for threatened species. However, this segmentation arises when investigating birders with a highly variable level of specialization. In our study, we focus on specialized birders, and the existence of “quality-driven” and “special-bird seekers” groups seems to be confirmed.

3.3 Post estimation of marginal WTT

Table 8 shows summary statistics of the marginal WTT for models based on the hypothesis of continuous and randomly normal distributed parameters (RPL, GMNL-I, GMNL-II, and GMNL). Similarly, Table 9 reports statistics for models that instead assume the hypothesis of fixed parameters (MNL and S-MNL) or of discrete randomly distributed parameters across classes of users (LC and SALC). Values reported in Table 8 suggest that, independently of the model and hypothesis on scale heterogeneity, MWTT distributions are asymmetric. As previously highlighted, the model that showed better statistical performance in continuous coefficient distributions was the GMNL. In this case, MWTT values indicated that specialized birders are willing to travel, in median, 45 km to visit a site where the probability of observing a new species is high, and 49 km for sites with a high probability of observing a rare species. The marginal WTT equaled to 138 or 109 km to visit a site with a medium or higher numerosity of bird species instead of low numerosity.

Table 8 Marginal WTT (in km) in the hypothesis of continuous random parameters
Table 9 Marginal WTT (in km) in the hypothesis of fixed or discrete random parameters

In the SALC model (see Table 9), marginal WTT for LC 2 exhibited higher values compared with the full sample, with the unique exception of the attribute relative to a high probability of observing a rare species. As previously stated, LC 2 allowed preferences for the less numerous segments of birders who are concerned with both qualitative and quantitative bioecological site attributes and who are, on average, more educated, with less experience in years, and less addicted to birdwatching in terms of the number of visits, but more skilled in identifying bird species. Marginal WTT for them was 99, 36, 205, and 177 km, respectively, to visit a site with a high probability of observing a new species, a rare species, or finding a high number of bird species. For advanced birders and birders more attracted to both qualitative and quantitative bioecological sites, estimates of marginal values of WTT to reach sites with particular bioecological attributes were consistent with De Salvo et al. (2020a). Such estimates indicate that most of our sample prefers to travel short distances (in general within 100 km from home) when the aim of the visit is viewing a specific (vagrant) bird (Callaghan et al. 2018).

Figure 2 exhibits the iso-quantile plots of bivariate kernel densities of coefficients for a high probability of observing a rare and a new species. The iso-quantile highlights the previously stated negative correlation between these bioecological indicators. Given the strong correlation (− 0.88), the curves are close to each other, concentrical, and depict the same trend.

Fig. 2
figure 2

Iso-quantile plots of bivariate kernel densities of individual coefficients for a high probability of observe a rare and a new species. Axes report the individual estimates of beta coefficients

Figures 3, 4 and 5 display the estimated choice probability functions along with the distance for selected birdwatching sites. Assuming as reference a site with a high probability of observing a rare species and a medium number of bird species, Fig. 3 shows that the probability demand for sites with a low probability of observing a new species rapidly decreased as the distance increases. Demand for sites with a high probability of observing a new species was less sensitive to distance increasing; however, to equal distance, we obtained low probabilities compared with the former curve when the distance is lower than approximately 28 km.

Fig. 3
figure 3

Choice probability functions for site with a different probability of observe a new species

Fig. 4
figure 4

Choice probability functions for site with a different probability of observe a rare species

Fig. 5
figure 5

Choice probability functions for site with a different numerosity od bird species

Figure 4 indicates that a similar phenomenon arises when we consider as reference a site with a high probability of observing a new species and a high number of bird species. Even in this case, the two curves, respectively, relative to a low and a high probability of observing a rare species, present the same behavior, but intersect at a higher distance (30 km), as evidenced by the intersection point in Fig. 3. Finally, Fig. 5, displays changes caused by distance increases in the probability for sites with a high probability of observing both rare and new species, and when the number of bird species at the site is medium or high. If the distance increased, the predicted probability dropped rapidly independently of the site’s numerosity of bird species.

Comparison, in terms of statistical performance, between the best continuous and discrete coefficient distributions models suggests that GMNL outperformed SALC, given that the former has a BIC value lower than that of the SALC model. This result is consistent with Keane and Wasi (2013). However, from a practical point of view, both models are useful as they produce differentiated pivotal insights. GMNL model, despite preventing to infer the sources of heterogeneity, gives us the possibility to assess marginal welfare measures for the whole sample, and other post estimations results (e.g., correlation among attributes, demand changings due to distance increases), once unobserved heterogeneity (including scale heterogeneity) is isolated. SALC model, instead, allows us to gain an intuitive understanding of the source of heterogeneity in categories. As already highlighted, we found only two latent classes, probably because we detected only advanced birders. SALC results suggest that distance affects birders’ preferences for the site only in one latent class, here named “quali-quantitative features addicted”. Thus, the SALC model is useful to demonstrate the existence of this sub-segment of advanced birders and to derive, even if only for this class, significant marginal WTT estimates.

4 Conclusions

Like previous studies, we find that specialized birders have significant preferences for natural areas delivering appropriate birding opportunities, especially in terms of observing rare and unusual bird species (Callaghan et al. 2018; Steven et al. 2017). The result indicates that both qualitative and quantitative biodiversity matters in birding site selection, even if preferences are extremely heterogeneous and well-defined in specific classes of advanced birders. In particular, our study shows that specialized birders are interested in visiting places characterized by a high probability of observing rare and new species and with numerous bird species, even if the latter attribute does not lead to a linear effect on birders’ utility.

Although the suitability of a natural site to attract specialized birders depends on qualitative and quantitative biodiversity levels our findings indicate that this capability is marginally low, in terms of distance, for qualitative aspects, and higher for increases in quantitative attributes, such as species numerosity. When the probability of observing a new or a rare species changes from low to high, the site’s catchment area increases by approximately 45–50 km. The radius could rise to approximately 130 km if the number of species moves from low to medium. Such effects appear more relevant for a segment of specialized birders: “quali-quantitative features addicted”. We find that this class comprises specialized birders with high levels of education, involved in birdwatching for fewer years, less addicted to birdwatching in terms of the number of visits, but more skilled in identifying a high number of bird species. In general, we observe that variables related to multidimension recreation specialization concept as well as to individual characteristics act as segmentation drivers (Kim et al. 2010).

Findings also reveal that a significant, strong, and negative correlation exists between the probability of observing a rare and a new species. This correlation indicates the presence of a segment of specialized birders—“bird seekers”—which includes birders interested in specific species, rather than in rare species and species never observed before (Steven et al. 2017).

Further, our study demonstrates how probability demand for specialized birders varies according to changes in the natural site profile. Demand for sites hosting a rare or unusual species is heavily sensitive to variation caused by an increase in the probability of observing such bird species. If this probability changes from low to high, specialized birders are willing to travel greater distances. This same sensitivity is not observed if the change concerns the abundance of bird species.

To conclude, we believe that our analysis could be usefully employed in the management of birdwatching sites to predict changes in conservation actions, enlarge catchment area, design customizable birdwatching tours tailored to target species, design marketing strategies aimed to enhance the image of sites by associating it, for instance, to flagship bird species that are appealing to advanced birders, and trigger greater demand from specialized users that show, on average, a higher WTT for birdwatching.