1 Introduction

Annual global electricity generation from wind farms has grown steadily over the past decade(s) and is expected to continue to grow. For example, the International Energy Agency predicts that annual wind capacity additions will reach 210 GW in 2030, up from 95 GW in 2021 (IEA 2022, 293). Obvious reasons include the low CO\(_{2}\) emission compared to fossil fuel-based generation, low operating costs, and reduced energy import dependencies from the countries’ perspective. While these advantages can be perceived as valuable for society in general, wind farms are simultaneously sources of various negative externalities (Zerrahn 2017; Sovacool et al. 2021). Residents in close proximity may particularly be affected by noises, shadow flicker, and visual deterioration of the landscape, which may result in negative welfare changes (Mattmann et al. 2016; Onakpoya et al. 2015; Liebich et al. 2021). In this context, many hedonic pricing studies investigate if these externalities translate to price variations in property values proximate to wind farms. Notably, the empirical evidence is ambiguous, with many studies failing to find a significant (negative) effect on property values due to the presence of wind turbines (Parsons and Heintzelman 2022; Dorrell and Lee 2020; Brinkley and Leach 2019). Furthermore, studies that find a significant effect show a large variance in its magnitude, mostly between \(-10\%\) and \(10\%\), but also with estimates well beyond this range. Given this ambiguity, it is of great interest for researchers, decision-makers, and property owners to understand whether and under which conditions wind turbines significantly affect local property values.

I address this open question by combining 720 estimates from 25 hedonic pricing studies estimating price-distance relationships for wind turbines and residential properties in the first comprehensive meta-analysis on this topic. Bayesian Model Averaging (BMA) techniques are used to identify sources of heterogeneity attributable to the data characteristics of the primary studies and their ability to control for confounding factors. In addition, I apply a series of novel tests for publication selection bias that allow the calculation of a bias-corrected mean effect size.

Some authors contributing to this literature have already speculated about reasons for the significant differences across studies in terms of reported effect sizes. In particular, Hoen and Atkinson-Palombo (2016) argue that insignificant findings may reflect underpowered studies, having access to only small samples of truly affected properties, i.e., few observations close to wind turbines. Accordingly, for a small effect size, the respective study would be unable to identify such an effect if it exists. There is also suggestive evidence that site conditions such as local opposition (Heintzelman et al. 2017), degree of visibility (Sunak and Madlener 2016) or the number of turbines in proximity (Dröes and Koster 2016) can partly explain the studies’ differing findings. In excellent reviews, Parsons and Heintzelman (2022) and Möllney (2022) show that the econometric specification (e.g., avoidance of omitted variables bias, treatment of endogeneity problems) and data characteristics (e.g., distance of treated properties, type of analysed property prices) differ across studies, potentially leading to contrasting results. Similarly, Möllney (2022) acknowledges the difficulties in comparing the empirical findings due to the diversity of employed metrics to measure wind turbine impacts. Parsons and Heintzelman (2022) also conduct what they call a “mini meta-analysis” as part of their qualitative review. That is, they calculate the simple mean effect size for different wind-turbine-to-property distances using 18 observations. They document \(-4.5\%\) devaluation for houses at 1 km of distance, with the effect fading out after 4.5 kms. However, this finding should be interpreted with caution since they do not (i) correct for publication bias, (ii) include other control variables, (iii) give more weight to more precise estimates, and (iv) use only a subset of the available estimates. I address all of these issues in the methodological setup of this meta-analysis, thus allowing to systematically investigate the existence of price effects of wind turbines on property values at the aggregate level. Building upon these lines of argumentation, I contribute to this literature (i) with a systematic assessment of the causes for effect size heterogeneity and (ii) by calculating an average effect size corrected for publication and misspecification bias.

The results do not confirm a substantial effect of the presence of wind turbines on residential property values. In fact, after correcting for publication and misspecification bias, the resulting effect size corresponds to a reduction of property values of \(-0.68\%\) for properties 1.89 miles away, which turns zero beyond 2.8 miles of distance. The simple average of reported effect size instead would have indicated a decrease of property values by \(2.14\%\). Selective under-reporting of significant positive findings is responsible for a 22% overestimation of the effect (based on the preferred publication bias correction method, RoBMA-PSMA, suggested by Bartoš et al. (2023)). Important study characteristics to avoid misspecification bias are, e.g., the accuracy of the distance calculation, usage of spatial controls, and reliance on a difference-in-difference estimation design.

The remainder of the paper is structured as follows. Section 2 summarizes the data collection process and gives an overview of the resulting sample. Section 3 assesses the severity of publication selection bias. Section 4 explores the heterogeneity-explaining factors relying on BMA and presents the results. Section 5 discusses the findings and concludes. The study selection process and coding decisions are explained in detail in the appendix. Finally, model diagnostics for BMA and results for alternative specifications are also reported in the appendix.

2 Data

The strategy to identify relevant studies followed the current guidelines for meta-analyses in economics (Havránek et al. 2020). I used a predefined search query with placeholders for “property values”, “wind turbines” and “hedonic pricing” to find relevant studies. I complemented the list with a forward- and backward search using citations and reference lists of already identified studies, respectively (see Table 4 for a list of search terms). For consideration, studies must use the hedonic pricing method to estimate a price-distance relationship for residential properties and wind turbines. Accordingly, I refrain from combining estimates with different (i) underlying welfare measures (e.g., contingent valuation and hedonic pricing studies), (ii) wind turbine impact metrics (e.g., distance and view), or (iii) property types (e.g., residential properties and farmlands). In the appendix, I outline in more detail that these study selection criteria ensure a consistent set of effect size estimates that can reasonably be meta-analysed. There, I also summarize the study selection process (Fig. 6) and list the included studies (Table 5) with their details (Table 6) and excluded studies by reason for exclusion (Table 7). The resulting final dataset consists of 720 observations from 25 studies. The search was conducted in December 2021 and documented using the reference management software Citavi. The data and code are available  via https://doi.org/10.17605/OSF.IO/UB37W.

All included studies rely on a semi-logarithmic functional form to elicit the price-distance relationship such that the reported distance coefficients represent semi-elasticities. Accordingly, these semi-elasticities serve as the dependent variable. They can be interpreted as the percentage change of residential property values at the average distance of a treated house.Footnote 1

Fig. 1
figure 1

Histogram of effect size estimates after removing three outliers exceeding \(+100\%\) and \(-50\%\), respectively. The solid vertical line depicts the sample mean

Figure 1 shows the distribution of estimates. Most estimates are negative but small in magnitude, with a mean of \(-2.15\%\) and a median of \(-1.67\%\). Still, the range of estimates is substantial, with a minimum and maximum of \(-66\%\) and \(+109\%\), respectively. Given the large variation, I exclude three outliers exceeding +100% and \(-50\%\). These are at the extreme ends of the effect size distribution, well beyond all other observations.Footnote 2 The main results are not affected by using different outlier criteria or no outlier criterion at all, see Sect. 4.4. This discrepancy is also reflected in Fig. 2, in which the effect size estimates of each study are summarized in boxplots, ordered by the studies’ publication year. Clearly, the estimates vary considerably, both across and within studies. Additionally, there seems to be a trend toward more negative findings and more narrow confidence intervals over time. This might indicate, e.g., better data availability or more refined methodological choices in more recent studies (Hoen and Atkinson-Palombo 2016; Parsons and Heintzelman 2022).

Fig. 2
figure 2

Boxplots of effect sizes estimates for every primary study after removing three outliers exceeding \(+100\%\) and \(-50\%\), respectively. Studies are sorted in ascending order by publication year. The boxes denote the inter-quartile range (P75–P25), with the mean shown as a solid vertical line. Whiskers indicate the distance up until 1.5 times the IQR starting from the P25 and P75, if applicable. Dots reflect outlying observations within a study. The solid vertical line depicts the sample mean

3 Publication Bias

Publication selection bias commonly describes the distortion of a field of literature due to journals’ selection of studies or researchers’ selection of findings based on statistical precision, magnitude, direction of effect, or any combination thereof (Stanley and Doucouliagos 2012). Accordingly, if researchers or editors expect the presence of wind turbines to either reduce or not affect property values, significantly positive estimates may not be selected for publication, leading to a distorted picture of the literature. Figure 3 shows a funnel plot where the effect size (horizontal axis) is set in relation to its precision \(=\frac{1}{SE}\) (vertical axis).Footnote 3 With no publication bias present, the most precise estimates at the top should mirror the genuine magnitude of the effect. A symmetric dispersion around this true mean should result in lower precision levels. Here, the most precise observations have a corresponding effect size close to zero. There is a tendency for relatively more negative results to be found for more imprecise estimates, i.e., moving down the inverted funnel. This is a first indication that significant positive findings of wind turbine effects on property values may be under-represented due to publication bias.Footnote 4

Fig. 3
figure 3

Funnel plot relating effect size estimates to their reported precision (\(\frac{1}{SE}\)). Without publication bias the plot should take the shape of an inverted funnel. The most precise estimates are omitted for the ease of exposition but included in all calculations

Next to visual tools, many more formal methods exist designed to detect and correct publication bias. Simulation studies show that there is no single best method, but instead indicate that the performance of these methods depends on the magnitude of the effect size, level of heterogeneity in the literature as well as severity and type of publication bias (Alinaghi and Reed 2018; Hong and Reed 2021). Here, I employ the recently developed R package RoBMA-PSMA that combines the competing techniques (Bartoš et al. 2023). This method is beneficial for two main reasons. First, it aims for objectivity by testing 36 competing models simultaneously, weighting the result by the fit to the data using BMA. Second, it captures uncertainty about the publication selection mechanism by including models assuming reporting based on p-values (so-called selection models) and those assuming selection based on significance and effect magnitude. Bartoš et al. (2023) show that their approach outperforms most conventional techniques but also advise to complementing their method with other concepts. Accordingly, I also employ other techniques shown to perform well under conditions frequently met in empirical economic settings, i.e., multiple estimates per study, omitted variables bias, or a continuous dependent variable (Alinaghi and Reed 2018; Bom and Rachinger 2019). This encompasses the Endogenous Kink method introduced by Bom and Rachinger (2019),Footnote 5 the selection model advocated by Andrews and Kasy (2019) and the p-uniform* method by van Aert and van Assen (2023). Finally, I also consider methods that only use a subset of the available observations to calculate a bias-corrected mean effect size (and are hence not covered by the RoBMA-PSMA framework). This includes the stem-based method introduced by Furukawa (2019) that is based on the funnel plot logic and the “Top Ten”, which uses only the ten percent most precise observations (Stanley et al. 2010).

The results of the tests for publication bias are summarized in Table 1. Regardless of the chosen method, the corrected effect size is considerably smaller (in absolute terms) than the unweighted mean (OLS estimate) (\(-0.06\) to \(-1.76\) vs. \(-2.14\)), confirming publication bias. This is also in line with the visual impression gained from the funnel plot depicted in Fig. 3. Stanley and Doucouliagos (2015) show that with publication bias present, the unrestricted weighted least squares (WLS) estimator, i.e., using inverse-variance weights, gives a more realistic representation of the true unconditional mean and I report it for completeness. Indeed, with \(-0.38\) this simple estimator is within the range of bias-corrected means. With RoBMA-PSMA, the corrected mean effect size is \(-1.67\). This corresponds to an absolute reduction in effect size magnitude by about 22% compared to the unweighted mean.Footnote 6 Turning to the other methods, the p-uniform* estimate is of similar magnitude (\(-1.76\)), while the Andrews and Kasy (2019) and the Endogenous Kink method correct more strongly (\(-0.29\) and \(-0.16\)).Footnote 7 The remaining methods that only use small subsets of the most precise observations similarly induce a strong correction of the mean effect size (“Top Ten”: \(-1.03\); Furukawa: \(-0.06\)). This is no surprise, considering that the most precise estimates are clustered around zero in this case.

Table 1 Mean effect size without and with correction for publication bias

This section confirms that publication bias is present in the hedonic literature on wind turbines and proximate property values. The unweighted mean of \(-2.14\%\) is inflated, and the bias-corrected effect size is likely around \(-1.67\%\).

In economic terms, the range of estimates translates to a slight decrease in property values on average if a wind turbine is present. None of the estimates, however, take into account the large differences in reported effect size magnitude within and across studies. This could be problematic for two main reasons. First, other sources of bias linked to measurement error or misspecification in the primary studies could systematically influence the magnitude of the reported effect. This could lead to an over- or underestimated mean effect size if only publication bias is corrected. Second, if study design choices are systematically related to the likelihood of publication or the precision of the estimates, the supposed presence of publication bias could, in fact, mirror true heterogeneity. Accordingly, I explore the drivers of the observed heterogeneity in the next section.

4 Heterogeneity

The goals of this section are threefold: first, to find elements of study design that are accountable for the observed effect size heterogeneity; second, to establish if publication bias can still be confirmed even with other controls for study differences in place; and third, to calculate the expected effect of wind turbines on property values corrected for misspecification and publication bias.

4.1 Moderator Selection

A rich set of 42 moderators was coded to identify relevant dimensions in which the selected studies differ. The selection of moderators is based on recommendations in the general meta-analytic literature (Stanley and Doucouliagos 2012; Havránek et al. 2020), existing reviews (Parsons and Heintzelman 2022; Möllney 2022; Brinkley and Leach 2019) and study differences becoming apparent during the data-coding process.Footnote 8 The moderators are grouped into four categories reflecting the primary studies’ (1) data characteristics, (2) control variables, (3) specification of the wind turbine impact, and (4) publication-related information. I summarize the variables in Table 2.

4.1.1 Data Characteristics

The primary studies included in the analysis examine very different samples of properties and wind turbines to estimate the wind turbine effect. Accordingly, I define ten variables controlling for data characteristics. For an accurate calculation of the distance between wind turbine(s) and properties, exact coordinates are required. Two variables control if more coarse data, e.g., using wind farm or postcode centroids lead to significantly different results in the reported effect size (Wind coordinates, Property coordinates). Similarly, I include a variable reflecting the reported mean distance of treated properties from the wind turbine site to test if the price-distance relationship experiences a distance decay (Distance). Turning to the sampled properties, some studies have access to actual sales transaction data, while others rely on asking- or assessed prices. Likewise, while most studies analyse effects on already built properties, some focus on residential land. The dummies Sales and Res. land reflect these differences. Next, I control for the sample size and the time span of the sampling period (Sample size, Sample duration). Finally, three moderators reflect the general context in which the data were sampled, i.e., a dummy controls if the study was conducted in the USA or elsewhere (USA), and two variables reflect the share of wind energy (Share wind) and share of renewables (Share renewables) in the respective country’s electricity mix at the midpoint of the studies’ sampling period, respectively.

4.1.2 Control Variables

Although the hedonic pricing framework does not define any exact set of variables to be included in the analysis, omission of relevant control variables may lead to misspecified models and, in turn, to biases in the reported effect sizes (Wooldridge 2010; Phaneuf and Requate 2016). Consequently, the dummy variables Structure var, Neighbourhood var, Access and Demoecon capture if studies control for characteristics of the sampled houses (e.g., age, number of rooms), the neighbourhood (e.g., road noise, prison presence), access options (e.g., distance to central business district or highway), or socio-economic factors (e.g., income levels, crime rate, population density), respectively. Similarly, the estimated wind turbine effect may differ for studies that control for other proximate (dis-)amenities (e.g., distance to a park, beach, industrial facility or landfill). The dummies Oth amen and Oth disamen reflect this possibility. Furthermore, some studies can address the endogeneity problem using a difference-in-difference design (Kuminoff et al. 2010; Bishop et al. 2020; Greenstone and Gayer 2009), i.e., taking advantage of the information on prices before and after treatment (temporal variation) as well as on prices of proximate and distant properties (spatial variation). This type of study can account for pre-existing price differentials and frequently interpret the estimated coefficient as a causal price change due to the presence of proximate wind turbines. Accordingly, I include the dummy DID to differentiate DID studies from those with a standard hedonic pricing framework.Footnote 9 Moreover, the hedonic pricing literature has a consensus that time-invariant unobservable spatial effects should be reflected in the econometric specification for proper identification (Parsons and Heintzelman 2022). Accordingly, I control if the omission of spatial controls results in changes in reported effect sizes using the dummy variable Spatial. Finally, the moderator OLS distinguishes studies using ordinary least squares from those with other estimation approaches (e.g., instrumental variables or maximum likelihood).

4.1.3 Specification of Wind Turbine Impact

The chosen econometric specification of the wind turbine impact may influence the effect size. Accordingly, I distinguish specifications with one treatment zone from those with several (Dist binary). Similarly, I test if the consideration of additional wind-turbine-impact variables, such as the number of turbines or the degree of visibility next to the distance variable leads to differences in the estimated effect (Other wind). Furthermore, three dummies test the influence of differences in the definition of the treatment period. First, I distinguish if the estimated effect size refers to the announcement or the construction date of a wind turbine (Announcement effect). Second, some studies consider both announcement and construction date in their econometric specification, while others define treatment for only one point in time (AE and CE). Finally, in a few cases, authors test if the effect size adjusts over time, mirroring a habituation effect (Adjustment control).

4.1.4 Publication

Taking the presence of publication bias into account, I include three variables to test if it can still be confirmed when the controls for heterogeneity introduced above are in place. This includes the standard error of the reported coefficient (SE), a dummy signalling its significance (Sig), and a variable reflecting peer-review status in a scientific journal (Reviewed). Including the standard error formalizes the funnel plot logic on which the methods by Furukawa (2019) and Bom and Rachinger (2019) are based. The significance dummy mirrors selection based on p-values (Andrews and Kasy 2019; van Aert and van Assen 2023). The peer-review control captures two aspects. First, it should reflect some particularities of the studies linked to their quality not captured by the variables introduced above. Second, since the peer-review process critically examines, e.g., the methodological choices or econometric specification and other decisions made by the respective authors, this process may have induced changes in the reported results. Thus, this dummy moderator measures the magnitude of another potential facet of publication selection. Next, I include the publication year to account for time-trend effects (Year publish). Finally, in some cases, the standard error was not reported. If other precision measures were available instead (e.g., t-value or p-value), they were converted accordingly (see also the detailed explanation on the standardization of the precision measure in the appendix and Fig. 7). In other cases, however, only information on the significance level of an estimate was given. For these estimates, I set the precision at a conservative level. The dummy Precision set is included to assess the influence of this coding decision.

These moderators reflect the most prominent aspects of the included studies and, thus, account for key differences in data and methodology. Additional moderators, e.g., the level of urbanity in the area of the sampled properties, the degree of turbine visibility, or the presence of local opposition, were initially considered but dismissed. These variables were inconsistently defined and could, therefore, not be used in a comprehensive comparison across studies.Footnote 10 Additionally, I tested the added value of finer-grained categorical moderators instead of dummies (e.g., using different significance levels instead of a significance dummy). If there were no changes to the results, I opted for model parsimony using a dummy specification (see the code available via https://doi.org/10.17605/OSF.IO/UB37W for details). Finally, for another subset of moderator candidates (i.e., wind turbine and additional sample size characteristics), information was missing for several studies. I include these variables in Table 2 for completeness (marked with an asterisk). However, to conserve sample size, I conduct separate analyses with this type of moderator, see Sect. 4.4.

Table 2 Definition and summary statistics of variables

4.2 Estimation

The choice of the correct meta-analytic model is a context-specific and data-driven issue, which involves (at least) decisions on treating publication bias as well as meta-model and moderator selection. Publication bias has clearly been confirmed, and I subsequently defined several relevant variables that can be considered to address this issue in a regression framework. Regarding model selection, I adopt a standard multivariate meta-regression model in the baseline specification. That is

$$\begin{aligned} wind~effect_{i}=\beta _{0}+\sum _{k=1}^{K}\beta _{k}*M_{k,i}+ \upsilon _{i} + \epsilon _{i} \qquad i=1,...,N \end{aligned}$$
(1)

where each reported effect size i is put in relation to a set of k study characteristics used as moderators \(M_{k,i}\), with \(\upsilon _{i}\) reflecting unobserved heterogeneity assumed to be \(\upsilon _{i}\sim N (0,\tau ^{2})\) and \(\epsilon _{i} \sim N (0,{\widehat{\sigma }}^{2}_{i})\) representing the error term. Equation 1 is estimated using WLS with random effects weights, i.e., \(w_{i}=\frac{1}{\widehat{\sigma ^{2}_{i}} + \tau ^{2}}=\frac{1}{\widehat{SE^{2}_{i}} + \tau ^{2}}\), where \(SE_{i}\) is the standard error corresponding to the reported effect size i. This gives greater weight to more precise observations in the dataset and addresses heteroscedasticity in the error term, which naturally occurs when combining estimates with different variances from several studies.Footnote 11 I assess the effect of choosing this baseline specification on the results using a series of robustness checks, including different assumptions on the error term, weighting schemes, and data dependency. The main results are unaffected by the model selection, as discussed in Sect. 4.4.

Finally, while all moderators introduced above may systematically influence the effect size, using all of them jointly in the regression may obscure true data patterns since some moderators will prove collinear in explaining wind turbine effects on property values. Accordingly, I use BMA to address model ambiguity. This technique does not require selecting one particular specification for the meta-regression.Footnote 12 BMA has become a frequently used tool in economics in general (Steel 2020) and meta-analysis in particular (Havranek et al. 2015; Matousek et al. 2022) to address model uncertainty. For a recent overview, see Steel (2020). The following brief description covers the basics of this technique and introduces relevant terms needed for inference in the subsequent analysis.

In BMA, all possible combinations of moderators are estimated in individual regressions, and a weighted average of these models is constructed. The weights correspond to the posterior model probabilities. This measure reflects the model fit of individual specifications conditional on the data and model parsimony, analogous to adjusted \(R^{2}\) in a frequentist setting. The importance of each moderator is measured in terms of posterior inclusion probability (PIP), i.e., the sum of all posterior model probabilities for all regressions that include the specific variable. This corresponds to statistical significance in frequentist econometrics (Steel 2020). Here, with 34 potential moderators with full observations \(2^{34}\) models could be estimated. To reduce computational complexity, I follow common practice and rely on the Markov Chain Monte Carlo algorithm included in the bms package for R (Zeugner and Feldkircher 2015), which considers only the most promising models, i.e., those with the highest posterior model probabilities.Footnote 13

BMA requires the selection of priors for the model space and regression coefficients (so-called g-prior). Without sufficient knowledge about the coefficients’ magnitude, I choose the popular unit information prior as g-prior in the baseline specification, which performs well in simulations (Eicher et al. 2011). This prior assumes a zero mean for all coefficients and has about the same information content as one observation. For the model space, I use the uniform model prior (Eicher et al. 2011), which gives each model the same probability. I assess the sensitivity of results using other common choices for the prior structure in robustness checks, see Sect. 4.4.

4.3 Results

Figure 4 illustrates the results of BMA. The columns represent individual regression models sorted by their posterior model probability starting with the best model on the left. The vertical axis lists the variables in descending order, sorted by posterior inclusion probability indicating importance. A blank cell indicates the variable is not included in the respective model. The red colour implies that the corresponding regression coefficient is positive, while a blue cell signals the negative sign of the coefficient. The best model in terms of posterior probabilities on the left includes twelve out of the 34 variables used in the analysis. These variables are also the only ones with a posterior inclusion probability above 0.5. This threshold signals a non-negligible effect of these variables on the effect size in the classification of Kass and Raftery (1995). According to this rule of thumb, the moderators have a weak, positive, strong, or decisive impact of the effect size if the PIP lies between 0.5 and 0.75, 0.75 and 0.95, 0.95 and 0.99, or 0.99 and 1, respectively. All other variables do not systematically influence the magnitude of the estimated effect.

Fig. 4
figure 4

Model inclusion probability of moderators. The response variable is the estimated price-distance coefficient relating wind turbines and property values. The columns represent individual models sorted by posterior model probability. The variables are depicted on the vertical axis, ordered by posterior inclusion probability in a descending array. A blue (red) cell indicates the inclusion of the variable in the model and that the estimated sign is positive (negative). A blank cell indicates that the variable is not included in the model. The uniform model prior and the unit information prior (Eicher et al. 2011) are used for the model space and the coefficients, respectively. Corresponding numerical results are presented in Table 3

Table 3 gives the corresponding numerical results of BMAFootnote 14. The left panel reports the posterior mean, posterior standard deviation, and posterior inclusion probability for each explanatory variable’s regression coefficient. The right panel shows the results of a frequentist WLS check, including the twelve variables with a posterior inclusion probability of 0.5 and higher. The estimated coefficients in both panels have the same sign and similar magnitudes as well as the same statistical importance (posterior inclusion probability in the BMA setting and its frequentist equivalent, p-value). Accordingly, the results of the frequentist check are consistent with the baseline BMA. In the following, I present results by moderator category.

4.3.1 Data Characteristics

The type and quality of the analysed data are central determinants of the estimated effect size. In particular, using exact wind turbine coordinates for the distance calculation seems essential to estimate the price-distance relationship reliably. Relying instead on, e.g., wind farm centroids induces imprecise estimation of this price-distance relationship, reflected in the large coefficient for this variable.Footnote 15 Moreover, the type of analysed property price seems to be an important dimension of effect size variance. Using actual sales data instead of asking prices or assessed values is associated with significantly more positive findings of about 3.26 percentage points. One possible interpretation for this finding is that residents who decide to offer their property include a price discount in their asking price, mirroring the (subjectively perceived) lower value due to wind turbine presence. Similarly, assessors seem to devalue properties with a proximate wind turbine on average. Apparently, both types of property prices (asking and assessed) lead to inflated negative estimates of the effect of wind turbines compared to actual sales prices.Footnote 16 The type of analysed property also seems to affect the effect size. Investigating the price effect on undeveloped residential land instead of residential buildings leads to more negative estimates on average (\(-9.54\) percentage points). This may be due to the increased visibility of wind turbines when no structure has been built yet. Note, however, that only one study (Sunak and Madlener 2017) contributing 15 observations relies on residential land values. Accordingly, I caution against generalizing this finding even though it remains robust to the inclusion of study-level fixed effects. In line with expectations, I find that the values of properties that are located at greater distances from a wind turbine are less affected, i.e., for each additional mile of distance from a wind turbine, property values increase by 0.73 percentage points, ceteris paribus. Additionally, for countries with a comparatively higher share of wind power in their electricity mix (or renewable energies in general) when the respective study took place, corresponding estimates document more negative effects on property values on average. One possible explanation might be that with an increased share, wind turbines are built closer to residential areas since more remote areas have already been used. Finally, neither the sampling duration, sample size nor the USA dummy influence the reported effect size systematically.

Table 3 Bayesian model averaging results

4.3.2 Control Variables

Using adequate control variables proves essential to disentangle the wind turbine effect from other price-influencing factors. In particular, studies lacking sufficient data to control for unobserved local price differentials using, e.g., spatial fixed effects or a repeat sales approach (Spatial), report more negative effect sizes (about eleven percentage points). Additionally, studies unable to control for pre-existing price-differentials via a difference-in-difference design (DID) generally report more negative effect sizes, as expected (about \(-2.43\) percentage points).Footnote 17 Similarly studies accounting for socio-economic factors like income levels or population density (Demoecon) document less adverse wind turbine effects. Additionally, if other amenities like a park or beach are present, the estimated wind turbine effect is more negative on average (Oth amen). Apparently, having a living-quality enhancing element in close vicinity worsens the effect of wind turbines.Footnote 18 Other moderators reflecting the inclusion of control variables for house-structure characteristics, neighbourhood aspects, infrastructure access options, or the presence of other disamenities do not systematically influence the wind turbine effect. Similarly, using estimation approaches different from OLS or the choice of approaches to control for time trends are not affecting the reported wind turbine impact systematically.

4.3.3 Specification of Wind Turbine Impact

Studies differ to a great extent in the way the wind turbine impact is specified. However, the reported effect size is largely unaffected by design choices in this dimension. The only exception is the dummy Announcement effect, which shows that choosing the announcement date of a wind turbine as treatment results in more negative estimates compared to using the construction date.Footnote 19 Using one treatment zone (Dist binary), i.e., a binary distance specification, does not lead to significantly different findings compared to categorical specifications with multiple treatment zones. In the same vein, controlling for announcement and construction effect simultaneously (AE and CE) or for a potential habituation effect (Adjustment control) does not change the effect size systematically. Additionally, the standard approach to pool observations from different sites to increase sample size (Pooled) does not alter the reported estimates. Surprisingly, studies using other wind turbine controls next to the distance variable (e.g., view, number of turbines) do not report estimates smaller in absolute terms, i.e., more positive price-distance effect sizes. This would be expected, assuming that other wind turbine controls take on some of the effect. Apparently, other moderators prove more important in explaining the effect size variance.

4.3.4 Publication

The combined findings for three variables show the existence and type of publication bias: First, I find that significant estimates of wind turbine effects (Sig) are more negative on average (\(-2.46\) percentage points). Second, the standard error (SE) is not a systematic factor in explaining the heterogeneity, i.e., selection does not take place based on effect size magnitude and its significance. Finally, studies that are published in peer-reviewed journals do not differ systematically in terms of effect size magnitude compared to unpublished manuscripts. Linking these findings, I conclude that publication bias is still confirmed with heterogeneity-explaining factors in place. The type of publication bias is a one-sided selection that disfavours significantly positive estimates of the impact of wind turbines on property values.Footnote 20 The publication year is not affecting the effect size. Reassuringly, the coding decision to set the precision level at specific values in cases when this metric is reported imprecisely (see again Fig. 7 for details) does not affect the estimated mean effect size.

4.4 Robustness Checks

The robustness of the results is assessed from several perspectives. First, I change the BMA prior settings: This includes (i) substituting the uniform model prior with the dilution prior, which allows for collinear moderators in each particular model (George 2010), (ii) combining the benchmark g-prior with the beta-binominal model prior (Fernandez et al. 2001; Ley and Steel 2009), which implies equal prior probability for each model size. I compare the results in terms of variable importance in Fig. 5. I conclude that the results are largely insensitive to the selection of priors with the exceptions mentioned above for the dummies Announcement effect and Oth amen, which lose importance using the alternative priors. The corresponding numerical results are summarized in Table 8. They confirm the main findings.

Fig. 5
figure 5

Posterior inclusion probabilities with changed priors. UIP and uniform = Baseline setting used in Table 3 with priors following Eicher et al. (2011). UIP and Dilution = Uniform model prior exchanged for dilution prior (George 2010). BRIC and Random = Benchmark g-prior for coefficients (Fernandez et al. 2001) combined with the beta-binominal model prior (Ley and Steel 2009)

Next, I run a series of frequentist robustness checks based on the set of moderators identified by the baseline BMA summarized in Table 3 with a PIP \(>0.5\). I document the findings in the appendix. First, several moderators with missing observations that were not considered in the main specification are added in separate regressions, respectively. These moderators reflect wind turbine characteristics (average turbine height, installed capacity, number of turbines, number of wind farms) as well as additional aspects related to the property sample used in the respective primary study (number of treated properties, number of properties at the construction stage of the wind turbine(s) to which the coefficient refers, number of properties within one mile of a turbine, number of properties within one mile of a turbine after turbine construction). Both types of variables are suspected to influence the effect of wind turbines on property values, e.g., multiple turbines are presumably related to more negative effects due to the increased likelihood of visual impacts or noises (Jensen et al. 2014; Jensen et al. 2018). At the aggregate level, however, these factors do not translate to economically significant changes in the reported effect size, see Table 9.Footnote 21 Only one of the additionally included moderators is statistically significant, i.e., a larger number of treated properties analysed by the primary studies is associated with more negative findings. Most of the other moderators are robust, with only minor changes that can be attributed to the reduced number of observations considered for these subsample regressions (717 in the baseline specification vs. 393 if the number of treated properties is included as moderator).

In a second set of specifications, I investigate the effect of different definitions of outliers see Table 10. These cover the range from no outlier to the omission of 89 observations using the inter-quartile range criterion following Tukey (1977). The results are robust to changes in this dimension. Finally, in Table 11, I document the sensitivity of results to modifications of the meta-analytic model. This includes (i) changes to the estimation (using heteroscedasticity-robust standard errors and standard errors clustered at the study-level), (ii) adding study-level fixed effects and, (iii) changing the weighting scheme to inverse-variance weights (WLS-FE), or using no weights (OLS) (reported for completeness). While the main findings are confirmed, some points are worth noting. Altering the assumptions for calculating the error term does not change the results. Using OLS reduces the explanatory power as expected (\(R^{2}\) drops from 0.669 to 0.218) and several variables lose significance. This again demonstrates the need to rely on WLS estimation in meta-regression analysis. Instead, the choice of weights (fixed or random effects) is less critical, with only minor changes documented in the WLS-FE framework due to the unbalanced weighting scheme.

Finally, when study-level fixed effects are added, the dummies Sales and Oth amen lose significance. This is probably due to the fact that only a few studies do not use sales data and Oth amen already found to be less robust in other BMA settings. The change in sign and significance of Share renewables is due to the high share of renewables in Sweden of about \(60\%\) in the study period of Westlund and Wilhelmsson (2021, 2022). This is significantly higher than the average of \(19.45\%\) in this meta-dataset. Omitting these observations results in coefficients similar to those in the baseline specification, but with a loss of significance for Share renewables. I therefore caution against generalising the findings related to this variable.

4.5 Implied Effect Size

As the final step of the analysis, I compute the wind turbine impact on residential property values conditional on the absence of publication and misspecification bias. To this end, I calculate an average effect size that can be expected for a hypothetical study that follows “best practice” regarding methodology and data quality. Specifically, I use the results from the baseline BMA analysis and compute fitted values of the effect size when specific values for the variables with PIP \(>0.5\) are used. While arguably certain aspects of study design are preferable to others, any best-practice specification remains subjective by design. In order to increase plausibility, I follow a conservative calculation approach: When there is good reason to prefer a particular type of study design, I use the preferred value (e.g., I use 1 for the dummy variable corresponding to DID design); otherwise, I use the respective study mean to reflect my indifference.

To permit an accurate distance calculation, I prefer exact wind turbine coordinates. Similarly, I consider estimates from actual sales prices to reflect wind turbine impacts more realistically. For the distance of treated properties as well as Share wind and Share renewables, I consider the respective study means (1.89 miles, \(19.45\%\) and \(2.29\%\)). Similarly, I multiply the Res. land coefficient by 0.5 to reflect my indifference. In terms of control variables, I prefer rich data sets allowing to control for socioeconomic factors (Demoecon) and unobserved spatial factors (Spatial) and the application of a DID design (DID), which - all - may lead to misspecification bias if not accounted for. Instead, the presence of other amenities is not a study quality dimension, so I remain agnostic regarding this moderator. Finally, to correct for publication bias, I set the significance dummy to 0.

The calculated conditional effect size for this best practice specification is \(-0.68\%\). This is considerably smaller than the unconditional and only publication bias corrected mean effect size of \(-1.67\%\) identified with RoBMA-PSMA in Table 1. This underlines the importance of correcting for misspecification bias to obtain a realistic effect size estimate. This conditional effect size is by definition sensitive to changes in the best practice specification. For example, setting the distance to a hypothetical value of 0 changes the effect size to \(-2.04\%\). For a distance of one mile, a reduction in property values of \(1.31\%\) can be expected. The effect becomes zero at a distance of 2.8 miles (i.e., 4.5 kms). This corresponds to a cut-off point often used in primary studies, beyond which no effect of wind turbines is suspected, and is identical to the result of the so-called “mini-meta-analysis” by Parsons and Heintzelman (2022). The effect of closer distances on the conditional effect size is also reflected in the subsample regressions shown in Table 12 that only include observations with a maximum distance of two and one mile, respectively.Footnote 22 While the main results remain largely unaffected, the conditional effect size is more negative (but insignificant) at a distance of one mile.Footnote 23

5 Discussion and Conclusion

This meta-study is the first to systematically assess the hedonic literature on the price-distance relationship between wind turbines and property values. It addresses the considerable ambiguity in empirical findings investigating its existence and magnitude. Combining 720 observations from 25 studies using BMA and novel publication bias correction methods, I identify the most essential moderators explaining the observed heterogeneity and calculate an average effect size for this relationship.

Three main conclusions emerge from this study. First, selective under-reporting of significant positive findings is responsible for overestimating the effect size by about 22% in absolute terms (correcting the unconditional mean effect size from \(-2.14\%\) to \(-1.67\%\)).Footnote 24 This is in line with ubiquitous publication bias in large parts of the economic literature (Bartoš et al. 2022). Second, next to selective reporting, various data characteristics (e.g., accuracy of the distance calculation, the distance of treated properties, type of property price data) as well as the ability to control for confounding factors, i.e., using a DID framework for accounting for pre-existing price differentials and inclusion of appropriate controls for socio-economic and unobservable spatial factors explain the considerable variation in empirical findings. Third, conditional on the absence of misspecification and publication bias, the effect size is \(-0.68\%\) for properties 1.89 miles away when calculating a (subjective) best-practice average. For distances greater than 2.8 miles, there is no evidence of a wind turbine effect. These results are robust to changes in the BMA setup, meta-model specification, and outlier treatment.

These findings can inform future research and policymakers in at least three ways. First, future hedonic pricing studies on this subject should rely on a proper identification strategy using a DID design and a rich dataset with sufficient control variables. This ensures that pre-existing price differentials and other confounding factors are not wrongly attributed to the presence of wind turbines. Since the effect is small, studies should also rely on many observations, especially in close vicinity to wind turbines, to have enough statistical power to detect an effect if it exists in the respective setting. In addition, recent methodological advances that reflect the staggered nature of the treatment, i.e., the fact that observations from different wind farms with different corresponding construction dates are pooled, should be adopted (de Chaisemartin and D’Haultfoeuille 2022, 2020; Steigerwald et al. 2021).

Second, future meta-analyses on this topic could construct alternative effect size variables from other impact measures used in this literature (e.g., view, continuous distance, or number of turbines within a certain distance) that could not be used in this study to ensure comparability. This would help to better understand what drives the occasionally documented negative effects of wind turbines on property values. In addition, future meta-analyses focusing on other energy generation facilities (e.g. nuclear power plants or solar farms) could help to place the results of this study in the general context of the effects of disamenities on property values. Compared to the few existing meta-analyses that consider other types of disamenity and focus on price-distance relationships, the effects of wind turbines are in the lower range of estimates. Schütt (2021), for example, documents that property values increase on average between \(1.5\%\) and \(2.9\%\) per mile of increased distance from waste sites. Lipscomb et al. (2013) report an increase in value of \(6.1\%\) per mile of increased distance for properties close to contaminated water bodies.

Finally, for policy makers, the aggregated evidence from the literature could indicate the appropriateness of financial compensation for homeowners with properties very close to wind turbines. Although the effect size is small on average, payments could be appropriate to acknowledge the “local cost, global benefit” (Frondel et al. 2019) situation of localised externalities from wind turbines. In addition, the calculated effect sizes for different distances can now be used by policy makers to make informed decisions about distance rules for wind turbines and residential areas that are appropriate to the individual context. As the announcement of future wind turbines appears to have a more negative effect than the actual construction, the results could also be used for early communication with local residents during the wind turbine siting process to reduce concerns. This early communication is also consistent with the results of a related meta-analysis on the non-market valuation of externalities of wind energy, which documents that visual deterioration in particular leads to welfare losses (Mattmann et al. 2016). Given the expected continued growth in global electricity generation from wind farms (IEA 2022), it is clearly in the general interest to increase public acceptance of wind turbines.