Multivariate Landscape Analysis of Honey Bee Winter Mortality in Wallonia, Belgium

The European honey bee species (Apis mellifera L.) is under increasing pressure from anthropogenic and other stressors. Winter mortality of entire colonies is generally attributed to biological, environmental, and management conditions. The rates of winter mortality can vary extremely from place to place. A landscape approach is used here to examine the dependency between spatially distributed winter mortality rates, environmental and biological conditions, and apiary management. The analysis was applied to data for the region of Wallonia in Belgium with winter mortality rates obtained from the European project EPILOBEE. Potential explanatory variables were spatially allocated based on GIS analysis, and subjected to binomial linear regression to identify the most predominant variables related to bee winter mortality. The results point to infestation with Varroa, the number of frost days, the potential flying hours, the connectivity of the natural landscape, and the use of plant protection products as most dominant causes for the region of Wallonia. The outcomes of this study will help focus beekeeping and environmental management to improve bee health and the effectiveness of apiary practices. The approach surpasses application to the problem of bee mortality and could be used to compare and rank the causes of other environmental problems by their significance, particularly when these are interdependent and spatially differentiated.


Introduction
In addition to honey production, honey bees such as Apis mellifera L. in Europe are essential to support biodiversity as they are the most important pollinators of multiple plant species. As much as 80% of all pollination is attributed to honey bee activity [1]. Worldwide increased (winter) mortality rates are observed among honey bees as well as solitary bees [2][3][4][5]. In the past, high winter mortality rates were reported for France for the winter of 1999-2000 [2,6], the USA [7,8], and more recently Belgium [4] and China [9]. Winter mortality of complete colonies is generally attributed to a complex combination of biological and environmental conditions, and poor apiary management. Reported causes of winter mortality include the level of infestation with the Varroa mite [10][11][12], the connectivity of the natural landscape [13], the use of plant protection products [14][15][16][17][18][19][20], and beekeeper experience and practices [5,21].
Winter mortality rates are a key indicator for the weakness of bee colonies [2,4,6,[21][22][23] with mortality rates of 10% being considered acceptable [3]. In Belgium the reported average bee mortality rates for the winters 2008-2009 and 2009-2010 were well above this threshold with 18% and 26% respectively [21]. More recently, a feasibility study was carried out by some of the authors for the region of Flanders to examine the significance of potential causes for bee winter mortality by means of linear regression [22]. The methodology was considered useful although the number of apiaries was insufficient to draw definitive conclusions on the main causes of bee winter mortality. In the follow-up study called BeeHappy-Wallonia, the analysis shifted to the region of Wallonia [24], applying binomial instead of linear regression. This was considered more appropriate in view of the binomial nature of the sampled mortality rates for the colonies. Compared with the region of Flanders, the southern region of Wallonia is less urbanised with a higher proportion of agricultural and natural land use. This can be expected to affect the winter mortality rates, as honey bees are affected by the quality of the landscape and food availability.
The analysis presented in this manuscript is related to the study for the region of Wallonia and was organised in three steps. First, spatially differentiated data for the winter mortality rates and potential causes, here referred to as explanatory variables, were identified. All data were collected and spatially allocated on a 1-ha raster grid, using GIS analysis. Next, the winter mortality rates and raw data for the explanatory variables were subjected to an iterative binomial regression analysis with a stepwise model design including non-linearities and interactions between terms, using the Akaike information criterion for model selection [25]. Finally, the resulting regression models were evaluated on the presence of spatial autocorrelation and overdispersion. The data sampling, regression modelling, and results are discussed in the following sections.

Data Sampling of Winter Mortality Rates
The winter mortality rates were obtained for the region of Wallonia from the EU-funded EPILOBEE project [3]. Belgium was one of the 17 EU member states participating in the project, with the Federal Agency for the Safety of the Food Chain (FASFC) as responsible administration. The main aim was to get an overview of honey bee colony losses on a harmonised basis in each of the participating EU countries, by collecting data for the two consecutive winters of 2012-2013 and 2013-2014. The surveillance protocol was based on common guidelines for honey bee health provided by the European Reference Laboratory ANSES [26]. In Belgium on average 15 apiaries were selected in each of the 10 provinces out of the total number of 3000 apiaries registered by the FASFC in 2012 [22]. The number of registered apiaries in 2012 represents less than half of the total number of apiaries in Belgium. The data were collected with a uniform spatial distribution over Belgium (Fig. 1). For Wallonia a total of 300 colonies were sampled in 74 apiaries for the winter 2012-2013 and 307 colonies in 73 apiaries for the winter 2013-2014 [3], with a sample size between 1 and 6 colonies. The average size of these samples was considered sufficient to derive reliable estimates for the mortality rate at the apiary level. This analysis at the apiary instead of the colony level is inspired by the level of accuracy for the description of the environmental conditions. Furthermore, the data on Varroa infestation could not systematically be linked to the mortality rates at the colony level due to limitations in the quality and quantity of the data.
The winter mortality rates for Belgium ( Fig. 1) do not show a clear visible spatial pattern. Nevertheless, the average winter mortality rates were lower for the region of Wallonia, with 31% for the first winter and 9% for the second winter, based on the sample size per apiary and ratio of raw counts [22].
A complete specification of all variables, data sources, and sampling protocols used in this study is found in Table 1. The available land use data allowed assessment of spatially differentiated environmental conditions that could have a negative impact on the foraging activities, such as the fragmentation of the open landscape. For some of the explanatory variables such as the meteorological variables, data were only available as point data for a limited number of locations. Spatial kriging interpolation using a spherical semivariogram was used here to translate data for individual weather stations into a map covering the region, ensuring data were available for all apiary locations with mortality rates.
The geospatial interpolation of the meteorological variables requires further clarification. The variable Flying Hours (Fig. 2) was obtained for each weather station by multiplying the number of sunshine hours per day with the proportion of the day between sunrise and sunset with a temperature above 10°C [22]. Next, geospatial kriging interpolation [35] was used to project the results on a 1-ha grid.
An identical procedure was used to derive the geospatial distributions for the variables Frost Days and Brood Season Temperature [22]. The number of (critical) frost days is taken into account because, after a period of higher temperatures at the end of the winter, bees will increase the distance between each other in the hive. This results in less protection against a sudden cold period. The variable Brood Season Temperature refers to the winter bees which are needed for the colony survival. If the temperature remains too high during a prolonged summer, the transition to winter bees occurs too late. The average temperature of the period September-October is considered to be representative for the brood season. A distinct spatial distribution pattern could be observed for Frost Days and Brood Season Temperature as well as the other explanatory variables (Fig. 3).
A honey bee typically forages over a distance of up to 1 km, with a maximum of 3 km around the beehive [46]. This behaviour can be represented by a circular, normalised distance decay mask around the beehive. For every explanatory variable, a distance decay function (Fig. 4), adopted from Hagler [47], was used to multiply within the foraging area the data with the distance weights for each pixel, followed by summation to obtain an aggregated value for the apiary. An exception was made for Food Availability and Plant Protection Products, which were averaged without distance decay mask because honey bees adapt their foraging pattern to the direction of food-rich locations [22,47].
The data for some apiaries were excluded from the analyses. Apiaries without data on the infestation with Varroa, known to be an important cause of bee mortality, were excluded from the analysis to avoid data inconsistency. Apiaries with colonies that were transported during the flying season were excluded as well because the apiary location did not correspond to the actual exposure of the bees. Hence, only 136 out of the 147 available apiaries were taken into account [24]. The datasets generated during this study are available on reasonable request.

Data Grouping
An important step in the analysis is to examine the presence of collinearity and to group the explanatory variables based on a significant positive or negative correlation, or functional similarity. The cross correlations of the explanatory variables are shown in Fig. 5.
The following conclusions were drawn from the correlation analysis: & a significant positive correlation is observed between most of the herbicides, insecticides, and fungicides. This can be attributed to the homogeneous distribution of these products over the agricultural parcels; & a significant positive correlation is observed between Urbanisation, NO 2 and PM 2.5 , Telecom Tower Density, and glyphosate. The latter correlation can be explained by the intensive application of herbicides in urbanised areas; & strong negative correlations are observed of Food Availability, and to a lesser extent Landscape Connectivity with the herbicides, insecticides, and fungicides; & NO 2 and PM 2.5 are positively correlated with Brood Season Temperature and negatively with Food Availability and Landscape Connectivity.
The observed correlations between the different insecticides, herbicides, and fungicides point to a need for further grouping prior to the regression analysis. Correlations exist between some of the variables related to environmental conditions and apiary management, but these variables are difficult to group while excluding them from the analysis is not desirable. Therefore, two different datasets were created for the analysis:  the plant protection products: fungicides (captan, mancozeb, propiconazole, tebuconazole, and thiram), herbicides (glyphosate), and insecticides (abamectin, alphacypermethrin, beta-cyfluthrin, cypermethrin, dimethoate, imidacloprid, spinosad, and thiamethoxam). & Both datasets were subjected to the regression modelling.

Regression Modelling
The bee winter mortality was analysed by means of logistic regression, using the logit canonical function where Y is the winter mortality as a fraction of the total number of colonies in the apiary, x i are predictors based on linear or nonlinear functions of the explanatory variables and products of these functions, and β i the regression coefficients. Binomial regression is preferable over ordinary linear regression in case data are based on trials from samples of a limited sample size, in this case the number of dead colonies (between 0 and the maximum of 6 observed colonies per apiary). This sample size (number of colonies) in each apiary was taken into account in the binomial regression.
The adjusted Akaike information criterion or AICc [25,47] is a useful, relative measure of the quality of statistical models for a given set of data: Table 1 Definition and sampling protocol used in this study for the explanatory variables [24] Beekeeper Experience total number of years with practice of the apiarist in apiculture [3] Beekeeper Practice index in the range 0-3 [3], combining binary scores for beekeeper organisation membership, apiculture qualification (professional or not), and the coordination/cooperation with other beekeepers Brood Season Temperature kriging interpolation [35] [36]. The flowering season was obtained from data on bee crops [37] and plant phenology [38]. Four attractiveness classes were used: no attractiveness (0); no attractiveness, unless there are flowering weeds present (0.2); attractive at the time that bloom on the field takes place (0.8); unconditional attractiveness (1.0) Frost Days kriging interpolation [35] of data of the Belgian Meteorological Institute a , summing the number of frost days at 34 stations after a period of 5 consecutive days with a maximum temperature equal to or larger than 10°C in the months of February and March, version 2012 and 2013 Landscape Connectivity based on data from the Environment Outlook for Wallonia 2014 [39] which uses the land use map for Wallonia [40] and the Jaeger index [41] to measure the fragmentation of natural habitats. A low Jaeger index is obtained for highly fragmented areas, corresponding to a low Landscape Connectivity NO 2 yearly average concentration of NO 2 in μg/m 3 obtained from modelled air quality data from the Life ATMOSYS project b , version 2012 and 2013 Plant Protection Products fourteen active substances, selected from a long list of insecticides, fungicides, and herbicides, based on the toxicity towards bees, application in the field, knowledge on acute bee mortality incidents, and synergetic effects: the insecticides abamectin, alpha-cypermethrin, beta-cyfluthrin, cypermethrin, dimethoate, imidacloprid, spinosad, thiamethoxam; the fungicides captan, mancozeb, propiconazole, tebuconazole, thiram; and the herbicide glyphosate. The average use of these products for 2012-2013 [42,43] was assessed using monitoring data on agricultural use, sales figures [42], and expert judgement for seven combined land use/crop categories: outdoor fruit, outdoor vegetables, maize, pastures, oilseed rape, ornamental outdoor plant cultivation, and other green. The spatial allocation was based on the agricultural parcel maps of 2012 and 2013 [44] and the land use map for 2007 [40]. Crop categories were obtained by aggregating crop types from the agricultural parcel map, excluding indoor cultivation not accessible for honey bees. The use of plant protection products was assumed to be homogeneously distributed over all corresponding parcels of the specific crop category Urbanisation proportion of built-up area (i.e. excluding nature and agricultural land use) within a 1.5-km radius based on the land use map for the region of Wallonia [40] Varroa Infestation number of mites per 100 bees, measured by application of the alcohol wash method to the samples for each colony [3]. Aggregation to the level of the apiary based on the maximum infestation over the colonies in the apiary (worst-case assessment) a https://www.meteo.be; b http://www.life-atmosys.be where k is the number of independents, N is the sample size, and L is the value of the likelihood function for the model. Models with a lower AICc value are preferred.
The general procedure for the regression analysis consisted of three distinct steps: 1. Application of four different functions (linear, squared, cubic, and natural logarithm) to the variables, followed by mean centering.

2.
Stepwise binomial regression analysis of the meancentred predictors, adding one regressor at a time, starting from the intercept model. The selection of the regressor is based on the maximal reduction of the AICc value. This procedure is executed twice: first considering predictors as main effects only, and a second time allowing for two-way interaction effects too. The stepwise model selection proceeds by adding regressors one by one to the model until the AICc index reaches a minimum and additional expanding the model leads to an increase due to the penalty effect and lack of improvement of the model fit.
The final model is validated by checking for overdispersion, using a dispersion index, defined by the ratio of the residual sum of squares (SSE) and number of degrees of freedom. Overdispersion is reflected by a value significantly larger than 1.
Spatial autocorrelation can be detected with the Global Moran Index (GMI) [48]: where N is the total number of spatial locations with residual x, μ is the spatial mean of the residuals, w ij is the weight assigned to the distance pair (x i , x j ), and W the sum of all weights. The GMI falls in the range (− 1, 1). A value significantly different from 0 indicates spatial autocorrelation and would necessitate the statistical model to take this into account. Different weight models can be used for the GMI. For this study, an inverse distance model with a maximum range was applied to ensure at least one apiary was located within the range of the distance function. This maximum range was 47 km, well beyond the average foraging distance of a honey bee [46]. Application to the observed mortality rates resulted in a value of −0.055 for the region of Wallonia, confirming the absence of spatial autocorrelation. This ensures that no spatial autocorrelation terms should be taken into account in the regression model.
Finally, McFadden's pseudo coefficient of determination [49], a common metric for binomial regression modelling, was used to verify the model fit with the actual mortality data: where L i is the likelihood for the actual model and L 0 the likelihood for the intercept model. Table 2 gives an overview of the key regression metrics for the different regression models, based on the ungrouped (dataset A) and grouped (dataset B) data, comparing models without and with interactions included. Comparison based on the AICc index, pseudo R 2 , and dispersion shows that the regression models including interactions have a better score, without overdispersion and spatial autocorrelation of the residuals. Figure 6 shows the stepwise reduction of the AICc index for the ungrouped dataset A and grouped dataset B with and without interactions included. Table 3 shows the predictor metrics for the grouped dataset B with interactions between the predictors. This regression model is proposed as the best model, retaining seven of the original seventeen explanatory variables. The grouping of the plant protection products into functional groups (Fig. 5) effectively reduced the impact of spatial autocorrelation while preserving a good model fit ( Table 2).

Model Evaluation
The chosen model based on the grouped dataset uses 7 explanatory variables: Varroa Infestation, Frost Days, Flying Hours, Fungicides, Beekeeper Practice, Landscape Connectivity, and Insecticides. The variables are ranked based on the order in which they are added to the regression model. The explanatory variable Varroa Infestation is clearly the most dominant predictor in the regression model because it is added as the first predictor and as the third predictor by means of its squared transformation.
Comparing the four different regression models, the following observations are potentially relevant for apiary and environmental management:       (Table 2) and should be included. In addition, the pseudo R 2 increases from 0.39 to 0.57 for the grouped dataset (Table 2); & Grouping variables facilitates the interpretation of results but does not affect the ranking (order of being added to the regression model) of the important predictors.
By itself the AICc-based selection of predictors cannot rule out overdispersion. However, the regression analyses result in models with a reasonable model fit and dispersion index close to one ( Table 2). This demonstrates the absence of overdispersion in the models.

Discussion and Conclusion
The European honey bee species (Apis mellifera L.) plays a key role for plant pollination in Europe, indirectly sustaining the production of food crops. The worldwide increase of bee winter mortality is a growing concern for the apiary sector, environmental protection agencies, agriculture, and society. Cross comparison with the environmental conditions and apiary management points to a large number of potential causes of bee winter mortality, the complex interaction of which is not yet fully understood. Laboratory experiments, model simulations, and colony-level sampling of apiaries are not able to explain the combined impact of all factors scientifically. Furthermore, spatially explicit analyses of the combined impact of natural and maninduced causes of honey bee winter mortality are still rare due to limitations in data availability and differences in the sampling protocols used for bee health.
Together, the EPILOBEE data and high-resolution environmental and landscape data for the region of Wallonia provided a unique opportunity to analyse the causes of winter mortality.
Step-by-step construction of a generalised regression model combined with the adjusted Akaike information criterion proved very useful to identify and rank the main causes of bee winter mortality to focus apiary management. The analyses indicate the need to include both non-linearities and interactions in the regression modelling. Testing after each step shows that no overdispersion was detected when adding the individual terms to the regression model. For the grouped dataset, seven out of seventeen variables were retained: Varroa Infestation dominates as primary cause of bee winter mortality, followed in order of significance (adding to the model) by Frost Days, Flying Hours, Fungicides, Beekeeper Practice, Landscape Connectivity, and Insecticides.
The current study exploits the available data for the 136 observations on winter mortality in the region of Wallonia to the extent possible. Awaiting improvements in the quantity and quality of data, the approach followed can be useful for gaining a first understanding of the relative importance of environmental conditions, biological aspects, and apiary practices for bee winter mortality.
Varroa Infestation was the most dominant predictor in the regression models. Future data sampling campaigns of high quality and spatial resolution could help identify the main causes of the more and more frequent Varroa outbreaks. This information would be extremely useful for improving apiary management and systematic Varroa treatment. A recommended short-term strategy is to focus on this predominant variable while harmonising the monitoring of infection rates and improving the exchange of information and expertise between apiarists. The analyses with the raw data point to a need for further analysis of the role of the insecticide abamectin and the fungicide captan as potential harmful substances, as well as a potential importance of Flying Hours, Beekeeper Practice, and Landscape Connectivity as relevant manageable factors, and interactions of these variables. The current approach could help organise sampling programs in a more efficient manner, by focusing on the key indicators for each stressor group (meteorological conditions, environmental conditions, diseases, and apiary management).
Data on the use of plant protection products were available with limited spatial detail for the region of Wallonia. It would be worthwhile to have data available on the regional differences to improve the quality of the explanatory variable. Improvements should also include the extension of the monitoring to additional winters and seasons, accounting for the time dependency of climate, nutritional and other conditions, and colony-level sampling of mortality rates.
This study does not yet exploit the existing scientific understanding of the biology of the honey bee or known dependencies between specific variables. A hybrid approach, combining statistical analysis with scientific expertise or land use change models [50], could increase the predictive power and efficiency of the analysis. Finally, it would be interesting to apply the sampling strategy and geostatistical analysis to other social-environmental problems by using this spatially explicit approach. For example, it could be worthwhile to examine the dependency between public health and the combination of exposure to air pollution and use of plant protection products.
Open Access This article is distributed under the terms of the Creative Comm ons Attribution 4.0 International License (http:// creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.