# Modelling the population fluctuation of winter moth and mottled umber moth in central and northern Germany

- 82 Downloads

## Abstract

### Background

Winter moth (*Operophtera brumata*) and mottled umber moth (*Erannis defoliaria*) are forest Lepidoptera species characterized by periodic high abundance in a 7–11 year cycle. During outbreak years they cause severe defoliation in many forest stands in Europe. In order to better understand the spatio-temporal dynamics and elucidate possible influences of weather, stand and site conditions, a generalized additive mixed model was developed. The investigated data base was derived from glue band catch monitoring stands of both species in Central and North Germany. From the glue bands only female moth individuals are counted and a hazard code is calculated. The model can be employed to predict the exceedance of a warning threshold of this hazard code which indicates a potential severe defoliation of oak stands by winter moth and mottled umber in the coming spring.

### Results

The developed model accounts for specific temporal structured effects for three large ecoregions and random effects at stand level. During variable selection the negative model effect of pest control and the positive model effects of mean daily minimum temperature in adult stage and precipitation in early pupal stage were identified.

### Conclusion

The developed model can be used for short-term predictions of potential defoliation risk in Central and North Germany. These predictions are sensitive to weather conditions and the population dynamics. However, a future extension of the data base comprising further outbreak years would allow for deeper investigation of the temporal and regional patterns of the cyclic dynamics and their causal influences on abundance of winter moth and mottled umber.

## Keywords

*Operophtera brumata*

*Erannis defoliaria*Generalized additive mixed model Weather effect Insect pest outbreaks

## Abbreviations

- AIC
Akaike’s information criterion

- AUC
Area under curve

- BIC
Bayesian information criterion

- cp
Cutpoint

- GAMM
General additive mixed model

- hc
Hazard code

- MU
Mottled umber moth (

*Erannis defoliaria*)- MUWM
Mottled umber moth and winter moth

- NW-FVA
Northwest German Forest Research Institute

- prec_pupa
Mean daily precipitation from 1st June to 14th October

- ROC
Receiver operating characteristics

- tmin_imago
Mean daily minimum temperature from 15th October to 14th December

- WM
Winter moth (

*Operophtera brumata*)

## Introduction

The vitality of oak on many European forest sites is repeatedly threatened by outbreaks of herbivorous insects. These insect infestations play a significant role in the decline of individual trees and sometimes even of complete stands of oak. Delb (2012) sees damage to oak by defoliating insects in spring as the trigger for this process. The oak’s ability to regenerate after defoliation is clearly limited when the crown shows more than 40% damage (Kätzel et al. 2006). Moreover, when defoliation takes place in several successive years the crown becomes increasingly and permanently damaged and the oak’s vitality diminishes (Fischer 1999; Petercord 2015). In combination with extreme weather conditions and other pests following moth outbreaks, for example oak mildew (e.g. *Erysiphe alphitoides*) and metallic wood-boring beetles (*Agrillus* spec.), this can result in a complete dieback of the tree (Hartmann and Blank 1998; Meshkova 2000; Thomas et al. 2002; Kätzel et al. 2006; Thomas 2008; Bressem and Steen 2012).

In this study, data of mottled umber moth (*Erannis defoliaria*) and winter moth (*Operophtera brumata*) (short MUWM) from North and Central Germany were analysed. These related species usually occur together (Altenkirch 1981) and periodically defoliate entire oak stands in spring. In this way they weaken their host and act as a predisposition factor for severe damage, such as oak decline (Manion 2003). Therefore MUWM are both important and frequently monitored herbivorous insects in Germany. As both species have a similar biology their combined population density is estimated using glue band catches. The amount of caught female MUWM serves for the calculation of the hazard code. When these codes exceed the warning threshold of one, a potential severe defoliation of the host tree in the following spring is expected (Altenkirch 1966; Altenkirch 1981). Therefore, the hazard code also provides a warning threshold for the forest owner that pest control might be necessary in the coming spring.

The abundance of MUWM shows a periodic dynamic that is spatially synchronized over large areas. This is characteristic for many forest insects (Liebhold and Kamata 2000). These population cycles are assumed to be caused by delayed abundance dependent factors, especially specialised parasitoids (Berryman 1996; Ruohomäki et al. 2000; Klemola et al. 2010) and host plant defences (Klemola et al. 2004; Tenow et al. 2013). Spatial synchronization is the result of dispersal (either of the defoliating Lepidoptera or of their enemies) or trophic interaction with populations of other species, e.g. parasitoids, that are mobile or spatially synchronous (Ydenberg 1987; Ims and Steen 1990; Liebhold et al. 2004).

Furthermore, geographical variation in habitat quality might influence the synchrony of population dynamics (Peltonen et al. 2002). Regional stochasticity (Moran 1953), such as synchronous weather variability, might also lead to spatial synchronization (“Moran effect”).

There are several studies that deal with the identification of the potential influential factors on the population dynamics pattern of MUWM in Europe, many of them distinguishing weather effects as driving components. Jepsen et al. (2009) discovered a coherent pattern of spatial synchrony of the start of the vegetation period and defoliation by geometrid moths in boreal birch forests in northern Fennoscandia. Thus, they assume that weather effects in spring, which influence the phenology of both host plant and geometrid moth, might act as “Moran effects”. Likewise, van Asch and Visser (2007) suppose that years with high synchrony between bud burst and egg hatch of winter moth might lead to a mass outbreak because synchrony is crucial for the moth development and has several consequences on its fitness. Topp and Kirsten (1991) discovered that the fecundity of female winter moths is temperature dependent and, therefore, assume that coincidence between imago eclosion and optimal temperatures might be a precondition for mass outbreaks. In Fennoscandia, important regulating influences on population density of the related autumn moth (*Epirrita autumnata*) appear to be egg mortality caused by minimum winter temperatures, followed by parasitism and, finally, the varying food plant quality (Virtanen and Neuvonen 1999).

- 1.
Analysing the influence of weather, stand and site conditions on the warning threshold exceedance of MUWM.

- 2.
Quantifying significant model effects to allow for short-term predictions of threshold exceedance as part of an early-warning system and to optimize monitoring intensity.

## Materials and methods

### Study species and study area

The life cycle of the univoltine geometrid species mottled umber (*Erannis defoliaria*) and winter moth (*Operophtera brumata*) in Europe is characterized by an overwintering egg-stage close to the buds of the host tree, hatching in spring in synchrony with the host plant’s budburst, a long pupation stage from summer to autumn in the soil and emergence of the adults from the ground in autumn, usually after the first frost nights, with highest activity from evening hours until midnight (Schwenke 1978). In middle Europe, MUWM densities follow periodic peaks approx. Every 7–11 years (Myers 1988; Tenow and Nilssen 1990; Delb 2012). In outbreak years, the polyphagous larvae might entirely defoliate broadleaves and conifers, with a preference for oak (Connell and Steyer 2007).

*hc*, which is a common measure in forest protection management conducted by the Northwest German Research Institute (NW-FVA). As mottled umber moths need approx. 2.5 times as much food as winter moths (Schwenke 1978), the

*hc*is calculated as follows:

*hc*was recorded (540 of 1711

*hc*without original counts). Therefore, the use of the original counts would have resulted in a loss of data, especially in the length and amount of time series (see Table 7 and Fig. 7 in Appendix). An analysis of the available original counts behind the hazard codes revealed that the population dynamics of winter moth are similar to those of mottled umber (see Figs. 8 and 9 in Appendix). However, there is a slight time lag between the density peaks of winter moth and those of mottled umber and, therefore, the correlation between both densities is weak (Pearson correlation coefficient of 0.35). This hardly affects the target value

*hc*, as the data is clearly dominated by winter moth counts. The female winter moth densities already account for 81% of the hazard codes and 73% of the threshold exceedances. For this reason, the

*hc*was used as a dependent variable.

The exact assignment of the glue band stands to the ecoregions resulted from the model development process described in the section of Model development, but is presented here for a better understanding and description of the data. The borders of the ecoregions are largely identical with those of the federal states, which might not only result from climatic differences but potentially also from different systems of data collection in the federal states.

### Weather, stand and site data

Regionalized weather data with a resolution of 1 day (Köhler et al. 2015) were used to generate variables that potentially effect the development of MUWM. For this purpose, daily mean and minimum temperatures and daily precipitation sums were aggregated for certain fixed periods of the year that cover the different developmental stages of the moth. In addition, the number of frost days per period for adult and egg stages was deduced from days with daily mean and daily minimum temperatures below 0 °C respectively. As synchronization between budburst and larval hatch is essential for the moth development (van Asch and Visser 2007), the start of the vegetation period for pendunculate oak (*Quercus robur*) was calculated on the basis of daily mean temperatures using the R-package *vegper* (Nuske 2015), which implements the algorithm described by Menzel (1997). High temperatures in April might lead to an earlier egg hatch and an asynchony with bud-burst, as Selås (2000) observed on *Tortrix viridana*, whose first instar larvae also require bursting buds. Therefore, variables describing temperature in April were calculated as well. Late frost in spring that might kill the moth larvae is indicated by a variable that contains the last day with frost in spring. Overall, 46 weather parameters were calculated and tested for their effects on the dynamics pattern (see Table 6 in Appendix).

Various parameters describing the monitoring stands

Parameter | Unit | Min | Max | Description |
---|---|---|---|---|

age | years | 20 | 239 | Stand age of oak |

oak_perc | % | 2 | 100 | Oak proportion based on the basal area |

stocking_degree | 0.25 | 1.54 | Ratio of absolute stand density to a reference level from yield table | |

dgm25 | m a.s.l. | 4.1 | 452.9 | Altitude |

budburst | day in year | 107 | 139 | Day of oak’s budburst (Menzel 1997) |

Furthermore, the binary parameter *pesticide* describes whether or not a pest control was carried out on the glue band stand in spring, before data collection. In total, 65 pest control events were registered.

### Model development

For model development, MUWM abundance, represented by the hazard code *hc*, was transformed into a binary variable *hc* _ *cat* = *I*_{{hc ≥ 1}} with a threshold value of 1 – note that different definitions of the threshold value will lead to varying frequencies of damage occurrences and therefore to different models (Hanewinkel et al. 2004). Using the threshold value of 1285 observations were defined as events and 1426 as non-events of threshold exceedance.

The transformation into a binary response results in a loss of information. However, regression models for count data could not be used for approx. 30% of the *hc*s due to the loss of the original count and tree girth information. Using the *hc* as response directly would be an alternative. However, in modelling count data it is rather uncommon to employ densities, such as *hc*, rather than using the original discrete counts and, if necessary, an offset-term to account for differing sample units. In our case, differing sample units are present since the tree girths vary. The main reason for modelling the original counts is the possibility of applying distribution functions for count data, such as zero-inflated Poisson or negative binomial, which can handle positively (right-) skewed and zero-inflated data, like that in our present study. When modelling densities, for instance *hc*, model approaches, such as generalized additive models for location, scale and shape *gamlss*, are more complicated, since distribution functions for positively skewed and zero-inflated continuous data are appropriate. Therefore, we used a simple binary regression as a first approach, but more sophisticated modelling will be a topic of our future research.

The effects of potential covariates on *hc_cat* were investigated using generalized additive mixed regression models (GAMM), assuming a Bernoulli distribution with logistic link-function. Hence, model predictions are probabilities for a threshold exceedance of the hazard code.

The developed model should be able to describe the cyclic variability of MUWM abundance with its large-scale spatial pattern (Fig. 2), i.e. spatio-temporal autocorrelation. Factors that are spatially correlated (e.g. climatic and geological conditions) were considered via three ecoregions, in order to eliminate any confounder of the causal effects. There might be additional small-scale spatio-temporal variation on the stand level. However, neither modelling of stand-specific population dynamics, nor complex spatio-temporal structures (Augustin et al. 2009) was possible, since only a small subsample of the database originated from longer per-stand time series (Fig. 3) and the sample stands were spatially rather unequally distributed. Therefore, a quite simple spatio-temporal component was introduced into the model, whereby the temporal patterns are described flexibly by a smooth function *f*_{ecoregion}(*j*) of year *j*, varying with the three-level categorical variable *ecoregion*. Since time-effects are mean-centered, main effects for *ecoregion* (*β*_{0ecoregion}) were additionally included.

The database comprises longitudinal repeated measurements for a subsample of monitoring stands. In order to fulfil the assumption of independently distributed residuals, stand level random effects *b*_{i} were introduced to account for within-stand correlation (Zuur et al. 2009). Ignoring this clustered data structure would lead to under-estimated standard errors of model effects (Pinheiro and Bates 2006). Moreover, the implementation of random effects made it possible to describe the simplest possible pattern of random between-stand variability, by assuming stand specific levels of threshold exceedance probability, but still allowing for a common cyclic pattern within an ecoregion.

*hc*_

*cat*

_{ij}~Bernoulli(

*π*

_{ij}), conditional probability

*π*

_{ij}= P(

*hc*_

*cat*

_{ij}= 1|

*j*,

*ecoregion*

_{i}) for hazard code class

*hc*_

*cat*in monitoring stand

*i*in year

*j*being equal to one, inverse logistic-link function

*h*, temporally structured linear predictors with main regional effects

*β*

_{0ecoregion, p},

*p*=

*A*,

*B*,

*C*three ecoregion-specific one dimensional smoothing functions (penalized thin plate regression splines),

*f*

_{ecoregion, p}(

*j*),

*p*=

*A*,

*B*,

*C*to describe specific cyclic population dynamics for the three ecoregions and random intercepts \( {b}_i\sim N\ \left(0,{\sigma}_b^2\right) \) that are assumed to be normally and identically distributed with mean 0 and variance \( {\sigma}_b^2 \).

*I*

_{{condition}}are indicator functions as denoted by:

*f*

_{1}, … ,

*f*

_{n}(mean-centered penalized thin plate regression splines) for

*n*continuous predictor variables

*x*

_{1ij}, … ,

*x*

_{nij}in monitoring stand

*i*in year

*j*and \( {\beta_0}_{{\mathrm{ecoregion}}_i} \) and \( {f}_{{\mathrm{ecoregion}}_i}(j) \) as described in Eq. 2.

Overall, this model formulation allows for an assessment of small-scale spatial variation via random effects, as well as large-scale spatio-temporal population dynamics patterns and fixed effects of causal covariates via smooth non-linear model terms.

### Variable selection, confusion matrix and ROC-curve

The basic model was extended by testing potential causal predictors via a non-automatic procedure. In this stepwise forward variable selection the BIC was used as selection criteria instead of the less strict AIC. After each variable selection step the confusion matrix with the model’s sensitivity (ratio of correctly predicted events (1-values)) and specificity (ratio of correctly predicted non-events (0-values)), as well as the ROC (receiver operating characteristics) curve with AUC (area under curve) values served as additional instruments to evaluate the model’s prediction ability. AUC values of 0.5 represent a random prediction. Hosmer and Lemeshow (2000) give a rule of thumb for the AUC that values between 0.7 and 0.8 represent an acceptable, and values greater than 0.8 an excellent discrimination.

The logistic regression model does not predict the binary class itself but the probability for an event. Hence, the confusion matrix is calculated by transforming the probability back into the binary response by setting a certain cutpoint. In the present survey, following the approach of Overbeck and Schmidt (2012) for their bark beetle infestation model, the cutpoint was defined as optimal, which minimizes the sum of false negative and false positive prediction rate.

One characteristic has to be taken into consideration when calculating the confusion matrix for a mixed effect model (Eq. 3). In order to separate the amount of variance explained by the causal covariates and by the random effects respectively, the confusion matrix and AUC of the model were calculated using predictions with and without random effects. The comparison of both metrics allows an additional evaluation of the model improvement when a further causal covariate is added to the model. In this context, a model with a higher variance partition explained by fixed causal effects is considered to be better, since the potential for a generalisation of the model predictions is higher.

### Model validation

In order to validate the model, a leave-one-out cross validation of single measurement occasions was carried out. The purpose of the validation was to assess the prediction error for the next future monitoring in stands where at least 2 previous assessments are available. For this purpose, only those 777 measurement occasions that hold at least two prior measurements were considered as validation data sets, since a robust estimate for the stand random effect should be guaranteed. All records from the stand of those validation measurement occasion that were chronologically younger were not used for model calibration. The 777 validation data records originate from 194 stands.

For interpretation of the results, the predictions of the validation procedure were compared to the predictions of the same 777 measurement occasions based on the original model fit. For this purpose, the different metrics from the confusion matrix and the ROC curve were calculated on the basis of the full models (i.e. using fixed and random effects), applying the optimal cutpoint 0.433. Additionally, the mean prediction error (difference between the binary observation and the predictions) was computed.

All calculations were carried out in the R environment (R Core Team 2015). The development of the general additive mixed models was realised with help of the R-packages *mgcv* (Wood 2011). The R-package *pROC* (Robin et al. 2011) served for calculations of the AUC.

## Results

### Model development

with *π*_{ij}, *h*, \( {\beta_0}_{{\mathrm{ecoregion}}_i}+{f}_{{\mathrm{ecoregion}}_i}(j) \) and *b*_{i} defined as in Eq. 2, *pesticide*_{ij}: dummy coded binary variable indicating the use of pesticide in the prior spring in stand *i* and year *j*, *β*_{1}: regression coefficient, *prec* _ *pupa*_{ij}: mean daily precipitation sum during the early pupal stage (01.06.–31.08.), and *tmin* _ *imago*_{ij}: mean daily minimum temperature in adult stage (15.10.–14.12.) of monitoring stand *i* in year *j*, *f*_{1}, *f*_{2}: one dimensional smoothing functions (penalized thin plate regression splines). The data of *prec_pupa* ranges from 9 to 44 (mm × 10), those of *tmin_imago* from − 13 to 68 (°C × 10).

Various logistic regression models fitted during the stepwise selection process

Model | BIC | |
---|---|---|

4.1 |
| 2352 |

4.2 |
| 2338 |

4.3 |
| 2007 |

4.4 |
| 1962 |

4.5 |
| 1954 |

4.6 |
| 1781 |

4.7 |
| 1756 |

*j*was added as a temporal structured effect describing the cyclic population dynamics (Eq. 4.2). Both the additional implementation of the main effects of the three ecoregions (Eq. 4.3) and the temporal effects that are specific for each ecoregion (Eq. 4.4) further improved the model significantly (BIC decreases from 2338 to 2007 and 1962). The temporal structured effects (Eq. 4.4) vary for each ecoregion but also show similarities, for example low abundance levels around year 2000 (Fig. 4). The broad confidence intervals of the temporal model effect in the “Northwest German Lowlands” (Fig. 4 left) indicate less reliable predictions than in the other two ecoregions.

The binary variable *pesticide* was implemented prior to the further variable selection process (Eq. 4.5) because pest controls are intended to reduce the abundance of MUWM considerably. This covariate showed a sensitive, significant negative effect in the case of a pest control application. Moreover, the BIC of the model decreased, hence the variable *pesticide* was selected as a predictor (Eq. 4.5). The resulting model served as a new basic model for the model selection process described in chapter 2.3.

Two further covariates were selected in this process, both describing the weather condition in certain moth developmental stages. None of the other potential predictor variables that led to a further decrease in BIC showed a significant and plausible model effect.

*prec_pupa*. This covariate shows a significant positive and non-linear model effect, i.e. higher humidity in this period results in a higher probability of threshold exceedance (Fig. 5 left). The effect gradient is especially strong at values between 25 and 30.

The mean daily minimum temperature during adult stage (15.10.–14.12.), *Tmin_imago*, was selected as second causal covariate. This predictor shows a positive, nearly linear effect: higher minimum temperatures in this stage lead to higher probabilities of threshold exceedances (Fig. 5 right). Because of the linear tendency of the model effect, the model was refitted assuming a strictly linear effect of *Tmin_imago*. However, this simplification led to a considerably higher BIC (1818 instead of 1781) and was not employed. In order to check for the robustness of the effect pattern of the selected covariates, the model was refitted for various warning thresholds (0.6, 0.8, 1.2, 1.4). This analysis revealed that the effects have little sensitivity to the chosen threshold, as they showed a similar pattern as in Fig. 5.

Model statistics of the stepwise selection process, calculated from predictions employing the full models including fixed and random effects, as well as predictions using fixed effects only (right part of the table). For comparison, the optimal cutpoints that were derived for the full models were also employed for the fixed effects predictions

Model | Cutpoint | Predictions based on fixed and random effects | Predictions based on fixed effects only | |||||||
---|---|---|---|---|---|---|---|---|---|---|

Sensitivity | Specificity | Accuracy | Auc | Sensitivity | Specificity | Accuracy | AUC | |||

4.4 |
| 0.439 | 0.568 | 0.973 | 0.906 | 0.930 | 0.316 | 0.954 | 0.848 | 0.831 |

4.5 |
| 0.422 | 0.604 | 0.967 | 0.906 | 0.930 | 0.316 | 0.957 | 0.850 | 0.836 |

4.6 |
| 0.427 | 0.596 | 0.966 | 0.904 | 0.915 | 0.467 | 0.927 | 0.850 | 0.850 |

4.7 |
| 0.433 | 0.586 | 0.966 | 0.903 | 0.924 | 0.446 | 0.945 | 0.862 | 0.860 |

### Sensitivity analysis

The influence of the single model parameters on the model predictions were illustrated with a sensitivity analysis. For this purpose, predictions over time were calculated without random effects and the variable to be tested was varied, while all other model predictors were set constant (ceteris paribus). Three scenarios were set up, representing optimum, mean and minimum conditions for the moth development. The covariates were fixed on values that approximately represent optimal, mean and poor conditions in the developmental moth stage concerned (dotted lines in Fig. 5). In order to avoid extrapolation, only combinations were used that are represented in the data basis. For this reason, the covariate *pesticide* was set to zero for all scenarios since the data set contains only few records with pest control.

When comparing regional predictions, the “Northwest German Lowlands” show the highest probabilities for each weather scenario. Under optimal conditions the probabilities remain above the optimum cutpoint of 0.433 for 9 years. In the other two ecoregions probabilities of above 0.433 are only predicted for a maximum of five successive years. Under mean conditions the predicted probabilities in the “Northwest German Lowlands” reach peaks of above 0.433 twice during the investigated period, only once in the “Central German High- and Lowlands” and never in the “East German Lowlands”. The predicted probabilities under unfavourable weather conditions are at a level distinctly below the optimal cutpoint in all three ecoregions.

### Model validation

Different statistics derived from model validation compared with the metrics from the original model fit calculated for the 777 records of the validation runs. Metrics of the confusion matrix were calculated by applying a cutpoint of 0.433

Sensitivity | Specificity | Accuracy | AUC | Prediction error | |
---|---|---|---|---|---|

(5% percentile) Mean (95% percentile) | |||||

validation | 0.534 | 0.955 | 0.907 | 0.871 | (−0.394) –0.016 (0.596) |

model fit | 0.659 | 0.964 | 0.929 | 0.926 | (−0.344) –0.017 (0.471) |

## Discussion

The model gives satisfying predictions of the probability of threshold exceedance with only a few predictors. This is convenient, as it requires little information. On the other hand, the model does not consider the effects of other predictors, such as those influencing the crucial synchrony between egg hatch and bud burst (van Asch and Visser 2007; Jepsen et al. 2009). For example, cold winters might lead to a better synchrony between egg hatch and budburst (Connell 2014). Moreover, cold and wet weather around hatching time could kill the first instar larvae (Schwenke 1978). Sometimes one extreme weather phenomenon, e.g. heavy rainfall that washes off the larvae (Habermann et al. 2007) has a strong regulating impact, but is difficult to describe by the aggregated parameters. Using extreme weather values per development stage, for example maximum daily rainfall or minimum daily temperature per stage, did not improve the model performance. But it must be taken into consideration that extreme rainfall events are often very local and therefore difficult to describe by regionalized weather data.

### Model structure

The model allows analyses of the joint regional and temporal population dynamics of MUWM because it describes these patterns by explicit effects. However, due to limitations of the data base, the spatial component could only be specified in a very simple way via three ecoregions. In contrast, the temporal dynamic is flexibly modelled via smooth effects of the monitoring year specified for each of the ecoregions (temporally structured effect). Unstructured spatial effects are modelled via random effects on stand level. This is a very simple way to allow for stand specific differences in the cyclic dynamics, since only the level but not the temporal pattern is allowed to vary between stands within one ecoregion. Due to these limitations, and even if two time-varying and stand specific weather covariates are integrated into the final model, the residuals show some temporal autocorrelation on stand level. A check of temporal autocorrelation within stands revealed that especially the measurements of two successive years are positively autocorrelated. Presumably, only the implementation of a temporal structured effect for each monitoring stand would eliminate all temporal autocorrelation. This would, however, require more time-series data than were available. A complex large scale spatio-temporal effect (Augustin et al. 2009) was not modelled since the spatial coverage for most observational years is rather weak. Moreover, several data originate from seriously affected stands and from time periods of higher moth abundance, as the inventory is a legal preliminary for any pest control application. This causes a positive bias (overestimation), in space and time, of the overall abundance level, which has to be considered in the interpretation of results. As explanatory variables are collected independently of this sampling scheme, however, estimated effects of predictors and patterns of population dynamics can be estimated without bias and with greater accuracy of estimate per observation (King and Zeng 2001) in comparison to a fully random sampling design.

The rough description of spatial and temporal population dynamics prohibits that the model predicts escalating probabilities after years of latency, even when weather conditions are favourable. In the sensibility analysis, even the strongest increase of predicted probabilities under optimum conditions (year 2001 to 2003, Fig. 6 upper) takes 2 years. This is consistent with the observation of Myers (1988) that the population dynamics of cyclic species are not characterized by sudden increments but show a gradual increase over several years. However, the description of temporal structured effects for each ecoregion requires sufficient data to calculate the smooth time-trends *f*_{ecoregion i}(*j*)*.* The large confidence intervals of the temporal structured effect for the “Northwest German Lowlands” (Fig. 4 left) indicate that the predictions are highly uncertain, which might result in implausible predictions, especially under extreme weather conditions. In this context, the high predicted probabilities, which lie above the cutpoint under optimal weather conditions for “Northwest German Lowlands” for nine successive years (Fig. 6) in the sensitivity analysis, seem implausible. Abundance generally decreases after a few outbreak years as a result of increasing intraspecific competition (Hunter 1998), parasites and other antagonists (Schwenke 1978; Roland 1994). There were no such data available during the modelling process. Hence, more data, especially for this ecoregion, are likely to improve the model performance.

### Selected variables

During variable selection, the effect of pest control and two weather variables were chosen as predictors. The binary predictor *pesticide* is very important since pest control reduces the probability for threshold exceedances in the following autumn significantly. One selected weather parameter is the mean daily minimum temperature during adult stage *tmin_imago,* with a positive model effect: higher temperatures during this period lead to higher probabilities of threshold exceedance (Fig. 5 right). This effect can be explained, since the imagines become less active with lower temperatures and immobile at frost, even though they can survive temperatures down to − 20 °C (Schwenke 1978, p. 226). According to Schwenke (1978, p. 226 and 228), the optimum temperature for moth activity lies around 5 °C to 10 °C. Hence, fewer moths crawl up and get caught by the glue bands when temperatures are low during emergence period. Topp and Kirsten (1991) assume that the coincidence between moth emergence and the optimum temperature of around 10 °C is a precondition for mass outbreaks. As the imagines are most active from the evening until approximately midnight (Schwenke 1978, p. 226), this time frame is better represented by the daily minimum temperature *tmin_imago* than the daily mean temperature. A covariate that describes the temperature between, for example, 6 p.m. and midnight during adult stage, might even have a stronger effect. However, such hourly temperature values were not at our disposal.

A parameter describing the precipitation in the early pupal stage, *prec_pupa*, was selected. This covariate’s effect indicates that higher moisture in this period has a positive influence on the moths’ development (Fig. 5 left). According to Schwenke (1978, p. 228), the main reasons for pupal mortality are dehydration, flooding and predators. As flooding can be largely excluded at the monitoring stands, the mortality by dehydration offers a plausible explanation for the described covariate’s effect. Influence by predatory enemies and epizootic disease could not be investigated in this study, but they can reduce the population of MUWM considerably, especially in pupal stage (Dempster 1983; Berryman 1996; Hunter 1998).

Additionally to the fixed effects, random effects are predicted on stand-level to account for the clustered data structure and the non-systematic data sampling. They contribute considerably to the model performance, which becomes clear when the predictions are calculated without random effects (Table 3). In this case, the AUC is significantly lower, even though it still indicates an excellent discrimination. When predicting without random effects, the model’s AUC increases with every further implemented predictor because higher proportions of the variance are explained through fixed model effects. Accordingly, the variance of random effects decreases (standard deviation of the *b*_{i} is decreasing from 1.0 in model 4.4 to 0.85 in model 4.7). The effect of the added predictor specifies a part of the variance that was previously covered by the random effects.

Stand and soil parameters were not identified as covariates, which might be due to the available data quality and quantity. Some monitoring stands lacked forest inventory and soil data. Moreover, the stand inventory data are assessed at 10 year intervals, thus changes between years are not covered as in the yearly glue band data. Additionally, the inventory methods vary between the considered federal states so the data comparability is problematic. Possible effects of stand and site might also be superimposed by the high impact of the dynamic weather conditions.

### Model performance and confusion matrix

The accuracy of the predictions is very high. However, the accuracy deduced from the confusion matrix is a problematic metric with skewed class distributions of the binary response. Lawrence et al. (1998) stated that the model “may always predict the most common class and still provide relatively high performance”. Hence, other statistics, such as sensitivity and specificity, are more meaningful. In the present data set, the amount of non-events (class 0) is very high – five times as many as events of threshold exceedance (class 1). It is, therefore, challenging to achieve a high sensitivity that describes the number of correct classified events. With a sensitivity of 0.586 in the original model fit, the events of threshold exceedances are predicted significantly worse than the non-events (specificity of 0.966, Table 4). In the validation, the sensitivity decreases to 0.53. The lower prediction performance in comparison to the original model fit, can be explained by the reduced database used in the 777 model calibrations. Since the validation measurement occasion and all chronologically younger records of the respective stand were excluded in the validation runs, the predictions are less accurate than in the original model fit. However, the model is able to achieve a prediction accuracy of 90% in the cross validation and the mean prediction error of the validation runs is close to zero.

### Conclusions

The developed model is able to generally estimate the large-scale hazard situation. Hence, it could be used for forest protection planning as part of an early warning system. As soon as the values for *prec_pupa* are available (by the end of August), predictions can be calculated, assuming different scenarios for temperature in the succeeding adult stage, e.g. values for minimum, mean and optimum conditions in accordance to the sensitivity analysis (chapter 3.2). Those estimations of threshold exceedances could help to choose an appropriate monitoring intensity for the coming autumn. But the choice of the most appropriate cutpoint depends on the model application. In forest management, one has to judge the opportunity costs, which means comparing the consequential charges by undetected threshold exceedances with the cost of “false alarms” that result in unnecessary work input (Overbeck and Schmidt 2012).

The influences on the MUWM populations are very complex and many aspects like predators or parasitisation were not considered explicitly in the modelling, even though they may have a strong regulating impact (Dempster 1983; Berryman 1996; Hunter 1998). Some meteorological variables, such as wind speed, sunshine and relative humidity, might give further explanations. Lagged effects of environmental factors could also be analysed more intensely in a subsequent study with the help of lagged nonlinear models (DLNM, Gasparrini 2011). Furthermore, Tenow et al. (2013) describe outbreak waves of winter moths that are travelling Europe, so considering the Europe-wide spatio-temporal dynamics of outbreaks might give valuable explanations of regional outbreak patterns. The model, however, gives first hints about existing influences. Future inventory data, including further population cycles, would improve the model and allow for further investigation of additional covariates and the verification of the identified effects. In this context, time series data, in particular, are of great value for model improvements. With the help of such an extended data base, winter moth and mottled umber should be investigated separately to clarify whether their sensitivity to the discovered parameters are the same.

It has to be taken into consideration that the exceedance of the warning threshold is only an indicator for potential severe defoliation in the coming spring, but not every threshold exceedance necessarily leads to defoliation. This still depends on the amount of positioned eggs, the further development of the moths and the synchronisation between egg hatch and budburst. Hence, the link between hazard code and defoliation requires additional investigation. Consequently, the model also provides a basis for further analyses and model development.

## Notes

### Acknowledgements

We would like to extend our special gratitude to the Brandenburg State Forestry Center of Excellence (LFE Eberswalde) and the Ministry for Agriculture and Environment in Schwerin for the monitoring data and all further information from Brandenburg and Mecklenburg West-Pommerania that was required for this study. Furthermore, we are indebted to the federal state forest services from Hesse, Lower Saxony and Saxony Anhalt (HessenForst, Niedersächsische Landesforsten, Landeszentrum Wald and Landesforstbetrieb Saxony Anhalt) for collecting the monitoring data with help of glue bands and providing stand and soil data. Additionally, we are very grateful for all information about the monitoring stands that was provided by the non-federal forest owners in Hesse. And we want to thank our colleagues from Department of Forest Protection and Department of Environmental Control for the monitoring and climate data and helpful discussions. We thank two anonymous reviewers for their valuable comments on the manuscript.

### Funding

This study was conducted as part of DSS-RiskMan (FKZ: 28WB401501), a project funded by the “Waldklimafonds”, supported by the Federal Ministry of Food and Agriculture and the Federal Ministry of the Environment, Nature Conservation and Nuclear Safety.

### Availability of data and materials

The data for this study were obtained with different data policies and can be made available only with consent of the respective data owners. Data requests can be send to the corresponding author.

### Legal statement

All research work reported in this study was performed in accordance with all relevant legislation and guidelines.

### Authors’ contributions

The provided data were prepared by AH and RB. AH and MS designed the model construction. AH conducted the model development and wrote the majority of the manuscript. RB guided the project and co-wrote the manuscript. MS conceived the study and co-wrote the manuscript. All authors read and approved the final manuscript.

### Ethics approval and consent to participate

Not applicable.

### Competing interests

The authors declare that they have no competing interests.

## References

- Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control 19:716–723CrossRefGoogle Scholar
- Altenkirch W (1966) Zur Verwendung von Leimringen bei der Abundanz-Bestimmung von Frostspannern. Zugleich ein kritischer Beitrag zur forstlichen Frostspanner-Prognose. Z Angew Zool 53:34Google Scholar
- Altenkirch W (1981) Zur Frostspanner-Situation in Niedersachsen im Herbst 1981. Forst- Holzwirt 20:504–506Google Scholar
- Augustin NH, Musio M, von Wilpert K, Kublin E, Wood SN, Schumacher M (2009) Modelling spatiotemporal forest health monitoring data. J Am Stat Assoc 104:899–911CrossRefGoogle Scholar
- Berryman AA (1996) What causes population cycles of forest Lepidoptera? Trends Ecol Evol 11:28–32CrossRefGoogle Scholar
- Böhner J, Antonić O (2009) Chapter 8 land-surface parameters specific to topo-climatology. Dev Soil Sci 33:195–226Google Scholar
- Bressem U, Steen A (2012) Eichensterben – Erkrankungsschub 2011. AFZ- Wald 67:24–27Google Scholar
- Connell J (2014) Frostspanner2015 - BFW. In: http://bfw.ac.at/rz/bfwcms.web_print?dok=10006. Accessed 9 Jun 2015
- Connell J, Steyer G (2007) Raupenfallen-Untersuchung 2006: Artenspektrum von Schmetterlingen an Laubbäumen. Forstsch Aktuell 38:12–17Google Scholar
- Conrad O, Bechtel B, Bock M, Dietrich H, Fischer E, Gerlitz L, Wehberg J, Wichmann V, Böhner J (2015) System for automated geoscientific analyses (SAGA) v. 2.1.4. Geosci Model Dev 8:1991–2007. https://doi.org/10.5194/gmd-8-1991-2015 CrossRefGoogle Scholar
- Delb H (2012) In: FVA-Einblick (ed) Eichenschädlinge im Klimawandel in Südwestdeutschland. https://www.waldwissen.net/waldwirtschaft/schaden/krankheiten/fva_eichensterben_klimawandel/index_DE. Accessed 9 Jun 2015
- Dempster JP (1983) The natural control of populations of butterflies and moths. Biol Rev 58:461–481. https://doi.org/10.1111/j.1469-185X.1983.tb00396.x CrossRefGoogle Scholar
- Fischer R (1999) Folgen von Insektenfraß für den Gesundheitszustand der Eichen. AFZ- Wald 7:355–356Google Scholar
- Gasparrini A (2011) Distributed lag linear and non-linear models in R: the package dlnm. J Stat Softw 43:1CrossRefGoogle Scholar
- Habermann M, Hurling R, Krüger F, Bressem U (2007) Waldschutzsituation 2006 in Niedersachsen und Hessen. AFZ- Wald 7:356–361Google Scholar
- Hanewinkel M, Zhou W, Schill C (2004) A neural network approach to identify forest stands susceptible to wind damage. For Ecol Manag 196:227–243. https://doi.org/10.1016/j.foreco.2004.02.056 CrossRefGoogle Scholar
- Hartmann G, Blank R (1998) Aktuelles Eichensterben in Niedersachsen - Ursachen und Gegenmaßnahmen. Forst Holz 53:733–735Google Scholar
- Hosmer DW, Lemeshow S (2000) Applied logistic regression, 2nd edn. Wiley, New YorkCrossRefGoogle Scholar
- Hunter MD (1998) Interactions between
*Operophtera brumata*and*Tortrix viridana*on oak: new evidence from time-series analysis. Ecol Entomol 23:168–173. https://doi.org/10.1046/j.1365-2311.1998.00124.x CrossRefGoogle Scholar - Ims RA, Steen H (1990) Geographical synchrony in microtine population cycles: a theoretical evaluation of the role of nomadic avian predators. Oikos 57:381–387. https://doi.org/10.2307/3565968 CrossRefGoogle Scholar
- Jepsen JU, Hagen SB, Karlsen S-R, Ims RA (2009) Phase-dependent outbreak dynamics of geometrid moth linked to host plant phenology. Proc R Soc Lond B Biol Sci 276:4119–4128. https://doi.org/10.1098/rspb.2009.1148 CrossRefGoogle Scholar
- Kätzel R, Löffler S, Möller K, Heydeck P, Kallweit R (2006) Das Eichensterben als Komplexkrankheit. Eberswalder Forstl Schriftenreihe 25:94–100, https://forst.brandenburg.de/cms/media.php/lbm1.a.4595.de/b25eiche.pdf
- King G, Zeng L (2001) Logistic regression in rare events data. Polit Anal 9:137–163CrossRefGoogle Scholar
- Klemola N, Andersson T, Ruohomäki K, Klemola T (2010) Experimental test of parasitism hypothesis for population cycles of a forest lepidopteran. Ecology 91:2506–2513CrossRefGoogle Scholar
- Klemola T, Ruohomäki K, Andersson T, Neuvonen S (2004) Reduction in size and fecundity of the autumnal moth, Epirrita autumnata, in the increase phase of a population cycle. Oecologia 141:47–56CrossRefGoogle Scholar
- Koenig WD (2002) Global patterns of environmental synchrony and the Moran effect. Ecography 25:283–288. https://doi.org/10.1034/j.1600-0587.2002.250304.x CrossRefGoogle Scholar
- Köhler M, Ahrends B, Meesenburg H (2015) Wie gut ist einfach? Evaluierung verschiedener Verfahren zur Regionalisierung täglicher Wetterdaten. Poster zum “Tag der Hydrologie 2015” in Bonn 19.–20.03.2015Google Scholar
- Lawrence S, Burns I, Back A, Tsoi AC, Giles CL (1998) Neural network classification and prior class probabilities. In: Orr G, Müller K-R, Caruana R (eds) Tricks of the Trade, Lecture Notes in Computer Sciens State-of-the-Art Surveys. Springer Verlag Berlin-Heidelberg, pp 299–314, https://ro.uow.edu.au/eispapers/271/
- Liebhold A, Kamata N (2000) Introduction: are population cycles and spatial synchrony a universal characteristic of forest insect populations? Popul Ecol 42:205–209. https://doi.org/10.1007/PL00011999 CrossRefGoogle Scholar
- Liebhold A, Koenig WD, Bjørnstad ON (2004) Spatial synchrony in population dynamics. Annu Rev Ecol Evol Syst 35:467–490. https://doi.org/10.1146/annurev.ecolsys.34.011802.132516 CrossRefGoogle Scholar
- Manion PD (2003) Evolution of concepts in forest pathology. Phytopathology 93:1052–1055CrossRefGoogle Scholar
- Menzel A (1997) Phänologie von Waldbäumen unter sich ändernden Klimabedingungen - Auswertung der Beobachtungen in den Internationalen Phänologischen Gärten und Möglichkeiten der Modellierung von Phänodaten. Forstl Forschungsberichte Münch 164:147Google Scholar
- Meshkova V (2000) The impact of insect-defoliators to the oak decline in Ukraine. Instytut Badawczy Leśnictwa (Forest Research Institute), pp 225–229Google Scholar
- Moran P (1953) The statistical analysis of the Canadian Lynx cycle. Aust J Zool 1:291–298. https://doi.org/10.1071/ZO9530291 CrossRefGoogle Scholar
- Myers J (1988) Can a general hypothesis explain population cycles of forest Lepidoptera? Adv Ecol Res 18:179–242. https://doi.org/10.1016/S0065-2504(08)60181-6 CrossRefGoogle Scholar
- Nuske R (2015) Determine Vegetation Periods of Forest Trees. In: R-Package Version 021. http://computerfoerster.de/r-pkgs. Accessed 2 Jan 2017Google Scholar
- Overbeck M, Schmidt M (2012) Modelling infestation risk of Norway spruce by
*Ips typographus*(L.) in the lower Saxon Harz Mountains (Germany). For Ecol Manag 266:115–125. https://doi.org/10.1016/j.foreco.2011.11.011 CrossRefGoogle Scholar - Peltonen M, Liebhold AM, Bjørnstad ON, Williams DW (2002) Spatial synchrony in forest insect outbreaks: roles of regional stochasticity and dispersal. Ecology 83:3120–3129CrossRefGoogle Scholar
- Petercord R (2015) Rolle der Eichenfrassgesellschaft beim Eichensterben. AFZ- Wald 70:17–19Google Scholar
- Pinheiro J, Bates D (2006) Mixed-effects models in S and S-PLUS. Springer Science & Business MediaGoogle Scholar
- R Core Team (2015) R: A language and environment for statistical computing [Internet]. Vienna, Austria: R Foundation for Statistical Computing. Available: http:www.r-proj.org. Accessed 20 January 2015
- Robin X, Turck N, Hainard A, Tiberti N, Lisacek F, Sanchez J-C, Müller M (2011) pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics 12:77. https://doi.org/10.1186/1471-2105-12-77 CrossRefPubMedPubMedCentralGoogle Scholar
- Roland J (1994) After the decline: what maintains low winter moth density after successful biological control? J Anim Ecol 63:392–398. https://doi.org/10.2307/5556 CrossRefGoogle Scholar
- Ruohomäki K, Tanhuanpää M, Ayres MP, Kaitaniemi P, Tammaru T, Haukioja E (2000) Causes of cyclicity of
*Epirrita autumnata*(Lepidoptera, Geometridae): grandiose theory and tedious practice. Popul Ecol 42:211–223. https://doi.org/10.1007/PL00012000 CrossRefGoogle Scholar - Schmidt W, Stüber V, Ullrich T, Paar U, Evers J, Dammann K, Hövelmann T, Schmidt M (2015) Synopse der Hauptmerkmale der forstlichen Standortskartierungsverfahren der Nordwestdeutschen Bundesländer. Universitätsdrucke Göttingen, GöttingenCrossRefGoogle Scholar
- Schwenke W (1978) Die Forstschädlinge Europas - Schmetterlinge. Paul PareyGoogle Scholar
- Selås V (2000) Is there a higher risk for herbivore outhreaks after cold mast years? An analysis of two plant/herbivore series from southern Norway. Ecography 23:651–658. https://doi.org/10.1111/j.1600-0587.2000.tb00308.x CrossRefGoogle Scholar
- Tenow O, Nilssen A (1990) Egg cold hardiness and topoclimatic limitations to outbreaks of
*Epirrita autumnata*in northern Fennoscandia. J Appl Ecol 27:723–734. https://doi.org/10.2307/2404314 CrossRefGoogle Scholar - Tenow O, Nilssen AC, Bylund H, Pettersson R, Battisti A, Bohn U, Caroulle F, Ciornei C, Csóka G, Delb H (2013) Geometrid outbreak waves travel across Europe. J Anim Ecol 82:84–95CrossRefGoogle Scholar
- Thomas FM (2008) Recent advances in cause-effect research on oak decline in Europe. CAB Rev Perspect Agric Vet Sci Nutr Nat Resour. https://doi.org/10.1079/PAVSNNR20083037
- Thomas FM, Blank R, Hartmann G (2002) Abiotic and biotic factors and their interactions as causes of oak decline in Central Europe. For Pathol 32:277–307CrossRefGoogle Scholar
- Topp W, Kirsten K (1991) Synchronisation of pre-imaginal development and reproductive success in the winter moth,
*Operophtera brumata*L. J Appl Entomol 111:137–146. https://doi.org/10.1111/j.1439-0418.1991.tb00304.x CrossRefGoogle Scholar - van Asch M, Visser ME (2007) Phenology of forest caterpillars and their host trees: the importance of synchrony. Annu Rev Entomol 52:37–55CrossRefGoogle Scholar
- Virtanen T, Neuvonen S (1999) Performance of moth larvae on birch in relation to altitude, climate, host quality and parasitoids. Oecologia 120:92–101. https://doi.org/10.1007/s004420050837 CrossRefPubMedGoogle Scholar
- Wood SN (2011) Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models. J R Stat Soc B 73:3–36CrossRefGoogle Scholar
- Ydenberg RC (1987) Nomadic predators and geographical synchrony in microtine population cycles. Oikos 50:270–272. https://doi.org/10.2307/3566014 CrossRefGoogle Scholar
- Zevenbergen LW, Thorne CR (1987a) Quantitative analysis of land surface topography. Earth Surf Process Landf 12:47–56CrossRefGoogle Scholar
- Zevenbergen LW, Thorne CR (1987b) Quantitative analysis of land surface topography. Earth Surf Process Landf 12:47–56. https://doi.org/10.1002/esp.32901 CrossRefGoogle Scholar
- Zuur AF, Ieno EN, Walker NJ, Saveliev AA, Smith GM (2009) Mixed effects models and extensions in ecology with r, statistics for biology and health. Springer, New YorkCrossRefGoogle Scholar

## Copyright information

**Open Access**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.