Introduction

European beech (Fagus sylvatica L.) is the most abundant broadleaf tree species in central Europe. Due to its high shade tolerance and very broad ecological amplitude, beech dominates the potential natural vegetation on many sites and plays a key role as a late successional species (Ellenberg and Leuschner 2010).

Natural regeneration can play an important role in the process of converting pure coniferous stands to mixed stands by increasing the proportion of broadleaf trees (Felton et al. 2016), contributing to the restoration of forests. The regeneration phase is the best opportunity to influence tree species composition and forest ecosystem structures (Fischer et al. 2016; Löf et al. 2018). With natural regeneration playing an expanding role in silviculture in central Europe, the monitoring of the success of regeneration in stands is an important issue for forest management. However, predicting natural regeneration in mixed stands is difficult (Löf et al. 2018).

To better understand and predict successful natural regeneration of beech, a sound knowledge of the key ecological processes is required. Successful regeneration encompasses numerous ecological processes such as flowering, seed production, seed dispersal, storage, germination, seedling development and the successful establishment of juvenile trees (Harper 2010; Wagner et al. 2010; Fischer et al. 2016).

The process of flowering and the total number of seeds depends on the trees’ physiological age and the onset of maturity (Owens and Blake 1985). Within stands, beech begins seed production between the ages of 50 and 80 (Brown 1953). Wagner (1999) established a relationship between seed production and tree diameter. Beech masts irregularly, at intervals of three to ten years (Övergaard et al. 2007). The distance to the parent tree is relevant to seed dispersal by means of barochory and to the resulting seed density (Sagnard et al. 2007). While barochory commonly results in beech nut dispersal up to a radius of 20 m (Wagner 1999; Kutter and Gratzer 2006; Millerón et al. 2013), zoochorus dispersal is dependent on the suitability of the stands as a habitat for seed-dispersing animals (Löf et al. 2018). Kunstler et al. (2007) found beech regeneration in pine stands up to 3000 m from the nearest mature beech stand. In temperate latitudes, plants usually bridge the unfavourable growth conditions of the winter months by means of dormancy (Runkle 1989). This is usually associated with the storage of diaspores on the forest floor. There, the diaspores are susceptible to harmful fungi, predation and unfavourable environmental conditions (Jensen 1985; Madsen 1995).

After successful germination, establishment and early growth, beech seedling development is dependent upon light, soil moisture, nutrient supply and browsing intensity (Minotta and Pinzauti 1996). Madsen and Larsen (1997) identified light availability to be the most important factor influencing sapling density. Beech seedlings can establish under a wide range of light conditions (Emborg 1998). However, the survival rate and growth of beech seedlings increase with increasing irradiation (Madsen 1994; Minotta and Pinzauti 1996). The overstorey (Paluch et al. 2019), the shrub layer and the herbaceous vegetation all reduce the resources available to beech seedlings. Ground vegetation affects seedling growth especially severely when grasses dominate the field layer (Coll et al. 2003, 2004; Lin et al. 2014). In addition to interactions prompted by ground vegetation and the overstorey, there are effects of competition between the regeneration of various tree species. Dobrovolny (2016) demonstrated an increase in the Norway spruce (Picea abies L.) density with increasing irradiation and a superior height growth in comparison with beech from a light intensity of 20% or more. Furthermore, seedling density can be reduced by browsing. Beech seedlings can be browsed by birds, rodents, deer and other herbivores (Madsen 1995; Olesen and Madsen 2008).

The aim of this study was to develop a statistical model to predict natural regeneration of beech. A statistical analysis will show important variables influencing natural regeneration. Using the variables, the occurrence of natural regeneration can be predicted. This is of great interest for regeneration ecology as well as for practical forestry, as nature-oriented forest management favours natural regeneration in terms of cost reduction and site adapted regeneration, especially as a means to facilitate adaptation to climate change (Zerbe 2002; Dobrowolska 2006; Finkeldey and Hattemer 2010; Milad et al. 2013; Vrška et al. 2016; Kaliszewski 2017; Polley et al. 2018). There are several approaches for regeneration modelling, from which two groups can be distinguished: statistical models and mechanistic (process) models. While mechanistic models predict regeneration as a system of biological processes (Nathan and Muller-Landau 2000), statistical models allow for an accurate reproduction of observed patterns. Within statistical modelling, regeneration can be modelled on different spatial scales. While the first phases of the regeneration process are often investigated by modelling the dispersal curve through maximum likelihood methods (Ribbens et al. 1994; Clark et al. 1999) and point process statistics (Wild et al. 2014) on a small scale, regeneration establishment models use wide-scale routine inventory data. Regeneration establishment models are models that deal exclusively with the establishment phase of regeneration. The probability of the occurrence and the density of natural regeneration of beech can be analysed and predicted using regeneration establishment models.

Studies on regeneration establishment models for beech are rare (Klopčič et al. 2015; Kolo et al. 2017). Many studies have tended to focus on the ingrowth of beech, including recruitment over a certain threshold value, e.g. dbh > 10 or 12 cm (Klopcic et al. 2012; Manso et al. 2019; Zell et al. 2019). For the purposes of this paper, established regeneration was defined as beech trees with a height greater than 20 cm and a diameter at breast height (dbh) smaller than 7 cm. In addition to different spatial scales, established regeneration can also be analysed on different time scales. Established regeneration can be quantified either as a regeneration density over a measurement threshold at a defined moment in time or as a regeneration rate, i.e. the number of trees reaching a threshold value over a certain period of time (Lexerod and Eid 2005). The former approach is preferable for silvicultural planning, because a certain regeneration density is important for management seeking to achieve a high quality stand in the future.

A special feature of forest inventory data based on grid samples with permanent plots is the small experimental area of the sample plot. Most investigations include only plot information when modelling regeneration density (Li et al. 2011; Zhang et al. 2012; Shen and Nelson 2018). As a consequence, spatial patterns of seed availability resulting from the plot environment are considered in few studies (Shive et al. 2018; Monteiro-Henriques and Fernandes 2018). In addition to the crucial issue of small plot size, inventory datasets contain spatial autocorrelation due to abiotic or biotic processes and spatial structures in unobserved explanatory variables (Bachl et al. 2019). Few regeneration establishment models have considered spatial correlation (Dobrowski et al. 2015).

To the best of our knowledge, there have been no approaches to the modelling of beech regeneration taking spatial dependence into account. We applied a new approach to a statistical establishment model using the example of natural beech regeneration in the lowland forests of Saxony, Germany. The objective of this study was to improve understanding of what drives the absence or occurrence of beech regeneration and, once regeneration is present, what drives its density, by testing a wide collection of explanatory variables. Six hypotheses were derived from the knowledge presented above: (1) based on beech nut dispersal, it was hypothesised that the probability of the occurrence of regeneration and the density increase with an increase in the plot-specific basal area of beech and that there is a decrease with increasing distance between the plot and the nearest mature beech stand. (2) It was expected that the probability of beech regeneration increases with beech diameter, as seed production is related to tree diameter. (3) Due to the reduced availability of resources such as water and light as a function of the overstorey, it was assumed that beech regeneration density is affected by the total basal area of the overstorey of the plot. (4) In addition to the properties of the overstorey, it was hypothesised that due to interspecific competition in the understorey, beech regeneration density will decrease with an increasing density of spruce regeneration. (5) Beech regeneration density is negatively affected by herbivory, as a result of increased mortality. (6) Furthermore, it was assumed that the occurrence and density of beech regeneration are affected by the soil characteristics of the site. An excess of water is expected to affect the occurrence of beech regeneration negatively.

Materials and methods

Study area

The study was conducted in state forest located in northern Saxony, Germany. The region is characterised by Pleistocene lowlands and their transitions to loess uplands. The region encompasses eight ‘growth districts’ (forstliche Wuchsgebiete, Gauer and Aldinger 2005) (Table 1). Figure 1 visualises the geographic range of this study area. Scots pine is the predominant tree species in both mixed forests and pure stands (Table 1). Climate records (1981–2010) indicate mean annual precipitation of between 534 and 796 mm, and the elevation ranges from 0 to 600 m above sea level (Gauer and Kroiher 2012).

Table 1 Distribution of permanent sampling points (PSP) in the growth areas and characterisation of the growth areas in terms of tree species composition, altitude, average annual temperature (Temp.) and mean annual precipitation (Precip.) (Gauer and Kroiher 2012)
Fig. 1
figure 1

Location of the study region in northern Saxony, Germany, and the permanent sampling plot inventory. Dots represent the coordinates of the sample plots. Representation of the triangulation with 7752 vertices used for the SPDE basis function representation of the Matérn Gaussian field. Section representing the southern part of the study area

Forest inventory and analysis of data

The data used in the study were generated in the context of a forest inventory and were provided by the Saxony state forestry service (Sachsenforst). The forest inventory surveyed 8725 permanent sample plots on a grid square with a 200 m side length (see also Fig. 1). Each plot consisted of three concentric circles with a 6 and 12 m radius. In the 6 m circle, all trees with a dbh ≥ 7 cm and < 30 cm were measured, while in the largest circle only trees with a dbh ≥ 30 cm were recorded. In each plot, the location of all trees above the measurement threshold relative to the plot centre was recorded, based on distance and azimuth. In addition, tree species, dbh, age and several height parameters were measured and recorded.

For each plot, a subplot with a 2 m radius (≈ 12.57 m2), located 5 m north of the plot centre, was established. Within each subplot, regeneration of height ≥ 20 cm and a dbh < 7 cm was measured. The inventory design distinguished between tree height/dbh classes for the regeneration. The first size class B0 included regeneration of height ≥ 20 cm and < 50 cm. The second size class B1 included regeneration of height between 50 cm and < 130 cm, whereas the third size class B2 consisted of trees of height ≥ 130 cm. The number of individuals was recorded separately according to tree species, size class, browsing damage and whether protected from browsing damage. A distinction was also made between natural and artificial regeneration. The maximum number of seedlings recorded was 15; that is, a theoretical distribution of numbers was truncated at 15.

We used the number of naturally regenerated beech seedlings in the plot as the dependent variable for every size class, because it was to be expected that regeneration would be influenced differently by variables within regeneration development classes (Žemaitis et al. 2019). The occurrence and density of beech regeneration were assumed to be affected by various variables (see Table 2 and hypotheses).

Table 2 Summary statistics of the median values and ranges in parentheses for the explanatory variables in the model fit for plots without beech regeneration, beech regeneration size class B0, beech regeneration size class B1 and beech regeneration size class B2

All computations were done using R version 3.4.1 (R Core Team 2018). The following explanatory variables characterised the overstorey of the plot: quadratic mean diameter of beech in the overstorey (1.3 m, Dq), stand basal area (BA) and species-specific basal area are determined on the basis of each plot. The total basal area for the plot and the species-specific basal area were scaled up to hectare level. The understorey was described by explanatory variables. To test the influence of established spruce regeneration on beech regeneration, the density in the plot of spruce of size class B1 and size class B2 was used (Spr_no_B1, Spr_no_B2).

In addition to the plot-specific explanatory variables, the following variables from the plot surroundings were included. In order to estimate seed availability across the study area, the shortest distance from each sampling point to the nearest potential seed source (mature beech stand or stand with admixed beech trees older than 80 years) D80 was determined using the gDistance function in R’s rgeos package (Bivand and Rundel 2019). Another proxy variable to estimate seed availability was buffer. Within three different neighbourhoods (BUF100, BUF250 and BUF500), the proportion of the area occupied by beech stands (> 80 years) was estimated using the gBuffer function in R’s rgeos package (Bivand and Rundel 2019). For both variables, beech stands > 80 years were chosen in order to examine the established regeneration, which may have established several decades ago (B2). The age criterion ensures that the surrounding beech stands had already produced seed at some point in the past. The location of the beech stands was recorded in the context of the stand inventory and provided as a spatial polygons data frame.

Russell et al. (2017) showed that it is essential to consider the impact of browsing when modelling regeneration. The influence of browsing was characterised by two variables. First, browsing percentage (BP) was defined as the ratio between the number of browsed and the total number of rowan (Sorbus aucuparia L.) within size classes B0 and B1. The ratio of browsed rowan seedlings was used as the species is widely distributed in the study area and preferentially browsed by deer (Motta 2003; Götmark et al. 2005). The size classes were chosen based on the size of the game species. Rowan regeneration of size classes B0 and B1 occurred in 2100 of the 8725 examined plots. Inventory plots without rowan regeneration were set as not available (NA). Based on the 2100 plots, the browsing impact was interpolated for all plots using inverse distance weighting in R’s gstat package (Gräler et al. 2016). Second, a factor variable characterising the browsing intensity in a plot was defined. The study distinguished between non-browsed, simply browsed and multiply browsed rowan, depending on the frequency of browsing on a single plant. The highest browsing intensity reported was assigned to the plot as a factor. If there was no rowan regeneration in the plot, this was assigned to the category absence of rowan. A binary variable was used to indicate whether a plot was fenced in to prevent damage caused by game.

Elevation, aspect, the nutrient content of the soil and its moisture level were provided by Sachsenforst as site condition variables. The factor variable was divided into four factor levels according to the moisture level. A distinction was made between well-drained, temporarily waterlogged and groundwater-affected sites. Where no level of water storage capacity was provided for the site, the fourth factor level ‘not available’ was used. In order to analyse the influence of humus layer on beech regeneration, the nitrogen and carbon content and the total thickness of the humus layer were provided. These data were derived from a regionalised soil inventory. All variables describing climate conditions were retrieved from GIS databases provided by Gauer and Kroiher (2012).

Data pre-treatments

In order to avoid errors, the data were tested for their suitability in terms of model building (Zuur et al. 2018). Data exploration was carried out following the protocol described in Zuur et al. (2010). Outliers, which were checked with Cleveland dotplots, were not found. In order to select the correct distribution function for modelling, the data were examined for zero inflation. Frequency plots were built for this purpose, to show the number of observations with a certain count of beech seedlings. In addition, a variance inflation factor (vif) was calculated for every covariate in order to detect multicollinearity. Collinearity is defined as the existence of correlation between covariates (Zuur et al. 2010). Here, the function corvif was used (Zuur et al. 2009). The threshold value was set at 3 and covariates with higher vifs were dropped. Subsequent to variable selection, we searched for relationships between the explanatory variables and the response variable by deploying multipanel scatterplots. Finally, the data were examined for dependency structures. Due to a small proportion of seed-bearing beech in Saxony (see also Table 1) and its limited seed dispersal and patchy establishment, local density is likely to be autocorrelated (Flores et al. 2009). This kind of clustering of regeneration has the potential to lead to zero inflation, especially when a regular inventory grid is used. Spatial correlation can also result from unobserved explanatory variables (Bachl et al. 2019).

Estimating seedling density

Due to the nature of the regeneration process, regeneration data often follow a zero-inflated and overdispersed count distribution (Zhang et al. 2012; Shen and Nelson 2018). The observed count of beech seedlings in the Saxonian forest inventory also revealed a very high proportion of zeros (Fig. 2). Different methodological approaches have been used for regeneration and recruitment modelling, in order to account for this special feature of the data. In previous studies, density was often modelled in a two-step approach. First, the probability of occurrence is modelled based on a set of covariates. Second, another equation estimates the density of regeneration based on the same or a different set of covariates (Vanclay 1992; Sterba et al. 1997; Qin 1998; Klopcic et al. 2012; Yang and Huang 2015). Recent papers used a one-step modelling approach by combining two separate estimation processes into a joint distribution of probabilities (Shive et al. 2018; Shen and Nelson 2018; Monteiro-Henriques and Fernandes 2018; Zell et al. 2019). To take into account zero inflation and overdispersion, they made use of zero-inflated distribution models. Zero-inflated Poisson (ZIP), zero-altered Poisson (ZAP), zero-inflated negative binomial (ZINB) and zero-altered negative binomial distributions (ZANB) are distributions commonly used to include excess zeros in the modelling process (Fortin and DeBlois 2007; Li et al. 2011; Zhang et al. 2012). The major advantage of zero-altered models is that both processes (occurrence and density) can be investigated, facilitating ecological understanding (Cunningham and Lindenmayer 2005).

Following the example of Zuur et al. (2018), the model was fitted in a Bayesian framework using integrated nested Laplace approximation (INLA). In general, Bayesian inference for complex spatial models can be implemented with simulation methods such as the Markov chain Monte Carlo method (MCMC). Compared to MCMC, INLA is advantageous in terms of faster computation due to deterministic and accurate approximation by integrals in contrast to stochastic simulations (Krainski et al. 2019). The package R-INLA (Lindgren and Rue 2015) in R (R Core Team 2018) provides the INLA method to solve complex models with spatial random effects. For zero-inflated models, there is currently no approach within the package R-INLA to model occurrence depending on the covariates (Zuur et al. 2018). Zero-altered negative binomial generalised linear models (ZANB) were used in this study for the first time to model both probability of occurrence {0,1} and density of beech seedlings (BDi) (\(i=1,\dots ,n\)) at \(n\) observations in the plots, as a function of covariates. The ZANB model simultaneously estimates the parameters of a zero-truncated negative binomial distribution conditional on a Bernoulli trial. The former applies for the count data, whereas the latter deals with absence occurrence. The variable \({1}_{\{{BD}_{i}>0\}}\), which indicates the occurrence (value \(1\)) or absence (value \(0\)) of seedlings, is \(1\) with probability \({\pi }_{i}\),

$${\text{Pr}}\left( {BD_{i} = 0} \right) = 1 - {\uppi }_{{\text{i}}} , \quad 0 \le {\pi } \le 1$$

and conditionally, provided seedlings are present, \({BD}_{i}\) has the truncated negative binomial distribution given by

$${\text{Pr}}(BD_{i} = x{\mid }BD_{i} > 0) \propto \frac{k + x - 1}{x} \cdots \frac{k}{1}\left( {\frac{{\mu_{i} }}{{k + \mu_{i} }}} \right)^{x} \quad {\text{for}}\;x = 1,2, \ldots$$

The parameters \({\uppi }_{\mathrm{i}}\) and \({\mu }_{i}\) in both parts are expanded as a function of different covariates. Except for the explicitly hierarchical dependencies, all variables are assumed to be independent.

First, a base or ‘initial model’ taking no account of spatial dependence was used. This model used only covariates as predictors of occurrence and density. To examine whether patterns of spatial correlation could be used as surrogates for unobserved explanatory variables, we added a spatial random structure to the base model, i.e. a ‘complex model’. We then compared the complex model to the initial model to determine whether the spatial component could contribute to explaining the variation in beech regeneration occurrence and density. In addition, a model consisting of only a spatial component was tested, i.e. ‘spatial random model’. This procedure was also applied to a unimodal Poisson, negative binomial and zero-altered Poisson model. Preliminary investigations showed that the complex ZANB model exhibited the best fit to the data (shown in “Optimal random structure” section). The following formula describes the full ZANB model used to test the variables:

$$BD_{i} \sim {\text{ ZANB}}\left( {\mu_{i} ,\pi_{i} ,k} \right)$$
$$E\left( {BD_{i} } \right) = \frac{{\pi_{i} }}{{1 - P_{0} }} \times \mu_{i} \quad {\text{where}}\; P_{0} = \left( {\frac{k}{{\mu_{i} + k}}} \right)^{k}$$
$${\text{var}} \left( {BD_{i} } \right) = \frac{{\pi_{i} }}{{1 - P_{0} }} \times \left( {\mu_{i}^{2} + \mu_{i} + \frac{{\mu_{i}^{2} }}{k}} \right) - \left( {\frac{{\pi_{i} }}{{1 - P_{0} }} \times \mu_{i} } \right)^{2}$$
$$\mathrm{log}\left({\mu }_{i}\right)= {\beta }_{0}+ {{x}_{1i}\beta }_{1}+ .. . {{x}_{mi}\beta }_{m}+ {u}_{i}$$
$$u={\left({u}_{i}\right)}_{i =1,\dots ,n} \sim N(0,{\sum }_{u})$$
$$logit\left({\pi }_{i}\right)= {\gamma }_{0}+ {z}_{1i}{\gamma }_{1}+ .. . + {z}_{ni}{\gamma }_{n}{+ v}_{i}$$
$${v=(v}_{i}{)}_{i =1,\dots ,n} \sim N(0,{\sum }_{v})$$

The term πi is the estimated probability of beech regeneration in the plot, i.e. occurrence, and is modelled via the Bernoulli part of the model. The logit transformation is defined as \(logit\left({\pi }_{i}\right)=\mathrm{log}(\frac{{\pi }_{i}}{1-{\pi }_{i}})\). The γs correspond to the n coefficients of the binary model, which are multiplied with each of the n selected explanatory variables zi. A zero-truncated negative binomial part of the model is used for the estimation of nonzero densities µi, where µi is log transformed. The βs correspond to the m coefficients of the count model, which are multiplied with each of the m selected explanatory variables xi. For the parameter estimation of the fixed effects, the default prior distributions of R-INLA, i.e. normal distributions, were used. \({\beta }_{0},\dots ,{\beta }_{m}\) and \({\gamma }_{0},\dots ,{\gamma }_{n}\sim N\left(0,1000\right)\)

In addition to the fixed effects, the zero-truncated negative binomial model estimated a hyperparameter k called size, which takes overdispersion into account. The default prior distributions of R-INLA were used for the parameter estimation of k, i.e. log gamma prior distribution (Table A1). Weakly informative prior distributions were selected for the coefficients as a priori knowledge was not available.

The terms ui and vi are random intercepts assumed to be spatially correlated with mean 0 and covariance matrix ∑ in the counts and binary parts of the model. It was assumed that u and v exhibit Markovian behaviour and follow a Gaussian Markov random field (GMRF). To quantify the associated covariance matrices of u and v, Matérn covariance functions were used and numerically approximated using SPDE (continuous domain stochastic partial differential equation). The SPDE approach provided the Matérn covariance parameter, i.e. hyperparameters θ1 = log(τ) and θ2 = log(κ). The hyperparameter κ quantified the range of the effect as the distance at which the spatial dependency diminishes. The hyperparameter τ is a measure of the strength of the spatial random effects (Lindgren and Rue 2015). R-INLA uses a normal distribution for the priors of θ1 and θ2 (Table A1).

A dense grid of non-overlapping triangles over the sampling area was needed to solve the SPDE. Therefore, the finite element approach was applied. The boundary of the study area and the locations of the plots were used to build the mesh of non-overlapping triangles using the function inla.mesh.2d(). max.edge and cutoff were used as arguments within this function in order to control the shape of the triangles. The maximum edge argument specified the triangles’ maximum allowed edge length for the inner and outer part of the mesh and was set to 700 (Zuur et al. 2017). The cut-off argument ensured that there was a minimum allowed distance between two points. This argument was set to 250. As the coordinate projection was set up in a UTM projection, these values were in metres. In total, there were 7752 vertices covering the study area. Figure 1 shows part of the mesh that was used to approximate the spatial field. Spatial prediction at a specific location within in the GMRF was straightforward because SPDE provided the approximation of the entire spatial process. It was just a matter of including the locations where predictions were required as observations of missing values in the model (Krainski et al. 2019). The works of Lindgren and Rue (2015), Zuur et al. (2017) and Krainski et al. (2019) are recommended for further information on INLA and the SPDE approaches.

As the influence of the explaining variables was expected to vary between the different size classes of beech regeneration (see Table 2), the models were individually fitted for each size class. All 16 chosen variables were tested as predictors in both the count and in the binary part of the model.

The model output could be differentiated in fixed effects and hyperparameters. The posterior mean and the quantiles were presented to summarise the posterior marginal distribution. Analogous to frequentist statistics, a parameter is considered not significant if the 95% credible interval includes 0 when taking the 2.5% and 97.5% quantiles of the posterior distribution. Non-significant variables were removed from the fixed part of the model. The posterior mean indicates the influence of the parameter.

The Watanabe–Akaike information criterion (WAIC) was applied for model selection. WAIC consists of two terms representing the quality of fit and the model complexity. For further details see Gelman et al. (2014). The model with the smallest WAIC was chosen. The leave-one-out cross-validation served as a further factor for model selection, used to quantify the prediction accuracy of the model. This factor could also be used to validate the model. R-INLA computed the so-called conditional predictive ordinates (CPO). It was defined as the posterior probability of an omitted observation given all the other data for the model fit. Large values indicate a good fit of the model to the observed data. The CPO values are presented in dotplots, and the mean logarithm CPO (LCPO) was calculated as a measure summarising the values. Lower values of the LCPO indicate a better fit of the model. The model accuracy of the occurrence part of the model was evaluated using receiver operating characteristic (ROC) curves. The ROC curve is generated by plotting the rate of true positive results against the rate of false positive results at different threshold settings. Values at 0.5 indicate a fit that is no better than random. Values of 1 indicate a perfect fit.

Results

Characteristics of beech regeneration

As expected, a high number of the inventory plots did not reveal any occurrence of beech natural regeneration (Fig. 2). Approximately 91.5% of all plots contained no beech regeneration in the B0 size class. On average, 3.9 individuals of size class B0 were found in the 744 colonised plots (0.31 ind. m−2). Approximately 92.3% and 93.1% of all plots contained no beech regeneration of size class B1 or B2, respectively. Beech regeneration of size class B1 was found in 676 plots, and the mean regeneration count per colonised plot was around 2.9 (0.23 ind. m−2). Only 601 plots possessed beech regeneration of size class B2. The mean beech count in plots with beech regeneration of size class B2 was around 3.0 (0.24 ind. m−2). Analyses revealed that natural beech regeneration included an excessive number of non-occurrence (zeros) and demonstrated a large variation around the mean. A higher count of beech regeneration is observed less frequently than a low count (Fig. 2). In the dataset, a regeneration count of 15 constituted an exception. The smallest size class (B0) exhibited a higher density in the colonised plots than B1 and B2 (Fig. 2).

Fig. 2
figure 2

Histogram with a Y-axis break for the beech regeneration count on the plot for size classes B0, B1 and B2. The proportion of plots without oak regeneration, the mean (\(\overline{x })\) and the variance (σ2) are given

Optimal random structure

Spatial correlation was expected to occur as the data were collected on a systematic grid. A comparative analysis revealed that those models with covariates only, and models with a spatial term only, exhibited higher WAIC values than the full model, suggesting a poorer fit for the former. This result indicated that the spatial relationship between the plots contained valuable information that is not captured by environmental covariates alone. As a result, the final ZANB model contained spatial correlation in the count part as well as in the binary part of the model. For the sake of a simplified presentation, the two classes of effects (fixed effects and spatial random effects) for the occurrence and density of regeneration are described separately in the following. However, it should be mentioned that the ZANB model fits both components simultaneously. All effects presented separately are conditional on the rest of the model.

Occurrence model of beech regeneration

Many of the covariate data elements originate from the same systematic grid as the regeneration data and other covariates were interpolated from wider grids. Therefore, we expected some collinearity between the different covariates. According to the vif, we found multicollinearity among the buffer queries of different neighbourhoods and between distances to nearest potential seed sources at different ages selected. Only uncorrelated covariates were used for the following analysis.

Among the three different size classes, different variables had an impact on regeneration occurrence (Table 3). Plotting probabilities of the occurrence of beech regeneration B0 against different Dq at different humus thicknesses revealed a positive influence of Dq and a slightly positive influence of humus layer (Fig. 3). Based on an estimated probability greater than the threshold value of 0.5, Fig. 3 indicates that beech regeneration was likely to occur from a Dq of 41 cm and higher. The probability of occurrence increased with increasing Dq. Distance to the nearest beech stand had a significant negative impact on the occurrence of beech regeneration. The probability of the occurrence of beech regeneration decreased with increasing distance to the nearest mature beech stand (> 80 years). In addition, beech regeneration was found to be more likely as the proportion of the area occupied by beech within the 500 m radius increased. Figure 3 highlights the effect of the distance to the nearest mature beech stand on different sites. The probability of beech regeneration decreased drastically when the distance to the nearest beech stand became greater than 1000 m. Beech regeneration was very unlikely at yet greater distances, as indicated by an estimated probability of 0.34. Where distance was equal, the probability of the occurrence of beech regeneration on well-drained sites was greater than on groundwater-affected and temporarily waterlogged sites. Figure 3 also highlights the importance for the occurrence of beech regeneration the influence of spruce regeneration density and spruce basal area. Where there are five spruce seedlings of size class B2 or more on a plot, beech regeneration of size class B0 can no longer be expected to occur. This corresponds to a spruce regeneration density of 4000 per hectare. The probability of beech regeneration occurrence became very low where spruce basal area exceeded 30 m2/ha.

Table 3 Fixed effects of the occurrence/absence part (Bernoulli part) of the ZANB models for size classes B0, B1, B2
Fig. 3
figure 3

Predicted probability of the occurrence of natural regeneration of beech depending on Dq and humus thickness (left), distance to the nearest mature beech stand on different sites (middle) and spruce basal area with varying numbers of spruce seedlings B2 (right) for different size classes (all continuous variables are set to their mean; all factors are set to their most frequent factor level). The confidence intervals were not plotted for reasons of clarity, as they are quite large

Dq, distance to the nearest mature beech stand, basal area of spruce, humus thickness and excess water showed similar effects on the probability of the occurrence of beech of size class B1 as was the case for size class B0 (Table 3). While the occurrence of beech regeneration of size class B0 was very likely for a Dq > 41 cm, size class B1 occurred more frequently from Dq > 55 cm, indicated by a probability ≥ 0.5 (Fig. 3). The probability of occurrence decreased with increasing distance to the nearest mature beech stand. Regeneration was unlikely on well-drained sites from distances ≥ 300 m. Regeneration was also unlikely on groundwater-affected sites, even in the vicinity of mature beech stands, indicated by an estimated probability < 0.5. The covariates pine basal area and spruce regeneration density did not show any effect on the occurrence of beech of size class B1.

In contrast to size classes B0 and B1, the occurrence of size class B2 was not dependent on spruce and pine basal area. However, Dq, distance to the nearest mature beech stand, humus thickness and excess water showed similar effects on the probability of occurrence (Table 3). Beech regeneration of size class B2 occurred from Dq ≥ 70 cm (see Fig. 3). Furthermore, occurrence was only likely in the vicinity of mature beech stands. The model predicted no beech regeneration of size class B2 on groundwater-affected sites, independent of the distance to the nearest mature beech stands, with an estimated probability of 0.2.

Count model of beech regeneration

Stand variables significantly influenced the extent of beech regeneration (Table 4), whereas no site variables exhibited an effect on regeneration density. Among the stand variables, beech basal area and Dq had a positive effect on the density of size class B0. Figure 4 shows the effect of beech basal area with a variable Dq on regeneration density. The quadratic term of beech basal area was found to have a negative effect on regeneration density of size class B0. Regeneration density rose sharply with increasing beech basal area, especially for plots > 20 m2/ha. After reaching the maximum, the regeneration density of size class B0 decreases as the beech basal area increases. In addition, beech regeneration density is higher with higher Dq. A high density of beech regeneration of size class B0 was more likely in plots with non-browsed or simply browsed rowan. Plots with the factor multiply browsed rowan had a significantly lower beech regeneration density than those with only low or no browsing of rowan. The density of beech regeneration decreased with increasing basal areas of pine and spruce. Spruce basal area had a significant negative influence on density. The model predicted a low beech regeneration density for a spruce basal area of 30 m2/ha (Fig. 4).

Table 4 Fixed effects of the count part (zero-truncated) of the ZANB models for size classes B0, B1, B2
Fig. 4
figure 4

Predicted density of natural regeneration of beech depending on beech basal area with varying quadratic mean beech dbh (left), beech basal area with browsing intensity and percent in rowan (middle), pine basal area with different spruce basal area (middle) and total basal area for different size classes (all continuous variables are set to their mean; all factors are set to their most frequent factor level). The confidence intervals were not plotted for reasons of clarity, as they are quite large

Beech basal area, Dq and pine basal area showed similar effects on the density of beech of size class B1 as was the case for size class B0. However, the model did not reveal any effect of spruce basal area on the regeneration density of size class B1. Whereas the factor browsing intensity showed an impact on the regeneration density of size class B0, the browsing percent had a significant negative effect for size class B1. Figure 4 highlights the effect of Dq and browsing percent on the regeneration density of size class B1. Beech regeneration density was found to be higher for plots with a low browsing percent in rowan. The regeneration density of size class B1 was lower than that of size class B0. The maximum predicted regeneration density was 11 seedlings per plot, corresponding to a density of 8750 seedlings per hectare (Fig. 6).

Beech basal area and Dq had similar effects on regeneration density for size class B2 as for size classes B0 and B1. In contrast to the effects they had on size classes B0 and B1, spruce and pine basal area had no significant effect on size class B2. The total basal area in a plot did influence density negatively, however (Fig. 4). Similar to the effects it had on size class B1, browsing percent in rowan had a negative impact on regeneration density. The predicted density for size class B2 was even lower than for size classes B0 and B1 at a medium beech basal area (Fig. 4).

Spatially smooth random effect

In addition to the fixed effects, the model included spatial random effects. The mesh used to fit the data is presented in Fig. 1. The spatial random field was defined by the hyperparameters θ1 and θ2. Table A2 (Appendix) shows the values of the hyperparameters that defined the spatial range and variance of the random field. In order to visualise the spatial random effects, an example area within the study area was selected. It is a representative large closed forest area used to explain the spatial random effects, which were estimated.

Figure 5 shows the example area of the spatial random field for the binary part of the model, Fig. 6 for the zero-truncated count part. The spatial random fields for the Bernoulli part of the models indicate that there was an obvious spatial pattern of unexplained variability in the example area where the probability of beech occurrence could not be precisely predicted by the fixed effects alone. The probability of the occurrence of beech regeneration of size class B0 was higher to the south of the example area in reality than was modelled by the fixed effects. The probability of occurrence was higher especially at the edges of fragmented forest areas, both in the example area (Fig. 5) and in other parts of the study area. For some parts of the north-west of the study area, the probability of regeneration occurrence was also higher than predicted by the fixed effects. Furthermore, there were parts with a high probability of an absence of regeneration. In the case of the example area in Fig. 5, the probability of occurrence was higher in the northern part. In the eastern part of the example area, the probability of occurrence was lower than predicted by the fixed effects. Spatial random effects for the occurrence of beech regeneration for size classes B1 and B2 showed similar but more distinct effects than for size class B0.

Fig. 5
figure 5

Example area from within the study region with observed occurrence of beech regeneration (left), spatial prediction of the occurrence of beech regeneration (middle) and spatial random field for the binary part of the ZANB model (right) for the three size classes. Areas shaded dark blue have a lower probability of occurrence of regeneration than that predicted by the fixed effects. Yellow to red areas have a higher probability

Fig. 6
figure 6

Example area from within the study region with observed density of beech regeneration (left), spatial prediction of beech regeneration density (middle) and spatial random field for the zero-truncated count part of the ZANB model for the three size classes and part of the investigation area. Areas shaded dark blue have a lower probability of high regeneration density than predicted by the fixed effects. Yellow to red areas have a higher probability. (Color figure online)

The precision of the prediction of occurrence achieved by the different models is shown both in Fig. 5 and in the ROC curves (Figure A1, Figure A2 and Figure A3). These figures also show how the model prediction is improved by the random effects, as a comparison is presented including and excluding random effects.

The spatial random fields for the zero-truncated count part of the models is shown (Fig. 6), using the same example area referred to above. The figure reveals a pattern of unexplained variability in parts of the example area where beech regeneration density was not predicted precisely by the fixed effects alone. The comparison of the random fields indicated that the random effects were smaller for size class B0 than for B1 and B2. There were some areas for which density was higher than predicted by fixed effects. For the B2 size class model, the uis were even close to 1.6, corresponding to a fourfold increase in density due to spatial correlation. Most of these areas were at the edges of the forest complexes and close to villages and the city of Dresden. The observed regeneration density was high in these locations. However, there were some areas exhibiting a density 60% lower than predicted by the fixed effects. Most of these areas were inside the forest areas (Fig. 6). The precision of the overall prediction provided by the different models can be seen both in Fig. 6 and in the CPO values (Figure A4, Figure A5 and Figure A6). The comparison of observed densities with predicted densities, root mean square error (RMSE) and bias estimate in the form of the mean prediction error (E), as well as the CPO values of the leave-one-out cross-validation indicated good prediction accuracy (Table A3).

Discussion

Methodological discussion

In this study, we successfully created complex ZANB spatial models with various covariates stemming from both the plot and the surrounding stand. In contrast to many earlier studies that followed a frequentist approach (Zhang et al. 2012; Klopcic et al. 2012; Yang and Huang 2015; Shen and Nelson 2018; Monteiro-Henriques and Fernandes 2018; Zell et al. 2019), a Bayesian approach was applied. Although maximum likelihood methods and Bayesian methods with weakly informative priors provide the same results for fixed effect models (Rue et al. 2009; Zuur et al. 2018), the great advantage of using Bayesian methods in R-INLA is the user-friendly integration of spatial random effects. The definition of complex variance structures (e.g. Matérn covariance function) is straightforward. The MCMC algorithm (Rathbun and Fei 2006; Flores et al. 2009; Entezari et al. 2019) and the INLA algorithm can be used to estimate the spatial random effects of geostatistical models. However, the computational efficiency of INLA means it exhibits much higher computation speed than MCMC approaches (Rue et al. 2009; Lindgren and Rue 2015). In addition, R-INLA offers a simple way to provide criteria for model comparison and prediction accuracy (Rue et al. 2009).

The ZANB model provides a useful framework for modelling regeneration data with an excess of zeros. It enabled us to examine both processes relevant to occurrence and processes that influence density, and therefore improve our understanding of the regeneration ecology of beech (Welsh et al. 1996; Cunningham and Lindenmayer 2005). The two-step computation described in chapter 2.4. is how Zuur et al. (2018) suggested applying the function inla(), which cannot directly handle unequal probabilities \({\pi }_{i}\). This was done in one pass by applying the function to a stacked response vector consisting of all \({1}_{\{{BD}_{i}>0\}}\), \(i=1,...,n\), and the nonzero values \({BD}_{i}\), and approximating the truncated negative binomial distribution for the second part with a zero-altered negative binomial distribution where the probability of zero values is fixed at a number very close to \(0\). However, not all probability distributions are currently available for model building in R-INLA. There is no possibility within R-INLA to calculate a zero-inflated negative binomial model with covariates in the occurrence part of the model (ZINB) (Zuur et al. 2018).

The response variable was formed by the occurrence and density of naturally regenerated beech seedlings above 20 cm in height and below 7 cm in dbh. This choice involved several restrictions. Smaller seedlings of below 20 cm height were not modelled as they were not recorded in the Saxony state forest inventory. Therefore, it may be deemed uncertain whether our model is able to represent the first stages of regeneration establishment. However, a clear advantage of a height threshold of 20 cm is that the most critical phase in the life cycle of the plant, namely germination and first establishment, has already been overcome, and thus, the information about these plants has relevance for management. In contrast, during germination, beech seedlings can be damaged by drought, frost or biotic pathogens (Jensen 1985; Minotta and Pinzauti 1996; Coll et al. 2003), making information about germinants and very small seedlings less valuable for management.

Our models for beech regeneration were based on forest inventory data. Inventory data are suitable for regeneration modelling as the data is representative for the sampled area and a large number of plots covers wide gradients of covariates. In addition, the model can be improved after the next inventory with follow-up data. A major disadvantage of inventory data is the small experimental plot size. As a consequence, the dispersal of beech cannot be depicted by the plot itself. The distance to the nearest mature beech stand is an estimate of seed availability. It is a biased estimator of the effective dispersal distance providing underestimations of the dispersal distance, as has been shown by parentage analysis (Oddou-Muratorio et al. 2011; Millerón et al. 2013). However, dispersal can be overestimated if not all mature beeches have been recorded. The accuracy of the detection of beech stands in the stand inventory is crucial. Further information on this topic can be found in Axer and Wagner (2020).

Inventory data used in regeneration modelling usually have a nested structure and are grouped within geographic locations. In addition, regeneration is a rather uncertain process, as it is affected by successful flowering, masting, storage, germination and establishment. The regeneration data exhibited a high degree of spatial autocorrelation, which should be considered when modelling natural regeneration. Few satisfactory results have been obtained for regeneration establishment models and their predictions in the past (Schweiger and Sterba 1997; Wikberg 2004; Klopčič et al. 2015). Miina et al. (2006) recommended including random effects in the modelling approach. It could be shown that taking random effects (Li et al. 2011; Welch et al. 2016; Shive et al. 2018; Zell et al. 2019) or spatially correlated residuals into account (Rathbun and Fei 2006; Flores et al. 2009) increases the model adjustment and improves prediction. Compared to plot-specific or stand-specific random effects, modelling fixed effects with spatially continuous random fields provides more interpretable results of the random effects and provides insights into the spatial correlation of regeneration data. The applied method has also been employed successfully in ecology research (Beguin et al. 2012; Leach et al. 2016; Dutra Silva et al. 2017; Sadykova et al. 2017), climate research (Cameletti et al. 2013) and medical research (Musenge et al. 2013; Arab 2015; Blangiardo et al. 2016). This suggests a theoretical suitability for regeneration establishment models also, as spatially correlated count data are also present here.

Spatial prediction for a given location is straightforward as the model provides the approximation of the entire Gaussian Markov random field (Lindgren and Rue 2015). The variability is correctly assigned to the predictions and enables predictions at unsampled locations (Beguin et al. 2012). The prediction in INLA can be performed by including locations in the model with missing response values. In this way, maps for the occurrence and density of beech regeneration can be generated easily (Fuglstad and Beguin 2018; Krainski et al. 2019). This is an important planning tool for forest management (Figs. 5, 6). It supports decision-making on how to utilise natural beech regeneration, e.g. for forest conversion.

Established regeneration in inventory plots, that is, trees at least 3–5 years old, may have undergone varying ecological conditions from the time of germination and establishment until the day of inventory. Due to the time lag between early establishment and the date of the inventory, the effect intensities of the explanatory variables employed may not be as strong as in studies with a coordinated examination design, e.g. true experiments. The variables used for our model are all ecologically interpretable. Other studies have used, for example, forest ownership as an explanatory variable for regeneration establishment (Kolo et al. 2017), which has no apparent ecological meaning, or its relevance is obscured. In the following, the fixed effects will be discussed in an ecological context and related to the regeneration cycle of beech (Wagner et al. 2010; Fischer et al. 2016).

Variables influencing the occurrence of beech regeneration

The occurrence of beech regeneration is influenced by both stand and site characteristics. Due to the short mean seed dispersal distances of beech (Wagner 1999; Millerón et al. 2013), the occurrence of mature beech in a plot is the primary driver of the occurrence of beech regeneration. As mean dbh offers reliable evidence of fructification (Wagner 1999), the finding that the quadratic mean dbh of beech influences regeneration occurrence fits. Kolo et al. (2017) arrived at a similar result. They found a positive influence of the mean tree height and the mean age of beeches in the overstorey, which are both closely related to the dbh. A sequence over time can be observed in the explanatory variable as higher levels of beech regeneration are predicted at a larger mean beech dbh. For quadratic mean diameters > 41 cm, the probability of beech regeneration of size class B0 is higher than 0.5 (Fig. 3). In the case of beech regeneration of size class B2, regeneration occurred several years ago and during this time the mature beech trees could grow from 40 to 70 cm dbh.

However, beech regeneration was also found in plots lacking mature beech trees. Due to the small plot sample size, it was not possible to depict seed distribution completely. Spatial patterns of seed availability resulting from the plot environment have only been taken into account in a few studies to date (Shive et al. 2018; Monteiro-Henriques and Fernandes 2018). The model indicated that distance to seed source is another major driver (and limitation) of beech regeneration patterns. This finding was also observed in other studies (Kunstler et al. 2007; Irmscher 2009; Poljanec et al. 2010; Millerón et al. 2013). Poljanec et al. (2010) found that an increase in the distance to the nearest beech stand of 1000 m reduced beech ingrowth (> 10 cm dbh) by 46%. Our results also confirmed that the probability of the occurrence of regeneration of size class B0 drops to below 0.5 at a distance of 900 m (Fig. 3). When modelling established regeneration data, the effective dispersal distance is used. The distances estimated using our model are, therefore, distances for the effective dispersal. These are not to be equated with distance estimates that are based on the primary dispersal, i.e. barochorus seed dispersal (Wagner 1999; Kutter and Gratzer 2006; Sagnard et al. 2007). In the case of effective seed dispersal, the germination and survival of regeneration has to be taken into account. Millerón et al. (2013) showed, for example, that the effective dispersal of beech exceeds the primary dispersal. According to fructification of beech, the age selection of stands ≥ 80 years for distance determination showed the best fit. The predicted probability of the occurrence of beech of size classes B1 and B2 was smaller than for B0 at the same distance. The different probabilities of occurrence for the different size classes (Fig. 3) contradicted the coefficients, which were estimated to be the same by the model for all size classes (Table 3). The different probabilities in Fig. 3 can be explained by the different total density between the size classes. The threshold of 1 regenerated beech {0;1} was reached due to the lower plant density for B1 and B2 at shorter distances. The lower plant density in the larger size classes was a consequence of density-dependent and density-independent mortality with increasing growth (Akashi 1997; Collet and Le Moguedec 2007).

There are two mechanisms for seed dispersal of beech given the two variables in question, namely occurrence of beech in the plot overstorey and distance to the nearest beech stand in the plot environment. While close-range seed dispersal is determined by barochory, zoochorus seed dispersal becomes decisive at distances ≥ 20 m. Compared to the density of seed dispersed barochorusly, the density of seeds dispersed by zoochory is much lower. Millerón et al. (2013) emphasised that zoochorus dispersal takes place on two scales. While small mammals, especially mice, are only able to transport beech nuts over small to medium distances of up to approximately 30 m (Jensen 1985; Jensen and Nielsen 1986; Den Ouden et al. 2005), birds can move seeds over longer distances (Nilsson 1985). Observed distant dispersal events were probably due to activities of the jay (Garrulus glandarius L.) (Turcek 1961; Nilsson 1985). The nuthatch (Sitta europaea L.) has smaller territories than the jay and usually deposits seed near the seed source and rarely at distances greater than 40 m. Nilsson (1985) observed jays transporting beech nuts about 1 km from the seed source. Gómez (2003) noted an average distance of 250 m from the seed source for holm oak acorns (Quercus ilex L.). Jays hide acorns and probably beech nuts in places that are suitable for germination and establishment (Bossema 1979). Although only a small proportion of the beech mast is hidden by the jay, these seeds seem to have a relatively high survival rate (Nilsson 1985). However, Perea et al. (2011) emphasised that for the dispersal of beechnuts by jays, the occurrence of other preferred species, e.g. oaks, and the frequency of non-masting years should be considered. Nilsson (1985) stated that jays only hide beech nuts when there are no acorns in a particular year. In the case of American beech (Fagus grandifolia EHRH.), dispersal distances of up to 4 km from the mother tree have been recorded for the blue jay (Cyanocitta cristata L.) (Johnson and Adkisson 1985). The results in relation to the influence of the distance to nearest mature beech stand (D80 in Table 3) may reflect the behaviour of the jay.

A high basal area of spruce in a plot reduces the probability of the occurrence of regeneration of size classes B0 and B1. Beech regeneration in spruce stands is limited by the high stand density, a lack of mother trees and a large distance to neighbouring beech stands. Poljanec et al. (2010) showed that beech ingrowth was not possible in stands featuring a spruce share of 70%.

The negative impact of spruce regeneration density on beech regeneration was an indication of interspecific competition. Existing spruce regeneration of size class B2 prevented establishment of beech seedlings. In contrast, spruce regeneration of size class B2 revealed no effect on beech regeneration of size classes B1 and B2. Investigations by Unkrig (1997), Dobrovolny (2016) and Kühne and Bartsch (2003) examined the influence of irradiation on the composition of beech–spruce regeneration. A higher regeneration density of spruce was associated with higher irradiation due to canopy gaps. Therefore, a higher amount of irradiation can be assumed to promote the establishment of spruce regeneration and, hence, prevent the establishment of beech regeneration.

The effect of site suggests that regeneration is not to be expected on sites that are not favourable for beech. This finding is consistent with the findings of Poljanec et al. (2010). They found a higher probability of beech occurrence on sites with a higher proportion of beech in the potential natural vegetation, which considers soil conditions as well. The absence of beech on hydromorphic sites is an empirical phenomenon that has been observed repeatedly (Ellenberg and Leuschner 2010; Aertsen et al. 2012). For the development of an intensive, deep-reaching root system, beech requires loose, well-aerated soil (Asche et al. 1995). In pseudogley soils, it is almost exclusively the fine roots that penetrate the compacted horizons, along shrinkage cracks (Kreutzer 1961). Schmull and Thomas (2000) were able to demonstrate morphological reactions of beech seedlings to waterlogging. Seedlings cannot generate roots below the water table. In addition to morphological reactions, physiological reactions that indicate increased stress in plants can be shown. This finding is consistent with our results. The negative influence of hydromorphic sites increases with increasing size class (Fig. 3).

Variables influencing beech regeneration density

Taking the ecological characteristics of beech into account, beech regeneration density is highly dependent on the dispersal of beech nuts (Sagnard et al. 2007; Millerón et al. 2013). Most beech nuts fall within the vicinity of the seed tree. According to earlier studies (Poljanec et al. 2010; Klopcic et al. 2012; Ramage and Mangana 2017; Žemaitis et al. 2019), the basal area of beech in a stand significantly influences the density of beech regeneration. A high density of beech regeneration is more likely when the beech basal area in a plot is high. A larger proportion of beech in the plot leads to more beech nuts and, consequently, to more beech regeneration. As beech nuts are highly attractive to seed predators, the rate of predation of beech nuts by small mammals in stands with a low degree of beech admixture could be higher (Kelly and Sork 2002; Yasaka et al. 2003; Kon et al. 2005). The lower beech regeneration density for admixed beech trees can, therefore, result from reduced seed availability and effective predation. In addition to beech basal area, Dq is positively related to beech regeneration density because seed production is related to diameter (Wagner 1999). Furthermore, beech is a late successional species, able to regenerate and establish under a wide range of light conditions (Emborg 1998). This was evidenced by the fact that the size classes B0 and B1 were not dependent on the total basal area. However, there was an optimum relationship between beech basal area and the density of seedlings of size class B0. Different studies have highlighted the weak relationship between regeneration density and light availability (Madsen and Larsen 1997; Szwagrzyk et al. 2001; Petritan et al. 2007; Ammer et al. 2008). Ammer et al. (2008) revealed that the importance of light availability increases with seedling size. All of these results are consistent with the natural regeneration methods commonly used for beech. A shelter provided by mature trees combined with a long regeneration period leads to a high regeneration density (Nyland et al. 2006; Peña et al. 2010).

Although the distance to the nearest mature beech stand had a positive influence on the occurrence of beech regeneration, it did not have any effect on beech regeneration density. Compared to barochorus seed density, the density of zoochorously dispersed seeds was significantly lower. Various investigations of long-distance dispersal of beech seed (Kunstler et al. 2007; Irmscher 2009; Dobrovolny and Tesař 2010; Mirschel et al. 2011) revealed only very low regeneration densities at higher dispersal distances. Therefore, the distance to the nearest mature beech stand influences only occurrence {0;1}, but not density {1,…15}. Kunstler et al. (2007) found an average density of 52 beech seedlings per hectare in pine stands. Dobrovolny and Tesař (2010) observed a density of 65 plants per hectare at a distance of 200 m from a seed source, while Irmscher (2009) detected 5 beech seedlings per hectare at 240 m from the nearest seed source. According to our sampling design, a count of 1 beech seedling per plot corresponded to a density of 795 per hectare. As can be seen in Axer and Wagner (2020), higher densities are almost completely absent at greater distances.

Several studies have shown the impact of browsing on regeneration (Ammer 1996; Motta 2003; Boulanger et al. 2009). The results of our study demonstrated that browsing intensity and browsing percentage in rowan had no effect on the occurrence of beech regeneration in this study area, but there was an influence on density. The browsing intensity in rowan was taken as an indication of a certain concentration of game (Chevrier et al. 2012). In contrast to Kamler et al. (2010), who found no correlation between deer density and browsing intensity in rowan, we were able to demonstrate a spatial variation in browsing intensity in rowan, indicating a concentration of game. The overall game density would appear to be decisive for these contrasting findings. Our results reinforced the findings of Klopcic et al. (2012) and Russell et al. (2017). Both of these studies found no impact of browsing on occurrence, but on density. Compared to other tree species, beech is less severely affected by browsing. This makes an absence of beech regeneration due to browsing unlikely (Motta 2003; Götmark et al. 2005; Boulanger et al. 2009). However, density is indirectly influenced by browsing, as the mortality rate of regeneration increases and other plants not preferentially browsed can grow better. Surprisingly, plots with the factor simply browsed rowan exhibited a higher density of beech regeneration of size class B0 than plots with non-browsed rowan. This finding was consistent with the insights obtained by Kupferschmid (2018), who found that browsing of a certain species also depends on the occurrence of other tree species. The occurrence of rowan can lead to a decrease in browsing of beech. In line with our results, Mirschel et al. (2011) also found a positive correlation between browsing intensity and beech regeneration.

As has been demonstrated, browsing affects regeneration density. According to Kolo et al. (2017), browsing is a major source of error in predicting the probability of regeneration. Therefore, they recommended that browsing intensity be considered in future studies.

Random effects

Many ecological models take into account spatial autocorrelation in data that is due to biotic and abiotic processes (limited seed dispersal, clumping of safe sites) as well as spatial structure in unobserved explanatory variables (Flores et al. 2009; Bachl et al. 2019). However, the variation of the mean of the random field expresses the variation remaining after taking into account the effects of the covariates (Dutra Silva et al. 2017), i.e. the fixed effects. Our results indicate that the spatial relationship between inventory plots contains information that is not included in the observed covariates, and the inclusion of the spatial random effects improves the overall prediction (Appendix Figure A1, Figure A2, Figure A3). Compared to size classes B0 and B1, B2 exhibited more variation, which cannot be explained by the fixed effects. The long time lag between establishment and inventory in the case of B2 may have contributed to the larger share of variance that is absorbed by the random effects in the model of B2 (Figs. 5, 6). Beech regeneration requires almost 20–40 years from germination to reach a dbh of 7 cm (Hessenmöller et al. 2012). During this time, the environmental conditions that affect growth or mortality can change significantly. Potential seed trees can be harvested. The lack of information on environmental conditions during the establishment phase may lead to a greater proportion of the variability being captured by the random effects. The random effects give insights into unobserved processes. The distribution of spatial random effects can be used as a surrogate to derive processes from spatial patterns (McIntire and Fajardo 2009).

In the north-western part of the example area, the occurrence of beech regeneration was higher than expected based on the fixed effects, while it was lower in the eastern part, as shown in Fig. 5. The pattern suggests that a large-scale process was responsible for this, potentially influenced by the site. Although there was no significant effect of nutrient content on regeneration occurrence, another study found a correlation between site productivity and beech recruitment (Klopcic et al. 2012). Other site variables may have a significant impact and should be tested. In addition to site variables, the suitability of the habitat for seed-dispersing animals and the occurrence of other masting tree species could also affect the probability of the occurrence of beech regeneration and cause the pattern of random effects (Perea et al. 2011; Löf et al. 2018).

A higher regeneration density than that predicted by the fixed effects was found especially at the edges of the forest areas near the city of Dresden. The patterns appeared on a small scale. These areas are visited for recreational use and game may avoid these areas as a result. Mathisen et al. (2018) found that proximity to forest roads affects the intensity of browsing of oak as red deer (Cervus elaphus L.) and roe deer (Capreolus capreolus L.) avoid these areas. A lower local game density can, therefore, lead to a higher regeneration density in the vicinity of cities. Spatial random effects also exhibited a lower density than predicted by the fixed effects in the interior of the forest (Fig. 6). This corresponded to findings of Mathisen et al. (2018) who observed higher browsing intensity in the inner reaches of the forest. In addition to the effect of visitors, real seed sources may not have been included in the data. A comparison of stand inventory databases and remote sensing databases indicated that many individual seed source trees are not recorded in stand databases (Axer and Wagner 2020). Remote sensing data were not available for the whole study area. The small-scale composition of the ground vegetation may also have influenced the regeneration density. The most relevant task for future research is to disentangle all that is pooled together in random effects. Additional variables should be included in the forest inventory for this purpose.

Conclusions

The natural regeneration and establishment of beech can support the conversion of pure coniferous stands into mixed stands, a task which is relevant to forest management throughout central Europe. The spatial ZANB model, implemented in R-INLA, provides the opportunity to analyse regeneration data. Factors influencing the occurrence and density of beech regeneration can be examined separately. This improves our ecological understanding of the establishment of beech regeneration. At the same time, the study provides information on which additional variables could also have an influence on beech regeneration. The model developed in this study emphasises the variability in regeneration across the study area. The probability of regeneration establishing is dependent on site and overstorey characteristics. However, the most limiting factor for regeneration is seed availability. Due to the very limited dispersal of beech seeds, the position of old beech trees is one of the main drivers of a succession of beech trees into surrounding spruce and pine stands. Knowledge of the location of mother trees and of the distance to the nearest mature beech stand is, therefore, an important source of information for management. Based on this knowledge, natural regeneration can be estimated spatially. Using these predictions, spatial decisions on forest conversion activities can be planned at forest management level. Areas in which artificial regeneration is necessary can be identified. At the same time, areas in which there is no need of planting can be identified. The evaluation of the regeneration process, therefore, leads directly to greater control over the effort that must be invested. The model developed in this study can be replicated for other forest districts for which sample points are available. To a large extent, the effects determining seedling occurrence and identified by the model are in alignment with our silvicultural knowledge. The model is also relevant to practitioners, because the variables used in the model were reported in the context of a forest inventory.