Using forest gap models and experimental data to explore long-term effects of tree diversity on the productivity of mixed planted forests

In this exploratory study, we show how combining the strength of tree diversity experiment with the long-term perspective offered by forest gap models allows testing the mixture yielding behavior across a full rotation period. Our results on a SW France example illustrate how mixing maritime pine with birch may produce an overyielding (i.e., a positive net biodiversity effect). Understanding the link between tree diversity and stand productivity is a key issue at a time when new forest management methods are investigated to improve carbon sequestration and climate change mitigation. Well-controlled tree diversity experiments have been set up over the last decades, but they are still too young to yield relevant results from a long-term perspective. Alternatively, forest gap models appear as appropriate tools to study the link between diversity and productivity as they can simulate mixed forest growth over an entire forestry cycle. We aimed at testing whether a forest gap model could first reproduce the results from a tree diversity experiment, using its plantation design as input, and then predict the species mixing effect on productivity and biomass in the long term. Here, we used data from different forest experimental networks to calibrate the gap model ForCEEPS for young pine (Pinus pinaster) and birch (Betula pendula) stands. Then, we used the refined model to compare the productivity of pure and mixed pine and birch stands over a 50-year cycle. The mixing effect was tested for two plantation designs, i.e., species substitution and species addition, and at two tree densities. Regarding the comparison with the experiment ORPHEE (thus on the short term), the model well reproduced the species interactions observed in the mixed stands. Simulations showed an overyielding (i.e., a positive net biodiversity effect) in pine-birch mixtures in all cases and during the full rotation period. A transgressive overyielding was detected in mixtures resulting from birch addition to pine stands at low density. These results were mainly due to a positive mixing effect on pine growth being larger than the negative effect on birch growth. Although this study remains explorative, calibrating gap models with data from monospecific stands and validating with data from the manipulative tree diversity experiment (ORPHEE) offers a powerful tool for further investigation of the productivity of forest mixtures. Improving our understanding of how abiotic and biotic factors, including diversity, influence the functioning of forest ecosystems should help to reconsider new forest managements optimizing ecosystem services.


Introduction
The effect of species diversity on ecosystem properties and functions is a key research topic in ecology Cardinale et al. 2012; van der Plas 2019). Both theoretical seminal works (Tilman et al. 1997;Loreau 1998) and experimental biodiversity-ecosystem functioning (BEF) studies, mostly carried out with artificial grasslands (Hector et al. 1999;Isbell et al. 2011), brought strong evidence for a positive impact of species diversity (especially species richness) on ecosystem functioning (especially productivity) (Hooper et al. 2005). Investigating experimentally the role of tree species diversity on forest functioning is much more challenging because of the large stature, longevity, and relatively slow growth of trees (as compared with herbaceous plants). This is the reason why this question has been mostly explored through observational studies. They relied on analyses of forest inventory data at national (Vila et al. 2007;Paquette and Messier 2011;Potter and Woodall 2014;Toigo et al. 2015), continental (Vila et al. 2013) or global scale (Liang et al. 2016), or on stand-level observations usually comparing "triplets" of stands (i.e., monospecific stands of species A and B, and mixture of A and B, e.g., Pretzsch et al. (2013)). They generally showed a positive effect of tree species richness on stand productivity (Zhang et al. 2012), although not always (Jacob et al. 2010;Potter and Woodall 2014). More recently, modelling-based studies confirmed such positive effect of species mixing on productivity (Morin et al. 2011;Perot and Picard 2012;Forrester and Tang 2016;Forrester et al. 2017). However, diversity effects on productivity were mainly estimated through the concept of "overyielding" M > P À Á , which means that the productivity of the mixed-species stand (M) is greater than the average productivity in component monocul- . It would be more advantageous if the mixtures could also exhibit "transgressive overyielding" (M > P 1, where P 1 is the most productive monoculture). In this case, there is a positive difference between the productivity of mixed-species stands (M) and the productivity of the best component monoculture (P 1 ) . In addition, similar to what was found in experimental grasslands, the effect of diversity-and particularly tree species richness-on forest productivity seems to strongly depend on species identity and site conditions, notably climate (Forrester 2014;Toigo et al. 2015;Jactel et al. 2018). Mixed forests are likely to better cope with climate change-i.e., maintaining a minimum level of functioningthan monospecific forests (Zhang et al. 2012), and with more stable functioning under changing conditions (Jucker et al. 2014) or practices. Furthermore, it has been shown that mixed forest stands are often more resistant to biotic disturbances like pathogen attacks or insect herbivory (Jactel and Brockerhoff 2007;Damien et al. 2016;Kambach et al. 2016;Jactel et al. 2017). Therefore, even in regions with currently high productive monospecific forests, it appears crucial to investigate whether mixed forests may be considered relevant alternatives, in order to maintain the provisioning of goods and services, and particularly in the context of climate change (Mori et al. 2017;. In spite of the technical difficulties listed above, several manipulative tree diversity experiments have been planted in the last 15 years Verheyen et al. 2016;Paquette et al. 2018) as they offer the unique opportunity to decipher the underlying mechanisms driving the effect of tree diversity on forest functioning (Scherer-Lorenzen et al. 2007;Grossman et al. 2018). Such long-term experiments are much less affected by confounding factors (climatic and soil conditions, management) than studies based on temporary plots (triplets) or large-scale inventory data to compare the productivity of monospecific and mixed stands. One should also mention that there is a number of experiments testing additive and substitutive effects in two species forest mixtures established in the previous century (see, e.g., Kelty and Cameron 1995;Mason and Connolly 2014). Nevertheless, even if some of these experiments have already yielded results showing how diversity can promote productivity in tree mixtures (e.g., Salisbury and Potvin 2015;Williams et al. 2017;Huang et al. 2018), most of them are still too young to bring relevant insights and allow generalization (Forrester 2014; Van de Peer et al. 2017). Here, we propose to combine data originating from such an experimental plantation with a modelling approach to investigate the possible long-term effect of tree diversity on forest plantations properties and functioning, i.e., on mature stands. More precisely, we have carried out simulations to compare the effect of species substitution and species addition on forest productivity. Species substitution-the pattern most often tested in BEF studiesconsists in replacing a proportion of trees from a focal species by trees from another species, thus keeping total tree density constant, but reducing species-specific density. In contrast, species addition refers to the introduction of trees from a companion species in a monoculture of a focal species, thus increasing both diversity and total tree density while keeping constant species-specific density. Our study is thus explorative, but it is clearly of both theoretical and applied interest to compare the effect on stand productivity of mixing tree species with a substitution or an addition approach, as this can help identify the role of tree density in mixed forest productivity while offering different options to forest managers (Pretzsch and Schütze 2016;Coll et al. 2018).
The mechanisms by which biotic interactions (excepted trophic ones) may promote ecosystem functioning could be resource partitioning (in space, time, chemical form), or abiotic facilitation (e.g., through resource enrichment or stress buffering as for example nitrogen fixing) (Barry et al. 2019). Regarding resource partitioning, complementarity processes may be related to (i) niche differentiation (e.g., vertical stratification (Morin et al. 2011), shade tolerance, phenology, root system) and (ii) plasticity adaptation, related to differences in species response to mixing (e.g., plasticity in crown shapes (Pretzsch 2014)). The present study focused on niche differentiation. By doing so, we do not deny the existence of plasticity adaptation, but we aimed at testing whether only diversity effects may explain overyielding in mixed stands.
Recent developments have shown how forest gap models represent an appealing complement of experimental or observational studies for testing diversity effects on the functioning of forest ecosystems (Morin et al. 2011Bohn and Huth 2017;Morin et al. 2018), especially because these models can take into account both biotic (mostly competition) and abiotic (e.g., climate) factors, from tree to stand levels. Relying on empirical relationships, physiological knowledge, and first principles from ecological theory, forest gap models represent a middle course between purely empirical (Skovsgaard and Vanclay 2008 or « simple » yield tables) and ecophysiologically orientated models (Dufrêne et al. 2005). Although less precise in their predictions, they are notably more easily parameterized than the latter ones. This feature usually allows gap models to achieve a strong generality in their results. Gap models (Bugmann 2001) simulate community dynamics by considering establishment, growth, and death of trees in a small patch of land. This modelling approach can be used to carry out virtual experiments in which tree community composition and forest productivity are emergent properties, based on environmental filtering and competition in the long-term (Chauvet et al. 2017). It allows testing diversity effect in mature tree communities through the simulation of forest dynamics in the long term, thus avoiding the bias due to transient effects (Duffy 2009). Furthermore, although forest models have already been used to predict the effects of tree species mixture on forest productivity (e.g., Morin et al. 2011;Forrester and Tang 2016), there were very few examples where the effect of tree species mixing had been compared between simulations and empirical data (Pretzsch et al. 2015a, b). However, in the context of exploring mixing effects on forest functioning in the long term, the strength of this approach should be interpreted with the relevant caution, especially in the light of the processes embedded in the model.
In this exploratory study, we aimed at extending recent developments in gap modelling to tree diversity experiments, using one of them located in SW France as an example of mixing trees design and as data supplier. More precisely, we wanted to test the potential of the combination of BEF experimental results with a forest gap model to simulate the longterm effects of mixing species on forest productivity.
The Landes de Gascogne forest in Southwest France is the largest planted forest area in Europe, with a surface close to 900,000 ha (IGN 2016). The main cultivated species is the maritime pine (Pinus pinaster, hereafter pine), a native species present in more than 75% of the total wood production forest area (IGN 2016(IGN , 2018. Most of the pine forests are intensively managed as an even-aged plantation with rotation lengths from 35 to 65 years. They are considered highly productive by European standards with increments of 10 m 3 /ha/yr or even more (Verkerk et al. 2015) and provide high volumes of fiber and round wood material to the national forestry sector (Agreste 2017). Yet, the possible sensitivity of such monoculture tree forests to global change ) has called for alternative forest management. In the context of Southwest France, an interesting option is to associate the maritime pine with the silver birch (Betula pendula, hereafter birch) a native species able to grow in the poor sandy soils of the region. Birch trees are of particular interest because they may preserve pine trees from pest insect damage, especially from defoliation by the pine processionary moth if they are planted as broadleaved hedgerows around pine stands (Dulaurent et al. 2012;Jactel 2011) or as intimate mixtures (Castagneyrol et al. 2014;Damien et al. 2016), but with efficiency at young ages. To experiment how tree species richness drives various ecosystem processes and functions over a rotation, the ORPHEE experiment, which belongs to the worldwide Tree Diversity Network (http://www.treedivnet.ugent. be/), has been planted in 2008 in a site of the Landes de Gascogne (Castagneyrol et al. 2013). This experiment spans species richness levels from 1 to 5 native trees species including mixed pine-birch plantations.
In this study, we tested whether mixing pines and birches can promote stand productivity in the long term, i.e., along a 50-year forest cycle, by using simulation runs with the forest gap model FORCEEPS, at two different tree densities. To calibrate the model, we used experimental data from monospecific pine and birch stands as well as literature, while validating the model on mixed stands.
In particular, we tried to answer the following research questions: -Can diversity effects emerge in simulated pine-birch mixed stands in the long term (ca. 50 years), exploring both the effect of species substitution and species addition on stand productivity?
-Can pine-birch mixed stands outperform the productivity of pure pine stands (transgressive overyielding) in the simulations?
Therefore, beyond the specific results arising from these explorative simulations, we aimed at showing that calibrating the FORCEEPS forest gap model with data collected on experimental plots might bring new insights into the long term functioning of mixed planted species.

Methods
We aimed at testing the long-term effect of mixing birch trees with pine trees on plot productivity in simulated plantations with a gap model, considering the effect of species addition and species substitution. We first calibrated the model for P. pinaster and B. pendula, and then we ran four kinds of simulations: monocultures of each species and mixtures of the two species, using two different stand densities, i.e., according to a substitutive or an additive design (Morin et al. 2020).

The model
We used the forest gap model FORCEEPS (Forest Community Ecology and Ecosystem Processes, http://capsis.cirad.fr/ capsis/help_en/forceeps), developed on the Capsis modelling platform (Dufour-Kowalski et al. 2012), to carry out virtual experiments testing the effect of tree species diversity on forest structure and functioning. FORCEEPS is a forest dynamics model based on the same basic principles as the FORCLIM model (Bugmann 1996;Didion et al. 2009). As a gap model (Botkin et al. 1972;Bugmann 2001), FORCEEPS relies on a few ecological assumptions to simulate the establishment, growth, and death of trees in independent small patches of land, considering both abiotic and biotic constraints. The model is not spatially explicit at the patch level, and forest properties at a larger scale are derived by aggregating the properties simulated at the patch level (Bugmann 2001).
Generally, gap models are cohort-based models in which all trees of the same species and of the same age behave in the same way except for mortality events that are stochastic at tree scale. The original rationale for this choice was to save computation time. Here, as we focused on plantations and thus even-aged populations of trees, a cohort-based approach would have been overly simplistic, neglecting variations in individual tree behavior in the simulations. FORCEEPS is individual-basel and differs from many gap models in that it allows seedling dimensions to vary at the plantation time (especially in terms of seedlings height).
In an initial version of FORCEEPS, trees are established as saplings, with a DBH of 1.5 cm (diameter at breast height). In this study, however, we switched off this establishment phase because we focused on plantations, in which all trees are planted and thus artificially established. Instead, we started the simulation from the data inventory (exact patch composition taking seedling mortality into account, but also variability in seedling heights) carried out at the beginning of the ORPHEE experiment.
Growth (i.e., stem diameter increment at breast height) was modelled in two steps. First, optimal growth was calculated for each tree using an empirical equation (Moore 1989;Bugmann 1996). Second, actual tree growth was calculated by modifying the optimum rate according to a set of environmental factors limiting growth. Specifically, considering the annual growth of tree j, a maximum diameter increment I D f was first calculated, depending on the size of the tree (i.e., its dbh D j ) and on a species-specific maximum growth rate g, as shown in Eq. (1): where H j is the height of the tree, H max is the maximum height reachable by the species in optimal abiotic conditions, and b and c are species-specific constants. H j directly depends on D j and a species-specific allometric parameter s as follows (Bugmann 1996): with a = 1.37 m (i.e., breast height). Then at each time step, it was reduced by growth limiting factors: light and soil nitrogen availability, growing season temperature (degree-days sum), and soil moisture. To calculate weather-dependent factors, observed mean monthly temperatures and monthly precipitation sums were used. Ultimately, stem wood biomass and foliage weight were estimated by using allometric equations and then summed to calculate the total aboveground biomass.
The growth reduction of a tree due to nitrogen availability, growing season temperature, and soil moisture is speciesspecific and determined by site conditions; thus, these factors influenced tree growth differently depending on species characteristics in the model. The light growth factor differed from the others as it depends on the identity and the size of neighboring trees in a patch allowing inter-and intra-specific competition for light. The amount of light available to a tree was reduced by the leaf area of all trees, in the same patch, of height greater than or equal to the height of the tree considered, following a coefficient calculated from the Beer-Lambert law, as commonly done in gap models (Bugmann 2001). Leaf area was estimated from diameters and functional characteristics of these neighboring trees. It is noteworthy that while belowground competition for water and nutrient was not explicitly integrated into the model, soil nitrogen availability and soil moisture could impact leaf area through a modification of the diameters of neighboring trees; thus, site conditions in turn could modulate light competition. Therefore, as the the distribution of trees was not spatially-explicit inside the patch, the interaction for light depended on vertical stratification. Furthermore, the leaf area of a tree decreased when shaded, following Didion et al. (2009), although this was not translated explicitly into a change in crown size as the whole foliage was supposed to be concentrated at the top of the tree.
Tree mortality depended on two components: background mortality (purely stochastic) and growth-related mortality. The former depended on species maximum longevity and was stochastic, whereas the latter was an integral proxy for stress conditions, i.e., tree vigor. It is noteworthy that since competition could affect individual tree growth, it also had an indirect effect on simulated mortality rates via growth-related mortality (Bugmann 1996). Although the parameters used are inherited from ForClim (Didion et al. 2009) (see Supplementary material), i.e., according to theoretical assumptions, it has been recently shown that more realistic parametrization may be extremely difficult to achieve so far (Hülsmann et al. 2018;Vanoni et al. 2019), especially with the kind of data available in our specific case.
Site characteristics included soil nitrogen content, maximum water holding capacity, and climate conditions. It is noteworthy that in FORCEEPS, climate variables consist of time-series (monthly temperature and precipitation sums) that can be either simulated or observed, which facilitates the calibration and validation in new sites. For a more detailed description of the model, see Supplementary material, and see Table 1 for a detailed description of species parameters.

Study sites
Before proceeding with the simulations for P. pinaster and B. pendula, we had to calibrate and validate FORCEEPS for these two species in the conditions of the ORPHEE experiment. To do so, we used experimental data from ORPHEE and data from long-term experimental plots from the surroundings and, whenever necessary, information from the scientific literature.
In order to calibrate in particularly the allometry parameter for ages from 5 to 20 years old, we used data from two older experimental plots to obtain information about dimensions of adult trees for both species (DBH and total height): the ISLANDES network (Bernier et al. 2016) and the GIS COOP -Castillonville trial (Seynave et al. 2018). ORPHEE experiment and Castillonville trial were located next to each other, while the ISLANDES network is 100 km southwest of Bordeaux . However, it is subject to similar environmental conditions, in particular sandy podzolic soils (Bernier et al. 2016). Climate data, mean monthly temperatures, and monthly precipitation sums came from the meteorological station of Cestas Pierroton (1997-2015; https:// www6.bordeaux-aquitaine.inra.fr/ue-pierroton/Meteo).

Tree data for calibration
The ORPHEE experiment is composed of eight blocks established in 2008 on a clear-cut of a former maritime pine stand. There are 32 plots in every block corresponding to the 31 possible combinations of 1-5 species, with an additional replicate of the combination of the five species (European birch: Betula pendula Roth., Maritime pine: Pinus pinaster Ait., and three oak species: Quercus robur L., Q. pyrenaica Willd., and Q. ilex L.). Tree species mixtures were established according to a substitutive design, keeping overall tree density equal across plots. Within plots (20 ;× ;20 m), 100 individual trees (10 ;× ;10) from different species were planted at 2-m intervals, in a regular alternate pattern (for more details on the ORPHEE experiment, see Castagneyrol et al. (2013)).
We compiled data from the ORPHEE experiment for the two target species, between 2012 and 2015, i.e., years for which both DBH and height of trees were available (before 2012, the trees were too small to measure DBH). In each plot, only the height of core trees at the center of the plot was measured to avoid edge effects (total n = 36), which reduces the number of sampled trees (see Supplementary Table 4). Among these core trees, seven trees per plot were selected and measured for both DBH and total height. As shown in Damien et al. (2016), at the time of the survey, oaks were 5 to 8 years old and on average less than 1 m tall, i.e., much smaller than pines and birches and could be confounded with understorey vegetation. As a consequence, the effect of oaks can be neglected in this study, so that mixtures of pines with one of the oaks or mixtures of birch with one of the oaks were considered as "low density" pine or birch monocultures, resulting in a gradient of pine/birch density, from 2500 stems/ha in real monocultures (100 trees in 400-m 2 plots) and pine-birch two species mixtures to 1250 stems/ha (50 pines or 50 birches in mixture with one oak species in 400-m 2 plots).
On these selected monoculture plots (4 plots per block and per species, thus 224 potential measured trees per species and per year), data were available for 878 and 874 tree-year couples for birch and pine trees respectively, for which DBH and height had been measured, for the calibration of the allometry parameter s (Eq. (2)). However, as the DBH-height allometry should also be inferred with adult trees, i.e., at least up to 20 years old, and thus we also used data from two older monospecific experimental plots to obtain DBH and height data for adult trees: the ISLANDES network for 18-year-old birch trees (n = 82) and the GIS COOP -Castillonville trial for 18-year-old pine trees (n = 224) (Supplementary Table 4).

Species calibration
Our objective was to calibrate species parameters for the ORPHEE conditions. A tree species was characterized by nine parameters in FORCEEPS (Table 1 and Supplementary material), reflecting species intrinsic characteristics (e.g., foliage type, maximum height, shade tolerance), species responses to abiotic conditions (e.g., drought tolerance, soil nitrogen requirements), growth-related traits (e.g., maximum growth rate), and regeneration traits (e.g., shade tolerance of seedlings, thermal requirements for regeneration, or tolerance to browsing). We did not consider the four regeneration parameters because we dealt with plantations in this study. We used data from several sources (Table 1). For parameters describing species' response to abiotic conditions (i.e., effect of the growing season temperature on tree growth (DDmin); drought tolerance (DrTol); and soil nitrogen requirement (NReq)), we relied on literature (Ellenberg and Mueller-Dombois 1966;Botkin et al. 1972;Niinemets and Valladares 2006). For parameters describing species intrinsic characteristics (i.e., foliage type (Ft); maximum age (Amax); maximum height (H max ); and shade tolerance (ShTol)), we used literature (Ellenberg and Mueller-Dombois 1966;Rameau et al. 1989;Niinemets and Valladares 2006) and data from the French National Forest Inventory (IGN 2016(IGN , 2018. Finally, FORCEEPS required two growth-related traits (i.e., maximum growth rate, g (Eq. (2)); and allometric parameter s (Eq. (1))). The parameter g corresponded to the growth rate of the species in optimal environmental conditions (Eq. (1)). It was difficult to directly extract such information from observational data, as trees were usually constrained in their growth (e.g., by environmental conditions or competition). Therefore, we also relied on literature data on other pine species, especially P. sylvestris (Table 1)-from FORCLIM and GREFOS models (Didion et al. 2009;Fyllas et al. 2010). For birch, g was primarily calibrated using data from Swiss alpine forests in Switzerland (Didion et al. 2009) (see Table 1). To calibrate the parameter s relating diameter and height (Eq. (2)), we used ORPHEE data and the additional data on adult trees (ISLANDES and GIS Coop) from monospecific plots and the nls function (from the package stats in the R software (R-Core-Team 2018)) that determines the weighted leastsquares estimates of the parameters. Note that we chose to focus only on pines and birches growing in monocultures to calibrate the DBH-height allometry parameter in the model, i.e., independently from any interspecific interaction effect but by taking into account the effect of tree density. This rationale relied on the fact that the diversity effect on tree growth only arises through fixed plant responses in ForCEEPS (Barry et al. 2019). Therefore, considering trees growing in either monospecific or mixed stands at the calibration step may reduce our ability to detect whether diversity effects emerge from the simulations by comparing simulated monospecific and mixed stands.

Validation of the model's outputs
To validate the growth increments predicted by FORCEEPS with the selected parameters for each species, data from ORPHEE only allowed short-term comparison (up to 8 years for tree height and 3 years for DBH according to the data that were available). As our long-term simulations mostly depend on diameter increments, we thus first chose to use the DBH data to test growth patterns at the individual level. We compared the observed annual tree diameter increment in the ORPHEE experiment with the annual increments in diameter simulated by FORCEEPS for the same trees. We carried out this comparison for each species and for the 3 years for which these data were available (i.e., 2013, 2014, and 2015). We selected the trees growing in the most competitive environment in ORPHEE plots, i.e., at the highest density (thus originally without any of the three oak species), either in monospecific or mixed stands. Therefore, we used 56 trees from each species in monospecific stands (i.e., 7 trees in one monoculture plot in 8 blocks) and in mixed stands (n = 56 for each species measured each year, i.e., 7 trees for each species in one mixed plot in 8 blocks) for validating growth patterns. However, we used tree height to test for growth patterns at the stand level, because doing so allowed comparing species dynamics (assessed through tree height) at that scale with more years than DBH. Furthermore, as trees are not spatialized in FORCEEPS, stand-level comparisons can only rely on mean values calculated across all trees in the plot (or all trees of the same species) but not individually, because the behavior of precisely identified trees cannot be well reproduced without bias. We have compared the mean height of trees in mixed stands and for both densities between observations in ORPHEE and simulations with FORCEEPS across the first 8 years, to test for the ability of the model to simulate mixed stands (n = 288 for each species at low density, 12 trees per target species in three mixed plots (pine + birch + one oak) in 8 blocks and n = 144 for each species at high density, 18 trees per species in one mixed plot in 8 blocks; Table 4). Note that there were more trees for the low densities because there were more plots at this density because of the oaks (see above).

Long-term simulations
After short-term validations, we simulated forest dynamics during 50 years, i.e., the mean duration of a forestry rotation (before clear-cut) of a maritime pine plantation in the Landes de Gascogne, in three types of stands: monoculture of pines, monoculture of birches, and pine-birch mixtures. Each simulation was run on 100 independent 400-m 2 patches. Initial conditions mimicked plantations, i.e., with small trees with a DBH between 0.5 and 1.5 cm, randomly assigned (with an age of 3 years old).
We considered three contrasting tree densities in the monoculture simulations: 375 stems/ha (i.e., 15 trees per plot), 750 stems/ha (i.e., 30 trees per plot), and 1500 stems/ha (i.e., 60 trees per plot). Mixtures were run with two tree densities (1500 and 750 stems/ha); each species therefore contributing 50% of total tree density (thus 750 and 375 stems/ha respectively, for each species). Such design allowed testing the effect of species substitution and species addition, by comparing mixed stands with pine monospecific stands at similar pine density (species addition) or similar total tree density (species substitution) (Fig. 1). Each year of the simulations, we recorded tree density in the plot, average height of trees, plot productivity (as basal area and biomass increments), and total plot biomass, which were all calculated across the 100 independent patches.

Analyses of long-term simulation outcomes
First, we estimated the transgressive yielding (TY) of pinebirch mixtures by comparing the total accumulated biomass and the mean annual productivity between pine-birch mixtures and the mean productivity of the monoculture of the most productive species (here, the mean of pine monocultures) at the same total tree densities (1500 and 750 stems/ ha) following the substitution design, i.e., calculating M 1500 -P 1500 and M 750 -P 750 (as shown in Fig. 1).
Second, we compared the total accumulated biomass and the mean annual productivity between the pine-birch mixtures and pine monocultures at constant density of pine trees, in order to quantify the effect of species addition (here birch) at the plot level. We thus calculated M 1500 -P 750 and M 750 -P 375 (Fig. 1).
Third, we compared the total accumulated biomass and the mean annual productivity of pine trees between the pine-birch mixtures and pine monocultures at the same constant density of pine trees, by thus calculating P mix 1500 -P 750 and P mix 750 -P 375 . Such calculation allows quantifying the effect of species addition specifically on pine trees (Fig. 1).
Finally, we quantified the net biodiversity effect (NBE) in simulated mixtures as the difference between the simulated productivity of two-species forests and their expected productivity based on the simulated productivity of component monospecific forests (Loreau and Hector 2001)-under the null hypothesis that there is no effect of species interactions on ecosystem functioning. In the present case, the NBE was calculated according to the original relative abundance (in terms of tree density), i.e., [productivity of both species in mixed stand] − 0.5 × [productivity of each species in its pure stand]. We highlight that simulations outputs were aggregated at the forest scale (not patch), and we were thus only able to make a simple calculation of NBE and TY. However, we could compare how TY and NBE change over time in the simulation.

Calibration of allometry between diameter at breast height (DBH) and height at the individual tree level
The best values for the s parameter according to the optimization algorithm were 103 and 62 for birch and pine respectively (Table 1). Tree heights simulated according to Eq. (2) with these best values for s were accurately predicted (r 2 = 0.85 and r 2 = 0.78 for birch and pine respectively, Fig. 2). Although using data from several sources (ORPHEE, GIS COOP -Castillonville trial, ISLANDES network) may lead to a slight overestimation of the height of the smallest pines (i.e., mostly in the first years) and to an underestimation for the largest pines, the accuracy of our predictions for diameter-height relationships was strong enough to be used in FORCEEPS simulations.
In mixed stands, FORCEEPS also simulated diameter increments for P. pinaster with a strong accuracy for 2013, 2014, and 2015 (Fig. 3d). The predictive power between trees was even stronger for monospecific stands, with r 2 values of 0.99 (slope = 1.04), 0.89 (slope = 0.93), and 0.97 (slope = 1.01) for 2013, 2014, and 2015 respectively (n = 56 for each year). For B. pendula, FORCEEPS slightly overestimated diameters (Fig. 3c) 1 Schematic representation of the simulation design and the comparisons made to assess the effect of species substitution and species addition between mixed stands (mixtures) and monocultures at the different densities tested. M 1500 and M 750 (mixed stand at 1500 and 750 stems/ha density respectively); P 1500 , P 750 , and P 375 (pine monoculture at 1500, 750, and 375 stems/ha density respectively); B 1500 , B 750 , and B 375 (birch monoculture at 1500, 750, and 375 stems/ha density respectively); P mix 1500 and P mix 750 (pine trees in mixture at 1500 and 750 stems/ha density respectively with a pine density of 750 and 375 respectively); B mix 1500 and B mix 750 (birch trees in mixture at 1500 and 750 stems/ha density respectively with a birch density of 750 and 375 respectively)

Validation of simulated height tree growth in pine-birch mixtures with FORCEEPS
During the first years of simulation, birch trees were taller than pine trees (Supplementary Fig. 8). Pine trees became taller than birch trees after 6 years for two densities ( Supplementary Fig. 9). Pine trees were slightly taller in mixtures than in monocultures, while the opposite pattern was observed for birch trees. Such findings are consistent with what was measured in the ORPHEE experiment in mixed stands, with pine trees overcoming birches 6 years after the start of the experiment (Fig. 4), thus the same year as in the simulations. Student t tests were not significant for each species and each year, except for pines at the last year for which the differences in mean tree height between simulations and observations were marginally significant (P ;< ;0.1).

Long-term simulation of tree growth in pine-birch mixtures with FORCEEPS
With these results on diameter and height, we were confident enough to run simulations until 50 years old with FORCEEPS and test for diversity effects. The summary of results is shown in Table 2. Tree densities progressively declined with time for all combinations (i.e., diversity × density) and all simulations converged to densities comprised between 150 and 400 trees per ha at 50 years (Fig. 5). As expected, the decline in tree density was faster at the highest density. However, it is noteworthy that the density of pine trees after 50 years was similar in mixtures with 1500 trees (at the start of simulation) and in monocultures (regardless of the density in monoculture).

Effect of species substitution on transgressive yielding (M 1500 -P 1500 and M 750 -P 750 )
The pine-birch mixture with a substitutive design showed lower productivity and accumulated less biomass than pine monoculture at the same low tree density, i.e., negative transgressive yielding or transgressive underyielding.

Effect of species addition at the stand level (M 1500 -P 750 and M 750 -P 375 )
For the mixture with the highest tree density, there was hardly any difference between the mixture (M 1500 ) and the pine monoculture at 750 tress/ha (P 750 ) in terms of productivity and biomass (Table 3). The mixture with the lowest density (M 750 ) showed a 10% increase in productivity and a 9% increase in accumulated biomass in comparison with the pine monoculture at 375 trees/ha (P 375 ). Thus, adding birch trees with pine trees increased productivity at lower densities and did not significantly reduce stand productivity at high density according to our simulations ( Supplementary Fig. 10).
3.7 Effect of species addition on pine trees (P mix 1500 -P 750 and P mix 750 -P 375 ) Consistent with the pattern described just above, pine trees in the mixture were barely affected by their mixing with added birches at high density (P mix 1500 vs. P 750 ), while they experienced an increase in productivity (Supplementary Fig. 10) and biomass accumulation at the lowest density (P mix 750 vs. P 375 ). (2) with the fitted value for the parameter s (see Table 1). Black circle: ORPHEE data; blue circles: ISLANDES data (only birch trees); green circles: GIS COOP -Castillonville data (only pine trees)

Net biodiversity effect (substitutive design)
We found a strong overyielding in our simulations, i.e., a positive net biodiversity effect (positive NBE) at the stand level, regardless of tree initial density under the substitutive design (Table 3; Fig. 6). Overyielding was much stronger at the high (1500 stems/ha) than at the low density (750 stems/ ha), all along the simulation time. However, overyielding peaked earlier for the 1500 stems/ha density (after ca. 23 years of simulation) than for 750 stems/ha density (after ca. 35 years), and then decreased. The overyielding was mostly due to a positive effect of mixture on the productivity of pine trees, while birch trees experienced a relative decrease in productivity ( Fig. 6 and Supplementary Fig. 10).

Discussion
This study aimed at showing how a forest gap model combined with data from BEF experiments can be used to simulate the long-term effects of mixing species on forest productivity. To illustrate this purpose, we thus proposed to focus on SW French forests, after having validated the effects at the first stages with experimental data of the ORPHEE BEF experiment. Although the short-term validation was satisfying and allows specific discussion of the simulations, the results shown here remain illustrative and should not be taken as not reliable predictions. The main point of this exploratory study is that the combination of BEF experimental results with a model like FORCEEPS might yield interesting projections and help in proposing hypotheses about the role of diversity effects in forest functioning, including interactions with climate change impacts. More specifically, this study is among the first to compare mixing effects simulated with a forest growth model with BEF empirical data (i.e., ORPHEE data) (Pretzsch et al. 2015a). Following the validation of the predictions of FORCEEPS on tree growth or B. pendula and P. pinaster with data from the ORPHEE tree diversity experiment (Figs. 3 and 4), our simulations showed that mixing P. pinaster with B. pendula has the Fig. 3 Predicted with FORCEEPS vs. observed diameter at breast height (DBH) for birch (a, c) and pine (b, d), in 2013 (open circles, with associated regression line in dashed grey), 2014 (grey circles, with associated regression line in plain grey), and 2015 (black circles, with associated regression line in dashed black), using diameter data of the ORPHEE experiment, in monospecific stands (a and b: total density 2500stems/ ha) and mixed stands (c and d: total density 2500 stems/ha). The 1:1 line is represented by the red plain line. The number of observed trees each year was 56 for each species potential to enhance overall forest stand productivity as well as pine productivity in the long term. The observed overyielding in mixed stands resulted from the higher productivity of the pine species during the entire forestry cycle (50 years). However, we found that mixing pine and birch would affect biomass accumulation differently according to the stand initial tree density. Weak patterns of response were found at the highest density, while stronger effects emerged at lower density (negative effect for species substitution but a positive effect for species addition). While these findings should be taken with caution regarding the limits of the study (see below), these results nevertheless suggest that birch addition to pine may be considered an interesting alternative to pure maritime pine plantation, not only in terms of increased diversity and associated ecosystem services, but also because pine-birch mixtures can eventually enhance wood production as compared with pine monocultures (through the higher pine relative yield, but also through the additional-although small -wood production of birches).

Accuracy and reliability of the model's predictions
Our simulations reproduced the experimental results with interesting accuracy. First, FORCEEPS showed an accurate ability to reproduce tree growth in either mixed and monospecific stands, although it slightly overestimated the growth of birch trees in pure stands. Second, short-term simulations showed that the age at which pine trees became taller than birches was identical as what has been observed in experimental data (Damien et al. 2016), although one should remind that the simulations were carried out across 100 patches-thus, not exactly at the same spatial scale than ORPHEE. Allometry between tree height and diameter successfully predicted tree growth in height at least in the young stages at ORPHEE, although parameters were set with data from different sources (ISLANDES and GIS COOP data) and with older ages. However, the good fit between observed and simulated data may not guarantee that predictions in the long term will be necessarily reliable.
It is also noticeable that the growth of trees in mixed stands (as well as the relative growth of each species) was well reproduced by the model (Fig. 4), while the parameterization of the s and g (Table 1)-i.e., the two parameters that have been specifically calibrated for this study-was done using only trees from monospecific stands (from ORPHEE, ISLANDES, and GIS COOP monospecific plots), thus without taking interspecific interactions into account. There was therefore no circularity in the design of our calibration/validation exercise. For the above reasons, although the uncertainty of these results remains too large to build any precise recommendation in terms of forest management, we are confident that the approach proposed and the trends shown here are strong and robust enough to be discussed. Fig. 4 Mean tree height in mixed stands observed in the ORPHEE experiment (right panels) and simulated by FORCEEPS (left panels), at the high (upper panels: 1500 or 1250 stems/ha) and low (lower panels: 750 or 833 stems/ ha) density. For each year, each density, and each species, the difference was not significant between observations and simulations, except for pines at the last year

Ecological insights from the modelling study
This study illustrates how a general gap model, only considering the competition for one resource explicitly (i.e., light), was able to accurately depict experimental data. There have been already many attempts to use forest models to predict the effects of tree species mixture on forest productivity (see Pretzsch et al. (2015a for review) in either naturally assembled composition (Forrester et al. 2017;Morin et al. 2011) or plantations (Battaglia et al. 2015;Sapijanskas et al. 2014). However, this study is the first one, to our knowledge, that shows how forest gap models can complement data from BEF experiments, which open promising perspectives for longterm predictions.
In nature, overyielding in mixed stands compared with pure stands can emerge from several mechanisms (Forrester and Bauhus 2016), like allometric plasticity (Dieler and Pretzsch 2013;Pretzsch 2014), niche complementarity in resource acquisition for either above (Jucker et al. 2015) and belowground processes (Richards et al. 2010) or decrease in the competition (Grossiord et al. 2014). In our simulations, trees could react in response to climate and soil conditions, but competition for light was the only modelled competitive interaction. Although we only considered competition for a single resource, we observed an overyielding pattern, as found in a former study with a similar model (Morin et al. 2011). This result is consistent with the fact that competition for light is often the main competitive process between trees in temperate forests (Silvertown 2004). However, further model developments should tackle more complex competition kernels, notably including competition for water and nutrients, to depict forest response to tree diversity more accurately.
In Morin et al. (2011), the positive effect of tree species richness on forest productivity was mostly due to interspecific complementarity in light capture and to larger occupancy of the vertical space in more diverse forests, both leading to stronger biomass turnover in mixed forests than in monospecific forests. The results of the present study might be also explained by the same processes. Yet, we only focused in the present study on a two-species mixture. In the model, the competitive effect of birch on pine growth ceases as soon as pines grew higher than birches. The positive effect of mixture on pine growth may then be simply related to the fact that pine trees quickly outcompeted birch trees, which reduced intraspecific competition between pines enough to promote their growth. The opposite was true for birch trees. Although cases of mutual overyielding have also been found (Pretzsch et al. 2015b), this antagonistic effect of mixing on associated species is a common pattern in two-species stands where one species often benefits from the mixture while the other performs worse than in monoculture (del Río et al. 2014;Toigo et al. 2015). In our case, the two species are both shadeintolerant (Niinemets and Valladares 2006), which is consistent with the observed antagonistic effect, but it is remarkable that the simulations still show a significant overyielding, with the overyielding of pine being higher than the underyielding of birch.

Transgressive yielding in mixed stands
One of the most remarkable results of our simulations is that species addition (birch added to pine, increasing overall tree density) did not result in increased competition but actually was at least neutral (at high density) or even had a positive effect (at low density) on stand productivity, thus resulting in transgressive overyielding. From a forest yield perspective, one can also notice that at the end of the simulation pine trees were slightly taller in the mixture than in the monoculture ( Table 2; Supplementary Fig. 8). Thus, these results from simulations suggest that mixing maritime pines with birches using an additive design may allow harvesting pine trees with the same biomass than in pine monocultures but with additional wood produced by the birches. This further suggests that wood harvest may be even maximized by cutting birch trees before they start declining, i.e., after ca. 25 years (Fig. 6). Furthermore, even if birches are outcompeted by pines, adding birch trees to maritime pine stands is likely to increase other ecosystem services than timber production, such as habitat provision (Gamfeldt et al. 2013) or resistance to pathogens (Jactel et al. 2017) or herbivores (see below) (Castagneyrol et al. 2019;Damien et al. 2016).

Limits of the approach and future directions
Several factors should temper the conclusions taken from our simulations. The first limits are directly related to the structure of the model. For instance, the competition for nutrients and water was not taken into account in the simulations, which may either decrease but also increase the observed overyielding (Loreau 1998). Moreover, the calibration and validation of the model carried out in the present studies are site specific, which may limit any extrapolation to other set of conditions. The fact that the model relies on a fixed DBHheight relationship (determined by the parameter s) necessarily decreases the precision of the model's predictions. For instance, DBH-height relationships are known to change with tree age and tree density, as it could be seen here for pine trees where the experimental stands in older Castillonville trials were thinned since they were planted (Fig. 2). However, the model has been designed to favor generality over precision, and we did not have the data to achieve such level of detail. Furthermore, a large sensitivity analysis of the ForCLIM model (i.e., with a very similar structure than the model used here) has shown that the allometry parameter s had a very limited effect on the model's results when compared with the effect of other parameters (such as g or H max or differences in shade tolerance) (Huber et al. 2018). It is also likely that the occurrence of mortality events should be better simulated to achieve more precise predictions. Improving the modelling of tree mortality is currently the main challenge in forest modelling, especially to better assess climate change impacts on forest functioning (Hülsmann et al. 2018;Vanoni et al. 2019).
Furthermore, it is noticeable that in this study we calibrated tree growth using only data from monospecific stands of each species. By doing so, mixture effects necessarily emerged from the niche differentiation of the two species, without considering possible plastic adjustments of species traits (Pretzsch 2014), particularly in terms of crown architecture or rooting segregation. This means that the mixture effects might have been underestimated in comparison with what can happen in real stands. Adjustments in crown size (or more precisely foliage area) may actually emerge in the simulations, due to species complementarity in light capture and/or response to environmental fluctuations (e.g., climate). However, the simulations of the present study did not produce such an effect, as the total foliage area of mixed stands was actually similar than expected from the monospecific stands (following the NBE approach). Therefore, these additional results show that the NBE pattern observed (Fig. 6) is not caused by an increase in total foliage area in simulated mixed stands relatively to monospecific stands but should thus rely on vertical stratification and a reduction in the competition (consistent with the decrease in foliage area).
With the FORCEEPS gap model, we aimed at exploring how the interactions between pines and birches might change stand level properties in the long term, especially growth, biomass, and mean tree height. However, it is important to notice that such results are not predictions, as the model was designed to be more generic than precise (Levins 1966). We thus did not aim at predicting what should be monoculture and mixture productivity in the study area but only the level differences in yield. As recently reviewed by Pretzsch et al. (2015a), there is a wide range of modelling approaches to predict stand growth, from detailed physiology based on empirical models. Although less detailed and predictive that some ecophysiological models, forest gap models can be satisfyingly accurate while being more easily parameterized and prove to be a promising approach for testing diversity effects on the functioning of forest ecosystems (Morin et al. 2011Bohn and Huth 2017). FORCEEPS in particular is a model that has notably few parameters like most "classic" gap models (Bugmann 2001), and because they can integrate biotic and abiotic factors, such models have shown to be applicable over a large range of species and conditions. We are aware of the limited predictive power of the FORCEEPS model, in particular because aiming for generality can only be achieved at the expense of precision, as it is the case for other forest models that have been used to test diversity effects on Table 3 Comparisons of mean productivity (t ha −1 yr −1 ) and final total biomass (t ha −1 ) at the stand level between mixed stands and pine monocultures and between pines in the mixed stands and pine monocultures, according to plot density, and net biodiversity effect (NBE) calculated for each density. The calculation description uses the terminology presented in Fig. 1 Beyond these limitations, the calibration of FORCEEPS on observed data from monospecific plots opens several perspectives. First, as the model can easily use climatic-series, we will be able to test how mixtures may react under climate change relatively to pine monocultures. It will be for example of great interest to compare resistance and resilience to severe drought events in pure and mixed stands. However, this will necessitate improving the mortality part of the model, which is currently a major challenge on the agenda of forest dynamics modellers. Second, as a management module has been developed in FORCEEPS, one may test the effect of thinning mixtures in order to explore how this could strengthen overyielding. One could also try to identify at which age birch trees should be harvested in order to optimize wood production in mixtures with pines (for instance around 20 years, as suggested by results from Fig. 5). Third, our model provides the opportunity to test how tree diversity may affect the sensitivity of trees to herbivory in the long term. Actually, in Southwest France, pine stands are periodically subject to outbreaks of processionary moths (Thaumetopoea pityocampa (Li et al. 2015)) that reduces their growth . Studies made in the same ORPHEE experiment have recently shown that the presence of B. pendula could protect young pine stands from damage caused by T. pityocampa defoliation (Castagneyrol et al. 2013(Castagneyrol et al. , 2014(Castagneyrol et al. , 2019Damien et al. 2016).
Still, what is the persistence of such associational resistance remains to be evaluated and long-run simulations with FORCEEPS could help address this question. More generally, this kind of model appears as a very relevant tool to forecast how forest ecosystems may react to change in species composition, silvicultural treatments, and climate change, either as individual drivers or simultaneously.

Conclusion
In this study, we have shown that even in the case of very productive monospecific forests, like P. pinaster in SW France, promoting mixtures especially through species addition (i.e., complementing pines at low density with additional birches) may achieve the same yield. Although its limits forbid any conclusion regarding operational transfer in this specific case, this study illustrates how calibrating forest models with data from monospecific plots offers a powerful tool to investigate overyielding in mixture stands, even only considering the niche partitioning hypothesis.
Acknowledgments The GIS COOP network benefits from the financial support of the French Ministry of Agriculture and Forest since 1994 and from the project Pinaster (2015-2019): Conseil régional de Nouvelle-Aquitaine (16004034), DRAAF Aquitaine (ADV15R072000012), FEDER (FEDER-FSE2014-2020 Axe 1). We thank the Forest experim e n t a l F a c i l i t y ( U E F P -h t t p s : / / d o i . o r g / 1 0 . 1 5 4 5 4 / 1 . 5483264699193726E12) and especially Bernard Issenhuth for the maintenance of the ORPHEE experiment, Laurent Severin for the Castillonville trial (from GIS COOP network), and Jean-Luc Denou Fig. 6 Temporal dynamics of the net biodiversity effect (NBE, in t ha −1 yr −1 ) and relative yield of each species, for total stand (i.e., all trees, dotted line), and for pine (black line) and birch (dashed line) trees separately, at the two tested densities (from ISLANDES network) and their colleagues for all available tree measurements. We warmly thank Harald Bugmann (ETH Zürich) for getting inspiration from the model FORCLIM to further developing the model FORCEEPS and for numerous discussions about gap models. We also thank anonymous reviewers and the associate editor for the helpful comments on previous versions of the manuscript. The authors are also grateful to the CAPSIS modelling platform (http://www7.inra.fr/capsis/).
Data availability The simulated data that support the findings of the study are available in the Figshare repository, https://doi.org/10.6084/m9. figshare.12006291.v2. The calibration data for Betula pendula and Pinus pinaster from ISLANDES and GIS COOP networks respectively are available in the DATA INRAE repository (https://doi.org/10.15454/ QMZJKU). Restrictions can apply on ORPHEE data but can be however available from Céline Meredieu with permission of the ORPHEE team.

Compliance with ethical standards
Conflict of interest The authors declare that they have no conflict of interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.