1 Introduction

Mast seeding, also called masting, is the variable, intermittent production of large seed crops, which is a typical reproductive strategy of many wind-pollinated species. Masting events have cascading effects on the overall ecosystem functioning. For instance, the associated resource pulses are relevant for the population dynamics of seed consumers like rodents (Elkinton et al. 1996; Zwolak et al. 2016), roe deer and wild boar (Bisi et al. 2016; Canu et al. 2015; Cutini et al. 2013; Jackson, 1980), brown bear (Ciucci et al. 2014; Tattoni et al. 2015), many bird species (Degange et al. 1989; Hannon et al. 1987; Szymkowiak and Kuczyński 2015; Czeszczewik and Walankiewicz 2016; Soler et al. 2017; Tattoni et al. 2019; Fležar et al. 2019; Szymkowiak and Thompson 2019), and insects (Bogdziewicz et al. 2018). Some forest tree seeds are also used as human or animal food supply (Tattoni et al. 2017); hence, their availability can provide an income in some rural area or add recreational value to forests (Riccioli et al. 2018). As masting determines seedling establishment and recruitment, it also plays a key role in forest management (Ascoli et al. 2015; Cutini et al. 2015; Chianucci et al. 2019a, b). Understanding masting is therefore crucial to improve the knowledge on population dynamics, assess present and future ecosystem resilience, and design adaptive forest management strategies (Cutini et al. 2013; Wagner et al. 2010).

Currently, there are some issues in studying mast seeding. Firstly, the definition of years with mast seeding (mast year) is controversial, and there are different methods to classify a mast year (Cutini et al. 2013; LaMontagne and Boutin 2009). Secondly, masting can be defined using different qualitative (e.g., Nussbaumer et al. 2018) or quantitative (e.g., Cutini et al. 2013; Koenig and Knops 2014) measurements and methods. While categorical data allow to harmonize information for wider-scale comparisons (Bajocco et al. 2020; Nussbaumer et al. 2018; Vacchiano et al. 2017), this can limit the understanding on the ‘real’ seed availability—i.e., the same species may exhibit a mast year, but with significantly different average seed production depending on the stand age, structure, and management (Cutini et al. 2015). Therefore, studies on quantitative data of seed production (e.g., number of seeds, seed biomass) can avoid the ambiguity of determining a mast year. However, the challenge in quantitatively measuring seed production is that available methods are time-consuming and expensive, which limits larger-scale application of quantitative approaches. As direct tree measurements are also impractical due to the height and density of forest canopies, ground-based methods have been frequently used to assess seed production (Perry et al. 1999).

Currently, two main ground methods have been used for quantifying seed production: litter traps and visual counting of seeds on the trees. The litter trap method (LT) has been used in the majorities of studies on forest seed production: examples of sampled species include beech (Bajocco et al. 2020; Chianucci et al. 2019a, b; Cutini et al. 2013), chestnut (Bisi et al. 2016; Cutini 20002001), many oak species (Bogdziewicz et al. 2018; Chianucci et al. 2019a), but also conifers (Mencuccini et al. 1995). The LT is considered the most accurate method (Gea-Izquierdo et al. 2006; Perry et al. 1999), but it is limited by the cost and the time needed to either collect data (which requires frequent—generally biweekly—sampling) or process the litter in the laboratory (which requires separating litterfall by species and by main components—leaf, woody, reproductive parts). As an alternative to LT, indirect methods have also been tested, which are often based on visual counting of seeds while still on the trees. These surveys are undoubtedly faster and cheaper than LT, but they are limited by the subjectivity of the measurements, which are also not replicable, and the difficulty to apply them to tall trees, particularly those with small seed size or in situation of high tree density and canopy closure (Perry et al. 1999). In addition, visual surveys are also generally unable to yield a quantitative estimate of seed production.

Recently, a method based on counting the number of seeds after their falling on the ground was proposed by Touzot et al. (2018). The method can be considered a floor-level variant of LT, but it has the further advantage of reducing the time and cost of field and laboratory working, being, therefore, more flexible, and allowing larger-scale field deployment compared with (fixed) litter traps. However, the method was tested only on oak forests, which are characterized by relatively large (and thus easily detectable) seed size on the ground. Therefore, more experiments are needed to sample tree seeds with different shape and size.

In this study, we tested two indirect ground-based methods to yield quantitative measurements (number) of seeds, which were compared against reference LT measurements. The first method was based on counting seeds on the ground in quadrats of known area, soon after seed fall (hereafter “ground quadrats”, abbreviated GQ); the method was similar to the ground plot method proposed by Touzot et al. (2018). The second was an image-based ground counting method (hereafter “image quadrats”, abbreviated IQ). Downward images of the forest floor were collected in the same quadrats used for GQ, soon after GQ counting, and then inspected to count the number of seeds. Image-based approaches for seed counting have been already proposed for agricultural crops (e.g., Mussadiq et al. 2015; Tańska et al. 2018). However, the existing solutions were able to retrieve seeds in an artificial homogeneous background. Unlike monoculture crops, the forest floor can contain litter layers, dead fallen leaves, coarse woody debris, bare ground rocks, which creates a complex background against which to detect seeds. In addition, the period of seed fall often coincides with leaf fall, and thus, fallen leaves can obscure seeds in the ground, hindering the image-based counting at the floor. Hence, the image analysis of tree seed at the ground must deal with the higher heterogeneity and complexity of the forest floor.

The trial was performed in three most diffuse broadleaves forest tree species in Italy. The ground quadrats estimates were first calibrated against benchmark values obtained by litter traps from a network of permanent plots (Chianucci et al. 2019a). Further ground quadrats measurements were then performed and compared with image quadrats, which were analyzed to test the reliability of the image-based counting of seeds. Our specific questions are.

     1. Are quadrat seed counts comparable with LT?

     2. Are IQ seed counts comparable with GQ?

     3. Is IQ robust in terms of user sensitivity?

2 Material and methods

2.1 Study area

The study was performed in broadleaved forests sampled from six stations in central-Northern Italy (Fig. 1). Broadleaved forests in Italy represent around 76% of the national forest surface (Tabacchi et al. 2007). We sampled pure forest stands of three most widespread broadleaved tree species in Italy: beech (Fagus sylvatica L.; 9.9% of the national forest surface; Tabacchi et al. 2007), Turkey oak (Quercus cerris L. 9.6%) and chestnut (Castanea sativa Mill. 7.5%). The distribution of the studied forests allowed to sample a very diverse range of environmental conditions, which is representative of the natural conditions of the sampled forest tree species.

Fig. 1
figure 1

Study area. Location of the six stations where seed production of temperate broadleaved tree species was sampled with quadrats and seed traps. Green areas are forest coverage in Italy according to CORINE Land Cover level IV (CLC2006_CLC2000_V2018_20 seamless 100 m raster)

2.2 Data collection

Field sampling was carried out in the stations during the period 16 October (day of year (doy) 289)–28 November 2019 (doy 332).

In two stations (AR03; AR04), concurrent measures of seed production were obtained with the three tested methods (LT, GQ, IQ). Nine ground quadrats and litter trap measurements were taken for comparison between the three methods in pure forest stands where litter traps were installed, with quadrats placed close from litter traps; measurements were repeated 2–3 times in the sampled plots (Table 1). Quadrat counting and litterfall collection were performed on the same day for comparison between these methods (Tattoni and Chianucci 2020).

Table 1 Number of seed count measurements collected for pairwise comparison between ground seed counting methods

Additional quadrat measurements were collected in other pure stands of chestnut (AR01, 30 quadrats), beech (TN01, 52 quadrats), and oak (VT01 and AR02, 42 quadrats) by randomly placing quadrats in pure forest stand of the sampled species, with measurements repeated 1–2 times depending on the species (Table 1). Seeds were then removed from the traps/quadrats at the end of each sampling session. Table 1 lists the number of measurements collected for pairwise comparison between the seed counting methods.

2.3 Litter trap sampling

Litterfall was collected in nine 0.25-m2 traps systematically distributed inside each permanent plot of AR03 and AR04 (Table 1). Litterfall was collected in the same days of quadrat collections in these plots, and then separated in the laboratory. The number of seeds per trap from LT was then determined and used as a benchmark for comparison with GQ and IQ.

2.4 Ground quadrat sampling

A quadrat of 0.25 m2 (the same size and shape of litter trap) was used to count the number of seeds in the ground (Fig. 2). In case of comparison with LT, the quadrat was placed north from the trap, at a distance within 1 m from it. The number of seeds was then manually counted in the field within each quadrat by a single observer (see Fig. 2).

Fig. 2
figure 2

Examples of seed counting methods used to assess seed production in temperate broadleaved tree species. A ground quadrat (0.25 m2) placed in beech a, Turkey oak b, chestnut c plots. A downward looking camera to collect quadrat images close to a litter trap d

2.5 Image-based quadrat sampling

We used a digital single lens reflex (Nikon D90) camera (Sendai Nikon Corp., Otawara, Tochigi, Japan) equipped with a Nikkor 18–200-mm lens locked to 18 mm to collect images of the ground quadrats, which were acquired during the same collection dates of LT (when available) and GQ. The camera was oriented perpendicular to the ground using a leveled tripod with a central column inclinable at 90° zenith angle (Vanguard Alta Pro 263AT, Vanguard World, China). An image centered on each ground sampled quadrat was then acquired (see Fig. 2), immediately after each GQ counting. Images were then downloaded, and the seeds manually counted on screen using ImageJ software Version 1.51 (Rasband 2018; Schneider et al. 2012). The estimated number of seeds from images was compared with those obtained from LT (when available) and GQ. Four images were disregarded because of low quality. A total of 219 images of the three species were analyzed (Table 1).

To assess the robustness of the IQ method in terms of its sensitivity to user subjectivity, three users further counted the number of seeds from three image sets, each consisting of 50 randomly selected images for each sampled species from the 219 images acquired. The user experience with the method ranged from beginners with no prior knowledge of seed counting to experts well trained with the IQ method. The results were compared in terms of the number of seeds, their standard deviations, and species-specific differences.

Table 2 summarizes the procedures of the three methods, including a comparison in terms of materials, costs, and number of operators needed. The estimation of costs should be interpreted with care, most of the laboratory equipment may be already available in most research centers, as well as the cameras, that can be used for multiple projects. Reflex cameras may be replaced by smartphone with good hardware in some cases, but this also needs testing. In Italy, the placement of an LT in the forest requires a formal authorization, but other countries may have different rules.

Table 2 Comparison of the three methods in terms of equipment, procedures, costs, and operators needed. Costs are to be considered a tentative estimation; they may vary a lot according to the availability of cameras and laboratory equipments of research units. Travel expenses are calculated in case of the sampling site located within 100 km and a day in the field is about 6 working hours

2.6 Statistical analyses

Firstly, we checked the difference in seed counts between the three methods using the Wilcoxon paired test (Zuur et al. 2010). We then compared seed counts using quadrats (GQ, IQ) against reference measurements using LT by fitting regression models. The influence of species on the estimated number of seeds by the counting methods was assessed using ANCOVA. Given that seed count data are usually non-normally distributed and often characterized by many zeros (Touzot et al. 2018; O'Hara and Kotze 2010), we adopted a zero inflated modeling approach. Several models were tested: Zero inflated Poisson (ZIP), Zero Inflated Negative Binomial (ZINB), Zero Altered Poisson (ZAP), Zero Altered Negative Binomial (ZANB) as named by Zuur et al. (2009a). The last two models, known also as hurdle models, are similar to the zero-inflated ones, but more flexible about the zero modeling. We used odds ratio (OR) to interpret the model outputs because OR quantifies the strength of the association between two events and supports meaningful ecological interpretation (Keating and Cherry 2004; Rita and Komonen 2008). To consider the influence of seasonality on the fitting, we included the month of sampling as covariate of the regression. We then selected the best models using Akaike information criterion (AIC) and log likelihood.

We compared also quadrat counts between QG and IQ by fitting a linear regression to evaluate whether IQ are consistent with QG measurements. Finally, we compared the accuracy of IQ counts by different users, to evaluate the sensitivity of the method. ANOVA was used to assess the differences in seed counts by users independently for each species. The statistical analysis was performed with R and Rstudio interface (R Development Core Team 2011; R Studio 2015).

3 Results

Count data for all species and methods showed a non-normal, skewed distribution with many zeros and some outliers (especially for beech acorns), as illustrated in the box plots of Fig. 3a. The average number of seeds per trap measured from LT ranged between 0 and 42 in chestnut (average number ± standard deviation 7.11 ± 11.01), between 0 and 8 in Turkey oak (2.23 ± 2.48), and between 0 and 7 in beech (1.00 ± 1.51).

Fig. 3
figure 3

a Boxplot showing the number of seeds collected in temperate broadleaved forest with the three methods: IQ (image quadrats), GQ (ground quadrats), and LT (litter traps). b Boxplot showing the number of seeds per species and month of collection. The y-axis of both plots was cut at 50 to improve readability, 27 records of beech counts over 50 are not shown

The relatively high standard deviations were due to a high number of records without seeds: the proportion of zeros was 34.4% for IQ, 26.8% for GQ, and 44.4% for LT. The three species showed a different seasonality in seed falling (Fig. 3b): beech nuts fell mainly at the mid-end of November, while acorns and chestnuts fell mainly in October.

Wilcoxon paired test showed that seed counts in GQ and LT were not significantly different (p > 0.05), while the same test yielded a significant difference in medians between methods LT-IQ (p < 0.05) and GQ-IQ (p < 0.05). We also performed an ANCOVA test to the data, and we found a significant effect of species on calculated seed number from the different methods (p < 0.05).

3.1 Are ground quadrat and image quadrat seed counts comparable with litter trap counts?

3.1.1 ⁠GQ VS LT

To predict the number of seeds in the traps (LT) from ground quadrats (GQ) or image quadrats (IQ), we fitted different models using species and the month of collection as random effects. The best model between GQ and LT was a zero altered negative binomial ZANB model, where the count model fitted a negative binomial distribution between the seed counts in quadrats and the seeds counted in the traps with interaction species (traps*species), while the zero model used species and month as covariates (more details about model selection are reported in Appendix Table 3). Odds ratio shows that in this model, the baseline of having a positive count vs zero is 1.19. The odds of detecting some seeds increase significantly in November and decrease for Turkey oak and beech (0.4 and 0.18, respectively). The positive count model explanation is that if there is any seed (positive counts), the average count is 2.27 seeds, but in case of beech and Turkey oak, the average count decreased, compared with the baseline chestnut, by 0.19 and 0.88, respectively (Appendix Table 5). The plot of predicted versus fitted curves (Fig. 4) of the count model shows the different fits of the model according to the species and month.

Fig. 4
figure 4

Observed and predicted values of the number seeds in litter traps (LT) according to the best ZANB model (formula: LT~GQ * species|species + month). a Count part: the estimate seed production in LT from seeds counted in ground quadrats (0.25 m2) is different for the three sampled species. b Zero model: the probability of having a positive seed count (instead of a zero) is higher in November (month 11) for all species and it is different across species

3.1.2 IQ VS LT

The same approach was used to find the best model predicting the relationship between IQ and LT. The best fitting model was a ZINB model negative binomial distribution between the seed counts in images and the seeds counted in the traps with interaction species, while the zero-model used species and month as covariates (more details of model selection and model coefficients are available in the Appendix section, Appendix Tables 3 and 4). The best fitting model for the count part was the same as the previous model (GQ VS LT); there was the same combination of covariates and similar coefficients for the count part. The zero part model had the same combination of covariates (species and month) but fitted a zero inflated negative binomial distribution instead of the zero altered negative binomial. The odds ratio for the count part was comparable with the above case. The baseline odds of having a positive count vs zero is 1.19; it increased in November and decreased in the case of Turkey oak and beech (0.18 and 0.04, respectively). The positive count model has the chestnut as a baseline and the average count is 2.19 and decreased for beech and Turkey oak compared with chestnut by 0.19 and 0.88, respectively. See plots of Fig. 5 and Appendix Table 8.

Fig. 5
figure 5

Observed and predicted values of the number seeds in litter traps (LT) according to the best ZINB model (formula: LT~IQ * species|species + month). a Count part: the estimate seed production in LT from seeds counted in ground quadrats (0.25 m2) is different for the three species of temperate broadleaved trees. b Zero model: the probability of having a positive seed count (instead of a zero) is higher in November (month 11) for all species and it is different across species

3.2 Are image quadrats counts comparable with ground quadrats?

The whole dataset was used to assess the relationship between the seeds counted in the image (IQ) and the seeds counted in the fields (GQ). Due to the nature of these counts, we expected a linear correlation between IQ and GQ, so we tested several linear models with different covariates and interactions. The best model with lowest AIC was the simplest one, a linear regression between IQ and GQ (Fig. 6, R squared = 0.96, intercept = − 0.5654, slope = 0.8634); more details about the tests and model selection can be found in Appendix Table 9.

Fig. 6
figure 6

Observed values and regression line of seed counted in quadrats (0.25 m2) predicted from seeds counted in images for three temperate broadleaved tree species (formula: IQ = − 0.57 + 0.86 GQ, adjusted R-squared: 0.96)

3.3 Is the image quadrat method robust in terms of user sensitivity?

Three different operators counted the seeds in the same 50 images per species picked at random, for a total of 150 pictures. The mean counts ranged between 12.5 and 13.6 in chestnut, between 5.5 and 5.8 in Turkey oak, between 13.1 and 15.7 in beech. ANOVA indicated that counting seeds on a picture is a reliable method, which is insensitive to users and their previous experience in counting seeds (p = 0.9).

4 Discussion

The main result of the study is that the quadrat counting, both IQ and GQ, can be used to quantify the number of seeds as compared with litter traps (LT). In fact, the two best models had an almost identical fitting for the count part, concerning the coefficients, their significance and overlapping predicted values (Figs. 4a and 5a). The models correctly predicted a higher number of chestnut seeds compared with Turkey oak and beech. Chestnut and oak coefficients were significant in both models, while they are not significant in beech. Despite fitting slightly different distributions (zero inflated for GQ and zero altered for IQ), the probability of zero counts is similar across models (Figs. 4b and 5b). The comparison also indicated a different performance of the method according to the sampled species. The results are in accordance with the findings of Touzot et al. (2018) for sessile oak (Quercus petraea) acorns. Unlike the previous study, in which four 0.25 m2 quadrats were used for comparison with a 20-m2 trap, we also demonstrated that using a single quadrat of the same size allowed a reliable estimate compared with a single litter trap; this allows reducing the sampling efforts per trap while simultaneously allowing increasing the number of quadrat samples across a site. This is an advantage per se of the GQ/IQ method, as the establishment of traps and monitoring of seed using LT is cost and time consuming, which limits their usage at the plot scale where they are permanently installed prior to authorization.

Sampling in November increased the probability of finding seeds for all the three species considered in this study. On top of that, late autumn is a time of the year when the herbal understorey vegetation is at its minimum facilitating the ground sampling with IQ and GQ. In the case of a thick litter of leaves, the operator can move the camera or remove some of the leaves to take a better image.

With reference to beech, results indicated a lower number of seeds and higher probability of not finding any, compared with the other species. We attributed such outcome to the combined effect of the relatively small size of beech nuts and the low fruiting production observed in 2019 (about three-fourth of the quadrats have zero seeds), which likely complicated the detection of (low number of small) seeds from quadrats. While Touzot et al. (2018) observed that no seeds in the quadrats correspond to very low fruiting levels, we speculate that the agreement between GQ and LT would likely increase in case of high fruiting (mast) years in beech when the production may likely reach more than 150 nuts/m2 (Burschel et al. 1964) and up to over 300 nuts/m2 (Schmidt 2006).

An additional improvement of the GQ method in this species would be collecting the litter in the quadrat and then separating it in the laboratory, to reduce the probability of missing counts of small seeds in the field. This procedure is likely to help counting small-sized seeds of other trees, such as ashes and hornbeams (Czeszczewik et al. 2020).

Zero inflated model approach was necessary to account for the high number of zeros in the data set, which is typical of seed fall, particularly in low fruiting years (Calama et al. 2011; Touzot et al. 2018). We advise the need for future investigation, in a multiannual time-scale, to evaluate the model fitting of seed counts in higher fruiting (particularly mast) years.

With reference to quadrat counting, we demonstrated that IQ is statistically comparable with GQ. The outcome further extends the applicability of quadrat sampling, since collecting images is faster than ground quadrat counting, allowing a higher number of samples to be performed. In addition, IQ has further advantages compared with GQ: firstly, images are permanent records, which could be inspected to check for data quality (Chianucci et al. 2019b). Secondly, IQ is robust and provides similar estimates of seed numbers, irrespective of the users and their previous knowledge on seed counting. Thirdly, images can also be re-analyzed to test different methods to count seeds from images. In this line, some image classification methods could also be developed and compared against the manual counting from seed images. At the moment, the available tools in image classification could offer promising solution such as object-based image analysis (OBIA) or with convolutional neural network (CNN). However, these methods have been tested only in laboratory conditions and for agricultural crops, not for forest seeds in a natural, spectrally complex background. Both solutions require some training of the algorithms for classification (OBIA) or for detection and recognition of a search image of the seeds (CNN).

A potential limitation of IQ is that seed fall typically occurs simultaneously with leaf fall, and therefore, some fallen leaves in the ground could cause underestimation of seeds captured by images, as compared with LT and GQ. Although the observed differences between IQ and GQ were not statistically significant, this issue further supports the need to set a proper timing for quadrat counting, possibly increasing the frequency of sampling.

To sum up, quadrat counting represents a simple, robust, cheap and flexible method to estimate tree seed number in the field. From a practical viewpoint, considering the long-term perspective of seed (and masting) dynamics, we suggest to calibrate quadrats counting with measurements obtained from LT in calibration plots (when available); this would also allow relating seed number with either continuous (seed mass) or categorical (low, medium, high fruiting, or other scales) measurements of seed in reference plots. Both GQ and IQ could be then used more routinely for indirect measurements and monitoring of seed production on larger areas.

As we focused on pure methodological differences, we did not apply a probabilistic sampling scheme as it would be recommended for, e.g., estimating seed production at the tree, stand, or site level. Therefore, future investigations are required to assess the sampling efforts (scheme and number of samples) required to obtain statistically representative estimates of seed at different sampling domains. For the aim of this study, we sampled pure stands, but we believe that the method can be applied also in mixed forests, as the seeds of each species are recognizable with a high degree of accuracy by a trained operator.

5 Conclusions

Field estimates of seed production are essential for a wide range of ecological studies. We demonstrated that ground quadrats are fast, simple, and reliable tools to assess seed production in broadleaf forest stands, being comparable with the more consuming litter trap method. Among the quadrats method, image quadrats hold great potential for future researches and applications, including the development of seed monitoring systems based on continuous camera, as already existing in other monitoring applications such as tree phenology (Chianucci 2020; Richardson et al. 2018). In this line, downward looking cameras could be installed to capture images at the forest floor for counting the number of seeds. This option would further reduce (or avoid) the efforts required for field surveys, simultaneously allowing an assessment of seed fall patterns at a finer, daily temporal scale.