Introduction

Pecan (Carya illinoinensis (Wangenh.) C. Koch) is native to North America and is a nut crop species of considerable economic importance. In the U.S.A. in 2021, there were 197,891 ha cultivated to pecan, producing 1.1 × 108 kg valued at US $ 5.5 × 108 (NASS 2022). The species is deciduous, monoecious, heterodichogamous, and wind-pollinated (Sparks 2005). Pecan’s native range extends from the northern U.S.A. in Illinois, Iowa and Ohio in the Midwest, throughout the Mississippi River watershed and the rivers of eastern and central west Texas and Louisiana, and south to Oaxaca in southern Mexico (Wood et al. 1998). The range spans 26° of latitude. Due to the extent of its natural range, pecan has adapted to a wide range of climates from mild to harsh winters to very humid and semiarid temperatures, which suggests great genetic diversity within the species (Sparks 2005).

A provenance collection of pecans is located at the USDA − ARS Southeastern Fruit and Tree Nut Research Station (SEFTNRS) in Georgia, representing its native range (Wood et al. 1998). Only a few horticulturally important traits have been characterized in the pecan provenance collection, including budbreak, leaflet tilt and droop angle, leaf fall in the middle of December (Wood et al. 1998; Rüter et al. 1999), and scab disease susceptibility (Bock et al. 2016, 2020). An important trait is canopy structure and associated characteristics. An understanding of natural variation of canopy development is important for improvement of crops (Bagley 1980), and characteristics of a tree canopy are used to describe the interaction between vegetation and its environment (Welles and Norman 1991). In forestry, canopy cover can determine the microhabitat within the forest, as it plays a role in determining the nature of vegetation and wildlife habitat (Jennings et al. 1999). In pecan orchards, canopy characteristics may be related to genotype, pest and disease effects, or other aspects of tree health. Furthermore, canopy structure is the main factor that controls quantity and quality of light and its distribution on a temporal and spatial scale (Jennings et al. 1999), and this directly impacts nut quality (Potter et al. 2012). Furthermore, with respect to the pecan provenance collection, data on canopy characteristics can provide insight on the genetic control and environmental cues involved in the timing and duration of leaf fall in autumn, and canopy development in spring.

In pecan, foliage retention late in the season is important for nutrient and carbohydrate reserves, and subsequent floral development and fruit set the following year (Westwood 1978; Worley 1979). Nutrients are transported and photosynthate generated in leaves, thus the period during which they are retained is important to the tree (Kim and Wetzstein 2005). Genotypes and management practices that promote late-season leaf retention should prevent nutrient losses associated with premature defoliation. Indeed, genotypes with different natural leaf fall dates have been observed in the provenance collection (Wood et al. 1998). Previous observations indicated the most northern provenances lost more leaflets earlier in the season compared to southern provenances. For example, genotypes from Livingston (MO, U.S.A.) had shed 99% of leaflets by 15 December compared only 49% for genotypes from Ixmiquilpan in Hidalgo, Mexico (MX) (Wood et al. 1998). But how the progress of natural leaf fall affects canopy density in different genotypes has not been quantified. Methods to quantify canopy density rapidly and effectively in pecan will be useful in the endeavor of comparing canopy duration, loss, and timing. The information can be important for assessing germplasm for foliar retention characteristics, a critical aspect of ensuring cultivars are appropriate for the latitude or climate in which they are being grown.

Visual methods are widely used to estimate areas affected by a particular variable for different purposes in biological sciences. These include assessing disease severity (Bock et al. 2022), weed cover (Andújar et al. 2010; Damgaard 2014), forest canopy or vegetation cover (Abdollahnejad et al. 2017), and foliage density (Frampton et al. 2001). Although remote sensing technologies (i.e., terrestrial laser scanners, also known as light detection and ranging (LiDAR) systems) can be used to obtain various measures of canopy foliage, including leaf area index and leaf area density, they are relatively costly, time consuming to develop and analyze, and require technically skilled individuals (Adams et al. 2011; Li et al. 2017; Yang et al. 2019). Visual methods have the advantage of being rapid and have been used to assess foliage density to monitor forest health for various reasons, are accurate, and also required as a good ‘check’ for remote sensing technologies (Meads 1976; Sutton 1981; Innes 1988, 1993; Leutert 1988; Landsberg 1989; Hunter et al. 1991; Payton et al. 1997; Frampton et al. 2001; U.S. Forest Service 2022). To maximize rater accuracy, Frampton et al. (2001) developed a standard set of two-dimensional silhouettes, representing a range of foliage densities, to aid raters in estimating canopy foliage density. The U.S. Forest Service uses a crown density foliage transparency scale to aid assessments of canopy ratings (U.S. Forest Service 2022). On the rating aid, the white areas represent sunlight visible through the crown and the black areas represent the portions of the crown that block sunlight. The person doing the rating, the rater, uses these to guide their decisions on estimating the percentage of canopy foliage versus the background light. The principle is similar to the standard area diagrams used for disease severity estimation (Del Ponte et al. 2017, 2022).

Canopy density has also been measured using image analysis (Chan et al. 1986; Englund et al. 2000; Goodenough and Goodenough 2012). This approach has been applied to measure the porosity of windbreaks (Stredova et al. 2012), although more sophisticated GIS-based approaches are also used (An et al. 2022). In some respects, the analyzed image measurements may be considered actual or true values (“gold standards”) compared to visual estimates, as has been done with estimates of disease severity on leaves (Yadav et al. 2013; Del Ponte et al. 2017). However, measurements by image analysis will be slightly biased as repeat measurements of the same specimens are not identical (Martin and Rybicki 1998; Melo et al. 2020). But visual estimates vary far more than image analyzed measurements in accuracy and reliability, which can impact the outcome of a hypothesis test. For example, in an estimation of weed cover, rater estimates were shown to be accurate but not reliable, particularly in the low to middle range (Andújar et al. 2010). In plant disease severity assessment, visual estimates can be accurate and reliable but is highly dependent on the rater (Nutter et al. 1993; Yadav et al. 2013) and in some cases, use of standard area diagrams to aid the raters’ estimates (Del Ponte et al. 2022). Instruction and training are also critical (Bardsley and Ngugi 2013). However, to the best of our knowledge, visual estimates of pecan canopy density have not been subject to the same investigations of accuracy comparing them to measured or assumed true (actual or gold standard) values.

The objectives of this study were to: (1) determine the ability of visual raters to estimate pecan canopy density as a measure of leaf fall both accurately and reliably; (2) explore the impact of different rater data on the outcome of an analysis, and assess whether type II errors might be committed by differences in ability among raters; and, (3) determine the relationship between mid-autumn (approximately mid leaf fall) canopy density of pecan trees from different provenances from different latitudes in North America.

Materials and methods

Orchard and trees

This study was conducted at the pecan provenance collection at the USDA − ARS Southeastern Fruit and Tree Nut Research Station (SEFTNRS) in Byron, Georgia (32°39′54"N, 83°44′31"W). The site has Faceville sandy loam soils (FoA; fine, kaolinitic, thermic Typic Kandiudult soil), an annual precipitation of 118 cm, 240-frost-free days/year at an elevation of 156 m a.s.l. The provenance collection has been previously described by Grauke et al. (1989), Wood et al. (1998) and Rüter et al. (1999). Seeds were collected as far north as Illinois and Missouri in the U.S.A. to as far south as Oaxaca, Mexico, representing the native range of the species. The provenance collection includes progeny from five maternal trees collected in 1987 from 19 naturally seeded locations, or provenances (Rüter et al. 1999). Nuts were collected 50 m to 10 km between trees for each provenance to represent genetic diversity. A “family” was denoted as “nuts collected from each tree within a provenance” and are thus half-sibs or full sibs, as the paternal parent is unknown, but conceivably may be the same pollen parent. Originally the collection consisted of 923 trees, with the number of progenies ranging from 2 to 18 per maternal tree with most individuals, (68 of 90), containing eight or more progeny per tree. Trees were initially planted at 10.5 m within and between rows. Heights were approximately 17 m. In 2007, the orchard was transplanted from the original location to a second one at USDA − ARS − SEFTNRS as described by Bock et al. (2016). The orchard was not managed beyond weed control and nutrient application as recommended for pecan production in Georgia (Wells et al. 2021).

Image analysis of canopy foliage density

A subset of 76 trees from the 19 provenances was included in the study (Table 1). Four trees were selected randomly from each of the provenances and represented some of the different families within each provenance. The experiment was carried out 16 November 2018 during the mid-late leaf fall period. Weather conditions were fair, with clear skies, no wind, and a temperature of 16.6 °C. A digital image of the canopy of each tree was captured at ground level using a digital camera (Nikon D7000, Japan) aimed upwards to capture the canopy against the sky, with the photographer standing as far back from the tree as spacing allowed (~ 10.5 m). APS Assess 2.0 image analysis software (Lamari 2002) was used to measure the true value for leaf shed applying the CIELAB color space L*a*b model, where L* is the lightness, a* the green − red axis, and b* the blue − yellow axis for chromatic colors. The L*a*b model separated the foliage from the background to obtain percentage foliage area from total area pixels and canopy area pixels (Fig. 1).

Table 1 The 19 provenances of pecan in the study and their latitudes; locations are listed by latitude from north to south. Four trees from each provenance were included
Fig. 1
figure 1

Tree canopy from Jaumave county in the state of Tamaulipas, a Mexican provenance (MX − J) (A) and an image of same canopy with the leaf area segmented out and highlighted in red (44.5%) using the image analysis software, APS Assess 2.0 (B)

Rater assessment of canopy density, instructions, and training

There were four raters who visually assessed the canopies for density, each of whom had some prior experience of disease severity assessment using a percentage scale. Their experience ranged from a few days to 15 years. Assessments occurred on the same days and times among raters. Foliage was assessed based on a percentage scale from (0 − 100%). Despite some prior experience of area estimations, raters were instructed on how to use the percentage scale, the concepts of leaf/leaflet fall, and canopy density were described. Further training was provided by showing the raters example trees with a range of canopy densities from zero to 100%.

Data analysis

The image analysis values were presumed to be the actual or true values, or gold standards, and the visual estimates of the 76 trees by the four raters were compared to the measured image analysis value. Lin (1989) concordance correlation (LCC) (Nita et al. 2003) analysis was used to evaluate the degree to which the estimates fell on the line of concordance (45°, where slope = 1, intercept = 0). There is perfect concordance between the estimates and the measured values when the LCC statistics of systematic scale bias, υ = 1, location (constant) bias, μ = 0, overall bias or accuracy, sometimes called the bias correction factor or generalized bias and an overall measure of how far a line of best fit is from the 45 degree line of concordance, Cb = 1, precision (Pearson’s correlation coefficient), r = 1, and agreement, which is Lin (1989)’s concordance correlation coefficient, LCCC, ρc = 1. Deviation from these values indicates bias, loss of precision and loss of agreement. Analyses of LCC were performed in MS Excel.

Absolute error, the visual estimate of canopy density–image analysis measurement of canopy density, was calculated for all estimates by the raters. Relative error of each estimate was also calculated, (actual error ÷ image analysis measurement of canopy density × 100).

Remaining analyses were performed using SAS V9.4 (SAS Institute, Cary, NC, USA). The inter-rater reliability of estimates of canopy density was measured using the coefficient of determination (R2) for each pairwise combination of rater with linear regression analysis. The coefficient of determination reflects the proportion of variation explained by the relationship and indicates how closely one measurement predicts the other.

To determine the impact of raters on the outcome of the hypothesis that there are no differences among canopy measurements or estimates from the different provenances (H0), canopy density measurements and estimates were analyzed using an ANOVA. A general linear mixed model (GLIMMIX) explored the effects of rater and family, and the rater × family interaction. Slices of main effects of rater and family were taken, and the simple main effects of the variables analyzed. Means separation was by Tukey’s HSD (α = 0.05). Because the data deviated from normality and exhibited heterogeneity of variance, they were \(\mathrm{arcsine}\left(\surd \left[\frac{\% area}{100}\right]\right)\) transformed prior to analysis. The results were back transformed, sine (transformed area)2. Box plots of image analysis measurements and rater estimates were prepared for each of the provenances to compare the medians, the means, the 25th and 75th percentiles, the minimum and maximum values of the variable below the lower and upper fences, 1.5 of the interquartile ranges, and outliers.

The relationship between canopy density and provenance latitude was explored using linear regression analysis. F and P values were used to ascertain model fit. Standard errors of the intercept and slope were calculated, and the R2 used to determine proportion of variation explained by the relationship. The coefficient of variation (CV) for the regression was calculated as the ratio of the root mean squared error (RMSE) to the mean of the dependent variable. The CV is a unitless measure used to compare variation from one data series to another, even if the means are different – in this case, the CV could provide insight into the variability in estimates of canopy density associated with individual raters.

Results

Because of the extensive genetic diversity of the collection, some trees of Mexican origin showed dense canopies whereas trees from more northern provenance had less canopy foliage due to leaf fall. During November in the late fall/early winter period, there is a range in canopy density which was reflected in the proportion of samples with different measurements of canopy density based on image analysis (Fig. 2). The overall mean canopy density was least for the image analysis measurements (11.9%) compared to the visual estimates by the four raters (15.8 to 18.4%) (Table 2). Similarly, the standard deviation and variance of the samples was different, least for the image analysis measurements compared to those by the four raters.

Fig. 2
figure 2

Frequency (%) of different pecan canopy densities as measured using image analysis from canopies of 76 trees from 19 provenances

Table 2 Overall mean, standard deviation and variance of measurements and estimates of canopy density of pecan trees from different provenances by four raters

Accuracy compared to image analysis

All four raters showed a tendency to overestimate canopy density compared to the measured values (Fig. 3). The characteristics of the agreement for each rater with the measured values was also similar. Thus, all raters demonstrated scale bias (υ = 1.313 to 1.528) and constant bias (μ = 0.204 to 0.319). Both measures of bias indicated a tendency to overestimate, particularly at higher canopy densities. However, the generalized measure of bias, or accuracy, (Cb) indicated a relatively small loss in overall accuracy (0.876 to 0.944) among the raters. Precision of the estimates was consistently high (r = 0.969 to 0.975). Overall, the LCCC demonstrated reasonably good agreement with the measured values for each of the raters (ρc = 0.849 to 0.915).

Fig. 3
figure 3

Lin’s concordance correlation measurements of four different raters estimating canopy density of pecan trees from different families and provenances compared to measured values using image analysis. Note: υ = systematic bias, μ = location bias, Cb = overall bias or accuracy, sometimes called the bias correction factor or generalized bias, r = precision (Pearson’s correlation coefficient), and ρc = agreement (Lin’s concordance correlation coefficient)

The absolute error was small when the measured values were < 10% but increased with increasing canopy density measurements (Fig. 4a). The absolute error also showed the tendency of raters to overestimate in most cases. However, the relative error of estimates was much greater at low canopy densities (often > 200%) compared to canopy densities > 30% (Fig. 4b).

Fig. 4
figure 4

Absolute (A) and relative (B) error of four different rater estimates of canopy density of pecan trees from different families and provenances compared to measured values using image analysis

Inter-rater reliability

Each of the pairwise combinations of raters demonstrated good inter-rater reliability (R2 = 0.910 to 0.953) (Table 3). This indicates that the four raters in the study used similar characteristics in estimating the canopy density, as most of the variability was explained by the estimates among raters.

Table 3 Inter-rater reliability of estimates of canopy density of pecan trees from different provenances

Comparison of treatments based on measured values and rater estimates

The GLIMMIX analysis showed that there were significant effects of provenance and rater but there was no interaction (Table 4), indicating that differences were consistent across provenances for each rater. This was further explored with an analysis of simple main effect by rater and provenance. There were no significant differences among the measurements or estimates of canopy density for any pecan provenance, with the exception of provenances MX − C and MX − S, where rater 4 estimated 68.8% compared to the measured value of 41.3%, and where rater 2 estimated 64.6% compared to the measured value of 39.5%, respectively. For provenances MX − C and MX − S, other raters were not different to the measured value or rater 4 (data not shown).

Table 4 Type III fixed effects of pecan provenance canopy foliage densities as measured by image analysis and estimated by four raters

The analysis of simple main effects by raters showed that the measurement by image analysis, presumed actual values, and those based on estimates by raters were always significant (Table 5). In addition, the magnitude of the individual mean estimates, although most often greater for rater estimates, had minimal impact on the ranking of provenance canopy densities. In fact, the ranking of provenances was most often similar to the measured value by image analysis. Nevertheless, the number of groupings based on Tukey’s post hoc test varied between the measurements and estimates, six groupings for image analysis, raters 2 and 4; seven groupings for rater 1; and five groupings for rater 3. Family MO − L, the northernmost provenance, had the lowest canopy density as measured by image analysis for all raters, but grouped with several other provenances. MX − C, MX − S, MX − O, MX − I and MX − J, the southernmost provenances, were grouped together regarding canopy density when measured by image analysis, and by raters 2 and 4. MX − O was not grouped with the other MX provenances based on estimates by raters 1 and 3. There were similar inconsistencies with the provenances from TX. But invariably, differences in groupings and ranking were small, with any particular provenance no more than four ranks different among raters, but most often still grouped the same relative to other provenances.

Table 5 Results of a generalized linear mixed model analysis of Rater × Provenance (sliced by rater) exploring the effect of measurement or rater estimate of canopy foliage density of pecan trees from different provenances by image analysis and four raters

The box plots showed that, although the overall patterns of canopy foliage density were very similar for the measured values and the rater values for the different provenances, the variance of rater estimates tended to be greater for the canopies with more foliage (Fig. 5a − e). The means, medians, 25th and 75th percentiles, and the minimum and maximum values of the provenances with low canopy density (< 20%) from rater estimates tended to be similar to those of the image analyzed set.

Fig. 5
figure 5

Box plots of measured canopy density (A) and estimates of canopy density by rater 1 (B), rater 2 (C), rater 3 (D) and rater 4 (E) for pecan trees in each of 19 provenances. Solid horizontal lines represents the median, black dots the mean, upper and lower limits represent the 25th and 75th percentiles of the data, respectively, vertical lines from the box indicate minimum and maximum values of the variable below the lower and upper fences (1.5 of the interquartile ranges)

Relationship between fall canopy density and latitude

There was a consistent negative relationship between canopy density and latitude, regardless of whether image analyzed measurements or rater estimates were used (Fig. 6a − e), and in all cases, the linear regression model P values were < 0.0001. The results show that the higher the source latitude of the provenance, the lower the canopy density in autumn. The relationship was moderate, regardless of whether the measured values or rater estimates of canopy foliage density were used. Thus, the source provenance latitude explained approximately 0.61 of the variation using image analysis, and 0.55 to 0.64 when based on the rater estimates of canopy density. The CV indicated similar variability, whether the data were image analyzed (87.4) or estimated by a rater (82.7 to 92.6). Interestingly, individual trees within a provenance tended to show similar canopy density characteristics, i.e., the within provenance range was much smaller than the between provenance range when comparing the more northern versus southern provenances.

Fig. 6
figure 6

Relationship between foliage density as a function of leaf fall on November 16, 2018, and latitude for image analyzed canopy density and rater estimates of pecan trees from different provenances; linear regression statistics are presented in each chart

Discussion

The results show that visual assessment of pecan canopy density, due to late season leaf fall for comparing pecan genotypes, provides accurate and reliable estimates. The four raters, each of whom could be considered experienced in area estimation using the percentage scale and who received instruction and training, estimated canopy densities with good agreement to image analysis values (ρc = 0.849 to 0.915). Bias was evident with a tendency for all four raters to overestimate, which is not unusual. Although many rater estimation characteristics exist (Nutter and Schultz 1995; Bock et al. 2009), several studies of plant disease severity estimations have found that raters most often tend to overestimate disease severity, particularly at low severity (Bock et al. 2010, 2022). Interestingly, in the current study overestimation was observed when canopy density was > 20%. However, the sample size of raters was small, and a larger sample would be needed to characterize general rater tendencies to over or underestimate. The results did show that estimates among raters were reliable – the interrater reliability was high, which corroborates results from previous studies of canopy densities with other species (Frampton et al. 2001). As noted, the rater estimates were more variable, particularly at canopy densities > 20% when compared to the assumed true values from the image analyzed data set.

In this study, no assessment aids were used to guide the rater estimates of canopy density. However, assessment aids have been developed and used to guide rater estimates of canopy densities (Wang et al. 1992; Ganey and Block 1994; Frampton et al. 2001; U.S. Forest Service 2022). The principle being that the rater can use these to focus on an appropriate canopy density estimate and interpolate between the two diagrams that bracket the sample being assessed. A similar aid, known as a standard area diagram, is widely used in plant disease severity estimation to improve the accuracy and reliability of disease severity estimates (Del Ponte et al. 2017, 2022). The advantages of using such aids to increase accuracy and reliability of estimates of canopy density in pecan has not been tested but would be worthwhile as a tool to maximize accuracy of estimates for rating genotypes that may be used in breeding programs.

The material used in this study was a subset of four trees from each of the 19 provenances in the pecan collection, representing genotypes from a wide latitudinal range, and the results clearly demonstrate the ability of raters to differentiate canopy densities among the genotypes from different provenances. The results are in general agreement with observations of leaf fall by Wood et al. (1998). However, leaflet counts per leaf are labor intensive and time-consuming for a large collection, where just a few days between measurements might result in dramatic changes in leaflet numbers. The genotypes from the more northerly latitudes, e.g., MO − L = 0% canopy density by image analysis, had the lowest density 16 November. Conversely, provenances from more southerly latitudes, e.g., MX − I = 47.3% canopy density by image analysis, still retained most foliage and had the highest canopy densities. Within a provenance, canopy densities were similar. The trees came from different families within provenances (if from the same mother tree, they were half sibs) (Wood et al. 1998). These are preliminary results that assert the value of visual estimates of canopy density in pecans. This study was carried out on a small subset of the trees in the collection. Based on these results, it will be useful to visually estimate canopy density for the whole pecan provenance to fully characterize the timing and duration of leaf retention as the season ends and leaf fall progresses. This will provide valuable information for investigating the genetic control of leaf fall, combining the phenotypic data with genotype by sequencing data from each tree in the collection and conducting a genome-wide association study.

Other methods exist that can be used to measure canopy density. In some cases, they are used to monitor not just characteristics of different genotypes, but also tree and ecosystem health. For example, canopy density affects microclimate and soil conditions can play important roles in the entirety of the forest ecosystem (Abdollahnejad et al. 2017). Methods such as multispectral imagery and LiDAR have been used to acquire information on canopy density, species, and health attributes of canopies, but unfortunately, these technologies have limitations and require expensive equipment, highly skilled and trained operators, and sophisticated analysis (Leckie et al. 2003; Adams et al. 2011; Li et al. 2017; Yang et al. 2019). Studies using LiDAR have supplemented multispectral imagery to obtain more precise data for individual crown analysis (Leckie et al. 2003). Remote sensing provides tree height estimations, which are consistently underestimated using LiDAR alone, but a complementarity of the two technologies provides high resolution data on forest inventories. Although many systems are used to analyze facets of tree crowns and canopy density, measures or estimates of canopy density can be made visually and a low-tech approach offers some advantages. Although a different metric to canopy density, in some instances, the use of remotely sensed images is less accurate than eye-level photographs when measuring cover density (Leslie et al. 2010). Studies by Jiang et al. (2017) have shown that associations between remote sensing imaging and eye-level photography exist, but the reliability of the association decreases as canopy cover increases. It was concluded that eye-level photographs and visits to the site are indeed critical tools for evaluating canopy density. In contrast, some studies of landscape-scale canopy cover indicate that variance and bias of ocular estimates were higher compared to more labor-intense techniques such as line intersect sampling (Korhonen et al. 2006). However, measurements or estimates of canopy cover at a landscape scale may be subject to different factors compared to those of canopy density. The estimates of pecan canopy density in this study did show some bias but the overall agreement was robust, and the analyses indicated that both the image analyzed, and rater data were similar.

It should be noted that this is a small study and further research is needed. In fact, all 867 trees in the collection need to be assessed for foliage loss during the autumns over multiple years. However, the preliminary results indicate that it can be concluded that visual estimates are both accurate and reliable for assessing canopy density in pecan trees during leaf fall. The same approach might be used to assess canopy density after bud break. Instructing and training raters is an important aspect to ensure accuracy and reliability of canopy density assessments. Future studies should explore the value of using standard area diagrams of pecan canopy density to improve accuracy and reliability of rater estimates (Frampton et al. 2001; US Forest Service 2022). Development and validation of a standard area diagrams set to aid in estimates of canopy density, particularly in relation to leaf loss, will be useful for rating genotypes of pecan, identifying the genetics involved, and helping guide subsequent breeding selection. Other, more sophisticated approaches including multispectral imaging and LiDAR could be explored in the future.