A new model for predicting the total tree height for stems cut-to-length by harvesters in Pinus radiata plantations

Shan, Chenxi; Bi, Huiquan; Watt, Duncan; Li, Yun; Strandgard, Martin; Ghaffariyan, Mohammad Reza

doi:10.1007/s11676-019-01078-6

A new model for predicting the total tree height for stems cut-to-length by harvesters in Pinus radiata plantations

Original Paper
Open access
Published: 16 December 2019

Volume 32, pages 21–41, (2021)
Cite this article

Download PDF

You have full access to this open access article

Journal of Forestry Research Aims and scope Submit manuscript

A new model for predicting the total tree height for stems cut-to-length by harvesters in Pinus radiata plantations

Download PDF

Chenxi Shan¹,
Huiquan Bi^2,3,
Duncan Watt⁴,
Yun Li¹,
Martin Strandgard⁵ &
…
Mohammad Reza Ghaffariyan⁵

2063 Accesses
9 Citations
Explore all metrics

Abstract

A new model for predicting the total tree height for harvested stems from cut-to-length (CTL) harvester data was constructed for Pinus radiata (D.Don) following a conceptual analysis of relative stem profiles, comparisons of candidate models forms and extensive selections of predictor variables. Stem profiles of more than 3000 trees in a taper data set were each processed 6 times through simulated log cutting to generate the data required for this purpose. The CTL simulations not only mimicked but also covered the full range of cutting patterns of nearly 0.45 × 10⁶ stems harvested during both thinning and harvesting operations. The single-equation model was estimated through the multiple-equation generalized method of moments estimator to obtain efficient and consistent parameter estimates in the presence of error correlation and heteroscedasticity that were inherent to the systematic structure of the data. The predictive performances of our new model in its linear and nonlinear form were evaluated through a leave-one-tree-out cross validation process and compared against that of the only such existing model. The evaluations and comparisons were made through benchmarking statistics both globally over the entire data space and locally within specific subdivisions of the data space. These statistics indicated that the nonlinear form of our model was the best and its linear form ranked second. The prediction accuracy of our nonlinear model improved when the total log length represented more than 20% of the total tree height. The poorer performance of the existing model was partly attributed to the high degree of multicollinearity among its predictor variables, which led to highly variable and unstable parameter estimates. Our new model will facilitate and widen the utilization of harvester data far beyond the current limited use for monitoring and reporting log productions in P. radiata plantations. It will also facilitate the estimation of bark thickness and help make harvester data a potential source of taper data to reduce the intensity and cost of the conventional destructive taper sampling in the field. Although developed for P. radiata, the mathematical form of our new model will be applicable to other tree species for which CTL harvester data are routinely captured during thinning and harvesting operations.

Characterizing prediction errors of a new tree height model for cut-to-length Pinus radiata stems through the Burr Type XII distribution

Article 30 June 2023

Predicting the upper stem diameters and volume of a tropical dominant tree species

Article Open access 04 February 2022

Predicting knottiness of Scots pine stems for quality bucking

Article Open access 20 November 2019

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Over the last 40 years, cut-to-length (CTL) harvesters have been increasingly adopted and widely utilised to improve log-harvesting productivity in natural and plantation forests worldwide (e.g., Huyler and LeDoux 1999; Murphy 2003; Gerasimov et al. 2012, 2013; Strandgard et al. 2013; Olivera et al. 2016; Williams and Ackerman 2016; Lu et al. 2018). The widespread utilization has been largely driven by great technological advances in the mechanical design of harvesters and in the harvester head measurement and optimization systems over the same period (Heinimann 2007; Nordfjell et al. 2010; Uusitalo 2010; Malinen et al. 2016). Now modern harvesters equipped with a GPS receiver and a computerized harvester head have become a major source of “big data” for forest management as they constantly capture, accrue and provide a daily flow of spatially explicit and time-stamped data on log production and assortment as well as detailed diameter measurements of harvested stems of individual trees over large operational areas (Sellén 2016; Uusitalo 2017; Lu et al. 2018; Müller et al. 2019; Rossit et al. 2019). Harvester data can provide the total log length but not the total height of individual trees because the top crown section beyond the last cut does not pass through the harvester head and therefore its length cannot be measured. The lack of tree height data represents a stumbling block in the full integration of spatially explicit harvester data with conventional inventory data, remote sensing imagery and LiDAR data for the development of harvester-based inventory systems (e.g., Stendahl and Dahlin 2002; Murphy et al. 2006; Holopainen et al. 2010), for predicting attributes of individual trees, stands and forests (e.g., Rasinmäki and Melkas 2005; Holmgren et al. 2012; Söderberg 2015; Saukkola et al. 2019) and for estimating product recovery (e.g., Peuhkurinen et al. 2008; Barth and Holmgren 2013; Barth et al. 2015; Caccamo et al. 2018; Hauglin et al. 2018). Full integration will provide forest managers with a more detailed view of standing trees over an entire forest area and also allow them to predict spatially the volume and value of a certain product or a product mix that the forest can yield for optimising value recovery in management planning (Hauglin et al. 2018; Lu et al. 2018). Without full data integration, maximum value extraction from harvester data cannot be attained, preventing the transformation of big data into valuable data for forest management (see Müller et al. 2019).

For the most effective use of harvester data, a necessary first step is to estimate the total height of each harvested tree that was bucked, measured and recorded by the harvester head. This necessity has been well recognised, and several attempts to do so have been made in the estimation of logging residue biomass and in the integration of harvester data with remote sensing data in forest inventory for predicting tree and stand attributes and estimating product recoveries (Varjo 1995; Kiljunen 2002; Maltamo et al. 2010; Möller et al. 2011; Söderberg 2015; Siipilehto et al. 2016; Hauglin et al. 2018; Lu et al. 2018). So far, total tree height has been calculated by estimating the length of the unprocessed top section of individual trees using the only such existing model, that of Varjo (1995), or through an iterative search algorithm using a taper equation as demonstrated well by Lu et al. (2018) and also briefly alluded to by Hauglin et al. (2018). Another ad hoc method was described by Kiljunen (2002), but it cannot be readily applied because of its requirement for diameter measurements at specific heights that are not routinely captured and stored in a harvester database.

Although proved useful in reconstructing individual trees from harvester data (e.g., Palander et al. 2009; Vesa and Palander 2010; Siipilehto et al. 2016), the only such existing model has a rather restricted range of application, only to stems with the small end diameter underbark of the top log between 5 and 10 cm and over a minimum log length of 1.5 m as outlined by Varjo (1995) for the 3 coniferous and broadleaf species in his study. The iterative search algorithm demonstrated by Lu et al. (2018) requires a taper equation and 2 extreme quantile curves that define the search range of total tree height at any given diameter at breast height overbark (DBH) for the species at interest. In comparison to a single predictive equation, it requires a large amount of auxiliary height and diameter data, is somewhat cumbersome to implement in a computer program and also takes longer to reach the optimal estimate. Even after the optimum is reached, it may still be less accurate than the prediction from a single equation as shown by Lu et al. (2018). This approach also relies on extensive bucking simulation through a large taper data set for accuracy testing. Without doing so, the accuracy of tree height estimation cannot be ascertained as in the case of Hauglin et al. (2018).

The weaknesses of these 2 methods highlight the need for an improved and more efficient approach to predict total tree height of individual stems cut-to-length by harvesters. The recent work of Lu et al. (2018) with Pinus radiata represented an essential first step towards satisfying this need. Building upon this work, this paper presents a new model that overcomes the weaknesses of the 2 methods that are currently in use, also using plantation trees of P. radiata as an example.

Materials and methods

Notation

${\text{DBH}}$ :: Diameter at breast height overbark (cm)
${\text{DOB}}$ :: Diameter overbark (cm)
$H$ :: Total tree height from ground level to tip of the tree (m)
$H_{\text{s}}$ :: Average stump height of 0.15 m
$H_{\text{b}}$ :: Defined breast height of 1.3 m above ground level
$L$ :: Total log length, i.e., sum of lengths of logs and waste sections of a stem (m)
$L_{\text{top}}$ :: Length of the unprocessed tree top section from the last cutting point to the tip (m)
${\text{LED}}$ :: Large end diameter overbark of a log (cm)
${\text{SED}}$ :: Small end diameter overbark of a log (cm)
${\text{SEDTL}}$ :: SED of the top log (cm), the smallest SED of a cut stem
$d = ({\text{SEDTL}}/{\text{DBH}})$ :: Relative diameter that takes any value between 0 and 1
$h = \left( {L + H_{\text{s}} - H_{\text{b}} } \right)/\left( {H - {\text{H}}_{\text{b}} } \right)$ :: Relative height above breast height
$B = (1 - d)$ :: The base function
$K$ :: Variable exponent that is a function of $L$, ${\text{DBH}}$ and ${\text{SEDTL}}$
$R_{\text{l}} = (L/H)$ :: Total log length ratio
$T = ({\text{DBH}} - {\text{SEDTL}})/L$ :: Average taper over total log length

A schematic diagram of the notation was drawn to aid understanding by readers (Fig. 1).

Data

For the purpose of this study, it would be ideal if the total tree height and stump height of a large enough sample of harvested trees were measured and recorded in addition to log lengths and diameters in the field during harvesting as Lang et al. (2010) did for a small number of sample trees. However, measuring a large number of sample trees across different stand types, site, and age classes is costly and time-consuming, particularly when working with harvesters in field operations amid safety concerns. To overcome the lack of such ideal data, this study followed the intuitive approach of Lu et al. (2018) in combining taper and harvester data through simulated log-cutting to generate a data set that not only included stump height and total tree height, but also mimicked the cutting patterns of harvesters in operational thinning and harvesting.

Taper data

The taper data set contained 3251 trees sampled from P. radiata stands across the State of New South Wales (NSW) in Australia over 30 years. These taper trees were sampled from both thinned and unthinned stands over wide ranges of age, site quality, and stand conditions. The lowest measurements of both overbark and underbark diameter were taken at between 0.1 and 0.3 m above ground, then another measurement was usually taken at 0.7 m before reaching the defined breast height of 1.3 m. Measurement intervals above breast height varied between 1.5 and 3 m depending on the height of the sample trees. This data set was previously used by Bi and Long (2001), Bi et al. (2012) and Zhang et al. (2015) for developing taper equations, height–diameter functions and conversion factors for DBH measured at different breast height. It was also used by Lu et al. (2018) in combination with harvester data in log-cutting simulations. These publications provided a detailed textual and graphical description of this data set, so it was not repeated here. For the log-cutting simulation in this study, 152 trees were excluded, including 66 small trees with DBH less than 10 cm that fell below the minimum size for log-cutting and 86 trees that had underbark but not overbark taper measurements. The remaining 3099 trees in the taper data set had a range of 10–79.1 cm for DBH and 6.4–44.5 m for total tree height. For each tree, a complete stem profile from ground to the tip was constructed numerically by interpolating through its observed values of DOB using piecewise cubic Hermite interpolating polynomials (PCHIP) with an even interval of 0.1 m.

Harvester data

The harvester data came from the Tumut Management Area of the Snowy Region, Forestry Corporation of NSW (FCNSW). Covering the foothills of the Snowy Mountains, this management area has about 90,000 ha of P. radiata plantations, representing approximately 46% of FCNSW’s total P. radiata estate, and produces over 1 × 10⁶ cubic meters of sawlogs and more than 0.6 × 10⁶ cubic meters of pulp wood annually. These plantations were established with an initial stocking of 1000–1100 trees ha⁻¹ across all site classes under the stand density management regime adopted for P. radiata in NSW since the 1980s (Horne and Robinson 1988). At the present regime, either one or two thinnings are prescribed for more productive sites. The first thinning aims to bring the stocking down to 450–550 trees ha⁻¹ at around 14 years of age, and the second thinning further reduces the stocking down to 200–300 trees ha⁻¹ at around 23 years of age. For poorer sites, no thinning is carried out before the final harvest generally at the age of 30–35 years.

The harvester data contained a total of 0.502 × 10⁶ stems/trees that were harvested and processed by 20 CTL harvesters in routine thinning and clear-felling operations at 54 sites over a period of 3 years from 2012 to 2014. Each stem was cut into one or more logs that were numbered sequentially from the butt log to the top log, making up a total of 1.808 × 10⁶ logs including approximately 0.240 × 10⁶ waste sections in the data set. For each stem, stump height and stump diameter, i.e. LED of the butt log, were not available. For each log, the length, overbark volume and SED were recorded together with a product description that contained customer identification and product type. For logs without a specific customer and product type, production description was recorded as “random”. The cutting patterns were driven by the sawlog and pulpwood specifications of the local softwood industry consisting of one large 0.7 million tonne pulp and paper mill producing high quality kraft paper for both domestic and international markets and six large and small timber and wood products companies near the town of Tumut in southern NSW.

Data screening and interactive exploratory data analysis

With the aid of the statistical software R 3.5.1 (R Foundation for Statistical Computing, Vienna, Austria) and GGobi 2.1.10, an open source visualization program for exploring high-dimensional data (see Swayne et al. 2006; Cook et al. 2007), multiple rounds of data screening and interactive exploratory data analysis were carried out through a systematic set of procedures to detect inconsistencies in data recording, erroneous or illogical records in the harvester data. Some obvious errors were corrected wherever possible, others errors or anomalies that could not be corrected were removed from the data set. To facilitate this process, DBH was calculated for each stem from the length and SED of its butt log using a system of conversion equations together with linear interpolation as in Lu et al. (2018). In addition, the merchantable height up to 5 cm DOB of trees in the taper data set was analysed in relation to their DBH to obtain an extreme nonlinear conditional quantile curve in the form of the 3-parameter Chapman–Richards function, one of the most commonly used functions for tree height and diameter models (see Huang et al. 1992; Huang 1999; Bi et al. 2000, 2012):

$${\text{MH}}_{0.995} = 1.3 + 44.35\left( {1 - e^{{ - 0.0432{\text{DBH}}}} } \right)^{1.1395} ,$$

(1)

where ${\text{MH}}_{0.995}$ represents the 0.995th nonlinear conditional quantile of merchantable height in meters for a given DBH. The 3 parameters in Eq. 1 were estimated using the R package for quantile regression, quantreg (Koenker 2017, 2018). This extreme quantile curve depicted the relationship between the maximum attainable merchantable height and DBH for P. radiata in NSW (Fig. 2). Through this equation, a maximum attainable total log length ($L_{\text{max }}$) for any given DBH was defined as ${\text{MH}}_{q = 0.995}$ minus 0.15 m, the average stump height. This relationship between $L_{{{ \hbox{max} } }}$ and DBH was then used as an upper frontier in the detection of anomalous stems.

After the completion of data screening and filtering, there were 0.448 × 10⁶ stems and 1.581 × 10⁶ logs remaining in the final data set. The DBH of these stems ranged from 10 to 82 cm with an average of 33 cm and their total log length $L$ varied within the range of 2.2–40.8 m with an average of 16.6 m (Fig. 2). The minimum $L$ of 2.2 m represented cases where only one short log was cut from a stem. The SED of the 1.581 × 10⁶ logs varied from 3 to 80 cm, where values larger than 60 cm were found only in 0.3% of the logs. The individual log length ranged from 2.2 to 26.4 m, but lengths longer than 6.4 and 12 m represented less than 0.50% and less than 0.01% of the logs, respectively (Fig. 3). Because the logs were numbered sequentially from the butt log up, the number of logs decreased as log number (i.e., the relative position of logs) increased. Concomitantly, the SED and length of logs varied both within and across log numbers, with the distribution of log length peaking around a number of specified log lengths (Fig. 4). This final data set was used for deriving cutting patterns for the simulated bucking of the stem profiles of taper trees.

Cut-to-length (CTL) simulations and data generation

Because of its larger volume, the harvester data showed an extremely wide spread in the value of ${\text{SEDTL}}$ at any given DBH, which meant that there could be multiple alternative cutting patterns for a single stem, resulting in many ${\text{SEDTL}}$ values and corresponding total log lengths. To have a more comprehensive representation of such variation in the cutting simulations, ${\text{SEDTL}}$ was plotted against DBH for the 0.448 × 10⁶ stems, and the data cloud was visually inspected through the interactive and dynamic graphical display of GGobi. Then 6 nonparametric conditional quantile curves at $\uptau$ = 0.01, 0.25, 0.50, 0.75, 0.90 and 0.99 were drawn through the data cloud using the QUANTREG procedure in SAS 9.4 (SAS Institute, Cary, NC, USA) to help discern the patterns of conditional variation in ${\text{SEDTL}}$ as DBH increased. Based on the patterns of the 6 nonparametric quantile curves, a parametric model was specified to derive the corresponding parametric conditional quantile curves:

$${\text{SEDTL}}_{\tau } = a_{\tau } + b_{\tau } {\text{DBH}}^{{c_{\tau } }} ,$$

(2)

where ${\text{SEDTL}}_{\tau }$ is the $\tau$th conditional quantile for a given DBH, $a_{\tau }$, $b_{\tau }$ and $c_{\tau }$ are the corresponding quantile-dependent parameters. Necessary constraints were placed upon parameter $a_{\tau }$ during quantile regression to prevent the fitted quantile curves from crossing each other in the close neighbourhood of the minimum DBH of 10 cm. In addition, parameter $c_{\tau }$ was set to 1 for the 90th and 99th quantiles as the relationship between ${\text{SEDTL}}_{\tau }$ and DBH became linear nearing the boundary of the data cloud (Fig. 5). For a taper tree with a given DBH in the log cutting simulations, these 6 parametric quantile curves provided 6 consecutive and increasingly large initial $\text{SEDTL}$ values that not only well covered the range of ${\text{SEDTL}}$ variations contained in the harvester data, but also led to 6 systematic and interrelated cutting patterns with 6 different total log lengths ranging from the longest to the shortest that the tree could possibly yield.

The log cutting simulation for each of the 3099 taper trees looped through 6 values of ${\text{SEDTL}}_{\tau }$ starting from $\tau = 0.01$ and ending at $\tau = 0.99$. Within each loop, the simulation went through the following steps: (1) obtaining a value of ${\text{SEDTL}}_{\tau }$ from its DBH and the $\tau$ th conditional quantile curve as shown in Fig. 5; (2) deriving the corresponding total log length $L$ by searching through its stem profile from an average stump height of 0.15 m upwards to the height where the DOB was equal to ${\text{SEDTL}}_{\tau }$; (3) from among the 0.448 × 10⁶ stems contained in the harvester data set selecting the stem that was most similar to the taper tree by using a similarity index SI as described below; (4) using the cutting pattern of the most similar stem to locate the sequential cross-cut points on the taper tree’s stem profile with some adjustment if necessary; (5) taking the DOB and height at each cutting point on the numerically interpolated stem profile; (6) exporting the values of stump height, log number, log length, LED and SED values, and $L_{\text{top}}$, the length of the tree top section from the last cutting point to the tip to a storage data set. The computer program previously written by Lu et al. (2018) in C# was modified to carry out all steps of the CTL simulations. The final data set contained 6 different but correlated values of ${\text{SEDTL}}$, $L$ and $L_{{{\text{top}} }}$ for each taper tree with the same DBH, total tree height $H$ and stump height $H_{{{\text{s}} }}$.

The similarity index (SI) in step 3 was calculated as the distance between the candidate stem and the subject taper tree in a 3-dimentional space as follows:

$${\text{SI}} = \sqrt {\left( {{\text{DBH}}_{\text{H}} - {\text{DBH}}_{\text{T}} } \right)^{2} + \left( {L_{\text{H}} - L_{\text{T}} } \right)^{2} + \left( {{\text{SEDTL}}_{\text{H}} - {\text{SEDTL}}_{\text{T}} } \right)^{2} } ,$$

(3)

where variables with the subscript $H$ and $T$ denote the attributes of candidate stem from the harvester data and that of the subject taper tree, respectively. The value of SI is zero when the candidate stem and the subject taper tree are identical in the 3 dimensions. The SI values for the 3099 × 6 most similar stems selected in the log cutting simulations had an L-shaped distribution over a range from 0 to 7.9, with a median of 0.25, a mean of 0.40 and an upper quartile of 0.47. About 91% of the SI values were between 0 and 1 (Fig. 6).

Model derivation

In deriving a model specification for predicting total tree height $H$, L was considered as a fraction of $H$ as shown in the “Notation” section. In doing so, any given height along a tree stem above breast height was converted to relative height $h$ and thus h varied from 0 at 1.3 m to 1 at the tip of the tree. At the same time, the DOB at a relative height $h$ was converted to relative diameter $d$, which varied from 1 at 1.3 m to 0 at the tip of the tree. Considering the relative stem profile of the part of the stem above breast height and taking reference of the idea and approach of Bi (2000) in the construction of the trigonometric variable-form taper model, h was expressed as a power function of $d$ and other variables:

$$h = B^{K} ,$$

(4)

where the base $B$ is a monotonic function of relative diameter d, varying from 0 to 1 as d changes from 1 to 0, and $K$ is a variable exponent. Nine candidate base functions were constructed, including both linear and trigonometric functions, to approximate the geometrical shape of the relative stem profile through Eq. 4. Because there was no clear geometrical form, biological reason or statistical theory to specify a particular equation form for the variable exponent, $K$, an empirical function for $K$ had to be derived through exploratory model building. Variables including DBH, L, d, average stem taper $T$, and their various transformations and combinations were taken as candidate predictors for K in a linear function. For each of the 9 candidate base functions, Eq. 4 was linearized by taking logarithmic transformation of both sides and then ln $h$ was regressed against all candidate predictors, each multiplied by ln B, but without the intercept term. The best model with either 4 or 5 predictors for $K$ was determined through stepwise regression in the PROC REG procedure of SAS 9.4 with MAXR (maximum R² improvement) as the method of variable selection.

The selected models for all 9 candidate base functions were then further compared with each other in terms of the goodness-of-fit statistics and also subjectively evaluated in terms of their simplicity, elegance, interpretability and applicability. Following this exploratory model building process, the linear model for lnh was specified as follows:

$$\ln h = \left( {a_{1} + a_{2} L + a_{3} \sqrt T + a_{4} \sqrt {d^{3} } + a_{5} \sqrt {{\text{DBH}}^{3} } } \right){ \ln }(1 - d),$$

(5)

where a₁ to a₅ are parameters. This linear model was back-transformed from natural logarithm to derive the nonlinear model form for h:

$$h = (1 - d)^{{a_{1} + a_{2} L + a_{3} \sqrt T + a_{4} \sqrt {d^{3} } + a_{5} \sqrt {{\text{DBH}}^{3} } }}$$

(6)

Because $h = \left( {L + H_{\text{s}} - H_{\text{b}} } \right)/\left( {H - H_{\text{b}} } \right)$ as shown in the “Notation” section, representing the height relative to the total tree height $H$ minus breast height $H_{\text{b}}$, a model form for predicting $H$ directly was obtained from Eq. 6 as follows:

$$H = \frac{{L + H_{\text{s}} - H_{\text{b}} }}{{(1 - d)^{{a_{1} + a_{2} L + a_{3} \sqrt T + a_{4} \sqrt {d^{3} } + a_{5} \sqrt {{\text{DBH}}^{3} } }} }} + H_{b}$$

(7)

Although the deterministic structures of Eqs. 5–7 are mathematically equivalent, as statistical models for predicting total tree height, they differ. The differences in their dependent variables and error structures (either multiplicative or additive) could potentially have some impact on the prediction accuracy for $H$.

Parameter estimation of linear and nonlinear models

Equations 5 and 7 could be estimated through linear and nonlinear least squares regression with the implicit assumption that the error term was an independent and identically distributed random variable. However, this assumption was unrealistic as revealed by the diagnostic analyses of residuals from the regressions. First, the CTL simulations processed each taper tree 6 times using 6 values of $SEDTL_{\tau }$ obtained from the 6 conditional quantile curves at $\tau = 0.01$, 0.25, 0.50, 0.75, 0.90 and 0.99 (Eq. 2, Fig. 5) and generated 6 sets of values of all variables appearing in Eqs. 5 and 7. Consequently, there was an inherent correlation among the residuals from the same tree. Second, the 6 values of $L$ for each taper tree represented different proportions of its total tree height $H$. As $L$ decreased and ${\text{SEDTL}}_{\tau }$ increased with $\tau ,$ the magnitude of residual variation became increasingly larger, providing a clear indication of the presence of residual heteroskedasticity, particularly for the nonlinear model.

Although the least squares estimators remain unbiased in the presence of residual correlation and heteroskedasticity, they are no longer efficient and their estimates of parameter variances are biased (Gujarati and Porter 2010; Greene 2012). To take both residual correlation and heteroskedasticity into consideration, Eqs. 5 and 7 were not estimated as single equations, but each as a system of 6 equations. For the sake of parsimony, the model specification for estimating Eq. 7 only was given below:

$$\begin{array}{*{20}c} {\begin{array}{*{20}c} {H_{1} = f(X, \theta ) + \varepsilon_{1} } \\ {H_{2} = f(X, \theta ) + \varepsilon_{2} } \\ \end{array} } \\ \vdots \\ {H_{6} = f(X, \theta ) + \varepsilon_{6,} } \\ \end{array}$$

(8)

where $H_{1} \;{\text{to}}\;H_{6}$ represent the same $H$ as in Eq. 7, but corresponding to data generated from the 6 cutting patterns that were based on the 6 conditional quantile curves at $\tau = 0.01$, 0.25, 0.50, 0.75, 0.90 and 0.99 (Fig. 5), $f(X,\theta )$ stands for the nonlinear function in Eq. 7 with $X$ representing a vector of its predictor variables and $\theta$ a vector of its 5 parameters, $\varepsilon_{1}$ to $\varepsilon_{6}$ are the corresponding error terms. The 6 error terms can be expressed in the matrix algebra notation as follows:

$$\varvec{\varepsilon}= \left[ {\varepsilon_{1} ,\varepsilon_{2} ,\varepsilon_{3} ,\varepsilon_{4} ,\varepsilon_{5} ,\varepsilon_{6} } \right]^{\prime }$$

(9)

The properties of $\varvec{\varepsilon}$ are

$$E(\varvec{\varepsilon}) = 0.$$

(10)

and

$${\text{Cov}}(\varvec{\varepsilon}) = E(\varvec{\varepsilon \varepsilon }^{\prime } ) = \left[ {\begin{array}{*{20}l} {\sigma_{11} } \hfill & {\sigma_{12} } \hfill & {\sigma_{13} } \hfill & {\sigma_{14} } \hfill & {\sigma_{15} } \hfill & {\sigma_{16} } \hfill \\ {\sigma_{21} } \hfill & {\sigma_{22} } \hfill & {\sigma_{23} } \hfill & {\sigma_{24} } \hfill & {\sigma_{25} } \hfill & {\sigma_{26} } \hfill \\ {\sigma_{31} } \hfill & {\sigma_{32} } \hfill & {\sigma_{33} } \hfill & {\sigma_{34} } \hfill & {\sigma_{35} } \hfill & {\sigma_{36} } \hfill \\ {\sigma_{41} } \hfill & {\sigma_{42} } \hfill & {\sigma_{43} } \hfill & {\sigma_{44} } \hfill & {\sigma_{45} } \hfill & {\sigma_{46} } \hfill \\ {\sigma_{51} } \hfill & {\sigma_{52} } \hfill & {\sigma_{53} } \hfill & {\sigma_{54} } \hfill & {\sigma_{55} } \hfill & {\sigma_{56} } \hfill \\ {\sigma_{61} } \hfill & {\sigma_{62} } \hfill & {\sigma_{63} } \hfill & {\sigma_{64} } \hfill & {\sigma_{65} } \hfill & {\sigma_{66} } \hfill \\ \end{array} } \right] \otimes I_{N} = \mathop \sum \nolimits \otimes I_{N} ,$$

(11)

where $E(\varvec{\varepsilon})$ and ${\text{Cov(}}\varvec{\varepsilon})$ denote the expectation and covariance of $\varvec{\varepsilon}$, $\sigma_{ii}$ represents the variance of $\varepsilon_{i}$ ($i$ = 1,…, 6), $\sigma_{ij}$ represents the covariance between the error term of the $i$th and the $j$th equation, $(j = 1, \ldots ,6), \otimes$ denotes the Kronecker product, $N = 3099,$ denotes the number of observations and $I_{N}$ is an identity matrix of order $N$. The system of Eq. 8 was fitted to the data generated from the CTL simulations using the generalised method of moments (GMM) through the MODEL procedure of SAS/ETS with the Marquart method of minimization. The multiple-equation GMM estimator produces efficient parameter estimates under heteroscedastic conditions without any specification of the nature of the heteroscedasticity (Hayashi 2000; Greene 2012). This approach turned the residual correlation in the single-equation regression (7) into the cross-equation error correlation in the system of Eq. 8, which was then characterized by the variance and covariance matrix $\mathop \sum \nolimits$ in Eq. 11 and taken into account in parameter estimation. At the same time, it also avoided the difficulty and complexity in deriving an accurate residual skedastic function conditional upon DBH, $L$, ${\text{SEDTL}}$ across the 6 log cutting patterns to serve as a weighting function within the frame work of generalized least squares estimation of a single equation. As the same parameter vector $\theta$ was shared across the 6 system equations, this approach effectively estimated a single equation model in Eq. 7.

Evaluating and comparing prediction accuracy

To evaluate and compare their predictive performances of Eqs. 5 and 7, a leave-one-tree-out cross-validation approach was adopted to obtain prediction errors from and for trees that were independent of the model building process. In doing so, the system of 6 equations specified for each single equation model was fitted 3099 times. Each time, 6 observations from a single tree were left out from the fitting process. Then the parameter estimates based on data from the remaining 3098 trees were used to compute the corresponding 6 prediction errors for the left-out tree as follows:

$$\hat{\varepsilon }_{i} = H - \hat{H}_{i} ,\quad i = 1, \ldots , 6,$$

(12)

where $\hat{H}_{i}$ represents the predicted value of $H$ for the $i$th system equation (i.e., $i$th cutting pattern) and $\hat{\varepsilon }_{i}$ is its prediction error (see Eq. 8). Upon completion, the repeated fitting and testing process generated 3099 × 6 prediction errors for each model for the evaluation and comparison of prediction accuracy. This leave-one-tree-out approach was based on the leave-one-out cross-validation method for model selection that was originally introduced by Stone (1974), Allen (1974) and Geisser (1975), but modified for the correlation structure of the data similarly to the cross-validation strategies reviewed by Roberts et al. (2017) for structured ecological data.

The linear and nonlinear models in Eqs. 5 and 7 were further compared with the model of Varjo (1995) in its original linear form:

$$\ln L_{\text{top}} = a_{1} + a_{2} L + a_{3} \ln L + a_{4} \ln {\text{DBH}} + a_{5} {\text{SEDTL}} + a_{6} \ln {\text{SEDTL}}$$

(13)

and also in its nonlinear form after back-transformation from log scale:

$$H = H_{\text{S}} + L + L^{{a_{3} }} {\text{DBH}}^{{a_{4} }} {\text{SEDTL}}^{{a_{6} }} e^{{a_{1} + a_{2} L + a_{5} {\text{SEDTL}}}}$$

(14)

Parameters of these 2 single-equation models were also estimated through the system of equations approach as described above. Then prediction errors for each model were generated through the repeated model fitting and testing process following the leave-one-tree-out cross validation approach as previously described.

The prediction accuracy of our model was evaluated and compared with that of Varjo’s in both their linear and nonlinear forms graphically and through benchmarking statistics. Scatter plots of the observed values of $H$ against their predicted values with a line of unity slope passing through the origin were evaluated together with prediction error distributions. The benchmarking statistics included the mean error of prediction (MEP), the relative mean error of prediction (RMEP), the mean absolute error of prediction (MAEP), the mean squared error of prediction (MSEP), and the prediction coefficient of determination ($R_{\text{p}}^{2}$). These statistics have been commonly used in evaluating the predictive performance of forest models as they assess the size, direction and dispersion of the prediction error as reviewed by Huang et al. (2003). In particular, the MSEP is the measure of prediction accuracy commonly used in the statistical literature since it incorporates both the variance of prediction error and the bias of prediction (Wackerly et al. 1996). In addition to these statistics, skewness and kurtosis (calculated as the excess kurtosis, which is 3 less than the standardized fourth central moment) were also obtained for prediction error distributions.

These benchmarking statistics were calculated both globally over the entire data space and locally over specific subspaces of data. A natural example of such subspace division lay in the data generated by the 6 cutting patterns and represented by the 6 system equations. For the $i$th system equation, the 4 benchmarking statistics were calculated as follows:

$${\text{MEP}}_{i} = \mathop \sum \limits_{j}^{n} \hat{\varepsilon }_{i} /n$$

(15)

$${\text{RMEP}}_{i} = 100\left[ {\mathop \sum \limits_{j}^{n} \left( {\hat{\varepsilon }_{i} /H} \right)} \right]/n$$

(16)

$${\text{MAEP}}_{i} = \mathop \sum \limits_{j}^{n} \left| {\hat{\varepsilon }_{i} } \right|/n$$

(17)

$${\text{MSEP}}_{i} = \mathop \sum \limits_{j}^{n} \hat{\varepsilon }_{i}^{2} /n$$

(18)

$$R_{{{\text{p}}i}}^{2} = 1 - \left[ { \mathop \sum \limits_{j}^{n} \hat{\varepsilon }_{i}^{2} /\mathop \sum \limits_{j}^{n} \left( {H_{j} - \bar{H}} \right)^{2} } \right],$$

(19)

where $i = 1, \ldots , 6$, $n = 3099,$ i.e., the total number of taper trees, $j$ indicates the $j$th tree, $H_{j}$ is its total tree height and $\bar{H}$ is the average height of all $n$ trees. As the 6 cutting patterns represented by the 6 system equations did not have an equal frequency of occurrence, an overall weighted average mean squared error of prediction (WMSEP) was calculated as follows:

$${\text{WMSEP}} = \mathop \sum \limits_{i = 1}^{6} w_{i} {\text{MSEP}}_{i} ,$$

(20)

where $w_{i} =$ 0.130, 0.245, 0.250, 0.200, 0.120, 0.055 for $i = 1, \ldots , 6$. These weights were the approximate proportions that the 6 quantiles, $\tau = 0.01$, 0.25, 0.50, 0.75, 0.90 and 0.99, represented over the range from 0 to 1, which were determined by evenly partitioning the distance between any 2 adjacent quantiles and allocating one-half to each quantile. Although analogous to the coefficient of determination R², which ranges from 0 to 1, $R_{\text{p}}^{2}$ can range from − ∞ to 1 (Nash and Sutcliffe 1970). When $R_{\text{p}}^{2} < 0$, the observed mean is a better predictor than the model as the variance of prediction error is larger than the variance of the observed data.

The 6 cutting patterns in the log cutting simulations also generated, as intended, a wide range of $d$ values from 0.07 to 0.97, covering 90% of its theoretically defined interval between 0 and 1. The range of $d$ was therefore divided into 10 even intervals with a width of 0.1, except for the first and last interval, for a further local evaluation and comparison of prediction accuracy. In addition, 3 other subdivisions of data were assessed for the same purpose. First, the height and diameter data of 3099 trees were divided into 6 size classes according to their DBH and $H$. The 0.50th nonlinear height–diameter quantile curve in the form of the Chapman–Richards equation was obtained by using the R package, quantreg. This median curve divided the 3099 trees into 2 halves, i.e., relative taller and shorter trees for a given DBH (Fig. 7). Then the trees in each half were further divided into 3 DBH classes using 30 and 50 cm as points of division, resulting in 6 size classes in total. Second, a ratio of total log length ratio ($R_{l}$) was calculated by dividing $L$ with H for each tree through the 3099 × 6 cutting simulations. Then the entire range of $R_{l}$ was divided into 9 even intervals with a width of 0.1. Third, the data were divided into 9 groups according to the total number of logs cut from each stem which ranged from 1 to 9 or more.

Results

The variance and covariance matrix $\mathop \sum \nolimits$ as exemplified in Eq. 11 and used in the multiple-equation GMM parameter estimation was

$$\begin{aligned} & \left[ {\begin{array}{*{20}l} {0.0012} \hfill & {0.0014} \hfill & {0.0014} \hfill & {0.0014} \hfill & {0.0015} \hfill & {0.0019} \hfill \\ {0.0014} \hfill & {0.0033} \hfill & {0.0037} \hfill & {0.0042} \hfill & {0.0047} \hfill & {0.0051} \hfill \\ {0.0014} \hfill & {0.0037} \hfill & {0.0051} \hfill & {0.0064} \hfill & {0.0075} \hfill & {0.0077} \hfill \\ {0.0014} \hfill & {0.0042} \hfill & {0.0064} \hfill & {0.0125} \hfill & {0.0161} \hfill & {0.0157} \hfill \\ {0.0015} \hfill & {0.0047} \hfill & {0.0075} \hfill & {0.0161} \hfill & {0.0267} \hfill & {0.0265} \hfill \\ {0.0019} \hfill & {0.0051} \hfill & {0.0077} \hfill & {0.0157} \hfill & {0.0265} \hfill & {0.0367} \hfill \\ \end{array} } \right]\;{\text{and}} \\ & \left[ {\begin{array}{*{20}l} {0.41} \hfill & {0.54} \hfill & {0.57} \hfill & {0.61 } \hfill & {0.61} \hfill & {0.68} \hfill \\ {0.54} \hfill & {1.44} \hfill & {1.65} \hfill & {1.93} \hfill & {2.13} \hfill & {2.07} \hfill \\ {0.57} \hfill & {1.65} \hfill & {2.52} \hfill & {3.17} \hfill & {3.69} \hfill & {3.50} \hfill \\ {0.61} \hfill & {1.93} \hfill & {3.17} \hfill & {6.10} \hfill & {7.66} \hfill & {6.97} \hfill \\ {0.61} \hfill & {2.13} \hfill & {3.69} \hfill & {7.66} \hfill & {12.32} \hfill & {11.62} \hfill \\ {0.68} \hfill & {2.07} \hfill & {3.50} \hfill & {6.97} \hfill & {11.62} \hfill & {15.98} \hfill \\ \end{array} } \right] \\ \end{aligned}$$

for our model in its linear and nonlinear form respectively, and

$$\begin{aligned} & \left[ {\begin{array}{*{20}l} {0.0748} \hfill & {0.0429} \hfill & {0.0340} \hfill & {0.0300} \hfill & {0.0295} \hfill & {0.0249} \hfill \\ {0.0429} \hfill & {0.0581} \hfill & {0.0519} \hfill & {0.0414} \hfill & {0.0333} \hfill & {0.0299} \hfill \\ {0.0340} \hfill & {0.0519} \hfill & {0.0534} \hfill & {0.0450} \hfill & {0.0376} \hfill & {0.0341} \hfill \\ {0.0300} \hfill & {0.0414} \hfill & {0.0450} \hfill & {0.0507} \hfill & {0.0488} \hfill & {0.0441} \hfill \\ {0.0295} \hfill & {0.0333} \hfill & {0.0376} \hfill & {0.0488} \hfill & {0.0567} \hfill & {0.0511} \hfill \\ {0.0249} \hfill & {0.0299} \hfill & {0.0341} \hfill & {0.0441} \hfill & {0.0511} \hfill & {0.0493} \hfill \\ \end{array} } \right]\;{\text{and}} \\ & \left[ {\begin{array}{*{20}l} {0.92} \hfill & {0.85} \hfill & {0.61} \hfill & {0.75} \hfill & {1.34} \hfill & {1.41} \hfill \\ {0.85} \hfill & {1.85} \hfill & {2.16} \hfill & {2.77} \hfill & {3.47} \hfill & {3.55} \hfill \\ {0.61} \hfill & {2.16} \hfill & {3.41} \hfill & {4.48} \hfill & {5.21} \hfill & {5.31} \hfill \\ {0.75} \hfill & {2.77} \hfill & {4.48} \hfill & {7.26} \hfill & {8.80} \hfill & {9.21} \hfill \\ {1.34} \hfill & {3.47} \hfill & {5.21} \hfill & {8.80} \hfill & {12.32} \hfill & {13.11} \hfill \\ {1.41} \hfill & {3.55} \hfill & {5.31} \hfill & {9.21} \hfill & {13.11} \hfill & {15.05} \hfill \\ \end{array} } \right] \\ \end{aligned}$$

for Varjo’s model in its linear and nonlinear form. The diagonal and off-diagonal elements of these matrices showed that error heteroskedasticity and correlation across the 6 cutting patterns were stronger for the nonlinear than the linear models. The values of all elements of $\mathop \sum \nolimits$ for our nonlinear model in Eq. 7 were smaller than that for Varjo’s nonlinear model in Eq. 14, except for $\sigma_{66}$. As expected, all elements of the matrices for the 2 linear models were much smaller than the corresponding elements of the matrices for the 2 nonlinear models due to logarithmic transformation. Although the dependent variable of our new model in its linear form in Eq. 5 was $\ln h$ (log-transformed relative height), some degree of heteroskedasticity was still evident as shown by the trend along the diagonal elements of the matrix. The parameters of our new model estimated in its nonlinear form in Eq. 7 differed slightly from that estimated in its linear form in Eq. 5. In comparison, such comparative differences were much larger for Varjo’s model (Table 1). The linear and nonlinear estimates of 3 parameters, $a_{2} ,$ $a_{3}$ and $a_{5}$ in Eqs. 13 and 14 even had opposite signs. For both models, the parameters estimated through their nonlinear forms had appreciably smaller standard errors. The $R^{2}$ values, calculated according to Eq. 19 but using the observed and estimated values of H of all trees involved in model fitting, were slightly higher for our model in both linear and nonlinear forms than for Varjo’s.

Table 1 Parameter estimates and their standard errors in parentheses for the linear and nonlinear forms of our new model (Eqs. 5, 7) and Varjo’s model (Eqs. 13, 14) that are coded as NML, NMN, VML and VMN respectively in the first column of the table

Full size table

The scatter plots of observed and predicted values of $H$ and the corresponding distributions of prediction error generated from the leave-one-tree-out cross validation process showed little overall bias in the prediction of $H$ for the 2 models in their linear as well as nonlinear forms (Fig. 8). The smallest MEP of 0.06 m was observed for our model in its nonlinear form, as compared with the MEP of 0.08 m for Varjo’s model also in its nonlinear form. The MEP of our model in its linear form was also smaller than that of Varjo’s model in its linear form. Our model in its nonlinear form also had the smallest value of MSEP of 6.47, as compared with its linear form and also with Varjo’s model in both forms. For both models, the nonlinear form had smaller values of MEP and MSEP than the linear form. The prediction error distributions had little skewness but pronounced kurtosis for the 2 models in both forms, but they were more leptokurtic for our model than Varjo’s (Fig. 8). These results were based on prediction errors for all 6 cutting patterns derived from using the 6 quantile curves that related SEDTL to DBH (Fig. 5). It was apparent from Fig. 8 that prediction errors for the cutting pattern derived from the 6th quantile curve at $\tau$ = 0.99 had a much wider spread than all other cutting patterns. When the practically rare cases represented by this cutting pattern were excluded, MSEP became much smaller and $R_{\text{p}}^{2}$ higher for the 2 models in both forms (Fig. 8). Our model in its nonlinear form still had the smallest MEP and MSEP among the 2 models and their different forms.

When prediction accuracy was evaluated across the 6 cutting patterns individually, our model in its nonlinear form had smaller values of MSEP than its linear form with the exception of the 6th cutting pattern. However, the differences were small, representing a reduction mostly less than 2.6%. The weighted average of the 6 MSEP values showed even a smaller reduction of 1.2% (Table 2). Varjo’s model in its nonlinear form also had smaller values of MSEP than its linear form except for the first cutting pattern as its MSEP became twice as larger, pointing to a marked deterioration in predictive performance. No matter in what form our model was estimated, it had smaller MSEP values than Varjo’s model in both forms across the first 4 cutting patterns, with the first cutting pattern having the largest reduction of more than 50%. For the last 2 cutting patterns, the MSEP values of our model were either similar to or slightly larger than that of Varjo’s. The weighted average MSEP of our model represented about 20% reduction of that of Varjo’s model in its linear form and about 10–11% reduction in its nonlinear form (Table 2). For each model, the values of ${\text{MSEP}}_{1} - {\text{MSEP}}_{6}$ in Table 2 corresponded almost exactly to the error variances $\sigma_{11} - \sigma_{66}$, i.e., the diagonal elements of its variance and covariance matrix. In addition to MSEP, the other 4 benchmarking statistics also indicated that our model outperformed Varjo’s in both linear and nonlinear forms. In the nonlinear form in particular, our model had smaller values of MEP, RMEP and MAEP higher values of $R_{\text{p}}^{2}$ across the 6 cutting patterns except for the last (Table 3).

Table 2 MSEP for each of the 6 cutting patterns represented by the system of 6 equations (8) and the weighted average (WMSEP) calculated according to Eq. 20 for the linear and nonlinear forms of our new model (Eqs. 5, 7) and Varjo’s model (Eqs. 13, 14) that are coded as NML, NMN, VML and VMN, respectively, in the first column of the table

Full size table

Table 3 Values of MEP, RMEP, MAEP and ${\text{R}}_{\text{p}}^{2}$ for the nonlinear form of our new model (NMN) in Eq. 7 and that of Varjo’s model (VMN) in Eq. 14 across the 6 cutting patterns derived from the $\tau$ th conditional quantile curves in Fig. 5

Full size table

As the nonlinear form outperformed the linear form for both models, the comparative performances of the 2 models in further local evaluations of prediction accuracy were only reported for their nonlinear forms for the sake of parsimony. Across the 10 intervals of $d$, our model had smaller values of MEP, MAEP and MSEP and higher $R_{\text{p}}^{2}$ values than Varjo’s over the first 7 intervals where $d \le 0.80$ (Table 4). In the interval $0.80 < d \le 0.90$, the predictive performances of the 2 models were about the same. When $0.90 < d \le 0.95$, our model did not perform as well as Varjo’s. For $d$ greater than 0.95, both models did not perform well as indicated by their large values of MEP, MAEP and MSEP as well as the small values of $R_{\text{p}}^{2}$ (Table 4). Over the 6 tree size classes delineated in Fig. 7, our model had smaller values of MEP, MAEP and MSEP and higher $R_{\text{p}}^{2}$ values except for size class III where the comparative performances were reversed (Fig. 9). For both models, the values of MEP were positive over the first 3 size classes and negative over the last 3 size classes. This systematic pattern of bias represented a slight underestimation of total tree height for relatively slender trees and a slight overestimation for relatively squatter trees. However, the bias was relatively small, mostly within ± 1.5 m and within ± 5% of observed tree height on average. As with results in Fig. 8, values of MEP, MAEP and MSEP became much smaller and $R_{\text{p}}^{2}$ was higher when these benchmarking statistics were based on prediction errors for the first 5 cutting patterns excluding the practically rare cases represented by the 6th cutting pattern (Fig. 9).

Table 4 Benchmarking statistics of prediction accuracy based on prediction errors for all 6 cutting patterns: MEP, MAEP, MSEP and $R_{\text{p}}^{2}$ across the 10 relative diameter classes for the nonlinear form of our new model (NMN) in Eq. 7 and that of Varjo’s model (VMN) in Eq. 14 presented in the first and second row in each row group, respectively

Full size table

Across the 9 intervals of total log length ratio $R_{l}$, our model generally outperformed Varjo’s as indicated by the benchmarking statistics also based on prediction errors for the first 5 cutting patterns (Table 5). When $R_{\text{l}} \le 0.2,$ i.e., the total log length was less than or equal to 20% of total tree height, the values of MEP, MAEP and MSEP were much larger than that for all other intervals for both models and their $R_{\text{p}}^{2}$ was negative. In this first interval, our model did not perform as well as Varjo’s. When $0.20 < R_{\text{l}} \le 0.30$ and $0.30 < R_{\text{l}} \le 0.40$, our model had a similar or slightly better performance. For the remaining 6 $R_{\text{l}}$ intervals, our model clearly outperformed Varjo’s based on the values of MAEP and MSEP. For both models, the bias was small and practically negligible as their MEP values were well within ± 0.8 m. Among the 9 groups of the number of logs cut from a stem, MEP varied between − 0.12 and 0.13 m for our model and between − 0.51 and 0.85 for Varjo’s model (Table 6). The values of MAEP and MSEP of our model were much smaller, and the values of $R_{\text{p}}^{2}$ were higher for our model than Varjo’s. The comparative advantage of our model increased as the number of logs cut from a stem increased.

Table 5 Benchmarking statistics of prediction accuracy based on prediction errors for the first 5 cutting patterns: MEP, MAEP, MSEP and $R_{\text{p}}^{2}$ across the 10 total log length ratio ($R_{\text{l}}$) classes for the nonlinear form of our new model (NMN) in Eq. 7 and that of Varjo’s model (VMN) in Eq. 14 presented in the first and second row in each row group, respectively

Full size table

Table 6 Benchmarking statistics of prediction accuracy based on prediction errors for the first 5 cutting patterns: MEP, MAEP, MSEP and $R_{\text{p}}^{2}$ across the 9 sequential log number categories for the nonlinear form of our new model (NMN) in Eq. 7 and that of Varjo’s model (VMN) in Eq. 14 presented in the first and second row in each row group, respectively

Full size table

Discussion

To our knowledge, this study represents the first rigorous attempt at developing a model for predicting total tree height from CTL harvester data. Based on the global benchmarking statistics in Table 2, the nonlinear form of our model in Eq. 7 ranked first and its linear form in Eq. 5 second, while the nonlinear form of Varjo’s model in Eq. 14 ranked third and its linear form in Eq. 13 last. The superior performance of our nonlinear model was also reflected by the values of the diagonal elements, $\sigma_{11} - \sigma_{66}$, of the variance and covariance matrices in the results section. Our nonlinear model also proved to be far superior than the iterative search algorithm using a taper equation, an existing ad hoc approach demonstrated by Lu et al. (2018) and used by Hauglin et al. (2018). The detailed comparative results formed part of the first author’s postgraduate work but were not presented in the present paper for the sake of parsimony and because the iterative search algorithm was already found by Lu et al. (2018) to be inferior to Varjo’s model in its original linear form in Eq. 14, the worst performing among the 4 equations reported in this study.

Although the best performer globally, our nonlinear model in Eq. 7 did not perform as well as Varjo’s nonlinear model in Eq. 14 in certain local areas of the data space. When $d > 0.90$, our model had larger MSEP and lower $R_{\text{p}}^{2}$ than Varjo’s (Table 4). Where total log length represented less than or equal to 20% of the total tree height, Eq. 7 did not perform as well as the nonlinear form of Varjo’s model. However, the cases with total log length ratio $R_{\text{l}} \le 0.20$ only accounted for about 2% of the total number of simulated log cuttings (Table 5). Among the 0.448 × 10⁶ stems contained in the screened and filtered harvester data set, such cases accounted for a small proportion of 4.9% approximately. This approximation was based on the values of $R_{1}$ calculated for individual stems using their predicted total tree height from our nonlinear model in Eq. 7. A posterior exploratory analysis showed that stems with $R_{\text{l}} \le 0.20$ were trees with DBH from 10.2 to 81.8 cm, with a mean of 41.0 cm, an upper and lower quartile of 33.8 and 47.5 cm, and a 2.5th and 97.5th percentile of 23.1 and 63.6 cm, respectively. Besides stems with $R_{\text{l}} \le 0.20$, another subspace of data where Eq. 7 did not perform as well as the nonlinear form of Varjo’s model was the larger slenderer trees with DBH greater than 50 cm and $H$ above the median height–diameter curve as delineated in Fig. 7. These trees had a height and diameter ratio (HDR, i.e., $H/{\text{DBH}}$) ranging from 0.52 to 0.83 with an average of 0.68. Among the 0.448 million stems, such slender trees represented a small proportion of 3.5% and had HDR (i.e., calculated as predicted $H$ over DBH) from 0.56 to 1.13, with an average of 0.70.

The poorer performance of our model for $R_{\text{l}} \le 0.20$ and $d \ge 0.90$ reflected a structural weakness of our model. As defined in the “Notation” section, $d$ takes any value between 0 and 1. When the value of ${\text{SEDTL}}$ approaches DBH, $d \to 1$ and the base function of $(1 - d)$ tends to 0. A very small positive number with a positive exponent as formulated in Eq. 7 would result in a relatively large predicted value of $H$. Conversely, a positive exponent would lead to a relatively small predicted value of $H$. As a result, the variance of prediction error became much larger as shown by the large values of MSEP in Tables 4 and 5. Based on the benchmarking statistics for the local evaluations of prediction accuracy, the applicable range of our nonlinear model for P. radiata stems processed and recorded by CTL harvesters should be where values of $R_{\text{l}} > 0.20$ and $d < 0.95$. For stems outside of this applicable range, total tree height could still be estimated but indirectly based on model estimates for trees within the applicable range and also within a user-defined neighborhood or local area. Using the estimated $H$ of these trees and their spatial co-ordinates recorded in the harvester data, stem-specific and spatially varying geographically weighted linear or nonlinear height–diameter equations could be derived for more accurate total tree height estimation for such stems following the approach of Zhang et al. (2003) and Caccamo et al. (2018).

The approach of estimating the parameters of a single-equation model through the multiple-equation GMM estimator was adopted in this study specifically to obtain efficient and consistent parameter estimates in the presence of error correlation and heteroscedasticity that were inherent to the systematic structure of data generated by the CTL simulations. This approach proved to be better than all single-equation methods that were evaluated during model derivation and parameter estimation. Although the comparative results were not reported here, these single-equation methods for the log-transformed linear models included (1) least squares regression (LSQ), (2) repeated sampling and fitting through LSQ, each time using data from only 1 of the 6 cutting patterns randomly selected from each tree to avoid error autocorrelation, and (3) LSQ with discrete as well as continuous AR1 (first-order autoregressive) errors. For the nonlinear models, these single-equation methods included (1) nonlinear least squares regression (NLSQ), (2) weighted nonlinear least squares regression (WNLSQ) to overcome heteroscedasticity, (3) repeated sampling and fitting through WNLSQ to reduce heteroscedasticity and at the same time to avoid error autocorrelation, and (4) WNLSQ with both discrete and continuous AR1 errors. In addition, the generalized estimating equations (GEE) implemented in the SAS macro %NLMIX as described by Vonesh (2012) was also attempted, but the estimation was not successful because of difficulties in achieving convergence. All these single-equation methods were compared with the multiple-equation GMM estimator and evaluated through the leave one-tree-out cross-validation process and the benchmarking statistics as described previously. The multiple-equation GMM estimator provided not only efficient and consistent parameter estimates in the presence of error correlation and heteroscedasticity in the structured data, but also the best predictive performance.

The repeated sampling and fitting also exposed a structural weakness of Varjo’s model due to the high degree of multicollinearity among its predictor variables as indicated by the collinearity diagnostics during parameter estimation of its linear as well as nonlinear form. The variance inflation factor was either close to the commonly used benchmark of 10 or much greater than 10 for the predictor variables in the LSQ estimation of a single equation, while the condition index was greater than 110 and 220 in the linear and nonlinear multiple-equation GMM estimations, well above the commonly recognized threshold of 30 (Belsley 1991; Galmacci 1996; Alin 2010; Friendly and Kwan 2009). The high degree of multicollinearity led to highly variable and unstable parameter estimates. As a result, there tended to be either 1 or 2 estimated parameters that were not significantly different from zero as found during repeated sampling and fitting. A small change in the data could result in relatively large changes in parameter estimates, even to extent of switching signs. Therefore, it was not surprising to see (1) the linear and nonlinear estimates of some parameters having opposite signs as shown in Table 1 and (2) the comparatively large difference in the predictive performance between the linear and nonlinear forms of Varjo’s model (Table 2). Even for the same linear form, the model fitted by Lu et al. (2018) to data generated from a single cutting pattern based on a much smaller harvester data set had parameters of different signs to that estimated in this study. Although detailed comparative results were not reported here for the sake of parsimony, its predictive performance became much poorer when tested over the much larger data space in this study, reflecting the potential impact of multicollinearity on the predictive ability of the model beyond its original training data space where the nature and degree of multicollinearity differed. Such effects of multicollinearity in linear and nonlinear regression models have long been recognised (Belsley 1984, 1991; Galmacci 1996; Alin 2010; Erkoç et al. 2010). In comparison, our new model in both linear and nonlinear forms did not suffer from the same problem, the condition index was 28 and 23 in the linear and nonlinear multiple-equation GMM estimations, respectively, all below the benchmark value of 30. This relatively weak multicollinearity among predictor variables of our model would certainly contribute to its superior predictive performance.

Interest has been growing in making a greater use of harvester data in forest management and planning among both researchers and managers over the last 10 years (Möller et al. 2011; Olivera and Visser 2016; Roth 2016). Most recently, harvester data analytics has been identified to be essential for the successful transformation of big forestry data into valuable data for management and envisaged to be an integral and indispensable part in the generation of a virtual forest for subsequent digitization of the wood supply chain within the internet of trees and services in the future (Müller et al. 2019). As shown by Möller et al. (2011), accurate estimation of total tree height for harvested stems represents a necessary basic step in harvester data analytics. Our new model for predicting total tree height will facilitate and widen the utilization of harvester data far beyond the current limited use in the management of radiata pine plantations, i.e., mostly for log production monitoring and reporting. It will enable the full integration of harvester data with conventional inventory data, remote sensing imagery and LiDAR data for the development of harvester-based inventory systems, for the prediction of attributes of individual trees, stands and forests, and for estimating product recovery and residue biomass in radiata pine plantations. It will also facilitate (1) the screening and exploratory analysis of harvester data, (2) calibration and estimation of bark thickness, (3) mapping of site index, (4) development site-specific height–diameter curves, and (5) post-thinning assessment of diameter and height distributions of retained versus removed stems. In addition, accurately estimated total tree height will make harvester data a potential source of taper data to supplement and possibly reduce the intensity and cost of the conventional destructive taper sampling in the field. This list is far from exhaustive. Many other applications of our model are expected to emerge as harvester data analytics become increasingly refined and sophisticated with further development. Although our new model was developed for radiata pine, its mathematical form will be applicable to other tree species in plantations as well as natural forests where CTL harvester data are routinely captured during thinning and harvesting operations.

References

Alin A (2010) Multicollinearity. Wiley Interdiscip Rev Comput Stat 2(3):370–374
Article Google Scholar
Allen DM (1974) The relationship between variable selection and data agumentation and a method for prediction. Technometrics 16(1):125–127
Article Google Scholar
Barth A, Holmgren J (2013) Stem taper estimates based on airborne laser scanning and cut-to-length harvester measurements for preharvest planning. Int J For Eng 24(3):161–169
Google Scholar
Barth A, Möller JJ, Wilhelmsson L, Arlinger J, Hedberg R, Söderman U (2015) A Swedish case study on the prediction of detailed product recovery from individual stem profiles based on airborne laser scanning. Ann Sci 72(1):47–56
Article Google Scholar
Belsley DA (1984) Collinearity and forecasting. J Forecast 3(2):183–196
Article Google Scholar
Belsley DA (1991) Conditioning diagnostics: collinearity and weak data in regression. Wiley Series in Probability, New York, p 396
Google Scholar
Bi H (2000) Trigonometric variable-form taper equation for Australian eucalypts. For Sci 46(3):397–409
Google Scholar
Bi H, Long Y (2001) Flexible taper equation for site-specific management of Pinus radiata in New South Wales, Australia. For Ecol Manag 148(1):79–91
Article Google Scholar
Bi H, Jurskis V, O’Gara J (2000) Improving height prediction of regrowth eucalypts by incorporating the mean size of site trees in a modified Chapman–Richards equation. Aust For 63(4):257–266
Article Google Scholar
Bi H, Fox JC, Li Y, Lei Y, Pang Y (2012) Evaluation of nonlinear equations for predicting diameter from tree height. Can J For Res 42(4):789–806
Article Google Scholar
Caccamo G, Iqbal IA, Osborn J, Bi H, Arkley K, Melville G, Aurik D, Stone C (2018) Comparing yield estimates derived from LiDAR and aerial photogrammetric point-cloud data with cut-to-length harvester data in a Pinus radiata plantation in Tasmania. Aust For 81(3):131–141
Article Google Scholar
Cook D, Swayne DF, Buja A (2007) Interactive and dynamic graphics for data analysis: with R and GGobi. Springer, New York, p 188
Book Google Scholar
Erkoç A, Tez M, Akay KU (2010) On multicollinearity in nonlinear regression models. Selçuk J Appl Math, Special Issue: 65–72
Friendly M, Kwan E (2009) Where’s Waldo? Visualizing collinearity diagnostics. Am Stat 63(1):56–65
Article Google Scholar
Galmacci G (1996) Collinearity detection in linear regression models. Comput Econ 9(3):215–227
Article Google Scholar
Geisser S (1975) The predictive sample reuse method with applications. J Am Stat Assoc 70(350):320–328
Article Google Scholar
Gerasimov Y, Seliverstov A, Syunev V (2012) Industrial round-wood damage and operational efficiency losses associated with the maintenance of a single-grip harvester head model: a case study in Russia. Forests 3(4):864–880
Article Google Scholar
Gerasimov Y, Sokolov A, Syunev V (2013) Development trends and future prospects of cut-to-length machinery. Adv Mater Res 705:468–473
Article Google Scholar
Greene WH (2012) Econometric analysis, 7th edn. Prentice Hall, Boston, p 1232
Google Scholar
Gujarati DN, Porter DC (2010) Essentials of econometrics, 4th edn. Irwin/McGraw-Hill, Boston, p 554
Google Scholar
Hauglin M, Hansen E, Sørngård E, Næsset E, Gobakken T (2018) Utilizing accurately positioned harvester data: modelling forest volume with airborne laser scanning. Can J For Res 48(999):1–10
Google Scholar
Hayashi F (2000) Econometrics. Princeton University Press, Princeton NJ, p 712
Google Scholar
Heinimann HR (2007) Forest operations engineering and management—the ways behind and ahead of a scientific discipline. Croat J For Eng 28(1):107–121
Google Scholar
Holmgren J, Barth A, Larsson H, Olsson H (2012) Prediction of stem attributes by combining airborne laser scanning and measurements from harvesters. Silva Fenn 46(2):227–239
Article Google Scholar
Holopainen M, Vastaranta M, Rasinmäki J, Kalliovirta J, Mäkinen A, Haapanen R, Melkas T, Yu X, Hyyppä J (2010) Uncertainty in timber assortment estimates predicted from forest inventory data. Eur J For Res 129(6):1131–1142
Article Google Scholar
Horne R, Robinson GL (1988) Development of basal area thinning prescriptions and predictive yield models for Pinus Radiata plantations in New South Wales, 1962–1988. Forestry Commission of New South Wales, Sydney, p 37
Google Scholar
Huang SM (1999) Ecoregion-based individual tree height-diameter models for lodgepole pine in Alberta. West J Appl For 14(4):186–193
Article Google Scholar
Huang SM, Titus SJ, Wiens DP (1992) Comparison of non-linear height-diameter functions for major Alberta tree species. Can J For Res 22(9):1297–1304
Article Google Scholar
Huang SM, Yang Y, Wang Y (2003) A critical look at procedures for validating growth and yield models. In: Amaro A, Reed D, Soares P (eds) Modelling forest systems. CABI Publishing, Oxford, pp 271–293
Google Scholar
Huyler NK, LeDoux CB (1999) Performance of a cut-to-length harvester in a single-tree and group selection cut. USDA Forestry Service, Northeastern Research Station, Research Paper NE-711, p 6
Kiljunen N (2002) Estimating dry mass of logging residues from final cuttings using a harvester data management system. Int J For Eng 13(1):17–25
Google Scholar
Koenker R (2017) Quantile regression: 40 years on. Annu Rev Econ 9(1):155–176. https://doi.org/10.1146/annurev-economics-063016103651
Article Google Scholar
Koenker R (2018) quantreg: quantile regression. R package version 5.38. https://cran.r-project.org/package=quantreg. Accessed 15 Nov 2018
Lang AH, Baker SA, Greene WD, Murphy GE (2010) Individual stem value recovery of modified and conventional tree-length systems in the southeastern United States. Int J For Eng 21(1):7–11
Google Scholar
Lu K, Bi H, Watt D, Strandgard M, Li Y (2018) Reconstructing the size of individual trees using log data from cut-to-length harvesters in Pinus radiata plantations: a case study in NSW, Australia. J For Res 29(1):13–33
Article Google Scholar
Malinen J, Laitila J, Väätäinen K, Viitamäki K (2016) Variation in age, annual usage and resale price of cut-to-length machinery in different regions of Europe. Int J For Eng 27(2):95–102
Google Scholar
Maltamo M, Bollandsås OM, Vauhkonen J, Breidenbach J, Gobakken T, Næsset E (2010) Comparing different methods for prediction of mean crown height in Norway spruce stands using airborne laser scanner data. Forestry 83(3):257–268
Article Google Scholar
Möller JJ, Arlinger J, Hannrup B, Larsson W, Barth A (2011) Harvester data as a base for management of forest operations and feedback to forest owners. In: Ackerman P, Ham H and Gleasure E (eds) Proceedings of 4th forest engineering conference: innovation in forest engineering—adapting to structural change. Stellenbosch University, White River, South Africa, 5–7 April 2011, pp 31–35
Müller F, Jaeger D, Hanewinkel M (2019) Digitization in wood supply—a review on howIndustry 4.0 will change the forest value chain. Comput Electron Agric 162:206–218
Article Google Scholar
Murphy G (2003) Procedures for scanning radiata pine stem dimensions and quality on mechanised processors. Int J For Eng 14(2):11–21
Google Scholar
Murphy G, Wilson I, Barr B (2006) Developing methods for pre-harvest inventories which use a harvester as the sampling tool. Aust For 69(1):9–15
Article Google Scholar
Nash JE, Sutcliffe JV (1970) River flow forecasting through conceptual models part I—A discussion of principles. J Hydrol 10(3):282–290
Article Google Scholar
Nordfjell T, Björheden R, Thor M, Wästerlund I (2010) Changes in technical performance, mechanical availability and prices of machines used in forest operations in Sweden from 1985 to 2010. Scand J For Res 25(4):382–389
Article Google Scholar
Olivera A, Visser R (2016) Development of forest-yield maps generated from Global Navigation Satellite System (GNSS)-enabled harvester StanForD files: preliminary concepts. N Z J For Sci 46(1):3
Article Google Scholar
Olivera A, Visser R, Acuna M, Morgenroth J (2016) Automatic GNSS-enabled harvester data collection as a tool to evaluate factors affecting harvester productivity in a Eucalyptus spp. harvesting operation in Uruguay. Int J For Eng 27(1):15–28
Google Scholar
Palander T, Vesa L, Tokola T, Pihlaja P, Ovaskainen H (2009) Modelling the stump biomass of stands for energy production using a harvester data management system. Biosyst Eng 102(1):69–74
Article Google Scholar
Peuhkurinen J, Maltamo M, Malinen J (2008) Estimating species-specific diameter distributions and saw log recoveries of boreal forests from airborne laser scanning data and aerial photographs: a distribution-based approach. Silva Fenn 42(4):625–641
Article Google Scholar
Rasinmäki J, Melkas T (2005) A method for estimating tree composition and volume using harvester data. Scand J For Res 20(1):85–95
Article Google Scholar
Roberts DR, Bahn V, Ciuti S, Boyce MS, Elith J, Guillera-Arroita G, Hauenstein S, Lahoz-Monfort JJ, Schröder B, Thuiller W, Warton DI (2017) Cross-validation strategies for data with temporal, spatial, hierarchical, or phylogenetic structure. Ecography 40(8):913–929
Article Google Scholar
Rossit DA, Olivera A, Céspedes VV, Broz D (2019) A Big Data approach to forestry harvesting productivity. Comput Electron Agric 161:29–52
Article Google Scholar
Roth G (2016) StanForD as a data source for forest management: a forest stand reconciliation implementation case study. M.Sc. thesis, University of Canterbury, New Zealand, p 55
Saukkola A, Melkas T, Riekki K, Sirparanta S, Peuhkurinen J, Holopainen M, Hyyppä J, Vastaranta M (2019) Predicting forest inventory attributes using airborne laser scanning, aerial imagery, and harvester data. Remote Sens 11(7):797
Article Google Scholar
Sellén D (2016) Big Data analytics for the forest industry: a proof-of-concept built on cloud technologies. M.Sc. thesis, Mid Sweden University, Ostersund, Sweden, p 80
Siipilehto J, Lindeman H, Vastaranta M, Yu X, Uusitalo J (2016) Reliability of the predicted stand structure for clear-cut stands using optional methods: airborne laser scanning-based methods, smartphone-based forest inventory application Trestima and pre-harvest measurement tool EMO. Silva Fenn 50(3), 1568. https://doi.org/10.14214/sf.1568
Article Google Scholar
Söderberg J (2015) A method for using harvester data in airborne laser prediction of forest variables in mature coniferous stands. M.Sc. thesis, Swedish University of Agricultural Science, Uppsala, Sweden, p 31
Stendahl J, Dahlin B (2002) Possibilities for harvester-based forest inventory in thinnings. Scand J For Res 17(6):548–555
Article Google Scholar
Stone M (1974) Cross-validatory choice and assessment of statistical predictions. J R Stat Soc B 36(2):111–133
Google Scholar
Strandgard M, Walsh D, Acuna M (2013) Estimating harvester productivity in Pinus radiata plantations using StanForD stem files. Scand J For Res 28(1):73–80
Article Google Scholar
Swayne D, Cook D, Buja A, Lang D, Wickham H, Lawrence M (2006) GGobi Manual. http://www.ggobi.org/docs/manual.pdf. Accessed 5 Jan 2018
Uusitalo J (2010) Introduction to forest operations and technology. JVP Forest Systems Oy, Hämeenlinna, p 287
Google Scholar
Uusitalo J (2017) Big data is transforming forestry. www.luke.fi/en/big-data-transforming-forestry. Accessed 1 Mar 2018
Varjo J (1995) Latvan hukkaosan pituusmallit männylle, kuuselle ja koivulle metsurimittausta varten. In: Verkasalo E (ed) Puutavaran mittauksen kehittämistutkimuksia 1989–93, Finnish Forest Research Institute, Research Papers 558, pp 21–23 (in Finnish). https://jukuri.luke.fi/handle/10024/521187
Vesa L, Palander T (2010) Modeling stump biomass of stands using harvester measurements for adaptive energy wood procurement systems. Energy 35(9):3717–3721
Article Google Scholar
Vonesh EF (2012) Generalized linear and nonlinear models for correlated data: theory and applications using SAS. SAS Institute, Cary
Google Scholar
Wackerly DD, Mendenhall W, Scheaffer RL (1996) Mathematical statistics with applications. Duxbury Press, Belmont, p 798
Google Scholar
Williams C, Ackerman P (2016) Cost-productivity analysis of South African pine sawtimber mechanised cut-to-length harvesting. South For J For Sci 78(4):267–274
Article Google Scholar
Zhang L, Bi H, Cheng P, Davie CJ (2003) Modelling spatial variations in tree diameter—height relationships. For Ecol Manag 189:317–329
Article Google Scholar
Zhang YH, Li Y, Bi H (2015) Converting diameter measurements of Pinus radiata taken at different breast heights. Aust For 78(1):1–5
Article Google Scholar

Download references

Acknowledgements

We thank Mr. Mike Sutton of the Forestry Corporation of NSW for providing the data license to Beijing Forestry University for this collaborative work. We are indebted to the past and present forestry staff of the Forestry Commission of NSW, State Forests of NSW, Forests NSW, and the Forestry Corporation of NSW who collected the data in the field. Dr. Glen Murphy provided helpful comments on the manuscript.

Author information

Authors and Affiliations

School of Information Science and Technology, Beijing Forestry University, Beijing, 100083, People’s Republic of China
Chenxi Shan & Yun Li
Forest Science, NSW Department of Primary Industries – Forestry, Level 12, 10 Valentine Ave, Parramatta, NSW, 2150, Australia
Huiquan Bi
School of Ecosystem and Forest Sciences, The University of Melbourne, Parkville, VIC, 3010, Australia
Huiquan Bi
Snowy Region, Forestry Corporation of NSW, 76 Capper Street, Tumut, NSW, 2720, Australia
Duncan Watt
Australian Forest Operations Research Alliance (AFORA), University of the Sunshine Coast, Sippy Downs, QLD, 4556, Australia
Martin Strandgard & Mohammad Reza Ghaffariyan

Authors

Chenxi Shan
View author publications
You can also search for this author in PubMed Google Scholar
Huiquan Bi
View author publications
You can also search for this author in PubMed Google Scholar
Duncan Watt
View author publications
You can also search for this author in PubMed Google Scholar
Yun Li
View author publications
You can also search for this author in PubMed Google Scholar
Martin Strandgard
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad Reza Ghaffariyan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yun Li.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Project funding: This study was partly funded by Forest and Wood Products Australia Limited (FWPA) through project PNC465-1718: Advanced real-time measurements at harvest to increase value recovery and also supported by Beijing Forestry University through the special fund for characteristic development under the program of Building World-class University and Disciplines.

The online version is available at http://www.springerlink.com

Corresponding editor: Yu Lei.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Shan, C., Bi, H., Watt, D. et al. A new model for predicting the total tree height for stems cut-to-length by harvesters in Pinus radiata plantations. J. For. Res. 32, 21–41 (2021). https://doi.org/10.1007/s11676-019-01078-6

Download citation

Received: 06 May 2019
Accepted: 27 August 2019
Published: 16 December 2019
Issue Date: February 2021
DOI: https://doi.org/10.1007/s11676-019-01078-6

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A new model for predicting the total tree height for stems cut-to-length by harvesters in Pinus radiata plantations

Abstract

Similar content being viewed by others

Characterizing prediction errors of a new tree height model for cut-to-length Pinus radiata stems through the Burr Type XII distribution

Predicting the upper stem diameters and volume of a tropical dominant tree species

Predicting knottiness of Scots pine stems for quality bucking

Introduction

Materials and methods

Notation

Data

Taper data

Harvester data

Data screening and interactive exploratory data analysis

Cut-to-length (CTL) simulations and data generation

Model derivation

Parameter estimation of linear and nonlinear models

Evaluating and comparing prediction accuracy

Results

Discussion

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation