Introduction

Tree height (h) and diameter (d) are important variables that are frequently measured in forest inventories for the determination of volume, biomass, and basal area (Gomez-Garcia et al. 2014; West 2015), and used for forest stand structure analysis (Ogana and Gorgoso-Varela 2020). They provide information on the competitive status of a tree within a stand (West 2015) and their ratio is used as a stability index, i.e., tree slenderness coefficient (Sharma and Parton 2007; Zhang et al. 2020). Equally important, height and diameter measurements are used for assessing site productivity (West 2015). In fact, height-diameter allometry is regarded as the fundamental component of forest growth and yield models (Gomez-Garcia et al. 2014; Bravo et al. 2019).

The ease by which diameter and height are measured vary, with the former being easier to measure and at low cost (Ferraz-Filho et al. 2018). On the other hand, measurement of tree height is costly, often difficult and time- consuming (Özçelik et al. 2018; Ciceu et al. 2020; Magnussen et al. 2020), especially in complex forest ecosystems with closed-canopies (Larjavaara and Muller-Landau 2013), and as such, foresters find it more acceptable to estimate this variable (Temesgen et al. 2014). To do this, a few heights are measured and an appropriate height-diameter (h–d) function is then used to estimate other tree heights for which diameters have been measured (Kalbi et al. 2018). Modelling tree height-diameter relationships in even-aged, single-layer and monospecific or conspecific stands is straight-forward and less variable compared with complex tropical mixed forest ecosystems, characterised by multi-species, multi-layers, and indeterminate age composition (Temesgen et al. 2014).

A good example of a complex forest ecosystem is the tropical rain forest biome, regarded as one of the world’s major vegetation types and the most diverse terrestrial ecosystem (Turner 2001). It serves as habitat for more fauna and flora species compared to other biomes (Turner 2001). Studies have shown that in Nigerian rainforests there are more than 4600 identified plant species (Sarumi et al. 1996) and a majority are locally endemic (Richards 1996). Turner (2001) also suggested that some tropical rain forests may have over 100 tree species with ≥ 10 cm diameter at breast height (1.3 m aboveground) on one hectare. Thus, the complex of species and structural composition within a small area makes it difficult to develop models for estimating some dendrometric variables e.g., tree height (Akindele and LeMay 2006; Bravo et al. 2019).

However, attempts have been made to develop height-diameter (h–d) models for tropical forest ecosystems using different approaches. For example, Fang and Bailey (1998) developed h–d models for all species combined in a tropical forest in Hainan, China. Feldpausch et al. (2011) developed regional h–d allometry models for tropical forest ecosystems using the ordinary least square technique. A similar approach was used by Ogana (2019) to fit h–d models in tropical mixed forests in Nigeria. However, procedures that do not take into consideration species-specific variability may not give precise predictions of height (Temesgen et al. 2014). Another alternative that has been frequently used involves the identification of major tree species, arrange the species into groups if there are many major species and use ordinary least squares (OLS) or mixed-effect modelling technique to develop models for the groups. Temesgen et al. (2014) used this methodology to develop h–d relationships for major tree species in tropical forests in Northeast China. Kearsley et al. (2017) also used a similar procedure for tropical forests in the Congo basin. This approach seems appropriate and logical, however, when aboveground biomass estimates of a tropical mixed forest is the objective, the issue of major species selection may be irrelevant. Since tropical biomass equations like those developed by Chave et al. (2014) and Fayolle et al. (2018) require tree height as one of the input variables, it is therefore important to develop h–d models that would account for the complex nature of tropical forest ecosystems. In Nigeria, Chenge (2021) classified all the sampled species in Omo biosphere into groups and fitted both ordinary least squares (OLS) and non-linear mixed-effect (NLME) models to the group data.

A more recent approach that could be used to address the problem with modelling h–d relationships in a complex forest ecosystem is artificial neural networks (ANNs). ANNs are a subunit of artificial intelligence (AI) whose functionality mimics that of the human brain (Strobl and Forte 2007). ANNs have been consistently used in forestry with significant success for modelling tree height (Özçelik et al. 2013; Vieira et al. 2018; Bayat et al. 2020; Ercanli 2020a; Hamidi et al. 2021), tree taper (Nunes and Görgene 2016), site productivity (Aertsen et al. 2010), tree biomass and volume (Miguel et al. 2016; Özçelik et al. 2017), basal area increment (Ashraf et al. 2013) and mortality and regeneration (Hamidi et al. 2021). These researchers reported reasonable predictions of tree dendrometric variables with ANNs compared with ordinary least square and mixed-effect models. However, most of the studies have been limited to conspecific stands or stands with a few tree species. In addition to ANNs, the deep learning algorithm (DLA) is another form of AI that has been recently introduced. DLA models are multi-layered ANNs with at least three hidden layers and hundreds to thousands of neurons (Ercanli 2020a). They represent a more complex structure similar to the human brain than those of ANNs. Recent studies by Ercanli (2020a, b) showed that the DLA had better prediction of tree height in an even-aged pine stand compared to ANNs, mixed-effect and ordinary least square models.

Application of the DLA models in complex tropical forests of Africa, including Nigeria, has not apparently been documented. Yet accurate prediction of dendrometric variables such as total tree height is necessary for quantifying the aboveground biomass (AGB) of the region. When tree heights are accurately estimated for complex tropical forests, minimal errors will be introduced into the estimation of AGB. Therefore, the objectives of this study were to: (1) develop DLA models for a tropical rain forest of Nigeria; (2) compare the predictions from DLA with h–d models developed with classical methods; and, (3) evaluate the models based on aboveground biomass estimations.

Materials and methods

Data

The data used for this study were collected in Cross River State of Nigeria during a REDD + research project funded by the African Forest Forum (AFF) in collaboration with the Swiss Agency for Development and Cooperation (SDC). Additional inventory data from research in the Ekuri Forest Reserve in the same state were also included. The data comprise diameter and total height of 1,736 individual trees representing 116 species measured from 52 0.25 ha sample plots. The number of individual trees (n) per species ranged from 1 to 378. Of this number, only 12 species ≥ 30. Because of the multiple tree species composition, it was not possible to develop species-specific height functions. Therefore, a cluster analysis was carried out.

A K-means clustering (MacQueen 1967) was used to classify the species into groups based on height-diameter ratios; this ensures high intra-class and low inter-class similarities. The Hartigan-Wong algorithm (Hartigan and Wong 1979 cited in Kassambara 2017) was used. The algorithm minimizes the total intra-cluster variation, defined as the sum of squared Euclidean distances between the height-diameter ratio of the species and corresponding mean.

$$TWSS = \mathop \sum \limits_{k = 1}^{k} W\left( {C_{k} } \right) = \mathop \sum \limits_{k = 1}^{k} \mathop \sum \limits_{{x_{i} \in C_{k} }} \left( {x_{i} - \mu_{k} } \right)^{2}$$
(1)

where TWSS is the total within sum of squared, W represents within, \({C}_{k}\) is the individual cluster (group), \({x}_{i}\) represents height-diameter ratio of a species belonging to the cluster \({C}_{k}\), \({\mu }_{k}\) is the mean value of the height-diameter ratio assigned to the cluster \({C}_{k}\). The cluster (Maechler et al. 2019) and factoextra (Kassambara and Mundt 2020) packages both implemented in R (R Core Team 2020) were used in the analysis. The 116 tree species were classified into three groups: group 1 had 68 species, group 2 and 3 had 25 and 23 species, respectively, (see group 1 had 68 species, group 2 and 3 had 25 and 23 species, respectively, (see Appendix Tables S1, S2,and S3).).

Descriptive statistics of the tree variables: diameter (d in cm), total tree height (h in m) and height-diameter ratio (h–d r); computed stand variables: quadratic mean diameter (Dg, cm), basal area per ha (G, m2 ha–1), basal area per ha of larger trees (BAL, m2 ha–1) and number of trees per ha (N, trees ha–1); computed diversity indices: dominance, evenness, Simpson and Shannon indices of the data by species-group are shown in Table 1. The species-group data were randomly split into training (85%) and validation (15%) sets. Diameter histograms (pooled data) and scatter plots by species-group are presented in Fig. 1a and b, respectively.

Table 1 Descriptive statistics of the species group data from tropical rain forests of Nigeria
Fig. 1
figure 1

a Diameter histograms (pooled data) and b scatterplots by species group (SG)

Modelling the height-diameter (h–d) relationships

Two sets of h–d models were developed for each species-group (SG) data from tropical rain forest ecosystems of Nigeria: those based on classical methods, i.e., nonlinear least squares (NLS) and nonlinear mixed-effects (NLME), and those based on artificial intelligence (AI), i.e., the deep learning algorithm (DLA).

Models based on classical methods: NLS and NLME

Several nonlinear single predictor height-diameter functions have been used to describe tree height and diameter relationships in both even-aged and uneven-aged stands (Mehtätalo et al. 2015; Corral-Rivas et al. 2019; Bronisz and Mehtätalo 2020; Ciceu et al. 2020; Ercanli 2020a; Ogana et al. 2020; Xie et al. 2020), and in complex natural forests (Feldpausch et al. 2011; Temesgen et al. 2014; Kearsley et al. 2017; Ogana 2019; Chenge 2021). To select the base model for the complex tropical forests, 18 single predictor h–d models were initially evaluated. The models include: Curtis (1967), Meyer (1940), Chapman-Richards (Richards 1959), Michailoff (1943), Michaelis-Menten (Michaelis and Menten 1913), Korf (Lundqvist 1957), Näslund (1937), Power (Stoffels and van Soest 1953), modified power (Ogana and Gorgoso-Varela 2020), Prodan (Strand 1959), Gompertz (1825), Logistic (Pearl and Reed 1920), Ratkowsky (1990), Schenute (1981), Wykoff (Wykoff et al. 1982), modified Hossfeld IV (Ogana et al. 2020), Weibull (Yang et al. 1978), and Burkhart (Burkhart and Strub 1974). Nonlinear least square (NLS) was used to fit the models in R (R Core Team 2020) and were evaluated and ranked based on five indices. Preliminary results showed that Meyer had the minimum rank sum (see Appendix Table S4). Thus, the model was selected and expanded.

The Meyer model (Eq. 2) was expanded with the inclusion of stand variables and biodiversity indices. Stand variables (Dg, G, BAL and N) and biodiversity indices (dominance, evenness, Shannon and Simpson) in Table 1 were all evaluated first. However, only the inclusion of the quadratic mean diameter (Dg) and number of trees per ha (N) in a linear combination as replacement for the asymptotic parameter \({b}_{0}\) improved the models significantly. The generalised model is expressed as Eq. (3):

$$E\left( h \right) = 1.3 + b_{0} \left( {1 - e^{{ - b_{1} d}} } \right)$$
(2)
$$E\left( h \right) = 1.3 + \left( {a_{0} + a_{1} Dg + a_{2} N} \right)\left( {1 - e^{{ - b_{1} d}} } \right)$$
(3)

where E(h) and d represent expected total tree height (m) and diameter at breast height (cm), respectively; Dg is quadratic mean diameter (cm), N is number of trees per ha (trees ha−1), \({a}_{0}\), \({a}_{1}\), \({a}_{2}\), \({b}_{1}\) are model parameters. Equations (2) and (3) were both fitted with NLS and NLME to the individual species-group data. The NLS has only fixed-effects parameters which explain the trend in tree height common to the overall stand (Ercanli 2020a). Contrary to the NLS, NLME has both fixed and random effects parameters. The fixed effects parameters play a similar role as those of ONLS; the random effects parameters explain the variation in h–d relationships across the plots.

The NLME model is represented in the general equation (Pinheiro and Bates 2013) as:

$$h_{ij} = f\left( {V_{ij} ;\phi_{i} } \right) + \varepsilon_{ij} ; for which i = 1, ..., m and j = 1,..., n_{i}$$
(4)
$$\phi_{i} = {\rm A}_{i} \lambda + B_{i} b_{i}$$
(5)

where m represents the number of grouping factors (one grouping factor was used in this study [plot]); \({n}_{i}\) represents the number of observation in the ith plot; \({h}_{ij}\) is the height of tree j on plot i, \({V}_{ij}\) is a covariate vector; f represents the nonlinear models [Eqs. (2) and (3)]; \({\phi }_{i}\) is the vector r × 1; r is the model parameters; λ is a vector of the fixed parameters: p × 1 (p the number of fixed parameters), \({b}_{i}\) is a vector of the random parameter: q × 1 (q equal number of random parameters) (Corral-Rivas et al. 2019), \({\rm A}_{i} \mathrm{is equal} r \times p\) and \({B}_{i} \mathrm{is }r \times q\), respectively, and are the dimensional matrix for the fixed and random effects, for plot i (Corral-Rivas et al. 2019). The plot effects is presumed to have a common multivariate normal distribution with zero mean and variance–covariance matrix var(\({b}_{i}\)) given as D for all values of i (Mehtätalo et al. 2015). The \({\varepsilon }_{ij}\) represents random error with zero mean and constant variance var (\({\varepsilon }_{ij}\)) = σ2. A power type variance function was used to account for heteroscedasticity in the residuals: \({\sigma }^{2}{d}_{ij}^{2\delta }\), where \(\delta\), is the power parameter to be estimated. The maximum likelihood through the ‘nlme’ function in R (R Core Team 2020) was used to estimate the parameters of the NLME models.

Deep learning algorithm (DLA)

The deep learning algorithm (DLA) is a multi-layer artificial neural networks (ANNs) with at least three hidden layers and hundreds to thousands of neurons, and gives a better representation of complex systems such as tropical forest ecosystems (Ercanli 2020a). The DLA requires sophisticated graphical processing units; thus, this study utilised the h2o.deeplearning function of the h2o package (LeDell et al. 2020) implemented in R (R Core Team 2020) to train the models. The h2o.deeplearning function has multi-layer feedback neural networks that provide well-supervised training procedures to predict output variable from input variable(s). In training the DLA models, diameter at beast height (d, cm), quadratic mean diameter (Dg, cm) and number of trees (N, trees per ha) were used as input variables, while tree height (h, m) was the output variable. The input variables were the independent variables used for the classical methods (NLS and NLME). The DLA was trained for each species-group.

Several factors influenced the convergence of DLA, e.g., number of hidden layers, number of neurons in the hidden layers, the activation function, distribution type, epochs, epsilon and rho. The adaptive learning rate algorithm called ADADELTA (Zeiler 2012 cited in Ercanli 2020b) was used to ensure fast convergence of the DLA. The ADADELTA has both momentum training and learning rate annealing. The rho parameter explains the rate of ADADELTA, while epsilon describes the strength of the learning rate during the training. Default values of 0.999 and 1 × 10–8 for rho and epsilon, respectively, were used to train the DLA models. A default value of 1000 was also used for the epochs. A similar value was used in Ercanli (2020a, 2020b). The Gaussian distribution was selected amidst other distributions, (e.g., Bernoulli, Huber, Poisson, Multinomial, and Laplace) in the h2o.deeplearning function as the training distribution because it is a continuous distribution. The number of hidden layers initially evaluated in this study ranged from 3 to 10 and did not consider hidden layers > 10 because too complex a network makes it difficult to achieve convergence. For each hidden layer, 10 to 100 neurons with an increment of 10 per step were used. Of the three activation functions of the h2o.deeplearning function, the rectifier function was more suitable for the data set. The activation functions describe the nonlinear trends in the tropical data set (Ercanli 2020b). The best DLA models were selected for each species-group.

Model evaluation and equivalence test

The quality of model predictions was evaluated based on the comparisons of the root mean square error (RMSE), mean relative error (MRE), mean absolute percentage error (MAPE), critical error (Ecrit) and Bayesian Information criteria (BIC). The smaller the RMSE, MRE, MAPE, Ecrit and BIC statistics, the better models.

$$RMSE = \sqrt {\frac{{\mathop \sum \nolimits_{i = 1}^{n} \left( {h_{i} - \hat{h}_{i} } \right)^{2} }}{n - p}} { }$$
(6)
$$MRE = {\raise0.7ex\hbox{${\mathop \sum \nolimits_{i = 1}^{n} \frac{{\hat{h}_{i} - h_{i} }}{{h_{i} }}}$} \!\mathord{\left/ {\vphantom {{\mathop \sum \nolimits_{i = 1}^{n} \frac{{\hat{h}_{i} - h_{i} }}{{h_{i} }}} n}}\right.\kern-\nulldelimiterspace} \!\lower0.7ex\hbox{$n$}}$$
(7)
$$MAPE = \left( {{\raise0.7ex\hbox{${\mathop \sum \nolimits_{i = 1}^{n} \frac{{\left| {h_{i} - \hat{h}_{i} } \right|}}{{h_{i} }}}$} \!\mathord{\left/ {\vphantom {{\mathop \sum \nolimits_{i = 1}^{n} \frac{{\left| {h_{i} - \hat{h}_{i} } \right|}}{{h_{i} }}} n}}\right.\kern-\nulldelimiterspace} \!\lower0.7ex\hbox{$n$}} } \right) \times 100{ }$$
(8)
$$E_{crit} = \frac{{\sqrt {{\raise0.7ex\hbox{${\tau^{2} \mathop \sum \nolimits_{i = 1}^{n} \left( {h_{i} - \hat{h}_{i} } \right)^{2} }$} \!\mathord{\left/ {\vphantom {{\tau^{2} \mathop \sum \nolimits_{i = 1}^{n} \left( {h_{i} - \hat{h}_{i} } \right)^{2} } {\chi_{crit}^{2} }}}\right.\kern-\nulldelimiterspace} \!\lower0.7ex\hbox{${\chi_{crit}^{2} }$}}} }}{{\overline{H}_{i} }}$$
(9)
$$BIC = n {\text{ln}}\left( \frac{RSS}{n} \right) + p\ln n$$
(10)

where RSS is residual sum of squares; n is the number of observations; p the number of parameters; \({\overline{h} }_{i}\) is average tree height; hi is observed tree height; \({\widehat{h}}_{i}\) is the predicted height by the model; \(\tau\) is the standard normal deviate (≈ 1.96 at probability level of \(\alpha\) = 0.05) and \({\chi }_{crit}^{2}\) was obtained for \(\alpha\) = 0.05. In addition, relative rank (Poudel and Cao 2013) was used to determine the relative location of each model based on the evaluation statistics. It is expressed as:

$$R_{i} = 1 + \frac{{\left( {m - 1} \right)\left( {S_{i} - S_{min} } \right)}}{{S_{max} - S_{min} }}$$
(11)

where Ri is relative rank of model i (i = 1, 2, …, m); m is the number of models evaluated, Si the evaluation statistic value of model i; Smax and Smin are the maximum and minimum values, respectively, of Si. Relative rank is a real number with 1 as the best. For each model, the relative ranks were summed across the five statistics (RMSE, MRE, MAPE, Ecrit and BIC). Thus, the relative rank sum was used to identify the best model for estimating tree height in complex tropical rain forest ecosystems.

The equivalence test of Robinson et al. (2005) was used to further assess height prediction by classical methods (NLS and NLME) and by DLA using the validation dataset (15% of the data). In this test, the size of the region of dissimilarity between the observed tree heights and predicted heights is an important factor for deciding on the acceptability of the model/method. The test begins with the null hypothesis (Ho) of significant difference between the observed and predicted values. Thus, a rejection of the Ho implies acceptance of the prediction of tree heights by the model.

The equivalence test was performed by regressing the relationships between the observed (X) and predicted (Y, predictions by NLS, NLME and DLA) heights and also by regressing the regression parameters with the intercepts (\({b}_{0}\)) and slope (\({b}_{1}\)) for this relation (Ercanli 2020b). Confidence intervals (CIs) for \({b}_{0}\) and \({b}_{1}\) were calculated using a two one-sided test (TOST) (Robinson et al. 2005). TOST tests the equality of slopes (\({b}_{1}\)) to \(1\pm 10\%\) and the equality of intercepts (\({b}_{0}\)) to \(\overline{y }\pm 10\%\) (Ercanli 2020b). We used the nonparametric bootstrap technique described by Robinson et al. (2005) to obtain the predictions of the CIs for the parameters. The number of bootstrap replicates was set at 1000 as recommended and recently used by Ercanli (2020b). The equivalence test procedures for observed (X) and predicted (Y, predictions by NLS, NLME and DLA) heights were carried out using the “equivalence” package (Robinson 2016) implemented in R (R Core Team 2020).

Aboveground biomass estimations

A useful application of h–d models is the estimation of aboveground biomass (AGB). Different studies have shown that allometric models for estimating AGB perform better when information on tree height is incorporated (Chave et al. 2014; Popkin 2015; Kearsley et al. 2017; Fayolle et al. 2018). Thus, both observed and predicted tree heights by DLA and classical methods were used to estimate the AGB of the forests. The generalised pantropical AGB model (equation [12]) (Chave et al. 2014) was used.

$$AGB_{est} = 0.0673 \times \left( {\rho d^{2} h} \right)^{0.976}$$
(12)

where \(AGB_{est}\) represents estimated aboveground biomass (kg); d is diameter at breast height (cm); h is tree height (m) and ρ is wood density (g cm–3). Wood density for each species was extracted from the global wood density database (Chave et al. 2009; Zanne et al. 2009). For unidentified species, an average of 0.5 g cm−3 was used. A similar average was used by Ogana and Ogana (2019) in the same region. Reyes et al. (1992) also used 0.5 g cm−3 for wood density of tropical African species. The global wood density database and the AGB model (equation [12]) have been implemented in the BIOMASS package (Rejou-Mechain et al. 2017). They were obtained with “wdData” and “computeAGB” functions of the BIOMASS package in R. However, the AGB is in megagrams (Mg)—the conventional unit of AGB (Chave et al. 2014).

The observed AGB was calculated by substituting the density, and the measured diameters and heights into Eq. (12). The predicted AGB was obtained from the density, measured diameters and predicted heights by the classical methods (NLS and NLME) and DLA. Root mean square error (RMSE), critical error (Ecrit) and mean relative error (MRE) were used to assess the adequacy of the models for estimating AGB. A plot of relative error (i.e., predicted AGB minus observed AGB, divided by the observed AGB, in %) was also used to illustrate the bias in predicted AGB.

Results

Height-diameter (h–d) models

The estimated parameters of Eqs. (2) and (3) fitted with NLS for the species groups (i.e., SG1, SG2 and SG3) are presented in Tables 2, 3 and 4. Also in the tables are the parameter estimations and variance components of the fitted nonlinear mixed effect (NLME) models expressed as Eqs. (13) and (14), and the best of the DLA models. In SG1 data, the parameters of the models by NLS and NLME had low standard errors and were significantly different from zero (p < 0.05), except for Eq. 14. 14 where \({a}_{0}\) was not significant (Table 2). Similarly, in SG2 data, parameters \({a}_{1}\) and \({a}_{2}\) were not significant in Eq. (13) and (14) (Table 3). However, all parameters in the models were significant for the SG3 data set.

$$h = 1.3 + \left( {b_{0} + u_{j} } \right)\left( {1 - e^{{ - b_{1} d}} } \right)$$
(13)
$$h = 1.3 + \left( {\left[ {a_{0} + u_{j} } \right] + a_{1} dg + \left[ {a_{2} + u_{j} } \right]N} \right)\left( {1 - e^{{ - b_{1} d}} } \right)$$
(14)
Table 2 Species group 1: Information on parameters of models, root mean square error (RMSE), mean relative error (MRE), mean absolute percentage error (MAPE) critical error (Ecrit), Bayesian information criterion (BIC) and relative rank sum (∑R)
Table 3 Species-group 2: Information on parameters of models, root mean square error (RMSE), mean relative error (MRE), mean absolute percentage error (MAPE) critical error (Ecrit), Bayesian information criterion (BIC) and relative rank sum (∑R)
Table 4 Species-group 3: Information on parameters of models, root mean square error (RMSE), mean relative error (MRE), mean absolute percentage error (MAPE) critical error (Ecrit), Bayesian information criterion (BIC) and relative rank sum (∑R)

The results from the evaluation statistics (RMSE, MRE, MAPE, Ecrit and BIC) showed that the DLA models outperformed other models fitted by NLS and NLME for the three species-group (Tables 2, 34). The DLA models had the smallest statistics and lowest relative ranks (i.e., 1.00) across the five indices for the species groups. The optimal number of hidden layers and neurons for the DLA models were: 100 neurons in six hidden layers for SG1, 100 neurons in nine hidden layers for SG2, and 100 neurons in seven hidden layers for SG3. In these DLA models, the input variables were diameter, quadratic mean diameter and number of trees per ha. Thus, based on the relative rank sum, the order of ranking is: DLA models > NLME models > NLS models.

The graphical relationships between the observed (x-axis) and predicted (y-axis) tree heights by the best three models compared with the 1:1-line for each species-group is shown in Fig. 2. As seen in the graph, the DLA models 100 neurons in six hidden layers for SG1, 100 neurons in nine hidden layers for SG2 and 100 neurons in seven hidden layers for SG3 produced a more organised cluster of measured and predicted values along the main diagonal (i.e., 1:1-line) compared with those of NLS and NLME. Furthermore, the graph of residual against predicted tree heights by the models did not show any meaningful heteroscedasticity across the three species groups (Fig. 3).

Fig. 2
figure 2

Relationship between observed (x-axis) and predicted (y-axis) tree height by the three best models for each species group (SG1, SG2 and SG3)

Fig. 3
figure 3

Residual plots of the three best models for each species group (SG1, SG2 and SG3)

The results from the equivalence test using the validation data showed that, for all models developed by NLS, NMLE and DLA, the null hypothesis (H0) of dissimilarity for intercept (\({b}_{0}\)) parameters was rejected, for which the bootstrap intercept (\({b}_{0}\)) lies inside the equivalent regions (\(\overline{y }\pm 10\%\)) (Table 5). In the case of the null hypothesis for dissimilarity for slope parameters (\({b}_{1}\)), only the DLA models 100 neurons in six hidden layers for SG1, 100 neurons in nine hidden layers for SG2 and 100 neurons in seven hidden layers for SG3 were rejected, in which the bootstrap slope (\({b}_{1}\)) lies within the equivalent regions \(1\pm 10\%\). The predicted bootstrap (\({b}_{1}\)) limit by the NLS and NLME models were not rejected for the three species groups. Since a rejection of the Ho implies acceptance of the prediction of tree heights, the DLA models were selected for the tropical rain forest ecosystems.

Aboveground biomass estimation

Aboveground biomass (Mg) estimations using tree height predicted by NLS, NLME and DLA models were assessed by the root mean square (RMSE), the mean relative error (MRE) and critical error (Ecrit) (Table 6). The results show that using tree heights predicted by DLA into the AGB Eq. (12) yielded the smallest RMSE (0.1931 Mg), MRE (0.0353) and critical error (0.4511 Mg) values. It brought about more than 30% reduction in the indices relative to NLS and NLME. The graph of relative error (%) also show that minimal error was inserted into the estimation of AGB using predicted heights by DLA compared with those of NLS and NLME models (Fig. 4). The DLA produced a near perfect smooth spline regression with little tendency toward overestimation and underestimation of aboveground biomass, whilst those of NLS and NLME were more irregular.

Table 5 Equivalence tests of the models (equations [2], [3], [13], [14] and DLA) using the 15% validation data set from the species-group (SG1, SG2 and SG3)
Table 6 The root mean square error (RMSE), mean relative error (MRE) and critical error (Ecrit) of models based on aboveground biomass (AGB, Mg) estimation
Fig. 4
figure 4

Relative error (%) in the predicted AGB from the five h–d models; background and black lines represent data-point density and a spline regression data point, respectively

Discussion

This research developed models for predicting tree heights in the complex rain forest ecosystems of Nigeria using classical methods (nonlinear least square and nonlinear mixed effect) and a robust AI technique, i.e., a deep learning algorithm (DLA) with a view to improving aboveground biomass estimations. The DLA models produced the smallest evaluation statistics and, as such, were more suitable in predicting tree heights in complex tropical rain forests. Parallel observation was reported in Ercanli (2020a) who applied the DLA technique to predict tree heights of even-aged pure Anatolian Crimean pine in Turkey. The author found the DLA model 100 neurons in 9 hidden layers to be the best for predicting tree heights compared with nonlinear regression and nonlinear mixed-effect models. Similarly, Ercanli (2020b) observed that a DLA model with 100 neurons in 8 hidden layers produced the best height predictions in even-aged pure Turkish pine. In the case of the complex tropical rain forest ecosystems, DLA with 100 neurons in six hidden layers was more accurate for predicting tree heights in SG1. Species group 1 contains more than 60 tree species. For SG2 (25 tree species) and SG3 (23 tree species), 100 neurons in nine hidden layers and 100 neurons in seven hidden layers, respectively, produced the best predictions of tree height.

The DLA models trained for the tropical rain forests resulted in more than 20% and 50% reduction in the RMSE and BIC values relative to NLS and NLME models across the species groups. As a rule of thumb, a minimum ΔBIC ≤ 2 is required for two models to be similar (Gorgoso-Varela et al. 2019). In addition, Temesgen et al. (2014) noted that the extension of a model is only necessary if the difference in RMSE is > 5%. Beside the evaluation statistics, only in the DLA models were the null hypothesis (H0) of dissimilarity for intercept (\({b}_{0}\)) and slope (\({b}_{1}\)) parameters rejected. The performance of the DLA models in predicting tree heights could be attributed to the complex network of neurons with different numbers of hidden layers. The DLA models are multi-layered ANNs with at least 3 hidden layers and hundreds to thousands of neurons (Ercanli 2020a). This is the first attempt to apply DLA techniques to model height-diameter relationships in complex tropical rain forests. Although Hamidi et al. (2021) used two ANNs, i.e., multilayer perception (MLP) and radial basis function (RBF) to model height-diameter relationship and other dendrometric variables in complex Hyrcanian forests of northern Iran, few species composition exist compared to those of tropical rain forests. Moreover, the MLP and RBF contain fewer networks than those of DLA models. Ercanli (2020a) also reported better performance with DLA models compared with ANNs in pure pine stands.

Bayat et al. (2020) used the ANNs and adaptive neuro-fuzzy inference system (ANFIS) to provide better estimation of tree heights in uneven-aged, mixed stands in Iran compared with regression analysis. Similar observation was reported by Vieira et al. (2018) for eucalyptus species. Özçelık et al. (2013) also showed that the use of ANNs improved height prediction of Crimean juniper. The ANNs model resulted in 20% reduction in RMSE compared to 13% by NLME. In addition, they noted that using ANNs is more advantageous than NLME because no height measurements are required for its application. In contrast, prior information is needed for mixed-effect model calibrations. Saudi et al. (2016) also asserted that random parameters in NLME may not be applicable for most prediction purposes except that calibration data are readily available. Data availability remains a limiting factor in complex tropical rain forests.

One important limitation of artificial intelligence is model transferability to other users (Hamidi et al. 2021). To ensure efficient transferability, the R syntax files of the DLA models was provided for the three species group in downloadable links via google drive (SG1: https://drive.google.com/file/d/1faIwy3ndBBCm39GNpxxKG2wXY_UqiT0E/view?usp=sharing; SG2: https://drive.google.com/file/d/13p9yW36_73M6U0PY42cxqWwFKNd5MwOU/view?usp=sharing; SG3: https://drive.google.com/file/d/1-bgIOsP8o25_HL-d6m2GpxNZ5tNMKwh5/view?usp=sharing). A step-by-step guide for uploading the R syntax files of DLA models in R for tree height prediction purposes can be found in the appendix of Ercanli (2020b). This ensures accessibility so that forest practitioners can use the predicted heights to estimate other dendrometric variables like tree biomass and volume.

Estimation of aboveground biomass of forest ecosystems is relevant, especially in the context of climate change. Accurate tree height predictions are required to improve AGB estimation (Kearsley et al. 2017). Using predicted tree heights by DLA in AGB equations resulted in a 30% reduction in the root mean square error, mean relative error and critical error. This implies that the number of errors introduced into the estimation of aboveground biomass is small. In contrast, errors produced by NLS and NLME in predicting tree heights of complex tropical rain forests are brought about in AGB estimations. Because tree diameters and wood density are fixed variables, i.e., the same for DLA, NLS and NLME, tree heights are the only source of variability. Several studies (Chave et al. 2014; Popkin 2015; Kearsley et al. 2017; Foyolle et al. 2018) have supported the use of local height-diameter model in generalised pan-tropical AGB models to minimise error in biomass estimations. Kearsley et al. (2017) quantified the size of error from using heights predicted by pan-tropical height-diameter values for aboveground estimation for the central Congo Basin. They reported a significant overestimation of tree heights which resulted in significant overestimation of AGB.

Besides the estimation of aboveground biomass, tree height predictions by DLA models could be applied to quantify the volumes of important timber species of the region. Volume equations developed for these species in the tropical rain forest by Akindele and LeMay (2006) require information on tree height as input variables. The predicted height by DLA models will improve the accuracy of estimated tree volumes, which could be scaled up to stand level.

Conclusions

The complexity of tropical rain forest ecosystems requires innovative techniques to improve the prediction of important dendrometric variables such as tree heights for aboveground biomass estimation. This study has shown the relevance of artificial intelligence (e.g., deep learning algorithm [DLA]) in addressing the problem of modelling tree height in complex tropical rain forest ecosystems. The DLA models outperformed other classical modelling techniques (nonlinear least square and nonlinear mixed-effects) in predicting tree heights in these ecosystems, consequently, minimizing the amount of error in aboveground biomass estimation. The input variables for the DLA models included diameter at breast height quadratic mean diameter and number of trees per ha. To facilitate the application of the DLA models by other users, a link is provided where the models can be downloaded and reused for tree height prediction.