1 Introduction

Mangroves are forests found in the tropical and subtropical coastlines between 30° south and north of equator (FAO 2007). In Africa, there are mangroves both at the western and eastern coasts. At the eastern coast of Africa, 14 mangrove species are growing naturally, and 10 among these are found in Tanzania. Avicennia marina (Forssk.) Vierh, Sonneratia alba J. Smith, and Rhizophora mucronata Lam. are the three most dominant mangrove species in Tanzania (MNRT 1991; Luoga et al. 2004; Nshare et al. 2007).

Mangroves provide a range of goods and services of biological and economic importance. In addition, mangroves store large amounts of carbon per unit area (Donato et al. 2011; Murray et al. 2011) and are therefore also important for climate change mitigation (UNEP 2014). Although mangroves in many countries are legally protected, for example in Tanzania, Kenya, and South Africa (FAO 2007), mangroves suffer from deforestation and forest degradation (Wang et al. 2003).

A climate change mitigation strategy under the United Nations Framework Convention on Climate Change (UNFCCC), aiming at Reducing Carbon Emissions from Deforestation and Forest Degradation (REDD+), offers an opportunity for conservation and management of mangroves. Successful implementation of REDD+ relies on the capabilities of participating countries to routinely and reliably monitor changes of carbon stocks and associated greenhouse gas emissions through establishment of a Monitoring, Reporting and Verification (MRV) system (Hewson et al. 2013). In line with this, Tanzania has, under the National Forestry Resources Monitoring and Assessment (NAFORMA) program, established a national grid of permanent sample plots, which will be monitored for biomass and carbon over time (URT 2010). For Tanzania to be able to report carbon stocks at tier 2 or 3 (IPCC 2003), the development of country-specific biomass models is therefore imperative.

Biomass models, based on allometric theory, relating easily measurable tree variables such as diameter at breast height (dbh) and total tree height (ht) to biomass, are considered to be the most efficient tools for tree level biomass prediction (Brown 1997; IPCC 2007; Chave et al. 2014). The tree variables used as model input are obtained through forest inventories (Husch et al. 2003; URT 2010). Development of biomass models requires destructive sampling of trees. Above- and belowground fresh weights of the trees are measured in field, and subsequently, dry weights are determined by using the dry to fresh weight ratio (DF ratio) derived from oven-dried subsamples. Aboveground biomass usually refers to stem, branch, and foliage, while belowground biomass refers to all live roots down to 2 mm in diameter (IPCC 2006).

Many models for prediction of both above- and belowground biomass of mangrove forests have been developed previously. A review by Komiyama et al. (2008) identified 13 species-specific and two common (i.e., multi-species) models for prediction of aboveground biomass of mangroves, while nine species-specific models and one common model were identified for belowground biomass. Additional studies on mangroves that developed models for prediction of biomass not present in this review also exist (e.g., Kairo et al. 2009; Kauffman and Donato 2012; Sitoe et al. 2014). With the exception of the models developed by Kairo et al. (2009) in Kenya and Sitoe et al. (2014) in Mozambique, most of the models have been developed for mangroves in Asia. The relatively few existing models for belowground biomass may be associated with the labor-intensive nature of sampling belowground biomass for mangrove tree species (Njana et al. 2015).

No biomass models have been developed for mangroves of Tanzania, yet numerous models have been developed based on data from other regions and some from neighboring countries in Africa. If these models are applied to quantify biomass of mangroves in Tanzania, they would be used beyond their spatial validity. Since mangrove trees may respond differently to different environmental conditions, this could also result into morphological and architectural differences between trees originating from different sites. Furthermore, it is also important that models are used within valid ranges in terms of species and tree size (dbh and ht). Models calibrated on data from other regions are more likely to violate these requirements. For example, the aboveground model by Chave et al. (2005) is based on mangrove data from a limited geographical area (French Guiana and Guadeloupe); thus, the model does not represent mangroves found in Africa and it does not include any dominant species found in Africa. Similarly, the aboveground biomass models from mangroves in Kenya and Mozambique are both based on data from one site, and they have limited sample size (e.g., n = 5, Kairo et al. 2009; n = 31 for six species, Sitoe et al. 2014) and tree size ranges (dbh up to 42 cm, Sitoe et al. 2014). Trees with dbh > 40 cm are likely to be found in Tanzania (e.g. Mattia, 1997). Therefore, if such models are applied in Tanzania, they are likely to provide biased estimates since the tree sizes are beyond size range of the model data.

Even though models should in principle not be used outside their geographical area and tree size ranges, this is sometimes necessary due to lack of local models. However, if no suitable data exist for testing, the user remains unaware of the nature of the prediction errors. Thus, model tests on real data are preferable, but this is of course seldom possible since suitable data would mostly be collected for calibrating local models, which renders the use of the alien model unnecessary. However, Njana et al. (2015) tested selected existing belowground biomass models on relevant data from Tanzania, both common (Komiyama et al. 2005) and species-specific (Tamai et al. 1986; Comley and McGuinness 2005; Kairo et al. 2009). The results revealed large prediction errors for both the common (26–63 %) and species-specific (55–63 %) models. These results support the development of new biomass models for Tanzanian mangrove forests.

The main objective of this study was therefore to develop tree biomass prediction models for the dominant mangrove species in Tanzania. Specifically, the study aimed to (1) provide basic information on the distribution of biomass between tree components and the root-to-shoot ratio, (2) develop both common and species-specific models for above- and belowground biomass, (3) develop models for aboveground biomass components (stem, branch, leaf, and twig), and (4) assess the predictive accuracy of the existing models and of those developed here in predicting the aboveground biomass of mangroves. A mixed modelling approach was applied.

2 Materials and methods

2.1 Study area

In Tanzania, mangroves grow naturally along the coastline between the borders to Kenya in the north and Mozambique in the south. Mangroves cover about 158,100 ha of Tanzania (MNRT 2015) and include 10 different species, namely A. marina, B ruguiera gymnorhiza, Ceriops tagal (Perr.) C. B. Rob., Heritiera littoralis Dryand., Lumnitzera racemosa Willd., Pemphis acidula J.R. & G. Forst., R. mucronata, S. alba, Xylocarpus granatum Koen., and Xylocarpus moluccensis (Lamk.) Roem. These species are also found in Kenya and Mozambique (Tamooh et al. 2008; Fatoyinbo et al. 2008; Mohamed et al. 2009). All mangroves in Tanzania are declared as forest reserves and managed by the Tanzania Forest Service Agency under the Ministry of Natural Resources and Tourism (URT 2002). The study was carried out at four sites: Pangani, Bagamoyo, Rufiji, and Lindi-Mtwara (Table 1) covering the northern, middle, and southern parts of the costal belt of Tanzania.

Table 1 Site, location, dominant soil type, temperature, and precipitation for the study sites

2.2 Tree sampling and measurement procedures

Site conditions in mangrove forests usually vary perpendicular to the shorelines of the sea/rivers. To cover as much variation as possible, we established nested sample plots of 2- and 10-m radii along 37 transects running from the shorelines across the entire extension of the mangrove vegetation. For each transect, the first plot was located close to the shoreline, while the remaining plots were located at distances varying from 150 to 250 m depending on the total extension of the mangroves. For some transects, it was not possible to establish all plots because of impenetrable mangrove vegetation or inaccessibility due to rivers/streams. Therefore, the number of plots sampled within transects varied from one to four. In total, we measured 120 plots. Fifteen plots were measured in Pangani and Lindi-Mtwara, respectively, while 45 plots were measured in Bagamoyo and Rufiji, respectively (Njana et al. 2015).

Within 2-m radius of each plot, we measured dbh for all trees with dbh ≥ 1 cm and total tree height ≥ 2 m, while within 10-m radius, we measured dbh for all trees with dbh ≥ 5 cm. For A. marina and S. alba trees, dbh was measured at 1.3 m above soil surface, while for R. mucronata trees, dbh was measured at 0.3 m above the highest stilt root.

In each plot, one tree was selected subjectively for destructive sampling, while ensuring an adequate representation of all the three species across sites, and diameter ranges from the sample plot. In total, 120 trees were sampled for aboveground biomass (40 for each of the three species), and among these, 30 were sampled for belowground biomass (10 for each of the three species). Among the sites, 15 trees (five for each species) were sampled in Pangani and Lindi-Mtwara, respectively, while 45 trees (15 for each species) were sampled in Bagamoyo and Rufiji, respectively (Njana et al. 2015). One S. alba sample tree had hollow and sandy sections, and since our focus was to develop models predicting biomass of healthy mangrove trees, this tree was excluded during modelling.

Before the sample trees were cut, we measured dbh, basal diameter (bd, diameter 15 cm above ground level for A. marina and S. alba or immediately above the highest stilt root for R. mucronata) using a diameter tape. We also measured ht, crown diameter (crd), and bole height (bht, height from ground level to first branch) (Fig. 1). Total and bole tree height was determined using a Suunto hypsometer. Crown length (crl) was determined from the difference between ht and bole height.

Fig. 1
figure 1

Schematic sketch of A. marina and S. alba trees (upper panel) and R. mucronata trees (lower panel) showing different tree components and variables. Note: AGB total aboveground biomass, BGB total belowground biomass

Three R. mucronata sample trees were multi-stemmed. For these trees, diameters of individual stems (dbh i ) were combined and a surrogate for dbh was determined as \( \mathrm{d}\mathrm{b}\mathrm{h}=\sqrt{{\displaystyle \sum {{\mathrm{dbh}}_i}^2}} \)(e.g., Zhou et al. 2007) while we used the heights of individual stems to determine basal area-weighted mean heights that were used as surrogate for ht. Table 2 summarizes statistics for plot (i.e., for trees ≥ 5 cm) and sample tree variables.

Table 2 Statistical summary of plot variables, sample tree variables, and tree biomass for different tree components

Using a chainsaw, trees were cut 15 cm above ground level for A. marina and S. alba, while R. mucronata trees were cut immediately above the highest stilt root (URT 2010) (see Fig. 1). After felling, the aboveground part for tree with dbh ≥ 15 cm was partitioned into (i) stem, (ii) branch (≥5 cm), and (iii) twig and leaf, and among these, 10 trees for each of the three species were further partitioned into twig and leaf. Stem and branch were cross-cut into billets and their corresponding fresh weights determined using a spring balance measuring weight to the nearest 0.1 kg. Fresh weights of small trees (<5 cm) were determined using a digital balance. Saw dusts from the chainsaw were not included in the fresh weight. For the large trees partitioned into twig and leaf, fresh weights were determined separately for each component. For all other trees, the aggregate fresh weights, i.e., twig plus leaf, were recorded.

For determining belowground biomass of A. marina and S. alba trees, we first excavated the root crown and then selected two main cable roots from the root crown and two side cable roots from each of the two main cable roots, including their respective pneumatophores, for full excavation. The root selection included one small and one large main and side cable root, respectively, so as to cover as wide ranges of root sizes as possible. Fresh weights as well as root basal diameter of all excavated roots were determined. These measurements were later used to develop side and main cable root regression models, which were applied to predict biomass of roots not excavated (for details on excavation and biomass determination procedures, see Njana et al. 2015). For R. mucronata, fresh weights were determined through harvesting of all aboveground stilt roots followed by complete excavation of all belowground stilt roots. Finally, the total belowground fresh weight was determined by summarizing weights of root crown and above- and belowground stilt roots.

For each tree, three subsamples were extracted from the stem, two from the branches, and two from the twigs. The weight of the subsamples for the aboveground tree components ranged from 0.05 to 4.5 kg. All subsamples were extracted at random locations except for the stem subsamples, which were extracted at 0, 40, and 70 % of the total tree height. The fresh weight of all subsamples was determined immediately after extraction using a digital balance (to the nearest 0.01 g). This was followed by labelling and packing for further measurements in laboratory. In total, the numbers of stem, branch, and twig subsamples were 119, 50, and 72, respectively, for A. marina; 118, 39, and 68, respectively, for S. alba; and 117, 46, and 72, respectively, for R. mucronata. The numbers of root crown and root subsamples were 10 and 19, respectively, both for A. marina and S. alba. For R. mucronata, 7 subsamples were extracted from root crown, 17 from aboveground stilt roots, and 19 from belowground stilt roots (see Njana et al. 2015).

2.3 Laboratory procedures and dry weight determination

In the laboratory, subsamples were oven-dried to constant weight at 105 °C and their dry weight determined by a digital balance. DF ratio of subsamples (unit less) was determined as oven dry weight (kg) per fresh weight (kg). Exploratory analysis of covariance (ANCOVA) with dbh as a covariate revealed that the DF ratio varied significantly between aboveground tree components and with tree dbh (p < 0.05). In general, DF ratio varied from 0.28 to 0.66 for A. marina, 0.22 to 0.69 for S. alba, and 0.33 to 0.71 for R. mucronata. Since only 10 trees for each of the three species among the larger trees (dbh ≥ 15 cm) were partitioned into twig and leaf, we initially computed species-specific twig to leaf ratio based on the 10 observations for each species which was used to partition the aggregate twig and leaf component into twig and leaf for trees not partitioned into that level. Then, total tree aboveground biomass was calculated as the product of tree- and component-specific fresh weight and DF ratio:

$$ {\mathrm{AGB}}_h={\displaystyle \sum_{i_s=1}^{n_s}\left({\mathrm{FW}}_{h{i}_s}\times {\mathrm{DF}}_{h_s}\right)}+{\displaystyle \sum_{i_b=1}^{n_b}\left({\mathrm{FW}}_{h{i}_b}\times {\mathrm{DF}}_{h_b}\right)}+{\displaystyle \sum_{i_t=1}^{n_t}\left({\mathrm{FW}}_{h{i}_t}\times {\mathrm{DF}}_{h_t}\right)}+{\displaystyle \sum_{i_l=1}^{n_l}\left({\mathrm{FW}}_{h{i}_l}\times {\mathrm{DF}}_{h_l}\right)} $$

where AGB h = observed total tree aboveground dry weight (kg) of the hth tree, n = total number of billets/twig bundles/leaf weights for a given aboveground tree component, s = stem, b = branch, t = twig, l = leaf, i = ith subsection, \( {\mathrm{FW}}_{h{i}_s} \), \( {\mathrm{FW}}_{h{i}_b} \), \( {\mathrm{FW}}_{h{i}_t} \), and \( {\mathrm{FW}}_{h{i}_l} \) are stem, branch, twig, and leaf fresh weights (kg), respectively, and \( {\mathrm{DF}}_{h_s} \), \( {\mathrm{DF}}_{h_b} \), \( {\mathrm{DF}}_{h_t} \), and \( {\mathrm{DF}}_{h_l} \) are stem, branch, twig, and leaf DF ratios, respectively.

Belowground dry weight determination for A. marina and S. alba involved conversion of fresh weight of excavated root components using species-, tree-, and component-specific DF ratios. From excavated sample root dry weight data, regression models for prediction of dry weight of unexcavated roots were developed and dry weights of unexcavated roots were predicted (for details, see Njana et al. 2015). Therefore, total root dry weights comprised excavated and unexcavated (i.e., predicted) root dry weights. Total tree belowground dry weight, i.e., belowground biomass, was derived as the sum of root and root crown dry weight. For R. mucronata, tree belowground dry weight was obtained by converting total tree fresh weight to dry weight using tree-specific DF ratios. This was the case because for this species, tree belowground fresh weight was not distinguished into root components. Statistical summary for sample tree dry weights are presented in Table 2.

2.4 Model specification

Model specification involves selection of functional form as well as selection of predictor variables. Initially, we tested various functional forms; however, power functional form was the best. Power functions have been widely used to model biomass of mangrove trees (e.g., Tamai et al. 1986; Komiyama et al. 2005; Kairo et al. 2009; Ray et al. 2011; Patil et al. 2014). In this study, two variants of power functions with an additive error term (ɛ i ) were considered (model forms Model form 1 and Model form 2):

$$ {B}_i={\beta}_0\times {\left({\mathrm{dbh}}_i\right)}^{\beta_1}+{\varepsilon}_i $$
(Model form 1)
$$ {B}_i={\beta}_0\times {\left({\mathrm{dbh}}_i\right)}^{\beta_1}{\left({\mathrm{ht}}_i\right)}^{\beta_2}+{\varepsilon}_i $$
(Model form 2)

where i represent ith observation, and B i represent aboveground biomass, leaf biomass, twig biomass, branch biomass, or stem biomass. Model form Model form 1 represents biomass as a function of dbh i , while model form Model form 2 represents biomass as a function of both dbh i and ht i , while the betas (β) are model parameters.

Diameter at breast height (dbh i ) is highly correlated with biomass (B i ). However, also ht i is highly correlated with biomass and could be a useful variable in biomass models to reflect that trees reach their maximum height at an earlier stage than maximum diameter. This means that models depending on dbh only may overpredict biomass of large trees because the biomass increase per unit increase in diameter is reduced when trees approach maximum height. Thus, ht i represents additional information not reflected by dbh i (e.g., Chave et al. 2005).

2.5 Nonlinear mixed effects (NLME) modelling

2.5.1 Nonlinear mixed effects models

Three important assumptions for regression modelling are normality, homoscedasticity (if residual variance increases as a function of dbh), and independency of residuals. Results and conclusions based on regression analysis are only reliable if these assumptions are met (Ritz and Streibig 2008; Zuur et al. 2009). For biological data, however, such assumptions may be difficult to meet. Non-normal residuals, for example, may be due to outliers, while lack of independency of residuals may occur due to the structure of data itself (Zuur et al. 2009). Non-normal and heteroscedastic residuals may be dealt with by transformation (Ritz and Streibig 2008; Zuur et al. 2009), although this leads into change of the original scale and introduces bias (O’Hara and Kotze 2010; Packard 2009).

NLME modelling is one way to confront challenges encountered in conventional regression approaches since it relaxes regression assumptions and take into account the complex nature of biological data (Pinheiro and Bates 2000; Zuur et al. 2009). Within the mixed effects model framework, parameters may also be allowed to vary by grouping variables(s) (i.e., random variables(s)) (Ritz and Streibig 2008). NLME models may generally be expressed as follows (Lindstrom and Bates 1990; Vonesh and Chinchilli 1997; Pinheiro and Bates 1998):

$$ {y}_{ij}=f\left({x}_{ij},{\lambda}_j;\beta, {\alpha}_j\right)+{\varepsilon}_{ij} $$

where i = ith observation, j = jth random-effect variable, y ij = response variable for observation i and random-effect variable j, x ij = predictor variable for observation i and random-effect variable j, λ j = random-effect variable for j, β = fixed effects parameters, α j = random effects parameters, and ɛ ij = error term, which is assumed normally distributed with a mean of zero.

Our data originated from four different sites and comprised three different species, where one tree was destructively sampled from each sample plot spatially distributed along transects. Since our data structure is hierarchical and the biomass–dbh relationship is nonlinear (Fig. 2), tree biomass was modelled using the NLME modelling approach in order to preserve the original scale.

Fig. 2
figure 2

Above- and belowground tree biomass over dbh across species and sites. Symbols b lack up-pointing triangle, gray up-pointing triangle, and white up-pointing triangle, respectively, represent A. marina, S. alba, and R. mucronata tree species, while black circle, white circle, gray plus sign, and black plus sign, respectively, represent trees from Pangani, Bagamoyo, Rufiji, and Lindi-Mtwara. Note: AGB = total aboveground biomass, BGB = total belowground biomass

Biomass models based on mixed effects modelling frameworks have also previously been developed (e.g., Moore 2010; Li et al. 2011; Xu et al. 2014). The mixed effects modelling provides a statistical capability where fixed- (i.e., populations average) and random effects (i.e., group specific) parameters may be estimated simultaneously (West et al. 2007). Under the mixed effects modelling framework, models including fixed effects parameters may therefore be regarded as common or multi-species models, while those including random effects may be regarded as species-specific models.

2.5.2 Modelling procedures

Model development was carried out using the R software version 3.1.2 (R Core Team 2014) using the NLME function in the NLME package (Pinheiro et al. 2015). In order to specify which parameter to be treated as solely fixed effects and which one as both fixed and random effects, we initially tested each parameter as both fixed and random effects parameters against prospective random effects variables. Prospective random effects variables included species (j) and site (k). The influence of random effects variable(s), individually or in combination on a given parameter, was evaluated using Akaike information criteria (AIC). Accordingly, β 0 (model forms 1 and 2) and β 2 (model form 2) were considered as solely a fixed effects parameter, while β 1 was considered as both fixed and random effects parameters. Model forms 1 and 2 were then re-specified to include a random effects parameter (α jk ) (model forms Model form 3 and Model form 4):

$$ {B}_{ijk}={\beta}_0\times {\left({\mathrm{dbh}}_{ijk}\right)}^{\beta_1+{\alpha}_{jk}}+{\varepsilon}_{ijk} $$
(Model form 3)
$$ {B}_{ijk}={\beta}_0\times {\left(db{h}_{ijk}\right)}^{\beta_1+{\alpha}_{jk}}{\left(h{t}_{ijk}\right)}^{\beta_2}+{\varepsilon}_{ijk} $$
(Model form 4)

Site did not result into significant random parameters (β 1), so relevant parameters estimated were not reported. Three sets of biomass models were developed: (i) aboveground biomass models, (ii) belowground biomass models, and (iii) aboveground tree component (leaf, twig, branch, and stem) biomass models. Both model forms 3 and 4 were fitted for total aboveground biomass, while only model form 3 was fitted for belowground biomass and aboveground tree component biomass. Model form 4 was not considered for belowground and aboveground tree components due to limited number of observations (Harrell 2001; Roxburgh et al. 2015).

During explorative data analysis, we observed that residual variances (σ 2 (ɛ ijk )) were heteroscedastic. Consequently, we assumed heteroscedasticity, and residual variances were modelled as a function of dbh using varPower function in R (Pinheiro and Bates 2000; Ritz and Streibig 2008; Zuur et al. 2009);

$$ {\sigma}^2\left({\varepsilon}_{ijk}\right)={\sigma}^2\times {\left|{\mathrm{dbh}}_{ijk}\right|}^{2\phi } $$

where ϕ = variance function coefficient. We initially also tested other functions in R (varExp, varIdent, varConstPower, and varComb). However, the varPower function appeared to be the best.

The effects of the variance function were evaluated using AIC. The variance function is implicitly part of the mixed effects model but is not explicitly stated; therefore, the variance functions are not reported in the results (Smith et al. 2014; de Miguel et al. 2014). Since one tree was sampled from each plot and the distance between plots ranged from 150 to 250 m, observations between plots were considered spatially independent; thus, no correlation structure was assumed.

During tests for random and variance function effects, model parameterization was done by using maximum likelihood (ML), while we for the final models used restricted ML (REML) (Lindstrom and Bates 1990; Pinheiro and Bates 2000). The models were evaluated using root mean squared error (RMSE (%)) and mean prediction error (MPE (%)) (Chai and Draxler 2014; Walther and Moore 2005) as measures of goodness of fit while model selection was done using AIC:

$$ \mathrm{RMSE}\;\left(\%\right)=\left(\frac{\sqrt{{\displaystyle \sum \left({e_{ijk}}^2\right)/n}}}{{\mathrm{MB}}_{\mathrm{obs}}}\right)\times 100 $$
$$ \mathrm{M}\mathrm{P}\mathrm{E}\;\left(\%\right)=\left(\frac{{\displaystyle \sum \left({e}_{ijk}\right)/n}}{{\mathrm{MB}}_{\mathrm{obs}}}\right)\times 100 $$
$$ \mathrm{A}\mathrm{I}\mathrm{C}=n\times \left( \ln \left(\frac{{\displaystyle \sum \left({e_{ijk}}^2\right)}}{n}\right)\right)+2\times \left(p+1\right)+C $$

where e ijk = residuals, i.e., difference between predicted and observed tree biomass (kg), n = sample size, MBobs = mean observed tree biomass (kg), ln = natural logarithm, p = number of parameters, and C = constant.

RMSE (%) represents a measure of accuracy and MPE (%) a measure of bias. A model with lower RMSE (%) than the reference model implied the model to be more accurate than the reference model and vice versa. Similarly, MPE (%) values significantly different from zero implied biased aboveground biomass predictions, i.e., under- or overpredictions; otherwise, they implied unbiased aboveground biomass predictions. The commonly used model selection criterion R 2 was not considered since its use has been criticized (e.g., Johnson and Omland 2004; Sileshi 2014).

2.6 Evaluation of predictive accuracy of existing biomass models

Based on a literature review, relevant existing aboveground biomass models were selected and tested on our data to determine their predictive accuracy. The selected models ensured representation of various regions and included four common and eight species-specific biomass models (Table 3). RMSE (%), MPE (%), and AIC served as model evaluation criteria. After computation of these criteria, the existing models were ranked in descending order based on AIC. The existing models were ranked without stratification into model type or predictor variable included since AIC as a model selection criteria is capable of detecting such differences (Burnham et al. 2011).

Table 3 Existing aboveground biomass mangrove models selected for evaluation of prediction accuracy

3 Results

3.1 Distribution of biomass into different tree parts

The three mangrove species considered in this study stored between 49 % (R. mucronata) and 72 % (S. alba) of aboveground biomass in the stem, while the rest in descending order was stored in branch, twig, and leaf (Fig. 3). On average, about 41 % of the total tree biomass is stored in the root system (Fig. 4). Figures 3 and 4 show that S. alba had relatively higher stem biomass and higher root biomass compared to the other species. The root-to-shoot ratios for A. marina, S. alba, and R. mucronata were 0.38, 1.29, and 0.62, respectively, with an overall mean of 0.70. Generally, the root-to-shoot ratio depicted a decreasing trend from lower to higher dbh classes.

Fig. 3
figure 3

Distribution of biomass between aboveground tree components. Am = A. marina (n = 23), Sa = S. alba (n = 17), and Rm = R. mucronata (n = 21)

Fig. 4
figure 4

Distribution between above- and belowground biomass. Am = A. marina (n = 10), Sa = S. alba (n = 10), and Rm = R. mucronata (n = 10). Note: AGB = total aboveground biomass, BGB = total belowground biomass

3.2 Biomass models

All parameter estimates for the above- and belowground biomass models were statistically significant (Table 4). For the aboveground biomass fixed effects models (FE1, FE2), inclusion of ht as a predictor variable was important since RMSE decreased from 42.6 to 38.4 %, which is equivalent to a decline of about 10 %. Based on AIC as model selection criterion, the fixed effects model FE2 is better than model FE1. For the aboveground biomass random effects models, inclusion of ht resulted in lower RMSE (%) and MPE (%) values for A. marina (models RE1 and RE4) and S. alba (models RE2 and RE5), while mixed results were observed for R. mucronata (models RE3 and RE6).

Table 4 Above- and belowground biomass models

The evaluation of the aboveground biomass models (Table 5) showed that inclusion of ht as predictor variable (model form 4) generally improved predictive accuracy, i.e., provided lower MPE values. The results also showed that the random effects models with ht as a predictor variable were more accurate than the fixed effects models.

Table 5 Mean prediction errors (MPE (%)) of the aboveground biomass models over site, dbh class, and ht class

For the belowground biomass models (Table 4), the goodness of fit statistics, i.e., RMSE (%) and MPE (%), improved when using random effects for A. marina (model RE7) and R. mucronata (model RE9) compared to the fixed effects model (model FE3), while the opposite was observed for S. alba (model RE8).

The β 0 parameter estimates of the aboveground tree components biomass models were statistically non-significant (p > 0.05) except for the stem biomass model (Table 6). All other parameter estimates were statistically significant (p < 0.05). MPEs were slightly lower than 10 % for the leaf, twig, and branch biomass models, while MPE was slightly higher than 10 % for the stem model. The stem biomass model had lower RMSE (%) values compared to all the other component models.

Table 6 Aboveground tree component biomass models

Using paired t test, comparisons of observed total tree aboveground biomass with total tree aboveground biomass predicted the tree components common/fixed effects models showed that the prediction errors were non-significant for A. marina (n = 23, MPE = −6.5 %, p > 0.05) and S. alba (n = 17, MPE = 3.9 %, p > 0.05), while they were significant for R. mucronata (n = 21, MPE = 16.0 %, p < 0.05).

3.3 Evaluation of predictive accuracy of existing aboveground biomass models

The predictive accuracy of the existing aboveground biomass models was evaluated by testing them on our data (Table 7). Judged by AIC, the common model developed by Chave et al. (2005) was the best for prediction of aboveground biomass for A. marina and S. alba, while the common model developed by and Komiyama et al. (2005) was the best for R. mucronata (Table 7). Except for the model developed by Chave et al. (2005) applied for S. alba and R. mucronata, MPE (%) values for all tested models were significantly (p < 0.05) different from zero.

Table 7 Predictive accuracy of existing aboveground biomass models and models developed in this study

When ranking the models developed in this study based on AIC, the common (fixed effects) model was the best in prediction of aboveground biomass for A. marina while the species-specific (random effects) models were the best for the other two species (Table 7). The MPE (%) values of all the species-specific (random effects) models were low and non-significant. For the common (fixed effects) models, the MPE (%) values were low for A. marina and relatively high for S. alba and R. mucronata. However, only the MPE (%) value of common model for S. alba was significantly different from zero.

The models developed by Kairo et al. (2009) and Sitoe et al. (2014), both from eastern Africa, were the poorest performing models as demonstrated by very high RMSE (%) and MPE (%) values (Table 7). These models were also characterized by remarkable disagreement between observed and predicted biomass values (Fig. 5).

Fig. 5
figure 5

Observed and predicted aboveground biomass for existing models and for models from this study. Note: Dashed gray lines represent 1:1 relationship between observed and predicted values. AGB = total aboveground biomass, BGB = total belowground biomass

4 Discussion

The distributions of the aboveground biomass components were quite similar for A. marina and R. mucronata, while for S. alba, the proportion of stem biomass was higher than for the two other species (Fig. 3). Although on average 41 % of tree biomass was stored belowground, S. alba stored the largest proportion belowground (Fig. 4). The main reason for the relatively large proportion of belowground biomass for S. alba, as compared to A. marina, is probably due to the large pneumatophores of this species (Njana et al. 2015). This is also in line with the high root-to-shoot ratio for this species. Apparently, S. alba has more belowground biomass than aboveground biomass (1.29 root-to-shoot ratio). Variation of distribution of biomass into different tree components between species and a declining trend in root-to-shoot ratios over dbh classes demonstrate the strategies of trees as they grow; at early stages, more biomass is distributed in the belowground for anchorage and stabilization in the soft substrate, while at later stages, more of the biomass is distributed to the aboveground part in support of physiological processes for growth. The distribution of biomass observed in this study is not unique for mangrove alone as similar observations have also been reported for miombo woodlands (e.g., Mugasha et al. 2013).

Our study presents above- and belowground biomass models based on data from three dominant mangrove species in Tanzania, i.e., A. marina, S. alba, and R. mucronata. No similar models have previously been developed in the country, and only a few models have been developed in Africa or are based on data from Africa. The existing biomass models from Africa (Kairo et al. 2009; Sitoe et al. 2014) are based on limited sample sizes and data from only one site. Our biomass models however are based on data from a range of sites along the coastline of Tanzania, covering a size range beyond data used in developing the existing aboveground biomass models both in Africa (e.g., Kairo et al. 2009; Sitoe et al. 2014) and beyond (e.g., Comley and McGuinness 2005; Chave et al. 2005; Komiyama et al. 2005; Kuei 2008; Patil et al. 2014). Accordingly, our belowground biomass models are based on data size range beyond those reported in existing studies (e.g., Comley and McGuinness 2005; Komiyama et al. 2005; Kairo et al. 2009). In addition, our belowground biomass models are based on data generated using comprehensive procedures for quantifying tree belowground biomass involving root sampling (A. marina and S. alba) and complete root excavation (R. mucronata) (Njana et al. 2015).

Our models are based on a nonlinear mixed-modelling approach. Ordinary nonlinear regression is commonly used to develop biomass models. Such models, however, may violate regression assumptions of homoscedasticity and independence of residuals, which are difficult to meet for biological data. Sampling for biomass model development often results in hierarchical data. Chave et al. (2005) and Komiyama et al. (2005) for example developed common biomass models using data originating from more than one site, such data form hierarchical data structure stratified by site and species. Observations originating from the same species and or site are likely to be more correlated hence lack of independence. A model based on non-independent observations is characterized by autocorrelated errors and therefore violate key assumptions of independence in regression (Ritz and Streibig 2008). Ignoring lack of independence tends to give imprecise parameter estimates (ibid.). The mixed effects modelling comprising both fixed and random effects that we applied in this study is a useful statistical tool in modelling hierarchical data (Ritz and Streibig 2008; Zuur et al. 2009).

Our study showed that the aboveground biomass models improved when random effects modelling was applied and when ht as an additional predictor variable was considered (Tables 4 and 5). In model development, it is important that models are properly specified and that the structure of the data is taken into account. Our study illustrate that common model including ht generally performed well across study site, species, dbh, and ht classes by resulting into decline in MPE (%) and that their corresponding random effects/species-specific further improved predictive accuracy (Table 5). This supports the role of random effects in explaining unexplained sources of variation which is only possible within the mixed modelling framework. In line with our results, Chave et al. (2005) reported that the inclusion of ht into a common mangrove biomass model reduced the standard error of aboveground biomass from 19.5 to 12.5 % for mangrove trees, while other authors reported that random effects improved predictive power of biomass models for non-mangrove trees (e.g., Fu et al. 2014; Xu et al. 2014).

Despite models including ht being better, in most forest inventories, due to many reasons such as costs, trees are not frequently measured for ht. In such cases, users are obliged either to use models including dbh as the only predictor variable or initially estimate ht using relevant models and subsequently apply biomass models based on both dbh and ht as predictor variables. However, ht prediction models for mangroves are lacking in Tanzania and the rest of Africa.

Basic density (BD) is another predictor variable which could have potentially improved predictive accuracy particularly for the common biomass model (Komiyama et al. 2005; Chave et al. 2005). However, our models did not include BD as an additional predictor variable for two reasons; firstly, BD may vary between species and between species-specific tree components and between tree size. Therefore, applying BD determined based on comprehensive sampling in modelling tree biomass may improve model predictive accuracy. Since BD is never determined in forest inventories however and that no BD prediction models exist for mangrove species, such biomass models would be better yet with limited application. Secondly, for common biomass models, BD serves as species distinguishing factor whereby species mean BD values may be used as opposed to the use of species- and tree-specific BD values. The mixed modelling approach used in this study is robust in distinguishing species.

The tests of existing models on our data generally showed large and significant underpredictions for aboveground biomass (Table 7). The underpredictions were as large as 90 % for some of the models (Kairo et al. 2009; Sitoe et al. 2014). Generally, predicted and observed biomass agreed quite well for small tree sizes, while the underpredictions increased with tree size (Fig. 5). Similar tests on belowground biomass for mangroves in Tanzania (Njana et al. 2015) showed prediction errors (underprediction) as high as 60 % when models by Komiyama et al. (2005) (common model), Comley and McGuinness (2005) (species-specific model), and Kairo et al. (2009) (species-specific model) were applied. Plausible explanations for the observed prediction errors could be the application of the models beyond data ranges (size), geographical locations, and differences in forest structure and architecture. For the belowground biomass, an additional explanation could also be inadequate excavation procedures applied when some of these models were developed (see Njana et al. 2015). Any application of the already existing above- and belowground biomass models to mangroves of Tanzania is therefore not recommended.

In the modelling, we applied species as random effects, which resulted into improved predictive accuracy of both the above- and the belowground biomass models, except for belowground biomass for S. alba where the models did not fit well to data (Table 4). This may be due to higher variances of BGB for this species (see Fig. 2). The contribution of random effects in improved predictive accuracy suggests that the biomass allometry varies by species. Therefore, the random effects/species-specific models should be applied since they are superior to the fixed effects/common models. For S. alba, however, the fixed effects model is recommended for belowground biomass. Since both the above- and the belowground biomass models performed fairly well across sites, the models may be applied across sites in Tanzania. However, the use of the models beyond species considered in this study is not recommended.

Aboveground tree component biomass estimates derived using models may be essential in describing forest structure (e.g., Camacho et al. 2011), determining forest productivity (e.g., Cox and Allen 1999; Kairo et al. 2008), and understanding ecosystem functions through quantification of carbon stocks and sequestration (e.g., Chen et al. 2012; Pandey and Pandey 2013) which are potentially relevant for climate change mitigation strategies. For example, the leaf biomass estimated from relevant models may provide useful information on nutrient cycling while the above- and belowground biomass models may be applied to generate tier 3 carbon stock estimates for carbon monitoring, reporting, and verification in REDD+ programs. The models may also be applied to the NAFORMA data for basic scientific ecological studies and for management decision-making. Since biomass estimates are essential for both ecological and management applications, the models (total AGB, BGB, and tree component models) from this study are expected to provide ecologists with the needed information and to support management of mangroves in Tanzania and elsewhere as deemed relevant. The aboveground tree component biomass models that we developed generally gave low prediction errors (<10 %) (Table 6). In addition, estimates based on tree components were additive (in agreement with the direct tree aboveground estimates). Therefore, we recommend the use of the developed aboveground tree component common models in deriving aboveground component-specific biomass estimates for utilization and ecological purposes, and the individual estimates may safely be added up.

5 Conclusions

The biomass models reported in this study are based on comprehensive data and modelling approach. The above- and belowground biomass models improved when random effects were considered. Therefore, random effects/species-specific models are generally recommended. For estimation of belowground biomass for S. alba however, the fixed effects/common model is recommended. Based on our results, we discourage species-specific or site-specific model development for data entailing more than one species or site, instead we encourage the use of a mixed modelling approach which is robust for such data sets. The aboveground tree component biomass models may also be applied since they yield unbiased and additive estimates. Based on goodness of fit statistics, both the above- and the belowground biomass models developed in this study are the best available and provide an important tool for accurate estimation of biomass and carbon stock stored in mangrove forest in Tanzania for both management and ecological applications. Our models should be used within the range of data from which they were developed, and their use outside this data range should be done with caution.