On the suitability of an allometric proxy for nondestructive estimation of average leaf dry weight in eelgrass shoots I: sensitivity analysis and examination of the influences of data quality, analysis method, and sample size on precision

Echavarría-Heras, Héctor; Leal-Ramírez, Cecilia; Villa-Diharce, Enrique; Cazarez-Castro, Nohe

doi:10.1186/s12976-018-0076-y

On the suitability of an allometric proxy for nondestructive estimation of average leaf dry weight in eelgrass shoots I: sensitivity analysis and examination of the influences of data quality, analysis method, and sample size on precision

Research
Open access
Published: 06 March 2018

Volume 15, article number 4, (2018)
Cite this article

Download PDF

You have full access to this open access article

Theoretical Biology and Medical Modelling Aims and scope

On the suitability of an allometric proxy for nondestructive estimation of average leaf dry weight in eelgrass shoots I: sensitivity analysis and examination of the influences of data quality, analysis method, and sample size on precision

Download PDF

Héctor Echavarría-Heras¹,
Cecilia Leal-Ramírez¹,
Enrique Villa-Diharce² &
…
Nohe Cazarez-Castro³

2629 Accesses
3 Citations
Explore all metrics

Abstract

Background

The effects of current anthropogenic influences on eelgrass (Zostera marina) meadows are noticeable. Eelgrass ecological services grant important benefits for mankind. Preservation of eelgrass meadows include several transplantation methods. Evaluation of establishing success relies on the estimation of standing stock and productivity. Average leaf biomass in shoots is a fundamental component of standing stock. Existing methods of leaf biomass measurement are destructive and time consuming. These assessments could alter shoot density in developing transplants. Allometric methods offer convenient indirect assessments of individual leaf biomass. Aggregation of single leaf projections produce surrogates for average leaf biomass in shoots. Involved parameters are time invariant, then derived proxies yield simplified nondestructive approximations. On spite of time invariance local factors induce relative variability of parameter estimates. This influences accuracy of surrogates. And factors like analysis method, sample size and data quality also impact precision. Besides, scaling projections are sensitive to parameter fluctuation. Thus the suitability of the addressed allometric approximations requires clarification.

Results

The considered proxies produced accurate indirect assessments of observed values. Only parameter estimates fitted from raw data using nonlinear regression, produced robust approximations. Data quality influenced sensitivity and sample size for an optimal precision.

Conclusions

Allometric surrogates of average leaf biomass in eelgrass shoots offer convenient nondestructive assessments. But analysis method and sample size can influence accuracy in a direct manner. Standardized routines for data quality are crucial on granting cost-effectiveness of the method.

Improved allometric proxies for eelgrass conservation

Article 30 July 2018

The effect of parameter variability in the allometric projection of leaf growth rates for eelgrass (Zostera marina L.) II: the importance of data quality control procedures in bias reduction

Article Open access 01 December 2015

A Takagi–Sugeno-Kang Fuzzy Model Formalization of Eelgrass Leaf Biomass Allometry with Application to the Estimation of Average Biomass of Leaves in Shoots: Comparing the Reproducibility Strength of the Present Fuzzy and Related Crisp Proxies

Background

Based on published data for 17 ecosystem services in 16 biomes, Costanza et al. [1] estimated the value of ecosystem services at the level of the whole biosphere. They found a lower bound in the range of US$16–54 trillion (1012) per year, with an average of US$33 trillion per year. Marine systems produced near 63% of the annual value. Almost half derived from coastal ecosystems. Approximately 25% of this share related to algal beds and seagrasses. This contribution to human welfare is deem relevant. Thus maintaining the health of marine ecosystems is subject of scientific concerns. Currently, increasing anthropogenic pressures pose important threats. And global climate change could also threaten future viability of seagrass meadows [2, 3]. For instance, water quality and other local stressors promote unprecedented meadow loss [4,5,6]. This reduces mitigation of wave action [7] and filtration [8]. Diminishes food and shelter for a myriad of organisms [9,10,11]. Weakens nutrient cycling [12, 13], erosion abatement and shoreline stabilization [14,15,16]. Moderates support for detrital food web foundation [17]. And inhibits carbon sequestration [18, 19].

Eelgrass (Zostera marina L.) is a dominant along the coasts of both the North Pacific and North Atlantic [20]. This species supports communities known as among the richest and most diverse in sea life [21]. Contribution of organic materials for food webs in shallow environments [22] is noticeable. Indeed, eelgrass produced up to 64% of the whole primary production of an estuarine system [23]. Current deleterious effects of anthropogenic influences on eelgrass prompted special restoration strategies. Among remediation efforts replanting plays an important role [24,25,26,27]. Transplant success amounts to reinstatement of ecological functions of natural populations. Evaluation relies on monitoring standing stock and productivity of transplanted plants. Then comparing with assessments of a reference population, which usually settle nearby [28].

Combined biomass of leaves in shoots is an important component of standing stock. Assessments rely on the estimation of the biomass of individual leaves. This requires shoot removal followed by dry weight measurement procedures in the laboratory. Elimination of shoots could infringe damage to natural eelgrass populations [29]. And reduced shoot density makes these effects even more perceptible for transplanted plots. Allometric methods make it possible simplified-indirect estimations of eelgrass productivity and standing stock. Echavarría-Heras et al. [30] considered an allometric representation for eelgrass leaf biomass and related length. Agreeing with Solana et al. [31], the involved parameters are invariant within a given geographical region. Estimates and leaf length measurements grant nondestructive approximations of observed leaf biomass values. This way, leaf length measurements grant nondestructive approximations for observed leaf biomass values. Leaf growth rates estimation relies on successive measurements of leaf biomass. Then the allometric model in [30] entails nondestructive assessments of eelgrass productivity. But, invariance does not impede local factors to imply variability of parameter estimates. Besides, local influences other factors could explain numerical differences in parameter estimates. There are methodological influences that may render biased parameter estimates. Analysis method, sample size, and data quality can influence scaling results (e.g. [32, 33]). And, since scaling relationships are particularly sensitive to parametric uncertainties, Echavarría-Heras et al. [30] concluded that the actual precision of derived allometric surrogates requires clarification.

Here we deal with allometric surrogates for average leaf biomass in eelgrass shoots. These derive from the model w(t) = βa(t)^α for leaf biomass w(t) and area a(t) measured at time t, and α and β parameters. Leaf area is more informative of eelgrass leaf biomass than corresponding length. Thus, the present scaling endures a boost in precision of parameter estimates by the model in [30]. This could increase the accuracy of derived surrogates for leaf biomass in shoots. Besides, eelgrass leaf area and length admit an isometric representation [34, 35]. Then, the time invariance found by Solana-Arellano et al. [31] also holds for parameter estimates of the present scaling. This by the way imbeds a nondestructive advantage to the present shoot-biomass substitutes. But, agreeing with Echavarría-Heras et al. [30], we must examine influences on precision of estimates for suitability of projections. Since, such an analysis was not produced before, we took here the try of filling that gap. Achieving the related goals, required the assemblage of an extensive data set. It comprises coupled measurements of eelgrass leaf biomasses and related areas. This is further called “raw data set”. A data cleaning procedure adapted from Echavarría-Heras et al. [30] removed inconsistent leaf biomass replicates from the raw data. Thereby forming what we call a “processed data set”. Differences in reproducibility strength allowed to assess data quality effects in precision. A similar procedure evaluated sample size effects. And a sensitivity analysis evaluated robustness of the projection method. This supports consistent, cost-effective allometric projections of observed values from raw data. But, this depends on nonlinear regression as an analysis method. Besides, sample size must be optimal. Data quality as expected improved reproducibility strength of the allometric projection method. But, this factor was more relevant in optimizing sample size. A detailed explanation of used procedures appears in the methods section. The results section is not only devoted to the presentation of our findings. It also examines the relative strengths of factors influencing the precision of proxies. A Discussion section emphasizes on the gains and the limitations of the method. Appendix 1 deals with the model selection problem. Appendix 2 is about data processing methodology. Appendix 3 presents the procedure for sensitivity assessment.

Methods

The symbol w_m(t) will stand for average leaf dry weight of shoots collected at sampling date t. An the average of these values over all sampling dates symbolized through 〈w_m(t)〉. Formal representations of these variables appear in Appendix 3 (cf. Eq. (15) and Eq. (16)). The symbols w(t) and a(t) will one to one stand for biomass and area of an individual leaf collected at a sampling time t. We assume that these variables are linked through the scaling relationship

$$ w(t)=\beta a{(t)}^{\alpha }. $$

(1)

The present raw data come from a coastal lagoon located in San Quintin Bay, México [30]. This comprises 10,412 leaves and measured lengths [mm], widths [mm] and dry weights (g). The product of length times width provided estimations of leaf area [mm²] [36]. In what follows the symbol n_ra stands for number of leaves in raw data. Processed data results by applying direct and statistical data cleaning techniques. The direct hinges on the consistency of allometric models for eelgrass leaf biomass. Leaf length or area are allometric descriptors of eelgrass leaf biomass [30, 31, 34]. A model selection exploration corroborated a power function like trend assumption for the present data. Details appear in Appendix 1. Severe deviations, from the mean response curve, are inconsistent and must be removed. This took care of sets containing less than ten leaf dry weight replicates. The statistical procedure worked on sets with a larger number of replicates. It centers on properties of the median of a group of data. This is immune to sample size and also a robust estimator of scale. The adapted Median Absolute Deviation (MAD) data cleaning procedure [37] appears in Appendix 2. Processing data resulted in a number of n_qa = 6094 pairs of leaf dry weights and areas.

Parameter estimates $ \widehat{\alpha} $ and $ \widehat{\beta} $ and leaf area values yield allometric proxies $ {w}_m\left(\widehat{\alpha},\widehat{\beta},t\right) $ for w_m(t). (cf. Eq. (19)). The symbol $ \left\langle {w}_m\left(\widehat{\alpha},\widehat{\beta},t\right)\right\rangle $ (cf. Eq. (20)) stands for the pertinent average over sampling dates. We use Lin’s Concordance Correlation Coefficient (CCC) [38] as an evaluation of reproducibility. This meant as the extent to which two connected variables fall on a line through the origin and with a slope of one. We represent this statistic by means of the symbol $ \widehat{\rho} $. Agreement defined as poor whenever $ \widehat{\rho}<0.90 $, moderate for $ 0.90\le \hat{\rho}<0.95 $, good for $ 0.95\le \hat{\rho}\le 0.99 $ or excellent for $ \widehat{\rho} $>0.99 [39]. Values of $ \widehat{\rho} $ gave an evaluation of the strength of the $ {w}_m\left(\widehat{\alpha},\widehat{\beta},t\right) $ devise to reproduce observed values.

In getting parameter estimates $ \widehat{\alpha} $ and $ \widehat{\beta} $ we relied on two procedures. The traditional analysis method of allometry and nonlinear regression. Assessing analysis method effects on reproducibility strength of $ {w}_m\left(\widehat{\alpha},\widehat{\beta},t\right) $ depended on testing differences in $ \widehat{\rho} $. The traditional approach involves a linear regression equation (cf. Eq. (4)). This obtained through logarithmic transformation of response and descriptor in Eq. (1). The nonlinear regression analysis method relied on maximum likelihood [40, 41]. This approach fitted the model of Eq. (1) in a direct way in the original arithmetical scale. The nonlinear fit allowed the consideration of homoscedasticity or heteroscedasticity (cf. Eqs. (5) and (6)). All the required fittings for both raw and processed data depended on the use of the R software.

We also fitted the model of Eq. (1) to samples of different sizes taken out from primary and processed data sets. Each sample of size k; with 100 ≤ k ≤ n_ra produced estimates $ \widehat{\alpha}(k) $ for α and $ \widehat{\beta}(k) $ for β, and resulting $ {w}_m\left(\widehat{\alpha}(k),\widehat{\beta}(k),t\right) $ projections. The symbol $ \widehat{\rho}(k) $ denotes the value of $ \widehat{\rho} $ for a sample of size k. Differences in $ \widehat{\rho}(k) $ allow exploring sample size influences in reproducibility.

Deviations ∆α_q and ∆β_r convey fluctuating values α_q = α + ∆α_q and β_r = β + ∆β_r for the parameters α and β one to one. The modulus of the vector of parametric changes (∆α_q, ∆β_r) defines a tolerance range θ(q, r). And the value of θ(q, r) determined by the standard errors of parameter estimates denoted by mean of θ_ste. A fixed value of θ(q, r) leads to four possible characterizations of the pair (∆α_q, ∆β_r). Each one associates to a trajectory w_m(α_q, β_r, t) shifting from a reference one w_m(α, β, t). The symbol δw_mθ(α_q, β_r, t) (cf. Eq. (42)) denotes deviations between reference and average of shifting trajectories at sampling dates. And the average of δw_mθ(α_q, β_r, t) values taken over all sampling dates denoted through 〈δw_mθ(α_q, β_r, t)〉 (cf. Eq. (43)). The absolute value of the ratio of 〈δw_mθ(α_q, β_r, t)〉 to 〈w_m(α, β, t)〉 defines a relative deviation index ϑ(θ). It measures sensitivity of 〈w_m(α, β, t)〉 to fluctuations of tolerance θ(q, r) on α and β. Appendix 3 presents detailed formulae.

Results

Figure 1 shows the variation of leaf dry weight and area observed in the raw data. Smallest and largest leaf areas were 2 mm² and 7868 mm² respectively. Associated dry weights were 1 × 10⁻⁵ g and 0.1588 g one to one. The time average of mean leaf dry weight in shoots was 〈w_m(t) 〉 = 0.01461g (cf. Eq. (16)). Each leaf area measurement associate to several replicates of leaf biomass. Number of replicates increased from a single association up to a largest value of 84. Dispersion masks a power function like trend. Contents of Appendix 1 corroborate this at formal level. And exploration of dispersion reveals severe deviations from the inherent power function-like trend. Inconsistencies are more visible for leaves with areas under 350 mm² and also for those over 5000 mm². This hints about the relevance of data quality.

Figure 2 exhibits the spreading of leaf dry weight after data quality control. About 40% of the replicates in the raw data were eliminated. A power function like trend appears more depicted than that showing on Fig. 1. But dispersion still shows significant deviations around the expected power function like trend. This suggests lack of standardized routines for data gathering. In this work the length times width proxy [36] approximated leaf area. Errors in estimation of area of older damaged leaves could explain uneven replicates. Faulty equipment, or incorrect recording could explicate inconsistencies for small leaves.

Table 1 gives estimates$ \widehat{\alpha} $ and $ \widehat{\beta} $for the parameters α and β and corresponding standard errors. Assuming heteroscedasticity in the model of Eqs. (5) and (6) did not affect estimates. Thus, presentation of results of nonlinear regression refers to the homoscedastic case of the model of Eqs. (5) and (6). Figure 3 displays mean response curves fitted using raw data. Figure 4 shows those associated to quality controlled data. Results for the log-linear transformation method included correction for bias of retransformation [42]. The smearing estimate of bias of Duan [43] provided the form of the correction factor.

Table 1 Parameter estimates $ \widehat{\alpha} $ and $ \widehat{\beta} $ associated standard errors ($ ste\left(\widehat{\alpha}\right), ste\Big(\widehat{\beta} $)) found by fitting the model of Eq. (1). Nonlinear regression estimates associate to the homoscedastic case of the model of Eqs. (5) and (6) (see Appendix 1). Values of $ \widehat{\rho} $ give an evaluation of reproducibility strength of the proxy of Eq. (1)

Full size table

For raw data the log-linear transformation method produced $ \widehat{\rho}=0.8910 $, entailing poor reproducibility. This explains a biased distribution of replicates around the mean response curve (Fig. 3a). Meanwhile, estimates acquired by nonlinear regression from raw data conveyed adequate reproducibility ($ \widehat{\rho}=0.9307\Big) $. This explains a displayed fair distribution of projected leaf biomass values (Fig. 3b).

Estimates via log-linear transformation for processed data seemed enhance reproducibility ($ \widehat{\rho}=0.9777 $ ̂=0.9455). But, Fig. 4a, reveals a bulk of inconsistent replicates for leaves areas under 5000 mm². Notice that this subset of replicates distributes almost evenly around the mean response curve. Yet replicate spread for areas beyond 5000 mm² shows significant bias (Fig. 4a). Meanwhile, nonlinear regression and processed data associate to $ \widehat{\rho}=0.9777 $. This corresponded to good reproducibility strength. Indeed, spread of replicates around the mean response is fair (Fig. 4b).

Estimates acquired from raw data via the traditional analysis method of allometry returned a value of $ \widehat{\rho}=0.9285 $ for $ {w}_m\left(\hat{\alpha},\hat{\beta},t\right) $ projections (Table 2). This seems to correspond to moderate reproducibility. Yet, corresponding rms = 0.01265 was largest among analysis method- data set combinations (Table 2). Figure 3a shows a relative wider bias in spread around the mean response curve for larger leaves. This explains resulting inconsistencies in reproducibility of $ {w}_m\left(\widehat{\alpha},\widehat{\beta},t\right) $ projections shown in Fig. 5a. Display reveals biased $ {w}_m\left(\hat{\alpha},\hat{\beta},t\right) $ projections for near 50% of sampling dates. This, led to a smallest value of 0.8436 for the ratio of projected to observed averages.

Table 2 Reproducibility results for w_m(α, β, t). Entries include, Lin’s concordance correlation coefficients $ \left(\hat{\rho}\right) $, root mean square deviations (rms) and ratios of 〈w_m(α, β, t)〉 to 〈w_m(t)〉 averages

Full size table

Instead, nonlinear regression and raw data produced a value of $ \widehat{\rho}=0.9915. $ And root mean squared deviation attained a value of rms = 0.00460 (Table 2). This suggest a remarkable reproducibility strength for $ {w}_m\left(\hat{\alpha},\hat{\beta},t\right) $ projections (Table 2). Correspondence between projected and observed values, shown in Fig. 5b. corroborates high agreement. Moreover, the ratio of projected to observed leaf dry weight averages attained an outstanding value of 0.9773 (Table 2).

Processed data and log-linear transformation analysis produced $ \hat{\rho}=0.9489 $ for w_m(α, β, t) projections. This figure is bigger than corresponding to raw data for this method. Nevertheless, lines in Fig. 6a show that this result does not correspond to a real gain in reproducibility. Besides data quality could not significantly reduce calculated root mean squared deviation (Table 2). Also, a value of 0.8588 for a ratio of projected to observed leaf dry weight averages is still low for suitable agreement (Table 2). Thus, regardless data quality, log-linear analysis failed to produce consistent w_m(α, β, t) projections of w_m(t) averages.

In turn, w_m(α, β, t) projections made by nonlinear regression and processed data yield the highest value of $ \hat{\rho}=0.9976 $. (Table 2). And also the smallest root mean squared deviation among analysis method–data set combinations (Table 2). As shown by Fig. 6b this corresponds to a fairly good reproducibility strength. Additionally, data quality and nonlinear regression led to an outstanding value of 0.9975 for the ratio of projected 〈w_m(α, β, t)〉 to observed 〈w_m(t)〉 averages.

Results exhibit that log-linear transformations failed to produce consistent projections for observed w_m(t) averages. In contraposition, nonlinear regression entailed parameter estimates of noteworthy reliability. Thus studying sample size effects on reproducibility addressed only this analysis method. Figure 7 exhibits variation of CCC values depending on sample size k. This is expressed as a percentage of the extent of data set ( n_ra = 10412 for raw) or (n_qa = 6094 for processed). For raw data, a sample sized k = 0.20n_ra endures reasonable reproducibility $ \left(\widehat{\rho}(k)=0.9889\right) $. But samples beyond this threshold would not induce a significant change in reproducibility. Meanwhile, for the quality controlled data, the sample size threshold for excellent reproducibility was k = 0.10n_rq. This leads to a high reproducibility strength of $ \widehat{\rho}(k)=0.9929 $. Thus, a sample 10% the size of processed data set produced excellent reproducibility. Again sample sizes beyond this threshold failed to increase $ \widehat{\rho}(k) $ values.

In Appendix 3 we consider variations α_q = α + ∆α_q and β_r = β + ∆β_r. We found that shifting trajectories, w_m(α_q, β_r, t) overestimated reference projections w_m(α, β, t) whenever ∆α_q > 0 and ∆β_r > 0. In correspondence underestimation of w_m(α, β, t) occurs for −α < ∆α_q < 0 and −β < ∆β_r < 0. Fig. 8 explains that for ∆α_q ∙ ∆β_r > 0, shifting trajectories overestimate (see red lines in panel a)) or underestimate the reference one (see blue lines in panel a)). We can also make certain that relatively smaller deviations between w_m(α_q, β_r, t) and w_m(α, β, t), values occur for the case, ∆α_q ∙ ∆β_r < 0, (see red and blue lines in panel b)).

The simulation code of Eqs. (39) through (44) explored the sensitivity of the w_m(α, β, t) projection method, to numerical variation of parameters α and β. Available parameter estimates yield reference values for α and β (Table 2). Again, since nonlinear regression associates to highest reproducibility strength, for easier presentation, we only explain results using this analysis method.

The variation of the absolute deviation index for raw data is shown in Fig. 9. A range 0 ≤ θ(q, r) ≤ θ_ste, places the relative ϑ(θ) deviation index within the domain 0≤ ϑ(θ) ≤ 0.0205 (Table 3). Therefore, for a bound of θ(q, r) set by the standard errors of estimates largest absolute deviation between w_m(α_q, β_r, t) and w_m(t) amounts to about 2% of 〈w_m(t) 〉. Moreover, a range 0 ≤ θ(q, r) ≤ 2θ_ste produces 0 ≤ ϑ(θ) ≤ 0.031. This leads to an equivalent 3% of 〈w_m(t) 〉. Figure 10 displays the dynamics of ϑ(θ) depending on θ(q, r) for processed data. We have that θ(q, r) varying in a range of 0 ≤ θ(q, r) ≤ θ_ste implies 0 ≤ ϑ(θ) ≤ 0.003 (Table 3). This time largest absolute deviation was only 0.03% of 〈w_m(t)〉. Comparing with results for raw data, we ascertained remarkable gain in precision of w_m(α, β, t) projections. This exploration highlights on importance of data quality control as a procedure leading the consistent w _m(α, β, t) projections. But, results show that the projection method is robust even when parameter estimates are obtained from raw data.

Table 3 Sensitivity of the w_m(α, β, t) projections to changes in estimates of the parameters α and β. Included are calculated θ_ste values. This gives θ(q, r) as determined by the standard errors of estimates. We also present corresponding values of the relative deviation index ϑ(θ_ste)

Full size table

Discussion

Results of Solana-Arellano et al. [31] explain invariance of the allometric parameters α and β in Eq. (1). This suggest w_m(α, β, t) proxies as possible nondestructive estimations of the average leaf dry weight in eelgrass shoots. These assessments are essential for monitoring the efficiency of transplanted eelgrass plots, fundamental in remediation aims. The present examination shows that the w_m(α, β, t) proxies could in fact offer reliable and cost-effective assessments. This on condition that practitioners take in to account our guidelines. For instance, our results typify the extent on which accuracy of estimates of the parameters α and β influences the predictive power of the w_m(α, β, t) projections. And, our findings clarify that there are methodological factors affecting the accuracy of estimates. Related influences could spread significant bias in approximations supported by the w_m(α, β, t) device. Indeed, analysis method turned into a main factor inducing bias in parameter estimates of the model of Eq. (1). Moreover, only parameter estimates acquired by nonlinear regression yield consistency of the model of Eq. (1) (Table 1 and lines in Fig. 3b and Fig. 4b). And, only these estimates upheld conclusive predictive power of the w_m(α, β, t) proxies (Table 2, as well as, lines in Fig. 5b and Fig. 6b). Our results also show that data quality could not improve the performance of w_m(α, β, t) projections acquired via log-linear transformations. Without doubt, parameter estimates acquired from processed data by this method still led to significant bias in w_m(α, β, t) projections (Fig. 6a). Meanwhile, data processing improved reproducibility of projections built for raw data using nonlinear regression (Table 2 and lines in Fig. 5b and Fig. 6b). Besides, relevance of data quality was also evident for optimizing sample size. Indeed, while for raw data, a sample of approximately 2000 leaves shows reasonable reproducibility, for the quality controlled data this threshold drops to near 1000. However, samples sized beyond these thresholds would not induce a noteworthy gain in reproducibility. This result on its own, ties to efficiency of the w_m(α, β, t) projection method. Undoubtedly routines for leaf dry weight assessment are tedious and time consuming. So, reducing size of data set for parameter estimation increases cost-effectiveness of the w_m(α, β, t) projection method.

Nonlinear regression estimation also showed advantages in sensitivity over the log-linear analysis counterpart. Estimates from raw data led to a largest absolute deviation between w_m(α, β, t) and w_m(t) values amounting only 3% of the average of w_m(t) over sampling dates. And, for processed data, the fluctuation range for equivalent sensitivity widened to 2.8 times the range for standard errors of estimates. But, on spite of data quality relevance, sensitivity results for raw data reveal that the w_m(α, β, t) projection method is robust relative to expected fluctuations in parameter estimates.

Our results show that both the accuracy and cost-effectiveness of projections can be enhanced by the addition of data quality control procedures. However, including data processing can become a weakness for the w_m(α, β, t) projection method. Indeed, data cleaning procedures convey niceties that relate in a fundamental way to detection and rejection of inconsistent replicates. Also, compromising about which particular rejection edge should work, is hard to determine. Thus, the use of any data processing will endure a doubt, that the examiner selects an arrangement producing the most probable results [37]. In that order of ideas, when attempting to enhance the reproducibility power of w_m(α, β, t) projections it is desirable to avoid depending in any form of data processing. For that aim, prior to data assembly, we must bear in mind standardized routines yielding accurate measurements for w(t) and a(t). This will favor direct identification of the model of Eq. (1) in a consistent way. It is of a fundamental importance to be aware, that errors in leaf dry weight or area assessment differentiate in terms of leaf size. Certainly, leaves produced anew normally present a complete and undamaged span. But, they normally yield very reduced dry weight values. Therefore, we can expect estimation errors imputable to the precision of the analytical scale for individual leaf dry weight assessments. To this, we may add errors in the reading and/or recording of the scale output. These issues could explain a perceptible accumulation of inconsistent replicates for leaves with areas between 2 mm² and 350 mm² (Fig. 1). And, even after data cleaning procedures, leaf dry weight replicate spread for leaves bigger than 2000 mm² shows significant residual variability (Fig. 2). Likewise, as far as, bigger and older leaves is concerned, there are issues on dry weight estimation errors. These seem to relate to damage caused by exposure to environmental factors. The fact that we estimated leaf area by means of the product of related length and width could explain these effects. For complete undamaged eelgrass leaves, the use of a leaf times width proxy grants simplified and accurate estimations of leaf area [36].

But, this approach could deliver inaccurate estimations for long and damaged leaves. Actually, bigger leaves remain exposed during significant periods of time to environmental influences such as drag forces or herbivory. This could remove large amounts of leaf tissue while leaving length unaffected. Thus, causing overestimation of true leaf area when using a width and length product estimation. At the same time, lost portions of a leaf will produce a smaller dry weight than expected for an overestimated area. These effects will bring dry weight replicates that deviate from the power function–like trend associated to the model of Eq. (1). Estimation bias for the dry weights of smaller and longer leaves could explain the anomalous proliferation of inconsistencies (around 41 % ) found while applying the proposed data cleaning procedure to the present raw data. These effects will propagate uncertainty of parameter estimates of the model of Eq. (1), influencing accuracy of the w_m(α, β, t) projections. Hence, for the sake of consistent reproducibility of observed values via w_m(α, β, t) projections, we need to be aware of these effects. And as elaborated, a good starting point for granting consistency, is appropriate gathering of primary data for the identification of the model of Eq. (1). This will make subsequent data cleaning procedures unnecessary.

Conclusion

This research show that precise estimates of allometric parameters in Eq. (1) grant accurate non-destructive projections of the average leaf dry weight in eelgrass shoots. These projections offer efficient appraisal of eelgrass restoration projects, thereby contributing to the conservation of this important seagrass species. Our findings support views in Hui and Jackson [32], Packard and Birchard [33] and Packard et al. [44], on the relevance of analysis method in scaling studies. Indeed, we found that for assuring suitability of the w_m(α, β, t) proxies, the use of nonlinear regression is crucial. On spite of claims that the use of the traditional log-linear analysis method is a must in allometric examination [45], exploration of spread of present crude data reveals curvature [46]. This explaining failure of the traditional analysis method to produce unbiased results for the present data. Besides proxies supported by nonlinear regression and raw data, are robust.

Data cleaning could only marginally enhance the accuracy of proxies produced by nonlinear regression and raw data. But results underline a relevant influence of data quality in setting optimal sample size for a suitable precision of parameter estimates. Nevertheless, the use of data cleaning procedures leads to intricacies. They in a fundamental way relate to choosing thresholds for rejection of detected inconsistencies, which are often regarded as subjective. Thus, instead of using later data cleaning, data gathering should seek for suitability. Special care must be taken when processing bigger and older leaves. These are often damaged or even trimmed so that their dry weights do not conform to a true weight to area relationship. Irregularities in raw data may also associate to biased estimation of leaf length or width. Moreover, in a lesser way faulty equipment for leaf dry weight assessment, rounding off, or even incorrect data recording could as well contribute.

Taking advantage of a time invariance of the parameters in Eq. (1) the w_m(α, β, t) device could offer to the eelgrass conservation practitioner highly consistent and truly nondestructive assessments of the average value of leaf dry weight in shoots. But the explained guidelines on analysis method, sample size and data appropriateness are mandatory for cost-effectiveness. Moreover, the present results suggest that the use of the w_m(α, β, t) method could be extended to other seagrasses species with similar leaf architecture to eelgrass.

References

Costanza R, d’Arge R, de Groot R, Farber S, Garsso M, Hannon B, et al. The value of the world’s ecosystem services and natural capital. Nature. 1997;387:253–60.
Article CAS Google Scholar
Short FT, Neckles HA. The effects of global climate change on seagrasses. Aquatic Bot. 1999;63:169–96.
Article Google Scholar
Grech A, Chartrand-Miller K, Erftemeijer P, Fonseca M, McKenzie L, Rasheed M, et al. A comparison of threats, vulnerabilities and management approaches in global seagrass bioregions. Environ Res Lett. 2012;7:024006.
Article Google Scholar
Unworth KFR. Seagrass meadows in a globally changing environment. Marine Poll Bull. 2014;83(2):383–6.
Article Google Scholar
Flindt MR, Rasmussen EK, Valdemarsen T, Erichsen A, Kaas H, Canal-Vergés P. Using a GIS-tool to evaluate potential eelgrass reestablishment in estuaries. Ecol Model. 2016;338:122–34.
Article Google Scholar
Hughes BB, Eby R, Dyke EV, Tinker MT, Marks CI, Johnson KS, et al. Recovery of a top predator mediates negative eutrophic effects on seagrass. Proc Natl Acad Sci. 2013;110:15313–8.
Article CAS PubMed PubMed Central Google Scholar
van Katwijk MM, Thorhaug A, Marbà N, Orth RJ, Duarte CM, Kendrick GA, et al. Global analysis of seagrass restoration: the importance of large-scale planting. J Appl Ecol. 2016;53:567–78.
Article Google Scholar
Short FT, Kostenb S, Morganc PA, Malone S, Moore GE. Impact of climate change on submerged and emergent wetland plants. Aquat Bot. 2016;135:3–17.
Article Google Scholar
Holmquist JG, Powell GVN, Sogard SM. Decapod and stomatopod assemblages on a system of seagrass-covered mud banks in Florida bay. Mar Biol. 1989;100:473–83.
Article Google Scholar
Montague CL, Ley JA. A possible effect of salinity fluctuation on abundance of benthic vegetation and associated fauna in northeastern Florida bay. Estuar Coast Shelf Sci. 1993;16:703–17.
CAS Google Scholar
Plummer ML, Harvey CJ, Anderson LE, Guerry AD, Ruckelshaus MH. The role of eelgrass in marine community interactions and ecosystem services: results from ecosystem-scale food web models. Ecosystems. 2013;16(2):237–51.
Article Google Scholar
Blackburn TH, Nedwell DB, Weibe WJ. Active mineral cycling in a Jamaican seagrass sediment. Mar Ecol Prog Ser. 1994;110:233–9.
Article CAS Google Scholar
Park SR, Li WT, Kim SH, Kim JW, Lee KS. A comparison of methods for estimating the productivity of Zostera marina. J Ecol Field Biol. 2010;33(1):59–65.
Google Scholar
Terrados J, Borum J. Why are seagrasses important? goods and services provided by seagrass meadows. In: Borum J, Duarte CM, Krause-Jensen D, Greve TM, editors. European seagrasses: an introduction to monitoring and management: EU project Monitoring and Managing of European Seagrasses (M&MS); 2004. p. 8–10. http://www.seagrasses.org/handbook/european_seagrasses_high.pdf.
Newell IER, Koch EW. Modeling seagrass density and distribution in response to changes in turbidity stemming from bivalve filtration and seagrass sediment stabilization. Estuar Coast Shelf Sci. 2004;27(5):793–806.
Google Scholar
Christianen MJA, van Belzen J, Herman PMJ, van Katwijk MM, Lamers LPM, van Leent PJM, et al. Low-canopy seagrass beds still provide important coastal protection services. PLoS One. 2013;8(5):e62413.
Article CAS PubMed PubMed Central Google Scholar
Liu X, Zhou Y, Yang H, Ru S. Eelgrass detritus as a food source for the sea cucumber Apostichopus Japonicus Selenka (Echinodermata: Holothuroidea) in coastal waters of North China: an experimental study in flow-through systems. PLoS One. 2013;8(3):e58293.
Article CAS PubMed PubMed Central Google Scholar
Kennedy H, Beggins J, Duarte CM, Fourqurean JW, Holmer M, Marbà N, et al. Seagrass sediments as a global carbon sink: isotopic constraints. Global Biogeome Cycles. 2010;24(4):GB4026.
Google Scholar
Duarte CM, Sintes T, Marba N. Assessing the CO2 capture potential of seagrass restoration projects. J Appl Ecol. 2013;50:1341–9.
Article CAS Google Scholar
Short FT, Coles RG, Pergent-Martini C. Global seagrass distribution. In: Short FT, Coles RG, editors. Global seagrass research methods. Amsterdam: Elsevier Science B.V; 2001. p. 6.
Google Scholar
Phillips RC. In: Odum HT, Copeland BJ, Mc Mahan EA, editors. Temperate grass flats in coastal ecological Systems of the United States. 2nd ed. Washington DC: Conservation Foundation; 1974. p. 244–99.
Google Scholar
Jacobs RPWM. Distribution and aspects of the production and biomass of eelgrass, Zostera marina L. at Roscoff, France. Aquat Bot. 1979;7:151–72.
Article Google Scholar
Williams RB. Nutrient level and phytoplankton productive in the estuary. In: Chabreck RA, editor. Proceedings of the coastal marsh and estuary management symposium. Baton Rouge: Louisiana State University; 1973. p. 59.
Google Scholar
Novak AB, Plaisted HK, Hays CG, Hughes RA. Limited effects of source population identity and number on seagrass transplant performance. PeerJ. 2017;5:e2972.
Article PubMed PubMed Central Google Scholar
Matheson FE, Reed J, Dos Santos VM, Mackay G, Cummings VJ. Seagrass rehabilitation: successful transplants and evaluation of methods at different spatial scales. N Z J Mar Freshwater Res. 2017;51(1):96–109.
Article Google Scholar
Fishman JR, Orth RJ, Marion S, Bieri J. A comparative test of mechanized and manual transplanting of eelgrass, Zostera marina, in Chesapeake Bay. Restor Ecol. 2004;12:214–9.
Article Google Scholar
Olsen JL, Coyer JA, Chesney B. Numerous mitigation transplants of the eelgrass Zostera marina in southern California shuffle genetic diversity and may promote hybridization with Zostera pacifica. Biol Conserv. 2014;176:133–43.
Article Google Scholar
Li WT, Kim JH, Park JI, Lee KS. Assessing establishment success of Zostera marina transplants through measurements of shoot morphology and growth. Estuar Coast Shelf Sci. 2010;88(3):377–84.
Article Google Scholar
Hauxwell J, Cebrián J, Valiela I. Eelgrass Zostera marina loss in temperate estuaries: relationship to land-derived nitrogen loads and effect of light limitation imposed by algae. Mar Ecol Prog Ser. 2003;247:59–73.
Article CAS Google Scholar
Echavarría-Heras HA, Leal-Ramírez C, Villa-Diharce E, Cazarez-Castro NR. The effect of parameter variability in the allometric projection of leaf growth rates for eelgrass (Zostera Marina L.) II: the importance of data quality control procedures in bias reduction. Theor Biol Med Model. 2015;12:30.
Article PubMed PubMed Central Google Scholar
Solana-Arellano ME, Echavarría-Heras HA, Leal-Ramírez C, Kun-Seop L. The effect of parameter variability in the allometric projection of leaf growth rates for eelgrass (Zostera marina L). Lat Am J Aquat Res. 2014;42(5):1099–108.
Article Google Scholar
Hui D, Jackson RB. Uncertainty in allometric exponent estimation: a case study in scaling metabolic rate with body mass. J Theor Biol. 2007;249:168–77.
Article PubMed Google Scholar
Packard GC, Birchard GF. Traditional allometric analysis fails to provide a valid predictive model for mammalian metabolic rates. J Exp Biol. 2008;211:3581–7.
Article PubMed Google Scholar
Echavarría-Heras HA, Lee KS, Solana-Arellano ME, Franco-Vizcaino E. Formal analysis and evaluation of allometric methods for estimating above-ground biomass of eelgrass. Ann Appl Biol. 2011;159(3):503–15.
Article Google Scholar
Echavarría-Heras HA, Leal-Ramírez C, Villa-Diharce E, Castillo O. Using the value of Lin's concordance correlation coefficient as a criterion for efficient estimation of areas of leaves of eelgrass from noisy digital images. Source code Biol Med 2014; 9(1):29.
Leal-Ramírez C, Echavarría-Heras HA, Castillo O. Exploring the suitability of a genetic algorithm as tool for boosting efficiency in Monte Carlo estimation of leaf area of eelgrass. In: P Melin, O Castillo, J Kacprzyk, editores. Design of intelligent systems based on fuzzy logic, neural networks and nature-inspired optimization. Studies in computational intelligence. England: Springer. 2015;601:291–303.
Leys C, Ley C, Klein O, Bernard P, Licata L. Detecting outliers: do not use standard deviation around the mean, use absolute deviation around the median. J Exp Soc Psychol. 2013;49(4):764–6.
Article Google Scholar
Lin LI. A concordance correlation coefficient to evaluate reproducibility. Biometrics. 1989;45:255–68.
Article CAS PubMed Google Scholar
McBride GB. A proposal for strength-of-agreement criteria for Lin’s concordance correlation coefficient. NIWA client report: HAM2005–062. Hamilton: National Institute of Water & Atmospheric Research; 2005.
Google Scholar
Paiwatin Y. Statistical Modeling and Inference Using Likelihood. In: In all Likelihood. UK: Oxford University Press; 2013.
Casella G, Berger RL. Statistical inference. 2nd ed. UK: Duxbury Press; 2001.
Newman MC. Regression analysis of log-transformed data: statistical bias and its correction. Environ Toxicol. 1993;12:1129–33.
Article CAS Google Scholar
Duan N. Smearing estimate: a nonparametric retransformation method. J Am Stat Assoc. 1983;78:605–10.
Article Google Scholar
Packard GC, Boardman TJ, Birchard GF. Allometric equations for predicting body mass of dinosaurs: a comment on Cawley and Janacek. J Zool. 2010;282:221–2.
Article Google Scholar
Lai J, Yang B, Lin D, Kerkhoff AJ, Ma K. The allometry of coarse root biomass: log-transformed linear regression or nonlinear regression? PLoS One. 2013;8(10):e77007.
Article CAS PubMed PubMed Central Google Scholar
Packard GC. Misconceptions about logarithmic transformation and the traditional allometric method. Zoology. 2017;123:115–20.
Article PubMed Google Scholar
Huber PJ. Robust statistics. New York: Wiley; 1981.
Book Google Scholar
Miller J. Reaction time analysis with outlier exclusion: bias varies with sample size. Q J Exp Psychol. 1991;43(4):907–12.
Article CAS Google Scholar

Download references

Acknowledgements

We thank Francisco Javier Ponce Isguerra for art work.

Funding

All funding for the present research were obtained from institutional grants

Availability of data and materials

Data and simulation codes will be available from corresponding author upon request.

Author information

Authors and Affiliations

Centro de Investigación Científica y de Estudios Superiores de Ensenada, Carretera Ensenada-Tijuana No. 3918, Zona Playitas, Código Postal 22860, Apdo. Postal 360, Ensenada, B.C., Mexico
Héctor Echavarría-Heras & Cecilia Leal-Ramírez
Centro de Investigación en Matemáticas, A.C. Jalisco s/n, Mineral Valenciana, Código Postal 36240, Guanajuato Gto, Mexico
Enrique Villa-Diharce
Instituto Tecnológico de Tijuana, Calzada Tecnológico S/N, Fracc. Tomas Aquino, Código Postal 22414, Tijuana, Baja California, Mexico
Nohe Cazarez-Castro

Authors

Héctor Echavarría-Heras
View author publications
You can also search for this author in PubMed Google Scholar
Cecilia Leal-Ramírez
View author publications
You can also search for this author in PubMed Google Scholar
Enrique Villa-Diharce
View author publications
You can also search for this author in PubMed Google Scholar
Nohe Cazarez-Castro
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

HEH conceived, designed performed and supervised the whole research. CLR performed formal derivations simulation and numerical tasks. EVD performed the statistical analysis and interpretation of data. NCC revised the manuscript critically both at statistical and formal levels. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Héctor Echavarría-Heras.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interest.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1 Model selection

This Appendix covers the selection problem of Eq. (1) as a model for the variation of eelgrass leaf dry weight in terms of related area. On first sight the spread in Fig. 1 may suggest that instead of clinging to the power function model of Eq. (1) we should explore fitting an alternative linear model. There are two possible ways to achieve a linear model fit for the present data. One by fitting an isometric model. This follows by setting α = 1 in Eq. (1) to yield a linear model with w-intercept through origin, that is,

$$ w(t)\kern0.5em =\kern0.5em \beta a(t) $$

(2)

A second choice is considering a standard linear equation with intercept different from zero, that is,

$$ w(t)=\beta a(t)+\delta, $$

(3)

where β is the slope and δ the intercept. But, by looking at the spread in Fig. 1, we realize that this model turns out to be inconsistent. This because for a vanishing leaf area we expect a vanishing dry weight. In any event, we fit the model of Eq. (3) as a device to analyze the suitability of the model of Eq. (2).

Fitting of the model of Eq. (1) by means the traditional analysis method of allometry

For this fit we used raw data. Appling a logarithmic transformation on both sides of the Eq. (1), since we have available data pairs (w_i, a_i), we get

$$ \mathrm{In}\left({w}_i\right)=\mathrm{In}\kern0.28em \left(\beta \right)+\alpha \kern0.28em \mathrm{In}\kern0.28em \left({a}_i\right)+{\varepsilon}_i,i=1,2,\dots, n, $$

(4)

where the additive errors ε₁, ε₂, …, ε_n are independent and identically distributed, with distribution N(0, σ²). Fitting the model of Eq. (4) to data produced entries in Tables 4, 5 and 6.

Table 4 Residual statistics for the fitting of model of Eq. (4)

Full size table

Table 5 Fitting results for the model of Eq. (4)

Full size table

Table 6 Fitting tests for the model of Eq. (4)

Full size table

Residual and normal probability plots for the fit of the model of Eq. (4) appear in Fig. 11. We notice that a value α = 1.0 is not included in the confidence interval for the parameter α. This impairs Eq. (2) as a consistent model for the present raw data.

Fitting of the model of Eq. (1) via non-linear regression

For the aim of fitting the model of Eq. (1) to raw data using nonlinear regression, we use the equation

$$ {w}_i=\beta {a_i}^{\alpha }+{\varepsilon}_i $$

(5)

the error term ε_i has a normal distribution with mean μ zero and standard deviation σ_i. This chosen to contemplate heteroscedasticity or homoscedasticity. Moreover, σ_i depends on the predictor variable a_i through the model

$$ {\sigma}_i={\gamma}_0+{\gamma}_1{a}_i, $$

(6)

where γ₀ and γ₁ are parameters to be estimated. The case γ₁ different from zero associates to heteroscedasticity. Whenever γ₁ vanishes allows the consideration of homoscedasticity.

The response variable w _i resulting from Eq. (5) is normally distributed, with mean μ_i = βa_i^α and standard deviation σ_i given by Eq. (6). Fitting this regression model to raw data relied on a likelihood approach, i.e., acquiring estimates for the parameters β, α, γ₀ and γ₁, which yield the largest value of the log likelihood function [40, 41]

$$ l\left(\beta, \alpha, {\gamma}_0,{\gamma}_1\right)=-\frac{n}{2}\log \left(2\pi \right)-{\sum}_1^n\log \left({\sigma}_i\right)-\frac{1}{2}{\sum}_1^n{\left(\frac{w_i-{\mu}_i}{\sigma_i}\right)}^2, $$

(7)

with μ_i = βa_i^α and σ_i given by Eq. (6).

Estimates $ \widehat{\beta} $and $ \widehat{\alpha} $ are very similar for the homoscedastic and heteroscedastic cases of Eq. (5). Notice that a value α = 1.0 is not included in the confidence interval for the parameter α. Thus, this fit also excludes the model of Eq. (2) (isometric case).

Confidence intervals in Table 7 show a wide overlap region. It follows that parameter estimates do not differ in a significant way. This is also supported by the estimated CCC value ($ \widehat{\rho}=0.9307 $) that was the same for the heteroscedastic and homoscedastic cases. Likewise, plotting the response curves w_i = βa_i^α on a dispersion diagram, reveals that curves estimated considering heteroscedasticity or homoscedasticity coincide (Fig. 12).

Table 7 Maximum likelihood estimates for parameters β and α in Eq. (5). This fit considers heteroscedasticity (γ₁ ≠ 0) or homoscedasticity (γ₁ = 0)

Full size table

Figure 13 shows the scatter diagram of dry weight residuals against leaf area. It also displays the region bounded by lines for 95% confidence intervals (blue lines). Figure 14 displays raw data and response curve (median) for the heteroscedastic model. It includes curves for region comprising 95% of leaf dry weight values (blue lines).

Thus, predictive strength from the homoscedastic or heteroscedastic regression models are similar. This because assumed distributions of the error term, in both models are symmetric. What makes a difference is the form capturing the uncertainty pattern.

Fitting of the linear model of Eq. (3)

Since, raw data composes of pairs (w_i, a_i), in order to test the consistency of the model of Eq. (3), we consider the regression equation

$$ {w}_i=\beta\;{a}_i+\delta +{\varepsilon}_i, $$

(8)

where the additive errors ε₁, ε₂, …, ε_n are independent and identically distributed N(0, σ²). Fitting regression Eq. (8) to raw data produced the results in Table (8),Table (9) and Table (10).

Table 8 Residual statistics for the fit of Eq. (8)

Full size table

Table 9 Fitting results for the model of Eq. (8)

Full size table

Table 10 Fitting test statistics for the model of Eq. (8)

Full size table

Residual and normal probability plot for this fit appear in Fig. 15. Looking at the confidence intervals in Table 7, we observe that a value α = 1 is not included. This implies rejection for the model of Eq. (2) (isometric case). Also, the intercept in Eq. (8) is negative, which is inconsistent. This is inconsistent with observations. Thus, we must reject the model of Eq. (3).

In summary, fitting results for the model of Eq. (1) excludes the value α = 1. Now, for α = 1, the model of Eq. (1) takes the form of a straight line through the origin (isometric model of Eq. (2)). In the other hand, for the fit of the linear model of Eq. (3) the intercept is negative. This means that this model could predict negative values of leaf dry weight. Thus, the model of Eq. (1) is the only one supporting a good selection criterion.

Appendix 2 Data processing

This Appendix explains the Median Absolute Deviation (MAD) criterion for data cleaning. A first step is detecting the group of leaf dry weight replicates that associate to a given leaf area. The symbol G(a) stands for this group, and the number of replicates it contains denoted by means of the symbol n(G). Formally, G(a) = {w_i(a) | 1 ≤ i ≤ n(G) }. The median of G(a) is denoted by means of MED(G(a)), that is,

$$ \mathrm{MED}\left(G(a)\right)=\mathrm{MED}\left\{{w}_1(a),\dots .,{w}_{n(G)}(a)\right\}. $$

(9)

And, for 1 ≤ i ≤ n(G) the absolute deviation of replicate w_i(a) from the median MED(G(a)) represented by means of δ_i(a), that is,

$$ {\delta}_i(a)=\left|{w}_i(a)-\mathrm{MED}\left(G(a)\right)\right|. $$

(10)

Similarly, the median of the set of absolute deviations is signified by the symbol MED(δG(a)), that is,

$$ \mathrm{MED}\left(\delta G(a)\right)=\mathrm{MED}\left\{{\delta}_1(a),\dots .,{\delta}_{n(G)}(a)\right\}. $$

(11)

We use the character MAD(G(a)) to denote median absolute deviation of a group G(a). After Huber [47] and recalling that eelgrass leaf dry weight values are log-normally distributed, this is calculated through,

$$ \mathrm{MAD}\left(G(a)\right)=b\mathrm{MED}\left\{\ {\delta}_1(a),\dots .,{\delta}_{n(G)}(a)\right\}, $$

(12)

where b = 1/Q(0.75), being Q(0.75) the 0.75 quantile of the lognormal distribution.

The removal of inconsistent replicates in a group G(a) follows the decision criterion

$$ \mathrm{MED}\left(G(a)\right)-T\cdot \mathrm{MAD}\left(G(a)\right)<{w}_i(a)<\mathrm{MED}\left(G(a)\right)+T\cdot \mathrm{MAD}\left(G(a)\right), $$

(13)

where T is the rejection threshold. Following Miller [48], we set T = 3. In what follows, the symbol n_qa stands for the size of the resulting quality controlled data set.

Appendix 3 Sensitivity of w _m(α, β, t)

This Appendix presents a study of the influences of parameter fluctuations in the precision of the w_m(α, β, t) projections. The examination is performed on the basis of both analytical and simulation approaches. Presentation requires a formal description of variables of interest. For that aim we label a generic eelgrass shoot by using a subscript s. Correspondingly, the characters nl(s) and w_s(t) shall stand for the associated number of leaves and their combined dry weight. Similarly, ns(t) denotes the number of shoots collected at a sampling date t. And, the number of sampling date symbolized by means of n(t).

Observed dry weight in a shoot s is represented by mean of

$$ {w}_s(t)={\sum}_{nl(s)}w(t), $$

(14)

where ∑_nl(s) indicates summation of the leaves that the shoot s, holds. Meanwhile, the matching average leaf dry weight of shoots denoted by the symbol w_m(t) is given by

$$ {w}_m(t)=\frac{\sum_{ns(t)}{w}_s(t)}{ns(t)\kern0.5em }, $$

(15)

where ∑_ns(t) indicates summation of the ns(t) shoots collected at a time t.

The character 〈w_m(t)〉 stands for the average of w_m(t) values over the number n(t) of sampling times. It is given by

$$ \left\langle {w}_m(t)\right\rangle =\frac{\sum_{n(t)\kern0.75em }{w}_m(t)}{n(t)}, $$

(16)

where $ \sum \limits_{n(t)} $ indicates summation over the number of sampling times.

We assumed that w(t) and a(t) are linked through the scaling expression of Eq. (1). Then, we can obtain an allometric proxy for w_s(t). The symbol w_s(α, β, t) represents this surrogate and it is given by

$$ {w}_s\left(\alpha, \beta, t\right)={\sum}_{nl(s)}\beta a{(t)}^{\alpha }. $$

(17)

Moreover,

$$ {w}_s(t)={w}_s\left(\alpha, \beta, t\right)+{\epsilon}_s\left(\alpha, \beta, t\right), $$

(18)

that is, w_s(α, β, t) is an approximation to the true value of w_s(t), being ϵ_s(α, β, t) the error of estimation. Similarly, Eq. (17) produces an allometric proxy for w_m(t), the average leaf dry weight in shoots at a time t. This surrogate, represented here by means of w_m(α, β, t), is given by

$$ {w}_m\left(\alpha, \beta, t\right)=\frac{\sum_{ns(t)\kern0.5em }\ {w}_s\left(\alpha, \beta, t\right)}{ns(t)\kern0.5em }. $$

(19)

In turn, the average of w_m(α, β, t) over the number n(t) of sampling times, denoted through 〈w_m(α, β, t) 〉, is calculated by means of

$$ \left\langle {w}_m\left(\alpha, \beta, t\right)\ \right\rangle =\frac{\sum_{n(t)}\ {w}_m\left(\alpha, \beta, t\right)}{n(t)\ }. $$

(20)

Similarly,

$$ {w}_m(t)={w}_m\left(\alpha, \beta, t\right)+{\epsilon}_m\left(\alpha, \beta, t\right), $$

(21)

where ϵ_m(α, β, t) stands for the resulting approximating error. Moreover, the root mean squared deviation between w_m(t) and w_m(α, β, t) is denoted by means of rms(α, β) and given by

$$ rms\left(\alpha, \beta \right)=\sqrt{\sum_{n(t)}{\epsilon}_m{\left(\alpha, \beta, t\right)}^2/n(t)}, $$

(22)

again ∑_n(t) indicates summation over the sampling dates n(t).

We assume changes in parameters α and β given by factors of their standard errors. Then, a factor q associates to a shift of the value of the parameter α. This deviation, denoted by means of the symbol α_q, is given by

$$ {\alpha}_q=\alpha +\varDelta {\alpha}_q, $$

(23)

where

$$ {\Delta \alpha}_q=q\cdot ste\left(\alpha \right), $$

(24)

with ste (α) symbolizing the standard error of α and q satisfying.

$$ \mid q\mid \le {q}_{\max .} $$

(25)

Similarly, a factor r, such that.

$$ \left|r\right|\le {r}_{\mathit{\max},} $$

(26)

associates to a fluctuation in β denoted by means of Δβ_r and expressed by way of

$$ \varDelta {\beta}_r=r\cdot ste\left(\beta \right), $$

(27)

being ste(β) the standard error of β. This returns a variation of the parameter β, denoted via β_r and represented in the form

$$ {\beta}_r=\beta +\varDelta {\beta}_r. $$

(28)

Qualitative exploration of sensitivity of w _m(α, β, t)

A qualitative study is intended to set domains where reference estimations w_m(α, β, t) are underestimated or overestimated by changing values w_m(α_q, β_r, t).For that aim, we firstly set reasonable local variation ranges for the numerical deviations α_q and β_r. Fittings of Eq. (1) to eelgrass leaf dry weight and area data demonstrate that the inequality 0 < β < α holds [30]. As stated by Eqs. (24) and (27), letting |q| < ste(α)/α along with | r| < ste(β)/β set domains |∆α_q| ≤ α and |∆β_r| ≤ β. This enables suitable domain to pursue a qualitative study of the effects of parametric changes in w_s(α, β, t) (cf. Eq. (17)). For that aim, let’s consider deviations δw_s(α_q, β_r, t) defined through

$$ \delta {w}_s\left({\alpha}_q,{\beta}_r,t\right)={w}_s\left({\alpha}_q,{\beta}_r,t\right)-{w}_s\left(\alpha, \beta, t\right). $$

(29)

Then,

$$ \delta {w}_s\left({\alpha}_q,{\beta}_r,t\right)={\sum}_{nl(s)}\left(\left(\beta +\varDelta {\beta}_r\right)a{(t)}^{\alpha +\varDelta {\alpha}_q}-\beta a{(t)}^{\alpha}\right), $$

(30)

and factoring the term βa(t)^α

$$ \delta {w}_s\left({\alpha}_q,{\beta}_r,t\right)={\sum}_{nl(s)}\beta a{(t)}^{\alpha }{\mu}_a\left(\varDelta {\alpha}_q\kern0.5em \varDelta {\beta}_r,a(t)\right), $$

(31)

where

$$ {\mu}_a\left(\varDelta {\alpha}_q\kern0.5em \varDelta {\beta}_r,a(t)\right)=\frac{\pi \left(\varDelta {\alpha}_q\right)}{\ \varphi \left(\varDelta {\beta}_r\right)}-1 $$

(32)

with

$$ \varphi \left(\varDelta {\beta}_r\right)=\frac{\beta }{\beta +\varDelta {\beta}_r} $$

(33)

and

$$ \pi \left(\varDelta {\alpha}_q\right)=a{(t)}^{\varDelta {\alpha}_q}. $$

(34)

Now, since a(t) is positive, for fixed values of the parameters α and β in the domain |∆α_q| < α, the factor π(∆α_q) remains positive and increasing. Respectively, for ∆β_r varying in the interval |∆β_r| < β, the factor φ(∆β_r) is positive and decreasing. Furthermore, for ∆α_q = ∆β_r= 0, we have π(∆α_q) = φ(∆β_r) = 1 (Fig. 16).

From Eq. (31) the proxies w_s(α, β, t) will be overestimated by shifting projections w_s(α_q, β_r, t) whenever the inequality

$$ {\mu}_a\left(\varDelta {\alpha}_q,\varDelta {\beta}_r,a(t)\right)\kern0.5em >0 $$

(35)

is satisfied. And from Eq. (32) this last inequality leads to

$$ \pi \left(\varDelta {\alpha}_q\right)>\varphi \left(\varDelta {\beta}_r\right). $$

(36)

But, for −α ≤ ∆α_q ≤ 0 the factor π(∆α_q) is continuous and increases monotonically from a value π(−α) > 0, until it reaches a value of π(0) = 1, while the factor φ(∆β_r) takes positive and arbitrarily large values whenever ∆β_r approaches –β, and decreases monotonically towards a value of φ(0) = 1. Thus, the continuity of φ(∆β_r) and π(∆α_q) implies the order relationship π(∆α_q) < φ(∆β_r), in the domain−α ≤ ∆α_q ≤ 0 and –β < ∆β_r ≤ 0. In the other hand in the domain 0 ≤ ∆α_q ≤ α and 0 ≤ ∆β_r ≤ β, both π(∆α_q) and φ(∆β_r) steer away from one being π(∆α_q) increasing and φ(∆β_r) decreasing. Therefore, by continuity the statement of inequality (36) will hold only in the domain ∆α_q > 0 and ∆β_r > 0.

Similarly, reference values w_s(α, β, t) will be underestimated by fluctuating ones w_s(α_q, β_r, t) whenever, the statement

$$ {\mu}_a\left(\varDelta {\alpha}_q,\varDelta {\beta}_r,a(t)\right)<0 $$

(37)

holds, or equivalently whenever

$$ \pi \left(\varDelta {\alpha}_q\right)<\varphi \left(\varDelta {\beta}_r\right), $$

(38)

which as it has been discussed above, only holds whenever the increments ∆α_q and ∆β_r vary in.

the domain −β < ∆β_r < 0 and −α < ∆α_q < 0.

Since, the w_s(α, β, t) projections are always positive, the w_m(α, β, t) reference projections will be overestimated or underestimated by changing w_m(α_q, β_r, t) values in the same domains of variation of ∆α_q and and ∆β_r where the w_s(α, β, t) projections are overestimated or underestimated by w_s(α_q, β_r, t) values.

Exploration of sensitivity of w _m(α, β, t) by simulation methods

In order to perform a simulation study of sensitivity, we considered a combined range of variability for α and β given by the modulus of the vector of parametric changes (∆α_q, ∆β_r). This is denoted by means of the symbol θ(q, r) and calculated through

$$ \theta \left(q,r\right)=\sqrt{\ \varDelta {\alpha_q}^2+\varDelta {\beta_r}^2.} $$

(39)

Equivalently, using Eqs. (24) and (27) yield

$$ \theta \left(q,r\right)=\sqrt{{\left(q\cdot ste\left(\alpha \right)\right)}^2+{\left(r\cdot ste\left(\beta \right)\right)}^2.} $$

(40)

And, setting q = 1 and r = 1 give (∆α_q, ∆β_r) = (ste(α), ste(β)). This produces vectors of shifting parametric values with modulus determined by the standard errors of estimates. Denoting by means of the symbol θ_ste the associated range of variability we have

$$ {\theta}_{ste}=\sqrt{ste{\left(\alpha \right)}^2+ ste{\left(\beta \right)}^2}. $$

(41)

By virtue of Eq. (39), every fixed value of θ(q, r) returns different pairs (∆α_q, ∆β_r) of increments, each one giving rise to a couple of shifting parameter values (α_q, β_r). Besides, for each ordered pair of changing parameters, leaf data in shoots and Eq. (19) return a shifting trajectory, that is denoted trough the symbol w_m(α_q, β_r, t). The average of the values taken by the fluctuating trajectories at a sampling date t is denoted by 〈w_m(α_q, β_r, t)| θ〉. The procedure estimates the deviations between the observed w_m(t) and the average 〈w_m(α_q, β_r, t)| θ〉 trajectory for each time t. This is denoted by δw_mθ(α_q, β_r, t) and calculated by way of

$$ \delta {w}_{m\theta}\left({\alpha}_q,{\beta}_r,t\right)={w}_m(t)-\left\langle {w}_{m\kern0.5em }\left({\alpha}_q,{\beta}_r,t\right)|\theta \right\rangle . $$

(42)

The average of the δw_mθ(α_q, β_r, t) deviation values over all sampling times n(t), represented thorough the symbol 〈δw_mθ(α_q, β_r, t)〉, is given by

$$ \left\langle \delta {w}_{m\theta}\left({\alpha}_q,{\beta}_r,t\right)\right\rangle =\frac{\sum_{n(t)}\ \delta {w}_{m\theta}\left({\alpha}_q,{\beta}_r,t\right)\kern0.5em }{n(t)}, $$

(43)

where, $ \sum \limits_{n(t)} $ indicates summation over the involved n(t) sampling dates. These statistics along with, the average of w_m(t) values over the number n(t) of sampling times are used to produce a relative deviation index represented by means of the symbol ϑ(θ) and given by

$$ \vartheta \left(\theta\ \right)=\frac{\left|\left\langle \delta {w}_{m\theta}\left({\alpha}_q,{\beta}_r,t\right)\right\rangle \right|\kern0.5em }{\left\langle {w}_m(t)\right\rangle\ }. $$

(44)

The value of ϑ(θ) stands a measure of the sensitivity of the reference trajectory w_m(α, β, t) to a change of tolerance θ(q, r) on the values of α and β. Moreover, ϑ(θ) assess in what percentage of 〈w_m(t)〉 the absolute deviation between w_m(α_q, β_r, t) and w_m(t) amounts.

Application to present data

The results of Appendix 1 show that there are not differences in parameter estimates of the nonlinear regression model of Eqs. (5) and (6) assuming heteroscedasticity or homoscedasticity. Therefore, in order to explore the sensitivity of w_m(α, β, t) by means of the simulation code of Eqs. (39)–(44), we take as a reference parameter estimates acquired from the raw data using the homoscedastic case. Parameter estimates were $ \widehat{\alpha}=1.104 $ with $ ste\left(\widehat{\alpha}\ \right)=5.101\times {10}^{-3} $ for α and of $ \widehat{\beta}=8.71\times {10}^{-6} $ with $ ste\left(\widehat{\beta}\right)=3.53\times {10}^{-7} $ for β (Table 1). Then, letting q = 1 and r = 1 in Eqs. (24) and (27) allows to consider a combined range of parametric change set by the standard errors of estimates. Eq. (41) returned θ_ste = 5.101 × 10⁻³. Afterwards, for every fixed value of θ(q, r) we determined all ordered pairs (∆α_q, ∆β_r) complying with condition set by Eq. (39). This returned fluctuating values α_q for α and β_r for β, given one by one by Eqs. (23) and (28). Observed leaf area data and shifting parameter values, shaped the associated changing w_m(α_q, β_r, t) trajectories. Moreover, while 0 ≤ θ(q, r) ≤ θ_ste, the values of the relative deviation index ϑ(τ) satisfied 0≤ ϑ(θ) ≤ 0.0205 (Fig. 9). Thus, for a parametric variation range set by the standard errors of estimates the maximum value of the absolute deviation ϑ(θ) between w_m(α_q, β_r, t) and w_m(t) amounts to approximately 2% of the average of w_m(t) taken over all sampling dates (〈w_m(t) 〉 cf. Eq. (16)). Moreover, for θ(q, r) in a range 0 ≤ θ ≤ 2θ_ste we have 0 ≤ ϑ(θ) ≤ 0.031. This leads to a maximum absolute deviation between w_m(α_q, β_r, t) and w _m(t) of around 3% of 〈w_m(t) 〉. This shows the robustness of the w_m(α, β, t) projection method.

In turn, fitting the model of Eq. (1) to the quality controlled data using nonlinear regression, produced estimates, $ \widehat{\alpha}=1.132 $ with $ ste\left(\widehat{\alpha}\ \right)=3.954\times {10}^{-3} $ for α and of $ \widehat{\beta}=6.956\times {10}^{-6} $ with$ ste\left(\hat{\beta}\right)=2.202\times {10}^{-7} $ for β (Table 1). Letting p = 1 and q = 1 in Eq. (40) gives θ_ste = 3.954 × 10⁻³, therefore, 0 ≤ θ(q, r) ≤ θ_ste implies 0 ≤ ϑ(θ) ≤ 0.003 (Fig. 10). Hence, the maximum absolute-relative deviation between w_m(α_q, β_r, t) and w_m(t) amounts to only 0.03% of the 〈w_m(t)〉 average. On addition, a range 0 ≤ θ ≤ 2.8θ_ste leads to 0 ≤ ϑ(θ) ≤ 0.03006. This leads to a maximum absolute deviation between w_m(α_q, β_r, t) and w_m(t) of around 3% of 〈w_m(t)〉. Comparing with results for raw data, we can ascertain that data quality induces a relative gain in the precision of the w_m(α, β, t) projections.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Cite this article

Echavarría-Heras, H., Leal-Ramírez, C., Villa-Diharce, E. et al. On the suitability of an allometric proxy for nondestructive estimation of average leaf dry weight in eelgrass shoots I: sensitivity analysis and examination of the influences of data quality, analysis method, and sample size on precision. Theor Biol Med Model 15, 4 (2018). https://doi.org/10.1186/s12976-018-0076-y

Download citation

Received: 11 September 2017
Accepted: 22 January 2018
Published: 06 March 2018
DOI: https://doi.org/10.1186/s12976-018-0076-y

On the suitability of an allometric proxy for nondestructive estimation of average leaf dry weight in eelgrass shoots I: sensitivity analysis and examination of the influences of data quality, analysis method, and sample size on precision

Abstract

Background

Results

Conclusions

Similar content being viewed by others

Improved allometric proxies for eelgrass conservation

The effect of parameter variability in the allometric projection of leaf growth rates for eelgrass (Zostera marina L.) II: the importance of data quality control procedures in bias reduction

A Takagi–Sugeno-Kang Fuzzy Model Formalization of Eelgrass Leaf Biomass Allometry with Application to the Estimation of Average Biomass of Leaves in Shoots: Comparing the Reproducibility Strength of the Present Fuzzy and Related Crisp Proxies

Background

Methods

Results

Discussion

Conclusion

References

Acknowledgements

Funding

Availability of data and materials

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Publisher’s Note

Appendices

Appendix 1

Model selection

Fitting of the model of Eq. (1) by means the traditional analysis method of allometry

Fitting of the model of Eq. (1) via non-linear regression

Fitting of the linear model of Eq. (3)

Appendix 2

Data processing

Appendix 3

Sensitivity of w m (α, β, t)

Qualitative exploration of sensitivity of w m (α, β, t)

Exploration of sensitivity of w m (α, β, t) by simulation methods

Application to present data

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation

Sensitivity of w _m(α, β, t)

Qualitative exploration of sensitivity of w _m(α, β, t)

Exploration of sensitivity of w _m(α, β, t) by simulation methods