Advertisement

Performance of alternative spatial models in empirical Douglas-fir and simulated datasets

  • Eduardo Pablo CappaEmail author
  • Facundo Muñoz
  • Leopoldo Sanchez
Research Paper

Abstract

Key message

Based on an empirical dataset originating from the French Douglas-fir breeding program, we showed that the bidimensional autoregressive and the two-dimensional P-spline regression spatial models clearly outperformed the classical block model, in terms of both goodness of fit and predicting ability. In contrast, the differences between both spatial models were relatively small. In general, results from simulated data were well in agreement with those from empirical data.

Context

Environmental (and/or non-environmental) global and local spatial trends can lead to biases in the estimation of genetic parameters and the prediction of individual additive genetic effects.

Aims

The goal of the present research is to compare the performances of the classical a priori block design (block) and two different a posteriori spatial models: a bidimensional first-order autoregressive process (AR) and a bidimensional P-spline regression (splines).

Methods

Data from eight trials of the French Douglas-fir breeding program were analyzed using the block, AR, and splines models, and data from 8640 simulated datasets corresponding to 180 different scenarios were also analyzed using the two a posteriori spatial models. For each real and simulated dataset, we compared the fitted models using several performance metrics.

Results

There is a substantial gain in accuracy and precision in switching from classical a priori blocks design to any of the two alternative a posteriori spatial methodologies. However, the differences between AR and splines were relatively small. Simulations, covering a larger though oversimplified hypothetical setting, seemed to support previous empirical findings. Both spatial approaches yielded unbiased estimations of the variance components when they match with the respective simulation data.

Conclusion

In practice, both spatial models (i.e., AR and splines) suitably capture spatial variation. It is usually safe to use any of them. The final choice could be driven solely by operational reasons.

Keywords

Global and local spatial trends Forest genetics trials Autoregressive residual Two-dimensional P-splines 

Notes

Acknowledgments

The authors sincerely acknowledge Jean-Charles Bastien for his help in identifying trials and accessing data. Thanks go to the staff of INRA experimental units (UE GBFOR, INRA Val de Loire) who have established, maintained, and assessed the field trials.

Funding

Eduardo P Cappa, F. Muñoz, and L. Sánchez received funding from the European Union’s Seventh Framework Program for research, technological development, and demonstration under grant agreement no. 284181 (“Trees4Future”). F. Muñoz is partially funded by research grant MTM2016-77501-P from the Spanish Ministry of Economy and Competitiveness.

Compliance with ethical standards

Conflict of interest

The authors declare that they have no conflict of interest.

Supplementary material

13595_2019_836_MOESM1_ESM.docx (1.1 mb)
ESM 1 (DOCX 1174 kb)

References

  1. Akaike H (1974) A new look at the statistical model identification. IEEE Trans on Automat Contr 19(6):716–723CrossRefGoogle Scholar
  2. Anekonda TS, Libby WJ (1996) Effectiveness of nearest neighbor data adjustment in a clonal test of redwood. Silvae Genet 45(1):46–51Google Scholar
  3. Bastien JC, Sánchez L, Michaud D (2013) Douglas-fir (Pseudotsuga menziesii (Mirb.) Franco). In: Ecosystems PLEMF (ed) Forest tree breeding in Europe, vol 24. Springer, New York, pp 325–369CrossRefGoogle Scholar
  4. Cappa EP, Muñoz F, Sanchez L (2019) Performance of alternative spatial models in empirical Douglas-fir and simulated datasets. V1. Zenodo. [dataset].  https://doi.org/10.5281/zenodo.2629151 CrossRefGoogle Scholar
  5. Cappa EP, Yanchuk AD, Cartwright CV (2015a) Estimation of genetic parameters using spatial analysis in Tsuga heterophylla full-sibling family trials in British Columbia. Silvae Genet 64:59–73CrossRefGoogle Scholar
  6. Cappa EP, Muñoz F, Sanchez L, Cantet RJC (2015b) A novel individual-tree mixed model with competition effects and environmental heterogeneity: a Bayesian approach. Tree Genet Genomes 11:120–135CrossRefGoogle Scholar
  7. Cappa EP, Lstiburek M, Yanchuk AD, El-Kassaby YA (2011) Two-dimensional penalized splines via Gibbs sampling to account for spatial variability in forest genetic trials with small amount of information available. Silvae Genet 60:25–35CrossRefGoogle Scholar
  8. Cappa EP, Cantet RJC (2008) Direct and competition additive effects in tree breeding: Bayesian estimation from an individual tree mixed model. Silvae Genet 57:45–56CrossRefGoogle Scholar
  9. Cappa EP, Cantet RJC (2007) Bayesian estimation of a surface to account for a spatial trend using penalized splines in an individual-tree mixed model. Can J For Res 37:2677–2688CrossRefGoogle Scholar
  10. Costa e Silva J, Kerr RJ (2013) Accounting for competition in genetic analysis, with particular emphasis on forest genetic trials. Tree Genet Genomes 9:1–17CrossRefGoogle Scholar
  11. Costa e Silva J, Dutkowski GW, Gilmour AR (2001) Analysis of early tree height in forest genetic trials is enhanced by including a spatially correlated residual. Can J For Res 31:1887–1893CrossRefGoogle Scholar
  12. Cressie N (1993) Statistics for Spatial Data. Wiley series in probability and statistics. Wiley, New YorkGoogle Scholar
  13. Cullis BR, Smith AB, Coombes NE (2006) On the design of early generation variety trials with correlated data. J Agric Biol Environ Stat 11:381–393CrossRefGoogle Scholar
  14. Dutkowski GW, Costa e Silva J, Gilmour AR, Lopez GA (2002) Spatial analysis methods for forest genetic trials. Can J For Res 32:2201–2214CrossRefGoogle Scholar
  15. Dutkowski GW, Costa e Silva J, Gilmour AR, Wellendorf H, Aguiar A (2006) Spatial analysis enhances modeling of a wide variety of traits in forest genetic trials. Can J For Res 36:1851–1870CrossRefGoogle Scholar
  16. Eilers PHC, Marx BD (2003) Multivariate calibration with temperature interaction using two-dimensional penalized signal regression. Chemometr Intell Lab Syst 66:159–174CrossRefGoogle Scholar
  17. Ericsson T (1997) Enhanced heritabilities and best linear unbiased predictors through appropriate blocking of progeny trials. Can J For Res 27:2097–2101CrossRefGoogle Scholar
  18. Federer WT (1998) Recovery of interblock, intergradient, and intervarietal information in incomplete block and lattice rectangle designed experiments. Biometrics 54:471–481CrossRefGoogle Scholar
  19. Fu YB, Yanchuk AD, Namkoong G (1999) Incomplete block designs for genetic testing: some practical considerations. Can J For Res 29:1871–1878CrossRefGoogle Scholar
  20. Gezan SA, White TL, Huber DA (2010) Accounting for spatial variability in breeding trials: a simulation study. Agron J 102:1562–1571CrossRefGoogle Scholar
  21. Gezan SA, Huber DA, White TL (2006) Post hoc blocking to improve heritability and precision of best linear unbiased genetic predictions. Can J For Res 36:2141–2147.  https://doi.org/10.1139/X06-112 CrossRefGoogle Scholar
  22. Gilmour AR, Cullis BR, Verbyla AP (1997) Accounting for natural and extraneous variation in the analysis of field experiments. J Agric Biol Environ Stat 2:269–293CrossRefGoogle Scholar
  23. Grondona MO, Crossa J, Fox PN, Pfeiffer WH (1996) Analysis of variety yield trials using two-dimensional separable ARIMA processes. Biometrics 52:763–770CrossRefGoogle Scholar
  24. Hamann A, Koshy M, Namkoong G (2002) Improving precision of breeding values by removing spatially autocorrelated variation in forestry field experiments. Silvae Genet 51:210–215Google Scholar
  25. Henderson CR (1984) Applications of linear models in animal breeding. University of Guelph, Guelph, Ont, CanadaGoogle Scholar
  26. Joyce D, Ford R, Fu YB (2002) Spatial patterns of tree height variations in a black spruce farm-field progeny test and neighbors-adjusted estimations of genetic parameters. Silvae Genet 51:13–18Google Scholar
  27. Kroon J, Andersson B, Mullin TJ (2008) Genetic variation in the diameter-height relationship in scots pine (Pinus sylvestris). Can J For Res 38:1493–1503CrossRefGoogle Scholar
  28. Lopez GA, Potts BM, Dutkowski GW, Apiolaza LA, Gelid P (2002) Genetic variation and inter-trait correlations in Eucalyptus globulus base population trials in Argentina. For Genet 9:223–237Google Scholar
  29. Manly BFJ (1991) Randomization, bootstrap and Monte Carlo methods in biology, 2nd edn. Chapman and Hall/CRC, New YorkCrossRefGoogle Scholar
  30. Magnussen S (1993) Bias in genetic variance estimates due to spatial autocorrelation. Theor Appl Genet 86:349–355CrossRefGoogle Scholar
  31. Magnussen S (1994) A method to adjust simultaneously for spatial microsite and competition effects. Can J For Res 24:985–995CrossRefGoogle Scholar
  32. Misztal I (1999) Complex models, more data: simpler programming. Proc Inter Workshop Comput Cattle Breed ‘99, March 18-20, Tuusala, Finland. Interbull Bul. 20:33-42Google Scholar
  33. Muñoz F, Sanchez L (2015) breedR: statistical methods for forest genetic resources analysts. R package version 0.7–16. https://github.com/famuvie/breedR
  34. Patterson HD, Thompson R (1971) Recovery of inter-block information when block sizes are unequal. Biometrika 58:545–554CrossRefGoogle Scholar
  35. Qiao CG, Basford KE, DeLacy IH, Cooper M (2000) Evaluation of experimental designs and spatial analyses in wheat breeding trials. Theor Appl Genet 100:9–16CrossRefGoogle Scholar
  36. Rodríguez-Álvarez MX, Boer MP, van Eeuwijk FA, Eilers PHC (2018) Correcting for spatial heterogeneity in plant breeding experiments with P-splines. Spatial Statistics 23:52–71CrossRefGoogle Scholar
  37. Saenz-Romero C, Nordheim EV, Guries RP, Crump PM (2001) A case study of a provenance/progeny test using trend analysis with correlated errors and SAS PROC MIXED. Silvae Genet 50:127–135Google Scholar
  38. Smith AB, Cullis BR, Gilmour A (2001) The analysis of crop variety evaluation data in Australia. Aust N Z J Stat 43:129–145CrossRefGoogle Scholar
  39. Stein ML (1999) Interpolation of spatial data: some theory for kriging. Springer-Verlag, New YorkCrossRefGoogle Scholar
  40. Sørbye SH, Rue H (2014) Scaling intrinsic Gaussian Markov random field priors in spatial modelling. Spat Stat 8:39–51CrossRefGoogle Scholar
  41. Thomson AJ, El-Kassaby YA (1988) Trend surface analysis of provenance-progeny transfer data. Can J For Res 18: 515–520Google Scholar
  42. Velazco JG, Rodríguez-Álvarez MX, Boer MP, Jordan DR, Eilers PH, Malosetti M, van Eeuwijk FA (2017) Modelling spatial trends in sorghum breeding field trials using a two-dimensional P-spline mixed model. Theor Appl Genet 130:1375–1392.  https://doi.org/10.1007/s00122-017-2894-4 CrossRefPubMedPubMedCentralGoogle Scholar
  43. Verbyla AP, Cullis BR, Kenward MG, Welham SJ (1999) The analysis of designed experiments and longitudinal data by using smoothing splines (with discussion). Appl Stat 48:269–311Google Scholar
  44. Ye TZ, Jayawickrama KJS (2008) Efficiency of using spatial analysis in firest-generation coastal Douglas-fir progeny tests in the US Pacific Northwest. Tree Genet Genomics 4:677–692CrossRefGoogle Scholar
  45. Zas R (2006) Iterative kriging for removing spatial autocorrelation in analysis of forest genetic trials. Tree Genet Genomics 2:177–185CrossRefGoogle Scholar

Copyright information

© INRA and Springer-Verlag France SAS, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Bosques Cultivados, Centro de Investigación en Recursos Naturales, Instituto Nacional de Tecnología Agropecuaria (INTA)Instituto de Recursos BiológicosBuenos AiresArgentina
  2. 2.Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET)Buenos AiresArgentina
  3. 3.UMR BioForA, INRAArdonFrance

Personalised recommendations