Bayesian measurement error correction in structured additive distributional regression with an application to the analysis of sensor data on soil–plant variability

  • Alessio PolliceEmail author
  • Giovanna Jona Lasinio
  • Roberta Rossi
  • Mariana Amato
  • Thomas Kneib
  • Stefan Lang
Original Paper


The flexibility of the Bayesian approach to account for covariates with measurement error is combined with semiparametric regression models. We consider a class of continuous, discrete and mixed univariate response distributions with potentially all parameters depending on a structured additive predictor. Markov chain Monte Carlo enables a modular and numerically efficient implementation of Bayesian measurement error correction based on the imputation of unobserved error-free covariate values. We allow for very general measurement errors, including correlated replicates with heterogeneous variances. The proposal is first assessed by a simulation trial, then it is applied to the assessment of a soil–plant relationship crucial for implementing efficient agricultural management practices. Observations on multi-depth soil information and forage ground-cover for a seven hectares Alfalfa stand in South Italy were obtained using sensors with very refined spatial resolution. Estimating a functional relation between ground-cover and soil with these data involves addressing issues linked to the spatial and temporal misalignment and the large data size. We propose a preliminary spatial aggregation on a lattice covering the field and subsequent analysis by a structured additive distributional regression model, accounting for measurement error in the soil covariate. Results are interpreted and commented in connection to possible Alfalfa management strategies.


Structured additive distributional regression Agricultural management Bayesian semiparametric regression Measurement error 



Alessio Pollice and Giovanna Jona Lasinio were partially supported by the PRIN2015 project “Environmental processes and human activities: capturing their interactions via statistical methods (EPHASTAT)” funded by MIUR - Italian Ministry of University and Research.


  1. Arima S, Bell WR, Datta GS, Franco C, Liseo B (2017) Multivariate Fay–Herriot Bayesian estimation of small area means under functional measurement error. J R Stat Soc Ser A (Stat Soc) 180(4):1191–1209CrossRefGoogle Scholar
  2. Banerjee S, Gelfand AE, Carlin BP (2014) Hierarchical modeling and analysis for spatial data, 2nd edn. Chapman & Hall, New YorkGoogle Scholar
  3. Banton O, Cimon MA, Seguin MK (1997) Mapping field-scale physical properties of soil with electrical resistivity. Soil Sci Soc Am 61(4):1010–1017CrossRefGoogle Scholar
  4. Basso B, Cammarano D, Chen D, Cafiero G, Amato M, Bitella G, R R, Basso F (2009) Landscape position and precipitation effects on spatial variability of wheat yield and grain protein in southern Italy. J Agron Crop Sci 4(195):301–312CrossRefGoogle Scholar
  5. Belitz C, Brezger A, Kneib T, Lang S, Umlauf N (2015) BayesX: software for Bayesian inference in structured additive regression models. Version 3:2.
  6. Berry SM, Carroll RJ, Ruppert D (2002) Bayesian smoothing and regression splines for measurement error problems. J Am Stat Assoc 97(457):160–169CrossRefGoogle Scholar
  7. Besson A, Cousin I, Samouëlian A, Boizard H, Richard G (2004) Structural heterogeneity of the soil tilled layer as characterized by 2d electrical resistivity surveying. Soil Tillage Res 79(2):239–249 Soil Physical QualityCrossRefGoogle Scholar
  8. Brezger A, Lang S (2006) Generalized structured additive regression based on Bayesian P-splines. Comput Stat Data Anal 50:967–991CrossRefGoogle Scholar
  9. Buonaccorsi JP (2010) Measurement error: models, methods, and applications. Chapman & Hall, LondonCrossRefGoogle Scholar
  10. Cameletti M (2013) The change of support problem through the inla approach. Statistica e Applicazioni 2013(Special Issue):29–43Google Scholar
  11. Carroll RJ, Ruppert D, Stefanski LA, Crainiceanu CM (2006) Measurement error in nonlinear models: a modern perspective, 2nd edn. Chapman & Hall, LondonCrossRefGoogle Scholar
  12. Corwin DL, Lesch SM, Segal E, Skaggs TH, Bradford SA (2010) Comparison of sampling strategies for characterizing spatial variability with apparent soil electrical conductivity directed soil sampling. J Environ Eng Geophys 15(3):147–162CrossRefGoogle Scholar
  13. Cressie NAC (2015) Statistics for spatial data, Revised edn. Wiley, New YorkGoogle Scholar
  14. Dardanelli J, Bachmeier O, Sereno R, Gil R (1997) Rooting depth and soil water extraction patterns of different crops in a silty loam Haplustoll. Field Crops Res 54(1):29–38CrossRefGoogle Scholar
  15. Datta A, Banerjee S, Finley AO, Gelfand AE (2016) Hierarchical nearest-neighbor Gaussian process models for large geostatistical datasets. J Am Stat Assoc 111(514):800–812CrossRefGoogle Scholar
  16. Doolittle JA, Brevik EC (2014) The use of electromagnetic induction techniques in soils studies. Geoderma 223–225:33–45CrossRefGoogle Scholar
  17. Dunn PK, Smyth GK (1996) Randomized quantile residuals. J Comput Graph Stat 5(3):236–244Google Scholar
  18. Eilers PHC, Marx BD (1996) Flexible smoothing with b-splines and penalties. Stat Sci 11(2):89–121CrossRefGoogle Scholar
  19. Fahrmeir L, Kneib T, Lang S, Marx B (2013) Regression models, methods and applications. Springer, New YorkGoogle Scholar
  20. Ferrari SLP, Cribari-Neto F (2004) Beta regression for modelling rates and proportions. J Appl Stat 31:799–815CrossRefGoogle Scholar
  21. Fuller WA (1987) Measurement error models. Wiley, New YorkCrossRefGoogle Scholar
  22. Gamerman D (1997) Sampling from the posterior distribution in generalized linear mixed models. Stat Comput 7:57–68CrossRefGoogle Scholar
  23. Gelfand AE, Diggle P, Guttorp P, Fuentes M (2010) Handbook of spatial statistics (Chapman & Hall-CRC handbooks of modern statistical methods). Chapman & Hall, LondonGoogle Scholar
  24. Gelman A, Hwang J, Vehtari A (2014) Understanding predictive information criteria for Bayesian models. Stat Comput 24(6):997–1016CrossRefGoogle Scholar
  25. Gneiting T, Raftery AE (2007) Strictly proper scoring rules, prediction, and estimation. J Am Stat Assoc 102(477):359–378CrossRefGoogle Scholar
  26. Guo Y, Shi Z, Huang J, Zhou L, Zhou Y, Wang L (2016) Characterization of field scale soil variability using remotely and proximally sensed data and response surface method. Stoch Environ Res Risk Assess 30(3):859–869CrossRefGoogle Scholar
  27. Gustafson P (2003) Measurement error and misclassification in statistics and epidemiology: impacts and Bayesian adjustments. Chapman & Hall, LondonCrossRefGoogle Scholar
  28. Huque MH, Bondell HD, Carroll RJ, Ryan LM (2016) Spatial regression with covariate measurement error: a semiparametric approach. Biometrics 72(3):678–686CrossRefGoogle Scholar
  29. Klein N, Kneib T, Lang S (2013) Bayesian structured additive distributional regression. Working papers in economics and statistics 2013–23, University of InnsbruckGoogle Scholar
  30. Klein N, Kneib T, Klasen S, Lang S (2015a) Bayesian structured additive distributional regression for multivariate responses. J R Stat Soc Ser C (Appl Stat) 64(4):569–591CrossRefGoogle Scholar
  31. Klein N, Kneib T, Lang S, Sohn A (2015b) Bayesian structured additive distributional regression with with an application to regional income inequality in germany. Ann Appl Stat 9:1024–1052CrossRefGoogle Scholar
  32. Kneib T, Brezger A, Crainiceanu CM (2010) Generalized semiparametric regression with covariates measured with error. In: Kneib T, Tutz G (eds) Statistical modelling and regression structures: Festschrift in honour of Ludwig Fahrmeir. Physica-Verlag HD, Heidelberg, pp 133–154CrossRefGoogle Scholar
  33. Kneib T, Klein N, Lang S, Umlauf N (2017) Modular regression—a Lego system for building structured additive distributional regression models with tensor product interactions. Technical reportGoogle Scholar
  34. Küchenhoff H, Mwalili SM, Lesaffre E (2006) A general method for dealing with misclassification in regression: the misclassification SIMEX. Biometrics 62:85–96CrossRefGoogle Scholar
  35. Lang S, Umlauf N, Wechselberger P, Harttgen K, Kneib T (2014) Multilevel structured additive regression. Stat Comput 24(2):223–238CrossRefGoogle Scholar
  36. Lasinio GJ, Mastrantonio G, Pollice A (2013) Discussing the “big n problem”. Stat Methods Appl 22(1):97–112CrossRefGoogle Scholar
  37. Loken E, Gelman A (2017) Measurement error and the replication crisis. Science 355(6325):584–585CrossRefGoogle Scholar
  38. Merrill HR, Grunwald S, Bliznyuk N (2017) Semiparametric regression models for spatial prediction and uncertainty quantification of soil attributes. Stoch Environ Res Risk Assess 31(10):2691–2703CrossRefGoogle Scholar
  39. Muff S, Riebler A, Held L, Rue H, Saner P (2015) Bayesian analysis of measurement error models using integrated nested Laplace approximations. J R Stat Soc Ser C (Appl Stat) 64(2):231–252CrossRefGoogle Scholar
  40. Rossi R, Pollice A, Bitella G, Bochicchio R, D’Antonio A, Alromeed AA, Stellacci AM, Labella R, Amato M (2015) Soil bulk electrical resistivity and forage ground cover: nonlinear models in an alfalfa (Medicago sativa L.) case study. Ital J Agron 10(4):215–219CrossRefGoogle Scholar
  41. Rossi R, Pollice A, Bitella G, Labella R, Bochicchio R, Amato M (2018) Modelling the non-linear relationship between soil resistivity and alfalfa NDVI: a basis for management zone delineation. J Appl Geophys 159:146–156CrossRefGoogle Scholar
  42. Samouëlian A, Cousin I, Tabbagh A, Bruand A, Richard G (2005) Electrical resistivity survey in soil science: a review. Soil Tillage Res 83(2):173–193CrossRefGoogle Scholar
  43. Sarkar A, Mallick BK, Carroll RJ (2014) Bayesian semiparametric regression in the presence of conditionally heteroscedastic measurement and regression errors. Biometrics 70(4):823–834CrossRefGoogle Scholar
  44. Saxton KE, Rawls W, Romberger JS, Papendick RI (1986) Estimating generalized soil–water characteristics from texture. Soil Sci Soc Am J 50(4):1031–1036CrossRefGoogle Scholar
  45. Schreuder R, de Visser C (2014) Report EIP-AGRI focus group protein crops. Technical report, European CommissionGoogle Scholar
  46. Singh A (2017) Optimal allocation of water and land resources for maximizing the farm income and minimizing the irrigation-induced environmental problems. Stoch Environ Res Risk Assess 31(5):1147–1154CrossRefGoogle Scholar
  47. Spiegelhalter DJ, Best NG, Carlin BP, Van Der Linde A (2002) Bayesian measures of model complexity and fit. J R Stat Soc Ser B (Stat Methodol) 64(4):583–639CrossRefGoogle Scholar
  48. Tetegan M, Pasquier C, Besson A, Nicoullaud B, Bouthier A, Bourennane H, Desbourdes C, King D, Cousin I (2012) Field-scale estimation of the volume percentage of rock fragments in stony soils by electrical resistivity. CATENA 92:67–74CrossRefGoogle Scholar
  49. Vidal I, Iglesias P (2008) Comparison between a measurement error model and a linear model without measurement error. Comput Stat Data Anal 53(1):2–102CrossRefGoogle Scholar
  50. Watanabe S (2010) Asymptotic equivalence of Bayes cross validation and widely applicable information criterion in singular learning theory. J Mach Learn Re 11:3571–3594Google Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Dipartimento di Economia e FinanzaUniversità degli Studi di Bari Aldo MoroBariItaly
  2. 2.Sapienza Università di RomaRomaItaly
  3. 3.Consiglio per la Ricerca in Agricoltura e l’Analisi dell’Economia AgrariaMuro LucanoItaly
  4. 4.Università della BasilicataPotenzaItaly
  5. 5.Georg August Universität GöttingenGöttingenGermany
  6. 6.Universität InnsbruckInnsbruckAustria

Personalised recommendations