Abstract
The paper deals with the matter of producing geographical domains estimates for a variable with a spatial pattern in presence of incomplete information about the population units location. The spatial distribution of the study variable and its eventual relations with other covariates are modeled by a geoadditive regression. The use of such a model to produce model-based estimates for some geographical domains requires all the population units to be referenced at point locations, however typically the spatial coordinates are known only for the sampled units. An approach to treat the lack of geographical information for non-sampled units is suggested: it is proposed to impose a distribution on the spatial locations inside each domain. This is realized through a hierarchical Bayesian formulation of the geoadditive model in which a prior distribution on the spatial coordinates is defined. The performance of the proposed imputation approach is evaluated through various Markov Chain Monte Carlo experiments implemented under different scenarios.
Similar content being viewed by others
Notes
Tuscany is partitioned in ten provinces: Massa-Carrara, Lucca, Pistoia, Prato, Firenze, Livorno, Pisa, Siena, Arezzo and Grosseto. Because of the smaller extension of Prato and Pistoia, we consider them as a unique area in our analysis.
The meteo-hydrological data recorded by the Tuscan monitoring network are available for download from the Regional Hydrologic Service (www.sir.toscana.it).
A catchment basin is an extent of land where surface water from rain and melting snow or ice converges to a single point at a lower elevation, usually the exit of the basin, where the waters join another waterbody, such as a river, lake, reservoir, estuary, wetland, sea, or ocean. Tuscany is composed by twelve catchment basins, five of which belong mainly or completely to the Tuscan territory: Toscana Nord, Serchio, Toscana Costa, Ombrone and Arno (divided in Arno Superiore, Arno Medio and Arno Inferiore). Because of the smaller extension of the basins Toscana Nord and Serchio, we consider them as a unique area in our analysis; on the other hand, because of the great extension of the Arno basin, we consider its three sub-areas separately.
References
Crainiceanu C, Ruppert D, Wand MP (2005) Bayesian analysis for penalized spline regression using WinBUGS. J Stat Softw 14(14):1–24
Cressie N (1993) Statistics for spatial data. Waley, New York
Diggle PJ (1983) Statistical analysis of spatial point patterns. Academic Press, London
Fahrmeir L, Lang S (2001) Bayesian inference for generalized additive mixed models based on markov random field priors. Appl Stat 50(2):201–220
Fotheringham AS, Brunsdon C, Charlton ME (2002) Geographically weighted regression: the analysis of spatially varying relationships. Wiley, Chichester
Gamerman D, Moreira ARB, Rue H (2003) Space-varying regression models: specifications and simulation. Comput Stat Data Anal 42(3):513–533
Kammann EE, Wand MP (2003) Geoadditive models. Appl Stat 52:1–18
Kaufman L, Rousseeuw PJ (1990) Finding groups in data: an introduction to cluster analysis. Wiley, New York
Ligges U, Thomas A, Spiegelhalter D, Best N, Lunn D, Rice K, Sturtz S (2009) BRugs 0.5-3. R package. http://www.cran.r-project.org/
Little RJA, Rubin DB (1987) Statistical analysis with missing data. Wiley, New York
Lunn D, Thomas A, Best N, Spiegelhalter D (2000) WinBUGS—a Bayesian modelling framework: concepts, structure, and extensibility. Stat Comput 10:325–337
Marley J, Wand MP (2010) Non-standard semiparametric regression via BRugs. J Stat Softw 37(5):1–30
Opsomer JD, Claeskens G, Ranalli MG, Kauermann G, Breidt FJ (2008) Non-parametric small area estimation using penalized spline regression. J R Stat Soc B 70:265–286
R Development Core Team (2011) R: a language and environment for statistical computing. R foundation for statistical computing, Vienna, Austria. http://www.R-project.org/, ISBN 3-900051-07-0
Ruppert D, Wand MP, Carroll RJ (2003) Semiparametric regression. Cambridge University Press, Cambridge
Ruppert D, Wand MP, Carroll RJ (2009) Semiparametric regression during 2003–2007. Electron J Stat 3:1193–1256
Salvati N, Chandra H, Ranalli MG, Chambers R (2010) Small area estimation using a nonparametric model-based direct estimator. Comput Stat Data Anal 54:2159–2171
Spiegelhalter D, Thomas A, Best N, Gilks W, Lunn D (2003) BUGS: Bayesian inference using Gibbs sampling. MRC Biostatistics Unit, Cambridge, England. http://www.mrc-bsu.cam.ac.uk/bugs/
Venables WN, Ripley BD (2002) Modern applied statistics with S, 4th edn. Springer, New York
Wand MP (2003) Smoothing and mixed models. Comput Stat 18:223–249
Wand MP, Jones MC (1993) Comparison of smoothing parameterizations in Bivariate Kernel density estimation. J Am Stat Assoc 88:520–528
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Bocci, C., Rocco, E. Estimates for geographical domains through geoadditive models in presence of incomplete geographical information. Stat Methods Appl 23, 283–305 (2014). https://doi.org/10.1007/s10260-014-0256-9
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10260-014-0256-9