Abstract
While statistical information on socio-economic activities is widely available, the data are often collected or released only at a relatively aggregated level. In these aggregated forms, the data are useful for broad-scale assessments, although we often need to disaggregate the source data in order to provide more localized estimates, and in order to analyze correlations against geophysical variables. Spatial disaggregation techniques can be used in this context, to transform data from a set of source zones into a set of target zones, with different geometry and with a higher general level of spatial resolution. Still, few previous studies in the area have attempted to leverage state-of-the-art spatial disaggregation procedures in the context of socio-economic variables, instead focusing on applications related to population modeling. In this article, we report on experiments with a hybrid spatial disaggregation technique that combines state-of-the-art regression analysis procedures with the classic methods of dasymetric mapping and pycnophylactic interpolation. The hybrid procedure was used together with population density, land coverage, nighttime satellite imagery, and OpenStreetMap road density, as ancillary data to disaggregate different types of socio-economic indicators to a high-resolution grid. Our test specifically leveraged data relative to the Portuguese territory, resulting in the production of raster datasets with a resolution of 30 arc-seconds per cell. The article discusses the spatial disaggregation methodology and the quality of the obtained results under different experimental conditions.
This is a preview of subscription content,
to check access.







Similar content being viewed by others
Notes
References
Andersen, R.: Modern Methods for Robust Regression. No. 152 in Quantitative Applications in the Social Sciences. Sage Publications, Thousand Oaks (2008)
Antoni, J.P., Vuidel, G., Aupet, J.B., Aube, J.: Generating a located synthetic population: a prerequisite to agent-based urban modelling. In: Proceedings of the European Colloquium of Quantitative and Theoretical Geography (2011)
Antoni, J.P., Vuidel, G., Klein, O.: Generating a located synthetic population of individuals, households, and dwellings. Working Paper Series, Luxembourg Institute of Socio-Economic Research (2017)
Antoniou, V., Fonte, C.C., See, L., Estima, J., Arsanjani, J.J., Lupia, F., Minghini, M., Foody, G., Fritz, S.: Investigating the feasibility of geo-tagged photographs as sources of land cover input data. ISPRS Int. J. GeoInf. 5(5), 64 (2016)
Bivand, R.S., Pebesma, E., Gmez-Rubio, V.: Applied Spatial Data Analysis with R. Springer, Berlin (2012)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Briggs, D.J., Gulliver, J., Fecht, D., Vienneau, D.M.: Dasymetric modelling of small-area population distribution using land cover and light emissions data. Remote Sens. Environ. 108(4), 451–466 (2007)
Chai, T., Draxler, R.R.: Root mean square error (RMSE) or mean absolute error (MAE)?—Arguments against avoiding RMSE in the literature. Geosci. Model Dev. 7(3), 1247–1250 (2014)
Chambers, R., Tzavidis, N.: M-quantile models for small area estimation. Biometrika 93(2), 255–268 (2006)
Chandra, H., Salvati, N., Chambers, R., Tzavidis, N.: Small area estimation under spatial nonstationarity. Comput. Stat. Data Anal. 56(10), 2875–2888 (2012)
Cunha, E., Martins, B.: Using one-class classifiers and multiple kernel learning for defining imprecise geographic regions. Int. J. Geogr. Inf. Sci. 28(11), 2220–2241 (2014)
Deville, P., Linard, C., Martin, S., Gilbert, M., Stevens, F.R., Gaughan, A.E., Blondel, V.D., Tatem, A.J.: Dynamic population mapping using mobile phone data. Proc. Natl. Acad. Sci. 111(45), 15888–15893 (2014)
Doll, C.N.H., Muller, J.P., Elvidge, C.: Night-time imagery as a tool for global mapping of socio-economic parameters and greenhouse gas emissions. Ambio 29(3), 157–162 (2000)
Douglass, R., Meyer, D., Ram, M., Rideout, D., Song, D.: High resolution population estimates from telecommunications data. Euro. Phys. J. Data Sci. 4(1), 4 (2015)
Eicher, C.L., Brewer, C.A.: Dasymetric mapping and areal interpolation: Implementation and evaluation. Cartogr. Geogr. Inf. Sci. 28(2), 125–138 (2001)
Elvidge, C., Erwin, E., Baugh, K., Ziskin, D., Tuttle, B., Ghosh, T., Sutton, P.: Overview of dmsp nightime lights and future possibilities. In: Proceedings of the Joint Urban Remote Sensing Event, pp. 1–5 (2009)
Elvidge, C.D., Baugh, K.E., Kihn, E.A., Kroehl, H.W., Davis, E.R., Davis, C.: Relation between satellite observed visible to near infrared emissions, population, and energy consumption. Int. J. Remote Sens. 18(6), 1373–1379 (1997)
Fisher, P.F., Langford, M.: Modelling the errors in areal interpolation between zonal systems by monte carlo simulation. Environ. Plann. A 27(2), 211–224 (1995)
Fotheringham, A.S., Brunsdon, C., Charlton, M.E.: Geographically Weighted Regression : The Analysis of Spatially Varying Relationships. Wiley, Hoboken (2002)
Gallego, F.J.: A population density grid of the European Union. Popul. Environ. 31(6), 460–473 (2010)
García-Palomares, J.C., Gutiérrez, J., Mínguez, C.: Identification of tourist hot spots based on social networks: a comparative analysis of european metropolises using photo-sharing services and GIS. Appl. Geogr. 63(1), 408–417 (2015)
Giri, C.P.: Remote Sensing of Land Use and Land Cover: Principles and Applications. CRC Press, Boca Raton (2012)
Giusti, C., Tzavidis, N., Pratesi, M., Salvati, N.: Resistance to outliers of M-quantile and robust random effects small area models. Commun. Stat. Simul. Comput. 43(3), 549–568 (2014)
Goerlich, F.J., Cantarino, I.: A population density grid for spain. Int. J. Geogr. Inf. Sci. 27(12), 2247–2263 (2013)
Goodchild, M.F., Anselin, L., Deichmann, U.: A framework for the areal interpolation of socioeconomic data. Environ. Plan. A 25(3), 383–397 (1993)
Goodchild, M.F., Lam, N.S.N.: Areal interpolation: a variant of the traditional spatial problem. Department of Geography, University of Western Ontario London, Canada (1980)
Gregory, I.N.: The accuracy of areal interpolation techniques: standardising 19th and 20th century census data to allow long-term comparisons. Comput. Environ. Urban Syst. 26(4), 293–314 (2002)
Gupta, M.R., Chen, Y.: Theory and use of the EM algorithm. Found. Trends Signal Process. 4(3), 223–296 (2010)
Harris, P., Brunsdon, C., Fotheringham, A.S.: Links, comparisons and extensions of the geographically weighted regression model when used as a spatial predictor. Stoch. Environ. Res. Risk Assess. 25(2), 123–138 (2011)
Harris, P., Fotheringham, A., Crespo, R., Charlton, M.: The use of geographically weighted regression for spatial prediction: an evaluation of models using simulated data sets. Math. Geosci. 42(6), 657–680 (2010)
Hastie, T.J., Tibshirani, R.J.: Generalized Additive Models. Chapman & Hall, Boca Raton (1990)
Hawelka, B., Sitko, I., Beinat, E., Sobolevsky, S., Kazakopoulos, P., Ratti, C.: Geo-located Twitter as proxy for global mobility patterns. Cartogr. Geogr. Inf. Sci. 41(3), 260–271 (2014)
Hawley, K., Moellering, H.: A comparative analysis of areal interpolation methods. Cartogr. Geogr. Inf. Sci. 32(4), 411–423 (2005)
Heymann Y., S.C.C.G., Bossard, M.: CORINE land cover technical guide. Technical Report EUR12585, Office for Official Publications of the European Communities (1994)
Kádár, B.: Measuring tourist activities in cities using geotagged photography. Tour. Geogr. 16(1), 88–104 (2014)
Kádár, B., Gede, M.: Where do tourists go? Visualizing and analysing the spatial distribution of geotagged photography. Cartogr. Int. J. Geogr. Inf. Geovis. 48(2), 78–88 (2013)
Kim, G., Barros, A.P.: Downscaling of remotely sensed soil moisture with a modified fractal interpolation method using contraction mapping and ancillary data. Remote Sens. Environ. 83(3), 400–413 (2002)
Kim, H., Yao, X.: Pycnophylactic interpolation revisited: integration with the dasymetric-mapping method. Int. J. Remote Sens. 31(21), 5657–5671 (2010)
Kuhn, M.: Building predictive models in R using the caret package. J. Stat. Softw. 28(5), 1–26 (2008)
Langford, M.: Rapid facilitation of dasymetric-based population interpolation by means of raster pixel maps. Comput. Environ. Urban Syst. 31(1), 19–32 (2007)
Li, D., Zhao, X., Li, X.: Remote sensing of human beings a perspective from nighttime light. Geospat. Inf. Sci. 19(1), 69–79 (2016)
Lin, J., Cromley, R., Zhang, C.: Using geographically weighted regression to solve the areal interpolation problem. Ann. GIS 17(1), 1–14 (2011)
Lin, J., Cromley, R.G.: Evaluating geo-located Twitter data as a control layer for areal interpolation of population. Appl. Geogr. 58(1), 41–47 (2015)
Longley, P.A., Adnan, M., Lansley, G.: The geotemporal demographics of Twitter usage. Environ. Plan. A 47(2), 465–484 (2015)
Malone, B.P., McBratney, A.B., Minasny, B., Wheeler, I.: A general method for downscaling Earth resource information. Comput. Geosci. 41(1), 119–125 (2012)
Nagle, N.N., Buttenfield, B.P., Leyk, S., Spielman, S.: Dasymetric modeling and uncertainty. Ann. Assoc. Am. Geogr. 104(1), 80–95 (2014)
Nordhaus, W.D.: Alternative Approaches to Spatial Rescaling. Technical Report. Yale University, New Haven (2003)
Nordhaus, W.D.: Geography and macroeconomics: new data and new findings. Proc. Natl. Acad. Sci. 103(10), 3510–3517 (2006)
Patel, N.N., Stevens, F.R., Huang, Z., Gaughan, A.E., Elyazar, I., Tatem, A.J.: Improving large area population mapping using geotweet densities. Trans. GIS 21(2), 317–331 (2016)
Paul, M.J., Dredze, M., Broniatowski, D.: Twitter improves influenza forecasting. PLoS Curr. 6(1), 18 (2014)
Quinlan, R.J.: Learning with continuous classes. In: Proceedings of the Australian Joint Conference On Artificial Intelligence, pp. 343–348 (1992)
Reibel, M., Bufalino, M.E.: Street-weighted interpolation techniques for demographic count estimation in incompatible zone systems. Environ. Plan. A 37(1), 127–139 (2005)
Rousseeuw, P.J., Leroy, A.M.: Robust Regression and Outlier Detection. Wiley, Hoboken (2005)
Salvati, N., Tzavidis, N., Pratesi, M., Chambers, R.: Small area estimation via M-quantile geographically weighted regression. Test 21(1), 1–28 (2012)
Schmid, T., Münnich, R.T.: Spatial robust small area estimation. Stat. Pap. 55(3), 653–670 (2014)
Sémécurbe, F., Tannier, C., Roux, S.G.: Spatial distribution of human population in france: Exploring the modifiable areal unit problem using multifractal analysis. Geogr. Anal. 48(3), 292–313 (2016)
Batista e Silva, F., Gallego, J., Lavalle, C.: A high-resolution population grid map for europe. J. Maps 9(1), 16–28 (2013)
Stevens, F.R., Gaughan, A.E., Linard, C., Tatem, A.J.: Disaggregating census data for population mapping using random forests with remotely-sensed and ancillary data. PLoS ONE 10(2), 1–22 (2015)
Tobler, W.: A computer movie simulating urban growth in the detroit region. Econ. Geogr. 46(2), 234–240 (1970)
Tobler, W.: Smooth pycnophylactic interpolation for geographical regions. J. Am. Stat. Assoc. 74(367), 519–530 (1979)
Tobler, W., Deichmann, U., Gottsegen, J., Maloy, K.: The Global Demography Project. Technical Report 95-6, National Center for Geographic Information and Analysis, Santa Barbara (1995)
Vega, K.V.A.: Aplicacin de la Interpolacin Fractal en Downscaling de Imgenes Satelitales NOAA-AVHRR de Temperatura de Superficie en Terrenos de Topografia Compleja. Ph.D. thesis, Universidad de Chile (2012)
Whitworth, A., Carter, E., Ballas, D., Moon, G.: Estimating uncertainty in spatial microsimulation approaches to small area estimation: a new approach to solving an old problem. Comput. Environ. Urban Syst. 63, 50–57 (2016)
Willmott, C.J., Matsuura, K.: Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Clim. Res. 30(1), 79–82 (2005)
Wu, Ss, Qiu, X., Wang, L.: Population estimation methods in GIS and remote sensing: a review. GISci. Remote Sens. 42(1), 80–96 (2005)
Xu, G., Xu, X., Liu, M., Sun, A.Y., Wang, K.: Spatial downscaling of TRMM precipitation product using a combined multifractal and regression approach: demonstration for South China. Water 7(6), 3083–3102 (2015)
Acknowledgements
This research was partially supported through Fundação para a Ciência e Tecnologia (FCT), through project grants with references PTDC/EEI-SCR/1743/2014 (Saturn) and EXPL/EEI-ESS/0427/2013 (KD-LBSN), as well as through the INESC-ID multi-annual funding from the PIDDAC programme (UID/CEC/50021/2013).
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
Rights and permissions
About this article
Cite this article
Monteiro, J., Martins, B. & Pires, J.M. A hybrid approach for the spatial disaggregation of socio-economic indicators. Int J Data Sci Anal 5, 189–211 (2018). https://doi.org/10.1007/s41060-017-0080-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41060-017-0080-z