A hybrid approach for the spatial disaggregation of socio-economic indicators

  • João Monteiro
  • Bruno Martins
  • João M. Pires
Regular Paper


While statistical information on socio-economic activities is widely available, the data are often collected or released only at a relatively aggregated level. In these aggregated forms, the data are useful for broad-scale assessments, although we often need to disaggregate the source data in order to provide more localized estimates, and in order to analyze correlations against geophysical variables. Spatial disaggregation techniques can be used in this context, to transform data from a set of source zones into a set of target zones, with different geometry and with a higher general level of spatial resolution. Still, few previous studies in the area have attempted to leverage state-of-the-art spatial disaggregation procedures in the context of socio-economic variables, instead focusing on applications related to population modeling. In this article, we report on experiments with a hybrid spatial disaggregation technique that combines state-of-the-art regression analysis procedures with the classic methods of dasymetric mapping and pycnophylactic interpolation. The hybrid procedure was used together with population density, land coverage, nighttime satellite imagery, and OpenStreetMap road density, as ancillary data to disaggregate different types of socio-economic indicators to a high-resolution grid. Our test specifically leveraged data relative to the Portuguese territory, resulting in the production of raster datasets with a resolution of 30 arc-seconds per cell. The article discusses the spatial disaggregation methodology and the quality of the obtained results under different experimental conditions.


Spatial analysis Downscaling Geographic information systems Regression-based spatial disaggregation Socio-economic indicators 



This research was partially supported through Fundação para a Ciência e Tecnologia (FCT), through project grants with references PTDC/EEI-SCR/1743/2014 (Saturn) and EXPL/EEI-ESS/0427/2013 (KD-LBSN), as well as through the INESC-ID multi-annual funding from the PIDDAC programme (UID/CEC/50021/2013).


  1. 1.
    Andersen, R.: Modern Methods for Robust Regression. No. 152 in Quantitative Applications in the Social Sciences. Sage Publications, Thousand Oaks (2008)CrossRefGoogle Scholar
  2. 2.
    Antoni, J.P., Vuidel, G., Aupet, J.B., Aube, J.: Generating a located synthetic population: a prerequisite to agent-based urban modelling. In: Proceedings of the European Colloquium of Quantitative and Theoretical Geography (2011)Google Scholar
  3. 3.
    Antoni, J.P., Vuidel, G., Klein, O.: Generating a located synthetic population of individuals, households, and dwellings. Working Paper Series, Luxembourg Institute of Socio-Economic Research (2017)Google Scholar
  4. 4.
    Antoniou, V., Fonte, C.C., See, L., Estima, J., Arsanjani, J.J., Lupia, F., Minghini, M., Foody, G., Fritz, S.: Investigating the feasibility of geo-tagged photographs as sources of land cover input data. ISPRS Int. J. GeoInf. 5(5), 64 (2016)CrossRefGoogle Scholar
  5. 5.
    Bivand, R.S., Pebesma, E., Gmez-Rubio, V.: Applied Spatial Data Analysis with R. Springer, Berlin (2012)zbMATHGoogle Scholar
  6. 6.
    Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)CrossRefzbMATHGoogle Scholar
  7. 7.
    Briggs, D.J., Gulliver, J., Fecht, D., Vienneau, D.M.: Dasymetric modelling of small-area population distribution using land cover and light emissions data. Remote Sens. Environ. 108(4), 451–466 (2007)CrossRefGoogle Scholar
  8. 8.
    Chai, T., Draxler, R.R.: Root mean square error (RMSE) or mean absolute error (MAE)?—Arguments against avoiding RMSE in the literature. Geosci. Model Dev. 7(3), 1247–1250 (2014)CrossRefGoogle Scholar
  9. 9.
    Chambers, R., Tzavidis, N.: M-quantile models for small area estimation. Biometrika 93(2), 255–268 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Chandra, H., Salvati, N., Chambers, R., Tzavidis, N.: Small area estimation under spatial nonstationarity. Comput. Stat. Data Anal. 56(10), 2875–2888 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Cunha, E., Martins, B.: Using one-class classifiers and multiple kernel learning for defining imprecise geographic regions. Int. J. Geogr. Inf. Sci. 28(11), 2220–2241 (2014)CrossRefGoogle Scholar
  12. 12.
    Deville, P., Linard, C., Martin, S., Gilbert, M., Stevens, F.R., Gaughan, A.E., Blondel, V.D., Tatem, A.J.: Dynamic population mapping using mobile phone data. Proc. Natl. Acad. Sci. 111(45), 15888–15893 (2014)CrossRefGoogle Scholar
  13. 13.
    Doll, C.N.H., Muller, J.P., Elvidge, C.: Night-time imagery as a tool for global mapping of socio-economic parameters and greenhouse gas emissions. Ambio 29(3), 157–162 (2000)CrossRefGoogle Scholar
  14. 14.
    Douglass, R., Meyer, D., Ram, M., Rideout, D., Song, D.: High resolution population estimates from telecommunications data. Euro. Phys. J. Data Sci. 4(1), 4 (2015)Google Scholar
  15. 15.
    Eicher, C.L., Brewer, C.A.: Dasymetric mapping and areal interpolation: Implementation and evaluation. Cartogr. Geogr. Inf. Sci. 28(2), 125–138 (2001)CrossRefGoogle Scholar
  16. 16.
    Elvidge, C., Erwin, E., Baugh, K., Ziskin, D., Tuttle, B., Ghosh, T., Sutton, P.: Overview of dmsp nightime lights and future possibilities. In: Proceedings of the Joint Urban Remote Sensing Event, pp. 1–5 (2009)Google Scholar
  17. 17.
    Elvidge, C.D., Baugh, K.E., Kihn, E.A., Kroehl, H.W., Davis, E.R., Davis, C.: Relation between satellite observed visible to near infrared emissions, population, and energy consumption. Int. J. Remote Sens. 18(6), 1373–1379 (1997)CrossRefGoogle Scholar
  18. 18.
    Fisher, P.F., Langford, M.: Modelling the errors in areal interpolation between zonal systems by monte carlo simulation. Environ. Plann. A 27(2), 211–224 (1995)CrossRefGoogle Scholar
  19. 19.
    Fotheringham, A.S., Brunsdon, C., Charlton, M.E.: Geographically Weighted Regression : The Analysis of Spatially Varying Relationships. Wiley, Hoboken (2002)zbMATHGoogle Scholar
  20. 20.
    Gallego, F.J.: A population density grid of the European Union. Popul. Environ. 31(6), 460–473 (2010)CrossRefGoogle Scholar
  21. 21.
    García-Palomares, J.C., Gutiérrez, J., Mínguez, C.: Identification of tourist hot spots based on social networks: a comparative analysis of european metropolises using photo-sharing services and GIS. Appl. Geogr. 63(1), 408–417 (2015)CrossRefGoogle Scholar
  22. 22.
    Giri, C.P.: Remote Sensing of Land Use and Land Cover: Principles and Applications. CRC Press, Boca Raton (2012)CrossRefGoogle Scholar
  23. 23.
    Giusti, C., Tzavidis, N., Pratesi, M., Salvati, N.: Resistance to outliers of M-quantile and robust random effects small area models. Commun. Stat. Simul. Comput. 43(3), 549–568 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  24. 24.
    Goerlich, F.J., Cantarino, I.: A population density grid for spain. Int. J. Geogr. Inf. Sci. 27(12), 2247–2263 (2013)CrossRefGoogle Scholar
  25. 25.
    Goodchild, M.F., Anselin, L., Deichmann, U.: A framework for the areal interpolation of socioeconomic data. Environ. Plan. A 25(3), 383–397 (1993)CrossRefGoogle Scholar
  26. 26.
    Goodchild, M.F., Lam, N.S.N.: Areal interpolation: a variant of the traditional spatial problem. Department of Geography, University of Western Ontario London, Canada (1980)Google Scholar
  27. 27.
    Gregory, I.N.: The accuracy of areal interpolation techniques: standardising 19th and 20th century census data to allow long-term comparisons. Comput. Environ. Urban Syst. 26(4), 293–314 (2002)CrossRefGoogle Scholar
  28. 28.
    Gupta, M.R., Chen, Y.: Theory and use of the EM algorithm. Found. Trends Signal Process. 4(3), 223–296 (2010)CrossRefzbMATHGoogle Scholar
  29. 29.
    Harris, P., Brunsdon, C., Fotheringham, A.S.: Links, comparisons and extensions of the geographically weighted regression model when used as a spatial predictor. Stoch. Environ. Res. Risk Assess. 25(2), 123–138 (2011)CrossRefGoogle Scholar
  30. 30.
    Harris, P., Fotheringham, A., Crespo, R., Charlton, M.: The use of geographically weighted regression for spatial prediction: an evaluation of models using simulated data sets. Math. Geosci. 42(6), 657–680 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  31. 31.
    Hastie, T.J., Tibshirani, R.J.: Generalized Additive Models. Chapman & Hall, Boca Raton (1990)zbMATHGoogle Scholar
  32. 32.
    Hawelka, B., Sitko, I., Beinat, E., Sobolevsky, S., Kazakopoulos, P., Ratti, C.: Geo-located Twitter as proxy for global mobility patterns. Cartogr. Geogr. Inf. Sci. 41(3), 260–271 (2014)CrossRefGoogle Scholar
  33. 33.
    Hawley, K., Moellering, H.: A comparative analysis of areal interpolation methods. Cartogr. Geogr. Inf. Sci. 32(4), 411–423 (2005)CrossRefGoogle Scholar
  34. 34.
    Heymann Y., S.C.C.G., Bossard, M.: CORINE land cover technical guide. Technical Report EUR12585, Office for Official Publications of the European Communities (1994)Google Scholar
  35. 35.
    Kádár, B.: Measuring tourist activities in cities using geotagged photography. Tour. Geogr. 16(1), 88–104 (2014)CrossRefGoogle Scholar
  36. 36.
    Kádár, B., Gede, M.: Where do tourists go? Visualizing and analysing the spatial distribution of geotagged photography. Cartogr. Int. J. Geogr. Inf. Geovis. 48(2), 78–88 (2013)Google Scholar
  37. 37.
    Kim, G., Barros, A.P.: Downscaling of remotely sensed soil moisture with a modified fractal interpolation method using contraction mapping and ancillary data. Remote Sens. Environ. 83(3), 400–413 (2002)CrossRefGoogle Scholar
  38. 38.
    Kim, H., Yao, X.: Pycnophylactic interpolation revisited: integration with the dasymetric-mapping method. Int. J. Remote Sens. 31(21), 5657–5671 (2010)CrossRefGoogle Scholar
  39. 39.
    Kuhn, M.: Building predictive models in R using the caret package. J. Stat. Softw. 28(5), 1–26 (2008)CrossRefGoogle Scholar
  40. 40.
    Langford, M.: Rapid facilitation of dasymetric-based population interpolation by means of raster pixel maps. Comput. Environ. Urban Syst. 31(1), 19–32 (2007)CrossRefGoogle Scholar
  41. 41.
    Li, D., Zhao, X., Li, X.: Remote sensing of human beings a perspective from nighttime light. Geospat. Inf. Sci. 19(1), 69–79 (2016)CrossRefGoogle Scholar
  42. 42.
    Lin, J., Cromley, R., Zhang, C.: Using geographically weighted regression to solve the areal interpolation problem. Ann. GIS 17(1), 1–14 (2011)CrossRefGoogle Scholar
  43. 43.
    Lin, J., Cromley, R.G.: Evaluating geo-located Twitter data as a control layer for areal interpolation of population. Appl. Geogr. 58(1), 41–47 (2015)CrossRefGoogle Scholar
  44. 44.
    Longley, P.A., Adnan, M., Lansley, G.: The geotemporal demographics of Twitter usage. Environ. Plan. A 47(2), 465–484 (2015)CrossRefGoogle Scholar
  45. 45.
    Malone, B.P., McBratney, A.B., Minasny, B., Wheeler, I.: A general method for downscaling Earth resource information. Comput. Geosci. 41(1), 119–125 (2012)CrossRefGoogle Scholar
  46. 46.
    Nagle, N.N., Buttenfield, B.P., Leyk, S., Spielman, S.: Dasymetric modeling and uncertainty. Ann. Assoc. Am. Geogr. 104(1), 80–95 (2014)CrossRefGoogle Scholar
  47. 47.
    Nordhaus, W.D.: Alternative Approaches to Spatial Rescaling. Technical Report. Yale University, New Haven (2003)Google Scholar
  48. 48.
    Nordhaus, W.D.: Geography and macroeconomics: new data and new findings. Proc. Natl. Acad. Sci. 103(10), 3510–3517 (2006)CrossRefGoogle Scholar
  49. 49.
    Patel, N.N., Stevens, F.R., Huang, Z., Gaughan, A.E., Elyazar, I., Tatem, A.J.: Improving large area population mapping using geotweet densities. Trans. GIS 21(2), 317–331 (2016)CrossRefGoogle Scholar
  50. 50.
    Paul, M.J., Dredze, M., Broniatowski, D.: Twitter improves influenza forecasting. PLoS Curr. 6(1), 18 (2014)Google Scholar
  51. 51.
    Quinlan, R.J.: Learning with continuous classes. In: Proceedings of the Australian Joint Conference On Artificial Intelligence, pp. 343–348 (1992)Google Scholar
  52. 52.
    Reibel, M., Bufalino, M.E.: Street-weighted interpolation techniques for demographic count estimation in incompatible zone systems. Environ. Plan. A 37(1), 127–139 (2005)CrossRefGoogle Scholar
  53. 53.
    Rousseeuw, P.J., Leroy, A.M.: Robust Regression and Outlier Detection. Wiley, Hoboken (2005)zbMATHGoogle Scholar
  54. 54.
    Salvati, N., Tzavidis, N., Pratesi, M., Chambers, R.: Small area estimation via M-quantile geographically weighted regression. Test 21(1), 1–28 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  55. 55.
    Schmid, T., Münnich, R.T.: Spatial robust small area estimation. Stat. Pap. 55(3), 653–670 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  56. 56.
    Sémécurbe, F., Tannier, C., Roux, S.G.: Spatial distribution of human population in france: Exploring the modifiable areal unit problem using multifractal analysis. Geogr. Anal. 48(3), 292–313 (2016)CrossRefGoogle Scholar
  57. 57.
    Batista e Silva, F., Gallego, J., Lavalle, C.: A high-resolution population grid map for europe. J. Maps 9(1), 16–28 (2013)CrossRefGoogle Scholar
  58. 58.
    Stevens, F.R., Gaughan, A.E., Linard, C., Tatem, A.J.: Disaggregating census data for population mapping using random forests with remotely-sensed and ancillary data. PLoS ONE 10(2), 1–22 (2015)CrossRefGoogle Scholar
  59. 59.
    Tobler, W.: A computer movie simulating urban growth in the detroit region. Econ. Geogr. 46(2), 234–240 (1970)CrossRefGoogle Scholar
  60. 60.
    Tobler, W.: Smooth pycnophylactic interpolation for geographical regions. J. Am. Stat. Assoc. 74(367), 519–530 (1979)MathSciNetCrossRefGoogle Scholar
  61. 61.
    Tobler, W., Deichmann, U., Gottsegen, J., Maloy, K.: The Global Demography Project. Technical Report 95-6, National Center for Geographic Information and Analysis, Santa Barbara (1995)Google Scholar
  62. 62.
    Vega, K.V.A.: Aplicacin de la Interpolacin Fractal en Downscaling de Imgenes Satelitales NOAA-AVHRR de Temperatura de Superficie en Terrenos de Topografia Compleja. Ph.D. thesis, Universidad de Chile (2012)Google Scholar
  63. 63.
    Whitworth, A., Carter, E., Ballas, D., Moon, G.: Estimating uncertainty in spatial microsimulation approaches to small area estimation: a new approach to solving an old problem. Comput. Environ. Urban Syst. 63, 50–57 (2016)CrossRefGoogle Scholar
  64. 64.
    Willmott, C.J., Matsuura, K.: Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Clim. Res. 30(1), 79–82 (2005)CrossRefGoogle Scholar
  65. 65.
    Wu, Ss, Qiu, X., Wang, L.: Population estimation methods in GIS and remote sensing: a review. GISci. Remote Sens. 42(1), 80–96 (2005)CrossRefGoogle Scholar
  66. 66.
    Xu, G., Xu, X., Liu, M., Sun, A.Y., Wang, K.: Spatial downscaling of TRMM precipitation product using a combined multifractal and regression approach: demonstration for South China. Water 7(6), 3083–3102 (2015)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2017

Authors and Affiliations

  1. 1.Universidade de Lisboa, IST/INESC-IDPorto SalvoPortugal
  2. 2.Universidade NOVA de Lisboa, DI, FCT/NOVA LINCSCaparicaPortugal

Personalised recommendations