Abstract
Census data provide detailed information about population characteristics at a coarse resolution. Nevertheless, fine-grained, high-resolution mappings of population counts are increasingly needed to characterize population dynamics and to assess the consequences of climate shocks, natural disasters, investments in infrastructure, development policies, etc. Disaggregating these census is a complex machine learning, and multiple solutions have been proposed in past research. We propose in this paper to view the problem in the context of the aggregate learning paradigm, where the output value for all training points is not known, but where it is only known for aggregates of the points (i.e. in this context, for regions of pixels where a census is available). We demonstrate with a very simple and interpretable model that this method is on par, and even outperforms on some metrics, the state-of-the-art, despite its simplicity.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001). https://doi.org/10.1023/A:1010933404324
Briggs, D.J., Gulliver, J., Fecht, D., Vienneau, D.M.: Dasymetric modelling of small-area population distribution using land cover and light emissions data. Remote Sens. Environ. 108(4), 451–466 (2007). https://doi.org/10.1016/j.rse.2006.11.020
Center for International Earth Science Information Network - CIESIN - Columbia University: Gridded population of the world, Version 4 (GPWv4): Population density, Revision 10, 11 July 2018 (2017). https://doi.org/10.7927/H4DZ068D
Center for International Earth Science Information Network - CIESIN - Columbia University: U.S. census grids 2010 (Summary file 1), 19 July 2018 (2017). https://doi.org/10.7927/H40Z716C
Dmowska, A., Stepinski, T.F.: High resolution dasymetric model of U.S. demographics with application to spatial distribution of racial diversity. Appl. Geogr. 53, 417–426 (2014). https://doi.org/10.1016/j.apgeog.2014.07.003
Doupe, P., Bruzelius, E., Faghmous, J., Ruchman, S.G.: Equitable development through deep learning: the case of sub-national population density estimation. In: Proceedings of the 7th Annual Symposium on Computing for Development, DEV 2016, pp. 6:1–6:10. ACM, New York (2016). https://doi.org/10.1145/3001913.3001921
Eicher, C.L., Brewer, C.A.: Dasymetric mapping and areal interpolation: implementation and evaluation. Cartogr. Geogr. Inf. Sci. 28(2), 125–138 (2001)
Flowerdew, R., Green, M.: Developments in areal interpolation methods and GIS. In: Fischer, M.M., Nijkamp, P. (eds.) Geographic Information Systems, Spatial Modelling and Policy Evaluation, pp. 73–84. Springer, Heidelberg (1993). https://doi.org/10.1007/978-3-642-77500-0_5
Gallego, F.J.: A population density grid of the European union. Popul. Environ. 31(6), 460–473 (2010). https://doi.org/10.1007/s11111-010-0108-y
Goodchild, M.F., Anselin, L., Deichmann, U.: A framework for the areal interpolation of socioeconomic data. Environ. Plan. A 25(3), 383–397 (1993)
Hahnloser, R.H., Sarpeshkar, R., Mahowald, M.A., Douglas, R.J., Seung, H.S.: Digital selection and analogue amplification coexist in a cortex-inspired silicon circuit. Nature 405(6789), 947 (2000)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. CoRR abs/1412.6980 (2014). http://arxiv.org/abs/1412.6980
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Mennis, J.: Generating surface models of population using dasymetric mapping. Prof. Geogr. 55(1), 31–42 (2003)
Monmonier, M.S., Schnell, G.A.: Land use and land cover data and the mapping of population density. Int. Yearb. Cartogr. 24(115), e121 (1984)
Musicant, D.R., Christensen, J.M., Olson, J.F.: Supervised learning by training on aggregate outputs. In: Seventh IEEE International Conference on Data Mining (ICDM 2007), pp. 252–261. IEEE (2007)
Paszke, A., et al.: Automatic differentiation in PyTorch (2017)
Robinson, C., Hohman, F., Dilkina, B.: A deep learning approach for population estimation from satellite imagery. In: Proceedings of the 1st ACM SIGSPATIAL Workshop on Geospatial Humanities, pp. 47–54. ACM (2017)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Stevens, F.R., Gaughan, A.E., Linard, C., Tatem, A.J.: Disaggregating census data for population mapping using random forests with remotely-sensed and ancillary data. Plos One 10(2), 1–22 (2015). https://doi.org/10.1371/journal.pone.0107042
Tian, Y., Yue, T., Zhu, L., Clinton, N.: Modeling population density using land cover data. Ecol. Model. 189(1–2), 72–88 (2005)
Tobler, W.R.: Smooth pycnophylactic interpolation for geographical regions. J. Am. Stat. Assoc. 74(367), 519–530 (1979)
UN Economic and Social Council: Resolution adopted by the economic and social council on 10 June 2015 (2020 world population and housing census programme), August 2015. http://www.un.org/ga/search/view_doc.asp?symbol=E/RES/2015/10
Wright, J.K.: A method of mapping densities of population: with cape cod as an example. Geogr. Rev. 26(1), 103–110 (1936)
Acknowledgments
Computational resources have been provided by the supercomputing facilities of the Université catholique de Louvain (CISM/UCL) and the Consortium des Équipements de Calcul Intensif en Fédération Wallonie Bruxelles (CÉCI) funded by the Fond de la Recherche Scientifique de Belgique (F.R.S.-FNRS) under convention 2.5020.11. We would like to thank Pavel Demin and the CP3 group that shared with us part of their reserved resources. The second and third authors acknowledge financial support from the ARC convention on “New approaches to understanding and modeling global migration trends” (convention 18/23-091).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Derval, G., Docquier, F., Schaus, P. (2020). An Aggregate Learning Approach for Interpretable Semi-supervised Population Prediction and Disaggregation Using Ancillary Data. In: Brefeld, U., Fromont, E., Hotho, A., Knobbe, A., Maathuis, M., Robardet, C. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2019. Lecture Notes in Computer Science(), vol 11908. Springer, Cham. https://doi.org/10.1007/978-3-030-46133-1_40
Download citation
DOI: https://doi.org/10.1007/978-3-030-46133-1_40
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-46132-4
Online ISBN: 978-3-030-46133-1
eBook Packages: Computer ScienceComputer Science (R0)