Natural Hazards

, Volume 81, Issue 3, pp 1929–1956 | Cite as

A Bayesian machine learning model for estimating building occupancy from open source data

  • Robert Stewart
  • Marie Urban
  • Samantha Duchscherer
  • Jason Kaufman
  • April Morton
  • Gautam Thakur
  • Jesse Piburn
  • Jessica Moehl
Original Paper

Abstract

Understanding building occupancy is critical to a wide array of applications including natural hazards loss analysis, green building technologies, and population distribution modeling. Due to the expense of directly monitoring buildings, scientists rely in addition on a wide and disparate array of ancillary and open source information including subject matter expertise, survey data, and remote sensing information. These data are fused using data harmonization methods, which refer to a loose collection of formal and informal techniques for fusing data together to create viable content for building occupancy estimation. In this paper, we add to the current state of the art by introducing the population data tables (PDT), a Bayesian model and informatics system for systematically arranging data and harmonization techniques into a consistent, transparent, knowledge learning framework that retains in the final estimation uncertainty emerging from data, expert judgment, and model parameterization. PDT aims to estimate ambient occupancy in units of people/1000 ft2 for a number of building types at the national and sub-national level with the goal of providing global coverage. We present the PDT model, situate the work within the larger community, and report on the progress of this multi-year project.

Keywords

Population Building Occupancy Bayesian Uncertainty Open source Elicitation 

Notes

Acknowledgments

This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes.

References

  1. Albert I, Donnet S, Guihenneuc-Jouyaux C, Low-Choy S, Mengersen K, Rousseau J (2012) Combining expert opinions in prior elicitation. Bayesian Anal 7(3):503–532CrossRefGoogle Scholar
  2. Axhausen K, Zimmermann A, Schönfelder S, Rindsfüser G, Haupt T (2000) Observing the rhythms of daily life: a six-week travel diary. Transportation 29(2):95–124CrossRefGoogle Scholar
  3. Badan Pusat Statistik (2010) Household by floor area of dwelling unit and households member size at http://sp2010.bps.go.id/index.php/site/tabel?tid=334&wid=1100000000. Accessed Apr 2014
  4. Beresovsky V, Burt C, Parsons V, Schenker N, Mutter R (2011) Application of hierarchical Bayesian models with poststratification for small area estimation from complex survey data. Am Stat Assoc Jt Stat Meet Miami, FLGoogle Scholar
  5. Berger J (2010) Statistical decision theory and Bayesian analysis. Springer, New YorkGoogle Scholar
  6. Berlin Metropolitan School (BMS) (2015) metropolitanschool.com/home. Accessed Nov 2015 Hos
  7. Bernardo J (2003) Bayesian statistics. In: Viertl R (ed) Encyclopedia of life support systems, probability and statistics. UNESCO, OxfordGoogle Scholar
  8. Bhaduri B, Bright E, Coleman P, Urban M (2007) LandScan USA: a high-resolution geospatial and temporal modeling approach for population distribution and dynamics. GeoJournal 69(1):103–117CrossRefGoogle Scholar
  9. Billari F, Graziani R, Melilli E (2012) Stochastic population forecasts based on conditional expert opinions. J R Stat Soc Ser A (Stat Soc) 175(2):491–511CrossRefGoogle Scholar
  10. Bolstad W (2007) Introduction to Bayesian statistics. Wiley, HobokenCrossRefGoogle Scholar
  11. Bryant J, Graham P (2013) Bayesian demographic accounts: subnational population estimation using multiple data sources. Bayesian Anal 8(3):591–622. doi: 10.1214/13-BA820 CrossRefGoogle Scholar
  12. Buckland S, Newman K, Thomas L, Koesters N (2004) State-space models for the dynamics of wild animal populations. Ecol Model 171(1–2):157–175CrossRefGoogle Scholar
  13. Cooke R (1991) Opinion and subjective probability in science. Oxford University Press, New YorkGoogle Scholar
  14. Dell’Acqua F, Gamba P, Jaiswal K (2013) Spatial aspects of building and population exposure data and their implications for global earthquake exposure modeling. Nat Hazards 68(3):1291–1309CrossRefGoogle Scholar
  15. Earle P, Wald D, Jaiswal K, Allen T, Hearne M, Marano K, Hotovec A, Fee J (2009) Prompt assessment of global earthquakes for response (PAGER): a system for rapidly determining the impact of earthquakes worldwide. United States Geological SurveyGoogle Scholar
  16. Eguchi R, Goltz J, Seligson H, Flores P, Blais N, Heaton T, Bortugno E (1997) Real-time loss estimation as an emergency response decision support system: the Early Post-Earthquake Damage Assessment Tool (EPEDAT). Earthq Spectra 13(4):815–833CrossRefGoogle Scholar
  17. Elliott M, Little R (2000) A Bayesian approach to combining information from a census, a coverage measurement survey, and demographic analysis. J Am Stat Assoc 95(450):351–362CrossRefGoogle Scholar
  18. FEMA (2011) Hazus 2.0 Manual. https://www.fema.gov/media-library/assets/documents/21879. Accessed June 2014
  19. French S (2011) Aggregating expert judgment. Rev Real Acad Cienc Exactas Fis Nat Ser A Mat 105(1):181–206CrossRefGoogle Scholar
  20. Furukawa Y, Curless B, Seitz SM, Szeliski R (2009) Reconstructing building interiors from images. In: IEEE 12th international conference on computer visionGoogle Scholar
  21. Gamba P, Cavalca D, Jaiswal K, Huyck C, Crowley H (2012) The GED4GEM project: development of a global exposure database for the global earthquake model initiative. In: 15th world conference on earthquake engineering, Lisbon, PortugalGoogle Scholar
  22. Garthwaite P, Kadane J, O’Hagan A (2005) Statistical methods for eliciting probability distributions. J Am Stat Assoc 100(470):680–700CrossRefGoogle Scholar
  23. GEM (2014) Global earthquake model. http://www.globalquakemodel.org/. Accessed Apr 2014
  24. Genest C, Weerahandi S, Zidek J (1984) Aggregating opinions through logarithmic pooling. Theor Decis 17(1):61–70CrossRefGoogle Scholar
  25. Gonzalez  M, Hidalgo H, Barabasi A (2008) Understanding individual human mobility patterns. Nature 453(7196):779–782CrossRefGoogle Scholar
  26. Heid I, Kuchenhoff H, Miles J, Kreienbrock L, Wichmann H (2004) Two dimensions of measurement error: classical and Berkson error in residential radon exposure assessment. J Expos Anal Environ Epidemiol 14(5):365–377CrossRefGoogle Scholar
  27. Herrmann C, Metzler J (2013) Density estimation in aerial images of large crowds for automatic people counting. In: SPIE proceedings: ISR processing III: image exploitation, Baltimore, MDGoogle Scholar
  28. Hong T, Lin H-W (2013) Occupant behavior: impact on energy use of private offices. Berkeley National Laboratory and the Green Energy and Environment Laboratories, Industrial Technology Research Institute, Taiway, ROCGoogle Scholar
  29. Illinois Department of Public Health (IDPH) (2012) John H. Stroger Hospital of Cook County Profile. http://app.idph.state.il.us/files/BMI/2012%20Hosp%20Profiles/5272.pdf. Accessed on Sept 2015
  30. Jaiswal K, Wald D (2008) Creating a global building inventory for earthquake loss assessment and risk management. United States Geological SurveyGoogle Scholar
  31. Jaiswal K, Wald D (2010) Development of a semi-empirical loss model within the USGS Prompt Assessment of Global Earthquakes for Response (PAGER) system. United States Geological SurveyGoogle Scholar
  32. Jaiswal K, Wald D, Earle P, Porter K, Hearne M (2009) Earthquake casualty models within the USGS Prompt Assessment of Global Earthquakes for Response (PAGER) system. In: Second international workshop on disaster casualties, University of Cambridge, UKGoogle Scholar
  33. Jaiswal K, Wald D, Porter K (2010) A global building inventory for earthquake loss estimation and risk management. Earthq Spectra 26(3):731CrossRefGoogle Scholar
  34. Jaiswal K, Wald D, Earle P, Porter K, Herne M (2011) Earthquake casualty models within the USGS Prompt Assessment of Global Earthquakes for Response (PAGER) system. Human casualties in earthquakes. Springer, Berlin, pp 83–94Google Scholar
  35. Johnston RJ, Pattie CJ (1993) Entropy-maximizing and the iterative proportional fitting procedure. Prof Geogr 45(3):317CrossRefGoogle Scholar
  36. Joshi, B (2008) Prisons and the rights of detainees: a photo exhibition on prison conditions in Nepal. Office of the high commissioner for human rights in Nepal, NepalGoogle Scholar
  37. Kim JR, Muller JP (2002) 3D reconstruction from very high resolution satellite stereo and its application to object identification. In: International society for photogrammetry and remote sensing, symposium on geospatial theory, processing and applications, vol 34(4)Google Scholar
  38. Kolendo A, Frumkin P (2012) Case study: the Art Institute of Chicago and the decision to start building. The Harris School of Public Policy at the University of ChicagoGoogle Scholar
  39. Luo Y, Gavrilova M (2006) 3D building reconstruction from LIDAR data. In: Gavrilova M, Gervasi O, Kumar V et al (eds) Computational science and its applications—ICCSA 2006, vol 3980. Springer, Berlin, pp 431–439Google Scholar
  40. Martani C, Lee D, Robinson P, Britter R, Ratti C (2012) ENERNET: studying the dynamic relationship between building occupancy and energy consumption. Energy Build 47:584–591CrossRefGoogle Scholar
  41. Melfi R, Rosenblum B, Nordman B, Christensen K (2011) Measuring building occupancy using existing network infrastructure. In: Proceedings of the 2011 international green computing conference and workshops. IEEE Computer Society, pp 1–8Google Scholar
  42. Meyn S, Surana A, Lin Y, Oggianu S, Narayanan S, Frewen T (2009) A sensor-utility-network method for estimation of occupancy distribution in buildings. In: 48th IEEE conference on decision and controlGoogle Scholar
  43. Ministry of Education Istanbul (MEI) (2015) tayfursokmenio.meb.k12.tr/tema/. Accessed Sept 2015
  44. Morton A (2013) A process model for capturing museum population dynamics mathematics. California State Polytechnic UniversityGoogle Scholar
  45. Mugglin A, Carlin B (1998) Hierarchical modeling in geographic information systems: population interpolation over incompatible zones. J Agric Biol Environ Stat 3(2):111–130CrossRefGoogle Scholar
  46. Mugglin A, Carlin B, Gelfand A (2000) Fully model-based approaches for spatially misaligned data. J Am Stat Assoc 95(451):877CrossRefGoogle Scholar
  47. Ng E (2010) Designing high-density cities for social and environmental sustainability. EarthScan, LondonGoogle Scholar
  48. Nigerian MDG Information System (NIS) (2015) nmis.mdgs.gov.ng. Accessed Sept 1015
  49. Noulas A, Scellato S, Lambiotte R, Pontil M, Mascolo C (2012) A tale of many cities: universal patterns in human urban mobility. PLoS ONE 7(5):e37027. doi: 10.1371/journal.pone.0037027 CrossRefGoogle Scholar
  50. Phillips L (1999) Group elicitation of probability distributions: Are many heads better than one? In: Shanteau J, Mellors B, Schum D (eds) Decision science and technology: reflections on the contributions of Ward Edwards. Kluwer Academic Publishers, NorwellGoogle Scholar
  51. Press J (2003) Subjective and objective Bayesian statistics: principles, models, and applications, 2nd edn., Wiley series in probability and statisticsWiley, New YorkGoogle Scholar
  52. Pujol G (2007) Sensitivity package, R package version 1.1Google Scholar
  53. Raftery AE, Li N, Ševčíková H, Gerland P, Heilig GK (2012) Bayesian probabilistic population projections for all countries. Proceedings of the National Academy of Sciences 109:13915–13921 Google Scholar
  54. Royal London Hospital (RLH) (2015) The Royal London Hospital Quality Report. http://www.cqc.org.uk/sites/default/files/new_reports/AAAC0234.pdf. Accessed Sept 2015
  55. Saltelli A, Tarantola S, Chan K (1999) A quantitative model-independent method for global sensitivity analysis of model output. Technometrics. 41(1):39–56CrossRefGoogle Scholar
  56. Schlich R, Axhausen K (2003) Habitual travel behaviour: evidence from a six-week travel diary. Transportation 30(1):13–36CrossRefGoogle Scholar
  57. Sharpe E, Skeggs T, McNaught S, Saraceno V, Stapley-Brown V (2013) Visitor figures 2013. The art newspaper. Allemandi Publishing, New YorkGoogle Scholar
  58. St. Nicholas School (SNS) (2015). stnicholas.com.br/highlights.php. Accessed Sept 2015
  59. Stewart R, White D, Urban M, Morton A, Webster C, Stoyanov M, Bright E, Bhaduri B (2013) Uncertainty quantification techniques for population density estimates derived from sparse open source data. Proc SPIE Geospatial InfoFusion III (refereed) 8747:874705CrossRefGoogle Scholar
  60. Stewart R, Piburn J, Weber E, Urban M, Morton A, Thakur G, Bhaduri B (2016) Can social media play a role in developing building occupancy curves, Advances in Geocomputation: Geocomputation 2015—The 13th International Conference (in press)Google Scholar
  61. Sutton P, Elvidge C, Obremski T (2003) Building and evaluating models to estimate ambient population density. Photogramm Eng Remote Sens 69(5):545–553CrossRefGoogle Scholar
  62. Tan Z, Xi W (2003) Bayesian analysis with consideration of data uncertainty in a specific scenario. Reliab Eng Syst Saf 79(1):17–31CrossRefGoogle Scholar
  63. Tehran Streetview (2015). http://map.tehran.ir/streetview/?lang=en. Accessed Oct 2015
  64. Thakur GS, Bhaduri BL, Piburn JO, Sims KM, Stewart RN, Urban, ML (2015) PlanetSense: a real-time streaming and spatio-temporal analytics platform for gathering geo-spatial intelligence from open source data. Computers and society. In: ACM SigSpatial conference, SeattleGoogle Scholar
  65. Trendafiloski G, Wyss M, Rosset P (2011) Loss estimation module in the second generation software QLARM human casualties in earthquakes. Springer, Berlin, pp 95–106CrossRefGoogle Scholar
  66. United Nations Economic Commission for Europe (2013) Country profiles on housing and land management. ECE/HBP/176, United Nations, Geneva Switzerland Google Scholar
  67. Wald D, Jaiswal K, So E, Gracia D, Marano K, Lin K, Hearne M, Greene M, D’Ayala D, Crowley H, Gamba P, Porter K (2011) The role of PAGER in improving global hazard, building, and loss inventories. Seismological Society of America Annual Meeting, Memphis (TE), Seismological Research LettersGoogle Scholar
  68. WHE (2014) World housing encyclopedia project. http://www.world-housing.net/. Accessed 19 Nov 2014
  69. Wisse B, Bedford T, Quigley J (2008) Expert judgement combination using moment methods. Reliab Eng Syst Saf 93(5):675–686CrossRefGoogle Scholar
  70. Wyss M, Tollis S, Rosset P, Pacchiani F (2013) Approximate model for worldwide building stock in three size categories. Report for world agency of planetary monitoring and earthquake risk reduction, Global Assessment Report on Disaster Risk Reduction, The United Nations Office for Disaster Risk ReductionGoogle Scholar
  71. Yang DB, Gonzalez-Banos HH, Guibas LJ (2003) Counting people in crowds with a real-time network of simple image sensors. In: Proceedings ninth IEEE international conference on computer vision 2003Google Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2016

Authors and Affiliations

  • Robert Stewart
    • 1
  • Marie Urban
    • 1
  • Samantha Duchscherer
    • 2
  • Jason Kaufman
    • 2
  • April Morton
    • 2
  • Gautam Thakur
    • 1
  • Jesse Piburn
    • 1
  • Jessica Moehl
    • 2
  1. 1.Oak Ridge National LaboratoryOak RidgeUSA
  2. 2.Oak Ridge Associated UniversitiesOak RidgeUSA

Personalised recommendations