The Potential for Big Data to Improve Neighborhood-Level Census Data

Spielman, Seth E.

doi:10.1007/978-3-319-40902-3_6

Seth E. Spielman⁴

Part of the book series: Springer Geography ((SPRINGERGEOGR))

3944 Accesses
2 Citations
1 Altmetric

Abstract

The promise of “big data” for those who study cities is that it offers new ways of understanding urban environments and processes. Big data exists within broader national data economies, these data economies have changed in ways that are both poorly understood by the average data consumer and of significant consequence for the application of data to urban problems. For example, high resolution demographic and economic data from the United States Census Bureau since 2010 has declined by some key measures of data quality. For some policy-relevant variables, like the number of children under 5 in poverty, the estimates are almost unusable. Of the 56,204 census tracts for which a childhood poverty estimate was available 40,941 had a margin of error greater than the estimate in the 2007–2011 American Community Survey (ACS) (72.8 % of tracts). For example, the ACS indicates that Census Tract 196 in Brooklyn, NY has 169 children under 5 in poverty ±174 children, suggesting somewhere between 0 and 343 children in the area live in poverty. While big data is exciting and novel, basic questions about American Cities are all but unanswerable in the current data economy. Here we highlight the potential for data fusion strategies, leveraging novel forms of big data and traditional federal surveys, to develop useable data that allows effective understanding of intra urban demographic and economic patterns. This paper outlines the methods used to construct neighborhood-level census data and suggests key points of technical intervention where “big” data might be used to improve the quality of neighborhood-level statistics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 219.00; Price excludes VAT (USA)

Softcover Book: USD 279.99; Price excludes VAT (USA)

Hardcover Book: USD 279.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Defining big data is difficult, most existing definitions, include some multiple of V’s (see Laney 2001). All are satisfactory for our purposes here. We use the term to distinguish between census/survey data which we see as “designed” measurement instruments and big data which we see as “accidental” measurement instruments.
2.
We use the terms “fine” and “high” resolution to refer to census tract or smaller geographies, these data are commonly conceived of as “neighborhood-scale” data. We conceive of resolution in the spatial sense, higher/finer resolution means a smaller census tabulation unit. However, the geographic scale high resolution of census units is a function of population density.
3.
The Census Bureau generally is not actually estimating the “average” value, they are estimating the “total” value of coins in the jar. Repeatedly grabbing five coins and computing the average will over many samples get you a very precise estimate of the average value, but it will give you no information on the total value. To get the total value, you need a good estimate of the average AND a good estimate of the total number of coins in the jar. The loss of cotemporaneous population controls caused by decoupling the ACS from the Decennial enumeration means that the census does not have information about the number of coins in the jar. This is discussed in more details later.

References

Alexander CH (2002) Still rolling: Leslie Kish’s “rolling samples” and the American Community Survey. Surv Methodol 28(1):35–42
Google Scholar
Anderson MJ, Citro C, Salvo JJ (2011) Encyclopedia of the US Census: from the Constitution to the American Community Survey. CQ Press, Washington, DC
Google Scholar
Fay RE, Train GF (1995) Aspects of survey and model-based postcensal estimation of income and poverty characteristics for states and counties. In Proceedings of the Government Statistics Section, American Statistical Association, pp 154–159
Google Scholar
Fuller WA (2011) Sampling statistics. Wiley, Hoboken, NJ
Google Scholar
Against the smart city (The city is here for you to use) by Adam Greenfield Kindle Edition, 152 pages, 2013
Google Scholar
National Research Council (2013) Nonresponse in social science surveys: a research agenda. In: Tourangeau R, Plewes TJ (eds) Panel on a research agenda for the future of social science data collection, Committee on National Statistics. Division of Behavioral and Social Sciences and Education. The National Academies Press, Washington, DC
Google Scholar
Kish L (1990) Rolling samples and censuses. Surv Methodol 16(1):63–79
Google Scholar
Kish L (2002) Combining multipopulation statistics. J Stat Plan Inference 102(1):109–118
Article Google Scholar
Kitchin (2014) Big Data & Society 1(1)2053951714528481; DOI: 10.1177/2053951714528481
Little RJ (2012) Calibrated Bayes: an alternative inferential paradigm for official statistics. J Off Stat 28(3):309–372
Google Scholar
MacDonald H (2006) The American community survey: warmer (more current), but fuzzier (less precise) than the decennial census. J Am Plan Assoc 72(4):491–503
Article Google Scholar
Navarro F (2012) An introduction to ACS statistical methods and lessons learned. Measuring people in place conference, Boulder, CO. http://www.colorado.edu/ibs/cupc/workshops/measuring_people_in_place/themes/theme1/asiala.pdf. Accessed 30 Dec 2012
Porter AT, Holan SH, Wikle CK, Cressie N (2014) Spatial Fay-Herriot models for small area estimation with functional covariates. Spat Stat 10:27–42
Article Google Scholar
Rao JNK (2003) Small area estimation, vol 327. Wiley-Interscience, New York
Book Google Scholar
Särndal C-E (1992) Model assisted survey sampling. Springer Science & Business Media, New York
Book Google Scholar
Spielman SE, Folch DC (2015) Reducing uncertainty in the American Community Survey through data-driven regionalization. PLoS One 10(2):e0115626
Article Google Scholar
Starsinic M (2005) American Community Survey: improving reliability for small area estimates. In Proceedings of the 2005 Joint Statistical Meetings on CD-ROM, pp 3592–3599
Google Scholar
Starsinic M, Tersine A (2007) Analysis of variance estimates from American Community Survey multiyear estimates. In: Proceedings of the section on survey research methods. American Statistical Association, Alexandria, VA, pp 3011–3017
Google Scholar
U.S. Census Bureau (2009a) Design and methodology. American Community Survey. U.S. Government Printing Office, Washington, DC
Google Scholar

Download references

Author information

Authors and Affiliations

University of Colorado, Boulder, CO, USA
Seth E. Spielman

Authors

Seth E. Spielman
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Seth E. Spielman .

Editor information

Editors and Affiliations

Urban Studies and Urban Big Data Centre, University of Glasgow, Glasgow, United Kingdom
Piyushimita (Vonu) Thakuriah
Department of Urban Planning and Policy, University of Illinois at Chicago, Chicago, Illinois, USA
Nebiyou Tilahun
Department of Urban Planning and Policy, University of Illinois at Chicago, Chicago, Illinois, USA
Moira Zellner

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Spielman, S.E. (2017). The Potential for Big Data to Improve Neighborhood-Level Census Data. In: Thakuriah, P., Tilahun, N., Zellner, M. (eds) Seeing Cities Through Big Data. Springer Geography. Springer, Cham. https://doi.org/10.1007/978-3-319-40902-3_6

Download citation

DOI: https://doi.org/10.1007/978-3-319-40902-3_6
Published: 08 October 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-40900-9
Online ISBN: 978-3-319-40902-3
eBook Packages: Earth and Environmental ScienceEarth and Environmental Science (R0)

Publish with us

Policies and ethics