Abstract
Lead poisoning produces serious health problems, which are worse when a victim is younger. The US government and society have tried to prevent lead poisoning, especially since the 1970s; however, lead exposure remains prevalent. Lead poisoning analyses frequently use georeferenced blood lead level data. Like other types of data, these spatial data may contain uncertainties, such as location and attribute measurement errors, which can propagate to analysis results. For this paper, simulation experiments are employed to investigate how selected uncertainties impact regression analyses of blood lead level data in Syracuse, New York. In these simulations, location error and attribute measurement error, as well as a combination of these two errors, are embedded into the original data, and then these data are aggregated into census block group and census tract polygons. These aggregated data are analyzed with regression techniques, and comparisons are reported between the regression coefficients and their standard errors for the error added simulation results and the original results. To account for spatial autocorrelation, the eigenvector spatial filtering method and spatial autoregressive specifications are utilized with linear and generalized linear models. Our findings confirm that location error has more of an impact on the differences than does attribute measurement error, and show that the combined error leads to the greatest deviations. Location error simulation results show that smaller administrative units experience more of a location error impact, and, interestingly, coefficients and standard errors deviate more from their true values for a variable with a low level of spatial autocorrelation. These results imply that uncertainty, especially location error, has a considerable impact on the reliability of spatial analysis results for public health data, and that the level of spatial autocorrelation in a variable also has an impact on modeling results.
Similar content being viewed by others
References
Arbia, G., Griffith, D. A., & Haining, R. (1998). Error propagation modelling in raster GIS: Overlay operations. International Journal of Geographical Information Science, 12(2), 145–167.
Arbia, G., Griffith, D. A., & Haining, R. (1999). Error propagation modeling in raster GIS: Adding and ratioing operations. Cartography and Geographic Information Science, 26(4), 297–316.
Barry, S., & Elith, J. (2006). Error and uncertainty in habitat models. Journal of Applied Ecology, 43(3), 413–423.
Besag, J. (1974). Spatial interaction and the statistical analysis of lattice systems. Journal of the Royal Statistical Society. Series B (Methodological), 36(2), 192–236.
Byers, R. K., & Lord, E. E. (1943). Late effects of lead poisoning on mental development. American Journal of Diseases of Children, 66(5), 471–495.
Canfield, R. L., Henderson, C. R., Jr., Cory-Slechta, D. A., Cox, C., Jusko, T. A., & Lanphear, B. P. (2003). Intellectual impairment in children with blood lead concentrations below 10 μg per deciliter. New England Journal of Medicine, 348(16), 1517–1526.
Cayo, M. R., & Talbot, T. O. (2003). Positional error in automated geocoding of residential addresses. International Journal of Health Geographics, 2(1), 1–12.
CDC. (2007). Interpreting and managing blood lead levels <10 µg/dL in children and reducing childhood exposures to lead. Morbidity and Mortality Weekly Report, 56, 1–14.
CDC. (2013). Blood lead levels in children aged 1–5 years—United States, 1999–2010. Morbidity and Mortality Weekly Report, 62, 245–248.
CDC. (2015). Childhood blood lead levels—United States, 2007–2012. Morbidity and Mortality Weekly Report, 62, 76–80.
Chun, Y., & Griffith, D. A. (2014). A quality assessment of eigenvector spatial filtering based parameter estimates for the normal probability model. Spatial Statistics, 10, 1–11.
Cromley, E. K., & McLafferty, S. L. (2002). GIS and Public Health. New York: The Guilford Press.
Dearwent, S. M., Jacobs, R. R., & Halbert, J. B. (2001). Locational uncertainty in georeferencing public health datasets. Journal of Exposure Analysis and Environmental Epidemiology, 11(4), 329–334.
Fisher, P. F. (1999). Models of uncertainty in spatial data. Geographical Information Systems, 1, 191–205.
Fisher, P. F., Comber, A., & Wadsworth, R. (2006). Approaches to uncertainty in spatial data. In R. Devillers & R. Jeansoulin (Eds.), Fundamentals of spatial data quality (pp. 43–59). London: ISTE Publishing Company.
Goldberg, D. W., & Cockburn, M. G. (2010). Improving geocode accuracy with candidate selection criteria. Transactions in GIS, 14, 149–176.
Goldberg, D. W., Wilson, J. P., Knoblock, C. A., Ritz, B., & Cockburn, M. G. (2008). An effective and efficient approach for manually improving geocoded data. International Journal of Health Geographics, 7, 60–80.
Griffith, D. A. (2003). Spatial autocorrelation and spatial filtering. Berlin: Springer.
Griffith, D. A., & Chun, Y. (2016). Evaluating eigenvector spatial filter corrections for omitted georeferenced variables. Econometrics, 4(2), 1–12.
Griffith, D. A., Doyle, P. G., Wheeler, D. C., & Johnson, D. L. (1998). A tale of two swaths: Urban childhood blood-lead levels across Syracuse, New York. Annals of the Association of American Geographers, 88, 640–665.
Griffith, D. A., Johnson, D. L., & Hunt, A. (2009). The geographic distribution of metals in urban soils: The case of Syracuse, NY. GeoJournal, 74(4), 275–291.
Griffith, D. A., Millones, M., Vincent, M., Johnson, D. L., & Hunt, A. (2007). Impacts of positional error on spatial regression analysis: A case study of address locations in Syracuse, New York. Transactions in GIS, 11(5), 655–679.
Griffith, D. A., Wong, D. W., & Chun, Y. (2015). Uncertainty-related research issues in spatial analysis. In W. Shi, B. Wu, & A. Stein (Eds.), Uncertainty modelling and quality control for spatial data (pp. 1–11). Boca Raton: CRC Press.
Griffith, D. A., Wong, D. W., & Whitfield, T. (2003). Exploring relationships between the global and regional measures of spatial autocorrelation. Journal of Regional Science, 43(4), 683–710.
Heuvelink, G. B. (1998). Uncertainty analysis in environmental modelling under a change of spatial scale. Nutrient Cycling in Agroecosystems, 50, 255–264.
Heuvelink, G. B., Brown, J. D., & van Loon, E. E. (2007). A probabilistic framework for representing and simulating uncertain environmental variables. International Journal of Geographical Information Science, 21(5), 497–513.
Jones, R. R., DellaValle, C. T., Flory, A. R., Nordan, A., Hoppin, J. A., Hofmann, J. N., et al. (2014). Accuracy of residential geocoding in the agricultural health study. International Journal of Health Geographics, 13(1), 37.
Lanphear, B. P., Dietrich, K., Auinger, P., & Cox, C. (2000). Cognitive deficits associated with blood lead concentrations <10 microg/dL in US children and adolescents. Public Health Reports, 115(6), 521.
Lee, M., Chun, Y., & Griffith, D. A. (2016). Uncertainties of spatial data analysis introduced by selected sources of error. In Geocomputation proceedings (pp. 18–24).
Lidsky, T. I., & Schneider, J. S. (2003). Lead neurotoxicity in children: Basic mechanisms and clinical correlates. Brain, 126(1), 5–19.
Lin-Fu, J. S. (1972). Undue absorption of lead among children—A new look at an old problem. The New England Journal of Medicine, 286(13), 702–710.
Mahaffey, K. R., Annest, J. L., Roberts, J., & Murphy, R. S. (1982). National estimates of blood lead levels: United States, 1976–1980. The New England Journal of Medicine, 307(10), 573–579.
Mason, L. H., Harp, J. P., & Han, D. Y. (2014). Pb neurotoxicity: Neuropsychological effects of lead toxicity. BioMed Research International, 2014, 1–8.
Prentice, R. L. (1996). Measurement error and results from analytic epidemiology: Dietary fat and breast cancer. JNCI: Journal of the National Cancer Institute, 88(23), 1738–1747.
Razali, N. M., & Wah, Y. B. (2011). Power comparisons of Shapiro–Wilk, Kolmogorov–Smirnov, Lilliefors and Anderson–Darling tests. Journal of Statistical Modeling and Analytics, 2, 21–33.
Reeves, G. K., Cox, D. R., Darby, S. C., & Whitley, E. (1998). Some aspects of measurement error in explanatory variables for continuous and binary regression models. Statistics in Medicine, 17(19), 2157–2177.
Robinson, V. B., Avenue, P., & Frank, A. U. (1985). About different kinds of uncertainty in collections of spatial data. Proceedings of AUTO-CARTO, 7, 440–449.
Schoof, R. A., Johnson, D. L., Handziuk, E. R., Van Landingham, C., Feldpausch, A. M., Gallagher, A. E., et al. (2015). Assessment of blood lead level declines in an area of historical mining with a holistic remediation and abatement program. Environmental Research, 150, 582–591.
Shi, W., & Liu, W. (2000). A stochastic process-based model for the positional error of line segments in GIS. International Journal of Geographical Information Science, 14(1), 51–66.
Tiefelsdorf, M., & Griffith, D. A. (2007). Semiparametric filtering of spatial autocorrelation: The eigenvector approach. Environment and Planning A, 39(5), 1193–1221.
US FDA. (2017). FDA warns against using Magellan Diagnostics LeadCare testing systems with blood obtained from a vein: FDA safety communication. https://www.census.gov/geo/maps-data/data/tiger.html. Accessed 20 July 2017.
US Department of Transportation. (2017). National Address DatabaseArticle title. Retrieved from https://www.transportation.gov/nad.
Wong, D. (2009). The modifiable areal unit problem (MAUP). Thousand Oaks: The SAGE Handbook of Spatial Analysis.
Wu, J., Funk, T. H., Lurmann, F. W., & Winer, A. M. (2005). Improving spatial accuracy of roadway networks and geocoded addresses. Transactions in GIS, 9(4), 585–601.
Yassin, M. M., & Lubbad, A. M. H. (2013). Blood lead level in relation to awareness and self reported symptoms among gasoline station workers in the Gaza strip. Journal of Medicine, 14(2), 135–142.
Yoon, J. H., & Ahn, Y. S. (2016). The association between blood lead level and clinical mental disorders in fifty thousand lead-exposed male workers. Journal of Affective Disorders, 190, 41–46.
Zandbergen, P. A. (2008). A comparison of address point, parcel and street geocoding techniques. Computers, Environment and Urban Systems, 32(3), 214–232.
Zandbergen, P. A., Hart, T. C., Lenzer, K. E., & Camponovo, M. E. (2012). Error propagation models to examine the effects of geocoding quality on spatial analysis of individual-level datasets. Spatial and Spatio-Temporal Epidemiology, 3(1), 69–82.
Zimmerman, D. L., & Li, J. (2010). The effects of local street network characteristics on the positional accuracy of automated geocoding for geographic health studies. International Journal of Health Geographics, 9(1), 10.
Acknowledgements
This research was supported by the National Institutes of Health, Grant 1R01HD076020-01A1; any opinions, findings, and conclusions or recommendations expressed in this paper are those of the authors and do not necessarily reflect the views of the National Institutes of Health.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Lee, M., Chun, Y. & Griffith, D.A. Error propagation in spatial modeling of public health data: a simulation approach using pediatric blood lead level data for Syracuse, New York. Environ Geochem Health 40, 667–681 (2018). https://doi.org/10.1007/s10653-017-0014-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10653-017-0014-7