Advertisement

Canadian Journal of Public Health

, Volume 107, Issue 4–5, pp e424–e430 | Cite as

Misclassification errors from postal code-based geocoding to assign census geography in Nova Scotia, Canada

  • Mikiko TerashimaEmail author
  • George Kephart
Quantitative Research
  • 2 Downloads

Abstract

OBJECTIVES: Postal codes are often the only available geographic identifiers in many sources of health data in Canada. In order to conduct geographic analyses, postal codes are routinely geocoded to census geography to link to ecological data. Despite common use of this method, the extent of geographic misclassification errors is poorly understood. We estimated misclassification errors in the geocoding of postal codes to assign census geography in Nova Scotia, Canada.

METHODS: We examined differences between counts and match rates for postal-code geocoded and actual locations of buildings in Nova Scotia at two census administrative area levels: dissemination areas (DAs) and census subdivisions (CSDs). Actual locations were based on the data collected by the provincial government containing actual latitude/longitude of buildings. Variation in misclassification by rurality, using Statistics Canada’s classification, was also assessed.

RESULTS: Outside two urban areas (Halifax Metro and Sydney) which had <10% differences in counts, many DAs had >30% differences. Match rates showed similar patterns, with the vast majority of non-urban DAs having <40% match rates. Even in major urban areas, 10% of DAs had large misclassification errors. Misclassification errors at the CSD level were still too great to estimate counts or rates without further area aggregation.

CONCLUSION: Routine use of postal code geocoding should be replaced with geocoding of location information using additional identifiers such as civic addresses or latitude and longitude. If data holders did this in-house before providing data to researchers, the accuracy and capacity of geographic analysis would be enhanced while protecting confidentiality.

Key words

Geocoding postal code data linkage small-area analysis population health 

Résumé

OBJECTIFS: Les codes postaux sont souvent les seuls identifiants géographiques disponibles dans de nombreuses sources de données sanitaires au Canada. Afin de procéder à des analyses géographiques, les codes postaux sont habituellement géocodés à la géographie du recensement pour être reliés aux données écologiques. Bien que ce soit une méthode couramment utilisée, on connaît mal l’étendue des erreurs de classification géographique. Nous avons estimé les erreurs de classification dans le géocodage des codes postaux pour fins d’association à la géographie du recensement en Nouvelle-Écosse, au Canada.

MÉTHODE: Nous avons examiné les écarts entre les numérations et les taux d’appariement d’emplacements géocodés selon le code postal et d’emplacements réels de bâtiments en Nouvelle-Écosse à deux niveaux de régions administratives du recensement: les aires de diffusion (AD) et les subdivisions de recensement (SDR). Les emplacements réels ont été déterminés selon les données recueillies par le gouvernement provincial indiquant la latitude et la longitude réelles des bâtiments. Nous avons aussi évalué la variation des erreurs de classification par ruralité à l’aide de la classification de Statistique Canada.

RÉSULTATS: Sauf dans deux agglomérations urbaines (Sydney et la région métropolitaine de Halifax) où il y avait <10 % d’écarts dans les numérations, beaucoup d’AD affichaient des écarts >30 %. Les tendances étaient semblables pour les taux d’appariement: la très grande majorité des AD non urbaines affichaient des taux d’appariement <40 %. Même dans les grandes agglomérations urbaines, 10 % des AD comportaient d’importantes erreurs de classification. Les erreurs de classification à l’échelle des SDR étaient encore trop importantes pour estimer les numérations ou les taux sans un regroupement plus poussé des zones.

CONCLUSION: L’utilisation habituelle du géocodage par code postal devrait être remplacée par le géocodage de l’information de localisation à l’aide d’identifiants supplémentaires, comme les adresses de voirie ou la latitude et la longitude. Si les détenteurs de données faisaient cela à l’interne avant de fournir leurs données aux chercheurs, l’exactitude et la capacité des analyses géographiques seraient rehaussées, et la confidentialité des données serait protégée.

Mots clés

géocodage code postal couplage de données analyse de données régionales santé des populations 

References

  1. 1.
    Krieger N, Waterman P, Lemieux K, Zieler S, Hogan JW. Evaluating the accuracy of geocoding in public health research. Am J Public Health 2001; 90:1114–16.Google Scholar
  2. 2.
    Rushton G, Armstrong MP, Gittler J, Greene BR, Pavlik CE, West MM, et al. Geocoding in cancer research: A review. Am J Prev Med 2006;30(2S):S16–24. doi: 10.1016/j.amepre.2005.09.011.CrossRefGoogle Scholar
  3. 3.
    Auger N, Daniel M, Platt RW, Wu Y, Luo ZC, Choiniere R. Association between perceived security of the neighbourhood and small-for-gestational-age birth. Paediatr Perinat Epidemiol 2008;22(5):467–77. doi: 10.1111/j.1365-3016.2008.00959.x.CrossRefGoogle Scholar
  4. 4.
    Wilkins R, Peters PA. PCCF+ Version 5K User’s Guide. Automated Geographic Coding Based on the Statistics Canada Postal Code Conversion Files, Including Postal Codes Through May 2011. Catalogue no. 82F0086-XDB. Ottawa, ON: Health Analysis Division, Statistics Canada, 2012.Google Scholar
  5. 5.
    Peller P. An Analysis of the Postal Code Conversion File’s Use in Research. Calgary, AB: University of Calgary, 2011; 1–24.Google Scholar
  6. 6.
    Jacquez GM. A research agenda: Does geocoding positional error matter in health GIS studies? Spat Spatio-temporal Epidemiol 2012;3:7–16. doi: 10.1016/j. sste.2012.02.002.CrossRefGoogle Scholar
  7. 7.
    Bell NJ, Schuurman N, Morad Hameed S. A small-area population analysis of socioeconomic status and incidence of severe burn/fire-related injury in British Columbia, Canada. Burns 2009;35(8):1133–41. PMID: 19553025. doi: 10.1016/j.burns.2009.04.028.CrossRefGoogle Scholar
  8. 8.
    Wang C, Guttmann A, To T, Dick PT. Neighborhood income and health outcomes in infants: How do those with complex chronic conditions fare? Arch Pediatr Adolesc Med 2009;163(7):608–15. PMID: 19581543. doi: 10.1001/ archpediatrics.2009.36.CrossRefGoogle Scholar
  9. 9.
    Zhang X, Onufrak S, Holt JB, Croft JB. A multilevel approach to estimating small area childhood obesity prevalence at the census block-group level. Prev Chronic Dis 2013;10:E68. doi: 10.5888/pcd10.120252.Google Scholar
  10. 10.
    Terashima M, Guernsey JR, Andreou P. What type of rural? Assessing the variations in life expectancy at birth at small area-level for a small population province using classes of locally defined settlement types. BMC Public Health 2014;14:162. PMID: 24524307. doi: 10.1186/1471-2458-14-162.CrossRefGoogle Scholar
  11. 11.
    Pampalon R, Hamel D, Gamache P. Recent changes in the geography of social disparities in premature mortality in Québec. Soc Sci Med 2008;67(8):1269–81. PMID: 18639966. doi: 10.1016/j.socscimed.2008.06.010.CrossRefGoogle Scholar
  12. 12.
    Matheson FI, Moineddin R, Glazier RH. The weight of place: A multilevel analysis of gender, neighborhood material deprivation, and body mass index among Canadian adults. Soc Sci Med 2008;66(3):675–90. PMID: 18036712. doi: 10.1016/j.socscimed.2007.10.008.CrossRefGoogle Scholar
  13. 13.
    Terashima M, Rainham DGC, Levy AR. A small-area analysis of inequalities in chronic disease prevalence across urban and non-urban communities in the Province of Nova Scotia, Canada, 2007–2012. BMJ Open 2014; 4(e004459):1–10.Google Scholar
  14. 14.
    Armstrong B. Effect of measurement error on epidemiological studies of environmental and occupational exposures. Occup Environ Med 1998; 55(10):651–56. PMID: 9930084. doi: 10.1136/oem.55.10.651.CrossRefGoogle Scholar
  15. 15.
    Rhomberg L, Chandalia J, Long J, Goodman J. Measurement error in environmental epidemiology and the shape of exposure-response curves. Crit Rev Toxicol 2011;41(8):651–71. PMID: 21823979. doi: 10.3109/10408444. 2011.563420.CrossRefGoogle Scholar
  16. 16.
    Government of Nova Scotia. Nova Scotia Civic Address Users Guide. Halifax, NS: GeoNOVA, 2015.Google Scholar
  17. 17.
    Statistics Canada. Postal Code Conversion File Plus (PCCF+) Reference Guide. Catalogue no. 82-E0086-XDB 6A. Ottawa, ON: Statistics Canada, 2014.Google Scholar
  18. 18.
    Statistics Canada. 2011 Census Dictionary. Catalogue no. 98-301-X, 2012. Available at: http://www12.statcan.gc.ca/census-recensement/2011/ref/dict/index-eng.cfm (Accessed December 10, 2015).Google Scholar
  19. 19.
    Ross NA, Tremblay S, Graham K. Neighbourhood influences on health in Montreal, Canada. Soc Sci Med 2004;28:443–78.Google Scholar
  20. 20.
    Goldberg DW, Jacquez GM. Advances in geocoding for the health sciences. Spat Spatio-temporal Epidemiol 2012;3:1–5. doi: 10.1016/j.sste.2012.02.001.CrossRefGoogle Scholar
  21. 21.
    Census of Population. Catalogue no. 12-581-X. Available at: http://www.statcan.gc.ca/pub/12-581-x/2012000/pop-eng.htm (Accessed November 30, 2015).
  22. 22.
    Iburi S, Fujita J, Yajima H, Kakuda H, Sakamoto M, Matsumura A. The intervention against an outbreak of pulmonary tuberculosis in the dormitory of construction laborers - Connection with approaches from public health, medical treatment, social welfare, and labor management. Kekkaku 2001; 76(11):691–98. PMID: 11766360.PubMedGoogle Scholar
  23. 23.
    Ratcliffe JH. Geocoding crime and a first estimate of a minimum acceptable hit rate. Int J Geogr Inform Sci 2004;18(1):61–72. doi: 10.1080/ 13658810310001596076.CrossRefGoogle Scholar
  24. 24.
    DMTI Spatial. Platinum Postal Code Suite v2011.3. Markham, ON: Multiple Enhanced Postalcodes (MEP), 2011.Google Scholar
  25. 25.
    Kephart G, Asada Y, Atherton F, Burge F, Campbell L-A, Dowling L, et al. Small Area Variation in Rates of High-Cost Healthcare Use Across Nova Scotia. Halifax, NS: Maritime SPOR Support Unit, 2016.Google Scholar
  26. 26.
    Fuller D, Shareck M. Canada Post community mailboxes: Implications for health research. Can J Public Health 2014;105(6):e453-55.Google Scholar
  27. 27.
    Shah TI, Bell S, Wilson K. Geocoding for public health research: Empirical comparison of two geocoding services applied to Canadian cities. Can Geogr 2014;58(4):400–17. doi: 10.1111/cag.12091.CrossRefGoogle Scholar
  28. 28.
    Office for National Statistics UK. Guidance and Methodology, Super Output Areas. ONS, London, UK.Google Scholar

Copyright information

© The Canadian Public Health Association 2016

Authors and Affiliations

  1. 1.School of Planning, Department of Community Health and Epidemiology, Healthy Populations InstituteDalhousie UniversityHalifaxCanada
  2. 2.Department of Community Health and EpidemiologyDalhousie UniversityHalifaxCanada

Personalised recommendations