Skip to main content

Advertisement

Log in

Quality evaluation of postal address datasets measuring their autocorrelation

  • Published:
GeoJournal Aims and scope Submit manuscript

Abstract

Many spatial applications related to land and titles like land use management, registering, and utility and health service providers are using postal addresses as their main or their supplementary georeferencing method. Evaluation of postal address datasets quality is important when controlling their changes due to manipulations (like add or update), comparing them, or merging them, that is one of the main strategies of developing countries like Iran, to form a unified addressing structure and database. Despite the costly and time consuming formal methods of postal addresses qualification that are based on address matching, the method proposed in this paper provides an evaluation of a postal address quality not requiring any preprocessing like standardization or ancillary data like streets and their addressing scheme data. The proposed method is based on measuring the autocorrelation of a postal address dataset content where higher level of autocorrelation indicates more standardization and less spatial sparsity of the addresses. The method processes the adjacency graph formed measuring Damerau–Levenstein distance between records of a postal address dataset. Evaluation of 5 statistics for 4 postal address datasets of Tehran City of Iran shows that the cumulative frequency of values and the maximum size of the components (sub-graphs) in the adjacency graph could be used. These statistics both show stable S-Shaped patterns that their threshold at the first extremum of their second derivative represents the desired quality of a postal address dataset. The results show that the measured threshold of postal address dataset corresponds with its topological structure of the streets that cover its addresses. The method can define characteristics of a standard address structure for one or more postal address datasets as the results propose 5 components for the standard address of the evaluated datasets which is the same as the number of components defined for Iranian national structure of postal addresses.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  • Christen, P. (2006). A comparison of personal name matching: Techniques and practical issues. Australia: The Australian National University.

    Google Scholar 

  • Christen, P. (2012). Data matching: Concepts and techniques for record linkage, entity resolution, and duplicate detection. Berlin: Springer.

    Book  Google Scholar 

  • Christen, P., & Belacic, D. (2005). Automated probabilistic address standardisation and verification. Paper presented at the Australasian Data Mining Conference (AusDM’05), Sydney, Australia, December, 2005.

  • Coetzee, S., & Rademeyer, M. (2009). Testing the spatial adjacency match of the Intiendo address matching tool for geocoding of addresses with misleading suburb or place name. Paper presented at the 24th international cartographic conference, Santiago, Chile, 15–21 November 2009.

  • Damerau, F. J. (1964). A technique for computer detection and correction of spelling errors. Communications of the ACM,7(3), 171–176.

    Article  Google Scholar 

  • Davis, C. A., Fonseca, F. T., & Borges, K. A. D. V. (2003). A flexible addressing system for approximate geocoding. Paper presented at the GeoInfo2003.

  • Drummond, W. J. (1995). Address matching: GIS technology for mapping human activity patterns. Journal of the American Planning Association,61(2), 240–251.

    Article  Google Scholar 

  • ESRI. (2010). Customizing Locators in ArcGIS 10. https://egis3.lacounty.gov/eGIS/wp-content/uploads/2011/05/Customizing-Locators-in-ArcGIS-10.pdf.

  • Goldberg, D. W. (2008). A geocoding best practices guide. Springfield, IL: North American Association of Central Cancer Registries.

    Google Scholar 

  • Goldberg, D. W., Wilson, J. P., Cockburn, M. G. (2010). Toward quantitative geocode accuracy metrics. In N. J. Tate & P. F. Fisher (Eds.), Proceedings of the ninth international symposium on spatial accuracy assessment in natural resources and environmental sciences, Leicester, United Kingdom, July 20–23, 2010.

  • Google Maps. (2018). Tehran map. Retrieved from https://www.google.com/maps/@35.6911917,51.355276,11z. Accessed Aug 2018.

  • IPTT. (2018). Introduction of Iranian GNAF from http://gnaf.post.ir/. Accessed July 2018.

  • Longley, P. A., Goodchild, M. F., Maguire, D. J., & Rhind, D. W. (2005). Geographical information systems and science. Chichester: Wiley.

    Google Scholar 

  • McDonald, Y. J., Schwind, M., Goldberg, D. W., Lampley, A., & Wheeler, C. M. (2017). An analysis of the process and results of manual geocoding correction. Geospatial Health,12(1), 84–89.

    Article  Google Scholar 

  • Patman, F., & Shaefer, L. (2001). Is Soundex good enough for you? On the hidden risks of Soundex-based name searching. Herndon: Language Analysis Systems Inc.

    Google Scholar 

  • Paull, D. (2003). A geocoded national address file for Australia: The GNAF what, why, who and when. Canberra: PSMA Australia.

    Google Scholar 

  • PSMA. (2018). The foundation geocoded address database for Australian businesses and governments. From https://www.psma.com.au/products/g-naf.

  • Snae, C., & Bruckner, M. (2009). Novel phonetic name matching algorithm with a statistical ontology for analysing names given in accordance with thai astrology. Issues in Informing Science and Information Technology,6, 497–515. https://doi.org/10.28945/3347.

    Article  Google Scholar 

  • Watts, D. J., & Strogatz, S. H. (1998). Collective dynamics of ‘small-world’ networks. Nature,393, 440–442. https://doi.org/10.1038/30918.

    Article  Google Scholar 

  • Zandbergen, P. A. (2009). Geocoding quality and implications for spatial analysis. Geography Compass,3(2), 647–680.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hani Rezayan.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rezayan, H., Sadidi, J. & Hosseini, V. Quality evaluation of postal address datasets measuring their autocorrelation. GeoJournal 84, 1617–1625 (2019). https://doi.org/10.1007/s10708-018-9940-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10708-018-9940-x

Keywords

Navigation