Skip to main content

QualESTIM: Interactive Quality Assessment of Socioeconomic Data Using Outlier Detection

  • Chapter
  • First Online:
Bridging the Geographic Information Sciences

Abstract

This paper presents a platform, called QualESTIM, for exploring socioeconomic statistical data (also called indicators). QualESTIM integrates various outlier detection methods that make it possible to evaluate the logical consistency of a dataset, and its quality in fine. Without recourse to ‘ground truth’ of some kind, data values are compared to various spatiotemporal distributions given by statistical models. However, an outlier is not necessarily an error: experts should always interpret the outlying value. That is why we claim here that such a quality assessment process has to be interactive and that metadata associated with such data should be made available in order to refine the analysis. Dedicated to outlier detection and their visualization by an expert, the platform is connected to a database that contains both the data and their metadata, structured according to an ISO 19115 profile. A case study illustrates the interest of this approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Beard, M. K., Buttenfield, B. P. and Clapham S. B. (1991). NCGIA Research Initiative 7: Visualization of Spatial Data Quality. NCGIA Technical Paper 91-26.

    Google Scholar 

  • Bivand, R. S., Pebesma, E. J., and Gómez-Rubio, V. (2008) Applied Spatial Data Analysis with R, XIV, 378 p., Springer.

    Google Scholar 

  • Brunsdon C, Fotheringham S, Charlton M (2007). “Geographically Weighted Discriminant Analysis.” Geographical Analysis, 39(4), pp. 376–396.

    Google Scholar 

  • Caussinus H, Ruiz A (1990). “Interesting Projections of Multidimensional Data by Means of Generalized Principal Components Analysis.” In COMPSTAT90, pp. 121–126. Physica- Verlag, Heidelberg, Germany.

    Google Scholar 

  • Chrisman, N. R., (1984) The role of quality information in the long-term functioning of a geographic information system. Cartographica, 21, pp. 79-87.

    Google Scholar 

  • Chrisman, N. R., (1991) The error component in spatial data. In Longley, P. A. & Goodchild, M. F. & Maguire, D. J. & Rhind, D. W., editors, Geographic Information Systems and Science, pp. 165-174. Longman Scientific and Technical.

    Google Scholar 

  • Clarke, D.G., and Clark, D.M., (1995), Lineage. In Guptill S.C. & Morrison J.L., editors, Elements of spatial data quality, pp. 13–30. Oxford, Elsevier.

    Google Scholar 

  • Cheng, T., and Li, Z., (2006) A multi-scale approach for spatial-temporal outlier detection, Transactions in GIS, 10(2), pp. 253-263.

    Google Scholar 

  • Daniel F., Casati F., Palpanas, T., Chayka O., and Cinzia C., (2008) Enabling Better Decisions through Quality-aware Reports. In: International Conference on Information Quality (ICIQ).

    Google Scholar 

  • Dean P., and Sundgren B., (1996) Quality Aspects of a Modern Database Service. In: Proc. of the 8th Int. Conf. on Scientific and Statistical Database Management, SSDBM’96, pp. 156-161.

    Google Scholar 

  • Gotway, C., and Young, L, (2002) “Combining incompatible spatial data”, in Journal of the American Statistical Association, (2002), 97(458) pp. 632-648

    Google Scholar 

  • Grasland, C., and Gensel, J., (2010) ESPON 2013 Database, Final Report, December 2010.

    Google Scholar 

  • Grubbs, F. E., (1969) Procedures for detecting outlying observations in samples. Technometrics (11), pp. 1–21.

    Google Scholar 

  • Harris, P. and Charlton, M., (2010) “Spatial analysis for quality control, phase 1: The identification of logical input errors and statistical outliers”, The ESPON Monitoring Comittee, Tech. Rep., Esch-sur-Alzette, Luxembourg.

    Google Scholar 

  • International Organization for Standardisation. Technical Committee 211, (2002) Geographic Information - Quality principles - ISO 19113.

    Google Scholar 

  • International Organization for Standardisation. Technical Committee 211, (2003) Geographic Information - Quality evaluation procedures - ISO 19114.

    Google Scholar 

  • International Organization for Standardisation. Technical Committee 211, (2003) Geographic Information -- Metadata - ISO 19115.

    Google Scholar 

  • International Organization for Standardisation. Technical Committee 211, (2006) Geographic Information – Data quality measures - ISO 19138.

    Google Scholar 

  • International Organization for Standardisation. Technical Committee 211, (2011) Geographic Information -- Data quality - ISO 19157.

    Google Scholar 

  • Kubik, K., Lyons, K., Merchant, D. (1988) Photogrammetric work without blunders. Photogrammetric Engineering and Remote Sensing 54: 51-4.

    Google Scholar 

  • Monmonier, M., (1989), Geographic brushing: enhancing exploratory analysis of the scatterplot matrix. Geographical Analysis, 21, pp. 81–84.

    Google Scholar 

  • Plumejeaud, C., Gensel, J., and Villanova-Oliver, M., (2010) Opérationnalisation d’un profil ISO 19115 pour des métadonnées socio-économiques, INFORSID Marseille, May 25-28.

    Google Scholar 

  • Plumejeaud C., Mathian H., Gensel J., and Grasland C., (2011), Spatio-temporal analysis of territorial changes from a multi-scale perspective, International Journal of Geographical Information Science, 25(11), pp. 1597-1612.

    Google Scholar 

  • Rousseeuw, P. and Leroy, A., (1996) Robust Regression and Outlier Detection. John Wiley & Sons, 3rd edition.

    Google Scholar 

  • Schneiderman, B., (1996), “The Eyes Have It: A Task by Data Type Taxonomy for Information Visualizations”, Proceedings of the 1996 IEEE Symposium on Visual Languages, pp. 336-344, Washington, DC, USA.

    Google Scholar 

  • Servigne, S., Lesage, N. and Libourel, T. (2010) Quality Components, Standards, and Metadata, in Fundamentals of Spatial Data Quality (eds R. Devillers and R. Jeansoulin), 2010, ISTE, London, UK.

    Google Scholar 

  • Tukey, J., (1977), Exploratory data analysis, Addison Wesley Longman Publishing Co., Inc., 688 p.

    Google Scholar 

  • UN/ECE. (1995) Guidelines for the Modelling of Statistical Data and Metadata. Technical report, UN/ECE, New York, Geneva.

    Google Scholar 

  • Wand, Y., and Wang, R.Y. (1996) Anchoring Data Quality Dimensions in Ontological Foundations. In: Communications of the ACM, pp. 86–95.

    Google Scholar 

Download references

Acknowledgements

The research presented in this paper has been supported by the ESPON 2013 database project, of the European Spatial Planning and Observation Network for Territorial Cohesion. We would like to thank Claude Grasland for its advices, as well as Martin Charlton and Paul Harris who provided the implementation of the outlier detection methods in R. The authors would like to thank the reviewers for their comments that help to improve the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christine Plumejeaud .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Plumejeaud, C., Villanova-Oliver, M. (2012). QualESTIM: Interactive Quality Assessment of Socioeconomic Data Using Outlier Detection. In: Gensel, J., Josselin, D., Vandenbroucke, D. (eds) Bridging the Geographic Information Sciences. Lecture Notes in Geoinformation and Cartography(). Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29063-3_8

Download citation

Publish with us

Policies and ethics