Skip to main content

Using SPARQL and SPIN for Data Quality Management on the Semantic Web

  • Conference paper
Business Information Systems (BIS 2010)

Part of the book series: Lecture Notes in Business Information Processing ((LNBIP,volume 47))

Included in the following conference series:

Abstract

The quality of data is a key factor that determines the performance of information systems, in particular with regard (1) to the amount of exceptions in the execution of business processes and (2) to the quality of decisions based on the output of the respective information system. Recently, the Semantic Web and Linked Data activities have started to provide substantial data resources that may be used for real business operations. Hence, it will soon be critical to manage the quality of such data. Unfortunately, we can observe a wide range of data quality problems in Semantic Web data. In this paper, we (1) evaluate how the state of the art in data quality research fits the characteristics of the Web of Data, (2) describe how the SPARQL query language and the SPARQL Inferencing Notation (SPIN) can be utilized to identify data quality problems in Semantic Web data automatically and this within the Semantic Web technology stack, and (3) evaluate our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Wang, R.Y., Strong, D.M.: Beyond accuracy: what data quality means to data consumers. Journal of Management Information Systems 12(4), 5–33 (1996)

    MATH  Google Scholar 

  2. Redman, T.C.: Data quality: the field guide. Digital Press, Boston (2001)

    Google Scholar 

  3. Redman, T.C.: Data quality for the information age. Artech House, Boston (1996)

    Google Scholar 

  4. Berners-Lee, T., Hendler, J., Lassila, O.: The Semantic Web. Scientific American 284(5), 34–43 (2001)

    Article  Google Scholar 

  5. Uschold, M., Gruninger, M.: Ontologies: Principles, Methods, and Applications. The Knowledge Engineering Review 11(2), 93–155 (1996)

    Article  Google Scholar 

  6. BestBuy catalog in RDF, http://products.semweb.bestbuy.com/sitemap.xml

  7. Hepp, M.: GoodRelations: An ontology for describing products and services offers on the web. In: Gangemi, A., Euzenat, J. (eds.) EKAW 2008. LNCS (LNAI), vol. 5268, pp. 329–346. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  8. Oliveira, P., Rodrigues, F., Henriques, P.R.: A Formal Definition of Data Quality Problems. In: International Conference on Information Quality (2005)

    Google Scholar 

  9. Leser, U., Naumann, F.: Informationsintegration: Architekturen und Methoden zur Integration verteilter und heterogener Datenquellen. dpunkt-Verlag, Heidelberg (2007)

    Google Scholar 

  10. Oliveira, P., Rodrigues, F., Henriques, P.R., Galhardas, H.: A Taxonomy of Data Quality Problems. In: Proc. 2nd Int. Workshop on Data and Information Quality (in conjunction with CAiSE 2005), Porto, Portugal (2005)

    Google Scholar 

  11. Rahm, E., Do, H.-H.: Data Cleaning: Problems and Current Approaches. IEEE Data Engineering Bulletin 23(4), 3–13 (2000)

    Google Scholar 

  12. OpenLink Software: Sponger Technology, http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VirtSponger

  13. Olson, J.: Data quality: the accuracy dimension. Morgan Kaufmann Publishers, San Francisco (2003)

    Google Scholar 

  14. Wang, X., Hamilton, H.J., Bither, Y.: An ontology-based approach to data cleaning. Dept. of Computer Science, University of Regina, Regina (2005)

    Google Scholar 

  15. Grüning, F.: Datenqualitätsmanagement in der Energiewirtschaft. Oldenburger Verlag für Wirtschaft, Informatik und Recht, Oldenburg (2009)

    Google Scholar 

  16. Ji, Q., Haase, P., Qi, G., Hitzler, P., Stadtmüller, S.: RaDON – Repair and Diagnosis in Ontology Networks. In: 6th European Semantic Web Conference on The Semantic Web: Research and Applications (2009)

    Google Scholar 

  17. Knublauch, H.: SPIN – SPARQL Inferencing Notation (2009), http://spinrdf.org/ (retrieved December 4, 2009)

  18. Alexiev, V., Breu, M., de Bruin, J., Fensel, D., Lara, R., Lausen, H.: Information integration with ontologies: experiences from an industrial showcase. Jon Wiley & Sons, Ltd., Chichester (2005)

    Google Scholar 

  19. Eckerson, W.: Data Quality and the Bottom Line: Achieving Business Success through a Commitment to High Quality Data. Report of The Data Warehousing Institute (2002)

    Google Scholar 

  20. Redman, T.C.: The impact of poor data quality on the typical enterprise. Communications of the ACM 41, 79–82 (1998)

    Article  Google Scholar 

  21. Kedad, Z., Métais, E.: Ontology-Based Data Cleaning. In: Proceedings of the 6th International Conference on Applications of Natural Language to Information Systems-Revised Papers (2002)

    Google Scholar 

  22. Hartig, O.: Provenance Information in the Web of Data. In: Linked Data on the Web (LDOW 2009) Workshop at the World Wide Web Conference, WWW (2009)

    Google Scholar 

  23. Hartig, O.: Querying trust in RDF data with tSPARQL. In: Aroyo, L., Traverso, P., Ciravegna, F., Cimiano, P., Heath, T., Hyvönen, E., Mizoguchi, R., Oren, E., Sabou, M., Simperl, E. (eds.) ESWC 2009. LNCS, vol. 5554, pp. 5–20. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  24. O’Reilly catalog in RDF, http://oreilly.com/catalog/9780596007683

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Fürber, C., Hepp, M. (2010). Using SPARQL and SPIN for Data Quality Management on the Semantic Web. In: Abramowicz, W., Tolksdorf, R. (eds) Business Information Systems. BIS 2010. Lecture Notes in Business Information Processing, vol 47. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12814-1_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-12814-1_4

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-12813-4

  • Online ISBN: 978-3-642-12814-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics