Publishing and Consuming Provenance Metadata on the Web of Linked Data

  • Olaf Hartig
  • Jun Zhao
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6378)


The World Wide Web evolves into a Web of Data, a huge, globally distributed dataspace that contains a rich body of machine-processable information from a virtually unbound set of providers covering a wide range of topics. However, due to the openness of the Web little is known about who created the data and how. The fact that a large amount of the data on the Web is derived by replication, query processing, modification, or merging raises concerns of information quality. Poor quality data may propagate quickly and contaminate the Web of Data. Provenance information about who created and published the data and how, provides the means for quality assessment. This paper takes a first step towards creating a quality-aware Web of Data: we present approaches to integrate provenance information into the Web of Data and we illustrate how this information can be consumed. In particular, we introduce a vocabulary to describe provenance of Web data as metadata and we discuss possibilities to make such provenance metadata accessible as part of the Web of Data. Furthermore, we describe how this metadata can be queried and consumed to identify outdated information.


  1. 1.
    Bizer, C., Heath, T., Berners-Lee, T.: Linked Data - The Story So Far. In: Int. Journal on Semantic Web and Information Systems. Special Issue on Linked Data (2009)Google Scholar
  2. 2.
    Berners-Lee, T.: Design issues: Linked data, (retrieved March 19 2010)
  3. 3.
    Hartig, O.: Provenance Information in the Web of Data. In: Proceedings of the Linked Data on the Web Workshop (LDOW) at WWW (2009)Google Scholar
  4. 4.
    Auer, S., Dietzold, S., Lehmann, J., Hellmann, S., Aumueller, D.: Triplify: Light-weight linked data publication from relational databases. In: Proceedings of the 18th International Conference on World Wide Web, WWW (2009)Google Scholar
  5. 5.
    Alexander, K., Cyganiak, R., Hausenblas, M., Zhao, J.: Describing linked datasets. In: Proceedings of the Linked Data on the Web Workshop (LDOW) at WWW (2009)Google Scholar
  6. 6.
    Carroll, J.J., Bizer, C., Hayes, P., Stickler, P.: Named graphs, provenance and trust. In: Proceedings of the 14th International World Wide Web Conference, WWW (2005)Google Scholar
  7. 7.
    Hartig, O., Zhao, J., Mühleisen, H.: Automatic integration of metadata into the web of linked data. In: Proceedings of the Demo Session at the 2nd Workshop on Trust and Privacy on the Social and Semantic Web (SPOT) at ESWC (2010)Google Scholar
  8. 8.
    Hartig, O., Bizer, C., Freytag, J.C.: Executing SPARQL queries over the web of linked data. In: Bernstein, A., Karger, D.R., Heath, T., Feigenbaum, L., Maynard, D., Motta, E., Thirunarayan, K. (eds.) ISWC 2009. LNCS, vol. 5823, Springer, Heidelberg (2009)CrossRefGoogle Scholar
  9. 9.
    Bose, R., Frew, J.: Lineage retrieval for scientific data processing: A survey. ACM Computing Surveys 37(1) (2005)Google Scholar
  10. 10.
    Simmhan, Y., Plale, B., Gannon, D.: A Survey of Data Provenance in e-Science. SIGMOD Record 34(3) (2005)Google Scholar
  11. 11.
    Tan, W.C.: Provenance in Databases: Past, Current, and Future. IEEE Data Engineering Bulletin 30(4) (2007)Google Scholar
  12. 12.
    Harth, A., Polleres, A., Decker, S.: Towards a Social Provenance Model for the Web. In: Proceedings of the Workshop on Principles of Provenance (2007)Google Scholar
  13. 13.
    Ding, L., Finin, T., Peng, Y., da Silva, P.P., McGuinness, D.L.: Tracking RDF Graph Provenance using RDF Molecules. Technical Report TR-CS-05-06, UMBC (2005)Google Scholar
  14. 14.
    da Silva, P.P., McGuinness, D.L., McCool, R.: Knowledge Provenance Infrastructure. Data Engineering Bulletin 26(4) (2003)Google Scholar
  15. 15.
    da Silva, P.P., McGuinness, D.L., Fikes, R.: A Proof Markup Language for Semantic Web Services. Information Systems 31(4-5) (2006)Google Scholar
  16. 16.
    Moreau, L., Clifford, B., Freire, J., Futrelle, J., Gil, Y., Groth, P., Kwasnikowska, N., Miles, S., Missier, P., Myers, J., Plale, B., Simmhan, Y., Stephan, E., den Bussche, J.V.: The open provenance model core specification (v1.1). In: Future Generation Computer Systems (in Press 2010) (accepted Manuscript)Google Scholar
  17. 17.
    Sahoo, S., Thomas, C., Sheth, A., York, W., Tartir, S.: Knowledge modeling and its application in life sciences: a tale of two ontologies. In: Proceedings of the 15th International Conference on World Wide Web, WWW (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Olaf Hartig
    • 1
  • Jun Zhao
    • 2
  1. 1.Humboldt-Universität zu BerlinGermany
  2. 2.University of OxfordUK

Personalised recommendations