Skip to main content

Data Provenance: Some Basic Issues

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1974))

Abstract

The ease with which one can copy and transform data on the Web, has made it increasingly difficult to determine the origins of a piece of data. We use the term data provenance to refer to the process of tracing and recording the origins of data and its movement between databases. Provenance is now an acute issue in scientific databases where it is central to the validation of data. In this paper we discuss some of the technical issues that have emerged in an initial exploration of thetopic.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A. Woodruff and M. Stonebraker. Supporting fine-grained data lineage in adatabase visualization environment. In ICDE, pages 91–102, 1997.

    Google Scholar 

  2. Serge Abiteboul, Peter Buneman, and Dan Suciu. Data on the Web. From Relationsto Semistructured Data and XML. Morgan Kaufman, 2000.

    Google Scholar 

  3. T. Barsalou, N. Siambela, A. Keller, and G Wiederhold. Updating relationaldatabases through object-based views. In Proceedings ACM SIGMOD, May 1991.

    Google Scholar 

  4. Tim Bray, Jean Paoli, and C. M. Sperberg-McQueen. Extensible MarkupLanguage (XML) 1.0. World Wide Web Consortium (W3C), Feb 1998.http://www.w3.org/TR/REC-xml.

  5. P. Buneman, S. Davidson, M. Liberman, C. Overton, and V. Tannen. Data provenance.http://db.cis.upenn.edu/~wctan/DataProvenance/precis/index.html.

  6. Peter Buneman, Susan Davidson, Carmem Hara, Wenfei Fan, and Wang-Chiew Tan. Keys for XML. Technical report, University of Pennsylvania, 2000.http://db.cis.upenn.edu.

  7. Peter Buneman, Sanjeev Khanna, and Wang-Chiew Tan. Why and Where: ACharacterization of Data Provenance. In International Conference on DatabaseTheory, 2001. To appear, available at http://db.cis.upenn.edu.

  8. James Clark and Steve DeRose. XML Path Language (XPath). W3CWorkingDraft, November 1999. http://www.w3.org/TR/xpath.

  9. Y. Cui and J. Widom. Practical lineage tracing in data warehouses. In ICDE,pages 367–378, 2000.

    Google Scholar 

  10. Jon Doyle. A truth maintenance system. Artificial Intelligence, 12:231–272, 1979.

    Article  MathSciNet  Google Scholar 

  11. R. G. G. Cattell et al, editor. The Object Database Standard: Odmg 2.0. MorganKaufmann, 1997.

    Google Scholar 

  12. A. Gupta and I. Mumick. Maintenance of materialized views: Problems, techniques,and applications. IEEE Data Engineering Bulletin, Vol. 18, No. 2, June1995., 1995.

    Google Scholar 

  13. Michael Lesk. Practical Digital Libraries: Books, Bytes and Bucks,. MorganKaufmann, July 1997.

    Google Scholar 

  14. Hartmut Liefke and Susan Davidson. View maintenance for hierarchical semistructureddata. In International Conference on Data Warehousing and KnowledgeDiscovery, 2000.

    Google Scholar 

  15. Susan Davidson and Chris Overton and Peter Buneman. Challenges in IntegratingBiological Data Sources. Journal of Computational Biology, 2(4):557–572, Winter1995.

    Article  Google Scholar 

  16. World Wide Web Consortium (W3C). XML Schema Part 0: Primer, 2000.http://www.w3.org/TR/xmlschema-0/.

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Buneman, P., Khanna, S., Tan, WC. (2000). Data Provenance: Some Basic Issues. In: Kapoor, S., Prasad, S. (eds) FST TCS 2000: Foundations of Software Technology and Theoretical Computer Science. FSTTCS 2000. Lecture Notes in Computer Science, vol 1974. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44450-5_6

Download citation

  • DOI: https://doi.org/10.1007/3-540-44450-5_6

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-41413-1

  • Online ISBN: 978-3-540-44450-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics