A Core Calculus for Provenance

  • Umut A. Acar
  • Amal Ahmed
  • James Cheney
  • Roly Perera
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7215)


Provenance is an increasing concern due to the revolution in sharing and processing scientific data on the Web and in other computer systems. It is proposed that many computer systems will need to become provenance-aware in order to provide satisfactory accountability, reproducibility, and trust for scientific or other high-value data. To date, there is not a consensus concerning appropriate formal models or security properties for provenance. In previous work, we introduced a formal framework for provenance security and proposed formal definitions of properties called disclosure and obfuscation

This paper develops a core calculus for provenance in programming languages. Whereas previous models of provenance have focused on special-purpose languages such as workflows and database queries, we consider a higher-order, functional language with sums, products, and recursive types and functions. We explore the ramifications of using traces based on operational derivations for the purpose of comparing other forms of provenance.We design a rich class of provenance views over traces. Finally, we prove relationships among provenance views and develop some solutions to the disclosure and obfuscation problems.


Execution Trace Functional Language Dynamic Semantic Primitive Operation Provenance Information 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Acar, U.A., Blelloch, G.E., Harper, R.: Adaptive functional programming. ACM Trans. Program. Lang. Syst. 28(6), 990–1034 (2006)CrossRefGoogle Scholar
  2. 2.
    Bohannon, A., Foster, J.N., Pierce, B.C., Pilkiewicz, A., Schmitt, A.: Boomerang: resourceful lenses for string data. In: POPL, pp. 407–419. ACM, New York (2008)CrossRefGoogle Scholar
  3. 3.
    Bose, R., Frew, J.: Lineage retrieval for scientific data processing: a survey. ACM Comput. Surv. 37(1), 1–28 (2005)CrossRefGoogle Scholar
  4. 4.
    Buneman, P., Cheney, J., Tan, W.-C., Vansummeren, S.: Curated databases. In: PODS, pp. 1–12 (2008)Google Scholar
  5. 5.
    Buneman, P., Cheney, J., Vansummeren, S.: On the expressiveness of implicit provenance in query and update languages. ACM Transactions on Database Systems 33(4), 28 (2008)CrossRefGoogle Scholar
  6. 6.
    Buneman, P., Khanna, S., Tan, W.-C.: Why and Where: A Characterization of Data Provenance. In: Van den Bussche, J., Vianu, V. (eds.) ICDT 2001. LNCS, vol. 1973, pp. 316–330. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  7. 7.
    Carey, S., Rogow, G.: UAL shares fall as old story surfaces online. Wall Street Journal (September 2008),
  8. 8.
    Cheney, J.: A formal framework for provenance security. In: CSF, pp. 281–293. IEEE (2011)Google Scholar
  9. 9.
    Cheney, J., Ahmed, A., Acar, U.A.: Provenance as dependency analysis. Mathematical Structures in Computer Science 21(6), 1301–1337 (2011)zbMATHCrossRefGoogle Scholar
  10. 10.
    Cheney, J., Chiticariu, L., Tan, W.C.: Provenance in databases: Why, how, and where. Foundations and Trends in Databases 1(4), 379–474 (2009)CrossRefGoogle Scholar
  11. 11.
    Cheney, J., Chong, S., Foster, N., Seltzer, M., Vansummeren, S.: Provenance: A future history. In: OOPSLA Companion (Onward! 2009), pp. 957–964 (2009)Google Scholar
  12. 12.
    Chong, S.: Towards semantics for provenance security. In: Workshop on the Theory and Practice of Provenance (2009), Informal online proceedings:
  13. 13.
    Cirillo, A., Jagadeesan, R., Pitcher, C., Riely, J.: Tapido: Trust and Authorization Via Provenance and Integrity in Distributed Objects (Extended Abstract). In: Gairing, M. (ed.) ESOP 2008. LNCS, vol. 4960, pp. 208–223. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  14. 14.
    Davidson, S.B., Freire, J.: Provenance and scientific workflows: challenges and opportunities. In: SIGMOD, New York, NY, USA, pp. 1345–1350 (2008)Google Scholar
  15. 15.
    Davidson, S.B., Khanna, S., Milo, T., Panigrahi, D., Roy, S.: Provenance views for module privacy. In: PODS, pp. 175–186 (2011)Google Scholar
  16. 16.
    Dimoulas, C., Findler, R.B., Flanagan, C., Felleisen, M.: Correct blame for contracts: no more scapegoating. In: POPL, pp. 215–226. ACM, New York (2011)Google Scholar
  17. 17.
    Foster, J.N., Green, T.J., Tannen, V.: Annotated XML: queries and provenance. In: PODS, pp. 271–280 (2008)Google Scholar
  18. 18.
    Green, T.J., Karvounarakis, G., Tannen, V.: Provenance semirings. In: PODS, pp. 31–40 (2007)Google Scholar
  19. 19.
    Guts, N., Fournet, C., Zappa Nardelli, F.: Reliable Evidence: Auditability by Typing. In: Backes, M., Ning, P. (eds.) ESORICS 2009. LNCS, vol. 5789, pp. 168–183. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  20. 20.
    Hidders, J., Kwasnikowska, N., Sroka, J., Tyszkiewicz, J., Van den Bussche, J.: A Formal Model of Dataflow Repositories. In: Cohen-Boulakia, S., Tannen, V. (eds.) DILS 2007. LNCS (LNBI), vol. 4544, pp. 105–121. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  21. 21.
    Jia, L., Vaughan, J.A., Mazurak, K., Zhao, J., Zarko, L., Schorr, J., Zdancewic, S.: Aura: a programming language for authorization and audit. In: ICFP, New York, NY, USA, pp. 27–38 (2008)Google Scholar
  22. 22.
    Moreau, L.: The foundations for provenance on the web. Foundations and Trends in Web Science 2(2-3) (2010)Google Scholar
  23. 23.
    Moreau, L., et al.: The open provenance model core specification (v1.1). Future Generation Computer Systems 27(6), 743–756 (2010)CrossRefGoogle Scholar
  24. 24.
    Simmhan, Y., Plale, B., Gannon, D.: A survey of data provenance in e-science. SIGMOD Record 34(3), 31–36 (2005)CrossRefGoogle Scholar
  25. 25.
    Swamy, N., Chen, J., Fournet, C., Strub, P.-Y., Bhargavan, K., Yang, J.: Secure distributed programming with value-dependent types. In: ICFP, pp. 266–278 (2011)Google Scholar
  26. 26.
    Swamy, N., Corcoran, B.J., Hicks, M.: Fable: A language for enforcing user-defined security policies. In: IEEE Symposium on Security and Privacy, pp. 369–383 (2008)Google Scholar
  27. 27.
    Varghese, S.: UK government gets bitten by Microsoft Word. Sydney Morning Herald (July 2003),

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Umut A. Acar
    • 1
  • Amal Ahmed
    • 2
  • James Cheney
    • 3
  • Roly Perera
    • 1
  1. 1.Max Planck Institute for Software SystemsGermany
  2. 2.Indiana UniversityUSA
  3. 3.University of EdinburghUK

Personalised recommendations