Advertisement

Toward a Theory of Self-explaining Computation

  • James Cheney
  • Umut A. Acar
  • Roly Perera
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8000)

Abstract

Provenance techniques aim to increase the reliability of human judgments about data by making its origin and derivation process explicit. Originally motivated by the needs of scientific databases and scientific computation, provenance has also become a major issue for business and government data on the Web. However, so far provenance has been studied only in relatively restrictive settings: typically, for data stored in databases or scientific workflow systems, and processed by query or workflow languages of limited expressiveness. Long-term provenance solutions require an understanding of provenance in other settings, particularly the general-purpose programming or scripting languages that are used to glue different components such as databases, Web services and workflows together. Moreover, what is required is not only an account of mechanisms for recording provenance, but also a theory of what it means for provenance information to explain or justify a computation. In this paper, we begin to outline a such a theory of self-explaining computation. We introduce a model of provenance for a simple imperative language based on operational derivations and explore its properties.

Keywords

Operational Semantic Derivation Tree Denotational Semantic Primitive Operation Data Provenance 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Abadi, M., Banerjee, A., Heintze, N., Riecke, J.G.: A core calculus of dependency. In: POPL, pp. 147–160. ACM Press (1999)Google Scholar
  2. 2.
    Abadi, M., Lampson, B., Lévy, J.-J.: Analysis and caching of dependencies. In: ICFP, pp. 83–91. ACM Press (1996)Google Scholar
  3. 3.
    Acar, U.A., Ahmed, A., Cheney, J., Perera, R.: A core calculus for provenance. In: Degano, P., Guttman, J.D. (eds.) POST 2012. LNCS, vol. 7215, pp. 410–429. Springer, Heidelberg (2012)Google Scholar
  4. 4.
    Acar, U.A., Blelloch, G.E., Blume, M., Harper, R., Tangwongsan, K.: An experimental analysis of self-adjusting computation. ACM Trans. Prog. Lang. Sys. 32(1), 3:1–3:53 (2009)CrossRefGoogle Scholar
  5. 5.
    Acar, U.A., Blelloch, G.E., Harper, R.: Adaptive functional programming. ACM Trans. Program. Lang. Syst. 28(6), 990–1034 (2006)CrossRefGoogle Scholar
  6. 6.
    Acar, U.A., Buneman, P., Cheney, J., Kwasnikowska, N., Van den Bussche, J., Vansummeren, S.: A graph model of data and workflow provenance. In: TAPP (2010), http://www.usenix.org/event/tapp10
  7. 7.
    Bowers, S., McPhillips, T.M., Riddle, S., Anand, M.K., Ludäescher, B.: Kepler/pPOD: Scientific workflow and provenance support for assembling the tree of life. In: Freire, et al. (eds.) [26], pp. 70–77.Google Scholar
  8. 8.
    Buneman, P., Cheney, J., Vansummeren, S.: On the expressiveness of implicit provenance in query and update languages. ACM Transactions on Database Systems 33(4), 28 (2008)CrossRefGoogle Scholar
  9. 9.
    Buneman, P., Khanna, S., Tan, W.-C.: Why and where: A characterization of data provenance. In: Van den Bussche, J., Vianu, V. (eds.) ICDT 2001. LNCS, vol. 1973, pp. 316–330. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  10. 10.
    Buneman, P., Libkin, L., Suciu, D., Tannen, V., Wong, L.: Comprehension syntax. SIGMOD Record 23(1), 87–96 (1994)CrossRefGoogle Scholar
  11. 11.
    Carey, S., Rogow, G.: UAL shares fall as old story surfaces online. Wall Street Journal (September 2008), http://online.wsj.com/article/SB122088673738010213.html
  12. 12.
    Cheney, J.: Program slicing and data provenance. IEEE Data Engineering Bulletin, 22–28 (December 2007) Invited paperGoogle Scholar
  13. 13.
    Cheney, J.: Causality and the semantics of provenance. In: Proceedings of the 2010 Workshop on Developments in Computational Models (2010)Google Scholar
  14. 14.
    Cheney, J.: A formal framework for provenance security. In: CSF, pp. 281–293. IEEE (2011)Google Scholar
  15. 15.
    Cheney, J., Ahmed, A., Acar, U.A.: Provenance as dependency analysis. Mathematical Structures in Computer Science 21(6), 1301–1337 (2011)MathSciNetCrossRefzbMATHGoogle Scholar
  16. 16.
    Cheney, J., Chiticariu, L., Tan, W.: Provenance in databases: Why, how, and where. Foundations and Trends in Databases 1(4), 379–474 (2009)CrossRefGoogle Scholar
  17. 17.
    Cheney, J., Chong, S., Foster, N., Seltzer, M., Vansummeren, S.: Provenance: A future history. In: OOPSLA Companion (Onward! 2009), pp. 957–964 (2009)Google Scholar
  18. 18.
    Cheney, J., Lindley, S., Wadler, P.: A practical theory of language-integrated query. In: ICFP (to appear, 2013)Google Scholar
  19. 19.
    Chong, S.: Towards semantics for provenance security. In: Workshop on the Theory and Practice of Provenance (2009), Informal online proceedings, http://www.usenix.org/events/tapp09/
  20. 20.
    Cui, Y., Widom, J., Wiener, J.L.: Tracing the lineage of view data in a warehousing environment. ACM Trans. Database Syst. 25(2), 179–227 (2000)CrossRefGoogle Scholar
  21. 21.
    Dezani-Ciancaglini, M., Horne, R., Sassone, V.: Tracing where and who provenance in linked data: A calculus. Theor. Comput. Sci. 464, 113–129 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  22. 22.
    Dourish, P.: Accounting for System Behaviour: Representation, Reflection and Resourceful Action. In: Computers and Design in Context, pp. 145–170. MIT Press (1997)Google Scholar
  23. 23.
    Foster, I., Vockler, J., Wilde, M., Zhao, Y.: Chimera: A virtual data system for representing, querying, and automating data derivation. In: SSDBM, pp. 1–10 (July 2002)Google Scholar
  24. 24.
    Foster, J.N., Green, T.J., Tannen, V.: Annotated XML: queries and provenance. In: PODS, pp. 271–280 (2008)Google Scholar
  25. 25.
    Foster, J.N., Greenwald, M.B., Moore, J.T., Pierce, B.C., Schmitt, A.: Combinators for bidirectional tree transformations: A linguistic approach to the view-update problem. ACM Trans. Program. Lang. Syst. 29(3), 17 (2007)CrossRefGoogle Scholar
  26. 26.
    Freire, J., Koop, D., Moreau, L. (eds.): IPAW 2008. LNCS, vol. 5272. Springer, Heidelberg (2008)Google Scholar
  27. 27.
    Gil, Y., Cheney, J., Groth, P., Hartig, O., Miles, S., Moreau, L., da Silva, P.P., Coppens, S., Garijo, D., Gomez, J.M., Missier, P., Myers, J., Sahoo, S., Zhao, J.: Provenance XG final report (December 2010), http://www.w3.org/2005/Incubator/prov/XGR-prov-20101214/
  28. 28.
    Good, D.I.: The foundations of computer security: we need some (1986), http://www.ieee-security.org/CSFWweb/goodessay.html
  29. 29.
    Green, T.J., Karvounarakis, G., Tannen, V.: Provenance semirings. In: PODS, pp. 31–40. ACM (2007)Google Scholar
  30. 30.
    Greenwald, G., MacAskill, E.: NSA Prism program taps in to user data of Apple, Google and others. The Guardian (June 2013), http://www.guardian.co.uk/world/2013/jun/06/us-tech-giants-nsa-data
  31. 31.
    Groth, P., Gil, Y., Cheney, J., Miles, S.: Requirements for provenance on the web. International Journal of Digital Curation 7(1), 39–56 (2012)CrossRefGoogle Scholar
  32. 32.
    Groth, P., Miles, S., Munroe, S.: Principles of high quality documentation for provenance: A philosophical discussion. In: Moreau, L., Foster, I. (eds.) IPAW 2006. LNCS, vol. 4145, pp. 278–286. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  33. 33.
    Kahn, G.: Natural semantics. In: Brandenburg, F.J., Wirsing, M., Vidal-Naquet, G. (eds.) STACS 1987. LNCS, vol. 247, pp. 22–39. Springer, Heidelberg (1987)CrossRefGoogle Scholar
  34. 34.
    Kammar, O., Lindley, S., Oury, N.: Handlers in action. In: ICFP (to appear, 2013)Google Scholar
  35. 35.
    Lamport, L.: State the problem before describing the solution. SIGSOFT Softw. Eng. Notes 3, 26–26 (1978)CrossRefGoogle Scholar
  36. 36.
    Levy, P.B.: Call-By-Push-Value: A Functional/Imperative Synthesis. Semantic Structures in Computation, vol. 2. Springer (2004)Google Scholar
  37. 37.
    Lynch, C.A.: When documents deceive: trust and provenance as new factors for information retrieval in a tangled web. J. Am. Soc. Inf. Sci. Technol. 52(1), 12–17 (2001)CrossRefGoogle Scholar
  38. 38.
    McGuinness, D.L., Pinheiro da Silva, P.: Explaining answers from the semantic web: the inference web approach. Web Semant. 1, 397–413 (2004)CrossRefGoogle Scholar
  39. 39.
    Miles, S., Groth, P., Branco, M., Moreau, L.: The requirements of using provenance in e-science experiments. Journal of Grid Computing 5, 1–25 (2007), doi:10.1007/s10723-006-9055-3CrossRefGoogle Scholar
  40. 40.
    Miles, S., Groth, P.T., Munroe, S., Jiang, S., Assandri, T., Moreau, L.: Extracting causal graphs from an open provenance data model. Concurrency and Computation: Practice and Experience 20(5), 577–586 (2008)CrossRefGoogle Scholar
  41. 41.
    Miles, S., Wong, S.C., Fang, W., Groth, P.T., Zauner, K.-P., Moreau, L.: Provenance-based validation of e-science experiments. J. Web Sem. 5(1), 28–38 (2007)CrossRefGoogle Scholar
  42. 42.
    Miller, G.: A scientist’s nightmare: Software problem leads to five retractions. Science 314(5807), 1856–1857 (2006)CrossRefGoogle Scholar
  43. 43.
    Missier, P., Belhajjame, K., Zhao, J., Roos, M., Goble, C.A.: Data lineage model for Taverna workflows with lightweight annotation requirements. In: Freire, et al. (eds.), pp. 17–30Google Scholar
  44. 44.
    Moreau, L.: The foundations for provenance on the web. Foundations and Trends in Web Science 2(2-3), 99–241 (2010)MathSciNetCrossRefGoogle Scholar
  45. 45.
    Moreau, L.: Provenance-based reproducibility in the semantic web. J. Web Sem. 9(2), 202–221 (2011)CrossRefGoogle Scholar
  46. 46.
    Moreau, L., Clifford, B., Freire, J., Futrelle, J., Gil, Y., Groth, P.T., Kwasnikowska, N., Miles, S., Missier, P., Myers, J., Plale, B., Simmhan, Y., Stephan, E.G., den Bussche, J.V.: The open provenance model core specification (v1.1). Future Generation Comp. Syst. 27(6), 743–756 (2011)CrossRefGoogle Scholar
  47. 47.
    Moreau, L., Missier, P. (eds.): PROV-DM: The PROV data model. W3C Recommendation (April 2013), http://www.w3.org/TR/2013/REC-prov-dm-20130430/
  48. 48.
    Muniswamy-Reddy, K.-K., Holland, D.A., Braun, U., Seltzer, M.: Provenance-aware storage systems. In: USENIX Annual Technical Conference, pp. 43–56. USENIX (June 2006)Google Scholar
  49. 49.
    Perera, R., Acar, U.A., Cheney, J., Levy, P.B.: Functional programs that explain their work. In: ICFP, pp. 365–376. ACM (2012)Google Scholar
  50. 50.
    Plotkin, G.D.: A structural approach to operational semantics. J. Log. Algebr. Program. 60-61, 17–139 (2004)MathSciNetCrossRefGoogle Scholar
  51. 51.
    Sabelfeld, A., Myers, A.: Language-based information-flow security. IEEE Journal on Selected Areas in Communications 21(1), 5–19 (2003)CrossRefGoogle Scholar
  52. 52.
    Souilah, I., Francalanza, A., Sassone, V.: A formal model of provenance in distributed systems. In: Workshop on the Theory and Practice of Provenance (2009)Google Scholar
  53. 53.
    Stoy, J.: Denotational Semantics: The Scott-Strachey Approach to Programming Language Semantics. MIT Press (1981)Google Scholar
  54. 54.
    Tip, F.: A survey of program slicing techniques. J. Prog. Lang. 3(3) (1995)Google Scholar
  55. 55.
    Varghese, S.: UK government gets bitten by Microsoft Word. Sydney Morning Herald (July 2003), http://www.smh.com.au/articles/2003/07/02/1056825430340.html
  56. 56.
    Wang, Y.R., Madnick, S.E.: A polygen model for heterogeneous database systems: The source tagging perspective. In: VLDB, pp. 519–538 (1990)Google Scholar
  57. 57.
    Winskel, G.: The Formal Semantics of Programming Languages: An Introduction. MIT Press (1993)Google Scholar
  58. 58.
    Winskel, G.: Events, causality and symmetry. Comput. J. 54(1), 42–57 (2011)CrossRefGoogle Scholar
  59. 59.
    Woodruff, A., Stonebraker, M.: Supporting fine-grained data lineage in a database visualization environment. In: ICDE, pp. 91–102 (1997)Google Scholar
  60. 60.
    Zhao, Y., Wilde, M., Foster, I.: Applying the virtual data provenance model. In: Moreau, L., Foster, I. (eds.) IPAW 2006. LNCS, vol. 4145, pp. 148–161. Springer, Heidelberg (2006)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • James Cheney
    • 1
  • Umut A. Acar
    • 2
    • 3
  • Roly Perera
    • 1
  1. 1.University of EdinburghUK
  2. 2.Carnegie Mellon UniversityUSA
  3. 3.INRIA-RocquencourtFrance

Personalised recommendations