The Full Provenance Stack: Five Layers for Complete and Meaningful Provenance

  • Ryan K. L. KoEmail author
  • Thye Way Phua
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10658)


This paper distils three decades of provenance research, and we propose a layered framework, the Full Provenance Stack, for describing provenance completely and meaningfully – within and across machines. The provenance layers aim to proliferate layer protocols and approaches for appropriate data provenance levels of detail, and empower cross-platform features – enabling identifying, detecting, responding and recovering capabilities across all cyber security, digital forensics, and data privacy scenarios.


Provenance Data lineage Logging Cyber security Digital forensics Data privacy 


  1. 1.
  2. 2.
    National Institute of Standards and Technology: Framework for Improving Critical Infrastructure Cybersecurity (2014)Google Scholar
  3. 3.
    Feigenbaum, G., Reist, I.J.: Provenance: An Alternate History of Art. Getty Research Institute, Los Angeles (2012)Google Scholar
  4. 4.
    Becker, R.A., Chambers, J.M.: Auditing of data analyses. In: Proceedings of the 3rd International Workshop on Statistical and Scientific Database Management, pp. 78–80. Lawrence Berkeley Laboratory (1986)Google Scholar
  5. 5.
    Buneman, P., Chapman, A., Cheney, J.: Provenance management in curated databases. In: Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data, pp. 539–550. ACM, Chicago (2006)Google Scholar
  6. 6.
    Buneman, P., Cheney, J., Vansummeren, S.: On the expressiveness of implicit provenance in query and update languages. ACM Trans. Database Syst. 33, 1–47 (2008)CrossRefGoogle Scholar
  7. 7.
    Muniswamy-Reddy, K.-K., Holland, D.A., Braun, U., Seltzer, M.: Provenance-aware storage systems. In: Proceedings of the Annual Conference on USENIX 2006 Annual Technical Conference, p. 4. USENIX Association, Boston (2006)Google Scholar
  8. 8.
    Ko, R.K.L., Will, M.A.: Progger: an efficient, Tamper-evident Kernel-space logger for cloud data provenance tracking. In: Proceedings of the 2014 IEEE International Conference on Cloud Computing, pp. 881–889. IEEE Computer Society (2014)Google Scholar
  9. 9.
    Sar, C., Cao, P.: Lineage file system, pp. 411–414 (2005).
  10. 10.
    Suen, C.H., Ko, R.K.L., Tan, Y.S., Jagadpramana, P., Lee, B.S.: S2Logger: end-to-end data tracking mechanism for cloud data provenance. In: Proceedings of the 2013 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications, pp. 594–602. IEEE Computer Society (2013)Google Scholar
  11. 11.
    Ko, R.K.L., Jagadpramana, P., Mowbray, M., Pearson, S., Kirchberg, M., Liang, Q., Lee, B.S.: TrustCloud: a framework for accountability and trust in cloud computing. In: Proceedings of the 2011 IEEE World Congress on Services, pp. 584–588. IEEE Computer Society (2011)Google Scholar
  12. 12.
    Ko, R.K.L., Jagadpramana, P., Lee, B.S.: Flogger: a file-centric logger for monitoring file access and transfers within cloud computing environments. In: Proceedings of the 2011 IEEE 10th International Conference on Trust, Security and Privacy in Computing and Communications, pp. 765–771. IEEE Computer Society (2011)Google Scholar
  13. 13.
    Sultana, S., Bertino, E.: A file provenance system. In: Proceedings of the Third ACM Conference on Data and Application Security and Privacy, pp. 153–156. ACM, San Antonio (2013)Google Scholar
  14. 14.
    Gil, Y., Deelman, E., Ellisman, M., Fahringer, T., Fox, G., Gannon, D., Goble, C., Livny, M., Moreau, L., Myers, J.: Examining the challenges of scientific workflows. Computer 40, 24–32 (2007)CrossRefGoogle Scholar
  15. 15.
    Muniswamy-Reddy, K.-K., Braun, U., Holland, D.A., Macko, P., Maclean, D., Margo, D., Seltzer, M., Smogor, R.: Layering in provenance systems. In: Proceedings of the 2009 Conference on USENIX Annual Technical Conference. USENIX Association, San Diego (2009)Google Scholar
  16. 16.
    Zhang, O.Q., Kirchberg, M., Ko, R.K., Lee, B.S.: How to track your data: the case for cloud computing provenance. In: 2011 IEEE Third International Conference on Cloud Computing Technology and Science (CloudCom), pp. 446–453. IEEE (2011)Google Scholar
  17. 17.
    Zimmermann, H.: OSI reference model–the ISO model of architecture for open systems interconnection. In: Partridge, C. (ed.) Innovations in Internetworking, pp. 2–9. Artech House, Inc. (1988)Google Scholar
  18. 18.
    Zhao, J., Wroe, C., Goble, C., Stevens, R., Quan, D., Greenwood, M.: Using semantic web technologies for representing E-science provenance. In: McIlraith, Sheila A., Plexousakis, D., van Harmelen, F. (eds.) ISWC 2004. LNCS, vol. 3298, pp. 92–106. Springer, Heidelberg (2004). CrossRefGoogle Scholar
  19. 19.
    Foster, I.T., Vöckler, J., Wilde, M., Zhao, Y.: Chimera: a virtual data system for representing, querying, and automating data derivation. In: Proceedings of the 14th International Conference on Scientific and Statistical Database Management, pp. 37–46. IEEE Computer Society (2002)Google Scholar
  20. 20.
    Bose, R.K.: Composing and Conveying Lineage Metadata for Environmental Science Research Computing, p. 151. University of California, Santa Barbara (2004)Google Scholar
  21. 21.

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Cyber Security Lab – Department of Computer ScienceUniversity of WaikatoHamiltonNew Zealand

Personalised recommendations