Formal Policy-Based Provenance Audit

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9836)


Data processing within large organisations is often complex, impeding both the traceability of data and the compliance of processing with usage policies. The chronology of the ownership, custody, or location of data—its provenance—provides the necessary information to restore traceability. However, to be of practical use, provenance records should include sufficient expressiveness by design with a posteriori analysis in mind, e.g. the verification of their compliance with usage policies. Additionally, they ought to be combined with systematic reasoning about their correctness. In this paper, we introduce a formal framework for policy-based provenance audit. We show how it can be used to demonstrate correctness, consistency, and compliance of provenance records with machine-readable usage policies. We also analyse the suitability of our framework for the special case of privacy protection. A formalised perspective on provenance is also useful in this area, but it must be integrated into a larger accountability process involving data protection authorities to be effective. The practical applicability of our approach is demonstrated using a provenance record involving medical data and corresponding privacy policies with personal data protection as a goal.


Privacy Policy Personal Data Policy Language Data Subject Data Controller 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



This work has been co-funded by the DFG as part of project “Long-Term Secure Archiving” within the CRC 1119 CROSSING. In addition, it has received funding from the European Union’s Horizon 2020 research and innovation program under Grant Agreement No 644962. The authors thank Fanny Coudert for insights about purpose ontologies.


  1. 1.
    Aldeco-Pérez, R., Moreau, L.: A provenance-based compliance framework. In: Berre, A.J., Gómez-Pérez, A., Tutschku, K., Fensel, D. (eds.) FIS 2010. LNCS, vol. 6369, pp. 128–137. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  2. 2.
    Article 29 Data Protection Working Party: Opinion 8/2001 on the processing of personal data in the employment context (2001).
  3. 3.
    Article 29 Data Protection Working Party: Opinion 3/2010 on the principle of accountability (2010).
  4. 4.
    Article 29 Data Protection Working Party: Advice paper on essential elements of a definition and a provision on profiling within the EU General Data Protection Regulation (2013).
  5. 5.
    Bellare, M., Yee, B.S.: Forward Integrity for Secure Audit Logs. Technical report University of California at San Diego (1997)Google Scholar
  6. 6.
    Bertino, E., Ooi, B.C., Yang, Y., Deng, R.H.: Privacy and ownership preserving of outsourced medical data. In: Aberer, K., Franklin, M.J., Nishio, S. (eds.) Proceedings of the 21st International Conference on Data Engineering, ICDE 2005, pp. 521–532. IEEE Computer Society (2005)Google Scholar
  7. 7.
    Bier, C.: How usage control and provenance tracking get together – a data protection perspective. In: IEEE Symposium on Security and Privacy Workshops, pp. 13–17. IEEE Computer Society (2013)Google Scholar
  8. 8.
    Butin, D., Chicote, M., Le Métayer, D.: Log design for accountability. In: 2013 IEEE Security & Privacy Workshop on Data Usage Management, pp. 1–7. IEEE Computer Society (2013)Google Scholar
  9. 9.
    Butin, D., Le Métayer, D.: Log analysis for data protection accountability. In: Jones, C., Pihlajasaari, P., Sun, J. (eds.) FM 2014. LNCS, vol. 8442, pp. 163–178. Springer, Heidelberg (2014)CrossRefGoogle Scholar
  10. 10.
    Cheney, J.: A formal framework for provenance security. In: Proceedings of the 24th IEEE Computer Security Foundations Symposium, CSF 2011, pp. 281–293. IEEE Computer Society (2011)Google Scholar
  11. 11.
    Cheney, J., Missier, P., Moreau, L.: Constraints of the PROV Data Model. Technical report, W3C (2013).
  12. 12.
    Chong, S.: Towards semantics for provenance security. In: Cheney, J. (ed.) Proceedings of the First Workshop on the Theory and Practice of Provenance, TaPP 2009. USENIX (2009)Google Scholar
  13. 13.
    Davidson, S.B., Khanna, S., Roy, S., Stoyanovich, J., Tannen, V., Chen, Y.: On provenance and privacy. In: Milo, T. (ed.) Proceedings of the 14th International Conference Database Theory, ICDT 2011, pp. 3–10. ACM (2011)Google Scholar
  14. 14.
    Decroix, K.: Model-Based Analysis of Privacy in Electronic Services. Ph.D. thesis, KU Leuven, Faculty of Engineering Science (2015)Google Scholar
  15. 15.
    European Commission: Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation). Official Journal of the European Union 59 (2016).
  16. 16.
    Foster, I.T., Vöckler, J., Wilde, M., Zhao, Y.: The virtual data grid: a new model and architecture for data-intensive collaboration. In: First Biennial Conference on Innovative Data Systems Research (CIDR) (2003)Google Scholar
  17. 17.
    Gil, Y., Fritz, C.: Reasoning about the appropriate use of private data through computational workflows. In: Intelligent Information Privacy Management, Papers from the 2010 AAAI Spring Symposium, Technical Report SS-10-05. AAAI (2010)Google Scholar
  18. 18.
    Greschbach, B., Kreitz, G., Buchegger, S.: The devil is in the metadata – new privacy challenges in Decentralised Online Social Networks. In: Tenth Annual IEEE International Conference on Pervasive Computing and Communications, PerCom 2012, Workshop Proceedings, pp. 333–339. IEEE Computer Society (2012)Google Scholar
  19. 19.
    Hartig, O.: Provenance information in the web of data. In: Bizer, C., Heath, T., Berners-Lee, T., Idehen, K. (eds.) Proceedings of the WWW 2009 Workshop on Linked Data on the Web, LDOW 2009. CEUR Workshop Proceedings, vol. 538. (2009).
  20. 20.
    Kumaraguru, P., Lobo, J., Cranor, L.F., Calo, S.B.: A survey of privacy policy languages. In: Workshop on Usable IT Security Management (USM 2007): Proceedings of the 3rd Symposium on Usable Privacy and Security. ACM (2007)Google Scholar
  21. 21.
    Lebo, T., Sahoo, S., McGuinness, D.: PROV-O: The PROV Ontology. Technical report, W3C (2013).
  22. 22.
    Madden, M., Rainie, L., Zickuhr, K., Duggan, M., Smith, A.: Public Perceptions of Privacy and Security in the Post-Snowden Era. Pew Research Center (2014).
  23. 23.
    Moreau, L., Clifford, B., Freire, J., Futrelle, J., Gil, Y., Groth, P., Kwasnikowska, N., Miles, S., Missier, P., Myers, J., Plale, B., Simmhan, Y., Stephan, E., den Bussche, J.V.: The open provenance model core specification (V1.1). Future Gener. Comput. Syst. 27(6), 743–756 (2011)CrossRefGoogle Scholar
  24. 24.
    Moreau, L., Missier, P.: PROV-DM: The PROV Data Model. Technical report, W3C (2013).
  25. 25.
    Okkalioglu, B.D., Okkalioglu, M., Koç, M., Polat, H.: A survey: deriving private information from perturbed data. Artif. Intell. Rev. 44(4), 547–569 (2015)CrossRefGoogle Scholar
  26. 26.
    Paulson, L.C. (ed.): Isabelle – A Generic Theorem Prover. LNCS, vol. 828. Springer, Heidelberg (1994)zbMATHGoogle Scholar
  27. 27.
    Pearson, S., Mont, M.C.: Sticky policies: an approach for managing privacy across multiple parties. IEEE Comput. 44(9), 60–68 (2011)CrossRefGoogle Scholar
  28. 28.
    Proctor, R.W., Ali, M.A., Vu, K.P.L.: Examining usability of web privacy policies. Int. J. Hum. Comput. Interact. 24(3), 307–328 (2008)CrossRefGoogle Scholar
  29. 29.
    Ram, S., Liu, J.: A new perspective on semantics of data provenance. In: Freire, J., Missier, P., Sahoo, S.S. (eds.) Proceedings of the First International Workshop on the Role of Semantic Web in Provenance Management (SWPM 2009). CEUR Workshop Proceedings, vol. 526. (2009).
  30. 30.
    Sultana, S., Bertino, E.: A comprehensive model for provenance. In: Groth, P., Frew, J. (eds.) IPAW 2012. LNCS, vol. 7525, pp. 243–245. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  31. 31.
    Tharaud, J., Wohlgemuth, S., Echizen, I., Sonehara, N., Müller, G., Lafourcade, P.: Privacy by data provenance with digital watermarking – a proof-of-concept implementation for medical services with electronic health records. In: Echizen, I., Pan, J., Fellner, D.W., Nouak, A., Kuijper, A., Jain, L.C. (eds.) Proceedings of the Sixth International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP 2010), pp. 510–513. IEEE Computer Society (2010)Google Scholar
  32. 32.
    Trabelsi, S., Njeh, A., Bussard, L., Neven, G.: PPL engine: a symmetric architecture for privacy policy handling. In: W3C Workshop on Privacy and Data Usage Control (2010)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.TU DarmstadtDarmstadtGermany

Personalised recommendations