When considering data provenance some problems arise from the need to safely handle provenance related functionality. If some modifications have to be performed in a data set due to provenance related requirements, e.g. remove data from a given user or source, this will affect not only the data itself but also all related models and aggregated information obtained from the data. This is specially aggravated when the data are protected using a privacy method (e.g. masking method), since modification in the data and the model can leak information originally protected by the privacy method. To be able to evaluate privacy related problems in data provenance we introduce the notion of integral privacy as compared to the well known definition of differential privacy.
KeywordsDecision Tree Data Privacy Voronoi Tesselation Differential Privacy Disclosure Risk
Partial support by the Spanish MINECO (project TIN2014-55243-P) and Catalan AGAUR (2014-SGR-691) is acknowledged.
- 1.Barbier, G., Feng, Z., Gundecha, P., Liu, H.: Provenance Data in Social Media. Morgan & Claypool Publishers, San Rafael (2013)Google Scholar
- 3.Buneman, P., Khanna, S., Wang-Chiew, T.: A characterization of data provenance. In: International Conference on Database Theory, pp. 316–330. SpringerGoogle Scholar
- 5.Domingo-Ferrer, J., Torra, V.: A quantitative comparison of disclosure control methods for microdata. In: Doyle, P., Lane, J.I., Theeuwes, J.J.M., Zayatz, L., (eds.) Confidentiality, Disclosure, Data Access: Theory and Practical Applications for Statistical Agencies, North-Holland, pp. 111–134 (2001)Google Scholar
- 7.Hasan, R., Sion, R., Winslett, M.: (2007) Introducing secure provenance: problems and challenges. In: Proceedings StorageSST. ACM, New York (2007)Google Scholar
- 10.Torra, V., Navarro-Arribas, G.: Data Privacy, WIREs Data Mining and Knowledge Discovery, 4(4), 269–280 (2014)Google Scholar