Abstract
The increasing pace of data collection has led to increasing awareness of privacy risks, resulting in new data privacy regulations like General data Protection Regulation (GDPR). Such regulations are an important step, but automatic compliance checking is challenging. In this work, we present a new paradigm, Data Capsule, for automatic compliance checking of data privacy regulations in heterogeneous data processing infrastructures. Our key insight is to pair up a data subject’s data with a policy governing how the data is processed. Specified in our formal policy language: PrivPolicy, the policy is created and provided by the data subject alongside the data, and is associated with the data throughout the life-cycle of data processing (e.g., data transformation by data processing systems, data aggregation of multiple data subjects’ data). We introduce a solution for static enforcement of privacy policies based on the concept of residual policies, and present a novel algorithm based on abstract interpretation for deriving residual policies in PrivPolicy. Our solution ensures compliance automatically, and is designed for deployment alongside existing infrastructure. We also design and develop PrivGuard, a reference data capsule manager that implements all the functionalities of Data Capsule paradigm .
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
The 18 biggest data breaches of the 21st century (2019). https://www.csoonline.com/article/2130877/the-biggest-data-breaches-of-the-21st-century.html. Accessed 23 May 2019
Solove, D.J., Citron, D.K.: Risk and anxiety: a theory of data-breach harms. Tex. L. Rev. 96, 737 (2017)
Insider threat 2018 report (2019). https://www.ca.com/content/dam/ca/us/files/ebook/insider-threat-report.pdf. Accessed 23 May 2019
Murdock, L.E.: The use and abuse of computerized information: striking a balance between personal privacy interests and organizational information needs. Alb. L. Rev. 44, 589 (1979)
The EU general data protection regulation (GDPR) (2019). https://eugdpr.org/. Accessed 16 Apr 2019
California consumer privacy act (CCPA) (2019). https://www.caprivacy.org/. Accessed 16 Apr 2019
The family educational rights and privacy act of 1974 (FERPA) (2019). https://www.colorado.edu/registrar/students/records/ferpa. Accessed 16 Apr 2019
Health insurance portability and accountability act (HIPAA) (2109). https://searchhealthit.techtarget.com/definition/HIPAA. Accessed 16 Apr 2019
Google keeps your data forever - unlocking the future transparency of your past (2019). www.siliconvalleywatcher.com/google-keeps-your-data-forever--unlocking-the-future-transparency-of-your-past/. Accessed 30 May 2019
Extract, transform, load (2019). https://en.wikipedia.org/wiki/Extract,_transform,_load. Accessed 30 May 2019
Codd, E.F.: A relational model of data for large shared data banks. Commun. ACM 13(6), 377–387 (1970)
Chodorow, K.: MongoDB: the definitive guide: powerful and scalable data storage. O’Reilly Media, Inc. (2013)
Shvachko, K., Kuang, H., Radia, S., Chansler, R., et al.: The hadoop distributed file system. In: MSST, vol. 10, pp. 1–10 (2010)
Lakshman, A., Malik, P.: Cassandra: a decentralized structured storage system. ACM SIGOPS Oper. Syst. Rev. 44(2), 35–40 (2010)
Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)
Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I.: Spark: Cluster computing with working sets. HotCloud 10(10–10), 95 (2010)
Sen, S., Guha, S., Datta, A., Rajamani, S.K., Tsai, J., Wing, J.M.: Bootstrapping privacy compliance in big data systems. In: 2014 IEEE Symposium on Security and Privacy, pp. 327–342. IEEE (2014)
Formal concept analysis (2019). https://en.wikipedia.org/wiki/Formal_concept_analysis. Accessed 30 May 2019
Nielson, F., Nielson, H.R., Hankin, C.: Principles of Program Analysis. Springer, Heidelberg (2015)
Gruschka, N., Mavroeidis, V., Vishi, K., Jensen, M.: Privacy issues and data protection in big data: a case study analysis under GDPR. In: 2018 IEEE International Conference on Big Data (Big Data), pp. 5027–5033. IEEE (2018)
Renaud, K., Shepherd, L.A.: How to make privacy policies both GDPR-compliant and usable. In: 2018 International Conference on Cyber Situational Awareness, Data Analytics And Assessment (Cyber SA), pp. 1–8. IEEE (2018)
Politou, E., Alepis, E., Patsakis, C.: Forgetting personal data and revoking consent under the GDPR: challenges and proposed solutions. J. Cybersecur. 4(1), tyy001 (2018)
Tom, J., Sing, E., Matulevičius, R.: Conceptual representation of the GDPR: model and application directions. In: Zdravkovic, J., Grabis, J., Nurcan, S., Stirna, J. (eds.) BIR 2018. LNBIP, vol. 330, pp. 18–28. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99951-7_2
Hanson, C., Berners-Lee, T., Kagal, L., Sussman, G.J., Weitzner, D.: Data-purpose algebra: modeling data usage policies. In: Eighth IEEE International Workshop on Policies for Distributed Systems and Networks (POLICY 2007), pp. 173–177. IEEE (2007)
Tschantz, M.C., Datta, A., Wing, J.M.: Formalizing and enforcing purpose restrictions in privacy policies. In: 2012 IEEE Symposium on Security and Privacy, pp. 176–190. IEEE (2012)
Chowdhury, O., et al.: Privacy promises that can be kept: a policy analysis method with application to the hipaa privacy rule. In: Proceedings of the 18th ACM Symposium on Access Control Models and Technologies, pp. 3–14. ACM (2013)
Lam, P.E., Mitchell, J.C., Scedrov, A., Sundaram, S., Wang, F.: Declarative privacy policy: finite models and attribute-based encryption. In: Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium, pp. 323–332. ACM (2012)
Gerl, A., Bennani, N., Kosch, H., Brunie, L.: LPL, towards a GDPR-compliant privacy language: formal definition and usage. In: Hameurlain, A., Wagner, R. (eds.) Transactions on Large-Scale Data- and Knowledge-Centered Systems XXXVII. LNCS, vol. 10940, pp. 41–80. Springer, Heidelberg (2018). https://doi.org/10.1007/978-3-662-57932-9_2
Chowdhury, O., Jia, L., Garg, D., Datta, A.: Temporal mode-checking for runtime monitoring of privacy policies. In: Biere, A., Bloem, R. (eds.) CAV 2014. LNCS, vol. 8559, pp. 131–149. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-08867-9_9
Symul, L., Wac, K., Hillard, P., Salathe, M.: Assessment of menstrual health status and evolution through mobile apps for fertility awareness, bioRxiv (2019). https://www.biorxiv.org/content/early/2019/01/28/385054
Liu, B.: Predicting pregnancy using large-scale data from a women’s health tracking mobile application. arXiv preprint arXiv:1812.02222 (2018)
Alvergne, A., Vlajic Wheeler, M., Högqvist Tabor, V.: Do sexually transmitted infections exacerbate negative premenstrual symptoms? Insights from digital health. In: Evolution, Medicine, and Public Health, vol. 2018, no. 1, pp. 138–150, July 2018. https://doi.org/10.1093/emph/eoy018
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Wang, L. et al. (2019). Data Capsule: A New Paradigm for Automatic Compliance with Data Privacy Regulations. In: Gadepally, V., et al. Heterogeneous Data Management, Polystores, and Analytics for Healthcare. DMAH Poly 2019 2019. Lecture Notes in Computer Science(), vol 11721. Springer, Cham. https://doi.org/10.1007/978-3-030-33752-0_1
Download citation
DOI: https://doi.org/10.1007/978-3-030-33752-0_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-33751-3
Online ISBN: 978-3-030-33752-0
eBook Packages: Computer ScienceComputer Science (R0)