Skip to main content

Data Capsule: A New Paradigm for Automatic Compliance with Data Privacy Regulations

  • Conference paper
  • First Online:
Heterogeneous Data Management, Polystores, and Analytics for Healthcare (DMAH 2019, Poly 2019)

Abstract

The increasing pace of data collection has led to increasing awareness of privacy risks, resulting in new data privacy regulations like General data Protection Regulation (GDPR). Such regulations are an important step, but automatic compliance checking is challenging. In this work, we present a new paradigm, Data Capsule, for automatic compliance checking of data privacy regulations in heterogeneous data processing infrastructures. Our key insight is to pair up a data subject’s data with a policy governing how the data is processed. Specified in our formal policy language: PrivPolicy, the policy is created and provided by the data subject alongside the data, and is associated with the data throughout the life-cycle of data processing (e.g., data transformation by data processing systems, data aggregation of multiple data subjects’ data). We introduce a solution for static enforcement of privacy policies based on the concept of residual policies, and present a novel algorithm based on abstract interpretation for deriving residual policies in PrivPolicy. Our solution ensures compliance automatically, and is designed for deployment alongside existing infrastructure. We also design and develop PrivGuard, a reference data capsule manager that implements all the functionalities of Data Capsule paradigm .

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. The 18 biggest data breaches of the 21st century (2019). https://www.csoonline.com/article/2130877/the-biggest-data-breaches-of-the-21st-century.html. Accessed 23 May 2019

  2. Solove, D.J., Citron, D.K.: Risk and anxiety: a theory of data-breach harms. Tex. L. Rev. 96, 737 (2017)

    Google Scholar 

  3. Insider threat 2018 report (2019). https://www.ca.com/content/dam/ca/us/files/ebook/insider-threat-report.pdf. Accessed 23 May 2019

  4. Murdock, L.E.: The use and abuse of computerized information: striking a balance between personal privacy interests and organizational information needs. Alb. L. Rev. 44, 589 (1979)

    Google Scholar 

  5. The EU general data protection regulation (GDPR) (2019). https://eugdpr.org/. Accessed 16 Apr 2019

  6. California consumer privacy act (CCPA) (2019). https://www.caprivacy.org/. Accessed 16 Apr 2019

  7. The family educational rights and privacy act of 1974 (FERPA) (2019). https://www.colorado.edu/registrar/students/records/ferpa. Accessed 16 Apr 2019

  8. Health insurance portability and accountability act (HIPAA) (2109). https://searchhealthit.techtarget.com/definition/HIPAA. Accessed 16 Apr 2019

  9. Google keeps your data forever - unlocking the future transparency of your past (2019). www.siliconvalleywatcher.com/google-keeps-your-data-forever--unlocking-the-future-transparency-of-your-past/. Accessed 30 May 2019

  10. Extract, transform, load (2019). https://en.wikipedia.org/wiki/Extract,_transform,_load. Accessed 30 May 2019

  11. Codd, E.F.: A relational model of data for large shared data banks. Commun. ACM 13(6), 377–387 (1970)

    Article  Google Scholar 

  12. Chodorow, K.: MongoDB: the definitive guide: powerful and scalable data storage. O’Reilly Media, Inc. (2013)

    Google Scholar 

  13. Shvachko, K., Kuang, H., Radia, S., Chansler, R., et al.: The hadoop distributed file system. In: MSST, vol. 10, pp. 1–10 (2010)

    Google Scholar 

  14. Lakshman, A., Malik, P.: Cassandra: a decentralized structured storage system. ACM SIGOPS Oper. Syst. Rev. 44(2), 35–40 (2010)

    Article  Google Scholar 

  15. Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)

    Article  Google Scholar 

  16. Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I.: Spark: Cluster computing with working sets. HotCloud 10(10–10), 95 (2010)

    Google Scholar 

  17. Sen, S., Guha, S., Datta, A., Rajamani, S.K., Tsai, J., Wing, J.M.: Bootstrapping privacy compliance in big data systems. In: 2014 IEEE Symposium on Security and Privacy, pp. 327–342. IEEE (2014)

    Google Scholar 

  18. Formal concept analysis (2019). https://en.wikipedia.org/wiki/Formal_concept_analysis. Accessed 30 May 2019

  19. Nielson, F., Nielson, H.R., Hankin, C.: Principles of Program Analysis. Springer, Heidelberg (2015)

    MATH  Google Scholar 

  20. Gruschka, N., Mavroeidis, V., Vishi, K., Jensen, M.: Privacy issues and data protection in big data: a case study analysis under GDPR. In: 2018 IEEE International Conference on Big Data (Big Data), pp. 5027–5033. IEEE (2018)

    Google Scholar 

  21. Renaud, K., Shepherd, L.A.: How to make privacy policies both GDPR-compliant and usable. In: 2018 International Conference on Cyber Situational Awareness, Data Analytics And Assessment (Cyber SA), pp. 1–8. IEEE (2018)

    Google Scholar 

  22. Politou, E., Alepis, E., Patsakis, C.: Forgetting personal data and revoking consent under the GDPR: challenges and proposed solutions. J. Cybersecur. 4(1), tyy001 (2018)

    Google Scholar 

  23. Tom, J., Sing, E., Matulevičius, R.: Conceptual representation of the GDPR: model and application directions. In: Zdravkovic, J., Grabis, J., Nurcan, S., Stirna, J. (eds.) BIR 2018. LNBIP, vol. 330, pp. 18–28. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99951-7_2

    Chapter  Google Scholar 

  24. Hanson, C., Berners-Lee, T., Kagal, L., Sussman, G.J., Weitzner, D.: Data-purpose algebra: modeling data usage policies. In: Eighth IEEE International Workshop on Policies for Distributed Systems and Networks (POLICY 2007), pp. 173–177. IEEE (2007)

    Google Scholar 

  25. Tschantz, M.C., Datta, A., Wing, J.M.: Formalizing and enforcing purpose restrictions in privacy policies. In: 2012 IEEE Symposium on Security and Privacy, pp. 176–190. IEEE (2012)

    Google Scholar 

  26. Chowdhury, O., et al.: Privacy promises that can be kept: a policy analysis method with application to the hipaa privacy rule. In: Proceedings of the 18th ACM Symposium on Access Control Models and Technologies, pp. 3–14. ACM (2013)

    Google Scholar 

  27. Lam, P.E., Mitchell, J.C., Scedrov, A., Sundaram, S., Wang, F.: Declarative privacy policy: finite models and attribute-based encryption. In: Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium, pp. 323–332. ACM (2012)

    Google Scholar 

  28. Gerl, A., Bennani, N., Kosch, H., Brunie, L.: LPL, towards a GDPR-compliant privacy language: formal definition and usage. In: Hameurlain, A., Wagner, R. (eds.) Transactions on Large-Scale Data- and Knowledge-Centered Systems XXXVII. LNCS, vol. 10940, pp. 41–80. Springer, Heidelberg (2018). https://doi.org/10.1007/978-3-662-57932-9_2

    Chapter  Google Scholar 

  29. Chowdhury, O., Jia, L., Garg, D., Datta, A.: Temporal mode-checking for runtime monitoring of privacy policies. In: Biere, A., Bloem, R. (eds.) CAV 2014. LNCS, vol. 8559, pp. 131–149. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-08867-9_9

    Chapter  Google Scholar 

  30. Symul, L., Wac, K., Hillard, P., Salathe, M.: Assessment of menstrual health status and evolution through mobile apps for fertility awareness, bioRxiv (2019). https://www.biorxiv.org/content/early/2019/01/28/385054

  31. Liu, B.: Predicting pregnancy using large-scale data from a women’s health tracking mobile application. arXiv preprint arXiv:1812.02222 (2018)

  32. Alvergne, A., Vlajic Wheeler, M., Högqvist Tabor, V.: Do sexually transmitted infections exacerbate negative premenstrual symptoms? Insights from digital health. In: Evolution, Medicine, and Public Health, vol. 2018, no. 1, pp. 138–150, July 2018. https://doi.org/10.1093/emph/eoy018

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lun Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wang, L. et al. (2019). Data Capsule: A New Paradigm for Automatic Compliance with Data Privacy Regulations. In: Gadepally, V., et al. Heterogeneous Data Management, Polystores, and Analytics for Healthcare. DMAH Poly 2019 2019. Lecture Notes in Computer Science(), vol 11721. Springer, Cham. https://doi.org/10.1007/978-3-030-33752-0_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-33752-0_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-33751-3

  • Online ISBN: 978-3-030-33752-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics