PDGuard: an architecture for the control and secure processing of personal data

Abstract

Online personal data are rarely, if ever, effectively controlled by the users they concern. Worse, as demonstrated by the numerous leaks reported each week, the organizations that store and process them fail to adequately safeguard the required confidentiality. In this paper, we propose pdguard, a framework that defines prototypes and demonstrates an architecture and an implementation that address both problems. In the context of pdguard, personal data are always stored encrypted as opaque objects. Processing them can only be performed through the pdguard application programming interface (api), under data and action-specific authorizations supplied online by third party agents. Through these agents, end-users can easily and reliably authorize and audit how organizations use their personal data. A static verifier can be employed to identify accidental api misuses. Following a security by design approach, pdguard changes the problem of personal data management from the, apparently, intractable problem of supervising processes, operations, personnel, and a large software stack to that of auditing the applications that use the framework for compliance. We demonstrate the framework’s applicability through a reference implementation, by building a pdguard-based e-shop, and by integrating pdguard into the The Guardian newspaper’s website identity application.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Notes

  1. 1.

    https://profile.theguardian.com/signin.

  2. 2.

    https://github.com/guardian/frontend/tree/master/identity.

  3. 3.

    https://github.com/guardian/frontend/tree/master/identity/app/controllers.

  4. 4.

    https://github.com/guardian/frontend/tree/master/identity/app/idapiclient.

  5. 5.

    https://github.com/guardian/frontend/tree/master/identity/app/controllers/editprofile.

References

  1. 1.

    ABC4Trust EU project: Official website. https://www.abc4trust.eu/index.php. Accessed 9 July 2019

  2. 2.

    Anderson, R.J.: Security Engineering: A Guide to Building Dependable Distributed Systems, 1st edn. Wiley, New York, NY (2001)

    Google Scholar 

  3. 3.

    Ateniese, G., Kevin, F., Green, M., Hohenberger, S.: Improved proxy re-encryption schemes with applications to secure distributed storage. ACM Trans. Inf. Syst. Secur. 9(1), 1–30 (2006)

    Article  Google Scholar 

  4. 4.

    Barford, P., Canadi, I., Krushevskaja, D., Ma, Q., Muthukrishnan, S.: Adscape: harvesting and analyzing online display ads. In: Proceedings of the 23rd International Conference on World Wide Web, pp. 597–608. ACM, New York, NY, USA (2014)

  5. 5.

    Barnum, S., Gegick, M.: Design principles. https://buildsecurityin.us-cert.gov/articles/knowledge/principles/design-principles 19 Sept (2005)

  6. 6.

    Bell, J., Kaiser, G.: Phosphor: illuminating dynamic data flow in commodity JVMS. In: Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages and Applications, OOPSLA ’14, pp. 83–101. ACM, New York, NY, USA (2014)

  7. 7.

    Bellovin, S.M.: Thinking Security: Stopping Next Year’s Hackers. Addison-Wesley, Boston (2016)

    Google Scholar 

  8. 8.

    Berger, S., Cáceres, R., Goldman, K.A., Perez, R., Sailer, R., van Doorn, L.: VTPM: virtualizing the trusted platform module. In: Proceedings of the 15th Conference on USENIX Security Symposium—Volume 15, USENIX-SS’06, Berkeley, CA, USA. USENIX Association (2006)

  9. 9.

    Brodsky, A., Farkas, C., Jajodia, S.: Secure databases: constraints, inference channels, and monitoring disclosures. IEEE Trans. Knowl. Data Eng. 12, 12 (2000)

    Article  Google Scholar 

  10. 10.

    Camenisch, J., Lehmann, A., Neven, G., Rial, A.: Privacy-preserving auditing for attribute-based credentials. In: 19th European Symposium on Research in Computer Security—Volume 8713, ESORICS 2014, pp. 109–127. Springer, New York, NY, USA (2014)

  11. 11.

    Chen, E.Y., Pei, Y., Chen, S., Tian, Y., Kotcher, R., Tague, P.: OAuth demystified for mobile application developers. In: Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, pp. 892–903. ACM, New York, NY, USA (2014)

  12. 12.

    Cohen, F.B.: Defense-in-depth against computer viruses. Comput. Secur. 11(6), 563–579 (1992)

    MathSciNet  Article  Google Scholar 

  13. 13.

    CREDENTIAL: Secure cloud identity wallet. https://credential.eu/. Accessed 09 July 2019

  14. 14.

    Denning, D.E.R.: An intrusion detection model. IEEE Trans. Softw. Eng. 13(2), 222–232 (1987)

    Article  Google Scholar 

  15. 15.

    Denning, P.J.: Computers Under Attack: Intruders, Worms, and Viruses. Addison-Wesley, Boston (1990)

    Google Scholar 

  16. 16.

    Derler, D., Krenn, S., Lornser, T., Ramacher, S., Slamanig, D., Striecks, C.: Revisiting proxy re-encryption: forward secrecy, improved security, and applications. Cryptology ePrint Archive, Report 2018/321 (2018). https://eprint.iacr.org/2018/321

  17. 17.

    Ding, W., Yan, Z., Deng, R.: Privacy-preserving data processing with flexible access control. IEEE Trans. Dependable Secure Comput. (2017). https://doi.org/10.1109/TDSC.2017.2786247

  18. 18.

    Doshi, N.: Facebook applications accidentally leaking access to third parties. Technical report, Symantec Corporation (2011) Accessed 10 Feb 2017

  19. 19.

    Fredrikson, M., Jha, S., Ristenpart, T.: Model inversion attacks that exploit confidence information and basic countermeasures. In: Proceedings of the 22Nd ACM SIGSAC Conference on Computer and Communications Security, CCS ’15, pp. 1322–1333. ACM, New York, NY, USA (2015)

  20. 20.

    Geambasu, R., Kohno, T., Levy, A.A., Levy, H.M.: Vanish: increasing data privacy with self-destructing data. In: Proceedings of the 18th Conference on USENIX Security Symposium, pp. 299–316. USENIX Association, Berkeley, CA, USA (2009)

  21. 21.

    Goyal, V., Pandey, O., Sahai, A., Waters, B.: Attribute-based encryption for fine-grained access control of encrypted data. In: Proceedings of the 13th ACM Conference on Computer and Communications Security, CCS ’06, pp. 89–98. ACM, New York, NY, USA (2006)

  22. 22.

    Grogan, S., McDonald, A.M.: Access denied! contrasting data access in the United States and Ireland. In: Proceedings on Privacy Enhancing Technologies, pp. 191–211. De Gruyter (2016)

  23. 23.

    Hannak, A., Soeller, G., Lazer, D., Mislove, A., Wilson, C.: Measuring price discrimination and steering on e-commerce web sites. In: Proceedings of the 2014 Internet Measurement Conference, pp. 305–318. ACM, New York, NY, USA (2014)

  24. 24.

    Howard, M., LeBlanc, D.: Writing Secure Code, 2nd edn. Microsoft Press, Redmond, WA (2003)

    Google Scholar 

  25. 25.

    International Organization for Standardization. Information technology—Security techniques—Encryption algorithms—Part 3: Block ciphers. ISO, Geneva, Switzerland. ISO/IEC 18033-3:2010 (2010)

  26. 26.

    Kamp, P.-H.: Linkedin password leak: salt their hide. Queue 10(6), 20:20–20:22 (2012)

    Article  Google Scholar 

  27. 27.

    Karegar, F., Lindegren, D., Pettersson, J.S., Fischer-Hübner, S.: Assessments of a cloud-based data wallet for personal identity management. In: Information Systems Development: Advances in Methods, Tools and Management—Proceedings of the 26th International Conference on Information Systems Development, ISD 2017, Larnaca, Cyprus, University of Central Lancashire Cyprus, September 6–8 2017 (2017)

  28. 28.

    Kc, G.S., Keromytis, A.D., Prevelakis, V.: Countering code-injection attacks with instruction-set randomization. In: Proceedings of the 10th ACM Conference on Computer and Communications Security, CCS ’03, pp. 272–280. ACM, New York, NY, USA (2003)

  29. 29.

    Kirkham, T., Winfield, S., Ravet, S., Kellomaki, S.: The personal data store approach to personal data security. IEEE Secur. Priv. 11(5), 12–19 (2013)

    Article  Google Scholar 

  30. 30.

    Klein, T.: All your private keys are belong to us. http://trapkit.de/research/sslkeyfinder/keyfinder_v1.0_20060205.pdf 5 Feb (2006)

  31. 31.

    Krawczyk, H., Bellare, M., Canetti, R.: HMAC: Keyed-hashing for message authentication. http://www.ietf.org/rfc/rfc2104.txt Accessed 9 Nov 2015, February 1997. RFC 2104 (Informational)

  32. 32.

    Lécuyer, M., Spahn, R., Spiliopolous, Y., Chaintreau, A., Geambasu, R., Hsu, D.: Sunlight: fine-grained targeting detection at scale with statistical confidence. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, pp. 554–566. Denver, CO, USA, October 12–6, 2015 (2015)

  33. 33.

    Mazurek, M.L., Komanduri, S., Vidas, T., Bauer, L., Christin, N., Cranor, L.F., Kelley, P.G., Shay, R., Ur, B.: Measuring password guessability for an entire university. In: Proceedings of the 2013 ACM SIGSAC Conference on Computer and Communications Security, CCS ’13, pp. 173–186. ACM, New York, NY, USA (2013)

  34. 34.

    McGraw, G.: Software Security: Building Security. Addison-Wesley Professional, Boston (2006)

    Google Scholar 

  35. 35.

    Milenkoski, A., Vieira, M., Kounev, S., Avritzer, A., Payne, B.D.: Evaluating computer intrusion detection systems: a survey of common practices. ACM Comput. Surv. 48(1), 12:1–12:41 (2015)

    Article  Google Scholar 

  36. 36.

    Mundada, Y., Ramachandran, A., Feamster, N.: Silverline: data and network isolation for cloud services. In: Proceedings of the 3rd USENIX Conference on Hot Topics in Cloud Computing, HotCloud’11, p. 13. USENIX Association, Berkeley, CA, USA (2011)

  37. 37.

    Nair, S.K., Dashti, M.T., Crispo, B., Tanenbaum, A.S.: A hybrid PKI-IBC based ephemerizer system. In: Proceedings of the IFIP TC-11 22nd International Information Security Conference, 14–16 May 2007, Sandton, South Africa, pp. 241–252 (2007)

  38. 38.

    Narayanan, A., Shmatikov, V.: Myths and fallacies of “personally identifiable information”. Commun. ACM 53(6), 24–26 (2010)

    Article  Google Scholar 

  39. 39.

    OAuth: An open protocol to allow secure authorization in a simple and standard method from web, mobile and desktop applications. http://oauth.net/. Accessed 09 July 2019

  40. 40.

    OpenID connect main website. https://openid.net/connect/. Accessed 09 July 2019

  41. 41.

    Pappas, V., Kemerlis, V.P., Zavou, A., Polychronakis, M., Keromytis, A.D.: Cloudfence: data flow tracking as a cloud service. In: Research in Attacks, Intrusions, and Defenses—16th International Symposium, Rodney Bay, St. Lucia, October 23–25, 2013. Proceedings, pp. 411–431 (2013)

  42. 42.

    Perlman, R., Perlman, R.: The Ephemerizer: making data disappear. J. Inf. Syst. Secur. 1, 51–68 (2005)

    Google Scholar 

  43. 43.

    Popa, R.A., Redfield, C., Zeldovich, N., Balakrishnan, H.: CryptDB: protecting confidentiality with encrypted query processing. In: Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles, pp. 85–100. ACM, New York, NY, USA (2011)

  44. 44.

    Popa, R.A., Stark, E., Helfer, J., Valdez, S., Zeldovich, N., Kaashoek, M.F., Balakrishnan, H.: Building web applications on top of encrypted data using mylar. In: Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation, pp. 157–172. USENIX Association, Berkeley, CA, USA (2014)

  45. 45.

    Ray, D., Ligatti, J.: Defining code-injection attacks. In: Proceedings of the 39th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pp. 179–190. ACM, New York, NY, USA (2012)

  46. 46.

    Ryutov, T., Neuman, C., Kim, D., Zhou, L.: Integrated access control and intrusion detection for web servers. In: Proceedings of the 23rd International Conference on Distributed Computing Systems, p. 394. IEEE, Washington, DC, USA (2003)Computer Society

  47. 47.

    Ryutov, T., Neuman, C., Kim, D.: Dynamic authorization and intrusion response in distributed systems. In: DARPA Information Survivability Conference and Exposition, 2003. Proceedings, Vol. 1, pp. 50–61. IEEE (2003)

  48. 48.

    Ryutov, T., Neuman, C.: The specification and enforcement of advanced security policies. In: Proceedings of the 3rd International Workshop on Policies for Distributed Systems and Networks, pp. 128. IEEE Computer Society, Washington, DC, USA (2002)

  49. 49.

    Ryutov, T., Zhou, L., Neuman, C., Leithead, T., Seamons, K.E.: Adaptive trust negotiation and access control. In: Proceedings of the Tenth ACM Symposium on Access Control Models and Technologies, pp. 139–146. ACM, New York, NY, USA (2005)

  50. 50.

    Sabouri, A., Rannenberg, K.: ABC4Trust: protecting privacy in identity management by bringing privacy-abcs into real-life. In: Privacy and Identity Management for the Future Internet in the Age of Globalisation—9th IFIP WG 9.2, 9.5, 9.6/11.7, 11.4, 11.6/SIG 9.2.2 International Summer School, Patras, Greece, September 7–12, 2014, Revised Selected Papers, pp. 3–16 (2014)

  51. 51.

    Schneier, B.: Secrets & Lies: Digital Security in a Networked World. Wiley, New York (2000)

    Google Scholar 

  52. 52.

    Shokri, R., Stronati, M., Shmatikov, V.: Membership inference attacks against machine learning models. In: 2017 IEEE Symposium on Security and Privacy (SP), pp. 3–18 (2017)

  53. 53.

    Slavin, R., Wang, X., Hosseini, M.B., Hester, J., Krishnan, R., Bhatia, J., Breaux, T.D., Niu, J.: Toward a framework for detecting privacy policy violations in Android application code. In: Proceedings of the 38th International Conference on Software Engineering, ICSE ’16, pp. 25–36. ACM, New York, NY, USA (2016)

  54. 54.

    Smith, N., Van Bruggen, D., Tomassetti. F.: Visited. Leanpub, JavaParser (2017)

  55. 55.

    Song, D., Shi, E., Fischer, I., Shankar, U.: Cloud data protection for the masses. Computer 45(1), 39–45 (2012)

    Article  Google Scholar 

  56. 56.

    Spiekermann, S.: The challenges of privacy by design. Commun. ACM 55(7), 38–40 (2012)

    Article  Google Scholar 

  57. 57.

    Spinellis, D.: Reflection as a mechanism for software integrity verification. ACM Trans. Inf. Syst. Secur. 3(1), 51–62 (2000)

    Article  Google Scholar 

  58. 58.

    Stolfo, S., Bellovin, S.M., Keromytis, A.D., Sinclair, S., Smith, S.W., Hershkop, S.: Insider Attack and Cyber Security: Beyond the Hacker (Advances in Information Security), 1st edn. Springer, Santa Clara, CA (2008)

    Book  Google Scholar 

  59. 59.

    Stytz, M.R.: Considering defense in depth for software applications. IEEE Secur. Priv. 2(1), 72–75 (2004)

    Article  Google Scholar 

  60. 60.

    Su, Z., Wassermann, G.: The essence of command injection attacks in web applications. In: Conference Record of the 33rd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pp. 372–382. ACM, New York, NY, USA (2006)

  61. 61.

    The European Union General Data Protection Regulation (GDPR). http://data.consilium.europa.eu/doc/document/ST-5419-2016-INIT/en/pdf (2016). Accessed 30 Sept 2018

  62. 62.

    The Guardian Media Group. http://www.theguardian.com/gmg. Accessed 30 Sept 2018

  63. 63.

    The Guardian. The source code of the world’s leading liberal voice. https://github.com/guardian. Accessed 30 Sept 2018

  64. 64.

    United States Department Of Veterans Affairs. Management of data breaches involving sensitive personal information (SPI). http://www.va.gov/vapubs/viewPublication.asp?Pub_ID=608. 6 Jan (2012)

  65. 65.

    User managed access: Created by kantara initiative staff. https://kantarainitiative.org/confluence/display/LC/User+Managed+Access. Accessed 05 July 2019

  66. 66.

    Viega, J., McGraw, G.: Building Secure Software: How to Avoid Security Problems the Right Way. Addison-Wesley, Boston, MA (2001)

    Google Scholar 

  67. 67.

    Winslett, M., Lee, A., Olson, L., Rosulek, M.: TrustBuilder: negotiating trust in dynamic coalitions. In: DARPA Information Survivability Conference and Exposition, 2003. Proceedings, VOL. 2, pp. 49–51. IEEE (2003)

  68. 68.

    Yu, S., Wang, C., Ren, K., Lou, W.: Achieving secure, scalable, and fine-grained data access control in cloud computing. In: Proceedings of the 29th Conference on Information Communications, INFOCOM’10, pp. 534–542. IEEE Press, Piscataway, NJ, USA (2010)

Download references

Acknowledgements

We would like to thank Amit Levy, Panos Louridas, Thodoris Mavrikis, Theofilos Petsios, and George Argyros for their insightful comments.

Funding

This work has received funding from the eu’s Horizon 2020 research and innovation programme under Grant Agreement No 825328 and the Research Centre of the Athens University of Economics and Business, under the Original Scientific Publications framework 2019 (Project er-3074-01).

Author information

Affiliations

Authors

Corresponding author

Correspondence to Dimitris Mitropoulos.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Availability

The source code of our framework is available as open-source software at https://github.com/AUEB-BALab/PDGuard.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Partial work by Dimitris Mitropoulos was done while at the Department of Computer Science of Columbia University in the City of New York.

A Appendix

A Appendix

In the following, we describe the user interface provided by our escrow agent reference implementation. Through this interface, data subjects can set or edit authorization rules, and monitor the actions performed on their data.

Data subjects can easily render their data inaccessible or set new allowable actions. Figure 10 illustrates a pop-up window that data subjects see when they attempt to define such rules for a specific data controller (“The Guardian” in this example). Notably, pdguard’s data type hierarchy allows data subjects to set one rule for multiple types of data via grouping.

A data subject can view which data controllers perform actions on which data types. For instance, in Fig. 11, Alice observes that “The Guardian” uses three different data types, namely: her given name, her surname, and her address. By clicking on the magnifier image Alice can view all the related uses or updates that the data controller may perform on her data and the corresponding validity period. For example, in Fig. 12 we see that “The Guardian” can use Alice’s surname for analytics and reporting from February 26, 2017 to March 8 2019.

Data subjects can monitor all the actions that the various applications perform on their personal data, through the authorization logs that the escrow agent provides. Figure 13 illustrates all the actions that were performed on the personal data of Alice, by The Guardian’s “frontend” application, for a specific period of time. The logs also include the interaction purpose the date and the time that the action took place. Finally, the data subject can check if the action was permitted or not.

Fig. 10
figure10

Setting rules. An example of the pop-up window that data subjects see when they attempt to define authorization rules for a specific data controller. Here, Alice specifies which of her data will be physically published in widely available material by “The Guardian”. She also specifies an expiration date for this rule

Fig. 11
figure11

Data types and related data controllers. Data subjects can observe which data types are stored by the various data controllers. In this case Alice can see that “The Guardian” stores her given name, her surname and her address. By pressing the magnifier image, Alice can see all the related uses or updates that the data controller may perform on her data and the corresponding validity periods (see also Fig. 12)

Fig. 12
figure12

Overview of allowable uses. Alice views the list of the allowable uses that can be performed on her surname. Currently, the corresponding data controller can use Alice’s surname for analytics and reporting. Both rules are valid until 2019. Note that, Alice can revoke or edit them

Fig. 13
figure13

Authorization logs. Alice monitors the different actions that “The Guardian” performed on her personal data between 2017-02-01 and 2017-02-28. Specifically, the “frontend” application sent five requests to the escrow agent. One call concerned the update of Alice’s given name. This update came from Alice herself. The other requests involved calls for different data types. Note that, the escrow agent granted access only to her given name and surname

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Mitropoulos, D., Sotiropoulos, T., Koutsovasilis, N. et al. PDGuard: an architecture for the control and secure processing of personal data. Int. J. Inf. Secur. 19, 479–498 (2020). https://doi.org/10.1007/s10207-019-00468-5

Download citation

Keywords

  • Personal data
  • Software architecture
  • Encrypted data
  • Auditing