Skip to main content

Fifty Shades of Personal Data – Partial Re-identification and GDPR

  • 144 Accesses

Part of the Lecture Notes in Computer Science book series (LNSC,volume 13279)

Abstract

This paper takes a look at data re-identification as an economic game where the attacker is assumed to be rational, i.e. performs attacks for a gain. In order to evaluate expectancy for this gain, we need to assess the attack success probability, which in turn depends on the level of re-identification. In the context of GDPR, possibility of various levels of re-identification is a grey area – it is neither explicitly prohibited, nor endorsed. We argue that the risk-based approach of GDPR would benefit from greater clarity in this regard. We present an explicit, yet general, attacker model that does not fit well into the current treatment of GDPR, and give it a high-level game-theoretic analysis.

Keywords

  • Data re-identification
  • Privacy attacks
  • Cost-benefit analysis
  • GDPR

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-031-07315-1_6
  • Chapter length: 9 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   44.99
Price excludes VAT (USA)
  • ISBN: 978-3-031-07315-1
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   59.99
Price excludes VAT (USA)
Fig. 1.

References

  1. EDPB Work Programme 2021/2022, The European Data Protection Board. https://edpb.europa.eu/system/files/2021-03/edpb_workprogramme_2021-2022_en.pdf

  2. Opinion 05/2014 on Anonymisation Techniques. Article 29 Data Protection Working Party, April 2014. https://ec.europa.eu/justice/article-29/documentation/opinion-recommendation/files/2014/wp216_en.pdf

  3. Common Methodology for Information Technology Security Evaluation. Evaluation methodology, Version 3.1, Revision 5, CCMB-2017-04-004, April 2017. https://www.commoncriteriaportal.org/files/ccfiles/CEMV3.1R5.pdf

  4. AEPD-EDPS joint paper on 10 misunderstandings related to anonymisation (2021). https://edps.europa.eu/data-protection/our-work/publications/papers/aepd-edps-joint-paper-10-misunderstandings-related_en

  5. Benitez, K., Malin, B.: Evaluating re-identification risks with respect to the HIPAA privacy rule. J. Am. Med. Inf. Assoc. 17(2), 169–177 (2010). https://doi.org/10.1136/jamia.2009.000026

  6. Buchmann, E., Böhm, K., Burghardt, T., Kessler, S.: Re-identification of smart meter data. Pers. Ubiquitous Comput. 17(4), 653–662 (2013). https://doi.org/10.1007/s00779-012-0513-6

    CrossRef  Google Scholar 

  7. Buldas, A., Laud, P., Priisalu, J., Saarepera, M., Willemson, J.: Rational choice of security measures via multi-parameter attack trees. In: Lopez, J. (ed.) CRITIS 2006. LNCS, vol. 4347, pp. 235–248. Springer, Heidelberg (2006). https://doi.org/10.1007/11962977_19

    CrossRef  Google Scholar 

  8. De Montjoye, Y.A., Hidalgo, C.A., Verleysen, M., Blondel, V.D.: Unique in the crowd: the privacy bounds of human mobility. Sci. Rep. 3(1), 1–5 (2013). https://doi.org/10.1038/srep01376

    CrossRef  Google Scholar 

  9. El Emam, K., Jonker, E., Arbuckle, L., Malin, B.: A systematic review of re-identification attacks on health data. PLoS ONE 6(12), e28071 (2011)

    CrossRef  Google Scholar 

  10. Elamir, E.A.H.: Analysis of re-identification risk based on log-linear models. In: Domingo-Ferrer, J., Torra, V. (eds.) PSD 2004. LNCS, vol. 3050, pp. 273–281. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-25955-8_21

    CrossRef  Google Scholar 

  11. Finck, M., Pallas, F.: They who must not be identified-distinguishing personal from non-personal data under the GDPR. Int. Data Privacy Law 10(1), 11–36 (2020). https://doi.org/10.1093/idpl/ipz026

  12. Kassem, A., Ács, G., Castelluccia, C., Palamidessi, C.: Differential inference testing: a practical approach to evaluate sanitizations of datasets. In: 2019 IEEE Security and Privacy Workshops, SP Workshops 2019, San Francisco, CA, USA, 19–23 May 2019, pp. 72–79. IEEE (2019). https://doi.org/10.1109/SPW.2019.00024

  13. Kikuchi, H., Yamaguchi, T., Hamada, K., Yamaoka, Y., Oguri, H., Sakuma, J.: Ice and fire: quantifying the risk of re-identification and utility in data anonymization. In: Barolli, L., Takizawa, M., Enokido, T., Jara, A.J., Bocchi, Y. (eds.) 30th IEEE International Conference on Advanced Information Networking and Applications, AINA 2016, Crans-Montana, Switzerland, 23–25 March 2016, pp. 1035–1042. IEEE Computer Society (2016). https://doi.org/10.1109/AINA.2016.151

  14. Narayanan, A., Shmatikov, V.: Robust de-anonymization of large sparse datasets. In: 2008 IEEE Symposium on Security and Privacy (S&P 2008), pp. 111–125. IEEE (2008)

    Google Scholar 

  15. Purtova, N.: From Knowing by name to personalisation: meaning of identification under the GDPR. Available at SSRN 3849943 (2021)

    Google Scholar 

  16. Quelle, C.: Enhancing compliance under the general data protection regulation: the risky upshot of the accountability- and risk-based approach. Eur. J. Risk Regul. 9(3), 502–526 (2018). https://doi.org/10.1017/err.2018.47

    CrossRef  Google Scholar 

  17. Rocchetto, M., Tippenhauer, N.O.: On attacker models and profiles for cyber-physical systems. In: Askoxylakis, I., Ioannidis, S., Katsikas, S., Meadows, C. (eds.) ESORICS 2016. LNCS, vol. 9879, pp. 427–449. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-45741-3_22

    CrossRef  Google Scholar 

  18. Skinner, C., Holmes, D.J.: Estimating the re-identification risk per record in microdata. J. Official Stat. 14(4), 361 (1998)

    Google Scholar 

  19. Truta, T.M., Fotouhi, F., Barth-Jones, D.C.: Disclosure risk measures for microdata. In: Proceedings of the 15th International Conference on Scientific and Statistical Database Management (SSDBM 2003), 9–11 July 2003, Cambridge, MA, USA, pp. 15–22. IEEE Computer Society (2003). https://doi.org/10.1109/SSDM.2003.1214948

  20. Wan, Z., et al.: A game theoretic framework for analyzing re-identification risk. PLoS ONE 10(3), e0120592 (2015). https://doi.org/10.1371/journal.pone.0120592

    CrossRef  Google Scholar 

  21. Yin, L., et al.: Re-identification risk versus data utility for aggregated mobility research using mobile phone location data. PLoS ONE 10(10), e0140589 (2015)

    CrossRef  Google Scholar 

  22. Zang, H., Bolot, J.: Anonymization of location data does not work: a large-scale measurement study. In: Ramanathan, P., Nandagopal, T., Levine, B.N. (eds.) Proceedings of the 17th Annual International Conference on Mobile Computing and Networking, MOBICOM 2011, Las Vegas, Nevada, USA, 19–23 September 2011, pp. 145–156. ACM (2011). https://doi.org/10.1145/2030613.2030630

Download references

Acknowledgements

The author is grateful to Triin Siil, Tiina Ilus, Tanel Mällo and Kati Sein for fruitful discussions. The paper has been supported by the Estonian Research Council under the grant number PRG920.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jan Willemson .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Willemson, J. (2022). Fifty Shades of Personal Data – Partial Re-identification and GDPR. In: Gryszczyńska, A., Polański, P., Gruschka, N., Rannenberg, K., Adamczyk, M. (eds) Privacy Technologies and Policy. APF 2022. Lecture Notes in Computer Science(), vol 13279. Springer, Cham. https://doi.org/10.1007/978-3-031-07315-1_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-07315-1_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-07314-4

  • Online ISBN: 978-3-031-07315-1

  • eBook Packages: Computer ScienceComputer Science (R0)