The Right to Be Forgotten: Towards Machine Learning on Perturbed Knowledge Bases

  • Bernd Malle
  • Peter Kieseberg
  • Edgar Weippl
  • Andreas Holzinger
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9817)

Abstract

Today’s increasingly complex information infrastructures represent the basis of any data-driven industries which are rapidly becoming the 21st century’s economic backbone. The sensitivity of those infrastructures to disturbances in their knowledge bases is therefore of crucial interest for companies, organizations, customers and regulating bodies. This holds true with respect to the direct provisioning of such information in crucial applications like clinical settings or the energy industry, but also when considering additional insights, predictions and personalized services that are enabled by the automatic processing of those data. In the light of new EU Data Protection regulations applying from 2018 onwards which give customers the right to have their data deleted on request, information processing bodies will have to react to these changing jurisdictional (and therefore economic) conditions. Their choices include a re-design of their data infrastructure as well as preventive actions like anonymization of databases per default. Therefore, insights into the effects of perturbed/anonymized knowledge bases on the quality of machine learning results are a crucial basis for successfully facing those future challenges. In this paper we introduce a series of experiments we conducted on applying four different classifiers to an established dataset, as well as several distorted versions of it and present our initial results.

Keywords

Machine learning Knowledge bases Right to be forgotten Perturbation Anonymization k-anonymity SaNGreeA Information loss Structural loss Cost weighing vector Interactive machine learning 

References

  1. 1.
    Aggarwal, C.C.: On k-anonymity and the curse of dimensionality. In: Proceedings of the 31st International Conference on Very Large Data Bases VLDB, pp. 901–909 (2005)Google Scholar
  2. 2.
    Aggarwal, G., Feder, T., Kenthapadi, K., Motwani, R., Panigrahy, R., Thomas, D., Zhu, A.: Approximation algorithms for k-anonymity. J. Priv. Technol. (JOPT) (2005)Google Scholar
  3. 3.
    Campan, A., Truta, T.M.: Data and structural k-anonymity in social networks. In: Bonchi, F., Ferrari, E., Jiang, W., Malin, B. (eds.) PinKDD 2008. LNCS, vol. 5456, pp. 33–54. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  4. 4.
    Chester, S., Kapron, B., Ramesh, G., Srivastava, G., Thomo, A., Venkatesh, S.: k-anonymization of social networks by vertex addition. ADBIS 2(789), 107–116 (2011)Google Scholar
  5. 5.
    Ciriani, V., Capitani, D., di Vimercati, S., Foresti, S., Samarati, P.: \(\kappa \)-anonymity. In: Yu, T., Jajodia, S. (eds.) Secure Data Management in Decentralized Systems. Advances in Information Security, vol. 33, pp. 323–353. Springer, US (2007)CrossRefGoogle Scholar
  6. 6.
    Duchi, J.C., Jordan, M.I., Wainwright, M.J.: Privacy aware learning. J. ACM (JACM) 61(6), 38 (2014)MathSciNetCrossRefMATHGoogle Scholar
  7. 7.
    Holzinger, A., Plass, M., Holzinger, K., Crisan, G.C., Pintea, C.M., Paladem, V.: Towards interactive machine learning (iml): applying ant colony algorithms to solve the traveling salesman problem with the human-in-the-loop approach. In: Buccafurri, F., Holzinger, A., Kieseberg, P., Tjoa, A.M., Weippl, E. (eds.) CD-ARES 2016, LNCS, vol. 9817, pp. X-XY. Springer, Heidelberg (2016)Google Scholar
  8. 8.
    Holzinger, A.: Interactive machine learning for health informatics: when do we need the human-in-the-loop? Brain Inform. (BRIN) 3(2), 119–131 (2016)CrossRefGoogle Scholar
  9. 9.
    Kapron, B., Srivastava, G., Venkatesh, S.: Social network anonymization via edge addition. In: 2011 International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pp. 155–162. IEEE (2011)Google Scholar
  10. 10.
    Kieseberg, P., Malle, B., Frühwirt, P., Weippl, E., Holzinger, A.: A tamper-proof audit and control system for the doctor in the loop. Brain Inform. 1–11 (2016)Google Scholar
  11. 11.
    Li, N., Li, T., Venkatasubramanian, S.: t-closeness: privacy beyond k-anonymity and l-diversity. In: IEEE 23rd International Conference on Data Engineering, ICDE 2007, pp. 106–115. IEEE (2007)Google Scholar
  12. 12.
    Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M.: l-diversity: privacy beyond k-anonymity. ACM Trans. Knowl. Discovery Data (TKDD) 1(1), 1–52 (2007)CrossRefGoogle Scholar
  13. 13.
    Nergiz, M.E., Clifton, C.: Delta-presence without complete world knowledge. IEEE Trans. Knowl. Data Eng. 22(6), 868–883 (2010)CrossRefGoogle Scholar
  14. 14.
    Samarati, P.: Protecting respondents identities in microdata release. IEEE Trans. Knowl. Data Eng. 13(6), 1010–1027 (2001)CrossRefGoogle Scholar
  15. 15.
    Sweeney, L.: Achieving k-anonymity privacy protection using generalization and suppression. Int. J. Uncertainty Fuzziness Knowl. Based Syst. 10(5), 571–588 (2002)MathSciNetCrossRefMATHGoogle Scholar
  16. 16.
    Sweeney, L.: k-anonymity: a model for protecting privacy. Int. J. Uncertainty Fuzziness Knowl. Based Syst. 10(05), 557–570 (2002)MathSciNetCrossRefMATHGoogle Scholar
  17. 17.
    Zheng, W.-S., Gong, S., Xiang, T.: Reidentification by relative distance comparison. IEEE Trans. Pattern Anal. Mach. Intell. 35(3), 653–668 (2013)CrossRefGoogle Scholar
  18. 18.
    Zhou, B., Pei, J., Luk, W.: A brief survey on anonymization techniques for privacy preserving publishing of social network data. ACM Sigkdd Explor. Newslett. 10(2), 12–22 (2008)CrossRefGoogle Scholar

Copyright information

© IFIP International Federation for Information Processing 2016

Authors and Affiliations

  • Bernd Malle
    • 1
    • 2
  • Peter Kieseberg
    • 1
    • 2
  • Edgar Weippl
    • 2
  • Andreas Holzinger
    • 1
  1. 1.Holzinger Group HCI-KDD, Institute for Medical Informatics, Statistics and DocumentationMedical University GrazGrazAustria
  2. 2.SBA Research gGmbHViennaAustria

Personalised recommendations