Abstract
Open data is a vast resource that is waiting for being utilizing. To publish data to an open environment, the privacy preserving requirement is a must. Different solutions are applied to privacy protection, and data anonymization is one of the prominent ones. This paper proposed a framework flexibly applying different anonymization models to work with an arbitrary open data management platform. An experimental setup was implemented with a heterogeneous service architecture using two datasets vary in the data volume and the number of dimensions. The measured results show that the proposed method produces anonymized data in an acceptable time using different anonymization techniques and settings, giving high quality outputs.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
References
t-Closeness: privacy beyond k-anonymity and l-diversity. In: Proceedings - International Conference on Data Engineering (3), pp. 106–115 (2007)
Aggarwal, C.C., Yu, P.S.: A condensation approach to privacy preserving data mining. In: Bertino, E., et al. (eds.) EDBT 2004. LNCS, vol. 2992, pp. 183–199. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24741-8_12
Agrawal, R., Srikant, R.: Privacy-preserving data mining. ACM SIGMOD Rec. 29(2), 439–450 (2000)
Babu, K.S., Reddy, N., Kumar, N., Elliot, M., Jena, S.K.: Achieving K-anonymity using improved greedy heuristics for very large relational databases. Trans. Data Privacy 6(1), 1–17 (2013)
Curry, E.: Message-oriented middleware. In: Middleware for Communications, pp. 1–28. Wiley, Chichester, UK (2005)
Domingo-Ferrer, J., Mateo-Sanz, J.: Practical data-oriented microaggregation for statistical disclosure control. IEEE Trans. Knowl. Data Eng. 14(1), 189–201 (2002)
Dwork, C.: Differential privacy. In: Bugliesi, M., Preneel, B., Sassone, V., Wegener, I. (eds.) ICALP 2006. LNCS, vol. 4052, pp. 1–12. Springer, Heidelberg (2006). https://doi.org/10.1007/11787006_1
El Emam, K., Dankar, F.K.: Protecting privacy using k-anonymity (Appendix A : Risk Estimates). J. Am. Med. Inform. 15(5), 1–5 (2008)
Open Knowledge Foundation: The Open Definition. https://opendefinition.org/
Guo, N., Yang, M., Gong, Q., Chen, Z., Luo, J.: Data anonymization based on natural equivalent class. In: Proceedings of the 2019 IEEE 23rd International Conference on Computer Supported Cooperative Work in Design, CSCWD 2019, pp. 22–27 (2019)
Ha, T., Dang, T.K., Dang, T.T., Truong, T.A., Nguyen, M.T.: Differential privacy in deep learning: an overview. In: Proceedings - 2019 International Conference on Advanced Computing and Applications, ACOMP 2019, pp. 97–102 (2019)
Ha, T., Dang, T.K., Le, H., Truong, T.A.: Security and privacy issues in deep learning: a brief review. SN Comput. Sci. 1(5), 253 (2020)
Iyengar, V.S.: Transforming data to satisfy privacy constraints. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 279–288 (2002)
Kohlmayer, F., Prasser, F., Eckert, C., Kuhn, K.A.: A flexible approach to distributed data anonymization. J. Biomed. Inform. 50, 62–76 (2014)
LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Incognito: efficient full-domain K-anonymity. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 49–60 (2005)
LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Mondrian multidimensional K-anonymity. In: Proceedings - International Conference on Data Engineering 2006, p. 25 (2006)
Machanavajjhala, A., Gehrke, J., Kifer, D., Venkitasubramaniam, M.: L-diversity: privacy beyond k-anonymity. In: 22nd International Conference on Data Engineering (ICDE’06), p. 24. IEEE (2006)
Murthy, S., Abu Bakar, A., Abdul Rahim, F., Ramli, R.: A comparative study of data anonymization techniques. In: Proceedings - 5th IEEE International Conference on Big Data Security on Cloud, BigDataSecurity 2019, 5th IEEE International Conference on High Performance and Smart Computing, HPSC 2019 and 4th IEEE International Conference on Intelligent Data and Securit, pp. 306–309 (2019)
Nergiz, M., Clifton, C.: Thoughts on k-anonymization. In: 22nd International Conference on Data Engineering Workshops (ICDEW’06), p. 96. No. 0428168. IEEE (2006)
Prasser, F., Eicher, J., Spengler, H., Bild, R., Kuhn, K.A.: Flexible data anonymization using ARX–current status and challenges ahead. Softw. - Pract. Exp. 50(7), 1277–1304 (2020)
Rubner, Y., Tomasi, C., Guibas, L.J.: Earth mover’s distance as a metric for image retrieval. Int. J. Comput. Vis. 40(2), 99–121 (2000)
Samarati, P.: Protecting respondents’ identities in microdata release. IEEE Trans. Knowl. Data Eng. 13(6), 1010–1027 (2001)
Sánchez, D., Domingo-Ferrer, J., Martínez, S., Soria-Comas, J.: Utility-preserving differentially private data releases via individual ranking microaggregation. Inf. Fus. 30, 1–14 (2016)
Sei, Y., Okumura, H., Takenouchi, T., Ohsuga, A.: Anonymization of Sensitive Quasi-Identifiers for l-Diversity and t-Closeness. IEEE Trans. Dependable Secure Comput. 16(4), 580–593 (2019)
Singh, A., Yu, F., Dunteman, G.: MASSC: a new data mask for limiting statistical information loss and disclosure. In: Work Session on Statistical Data (23), pp. 1–13 (2004)
Skinner, C., Marsh, C., Openshaw, S., Wymer, C.: Disclosure control for census microdata. J. Off. Stat. 10(1), 31–51 (1994)
Soria-Comas, J., Domingo-Ferrer, J., Sánchez, D., Martínez, S.: Enhancing data utility in differential privacy via microaggregation-based k-anonymity. VLDB J. 23(5), 771–794 (2014)
Soria-Comas, J., Domingo-Ferrer, J., Sánchez, D., Martínez, S.: t-Closeness through microaggregation: strict privacy with enhanced utility preservation. IEEE Trans. Knowl. Data Eng. 27(11), 3098–3110 (2015)
Sweeney, L.: k-anonymity: a model for protecting privacy. Int. J. Uncertain. Fuzziness Knowl.-Based Syst. 10(05), 557–570 (2002)
Takemura, A.: Local recoding by maximum weight matching for disclosure control of microdata sets. In: ITME Discussion Paper (40), pp. 1–14 (1999)
Wagner, I., Eckhoff, D.: Technical privacy metrics: a systematic survey. ACM Comput. Surv. 51(3), 1–38 (2018)
Wang, R., Zhu, Y., Chen, T.S., Chang, C.C.: Privacy-preserving algorithms for multiple sensitive attributes satisfying t-Closeness. Journal of Computer Science and Technology 33(6), 1231–1242 (2018)
Wood, A., et al.: Differential privacy: a primer for a non-technical audience. SSRN Electron. J. 21, 209 (2019)
Zhang, K., Ni, J., Yang, K., Liang, X., Ren, J., Shen, X.S.: Security and privacy in smart city applications: challenges and solutions. IEEE Commun. Mag. 55(1), 122–129 (2017)
Acknowledgements
This work is supported by a project with the Department of Science and Technology, Ho Chi Minh City, Vietnam (contract with HCMUT No. 08/2018/HĐQKHCN, dated 16/11/2018). We also thank all members of AC Lab and D-STAR Lab for their great supports and comments during the preparation of this paper.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Le, T.H., Dang, T.K. (2020). An Elastic Anonymization Framework for Open Data. In: Dang, T.K., Küng, J., Takizawa, M., Chung, T.M. (eds) Future Data and Security Engineering. Big Data, Security and Privacy, Smart City and Industry 4.0 Applications. FDSE 2020. Communications in Computer and Information Science, vol 1306. Springer, Singapore. https://doi.org/10.1007/978-981-33-4370-2_8
Download citation
DOI: https://doi.org/10.1007/978-981-33-4370-2_8
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-33-4369-6
Online ISBN: 978-981-33-4370-2
eBook Packages: Computer ScienceComputer Science (R0)