Journal of Intelligent Information Systems

, Volume 53, Issue 3, pp 547–562 | Cite as

Safe disassociation of set-valued datasets

  • Nancy AwadEmail author
  • Bechara Al Bouna
  • Jean-Francois Couchot
  • Laurent Philippe


Disassociation is a bucketization based anonymization technique that divides a set-valued dataset into several clusters to hide the link between individuals and their complete set of items. It increases the utility of the anonymized dataset, but on the other side, it raises many privacy concerns, one in particular, is when the items are tightly coupled to form what is called, a cover problem. In this paper, we present safe disassociation, a technique that relies on partial suppression, to overcome the aforementioned privacy breach encountered when disassociating set-valued datasets. Safe disassociation allows the km-anonymity privacy constraint to be extended to a bucketized dataset and copes with the cover problem. We describe our algorithm that achieves the safe disassociation and we provide a set of experiments to demonstrate its efficiency.


Disassociation Cover problem Data privacy Set-valued Privacy preserving 



This work is funded by the InMobiles company3 and the Labex ACTION program (contract ANR-11-LABX-01-01). Computations have been performed on the supercomputer facilities of the Mésocentre de calcul de Franche-Comté. Special thanks to Ms. Sara Barakat for her contribution in identifying the cover problem.


  1. Barakat, S., al Bouna, B., Nassar, M., Guyeux, C. (2016). On the evaluation of the privacy breach in disassociated set-valued datasets. In Callegari, C., van Sinderen, M., Sarigiannidis, P.G., Samarati, P., Cabello, E., Lorenz, P., Obaidat, M.S. (Eds.) Proceedings of the 13th International Joint Conference on e-Business and Telecommunications (ICETE 2016) - Volume 4: SECRYPT, Lisbon, Portugal, July 26-28, 2016 (pp. 318–326). SciTePress.Google Scholar
  2. Bewong, M., Liu, J., Liu, L., Li, J. (2017). Utility aware clustering for publishing transactional data. In Kim, J., Shim, K., Cao, L., Lee, J.-G., Lin, X., Moon, Y.-S. (Eds.) Advances in Knowledge Discovery and Data Mining (pp. 481–494). Cham: Springer International Publishing.CrossRefGoogle Scholar
  3. Biskup, J., Marcel, P.B., Wiese, L. (2011). On the inference-proofness of database fragmentation satisfying confidentiality constraints. In: Proceedings of the 14th Information Security Conference, Xian, China.Google Scholar
  4. Barbaro, M., & Zeller, T. (2006). A face is exposed for aol searcher no. 4417749.Google Scholar
  5. Chen, L., Zhong, S., Wang, L.-E., Li, X. (2016). A sensitivity-adaptive ρ-uncertainty model for set-valued data. In International Conference on Financial Cryptography and Data Security 2016 (pp. 460–473). Berlin: Springer.Google Scholar
  6. Ciriani, V., De Capitani Di Vimercati, S., Foresti, S., Jajodia, S., Paraboschi, S., Samarati, P. (2010). Combining fragmentation and encryption to protect privacy in data storage. ACM Transactions on Information and System Security, 13, 22:1–22:33.CrossRefGoogle Scholar
  7. De Capitani di Vimercati, S, Foresti, S., Jajodia, S., Livraga, G., Paraboschi, S., Samarati, P. (2013). Extending loose associations to multiple fragments. In Proceedings of the 27th International Conference on Data and Applications Security and Privacy XXVII, DBSec’13 (pp. 1–16). Berlin: Springer.Google Scholar
  8. Dwork, C., McSherry, F., Nissim, K., Smith, A. (2006). Calibrating noise to sensitivity in private data analysis. In Proceedings of the Third Conference on Theory of Cryptography, TCC’06 (pp. 265–284). Berlin: Springer.CrossRefGoogle Scholar
  9. Fard, A.M., & Wang, K. (2010). An effective clustering approach to web query log anonymization. In: Proceedings of the 2010 International Conference on Security and Cryptography (SECRYPT) (pp. 1–11). IEEE.Google Scholar
  10. He, Y., & Naughton, J.F. (2009). Anonymization of set-valued data via top-down, local generalization. Proceedings of the VLDB Endowment, 2(1), 934–945.CrossRefGoogle Scholar
  11. Jia, X., Pan, C., Xu, X., Zhu, K.Q., Lo, E. (2014). ρ-uncertainty anonymization by partial suppression. In Bhowmick, S.S., Dyreson, C.E., Jensen, C.S., Lee, M.L., Muliantara, A., Thalheim, B. (Eds.) Database Systems for Advanced Applications, volume 8422 of Lecture Notes in Computer Science (pp. 188–202). Berlin: Springer International Publishing.Google Scholar
  12. Loukides, G., Liagouris, J., Gkoulalas-Divanis, A., Terrovitis, M. (2015). Utility-constrained electronic health record data publishing through generalization and disassociation. In Gkoulalas-Divanis, A., & Loukides, G. (Eds.) Medical Data Privacy Handbook (pp. 149–177). Berlin: Springer International Publishing.Google Scholar
  13. Loukides, G., Liagouris, J., Gkoulalas-divanis, A., Terrovitis, M. (2014). Disassociation for electronic health record privacy. Journal of Biomedical Informatics, 50, 46–61.CrossRefGoogle Scholar
  14. Li, T., Li, N., Zhang, J., Molloy, I. (2012). Slicing: a new approach for privacy preserving data publishing. IEEE Transactions on Knowledge and Data Engineering, 24(3), 561–574.CrossRefGoogle Scholar
  15. Machanavajjhala, A., Gehrke, J., Kifer, D., Venkitasubramaniam, M. (2006). l-diversity: Privacy beyond k-anonymity. In: Proceedings of the 22nd IEEE International Conference on Data Engineering (ICDE 2006), Atlanta Georgia.Google Scholar
  16. Samarati, P. (2001). Protecting respondents’ identities in microdata release. IEEE Transactions on Knowledge and Data Engineering, 13(6), 1010–1027.CrossRefGoogle Scholar
  17. Sweeney, L. (2002). k-anonymity: a model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems, 10(5), 557–570.MathSciNetCrossRefGoogle Scholar
  18. Terrovitis, M., Mamoulis, N., Kalnis, P. (2008). Privacy-preserving anonymization of set-valued data. PVLDB, 1(1), 115–125.Google Scholar
  19. Terrovitis, M., Mamoulis, N., Liagouris, J., Skiadopoulos, S. (2012). Privacy preservation by disassociation. Proceedings of the VLDB Endowment, 5(10), 944–955.CrossRefGoogle Scholar
  20. Wang, J., Deng, C., Li, X. (2018). Two privacy-preserving approaches for publishing transactional data streams. IEEE Access, pp. 1–1.Google Scholar
  21. Ke, W., Wang, P., Fu, A.W., Wong, R.C.-W. (2016). Generalized bucketization scheme for flexible privacy settings. Information Sciences, 348, 377–393.MathSciNetCrossRefGoogle Scholar
  22. Xiao, X., & Tao, Y. (2006). Anatomy: Simple and effective privacy preservation. In: Proceedings of 32nd International Conference on Very Large Data Bases (VLDB 2006), Seoul, Korea, September 12-15.Google Scholar
  23. Zhang, H., Zhou, Z., Ye, L., Xiaojiang, D.U. (2015). Towards privacy preserving publishing of set-valued data on hybrid cloud. In: IEEE Transactions on cloud computing.Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.TICKET LaboratoryAntonine UniversityHadat-BaabdaLebanon
  2. 2.FEMTO-ST Institute, UMR 6174 CNRSUniversité of BourgogneFranche-ComtéFrance

Personalised recommendations