Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Semi-Homogenous Generalization: Improving Homogenous Generalization for Privacy Preservation in Cloud Computing

  • 80 Accesses

  • 4 Citations

Abstract

Data security is one of the leading concerns and primary challenges for cloud computing. This issue is getting more and more serious with the development of cloud computing. However, the existing privacy-preserving data sharing techniques either fail to prevent the leakage of privacy or incur huge amounts of information loss. In this paper, we propose a novel technique, termed as linking-based anonymity model, which achieves K-anonymity with quasi-identifiers groups (QI-groups) having a size less than K. In the meanwhile, a semi-homogenous generalization is introduced to be against the attack incurred by homogenous generalization. To implement linking-based anonymization model, we propose a simple yet efficient heuristic local recoding method. Extensive experiments on real datasets are also conducted to show that the utility has been significantly improved by our approach compared with the state-of-the-art methods.

This is a preview of subscription content, log in to check access.

References

  1. 1.

    Sweeney L. k-anonymity: A model for protecting privacy. Int. J. Unc. Fuzz Knowl. Based Syst., 2002, 10(5): 557-570.

  2. 2.

    Samarati P, Sweeney L. Generalizing data to provide anonymity when disclosing information (abstract). In Proc. the 17th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, June 1998, p.188.

  3. 3.

    Samarati P. Protecting respondents, identities in microdata release. IEEE Trans. Knowl. Data Eng., 2001 13(6): 1010-1027.

  4. 4.

    Xu J, Wang W, Pei J, Wang X Y, Shi B L, Fu A W C. Utility-based anonymization using local recoding. In Proc. the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Aug. 2006, pp.785-790.

  5. 5.

    LeFevre K, DeWitt D J, Ramakrishnan R. Mondrian multidimensional k-anonymity. In Proc. the 22nd International Conference on Data Engineering, April 2006, p.25.

  6. 6.

    Ghinita G, Karras P, Kalnis P, Mamoulis N. Fast data anonymization with low information loss. In Proc. the 33rd International Conference on Very Large Data Bases, Sept. 2007, pp.758-769.

  7. 7.

    Wong W K, Mamoulis N, Cheung D W L. Nonhomogeneous generalization in privacy preserving data publishing. In Proc. ACM SIGMOD International Conference on Management of Data, Jan. 2010, pp.747-758.

  8. 8.

    Wong R C W, Fu A W C, Wang K, Pei J. Minimality attack in privacy preserving data publishing. In Proc. the 33rd International Conference on Very Large Data Bases, Sept. 2007, pp.543-554.

  9. 9.

    Ganta S R, Kasiviswanathan S P, Smith A. Composition attacks and auxiliary information in data privacy. In Proc. the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Aug. 2008, pp.265-273.

  10. 10.

    Wong R C W, Fu A W C, Wang K, Yu P S, Pei J. Can the utility of anonymized data be used for privacy breaches? ACM Transactions on Knowledge Discovery from Data, 2011, 5(3): Article No. 16.

  11. 11.

    Kifer D. Attacks on privacy and Definetti’s theorem. In Proc. ACM SIGMOD International Conference on Management of Data, June 2009, pp.127-138.

  12. 12.

    Gionis A, Mazza A, Tassa T. k-anonymization revisited. In Proc. the 24th International Conference on Data Engineering, April 2008, pp.744-753.

  13. 13.

    Bayardo R J, Agrawal R. Data privacy through optimal kanonymization. In Proc. the 21st International Conference on Data Engineering, April 2005, pp.217-228.

  14. 14.

    Fung B C M, Wang K, Yu P S. Top-down specialization for information and privacy preservation. In Proc. the 21st International Conference on Data Engineering, April 2005, pp.205-216.

  15. 15.

    LeFevre K, DeWitt D J, Ramakrishnan R. Workload-aware anonymization. In Proc. the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Aug. 2006, pp.277-286.

  16. 16.

    LeFevre K, DeWitt D J, Ramakrishnan R. Incognito: Efficient full-domain k-anonymity. In Proc. ACM SIGMOD International Conference on Management of Data, June 2005, pp.49-60.

  17. 17.

    Iwuchukwu T, Naughton J F. K-anonymization as spatial indexing: Toward scalable and incremental anonymization. In Proc. the 33rd International Conference on Very Large Data Bases, Sept. 2007, pp.746-757.

Download references

Author information

Correspondence to Xian-Mang He.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

He, X., Wang, X.S., Li, D. et al. Semi-Homogenous Generalization: Improving Homogenous Generalization for Privacy Preservation in Cloud Computing. J. Comput. Sci. Technol. 31, 1124–1135 (2016). https://doi.org/10.1007/s11390-016-1687-6

Download citation

Keywords

  • privacy preservation
  • cloud computing
  • linking-based anonymization
  • semi-homogenous
  • homogenous generalization