Abstract
When generalization algorithms are known to the public, an adversary can obtain a more precise estimation of the secret table than what can be deduced from the disclosed generalization result. Therefore, whether a generalization algorithm can satisfy a privacy property should be judged based on such an estimation. In this paper, we show that the computation of the estimation is inherently a recursive process that exhibits a high complexity when generalization algorithms take a straightforward inclusive strategy. To facilitate the design of more efficient generalization algorithms, we suggest an alternative exclusive strategy, which adopts a seemingly drastic approach to eliminate the need for recursion. Surprisingly, the data utility of the two strategies are actually not comparable and the exclusive strategy can provide better data utility in certain cases.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Dobra, A., Feinberg, S.E.: Bounding entries in multi-way contingency tables given a set of marginal totals. In: Foundations of Statistical Inference: Proceedings of the Shoresh Conference 2000, Springer, Heidelberg (2003)
Machanavajjhala, A., Gehrke, J., Kifer, D., Venkitasubramaniam, M.: l-diversity: Privacy beyond k-anonymity. In: Proceedings of the 22nd IEEE International Conference on Data Engineering (ICDE 2006) (2006)
Slavkovic, A., Feinberg, S.E.: Bounds for cell entries in two-way tables given conditional relative frequencies. Privacy in Statistical Databases (2004)
Dobkin, D.P., Jones, A.K., Lipton, R.J.: Secure databases: Protection against user influence. ACM TODS 4(1), 76–96 (1979)
Du, Y., Xia, T., Tao, Y., Zhang, D., Zhu, F.: On multidimensional k-anonymity with local recoding generalization
Chin, F.: Security problems on inference control for sum, max, and min queries. J.ACM 33(3), 451–464 (1986)
Aggarwal, G., Feder, T., Kenthapadi, K., Motwani, R., Panigrahy, R., Thomas, D., Zhu, A.: k-anonymity: Algorithms and hardness. Technical report, Stanford University (2004)
Miklau, G., Suciu, D.: A formal analysis of information disclosure in data exchange. In: SIGMOD (2004)
Duncan, G.T., Feinberg, S.E.: Obtaining information while preserving privacy: A markov perturbation method for tabular data. In: Joint Statistical Meetings, Anaheim, CA (1997)
Fellegi, I.P.: On the question of statistical confidentiality. Journal of the American Statistical Association 67(337), 7–18 (1993)
Byun, J., Bertino, E.: Micro-views, or on how to protect privacy while enhancing data usability: concepts and challenges. SIGMOD Record 35(1), 9–13 (2006)
Kleinberg, J., Papadimitriou, C., Raghavan, P.: Auditing boolean attributes. In: PODS (2000)
Schlorer, J.: Identification and retrieval of personal records from a statistical bank. Methods Info. Med. (1975)
Kenthapadi, K., Mishra, N., Nissim, K.: Simulatable auditing. In: PODS (2005)
LeFevre, K., DeWitt, D., Ramakrishnan, R.: Incognito: Efficient fulldomain k-anonymity. In: SIGMOD (2005)
Cox, L.H.: Solving confidentiality protection problems in tabulations using network optimization: A network model for cell suppression in the u.s. economic censuses. In: Proceedings of the Internatinal Seminar on Statistical Confidentiality (1982)
Cox, L.H.: New results in disclosure avoidance for tabulations. In: International Statistical Institute Proceedings (1987)
Cox, L.H.: Suppression, methodology and statistical disclosure control. J. of the American Statistical Association (1995)
Li, N., Li, T., Venkatasubramanian, S.: t-closeness: Privacy beyond k-anonymity and l-diversity. In: ICDE (2007)
Sweeney, L.: k-anonymity: a model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems 10(5), 557–570 (2002)
Meyerson, A., Williams, R.: On the complexity of optimal k-anonymity. In: ACM PODS (2004)
Adam, N.R., Wortmann, J.C.: Security-control methods for statistical databases: A comparative study. ACM Comput. Surv. 21(4), 515–556 (1989)
Diaconis, P., Sturmfels, B.: Algebraic algorithms for sampling from conditional distributions. Annals of Statistics (1998)
Samarati, P.: Protecting respondents identities in microdata release. In: IEEE TKDE, pp. 1010–1027 (2001)
Samarati, P., Sweeney, L.: Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. Technical report, CMU, SRI (1998)
Bayardo, R.J., Agrawal, R.: Data privacy through optimal k-anonymization. In: ICDE (2005)
Chawla, S., Dwork, C., McSherry, F., Smith, A., Wee, H.: Toward privacy in public databases. In: Theory of Cryptography Conference (2005)
Dalenius, T., Reiss, S.: Data swapping: A technique for disclosure control. Journal of Statistical Planning and Inference 6, 73–85 (1982)
Xiao, X., Tao, Y.: Personalized privacy preservation. In: SIGMOD (2006)
Zhang, L., Jajodia, S., Brodsky, A.: Information disclosure under realistic assumptions: Privacy versus optimality. In: ACM Conference on Computer and Communications Security (CCS) (2007)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 IFIP International Federation for Information Processing
About this paper
Cite this paper
Zhang, L., Wang, L., Jajodia, S., Brodsky, A. (2008). Exclusive Strategy for Generalization Algorithms in Micro-data Disclosure. In: Atluri, V. (eds) Data and Applications Security XXII. DBSec 2008. Lecture Notes in Computer Science, vol 5094. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70567-3_15
Download citation
DOI: https://doi.org/10.1007/978-3-540-70567-3_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-70566-6
Online ISBN: 978-3-540-70567-3
eBook Packages: Computer ScienceComputer Science (R0)