Advertisement

Disclosure Analysis and Control in Statistical Databases

  • Yingjiu Li
  • Haibing Lu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5283)

Abstract

Disclosure analysis and control are critical to protect sensitive information in statistical databases when some statistical moments are released. A generic question in disclosure analysis is whether a data snooper can deduce any sensitive information from available statistical moments. To address this question, we consider various types of possible disclosure based on the exact bounds that a snooper can infer about any protected moments from available statistical moments. We focus on protecting static moments in two-dimensional tables and obtain the following results. For each type of disclosure, we reveal the distribution patterns of protected moments that are subject to disclosure. Based on the disclosure patterns, we design efficient algorithms to discover all protected moments that are subject to disclosure. Also based on the disclosure patterns, we propose efficient algorithms to eliminate all possible disclosures by combining a minimum number of available moments. We also discuss the difficulties of executing disclosure analysis and control in high-dimensional tables.

Keywords

Statistical Database Exact Bound Privacy Preserve Inference Control Disclosure Control 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Adam, N.R., Wortmann, J.C.: Security-control methods for statistical databases: a comparative study. ACM Computing Surveys 21(4), 515–556 (1989)CrossRefGoogle Scholar
  2. 2.
    Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: SIGMOD Conference, pp. 439–450 (2000)Google Scholar
  3. 3.
    Beck, L.L.: A security mechanism for statistical databases. ACM Trans. Database Syst. 5(3), 316–338 (1980)CrossRefzbMATHGoogle Scholar
  4. 4.
    Brodsky, A., Farkas, C., Jajodia, S.: Secure databases: Constraints, inference channels, and monitoring disclosures. IEEE Trans. Knowl. Data Eng. 12(6), 900–919 (2000)CrossRefGoogle Scholar
  5. 5.
    Buzzigoli, L., Giusti, A.: An algorithm to calculate the lower and upper bounds of the elements of an array given its marginals. In: Proceedings of the conference for statistical data protection, pp. 131–147 (1999)Google Scholar
  6. 6.
    Causey, B.D., Cox, L.H., Ernst, L.R.: Applications of transportation theory to statistical problems. Journal of the American Statistical Association 80, 903–909 (1985)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Chen, K., Liu, L.: Privacy preserving data classification with rotation perturbation. In: ICDM, pp. 589–592 (2005)Google Scholar
  8. 8.
    Chin, F.Y.L., Özsoyoglu, G.: Auditing and inference control in statistical databases. IEEE Trans. Software Eng. 8(6), 574–582 (1982)MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Chowdhury, S., Duncan, G., Krishnan, R., Roehrig, S., Mukherjee, S.: Disclosure detection in multivariate categorical databases: Auditing confidentiality protection through two new matrix operators. Management Sciences 45, 1710–1723 (1999)CrossRefzbMATHGoogle Scholar
  10. 10.
    Cox, L.H.: On properties of multi-dimensional statistical tables. Journal of Statistical Planning and Inference 117(2), 251–273 (2003)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Cox, L.H.: Suppression methodology and statistical disclosure control. Journal of American Statistical Association 75, 377–385 (1980)CrossRefzbMATHGoogle Scholar
  12. 12.
    Cox, L.H.: A constructive procedure for unbiased controlled rounding. Journal of the American Statistical Association 82, 520–524 (1987)CrossRefzbMATHGoogle Scholar
  13. 13.
    Cox, L.H., George, J.A.: Controlled rounding for tables with subtotals. Annuals of operations research 20(1-4), 141–157 (1989)MathSciNetCrossRefzbMATHGoogle Scholar
  14. 14.
    Cox, L.H.: Network models for complementary cell suppression. Journal of the American Statistical Association 90, 1453–1462 (1995)CrossRefzbMATHGoogle Scholar
  15. 15.
    Dandekar, R.A., Cox, L.H.: Synthetic tabular data: An alternative to complementary cell suppression (manuscript, 2002), http://mysite.verizon.net/vze7w8vk/syn_tab.pdf
  16. 16.
    Denning, D.E., Schlorer, J.: Inference controls for statistical databases. IEEE Computer 16(7), 69–82 (1983)CrossRefGoogle Scholar
  17. 17.
    Denning, D.E., Schlörer, J., Wehrle, E.: Memoryless inference controls for statistical databases. In: IEEE Symposium on Security and Privacy, pp. 38–45 (1982)Google Scholar
  18. 18.
    Dobkin, D.P., Jones, A.K., Lipton, R.J.: Secure databases: Protection against user influence. ACM Trans. Database Syst. 4(1), 97–106 (1979)CrossRefGoogle Scholar
  19. 19.
    Dobra, A., Fienberg, S.E.: Bounds for cell entries in contingency tables given fixed marginal totals and decomposable graphs. Proceedings of the National Academy of Sciences of the United States of America 97(22), 11885–11892 (2000)MathSciNetCrossRefzbMATHGoogle Scholar
  20. 20.
    Domingo-Ferrer, J.: Advances in inference control in statistical databases: An overview. In: Inference Control in Statistical Databases, pp. 1–7 (2002)Google Scholar
  21. 21.
    Domingo-Ferrer, J., Mateo-Sanz, J.M.: Practical data-oriented microaggregation for statistical disclosure control. IEEE Trans. Knowl. Data Eng. 14(1), 189–201 (2002)CrossRefGoogle Scholar
  22. 22.
    Farkas, C., Jajodia, S.: The inference problem: A survey. SIGKDD Explorations 4(2), 6–11 (2002)CrossRefGoogle Scholar
  23. 23.
    Fischetti, M., Salazar, J.J.: Solving the cell suppression problem on tabular data with linear constraints. Management sciences 47(7), 1008–1027 (2001)CrossRefzbMATHGoogle Scholar
  24. 24.
    Fischetti, M., Salazar, J.J.: Solving the cell suppression problem on tabular data with linear constraints. Management Sciences 47, 1008–1026 (2000)CrossRefzbMATHGoogle Scholar
  25. 25.
    Fischetti, M., Salazar, J.J.: Partial cell suppression: a new methodology for statistical disclosure control. Statistics and Computing 13, 13–21 (2003)MathSciNetCrossRefGoogle Scholar
  26. 26.
    Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness, p. 226. W.H. Freeman, New York (1979)zbMATHGoogle Scholar
  27. 27.
    Huang, Z., Du, W., Chen, B.: Deriving private information from randomized data. In: SIGMOD Conference, pp. 37–48 (2005)Google Scholar
  28. 28.
    Iyengar, V.S.: Transforming data to satisfy privacy constraints. In: KDD, pp. 279–288 (2002)Google Scholar
  29. 29.
    Kargupta, H., Datta, S., Wang, Q., Sivakumar, K.: On the privacy preserving properties of random data perturbation techniques. In: ICDM, pp. 99–106 (2003)Google Scholar
  30. 30.
    Li, N., Li, T., Venkatasubramanian, S.: t-closeness: Privacy beyond k-anonymity and l-diversity. In: ICDE, pp. 106–115 (2007)Google Scholar
  31. 31.
    Li, Y., Lu, H., Deng, R.H.: Practical inference control for data cubes (extended abstract). In: Proceedings of the IEEE Symposium on Security and Privacy, pp. 115–120 (2006)Google Scholar
  32. 32.
    Liu, K., Kargupta, H., Ryan, J.: Random projection-based multiplicative data perturbation for privacy preserving distributed data mining. IEEE Trans. Knowl. Data Eng. 18(1), 92–106 (2006)CrossRefGoogle Scholar
  33. 33.
    Machanavajjhala, A., Gehrke, J., Kifer, D., Venkitasubramaniam, M.: l-diversity: Privacy beyond k-anonymity. In: ICDE, p. 24 (2006)Google Scholar
  34. 34.
    Muralidhar, K., Sarathy, R.: A general aditive data perturbation method for database security. Management Sciences 45, 1399–1415 (2002)CrossRefGoogle Scholar
  35. 35.
    Samarati, P., Sweeney, L.: Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. Technical report, SRI International (1998)Google Scholar
  36. 36.
    Schlörer, J.: Security of statistical databases: Multidimensional transformation. ACM Trans. Database Syst. 6(1), 95–112 (1981)MathSciNetCrossRefzbMATHGoogle Scholar
  37. 37.
    Schlörer, J.: Information loss in partitioned statistical databases. Comput. J. 26(3), 218–223 (1983)CrossRefGoogle Scholar
  38. 38.
    Wang, K., Yu, P.S., Chakraborty, S.: Bottom-up generalization: A data mining solution to privacy protection. In: ICDM, pp. 249–256 (2004)Google Scholar
  39. 39.
    Wang, L., Jajodia, S., Wijesekera, D.: Securing OLAP data cubes against privacy breaches. In: IEEE Symposium on Security and Privacy, pp. 161–175 (2004)Google Scholar
  40. 40.
    Wang, L., Li, Y., Wijesekera, D., Jajodia, S.: Precisely answering multi-dimensional range queries without privacy breaches. In: Snekkenes, E., Gollmann, D. (eds.) ESORICS 2003. LNCS, vol. 2808, pp. 100–115. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  41. 41.
    Willenborg, L., de Waal, T.: Statistical Disclosure Control in Practice. Springer, Heidelberg (1996)CrossRefzbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Yingjiu Li
    • 1
  • Haibing Lu
    • 2
  1. 1.Singapore Management UniversitySingapore
  2. 2.Rutgers UniversityNewark

Personalised recommendations