Advertisement

Disclosure Analysis for Two-Way Contingency Tables

  • Haibing Lu
  • Yingjiu Li
  • Xintao Wu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4302)

Abstract

Disclosure analysis in two-way contingency tables is important in categorical data analysis. The disclosure analysis concerns whether a data snooper can infer any protected cell values, which contain privacy sensitive information, from available marginal totals (i.e., row sums and column sums) in a two-way contingency table. Previous research has been targeted on this problem from various perspectives. However, there is a lack of systematic definitions on the disclosure of cell values. Also, no previous study has been focused on the distribution of the cells that are subject to various types of disclosure. In this paper, we define four types of possible disclosure based on the exact upper bound and/or the lower bound of each cell that can be computed from the marginal totals. For each type of disclosure, we discover the distribution pattern of the cells subject to disclosure. Based on the distribution patterns discovered, we can speed up the search for all cells subject to disclosure.

Keywords

Contingency Table Statistical Database Protected Cell Exact Bound Marginal Total 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Adam, N.R., Wortmann, J.C.: Security-control methods for statistical databases: a comparative study. ACM Computing Surveys 21(4), 515–556 (1989)CrossRefGoogle Scholar
  2. 2.
    Agrawal, D., Aggarwal, C.C.: On the design and quantification of privacy preserving data mining algorithms. In: PODS (2001)Google Scholar
  3. 3.
    Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: SIGMOD Conference, pp. 439–450 (2000)Google Scholar
  4. 4.
    Beck, L.L.: A security mechanism for statistical databases. ACM Trans. Database Syst. 5(3), 316–338 (1980)MATHCrossRefGoogle Scholar
  5. 5.
    Brodsky, A., Farkas, C., Jajodia, S.: Secure databases: Constraints, inference channels, and monitoring disclosures. IEEE Trans. Knowl. Data Eng. 12(6), 900–919 (2000)CrossRefGoogle Scholar
  6. 6.
    Buzzigoli, L., Giusti, A.: An algorithm to calculate the lower and upper bounds of the elements of an array given its marginals. In: Proceedings of the conference for statistical data protection, pp. 131–147 (1999)Google Scholar
  7. 7.
    Causey, B.D., Cox, L.H., Ernst, L.R.: Applications of transportation theory to statistical problems. Journal of the American Statistical Association 80, 903–909 (1985)CrossRefMathSciNetGoogle Scholar
  8. 8.
    Chen, K., Liu, L.: Privacy preserving data classification with rotation perturbation. In: ICDM, pp. 589–592 (2005)Google Scholar
  9. 9.
    Chin, F.Y.L., Özsoyoglu, G.: Statistical database design. ACM Trans. Database Syst. 6(1), 113–139 (1981)CrossRefGoogle Scholar
  10. 10.
    Chin, F.Y.L., Özsoyoglu, G.: Auditing and inference control in statistical databases. IEEE Trans. Software Eng. 8(6), 574–582 (1982)CrossRefGoogle Scholar
  11. 11.
    Chowdhury, S., Duncan, G., Krishnan, R., Roehrig, S., Mukherjee, S.: Disclosure detection in multivariate categorical databases: auditing confidentiality protection through two new matrix operators. Management Sciences 45, 1710–1723 (1999)CrossRefGoogle Scholar
  12. 12.
    Cox, L.: Bounding entries in 3-dimensional contingency tables. In: SDC: From Theory to Practice (2001), http://vneumann.etse.urv.es/amrads/papers/coxlux.pdf
  13. 13.
    Cox, L.: On properties of multi-dimensional statistical tables. Journal of Statistical Planning and Inference 117(2), 251–273 (2003)MATHCrossRefMathSciNetGoogle Scholar
  14. 14.
    Cox, L.H.: Suppression methodology and statistical disclosure control. Journal of American Statistical Association 75, 377–385 (1980)MATHCrossRefGoogle Scholar
  15. 15.
    Cox, L.H.: A constructive procedure for unbiased controlled rounding. Journal of the American Statistical Association 82, 520–524 (1987)MATHCrossRefGoogle Scholar
  16. 16.
    Cox, L.H., George, J.A.: Controlled rounding for tables with subtotals. Annuals of operations research 20(1-4), 141–157 (1989)MATHCrossRefMathSciNetGoogle Scholar
  17. 17.
    Cox, L.H.: Network models for complementary cell suppression. Journal of the American Statistical Association 90, 1453–1462 (1995)MATHCrossRefGoogle Scholar
  18. 18.
    Dandekar, R.A., Cox, L.H.: Synthetic tabular data: An alternative to complementary cell suppression. Manuscript available from URL, http://mysite.verizon.net/vze7w8vk/
  19. 19.
    Denning, D.E., Schlorer, J.: Inference controls for statistical databases. IEEE Computer 16(7), 69–82 (1983)Google Scholar
  20. 20.
    Dobkin, D.P., Jones, A.K., Lipton, R.J.: Secure databases: Protection against user influence. ACM Trans. Database Syst. 4(1), 97–106 (1979)CrossRefGoogle Scholar
  21. 21.
    Dobra, A., Fienberg, S.E.: Bounds for cell entries in contingency tables given fixed marginal totals and decomposable graphs. Proceedings of the National Academy of Sciences of the United States of America 97(22), 11885–11892 (2000)MATHCrossRefMathSciNetGoogle Scholar
  22. 22.
    Dobra, A., Fienberg, S.E.: Bounds for cell entries in contingency tables induced by fixed marginal totals with applications to disclosure limitation. Statistical journal of the united states 18, 363–371 (2001)Google Scholar
  23. 23.
    Dobra, A., Karr, A., Sanil, A.: Preserving confidentiality of high-dimensional tabulated data: Statistical and computational issues. Statistics and Computing 13, 363–370 (2003)CrossRefMathSciNetGoogle Scholar
  24. 24.
    Domingo-Ferrer, J.: Advances in inference control in statistical databases: An overview. In: Inference Control in Statistical Databases, pp. 1–7 (2002)Google Scholar
  25. 25.
    Domingo-Ferrer, J., Mateo-Sanz, J.M.: Practical data-oriented microaggregation for statistical disclosure control. IEEE Trans. Knowl. Data Eng. 14(1), 189–201 (2002)CrossRefGoogle Scholar
  26. 26.
    Farkas, C., Jajodia, S.: The inference problem: A survey. SIGKDD Explorations 4(2), 6–11 (2002)CrossRefGoogle Scholar
  27. 27.
    Fischetti, M., Salazar, J.: Solving the cell suppression problem on tabular data with linear constraints. Management sciences 47(7), 1008–1027 (2001)CrossRefGoogle Scholar
  28. 28.
    Fischetti, M., Salazar, J.J.: Solving the cell suppression problem on tabular data with linear constraints. Management Sciences 47, 1008–1026 (2000)CrossRefGoogle Scholar
  29. 29.
    Fischetti, M., Salazar, J.J.: Partial cell suppression: a new methodology for statistical disclosure control. Statistics and Computing 13, 13–21 (2003)CrossRefMathSciNetGoogle Scholar
  30. 30.
    Huang, Z., Du, W., Chen, B.: Deriving private information from randomized data. In: SIGMOD Conference, pp. 37–48 (2005)Google Scholar
  31. 31.
    Iyengar, V.S.: Transforming data to satisfy privacy constraints. In: KDD, pp. 279–288 (2002)Google Scholar
  32. 32.
    Kargupta, H., Datta, S., Wang, Q., Sivakumar, K.: On the privacy preserving properties of random data perturbation techniques. In: ICDM, pp. 99–106 (2003)Google Scholar
  33. 33.
    Li, Y., Lu, H., Deng, R.H.: Practical inference control for data cubes (extended abstract). In: IEEE Symposium on Security and Privacy (2006)Google Scholar
  34. 34.
    Li, Y., Wang, L., Jajodia, S.: Preventing interval-based inference by random data perturbation. In: Privacy Enhancing Technologies, pp. 160–170 (2002)Google Scholar
  35. 35.
    Liu, K., Kargupta, H., Ryan, J.: Random projection-based multiplicative data perturbation for privacy preserving distributed data mining. IEEE Trans. Knowl. Data Eng. 18(1), 92–106 (2006)CrossRefGoogle Scholar
  36. 36.
    Muralidhar, K., Sarathy, R.: A general aditive data perturbation method for database security. Management Sciences 45, 1399–1415 (2002)CrossRefGoogle Scholar
  37. 37.
    Samarati, P., Sweeney, L.: Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. Technical report, SRI International (1998)Google Scholar
  38. 38.
    Schlörer, J.: Security of statistical databases: Multidimensional transformation. ACM Trans. Database Syst. 6(1), 95–112 (1981)MATHCrossRefGoogle Scholar
  39. 39.
    Schlörer, J.: Information loss in partitioned statistical databases. Comput. J. 26(3), 218–223 (1983)CrossRefGoogle Scholar
  40. 40.
    Sturmfels, B.: Week 1: Two-way contingency tables, John von Neumann Lectures 2003 at the Technical University München (2003), http://www-m10.mathematik.tu-muenchen.de/neumann/lecturenotes/neumann_week1.pdf
  41. 41.
    Sweeney, L.: Achieving k-anonymity privacy protection using generalization and suppression. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems 10(5), 571–588 (2002)MATHCrossRefMathSciNetGoogle Scholar
  42. 42.
    Traub, J.F., Yemini, Y., Wozniakowski, H.: The statistical security of a statistical database. ACM Trans. Database Syst. 9(4), 672–679 (1984)CrossRefGoogle Scholar
  43. 43.
    Wang, K., Yu, P.S., Chakraborty, S.: Bottom-up generalization: A data mining solution to privacy protection. In: Perner, P. (ed.) ICDM 2004. LNCS (LNAI), vol. 3275, pp. 249–256. Springer, Heidelberg (2004)Google Scholar
  44. 44.
    Wang, L., Jajodia, S., Wijesekera, D.: Securing olap data cubes against privacy breaches. In: IEEE Symposium on Security and Privacy, pp. 161–175 (2004)Google Scholar
  45. 45.
    Wang, L., Li, Y., Wijesekera, D., Jajodia, S.: Precisely Answering Multi-dimensional Range Queries without Privacy Breaches. In: Snekkenes, E., Gollmann, D. (eds.) ESORICS 2003. LNCS, vol. 2808, pp. 100–115. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  46. 46.
    Willenborg, L., de Walal, T.: Statistical Disclosure Control in Practice. Springer, Heidelberg (1996)MATHGoogle Scholar
  47. 47.
    Yao, C., Wang, X.S., Jajodia, S.: Checking for k-anonymity violation by views. In: VLDB, pp. 910–921 (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Haibing Lu
    • 1
  • Yingjiu Li
    • 1
  • Xintao Wu
    • 2
  1. 1.Singapore Management UniversitySingapore
  2. 2.University of North Carolina at CharlotteCharlotteUSA

Personalised recommendations