Privacy Disclosure Analysis and Control for 2D Contingency Tables Containing Inaccurate Data

  • Bing Liang
  • Kevin Chiew
  • Yingjiu Li
  • Yanjiang Yang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6344)

Abstract

The 2D (two-dimensional) contingency tables have been used in many aspects of our daily life. In practice, errors may be incurred when generating or editing such a table hence the data contained by the table could be inaccurate. Even so, it is still possible for a knowledgeable snooper who may have acquired the information of error distributions to decipher some private information from a released table. This paper investigates the estimation of privacy disclosure probability for contingency tables with inaccurate data based on Fréchet bounds and proposes two optimization solutions for the control of privacy disclosure so as to preserve private information. Our estimation of privacy disclosure probability and the optimization solutions are also applicable to error-free tables which can be regarded as a special case where there are no errors. The effectiveness of the solutions is verified by rigorous experiments.

Keywords

contingency table privacy disclosure Fréchet bounds 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Beck, L.L.: A security mechanism for statistical databases. ACM Transactions on Database Systems 5(3), 316–338 (1980)MATHCrossRefGoogle Scholar
  2. 2.
    Brodsky, A., Farkas, C., Jajodia, S.: Secure databases: constraints, inference channels, and monitoring disclosures. IEEE Transactions on Knowledge and Data Engineering 12(6), 900–919 (2000)CrossRefGoogle Scholar
  3. 3.
    Chen, K., Liu, L.: Privacy preserving data classification with rotation perturbation. In: Proceedings of the 5th IEEE International Conference on Data Mining (ICDM 2005), Houston, TX, USA, November 27-30, pp. 589–592 (2005)Google Scholar
  4. 4.
    Chowdhury, S.D., Duncan, G.T., Krishnan, R., Roehrig, S.F., Mukherjee, S.: Disclosure detection in multivariate categorical databases: auditing confidentiality protection through two new matrix operators. Management Science 45(12), 1710–1723 (1999)CrossRefGoogle Scholar
  5. 5.
    Ciriani, V., di Vimercati, S.D.C., Foresti, S., Samarati, P.: K-anonymity. Security in Decentralized Data Management, 323–353 (2007)Google Scholar
  6. 6.
    Cox, L.H.: On properties of multi-dimensional statistical tables. Journal of Statistical Planning and Inference 117(23), 251–273 (2003)MATHCrossRefMathSciNetGoogle Scholar
  7. 7.
    Dobra, A., Fienberg, S.E.: Bounds for cell entries in contingency tables induced by fixed marginal totals with applications to disclosure limitation. Statistical Journal of the United States 18(1), 363–371 (2001)Google Scholar
  8. 8.
    Domingo-Ferrer, J., Mateo-Sanz, J.M.: Practical data-oriented microaggregation for statistical disclosure control. IEEE Transactions on Knowledge and Data Engineering 14(1), 189–201 (2002)CrossRefGoogle Scholar
  9. 9.
    Farkas, C., Jajodia, S.: The inference problem: a survey. SIGKDD Explorations 4(2), 6–11 (2002)CrossRefGoogle Scholar
  10. 10.
    Huang, Z., Du, W., Chen, B.: Deriving private information from randomized data. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, Baltimore, MD, USA, June 14-16, pp. 37–48 (2005)Google Scholar
  11. 11.
    Kargupta, H., Datta, S., Wang, Q., Sivakumar, K.: On the privacy preserving properties of random data perturbation techniques. In: Proceedings of the 3rd IEEE International Conference on Data Mining (ICDM 2003), Melbourne, FL, USA, December 19-22, pp. 99–106 (2003)Google Scholar
  12. 12.
    LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Incognito: efficient full-domain k-anonymity. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, Baltimore, MD, USA, June 14-16, pp. 49–60 (2005)Google Scholar
  13. 13.
    Li, N., Li, T., Venkatasubramanian, S.: t-closeness: privacy beyond k-anonymity and ℓ-diversity. In: Proceedings of the 23rd IEEE International Conference on Data Engineering, Istanbul, Turkey, April 15-20, pp. 106–115 (2007)Google Scholar
  14. 14.
    Liu, K., Kargupta, H., Ryan, J.: Random projection-based multiplicative data perturbation for privacy preserving distributed data mining. IEEE Transactions on Knowledge and Data Engineering 18(1), 92–106 (2006)CrossRefGoogle Scholar
  15. 15.
    Lu, H., Li, Y.: Disclosure analysis and control in statistical databases. In: Jajodia, S., Lopez, J. (eds.) ESORICS 2008. LNCS, vol. 5283, pp. 146–160. Springer, Heidelberg (2008)Google Scholar
  16. 16.
    Lu, H., Li, Y.: Practical inference control for data cubes. IEEE Transactions on Dependable Secure Computing 5(2), 87–98 (2008)CrossRefGoogle Scholar
  17. 17.
    Lu, H., Li, Y., Wu, X.: On the disclosure risk in dynamic two-dimensional contingency tables (extended abstract). In: Proceedings of the 2nd International Conference on Information System Security (ICISS 2006), Kolkata, India, December 17-21, pp. 349–352 (2006)Google Scholar
  18. 18.
    Lui, S.M., Qiu, L.: Individual privacy and organizational privacy in business analytics. In: Proceedings of the 40th Hawaii International Conference on System Sciences (HICSS 2007), Waikoloa, Big Island, Hawaii, USA, January 3-6, p. 216b (2007)Google Scholar
  19. 19.
    Machanavajjhala, A., Gehrke, J., Kifer, D., Venkitasubramaniam, M.: ℓ-diversity: privacy beyond k-anonymity. In: Proceedins of the 22nd International Conference on Data Engineering (ICDE 2006), Atlanta, GA, USA, April 3-8, pp. 24–35 (2006)Google Scholar
  20. 20.
    Muralidhar, K., Sarathy, R.: A general aditive data perturbation method for database security. Management Science 45(10), 1399–1415 (1999)CrossRefGoogle Scholar
  21. 21.
    Schlörer, J.: Security of statistical databases: multidimensional transformation. ACM Transactions on Database Systems 6(1), 95–112 (1981)MATHCrossRefGoogle Scholar
  22. 22.
    Schlörer, J.: Information loss in partitioned statistical databases. Computer Journal 26(3), 218–223 (1983)CrossRefGoogle Scholar
  23. 23.
    Sweene, L.: Achieving k-anonymity privacy protection using generalization and suppression. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 10(5), 571–588 (2002)CrossRefMathSciNetGoogle Scholar
  24. 24.
    Wang, L., Jajodia, S., Wijesekera, D.: Securing OLAP data cubes against privacy breaches. In: Proceedings of the IEEE Symposium on Security and Privacy (S&P 2004), Berkeley, CA, USA, May 9-12, pp. 161–175 (2004)Google Scholar
  25. 25.
    Wang, L., Li, Y., Wijesekera, D., Jajodia, S.: Precisely answering multi-dimensional range queries without privacy breaches. In: Snekkenes, E., Gollmann, D. (eds.) ESORICS 2003. LNCS, vol. 2808, pp. 100–115. Springer, Heidelberg (2003)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Bing Liang
    • 1
  • Kevin Chiew
    • 1
  • Yingjiu Li
    • 1
  • Yanjiang Yang
    • 2
  1. 1.School of Information SystemsSingapore Management UniversitySingapore
  2. 2.Institute for Infocomm ResearchSingapore

Personalised recommendations