Advertisement

Differential Privacy and the Risk-Utility Tradeoff for Multi-dimensional Contingency Tables

  • Stephen E. Fienberg
  • Alessandro Rinaldo
  • Xiaolin Yang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6344)

Abstract

The methodology of differential privacy has provided a strong definition of privacy which in some settings, using a mechanism of doubly-exponential noise addition, also allows for extraction of informative statistics from databases. A recent paper extends this approach to the release of a specified set of margins from a multi-way contingency table. Privacy protection in such settings implicitly focuses on small cell counts that might allow for the identification of units that are unique in the database. We explore how well the mechanism works in the context of a series of examples, and the extent to which the proposed differential-privacy mechanism allows for sensible inferences from the released data.

Keywords

Contingency Table Maximum Likelihood Estimator Privacy Protection Differential Privacy Total Variation Distance 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Barak, B., Chaudhuri, K., Dwork, C., Kale, S., McSherry, F., Talwar, K.: Privacy, accuracy, and consistency too: A holistic solution to contingency table release. In: Proceedings of the 26th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (2007)Google Scholar
  2. 2.
    Bishop, Y.M., Fienberg, S.E., Holland, P.W.: Discrete Multivariate Analysis: Theory and Practice. MIT Press, Cambridge (1975); reprinted: Springer (2007)zbMATHGoogle Scholar
  3. 3.
    Christiansen, S.K., Giese, H.: Genetic analysis of obligate barley powdery mildew fungus based on rfpl and virulence loci. Theoretical and Applied Genetics 79, 705–712 (1991)Google Scholar
  4. 4.
    Dobra, A., Fienberg, S.E., Rinaldo, A., Slavkovic, A.B., Zhou, Y.: Algebraic statistics and contingency table problems: Log-linear models, likelihood estimation, and disclosure limitation. In: Putinar, M., Sullivant, S. (eds.) Emerging Applications of Algebraic Geometry. IMA Series in Applied Mathematics, pp. 63–88. Springer, Heidelberg (2008)Google Scholar
  5. 5.
    Duncan, G.T., Fienberg, S.E., Krishnan, R., Padman, R., Roehrig, S.F.: Disclosure limitation methods and information loss for tabular data. In: Doyle, P., Lane, J., Theeuwes, J., Zayatz, L. (eds.) Confidentiality, Disclosure and Data Access: Theory and Practical Applications for Statistical Agencies, pp. 135–166. Elsevier, Amsterdam (2001)Google Scholar
  6. 6.
    Dwork, C.: Differential privacy. In: Bugliesi, M., Preneel, B., Sassone, V., Wegener, I. (eds.) ICALP 2006. LNCS, vol. 4052, pp. 1–12. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  7. 7.
    Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: Halevi, S., Rabin, T. (eds.) TCC 2006. LNCS, vol. 3876, pp. 265–284. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  8. 8.
    Edwards, D.: Linkage analysis using log-linear models. Comp. Statist. and Data Anal. 13, 281–290 (1992)CrossRefGoogle Scholar
  9. 9.
    Edwards, D.: Introduction to Graphical Modelling, 2nd edn. Springer, Heidelberg (2000)zbMATHGoogle Scholar
  10. 10.
    Edwards, D., Havranek, T.: Fast procedure for model search in multidimensional contingency tables. Biometrika 72, 339–351 (1985)zbMATHCrossRefMathSciNetGoogle Scholar
  11. 11.
    Fienberg, S.E., Slavkovic, A.B.: A survey of statistical approaches to preserving confi- dentiality of contingency table entries. In: Aggarwal, C., Yu, P.S. (eds.) Privacy Preserving Data Mining: Models and Algorithms, pp. 289–310. Springer, Heidelberg (2008)Google Scholar
  12. 12.
    Lauritzen, S.L.: Graphical Models. Oxford University Press, Oxford (1996)Google Scholar
  13. 13.
    Wasserman, L., Shuheng, Z.: A statistical framework for differential privacy. J. Amer. Statist. Assoc. 105, 375–389 (2010)CrossRefMathSciNetGoogle Scholar
  14. 14.
    Whittaker, J.: Graphical Models in Applied Multivariate Statistics. Wiley, Chichester (1990)zbMATHGoogle Scholar
  15. 15.
    Winkler, W.: General Discret-data Modeling Methods for Producing Synthetic Data with Reduced Re-identification Risk that Preserve Analytic Properties. Research Report Series, Statistics 2010-02 (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Stephen E. Fienberg
    • 1
    • 2
    • 3
  • Alessandro Rinaldo
    • 1
    • 2
  • Xiaolin Yang
    • 1
  1. 1.Department of StatisticsCarnegie Mellon UniversityPittsburghUSA
  2. 2.Machine Learning DepartmentCarnegie Mellon UniversityPittsburghUSA
  3. 3.Cylab, and i-LabCarnegie Mellon UniversityPittsburghUSA

Personalised recommendations