Adaptive Differentially Private Histogram of Low-Dimensional Data

  • Chengfang Fang
  • Ee-Chien Chang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7384)

Abstract

We want to publish low-dimensional points, for example 2D spatial points, in a differentially private manner. Most existing mechanisms publish noisy frequency counts of points in a fixed predefined partition. Arguably, histograms with adaptive partition, for example V-optimal and equi-depth histograms, which have smaller bin-widths in denser regions, would provide more statistical information. However, as the adaptive partitions leak significant information about the dataset, it is not clear how differentially private partitions can be published accurately. In this paper, we propose a simple method based on the observation that the sensitivity of publishing the sorted sequence of a dataset is independent of the size of dataset. Together with isotonic regression, the dataset can be reconstructed with high accuracy. One advantage of the proposed method is its simplicity, in the sense that there are only a few parameters to be determined. Furthermore, the parameters can be estimated solely from the privacy requirement ε and the total number of points, and hence do not leak information about the data. Although the parameters are chosen to minimize the earth mover’s distance between the published data and original data, empirical studies show that the proposed method also achieves high accuracy w.r.t. to some other measurements, for example range query and order statistics.

Keywords

Range Query Generalization Error Twitter User Privacy Requirement Hilbert Curve 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
  2. 2.
    Barak, B., Chaudhuri, K., Dwork, C., Kale, S., McSherry, F., Talwar, K.: Privacy, accuracy, and consistency too: a holistic solution to contingency table release. In: Symposium on Principles of Database Systems, pp. 273–282 (2007)Google Scholar
  3. 3.
    Blum, A., Dwork, C., McSherry, F., Nissim, K.: Practical privacy: the sulq framework, pp. 128–138 (2005)Google Scholar
  4. 4.
    Cormode, G., Procopiuc, M., Shen, E., Srivastava, D., Yu, T.: Differentially private spatial decompositions. To be appeared in ICDE (2012)Google Scholar
  5. 5.
    Dwork, C.: Differential privacy. Automata, languages and programming, p. 1 (2006)Google Scholar
  6. 6.
    Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating Noise to Sensitivity in Private Data Analysis. In: Halevi, S., Rabin, T. (eds.) TCC 2006. LNCS, vol. 3876, pp. 265–284. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  7. 7.
    Feldman, D., Fiat, A., Kaplan, H., Nissim, K.: Private coresets, p. 361 (2009)Google Scholar
  8. 8.
    Fung, B., Wang, K., Chen, R., Yu, P.: Privacy-preserving data publishing: A survey of recent developments. ACM Computing Surveys, 14 (2010)Google Scholar
  9. 9.
    Gotsman, C., Lindenbaum, M.: On the metric properties of discrete space-filling curves. IEEE Transactions on Image Processing, 794–797 (1996)Google Scholar
  10. 10.
    Grotzinger, S., Witzgall, C.: Projections onto order simplexes. Applied Mathematics and Optimization, 247–270 (1984)Google Scholar
  11. 11.
    Hay, M., Rastogi, V., Miklau, G., Suciu, D.: Boosting the accuracy of differentially private histograms through consistency. VLDB Endowment, 1021 (2010)Google Scholar
  12. 12.
    Kaluža, B., Mirchevska, V., Dovgan, E., Luštrek, M., Gams, M.: An Agent-Based Approach to Care in Independent Living. In: de Ruyter, B., Wichert, R., Keyson, D.V., Markopoulos, P., Streitz, N., Divitini, M., Georgantas, N., Mana Gomez, A. (eds.) AmI 2010. LNCS, vol. 6439, pp. 177–186. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  13. 13.
    Kifer, D., Machanavajjhala, A.: No free lunch in data privacy. In: Management of Data, pp. 193–204 (2011)Google Scholar
  14. 14.
    Li, C., Hay, M., Rastogi, V., Miklau, G., McGregor, A.: Optimizing linear counting queries under differential privacy, pp. 123–134 (2010)Google Scholar
  15. 15.
    Machanavajjhala, A., Gehrke, J., Kifer, D., Venkitasubramaniam, M.: ℓ-diversity: Privacy beyond k-anonymity. In: International Conference on Data Engineering, pp. 24–24 (2006)Google Scholar
  16. 16.
    Machanavajjhala, A., Kifer, D., Abowd, J., Gehrke, J., Vilhuber, L.: Privacy: Theory meets practice on the map. In: International Conference on Data Engineering, pp. 277–286 (2008)Google Scholar
  17. 17.
    Meyer, M.C.: Inference using shape-restricted regression splines. Annals of Applied Statistics, 1013–1033 (2008)Google Scholar
  18. 18.
    Mitchison, G., Durbin, R.: Optimal numberings of an n x n array. Algebraic Discrete Methods, 571–582 (1986)Google Scholar
  19. 19.
    Niedermeier, R., Reinhardt, K., Sanders, P.: Towards optimal locality in mesh-indexings, pp. 364–375 (1997)Google Scholar
  20. 20.
    Nissim, K., Raskhodnikova, S., Smith, A.: Smooth sensitivity and sampling in private data analysis, pp. 75–84 (2007)Google Scholar
  21. 21.
    Piatetsky-Shapiro, G., Connell, C.: Accurate estimation of the number of tuples satisfying a condition, pp. 256–276 (1984)Google Scholar
  22. 22.
    Poosala, V., Haas, P., Ioannidis, Y., Shekita, E.: Improved histograms for selectivity estimation of range predicates. In: ACM SIGMOD Record, p. 294 (1996)Google Scholar
  23. 23.
    Qardaji, W., Li, N.: Recursive partitioning and summarization: a practical framework for differentially private data publishing. To be appeared in ASIACCS (2012)Google Scholar
  24. 24.
    Rubner, Y., Guibas, L., Tomasi, C.: The earth movers distance, multi-dimensional scaling, and color-based image retrieval, pp. 661–668 (1997)Google Scholar
  25. 25.
    Stout, Q.F.: Optimal algorithms for unimodal regression. Computer Science and Statistics, 109–122 (2000)Google Scholar
  26. 26.
    Sweeney, L.: k-anonymity: a model for protecting privacy. International Journal of Uncertainty, Fuzziness and Knowledge-Based System, 557–570 (2002)Google Scholar
  27. 27.
    Wang, X., Li, F.: Isotonic smoothing spline regression. Journal Computational and Graphical Statistics, 21–37 (2008)Google Scholar
  28. 28.
    Xiao, X., Wang, G., Gehrke, J.: Differential privacy via wavelet transforms. IEEE Transactions on Knowledge and Data Engineering, 1200–1214 (2010)Google Scholar
  29. 29.
    Xiao, Y., Xiong, L., Yuan, C.: Differentially Private Data Release through Multidimensional Partitioning. In: Jonker, W., Petković, M. (eds.) SDM 2010. LNCS, vol. 6358, pp. 150–168. Springer, Heidelberg (2010)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Chengfang Fang
    • 1
  • Ee-Chien Chang
    • 1
  1. 1.School of ComputingNational University of SingaporeSingapore

Personalised recommendations