Skip to main content

Adaptive Differentially Private Histogram of Low-Dimensional Data

  • Conference paper

Part of the Lecture Notes in Computer Science book series (LNSC,volume 7384)

Abstract

We want to publish low-dimensional points, for example 2D spatial points, in a differentially private manner. Most existing mechanisms publish noisy frequency counts of points in a fixed predefined partition. Arguably, histograms with adaptive partition, for example V-optimal and equi-depth histograms, which have smaller bin-widths in denser regions, would provide more statistical information. However, as the adaptive partitions leak significant information about the dataset, it is not clear how differentially private partitions can be published accurately. In this paper, we propose a simple method based on the observation that the sensitivity of publishing the sorted sequence of a dataset is independent of the size of dataset. Together with isotonic regression, the dataset can be reconstructed with high accuracy. One advantage of the proposed method is its simplicity, in the sense that there are only a few parameters to be determined. Furthermore, the parameters can be estimated solely from the privacy requirement ε and the total number of points, and hence do not leak information about the data. Although the parameters are chosen to minimize the earth mover’s distance between the published data and original data, empirical studies show that the proposed method also achieves high accuracy w.r.t. to some other measurements, for example range query and order statistics.

Keywords

  • Range Query
  • Generalization Error
  • Twitter User
  • Privacy Requirement
  • Hilbert Curve

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (Canada)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (Canada)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (Canada)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Twitter census: Twitter users by location, http://www.infochimps.com/datasets/twitter-census-twitter-users-by-location

  2. Barak, B., Chaudhuri, K., Dwork, C., Kale, S., McSherry, F., Talwar, K.: Privacy, accuracy, and consistency too: a holistic solution to contingency table release. In: Symposium on Principles of Database Systems, pp. 273–282 (2007)

    Google Scholar 

  3. Blum, A., Dwork, C., McSherry, F., Nissim, K.: Practical privacy: the sulq framework, pp. 128–138 (2005)

    Google Scholar 

  4. Cormode, G., Procopiuc, M., Shen, E., Srivastava, D., Yu, T.: Differentially private spatial decompositions. To be appeared in ICDE (2012)

    Google Scholar 

  5. Dwork, C.: Differential privacy. Automata, languages and programming, p. 1 (2006)

    Google Scholar 

  6. Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating Noise to Sensitivity in Private Data Analysis. In: Halevi, S., Rabin, T. (eds.) TCC 2006. LNCS, vol. 3876, pp. 265–284. Springer, Heidelberg (2006)

    CrossRef  Google Scholar 

  7. Feldman, D., Fiat, A., Kaplan, H., Nissim, K.: Private coresets, p. 361 (2009)

    Google Scholar 

  8. Fung, B., Wang, K., Chen, R., Yu, P.: Privacy-preserving data publishing: A survey of recent developments. ACM Computing Surveys, 14 (2010)

    Google Scholar 

  9. Gotsman, C., Lindenbaum, M.: On the metric properties of discrete space-filling curves. IEEE Transactions on Image Processing, 794–797 (1996)

    Google Scholar 

  10. Grotzinger, S., Witzgall, C.: Projections onto order simplexes. Applied Mathematics and Optimization, 247–270 (1984)

    Google Scholar 

  11. Hay, M., Rastogi, V., Miklau, G., Suciu, D.: Boosting the accuracy of differentially private histograms through consistency. VLDB Endowment, 1021 (2010)

    Google Scholar 

  12. Kaluža, B., Mirchevska, V., Dovgan, E., Luštrek, M., Gams, M.: An Agent-Based Approach to Care in Independent Living. In: de Ruyter, B., Wichert, R., Keyson, D.V., Markopoulos, P., Streitz, N., Divitini, M., Georgantas, N., Mana Gomez, A. (eds.) AmI 2010. LNCS, vol. 6439, pp. 177–186. Springer, Heidelberg (2010)

    CrossRef  Google Scholar 

  13. Kifer, D., Machanavajjhala, A.: No free lunch in data privacy. In: Management of Data, pp. 193–204 (2011)

    Google Scholar 

  14. Li, C., Hay, M., Rastogi, V., Miklau, G., McGregor, A.: Optimizing linear counting queries under differential privacy, pp. 123–134 (2010)

    Google Scholar 

  15. Machanavajjhala, A., Gehrke, J., Kifer, D., Venkitasubramaniam, M.: ℓ-diversity: Privacy beyond k-anonymity. In: International Conference on Data Engineering, pp. 24–24 (2006)

    Google Scholar 

  16. Machanavajjhala, A., Kifer, D., Abowd, J., Gehrke, J., Vilhuber, L.: Privacy: Theory meets practice on the map. In: International Conference on Data Engineering, pp. 277–286 (2008)

    Google Scholar 

  17. Meyer, M.C.: Inference using shape-restricted regression splines. Annals of Applied Statistics, 1013–1033 (2008)

    Google Scholar 

  18. Mitchison, G., Durbin, R.: Optimal numberings of an n x n array. Algebraic Discrete Methods, 571–582 (1986)

    Google Scholar 

  19. Niedermeier, R., Reinhardt, K., Sanders, P.: Towards optimal locality in mesh-indexings, pp. 364–375 (1997)

    Google Scholar 

  20. Nissim, K., Raskhodnikova, S., Smith, A.: Smooth sensitivity and sampling in private data analysis, pp. 75–84 (2007)

    Google Scholar 

  21. Piatetsky-Shapiro, G., Connell, C.: Accurate estimation of the number of tuples satisfying a condition, pp. 256–276 (1984)

    Google Scholar 

  22. Poosala, V., Haas, P., Ioannidis, Y., Shekita, E.: Improved histograms for selectivity estimation of range predicates. In: ACM SIGMOD Record, p. 294 (1996)

    Google Scholar 

  23. Qardaji, W., Li, N.: Recursive partitioning and summarization: a practical framework for differentially private data publishing. To be appeared in ASIACCS (2012)

    Google Scholar 

  24. Rubner, Y., Guibas, L., Tomasi, C.: The earth movers distance, multi-dimensional scaling, and color-based image retrieval, pp. 661–668 (1997)

    Google Scholar 

  25. Stout, Q.F.: Optimal algorithms for unimodal regression. Computer Science and Statistics, 109–122 (2000)

    Google Scholar 

  26. Sweeney, L.: k-anonymity: a model for protecting privacy. International Journal of Uncertainty, Fuzziness and Knowledge-Based System, 557–570 (2002)

    Google Scholar 

  27. Wang, X., Li, F.: Isotonic smoothing spline regression. Journal Computational and Graphical Statistics, 21–37 (2008)

    Google Scholar 

  28. Xiao, X., Wang, G., Gehrke, J.: Differential privacy via wavelet transforms. IEEE Transactions on Knowledge and Data Engineering, 1200–1214 (2010)

    Google Scholar 

  29. Xiao, Y., Xiong, L., Yuan, C.: Differentially Private Data Release through Multidimensional Partitioning. In: Jonker, W., Petković, M. (eds.) SDM 2010. LNCS, vol. 6358, pp. 150–168. Springer, Heidelberg (2010)

    CrossRef  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Fang, C., Chang, EC. (2012). Adaptive Differentially Private Histogram of Low-Dimensional Data. In: Fischer-Hübner, S., Wright, M. (eds) Privacy Enhancing Technologies. PETS 2012. Lecture Notes in Computer Science, vol 7384. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31680-7_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-31680-7_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-31679-1

  • Online ISBN: 978-3-642-31680-7

  • eBook Packages: Computer ScienceComputer Science (R0)