Abstract
We want to publish low-dimensional points, for example 2D spatial points, in a differentially private manner. Most existing mechanisms publish noisy frequency counts of points in a fixed predefined partition. Arguably, histograms with adaptive partition, for example V-optimal and equi-depth histograms, which have smaller bin-widths in denser regions, would provide more statistical information. However, as the adaptive partitions leak significant information about the dataset, it is not clear how differentially private partitions can be published accurately. In this paper, we propose a simple method based on the observation that the sensitivity of publishing the sorted sequence of a dataset is independent of the size of dataset. Together with isotonic regression, the dataset can be reconstructed with high accuracy. One advantage of the proposed method is its simplicity, in the sense that there are only a few parameters to be determined. Furthermore, the parameters can be estimated solely from the privacy requirement ε and the total number of points, and hence do not leak information about the data. Although the parameters are chosen to minimize the earth mover’s distance between the published data and original data, empirical studies show that the proposed method also achieves high accuracy w.r.t. to some other measurements, for example range query and order statistics.
Keywords
- Range Query
- Generalization Error
- Twitter User
- Privacy Requirement
- Hilbert Curve
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, access via your institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Twitter census: Twitter users by location, http://www.infochimps.com/datasets/twitter-census-twitter-users-by-location
Barak, B., Chaudhuri, K., Dwork, C., Kale, S., McSherry, F., Talwar, K.: Privacy, accuracy, and consistency too: a holistic solution to contingency table release. In: Symposium on Principles of Database Systems, pp. 273–282 (2007)
Blum, A., Dwork, C., McSherry, F., Nissim, K.: Practical privacy: the sulq framework, pp. 128–138 (2005)
Cormode, G., Procopiuc, M., Shen, E., Srivastava, D., Yu, T.: Differentially private spatial decompositions. To be appeared in ICDE (2012)
Dwork, C.: Differential privacy. Automata, languages and programming, p. 1 (2006)
Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating Noise to Sensitivity in Private Data Analysis. In: Halevi, S., Rabin, T. (eds.) TCC 2006. LNCS, vol. 3876, pp. 265–284. Springer, Heidelberg (2006)
Feldman, D., Fiat, A., Kaplan, H., Nissim, K.: Private coresets, p. 361 (2009)
Fung, B., Wang, K., Chen, R., Yu, P.: Privacy-preserving data publishing: A survey of recent developments. ACM Computing Surveys, 14 (2010)
Gotsman, C., Lindenbaum, M.: On the metric properties of discrete space-filling curves. IEEE Transactions on Image Processing, 794–797 (1996)
Grotzinger, S., Witzgall, C.: Projections onto order simplexes. Applied Mathematics and Optimization, 247–270 (1984)
Hay, M., Rastogi, V., Miklau, G., Suciu, D.: Boosting the accuracy of differentially private histograms through consistency. VLDB Endowment, 1021 (2010)
Kaluža, B., Mirchevska, V., Dovgan, E., Luštrek, M., Gams, M.: An Agent-Based Approach to Care in Independent Living. In: de Ruyter, B., Wichert, R., Keyson, D.V., Markopoulos, P., Streitz, N., Divitini, M., Georgantas, N., Mana Gomez, A. (eds.) AmI 2010. LNCS, vol. 6439, pp. 177–186. Springer, Heidelberg (2010)
Kifer, D., Machanavajjhala, A.: No free lunch in data privacy. In: Management of Data, pp. 193–204 (2011)
Li, C., Hay, M., Rastogi, V., Miklau, G., McGregor, A.: Optimizing linear counting queries under differential privacy, pp. 123–134 (2010)
Machanavajjhala, A., Gehrke, J., Kifer, D., Venkitasubramaniam, M.: ℓ-diversity: Privacy beyond k-anonymity. In: International Conference on Data Engineering, pp. 24–24 (2006)
Machanavajjhala, A., Kifer, D., Abowd, J., Gehrke, J., Vilhuber, L.: Privacy: Theory meets practice on the map. In: International Conference on Data Engineering, pp. 277–286 (2008)
Meyer, M.C.: Inference using shape-restricted regression splines. Annals of Applied Statistics, 1013–1033 (2008)
Mitchison, G., Durbin, R.: Optimal numberings of an n x n array. Algebraic Discrete Methods, 571–582 (1986)
Niedermeier, R., Reinhardt, K., Sanders, P.: Towards optimal locality in mesh-indexings, pp. 364–375 (1997)
Nissim, K., Raskhodnikova, S., Smith, A.: Smooth sensitivity and sampling in private data analysis, pp. 75–84 (2007)
Piatetsky-Shapiro, G., Connell, C.: Accurate estimation of the number of tuples satisfying a condition, pp. 256–276 (1984)
Poosala, V., Haas, P., Ioannidis, Y., Shekita, E.: Improved histograms for selectivity estimation of range predicates. In: ACM SIGMOD Record, p. 294 (1996)
Qardaji, W., Li, N.: Recursive partitioning and summarization: a practical framework for differentially private data publishing. To be appeared in ASIACCS (2012)
Rubner, Y., Guibas, L., Tomasi, C.: The earth movers distance, multi-dimensional scaling, and color-based image retrieval, pp. 661–668 (1997)
Stout, Q.F.: Optimal algorithms for unimodal regression. Computer Science and Statistics, 109–122 (2000)
Sweeney, L.: k-anonymity: a model for protecting privacy. International Journal of Uncertainty, Fuzziness and Knowledge-Based System, 557–570 (2002)
Wang, X., Li, F.: Isotonic smoothing spline regression. Journal Computational and Graphical Statistics, 21–37 (2008)
Xiao, X., Wang, G., Gehrke, J.: Differential privacy via wavelet transforms. IEEE Transactions on Knowledge and Data Engineering, 1200–1214 (2010)
Xiao, Y., Xiong, L., Yuan, C.: Differentially Private Data Release through Multidimensional Partitioning. In: Jonker, W., Petković, M. (eds.) SDM 2010. LNCS, vol. 6358, pp. 150–168. Springer, Heidelberg (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fang, C., Chang, EC. (2012). Adaptive Differentially Private Histogram of Low-Dimensional Data. In: Fischer-Hübner, S., Wright, M. (eds) Privacy Enhancing Technologies. PETS 2012. Lecture Notes in Computer Science, vol 7384. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31680-7_9
Download citation
DOI: https://doi.org/10.1007/978-3-642-31680-7_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-31679-1
Online ISBN: 978-3-642-31680-7
eBook Packages: Computer ScienceComputer Science (R0)
