Advertisement

Differentially Private High-Dimensional Data Publication via Markov Network

  • Fengqiong Wei
  • Wei Zhang
  • Yunfang Chen
  • Jingwen Zhao
Conference paper
Part of the Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering book series (LNICST, volume 254)

Abstract

Differentially private data publication has recently received considerable attention. However, it faces some challenges in differentially private high-dimensional data publication, such as the complex attribute relationships, the high computational complexity and data sparsity. Therefore, we propose PrivMN, a novel method to publish high-dimensional data with differential privacy guarantee. We first use the Markov model to represent the mutual relationships between attributes to solve the problem that the direction of relationship between variables cannot be determined in practical application. We then take advantage of approximate inference to calculate the joint distribution of high-dimensional data under differential privacy to figure out the computational and spatial complexity of accurate reasoning. Extensive experiments on real datasets demonstrate that our solution makes the published high-dimensional synthetic datasets more efficient under the guarantee of differential privacy.

Keywords

Differential privacy High-dimensional data Data publication Markov network 

Notes

Acknowledgments

The authors would like to express their thanks to the anonymous reviewers for their constructive comments and suggestions. This work was supported by the National Natural Science Foundation of China under grants 61272422, 61672297.

References

  1. 1.
    The Economist: The worlds most valuable resource is no longer oil, but data, May 2017Google Scholar
  2. 2.
    Yu, S.: Big privacy: challenges and opportunities of privacy study in the age of big data. IEEE Access 2017(4), 2751–2763 (2017)Google Scholar
  3. 3.
    Sweeney, L.: k-anonymity: a model for protecting privacy. Int. J. Uncertain. Fuzziness Knowl.-Based Syst. 10(5), 557–570 (2002)MathSciNetCrossRefGoogle Scholar
  4. 4.
    Dwork, C.: Differential privacy. In: Bugliesi, M., Preneel, B., Sassone, V., Wegener, I. (eds.) ICALP 2006. LNCS, vol. 4052, pp. 1–12. Springer, Heidelberg (2006).  https://doi.org/10.1007/11787006_1CrossRefGoogle Scholar
  5. 5.
    Koller, D., Friedman, N.: Probabilistic Graphical Models: Principles and Techniques - Adaptive Computation and Machine Learning. MIT Press, Cambridge (2009)Google Scholar
  6. 6.
    Roth, A., Roughgarden, T.: Interactive privacy via the median mechanism. In: Proceedings of the 42nd ACM Symposium on Theory of Computing, Cambridge, USA, pp. 765–774 (2010)Google Scholar
  7. 7.
    Gupta, A., Ligett, K., McSherry, F., et al.: Differentially private approximation algorithms (2009)Google Scholar
  8. 8.
    Dwork, C., McSherry, F., Nissim, K., Smith, A.: Calibrating noise to sensitivity in private data analysis. In: Halevi, S., Rabin, T. (eds.) TCC 2006. LNCS, vol. 3876, pp. 265–284. Springer, Heidelberg (2006).  https://doi.org/10.1007/11681878_14CrossRefGoogle Scholar
  9. 9.
    Hay, M., Rastogi, V., Miklau, G., et al.: Boosting the accuracy of differentially private histograms through consistency. Proc. VLDB Endow. 3(1–2), 1021–1032 (2010)CrossRefGoogle Scholar
  10. 10.
    Xiao, Y., Gardner, J., Xiong, L.: DPCube: releasing differentially private data cubes for health information. In: IEEE International Conference on Data Engineering, pp. 1305–1308. IEEE Computer Society (2012)Google Scholar
  11. 11.
    Xiao, X., Wang, G., Gehrke, J.: Differential privacy via wavelet transforms. IEEE Trans. Knowl. Data Eng. 23(8), 1200–1214 (2011)CrossRefGoogle Scholar
  12. 12.
    Barak, B., Chaudhuri, K., Dwork, C., et al.: Privacy, accuracy, and consistency too: a holistic solution to contingency table release. In: Proceedings of the 26th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, Beijing, China, pp. 273–282 (2007)Google Scholar
  13. 13.
    Qardaji, W., Yang, W., Li, N.: Differentially private grids for geospatial data. In: IEEE International Conference on Data Engineering, pp. 757–768. IEEE Computer Society (2013)Google Scholar
  14. 14.
    Chen, R., Mohammed, N., Fung, B.C.M., Desai, B.C., Xiong, L.: Publishing set-valued data via differential privacy. PVLDB 4(11), 1087–1098 (2011)Google Scholar
  15. 15.
    Qardaji, W., Yang, W., Li, N.: Priview: practical differentially private release of marginal contingency tables. In: SIGMOD (2014)Google Scholar
  16. 16.
    Xu, C., Ren, J., Zhang, Y., et al.: DPPro: differentially private high-dimensional data release via random projection. IEEE Trans. Inf. Forensics & Secur. 12(12), 3081–3093 (2017)CrossRefGoogle Scholar
  17. 17.
    Ren, X., Yu, C.M., Yu, W., et al.: LoPub: high-dimensional crowdsourced data publication with local differential privacy. IEEE Trans. Inf. Forensics & Secur. 13(9), 2151–2166 (2016)CrossRefGoogle Scholar
  18. 18.
    Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausble Inference. Morgan Kaufmann Publishers, Burlington (1988)Google Scholar
  19. 19.
    Lauritzen, S.L., Spiegelhalter, D.J.: Local computations with probabilities on graphical structures and their application to expert systems. J. R. Stat. Soc. Ser. B (Methodol.) 50(2), 157–224 (1988)MathSciNetzbMATHGoogle Scholar
  20. 20.
    Zhang, J., Cormode, G., Procopiuc, C.M., Srivastava, D., Xiao, X.: Privbayes: private data release via Bayesian networks. In: SIGMOD (2014)Google Scholar
  21. 21.
    Su, S., Tang, P., Cheng, X., et al.: Differentially private multi-party high-dimensional data publishing. In: IEEE International Conference on Data Engineering, pp. 205–216. IEEE (2016)Google Scholar
  22. 22.
    Chen, R., Xiao, Q., Zhang, Y., et al.: Differentially private high-dimensional data publication via sampling-based inference. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM (2015)Google Scholar
  23. 23.
    Dwork, C.: A firm foundation for private data analysis. Commun. ACM 54(1), 86–95 (2011)CrossRefGoogle Scholar
  24. 24.
    Mcsherry, F., Talwar, K.: Mechanism design via differential privacy. In: Proceedings of the 48th Annual IEEE Symposium on Foundations of Computer Science, Providence, Rhode Island, USA, pp. 94–103 (2007)Google Scholar
  25. 25.
    Hammersley, J.M., Clifford, P.: Markov fields on finite graphs and lattices (1971)Google Scholar
  26. 26.
    Cliord, P.: Markov random fields in statistics. Disord. Phys. Syst. A 14(1), 128–135 (1990)Google Scholar
  27. 27.
    Zhang, J., Xiao, X., Xie, X.: PrivTree: a differentially private algorithm for hierarchical decompositions. In: International Conference on Management of Data, pp. 155–170. ACM (2016)Google Scholar
  28. 28.
    Li, D., Zhang, W., Chen, Y.: Differentially private network data release via stochastic kronecker graph. In: Cellary, W., Mokbel, M.F., Wang, J., Wang, H., Zhou, R., Zhang, Y. (eds.) WISE 2016. LNCS, vol. 10042, pp. 290–297. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-48743-4_23CrossRefGoogle Scholar

Copyright information

© ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2018

Authors and Affiliations

  • Fengqiong Wei
    • 1
  • Wei Zhang
    • 1
  • Yunfang Chen
    • 1
  • Jingwen Zhao
    • 1
  1. 1.School of Computer ScienceNanjing University of Posts and TelecommunicationsNanjingChina

Personalised recommendations