Differentially private counting of users’ spatial regions

Regular Paper
  • 105 Downloads

Abstract

Mining of spatial data is an enabling technology for mobile services, Internet-connected cars and the Internet of Things. But the very distinctiveness of spatial data that drives utility can cost user privacy. Past work has focused upon points and trajectories for differentially private release. In this work, we continue the tradition of privacy-preserving spatial analytics, focusing not on point or path data, but on planar spatial regions. Such data represent the area of a user’s most frequent visitation—such as “around home and nearby shops”. Specifically we consider the differentially private release of data structures that support range queries for counting users’ spatial regions. Counting planar regions leads to unique challenges not faced in existing work. A user’s spatial region that straddles multiple data structure cells can lead to duplicate counting at query time. We provably avoid this pitfall by leveraging the Euler characteristic for the first time with differential privacy. To address the increased sensitivity of range queries to spatial region data, we calibrate privacy-preserving noise using bounded user region size and a constrained inference that uses robust least absolute deviations. Our novel constrained inference reduces noise and promotes covertness by (privately) imposing consistency. We provide a full end-to-end theoretical analysis of both differential privacy and high-probability utility for our approach using concentration bounds. A comprehensive experimental study on several real-world datasets establishes practical validity.

Keywords

Differential privacy Euler histograms Location privacy Spatial regions 

References

  1. 1.
    Ács G, Castelluccia C, Chen R (2012) Differentially private histogram publishing through lossy compression. In: ICDM’12, pp 1–10Google Scholar
  2. 2.
    Andrés ME, Bordenabe NE, Chatzikokolakis K, Palamidessi C (2013) Geo-indistinguishability: differential privacy for location-based systems. In: CCS’13, pp 901–914Google Scholar
  3. 3.
    Barak B, Chaudhuri K, Dwork C, Kale S, McSherry F, Talwar K (2007) Privacy, accuracy, and consistency too: a holistic solution to contingency table release. In: Proceedings of the twenty-sixth ACM SIGACT-SIGMOD-SIGART symposium on principles of database systems, June 11–13, 2007, Beijing, China, pp 273–282Google Scholar
  4. 4.
    Beigel R, Tanin E (1998) The geometry of browsing. In: LATIN ’98: theoretical informatics, third latin American symposium, pp 331–340Google Scholar
  5. 5.
    Beresford AR, Stajano F (2003) Location privacy in pervasive computing. IEEE Pervasive Comput 2(1):46–55CrossRefGoogle Scholar
  6. 6.
    Braz F, Orlando S, Orsini R, Raffaetà A, Roncato A, Silvestri C (2007) Approximate aggregations in trajectory data warehouses. In: Proceedings of the 23rd international conference on data engineering workshops, ICDE 2007, pp 536–545Google Scholar
  7. 7.
    Chawla S, Dwork C, McSherry F, Talwar K (2005) On the utility of privacy-preserving histograms. In: Proceedings of the 21st conference on uncertainty in artificial intelligenceGoogle Scholar
  8. 8.
    Chen R, Fung BCM, Desai BC, Sossou NM (2012) Differentially private transit data publication: a case study on the Montreal transportation system. In: KDD’12, pp 213–221Google Scholar
  9. 9.
    Chow CY, Mokbel MF (2011a) Privacy of spatial trajectories. In: Zheng Y, Zhou X (eds) Computing with spatial trajectories. Springer, New York, pp 109–141Google Scholar
  10. 10.
    Chow C-Y, Mokbel MF (2011) Trajectory privacy in location-based services and data publication. SIGKDD Explor 13(1):19–29CrossRefGoogle Scholar
  11. 11.
    Cormode G, Procopiuc CM, Srivastava D, Shen E, Yu T (2012) Differentially private spatial decompositions. In: ICDE’12, pp 20–31Google Scholar
  12. 12.
    Dielman TE (2005) Least absolute value regression: recent contributions. J Stat Comput Simul 75(4):263–286MathSciNetCrossRefMATHGoogle Scholar
  13. 13.
    Dwork C (2008) Differential privacy: a survey of results. In: Theory and applications of models of computation, 5th international conference, TAMC, pp 1–19Google Scholar
  14. 14.
    Dwork C (2011) A firm foundation for private data analysis. Commun ACM 54(1):86–95CrossRefGoogle Scholar
  15. 15.
    Dwork C, McSherry F, Nissim K, Smith A (2006) Calibrating noise to sensitivity in private data analysis. In: Theory of cryptography, third theory of cryptography conference, TCC, vol 3876. Lecture notes in computer science, Springer, Berlin, pp 265–284Google Scholar
  16. 16.
    Fan L, Xiong L, Sunderam VS (2013) Differentially private multi-dimensional time series release for traffic monitoring. In: IFIP’13. Proceedings, pp 33–48Google Scholar
  17. 17.
    Fanaeepour M, Kulik L, Tanin E, Rubinstein BIP (2015) The CASE histogram: privacy-aware processing of trajectory data using aggregates. GeoInformatica 19(4):747–798CrossRefGoogle Scholar
  18. 18.
    Fanaeepour M, Rubinstein BIP (2016) Beyond points and paths: counting private bodies. In: ICDM, pp 131–140Google Scholar
  19. 19.
    Ghinita G (2013) Privacy for location-based services. In: Bertino E, Sandhu R (eds) Synthesis lectures on information security, privacy, and trust. Morgan & Claypool Publishers, San RafaelGoogle Scholar
  20. 20.
    Gruteser M, Liu X (2004) Protecting privacy in continuous location-tracking applications. IEEE Secur Priv 2(2):28–34CrossRefGoogle Scholar
  21. 21.
    Gurobi Optimization, Inc. (2015) Gurobi optimizer reference manual. http://www.gurobi.com
  22. 22.
    Hay M, Rastogi V, Miklau G, Suciu D (2010) Boosting the accuracy of differentially private histograms through consistency. PVLDB 3(1):1021–1032Google Scholar
  23. 23.
    He X, Cormode G, Machanavajjhala A, Procopiuc CM, Srivastava D (2015) DPT: differentially private trajectory synthesis using hierarchical reference systems. PVLDB 8(11):1154–1165Google Scholar
  24. 24.
    Hsu J, Gaboardi M, Haeberlen A, Khanna S, Narayan A, Pierce BC, Roth A (2014) Differential privacy: an economic method for choosing epsilon. In: IEEE 27th computer security foundations symposium, CSF 2014, pp 398–410Google Scholar
  25. 25.
    Iliffe J, Lott R (2008) Datums and map projections for remote sensing, GIS and surveying. Whittles Publishing. https://books.google.com.au/books?id=u_4RAQAAIAAJ
  26. 26.
    Inan A, Kantarcioglu M, Ghinita G, Bertino E (2010) Private record matching using differential privacy. In: EDBT’10, pp 123–134Google Scholar
  27. 27.
    Jones E, Oliphant T, Peterson P et al (2001) SciPy: open source scientific tools for Python. http://www.scipy.org/
  28. 28.
    Karmarkar N (1984) A new polynomial-time algorithm for linear programming. In: STOC’84, pp 302–311Google Scholar
  29. 29.
    Kifer D, Lin B (2010) Towards an axiomatization of statistical privacy and utility. In: Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on principles of database systems, PODS, pp 147–158Google Scholar
  30. 30.
    Krumm J (2007) Inference attacks on location tracks. In: 5th international conference on pervasive computing, PERVASIVE’07, pp 127–143Google Scholar
  31. 31.
    Krumm J (2009) A survey of computational location privacy. Pers Ubiquitous Comput 13(6):391–399CrossRefGoogle Scholar
  32. 32.
    Leonardi L, Orlando S, Raffaetà A, Roncato A, Silvestri C, Andrienko GL, Andrienko NV (2014) A general framework for trajectory data warehousing and visual OLAP. GeoInformatica 18(2):273–312CrossRefGoogle Scholar
  33. 33.
    Li C, Hay M, Miklau G, Wang Y (2014) A data- and workload-aware query answering algorithm for range queries under differential privacy. PVLDB 7(5):341–352Google Scholar
  34. 34.
    López IFV, Snodgrass RT, Moon B (2005) Spatiotemporal aggregate computation: a survey. IEEE Trans Knowl Data Eng TKDE 17(2):271–286CrossRefGoogle Scholar
  35. 35.
    Marketos G, Frentzos E, Ntoutsi I, Pelekis N, Raffaetà A, Theodoridis Y (2008) Building real-world trajectory warehouses. In: Seventh ACM international workshop on data engineering for wireless and mobile access, Mobide 2008, pp 8–15Google Scholar
  36. 36.
    Mir DJ, Isaacman S, Cáceres R, Martonosi M, Wright RN (2013) DP-WHERE: differentially private modeling of human mobility. In: Proceedings of the 2013 IEEE international conference on big data, pp 580–588Google Scholar
  37. 37.
    Papadias D, Kalnis P, Zhang J, Tao Y (2001) Efficient OLAP operations in spatial data warehouses. In: 7th international symposium on advances in spatial and temporal databases, SSTD’01, pp 443–459Google Scholar
  38. 38.
    Papadias D, Tao Y, Kalnis P, Zhang J (2002) Indexing spatio-temporal data warehouses. In: Proceedings of the 18th international conference on data engineering, ICDE’02, pp 166–175Google Scholar
  39. 39.
    Piorkowski M, Sarafijanovoc-Djukic N, Grossglauser M (2009) A parsimonious model of mobile partitioned networks with clustering. In: COMSNETS. http://www.comsnets.org
  40. 40.
    Primault V, Mokhtar SB, Lauradoux C, Brunie L (2014) Differentially private location privacy in practice. CoRR abs/1410.7744Google Scholar
  41. 41.
    Qardaji WH, Yang W, Li N (2013) Differentially private grids for geospatial data. In: ICDE’13, pp 757–768Google Scholar
  42. 42.
    Rubinstein BIP, Bartlett PL, Huang L, Taft N (2012) Learning in a large function space: privacy-preserving mechanisms for SVM learning. J Priv Confid 4(1):65–100Google Scholar
  43. 43.
    Sun C, Agrawal D, El Abbadi A (2002a) Exploring spatial datasets with histograms. In: Proceedings of the 18th international conference on data engineering, ICDE, pp 93–102Google Scholar
  44. 44.
    Sun C, Agrawal D, El Abbadi A (2002b) Selectivity estimation for spatial joins with geometric selections. In: EDBT’02, pp 609–626Google Scholar
  45. 45.
    Sun C, Bandi N, Agrawal D, El Abbadi A (2006) Exploring spatial datasets with histograms. Distrib Parallel Databases 20(1):57–88CrossRefGoogle Scholar
  46. 46.
    Tao Y, Kollios G, Considine J, Li F, Papadias D (2004) Spatio-temporal aggregation using sketches. In: Proceedings of the 20th international conference on data engineering, ICDE 2004, pp 214–225Google Scholar
  47. 47.
    Tao Y, Papadias D, Zhang J (2002) Aggregate processing of planar points. In: 8th international conference on extending database technology, EDBT 2002, pp 682–700Google Scholar
  48. 48.
    Timko I, Böhlen MH, Gamper J (2009) , Sequenced spatio-temporal aggregation in road networks. In: EDBT 2009, 12th international conference on extending database technology, pp 48–59Google Scholar
  49. 49.
    To H, Ghinita G, Shahabi C (2014) A framework for protecting worker location privacy in spatial crowdsourcing. PVLDB 7(10):919–930Google Scholar
  50. 50.
    Trudeau R (1993) Introduction to graph theory. Dover books on mathematics series. Dover Publications, New YorkGoogle Scholar
  51. 51.
    Wang M, Zhang X, Meng X (2013) , DiffR-tree: a differentially private spatial index for OLAP query. In: WAIM’13, pp 705–716Google Scholar
  52. 52.
    Xie H, Tanin E, Kulik L (2007) Distributed histograms for processing aggregate data from moving objects. In: 8th international conference on mobile data management (MDM 2007), pp 152–157Google Scholar
  53. 53.
    Xie H, Tanin E, Kulik L, Scheuermann P, Trajcevski G, Fanaeepour M (2014) Euler histogram tree: a spatial data structure for aggregate range queries on vehicle trajectories. In: 7th ACM SIGSPATIAL international workshop on computational transportation science, IWCTS 2014Google Scholar
  54. 54.
    Yuan J, Zheng Y, Zhang C, Xie W, Xie X, Sun G, Huang Y (2010) T-drive: driving directions based on taxi trajectories. In: 18th ACM SIGSPATIAL international symposium on advances in geographic information systems, ACM-GIS 2010, pp 99–108Google Scholar
  55. 55.
    Zhang J, Ghinita G, Chow C (2014) Differentially private location recommendations in geosocial networks. In: MDM’14, pp 59–68Google Scholar
  56. 56.
    Zheng Y, Xie X, Ma W (2010) Geolife: a collaborative social networking service among user, location and trajectory. IEEE Data Eng Bull 33(2):32–39Google Scholar

Copyright information

© Springer-Verlag London Ltd. 2017

Authors and Affiliations

  1. 1.School of Computing and Information SystemsUniversity of MelbourneParkvilleAustralia
  2. 2.Data61CSIROCanberraAustralia

Personalised recommendations