Skip to main content

Querying Probabilistic Neighborhoods in Spatial Data Sets Efficiently

  • Conference paper
  • First Online:
Combinatorial Algorithms (IWOCA 2016)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9843))

Included in the following conference series:

Abstract

The probability that two spatial objects establish some kind of mutual connection often depends on their proximity. To formalize this concept, we define the notion of a probabilistic neighborhood: Let P be a set of n points in \(\mathbb {R}^d\), \(q \in \mathbb {R}^d\) a query point, \({\text {dist}}\) a distance metric, and \(f : \mathbb {R}^+ \rightarrow [0,1]\) a monotonically decreasing function. Then, the probabilistic neighborhood N(qf) of q with respect to f is a random subset of P and each point \(p \in P\) belongs to N(qf) with probability \(f({\text {dist}}(p,q))\). Possible applications include query sampling and the simulation of probabilistic spreading phenomena, as well as other scenarios where the probability of a connection between two entities decreases with their distance. We present a fast, sublinear-time query algorithm to sample probabilistic neighborhoods from planar point sets. For certain distributions of planar P, we prove that our algorithm answers a query in \(O((|N(q,f)| + \sqrt{n})\log n)\) time with high probability. In experiments this yields a speedup over pairwise distance probing of at least one order of magnitude, even for rather small data sets with \(n=10^5\) and also for other point distributions not covered by the theoretical results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    We say “with high probability” (whp) when referring to a probability \(\ge 1- 1/n\) for sufficiently large n.

  2. 2.

    The probability density in the polar model depends only on radii r and R as well as a growth parameter \(\alpha \) and is given by \(g(r) := \alpha \frac{\sinh (\alpha r)}{\cosh (\alpha R)-1} \).

References

  1. Agarwal, P.K., Aronov, B., Har-Peled, S., Phillips, J.M., Yi, K., Zhang, W.: Nearest neighbor searching under uncertainty II. In Proceedings of the 32nd Symposium on Principles of Database Systems, PODS, pp. 115–126. ACM (2013)

    Google Scholar 

  2. Aldecoa, R., Orsini, C., Krioukov, D.: Hyperbolic graph generator. Comput. Phys. Commun. 196, 492–496 (2015). Elsevier, Amsterdam

    Article  Google Scholar 

  3. Arge, L., Larsen, K.G.: I/O-efficient spatial data structures for range queries. SIGSPATIAL Spec. 4, 2–7 (2012)

    Article  Google Scholar 

  4. Batagelj, V., Brandes, U.: Efficient generation of large random networks. Phys. Rev. E 71(3), 036113 (2005)

    Article  Google Scholar 

  5. Bringmann, K., Keusch, R., Lengler, J.: Geometric inhomogeneous random graphs (2015). arXiv preprint arXiv:1511.00576

  6. Center for International Earth Science Information Network CIESIN Columbia University; Centro Internacional de Agricultura Tropical CIAT. Gridded population of the world, version 3 (gpwv3): Population density grid (2005)

    Google Scholar 

  7. Hethcote, H.W.: The mathematics of infectious diseases. SIAM Rev. 42(4), 599–653 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  8. Hu, X., Qiao, M., Tao, Y.: Independent range sampling. In: Proceedings of the 33rd Symposium on Principles of Database Systems, PODS, pp. 246–255. ACM (2014)

    Google Scholar 

  9. Kamel, I., Faloutsos, C.: Hilbert R-tree: An improved R-tree using fractals. In: Proceedings of the 20th International Conference on Very Large Data Bases, VLDB, pp. 500–509. Morgan Kaufmann Publishers Inc., San Francisco (1994)

    Google Scholar 

  10. Kraetzschmar, G.K., Gassull, G.P., Uhl, K.: Probabilistic quadtrees for variable-resolution mapping of large environments. In: Proceedings of the 5th IFAC/EURON Symposium on Intelligent Autonomous Vehicles (2004)

    Google Scholar 

  11. Kriegel, H.-P., Kunath, P., Renz, M.: Probabilistic nearest-neighbor query on uncertain objects. In: Kotagiri, R., Radha Krishna, P., Mohania, M., Nantajeewarawat, E. (eds.) DASFAA 2007. LNCS, vol. 4443, pp. 337–348. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  12. Krioukov, D., Papadopoulos, F., Kitsak, M., Vahdat, A., Boguñá, M.: Hyperbolic geometry of complex networks. Phys. Rev. E 82(3), 036106 (2010)

    Article  MathSciNet  Google Scholar 

  13. Pei, J., Hua, M., Tao, Y., Lin, X.: Query answering techniques on uncertain, probabilistic data: tutorial summary. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 1357–1364. ACM (2008)

    Google Scholar 

  14. Samet, H.: Foundations of Multidimensional and Metric Data Structures. Morgan Kaufmann Publishers Inc., San Francisco (2005)

    MATH  Google Scholar 

  15. Staudt, C.L., Sazonovs, A., Meyerhenke, H.: NetworKit: A tool suite for large-scale complex network analysis. In: Network Science. Cambridge University Press (2016, to appear)

    Google Scholar 

  16. von Looz, M., Meyerhenke, H.: Querying Probabilistic Neighborhoods in Spatial Data Sets Efficiently. ArXiv preprint arXiv:1509.01990

  17. von Looz, M., Prutkin, R., Meyerhenke, H.: Generating random hyperbolic graphs in subquadratic time. In: Elbassioni, K., Makino, K. (eds.) ISAAC 2015. LNCS, vol. 9472, pp. 467–478. Springer, Heidelberg (2015)

    Google Scholar 

Download references

Acknowledgements

This work is partially supported by German Research Foundation (DFG) grant ME 3619/3-1 within the Priority Programme 1736 Algorithms for Big Data. The authors thank Mark Ortmann for helpful discussions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Moritz von Looz .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

von Looz, M., Meyerhenke, H. (2016). Querying Probabilistic Neighborhoods in Spatial Data Sets Efficiently. In: Mäkinen, V., Puglisi, S., Salmela, L. (eds) Combinatorial Algorithms. IWOCA 2016. Lecture Notes in Computer Science(), vol 9843. Springer, Cham. https://doi.org/10.1007/978-3-319-44543-4_35

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-44543-4_35

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-44542-7

  • Online ISBN: 978-3-319-44543-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics