# Querying Probabilistic Neighborhoods in Spatial Data Sets Efficiently

• Conference paper
• First Online:
Combinatorial Algorithms (IWOCA 2016)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9843))

Included in the following conference series:

• 844 Accesses

## Abstract

The probability that two spatial objects establish some kind of mutual connection often depends on their proximity. To formalize this concept, we define the notion of a probabilistic neighborhood: Let P be a set of n points in $$\mathbb {R}^d$$, $$q \in \mathbb {R}^d$$ a query point, $${\text {dist}}$$ a distance metric, and $$f : \mathbb {R}^+ \rightarrow [0,1]$$ a monotonically decreasing function. Then, the probabilistic neighborhood N(qf) of q with respect to f is a random subset of P and each point $$p \in P$$ belongs to N(qf) with probability $$f({\text {dist}}(p,q))$$. Possible applications include query sampling and the simulation of probabilistic spreading phenomena, as well as other scenarios where the probability of a connection between two entities decreases with their distance. We present a fast, sublinear-time query algorithm to sample probabilistic neighborhoods from planar point sets. For certain distributions of planar P, we prove that our algorithm answers a query in $$O((|N(q,f)| + \sqrt{n})\log n)$$ time with high probability. In experiments this yields a speedup over pairwise distance probing of at least one order of magnitude, even for rather small data sets with $$n=10^5$$ and also for other point distributions not covered by the theoretical results.

This is a preview of subscription content, log in via an institution to check access.

## Subscribe and save

Springer+ Basic
EUR 32.99 /Month
• Get 10 units per month
• 1 Unit = 1 Article or 1 Chapter
• Cancel anytime

Chapter
USD 29.95
Price excludes VAT (USA)
• Available as PDF
• Read on any device
• Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
• Available as EPUB and PDF
• Read on any device
• Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
• Compact, lightweight edition
• Dispatched in 3 to 5 business days
• Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

## Notes

1. 1.

We say “with high probability” (whp) when referring to a probability $$\ge 1- 1/n$$ for sufficiently large n.

2. 2.

The probability density in the polar model depends only on radii r and R as well as a growth parameter $$\alpha$$ and is given by $$g(r) := \alpha \frac{\sinh (\alpha r)}{\cosh (\alpha R)-1}$$.

## References

1. Agarwal, P.K., Aronov, B., Har-Peled, S., Phillips, J.M., Yi, K., Zhang, W.: Nearest neighbor searching under uncertainty II. In Proceedings of the 32nd Symposium on Principles of Database Systems, PODS, pp. 115–126. ACM (2013)

2. Aldecoa, R., Orsini, C., Krioukov, D.: Hyperbolic graph generator. Comput. Phys. Commun. 196, 492–496 (2015). Elsevier, Amsterdam

3. Arge, L., Larsen, K.G.: I/O-efficient spatial data structures for range queries. SIGSPATIAL Spec. 4, 2–7 (2012)

4. Batagelj, V., Brandes, U.: Efficient generation of large random networks. Phys. Rev. E 71(3), 036113 (2005)

5. Bringmann, K., Keusch, R., Lengler, J.: Geometric inhomogeneous random graphs (2015). arXiv preprint arXiv:1511.00576

6. Center for International Earth Science Information Network CIESIN Columbia University; Centro Internacional de Agricultura Tropical CIAT. Gridded population of the world, version 3 (gpwv3): Population density grid (2005)

7. Hethcote, H.W.: The mathematics of infectious diseases. SIAM Rev. 42(4), 599–653 (2000)

8. Hu, X., Qiao, M., Tao, Y.: Independent range sampling. In: Proceedings of the 33rd Symposium on Principles of Database Systems, PODS, pp. 246–255. ACM (2014)

9. Kamel, I., Faloutsos, C.: Hilbert R-tree: An improved R-tree using fractals. In: Proceedings of the 20th International Conference on Very Large Data Bases, VLDB, pp. 500–509. Morgan Kaufmann Publishers Inc., San Francisco (1994)

10. Kraetzschmar, G.K., Gassull, G.P., Uhl, K.: Probabilistic quadtrees for variable-resolution mapping of large environments. In: Proceedings of the 5th IFAC/EURON Symposium on Intelligent Autonomous Vehicles (2004)

11. Kriegel, H.-P., Kunath, P., Renz, M.: Probabilistic nearest-neighbor query on uncertain objects. In: Kotagiri, R., Radha Krishna, P., Mohania, M., Nantajeewarawat, E. (eds.) DASFAA 2007. LNCS, vol. 4443, pp. 337–348. Springer, Heidelberg (2007)

12. Krioukov, D., Papadopoulos, F., Kitsak, M., Vahdat, A., Boguñá, M.: Hyperbolic geometry of complex networks. Phys. Rev. E 82(3), 036106 (2010)

13. Pei, J., Hua, M., Tao, Y., Lin, X.: Query answering techniques on uncertain, probabilistic data: tutorial summary. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 1357–1364. ACM (2008)

14. Samet, H.: Foundations of Multidimensional and Metric Data Structures. Morgan Kaufmann Publishers Inc., San Francisco (2005)

15. Staudt, C.L., Sazonovs, A., Meyerhenke, H.: NetworKit: A tool suite for large-scale complex network analysis. In: Network Science. Cambridge University Press (2016, to appear)

16. von Looz, M., Meyerhenke, H.: Querying Probabilistic Neighborhoods in Spatial Data Sets Efficiently. ArXiv preprint arXiv:1509.01990

17. von Looz, M., Prutkin, R., Meyerhenke, H.: Generating random hyperbolic graphs in subquadratic time. In: Elbassioni, K., Makino, K. (eds.) ISAAC 2015. LNCS, vol. 9472, pp. 467–478. Springer, Heidelberg (2015)

## Acknowledgements

This work is partially supported by German Research Foundation (DFG) grant ME 3619/3-1 within the Priority Programme 1736 Algorithms for Big Data. The authors thank Mark Ortmann for helpful discussions.

## Author information

Authors

### Corresponding author

Correspondence to Moritz von Looz .

## Rights and permissions

Reprints and permissions

© 2016 Springer International Publishing Switzerland

### Cite this paper

von Looz, M., Meyerhenke, H. (2016). Querying Probabilistic Neighborhoods in Spatial Data Sets Efficiently. In: Mäkinen, V., Puglisi, S., Salmela, L. (eds) Combinatorial Algorithms. IWOCA 2016. Lecture Notes in Computer Science(), vol 9843. Springer, Cham. https://doi.org/10.1007/978-3-319-44543-4_35

• DOI: https://doi.org/10.1007/978-3-319-44543-4_35

• Published:

• Publisher Name: Springer, Cham

• Print ISBN: 978-3-319-44542-7

• Online ISBN: 978-3-319-44543-4

• eBook Packages: Computer ScienceComputer Science (R0)