WADS 2019: Algorithms and Data Structures pp 28-42

# Efficient Nearest-Neighbor Query and Clustering of Planar Curves

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11646)

## Abstract

We study two fundamental problems dealing with curves in the plane, namely, the nearest-neighbor problem and the center problem. Let $$\mathcal {C}$$ be a set of n polygonal curves, each of size m. In the nearest-neighbor problem, the goal is to construct a compact data structure over $$\mathcal {C}$$, such that, given a query curve Q, one can efficiently find the curve in $$\mathcal {C}$$ closest to Q. In the center problem, the goal is to find a curve Q, such that the maximum distance between Q and the curves in $$\mathcal {C}$$ is minimized. We use the well-known discrete Fréchet distance function, both under $$L_\infty$$ and under $$L_2$$, to measure the distance between two curves.

For the nearest-neighbor problem, despite discouraging previous results, we identify two important cases for which it is possible to obtain practical bounds, even when m and n are large. In these cases, either Q is a line segment or $$\mathcal {C}$$ consists of line segments, and the bounds on the size of the data structure and query time are nearly linear in the size of the input and query curve, respectively. The returned answer is either exact under $$L_\infty$$, or approximated to within a factor of $$1+\varepsilon$$ under $$L_2$$. We also consider the variants in which the location of the input curves is only fixed up to translation, and obtain similar bounds, under $$L_\infty$$.

As for the center problem, we study the case where the center is a line segment, i.e., we seek the line segment that represents the given set as well as possible. We present near-linear time exact algorithms under $$L_\infty$$, even when the location of the input curves is only fixed up to translation. Under $$L_2$$, we present a roughly $$O(n^2m^3)$$-time exact algorithm.

## Keywords

Polygonal curves Nearest-neighbor queries Clustering Fréchet distance Data structures (Approximation) algorithms

## References

1. 1.
Abraham, C., Cornillon, P.A., Matzner-Lober, E., Molinari, N.: Unsupervised curve clustering using b-splines. Scand. J. Stat. 30(3), 581–595 (2003).
2. 2.
Afshani, P., Driemel, A.: On the complexity of range searching among curves. In: Proceedings of the 29th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 898–917. SIAM (2018)
3. 3.
Agarwal, P.K., Procopiuc, C.M.: Exact and approximation algorithms for clustering. Algorithmica 33(2), 201–226 (2002).
4. 4.
Agarwal, P.K., Avraham, R.B., Kaplan, H., Sharir, M.: Computing the discrete Fréchet distance in subquadratic time. SIAM J. Comput. 43(2), 429–449 (2014).
5. 5.
Alewijnse, S.P.A., Buchin, K., Buchin, M., Kölzsch, A., Kruckenberg, H., Westenberg, M.A.: A framework for trajectory segmentation by stable criteria. In: Proceedings of the 22nd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. ACM Press, Dallas, November 2014.
6. 6.
Alt, H., Godau, M.: Computing the Fréchet distance between two polygonal curves. Intern. J. Comput. Geom. Appl. 05(01n02), 75–91 (1995).
7. 7.
Aronov, B., Filtser, O., Horton, M., Katz, M.J., Sheikhan, K.: Efficient nearest-neighbor query and clustering of planar curves. arXiv preprint arXiv:1904.11026 (2019)
8. 8.
de Berg, M., Cook, A.F., Gudmundsson, J.: Fast Fréchet queries. Comput. Geom. 46(6), 747–755 (2013).
9. 9.
de Berg, M., Gudmundsson, J., Mehrabi, A.D.: A dynamic data structure for approximate proximity queries in trajectory data. In: Proceedings of the 25th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, p. 48. ACM (2017)Google Scholar
10. 10.
Berndt, D.J., Clifford, J.: Using dynamic time warping to find patterns in timeseries. In: Papers from the AAAI Knowledge Discovery in Databases Workshop: Technical report WS-94-03, pp. 359–370. AAAI Press, Seattle, July 1994Google Scholar
11. 11.
Bringmann, K.: Why walking the dog takes time: Fréchet distance has no strongly subquadratic algorithms unless SETH fails. In: Proceedings of the 55th IEEE Symposium Foundations of Computer Science. IEEE, Philadelphia, October 2014.
12. 12.
Bringmann, K., Mulzer, W.: Approximability of the discrete Fréchet distance. J. Comput. Geom. 7(2), 46–76 (2016). http://jocg.org/index.php/jocg/article/view/261
13. 13.
Buchin, K., et al. Approximating $$(k, l)$$-center clustering for curves. In: Proceedings of the 30th Annual ACM-SIAM Symposium on Discrete Algorithms, San Diego, California, USA, 6–9 January 2019, pp. 2922–2938 (2019).
14. 14.
Chiou, J.M., Li, P.L.: Functional clustering and identifying substructures of longitudinal data. J. Roy. Stat. Soc.: Ser. B (Stat. Methodol.) 69(4), 679–699 (2007).
15. 15.
Driemel, A., Har-Peled, S.: Jaywalking your dog—computing the Fréchet distance with shortcuts. In: Proceedings of the 23rd ACM-SIAM Symposium on Discrete Algorithms, pp. 318–355. Society for Industrial and Applied Mathematics, Kyoto, January 2012.
16. 16.
Driemel, A., Krivošija, A., Sohler, C.: Clustering time series under the Fréchet distance. In: Proceedings of the 27th ACM-SIAM Symposium on Discrete Algorithms, pp. 766–785. SIAM, January 2016.
17. 17.
Driemel, A., Silvestri, F.: Locality-sensitive hashing of curves. In: Proceedings of the 33rd International Symposium on Computational Geometry, SoCG 2017, Brisbane, Australia, pp. 37:1–37:16 (2017). http://drops.dagstuhl.de/opus/volltexte/2017/7203
18. 18.
Eiter, T., Mannila, H.: Computing discrete Fréchet distance. Technical report CD-TR 94/64, Christian Doppler Labor. für Expertensysteme, Technische Uni. Wien (1994)Google Scholar
19. 19.
Emiris, I.Z., Psarros, I.: Products of Euclidean metrics and applications to proximity questions among curves. In: Proceedings of the 34th International Symposium on Computational Geometry, SoCG 2018, 11–14 June 2018, Budapest, Hungary, pp. 37:1–37:13 (2018). . arXiv:1712.06471
20. 20.
Fréchet, M.M.: Sur quelques points du calcul fonctionnel. Rendiconti del Circolo Matematico di Palermo 22(1), 1–72 (1906).
21. 21.
Gonzalez, T.F.: Clustering to minimize the maximum intercluster distance. Theor. Comput. Sci. 38, 293–306 (1985).
22. 22.
Gudmundsson, J., Horton, M.: Spatio-temporal analysis of team sports. ACM Comput. Surv. 50(2), 1–34 (2017).
23. 23.
Hausdorff, F.: Mengenlehre. Walter de Gruyter, Berlin (1927)
24. 24.
Hsu, W.L., Nemhauser, G.L.: Easy and hard bottleneck location problems. Discr. Appl. Math. 1(3), 209–215 (1979).
25. 25.
Indyk, P.: Approximate nearest neighbor algorithms for Fréchet distance via product metrics. In: Proceedings of the 8th Symposium on Computational Geometry, pp. 102–106. ACM Press, Barcelona, June 2002.
26. 26.
Indyk, P., Matoušek, J.: Low-distortion embeddings of finite metric spaces. In: Handbook of Discrete and Computational Geometry, 2 edn. Chapman and Hall/CRC, April 2004. Google Scholar
27. 27.
Niu, H., Wang, J.: Volatility clustering and long memory of financial time series and financial price model. Digit. Signal Process. 23(2), 489–498 (2013).
28. 28.
Willard, D.E., Lueker, G.S.: Adding range restriction capability to dynamic data structures. J. ACM 32(3), 597–617 (1985).