Skip to main content
Log in

Approximating the Geometric Edit Distance

  • Published:
Algorithmica Aims and scope Submit manuscript

Abstract

Edit distance is a measurement of similarity between two sequences such as strings, point sequences, or polygonal curves. Many matching problems from a variety of areas, such as signal analysis, bioinformatics, etc., need to be solved in a geometric space. Therefore, the geometric edit distance (GED) has been studied. In this paper, we describe the first strictly sublinear approximate near-linear time algorithm for computing the GED of two point sequences in constant dimensional Euclidean space. Specifically, we present a randomized \(O(n\log ^2n)\) time \(O(\sqrt{n})\)-approximation algorithm. Then, we generalize our result to give a randomized \(\alpha \)-approximation algorithm for any \(\alpha \in [\sqrt{\log n}, \sqrt{n/\log n}]\), running in time \(O(n^2/\alpha ^2 \log n)\). Both algorithms are Monte Carlo and return approximately optimal solutions with high probability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. The (discrete) Fréchet and DTW distances are defined similarly to GED; however, they use one-to-many correspondences instead of one-to-one matchings, and they disallow the use of gap points. As in GED, DTW aims to minimize the sum of distances between corresponding points, while discrete Fréchet distance aims to minimize the maximum distance over corresponding points.

  2. We say an event occurs with high probability if it occurs with probability at least \(1-\frac{1}{n^c}\) for some constant \(c>0\).

  3. This variant of the string edit distance is really closer to the shortest common supersequence length of the strings rather than the traditional Levenshtein distance, but we stick with “edit distance” for simplicity.

References

  1. Abboud, A., Backurs, A., Williams, V.V.: Tight hardness results for LCS and other sequence similarity measures. In: Proceedings of the IEEE 56th Annual Symposium on Foundations of Computer Science, pp. 59–78 (2015)

  2. Agarwal, P.K., Fox, K., Pan, J., Ying, R.: Approximating dynamic time warping and edit distance for a pair of point sequences. In: Proceedings of the 32nd International Symposium on Computational Geometry, pp. 6:1–6:16 (2016)

  3. Andoni, A., Krauthgamer, R., Onak, K.: Polylogarithmic approximation for edit distance and the asymmetric query complexity. In: Proceedings of the IEEE 51st Annual Symposium on Foundations of Computer Science, pp. 377–386 (2010)

  4. Andoni, A., Nosatzki, N.S.: Edit distance in near-linear time: it’s a constant factor. In: Proceedings of the IEEE 61st Annual Symposium on Foundations of Computer Science (2020, to appear)

  5. Backurs, A., Indyk, P.: Edit distance cannot be computed in strongly subquadratic time (unless SETH is false). In: Proceedings of the 47th Annual ACM Symposium on Theory of Computing, pp. 51–58 (2015)

  6. Bringmann, K.: Why walking the dog takes time: Fréchet distance has no strongly subquadratic algorithms unless SETH fails. In: Proceedings of the IEEE 55th Annual Symposium on Foundations of Computer Science, pp. 661–670 (2014)

  7. Bringmann, K., Künnemann, M.: Quadratic conditional lower bounds for string problems and dynamic time warping. In: Proceedings of the IEEE 56th Annual Symposium on Foundations of Computer Science, pp. 79–97 (2015)

  8. Bringmann, K., Mulzer, W.: Approximability of the discrete Fréchet distance. JoCG 7(2), 46–76 (2016)

    MATH  Google Scholar 

  9. Chakraborty, D., Das, D., Goldenberg, E., Koucky, M., Saks, M.: Approximating edit distance within constant factor in truly sub-quadratic time. In: Proceedings of the 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS), pp. 979–990. IEEE (2018)

  10. Chan, T.M., Rahmati, Z.: An improved approximation algorithm for the discrete Fréchet distance. Inf. Process. Lett. 138, 72–74 (2018)

    Article  Google Scholar 

  11. Chen, L., Ng, R.: On the marriage of Lp-norms and edit distance. In: Proceedings of the 30th International Conference on Very Large Databases, pp. 792–803 (2004)

  12. Chen, L., Özsu, M.T., Oria, V.: Robust and fast similarity search for moving object trajectories. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, pp. 491–502 (2005)

  13. Fox, K., Li, X.: Approximating the geometric edit distance. In: Proceedings of the 30th International Symposium on Algorithms and Computation, pp. 26:1–26:16 (2019)

  14. Gold, O., Sharir, M.: Dynamic time warping and geometric edit distance: breaking the quadratic barrier. ACM Trans. Algorithms 14(4), 50 (2018)

    Article  MathSciNet  Google Scholar 

  15. Har-Peled, S.: Geometric approximation algorithms, chap. 11, Random Partition via Shifting. American Mathematical Society (2011)

  16. Impagliazzo, R., Paturi, R.: On the complexity of k-SAT. J. Comp. Sys. Sci. 62(2), 367–375 (2001)

    Article  MathSciNet  Google Scholar 

  17. Kuszmaul, W.: Dynamic time warping in strongly subquadratic time: Algorithms for the low-distance regime and approximate evaluation. In: Proceedings of the 46th International Colloquium on Automata, Languages and Programming (2019)

  18. Landau, G.M., Myers, E.W., Schmidt, J.P.: Incremental string comparison. SIAM J. Comput. 27(2), 557–582 (1998)

    Article  MathSciNet  Google Scholar 

  19. Marteau, P.F.: Time warp edit distance with stiffness adjustment for time series matching. IEEE Trans. Pattern Anal. Mach. Intell. 31(2), 306–318 (2009)

    Article  Google Scholar 

  20. Masek, W.J., Paterson, M.S.: A faster algorithm computing string edit distances. J. Comput. Syst. Sci. 20(1), 18–31 (1980)

    Article  MathSciNet  Google Scholar 

  21. Sankararaman, S., Agarwal, P.K., Mølhave, T., Pan, J., Boedihardjo, A.P.: Model-driven matching and segmentation of trajectories. In: Proceedings of the 21st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pp. 234–243 (2013)

  22. Stojmirovic, A., Yu, Y.K.: Geometric aspects of biological sequence comparison. J. Comput. Biol. 16(4), 579–611 (2009)

    Article  MathSciNet  Google Scholar 

  23. Ukkonen, E.: Algorithms for approximate string matching. Inf. Control 64(1–3), 100–118 (1985)

    Article  MathSciNet  Google Scholar 

  24. Wagner, R.A., Fischer, M.J.: The string-to-string correction problem. J. ACM 21(1), 168–173 (1974)

    Article  MathSciNet  Google Scholar 

  25. Wang, X., Mueen, A., Ding, H., Trajcevski, G., Scheuermann, P., Keogh, E.: Experimental comparison of representation methods and distance measures for time series data. Data Min. Knowl. Disc. 26(2), 275–309 (2013)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

The authors would like to thank Anne Driemel and Benjamin Raichel for helpful discussions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kyle Fox.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

A preliminary version of this work appeared in the Proceedings of the 30th International Symposium on Algorithms and Computation [13]. Most of the work was done while the second author was a student at the University of Texas at Dallas.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fox, K., Li, X. Approximating the Geometric Edit Distance. Algorithmica 84, 2395–2413 (2022). https://doi.org/10.1007/s00453-022-00966-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00453-022-00966-4

Keywords

Navigation