Skip to main content

Pattern Matching Under \(\textrm{DTW}\) Distance

  • 209 Accesses

Part of the Lecture Notes in Computer Science book series (LNCS,volume 13617)


In this work, we consider the problem of pattern matching under the dynamic time warping (\(\textrm{DTW}\)) distance motivated by potential applications in the analysis of biological data produced by the third generation sequencing. To measure the \(\textrm{DTW}\) distance between two strings, one must “warp” them, that is, double some letters in the strings to obtain two equal-lengths strings, and then sum the distances between the letters in the corresponding positions. When the distances between letters are integers, we show that for a pattern P with m runs and a text T with n runs:

  1. 1.

    There is an \(\mathcal {O}(m+n)\)-time algorithm that computes all locations where the \(\textrm{DTW}\) distance from P to T is at most 1;

  2. 2.

    There is an \(\mathcal {O}(kmn)\)-time algorithm that computes all locations where the \(\textrm{DTW}\) distance from P to T is at most k.

As a corollary of the second result, we also derive an approximation algorithm for general metrics on the alphabet.


  • Dynamic time warping distance
  • Pattern matching
  • Small-distance regime
  • Approximation algorithms

This work was partially funded by the grants ANR-20-CE48-0001, ANR-19-CE45-0008 SeqDigger and ANR-19-CE48-0016 from the French National Research Agency.

This is a preview of subscription content, access via your institution.

Buying options

USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-031-20643-6_23
  • Chapter length: 16 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
USD   59.99
Price excludes VAT (USA)
  • ISBN: 978-3-031-20643-6
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   79.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.


  1. 1.

    The preprocessing time \(\mathcal {O}(|\varSigma |^2 \log L)\) that is required to embed \(\mu \) into a well-separated metric is not accounted for in the runtime of the algorithm.


  1. Abboud, A., Backurs, A., Williams, V.V.: Tight hardness results for LCS and other sequence similarity measures. In: FOCS 2015, pp. 59–78. IEEE Computer Society (2015).

  2. Amarasinghe, S.L., Su, S., Dong, X., Zappia, L., Ritchie, M.E., Gouil, Q.: Opportunities and challenges in long-read sequencing data analysis. Genome Biol. 21(1), 1–16 (2020)

    CrossRef  Google Scholar 

  3. Bansal, N., Buchbinder, N., Madry, A., Naor, J.: A polylogarithmic-competitive algorithm for the k-server problem. In: FOCS 2011, pp. 267–276 (2011).

  4. Braverman, V., Charikar, M., Kuszmaul, W., Woodruff, D.P., Yang, L.F.: The one-way communication complexity of dynamic time warping distance. In: SoCG 2019. LIPIcs, vol. 129, pp. 16:1–16:15 (2019).

  5. Bringmann, K., Künnemann, M.: Quadratic conditional lower bounds for string problems and dynamic time warping. In: FOCS 2015, pp. 79–97 (2015).

  6. Chen, J.Q., Wu, Y., Yang, H., Bergelson, J., Kreitman, M., Tian, D.: Variation in the ratio of nucleotide substitution and indel rates across genomes in mammals and bacteria. Mol. Biol. Evol. 26(7), 1523–1531 (2009).

    CrossRef  Google Scholar 

  7. Driemel, A., Silvestri, F.: Locality-sensitive hashing of curves. In: SoCG 2017. LIPIcs, vol. 77, pp. 37:1–37:16 (2017).

  8. Dupont, M., Marteau, P.-F.: Coarse-DTW for sparse time series alignment. In: Douzal-Chouakria, A., Vilar, J.A., Marteau, P.-F. (eds.) AALTD 2015. LNCS (LNAI), vol. 9785, pp. 157–172. Springer, Cham (2016).

    CrossRef  Google Scholar 

  9. Emiris, I.Z., Psarros, I.: Products of euclidean metrics and applications to proximity questions among curves. In: SoCG 2018. LIPIcs, vol. 99, pp. 37:1–37:13 (2018).

  10. Fakcharoenphol, J., Rao, S., Talwar, K.: A tight bound on approximating arbitrary metrics by tree metrics. In: STOC 2003, pp. 448–455 (2003).

  11. Froese, V., Jain, B.J., Rymar, M., Weller, M.: Fast exact dynamic time warping on run-length encoded time series. CoRR abs/1903.03003 (2019)

    Google Scholar 

  12. Gold, O., Sharir, M.: Dynamic time warping and geometric edit distance: breaking the quadratic barrier. ACM Trans. Algorithms 14(4), 50:1–50:17 (2018).

  13. Gonzalez-Garay, M.L.: Introduction to isoform sequencing using pacific biosciences technology (Iso-Seq). In: Wu, J. (ed.) Transcriptomics and Gene Regulation. TRBIO, vol. 9, pp. 141–160. Springer, Dordrecht (2016).

    CrossRef  Google Scholar 

  14. Huang, Y.T., Liu, P.Y., Shih, P.W.: Homopolish: a method for the removal of systematic errors in nanopore sequencing by homologous polishing. Genome Biol. 22(1), 95 (2021).

    CrossRef  Google Scholar 

  15. Hwang, Y., Gelfand, S.B.: Sparse dynamic time warping. In: Perner, P. (ed.) MLDM 2017. LNCS (LNAI), vol. 10358, pp. 163–175. Springer, Cham (2017).

    CrossRef  Google Scholar 

  16. Hwang, Y., Gelfand, S.B.: Binary sparse dynamic time warping. In: MLDM 2019, pp. 748–759. ibai Publishing (2019)

    Google Scholar 

  17. Kuszmaul, W.: Dynamic time warping in strongly subquadratic time: algorithms for the low-distance regime and approximate evaluation. In: ICALP 2019. LIPIcs, vol. 132, pp. 80:1–80:15 (2019).

  18. Kuszmaul, W.: Dynamic time warping in strongly subquadratic time: algorithms for the low-distance regime and approximate evaluation. CoRR abs/1904.09690 (2019).

  19. Kuszmaul, W.: Binary dynamic time warping in linear time. CoRR abs/2101.01108 (2021)

    Google Scholar 

  20. Landau, G.M., Myers, E.W., Schmidt, J.P.: Incremental string comparison. SIAM J. Comput. 27(2), 557–582 (1998).

    CrossRef  MathSciNet  MATH  Google Scholar 

  21. Landau, G.M., Vishkin, U.: Fast string matching with k differences. J. Comput. Syst. Sci. 37(1), 63–78 (1988).

    CrossRef  MathSciNet  MATH  Google Scholar 

  22. Li, H.: Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34(18), 3094–3100 (2018).

    CrossRef  Google Scholar 

  23. Mahmoud, M., Gobet, N., Cruz-Dávalos, D.I., Mounier, N., Dessimoz, C., Sedlazeck, F.J.: Structural variant calling: the long and the short of it. Genome Biol. 20(1), 1–14 (2019).

    CrossRef  Google Scholar 

  24. Mueen, A., Chavoshi, N., Abu-El-Rub, N., Hamooni, H., Minnich, A.: AWarp: fast warping distance for sparse time series. In: ICDM 2016, pp. 350–359. IEEE (2016)

    Google Scholar 

  25. Nishi, A., Nakashima, Y., Inenaga, S., Bannai, H., Takeda, M.: Towards efficient interactive computation of dynamic time warping distance. In: Boucher, C., Thankachan, S.V. (eds.) SPIRE 2020. LNCS, vol. 12303, pp. 27–41. Springer, Cham (2020).

    CrossRef  Google Scholar 

  26. Sakai, Y., Inenaga, S.: A reduction of the dynamic time warping distance to the longest increasing subsequence length. In: ISAAC 2020. LIPIcs, vol. 181, pp. 6:1–6:16 (2020).

  27. Sakoe, H., Chiba, S.: Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. Acoust. Speech Sig. Process. 26(1), 43–49 (1978)

    CrossRef  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Garance Gourdel .

Editor information

Editors and Affiliations


Appendix A

Lemma 2

Consider a block \(B = D[i_p\mathinner {.\,.}j_p, i_t \mathinner {.\,.}j_t]\) and cell (ab) in it. If \(i_p \le a < j_p\), then \(D[a,b] \le D[a+1,b]\) and if \(i_t \le b < j_t\), then \(D[a,b] \le D[a,b+1]\).


Let us first give an equivalent statement of the lemma: if (ab) and \((a+1,b)\) are in the same block, then \(D[a,b] \le D[a+1,b]\), and if (ab) and \((a,b+1)\) are in the same block, then \(D[a,b] \le D[a,b+1]\).

We show the lemma by induction on \(a+b\). The base of the induction are the cells such that \(a = 0\) or \(b = 0\), and for them the statement holds by the definition of D. Consider now a cell (ab), where \(a,b \ge 1\). Assume that the induction assumption holds for all cells (xy) such that \(x+y < a+b\). By Eq. 1, we have:

$$\begin{aligned}&D[a, b] = \min \{ D[a-1, b-1], D[a-1, b], D[a, b-1]\} +d\\&D[a+1, b] = \min \{ D[a, b-1], D[a, b], D[a+1, b-1]\} + d\\&D[a, b+1] = \min \{ D[a-1, b], D[a-1, b+1], D[a, b]\} + d\\ \end{aligned}$$

Assume that (ab) and \((a+1,b)\) are in the same block. We have \(D[a,b] \le D[a, b-1]+d\) and trivially \(D[a,b] \le D[a,b] + d\). By the induction assumption, \(D[a,b-1] \le D[a+1,b-1]\) (the cells \((a,b-1)\) and \((a+1,b-1)\) must belong to the same block). Therefore,

$$\begin{aligned} D[a+1,b]&= \min \{ D[a, b-1], D[a, b], D[a+1, b-1]\} + d \\&= \min \{ D[a, b-1] + d, D[a, b] + d, D[a+1, b-1] + d\} \\&\ge \min \{D[a,b], D[a,b], D[a,b-1]+d\} \\&\ge \min \{D[a,b], D[a,b], D[a,b]\} = D[a,b]. \end{aligned}$$

Assume now that (ab) and \((a,b+1)\) are in the same block. We have \(D[a,b] \le D[a-1, b]+d\). Furthermore, as \((a-1,b)\) and \((a-1,b+1)\) are in the same block, we have \(D[a-1,b] \le D[a-1,b+1]\) by the induction assumption. Therefore,

$$\begin{aligned} D[a,b+1]&= \min \{ D[a-1, b], D[a-1, b+1], D[a, b]\} + d\\&= \min \{ D[a-1, b] + d, D[a-1, b+1] + d, D[a, b] + d\}\\&\ge \min \{D[a-1,b]+d, D[a-1,b]+d, D[a,b]\}\\&\ge \min \{D[a,b], D[a,b], D[a,b]\} = D[a,b]. \end{aligned}$$

This concludes the proof of the lemma.    \(\square \)

Appendix B

Theorem 2

Given run-length encodings of a pattern P with m runs and of a text T with n runs over an alphabet \(\varSigma \). Assume that the \(\textrm{DTW}\) distance is specified by a metric \(\mu \) on \(\varSigma \), and suppose that the ratio between the largest and the smallest non-zero distances between the letters of \(\varSigma \) is at most exponential in \(L = \max \{|P|,|T|\}\). For any \(0< \epsilon < 1\), there is a \(\mathcal {O}(L^{1-\varepsilon } \cdot mn \log ^3 L)\)-time algorithm that computes \(\mathcal {O}(L^{\varepsilon })\)-approximation of the smallest \(\textrm{DTW}\) distance between P and a substring of T correctly with high probability (See Footnote 1).


Any metric \(\mu \) can be embedded in \(\mathcal {O}(\sigma ^2)\) time into a well-separated tree metric \(\mu _\tau \) of depth \(\mathcal {O}(\log \sigma )\) with expected distortion \(\mathcal {O}(\log \sigma )\) (see [10] and [3, Theorem 2.4]). Furthermore, the ratio between the smallest distance and the largest distance grows at most polynomially. Formally, for any two letters ab we have \(\mu (a,b) \le \mu _\tau (a,b)\) and \(\mathbb {E}(\mu _\tau (a,b)) \le \mathcal {O}(\log \sigma ) \cdot d(a,b)\). Therefore, we have:

$$\begin{aligned} \textrm{DTW}_{\mu }(X,Y)&\le \textrm{DTW}_{\mu _\tau }(X,Y) \end{aligned}$$
$$\begin{aligned} \mathbb {E}(\textrm{DTW}_{\mu _\tau }(X,Y))&\le \mathcal {O}(\log \sigma ) \cdot \textrm{DTW}_\mu (X,Y) \end{aligned}$$

Let \(\delta = \min _{S-\text { substr. of }T} \textrm{DTW}_\mu (P,S)\) and \(\delta _\tau = \min _{S-\text { substr. of }T} \textrm{DTW}_{\mu _\tau } (P,S)\). Assume that \(\delta \) is realised on a substring X, and \(\delta _\tau \) on a substring \(X_\tau \). By Eq. 4, we then obtain:

$$\delta = \textrm{DTW}_\mu (P,X) \le \textrm{DTW}_\mu (P,X_\tau ) \le \delta _\tau $$

And Eq. 5 gives the following:

$$\mathbb {E}(\delta _\tau ) \le \mathbb {E}(\textrm{DTW}_{\mu _\tau } (P,X)) \le \mathcal {O}(\log \sigma ) \cdot \textrm{DTW}_\mu (P,X) = \mathcal {O}(\log \sigma ) \cdot \delta $$

We apply the embedding \(\log L\) times independently to obtain well-separated tree metrics \(\mu _\tau ^i\), \(i = 1, 2, \ldots , \log L\). From above and by Chernoff bounds,

$$\min _i \min _{S-\text { substring of }T} \textrm{DTW}_{\mu _\tau }^i(P,S)$$

gives an \(\mathcal {O}(\log \sigma ) = \mathcal {O}(\log L)\) approximation of \(\delta \) with high probability and can be computed in time \(\mathcal {O}(L^{1-\varepsilon } \cdot mn \log ^3 L)\) by Lemma 6, concluding the proof of the theorem.    \(\square \)

Rights and permissions

Reprints and Permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Gourdel, G., Driemel, A., Peterlongo, P., Starikovskaya, T. (2022). Pattern Matching Under \(\textrm{DTW}\) Distance. In: Arroyuelo, D., Poblete, B. (eds) String Processing and Information Retrieval. SPIRE 2022. Lecture Notes in Computer Science, vol 13617. Springer, Cham.

Download citation

  • DOI:

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-20642-9

  • Online ISBN: 978-3-031-20643-6

  • eBook Packages: Computer ScienceComputer Science (R0)