Improving the efficiency of traditional DTW accelerators

Tavenard, Romain; Amsaleg, Laurent

doi:10.1007/s10115-013-0698-7

Improving the efficiency of traditional DTW accelerators

Regular Paper
Published: 17 October 2013

Volume 42, pages 215–243, (2015)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Romain Tavenard¹ &
Laurent Amsaleg²

407 Accesses
11 Citations
Explore all metrics

Abstract

Dynamic time warping (DTW) is the most popular approach for evaluating the similarity of time series, but its computation is costly. Therefore, simple functions lower bounding DTW distances have been designed, accelerating searches by quickly pruning sequences that could not possibly be best matches. The tighter the bounds, the more they prune and the better the performance. Designing new functions that are even tighter is difficult because their computation is likely to become complex, canceling the benefits of their pruning. It is possible, however, to design simple functions with a higher pruning power by relaxing the no false dismissal assumption, resulting in approximate lower bound functions. This paper describes how very popular approaches accelerating DTW such as $\text {LB}\_\text {Keogh}{}$ and $\text {LB}\_\text {PAA}{}$ can be made more efficient via approximations. The accuracy of approximations can be tuned, ranging from no false dismissal to potential losses when aggressively set for great response time savings. At very large scale, indexing time series is mandatory. This paper also describes how approximate lower bound functions can be used with iSAX. Furthermore, it shows that a $k$-means-based quantization step for iSAX gives significant performance gains.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Scalable Segmented Dynamic Time Warping for Time Series Classification

Asymptotic Dynamic Time Warping calculation with utilizing value repetition

Article 16 February 2018

Coarse-DTW for Sparse Time Series Alignment

Notes

The $\mathcal{A}\_{}$ prefix stand for approximate.
Note that, in this paper, we chose to rely on the DTW formulation that is used in the work of Sakoe and Chiba [14], which leads to the Manhattan distance being an upper bound while, when using the same formulation as in Keogh and Ratanamahatana [8], DTW is upper-bounded by Euclidean distance.
There exists exactly one such node. isax_approximate_search is in fact the method defined in Shieh and Keogh [15] and used in the next section describing $i\text {SAX}\_\text {Approx}{}$ indexing.

References

Aach J, Church G (2001) Aligning gene expression time series with time warping algorithms. Bioinformatics 17(6):495
Article Google Scholar
Camerra A, Palpanas T, Shieh J, Keogh EJ (2010) iSAX 2.0: indexing and mining one billion time series. In: Proceedings of the IEEE international conference on data mining
Chu S, Keogh E, Hart D, Pazzani M et al (2002) Iterative deepening dynamic time warping for time series. In: Proceedings of the SIAM international conference on data mining
Faloutsos C, Ranganathan M, Manolopoulos Y (1994) Fast subsequence matching in time-series databases. In: Proceedings of the ACM SIGMOD conference on management of data
Gavrila D, Davis L (1995) Towards 3-d model-based tracking and recognition of human movement: a multi-view approach. In: International workshop on automatic face-and gesture-recognition, pp 272–277
Itakura F (1975) Minimum prediction residual principle applied to speech recognition. IEEE Trans Acoust Speech Signal Process 23(1):67–72
Article Google Scholar
Kashyap S, Karras P (2011) Scalable knn search on vertically stored time series. In: Proceedings of the ACM SIGKDD conference on knowledge discovery and data mining, ACM, pp 1334–1342
Keogh E, Ratanamahatana C (2005) Exact indexing of dynamic time warping. Knowl Inform Syst 7(3):358–386
Article Google Scholar
Keogh E, Xi X, Wei L, Ratanamahatana CA (2006) The ucr time series classification/clustering homepage. www.cs.ucr.edu/~eamonn/time_series_data/
Lin J, Keogh E, Wei L, Lonardi S (2007) Experiencing SAX: a novel symbolic representation of time series. Data Mining Knowl Discov 15(2):107–144
Article MathSciNet Google Scholar
Munich M, Perona P (1999) Continuous dynamic time warping for translation-invariant curve alignment with applications to signature verification. In: Proceedings of the IEEE international conference on computer vision, vol 1, pp 108–115
Nistér D, Stewénius H (2006) Scalable recognition with a vocabulary tree. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2161–2168
Paulevé L, Jégou H, Amsaleg L (2010) Locality sensitive hashing: a comparison of hash function types and querying mechanisms. Pattern Recogn Lett 31(11):1348–1358
Article Google Scholar
Sakoe H, Chiba S (1978) Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans Acoust Speech Signal Process 26:43–49
Article MATH Google Scholar
Shieh J, Keogh E (2008) iSAX: indexing and mining terabyte sized time series. In: Proceedings of the ACM SIGKDD conference on knowledge discovery and data mining
Sivic J, Zisserman A (2003) Video google: a text retrieval approach to object matching in videos. In: Proceedings of the IEEE international conference on computer vision, pp 1470–1477
Tavenard R, Jégou H, Amsaleg L (2011) Balancing clusters to reduce response time variability in large scale image search. In: Proceedings of the IEEE workshop on content-based multimedia indexing
Vlachos M, Hadjieleftheriou M, Gunopulos D, Keogh EJ (2003) Indexing multi-dimensional time-series with support for multiple distance measures. In: Proceedings of the ACM SIGKDD conference on knowledge discovery and data mining, pp 216–225
Yi B, Jagadish HV, Faloutsos C (1998) Efficient retrieval of similar time sequences under time warping. In: Proceedings of the IEEE international conference on data engineering

Download references

Author information

Authors and Affiliations

Idiap Research Institute, Rue Marconi 19, 1920 , Martigny, Switzerland
Romain Tavenard
IRISA/CNRS, Campus universitaire de Beaulieu, 35042 , Rennes cedex, France
Laurent Amsaleg

Authors

Romain Tavenard
View author publications
You can also search for this author in PubMed Google Scholar
Laurent Amsaleg
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Romain Tavenard.

Additional information

This work has been conducted while Romain Tavenard was pursuing his Ph.D. at INRIA, Rennes, with a scholarship from Université de Rennes 1.

Appendices

1.1 Proofs and mathematical definitions for upper bounds

We prove here that UB_Keogh is upper bounding DTW when the latter is restricted to a Sakoe–Chiba band. We also introduce upper bounds related to $\text {LB}\_\text {PAA}{}$ and $i\text {SAX}\_\text {MinDist}{}$ for which we omit the proofs as they follow the exact same principles.

Definition 1

Let UB_Keogh be:

$$\begin{aligned} \text {UB}\_\text {Keogh}{}(Q,C) = \sum _{i=1}^{n} {\left\{ \begin{array}{ll} (c_i - L_i) &{} \text { if } c_i > U_i\\ (U_i - c_i) &{} \text { if } c_i < L_i\\ \max (U_i - c_i, c_i - L_i) &{} \text { otherwise} \end{array}\right. } \end{aligned}$$

(11)

Lemma 1

For any two sequences $Q$ and $C$ of length $n$, the following inequality stands:

$$\begin{aligned} L_1(Q,C) \ge \text {DTW}{}(Q,C) \end{aligned}$$

where the considered DTW is constrained to a Sakoe–Chiba band of width $r$.

Proof

Let $Q$ and $C$ be two sequences of length $n$. Manhattan distance $L_1$ corresponds to the alignment that follows the diagonal path. Hence, this distance is associated with one of the possible paths considered by the DTWalgorithm and is therefore greater than the cost of the minimal path, that is the value returned by DTW, which concludes the proof for Lemma 1. $\square $

Proposition 1

For any two sequences $Q$ and $C$ of length $n$, the following inequality stands:

$$\begin{aligned} \text {UB}\_\text {Keogh}{}(Q,C) \ge \text {DTW}{}(Q,C) \end{aligned}$$

where the considered DTW is constrained to a Sakoe–Chiba band of width $r$.

Proof

It is important to notice that each term in the sum that occurs in the definition of UB_Keogh is related to exactly one term in the computation of the $L_1$ distance. The only difference is that for UB_Keogh, the $i$th term corresponds to the distance between the $i$th point in the candidate sequence and its furthest corresponding point in the envelope of the query, while for $L_1$, the same term is equal to the distance between the $i$th point in the candidate sequence and one of its possible corresponding points in the envelope of the query. The latter distance is then, by definition, smaller than the former, and the following inequality is then straightforward, coming from Lemma 1:

$$\begin{aligned} \text {UB}\_\text {Keogh}{}(Q,C) \ge L_1(Q,C) \ge \text {DTW}{}(Q,C). \end{aligned}$$

$\square $

Definition 2

Let us define UB_PAA as:

$$\begin{aligned} \text {UB}\_\text {PAA}{}(Q,C)&= \frac{n}{N} \cdot {} \left( \sum _{i=1}^{N}\max (\hat{U}_i - \bar{c_i}, \bar{c_i} - \hat{L}_i)\right) \nonumber \\&+ \frac{n}{N} \cdot {} \left( \max (C)-\min (C)\right) . \end{aligned}$$

(12)

Lemma 2

For any two sequences $Q$ and $C$ of length $n$, the following inequality stands:

$$\begin{aligned} \text {UB}\_\text {PAA}{}(Q,C) \ge \text {UB}\_\text {Keogh}{}(Q,C). \end{aligned}$$

Proposition 2

For any two sequences $Q$ and $C$ of length $n$, the following inequality stands:

$$\begin{aligned} \text {UB}\_\text {PAA}{}(Q,C) \ge \text {DTW}{}(Q,C) \end{aligned}$$

where the considered DTW is constrained to a Sakoe–Chiba band of width $r$.

Proof

It is straightforward that proving Lemma 2 is sufficient to prove, using Proposition 1, that Proposition 2 holds.

Let $Q$ and $C$ be two sequences of length $n$ and $W$ be the minimum cost path used for DTW computation with a Sakoe–Chiba band of with $r$ between $Q$ and $C$. So as to prove:

$$\begin{aligned} \sum _{i=1}^{n}\max (U_i - c_i, c_i - L_i) \le \frac{n}{N} \cdot {} \left( \sum _{i=1}^{N}\max (\hat{U}_i - \bar{c_i}, \bar{c_i} - \hat{L}_i) + \max (C)-\min (C)\right) \end{aligned}$$

it is sufficient to prove that, for all $i \in \{ 1,\ldots ,N \}$:

$$\begin{aligned} \sum _{j=\frac{n}{N}(i-1)+1}^{\frac{n}{N}i}\max (U_j - c_j, c_j - L_j) \le \frac{n}{N} \cdot {} \left( \max (\hat{U}_i - \bar{c_i}, \bar{c_i} - \hat{L}_i) + (\max (C)-\min (C)\right) \end{aligned}$$

Let $i \in \{ 1,\ldots ,N \}$. If we denote, for all $j \in \{ \frac{n}{N}(i-1)+1,\ldots ,\frac{n}{N}i \}$, $c_j = \bar{c_i} \!+\! \Delta c_j$, we get :

$$\begin{aligned}&\sum _{j=\frac{n}{N}(i-1)+1}^{\frac{n}{N}i}\max (U_j - c_j, c_j - L_j)\\&\quad \qquad = \sum _{j=\frac{n}{N}(i-1)+1}^{\frac{n}{N}i}\max (U_j - (\bar{c_i} + \Delta c_j), \bar{c_i} + \Delta c_j - L_j) \\&\quad \qquad \le \sum _{j=\frac{n}{N}(i-1)+1}^{\frac{n}{N}i}\max (\hat{U_i} - (\bar{c_i} + \Delta c_j), \bar{c_i} + \Delta c_j - \hat{L_i}) \\&\quad \qquad \le \sum _{j=\frac{n}{N}(i-1)+1}^{\frac{n}{N}i}\max (\hat{U_i} - \bar{c_i}, \bar{c_i} - \hat{L_i}) + |\Delta c_j| \\&\quad \qquad \le \frac{n}{N}\left( \max (\hat{U_i} - \bar{c_i}, \bar{c_i} - \hat{L_i}) + \max (C) - \min (C)\right) \end{aligned}$$

which concludes the proof. $\square $

Definition 3

Let us define iSAX_MaxDist as:

$$\begin{aligned} { i}\text {SAX}\_\text {MaxDist}{}(Q,R) = \sqrt{\frac{n}{N} \sum _{i=1}^{N} X_i} \end{aligned}$$

(13)

where

$$\begin{aligned} \forall i \le N, X_i = {\left\{ \begin{array}{ll} (\bar{q_i} - B_i)^2 &{} \text { if } \bar{q_i} > H_i\\ (H_i - \bar{q_i})^2 &{} \text { if } \bar{q_i} < B_i\\ \max (H_i - \bar{q_i}, \bar{q_i} - B_i)^2 &{} \text { otherwise} \end{array}\right. }. \end{aligned}$$

(14)

1.2 Balancing $k$-means

When $k=2$, balancing $k$-means does not require any iterative process as proposed in Tavenard et al. [17]. It is possible to derive elevation $h$ that the most populated clusters’ centroid will get in order for both clusters to finally get equal populations without resorting to any iterative process.

Let us assume, without loss of generality, that $k$-means produced two centroids $\mathbf {C_1}$ and $\mathbf {C_2}$ and that cluster $C_1$ is more populated than $C_2$. Using notations introduced in Fig. 9, intersection between the line $(\mathbf {C_1}, \mathbf {C_2})$ and the boundary between classes $C_1$ and $C_2$ is then $\mathbf {C_0}$, middle of the line segment $[\mathbf {C_1}, \mathbf {C_2}]$. We aim at evaluating elevation $h$ such that this point moves to $\mathbf {C'_0}$ that is the median of projected data points. It is straightforward that if one builds a new boundary that is parallel to the original one and passes through $\mathbf {C'_0}$, both clusters will be equally populated. After solving the related system of equations, one gets:

$$\begin{aligned} h = \sqrt{2(x_2-x_1) \left( \frac{x_1+x_2}{2}-x'_0\right) ,} \end{aligned}$$

(15)

where $x_1$ and $x_2$ are known from the $k$-means and $x'_0$ is the median of projected data points.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tavenard, R., Amsaleg, L. Improving the efficiency of traditional DTW accelerators. Knowl Inf Syst 42, 215–243 (2015). https://doi.org/10.1007/s10115-013-0698-7

Download citation

Received: 22 February 2012
Revised: 26 April 2013
Accepted: 14 September 2013
Published: 17 October 2013
Issue Date: January 2015
DOI: https://doi.org/10.1007/s10115-013-0698-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Improving the efficiency of traditional DTW accelerators

Abstract

Access this article

Similar content being viewed by others

A Scalable Segmented Dynamic Time Warping for Time Series Classification

Asymptotic Dynamic Time Warping calculation with utilizing value repetition

Coarse-DTW for Sparse Time Series Alignment

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendices

Appendices

1.1 Proofs and mathematical definitions for upper bounds

Definition 1

Lemma 1

Proof

Proposition 1

Proof

Definition 2

Lemma 2

Proposition 2

Proof

Definition 3

1.2 Balancing \(k\)-means

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Improving the efficiency of traditional DTW accelerators

Abstract

Access this article

Similar content being viewed by others

A Scalable Segmented Dynamic Time Warping for Time Series Classification

Asymptotic Dynamic Time Warping calculation with utilizing value repetition

Coarse-DTW for Sparse Time Series Alignment

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendices

Appendices

1.1 Proofs and mathematical definitions for upper bounds

Definition 1

Lemma 1

Proof

Proposition 1

Proof

Definition 2

Lemma 2

Proposition 2

Proof

Definition 3

1.2 Balancing \(k\)-means

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation