Sufficient conditions for the existence of a sample mean of time series under dynamic time warping

Jain, Brijnesh; Schultz, David

doi:10.1007/s10472-019-09682-2

Sufficient conditions for the existence of a sample mean of time series under dynamic time warping

Published: 10 January 2020

Volume 88, pages 313–346, (2020)
Cite this article

Annals of Mathematics and Artificial Intelligence Aims and scope Submit manuscript

158 Accesses
4 Citations
Explore all metrics

Abstract

Time series averaging is an important subroutine for several time series data mining tasks. The most successful approaches formulate the problem of time series averaging as an optimization problem based on the dynamic time warping (DTW) distance. The existence of an optimal solution, called sample mean, is an open problem for more than four decades. Its existence is a necessary prerequisite to formulate exact algorithms, to derive complexity results, and to study statistical consistency. In this article, we propose sufficient conditions for the existence of a sample mean. A key result for deriving the proposed sufficient conditions is the Reduction Theorem that provides an upper bound for the minimum length of a sample mean.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

catch22: CAnonical Time-series CHaracteristics

Article Open access 09 August 2019

LoCoMotif: discovering time-warped motifs in time series

Article 30 May 2024

Clustering time series by extremal dependence

Article Open access 28 May 2024

References

Abanda, A., Mori, U., Lozano, J.A.: A review on distance based time series classification. Data Mining and Knowledge Discovery (2018)
Abdulla, W.H., Chow, D., Sin, G.: Cross-words reference template for DTW based speech recognition systems. Conference on Convergent Technologies for Asia-Pacific Region (2003)
Aghabozorgi, S., Shirkhorshidi, A.S., Wah, T.Y.: Time-series clustering – a decade review. Inf. Syst. 53, 16–38 (2015)
Article Google Scholar
Bagnall, A., Lines, J., Bostrom, A., Large, J., Keogh, E.: The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances. Data Min. Knowl. Disc. 31(3), 606–660 (2017)
Article MathSciNet Google Scholar
Bhattacharya, R., Patrangenaru, V.: Large sample theory of intrinsic and extrinsic sample means on manifolds. Ann. Stat. 31(1), 1–29 (2003)
Article MathSciNet MATH Google Scholar
Brill, M., Fluschnik, T., Froese, V., Jain, B., Niedermeier, R, Schultz, D.: Exact mean computation in dynamic time warping spaces. Data Mining and Knowledge Discovery (2019)
Bulteau, L., Froese, V., Niedermeier, R.: Hardness of Consensus Problems for Circular Strings and Time Series Averaging. arXiv:1804.02854(2018)
Cuturi, M., Blondel, M.: Soft-DTW: a differentiable loss function for time-series. International Conference on Machine Learning (2017)
Ding, H., Trajcevski, G., Scheuermann, P., Wang, X., Keogh, E.: Querying and mining of time series data: experimental comparison of representations and distance measures. Proc. VLDB Endowment 1(2), 1542–1552 (2008)
Article Google Scholar
Dryden, I.L., Mardia, KV: Statistical shape analysis. Wiley, New York (1998)
Feragen, A., Lo, P., De Bruijne, M., Nielsen, M., Lauze, F.: Toward a theory of statistical tree-shape analysis. IEEE Trans. Pattern Anal. Mach. Intell. 35, 2008–2021 (2013)
Article Google Scholar
Fletcher, P.T., Lu, C., Pizer, S.M., Joshi, S.: Principal geodesic analysis for the study of nonlinear statistics of shape. IEEE Trans. Med. Imaging 23(8), 995–1005 (2004)
Article Google Scholar
Fréchet, M.: Les éléments aléatoires de nature quelconque dans un espace distancié. Annales de l’,institut Henri Poincaré 10, 215–310 (1948)
Ginestet, C.E.: Strong Consistency of Fré,chet Sample Mean Sets for Graph-Valued Random Variables. arXiv:1204.3183 (2012)
Hautamaki, V., Nykanen, P., Franti, P.: Time-series clustering by approximate prototypes. International Conference on Pattern Recognition (2008)
Jain, B.J.: Generalized gradient learning on time series. Mach. Learn. 100(2-3), 587–608 (2016)
Article MathSciNet MATH Google Scholar
Jain, B.J.: Statistical analysis of graphs. Pattern Recogn. 60, 802–812 (2016)
Article MATH Google Scholar
Jain, B.J., Schultz, D.: On the existence of a sample mean in dynamic time warping spaces. arXiv:1610.04460 (2016)
Jain, B.J., Schultz, D.: Asymmetric learning vector quantization for efficient nearest neighbor classification in dynamic time warping spaces. Pattern Recogn. 76, 349–366 (2018)
Article Google Scholar
Jain, B.: Revisiting Inaccuracies of Time Series Averaging under Dynamic Time Warping. Pattern Recogn. Lett. 125, 418–424 (2019)
Article Google Scholar
Kendall, D.G.: Shape manifolds, procrustean metrics, and complex projective spaces. Bull. Lond. Math. Soc. 16, 81–121 (1984)
Article MathSciNet MATH Google Scholar
Kohonen, T., Somervuo, P.: Self-organizing maps of symbol strings. Neurocomputing 21(1-3), 19–30 (1998)
Article MATH Google Scholar
Liu, Y., Zhang, Y., Zeng, M.: Adaptive Global Time Sequence Averaging Method Using Dynamic Time Warping. IEEE Trans. Signal Process. 67(8), 2129–2142 (2019)
Article MathSciNet MATH Google Scholar
Petitjean, F., Ketterlin, A., Gancarski, P.: A global averaging method for dynamic time warping, with applications to clustering. Pattern Recogn. 44(3), 678–693 (2011)
Article MATH Google Scholar
Petitjean, F., Forestier, G., Webb, G.I., Nicholson, A.E., Chen, Y., Keogh, E.: Faster and more accurate classification of time series by exploiting a novel dynamic time warping averaging algorithm. Knowl. Inf. Syst. 47(1), 1–26 (2016)
Article Google Scholar
Rabiner, L.R., Wilpon, J.G.: Considerations in applying clustering techniques to speaker-independent word recognition. J. Acoust. Soc. Am. 66(3), 663–673 (1979)
Article Google Scholar
Sakoe, H., Chiba, S.: Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. Acoust. Speech Signal Process. 26(1), 43–49 (1978)
Article MATH Google Scholar
Schultz, D., Jain, B.: Nonsmooth analysis and subgradient methods for averaging in dynamic time warping spaces. Pattern Recogn. 74, 340–358 (2018)
Article Google Scholar
Soheily-Khah, S., Douzal-Chouakria, A., Gaussier, E.: Generalized k-means-based clustering for temporal data under weighted and kernel time warp. Pattern Recogn. Lett. 75, 63–69 (2016)
Article Google Scholar
Somervuo, P., Kohonen, T.: Self-organizing maps and learning vector quantization for feature sequences. Neural. Process. Lett. 10(2), 151–159 (1999)
Article Google Scholar
Sverdrup-Thygeson, H.: Strong law of large numbers for measures of central tendency and dispersion of random variables in compact metric spaces. Ann. Stat. 9(1), 141–145 (1981)
Article MathSciNet MATH Google Scholar
Tan, C.W., Webb, G.I., Petitjean, F.: Indexing and classifying gigabytes of time series under time warping. International Conference on Data Mining (2017)
Wilpon, J.G., Rabiner, L.R.: A Modified K-Means Clustering Algorithm for Use in Isolated Work Recognition
Ziezold, H.: On expected figures and a strong law of large numbers for random elements in quasi-metric spaces. Transactions of the Seventh Prague Conference on Information Theory, Statistical Decision Functions Random Processes and of the 1974 European Meeting of Statisticians. (1977)
Chapter Google Scholar

Download references

Acknowledgements

B. Jain was funded by the DFG Sachbeihilfe JA 2109/4-2.

Author information

Authors and Affiliations

Distributed Artificial Intelligence Laboratory, TU Berlin, Berlin, Germany
Brijnesh Jain & David Schultz

Authors

Brijnesh Jain
View author publications
You can also search for this author in PubMed Google Scholar
David Schultz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Brijnesh Jain.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

A Proof of Example 9

Proof

From the Reduction Theorem follows that it is sufficient to consider candidate solutions of length one and two. Thus, it is sufficient to consider the restricted Fréchet functions F₁ and F₂. In addition, it is sufficient to assume that all warping paths are compact (see Example 25). Then the set §P_m,n with m,n ∈ {1,2} consists of exactly one warping path. Suppose that x = (x₁), y = (y₁,y₂) and z = (z₁,z₂). Then the squared DTW distances are of the form δ(x,z)² = d(x₁,z₁) + d(x₁,z₂)δ(y,z)² = d(y₁,z₁) + d(y₂,z₂).

We proceed with considering a slightly modified setting using the local distance function $d^{\prime }(a, a^{\prime }) = (a-a^{\prime })^{2}$ for all $a, a^{\prime } \in \mathcal {A}$. Under this setting, we denote the DTW distance by $\delta ^{\prime }$ and the restricted Fréchet functions by $F^{\prime }_{1}$ and $F^{\prime }_{2}$. The function $F^{\prime }_{1}(x)$ at time series x = (x₁) is of the form

$$ \begin{array}{@{}rcl@{}} F^{\prime}_{1}(x) = \underbrace{(x_{1} - 1)^{2} + (x_{1} - 1)^{2}}_{= \delta^{\prime}(x, x^{(1)})} + \underbrace{(x_{1} - 1)^{2} + (x_{1} + 1)^{2}}_{= \delta^{\prime}(x, x^{(2)})}. \end{array} $$

The function $F^{\prime }_{1}$ is convex and differentiable with respect to x₁. Taking the gradient, setting to zero and solving yields the unique solution x₁ = 0.5. Thus, z = (0.5) is the restricted sample mean of $\mathcal {X}$ on §T₁ with Fréchet variation $F^{\prime }_{1}(z) = 3$.

For a given x = (x₁,x₂), a similar calculation with

$$ \begin{array}{@{}rcl@{}} F^{\prime}_{2}(x) = (x_{1} - 1)^{2} + (x_{2} - 1)^{2} + (x_{1} - 1)^{2} + (x_{2} + 1)^{2} \end{array} $$

gives z = (1,0) as the unique restricted sample mean on $\mathcal {T}_{2}$ with Fréchet variation $F^{\prime }_{2}(z) = 2$. By combining both results, we conclude that z = (1,0) is an unrestricted sample mean of $\mathcal {X}$ with total variation $F^{\prime *} = 2$.

Next, we assume the original local distance function d on §A as defined in Example 9. Again, we first consider time series x = (x₁) of length one. Then we have $F_{1}(x) = F^{\prime }_{1}(x)$ if x₁≠ 0 and F₁(x) = 4 if x₁ = 0. Thus, z = (0.5) is the restricted sample mean of $\mathcal {X}$ on $\mathcal {T}_{1}$ with Fréchet variation F₁(z) = 3.

Now, we consider time series of length two. Let x_ε = (1,ε) for some $\varepsilon \in \mathbb {R}$. Then

$$ \begin{array}{@{}rcl@{}} \lim_{\varepsilon \to 0} F(x_{\varepsilon}) = 2 \end{array} $$

(5)

but F(x₀) = 4. Suppose there is a restricted sample mean z = (z₁,z₂) on $\mathcal {T}_{2}$. From (5) follows that F(z) ≤ 2. If at least one element of z is zero, we have F(z) ≥ 4. This contradicts our assumption that z is a sample mean. Thus, the elements of z are both non-zero. Then we have $F_{2}(z) = F^{\prime }_{2}(z)$. Recall that the unique minimizer of $F^{\prime }_{2}$ has a zero element. This yields the contradiction $2 < F^{\prime }_{2}(z) = F_{2}(z) \leq 2$. Consequently, the function F₂ has no minimizer. Thus, the unrestricted Fréchet function F never attains its infimum 2 and therefore $\mathcal {X}$ has no sample mean. □

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jain, B., Schultz, D. Sufficient conditions for the existence of a sample mean of time series under dynamic time warping. Ann Math Artif Intell 88, 313–346 (2020). https://doi.org/10.1007/s10472-019-09682-2

Download citation

Published: 10 January 2020
Issue Date: April 2020
DOI: https://doi.org/10.1007/s10472-019-09682-2

Keywords

Mathematics Subject Classification (2010)

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Sufficient conditions for the existence of a sample mean of time series under dynamic time warping

Abstract

Access this article

Similar content being viewed by others

catch22: CAnonical Time-series CHaracteristics

LoCoMotif: discovering time-warped motifs in time series

Clustering time series by extremal dependence

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Appendix

A Proof of Example 9

Proof

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification (2010)

Navigation

Sufficient conditions for the existence of a sample mean of time series under dynamic time warping

Abstract

Access this article

Similar content being viewed by others

catch22: CAnonical Time-series CHaracteristics

LoCoMotif: discovering time-warped motifs in time series

Clustering time series by extremal dependence

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Appendix

A Proof of Example 9

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification (2010)

Search

Navigation