## Abstract

Averaging time series under dynamic time warping is an important tool for improving nearest-neighbor classifiers and formulating centroid-based clustering. The most promising approach poses time series averaging as the problem of minimizing a Fréchet function. Minimizing the Fréchet function is NP-hard and so far solved by several heuristics and inexact strategies. Our contributions are as follows: we first discuss some inaccuracies in the literature on exact mean computation in dynamic time warping spaces. Then we propose an exponential-time dynamic program for computing a global minimum of the Fréchet function. The proposed algorithm is useful for benchmarking and evaluating known heuristics. In addition, we present an exact polynomial-time algorithm for the special case of binary time series. Based on the proposed exponential-time dynamic program, we empirically study properties like uniqueness and length of a mean, which are of interest for devising better heuristics. Experimental evaluations indicate substantial deficits of state-of-the-art heuristics in terms of their output quality.

### Similar content being viewed by others

### Explore related subjects

Find the latest articles, discoveries, and news in related topics.## Notes

Source code available at http://www.akt.tu-berlin.de/menue/software/.

“Appendix A” describes performance profiles in more detail.

## References

Aach J, Church GM (2001) Aligning gene expression time series with time warping algorithms. Bioinformatics 17(6):495–508

Abdulla WH, Chow D, Sin G (2003) Cross-words reference template for DTW-based speech recognition systems. In: Proceedings of the IEEE conference on convergent technologies for the Asia-Pacific region (TENCON 2003), vol 4. IEEE, pp 1576–1579

Aghabozorgi S, Shirkhorshidi AS, Wah TY (2015) Time-series clustering—a decade review. Inf Syst 53:16–38

Alon J, Athitsos V, Yuan Q, Sclaroff S (2009) A unified framework for gesture recognition and spatiotemporal gesture segmentation. IEEE Trans Pattern Anal Mach Intell 31(9):1685–1699

Blondel M (2017) Python implementation of soft-DTW. https://github.com/mblondel/soft-dtw

Bonizzoni P, Della Vedova G (2001) The complexity of multiple sequence alignment with SP-score that is a metric. Theor Comput Sci 259(1–2):63–79

Bringmann K, Künnemann M (2015) Quadratic conditional lower bounds for string problems and dynamic time warping. In: Proceedings of the 56th annual IEEE symposium on foundations of computer science (FOCS ’15). IEEE, pp 79–97

Bulteau L, Froese V, Niedermeier R (2018) Tight hardness results for consensus problems on circular strings and time series. CoRR arXiv:1804.02854

Chen Y, Keogh E, Hu B, Begum N, Bagnall A, Mueen A, Batista G (2015) The UCR time series classification archive. www.cs.ucr.edu/~eamonn/time_series_data/

Clarke FH (1990) Optimization and nonsmooth analysis. SIAM, Philadelphia

Cuturi M, Blondel M (2017) Soft-DTW: a differentiable loss function for time-series. In: Proceedings of the 34th international conference on machine learning (ICML ’17), PMLR, proceedings of machine learning research, vol 70, pp 894–903

Cuturi M, Vert JP, Birkenes O, Matsui T (2007) A kernel for time series based on global alignments. In: Acoustics, speech and signal processing, 2007. ICASSP 2007. IEEE international conference on, vol 2. IEEE, pp 2–413

Dolan E, Moré J (2002) Benchmarking optimization software with performance profiles. Math Program 91(2):201–213

Fahiman F, Bezdek JC, Erfani SM, Leckie C, Palaniswami M (2017) Fuzzy c-Shape: a new algorithm for clustering finite time series waveforms. In: Proceeding of the 2017 IEEE international conference on fuzzy systems (FUZZ-IEEE). IEEE, pp 1–8

Fréchet M (1948) Les éléments aléatoires de nature quelconque dans un espace distancié. Ann Inst Henri Poincaré 10:215–310

Fu T (2011) A review on time series data mining. Eng Appl Artif Intell 24(1):164–181

Gold O, Sharir M (2018) Dynamic time warping and geometric edit distance: breaking the quadratic barrier. ACM Trans Algorithms 14(4):50:1–50:17

Gupta L, Molfese DL, Tammana R, Simos PG (1996) Nonlinear alignment and averaging for estimating the evoked potential. IEEE Trans Biomed Eng 43(4):348–356

Gusfield D (1997) Algorithms on strings, trees and sequences. Cambridge University Press, Cambridge

Hautamaki V, Nykanen P, Franti P (2008) Time-series clustering by approximate prototypes. In: Proceedings of the 19th international conference on pattern recognition (ICPR ’08). IEEE, pp 1–4

Huang B, Kinsner W (2002) ECG frame classification using dynamic time warping. In: Canadian conference on electrical and computer engineering (CCECE), vol 2. IEEE, pp 1105–1110

Jain B, Schultz D (2017) Optimal warping paths are unique for almost every pair of time series. CoRR arXiv:1705.05681

Jain B, Schultz D (2018) On the existence of a sample mean in dynamic time warping spaces. CoRR arXiv:1610.04460v3

Keogh EJ, Pazzani MJ (2001) Derivative dynamic time warping. In: Proceedings of the 2001 SIAM international conference on data mining. SIAM, Philadelphia, pp 1–11

Lummis R (1973) Speaker verification by computer using speech intensity for temporal registration. IEEE Trans Audio Electroacoust 21(2):80–89

Marteau PF (2018) Times series averaging and denoising from a probabilistic perspective on time-elastic kernels. Int J Appl Math Comput Sci. arXiv:1611.09194

Moon TK (1996) The expectation-maximization algorithm. IEEE Signal Process Mag 13(6):47–60

Morel M, Achard C, Kulpa R, Dubuisson S (2018) Time-series averaging using constrained dynamic time warping with tolerance. Pattern Recogn 74:77–89

Myers CS, Rabiner LR (1981) A comparative study of several dynamic time-warping algorithms for connected-word recognition. Bell Syst Tech J 60(7):1389–1409

Nicolas F, Rivals E (2005) Hardness results for the center and median string problems under the weighted and unweighted edit distances. J Discrete Algorithms 3(2):390–415

Niennattrakul V, Ratanamahatana CA (2007) Inaccuracies of shape averaging method using dynamic time warping for time series data. In: International conference on computational science. Springer, Berlin, pp 513–520

Niennattrakul V, Ratanamahatana CA (2009) Shape averaging under time warping. In: Shi Y, van Albada GD, Dongarra J, Sloot P (eds) Proceedings of the 6th international conference on electrical engineering/electronics, computer, telecommunications and information technology (ECTI-CON ’09), vol 02, pp 626–629

Oates T, Firoiu L, Cohen PR (1999) Clustering time series with hidden Markov models and dynamic time warping. In: Proceedings of the IJCAI’99 workshop on neural, symbolic and reinforcement learning methods for sequence learning, pp 17–21

Paparrizos J, Gravano L (2017) Fast and accurate time-series clustering. ACM Trans Database Syst 42(2):8:1–8:49

Petitjean F, Gançarski P (2012) Summarizing a set of time series by averaging: from Steiner sequence to compact multiple alignment. Theor Comput Sci 414(1):76–91

Petitjean F, Ketterlin A, Gançarski P (2011) A global averaging method for dynamic time warping, with applications to clustering. Pattern Recogn 44(3):678–693

Petitjean F, Forestier G, Webb GI, Nicholson AE, Chen Y, Keogh E (2014) Dynamic time warping averaging of time series allows faster and more accurate classification. In: Proceeding of the 2014 IEEE international conference on data mining (ICDM’14). IEEE, pp 470–479

Petitjean F, Forestier G, Webb GI, Nicholson AE, Chen Y, Keogh E (2016) Faster and more accurate classification of time series by exploiting a novel dynamic time warping averaging algorithm. Knowl Inf Syst 47(1):1–26

Rabiner L, Wilpon J (1979) Considerations in applying clustering techniques to speaker-independent word recognition. J Acoust Soc Am 66(3):663–673

Reyes M, Dominguez G, Escalera S (2011) Feature weighting in dynamic timewarping for gesture recognition in depth data. In: Proceedings of the 2011 IEEE international conference on computer vision workshops (ICCV Workshops). IEEE, pp 1182–1188

Sakoe H, Chiba S (1978) Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans Acoust Speech Signal Process 26(1):43–49

Sankoff D, Kruskal J (1999) Time warps, string edit and macromolecules: the theory and practice of sequence comparison. CSLI Publications, Stanford

Schultz D, Jain BJ (2016) Sample mean algorithms for averaging in dynamic time warping spaces. https://doi.org/10.5281/zenodo.216233

Schultz D, Jain B (2018) Nonsmooth analysis and subgradient methods for averaging in dynamic time warping spaces. Pattern Recogn 74(Supplement C):340–358

Soheily-Khah S, Douzal-Chouakria A, Gaussier E (2015) Progressive and iterative approaches for time series averaging. In: Proceedings of the 1st international conference on advanced analytics and learning on temporal data, vol 1425, pp 111–117

Soheily-Khah S, Douzal-Chouakria A, Gaussier E (2016) Generalized \(k\)-means-based clustering for temporal data under weighted and kernel time warp. Pattern Recogn Lett 75:63–69

Sun T, Liu H, Yu H, Chen CP (2017) Degree-pruning dynamic programming approaches to central time series minimizing dynamic time warping distance. IEEE Trans Cybern 47(7):1719–1729

Zhao J, Itti L (2018) shapeDTW. Pattern Recogn 74(C):171–184

## Acknowledgements

This work was supported by the Deutsche Forschungsgemeinschaft under Grants JA 2109/4-1 and NI 369/13-2, and by a Feodor Lynen return fellowship of the Alexander von Humboldt Foundation. The work on the theoretical part of this paper started at the research retreat of the Algorithmics and Computational Complexity group, TU Berlin, held at Boiensdorf, Baltic Sea, April 2017, with MB, TF, VF, and RN participating. We also thank the authors of the UCR Time Series Classification Archive for providing the data sets which we used in our experiments.

## Author information

### Authors and Affiliations

### Corresponding author

## Additional information

Responsible editor: Eamonn Keogh.

### Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

A short version of this article appeared in the *Proceedings of the 2018 SIAM International Conference on Data Mining* (*SDM* ’18), pp. 540–548. SIAM, 2018. This article contains all proofs in full detail. Also, the dynamic program is improved to find an arbitrary weighted mean and new experimental results are included.

## Appendices

### A Performance profiles

To compare the performance of the mean algorithms, we used a slight variation of the performance profiles proposed by Dolan and Moré (2002). A performance profile is a cumulative distribution function for a performance metric. Here, the chosen performance metric is the error percentage from the exact solution.

To define a performance profile, we assume that \({\mathbb {A}}\) is a set of mean algorithms and \({\mathbb {S}}\) is a set of samples each of which consists of *k* time series. For each sample \({\mathcal {X}} \in {\mathbb {X}}\) and each mean algorithm \(A \in {\mathbb {A}}\), we define \(E_{A,{\mathcal {S}}}\) as the error percentage obtained by applying algorithm *A* on sample \({\mathcal {S}}\). The performance profile of algorithm *A* over all samples \({\mathcal {S}} \in {\mathbb {S}}\) is the empirical cumulative distribution function defined by

for all \(\tau \ge 0\). Thus, \(P_A(\tau )\) is the estimated probability that the error percentage of algorithm *A* is at most \(\tau \). The value \(P_A(0)\) is the estimated probability that algorithm *A* finds an exact solution.

### B Detailed results

## Rights and permissions

## About this article

### Cite this article

Brill, M., Fluschnik, T., Froese, V. *et al.* Exact mean computation in dynamic time warping spaces.
*Data Min Knowl Disc* **33**, 252–291 (2019). https://doi.org/10.1007/s10618-018-0604-8

Received:

Accepted:

Published:

Issue Date:

DOI: https://doi.org/10.1007/s10618-018-0604-8