1 Introduction

Nowadays, searching and comparing time series databases generated by computers, which consist of accurate time cycles and which achieve a determined finite number of value levels, is a trivial problem. Main attention is focused rather on optimization of the searching speed. A non-trivial task occurs while comparing or searching signals with different length, which are not strictly defined and have various distortions in time and amplitude. As a typical example, we can mention the measurement of functionality of human body (ECG, EEG) or the elements (precipitation, flow rates in riverbeds), that does not contain any accurate timing for signal generation. Therefore, comparison of such sequences is significantly difficult, and almost impossible while using standard functions for similarity (distance) computation [2], such as Euclidean distance [3], cosine measure [8], Mean Estimate Error [16], etc. Examples of such signals are presented in Fig. 1. A problem of standard functions for similarity (distance) computation consists in sequential comparison of the opposite elements in the both sequences (comparison of elements with the identical indices). Fortunately, such lack of commonly used approach can be easily eliminated by the Dynamic Time Warping algorithm, which is able to percept similarity through the eyes of a domain expert, in contrast with a strict sequential comparison. However, such commonly used definition of benevolence cannot be applied on DTW modifications, which were created for solving specific tasks (e.g. searching the longest common subsequence).

The main goal of this paper is to eliminate weaknesses of commonly used approach and to propose a new flexible mechanism for definition of benevolence applicable to modifications of the original DTW. It is organized as follows: First, the DTW algorithm for comparing two distorted sequences and its several modifications will be described in Sect. 2. In Sect. 3, commonly used approaches for definition of benevolence will be introduced. It will be followed by a proposal of a new Flexible Global Constraint. Finally, an effect of the algorithm’s settings will be visualized and the proposed solution will be discussed.

2 Dynamic Time Warping

Dynamic Time Warping (DTW) is a technique for finding the optimal matching of two warped sequences using pre-defined rules [11]. Essentially, it is a nonlinear mapping of particular elements to match them in the most appropriate way. The output of such DTW mapping of sequences from Fig. 1 can be seen in Fig. 2. At first, this approach was used for comparison of two voice patterns during an automatic recognition of voice commands [13]. Since this time, it was widely used in many domains, e.g. for efficient satellite image analysis [12], in analysis of student behavioral patterns [17] or in protein fold recognition [9]. As it is correctly noted in [5], a common problem of many DTW applications lies in the fact, that the DTW is too computationally expensive. In order to speed up the algorithm run, several lower bounding methods [4] or parallelization techniques were created [14, 15]. Moreover, the DTW was modified many times for solving specific tasks (e.g. searching the longest common subsequence [7]) or for better algorithm behavior (e.g. Derivative Dynamic Time Warping [6]). Since the proposed approach is also an extension of this algorithm, the original DTW algorithm will be described in more detail for better understanding.

Fig. 1.
figure 1

Standard metrics comparison

Fig. 2.
figure 2

DTW comparison

Fig. 3.
figure 3

DTW cost matrices

Formally, the main goal of DTW method is a comparison of two time dependent sequences x and y, where \(x=(x_1,x_2,\ldots ,x_n)\) and \(y=(y_1,y_2,\ldots ,y_m)\), and finding an optimal mapping of their elements. To compare partial elements of sequences \(x_i,y_j \in \mathbb {R}\), it is necessary to define a local cost measure \(c:\mathbb {R} \times \mathbb {R} \rightarrow \mathbb {R}_{\ge 0}\), where c is small if x and y is similar to each other, and otherwise it is large. Computation of the local cost measure for each pair of elements of sequences x and y results in a construction of the cost matrix \(C \in \mathbb {R}^{n\times m}\) defined by \(C(i,j)=c(x_i,y_j)\) (see Fig. 3(a)).

Then the goal is to find an alignment between x and y with a minimal overall cost. Such optimal alignment leads through the black valleys of the cost matrix C, trying to avoid the white areas with a high cost. Such alignment is demonstrated in Fig. 3(b). Basically, the alignment (called warping path) \(p=(p_1,\ldots ,p_q)\) is a sequence of q pairs (warping path points) \(p_k=(p_{kx},p_{ky}) \in \{1,\ldots ,n\} \times \{1,\ldots ,m\}\). Each of such pairs (ij) indicates an alignment between the ith element of the sequence x and jth element of the sequence y.

Retrieval of optimal path \(p^*\) by evaluating all possible warping paths between sequences x and y leads to an exponential computational complexity. Fortunately, there exists a better way with \(O(n\cdot m)\) complexity based on dynamic programming. It involves the use of an accumulated cost matrix \(D\in \mathbb {R}^{n\times m}\) described in [11].

Accumulated cost matrix computed for the cost matrix from Fig. 3(a) can be seen in Fig. 4(a). It is evident that the accumulation highlights only a single black valley. The optimal path \(p^*=(p_1,\ldots ,p_q)\) is then computed in a reverse order starting with \(p_q=(n,m)\) and finishing in \(p_1=(1,1)\). An example of such found warping path can be seen in Fig. 4(b).

Fig. 4.
figure 4

DTW accumulated cost matrices

The final DTW cost can be understood as a quantified effort for the alignment of the two sequences (see Eq. 1).

$$\begin{aligned} DTW(x,y) = \sum _{k=1}^q{C( x_{p_{kx}}, y_{p_{ky}} )} = D(n,m) \end{aligned}$$
(1)

2.1 Subsequence DTW

In some cases, it is not necessary to compare or align the whole sequences. A usual goal is to find an optimal alignment of a sample (a relatively short time series) within the signal database (a very long time series). This is very usual in situations, in which one manages with a signal database and wants to find the best occurrence(s) of a sample (query). Using the slight modification [11], the DTW has the ability to search such queries in a much longer sequence. The basic idea is not to penalize the omission in the alignment between x and y that appears at the beginning and at the end of the sequence y. Suppose we have two sequences \(x=(x_1,x_2,\ldots ,x_n )\) of the length \(n\in \mathbb {N}\) and \(y=(y_1,y_2,\ldots ,y_m)\) of the much larger length \(m\in \mathbb {N}\). The goal is to find a subsequence \(y_{a:b}=(y_a,y_{a+1},\ldots ,y_b)\) where \(1 \le a \le b \le m\) that minimizes the DTW cost to x over the all possible subsequences of y. An example of such searching the best subsequence alignment can be seen in Fig. 5. Both constructed matrices including the found warping path are then shown in Fig. 6.

Fig. 5.
figure 5

Found DTW subsequence

Fig. 6.
figure 6

Cost matrix and accumulated cost matrix for searching subsequence

Despite the fact that the DTW has its own modification for searching subsequences, it works perfectly only in case of searching an exact pattern in a signal database. However, in real situations, exact patterns are not available because they are surrounded by additional values, or even repeated several times in a sequence (see Fig. 7). Unfortunately, the basic DTW is not able to handle these situations and it fails or returns only a single occurrence of the pattern. To deal with this type of situations, several DTW modifications were created and described for example in [7] or [10] in detail.

Fig. 7.
figure 7

Basic DTW subsequence inaccuracies

Fig. 8.
figure 8

Approach for searching the warping path

Fig. 9.
figure 9

Cost matrix with found warping paths

Fig. 10.
figure 10

Found common subsequences

The biggest difference is in the approach for searching the warping path. In simple terms, the algorithm does not search the warping path from the upper right corner to the bottom left one (shown in the case of classical DTW in Fig. 8(a)) and also it does not connect the opposite sides of the matrix (shown in the case of subsequence DTW in Fig. 8(b)). The main idea is to find warping paths as long as possible from any element to another one, parallel to a diagonal, as it is outlined in Fig. 8(c). An example of such found common subsequences can be seen in Fig. 10. The corresponding warping paths are also visualized in the cost matrix in Fig. 9.

3 Flexible Global Constraints

In the practical applications [1, 1820], the construction of a warping path has to be controlled. The reason is possible uncontrolled high number of warpings, i.e. alignment of a single element to a high number of the elements in the opposite sequence [11]. In this manner, dissimilar sequences can get low DTW Cost and they can be evaluated as similar. This situation is demonstrated on sequences in Fig. 11, and on appropriate cost matrix in Fig. 12.

Generally, this can be easily fixed by definition of a global constraint region \(R \subseteq D\). This region then determines the elements of the cost matrix, which can be used for searching the warping path. In the original paper about DTW [11], there are two global constraints for warping path mentioned - Itakura parallelogram (Fig. 13(a)) and Sakoe-Chiba band (Fig. 13(b)).

Fig. 11.
figure 11

Mapping of dissimilar sequences

Fig. 12.
figure 12

Cost matrix of dissimilar sequences

Fig. 13.
figure 13

DTW global constraint regions

However, for purpose of searching subsequences and other DTW modifications, the Itakura parallelogram seems to be inappropriate, because it was designed to limit warpings at the start and end of the classical DTW warping path, where the first and last warping points are exactly known. Fortunately, the Saoke-Chiba band looks more preferable. The warping path respecting this band for sequences from Fig. 11 is visible in Fig. 14.

Fig. 14.
figure 14

Examples of applied Saoke-Chiba bands

However, one may ask what width of band to choose. The width essentially defines the maximal number of warpings in a found sequence. For this reason, it is almost impossible to define a universal number applicable both on shorter and longer sequences. It is evident that allowing five warpings on a path comparing sequences of the length ten or hundred has absolutely different meaning. In this example, the results look satisfactorily, but this belt was also designed for searching the warping path through the whole sequences. This inaccuracy is evident in the following example:

Lets have two sequences \(x=(x_1,x_2,\ldots ,x_n )\) and \(y=(y_1,y_2,\ldots ,y_{2n} )\), where y is created by stretching x into the double length (i.e. \(\forall {i\in \{1,\ldots ,2n\}}: y_i=x_{i/2}\)). The matrix will stretch in one dimension and the line of minima will slightly bend (see Fig. 15(a)). It causes some warpings, but it is still acceptable. Using the standard Sakoe-Chiba band, the warping path cannot follow the minima trajectory and have to continue in straight direction, as shown in Fig. 15(b).

Fig. 15.
figure 15

Cost matrices for stretched sequence

More elegant solution is to allow a band to bend itself and provide a warping path with reasonable freedom. For this purpose, we designed a flexible band allowing configurable bending. The band is based on Saoke-Chiba band, but it changes its position and shape according the previously constructed warping path. The center of the original Saoke-Chiba band lies exactly on cost matrix’s diagonal.

Proposed Modifications to Saoke-Chiba Band. In our modification, the center of the band varies and passes through one of the previous points of the currently constructed warping path, called control point. Such control point is always located in the fixed distance from the currently processed point. This distance is called control point distance and it is defined as a number of warping path points preceding the currently processed point. The center of constructed band always moves to a newly established control point.

Formally, suppose we have a currently constructed warping path p defined as \(p=(p_1,\ldots ,p_q)\) consisting of a sequence of q path points \(p_k=(p_{kx},p_{ky}) \in \{1,\ldots ,n\} \times \{1,\ldots ,m\},\) \(p_1=(n,m)\). Each such pair \((p_{kx},p_{ky})\) indicates an alignment between the ith element of the sequence x and jth element of the sequence y. The path point \((p_{kx},p_{ky})\) lies in the Saoke-Chiba band of a width w, if \(|p_{kx}-p_{ky}| < w\). With the flexible band of the width w and with a control point distance d, the path point \((p_{kx},p_{ky})\) lies in the band if \(|(p_{kx} - p_{(k-d)x}) - (p_{ky} - p_{(k-d)y})| < w\). The distance d of such control point from the end of the warping path defines a rigidity of the band.

Figure 16 demonstrates how the increasing distance of the control point d causes higher toughness of the band, and how the ability to bend loses. The shorter distance makes the band more flexible, the higher distance causes inflexibility. It is especially evident from Fig. 16(d) (with \(d=4\)), where the band became too much tough to follow the black valleys.

An effect of predefined toughness can be also easily quantified by the received DTW Cost defined in Eq. 1. With an original Saoke-Chiba Band (see Fig. 13), received \(DTW~Cost = 3.6433\). On the other hand, with using the proposed flexible constraint (distances of the control point d) and appropriately adjusted benevolence, the sequences can be evaluated as almost equal (\(DTW~Cost = 0,0182\)). Table 1 illustrates how the received DTW Cost reflects the adjusted amount of benevolence (various distances of the control point d). In order to set the control point distance up correctly, it is necessary to have some domain knowledge. At this point, the domain expert has to define the benevolence for the evaluation.

Fig. 16.
figure 16

Various distances of control point

Table 1. DTW costs for various control point distances

4 Conclusion

The Dynamic Time Warping algorithm has become widely used technique for comparing two sequences and evaluating their mutual similarity. Its many modifications, created for solving specific tasks, subsequently requested additional adjustments of partial steps of this algorithm. As a typical example, the DTW approach for searching the longest common subsequence can be mentioned. In this type of modification, none of commonly used constraints for construction of the warping path can be used. Therefore, the mail goal of this paper was to provide a solution for such situations and to propose a new flexible mechanism for definition of the constraint applicable to the modifications of the original DTW. The proposed solution consists in a new flexible constraint, which is based on the original Saoke-Chiba band. The constraint enables the control over the process of warping path construction and it generally offers more flexibility and predictable behaviour. Moreover, definition of its conduct (i.e. rigidity of the band) can be defined by a single number, which is not dependent on the length of the processed sequences. The use of the proposed solution is not limited only for searching the common subsequences, but it can be utilized in all DTW modifications, whose constructed warping paths are not defined by exactly beginnings and ends.