Robust online-surveillance of trend-coherence in multivariate data streams: the similar trend monitoring (STM) procedure

Abstract

When several data streams are observed simultaneously, it is often of great interest to monitor the coherences between all pairs of streams. We propose a new technique called Similar Trend Monitoring (STM) for this task: The current slopes of all univariate streams are estimated and compared pairwise at each time point. The STM statistic is the standardized slope difference, so that decisions about coherence can be made by means of the six-sigma-rule, for instance. The STM meets the high demands that come along with the online monitoring of multivariate data streams: it is fast to compute, robust against outliers, applicable when observations are missing, and does not require stationarity of the processes. We investigate the distribution and the performance of the STM and demonstrate its capabilities considering blood pressure time series from intensive care patient monitoring as an example.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

References

  1. Aue, A., Hörmann, S., Horváth, L., Reimherr, M.: Break detection in the covariance structure of multivariate time series models. Ann. Stat. 37(6B), 4046–4087 (2009)

    Article  MATH  Google Scholar 

  2. Beringer, J., Hüllermeier, E.: Online clustering of parallel data streams. Data Knowl. Eng. 58, 180–204 (2006)

    Article  Google Scholar 

  3. Bernholt, T., Fried, R.: Computing the update of the repeated median regression line in linear time. Inf. Process. Lett. 88(3), 111–117 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  4. Bodnar, O., Schmid, W.: Surveillance of the mean behavior of multivariate time series. Stat. Neerl. 61(4), 383–406 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  5. Borowski, M., Fried, R.: Online signal extraction by robust regression in moving windows with data-adaptive width selection. Stat. Comput. (2013)

  6. Borowski, M., Schettlinger, K., Gather, U.: Multivariate real time signal processing by a robust adaptive regression filter. Commun. Stat. Simul. 38(2), 426–440 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  7. Bulut, A., Singh, A.: A unified framework for monitoring data streams in real time. In: Proceedings of the 21st International Conference on Data Engineering (ICDE 2005) (2005)

  8. Busse, D.: Robuste Echtzeit-Überwachung der Abhängigkeiten multivariater nichtstationärer Zeitreihen. Diploma thesis, Faculty of Statistics, TU Dortmund University (in German) (2012)

  9. Chan, L., Zhang, J.: Cumulative sum control charts for the covariance matrix. Stat. Sin. 11, 767–790 (2001)

    MathSciNet  MATH  Google Scholar 

  10. Cole, R., Shasha, D., Zhao, X.: Fast window correlations over uncooperative time series. In: Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, pp. 743–749. KDD ’05 (2005)

  11. Dai, B., Huang, J., Yeh, M., Chen, M.: Adaptive clustering for multiple evolving streams. IEEE Trans. Knowl. Data Eng. 18(9), 1166–1180 (2006)

    Google Scholar 

  12. Davies, P., Gather, U.: Breakdown and groups. Ann. Stat. 33(3), 977–1035 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  13. Davies, P., Fried, R., Gather, U.: Robust signal extraction for on-line monitoring data. J. Stat. Plan. Inference 122, 65–78 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  14. Fried, R.: Robust filtering of time series with trends. J. Nonparametr. Stat. 16(3–4), 313–328 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  15. Fried, R., Schettlinger, K., Borowski, M.: robfilter: Robust Time Series Filters (2012), http://CRAN.R-project.org/package=robfilter, r package version 4.0

  16. Gather, U., Schettlinger, K., Fried, R.: Online signal extraction by robust linear regression. Comput. Stat. 21(1), 33–51 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  17. Gelper, S., Schettlinger, K., Croux, C., Gather, U.: Robust online scale estimation in time series: A model-free approach. J. Stat. Plan. Inference 139, 335–349 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  18. Gnanadesikan, R., Kettenring, J.: Robust estimates, residuals, and outlier detection with multiresponse data. Biometrics 28, 81–124 (1972)

    Article  Google Scholar 

  19. Idé, T., Papadimitriou, S., Vlachos, M.: Computing correlation anomaly scores using stochastic nearest neighbors. In: Proceedings of the 7th IEEE International Conference on Data Mining 2007. pp. 523–528 (2007)

  20. Jiang, T., Feng, Y., Zhang, B., Cao, Z., Fu, G., Shi, J.: Monitoring correlative financial data streams by local pattern similarity. J. Zhejiang Univ. Sci. A 10(7), 937–951 (2009)

    Article  MATH  Google Scholar 

  21. Lanius, V., Gather, U.: Robust online signal extraction from multivariate time series. Comput. Stat. Data Anal. 54, 966–975 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  22. Liu, X., Ferhatosmanoglu, H.: Efficient k-NN search on streaming data series. In: Hadzilacos, T., Manolopoulos, Y., Roddick, J., Theodoridis, Y. (eds.) Advances in Spatial and Temporal Databases, pp. 83–101. Lecture Notes in Computer Science, Springer, Berlin, Heidelberg (2003)

  23. Maronna, A., Zamar, R.: Robust estimates of location and dispersion for high-dimensional data sets. Technometrics 44(4), 307–317 (2002)

    Article  MathSciNet  Google Scholar 

  24. Papadimitriou, S., Sun, J., Yu, P.: Local correlation tracking in time series. In: Proceedings of the Sixth International Conference on Data Mining (ICDM’06), pp. 456–465 (2006)

  25. Rodrigues, P., Gama, J., Pedroso, J.: Hierarchical clustering of time-series data streams. IEEE T. Knowl. Data Eng. 20(5), 615–627 (2008)

    Article  Google Scholar 

  26. Rousseeuw, P., Croux, C.: Alternatives to the median absolute deviation. J. Am. Stat. Assoc. 88(424), 1273–1283 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  27. Rousseeuw, P., Hubert, M.: Regression-free and robust estimation of scale for bivariate data. Comput. Stat. Data Anal. 21, 67–85 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  28. Rousseeuw, P.J., Leroy, A.M.: Robust Regression and Outlier Detection. Wiley, New York (1987)

    Google Scholar 

  29. Siegel, A.: Robust regression using repeated medians. Biometrika 69(1), 242–244 (1982)

    Article  MATH  Google Scholar 

  30. Śliwa, P., Schmid, W.: Monitoring the cross-covariances of a multivariate time series. Metrika 61, 89–115 (2005a)

    Article  MathSciNet  MATH  Google Scholar 

  31. Śliwa, P., Schmid, W.: Surveillance of the covariance matrix of multivariate nonlinear time series. Statistics 39(3), 221–246 (2005b)

    Article  MathSciNet  MATH  Google Scholar 

  32. Wied, D., Galeano, P.: Monitoring correlation change in a sequence of random variables. J. Stat. Plan. Inference 143(1), 186–196 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  33. Yang, J.: Dynamic clustering of evolving streams with a single pass. In: Proceedings of the 19th International Conference on Data, Engineering 2003, pp. 695–697 (2003)

  34. Yeh, M., Dai, B., Chen, M.: Clustering over multiple evolving streams by events and correlations. IEEE T. Knowl. Data Eng. 19(10), 1349–1362 (2007)

    Article  Google Scholar 

  35. Zhu, Y., Shasha, D.: Statstream: Statistical monitoring of thousands of data streams in real time. In: Proceedings of the 28th international conference on Very Large Data Bases, pp. 358–369 (2002)

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Matthias Borowski.

Appendices

Appendix 1: The Orthogonalized Gnanadesikan- Kettenring covariance estimator based on \(\mathrm{Q^\mathrm{adj}}\)

We use the Orthogonalized Gnanadesikan-Kettenring covariance estimator (OGK, Maronna and Zamar 2002; Gnanadesikan and Kettenring 1972) to estimate the local error cross-covariance matrix \(\varvec{\varSigma }_t\), where \(\mathrm{Cov}[\varepsilon _t(i),\varepsilon _t(j)] = \sigma _t(i,j)\) and \(\mathrm{Corr}[\varepsilon _t(i),\varepsilon _t(j)] = \sigma _t(i,j) / [\sigma _t(i) \cdot \sigma _t(j)]\). Let \(\varvec{X}\in \mathbb {R}^{n \times K}\) denote a sample matrix with rows \(\varvec{x}_i = (x_{i1},\) \(\ldots ,\) \(x_{in})\) and columns \(X_k = (x_{1k},\ldots ,x_{nk})'\) with \(i=1,\ldots ,n\) and \(k=1,\ldots ,K\). Given a univariate scale estimator \(\hat{\sigma }()\), the covariance estimation \(\mathrm{OGK}(\varvec{X})\) is carried out as follows:

  1. 1.

    Determine \(\varvec{D} = \mathrm{diag}(\hat{\sigma }(X_1),\ldots ,\hat{\sigma }(X_K))\), so that \(\varvec{Y}= \varvec{X}\varvec{D}^{-1}\).

  2. 2.

    Estimate the correlation matrix \(\varvec{R}\) of \(\varvec{X}\):

    $$\begin{aligned} R_{ij} = \left( \hat{\sigma }(Y_i + Y_j)^2 - \hat{\sigma }(Y_i - Y_j)^2 \right) \bigg /4, \end{aligned}$$

    where \(R_{ii}=1\), and \(Y_i\) denotes the \(i\)-th column of \(\varvec{Y}\).

  3. 3.

    Conduct an eigenvalue decomposition \(\varvec{R} = \varvec{E}\varvec{\varLambda }\varvec{E}'\), where \(\varvec{\varLambda }\) \(=\) \(\mathrm{diag}(\lambda _1,\ldots ,\lambda _K)\) contains the ordered eigenvalues and \(\varvec{E}\) the referring eigenvectors of \(\varvec{R}\).

  4. 4.

    Obtain the OGK covariance estimate by \(\mathrm{OGK}(\varvec{X}) = \varvec{A}\varvec{\varGamma }\varvec{A}'\) with \(\varvec{A}=\varvec{DE}\) and \(\varvec{\varGamma } = \mathrm{diag}(\hat{\sigma }(Z_1)^2,\ldots ,\hat{\sigma }(Z_K)^2)\), where \(Z_i\) denotes the \(i\)-th column of \(\varvec{Z}= \varvec{X}(\varvec{A}')^{-1}\).

In order to estimate the error covariance matrix \(\varvec{\varSigma }_t\), the obvious approach is to apply this algorithm to a matrix \(\varvec{X}\) of residuals, which can be residuals of RM regression lines, for instance, that are fitted to univariate time windows. Lanius and Gather (2010) follow this approach and suggest to apply the \(Q_n\) scale estimator (Rousseeuw and Croux 1993) to RM residuals. However, we do not use the OGK based on the \(Q_n\) but on the \(\mathrm{Q^\mathrm{adj}}\) scale estimator instead. One reason is that the \(\mathrm{Q^\mathrm{adj}}\) estimator is model-free, i.e. it does not assume a certain shape of the signal. Therefore, the OGK-\(\mathrm{Q^\mathrm{adj}}\) delivers good results when linearity is not given, in contrast to the OGK-\({Q_n}\) (Chapter 4.2.2 Busse 2012). Furthermore, the \(\mathrm{Q^\mathrm{adj}}\) estimations \(\hat{\sigma }_t(k) = \hat{\sigma }(X_k)\), \(k=1,\ldots ,K\), are delivered by the SCARM anyway, so that no additional computing time is needed to determine the matrix \(\varvec{D}\) at step 1 of the OGK algorithm.

The OGK-\(\mathrm{Q^\mathrm{adj}}\) estimates the error covariance directly from the data, i.e. the matrix \(\varvec{X}\) in the OGK-\(\mathrm{Q^\mathrm{adj}}\)-algorithm contains the \(n\) most recent observation vectors. (In contrast, the matrix \(\varvec{X}\) in the OGK-\(Q_n\)-algorithm contains residuals.) However, we have to decide how many observations are used for the OGK-\(\mathrm{Q^\mathrm{adj}}\) estimation. The STM uses the SCARM to obtain an individual window width \(n_t(k)\) for all of the \(K\) univariate components of the \(K\)-variate time series. That is, we do not have a sample matrix \(\varvec{X}\in \mathbb {R}^{K \times n}\) but \(K\) sample vectors \(X_k\) of different lengths \(n_t(k)\). We therefore estimate \(\sigma _t(i,j)\) from the matrix \(\varvec{X}^* = (X^*_i,X^*_j) \in \mathbb {R}^{n \times 2}\), where \(X^*_i\) and \(X^*_j\) are the recent \(n = \max \ \{ n_t(i), n_t(j) \}\) observations of component \(i\) and \(j\). Choosing \(n\) as the minimum of \(n_t(i)\) and \(n_t(j)\) would be the more comprehensible approach at first glance, since both \(X^*_i\) and \(X^*_j\) would have a linear structure then. However, this would mean discarding observations, and linearity is not required for the \(\mathrm{Q^\mathrm{adj}}\) estimator.

Note that the OGK is not affine equivariant, but it is approximately affine equivariant, see Maronna and Zamar (2002). Furthermore, given a sample matrix \(\varvec{X}=(X_1,X_2) \in \mathbb {R}^{n \times 2}\), the OGK is regression invariant if the same trend is added to both univariate samples \(X_2\) and \(X_2\) and a regression invariant scale estimator \(\hat{\sigma }()\) is used:

Lemma 1

Let \(\varvec{X}{=}(X_1,X_2) {\in } \mathbb {R}^{n \times 2}\) and \(\varvec{X}^* {=} (X_1^*,X_2^*) \in \mathbb {R}^{n \times 2}\), \(k=1,2\), where \(X_k\) = \((x_{ik})_{i=1,\ldots ,n}'\) and \(X_k^*\) = \((x_{ik}^*)_{i=1,\ldots ,n}'\) with \(x_{ik}^* = x_{ik} + i \cdot a + b\), \(a,b \in \mathbb {R}\), \(i=1,\ldots ,n\). If the OGK uses a regression invariant scale estimator \(\hat{\sigma }()\), then OGK\((\varvec{X})\) = OGK\((\varvec{X}^*)\).

Proof

In the following, \(\varvec{D}, \varvec{Y}, \varvec{R}, \varvec{E}, \varvec{A}, \varvec{Z}, \varvec{\varGamma }\) denote the matrices of the OGK-algorithm from above. Obviously,

$$\begin{aligned} \varvec{D}= \mathrm{diag}\left( \hat{\sigma }(\varvec{X}_1), \hat{\sigma }(\varvec{X}_2) \right) = \mathrm{diag}\left( \hat{\sigma }(\varvec{X}_1^*), \hat{\sigma }(\varvec{X}_2^*) \right) =: \varvec{D}^*, \end{aligned}$$

since \(\hat{\sigma }()\) is regression invariant. Let \(\varvec{R}^* = (R_{ij})_{i,j=1,2}\) with \(R^*_{ii}=1\) and \(R_{ij}^* = \left( \hat{\sigma }(Y_1^* + Y_2^*)^2 - \hat{\sigma }(Y_1^* - Y_2^*)^2 \right) / 4\). Let \(\varvec{Y}^* = \varvec{X}^* \varvec{D}^{-1}\) with \(\varvec{Y}^* = (Y_1^*, Y_2^*) \in \mathbb {R}^{n \times 2}\). It is not difficult to show that

$$\begin{aligned} \hat{\sigma }(Y_1^* + Y_2^*) = \hat{\sigma }(Y_1 + Y_2) ~\hbox {and}~ \hat{\sigma }(Y_1^* - Y_2^*) = \hat{\sigma }(Y_1 - Y_2), \end{aligned}$$

since \(\hat{\sigma }(Y_1^*) = \hat{\sigma }(Y_1)\) and \(\hat{\sigma }(Y_2^*) = \hat{\sigma }(Y_2)\) due to the regression invariance of \(\hat{\sigma }()\). Hence, \(R_{12} {=} R^*_{12}\) and obviously also \(R_{21} {=} R^*_{21}\). Thus, \(\varvec{E}{=} \varvec{E}^*\), where \(\varvec{E}^*\) denotes the matrix \(\varvec{E}\) computed on \(\varvec{X}^*\) instead of \(\varvec{X}\). Thus, \(\varvec{A}= \varvec{D}\varvec{E}= \varvec{D}^* \varvec{E}^* =: \varvec{A}^*\). Furthermore, it is \(\varvec{Z}= \varvec{X}(\varvec{A}')^{-1}\), see the OGK-algorithm, where \(\varvec{Z}= (Z_1,Z_2) \in \mathbb {R}^{n \times 2}\). Let \(\varvec{Z}^* := \varvec{X}^* ((\varvec{A}^*)')^{-1} = \varvec{X}^* (\varvec{A}')^{-1}\) with \(\varvec{Z}^* = (Z_1^*,Z_2^*) \in \mathbb {R}^{n \times 2}\). It is not difficult to show that

$$\begin{aligned} \hat{\sigma }(Z_1^*) = \hat{\sigma }(Z_1) ~\hbox {and}~ \hat{\sigma }(Z_2^*) = \hat{\sigma }(Z_2), \end{aligned}$$

because of the regression invariance of \(\hat{\sigma }()\). Hence \(\varvec{\varGamma }^* := \mathrm{diag}\left( \hat{\sigma }(Z_1^*)^2, \hat{\sigma }(Z_2^*)^2 \right) = \varvec{\varGamma }\) and, thus,

$$\begin{aligned} \hbox {OGK}(\varvec{X}) = \varvec{A}\varvec{\varGamma }\varvec{A}' = \varvec{A}^* \varvec{\varGamma }^* (\varvec{A}^*)' = \hbox {OGK}(\varvec{X}^*). \end{aligned}$$

Appendix 2: Distribution of the STM statistic: further simulations

Here we investigate the distribution of the STM statistic given AR(1) time series from (16) with innovations \(e_t\) following

  1. (i)

    a \(t\)-distribution with three degrees of freedom (heavy-tailed errors),

  2. (ii)

    a Weibull distribution with shape parameter two and scale parameter one (skewed errors),

  3. (iii)

    a \(0.95N(0,1) + 0.05N(5,1)\)-distribution (contaminated normal errors).

For each error type (i)–(iii), we consider each combination of \(\varphi (1)\) and \(\varphi (2)\) with \(\varphi (1) \le \varphi (2)\) and \(\varphi (1),\varphi (2) \in \{-0.9,-0.6,\ldots ,0.9\}\), and compute 100.000 STM statistics in each case. Histograms of the STM statistics can be found on the website mentioned in Sect. 2.5. The empirical quantiles computed from these statistics are listed in Tables 2, 3 and 4. The quantiles in Table 3 are quite similar to those in Table 1. The quantiles in Tables 2 and 4 are larger than the corresponding standard normal quantiles in almost all cases, in particular if the autocorrelations are strong. Furthermore, the quantiles in Tables 2 and 4 are larger than the quantiles in Table 1 if \(\varphi (1)\) or \(\varphi (2)\) is equal to \(0.6\) or \(0.9\). Obviously, the combination of extreme observations in the data stream together with strong positive autocorrelations leads to an increased probability for huge absolute values of the STM statistic.

Table 2 Empirical \((1-\gamma )\)-quantiles of 100.000 STM statistics computed from two AR(1) time series with autocorrelation parameters \(\varphi (k)\), \(k=1,2\), and innovations from a \(t_3\)-distribution
Table 3 Empirical \((1-\gamma )\)-quantiles of 100.000 STM statistics computed from two AR(1) time series with autocorrelation parameters \(\varphi (k)\), \(k=1,2\), and innovations from a Weibull distribution
Table 4 Empirical \((1-\gamma )\)-quantiles of 100.000 STM statistics computed from two AR(1) time series with autocorrelation parameters \(\varphi (k)\), \(k=1,2\), and innovations from a contaminated normal distribution

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Borowski, M., Busse, D. & Fried, R. Robust online-surveillance of trend-coherence in multivariate data streams: the similar trend monitoring (STM) procedure. Stat Comput 25, 913–928 (2015). https://doi.org/10.1007/s11222-014-9462-4

Download citation

Keywords

  • Multivariate data streams
  • Coherence
  • Trend estimation
  • Online
  • Robustness