A hybrid segmentation method for multivariate time series based on the dynamic factor model

Sun, Zhubin; Liu, Xiaodong; Wang, Lizhu

doi:10.1007/s00477-016-1323-6

A hybrid segmentation method for multivariate time series based on the dynamic factor model

Original Paper
Published: 04 October 2016

Volume 31, pages 1291–1304, (2017)
Cite this article

Stochastic Environmental Research and Risk Assessment Aims and scope Submit manuscript

Zhubin Sun^1,2,
Xiaodong Liu² &
Lizhu Wang³

737 Accesses
6 Citations
Explore all metrics

Abstract

There have been a slew of ready-made methods for the segmentation of univariate time series, but in contrast, there are fewer segmentation methods to satisfy the demand for multivariate time series analysis. It has become a common practice to develop more segmentation methods for multivariate time series by extending segmentation methods of univariate time series. But on the contrary, this paper tries to reduce multivariate time series to a univariate common factor sequence to adapt to the methods for segmentation of univariate time series. First, a common factor sequence is extracted from the multivariate time series as a composite index by a dynamic factor model. Then, three typical search methods including binary segmentation, segment neighborhoods and the pruned exact linear time are applied to the common factor sequence to detect the change points and the segmentation result is considered as the final segmentation result of multivariate time series. The case studies show the applicability and robustness of the proposed approach in hydrometeorological time series segmentation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dynamic programming approach for segmentation of multivariate time series

Article 21 May 2014

Adaptive G–G clustering for fuzzy segmentation of multivariate time series

Article 02 June 2020

Adaptive time series segmentation algorithm based on trend turning points and state changes

Article 22 April 2024

References

Abonyi J, Feil B, Nemeth S, Arva P (2003) Fuzzy clustering based segmentation of time-series. In: Advances in intelligent data analysis V, Springer, pp 275–285
Abonyi J, Feil B, Nemeth S, Arva P (2005) Modified gath-geva clustering for fuzzy segmentation of multivariate time-series. Fuzzy Sets Syst 149(1):39–56
Article Google Scholar
Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control 19(6):716–723
Article Google Scholar
Aksoy H, Gedikli A, Unal NE, Kehagias A (2008) Fast segmentation algorithms for long hydrometeorological time series. Hydrol process 22(23):4600–4608
Article Google Scholar
Albertson DG, Pinkel D (2003) Genomic microarrays in human genetic disease and cancer. Hum Mol Genet 12(suppl 2):R145–R152
Article CAS Google Scholar
Auger IE, Lawrence CE (1989) Algorithms for the optimal identification of segment neighborhoods. Bull Math Biol 51(1):39–54
Article CAS Google Scholar
Bai J, Wang P (2015) Identification and bayesian estimation of dynamic factor models. J Bus Econ Stat 33(2):221–240
Article CAS Google Scholar
Bellman RE, Dreyfus SE (2015) Applied dynamic programming. Princeton university press, Princeton
Google Scholar
Choi I (2012) Efficient estimation of factor models. Econ Theory 28(2):274–308
Article Google Scholar
Dickey DA, Fuller WA (1981) Likelihood ratio statistics for autoregressive time series with a unit root. Econometrica 49(4):1057–1072
Article Google Scholar
Doz C, Giannone D, Reichlin L (2012) A quasi-maximum likelihood approach for large, approximate dynamic factor models. Rev Econ Stat 94(4):1014–1024
Article Google Scholar
Durbin J, Koopman SJ (2012) Time series analysis by state space methods. Oxford University Press, Oxford
Book Google Scholar
Edwards AW, Cavalli-Sforza LL (1965) A method for cluster analysis. Biometrics 21(2):362–375
Article CAS Google Scholar
Engle R, Watson M (1981) A one-factor multivariate time series model of metropolitan wage rates. J Am Stat Assoc 76(376):774–781
Article Google Scholar
Forni M, Reichlin L (2005) The generalized dynamic factor model: one-sided estimation and forecasting. J Am Stat Assoc 100(471):830–840
Article CAS Google Scholar
Garcia-Papani F, Uribe-Opazo MA, Leiva V, Aykroyd RG (2016) Birnbaumcsaunders spatial modelling and diagnostics applied to agricultural engineering data. Stoch Environ Res Risk Assess pp 1–20
Gedikli A, Aksoy H, Unal NE (2008) Segmentation algorithm for long time series analysis. Stoch Environ Res Risk Assess 22(3):291–302
Article Google Scholar
Gedikli A, Aksoy H, Unal NE, Kehagias A (2010) Modified dynamic programming approach for offline segmentation of long hydrometeorological time series. Stoch Environ Res Risk Assess 24(5):547–557
Article Google Scholar
Guo H, Liu X, Song L (2015) Dynamic programming approach for segmentation of multivariate time series. Stoch Environ Res Risk Assess 29(1):265–273
Article Google Scholar
Hannan EJ, Quinn BG (1979) The determination of the order of an autoregression. J R Stat Soc 41(2):190–195
Google Scholar
Hinkley DV (1970) Inference about the change-point in a sequence of random variables. Biometrika 57(1):1–17
Article Google Scholar
Holmes EE, Ward EJ, Wills K (2012) Marss: multivariate autoregressive state-space models for analyzing time-series data. R J 4(1):11–19
Google Scholar
Hubert P (2000) The segmentation procedure as a tool for discrete modeling of hydrometeorological regimes. Stoch Environ Res Risk Assess 14(4):297–304
Article Google Scholar
Hubert P, Carbonnel JP, Chaouche A (1989) Segmentation des séries hydrométéorologiquesapplication à des séries de précipitations et de débits de l’afrique de l’ouest. J Hydrol 110(3):349–367
Article Google Scholar
Inclan C, Tiao GC (1994) Use of cumulative sums of squares for retrospective detection of changes of variance. J Am Stat Assoc 89(427):913–923
Google Scholar
Kalman RE (1960) A new approach to linear filtering and prediction problems. J Fluids Eng 82(1):35–45
Google Scholar
Kawahara Y, Sugiyama M (2012) Sequential change-point detection based on direct density-ratio estimation. Stat Anal Data Min 5(2):114–127
Article Google Scholar
Kehagias A (2004) A hidden markov model segmentation procedure for hydrological and environmental time series. Stoch Environ Res Risk Assess 18(2):117–130
Article Google Scholar
Kehagias A, Fortin V (2006) Time series segmentation with shifting means hidden markov models. Nonlinear Process Geophys 13(3):339–352
Article Google Scholar
Kehagias A, Nidelkou E, Petridis V (2006) A dynamic programming segmentation procedure for hydrological and environmental time series. Stoch Environ Res Risk Assess 20(1):77–94
Article Google Scholar
Killick R, Eckley I (2014) Changepoint: an R package for changepoint analysis. J Stat Softw 58(3):1–19
Article Google Scholar
Killick R, Fearnhead P, Eckley I (2012) Optimal detection of changepoints with a linear computational cost. J Am Stat Assoc 107(500):1590–1598
Article CAS Google Scholar
Koopman SJ, Shephard N, Doornik JA (1999) Statistical algorithms for models in state space using ssfpack 2.2. Econom J 2(1):107–160
Article Google Scholar
Mariano RS, Murasawa Y (2003) A new coincident index of business cycles based on monthly and quarterly series. J Appl Econ 18(4):427–443
Article Google Scholar
Mariano RS, Murasawa Y (2010) A coincident index, common factors, and monthly real gdp*. Oxf Bull Econ Stat 72(1):27–46
Article Google Scholar
Matteson DS, James NA (2014) A nonparametric approach for multiple change point analysis of multivariate data. J Am Stat Assoc 109(505):334–345
Article CAS Google Scholar
Molinari N, Daures JP, Durand JF (2001) Regression splines for threshold selection in survival data analysis. Stat Med 20(2):237–247
Article CAS Google Scholar
Muggeo VM (2003) Estimating regression models with unknown break-points. Stat Med 22(19):3055–3071
Article Google Scholar
Muggeo VM, Adelfio G (2010) Efficient change point detection for genomic sequences of continuous measurements. Bioinformatics 27(2):161–166
Article Google Scholar
Pfaff B (2008) Var, svar and svec models: implementation within R package vars. J Stat Softw 27(4):1–32
Article Google Scholar
Ramsey JB, Lampart C (1998) The decomposition of economic relationships by time scale using wavelets: expenditure and income. Stud Nonlinear Dyn Econom 3(1):1–22
Google Scholar
Redon R, Ishikawa S, Fitch KR, Feuk L, Perry GH, Andrews TD, Fiegler H, Shapero MH, Carson AR, Chen W et al (2006) Global variation in copy number in the human genome. Nature 444(7118):444–454
Article CAS Google Scholar
Schwarz G et al (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464
Article Google Scholar
Seong B, Ahn SK, Zadrozny PA (2013) Estimation of vector error correction models with mixed-frequency data. J Time Ser Anal 34(2):194–205
Article Google Scholar
Shumway RH, Stoffer DS (2010) Time series analysis and its applications: with R examples. Springer, New York
Google Scholar
Stock JH, Watson MW (1988) A probability model of the coincident economic indicators. Technical report, National Bureau of Economic Research
Stock JH, Watson MW (2011) Dynamic factor models. Oxf Handb Econ Forecast 1:35–59
Google Scholar
Wang N, Liu X, Yin J (2012) Improved Gath–Geva clustering for fuzzy segmentation of hydrometeorological time series. Stoch Environ Res Risk Assess 26(1):139–155
Article CAS Google Scholar

Download references

Acknowledgments

This work is supported by the Natural Science Foundation of China under Grant 61673082 and 61533005.

Author information

Authors and Affiliations

School of Mathematical Sciences, Dalian University of Technology, Dalian, 116024, China
Zhubin Sun
School of Control Science and Engineering, Dalian University of Technology, Dalian, 116024, China
Zhubin Sun & Xiaodong Liu
School of Mathematics and System Science, Shenyang Normal University, Shenyang, 110034, China
Lizhu Wang

Authors

Zhubin Sun
View author publications
You can also search for this author in PubMed Google Scholar
Xiaodong Liu
View author publications
You can also search for this author in PubMed Google Scholar
Lizhu Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaodong Liu.

Appendices

Appendix 1

Let for all t,

$$s_{t}=\left( \begin{array}{c} f_{t} \\ f_{t-1} \\ \vdots \\ f_{t-p+1} \\ \end{array} \right) _{p\times 1},$$

(18)

so Eqs. (7)–(8) can be transformed into the state space form:

$$\begin{aligned} s_{t}=\, & {} Bs_{t-1}+ D\xi _{t}, \end{aligned}$$

(19)

$$\begin{aligned} x_{t} =\, & {} Gs_{t}, \, +\, v_t \end{aligned}$$

(20)

where

$$\begin{aligned} B= & {} \left( \begin{array}{ccccc} {\phi }_{1}, &{} {\phi }_{2}, &{} {\ldots }, &{} {\phi }_{p-1}, &{} {\phi }_{p} \\ 1, &{} 0, &{} {\ldots }, &{} 0, &{} 0 \\ 0, &{} 1, &{} {\ldots }, &{} 0, &{} 0 \\ \vdots &{} \vdots &{} \ddots &{} \vdots &{} \vdots \\ 0, &{} 0, &{} {\ldots }, &{} 1, &{} 0 \\ \end{array} \right) _{p\times p}, \end{aligned}$$

(21)

$$\begin{aligned} D=\, & {} \left[ \left( \begin{array}{cccccccc} 1, &{} 0, {\ldots}, &{} 0 \\ \end{array} \right) _{1\times p}\right] ', \end{aligned}$$

(22)

$$\begin{aligned} \xi _{t}=\, & {} w_{t}, \end{aligned}$$

(23)

$$\begin{aligned} G=\, & {} \left( \begin{array}{cccc} \Lambda _{n\times 1}, &{} {\mathbf{0}}_{n\times (p-1)} \\ \end{array} \right). \end{aligned}$$

(24)

Define

$$\begin{aligned} A=\, & {} \left( \begin{array}{cccc} 1, &{} 0, {\ldots }, &{}0 \\ \end{array} \right) _{1\times p}, \end{aligned}$$

(25)

$$\begin{aligned} B^{*}=\, & {} \left( \begin{array}{cccc} \phi _{1}, &{} \phi _{2}, {\ldots }, &{}\phi _{p} \\ \end{array} \right) _{1\times p}, \end{aligned}$$

(26)

so the following expression holds:

$$As_{t}=f_{t}=\sum _{i=1}^{p}\phi _{i}f_{t-i}+w_{t}=B^{*}s_{t-1}+w_{t}.$$

(27)

Assuming $v_{t}\sim \mathrm {N}(0, \Sigma _{v})$ and the initial state vector $f_{0}$ is distributed $\mathrm {N}(\delta ,\Omega )$, we can express the complete-data log-likelihood function as

$$\begin{array}{ll} \log (L(\Theta ))&{} = -\frac{1}{2}\log (|\Omega |)-\frac{1}{2}(f_{0}-\delta )'\Omega ^{-1}(f_{0}-\delta )\\ &{} -\frac{T}{2}\log (|\Sigma _{w}|)-\frac{1}{2}\sum _{t=1}^{T}(As_{t}-B^{*}s_{t-1})'\Sigma _{w}^{-1}(As_{t}-B^{*}s_{t-1})\\ &{} -\frac{T}{2}\log (|\Sigma _{v}|)-\frac{1}{2}\sum _{t=1}^{T}(x_{t}-\Lambda As_{t})'\Sigma _{w}^{-1}(x_{t}-\Lambda As_{t}){,} \end{array}$$

(28)

where $\Theta =(\mathrm {vec}(\Lambda )',\mathrm {vec}(B^{*})',\mathrm {vech}(\Sigma _{w})',\mathrm {vech}(\Sigma _{v})')$ is the vector containing all the unknown parameters and $\mathrm {vec}(\cdot )$ denotes the vectorization of a matrix column-wise from left to right, and $\mathrm {vech}(\cdot )$ denotes the vectorization of the lower triangular part of a matrix column-wise from left to right. Let

$$X_{T}=:\{x_{t},1\le t\le T\},$$

(29)

and define $Q(\Theta )$ as the expectation of ${\log (L(\Theta ))}$ conditional on $X_{T}$, namely,

$$Q(\Theta )=\mathrm {E}(\log (L(\Theta ))|X_{T}).$$

(30)

We get the iteration formula by calculating the partial differential of Eq. (30) regarding unknown parameters:

$$\begin{aligned} \Sigma _{w}=\, & {} \frac{1}{T}\sum _{t=1}^{T}(AM_{00}A'-B^{*}), \end{aligned}$$

(31)

$$\begin{aligned} \Sigma _{v}=\, & {} \frac{1}{T}\sum _{t=1}^{T}(x_{t}x_{t}'-\Lambda As_{t}^{T}x_{t}'), \end{aligned}$$

(32)

$$\begin{aligned} B^{*}=\, & {} AM_{01}M_{11}^{-1}, \end{aligned}$$

(33)

$$\begin{aligned} \Lambda \;=\, & {} x_{t}(As_{t}^{T})'(AM_{01}A')^{-1}, \end{aligned}$$

(34)

where

$$M_{jk}=\sum _{t=1}^{T}\mathrm {E}(s_{t-j}s_{t-k}|X_{T})=\sum _{t=1}^{T}(P^{T}_{t-j,t-k}+s^{T}_{t-j}(s^{T}_{t-k})'),\; j,k=0,1.$$

(35)

In addition, define

$$s_{r}^{t}:=\mathrm {E}(s_{r}|X_{t})$$

(36)

as the conditional expectation based on $X_{t}$. The conditional variance and covariance based on $X_{t}$ are respectively denoted by

$$P_{r}^{t}:=\mathrm {cov}(s_{r},s_{r}|X_{t})$$

(37)

and

$$P_{r,u}^{t}:=\mathrm {cov}(s_{r},s_{u}|X_{t}),$$

(38)

which can be estimated by the updating and smooth equations of the Kalman filter (Durbin and Koopman 2012). The EM estimation procedure is performed in the following steps (Shumway and Stoffer 2010; Seong et al. 2013):

(1)
Given the initial values $\Theta ^{0},\delta$ and $\Omega$. (In general, the initial values of $\Sigma _{w}$ and $\Sigma _{v}$ are set as the identity matrices with associated dimensions and the initial values of $B^{*}$ and $\Lambda$ are set as the zero matrices with associated dimensions. Moreover, we set $\delta =0$ and $\Omega =\kappa I$, where I is an identity matrix and $\kappa$ is 1 for stationary process and is a large value such as $10^6$ for nonstationary process). On iteration $j, \mathrm {for} \; j = 1,2,\ldots$:
(2)
Compute the negative log-likelihood ${-\log (L_{X}(\Theta ^{j-1}))}$.
(3)
Perform the E-Step of EM algorithm. Obtain smoothed values $s^{T}_{t},P^{T}_{t},P^{T}_{t,t-1}$ for $t=1,\ldots ,T$ by the Kalman filter based on $\Theta ^{j-1}$ and then calculate $M_{ij}$ for $i,j=0,1$ according to Eq. (35).
(4)
Perform the M-Step of EM algorithm. Update the estimates $\Theta ^{j}$ according to Eqs. (31)–(34).
(5)
Repeat Steps (2)–(4) until the likelihood values converge.

Appendix 2

Let for all t,

$$s_{t}=\left( \begin{array}{c} f_{t} \\ f_{t-1} \\ \vdots \\ f_{t-p+1} \\ v_{t} \\ v_{t-1} \\ \vdots \\ v_{t-q+1} \\ \end{array} \right) _{(p+nq)\times 1} ,$$

(39)

so Eqs. (7)–(9) can be transformed into the state space form:

$$\begin{aligned} s_{t}=\, & {} Bs_{t-1}+ D\xi _{t}, \end{aligned}$$

(40)

$$\begin{aligned} x_{t}=\, & {} Gs_{t}, \end{aligned}$$

(41)

where

$$B=\left( \begin{array}{cc} B_{1}, &{} {\mathbf{0}}_{p\times nq} \\ {\mathbf{0}}_{nq\times p}, &{} B_{2} \\ \end{array} \right) ,$$

(42)

with

$$\begin{aligned} B_{1}=\left( \begin{array}{ccccc} {\phi }_{1}, &{} {\phi }_{2}, &{} {\ldots }, &{} {\phi }_{p-1}, &{} {\phi }_{p} \\ 1, &{} 0, &{} {\ldots }, &{} 0, &{} 0 \\ 0, &{} 1, &{} {\ldots }, &{} 0, &{} 0 \\ \vdots &{} \vdots &{} \ddots &{} \vdots &{} \vdots \\ 0, &{} 0, &{} {\ldots }, &{} 1, &{} 0 \\ \end{array} \right) _{p\times p}, \end{aligned}$$

(43)

$$\begin{aligned} B_{2}=\left( \begin{array}{ccccc} {\Psi }_{1}, &{} {\Psi }_{2}, &{} {\ldots }, &{} {\Psi }_{q-1}, &{} {\Psi }_{q} \\ I_{n\times n}, &{} \mathbf {0}_{n\times n}, &{} {\ldots }, &{} \mathbf {0}_{n\times n}, &{} \mathbf {0}_{n\times n} \\ \mathbf {0}_{n\times n}, &{} I_{n\times n}, &{} {\ldots }, &{} \mathbf {0}_{n\times n}, &{} \mathbf {0}_{n\times n} \\ \vdots &{} \vdots &{} \ddots &{} \vdots &{} \vdots \\ \mathbf {0}_{n\times n}, &{} \mathbf {0}_{n\times n}, &{} {\ldots }, &{} I_{n\times n}, &{} \mathbf {0}_{n\times n} \\ \end{array} \right) _{nq \times nq}, \end{aligned}$$

(44)

$$\begin{aligned} D=\left[ \begin{array}{c} \left( \begin{array}{cccccccc} 1, &{} 0, &{} {\ldots }, &{} 0, &{} \mathbf {0}_{1\times n}, &{} \mathbf {0}_{1\times n}, &{} {\ldots }, &{} \mathbf {0}_{1\times n} \\ \mathbf {0}_{n\times 1}, &{} \mathbf {0}_{n\times 1}, &{} {\ldots }, &{} \mathbf {0}_{n\times 1}, &{} I_{n\times n}, &{} \mathbf {0}_{n\times n}, &{} {\ldots }, &{} \mathbf {0}_{n\times n} \\ \end{array} \right) _{(1+n)\times (p+nq)} \\ \end{array} \right] ', \end{aligned}$$

(45)

$$\xi =\left( \begin{array}{c} w_{t} \\ e_{t} \\ \end{array} \right) _{(1+n)\times 1},$$

(46)

$$\begin{aligned} G=\left( \begin{array}{cccc} \Lambda , &{} \mathbf {0}_{n\times (p-1)}, &{} I_{n\times n}, &{} \mathbf {0}_{n\times n(q-1)} \\ \end{array} \right) , \end{aligned}$$

(47)

in which $I_{i\times j}$ and ${\mathbf{0}}_{i\times j}$ stand for an i-by-j identity matrix and an i-by-j zero matrix respectively.

In this case, let $\eta _{t}=x_t-\hat{x}_t$ denote innovations and its variances are signified as $F_t$. The Kalman filter allows the computation of the Gaussian log-likelihood function via the prediction error decomposition (Engle and Watson 1981; Koopman et al. 1999). Assuming

$$x_{t}|x_{1},{\ldots },x_{t-1}\sim \mathrm {N}\left( x_{t|t-1},F_{t}\right) ,$$

(48)

the log-likelihood function is given by

$$\begin{aligned} \log (L\left( \Theta \right) )= & {} \log (p\left( x_1,\ldots ,x_T;\Theta \right) )=\sum _{t=1}^{T}\log (p\left( x_t|x_1,\ldots ,x_{t-1};\Theta \right) ) \\= & {} \sum _{t=1}^{T}-\left( \frac{n}{2}\log (2\pi )+\log (\det (F_{t}))+\eta _{t}'F_{t}^{-1}\eta _{t}\right) =\sum _{t=1}^{T}L_{t} \\= & {} -\frac{nT}{2}\log (2\pi )-\frac{1}{2}\sum _{t=1}^{T}\left[ \log (\det (F_{t}))+\eta _{t}'F_{t}^{-1}\eta _{t}\right] , \end{aligned}$$

(49)

where $\Theta$ is the vector of parameters for a specific statistical model represented in the state space form. The iterative procedure given by Eq. (50) involves finding $H^{k}$, the information matrix evaluated at $\Theta ^{k}$; and $\alpha ^{k}$ is a scalar step length to obtain new estimates $\Theta ^{k+1}$ based upon estimates from the k-th iteration:

$$\Theta ^{k+1}=\Theta ^{k}+\alpha ^{k}\left( H^{k}\right) ^{-1}\frac{\partial L}{\partial \Theta }|_{\Theta ^{k}}.$$

(50)

For a symmetric matrix B, the following expressions are satisfied:

$$\begin{aligned} \frac{\partial |B|}{\partial x}=\, & {} |B|\mathrm {tr}\left( B^{-1}\frac{\partial B}{\partial x}\right) , \end{aligned}$$

(51)

$$\begin{aligned} \frac{\partial B^{-1}}{\partial x}= & {} -B^{-1}\frac{\partial B}{\partial x}B^{-1}. \end{aligned}$$

(52)

Differentiate $L_{t}$ in Eq. (49) with respect to the parameter $\Theta _{i}$ according to Eqs. (51) and (52), we get the following expressions (Engle and Watson 1981):

$$\begin{aligned} \frac{\partial L_{t}}{\partial \Theta _{i}}= & {} -\frac{1}{2}\mathrm {tr}\left( F_{t}^{-1}\frac{\partial F_{t}}{\partial \Theta _{i}}\right) -\left( \frac{\partial \eta _{t}}{\partial \Theta _{i}}\right) 'F_{t}^{-1}\eta _{t} + \frac{1}{2}\eta _{t}'F_{t}^{-1}\frac{\partial F_{t}}{\partial \Theta _{i}}F_{t}^{-1}\eta _{t} \end{aligned}$$

(53)

$$\begin{aligned} \qquad = & {} -\frac{1}{2}\mathrm {tr}\left( F_{t}^{-1}\frac{\partial F_{t}}{\partial \Theta _{i}}\right) \left( I-F_{t}^{-1}\eta _{t}\eta _{t}'\right) -\left( \frac{\partial \eta _{t}}{\partial \Theta _{i}}\right) 'F_{t}^{-1}\eta _{t} \end{aligned}$$

(54)

$$\qquad = L_{1_{t}}+L_{2_{t}}.$$

(55)

To get the second derivative matrix of the log-likelihood, first calculate

$$\begin{aligned} \frac{\partial L_{1_t}}{\partial \Theta _{j}}&= -\frac{1}{2}\mathrm {tr}\left[ \partial \left( F_{t}^{-1}\frac{\partial F_{t}}{\partial \Theta _{i}}\right) /\partial \Theta _{j}\right] \times \left( I-F_{t}^{-1}\eta _{t}\eta _{t}'\right) \\&\quad -\frac{1}{2}\mathrm {tr}\left[ \left( F_{t}^{-1}\frac{\partial F_{t}}{\partial \Theta _{i}}\right) F_{t}^{-1}\frac{\partial F_{t}}{\partial \Theta _{j}}\eta _{t}\eta _{t}'\right] \\&\quad +\frac{1}{2}\mathrm {tr}\left\{ F_{t}^{-1}\frac{\partial F_{t}}{\partial \Theta _{i}}F_{t}^{-1}\times \left[ \frac{\partial \eta _{t}}{\partial \Theta _{j}}\eta _{t}'+\eta _{t}\left( \frac{\partial \eta _{t}}{\partial \Theta _{j}}\right) '\right] \right\} , \end{aligned}$$

(56)

and the only random variables in this expression are the $\eta _{t}$. Hence, taking the expected value of Eq. (56), we have

$$E\left( \frac{\partial L_{1_{t}}}{\partial \Theta _{j}}\right) =-\frac{1}{2}\mathrm {tr}\left( F_{t}^{-1}\frac{\partial F_{t}}{\partial \Theta _{i}}F_{t}^{-1}\frac{\partial F_{t}}{\partial \Theta _{j}}\right) .$$

(57)

Similarly, differentiate $L_{2_{t}}$ with respect to $\Theta _{j}$ to obtain

$$\frac{\partial L_{2_{t}}}{\partial \Theta _{j}}=\frac{\partial ^{2}\eta _{t}}{\partial \Theta _{i}\partial \Theta _{j}}F_{t}^{-1}\eta _{t} -\left( \frac{\partial \eta _{t}}{\partial \Theta _{i}}\right) '\frac{\partial F_{t}^{-1}}{\partial \Theta _{j}}\eta _{t} -\left( \frac{\partial \eta _{t}}{\partial \Theta _{i}}\right) 'F_{t}^{-1}\frac{\partial \eta _{t}}{\partial \Theta _{j}}.$$

(58)

Take expected values of Eq. (58),

$$E\left( \frac{\partial L_{2_{t}}}{\partial \Theta _{j}}\right) =-\left( \frac{\partial \eta _{t}}{\partial \Theta _{i}}\right) 'F_{t}^{-1}\frac{\partial \eta _{t}}{\partial \Theta _{j}}.$$

(59)

The ij-th element of the information matrix is the negative of the sum of Eq. (57) and Eq. (59) summed over all time periods. Thus

$$H_{ij}=\sum _{t}\mathrm {tr}\left[ F_{t}^{-1}\frac{\partial F_{t}}{\partial \Theta _{i}}F_{t}^{-1}\frac{\partial F_{t}}{\partial \Theta _{j}}\right] +\sum _{t}\mathrm {tr}\left( \frac{\partial \eta _{t}}{\partial \Theta _{i}}\right) 'F_{t}^{-1}\frac{\partial \eta _{t}}{\partial \Theta _{j}}.$$

(60)

The expression Eq. (60) requires $\eta _{t}$ and its variance $F_{t}$, which can be calculated numerically by the smoothing equations of the Kalman filter (Durbin and Koopman 2012). In turn, the updated estimate of $\Theta$ will be employed in the equations of the Kalman filter. Further, the iteration process will achieve the goal of estimating the DFM.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sun, Z., Liu, X. & Wang, L. A hybrid segmentation method for multivariate time series based on the dynamic factor model. Stoch Environ Res Risk Assess 31, 1291–1304 (2017). https://doi.org/10.1007/s00477-016-1323-6

Download citation

Published: 04 October 2016
Issue Date: August 2017
DOI: https://doi.org/10.1007/s00477-016-1323-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A hybrid segmentation method for multivariate time series based on the dynamic factor model

Abstract

Access this article

Similar content being viewed by others

Dynamic programming approach for segmentation of multivariate time series

Adaptive G–G clustering for fuzzy segmentation of multivariate time series

Adaptive time series segmentation algorithm based on trend turning points and state changes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix 1

Appendix 2

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A hybrid segmentation method for multivariate time series based on the dynamic factor model

Abstract

Access this article

Similar content being viewed by others

Dynamic programming approach for segmentation of multivariate time series

Adaptive G–G clustering for fuzzy segmentation of multivariate time series

Adaptive time series segmentation algorithm based on trend turning points and state changes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix 1

Appendix 2

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation