Skip to main content

Simultaneous Multiple Change Points Estimation in Generalized Linear Models

  • Chapter
  • First Online:
Contemporary Experimental Design, Multivariate Analysis and Data Mining
  • 856 Accesses

Abstract

In this paper, the problem of multiple change points estimation is considered for generalized linear models, in which both the number of change-points and their locations are unknown. The proposed method is to first partition the data sequence into segments to construct a new design matrix, secondly convert the multiple change points estimation problem into a variable selection problem, and then apply a regularized model selection technique and obtain the regression coefficient estimation. The consistency of the estimator is established regardless if there is a change point in which the number of coefficients can diverge as the sample size goes to infinity. An algorithm is provided to estimate the multiple change points. Simulation studies are conducted for the logistic and log-linear models. A real data application is also presented.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Antoch, J., Gregoire, G., Jarušková, D.: Detection of structural changes in generalized linear models. Stat. Probab. Lett. 69, 315–332 (2004)

    Article  MathSciNet  Google Scholar 

  2. Csörgö, M., Horváth, L.: Limit Theorems in Change-point Analysis. Wiley, New York (1997)

    MATH  Google Scholar 

  3. Davis, R.A., Lee, T.C.M., Rodriguez-Yam, G.A.: Structural break estimation for nonstationary time series models. J. Am. Stat. Assoc. 101, 223–239 (2006)

    Article  MathSciNet  Google Scholar 

  4. Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96, 1348–1360 (2001)

    Article  MathSciNet  Google Scholar 

  5. Fan, J., Feng, Y., Saldana, D.F., Samworth, R., Wu, Y.: SIS: Sure Independence Screening (2010)

    Google Scholar 

  6. Fan, J., Peng, H.: Nonconcave penalized likelihood with a diverging number of parameters. Ann. Stat. 32, 928–961 (2004)

    Article  MathSciNet  Google Scholar 

  7. Fanaee-T, H., Gama, J.: Event labeling combining ensemble detectors and background knowledge. Prog. Artif. Intell. 2, 113–127 (2014)

    Article  Google Scholar 

  8. Hušková, M., Meintanis, S.G.: Change point analysis based on empirical characteristic functions. Metrika 63, 145–168 (2006)

    Article  MathSciNet  Google Scholar 

  9. Jiang, D., Huang, J.: Majorization minimization by coordinate descent for concave penalized generalized linear models. Stat. Comput. 24, 871–883 (2014)

    Article  MathSciNet  Google Scholar 

  10. Jin, B., Shi, X., Wu, Y.: A novel and fast methodology for simultaneous multiple structural break estimation and variable selection for nonstationary time series models. Stat. Comput. 23, 221–231 (2013)

    Article  MathSciNet  Google Scholar 

  11. Jin, B., Wu, Y., Shi, X.: Consistent two-stage multiple changepoint detection in linear models. Can. J. Stat. 44, 161–179 (2016)

    Article  MathSciNet  Google Scholar 

  12. Lu, Q., Wang, X.L.: An extended cumulative logit model for detecting a shift in frequencies of sky-cloudiness conditions. J. Geophys. Res. 117, D16210 (2012). https://doi.org/10.1029/2012JD017893

    Article  Google Scholar 

  13. Matteson, D.S., James, N.A.: A nonparametric approach for multiple change point analysis of multivariate data. J. Am. Stat. Assoc. 109, 334–345 (2014)

    Article  MathSciNet  Google Scholar 

  14. Page, E.S.: Continuous inspection schemes. Biometrika 41, 100–115 (1954)

    Article  MathSciNet  Google Scholar 

  15. Page, E.S.: A test for a change in a parameter occurring at an unknown point. Biometrika 42, 523–527 (1955)

    Article  MathSciNet  Google Scholar 

  16. Robbins, M.W., Lund, R.B., Gallagher, C.M., Lu, Q.: Changepoints in the North Atlantic tropical cyclone record. J. Am. Stat. Assoc. 106, 89–99 (2011)

    Article  MathSciNet  Google Scholar 

  17. Tan, C., Shi, X., Sun, X., Wu, Y.: On nonparametric change point estimator based on empirical characteristic functions. Sci. China Math. 59, 2463–2484 (2016)

    Article  MathSciNet  Google Scholar 

  18. Zhang, C.: Nearly unbiased variable selection under minimax concave penalty. Ann. Stat. 38, 894–942 (2010)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

The work was supported by the Natural Sciences and Engineering Research Council of Canada [RGPIN-2017-05720].

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yuehua Wu .

Editor information

Editors and Affiliations

Appendices

Appendix A: A Single Change Point Detection and Estimation in GLM

Consider the following model

$$\begin{aligned} g(\mu _t) = \left\{ \begin{array}{ll} {x}_t^T {\beta }, \ \ \ \ \ &{} t = \hbox {1, 2,} \ldots , l ,\\ {x}_t^T {\beta }^{*}, \ \ \ \ \ &{} t = l+\hbox {1}, l+\hbox {2}, \ldots , n.\\ \end{array} \right. \end{aligned}$$

Test \(H_0:\ l = n\) and \(H_1:\ l < n\).

The test statistic proposed in Antoch et al. [1] is summarized as follows. The maximum likelihood estimator \(\widehat{{\beta }}\) of \({\beta }\) is defined as the solution of the following system of equations: \(\sum _{t=1}^n(y_t-g^{-1}({x}_t^T {\beta }))x_{tj}=0,\ j = 1, 2, \ldots , p\). Then \(\widehat{\mu }_t=b'({x}_t^T\widehat{{\beta }})\) and \(\widehat{\sigma }^2 = a(\phi )b''({x}_t^T\widehat{{\beta }})\), where \(\phi \) is assumed to be known. Let \(\widehat{S}(\tilde{l}) = \sum _{t=1}^{\tilde{l}} (y_t - \widehat{\mu }_t)^T{x}_t\), \(\widehat{F}(\tilde{l}) = \sum _{t=1}^{\tilde{l}} \widehat{\sigma }_t^2{x}_t{x}_t^T\), \(\widehat{F}(n) = \sum _{t=1}^{n} \widehat{\sigma }_t^2{x}_t{x}_t^T\), and \(\widehat{D}(\tilde{l}) = \widehat{F}(\tilde{l}) - \widehat{F}(\tilde{l}) \widehat{F}(n)^{-1}\widehat{F}(\tilde{l})^T\). Assume that there exists \(k_0\) such that \(\widehat{D}(\tilde{l})\) is positive definite for all \(k_0< \tilde{l} < n-k_0\). The test statistic proposed in Antoch et al. [1] is \(T = \max _{k_0< \tilde{l} < n-k_0} \widehat{S}(\tilde{l})^T\widehat{D}(\tilde{l})^{-1}\widehat{S}(\tilde{l})\). They also showed that under \(H_0\), the limiting distribution of the test statistic is

$$P(T \le 2\log \log n + (p+1)\log \log \log n + 2t - 2log \Gamma (\frac{p+1}{2})) \rightarrow \exp \{-2e^{-t}\}.$$

The asymptotic critical value for the test statistic at a given significance level can be obtained from this limiting distribution.

In the case that \(H_0\) is rejected, the estimate of l is given by

$$\widehat{l} ={\arg \max }_{k_0< \tilde{l} < n-k_0} \widehat{S}(\tilde{l})^T\widehat{D}(\tilde{l})^{-1}\widehat{S}(\tilde{l}).$$

Appendix B: Proof of Theorem 22.1

Consider a ball \(\Vert {\gamma }_n - {\gamma }_n^0\Vert \le M\sqrt{q_n/n}\) for some finite M.

$$\begin{aligned}&Q({\gamma }_n)= \mathscr {L}_1({\gamma }_n) - n \sum _{j=1}^{p q_n} p_{\lambda _n} (|\gamma _{nj}|)\\= & {} \sum _{t = 1}^n ( \frac{y_{nt}({z}_{nt}^T{\gamma }_n)-b({z}_{nt}^T {\gamma }_n)}{a(\phi )} + c(y_{nt}, \phi ))- n \sum _{j=1}^{p q_n} p_{\lambda _n} (|\gamma _{nj}|) \\= & {} \mathscr {L}({\gamma }_n) - n \sum _{j=1}^{p q_n} p_{\lambda _n} (|\gamma _{nj}|) + \sum _{i=1}^{s} \sum _{t = \varsigma _i}^{l_{n,i}} \frac{y_t({w}_{nt}^T {\gamma }_n)}{a(\phi )}\\&- \sum _{i=1}^{s} \sum _{t = \varsigma _i}^{l_{n,i}}\frac{b({z}_{nt}^T {\gamma }_n) - b({z}_{nt}^T {\gamma }_n - {w}_{nt}^T {\gamma }_n)}{a(\phi )} \end{aligned}$$

where \({w}_{nt} = {0}\) for \(t \notin \{n - (q_n-n_i+1)m +1, \ldots , l_{n,i}\}\).

First, we consider \(\Vert {\gamma }_n - {\gamma }_n^0\Vert = M\sqrt{q_n/n}\).

$$\begin{aligned}&Q({\gamma }_n) - Q({\gamma }_n^0) \\= & {} (\mathscr {L}({\gamma }_n) - \mathscr {L}({\gamma }_n^0)) - n \sum _{j \in \mathscr {A}_n} (p_{\lambda _n} (|\gamma _{nj}|) - p_{\lambda _n} (|\gamma _{nj}^0|)) - n \sum _{j \in \mathscr {A}_n^c} (p_{\lambda _n} (|\gamma _{nj}|) - p_{\lambda _n} (|\gamma _{nj}^0|))\\&+ \sum _{i=1}^{s} \sum _{t = \varsigma _i}^{l_{n,i}} \frac{y_{nt}({w}_{nt}^T ({\gamma }_n- {\gamma }_n^0)) }{a(\phi )} - \sum _{i=1}^{s} \sum _{t = \varsigma _i}^{l_{n,i}}\frac{b({z}_{nt}^T {\gamma }_n) - b({z}_{nt}^T {\gamma }_n^0)}{a(\phi )}\\&+ \sum _{i=1}^{s} \sum _{t = \varsigma _i}^{l_{n,i}}\frac{b({z}_{nt}^T {\gamma }_n - {w}_{nt}^T {\gamma }_n) - b({z}_{nt}^T {\gamma }_n^0 - {w}_{nt}^T {\gamma }_n^0) }{a(\phi )}. \end{aligned}$$

As \(p_{\lambda _n}(0) = 0\) and \(p_{\lambda _n} (|\gamma _{nj}|) \ge 0\), we have

$$\begin{aligned}&Q({\gamma }_n) - Q({\gamma }_n^0) \\\le & {} [\mathscr {L}({\gamma }_n) - \mathscr {L}({\gamma }_n^0)]- n \sum _{j \in \mathscr {A}_n}[p'_{\lambda _n} (|\gamma _{nj}^0|)\text {sign}(\gamma _{nj}^0)(\gamma _{nj} - \gamma _{nj}^0) \\+ & {} p''_{\lambda _n} (|\gamma _{nj}^0|)(\gamma _{nj} - \gamma _{nj}^0)^2(1+o_P(1))] \\+ & {} \sum _{i=1}^{s} \sum _{t = \varsigma _i}^{l_{n,i}} a(\phi )^{-1} [y_{nt}({w}_{nt}^T ({\gamma }_n- {\gamma }_n^0)) - \frac{\partial b({z}_{nt}^T {\gamma }_n^*)}{\partial {\gamma }_n}{z}_{nt}^T ({\gamma }_n- {\gamma }_n^0) \\+ & {} \frac{\partial b({z}_{nt}^T {\gamma }_n^* - {w}_{nt}^T {\gamma }_n^*)}{\partial {\gamma }_n}({z}_{nt}^T -{w}_{nt}^T)({\gamma }_n- {\gamma }_n^0)]\\= & {} A_1 + A_2 + A_3, \end{aligned}$$

where \(\Vert {\gamma }_n^* - {\gamma }_n^0\Vert \le M\sqrt{q_n/n}\).

By the Taylor expansion and Assumption 4, \(A_1 = \mathscr {L}({\gamma }_n) - \mathscr {L}({\gamma }_n^0) = -M^2O_p(q_n).\) By Assumption 2, \(p'_{\lambda _n} (|\gamma ^0_{nj}|) = p''_{\lambda _n} (|\gamma ^0_{nj}|)= 0,\) for \(j \in \mathscr {A}_n\) and large n. Then \(|A_2| = o_p(\sqrt{q_n})\). By Assumption 5, \(|A_3| = O_P(\sqrt{nq_n}) (M\sqrt{q_n/n}) = O_p(q_n)\). By choosing a sufficiently large M, the first term dominates other terms. Since \(A_1\) is negative, for \(\varepsilon > 0\), there exists a large constant M such that

$$P\left\{ \sup _{\Vert {\gamma }_n - {\gamma }_n^0\Vert = M\sqrt{q_n/n}} Q({\gamma }_n) < Q({\gamma }_n^0)\right\} \ge 1 - \varepsilon .$$

This implies that with probability at least \(1-\varepsilon \) there exists a local maximum in the ball \(\{{\gamma }_n:\Vert {\gamma }_n - {\gamma }_n^0\Vert \le M\sqrt{q_n/n}\}\). Hence, there exists a local maximizer such that \(\Vert \widehat{{\gamma }_n} - {\gamma }_n^0\Vert = O_P(\sqrt{q_n/n})\).

Then we consider for \(j \in \mathscr {A}_n^c\), by the standard Taylor expansion of the function \({\partial \mathscr {L} ({\gamma }_n) }/{\partial \gamma _{nj}}\) at \({\gamma }_n^0\), we obtain

$$\begin{aligned}&\frac{\partial Q({\gamma }_n) }{\partial \gamma _{nj}}\\= & {} \frac{\partial \mathscr {L} ({\gamma }_n^0) }{\partial \gamma _{nj}} + \sum _{j' =1}^{pq_n}(\gamma _{nj'}-\gamma _{nj'}^0)\frac{\partial ^2 \mathscr {L} ({\gamma }_n^0) }{\partial \gamma _{nj}^2 }(1 + O_P(1))- n p'_{\lambda _n} (|\gamma _{nj}|)\text {sign}(\gamma _{nj})\\+ & {} O_P(\sqrt{nq_n})\\= & {} O_P(\sqrt{nq_n}) + O_P(\sqrt{nq_n}) - n p'_{\lambda _n} (|\gamma _{nj}|)\text {sign}(\gamma _{nj}) + O_P(\sqrt{nq_n})\\= & {} n\lambda _n \left[ O_P\left( \frac{\sqrt{q_n/n}}{\lambda _n}\right) - \lambda _n^{-1} p'_{\lambda _n} (|\gamma _{nj}|)\text {sign}(\gamma _{nj})\right] \end{aligned}$$

by Assumption 1. Since \({\sqrt{q_n/n}}/{\lambda _n} \rightarrow 0\) by Assumption 22.1, this entails that the sign of \({\partial Q({\gamma }_n) }/{\partial \gamma _{nj}}\) is determined by the sign of \(\gamma _{nj}\) inside the neighborhood of \({\gamma }_n^0\) with radius \(M\sqrt{q_n/n}\) by Assumption 3. That is, \({\partial Q({\gamma }_n) }/{\partial \gamma _{nj}} > 0\) for \(\gamma _{nj} < 0\) and \({\partial Q({\gamma }_n) }/{\partial \gamma _{nj}} < 0\) for \(\gamma _{nj} > 0\). Therefore, for any local maximizer \(\widehat{{\gamma }}_n\) inside this ball, \(\widehat{{\gamma }}_{n\mathscr {A}_n^c} = 0\) with probability tending to one. This completes the proof.

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Sun, X., Wu, Y. (2020). Simultaneous Multiple Change Points Estimation in Generalized Linear Models. In: Fan, J., Pan, J. (eds) Contemporary Experimental Design, Multivariate Analysis and Data Mining. Springer, Cham. https://doi.org/10.1007/978-3-030-46161-4_22

Download citation

Publish with us

Policies and ethics