Simultaneous Multiple Change Points Estimation in Generalized Linear Models

Sun, Xiaoying; Wu, Yuehua

doi:10.1007/978-3-030-46161-4_22

Xiaoying Sun³ &
Yuehua Wu³

856 Accesses

Abstract

In this paper, the problem of multiple change points estimation is considered for generalized linear models, in which both the number of change-points and their locations are unknown. The proposed method is to first partition the data sequence into segments to construct a new design matrix, secondly convert the multiple change points estimation problem into a variable selection problem, and then apply a regularized model selection technique and obtain the regression coefficient estimation. The consistency of the estimator is established regardless if there is a change point in which the number of coefficients can diverge as the sample size goes to infinity. An algorithm is provided to estimate the multiple change points. Simulation studies are conducted for the logistic and log-linear models. A real data application is also presented.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Softcover Book: USD 199.99; Price excludes VAT (USA)

Hardcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Antoch, J., Gregoire, G., Jarušková, D.: Detection of structural changes in generalized linear models. Stat. Probab. Lett. 69, 315–332 (2004)
Article MathSciNet Google Scholar
Csörgö, M., Horváth, L.: Limit Theorems in Change-point Analysis. Wiley, New York (1997)
MATH Google Scholar
Davis, R.A., Lee, T.C.M., Rodriguez-Yam, G.A.: Structural break estimation for nonstationary time series models. J. Am. Stat. Assoc. 101, 223–239 (2006)
Article MathSciNet Google Scholar
Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96, 1348–1360 (2001)
Article MathSciNet Google Scholar
Fan, J., Feng, Y., Saldana, D.F., Samworth, R., Wu, Y.: SIS: Sure Independence Screening (2010)
Google Scholar
Fan, J., Peng, H.: Nonconcave penalized likelihood with a diverging number of parameters. Ann. Stat. 32, 928–961 (2004)
Article MathSciNet Google Scholar
Fanaee-T, H., Gama, J.: Event labeling combining ensemble detectors and background knowledge. Prog. Artif. Intell. 2, 113–127 (2014)
Article Google Scholar
Hušková, M., Meintanis, S.G.: Change point analysis based on empirical characteristic functions. Metrika 63, 145–168 (2006)
Article MathSciNet Google Scholar
Jiang, D., Huang, J.: Majorization minimization by coordinate descent for concave penalized generalized linear models. Stat. Comput. 24, 871–883 (2014)
Article MathSciNet Google Scholar
Jin, B., Shi, X., Wu, Y.: A novel and fast methodology for simultaneous multiple structural break estimation and variable selection for nonstationary time series models. Stat. Comput. 23, 221–231 (2013)
Article MathSciNet Google Scholar
Jin, B., Wu, Y., Shi, X.: Consistent two-stage multiple changepoint detection in linear models. Can. J. Stat. 44, 161–179 (2016)
Article MathSciNet Google Scholar
Lu, Q., Wang, X.L.: An extended cumulative logit model for detecting a shift in frequencies of sky-cloudiness conditions. J. Geophys. Res. 117, D16210 (2012). https://doi.org/10.1029/2012JD017893
Article Google Scholar
Matteson, D.S., James, N.A.: A nonparametric approach for multiple change point analysis of multivariate data. J. Am. Stat. Assoc. 109, 334–345 (2014)
Article MathSciNet Google Scholar
Page, E.S.: Continuous inspection schemes. Biometrika 41, 100–115 (1954)
Article MathSciNet Google Scholar
Page, E.S.: A test for a change in a parameter occurring at an unknown point. Biometrika 42, 523–527 (1955)
Article MathSciNet Google Scholar
Robbins, M.W., Lund, R.B., Gallagher, C.M., Lu, Q.: Changepoints in the North Atlantic tropical cyclone record. J. Am. Stat. Assoc. 106, 89–99 (2011)
Article MathSciNet Google Scholar
Tan, C., Shi, X., Sun, X., Wu, Y.: On nonparametric change point estimator based on empirical characteristic functions. Sci. China Math. 59, 2463–2484 (2016)
Article MathSciNet Google Scholar
Zhang, C.: Nearly unbiased variable selection under minimax concave penalty. Ann. Stat. 38, 894–942 (2010)
Article MathSciNet Google Scholar

Download references

Acknowledgements

The work was supported by the Natural Sciences and Engineering Research Council of Canada [RGPIN-2017-05720].

Author information

Authors and Affiliations

York University, Toronto, Canada
Xiaoying Sun & Yuehua Wu

Authors

Xiaoying Sun
View author publications
You can also search for this author in PubMed Google Scholar
Yuehua Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yuehua Wu .

Editor information

Editors and Affiliations

Department of Financial Engineering, Princeton University, Princeton, NJ, USA
Jianqing Fan
Department of Mathematics, The University of Manchester, Manchester, UK
Jianxin Pan

Appendices

Appendix A: A Single Change Point Detection and Estimation in GLM

Consider the following model

$$\begin{aligned} g(\mu _t) = \left\{ \begin{array}{ll} {x}_t^T {\beta }, \ \ \ \ \ &{} t = \hbox {1, 2,} \ldots , l ,\\ {x}_t^T {\beta }^{*}, \ \ \ \ \ &{} t = l+\hbox {1}, l+\hbox {2}, \ldots , n.\\ \end{array} \right. \end{aligned}$$

Test $H_0:\ l = n$ and $H_1:\ l < n$.

The test statistic proposed in Antoch et al. [1] is summarized as follows. The maximum likelihood estimator $\widehat{{\beta }}$ of ${\beta }$ is defined as the solution of the following system of equations: $\sum _{t=1}^n(y_t-g^{-1}({x}_t^T {\beta }))x_{tj}=0,\ j = 1, 2, \ldots , p$. Then $\widehat{\mu }_t=b'({x}_t^T\widehat{{\beta }})$ and $\widehat{\sigma }^2 = a(\phi )b''({x}_t^T\widehat{{\beta }})$, where $\phi $ is assumed to be known. Let $\widehat{S}(\tilde{l}) = \sum _{t=1}^{\tilde{l}} (y_t - \widehat{\mu }_t)^T{x}_t$, $\widehat{F}(\tilde{l}) = \sum _{t=1}^{\tilde{l}} \widehat{\sigma }_t^2{x}_t{x}_t^T$, $\widehat{F}(n) = \sum _{t=1}^{n} \widehat{\sigma }_t^2{x}_t{x}_t^T$, and $\widehat{D}(\tilde{l}) = \widehat{F}(\tilde{l}) - \widehat{F}(\tilde{l}) \widehat{F}(n)^{-1}\widehat{F}(\tilde{l})^T$. Assume that there exists $k_0$ such that $\widehat{D}(\tilde{l})$ is positive definite for all $k_0< \tilde{l} < n-k_0$. The test statistic proposed in Antoch et al. [1] is $T = \max _{k_0< \tilde{l} < n-k_0} \widehat{S}(\tilde{l})^T\widehat{D}(\tilde{l})^{-1}\widehat{S}(\tilde{l})$. They also showed that under $H_0$, the limiting distribution of the test statistic is

$$P(T \le 2\log \log n + (p+1)\log \log \log n + 2t - 2log \Gamma (\frac{p+1}{2})) \rightarrow \exp \{-2e^{-t}\}.$$

The asymptotic critical value for the test statistic at a given significance level can be obtained from this limiting distribution.

In the case that $H_0$ is rejected, the estimate of l is given by

$$\widehat{l} ={\arg \max }_{k_0< \tilde{l} < n-k_0} \widehat{S}(\tilde{l})^T\widehat{D}(\tilde{l})^{-1}\widehat{S}(\tilde{l}).$$

Appendix B: Proof of Theorem 22.1

Consider a ball $\Vert {\gamma }_n - {\gamma }_n^0\Vert \le M\sqrt{q_n/n}$ for some finite M.

$$\begin{aligned}&Q({\gamma }_n)= \mathscr {L}_1({\gamma }_n) - n \sum _{j=1}^{p q_n} p_{\lambda _n} (|\gamma _{nj}|)\\= & {} \sum _{t = 1}^n ( \frac{y_{nt}({z}_{nt}^T{\gamma }_n)-b({z}_{nt}^T {\gamma }_n)}{a(\phi )} + c(y_{nt}, \phi ))- n \sum _{j=1}^{p q_n} p_{\lambda _n} (|\gamma _{nj}|) \\= & {} \mathscr {L}({\gamma }_n) - n \sum _{j=1}^{p q_n} p_{\lambda _n} (|\gamma _{nj}|) + \sum _{i=1}^{s} \sum _{t = \varsigma _i}^{l_{n,i}} \frac{y_t({w}_{nt}^T {\gamma }_n)}{a(\phi )}\\&- \sum _{i=1}^{s} \sum _{t = \varsigma _i}^{l_{n,i}}\frac{b({z}_{nt}^T {\gamma }_n) - b({z}_{nt}^T {\gamma }_n - {w}_{nt}^T {\gamma }_n)}{a(\phi )} \end{aligned}$$

where ${w}_{nt} = {0}$ for $t \notin \{n - (q_n-n_i+1)m +1, \ldots , l_{n,i}\}$.

First, we consider $\Vert {\gamma }_n - {\gamma }_n^0\Vert = M\sqrt{q_n/n}$.

$$\begin{aligned}&Q({\gamma }_n) - Q({\gamma }_n^0) \\= & {} (\mathscr {L}({\gamma }_n) - \mathscr {L}({\gamma }_n^0)) - n \sum _{j \in \mathscr {A}_n} (p_{\lambda _n} (|\gamma _{nj}|) - p_{\lambda _n} (|\gamma _{nj}^0|)) - n \sum _{j \in \mathscr {A}_n^c} (p_{\lambda _n} (|\gamma _{nj}|) - p_{\lambda _n} (|\gamma _{nj}^0|))\\&+ \sum _{i=1}^{s} \sum _{t = \varsigma _i}^{l_{n,i}} \frac{y_{nt}({w}_{nt}^T ({\gamma }_n- {\gamma }_n^0)) }{a(\phi )} - \sum _{i=1}^{s} \sum _{t = \varsigma _i}^{l_{n,i}}\frac{b({z}_{nt}^T {\gamma }_n) - b({z}_{nt}^T {\gamma }_n^0)}{a(\phi )}\\&+ \sum _{i=1}^{s} \sum _{t = \varsigma _i}^{l_{n,i}}\frac{b({z}_{nt}^T {\gamma }_n - {w}_{nt}^T {\gamma }_n) - b({z}_{nt}^T {\gamma }_n^0 - {w}_{nt}^T {\gamma }_n^0) }{a(\phi )}. \end{aligned}$$

As $p_{\lambda _n}(0) = 0$ and $p_{\lambda _n} (|\gamma _{nj}|) \ge 0$, we have

$$\begin{aligned}&Q({\gamma }_n) - Q({\gamma }_n^0) \\\le & {} [\mathscr {L}({\gamma }_n) - \mathscr {L}({\gamma }_n^0)]- n \sum _{j \in \mathscr {A}_n}[p'_{\lambda _n} (|\gamma _{nj}^0|)\text {sign}(\gamma _{nj}^0)(\gamma _{nj} - \gamma _{nj}^0) \\+ & {} p''_{\lambda _n} (|\gamma _{nj}^0|)(\gamma _{nj} - \gamma _{nj}^0)^2(1+o_P(1))] \\+ & {} \sum _{i=1}^{s} \sum _{t = \varsigma _i}^{l_{n,i}} a(\phi )^{-1} [y_{nt}({w}_{nt}^T ({\gamma }_n- {\gamma }_n^0)) - \frac{\partial b({z}_{nt}^T {\gamma }_n^*)}{\partial {\gamma }_n}{z}_{nt}^T ({\gamma }_n- {\gamma }_n^0) \\+ & {} \frac{\partial b({z}_{nt}^T {\gamma }_n^* - {w}_{nt}^T {\gamma }_n^*)}{\partial {\gamma }_n}({z}_{nt}^T -{w}_{nt}^T)({\gamma }_n- {\gamma }_n^0)]\\= & {} A_1 + A_2 + A_3, \end{aligned}$$

where $\Vert {\gamma }_n^* - {\gamma }_n^0\Vert \le M\sqrt{q_n/n}$.

By the Taylor expansion and Assumption 4, $A_1 = \mathscr {L}({\gamma }_n) - \mathscr {L}({\gamma }_n^0) = -M^2O_p(q_n).$ By Assumption 2, $p'_{\lambda _n} (|\gamma ^0_{nj}|) = p''_{\lambda _n} (|\gamma ^0_{nj}|)= 0,$ for $j \in \mathscr {A}_n$ and large n. Then $|A_2| = o_p(\sqrt{q_n})$. By Assumption 5, $|A_3| = O_P(\sqrt{nq_n}) (M\sqrt{q_n/n}) = O_p(q_n)$. By choosing a sufficiently large M, the first term dominates other terms. Since $A_1$ is negative, for $\varepsilon > 0$, there exists a large constant M such that

$$P\left\{ \sup _{\Vert {\gamma }_n - {\gamma }_n^0\Vert = M\sqrt{q_n/n}} Q({\gamma }_n) < Q({\gamma }_n^0)\right\} \ge 1 - \varepsilon .$$

This implies that with probability at least $1-\varepsilon $ there exists a local maximum in the ball $\{{\gamma }_n:\Vert {\gamma }_n - {\gamma }_n^0\Vert \le M\sqrt{q_n/n}\}$. Hence, there exists a local maximizer such that $\Vert \widehat{{\gamma }_n} - {\gamma }_n^0\Vert = O_P(\sqrt{q_n/n})$.

Then we consider for $j \in \mathscr {A}_n^c$, by the standard Taylor expansion of the function ${\partial \mathscr {L} ({\gamma }_n) }/{\partial \gamma _{nj}}$ at ${\gamma }_n^0$, we obtain

$$\begin{aligned}&\frac{\partial Q({\gamma }_n) }{\partial \gamma _{nj}}\\= & {} \frac{\partial \mathscr {L} ({\gamma }_n^0) }{\partial \gamma _{nj}} + \sum _{j' =1}^{pq_n}(\gamma _{nj'}-\gamma _{nj'}^0)\frac{\partial ^2 \mathscr {L} ({\gamma }_n^0) }{\partial \gamma _{nj}^2 }(1 + O_P(1))- n p'_{\lambda _n} (|\gamma _{nj}|)\text {sign}(\gamma _{nj})\\+ & {} O_P(\sqrt{nq_n})\\= & {} O_P(\sqrt{nq_n}) + O_P(\sqrt{nq_n}) - n p'_{\lambda _n} (|\gamma _{nj}|)\text {sign}(\gamma _{nj}) + O_P(\sqrt{nq_n})\\= & {} n\lambda _n \left[ O_P\left( \frac{\sqrt{q_n/n}}{\lambda _n}\right) - \lambda _n^{-1} p'_{\lambda _n} (|\gamma _{nj}|)\text {sign}(\gamma _{nj})\right] \end{aligned}$$

by Assumption 1. Since ${\sqrt{q_n/n}}/{\lambda _n} \rightarrow 0$ by Assumption 22.1, this entails that the sign of ${\partial Q({\gamma }_n) }/{\partial \gamma _{nj}}$ is determined by the sign of $\gamma _{nj}$ inside the neighborhood of ${\gamma }_n^0$ with radius $M\sqrt{q_n/n}$ by Assumption 3. That is, ${\partial Q({\gamma }_n) }/{\partial \gamma _{nj}} > 0$ for $\gamma _{nj} < 0$ and ${\partial Q({\gamma }_n) }/{\partial \gamma _{nj}} < 0$ for $\gamma _{nj} > 0$. Therefore, for any local maximizer $\widehat{{\gamma }}_n$ inside this ball, $\widehat{{\gamma }}_{n\mathscr {A}_n^c} = 0$ with probability tending to one. This completes the proof.

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Sun, X., Wu, Y. (2020). Simultaneous Multiple Change Points Estimation in Generalized Linear Models. In: Fan, J., Pan, J. (eds) Contemporary Experimental Design, Multivariate Analysis and Data Mining. Springer, Cham. https://doi.org/10.1007/978-3-030-46161-4_22

Download citation

DOI: https://doi.org/10.1007/978-3-030-46161-4_22
Published: 23 May 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-46160-7
Online ISBN: 978-3-030-46161-4
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics