Coherent forecasting for stationary time series of discrete data

Maiti, Raju; Biswas, Atanu

doi:10.1007/s10182-014-0243-3

Coherent forecasting for stationary time series of discrete data

Original Paper
Published: 28 January 2015

Volume 99, pages 337–365, (2015)
Cite this article

AStA Advances in Statistical Analysis Aims and scope Submit manuscript

Raju Maiti¹ &
Atanu Biswas¹

9 Citations
Explore all metrics

Abstract

Coherent forecasting for discrete-valued stationary time series is considered in this article. In the context of count time series, different methods of coherent forecasting such as median forecasting and mode forecasting are used to obtain $h$-step ahead coherent forecasting. However, there are not many existing works in the context of categorical time series. Here, we consider the case of a finite number of categories with different possible models, such as the Pegram’s operator-based ARMA($p$,$q$) model, the mixture transition distribution model and the logistic regression model, and study their $h$-step ahead coherent forecasting. Some theoretical results are derived along with some numerical examples. To facilitate comparison among the three models, we use some forecasting measures. The procedure is illustrated using one real-life categorical data, namely the infant sleep status data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey of methods for time series change point detection

Article 08 September 2016

The great multivariate time series classification bake off: a review and experimental evaluation of recent algorithmic advances

Article Open access 18 December 2020

A review of predictive uncertainty estimation with machine learning

Article Open access 18 March 2024

References

Agresti, A.: Categorical data analysis. Wiley, New Jersey (2002)
Book MATH Google Scholar
Al-Osh, M., Alzaid, A.A.: First-order integer-valued autoregressive (INAR(1)) process. J. Time Ser. Anal. 8(3), 261–275 (1987)
Article MathSciNet MATH Google Scholar
Alzaid, A., Al-Osh, M.: An integer-valued pth-order autoregressive structure (INAR(p)) process. J. Appl. Probab. 27, 314–324 (1990)
Article MathSciNet MATH Google Scholar
Atkinson, K.: An introduction to numerical analysis. Wiley, New Delhi (2008)
Google Scholar
Berchtold, A., Raftery, A.E.: The mixture transition distribution model for high-order Markov chains and non-gaussian time series. Stat. Sci. 17(3), 328–356 (2002)
Article MathSciNet MATH Google Scholar
Biswas, A., Guha, A.: Time series analysis of categorical data using auto-mutual information. J. Stat. Plan. Inference 139(9), 3076–3087 (2009)
Article MathSciNet MATH Google Scholar
Biswas, A., Song, P.X.-K.: Discrete-valued ARMA processes. Stat. Prob. Lett. 79(17), 1884–1889 (2009)
Article MathSciNet MATH Google Scholar
Brockwell, P.J., Davis, R.A.: Time series: theory and methods. Springer, Berlin (2002)
Bu, R., McCabe, B.: Model selection, estimation and forecasting in INAR(p) models: a likelihood-based Markov chain approach. Int. J. Forecast. 24(1), 151–162 (2008)
Article Google Scholar
Carruth, J., Tygert, M., Ward, R.: A comparison of the discrete Kolmogorov–Smirnov statistic and the euclidean distance (2012). arXiv:1206.6367
Fokianos, K., Kedem, B.: Regression theory for categorical time series. Stat. Sci. 18(3), 357–376 (2003)
Article MathSciNet MATH Google Scholar
Freeland, R.K., McCabe, B.P.: Forecasting discrete valued low count time series. Int. J. Forecast. 20(3), 427–434 (2004)
Article Google Scholar
Jacobs, P.A., Lewis, P.A.: Discrete time series generated by mixtures III: Autoregressive processes (DAR(p)). Technical report, Naval Postgraduate School, Monterey (1978c)
Jacobs, P.A., Lewis, P.A.: Discrete time series generated by mixtures. I: correlational and runs properties. J. Royal Stat. Soc. Ser. B (Methodological) 40(1), 94–105 (1978a)
MathSciNet MATH Google Scholar
Jacobs, P.A., Lewis, P.A.: Discrete time series generated by mixtures II: asymptotic properties. J. Royal Stat. Soc. Ser. B (Methodological) 40(2), 222–228 (1978b)
MathSciNet Google Scholar
Jacobs, P.A., Lewis, P.A.: Stationary discrete autoregressive–moving average time series generated by mixtures. J. Time Ser. Anal. 4(1), 19–36 (1983)
Article MathSciNet MATH Google Scholar
Jung, R.C., Tremayne, A.R.: Coherent forecasting in integer time series models. Int. J. Forecast. 22(2), 223–238 (2006)
Article MathSciNet MATH Google Scholar
McKenzie, E.: Discrete variate time series. Handbook of statistics 21, 573–606 (2003)
McKenzie, E.: Some simple models for discrete variate time series. J. Am. Water Resour. Assoc. 21(4), 645–650 (1985)
Article MathSciNet Google Scholar
McKenzie, E.: Some ARMA models for dependent sequences of Poisson counts. Adv. Appl. Probab. 20(4), 822–835 (1988)
Article MathSciNet MATH Google Scholar
Pegram, G.: An autoregressive model for multilag Markov chains. J. Appl. Probab. 17(2), 350–362 (1980)
Article MathSciNet MATH Google Scholar
Raftery, A.E.: A model for high-order Markov chains. J. Royal Stat. Soc. Ser. B (Methodological) 47(3), 528–539 (1985)
MathSciNet MATH Google Scholar
Silva, N., Pereira, I., Silva, M.E.: Forecasting in INAR(1) model. REVSTAT Stat. J. 7(1), 119–134 (2009)
MathSciNet MATH Google Scholar
Stoffer, D.S., Scher, M.S., Richardson, G.A., Day, N.L., Coble, P.A.: A Walsh–Fourier analysis of the effects of moderate maternal alcohol consumption on neonatal sleep-state cycling. J. Am. Stat. Assoc. 83(404), 954–963 (1988)
Google Scholar
Stoffer, D.S., Tyler, D.E., Wendt, D.A.: The spectral envelope and its applications. Stat. Sci. 15(3), 224–253 (2000)
Article MathSciNet MATH Google Scholar
Weiß, C.H., Göb, R.: Measuring serial dependence in categorical time series. Adv. Stat. Anal. 92(1), 71–89 (2008)
Article MathSciNet MATH Google Scholar
Weiß, C.H.: Empirical measures of signed serial dependence in categorical time series. J. Stat. Comput. Simul. 81(4), 411–429 (2011)
Article MathSciNet MATH Google Scholar
Weiß, C.H.: Serial dependence of NDARMA process. Comput. Stat. Data Anal. 68(1), 213–238 (2013)
Article Google Scholar

Download references

Acknowledgments

The authors wish to thank the three anonymous referees and the associate editor for their careful reading and constructive suggestions which led to this improved version of the paper.

Author information

Authors and Affiliations

Applied Statistics Unit, Indian Statistical Institute, 203 B.T. Road, Kolkata, 700 108, India
Raju Maiti & Atanu Biswas

Authors

Raju Maiti
View author publications
You can also search for this author in PubMed Google Scholar
Atanu Biswas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Atanu Biswas.

Appendix

1.1 Appendix A : Proof of Theorem 2

From the model (3.2), the 1-step ahead conditional distribution is given by

$$\begin{aligned} \begin{array}{lcl} p_1(i|i_1,\ldots ,i_p)&{}=&{}P(Y_{n+1}=C_{i}|Y_{n}=C_{i_1},\ldots ,Y_{n-p+1}=C_{i_p})\\ &{}=&{}\eta _{11}I(i_1=i)+\cdots +\eta _{1p}I(i_p=i)+(1-\eta _{11}-\cdots -\eta _{1p})p_i, \end{array} \end{aligned}$$

with $\eta _{1l}=\phi _{l}$, $l=1,\ldots ,p$. Then the two-step ahead conditional distribution is given by

$$\begin{aligned} p_2(i|i_1,\ldots ,i_p)&= P(Y_{n+2}=C_{i}|Y_{n}=C_{i_1},\ldots ,Y_{n-p+1}=C_{i_p})\\&= \displaystyle \sum _{j=0}^{k}P(Y_{n+2}=C_{i}|Y_{n+1}=C_{j},Y_{n}=C_{i_1},\ldots ,Y_{n-p+2}=C_{i_{p-1}})\\&\times P(Y_{n+1}=C_{j}|Y_{n}=C_{i_1},\ldots ,Y_{n-p+1}=C_{i_{p}})\\&= \displaystyle \sum _{j=0}^{k}\left\{ \eta _{11}I(j=i)+\cdots +\eta _{1p}I(i_{p-1}=i)+(1-\eta _{11}-\cdots -\eta _{1p})p_i\right\} \\&\times \left\{ \phi _1I(i_1=j)+\cdots +\phi _pI(i_p=j)+(1-\phi _1-\cdots -\phi _p)p_j\right\} \\&= \eta _{21}I(i_1=i)+\cdots +\eta _{2p}I(i_p=i)+(1-\eta _{21}-\cdots -\eta _{2p})p_i \end{aligned}$$

where $\varvec{\eta _2}=\varvec{\Phi }\varvec{\phi }$. So the result is true for $h=2$. Let it be true for $(h-1)$, that is $\varvec{\eta _{h-1}}=\varvec{\Phi }^{h-2}\varvec{\phi }$. Then by induction it is straightforward to show that the $h$-step ahead conditional distribution is given by (3.5).

1.2 Appendix B : Proof of Theorem 3

To prove the Theorem 3, it is enough to show that $\displaystyle \lim _{h\rightarrow \infty }\eta _{hi}=0$ for all $i$. To show this we use the result that for any $n \times n$ matrix $A$ with its eigenvalues $\lambda _1, \lambda _2,\ldots ,\lambda _s$, $\displaystyle \lim _{k \rightarrow \infty }A^k=0$ if the spectral radius of $A$, $\rho (A)<1$ where $\rho (A)=\max \{\left| \lambda _1\right| , \left| \lambda _2\right| ,\ldots ,\left| \lambda _s\right| \}$ (See Atkinson 2008). Outline of the proof is given follows.

From the Jordan normal theorem, for any $n \times n$ matrix $A$, there exist a non-singular matrix $V$ and a block diagonal matrix $J$ such that

$$\begin{aligned} A=VJV^{-1} \end{aligned}$$

for

where the $m_{i}\times m_{i}$ matrix $J_{m_{i}}(\lambda _{i})$ being

Now

$$\begin{aligned} A^k=VJ^kV^{-1} \end{aligned}$$

and, since $J$ is block diagonal,

Now a standard result on the $k$th power of an $m \times m$ Jordan block states that, for $k \ge m $,

$$\begin{aligned} J_{m}^k(\lambda )=\begin{pmatrix} \lambda ^k &{}\quad {{k} \atopwithdelims (){1}} \lambda ^{k-1} &{}\quad {{k}\atopwithdelims (){2}} \lambda ^{k-2} &{}\quad \cdots &{}\quad {{k}\atopwithdelims (){m-1}} \lambda ^{k-m+1} \\ 0 &{}\quad \lambda ^k &{}\quad {{k}\atopwithdelims (){1}} \lambda ^{k-1} &{}\quad \cdots &{}\quad {{k}\atopwithdelims (){m-2}} \lambda ^{k-m+2}\\ \vdots &{}\quad \vdots &{}\quad \vdots &{}\quad \ddots &{}\quad \vdots \\ 0 &{}\quad 0 &{}\quad 0 &{}\quad \cdots &{}\quad \lambda ^k \end{pmatrix}. \end{aligned}$$

Since $\rho (A)<1$, i.e., $|\lambda _i|<1$ for all $i$ and $\displaystyle \lim _{k \rightarrow \infty }{k \atopwithdelims ()i}\lambda ^{k-i}=0$, and hence $\displaystyle \lim _{k \rightarrow \infty } J_m^k(\lambda )=0$, . This implies that $\displaystyle \lim _{k \rightarrow \infty }J^k=0$. Therefore,

$$\begin{aligned} \displaystyle \lim _{k \rightarrow \infty }A^k=\displaystyle \lim _{k \rightarrow \infty }VJ^kV^{-1}=V(\displaystyle \lim _{k \rightarrow \infty }J^k)V^{-1}=0. \end{aligned}$$

Note that the eigenvalues of $\varvec{\Phi }$ are $\phi _1, \ldots ,\phi _p$ all of which lie between 0 and 1, and hence $\displaystyle \lim _{h \rightarrow \infty }\varvec{\Phi }^h=0$. Consequently $\displaystyle \lim _{h \rightarrow \infty } \varvec{\eta }_h=\displaystyle \lim _{h \rightarrow \infty }\varvec{\Phi }^{h-1}\varvec{\phi } =(\displaystyle \lim _{h \rightarrow \infty }\varvec{\Phi }^{h-1})\varvec{\phi }=0$.

1.3 Appendix C: Pegram’s MA(2) model

Here for $h=1$,

$$\begin{aligned}&P(Y_{n+1}=C_i|Y_{n}=C_j,Y_{n-1}=C_k)\\&\quad =\frac{\displaystyle \sum \nolimits _{r=0}^2\displaystyle \sum \nolimits _{s=0}^2\displaystyle \sum \nolimits _{t=0}^2\theta _r\theta _s\theta _tP(\epsilon _{n+1-r}=C_i,\epsilon _{n-s}=C_j,\epsilon _{n-1-t}=C_k)}{\displaystyle \sum \nolimits _{s=0}^2\displaystyle \sum \nolimits _{t=0}^2\theta _s\theta _tP(\epsilon _{n-s}=C_j,\epsilon _{n-1-t}=C_k)}, \end{aligned}$$

where

$$\begin{aligned} P(\epsilon _{n+1-r}=C_{i},\epsilon _{n-s}=C_{j},\epsilon _{n-1-t}=C_{k})&= p_ip_jp_kI(r-1\ne s\ne t+1) \\&\quad + p_ip_jI(j = k)I(r-1\ne s=t+1)\\&\quad + p_ip_jI(i = k)I(r-1 =t+1 \ne s)\\&\quad + p_ip_kI(i = j)I(r-1 =s\ne t+1)\\&\quad + p_iI(i=j=k)I(r-1=s=t\!+\!1) \end{aligned}$$

and

$$\begin{aligned} P(\epsilon _{n-s}=C_{j},\epsilon _{n-1-t}=C_{k})=p_{j}p_{k} I(s\ne t+1) + p_jI(j = k)I(s=t+1). \end{aligned}$$

Similarly for $h=2$,

$$\begin{aligned}&P(Y_{n+2}=C_{i}|Y_{n},\ldots ,Y_{1})\\&\quad = P(Y_{n+2}=C_{i}|Y_{n}=C_{j},Y_{n-1}=C_{k})\\&\quad =\frac{P(Y_{n+2}=C_{i},Y_{n}=C_{j},Y_{n-1}=C_{k})}{P(Y_{n}=C_{j},Y_{n-1}=C_{k})}\\&\quad =\frac{\displaystyle \sum \nolimits _{r=0}^2\displaystyle \sum \nolimits _{s=0}^2\displaystyle \sum \nolimits _{t=0}^2\theta _r\theta _s\theta _tP(\epsilon _{n+2-r}=C_{i},\epsilon _{n-s}=C_{j},\epsilon _{n-1-t}=C_{k})}{\displaystyle \sum \nolimits _{s=0}^2\displaystyle \sum \nolimits _{t=0}^2\theta _s\theta _tP(\epsilon _{n-s}=C_{j},\epsilon _{n-1-t}=C_{k})} \end{aligned}$$

where

$$\begin{aligned} P(\epsilon _{n+2-r}=C_{i},\epsilon _{n-s}=C_{j},\epsilon _{n-1-t}=C_{k})&= p_ip_jp_kI(r-2\ne s\ne t+1) \\&\quad + p_ip_jI(j = k)I(r-2\ne s=t+1)\\&\quad + p_ip_jI(i = k)I(r-2 =t+1 \ne s)\\&\quad + p_ip_kI(i = j)I(r-2 =s\ne t+1) \\&\quad + p_iI(i\!=\!j\!=\!k)I(r\!-\!2=s=t\!+\!1), \end{aligned}$$

and

$$\begin{aligned} P(\epsilon _{n-s}=C_{j},\epsilon _{n-1-t}=C_{k})=p_jp_kI(s\ne t+1) + p_jI(j = k)I(s=t+1). \end{aligned}$$

And for $h>2,\; P(Y_{n+h}=C_{i}| Y_{n}, Y_{n-1}, \ldots )=p_{i}$.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Maiti, R., Biswas, A. Coherent forecasting for stationary time series of discrete data. AStA Adv Stat Anal 99, 337–365 (2015). https://doi.org/10.1007/s10182-014-0243-3

Download citation

Received: 15 August 2013
Accepted: 24 December 2014
Published: 28 January 2015
Issue Date: July 2015
DOI: https://doi.org/10.1007/s10182-014-0243-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Coherent forecasting for stationary time series of discrete data

Abstract

Access this article

Similar content being viewed by others

A survey of methods for time series change point detection

The great multivariate time series classification bake off: a review and experimental evaluation of recent algorithmic advances

A review of predictive uncertainty estimation with machine learning

References

Acknowledgments