Skip to main content
Log in

Coherent forecasting for stationary time series of discrete data

  • Original Paper
  • Published:
AStA Advances in Statistical Analysis Aims and scope Submit manuscript

Abstract

Coherent forecasting for discrete-valued stationary time series is considered in this article. In the context of count time series, different methods of coherent forecasting such as median forecasting and mode forecasting are used to obtain \(h\)-step ahead coherent forecasting. However, there are not many existing works in the context of categorical time series. Here, we consider the case of a finite number of categories with different possible models, such as the Pegram’s operator-based ARMA(\(p\),\(q\)) model, the mixture transition distribution model and the logistic regression model, and study their \(h\)-step ahead coherent forecasting. Some theoretical results are derived along with some numerical examples. To facilitate comparison among the three models, we use some forecasting measures. The procedure is illustrated using one real-life categorical data, namely the infant sleep status data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Agresti, A.: Categorical data analysis. Wiley, New Jersey (2002)

    Book  MATH  Google Scholar 

  • Al-Osh, M., Alzaid, A.A.: First-order integer-valued autoregressive (INAR(1)) process. J. Time Ser. Anal. 8(3), 261–275 (1987)

    Article  MathSciNet  MATH  Google Scholar 

  • Alzaid, A., Al-Osh, M.: An integer-valued pth-order autoregressive structure (INAR(p)) process. J. Appl. Probab. 27, 314–324 (1990)

    Article  MathSciNet  MATH  Google Scholar 

  • Atkinson, K.: An introduction to numerical analysis. Wiley, New Delhi (2008)

    Google Scholar 

  • Berchtold, A., Raftery, A.E.: The mixture transition distribution model for high-order Markov chains and non-gaussian time series. Stat. Sci. 17(3), 328–356 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  • Biswas, A., Guha, A.: Time series analysis of categorical data using auto-mutual information. J. Stat. Plan. Inference 139(9), 3076–3087 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  • Biswas, A., Song, P.X.-K.: Discrete-valued ARMA processes. Stat. Prob. Lett. 79(17), 1884–1889 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  • Brockwell, P.J., Davis, R.A.: Time series: theory and methods. Springer, Berlin (2002)

  • Bu, R., McCabe, B.: Model selection, estimation and forecasting in INAR(p) models: a likelihood-based Markov chain approach. Int. J. Forecast. 24(1), 151–162 (2008)

    Article  Google Scholar 

  • Carruth, J., Tygert, M., Ward, R.: A comparison of the discrete Kolmogorov–Smirnov statistic and the euclidean distance (2012). arXiv:1206.6367

  • Fokianos, K., Kedem, B.: Regression theory for categorical time series. Stat. Sci. 18(3), 357–376 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  • Freeland, R.K., McCabe, B.P.: Forecasting discrete valued low count time series. Int. J. Forecast. 20(3), 427–434 (2004)

    Article  Google Scholar 

  • Jacobs, P.A., Lewis, P.A.: Discrete time series generated by mixtures III: Autoregressive processes (DAR(p)). Technical report, Naval Postgraduate School, Monterey (1978c)

  • Jacobs, P.A., Lewis, P.A.: Discrete time series generated by mixtures. I: correlational and runs properties. J. Royal Stat. Soc. Ser. B (Methodological) 40(1), 94–105 (1978a)

    MathSciNet  MATH  Google Scholar 

  • Jacobs, P.A., Lewis, P.A.: Discrete time series generated by mixtures II: asymptotic properties. J. Royal Stat. Soc. Ser. B (Methodological) 40(2), 222–228 (1978b)

    MathSciNet  Google Scholar 

  • Jacobs, P.A., Lewis, P.A.: Stationary discrete autoregressive–moving average time series generated by mixtures. J. Time Ser. Anal. 4(1), 19–36 (1983)

    Article  MathSciNet  MATH  Google Scholar 

  • Jung, R.C., Tremayne, A.R.: Coherent forecasting in integer time series models. Int. J. Forecast. 22(2), 223–238 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  • McKenzie, E.: Discrete variate time series. Handbook of statistics 21, 573–606 (2003)

  • McKenzie, E.: Some simple models for discrete variate time series. J. Am. Water Resour. Assoc. 21(4), 645–650 (1985)

    Article  MathSciNet  Google Scholar 

  • McKenzie, E.: Some ARMA models for dependent sequences of Poisson counts. Adv. Appl. Probab. 20(4), 822–835 (1988)

    Article  MathSciNet  MATH  Google Scholar 

  • Pegram, G.: An autoregressive model for multilag Markov chains. J. Appl. Probab. 17(2), 350–362 (1980)

    Article  MathSciNet  MATH  Google Scholar 

  • Raftery, A.E.: A model for high-order Markov chains. J. Royal Stat. Soc. Ser. B (Methodological) 47(3), 528–539 (1985)

    MathSciNet  MATH  Google Scholar 

  • Silva, N., Pereira, I., Silva, M.E.: Forecasting in INAR(1) model. REVSTAT Stat. J. 7(1), 119–134 (2009)

    MathSciNet  MATH  Google Scholar 

  • Stoffer, D.S., Scher, M.S., Richardson, G.A., Day, N.L., Coble, P.A.: A Walsh–Fourier analysis of the effects of moderate maternal alcohol consumption on neonatal sleep-state cycling. J. Am. Stat. Assoc. 83(404), 954–963 (1988)

    Google Scholar 

  • Stoffer, D.S., Tyler, D.E., Wendt, D.A.: The spectral envelope and its applications. Stat. Sci. 15(3), 224–253 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  • Weiß, C.H., Göb, R.: Measuring serial dependence in categorical time series. Adv. Stat. Anal. 92(1), 71–89 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  • Weiß, C.H.: Empirical measures of signed serial dependence in categorical time series. J. Stat. Comput. Simul. 81(4), 411–429 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  • Weiß, C.H.: Serial dependence of NDARMA process. Comput. Stat. Data Anal. 68(1), 213–238 (2013)

    Article  Google Scholar 

Download references

Acknowledgments

The authors wish to thank the three anonymous referees and the associate editor for their careful reading and constructive suggestions which led to this improved version of the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Atanu Biswas.

Appendix

Appendix

1.1 Appendix A : Proof of Theorem 2

From the model (3.2), the 1-step ahead conditional distribution is given by

$$\begin{aligned} \begin{array}{lcl} p_1(i|i_1,\ldots ,i_p)&{}=&{}P(Y_{n+1}=C_{i}|Y_{n}=C_{i_1},\ldots ,Y_{n-p+1}=C_{i_p})\\ &{}=&{}\eta _{11}I(i_1=i)+\cdots +\eta _{1p}I(i_p=i)+(1-\eta _{11}-\cdots -\eta _{1p})p_i, \end{array} \end{aligned}$$

with \(\eta _{1l}=\phi _{l}\), \(l=1,\ldots ,p\). Then the two-step ahead conditional distribution is given by

$$\begin{aligned} p_2(i|i_1,\ldots ,i_p)&= P(Y_{n+2}=C_{i}|Y_{n}=C_{i_1},\ldots ,Y_{n-p+1}=C_{i_p})\\&= \displaystyle \sum _{j=0}^{k}P(Y_{n+2}=C_{i}|Y_{n+1}=C_{j},Y_{n}=C_{i_1},\ldots ,Y_{n-p+2}=C_{i_{p-1}})\\&\times P(Y_{n+1}=C_{j}|Y_{n}=C_{i_1},\ldots ,Y_{n-p+1}=C_{i_{p}})\\&= \displaystyle \sum _{j=0}^{k}\left\{ \eta _{11}I(j=i)+\cdots +\eta _{1p}I(i_{p-1}=i)+(1-\eta _{11}-\cdots -\eta _{1p})p_i\right\} \\&\times \left\{ \phi _1I(i_1=j)+\cdots +\phi _pI(i_p=j)+(1-\phi _1-\cdots -\phi _p)p_j\right\} \\&= \eta _{21}I(i_1=i)+\cdots +\eta _{2p}I(i_p=i)+(1-\eta _{21}-\cdots -\eta _{2p})p_i \end{aligned}$$

where \(\varvec{\eta _2}=\varvec{\Phi }\varvec{\phi }\). So the result is true for \(h=2\). Let it be true for \((h-1)\), that is \(\varvec{\eta _{h-1}}=\varvec{\Phi }^{h-2}\varvec{\phi }\). Then by induction it is straightforward to show that the \(h\)-step ahead conditional distribution is given by (3.5).

1.2 Appendix B : Proof of Theorem 3

To prove the Theorem 3, it is enough to show that \(\displaystyle \lim _{h\rightarrow \infty }\eta _{hi}=0\) for all \(i\). To show this we use the result that for any \(n \times n\) matrix \(A\) with its eigenvalues \(\lambda _1, \lambda _2,\ldots ,\lambda _s\), \(\displaystyle \lim _{k \rightarrow \infty }A^k=0\) if the spectral radius of \(A\), \(\rho (A)<1\) where \(\rho (A)=\max \{\left| \lambda _1\right| , \left| \lambda _2\right| ,\ldots ,\left| \lambda _s\right| \}\) (See Atkinson 2008). Outline of the proof is given follows.

From the Jordan normal theorem, for any \(n \times n\) matrix \(A\), there exist a non-singular matrix \(V\) and a block diagonal matrix \(J\) such that

$$\begin{aligned} A=VJV^{-1} \end{aligned}$$

for

where the \(m_{i}\times m_{i}\) matrix \(J_{m_{i}}(\lambda _{i})\) being

Now

$$\begin{aligned} A^k=VJ^kV^{-1} \end{aligned}$$

and, since \(J\) is block diagonal,

Now a standard result on the \(k\)th power of an \(m \times m\) Jordan block states that, for \(k \ge m \),

$$\begin{aligned} J_{m}^k(\lambda )=\begin{pmatrix} \lambda ^k &{}\quad {{k} \atopwithdelims (){1}} \lambda ^{k-1} &{}\quad {{k}\atopwithdelims (){2}} \lambda ^{k-2} &{}\quad \cdots &{}\quad {{k}\atopwithdelims (){m-1}} \lambda ^{k-m+1} \\ 0 &{}\quad \lambda ^k &{}\quad {{k}\atopwithdelims (){1}} \lambda ^{k-1} &{}\quad \cdots &{}\quad {{k}\atopwithdelims (){m-2}} \lambda ^{k-m+2}\\ \vdots &{}\quad \vdots &{}\quad \vdots &{}\quad \ddots &{}\quad \vdots \\ 0 &{}\quad 0 &{}\quad 0 &{}\quad \cdots &{}\quad \lambda ^k \end{pmatrix}. \end{aligned}$$

Since \(\rho (A)<1\), i.e., \(|\lambda _i|<1\) for all \(i\) and \(\displaystyle \lim _{k \rightarrow \infty }{k \atopwithdelims ()i}\lambda ^{k-i}=0\), and hence \(\displaystyle \lim _{k \rightarrow \infty } J_m^k(\lambda )=0\), . This implies that \(\displaystyle \lim _{k \rightarrow \infty }J^k=0\). Therefore,

$$\begin{aligned} \displaystyle \lim _{k \rightarrow \infty }A^k=\displaystyle \lim _{k \rightarrow \infty }VJ^kV^{-1}=V(\displaystyle \lim _{k \rightarrow \infty }J^k)V^{-1}=0. \end{aligned}$$

Note that the eigenvalues of \(\varvec{\Phi }\) are \(\phi _1, \ldots ,\phi _p\) all of which lie between 0 and 1, and hence \(\displaystyle \lim _{h \rightarrow \infty }\varvec{\Phi }^h=0\). Consequently \(\displaystyle \lim _{h \rightarrow \infty } \varvec{\eta }_h=\displaystyle \lim _{h \rightarrow \infty }\varvec{\Phi }^{h-1}\varvec{\phi } =(\displaystyle \lim _{h \rightarrow \infty }\varvec{\Phi }^{h-1})\varvec{\phi }=0\).

1.3 Appendix C: Pegram’s MA(2) model

Here for \(h=1\),

$$\begin{aligned}&P(Y_{n+1}=C_i|Y_{n}=C_j,Y_{n-1}=C_k)\\&\quad =\frac{\displaystyle \sum \nolimits _{r=0}^2\displaystyle \sum \nolimits _{s=0}^2\displaystyle \sum \nolimits _{t=0}^2\theta _r\theta _s\theta _tP(\epsilon _{n+1-r}=C_i,\epsilon _{n-s}=C_j,\epsilon _{n-1-t}=C_k)}{\displaystyle \sum \nolimits _{s=0}^2\displaystyle \sum \nolimits _{t=0}^2\theta _s\theta _tP(\epsilon _{n-s}=C_j,\epsilon _{n-1-t}=C_k)}, \end{aligned}$$

where

$$\begin{aligned} P(\epsilon _{n+1-r}=C_{i},\epsilon _{n-s}=C_{j},\epsilon _{n-1-t}=C_{k})&= p_ip_jp_kI(r-1\ne s\ne t+1) \\&\quad + p_ip_jI(j = k)I(r-1\ne s=t+1)\\&\quad + p_ip_jI(i = k)I(r-1 =t+1 \ne s)\\&\quad + p_ip_kI(i = j)I(r-1 =s\ne t+1)\\&\quad + p_iI(i=j=k)I(r-1=s=t\!+\!1) \end{aligned}$$

and

$$\begin{aligned} P(\epsilon _{n-s}=C_{j},\epsilon _{n-1-t}=C_{k})=p_{j}p_{k} I(s\ne t+1) + p_jI(j = k)I(s=t+1). \end{aligned}$$

Similarly for \(h=2\),

$$\begin{aligned}&P(Y_{n+2}=C_{i}|Y_{n},\ldots ,Y_{1})\\&\quad = P(Y_{n+2}=C_{i}|Y_{n}=C_{j},Y_{n-1}=C_{k})\\&\quad =\frac{P(Y_{n+2}=C_{i},Y_{n}=C_{j},Y_{n-1}=C_{k})}{P(Y_{n}=C_{j},Y_{n-1}=C_{k})}\\&\quad =\frac{\displaystyle \sum \nolimits _{r=0}^2\displaystyle \sum \nolimits _{s=0}^2\displaystyle \sum \nolimits _{t=0}^2\theta _r\theta _s\theta _tP(\epsilon _{n+2-r}=C_{i},\epsilon _{n-s}=C_{j},\epsilon _{n-1-t}=C_{k})}{\displaystyle \sum \nolimits _{s=0}^2\displaystyle \sum \nolimits _{t=0}^2\theta _s\theta _tP(\epsilon _{n-s}=C_{j},\epsilon _{n-1-t}=C_{k})} \end{aligned}$$

where

$$\begin{aligned} P(\epsilon _{n+2-r}=C_{i},\epsilon _{n-s}=C_{j},\epsilon _{n-1-t}=C_{k})&= p_ip_jp_kI(r-2\ne s\ne t+1) \\&\quad + p_ip_jI(j = k)I(r-2\ne s=t+1)\\&\quad + p_ip_jI(i = k)I(r-2 =t+1 \ne s)\\&\quad + p_ip_kI(i = j)I(r-2 =s\ne t+1) \\&\quad + p_iI(i\!=\!j\!=\!k)I(r\!-\!2=s=t\!+\!1), \end{aligned}$$

and

$$\begin{aligned} P(\epsilon _{n-s}=C_{j},\epsilon _{n-1-t}=C_{k})=p_jp_kI(s\ne t+1) + p_jI(j = k)I(s=t+1). \end{aligned}$$

And for \(h>2,\; P(Y_{n+h}=C_{i}| Y_{n}, Y_{n-1}, \ldots )=p_{i}\).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Maiti, R., Biswas, A. Coherent forecasting for stationary time series of discrete data. AStA Adv Stat Anal 99, 337–365 (2015). https://doi.org/10.1007/s10182-014-0243-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10182-014-0243-3

Keywords

Navigation