A mixed integer linear program to compress transition probability matrices in Markov chain bootstrapping

Cerqueti, Roy; Falbo, Paolo; Pelizzari, Cristian; Ricca, Federica; Scozzari, Andrea

doi:10.1007/s10479-016-2181-9

A mixed integer linear program to compress transition probability matrices in Markov chain bootstrapping

Original - OR Modeling/Case Study
Published: 18 April 2016

Volume 248, pages 163–187, (2017)
Cite this article

Annals of Operations Research Aims and scope Submit manuscript

398 Accesses
2 Citations
Explore all metrics

Abstract

Bootstrapping time series is one of the most acknowledged tools to study the statistical properties of an evolutive phenomenon. An important class of bootstrapping methods is based on the assumption that the sampled phenomenon evolves according to a Markov chain. This assumption does not apply when the process takes values in a continuous set, as it frequently happens with time series related to economic and financial phenomena. In this paper we apply the Markov chain theory for bootstrapping continuous-valued processes, starting from a suitable discretization of the support that provides the state space of a Markov chain of order $k \ge 1$. Even for small k, the number of rows of the transition probability matrix is generally too large and, in many practical cases, it may incorporate much more information than it is really required to replicate the phenomenon satisfactorily. The paper aims to study the problem of compressing the transition probability matrix while preserving the “law” characterising the process that generates the observed time series, in order to obtain bootstrapped series that maintain the typical features of the observed time series. For this purpose, we formulate a partitioning problem of the set of rows of such a matrix and propose a mixed integer linear program specifically tailored for this particular problem. We also provide an empirical analysis by applying our model to the time series of Spanish and German electricity prices, and we show that, in these medium size real-life instances, bootstrapped time series reproduce the typical features of the ones under observation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Moving Linear Model Approach for Extracting Cyclical Variation from Time Series Data

Article 25 November 2023

Quantile-based dynamic modeling of asymmetric data: a novel Burr XII approach for positive continuous random variables

Article 24 May 2024

Existence and Uniqueness of Quasi-stationary Distributions for Symmetric Markov Processes with Tightness Property

Article 17 January 2019

Notes

The size of the solution space is reduced from $B\left( n^{k}\right) $ to $[B\left( n\right) ]^{k}$, where $B\left( n\right) $ is the n-th Bell number, i.e., the number of partitions of the set of n states of a Markov chain of order k.
For the sake of simplicity, we shall not introduce a specific notation for the estimates of the transition probabilities here.
The matrices are available from the authors upon request.
For both Spain and Germany the last observed 2-state is excluded from the computation of the cardinality of ${\mathcal {O}}_{2}$.
Actually, in this case no aggregation at all is performed on the rows of the transition probability matrix which remains the original one.
The matrices are available from the authors upon request.

References

Abdel-Moneim, A. M., & Leysieffer, F. W. (1984). Lumpability for non-irreducible finite Markov chains. Journal of Applied Probability, 21(3), 567–574.
Article Google Scholar
Anatolyev, S., & Vasnev, A. (2002). Markov chain approximation in bootstrapping autoregressions. Economics Bulletin, 3(19), 1–8.
Google Scholar
Barr, D. R., & Thomas, M. U. (1977). An eigenvector condition for Markov chain lumpability. Operations Research, 25(6), 1028–1031.
Article Google Scholar
Brock, W., Lakonishok, J., & LeBaron, B. (1992). Simple technical trading rules and the stochastic properties of stock returns. The Journal of Finance, 47(5), 1731–1764.
Article Google Scholar
Bühlmann, P. (2002). Bootstraps for time series. Statistical Science, 17(1), 52–72.
Article Google Scholar
Bühlmann, P., & Wyner, A. J. (1999). Variable length Markov chains. The Annals of Statistics, 27(2), 480–513.
Article Google Scholar
Bunn, D. W. (2004). Modelling prices in competitive electricity markets. Chichester: Wiley.
Google Scholar
Burke, C. J., & Rosenblatt, M. A. (1958). A Markovian function of a Markov chain. The Annals of Mathematical Statistics, 29(4), 1112–1122.
Article Google Scholar
Cerqueti, R., Falbo, P., & Pelizzari, C. (2010). Relevant states and memory in Markov chain bootstrapping and simulation. Munich Personal RePEc Archive. http://mpra.ub.uni-muenchen.de/46254/1/MPRApaper46250.pdf
Cerqueti, R., Falbo, P., Guastaroba, G., & Pelizzari, C. (2013). A Tabu search heuristic procedure in Markov chain bootstrapping. European Journal of Operational Research, 227(2), 367–384.
Article Google Scholar
Ching, W.-K., Ng, M. K., & Fung, E. S. (2008). Higher-order multivariate Markov chains and their applications. Linear Algebra and Its Applications, 428(2–3), 492–507.
Chung, F. K. R. (1997). Spectral graph theory. Providence, RI: American Mathematical Society.
Google Scholar
Deng, K., Mehta, P. G., & Meyn, S. P. (2011). Optimal Kullback–Leibler aggregation via spectral theory of Markov chains. IEEE Transactions on Automatic Control, 56(12), 2793–2808.
Article Google Scholar
Efron, B. (1979). Bootstrap methods: Another look at the jackknife. The Annals of Statistics, 7(1), 1–26.
Article Google Scholar
Efron, B., & Tibshirani, R. J. (1993). An introduction to the bootstrap. New York, NY: Chapman & Hall.
Book Google Scholar
Freedman, D. (1984). On bootstrapping two-stage least-squares estimates in stationary linear models. The Annals of Statistics, 12(3), 827–842.
Article Google Scholar
Freedman, D. A., & Peters, S. C. (1984). Bootstrapping a regression equation: Some empirical results. Journal of the American Statistical Association, 79(385), 97–106.
Article Google Scholar
Hamilton, J. D. (1996). Specification testing in Markov-switching time-series models. Journal of Econometrics, 70(1), 127–157.
Article Google Scholar
Hamilton, J. D. (2005). What’s real about the business cycle? Federal Reserve Bank of St. Louis Review, 87(4), 435–452.
Google Scholar
Huisman, R., & Mahieu, R. (2003). Regime jumps in electricity prices. Energy Economics, 25(5), 425–434.
Article Google Scholar
Jeanne, O., & Masson, P. (2000). Currency crises, sunspots and Markov-switching regimes. Journal of International Economics, 50(2), 327–350.
Article Google Scholar
Kemeny, J. G., & Snell, J. L. (1976). Finite Markov chains. Berlin: Springer.
Google Scholar
Kolmogorov, A. N. (1965). Three approaches to the quantitative definition of information. Problemy Peredachi Informatsii, 1(1), 3–11.
Google Scholar
Meila, M., & Xu, L. (2004). Multiway cuts and spectral clustering. University of Washington—Department of Statistics, 442. https://www.stat.washington.edu/research/reports/2004/tr442.pdf.
Mueller, M., & Kramer, S. (2010). Integer linear programming models for constrained clustering. In B. Pfahringer, G. Holmes, & A. Hoffman (Eds.), Discovery science (pp. 159–173). Springer: Berlin.
Chapter Google Scholar
Rached, Z., Alalaji, F., & Campbell, L. L. (2004). The Kullback–Leibler divergence rate between Markov sources. IEEE Transactions on Information Theory, 50(5), 917–921.
Article Google Scholar
Saǧlam, B., Salman, F. S., Sayin, S., & Türkay, M. (2006). A mixed-integer programming approach to the clustering problem with an application in customer segmentation. European Journal of Operational Research, 173(3), 866–879.
Article Google Scholar
Spears, W. M. (1998). A compression algorithm for probability transition matrices. SIAM Journal on Matrix Analysis and Applications, 20, 60–77.
Article Google Scholar
Sullivan, R., Timmermann, A., & White, H. (1999). Data-snooping, technical trading rule performance, and the bootstrap. The Journal of Finance, 54(5), 1647–1691.
Article Google Scholar
Thomas, M. U. (2010). Aggregation and lumping of DTMCs. In J. J. Cochran, L. A. Cox Jr., P. Kesikinocak, J. P. Kharoufeh, & J. C. Smith (Eds.), Wiley encyclopedia of operations research and management science. Hoboken, NJ: Wiley.
Google Scholar
Verma, D., & Meila, M. (2003). Comparison of spectral clustering methods. Advances in neural information processing systems, 15. www.cs.washington.edu/spectral/papers/nips03-comparison.ps.
Ward, J. H. (1963). Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 58(301), 236–244.
Article Google Scholar
Weron, R., Bierbrauer, M., & Trueck, S. (2004). Modeling electricity prices: Jump diffusion and regime switching. Physica A: Statistical Mechanics and Its Applications, 336(1–2), 39–48.
Article Google Scholar
Weron, R. (2006). Modeling and forecasting electricity loads and prices: A statistical approach. Chichester: Wiley.
Book Google Scholar
White, L. B., Mahony, R., & Brushe, G. D. (2000). Lumpable hidden Markov models-model reduction and reduced complexity filtering. IEEE Transactions on Automatic Control, 43(12), 2297–2306.
Article Google Scholar
Zhu, J., Hong, J., & Hughes, J. G. (2002). Using Markov chains for link prediction in adaptive web sites. In D. Bustard, W. Liu, & R. Sterritt (Eds.), SoftWare 2002: Computing in an imperfect world (pp. 55–66). Berlin: Springer.
Google Scholar

Download references

Acknowledgments

The fourth and fifth author wish to thank the partial support received from the Spanish Ministry of Science and Technology through grant number MTM2013-46962-C2-1-P.

Author information

Authors and Affiliations

Università degli Studi di Macerata, Macerata, Italy
Roy Cerqueti
Università degli Studi di Brescia, Brescia, Italy
Paolo Falbo & Cristian Pelizzari
Sapienza Università di Roma, Rome, Italy
Federica Ricca
Università degli Studi Niccolò Cusano - Telematica Roma, Rome, Italy
Andrea Scozzari

Authors

Roy Cerqueti
View author publications
You can also search for this author in PubMed Google Scholar
Paolo Falbo
View author publications
You can also search for this author in PubMed Google Scholar
Cristian Pelizzari
View author publications
You can also search for this author in PubMed Google Scholar
Federica Ricca
View author publications
You can also search for this author in PubMed Google Scholar
Andrea Scozzari
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andrea Scozzari.

Appendices

Appendix 1: Trend and weekly seasonality removal

The estimation of the exponential trend and weekly seasonality is based on the following model:

$$\begin{aligned} e_{t}^{(c)}=\exp (rt+\eta _{1}{\mathbb {I}}_{1}(t)+\eta _{2}{\mathbb {I}} _{2}(t)+\eta _{3}{\mathbb {I}}_{3}(t)+\eta _{4}{\mathbb {I}}_{4}(t)+\eta _{5} {\mathbb {I}}_{5}(t)+\eta _{6}{\mathbb {I}}_{6}(t)+\eta _{7}{\mathbb {I}} _{7}(t)+\varepsilon _{t})\text {,} \end{aligned}$$

(14)

where $e_{t}^{(c)}$ are the raw original prices, ${\mathbb {I}}_{j}(t)$ is the dummy variable signalling whether t is the jth day of the week, with $ j=1,\dots ,7$, r is the growth rate, $\eta _{j}$ is the coefficient of dummy variable ${\mathbb {I}}_{j}(t)$, with $j=1,\dots ,7$, and $ \varepsilon _{t}$ are the errors. If we take the natural logarithm on both sides of formula (14), we obtain the following formula:

$$\begin{aligned} z_{t}=rt+\eta _{1}{\mathbb {I}}_{1}(t)+\eta _{2}{\mathbb {I}}_{2}(t)+\eta _{3} {\mathbb {I}}_{3}(t)+\eta _{4}{\mathbb {I}}_{4}(t)+\eta _{5}{\mathbb {I}}_{5}(t)+\eta _{6}{\mathbb {I}}_{6}(t)+\eta _{7}{\mathbb {I}}_{7}(t)+\varepsilon _{t}\text {,} \end{aligned}$$

where $z_{t}=\ln e_{t}^{(c)}$.

For estimation purposes, we assume that the usual hypotheses of linear regression on the errors $\varepsilon _{t}$ hold. We obtain the OLS estimates of r and $\eta _{j}$, $j=1,\dots ,7$, and they are significant at a level of $5\,\%$ (see Table 4).

Table 4 Coefficients estimates of an exponential regression model of trend and weekly seasonality applied to the series of electricity prices of Spain and Germany

Full size table

To the purpose of removing the exponential trend and weekly seasonality from our original series, we define the series of prices $ e(T)=(e_{1},\dots ,e_t, \dots , e_{T})$, where:

$$\begin{aligned} e_{t}= & {} \exp [z_{t}-(\hat{r}t+\hat{\eta }_{1}{\mathbb {I}}_{1}(t)+\hat{\eta }_{2} {\mathbb {I}}_{2}(t)+\hat{\eta }_{3}{\mathbb {I}}_{3}(t)+\hat{\eta }_{4}{\mathbb {I}} _{4}(t)+\hat{\eta }_{5}{\mathbb {I}}_{5}(t)+\hat{\eta }_{6}{\mathbb {I}}_{6}(t)\\&+\,\hat{ \eta }_{7}{\mathbb {I}}_{7}(t))]\text {, }t=1,\dots ,T\text {.} \end{aligned}$$

Set e(T) is an input of the bootstrapping method, while the output is the bootstrapped series $x(\ell )=(x_{1},\ldots ,x_{\ell })$. To re-introduce the exponential trend and weekly seasonality in $x(\ell )$, we multiply each point $x_{j}$ by $e^{(\hat{r}j+\hat{\eta }_{1}{\mathbb {I}} _{1}(j)+\hat{\eta }_{2}{\mathbb {I}}_{2}(j)+\hat{\eta }_{3}{\mathbb {I}}_{3}(j)+\hat{ \eta }_{4}{\mathbb {I}}_{4}(j)+\hat{\eta }_{5}{\mathbb {I}}_{5}(j)+\hat{\eta }_{6} {\mathbb {I}}_{6}(j)+\hat{\eta }_{7}{\mathbb {I}}_{7}(j))}$,$j=1,\dots ,\ell $.

Appendix 2: Initial states, or intervals

Table 5 reports the 12 intervals of the initial partition of the support $[\alpha ,\beta ]$ of the series of Spain and Germany after removal of exponential trend and weekly seasonality.

Table 5 Elements of the initial partition of the support of the exponentially detrended and deseasonalized series of electricity prices of Spain and Germany

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cerqueti, R., Falbo, P., Pelizzari, C. et al. A mixed integer linear program to compress transition probability matrices in Markov chain bootstrapping. Ann Oper Res 248, 163–187 (2017). https://doi.org/10.1007/s10479-016-2181-9

Download citation

Published: 18 April 2016
Issue Date: January 2017
DOI: https://doi.org/10.1007/s10479-016-2181-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A mixed integer linear program to compress transition probability matrices in Markov chain bootstrapping

Abstract

Access this article

Similar content being viewed by others

A Moving Linear Model Approach for Extracting Cyclical Variation from Time Series Data

Quantile-based dynamic modeling of asymmetric data: a novel Burr XII approach for positive continuous random variables

Existence and Uniqueness of Quasi-stationary Distributions for Symmetric Markov Processes with Tightness Property

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix 1: Trend and weekly seasonality removal

Appendix 2: Initial states, or intervals

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A mixed integer linear program to compress transition probability matrices in Markov chain bootstrapping

Abstract

Access this article

Similar content being viewed by others

A Moving Linear Model Approach for Extracting Cyclical Variation from Time Series Data

Quantile-based dynamic modeling of asymmetric data: a novel Burr XII approach for positive continuous random variables

Existence and Uniqueness of Quasi-stationary Distributions for Symmetric Markov Processes with Tightness Property

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix 1: Trend and weekly seasonality removal

Appendix 2: Initial states, or intervals

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation