Abstract
Bootstrapping time series is one of the most acknowledged tools to study the statistical properties of an evolutive phenomenon. An important class of bootstrapping methods is based on the assumption that the sampled phenomenon evolves according to a Markov chain. This assumption does not apply when the process takes values in a continuous set, as it frequently happens with time series related to economic and financial phenomena. In this paper we apply the Markov chain theory for bootstrapping continuous-valued processes, starting from a suitable discretization of the support that provides the state space of a Markov chain of order \(k \ge 1\). Even for small k, the number of rows of the transition probability matrix is generally too large and, in many practical cases, it may incorporate much more information than it is really required to replicate the phenomenon satisfactorily. The paper aims to study the problem of compressing the transition probability matrix while preserving the “law” characterising the process that generates the observed time series, in order to obtain bootstrapped series that maintain the typical features of the observed time series. For this purpose, we formulate a partitioning problem of the set of rows of such a matrix and propose a mixed integer linear program specifically tailored for this particular problem. We also provide an empirical analysis by applying our model to the time series of Spanish and German electricity prices, and we show that, in these medium size real-life instances, bootstrapped time series reproduce the typical features of the ones under observation.
Similar content being viewed by others
Notes
The size of the solution space is reduced from \(B\left( n^{k}\right) \) to \([B\left( n\right) ]^{k}\), where \(B\left( n\right) \) is the n-th Bell number, i.e., the number of partitions of the set of n states of a Markov chain of order k.
For the sake of simplicity, we shall not introduce a specific notation for the estimates of the transition probabilities here.
The matrices are available from the authors upon request.
For both Spain and Germany the last observed 2-state is excluded from the computation of the cardinality of \({\mathcal {O}}_{2}\).
Actually, in this case no aggregation at all is performed on the rows of the transition probability matrix which remains the original one.
The matrices are available from the authors upon request.
References
Abdel-Moneim, A. M., & Leysieffer, F. W. (1984). Lumpability for non-irreducible finite Markov chains. Journal of Applied Probability, 21(3), 567–574.
Anatolyev, S., & Vasnev, A. (2002). Markov chain approximation in bootstrapping autoregressions. Economics Bulletin, 3(19), 1–8.
Barr, D. R., & Thomas, M. U. (1977). An eigenvector condition for Markov chain lumpability. Operations Research, 25(6), 1028–1031.
Brock, W., Lakonishok, J., & LeBaron, B. (1992). Simple technical trading rules and the stochastic properties of stock returns. The Journal of Finance, 47(5), 1731–1764.
Bühlmann, P. (2002). Bootstraps for time series. Statistical Science, 17(1), 52–72.
Bühlmann, P., & Wyner, A. J. (1999). Variable length Markov chains. The Annals of Statistics, 27(2), 480–513.
Bunn, D. W. (2004). Modelling prices in competitive electricity markets. Chichester: Wiley.
Burke, C. J., & Rosenblatt, M. A. (1958). A Markovian function of a Markov chain. The Annals of Mathematical Statistics, 29(4), 1112–1122.
Cerqueti, R., Falbo, P., & Pelizzari, C. (2010). Relevant states and memory in Markov chain bootstrapping and simulation. Munich Personal RePEc Archive. http://mpra.ub.uni-muenchen.de/46254/1/MPRApaper46250.pdf
Cerqueti, R., Falbo, P., Guastaroba, G., & Pelizzari, C. (2013). A Tabu search heuristic procedure in Markov chain bootstrapping. European Journal of Operational Research, 227(2), 367–384.
Ching, W.-K., Ng, M. K., & Fung, E. S. (2008). Higher-order multivariate Markov chains and their applications. Linear Algebra and Its Applications, 428(2–3), 492–507.
Chung, F. K. R. (1997). Spectral graph theory. Providence, RI: American Mathematical Society.
Deng, K., Mehta, P. G., & Meyn, S. P. (2011). Optimal Kullback–Leibler aggregation via spectral theory of Markov chains. IEEE Transactions on Automatic Control, 56(12), 2793–2808.
Efron, B. (1979). Bootstrap methods: Another look at the jackknife. The Annals of Statistics, 7(1), 1–26.
Efron, B., & Tibshirani, R. J. (1993). An introduction to the bootstrap. New York, NY: Chapman & Hall.
Freedman, D. (1984). On bootstrapping two-stage least-squares estimates in stationary linear models. The Annals of Statistics, 12(3), 827–842.
Freedman, D. A., & Peters, S. C. (1984). Bootstrapping a regression equation: Some empirical results. Journal of the American Statistical Association, 79(385), 97–106.
Hamilton, J. D. (1996). Specification testing in Markov-switching time-series models. Journal of Econometrics, 70(1), 127–157.
Hamilton, J. D. (2005). What’s real about the business cycle? Federal Reserve Bank of St. Louis Review, 87(4), 435–452.
Huisman, R., & Mahieu, R. (2003). Regime jumps in electricity prices. Energy Economics, 25(5), 425–434.
Jeanne, O., & Masson, P. (2000). Currency crises, sunspots and Markov-switching regimes. Journal of International Economics, 50(2), 327–350.
Kemeny, J. G., & Snell, J. L. (1976). Finite Markov chains. Berlin: Springer.
Kolmogorov, A. N. (1965). Three approaches to the quantitative definition of information. Problemy Peredachi Informatsii, 1(1), 3–11.
Meila, M., & Xu, L. (2004). Multiway cuts and spectral clustering. University of Washington—Department of Statistics, 442. https://www.stat.washington.edu/research/reports/2004/tr442.pdf.
Mueller, M., & Kramer, S. (2010). Integer linear programming models for constrained clustering. In B. Pfahringer, G. Holmes, & A. Hoffman (Eds.), Discovery science (pp. 159–173). Springer: Berlin.
Rached, Z., Alalaji, F., & Campbell, L. L. (2004). The Kullback–Leibler divergence rate between Markov sources. IEEE Transactions on Information Theory, 50(5), 917–921.
Saǧlam, B., Salman, F. S., Sayin, S., & Türkay, M. (2006). A mixed-integer programming approach to the clustering problem with an application in customer segmentation. European Journal of Operational Research, 173(3), 866–879.
Spears, W. M. (1998). A compression algorithm for probability transition matrices. SIAM Journal on Matrix Analysis and Applications, 20, 60–77.
Sullivan, R., Timmermann, A., & White, H. (1999). Data-snooping, technical trading rule performance, and the bootstrap. The Journal of Finance, 54(5), 1647–1691.
Thomas, M. U. (2010). Aggregation and lumping of DTMCs. In J. J. Cochran, L. A. Cox Jr., P. Kesikinocak, J. P. Kharoufeh, & J. C. Smith (Eds.), Wiley encyclopedia of operations research and management science. Hoboken, NJ: Wiley.
Verma, D., & Meila, M. (2003). Comparison of spectral clustering methods. Advances in neural information processing systems, 15. www.cs.washington.edu/spectral/papers/nips03-comparison.ps.
Ward, J. H. (1963). Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 58(301), 236–244.
Weron, R., Bierbrauer, M., & Trueck, S. (2004). Modeling electricity prices: Jump diffusion and regime switching. Physica A: Statistical Mechanics and Its Applications, 336(1–2), 39–48.
Weron, R. (2006). Modeling and forecasting electricity loads and prices: A statistical approach. Chichester: Wiley.
White, L. B., Mahony, R., & Brushe, G. D. (2000). Lumpable hidden Markov models-model reduction and reduced complexity filtering. IEEE Transactions on Automatic Control, 43(12), 2297–2306.
Zhu, J., Hong, J., & Hughes, J. G. (2002). Using Markov chains for link prediction in adaptive web sites. In D. Bustard, W. Liu, & R. Sterritt (Eds.), SoftWare 2002: Computing in an imperfect world (pp. 55–66). Berlin: Springer.
Acknowledgments
The fourth and fifth author wish to thank the partial support received from the Spanish Ministry of Science and Technology through grant number MTM2013-46962-C2-1-P.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix 1: Trend and weekly seasonality removal
The estimation of the exponential trend and weekly seasonality is based on the following model:
where \(e_{t}^{(c)}\) are the raw original prices, \({\mathbb {I}}_{j}(t)\) is the dummy variable signalling whether t is the jth day of the week, with \( j=1,\dots ,7\), r is the growth rate, \(\eta _{j}\) is the coefficient of dummy variable \({\mathbb {I}}_{j}(t)\), with \(j=1,\dots ,7\), and \( \varepsilon _{t}\) are the errors. If we take the natural logarithm on both sides of formula (14), we obtain the following formula:
where \(z_{t}=\ln e_{t}^{(c)}\).
For estimation purposes, we assume that the usual hypotheses of linear regression on the errors \(\varepsilon _{t}\) hold. We obtain the OLS estimates of r and \(\eta _{j}\), \(j=1,\dots ,7\), and they are significant at a level of \(5\,\%\) (see Table 4).
To the purpose of removing the exponential trend and weekly seasonality from our original series, we define the series of prices \( e(T)=(e_{1},\dots ,e_t, \dots , e_{T})\), where:
Set e(T) is an input of the bootstrapping method, while the output is the bootstrapped series \(x(\ell )=(x_{1},\ldots ,x_{\ell })\). To re-introduce the exponential trend and weekly seasonality in \(x(\ell )\), we multiply each point \(x_{j}\) by \(e^{(\hat{r}j+\hat{\eta }_{1}{\mathbb {I}} _{1}(j)+\hat{\eta }_{2}{\mathbb {I}}_{2}(j)+\hat{\eta }_{3}{\mathbb {I}}_{3}(j)+\hat{ \eta }_{4}{\mathbb {I}}_{4}(j)+\hat{\eta }_{5}{\mathbb {I}}_{5}(j)+\hat{\eta }_{6} {\mathbb {I}}_{6}(j)+\hat{\eta }_{7}{\mathbb {I}}_{7}(j))}\),\(j=1,\dots ,\ell \).
Appendix 2: Initial states, or intervals
Table 5 reports the 12 intervals of the initial partition of the support \([\alpha ,\beta ]\) of the series of Spain and Germany after removal of exponential trend and weekly seasonality.
Rights and permissions
About this article
Cite this article
Cerqueti, R., Falbo, P., Pelizzari, C. et al. A mixed integer linear program to compress transition probability matrices in Markov chain bootstrapping. Ann Oper Res 248, 163–187 (2017). https://doi.org/10.1007/s10479-016-2181-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10479-016-2181-9