Abstract
In many regions, monthly (or bimonthly) rainfall data can be considered as deterministic while daily rainfall data may be treated as random. As a result, deterministic models may not sufficiently fit the daily data because of the strong stochastic nature, while stochastic models may also not reliably fit into daily rainfall time series because of the deterministic nature at the large scale (i.e. coarse scale). Although there are different approaches for simulating daily rainfall, mixing of deterministic and stochastic models (towards possible representation of both deterministic and stochastic properties) has not hitherto been proposed. An attempt is made in this study to simulate daily rainfall data by utilizing discrete wavelet transformation and hidden Markov model. We use a deterministic model to obtain large-scale data, and a stochastic model to simulate the wavelet tree coefficients. The simulated daily rainfall is obtained by inverse transformation. We then compare the accumulated simulated and accumulated observed data from the Chao Phraya Basin in Thailand. Because of the stochastic nature at the small scale, the simulated daily rainfall on a point to point comparison show deviations with the observed data. However the accumulated simulated data do show some level of agreement with the observed data.
Similar content being viewed by others
References
Abarbanel HDI (1996) Analysis of observed chaotic data. Springer, New York
Aksoy H, Bayazit M (2000) A model for daily flows of intermittent streams. Hydrol Processes 14:1725–1744
Aksoy H, Unal NE (2007) Discussion of ‘Comparison of two nonparametric alternatives for stochastic generation of monthly rainfall’ by R. Srikanthan et al. ASCE J Hydrol Eng 12:699–702
Bayazit M, Aksoy H (2001) Using wavelets for data generation. J Appl Stat 28:157–166
Bayazit M, Onoz B, Aksoy H (2001) Nonparametric streamflow simulation by wavelet or Fourier analysis. Hydrol Sci J 46:623–634
Chapman TG (1994) Stochastic models of daily rainfall. In: Proceedings of water down under 94, vol. 3. National Conference Publications, Institution of Engineers, Canberra, pp 7–12
Chipman HA, Kolaczyk ED, McCulloch RE (1997) Adaptive Bayesian wavelet shrinkage. J Am Stat Assoc 92:1413–1421
Chow CK, Liu CN (1968) Approximating discrete probability distributions with dependence trees. IEEE Trans Inform Theory 14:462–467
Coulibaly P, Anctil F, Bobee B (2000) Daily reservoir inflow forecasting using artificial neural networks with stopped training approach. J Hydrol 230:244–257
Crouse MS, Nowak RD, Baraniuk RG (1998) Wavelet-based statistical signal processing using hidden Markov models. IEEE Trans Signal Process 46:886–902
Daubechies I (1992) Ten lectures on wavelets. SIAM, New York
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc B 39:1–38
Haar A (1910) Zur Theorie der orthogonalen Funktionensysteme. Mathematische Annalen 69:331–371
Hamilton JD (1994) Time series analysis. Princeton University Press, Princeton
Jayawardena AW, Li WK, Xu P (2002) Neighbourhood selection for local modelling and prediction of hydrological time series. J Hydrol 258:40–57
Labat D, Ababou R, Mangin A (1999) Wavelet analysis in Karstic hydrology. Part 2: rainfall–runoff cross-wavelet analysis. Comptes Rendus de l’Academie des Sciences Series IIA Earth Planet Sci 329:881–887
Mallat S (1997). A wavelet tour of signal processing. Academic Press
McLachlan GJ, Krishnan T (1997) The EM algorithm and extensions. Wiley, New York
Rabiner LR (1989) A tutorial on hidden Markov models and selected applications in speech recognition. Proc IEEE 77:257–286
Ronen O, Rohlicek JR, Ostendorf M (1995) Parameter estimation of dependence tree models using the EM algorithm. IEEE Signal Proc Lett 2:157–159
Saco P, Kumar P (2000) Coherent modes in multiscale variability of streamflow over the United States. Water Resour Res 36:1049–1068
Salas JD (1993) Analysis and modelling of hydrologic time series. In: Maidment DR (ed) Handbook of Hydrology, vol 19. McGraw-Hill, New York, pp 1–72
Sharma A, Lall U (1999) A nonparametric approach for daily rainfall simulation. Math Comput Simul 48:361–371
Smith LC, Turcotte DL, Isacks BL (1998) Stream flow characterization and feature detection using a discrete wavelet transform. Hydrol Processes 12:233–249
Sobol’ IM (1994) A primer for the Monte Carlo method. CRC Press
Unal NE, Aksoy H, Akar T (2004) Annual and monthly rainfall data generation schemes. Stoch Environ Res Risk Assess 18:245–257
Wilks DS (1998) Multisite generalization of a daily stochastic precipitation generation model. J Hydrol 210:178–191
Wong CS, Li WK (2000) On a mixture autoregressive model. J R Stat Soc B 62:91–115
Acknowledgements
The initial part of the work presented in this paper was carried out while the first author was in the Department of Civil Engineering of the University of Hong Kong and the second author was visiting. The third author would like to thank the Area of Excellence Scheme under the University Grants Committee of the Hong Kong Special Administrative Region, China (Project No. AoE/P-04/2004) for partial support.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix A
The algorithm provided below is used to estimate the parameters in a mixture of Gaussian distributions. It turns out to be a special case of the EM algorithm developed by Dempster, Laird and Rubin (1997). Quick reference could also be made to the book by Hamilton (1994).
Let \({v_{N,1}^{t} (i)}\) be an arbitrary initial guess of v N,1(i) under the probability rule \( \sum\nolimits_{i = 1}^{M} {v_{N,1}^{t} (i)} = 1 \) (Here, the superscript t is an iteration counter). For i = 1, 2,…,M, we let
and
Then, a new estimation of the probability v N,1(i) is obtained as:
for i = 1, 2,…,M. The iteration stops when the convergence criterion
(ε = 1.0E − 6 in this study) is satisfied. Thus the estimation of the probability v N,1(i) can be obtained.
Appendix B
The EM algorithm for estimating the transition probabilities closely follows the paper by Crouse et al. (1998). The EM algorithm for dependent tree models first appeared in Chow and Liu (1968). Ronen et al. (1995) expanded this work to include the condition that some components of the tree are unobserved. However, the EM steps have been derived only for discrete-valued random variables. Crouse et al. (1998) generalized the algorithm so that it can be applied to the hidden Markov tree models, which are of discrete and continuous valued nodes. We herein slightly modify the algorithm presented in Crouse’s paper, so that some of the parameters (means and variances) are fixed. The steps of the EM algorithm for this model are as follows:
2.1 Initialization
Given arbitrary initial assignments of the transition probabilities T tk,n (i, j) and hidden state probabilities v y,tk,n (i) for different wavelet trees (Here, the superscript t is an iteration counter):
under the probability rules: \( \sum\nolimits_{i = 1}^{M} {v_{k,n}^{y,t} (i) = 1} \) and \( \sum\nolimits_{i = 1}^{M} {T_{k,n}^{t} (i,j)} = 1 \) for any choice of j (since for any fixed j, the state variable S yk,n takes a value from {1,…,M}).
2.2 Expectation step (E-step)
For each wavelet tree, i.e. for y = 1,…,Y, we apply the “upward-downward” algorithm:
-
A.
The upward algorithm
-
1.
Initialization: assign the values of β at the “leaves” of the wavelet trees. For i = 1, 2,…,M and n = 2N−2 + 1,…,2N−1, let
$$ \beta_{1,n}^{y} (i) = \frac{1}{{\sqrt {2\pi } \sigma_{i} }}\exp \left( { - \frac{{(D_{1,n}^{y} )^{2} }}{{2\sigma_{i}^{2} }}} \right) . $$(39) -
2.
Step upward: calculate all the values of β by the following formulas. For k = 2,…,N, n = 2N−k + 1,…,2N−k+1, l = 0, 1 and i = 1,…,M,
$$ \phi_{k - 1,2n - l}^{y} (i) = \sum\limits_{j = 1}^{M} {T_{k - 1,2n - l}^{t} (j,i)\beta_{k - 1,2n - l}^{y} (j)} , $$(40)$$ \beta_{k,n}^{y} (i) = \frac{1}{{\sqrt {2\pi } \sigma_{i} }}\exp \left( { - \frac{{(D_{k,n}^{y} )^{2} }}{{2\sigma_{i}^{2} }}} \right)\prod\limits_{l = 0}^{1} {\phi_{k - 1,2n - l}^{y} (i)} ,\,{\text{and}}\,{\text{let}} $$(41)$$ \varphi_{k - 1,2n - l}^{y} (i) = \frac{{\beta_{k,n}^{y} (i)}}{{\phi_{k - 1,2n - l}^{y} (i)}}. $$(42)
-
1.
-
B.
The downward algorithm
-
1.
Initialization: To get the values of α at the “root” of the wavelet trees, we set for i = 1, 2,…,M
$$ \alpha_{N,1}^{y} (i) = v_{N,1}^{y,t} (i). $$(43) -
2.
Step downward: obtain the remaining values of α. For k = N,…,2, n = 2N−k + 1,…,2N−k+1, l = 0, 1 and i = 1, 2,…,M,
$$ \alpha_{k - 1,2n - l}^{y} (i) = \sum\limits_{j = 1}^{M} {T_{k - 1,2n - l}^{t} (i,j)\alpha_{k,n}^{y} (j)\varphi_{k - 1,2n - l}^{y} (j)} . $$(44)
-
1.
2.3 Maximization step (M-step)
-
A Update the state probabilities in the wavelet trees
-
For k = 1,…,N, n = 2N−k−1 + 1,…,2N−k and i = 1,…,M, the new iteration is
$$ v_{k,n}^{y,t + 1} (i) = \frac{{\alpha_{k,n}^{y} (i)\beta_{k,n}^{y} (i)}}{{\sum\limits_{j = 1}^{M} {\alpha_{k,n}^{y} (j)\beta_{k,n}^{y} (j)} }}. $$(45) -
B Renew the transition probabilities
For k = 2,…,N, n = 2N−k + 1,…,2N−k+1, l = 0, 1 and i = 1,…,M, the following simplified equation (to facilitate computations) is used:
$$ T_{k - 1,2n - l}^{t + 1} (i,j) = \frac{1}{Y}\sum\limits_{y = 1}^{Y} {\frac{{T_{k - 1,2n - l}^{t} (i,j)\beta_{k - 1,2n - l}^{y} (i)}}{{\phi_{k - 1,2n - l}^{y} (j)}}} . $$(46)
2.4 Convergence checking
-
For k = 1,…,N − 1, n = 2N−k−1 + 1,…,2N−k and 1 ≤ i, j ≤ M, set
$$ \varepsilon_{1} = \max \left(\left|T_{k,n}^{t + 1} (i,j) - T_{k,n}^{t} (i,j)\right|\right). $$(47) -
For y = 1,…,Y, k = 1,…,N − 1, n = 2N−k−1 + 1,…,2N−k and 1 ≤ i ≤ M, set
$$ \varepsilon_{2} = \max \left(\left|v_{k,n}^{y,t + 1} (i) - v_{k,n}^{y,t} (i)\right|\right). $$(48) -
Set ε = max (ε 1, ε 2).
If ε < 1.0E − 6 (convergence criterion), then we STOP the algorithm and set all
$$ T_{k,n} (i,j) = T_{k,n}^{t + 1} (i,j). $$(49)
Otherwise, we need to set \( v_{k,n}^{y,t} (i) = v_{k,n}^{y,t + 1} (i) \) and \( T_{k,n}^{t} (i,j) = T_{k,n}^{t + 1} (i,j), \) and do the EM algorithm steps again until the convergence criterion is met.
Rights and permissions
About this article
Cite this article
Jayawardena, A.W., Xu, P.C. & Li, W.K. Rainfall data simulation by hidden Markov model and discrete wavelet transformation. Stoch Environ Res Risk Assess 23, 863–877 (2009). https://doi.org/10.1007/s00477-008-0264-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00477-008-0264-0