Rainfall data simulation by hidden Markov model and discrete wavelet transformation

Jayawardena, A. W.; Xu, P. C.; Li, W. K.

doi:10.1007/s00477-008-0264-0

Rainfall data simulation by hidden Markov model and discrete wavelet transformation

Original Paper
Published: 25 October 2008

Volume 23, pages 863–877, (2009)
Cite this article

Stochastic Environmental Research and Risk Assessment Aims and scope Submit manuscript

A. W. Jayawardena¹,
P. C. Xu² &
W. K. Li³

266 Accesses
7 Citations
Explore all metrics

Abstract

In many regions, monthly (or bimonthly) rainfall data can be considered as deterministic while daily rainfall data may be treated as random. As a result, deterministic models may not sufficiently fit the daily data because of the strong stochastic nature, while stochastic models may also not reliably fit into daily rainfall time series because of the deterministic nature at the large scale (i.e. coarse scale). Although there are different approaches for simulating daily rainfall, mixing of deterministic and stochastic models (towards possible representation of both deterministic and stochastic properties) has not hitherto been proposed. An attempt is made in this study to simulate daily rainfall data by utilizing discrete wavelet transformation and hidden Markov model. We use a deterministic model to obtain large-scale data, and a stochastic model to simulate the wavelet tree coefficients. The simulated daily rainfall is obtained by inverse transformation. We then compare the accumulated simulated and accumulated observed data from the Chao Phraya Basin in Thailand. Because of the stochastic nature at the small scale, the simulated daily rainfall on a point to point comparison show deviations with the observed data. However the accumulated simulated data do show some level of agreement with the observed data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Improving a stochastic multi-site generation model of daily rainfall using discrete wavelet de-noising: a case study to a semi-arid region

Article 12 January 2019

Simulating rainfall time-series: how to account for statistical variability at multiple scales?

Article 11 April 2017

Modeling winter rainfall in Northwest India using a hidden Markov model: understanding occurrence of different states and their dynamical connections

Article 23 May 2014

References

Abarbanel HDI (1996) Analysis of observed chaotic data. Springer, New York
Google Scholar
Aksoy H, Bayazit M (2000) A model for daily flows of intermittent streams. Hydrol Processes 14:1725–1744
Article Google Scholar
Aksoy H, Unal NE (2007) Discussion of ‘Comparison of two nonparametric alternatives for stochastic generation of monthly rainfall’ by R. Srikanthan et al. ASCE J Hydrol Eng 12:699–702
Article Google Scholar
Bayazit M, Aksoy H (2001) Using wavelets for data generation. J Appl Stat 28:157–166
Article Google Scholar
Bayazit M, Onoz B, Aksoy H (2001) Nonparametric streamflow simulation by wavelet or Fourier analysis. Hydrol Sci J 46:623–634
Article Google Scholar
Chapman TG (1994) Stochastic models of daily rainfall. In: Proceedings of water down under 94, vol. 3. National Conference Publications, Institution of Engineers, Canberra, pp 7–12
Chipman HA, Kolaczyk ED, McCulloch RE (1997) Adaptive Bayesian wavelet shrinkage. J Am Stat Assoc 92:1413–1421
Article Google Scholar
Chow CK, Liu CN (1968) Approximating discrete probability distributions with dependence trees. IEEE Trans Inform Theory 14:462–467
Article Google Scholar
Coulibaly P, Anctil F, Bobee B (2000) Daily reservoir inflow forecasting using artificial neural networks with stopped training approach. J Hydrol 230:244–257
Article Google Scholar
Crouse MS, Nowak RD, Baraniuk RG (1998) Wavelet-based statistical signal processing using hidden Markov models. IEEE Trans Signal Process 46:886–902
Article Google Scholar
Daubechies I (1992) Ten lectures on wavelets. SIAM, New York
Google Scholar
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc B 39:1–38
Google Scholar
Haar A (1910) Zur Theorie der orthogonalen Funktionensysteme. Mathematische Annalen 69:331–371
Article Google Scholar
Hamilton JD (1994) Time series analysis. Princeton University Press, Princeton
Google Scholar
Jayawardena AW, Li WK, Xu P (2002) Neighbourhood selection for local modelling and prediction of hydrological time series. J Hydrol 258:40–57
Article Google Scholar
Labat D, Ababou R, Mangin A (1999) Wavelet analysis in Karstic hydrology. Part 2: rainfall–runoff cross-wavelet analysis. Comptes Rendus de l’Academie des Sciences Series IIA Earth Planet Sci 329:881–887
Article Google Scholar
Mallat S (1997). A wavelet tour of signal processing. Academic Press
McLachlan GJ, Krishnan T (1997) The EM algorithm and extensions. Wiley, New York
Google Scholar
Rabiner LR (1989) A tutorial on hidden Markov models and selected applications in speech recognition. Proc IEEE 77:257–286
Article Google Scholar
Ronen O, Rohlicek JR, Ostendorf M (1995) Parameter estimation of dependence tree models using the EM algorithm. IEEE Signal Proc Lett 2:157–159
Article Google Scholar
Saco P, Kumar P (2000) Coherent modes in multiscale variability of streamflow over the United States. Water Resour Res 36:1049–1068
Article Google Scholar
Salas JD (1993) Analysis and modelling of hydrologic time series. In: Maidment DR (ed) Handbook of Hydrology, vol 19. McGraw-Hill, New York, pp 1–72
Sharma A, Lall U (1999) A nonparametric approach for daily rainfall simulation. Math Comput Simul 48:361–371
Article Google Scholar
Smith LC, Turcotte DL, Isacks BL (1998) Stream flow characterization and feature detection using a discrete wavelet transform. Hydrol Processes 12:233–249
Article Google Scholar
Sobol’ IM (1994) A primer for the Monte Carlo method. CRC Press
Unal NE, Aksoy H, Akar T (2004) Annual and monthly rainfall data generation schemes. Stoch Environ Res Risk Assess 18:245–257
Article Google Scholar
Wilks DS (1998) Multisite generalization of a daily stochastic precipitation generation model. J Hydrol 210:178–191
Article Google Scholar
Wong CS, Li WK (2000) On a mixture autoregressive model. J R Stat Soc B 62:91–115
Article Google Scholar

Download references

Acknowledgements

The initial part of the work presented in this paper was carried out while the first author was in the Department of Civil Engineering of the University of Hong Kong and the second author was visiting. The third author would like to thank the Area of Excellence Scheme under the University Grants Committee of the Hong Kong Special Administrative Region, China (Project No. AoE/P-04/2004) for partial support.

Author information

Authors and Affiliations

International Centre for Water Hazard and Risk Management, Public Works Research Institute, Tsukuba, Japan
A. W. Jayawardena
Academy of Mathematics and System Sciences, Chinese Academy of Sciences, Beijing, China
P. C. Xu
Department of Statistics and Actuarial Science, The University of Hong Kong, Hong Kong, Hong Kong
W. K. Li

Authors

A. W. Jayawardena
View author publications
You can also search for this author in PubMed Google Scholar
P. C. Xu
View author publications
You can also search for this author in PubMed Google Scholar
W. K. Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to A. W. Jayawardena.

Appendices

Appendix A

The algorithm provided below is used to estimate the parameters in a mixture of Gaussian distributions. It turns out to be a special case of the EM algorithm developed by Dempster, Laird and Rubin (1997). Quick reference could also be made to the book by Hamilton (1994).

Let ${v_{N,1}^{t} (i)}$ be an arbitrary initial guess of v _N,1(i) under the probability rule $ \sum\nolimits_{i = 1}^{M} {v_{N,1}^{t} (i)} = 1 $ (Here, the superscript t is an iteration counter). For i = 1, 2,…,M, we let

$$ \alpha_{N,1} (i) = v_{N,1}^{t} (i) $$

(35)

and

$$ \beta_{N,1} (i) = \frac{1}{{\sqrt {2\pi } \sigma_{i} }}\exp \left( { - \frac{{D_{N,1}^{2} }}{{2\sigma_{i}^{2} }}} \right). $$

(36)

Then, a new estimation of the probability v _N,1(i) is obtained as:

$$ v_{N,1}^{t + 1} (i) = \frac{{\alpha_{N,1} (i)\beta_{N,1} (i)}}{{\sum\limits_{j = 1}^{M} {\alpha_{N,1} (j)\beta_{N,1} (j)} }} $$

(37)

for i = 1, 2,…,M. The iteration stops when the convergence criterion

$$ \sum\limits_{i = 1}^{M} {\left| {v_{N,1}^{t + 1} (i) - v_{N,1}^{t} (i)} \right|} < \varepsilon $$

(38)

(ε = 1.0E − 6 in this study) is satisfied. Thus the estimation of the probability v _N,1(i) can be obtained.

Appendix B

The EM algorithm for estimating the transition probabilities closely follows the paper by Crouse et al. (1998). The EM algorithm for dependent tree models first appeared in Chow and Liu (1968). Ronen et al. (1995) expanded this work to include the condition that some components of the tree are unobserved. However, the EM steps have been derived only for discrete-valued random variables. Crouse et al. (1998) generalized the algorithm so that it can be applied to the hidden Markov tree models, which are of discrete and continuous valued nodes. We herein slightly modify the algorithm presented in Crouse’s paper, so that some of the parameters (means and variances) are fixed. The steps of the EM algorithm for this model are as follows:

2.1 Initialization

Given arbitrary initial assignments of the transition probabilities T ^t_k,n (i, j) and hidden state probabilities v ^y,t_k,n (i) for different wavelet trees (Here, the superscript t is an iteration counter):

$$T_{k,n}^{t}(i,j): \text{for}\;1 \le i,\;j \le M, k=1,\ldots, N-1,\;\text{and}\;n=2^{N-k-1}+1,\ldots,2^{N-k}.$$

$$\nu_{k,n}^{y,t}(i): \text{for}\;1 \le i \le M,\; k=1,\ldots, N,\;n=2^{N-k-1}+1,\ldots,2^{N-k}\;\text{and}\;y=1,\ldots,Y.$$

under the probability rules: $ \sum\nolimits_{i = 1}^{M} {v_{k,n}^{y,t} (i) = 1} $ and $ \sum\nolimits_{i = 1}^{M} {T_{k,n}^{t} (i,j)} = 1 $ for any choice of j (since for any fixed j, the state variable S ^y_k,n takes a value from {1,…,M}).

2.2 Expectation step (E-step)

For each wavelet tree, i.e. for y = 1,…,Y, we apply the “upward-downward” algorithm:

A.
The upward algorithm
1. 1.
  Initialization: assign the values of β at the “leaves” of the wavelet trees. For i = 1, 2,…,M and n = 2^N−2 + 1,…,2^N−1, let
  $$ \beta_{1,n}^{y} (i) = \frac{1}{{\sqrt {2\pi } \sigma_{i} }}\exp \left( { - \frac{{(D_{1,n}^{y} )^{2} }}{{2\sigma_{i}^{2} }}} \right) . $$
  (39)
2. 2.
  Step upward: calculate all the values of β by the following formulas. For k = 2,…,N, n = 2^N−k + 1,…,2^N−k+1, l = 0, 1 and i = 1,…,M,
  $$ \phi_{k - 1,2n - l}^{y} (i) = \sum\limits_{j = 1}^{M} {T_{k - 1,2n - l}^{t} (j,i)\beta_{k - 1,2n - l}^{y} (j)} , $$
  (40)
  
  $$ \beta_{k,n}^{y} (i) = \frac{1}{{\sqrt {2\pi } \sigma_{i} }}\exp \left( { - \frac{{(D_{k,n}^{y} )^{2} }}{{2\sigma_{i}^{2} }}} \right)\prod\limits_{l = 0}^{1} {\phi_{k - 1,2n - l}^{y} (i)} ,\,{\text{and}}\,{\text{let}} $$
  (41)
  
  $$ \varphi_{k - 1,2n - l}^{y} (i) = \frac{{\beta_{k,n}^{y} (i)}}{{\phi_{k - 1,2n - l}^{y} (i)}}. $$
  (42)
B.
The downward algorithm
1. 1.
  Initialization: To get the values of α at the “root” of the wavelet trees, we set for i = 1, 2,…,M
  $$ \alpha_{N,1}^{y} (i) = v_{N,1}^{y,t} (i). $$
  (43)
2. 2.
  Step downward: obtain the remaining values of α. For k = N,…,2, n = 2^N−k + 1,…,2^N−k+1, l = 0, 1 and i = 1, 2,…,M,
  $$ \alpha_{k - 1,2n - l}^{y} (i) = \sum\limits_{j = 1}^{M} {T_{k - 1,2n - l}^{t} (i,j)\alpha_{k,n}^{y} (j)\varphi_{k - 1,2n - l}^{y} (j)} . $$
  (44)

2.3 Maximization step (M-step)

A Update the state probabilities in the wavelet trees
For k = 1,…,N, n = 2^N−k−1 + 1,…,2^N−k and i = 1,…,M, the new iteration is
$$ v_{k,n}^{y,t + 1} (i) = \frac{{\alpha_{k,n}^{y} (i)\beta_{k,n}^{y} (i)}}{{\sum\limits_{j = 1}^{M} {\alpha_{k,n}^{y} (j)\beta_{k,n}^{y} (j)} }}. $$
(45)
B Renew the transition probabilities

For k = 2,…,N, n = 2^N−k + 1,…,2^N−k+1, l = 0, 1 and i = 1,…,M, the following simplified equation (to facilitate computations) is used:
$$ T_{k - 1,2n - l}^{t + 1} (i,j) = \frac{1}{Y}\sum\limits_{y = 1}^{Y} {\frac{{T_{k - 1,2n - l}^{t} (i,j)\beta_{k - 1,2n - l}^{y} (i)}}{{\phi_{k - 1,2n - l}^{y} (j)}}} . $$
(46)

2.4 Convergence checking

For k = 1,…,N − 1, n = 2^N−k−1 + 1,…,2^N−k and 1 ≤ i, j ≤ M, set
$$ \varepsilon_{1} = \max \left(\left|T_{k,n}^{t + 1} (i,j) - T_{k,n}^{t} (i,j)\right|\right). $$
(47)
For y = 1,…,Y, k = 1,…,N − 1, n = 2^N−k−1 + 1,…,2^N−k and 1 ≤ i ≤ M, set
$$ \varepsilon_{2} = \max \left(\left|v_{k,n}^{y,t + 1} (i) - v_{k,n}^{y,t} (i)\right|\right). $$
(48)
Set ε = max (ε ₁, ε ₂).

If ε < 1.0E − 6 (convergence criterion), then we STOP the algorithm and set all
$$ T_{k,n} (i,j) = T_{k,n}^{t + 1} (i,j). $$
(49)

Otherwise, we need to set $ v_{k,n}^{y,t} (i) = v_{k,n}^{y,t + 1} (i) $ and $ T_{k,n}^{t} (i,j) = T_{k,n}^{t + 1} (i,j), $ and do the EM algorithm steps again until the convergence criterion is met.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jayawardena, A.W., Xu, P.C. & Li, W.K. Rainfall data simulation by hidden Markov model and discrete wavelet transformation. Stoch Environ Res Risk Assess 23, 863–877 (2009). https://doi.org/10.1007/s00477-008-0264-0

Download citation

Published: 25 October 2008
Issue Date: October 2009
DOI: https://doi.org/10.1007/s00477-008-0264-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Rainfall data simulation by hidden Markov model and discrete wavelet transformation

Abstract

Access this article

Similar content being viewed by others

Improving a stochastic multi-site generation model of daily rainfall using discrete wavelet de-noising: a case study to a semi-arid region

Simulating rainfall time-series: how to account for statistical variability at multiple scales?

Modeling winter rainfall in Northwest India using a hidden Markov model: understanding occurrence of different states and their dynamical connections

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix A

Appendix B

2.1 Initialization

2.2 Expectation step (E-step)

2.3 Maximization step (M-step)

2.4 Convergence checking

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Rainfall data simulation by hidden Markov model and discrete wavelet transformation

Abstract

Access this article

Similar content being viewed by others

Improving a stochastic multi-site generation model of daily rainfall using discrete wavelet de-noising: a case study to a semi-arid region

Simulating rainfall time-series: how to account for statistical variability at multiple scales?

Modeling winter rainfall in Northwest India using a hidden Markov model: understanding occurrence of different states and their dynamical connections

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix A

Appendix B

2.1 Initialization

2.2 Expectation step (E-step)

2.3 Maximization step (M-step)

2.4 Convergence checking

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation