Skip to main content
Log in

A hidden Markov approach to the analysis of space–time environmental data with linear and circular components

  • Original Paper
  • Published:
Stochastic Environmental Research and Risk Assessment Aims and scope Submit manuscript

Abstract

The analysis of bivariate space–time series with linear and circular components is complicated by (1) multiple correlations, across time, space and between variables, (2) different supports on which the variables are observed, the real line and the circle, and (3) the periodic nature of circular data. We describe a multivariate hidden Markov model that includes these features of the data within a single framework. The model integrates a circular von Mises Markov field and a Gaussian Markov field, with parameters that evolve in time according to a latent (hidden) Markov chain. It allows to describe the data by means of a finite number of time-varying latent regimes, associated with easily interpretable components of large-scale and small-scale spatial variation. It can be estimated by a computationally feasible expectation–maximization algorithm. In a case study of sea currents in the Northern Adriatic Sea, it provides a parsimonious representation of the sea surface in terms of alternating environmental states.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Bertotti L, Cavalieri L (2009) Wind and wave predictions in the Adriatic Sea. J Mar Syst 78:S227–S234

    Article  Google Scholar 

  • Biernacki C, Celeux G, Govaert G (2003) Choosing starting values for the em algorithm for getting the highest likelihood in multivariate gaussian mixture models. Comput Stat Data Anal 41(3–4):561–575

    Article  Google Scholar 

  • Bulla J, Lagona F, Maruotti A, Picone M (2012) A multivariate hidden Markov model for the identification of sea regimes from incomplete skewed and circular time series. J Agric Biol Environ Stat 17(4):544–567. doi:10.1007/s13253-012-0110-1

    Article  Google Scholar 

  • Cappé O, Moulines E, Rydén T (2005) Inference in hidden Markov models. Springer, Berlin

  • Carnicero JA, Ausín MC, Wiper MP (2013) Non-parametric copulas for circular–linear and circular–circular data: an application to wind directions. Stoch Environ Res Risk Assess 27(8):1991–2002. doi:10.1007/s00477-013-0733-y

    Article  Google Scholar 

  • Cosoli S, Mazzoldi A, Gacic M (2010) Validation of surface current measurements in the northern Adriatic Sea from high frequency radars. J Atmos Ocean Techno 27:908–919

    Article  Google Scholar 

  • Cosoli S, Gacic M, Mazzoldi A (2012) Surface current variability and wind influence in the northeastern Adriatic Sea as observed from high-frequency (hf) radar measurements. Cont Shelf Res 33:1–13

    Article  Google Scholar 

  • Faltinsen O (1990) Sea loads on ships and offshore structures. Cambridge University Press, Cambridge

  • Fisher N, Lee A (1983) A correlation coefficient for circular data. Biometrika 70(2):327–332

    Article  Google Scholar 

  • Fisher N, Lee A (1992) Regression models for an angular response. Biometrics 48:665–677

    Article  Google Scholar 

  • Gaetan C, Guyon X (2010) Spatial statistics and modelling. Springer, Berlin

  • García-Portugués E, Barros AM, Crujeiras RM, González-Manteiga W, Pereira J (2013a) A test for directional-linear independence, with applications to wildfire orientation and size. Stoch Environ Res Risk Assess 1–15. doi:10.1007/s00477-013-0819-6

  • García-Portugués E, Crujeiras R, González-Manteiga W (2013b) Exploring wind direction and So2 concentration by circular–linear density estimation. Stoch Environ Res Risk Assess 27(5):1055–1067

    Article  Google Scholar 

  • Griffith DA, Lagona F (1998) On the quality of likelihood-based estimators in spatial autoregressive models when the data dependence structure is misspecified. J Stat Plan Inference 69(1):153–174

    Article  Google Scholar 

  • Huang G, Wing-Keung Law A, Huang Z (2011) Wave-induced drift of small floating objects in regular waves. Ocean Eng 38:712–718

    Article  Google Scholar 

  • Ingrassia S, Rocci R (2011) Degeneracy of the em algorithm for the mle of multivariate gaussian mixtures and dynamic constraints. Comput Stat Data Anal 55:1715–1725

    Article  Google Scholar 

  • Jin KR, Ji ZG (2004) Case study: modeling of sediment transport and wind-wave impact in lake okeechobee. J Hydraul Eng 130:1055–1067

    Article  Google Scholar 

  • Jona Lasinio G, Lagona F (2002) Selection of the neighborhood structure for space–time Markov random field models. Stat Methods Appl 11(3):293–311

    Article  Google Scholar 

  • Jona Lasinio G, Gelfand A, Jona-Lasinio M (2012) Spatial analysis of wave direction data using wrapped gaussian processes. Ann Appl Stat 6:1478–1498

    Article  Google Scholar 

  • Kato S, Shimizu K (2008) Dependent models for observations which include angular ones. J Stat Plan Inference 138(11):3538–3549, special issue in Honor of Junjiro Ogawa (1915–2000): design of experiments, multivariate analysis and statistical inference

  • Lagona F (2001) Parametric restrictions in random fields for binary space–time data series. Metron-Int J Stat 59(1–2):72–96

    Google Scholar 

  • Lagona F (2002) Adjacency selection in Markov random fields for high spatial resolution hyperspectral data. J Geogr Syst 4(1):53–68

    Article  Google Scholar 

  • Lagona F, Picone M (2013) Maximum likelihood estimation of bivariate circular hidden Markov models from incomplete data. J Stat Comput Simul 83:1223–1237

    Article  Google Scholar 

  • Lagona F, Maruotti A, Picone M (2011) A non-homogeneous hidden Markov model for the analysis of multi-pollutant exceedances data. In: Dymarsky P (ed) Hidden Markov models, theory and applications, InTech, Chap 10, pp 207–222

  • Lee A (2010) Circular data. Wiley Interdiscip Rev Comput Stat 2(4):477–486

    Article  Google Scholar 

  • Mardia K, Voss J (2011) Some fundamental properties of multivariate von Mises distributions. arXiv:1109.6042v1

  • Mardia KV, Hughes G, Taylor CC, Singh H (2008) A multivariate von Mises distribution with applications to bioinformatics. Can J Stat 36(1):99–109

    Article  Google Scholar 

  • McLachlan G, Peel D (2000) Finite mixture models. Wiley, New York

    Book  Google Scholar 

  • Mihanovic H, Cosoli S, Vilibic I, Ivankovic D, Dadic V, Gacic M (2011) Surface current patterns in the northern adriatic extracted from high frequency radar data using self organizing map analysis. J Geophys Res 116(C08):033

    Google Scholar 

  • Modlin D, Fuentes M, Reich B (2012) Circular conditional autoregressive modeling of vector fields. Environmetrics 23(1):46–53. doi:10.1002/env.1133

    Article  Google Scholar 

  • Pleskachevsky A, Eppel D, Kapitza H (2009) Interaction of waves, currents and tides, and wave-energy impact on the beach area of sylt island. Ocean Dyn 59:451–461

    Article  Google Scholar 

  • Poulain P, Kourafalou M, Cushman-Roisin B (2001) Nothern Adriatic Sea. In: Cushman-Roisin B et al (eds) Physical oceanography of the Adriatic Sea. Kluwer Academic, Dordrecht, pp 143–165

    Chapter  Google Scholar 

  • Rue H, Held L (2005) Gaussian Markov radom field: theory and application. Chapman & Hall, London

  • Singh H, Hnizdo V, Demchuk E (2002) Probabilistic model for two dependent circular variables. Biometrika 89(3):719–723. doi:10.1093/biomet/89.3.719

    Article  Google Scholar 

  • Visser I, Raijmakers M, Molenaar P (2000) Confidence intervals for hidden Markov model parameters. Br J Math Stat Psychol 53:317–327

    Article  Google Scholar 

  • Visser I, Raijmakers MEJ, Molenaar PCM (2002) Fitting hidden Markov models to psychological data. Sci Program 10:185–199

    Google Scholar 

  • Wang F (2013) Space and space–time modeling of directional data. PhD thesis, Dept Statistical Sciences, Duke University

  • Wang F, Gelfand AE (2013) Directional data analysis under the general projected normal distribution. Stat Methodol 10(1):113–127

    Article  Google Scholar 

  • Wang F, Gelfand AE, Jona Lasinio G (2014) Joint spatio-temporal analysis of a linear and a directional variable: space-time modeling of wave heights and wave directions in the Adriatic Sea. Stat Sin. doi: 10.5705/ss.2013.204w

  • Wu C (1983) On the convergence properties of the em algorithm. Ann Stat 11:95–103

    Article  Google Scholar 

  • Zucchini W, Guttorp P (1991) A hidden Markov model for space–time precipitation. Water Resour Res 27:1917–1923

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Francesco Lagona.

Appendix

Appendix

For \(n>2\), the normalizing constant of the multivariate von Mises density \(f(\varvec{x}_t; \varvec{\theta }^{\mathrm{circ}}_k)\) is unknown. However, the conditional log-likelihood \(l_k(\varvec{\theta }^{\mathrm{circ}}\mid \varvec{x}_t)=\log f(\varvec{x}_t; \varvec{\theta }^{\mathrm{circ}}_k)\) is well approximated about it maximum point by the pseudo-loglikelihood

$$\begin{aligned} {\mathrm pl} _{k}(\varvec{\theta }^{\mathrm{circ} }\mid \varvec{x}_t)=\sum _{i=1}^{n}\log f_{\mathrm{vm} }(x_{it};\nu _{ik},\kappa _{ik}), \end{aligned}$$

obtained by taking the logarithm of the product of the conditional von Mises distributions (Mardia et al. 2008). We take advantage of this result to compute the posterior probabilities of the Markov chain

$$\begin{aligned} \hat{p}_{tk}(\hat{\varvec{\theta }}_s)={\mathbb {E}}(\xi _{tk}|\varvec{z},\hat{\varvec{\theta }}_s) \hat{p}_{t-1,t,hk}(\hat{\varvec{\theta }}_s)={\mathbb {E}}(\xi _{t-1,h}\xi _{tk}|\varvec{z},\hat{\varvec{\theta }}_s) \end{aligned}$$
(14)

from the estimate \(\hat{\varvec{\theta }}_s\) provided by the \(s\)th step of the EM algorithm. This task is generally referred to as the HMM-smoothing numerical issue and it is typically solved by specifying the posterior probabilities in terms of suitably normalized functions, which can be computed recursively, avoiding unpractical summations over the state space of latent Markov chain and numerical under- and over-flows. In the literature, this approach is known as the Forward-Backward (FB) recursion and it can be implemented in a number of different ways (Cappé et al. 2005; ch. 3). We describe below the FB recursion that we have exploited in this paper.

Let

$$\begin{aligned} L_{tk}(\hat{\varvec{\theta }}^{\mathrm{circ}}_s,\hat{\varvec{\theta }}^{\mathrm{lin}}_s)= N(\varvec{y}_t;\varvec{\mu }_k,\sigma ^2_k(\varvec{I}-\rho _k\varvec{I})^{-1})\prod _{i=1}^{n} f_{\mathrm{vm}}(x_{it};\nu _{iks},\kappa _{iks}), \end{aligned}$$

be the conditional contribution of the bivariate spatial series \(\varvec{z}_{t}\) to the likelihood function, under state \(k\). In addition, let

$$\begin{aligned} L_{0:t}(\hat{\varvec{\theta }}_s)=\sum _{\varvec{\xi }_0}\ldots \sum _{\varvec{\xi }_t} p(\varvec{\xi }_{0:t};\hat{\varvec{\pi }}_s) \prod _{\tau =0}^{t}\prod _{k=1}^{K}\left( L_{\tau k}(\hat{\varvec{\theta }}^{\mathrm{circ}}_s,\hat{\varvec{\theta }}^{\mathrm{lin}}_s)\right) ^{\xi _{\tau k}} \end{aligned}$$

be the contribution of the first \(t\) profiles to the likelihood and let \(\varvec{z}_{0:t}\) be the space–time series, observed up to time \(t\). We run a forward and a backward iteration.

During the forward iteration, we exploit the output of the \(s\)th step of the EM algorithm to compute the probabilities \(\psi ^{(t)}(k)=P(\xi _{tk}=1|\varvec{z}_{0:t-1})\), the likelihood ratios \(c_t=\frac{L_{0:t}(\hat{\varvec{\theta }}_s)}{L_{0:t-1}(\hat{\varvec{\theta }}_s)}\) and the forward probabilities \(\bar{\alpha }_t(k)=P(\xi _{tk}=1|\varvec{z}_{0:t})\), as follows.

Forward recursion:

  • initialization:

    $$\begin{aligned}\psi ^{(0)}(k)=\hat{\pi }_{ks} c_0=\sum _{k=1}^{K}\psi ^{(0)}(k)L_0(\hat{\varvec{\theta }}^{\mathrm{circ}}_{s},\hat{\varvec{\theta }}^{\mathrm{lin}}_{s}) \bar{\alpha }_0(k)=\frac{\psi ^{(0)}(k)L_0(\hat{\varvec{\theta }}^{\mathrm{circ}}_{s},\hat{\varvec{\theta }}^{\mathrm{lin}}_{s})}{c_0} \end{aligned}$$
  • for \(t=1, \ldots T\)

    $$\begin{aligned} \psi ^{(t)}(k)=\sum _{h=1}^{K}\bar{\alpha }_{t-1}(h)\hat{\pi }_{hks} c_t=\sum _{k=1}^{K}\psi ^{(t)}(k)L_t(\hat{\varvec{\theta }}^{\mathrm{circ}}_{ks},\hat{\varvec{\theta }}^{\mathrm{lin}}_{s}) \bar{\alpha }_t(k)=\frac{\psi ^{(t)}(k)L_t(\hat{\varvec{\theta }}^{\mathrm{circ}}_{ks},\hat{\varvec{\theta }}^{\mathrm{lin}}_{s})}{c_t} \end{aligned}$$

At the end the forward recursion, we store the values \(c_0 \ldots c_T\) and \(\bar{\alpha }_0(k) \ldots \bar{\alpha }_T(k)\). The sequence \(c_0 \ldots c_T\) can be exploited to compute the value taken by the log-likelihood at the \(s\)th step of the EM algorithm, as follows:

$$\begin{aligned} \log L(\hat{\varvec{\theta }}_s)=\sum _{t=0}^{t}\log c_t. \end{aligned}$$

We then run a backward recursion, by computing the ratios \(\bar{\varphi }_t(k)=\frac{f(\varvec{z}_{t+1:T}|\xi _{tk}=1)}{\prod _{l=t}^{T}c_l}\), as follows.

Backward recursion:

  • initialization: \(\bar{\varphi }_T(k)=\frac{1}{c_T}\)

  • for \(t= T-1, T-2, \ldots 0\)

    $$\begin{aligned} \bar{\varphi }_t(k)=\frac{\sum _{h=1}^{K}\hat{\pi }_{khs}L_{t+1}(\hat{\varvec{\theta }}^{\mathrm{circ}}_{ks},\hat{\varvec{\theta }}^{\mathrm{lin}}_{s})\bar{\varphi }_{t+1}(h)}{c_t}. \end{aligned}$$

At the end of the backward recursion, we store the values of \(\bar{\varphi }_0(k) \ldots \bar{\varphi }_T(k)\) and compute the posterior univariate state probabilities as

$$\begin{aligned} \hat{p}_{tk}(\hat{\varvec{\theta }}_s)=\frac{\bar{\alpha }_t(k)\bar{\varphi }_{t}(k)}{\sum _{k=1}^{K}\bar{\alpha }_t(k)\bar{\varphi }_{t}(k)}. \end{aligned}$$

The bivariate posterior probabilities can be instead computed as

$$\begin{aligned} \hat{p}_{t-1,t,hk}(\hat{\varvec{\theta }}_s)=\bar{\alpha }_t(k)\hat{\pi }_{hks}L_{t+1}(\hat{\varvec{\theta }}^{\mathrm{circ}}_{ks},\hat{\varvec{\gamma }}^{\mathrm{lin}}_{ks})\bar{\varphi }_{t+1}(k)). \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lagona, F., Picone, M., Maruotti, A. et al. A hidden Markov approach to the analysis of space–time environmental data with linear and circular components. Stoch Environ Res Risk Assess 29, 397–409 (2015). https://doi.org/10.1007/s00477-014-0919-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00477-014-0919-y

Keywords

Navigation