A hidden Markov approach to the analysis of space–time environmental data with linear and circular components

Lagona, Francesco; Picone, Marco; Maruotti, Antonello; Cosoli, Simone

doi:10.1007/s00477-014-0919-y

A hidden Markov approach to the analysis of space–time environmental data with linear and circular components

Original Paper
Published: 28 June 2014

Volume 29, pages 397–409, (2015)
Cite this article

Stochastic Environmental Research and Risk Assessment Aims and scope Submit manuscript

Francesco Lagona¹,
Marco Picone²,
Antonello Maruotti^1,3 &
…
Simone Cosoli⁴

489 Accesses
21 Citations
Explore all metrics

Abstract

The analysis of bivariate space–time series with linear and circular components is complicated by (1) multiple correlations, across time, space and between variables, (2) different supports on which the variables are observed, the real line and the circle, and (3) the periodic nature of circular data. We describe a multivariate hidden Markov model that includes these features of the data within a single framework. The model integrates a circular von Mises Markov field and a Gaussian Markov field, with parameters that evolve in time according to a latent (hidden) Markov chain. It allows to describe the data by means of a finite number of time-varying latent regimes, associated with easily interpretable components of large-scale and small-scale spatial variation. It can be estimated by a computationally feasible expectation–maximization algorithm. In a case study of sea currents in the Northern Adriatic Sea, it provides a parsimonious representation of the sea surface in terms of alternating environmental states.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Data–Driven Approximation of the Koopman Operator: Extending Dynamic Mode Decomposition

Article 05 June 2015

A Moving Linear Model Approach for Extracting Cyclical Variation from Time Series Data

Article 25 November 2023

Modelling multidecadal variability in flood frequency using the Two-Component Extreme Value distribution

Article Open access 20 April 2024

References

Bertotti L, Cavalieri L (2009) Wind and wave predictions in the Adriatic Sea. J Mar Syst 78:S227–S234
Article Google Scholar
Biernacki C, Celeux G, Govaert G (2003) Choosing starting values for the em algorithm for getting the highest likelihood in multivariate gaussian mixture models. Comput Stat Data Anal 41(3–4):561–575
Article Google Scholar
Bulla J, Lagona F, Maruotti A, Picone M (2012) A multivariate hidden Markov model for the identification of sea regimes from incomplete skewed and circular time series. J Agric Biol Environ Stat 17(4):544–567. doi:10.1007/s13253-012-0110-1
Article Google Scholar
Cappé O, Moulines E, Rydén T (2005) Inference in hidden Markov models. Springer, Berlin
Carnicero JA, Ausín MC, Wiper MP (2013) Non-parametric copulas for circular–linear and circular–circular data: an application to wind directions. Stoch Environ Res Risk Assess 27(8):1991–2002. doi:10.1007/s00477-013-0733-y
Article Google Scholar
Cosoli S, Mazzoldi A, Gacic M (2010) Validation of surface current measurements in the northern Adriatic Sea from high frequency radars. J Atmos Ocean Techno 27:908–919
Article Google Scholar
Cosoli S, Gacic M, Mazzoldi A (2012) Surface current variability and wind influence in the northeastern Adriatic Sea as observed from high-frequency (hf) radar measurements. Cont Shelf Res 33:1–13
Article Google Scholar
Faltinsen O (1990) Sea loads on ships and offshore structures. Cambridge University Press, Cambridge
Fisher N, Lee A (1983) A correlation coefficient for circular data. Biometrika 70(2):327–332
Article Google Scholar
Fisher N, Lee A (1992) Regression models for an angular response. Biometrics 48:665–677
Article Google Scholar
Gaetan C, Guyon X (2010) Spatial statistics and modelling. Springer, Berlin
García-Portugués E, Barros AM, Crujeiras RM, González-Manteiga W, Pereira J (2013a) A test for directional-linear independence, with applications to wildfire orientation and size. Stoch Environ Res Risk Assess 1–15. doi:10.1007/s00477-013-0819-6
García-Portugués E, Crujeiras R, González-Manteiga W (2013b) Exploring wind direction and So₂ concentration by circular–linear density estimation. Stoch Environ Res Risk Assess 27(5):1055–1067
Article Google Scholar
Griffith DA, Lagona F (1998) On the quality of likelihood-based estimators in spatial autoregressive models when the data dependence structure is misspecified. J Stat Plan Inference 69(1):153–174
Article Google Scholar
Huang G, Wing-Keung Law A, Huang Z (2011) Wave-induced drift of small floating objects in regular waves. Ocean Eng 38:712–718
Article Google Scholar
Ingrassia S, Rocci R (2011) Degeneracy of the em algorithm for the mle of multivariate gaussian mixtures and dynamic constraints. Comput Stat Data Anal 55:1715–1725
Article Google Scholar
Jin KR, Ji ZG (2004) Case study: modeling of sediment transport and wind-wave impact in lake okeechobee. J Hydraul Eng 130:1055–1067
Article Google Scholar
Jona Lasinio G, Lagona F (2002) Selection of the neighborhood structure for space–time Markov random field models. Stat Methods Appl 11(3):293–311
Article Google Scholar
Jona Lasinio G, Gelfand A, Jona-Lasinio M (2012) Spatial analysis of wave direction data using wrapped gaussian processes. Ann Appl Stat 6:1478–1498
Article Google Scholar
Kato S, Shimizu K (2008) Dependent models for observations which include angular ones. J Stat Plan Inference 138(11):3538–3549, special issue in Honor of Junjiro Ogawa (1915–2000): design of experiments, multivariate analysis and statistical inference
Lagona F (2001) Parametric restrictions in random fields for binary space–time data series. Metron-Int J Stat 59(1–2):72–96
Google Scholar
Lagona F (2002) Adjacency selection in Markov random fields for high spatial resolution hyperspectral data. J Geogr Syst 4(1):53–68
Article Google Scholar
Lagona F, Picone M (2013) Maximum likelihood estimation of bivariate circular hidden Markov models from incomplete data. J Stat Comput Simul 83:1223–1237
Article Google Scholar
Lagona F, Maruotti A, Picone M (2011) A non-homogeneous hidden Markov model for the analysis of multi-pollutant exceedances data. In: Dymarsky P (ed) Hidden Markov models, theory and applications, InTech, Chap 10, pp 207–222
Lee A (2010) Circular data. Wiley Interdiscip Rev Comput Stat 2(4):477–486
Article Google Scholar
Mardia K, Voss J (2011) Some fundamental properties of multivariate von Mises distributions. arXiv:1109.6042v1
Mardia KV, Hughes G, Taylor CC, Singh H (2008) A multivariate von Mises distribution with applications to bioinformatics. Can J Stat 36(1):99–109
Article Google Scholar
McLachlan G, Peel D (2000) Finite mixture models. Wiley, New York
Book Google Scholar
Mihanovic H, Cosoli S, Vilibic I, Ivankovic D, Dadic V, Gacic M (2011) Surface current patterns in the northern adriatic extracted from high frequency radar data using self organizing map analysis. J Geophys Res 116(C08):033
Google Scholar
Modlin D, Fuentes M, Reich B (2012) Circular conditional autoregressive modeling of vector fields. Environmetrics 23(1):46–53. doi:10.1002/env.1133
Article Google Scholar
Pleskachevsky A, Eppel D, Kapitza H (2009) Interaction of waves, currents and tides, and wave-energy impact on the beach area of sylt island. Ocean Dyn 59:451–461
Article Google Scholar
Poulain P, Kourafalou M, Cushman-Roisin B (2001) Nothern Adriatic Sea. In: Cushman-Roisin B et al (eds) Physical oceanography of the Adriatic Sea. Kluwer Academic, Dordrecht, pp 143–165
Chapter Google Scholar
Rue H, Held L (2005) Gaussian Markov radom field: theory and application. Chapman & Hall, London
Singh H, Hnizdo V, Demchuk E (2002) Probabilistic model for two dependent circular variables. Biometrika 89(3):719–723. doi:10.1093/biomet/89.3.719
Article Google Scholar
Visser I, Raijmakers M, Molenaar P (2000) Confidence intervals for hidden Markov model parameters. Br J Math Stat Psychol 53:317–327
Article Google Scholar
Visser I, Raijmakers MEJ, Molenaar PCM (2002) Fitting hidden Markov models to psychological data. Sci Program 10:185–199
Google Scholar
Wang F (2013) Space and space–time modeling of directional data. PhD thesis, Dept Statistical Sciences, Duke University
Wang F, Gelfand AE (2013) Directional data analysis under the general projected normal distribution. Stat Methodol 10(1):113–127
Article Google Scholar
Wang F, Gelfand AE, Jona Lasinio G (2014) Joint spatio-temporal analysis of a linear and a directional variable: space-time modeling of wave heights and wave directions in the Adriatic Sea. Stat Sin. doi: 10.5705/ss.2013.204w
Wu C (1983) On the convergence properties of the em algorithm. Ann Stat 11:95–103
Article Google Scholar
Zucchini W, Guttorp P (1991) A hidden Markov model for space–time precipitation. Water Resour Res 27:1917–1923
Article Google Scholar

Download references

Author information

Authors and Affiliations

DIPES, University Roma Tre, Via G. Chiabrera 199, Rome, Italy
Francesco Lagona & Antonello Maruotti
Istituto Superiore per la Protezione e la Ricerca Ambientale (ISPRA), Rome, Italy
Marco Picone
University of Southampton, Southampton, UK
Antonello Maruotti
Istituto Nazionale di Oceanografia e di Geofisica Sperimentale, Sgonico, Italy
Simone Cosoli

Authors

Francesco Lagona
View author publications
You can also search for this author in PubMed Google Scholar
Marco Picone
View author publications
You can also search for this author in PubMed Google Scholar
Antonello Maruotti
View author publications
You can also search for this author in PubMed Google Scholar
Simone Cosoli
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Francesco Lagona.

Appendix

For $n>2$, the normalizing constant of the multivariate von Mises density $f(\varvec{x}_t; \varvec{\theta }^{\mathrm{circ}}_k)$ is unknown. However, the conditional log-likelihood $l_k(\varvec{\theta }^{\mathrm{circ}}\mid \varvec{x}_t)=\log f(\varvec{x}_t; \varvec{\theta }^{\mathrm{circ}}_k)$ is well approximated about it maximum point by the pseudo-loglikelihood

$$\begin{aligned} {\mathrm pl} _{k}(\varvec{\theta }^{\mathrm{circ} }\mid \varvec{x}_t)=\sum _{i=1}^{n}\log f_{\mathrm{vm} }(x_{it};\nu _{ik},\kappa _{ik}), \end{aligned}$$

obtained by taking the logarithm of the product of the conditional von Mises distributions (Mardia et al. 2008). We take advantage of this result to compute the posterior probabilities of the Markov chain

$$\begin{aligned} \hat{p}_{tk}(\hat{\varvec{\theta }}_s)={\mathbb {E}}(\xi _{tk}|\varvec{z},\hat{\varvec{\theta }}_s) \hat{p}_{t-1,t,hk}(\hat{\varvec{\theta }}_s)={\mathbb {E}}(\xi _{t-1,h}\xi _{tk}|\varvec{z},\hat{\varvec{\theta }}_s) \end{aligned}$$

(14)

from the estimate $\hat{\varvec{\theta }}_s$ provided by the $s$th step of the EM algorithm. This task is generally referred to as the HMM-smoothing numerical issue and it is typically solved by specifying the posterior probabilities in terms of suitably normalized functions, which can be computed recursively, avoiding unpractical summations over the state space of latent Markov chain and numerical under- and over-flows. In the literature, this approach is known as the Forward-Backward (FB) recursion and it can be implemented in a number of different ways (Cappé et al. 2005; ch. 3). We describe below the FB recursion that we have exploited in this paper.

Let

$$\begin{aligned} L_{tk}(\hat{\varvec{\theta }}^{\mathrm{circ}}_s,\hat{\varvec{\theta }}^{\mathrm{lin}}_s)= N(\varvec{y}_t;\varvec{\mu }_k,\sigma ^2_k(\varvec{I}-\rho _k\varvec{I})^{-1})\prod _{i=1}^{n} f_{\mathrm{vm}}(x_{it};\nu _{iks},\kappa _{iks}), \end{aligned}$$

be the conditional contribution of the bivariate spatial series $\varvec{z}_{t}$ to the likelihood function, under state $k$. In addition, let

$$\begin{aligned} L_{0:t}(\hat{\varvec{\theta }}_s)=\sum _{\varvec{\xi }_0}\ldots \sum _{\varvec{\xi }_t} p(\varvec{\xi }_{0:t};\hat{\varvec{\pi }}_s) \prod _{\tau =0}^{t}\prod _{k=1}^{K}\left( L_{\tau k}(\hat{\varvec{\theta }}^{\mathrm{circ}}_s,\hat{\varvec{\theta }}^{\mathrm{lin}}_s)\right) ^{\xi _{\tau k}} \end{aligned}$$

be the contribution of the first $t$ profiles to the likelihood and let $\varvec{z}_{0:t}$ be the space–time series, observed up to time $t$. We run a forward and a backward iteration.

During the forward iteration, we exploit the output of the $s$th step of the EM algorithm to compute the probabilities $\psi ^{(t)}(k)=P(\xi _{tk}=1|\varvec{z}_{0:t-1})$, the likelihood ratios $c_t=\frac{L_{0:t}(\hat{\varvec{\theta }}_s)}{L_{0:t-1}(\hat{\varvec{\theta }}_s)}$ and the forward probabilities $\bar{\alpha }_t(k)=P(\xi _{tk}=1|\varvec{z}_{0:t})$, as follows.

Forward recursion:

initialization:
$$\begin{aligned}\psi ^{(0)}(k)=\hat{\pi }_{ks} c_0=\sum _{k=1}^{K}\psi ^{(0)}(k)L_0(\hat{\varvec{\theta }}^{\mathrm{circ}}_{s},\hat{\varvec{\theta }}^{\mathrm{lin}}_{s}) \bar{\alpha }_0(k)=\frac{\psi ^{(0)}(k)L_0(\hat{\varvec{\theta }}^{\mathrm{circ}}_{s},\hat{\varvec{\theta }}^{\mathrm{lin}}_{s})}{c_0} \end{aligned}$$
for $t=1, \ldots T$
$$\begin{aligned} \psi ^{(t)}(k)=\sum _{h=1}^{K}\bar{\alpha }_{t-1}(h)\hat{\pi }_{hks} c_t=\sum _{k=1}^{K}\psi ^{(t)}(k)L_t(\hat{\varvec{\theta }}^{\mathrm{circ}}_{ks},\hat{\varvec{\theta }}^{\mathrm{lin}}_{s}) \bar{\alpha }_t(k)=\frac{\psi ^{(t)}(k)L_t(\hat{\varvec{\theta }}^{\mathrm{circ}}_{ks},\hat{\varvec{\theta }}^{\mathrm{lin}}_{s})}{c_t} \end{aligned}$$

At the end the forward recursion, we store the values $c_0 \ldots c_T$ and $\bar{\alpha }_0(k) \ldots \bar{\alpha }_T(k)$. The sequence $c_0 \ldots c_T$ can be exploited to compute the value taken by the log-likelihood at the $s$th step of the EM algorithm, as follows:

$$\begin{aligned} \log L(\hat{\varvec{\theta }}_s)=\sum _{t=0}^{t}\log c_t. \end{aligned}$$

We then run a backward recursion, by computing the ratios $\bar{\varphi }_t(k)=\frac{f(\varvec{z}_{t+1:T}|\xi _{tk}=1)}{\prod _{l=t}^{T}c_l}$, as follows.

Backward recursion:

initialization: $\bar{\varphi }_T(k)=\frac{1}{c_T}$
for $t= T-1, T-2, \ldots 0$
$$\begin{aligned} \bar{\varphi }_t(k)=\frac{\sum _{h=1}^{K}\hat{\pi }_{khs}L_{t+1}(\hat{\varvec{\theta }}^{\mathrm{circ}}_{ks},\hat{\varvec{\theta }}^{\mathrm{lin}}_{s})\bar{\varphi }_{t+1}(h)}{c_t}. \end{aligned}$$

At the end of the backward recursion, we store the values of $\bar{\varphi }_0(k) \ldots \bar{\varphi }_T(k)$ and compute the posterior univariate state probabilities as

$$\begin{aligned} \hat{p}_{tk}(\hat{\varvec{\theta }}_s)=\frac{\bar{\alpha }_t(k)\bar{\varphi }_{t}(k)}{\sum _{k=1}^{K}\bar{\alpha }_t(k)\bar{\varphi }_{t}(k)}. \end{aligned}$$

The bivariate posterior probabilities can be instead computed as

$$\begin{aligned} \hat{p}_{t-1,t,hk}(\hat{\varvec{\theta }}_s)=\bar{\alpha }_t(k)\hat{\pi }_{hks}L_{t+1}(\hat{\varvec{\theta }}^{\mathrm{circ}}_{ks},\hat{\varvec{\gamma }}^{\mathrm{lin}}_{ks})\bar{\varphi }_{t+1}(k)). \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lagona, F., Picone, M., Maruotti, A. et al. A hidden Markov approach to the analysis of space–time environmental data with linear and circular components. Stoch Environ Res Risk Assess 29, 397–409 (2015). https://doi.org/10.1007/s00477-014-0919-y

Download citation

Published: 28 June 2014
Issue Date: February 2015
DOI: https://doi.org/10.1007/s00477-014-0919-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A hidden Markov approach to the analysis of space–time environmental data with linear and circular components

Abstract

Access this article

Similar content being viewed by others

A Data–Driven Approximation of the Koopman Operator: Extending Dynamic Mode Decomposition

A Moving Linear Model Approach for Extracting Cyclical Variation from Time Series Data

Modelling multidecadal variability in flood frequency using the Two-Component Extreme Value distribution

References

Author information

Authors and Affiliations

Corresponding author

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A hidden Markov approach to the analysis of space–time environmental data with linear and circular components

Abstract

Access this article

Similar content being viewed by others

A Data–Driven Approximation of the Koopman Operator: Extending Dynamic Mode Decomposition

A Moving Linear Model Approach for Extracting Cyclical Variation from Time Series Data

Modelling multidecadal variability in flood frequency using the Two-Component Extreme Value distribution

References

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation