Abstract
The investigation of species distributions in rivers involves data which are inherently sequential and unlikely to be fully independent. To take these characteristics into account, we develop a Bayesian hierarchical model for mapping the distribution of freshwater pearl mussels in the River Dee (Scotland). At the top of the hierarchy the likelihood is used to describe the sequence of sites in which mussels were observed or not. Given that false observations can occur, and that “not observed” means both that the species was not present and that it was not observed, a Markov prior is introduced at the second level of the hierarchy to represent the sequence of sites in which mussels are estimated to occur. The Markov prior allows modelling the spatial dependency between neighbouring sites. A third level in the hierarchy is given by the representation of the transition probabilities of the Markov chain in terms of site-specific explanatory variables, through a logistic regression. The selection of the explanatory variables which influence the Markov process is performed by means of a simulation-based procedure, in the complex case of association between covariates. Four features were found to be associated with reduced chance of finding a local mussel population: tributaries, bridges, dredging, and waste water treatment works. These results complement the results of a previous study, providing new evidence for the causes of the deterioration of a highly threatened species.
Similar content being viewed by others
References
Bauer G (1988) Threats to the freshwater pearl mussel Margaritifera margaritifera L. in Central Europe. Biol Conserv 45:239–253
Besag J, York J, Mollié A (1991) Bayesian image restoration with two applications in spatial statistics (with discussion). Ann Inst Stat Math 43:1–59
Chen M-H, Shao Q-M, Ibrahim JG (2000) Monte Carlo methods in Bayesian computation. Springer, New York
Chib S (1996) Calculating posterior distributions and modal estimates in Markov mixture models. J Econ 75:79–97
Cooksley SL (2007) Dee catchment management plan. Dee Catchment Partnership, Aberdeen
Cooksley SL, Brewer MJ, Donnelly D, Spezia L, Tree A (2012) Impacts of riverine infrastructure on the freshwater pearl mussel Margaritifera margaritifera in the River Dee, Scotland. Aquat Conserv Mar Freshw Ecosyst 22:318–330
Dellaportas P, Forster JJ, Ntzoufras I (2000) Bayesian variable selection using the Gibbs sampler. In: Dey DK, Ghosh SK, Mallick BK (eds) Generalized linear models: a Bayesian perspective. Marcel Dekker, New York, pp 273–286
Denison DGT, Mallick BK, Smith AFM (1998) Automatic Bayesian curve fitting. J R Stat Soc Ser B 60:333–350
George EI, McCulloch RE (1993) Variables selection via Gibbs-sampling. J Am Stat Assoc 88:881–889
Gilks WR, Roberts GO (1996) Strategies for improving MCMC. In: Gilks WR, Richardson S, Spieghelhalter DJ (eds) Markov chain Monte Carlo in practice. Chapman & Hall, London, pp 89–114
Green PG (2000) A primer on Markov chain Monte Carlo. In: Barndorff-Nielsen OE, Cox DR, Klüpperberg C (eds) Complex stochastic systems. Chapman & Hall/CRC, Boca Raton, pp 1–62
Green PJ, Richardson S (2002) Hidden Markov models and disease mapping. J Am Stat Assoc 97:1055–1076
Hastie LC (1999) Conservation and ecology of the freshwater pearl mussel, Margaritifera margaritifera (L.). Ph.D. thesis, University of Aberdeen
Heikkinen J, Högmander H (1994) Fully Bayesian approach to image restoration with an application in biogeography. Appl Stat 43:569–582
Hughes JP, Guttorp P (1994) A class of stochastic models for relating synoptic atmospheric patterns to local hydrologic phenomenon. Water Resour Res 30:1535–1546
Hughes JP, Guttorp P, Charles SP (1999) A non-homogeneous hidden Markov model for precipitation occurrence. Appl Stat 48:15–30
Kim C-J (1994) Dynamic linear models with Markov-switching. J Econ 60:1–22
Kuo L, Mallick B (1998) Variable selection for regression models. Sankhya Ser B 60:65–81
Langan SJ, Cooksley SL, Young M, Stutter MI, Scougall F, Dalziel A, Feeney I (2007) The management and conservation of the freshwater pearl mussel Margaritifera margaritifera L. in Scottish catchments designated as special areas of conservation or sites of special scientific interest. Macaulay Land Use Research Institute commissioned report to Scottish Natural Heritage, Aberdeen
Linnaeus C (1758) Systema naturae per regna tria naturae, secundum classes, ordines, genera, species, cum characteribus, differentiis, synonymis, locis. Tomus I. Editio decima, reformata. Stockholm
Liu JS, Wong WH, Kong A (1994) Covariance structure of the Gibbs sampler with applications to the comparisons of the estimators and augmentation scheme. Biometrika 81:27–40
MacDonald IL, Zucchini W (1997) Hidden Markov and other models for discrete-valued time series. Chapman & Hall, London
McEwen L (2000) The geomorphological character of the River Dee. Scottish Natural Heritage, commissioned report
McEwen L (2005) The geomorphological character of the River Dee: Phase II. Scottish Natural Heritage, commissioned report
Nott DJ, Green PJ (2004) Bayesian variable selection and the Swendsen-Wang algorithm. J Comp Graph Stat 13:1–17
Paroli R, Spezia L (2008) Bayesian variable selection in Markov mixture models. Commun Stat Simul Comput 37:25–47
Qian W, Titterington DM (1991) Estimation of parameters in hidden Markov models. Philos Trans R Soc Lond Ser A 337:407–428
Scottish Natural Heritage (2005) River Dee condition monitoring form. Scottish Natural Heritage, Inverness
Spezia L (2009) Reversible jump and the label switching problem in hidden Markov models. J Stat Plan Inference 139:2305–2315
Young MR, Hastie LC, Cooksley SL (2003) Monitoring the Freshwater Pearl Mussel, Margaritifera margaritifera. Conserving Natura 2000, Rivers Monitoring Series No. 2, English Nature
Young MR, Cosgrove PJ, Hastie LC (2001) The extent of, and causes for, the decline of a highly threatened naiad: Margaritifera margaritifera. In: Bauer G, Wächtler K (eds) Ecology and evolutionary biology of the freshwater mussels Unionoidea. Springer, Berlin, pp 337–354
Zhang Y, Brady M, Smith S (2001) Segmentation of brain MR images through a hidden Markov random field model and the expectation-maximization algorithm. IEEE Trans Med Imaging 20:45–57
Ziuganov V, Zotin A, Nezlin L, Tretiakov V (1994) The freshwater pearl mussels and their relationships with salmonid fish. Russian Federal Research Institute of Fisheries and Oceanography, Moscow
Zucchini W, Guttorp P (1991) A hidden Markov model for space-time precipitation. Water Resour Res 27:1917–1923
Acknowledgments
This work was funded by the Scottish Government’s Rural and Environment Science and Analytical Services Division and by Scottish Natural Heritage. Survey data were provided under licence from Scottish Natural Heritage (freshwater pearl mussel data were collected by Peter Cosgrove, Lee Hastie and Jackie Farquhar; hydromorphological data were collected by Lindsey McEwen—both surveys commissioned by Scottish Natural Heritage). Comments from Chris Glasbey improved the quality of the first version of the paper. The authors are also thankful to the Editor, the Associate Editor and two referees, whose comments improved the final paper.
Author information
Authors and Affiliations
Corresponding author
Additional information
Handling Editor: Pierre Dutilleul.
Appendix
Appendix
The basic MCMC sampler used for variable selection is here described. Given the draws \(\theta ^{(h-1)}\), \(\alpha ^{(h-1)}\), \(\eta ^{(h-1)}\), \(x^{T(h-1)}\), \(y^{*(h-1)}\), generated at the \((h-1)\)-th iteration under the constraint \(\theta _{0}^{(h-1)}<\theta _{1}^{(h-1)}\), the steps of the generic \(h\)-th iteration are the following.
[Step 1] The sequence of the hidden states \(x^{T(h)}\) is generated by the forward filtering-backward sampling (ff-bs) algorithm (Chib 1996). It is so called because firstly the filtered probabilities of the hidden states are computed going forwards; then the conditional probabilities of the hidden states are computed going backwards, sampling the states from the full conditional:
suppressing the conditioning on \(\theta , \alpha , \eta \), and \(z\).
Let \(\xi _{t+1\mid t}\) be the bidimensional column vector whose generic entry is \(p\left( x_{t}=i\mid y^{t}\right) \); \(\xi _{t\mid t}\) be the bidimensional column vector whose generic entry is \(p\left( x_{t}=i\mid y^{t}\right) \); \(\xi _{t}\) be the bidimensional column vector whose generic entry is \(p\left( x_{t}=i\mid x_{t+1},y^{t}\right) \), for any \(i\in S_{X}\). The iterative scheme of the ff-bs algorithm is the following.
-
(1.1)
Place
$$\begin{aligned} \xi _{1\mid 0}^{(h)}=\delta ^{(h)\prime }, \end{aligned}$$where \(\delta ^{(h)}\) is the initial distribution of the Markov chain.
-
(1.2)
Compute
$$\begin{aligned} \xi _{t\mid t}^{(h)}=\frac{\xi _{t\mid t-1}^{(h)}\odot F_{t}^{(h-1)}}{ 1_{(2)}^{\prime }\left( \xi _{t\mid t-1}^{(h)}\odot F_{t}^{(h-1)}\right) } \qquad \hbox {and}\qquad \xi _{t+1\mid t}^{(h)}=\left[ \Gamma ^{t(h-1)}\right] ^{\prime }\ \xi _{t\mid t}^{(h)}, \end{aligned}$$for any \(t=1,\ldots ,T-1,\) where \(F_{t}^{(h-1)}=\left( p\left( y_{t}\left| x_{t}=0,\theta ^{(h-1)}\right. \right) ;p\left( y_{t}\left| x_{t}=1,\right. \right. \right. \) \(\left. \left. \left. \theta ^{(h-1)}\right. \right) \right) ^{\prime }\), \( 1_{(2)}\) is the bidimensional column vector of ones, and \(\odot \) denotes the Hadamard product.
-
1.3)
Compute
$$\begin{aligned} \xi _{T\mid T}^{(h)}=\frac{\xi _{T\mid T-1}^{(h)}\odot F_{T}^{(h-1)}}{ 1_{(2)}^{\prime }\left( \xi _{T\mid T-1}^{(h)}\odot F_{T}^{(h-1)}\right) }. \end{aligned}$$ -
1.4)
Generate \(x_{T}^{(h)}\) from \(\xi _{T\mid T}^{(h)}\).
-
1.5)
Compute
$$\begin{aligned} \xi _{t}^{(h)}=\frac{\xi _{t\mid t}^{(h)}\odot \Gamma _{\bullet x_{t+1}^{(h)}}^{t(h-1)}}{1_{(2)}^{\prime }\left( \xi _{t\mid t}^{(h)}\odot \Gamma _{\bullet x_{t+1}^{(h)}}^{t(h-1)}\right) }, \end{aligned}$$and generate \(x_{t}^{(h)}\) from \(\xi _{t}^{(h)}\), for any \(t=T-1,\ldots ,1\). Vector \(\Gamma _{\bullet x_{t+1}^{(h)}}^{t(h-1)}\) represents the column of \( \Gamma ^{t(h-1)}\) corresponding to the state generated previously.
[Step 2] The logarithmic transformations of the parameters \(\theta _{0}^{(h)}\) and \(\theta _{1}^{(h)}\) are independently generated from the random walk logit\(\left( \theta _{i}^{(h)}\right) =\) logit \(\left( \theta _{i}^{(h-1)}\right) +U_{\Theta }\) (\(i\in S_{X}\)), where \( U_{\Theta }\) is a Gaussian noise with zero mean and constant variance \( \sigma _{\Theta }^{2}\). Then the pair of parameters \(\left( \theta _{0}^{(h)};\theta _{1}^{(h)}\right) \) is permuted if \(\theta _{0}^{(h)}>\theta _{1}^{(h)}\), in order to respect the prior constraint \( \theta _{0}<\theta _{1}\) (Green and Richardson 2002).
The vector \(\theta ^{(h)}\) is accepted with probability
where the proposal ratio \(q\left( \theta ^{(h-1)}\mid \theta ^{(h)}\right) / q\left( \theta ^{(h)}\mid \theta ^{(h-1)}\right) \) reduces to the ratio of the Jacobians of the logit transformation of \(\theta _{0}\) and \( \theta _{1}\), i.e. \(J_{\theta ^{(h-1)}}/J_{\theta ^{(h)}}\), because the proposal densities cancel out, due to their symmetry:
[Step 3] Any coefficient \(\alpha _{i,j,k}^{(h)}\) (\(i,j\in S_{X}, i\ne j, k=0,1,\ldots ,K\)) is independently generated from the random walk \(\alpha _{i,j,k}^{(h)}=\alpha _{i,j,k}^{(h-1)}+U_{A}\), where \( U_{A}\) is a Gaussian noise of zero mean and constant variance \(\sigma _{A}^{2}\). Then, each parameter \(\eta _{k(i)}^{(h)}\) (\(i\in S_{X}\), \( k=1,\ldots ,K\)) is independently generated from a Bernoulli distribution of parameter \(a_{k(i)}^{(h)}/ \left( a_{k(i)}^{(h)}+b_{k(i)}^{(h)}\right) ,\) where \(a_{k(i)}^{(h)}\) is the product of the transition probabilities \(\gamma _{x_{t-1},x_{t}}^{(h)}\), for any \(t=2,\ldots ,T\), when \(x_{t-1}^{(h)}=i\) and \(x_{t}^{(h)}=j\) (\(i\ne j\)), replacing \(\eta _{k(i)}\) in any \(\gamma _{x_{t-1},x_{t}}^{(h)}\) by 1, whereas \(b_{k(i)}^{(h)}\) is the product of the transition probabilities \(\gamma _{x_{t-1},x_{t}}^{(h)}\), for any \(t=2,\ldots ,T\), when \(x_{t-1}^{(h)}=i\) and \(x_{t}^{(h)}=j\) (\(i\ne j\)), replacing \(\eta _{k(i)}\) in any \(\gamma _{x_{t-1},x_{t}}^{(h)}\) by 0:
where \(H_{i}^{[k;1]}\) is the diagonal matrix \(H_{i}\) in which \(\eta _{k(i)}^{(h)}\) is replaced by 1, i.e. \(H_{i}^{[k;1]}=\) diag\(\left( 1,\eta _{1(i)}^{(h)},\ldots ,\right. \) \(\left. \eta _{k-1(i)}^{(h)},1,\eta _{k+1(i)}^{(h-1)},\ldots ,\eta _{K(i)}^{(h-1)}\right) \), and \(H_{i}^{[k;0]}\) is the diagonal matrix \(H_{i}\) in which \(\eta _{k(i)}^{(h)}\) is replaced by 0, i.e. \(H_{i}^{[k;0]}=\) diag\(\left( 1,\eta _{1(i)}^{(h)},\ldots ,\eta _{k-1(i)}^{(h)},0,\eta _{k+1(i)}^{(h-1)},\right. \) \(\left. \ldots ,\eta _{K(i)}^{(h-1)}\right) \).
Finally each pair of vectors \(\left( \alpha _{i,j}^{(h)};\eta _{i}^{(h)}\right) , i,j\in S_{X}\) (\(i\ne j\)), is accepted in block with probability
by the factorization of the proposal density, by the independence of \(\alpha _{i,j}^{(h)}\) on \(\eta _{i}^{(h-1)}\) and cancelling the ratio \(q\left( \alpha _{i,j}^{(h-1)}\left| \alpha _{i,j}^{(h)};z;\delta \right. \right) / q\left( \alpha _{i,j}^{(h)}\left| \alpha _{i,j}^{(h-1)};z;\delta \right. \right) \) for the symmetry of the proposal distribution.
[Step 4] Every missing observation \(y_{t}^{*}\), given the hidden state \(x_{t}^{(h)}=i\) (\(i\in S_{X}\)), is generated from the Bernoulli distribution \(\mathcal B e\left( \theta _{i}^{(h)}\right) \). This concludes the \(h\)-th iteration.
Rights and permissions
About this article
Cite this article
Spezia, L., Cooksley, S.L., Brewer, M.J. et al. Mapping species distributions in one dimension by non-homogeneous hidden Markov models: the case of freshwater pearl mussels in the River Dee. Environ Ecol Stat 21, 487–505 (2014). https://doi.org/10.1007/s10651-013-0265-0
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10651-013-0265-0