Dynamic and robust Bayesian graphical models

Liu, Chunshan; Kowal, Daniel R.; Vannucci, Marina

doi:10.1007/s11222-022-10177-0

Dynamic and robust Bayesian graphical models

OriginalPaper
Published: 09 November 2022

Volume 32, article number 105, (2022)
Cite this article

Statistics and Computing Aims and scope Submit manuscript

Chunshan Liu¹,
Daniel R. Kowal¹ &
Marina Vannucci¹

593 Accesses
1 Citation
3 Altmetric
Explore all metrics

Abstract

Gaussian graphical models are widely popular for studying the conditional dependence among random variables. By encoding conditional dependence as an undirected graph, Gaussian graphical models provide interpretable representations and insightful visualizations of the relationships among variables. However, time series data present additional challenges: the graphs can evolve over time—with changes occurring at unknown time points—and the data often exhibit heavy-tailed characteristics. To address these challenges, we propose dynamic and robust Bayesian graphical models that employ state-of-the-art hidden Markov models (HMMs) to introduce dynamics in the graph and heavy-tailed multivariate t-distributions for model robustness. The HMM latent states are linked both temporally and hierarchically for greater information sharing across time and between states. The proposed methods are computationally efficient and demonstrate excellent graph estimation on simulated data with substantial improvements over non-robust graphical models. We demonstrate our proposed approach on human hand gesture tracking data, and discover edges and dynamics with well explained practical meanings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Variational Conditional Dependence Hidden Markov Models for Skeleton-Based Action Recognition

How to Encode Dynamic Gaussian Bayesian Networks as Gaussian Processes?

The Variational Coupled Gaussian Process Dynamical Model

References

Cheok, M.J., Omar, Z., Jaward, M.H.: A review of hand gesture and sign language recognition techniques. Int. J. Mach. Learn. Cybern. 10(1), 131–153 (2019)
Article Google Scholar
Cremaschi, A., Argiento, R., Shoemaker, K., Peterson, C., Vannucci, M.: Hierarchical normalized completely random measures for robust graphical modeling. Bayesian Anal. 14(4), 1271 (2019)
Danaher, P., Wang, P., Witten, D.M.: The joint graphical lasso for inverse covariance estimation across multiple classes. J. R. Stat. Soc. Ser. B Stat. Methodol. 76(2), 373 (2014)
Article MathSciNet MATH Google Scholar
Fan, J., Feng, Y., Wu, Y.: Network exploration via the adaptive Lasso and SCAD penalties. Ann. Appl. Stat. 3(2), 521 (2009)
Article MathSciNet MATH Google Scholar
Finegold, M., Drton, M.: Robust graphical modeling of gene networks using classical and alternative $t$-distributions. Ann. Appl. Stat. 5(2A), 1057–1080 (2011)
Article MathSciNet MATH Google Scholar
Finegold, M., Drton, M., et al.: Robust Bayesian graphical modeling using Dirichlet $ t $-distributions. Bayesian Anal. 9(3), 521–550 (2014)
Article MathSciNet MATH Google Scholar
Friedman, J., Hastie, T., Tibshirani, R.: Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9(3), 432–441 (2008)
Article MATH Google Scholar
Gelman, A., Carlin, J.B., Stern, H.S., Rubin, D.B.: Bayesian Data Analysis. Chapman and Hall/CRC, London (1995)
Book MATH Google Scholar
Geweke, J., et al.: Evaluating the accuracy of sampling-based approaches to the calculation of posterior moments, vol. 196. Federal Reserve Bank of Minneapolis, Research Department Minneapolis (1991)
Gibberd, A.J., Nelson, J.D.: Regularized estimation of piecewise constant Gaussian graphical models: the group-fused graphical lasso. J. Comput. Graph. Stat. 26(3), 623–634 (2017)
Article MathSciNet Google Scholar
Gottard, A., Pacillo, S.: Robust concentration graph model selection. Comput. Stat. Data Anal. 54(12), 3070–3079 (2010)
Article MathSciNet MATH Google Scholar
Han, T.X., Ning, H., Huang, T.S.: Efficient nonparametric belief propagation with application to articulated body tracking. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), vol. 1, pp. 214–221. IEEE (2006)
Ishwaran, H., James, L.F.: Gibbs sampling methods for stick-breaking priors. J. Am. Stat. Assoc. 96(453), 161–173 (2001)
Article MathSciNet MATH Google Scholar
Jasra, A., Holmes, C.C., Stephens, D.A.: Markov chain Monte Carlo methods and the label switching problem in Bayesian mixture modeling. Stat. Sci. 20(1), 50–67 (2005)
Article MathSciNet MATH Google Scholar
Kendon, A.: Gesture: Visible Action as Utterance. Cambridge University Press, Cambridge (2004)
Book Google Scholar
Kim, D., Song, J., Kim, D.: Simultaneous gesture segmentation and recognition based on forward spotting accumulative HMMs. Pattern Recognit. 40(11), 3012–3026 (2007)
Article MATH Google Scholar
Kolar, M., Song, L., Ahmed, A., Xing, E.P., et al.: Estimating time-varying networks. Ann. Appl. Stat. 4(1), 94–123 (2010)
Article MathSciNet MATH Google Scholar
Kotz, S., Nadarajah, S.: Multivariate $t$-Distributions and Their Applications. Cambridge University Press, Cambridge (2004)
Book MATH Google Scholar
Lenkoski, A., Dobra, A.: Computational aspects related to inference in Gaussian graphical models with the G-Wishart prior. J. Comput. Graph. Stat. 20(1), 140–157 (2011)
Article MathSciNet Google Scholar
Liu, X., Daniels, M.J.: A new algorithm for simulating a correlation matrix based on parameter expansion and reparameterization. J. Comput. Graph. Stat. 15(4), 897–914 (2006)
Article MathSciNet Google Scholar
Liu, Y., Wichura, M.J., Drton, M.: Rejection sampling for an extended Gamma distribution. Submitted (2012)
Lun, R., Zhao, W.: A survey of applications and human motion recognition with microsoft kinect. Int. J. Pattern Recognit. Artif. Intell. 29(05), 1555008 (2015)
Article Google Scholar
Madeo, R.C., Lima, C.A., Peres, S.M.: Gesture unit segmentation using support vector machines: segmenting gestures from rest positions. In: Proceedings of the 28th Annual ACM Symposium on Applied Computing, pp. 46–52 (2013)
Meinshausen, N., Bühlmann, P., et al.: High-dimensional graphs and variable selection with the lasso. Ann. Stat. 34(3), 1436–1462 (2006)
Article MathSciNet MATH Google Scholar
Mitra, S., Acharya, T.: Gesture recognition: a survey. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 37(3), 311–324 (2007)
Miyamura, M., Kano, Y.: Robust Gaussian graphical modeling. J. Multivar. Anal. 97(7), 1525–1550 (2006)
Article MathSciNet MATH Google Scholar
Neal, R.M.: Markov chain sampling methods for Dirichlet process mixture models. J. Comput. Graph. Stat. 9(2), 249–265 (2000)
MathSciNet Google Scholar
Ni, Y., Baladandayuthapani, V., Vannucci, M., Stingo, F.C.: Bayesian graphical models for modern biological applications. In: Statistical Methods and Applications, pp. 1–29 (2021)
Osborne, N., Peterson, C.B., Vannucci, M.: Latent network estimation and variable selection for compositional data via variational EM. J. Comput. Graph. Stat. 31(1), 163–175 (2022)
Article MathSciNet MATH Google Scholar
Peng, J., Wang, P., Zhou, N., Zhu, J.: Partial correlation estimation by joint sparse regression models. J. Am. Stat. Assoc. 104(486), 735–746 (2009)
Article MathSciNet MATH Google Scholar
Peterson, C.B., Osborne, N., Stingo, F.C., Bourgeat, P., Doecke, J.D., Vannucci, M.: Bayesian modeling of multiple structural connectivity networks during the progression of Alzheimer’s disease. Biometrics 76(4), 1120–1132 (2020)
Article MathSciNet Google Scholar
Pinheiro, J.C., Liu, C., Wu, Y.N.: Efficient algorithms for robust estimation in linear mixed-effects models using the multivariate $t$ distribution. J. Comput. Graph. Stat. 10(2), 249–276 (2001)
Article MathSciNet Google Scholar
Rodríguez, C.E., Walker, S.G.: Label switching in Bayesian mixture models: deterministic relabeling strategies. J. Comput. Graph. Stat. 23(1), 25–45 (2014)
Article MathSciNet Google Scholar
Roverato, A.: Hyper inverse Wishart distribution for non-decomposable graphs and its application to Bayesian inference for Gaussian graphical models. Scand. J. Stat. 29(3), 391–411 (2002)
Article MathSciNet MATH Google Scholar
Scott, S.L.: Bayesian methods for hidden Markov models: recursive computing in the 21st century. J. Am. Stat. Assoc. 97(457), 337–351 (2002)
Article MathSciNet MATH Google Scholar
Shaddox, E., Stingo, F.C., Peterson, C.B., Jacobson, S., Cruickshank-Quinn, C., Kechris, K., Bowler, R., Vannucci, M.: A Bayesian approach for learning gene networks underlying disease severity in COPD. Stat. Biosci. 10(1), 59–85 (2018)
Article Google Scholar
Sigal, L., Isard, M., Haussecker, H., Black, M.J.: Loose-limbed people: estimating 3d human pose and motion using non-parametric belief propagation. Int. J. Comput. Vis. 98(1), 15–48 (2012)
Article MathSciNet MATH Google Scholar
Song, Y., Goncalves, L., Perona, P.: Unsupervised learning of human motion. IEEE Trans. Pattern Anal. Mach. Intell. 25(7), 814–827 (2003)
Article Google Scholar
Sperrin, M., Jaki, T., Wit, E.: Probabilistic relabelling strategies for the label switching problem in Bayesian mixture models. Stat. Comput. 20(3), 357–366 (2010)
Article MathSciNet Google Scholar
Sun, H., Li, H.: Robust Gaussian graphical modeling via $\ell _1$ penalization. Biometrics 68(4), 1197–1206 (2012)
Article MathSciNet MATH Google Scholar
Tan, L.S., Jasra, A., De Iorio, M., Ebbels, T.M.: Bayesian inference for multiple Gaussian graphical models with application to metabolic association networks. Ann. App. Stat. 11(4), 2222–2251 (2017)
MathSciNet MATH Google Scholar
Vinciotti, V., Hashem, H.: Robust methods for inferring sparse network structures. Comput. Stat. Data Anal. 67, 84–94 (2013)
Article MathSciNet MATH Google Scholar
Wang, H.: Bayesian graphical lasso models and efficient posterior computation. Bayesian Anal. 7(4), 867–886 (2012)
Article MathSciNet MATH Google Scholar
Wang, H.: Scaling it up: stochastic search structure learning in graphical models. Bayesian Anal. 10(2), 351–377 (2015)
Article MathSciNet MATH Google Scholar
Warnick, R., Guindani, M., Erhardt, E., Allen, E., Calhoun, V., Vannucci, M.: A Bayesian approach for estimating dynamic functional network connectivity in fMRI data. J. Am. Stat. Assoc. 113(521), 134–151 (2018)
Article MathSciNet MATH Google Scholar
Xu, R., Wu, J., Yue, X., Li, Y.: Online structural change-point detection of high-dimensional streaming data via dynamic sparse subspace learning. In: Technometrics, pp. 1–14 (2022)
Yang, E., Lozano, A.C.: Robust Gaussian graphical modeling with the trimmed graphical lasso. In: Advances in Neural Information Processing Systems, pp. 2602–2610 (2015)
Yang, J., Peng, J.: Estimating time-varying graphical models. J. Comput. Graph. Stat. 29(1), 191–202 (2020)
Zhang, C., Yan, H., Lee, S., Shi, J.: Dynamic multivariate functional data modeling via sparse subspace learning. Technometrics 63(3), 370–383 (2021)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Statistics, Rice University, Houston, TX, USA
Chunshan Liu, Daniel R. Kowal & Marina Vannucci

Authors

Chunshan Liu
View author publications
You can also search for this author in PubMed Google Scholar
Daniel R. Kowal
View author publications
You can also search for this author in PubMed Google Scholar
Marina Vannucci
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marina Vannucci.

Ethics declarations

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose. The authors have no competing interests to declare that are relevant to the content of this article. All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript. The authors have no financial or proprietary interests in any material discussed in this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

1.1 MCMC for the dynamic classical-t graphical model

Update ${\varvec{\tau }}$: For $t=1,2,\ldots ,T$, sample $\tau _t$ from
$$\begin{aligned} \text {Gamma}\left( \frac{\nu +p}{2}, \frac{\nu + {\varvec{y}}_t' {\varvec{\Omega }}_{s_t} {\varvec{y}}_t }{2}\right) , \end{aligned}$$
where the second parameter is the rate.
Update ${{\varvec{\Omega }}}$ and ${{\varvec{G}}}$: For each $s \in \{1,...,S\}$, let $n_s$ denote the number of time points in state s. Compute ${\varvec{\Sigma }}_s = \sum _{t: s_t=s} \tau _t {\varvec{y}}_t {\varvec{y}}_t'$ and perform the following updates:
1. 1.
  Following Peterson et al. (2020), for $s \in \{1,..,S\}$, sample ${{\varvec{\Omega }}}_{s}$ one column and row at a time. Here we demonstrate how to sample the last row and last column in the last state. First, partition ${\varvec{\Omega }}_S$ into
  $$\begin{aligned} {\varvec{\Omega }}_S = \begin{pmatrix} {\varvec{\Omega }}_{S,11} &{} {\varvec{\omega }}_{S,12} \\ {\varvec{\omega }}_{S,21} &{} \omega _{S,pp} \end{pmatrix} \end{aligned}$$
  where $\omega _{S,pp}$ is the (p, p)th element of the matrix, ${\varvec{\omega }}_{S,12}$ is the last column except $\omega _{S,pp}$ and ${\varvec{\omega }}_{S,21}$ is the last row except $\omega _{S,pp}$. Similarly, partition ${\varvec{\Theta }}_{ij}$, $i,j=1,\ldots ,p$, and ${\varvec{\Sigma }}_S$ into
  $$\begin{aligned} {\varvec{\Theta }}_{ij} = \begin{pmatrix} {\varvec{\Theta }}_{ij,11} &{} {\varvec{\theta }}_{ij,12} \\ {\varvec{\theta }}_{ij,21} &{} \theta _{ij,SS} \end{pmatrix} , \quad {\varvec{\Sigma }}_S = \begin{pmatrix} {\varvec{\Sigma }}_{S,11} &{} {\varvec{\sigma }}_{S,12} \\ {\varvec{\sigma }}_{S,21} &{} \sigma _{S,pp} \end{pmatrix} . \end{aligned}$$
  Next, perform a one-on-one transformation of the last column of ${\varvec{\Omega }}_S$ as $({\varvec{a}} , b) = ({\varvec{\omega }}_{S,12}, \omega _{S,pp} - {\varvec{\omega }}_{S,21} {\varvec{\Omega }}_{S,11}^{-1} {\varvec{\omega }}_{S,12} )$. It can be proved that the conditional posterior of $({\varvec{a}}, b)$ is
  $$\begin{aligned} \begin{aligned}&p({\varvec{a}}, b | {\varvec{\Omega }}_{S,11}, {\varvec{\Omega }}_1,\ldots ,{\varvec{\Omega }}_{S-1}, {\varvec{G}}_1,\ldots ,{\varvec{G}}_S, {\varvec{\Phi }}, {\varvec{s}}, {\varvec{Y}})\\&\quad = \text {Normal}({\varvec{Q}}^{-1} {\varvec{l}}, {\varvec{Q}}^{-1}) \text {Gamma}\left( \dfrac{n_S}{2}+1, \dfrac{\sigma _{S,pp}+\uplambda }{2}\right) \end{aligned} \nonumber \\ \end{aligned}$$
  (14)
  where ${\varvec{l}} = {\varvec{D}}^{-1}{\varvec{m}} - {\varvec{\sigma }}_{S,12}$, ${\varvec{Q}} = {\varvec{D}}^{-1} + (\sigma _{S,pp}+\uplambda ) {\varvec{\Omega }}_{S,11}^{-1}$, and the second parameter in the gamma distribution is rate. Here ${\varvec{D}}$ is the covariance matrix of ${\varvec{\omega }}_{S,12}$, which off diagonal elements are 0 and the jth on-diagonal element is $\theta _{jp,SS} - {\varvec{\theta }}_{jp,21} {\varvec{\Theta }}_{jp,11}^{-1} {\varvec{\theta }}_{jp,12}$, $j=1,\ldots ,p-1$. ${\varvec{m}}$ is the mean of ${\varvec{\omega }}_{S,12}$, which jth element is ${\varvec{\theta }}_{jp,21} {\varvec{\Theta }}_{jp,11}^{-1} (\omega _{1,jp}, \omega _{2,jp},\ldots ,\omega _{S-1,jp})'$, $j=1,\ldots ,p-1$. The positive definiteness of ${\varvec{\Omega }}_S$ is automatically guaranteed by the Sylvester’s criterion. Assume the sampled ${\varvec{\Omega }}_S$ at the nth iteration, denoted as ${\varvec{\Omega }}_S^{(n)}$, is positive definite. After updating its last column and row at the $(n+1)$th iteration, the leading $p-1$ principal minors of ${\varvec{\Omega }}_S^{(n+1)}$ are still positive. Note that the principal minor of pth order in ${\varvec{\Omega }}_S^{(n+1)}$ is $b \cdot |{\varvec{\Omega }}_{S, 11}^{(n)}|$. Since b is positive with a Gamma distribution, the pth leading principal minor is positive and ${\varvec{\Omega }}_S^{(n+1)}$ is positive definite.
2. 2.
  Sample ${\varvec{G}}_s$, $s=1,\ldots ,S$ from the conditional Bernoulli distributions
  $$\begin{aligned} p(g_{s,ij}= & {} 1|{\varvec{\Omega }}_1,\ldots ,{\varvec{\Omega }}_S, {\varvec{\Phi }}, {\varvec{s}}, {\varvec{Y}})\\= & {} \dfrac{N(\omega _{s,ij}|0,v_1^2)\pi }{N(\omega _{s,ij}|0,v_1^2)\pi +N(\omega _{s,ij}|0,v_0^2)(1-\pi )}. \end{aligned}$$
Update ${\varvec{\Phi }}$: This is an MH-step based on the PX-RPMH algorithm of Liu and Daniels (2006). Compute the diagonal matrix V where $V_{ss} =(\sum _{i=1}^{p-1} \sum _{j=i+1}^p \omega _{s,ij})^{-\frac{1}{2}}$, $s = 1,...,S$, and transfer ${\varvec{\Phi }}$ to ${\varvec{\Psi }}= V {\varvec{\Phi }}V$. Define ${\varvec{\epsilon }}_{ij} = V {\varvec{\omega }}_{ij}$, $i,j=1,\ldots ,p$. Given the candidate prior $p({\varvec{\Phi }}) \propto |{\varvec{\Phi }}|^{-\frac{S+1}{2}} {\varvec{1}}_{{\varvec{\Phi }}\in \mathcal {R^{S}}}$, the posterior of ${\varvec{\Psi }}$ is an Inverse-Wishart distribution with degrees of freedom $p(p-1)/2$ and scale matrix $\sum _{i=1}^{p-1} \sum _{j=i+1}^p \text {diag}({\varvec{v}}_{ij})^{-1} {\varvec{\epsilon }}_{ij} {\varvec{\epsilon }}_{ij}' \text {diag}({\varvec{v}}_{ij})^{-1}$. Sample ${\varvec{\Psi }}$ and set ${\varvec{\Phi }}^* = \text {diag}({\varvec{\Psi }})^{-\frac{1}{2}} {\varvec{\Psi }}\text {diag}({\varvec{\Psi }})^{-\frac{1}{2}} $. Accept the proposed ${\varvec{\Phi }}^*$ with probability
$$\begin{aligned} \text {min}\left( 1, \text {exp}\left[ \dfrac{S+1}{2} \left( \text {log}|{\varvec{\Phi }}^*| - \text {log}|{\varvec{\Phi }}| \right) \right] \right) . \end{aligned}$$
Update $\textbf{P}$ and $\textbf{s}$: This is a Metropolis-Hastings step following Scott (2002). First sample the state transition probabilities for each state s from a $\text {Dirichlet}({\varvec{\alpha }}_s)$ distribution where $\alpha _{rs}= (\# r \rightarrow s) +1$, $r = 1,...,S$, is the number of state transitions from r to s plus 1. Let $\pi _{new}$ denote the stationary distribution given the transition probabilities. Accept the resulting transition matrix and stationary distribution with probability
$$\begin{aligned} \text {min} \left\{ 1, \frac{\pi _{new}(State \quad at \quad the \quad first \quad time \quad point)}{\pi _{old}(State \quad at \quad the \quad first \quad time \quad point )} \right\} . \end{aligned}$$
Next, perform a Gibbs step using Forward Propagate Backward sampling. Obtain the forward transition probabilities for each state at each time point as ${\varvec{Q}}_{t}=[q_{trs}]$ by
$$\begin{aligned} q_{trs}\propto \pi _{t-1}(r|...) p_{rs} N( \sqrt{\tau _t} {\varvec{y}}_t |{\varvec{0}},{\varvec{\Omega }}_s^{-1}), \end{aligned}$$
where $q_{trs}$ is the transition probability from r at time $t-1$ to s at time t; compute $\pi _{t}(r|...) = \sum _{s = 1}^S q_{tsr}$. Sample $s_{T}$ from $\pi _{T}$ and then recursively sample $s_{t}$ proportional to the $s_{t+1}$-th column of ${\varvec{Q}}_{t+1}$ until $s_{1}$ is sampled.

1.2 MCMC for the dynamic Dirichlet-t graphical model

Update ${{\varvec{\Omega }}}$ and ${{\varvec{G}}}$: Same as the classical-t model with ${\varvec{\Sigma }}_s = \sum _{t: s_t=s} \text {diag}(\sqrt{{\varvec{\tau }}_t}) {\varvec{y}}_t {\varvec{y}}_t'\text {diag}(\sqrt{{\varvec{\tau }}_t})$ for $s \in \{1,\ldots ,S\}$.
Update ${{\varvec{\Phi }}}$: Same as the classical-t model.
Update $\textbf{P}$ and $\textbf{s}$: Same as the classical t model with $p_{trs} \propto \pi _{t-1}(r|\theta )q(r,s) N( \text {diag}(\sqrt{{\varvec{\tau }}_t}){\varvec{y}}_t |{\varvec{0}},{\varvec{\Omega }}_s^{-1} )$.
Update ${\varvec{\tau }}$: We use the truncated stick-breaking process to estimate the Dirichlet process (Ishwaran and James 2001). In simulation studies, we found that fixing the number of possible clusters reduces the MCMC processing time by up to 90%, as we avoid frequent sampling of the probability that $\tau _{tj}$ is in a new cluster, which requires evaluating the pdf of noncentral t-distributions. Given the pre-specified level of truncation K, let ${\varvec{Z}}$ be a $T \times p$ matrix of cluster indicators and let $n_{tk}$ be the size of cluster k at time t, $k = 1, \ldots , K$, $t = 1,...,T$. ${\varvec{\eta }}_t = (\eta _{t1},\ldots ,\eta _{t K})$ denotes the unique cluster values of ${\varvec{\tau }}_t$. Define the cluster weights as $\{w_{tk}\}_{t \in \{1,\ldots ,T\}, k \in \{1,\ldots ,K\} }$ and perform the following steps:
1. 1.
  For each $t=1,\ldots ,T$ and $j=1,\ldots ,p$, draw $z_{tj}$ and update $x_{tj} = y_{tj} * \sqrt{\eta _{t, z_{tj}}}$. The probability that it belongs to cluster k, $k=1,\ldots ,K$, is
  $$\begin{aligned} p(z_{tj}= k | {\varvec{z}}_{t,-j }, {\varvec{\eta }}_t, {\varvec{y}}_t, {\varvec{\Omega }}_{s_t}, {\varvec{w}}_t) \propto w_{tk} \text {N}\left( y_{tj} | \frac{\mu _c}{\sqrt{\eta _{tk}}}, \frac{\sigma _c^2}{\eta _{tk}}\right) , \end{aligned}$$
  where N$(x|\mu ,\sigma ^2)$ is the Normal($\mu ,\sigma ^2$) pdf evaluated at x. Here $\mu _c$ and $\sigma _c^2$ are the conditional mean and variance of $x_{tj}|{\varvec{x}}_{t,-j}$, where ${\varvec{x}}_{t,-j} = \{x_{ti}:i\ne j, i \in \{1,\ldots , p\} \}$ and ${\varvec{x}}_t \sim MVN({\varvec{0}}, {\varvec{\Omega }}_{s_t}^{-1})$
2. 2.
  For each $t=1,\ldots ,T$ and $k=1,\ldots ,K-1$, sample the stick-breaking weight $v_{tk}$ from Beta$(1+n_{tk}, \alpha + \sum _{k'= k+1}^{K} n_{t k'})$. Compute $w_{tk} = v_{tk} \prod _{k'<k} (1-v_{tk})$.
3. 3.
  For each $t=1,\ldots ,T$ and each $k=1,\ldots ,K$, draw $\eta _{tk}$ from distribution
  $$\begin{aligned} f(\eta _{tk}| {\varvec{\eta }}_{t,-k}, {\varvec{z}}_t, {\varvec{y}}_t, {\varvec{\Omega }}_{s_t}) \propto \eta _{tk}^{a-1} \text {exp}\{ -b \eta _{tk} -c \sqrt{\eta _{tk}} \} \end{aligned}$$
  (15)
  with $a = \frac{\nu +n_k}{2}$, $b=\frac{1}{2}[\nu + \text {tr}({\varvec{\Omega }}_{(k)(k)} {\varvec{y}}_{(k)} {\varvec{y}}_{(k)}' )]$, $c = \text {tr}( {\varvec{\Omega }}_{(k)(-k)} {\varvec{x}}_{(-k)} {\varvec{y}}_{(k)}' )$, $(k) = \{j: z_j=k, j \in \{1,\ldots ,p\}\}$ and $(-k) = \{j: z_j\ne k, j \in \{ 1,\ldots ,p\}\}$ (t and $s_t$ are dropped here for simplicity). In order to sample $\eta _{tk}$, we do a variable transformation $\tau ^* = b \eta _{tk}$ and get
  $$\begin{aligned} f(\tau ^*| \ldots ) \propto \tau ^{*(a^*-1)} \text {exp}\{ -\tau ^* -2 c^*\sqrt{\tau ^*} \} \end{aligned}$$
  with $a^*=a$ and $c^*=\frac{c}{2\sqrt{b}} $. Next, use the rejection algorithm from Liu et al. (2012) to sample $\tau ^*$:
  1. (a)
    When $c^*/\sqrt{a^*}\le -0.7$, use Algorithm 2 from Liu et al. (2012). Compute $m=\frac{2 a^*-1}{c^* + \sqrt{(c^*)^2 + 4 a^* -2}}$. Repeatedly generate $X \sim N (m, \frac{1}{2} )$ and $U\sim $ Uniform(0, 1) until $X>0$ and $m^{2a^*-1}U< X^{2a^*-1}\text {exp}\{ -2(m+c^*)(X-m)\}$. Accept $X^2$.
  2. (b)
    When $-0.7<c^*/\sqrt{a^*}< 0$, use the first part of Algorithm 1 from Liu et al. (2012). Compute $m=\frac{4a^*}{( \sqrt{(c^*)^2+4a^*}-c^*)^2}$. Repeatedly generate $X\sim $Gamma$(a^*, m)$ (m is the rate parameter) and $U\sim $ Uniform(0, 1) until $U< \text {exp} \{a^*+X(m-1)-2\sqrt{X}c^* - a^*/m)\}$. Accept X.
  3. (c)
    When $0\le c^*/\sqrt{a^*}< 0.7$, use the second part of Algorithm 1 from Liu et al. (2012). Repeatedly generate $X\sim $Gamma$(a^*,1)$ and $U\sim $Uniform(0, 1) until $U< \text {exp}(-2 c^* \sqrt{X})$. Accept X.
  4. (d)
    When $0.7\le c^*/\sqrt{a^*}$, use Algorithm 3 from Liu et al. (2012). Compute $m= c^*+\sqrt{(c^*)^2+4a^*}$. Repeatedly generate $X\sim $Gamma$(2a^*,m)$ and $U\sim $Uniform(0, 1) until $U<\text {exp}\{ - (X-\frac{m}{2}+c^*)^2\}$. Accept $X^2$.
  5. (e)
    $\eta _{tk} = \tau ^* / b$.
  6. (f)
    Update $x = \sqrt{\tau } y $ at each k.
Update ${\varvec{\alpha }}$: Sample $\alpha $ from
$$\begin{aligned} \alpha | \ldots \sim \text {Gamma}\left( a_{\alpha } + T \cdot (K - 1), b_{\alpha } - \sum _{t=1}^T \sum _{k=1}^{K - 1} log ( 1 - v_{tk}) \right) . \end{aligned}$$

1.3 Parameter selection for the fused graphical lasso

In the simulation study, we evaluate the performance of the fused graphical lasso (fLasso; Danaher et al. 2014) by treating the true hidden states as known. The fLasso contains two tuning parameters: $\uplambda _1$ controls the graphical lasso penalty (i.e., graph sparsity) and $\uplambda _2$ controls the fused lasso penalty (i.e., graph similarity).

To estimate a single graph, we select both $\uplambda _1$ and $\uplambda _2$ from a grid of values based on AIC, as done in Danaher et al. (2014). Specifically, for each value of $\uplambda _2$ on a grid, we define the grid of $\uplambda _1$ such that the graph ranges from empty to full. After fitting the fLasso for each $(\uplambda _1, \uplambda _2)$ pair, we return the estimated graph for which the AIC is smallest.

To compute an ROC curve for fLasso, we again vary $\uplambda _2$ on a grid, but now compute the AUC (across values of $\uplambda _1$ ranging from the empty graph to the full graph) and select the value of $\uplambda _2$ for which the AUC is largest.

1.4 Simulation study results when $p = 10$

	Classical-t			Slightly contaminated			Highly contaminated
	TPR	FPR	MCC	TPR	FPR	MCC	TPR	FPR	MCC
dGM	0.58	0.21	0.36	0.56	0.22	0.35	0.38	0.29	0.09
dCT	0.98	0.01	0.97	0.81	0.10	0.71	0.68	0.16	0.52
dDT	0.98	0.01	0.97	0.97	0.01	0.96	0.97	0.02	0.95

Performance comparisons for hidden state estimation over the three hidden states, for the three simulation scenarios, $p = 10$

	Classical-t				Slightly contaminated				Highly contaminated
	TPR	FPR	MCC	FL	TPR	FPR	MCC	FL	TPR	FPR	MCC	FL
fLasso	0.99	0.87	0.20	0.35	0.97	0.72	0.31	0.45	0.90	0.63	0.26	0.65
dGM	0.45	0.02	0.54	0.38	0.42	0.02	0.51	0.38	0.31	0.03	0.39	0.51
dCT	0.92	0.01	0.93	0.06	0.67	0.03	0.70	0.42	0.58	0.05	0.60	0.37
dDT	0.92	0.01	0.93	0.09	0.91	0.02	0.91	0.25	0.85	0.01	0.86	0.18

Performance comparisons for graph estimation, $p = 10$

1.5 List of variables in the hand gesture data set

Code	Variable name	Code	Variable name
1	Vectorial velocity of left hand (x coordinate)	17	Vectorial acceleration of right hand (y coordinate)
2	Vectorial velocity of left hand (y coordinate)	18	Vectorial acceleration of right hand (z coordinate)
3	Vectorial velocity of left hand (z coordinate)	19	Vectorial acceleration of left wrist (x coordinate)
4	Vectorial velocity of right hand (x coordinate)	20	Vectorial acceleration of left wrist (y coordinate)
5	Vectorial velocity of right hand (y coordinate)	21	Vectorial acceleration of left wrist (z coordinate)
6	Vectorial velocity of right hand (z coordinate)	22	Vectorial acceleration of right wrist (x coordinate)
7	Vectorial velocity of left wrist (x coordinate)	23	Vectorial acceleration of right wrist (y coordinate)
8	Vectorial velocity of left wrist (y coordinate)	24	Vectorial acceleration of right wrist (z coordinate)
9	Vectorial velocity of left wrist (z coordinate)	25	Scalar velocity of left hand
10	Vectorial velocity of right wrist (x coordinate)	26	Scalar velocity of right hand
11	Vectorial velocity of right wrist (y coordinate)	27	Scalar velocity of left wrist
12	Vectorial velocity of right wrist (z coordinate)	28	Scalar velocity of right wrist
13	Vectorial acceleration of left hand (x coordinate)	29	Scalar acceleration of left hand
14	Vectorial acceleration of left hand (y coordinate)	30	Scalar acceleration of right hand
15	Vectorial acceleration of left hand (z coordinate)	31	Scalar acceleration of left wrist
16	Vectorial acceleration of right hand (x coordinate)	32	Scalar acceleration of right wrist

Variables and their abbreviations

1.6 Results from the fused graphical lasso model on the hand gesture data set

In the application study, we fit the fused graphical lasso model given the two states generated by the proposed dynamic Dirichlet-t graphical model. The optimal graphs chosen by AIC (Danaher et al. 2014) are full graphs. To reach a desired level of sparsity, we adjust the two fLasso tuning parameters until we get a similar number of edges as the Dirichlet-t model. The estimated graphs are plotted below. The graphs in the two states include all edges that describe the positive correlation of velocity between the left hand and the left wrist, the positive correlation of acceleration between the left hand and the left wrist, and the same positive correlations on the right hand side. However, many expected edges are not detected. Thus, the fLasso results—even with the states supplied—are unsatisfactory.

Posterior partial correlations in each hidden state from the fused graphical lasso model. Blue lines indicate positive correlations and red lines indicate negative correlations; line thickness represents the correlation strength; a dotted line indicates that the edge is unique to the state

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Liu, C., Kowal, D.R. & Vannucci, M. Dynamic and robust Bayesian graphical models. Stat Comput 32, 105 (2022). https://doi.org/10.1007/s11222-022-10177-0

Download citation

Received: 28 July 2022
Accepted: 26 October 2022
Published: 09 November 2022
DOI: https://doi.org/10.1007/s11222-022-10177-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dynamic and robust Bayesian graphical models

Abstract

Access this article

Similar content being viewed by others

Variational Conditional Dependence Hidden Markov Models for Skeleton-Based Action Recognition

How to Encode Dynamic Gaussian Bayesian Networks as Gaussian Processes?

The Variational Coupled Gaussian Process Dynamical Model

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix

1.1 MCMC for the dynamic classical-t graphical model

1.2 MCMC for the dynamic Dirichlet-t graphical model

1.3 Parameter selection for the fused graphical lasso

1.4 Simulation study results when \(p = 10\)

1.5 List of variables in the hand gesture data set

1.6 Results from the fused graphical lasso model on the hand gesture data set

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Dynamic and robust Bayesian graphical models

Abstract

Access this article

Similar content being viewed by others

Variational Conditional Dependence Hidden Markov Models for Skeleton-Based Action Recognition

How to Encode Dynamic Gaussian Bayesian Networks as Gaussian Processes?

The Variational Coupled Gaussian Process Dynamical Model

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix

Appendix

1.1 MCMC for the dynamic classical-t graphical model

1.2 MCMC for the dynamic Dirichlet-t graphical model

1.3 Parameter selection for the fused graphical lasso

1.4 Simulation study results when \(p = 10\)

1.5 List of variables in the hand gesture data set

1.6 Results from the fused graphical lasso model on the hand gesture data set

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation