Skip to main content

Advertisement

Log in

A new flexible Bayesian hypothesis test for multivariate data

  • Original Paper
  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

We propose a Bayesian hypothesis testing procedure for comparing the multivariate distributions of several treatment groups against a control group. This test is derived from a flexible model for the group distributions based on a random binary vector such that, if its jth element equals one, then the jth treatment group is merged with the control group. The group distributions’ flexibility comes from a dependent Dirichlet process, while the latent vector prior distribution ensures a multiplicity correction to the testing procedure. We explore the posterior consistency of the Bayes factor and provide a Monte Carlo simulation study comparing the performance of our procedure with state-of-the-art alternatives. Our results show that the presented method performs better than competing approaches. Finally, we apply our proposal to two classical experiments. The first one studies the effects of tuberculosis vaccines on multiple health outcomes for rabbits, and the second one analyzes the effects of two drugs on weight gain for rats. In both applications, we find relevant differences between the control group and at least one treatment group.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

Download references

Acknowledgements

The first author was supported by CONICYT PFCHA/DOCTORADO BECAS CHILE/2020-21201742. The second author was supported by Fondecyt Grant 1220229 and ANID–Millennium Science Initiative Program–NCN17\(\_\)059. The third author was partially supported by Fondecyt Grant 11190018 and UKRI Medical Research Council Grant MC_UU_00002/5.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Iván Gutiérrez.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Useful properties of the Bayesian Normal model

For future reference, we review here some useful properties of the conjugate Normal model. Let \(\varvec{y}_{1:m}:= (\varvec{y}_1, \ldots , \varvec{y}_{m})\), where \((\varvec{y}_i \,|\, \varvec{\mu }, \varvec{\Sigma }) {\mathop {\sim }\limits ^{iid}} \text {N}_D(\varvec{\mu }, \varvec{\Sigma })\) and \((\varvec{\mu }, \varvec{\Sigma }) \sim \text {NIW}_D(\varvec{u}_0,r_0, \nu _0, \varvec{S}_0)\). Then, \(\big ((\varvec{\mu }, \varvec{\Sigma }) \,|\, \varvec{y}_{1:m}\big ) \sim \text {NIW}_D(\varvec{u}_m, r_m, \nu _m, \varvec{S}_m)\), where \((\varvec{u}_m, r_m, \nu _m, \varvec{S}_m)\) can be computed recursively as

$$\begin{aligned} \begin{aligned} \nu _m&= \nu _{m-1} + 1, \\ r_m&= r_{m-1} + 1, \\ \varvec{u}_m&= (r_{m-1} \varvec{u}_{m-1} + \varvec{y}_m) / r_m, \\ \varvec{S}_m&= \varvec{S}_{m-1} + (\varvec{y}_m - \varvec{u}_m)(\varvec{y}_m - \varvec{u}_m)' \frac{r_m}{r_{m-1}}. \end{aligned} \end{aligned}$$
(6)

Moreover, the marginal predictive density and the predictive density can be written as

$$\begin{aligned} p(\varvec{y}_{1:m}) = \frac{1}{\pi ^{mD/2}} \frac{r_0^{D/2}}{r_m^{D/2}} \frac{ \,|\, \varvec{S}_0 \,|\, ^{\nu _0 / 2}}{ \,|\, \varvec{S}_m \,|\, ^{\nu _m / 2}} \frac{\Gamma _D(\nu _m / 2)}{\Gamma _D(\nu _0 / 2)} \end{aligned}$$
(7)

and

$$\begin{aligned}&p(\varvec{y}_{m+1} \,|\, \varvec{y}_{1:m}) = \nonumber \\&\quad \frac{1}{\pi ^{D/2}} \frac{r_m^{D/2}}{r_{m+1}^{D/2}} \frac{ \,|\, \varvec{S}_m \,|\, ^{\nu _m / 2}}{ \,|\, \varvec{S}_{m+1} \,|\, ^{\nu _{m+1} / 2}} \frac{\Gamma _D(\nu _{m+1} / 2)}{\Gamma _D(\nu _m / 2)}, \end{aligned}$$
(8)

respectively, where \(\Gamma _D(\cdot )\) is the D-variate Gamma function (Bernardo and Smith 1994). In practice, we almost never compute \(\varvec{S}_m\) but its Cholesky decomposition, \(\varvec{S}_m = \varvec{P}_m' \varvec{P}_m\). Specifically, we compute \(\varvec{P}_0\) from scratch and then compute \(\varvec{P}_1, \ldots , \varvec{P}_m\) using a series of rank-1 updates (Golub and Van Loan 2013, Section 6.5.4).

Appendix B: Posterior simulation algorithm

1.1 Appendix B.1: Updating the DP concentration parameter (Step 1, Algorithm [1])

We update \(\alpha \) using the algorithm of Escobar and West (1995), which proceeds as follows:

  1. 1.

    Draw \(\phi \,|\, \alpha \sim \text {Beta}(\alpha + 1, N)\).

  2. 2.

    Compute \(n_{k} = \# \{s_i: i \in {\mathcal {N}}\}\).

  3. 3.

    Compute

    $$\begin{aligned} \psi / (1 - \psi ) = (a_0 + n_{k} - 1) / \{N (b_0 - \log \phi )\}. \end{aligned}$$
  4. 4.

    Draw \(\chi \sim \text {Bernoulli}(\psi )\).

  5. 5.

    Draw

    $$\begin{aligned} \alpha \sim {\left\{ \begin{array}{ll} \text {Gamma}(a_0 + n_{k}, b_0 - \log \phi ) &{} \text {if }\chi = 1, \\ \text {Gamma}(a_0 + n_{k} - 1, b_0 - \log \phi ) &{} \text {otherwise.} \end{array}\right. } \end{aligned}$$

1.2 Appendix B.2: Updating the cluster labels (Step 2, Algorithm [1])

It is not hard to see that, conditionally on \((\alpha , \varvec{\gamma })\), model (3) reduces to a Dirichlet process mixture model (Lo 1984). Let \({\mathcal {I}}_{-i}:= {\mathcal {N}} \backslash \{i\}\) and \(\varvec{s}_{-i}:= (s_\ell : \ell \in {\mathcal {I}}_{-i})\). Then, there is a well known, closed-form expression for \(p(s_i \,|\, \varvec{s}_{-i}, \varvec{z}, \varvec{y},\alpha ) = p(s_i \,|\, \varvec{s}_{-i}, \varvec{x}, \varvec{y},\alpha , \varvec{\gamma })\), namely

$$\begin{aligned}&P(s_i = k \,|\, \varvec{s}_{-i}, \varvec{z}, \varvec{y},\alpha ) \\&\propto {\left\{ \begin{array}{ll} \alpha p(\varvec{y}_i \,|\, \varvec{z}, \varvec{y}_{-i}, \varvec{s}_{-i}, s_i = k), &{} \text {if }k = k^*, \\ n_{-ik} p(\varvec{y}_i \,|\, \varvec{z}, \varvec{y}_{-i}, \varvec{s}_{-i}, s_i = k), &{} \text {otherwise,} \end{array}\right. } \end{aligned}$$

where \(k^*:= 1 + \max (\varvec{s}_{-i})\), \(\varvec{y}_{-i}:= (\varvec{y}_\ell : \ell \in {\mathcal {I}}_{-i})\) and \(n_{-ik} = \#\{\ell \in {\mathcal {I}}_{-i}: s_\ell = k\}\) (Neal 2000). Now, let \({\mathcal {I}}_{-ik} = \{\ell \in {\mathcal {I}}_{-i}: z_\ell = z_i, s_\ell = k\}\) and \(\varvec{y}_{-ik} = (\varvec{y}_\ell : \ell \in {\mathcal {I}}_{-ik})\). Then, we can rewrite \(p(\varvec{y}_i \,|\, \varvec{z}, \varvec{y}_{-i}, \varvec{s}_{-i}, s_i = k)\) as \(p(\varvec{y}_i \,|\, z_i, \varvec{y}_{-ik}, s_i = k)\), which coincides with the predictive likelihood of an out-of sample observation \(\varvec{y}_i\), given a sample \(\varvec{y}_{-ik}\), under the conjugate Normal model described in the previous appendix. Hence, \(p(\varvec{y}_i \,|\, \varvec{z}, \varvec{y}_{-i}, \varvec{s}_{-i}, s_i = k)\) can be computed efficiently using Eqs. (7) and (8).

In practice, we rarely need to compute the quantity \(p(\varvec{y}_i \,|\, \varvec{y}_{-ik}, s_i = k, z_i = j)\) from scratch. Instead, the statistics used in the computation of \(p(\varvec{y}_i \,|\, \varvec{y}_{-ik}, s_i = k, z_i = j)\), denoted by \((\varvec{u}_m, r_m, \nu _m, \varvec{S}_m)\) in (6), are cached as \(\varvec{T}_{jk}\); and each time \(s_i\) change its value, say from \(k_1\) to \(k_2\), only \(\varvec{T}_{jk_1}\) and \(\varvec{T}_{jk_2}\) are updated using recursion (6).

1.3 Appendix B.3: Updating the hypothesis vector (Step 3, Algorithm [1])

Let \(g_j:= P(\Vert \varvec{\gamma }\Vert _1 = j)\), \(j \in {\mathcal {J}}\). Then, under Womack’s prior,

$$\begin{aligned} g_j&= \zeta _0 \sum _{l=1}^{J - j} g_{j + l} \left( {\begin{array}{c}j + l\\ j\end{array}}\right) , \qquad j = \{0, \ldots , J - 1\}, \\ g_0&= \zeta _0 / (1 + \zeta _0), \end{aligned}$$

where \(\zeta _0\) is a hyperparameter (Womack et al. 2015). This is a non-singular linear system with \(J + 1\) equations. Hence, \(\varvec{g} = (g_j: j \in {\mathcal {J}})\) is always well defined. Once \(\varvec{g}\) is computed, the cost of computing \(\pi _0(\varvec{\gamma })\) is negligible, because Womack’s prior implies that \(\pi _0(\varvec{\gamma }) = g_{\Vert \varvec{\gamma }\Vert _1} \left( {\begin{array}{c}J\\ \Vert \varvec{\gamma }\Vert _1!\end{array}}\right) \).

Now, let \({\mathcal {I}}_{\varvec{\gamma } jk}:= \{i \in {\mathcal {N}}: \gamma _{x_i}x_i = j, s_i = k\}\) and \(\varvec{y}_{\varvec{\gamma } jk}:= (\varvec{y}_i: i \in {\mathcal {I}}_{\varvec{\gamma } jk})\). Then,

$$\begin{aligned} \pi _1(\varvec{\gamma })&\equiv p(\varvec{\gamma } \,|\, \alpha , \varvec{s}, \varvec{x}, \varvec{y}) \\&\propto p(\varvec{\gamma } \,|\, \varvec{s}) p(\varvec{y} \,|\, \alpha , \varvec{x}, \varvec{s}, \varvec{\gamma }) = \pi _0(\varvec{\gamma }) p(y \,|\, \varvec{s}, \varvec{z}) \\&= \pi _0(\varvec{\gamma }) \prod _{j \in {\mathcal {J}}} \prod _{k \in {\mathbb {N}}} p(\varvec{y}_{\varvec{\gamma } jk} \,|\, {\mathcal {I}}_{\varvec{\gamma } jk}), \end{aligned}$$

under the convention that \(p(\varvec{y}_{\varvec{\gamma } jk} \,|\, {\mathcal {I}}_{\varvec{\gamma } jk}) = 1\) if \({\mathcal {I}}_{\varvec{\gamma } jk} = \emptyset \). However, \(p(\varvec{y}_{\varvec{\gamma } jk} \,|\, {\mathcal {I}}_{\varvec{\gamma } jk})\) is numerically equal to the marginal likelihood of a sample \(\varvec{y}_{\varvec{\gamma } jk}\) under the model described in Appendix A. Hence, \(\pi _1(\varvec{\gamma })\) can be computed efficiently (up to some proportionality constant) using Eq. (7) and thus we can draw \(\varvec{\gamma } \sim \pi _1\) using any variant of the MH algorithm. In particular, if we use \(W(\varvec{\beta } \,|\, \varvec{\gamma }) \propto I(\Vert \varvec{\beta } - \varvec{\gamma }\Vert _1 = 1)\) as proposal distribution, the acceptance ratio becomes

$$\begin{aligned} a&= \min \left( 1, \frac{\pi _1(\varvec{\beta })}{\pi _1(\varvec{\gamma })}\right) . \end{aligned}$$

Appendix C: Proof of Theorem 1

Let \(\varvec{\beta }\) and \(\varvec{\gamma }\) two hypotheses such that \(\varvec{\beta } \succ \varvec{\gamma }\). Without loss of generality, assume that \(\varvec{\beta } = \varvec{e}_1\) and \(\varvec{\gamma } = \varvec{0}_J\), where \(\varvec{e}_1\) is the first column of \(\varvec{I}_J\).

(a) In order to prove this claim, let’s start by noting that \(\bar{\nu }_{\varvec{\gamma } jk} / n_{\varvec{\gamma } jk} \rightarrow 1\), \(\bar{r}_{\varvec{\gamma } jk} / n_{\varvec{\gamma } jk} \rightarrow 1\), and \(\bar{\varvec{S}}_{\varvec{\gamma } 0k} / n_{\varvec{\gamma } 0k} {\mathop {\rightarrow }\limits ^{p}} \varvec{\Sigma }_{0k}\) for all active groups. In addition, if the true hypothesis is \(\varvec{\gamma }\), then \(\bar{\varvec{S}}_{\varvec{\beta } jk} / n_{\varvec{\beta } jk} {\mathop {\rightarrow }\limits ^{p}} \varvec{\Sigma }_{0k}\), \(j = 0, 1\). Hence,

$$\begin{aligned}&p(\{\varvec{y}_i : i \in {\mathcal {I}}_{\varvec{\gamma } jk}\} \,|\, {\mathcal {I}}_{\varvec{\gamma } jk}) \\&{\mathop {\sim }\limits ^{\centerdot }} \pi ^{- D n_{\varvec{\gamma } jk} / 2} n_{\varvec{\gamma } jk}^{-D / 2 - D \bar{\nu }_{\varvec{\gamma } jk} / 2} \\&\qquad \,|\, \varvec{\Sigma }_{0k} \,|\, ^{- \bar{\nu }_{\varvec{\gamma } jk} / 2} {\textstyle \prod _{d=1}^D} \Gamma (\kappa _{\varvec{\gamma } jkd}), \end{aligned}$$

where \(\kappa _{\varvec{\gamma } jkd}:= (\bar{\nu }_{\varvec{\gamma } jk} + 1 - d) / 2\) and \(a_n {\mathop {\sim }\limits ^{\centerdot }} b_n \Leftrightarrow a_n = O_p(b_n)\), i.e. that \(a_n / b_n\) is stochastically bounded. This, in turn, implies that

$$\begin{aligned} A_1&\equiv p(\{\varvec{y}_i: i \in {\mathcal {I}}_{\varvec{\beta } 0k}\} \,|\, {\mathcal {I}}_{\varvec{\beta } 0k})\\&\qquad p(\{\varvec{y}_i: i \in {\mathcal {I}}_{\varvec{\beta } 1k}\} \,|\, {\mathcal {I}}_{\varvec{\beta } 1k}) \\&\div p(\{\varvec{y}_i: i \in {\mathcal {I}}_{\varvec{\gamma } 0k}\} \,|\, {\mathcal {I}}_{\varvec{\gamma } 0k})\\&{\mathop {\sim }\limits ^{\centerdot }} \underbrace{\frac{ n_{\varvec{\gamma } 0k}^{D / 2 + D \bar{\nu }_{\varvec{\gamma } 0k} / 2} }{ n_{\varvec{\beta } 0k}^{D / 2 + D \bar{\nu }_{\varvec{\beta } 0k} / 2} n_{\varvec{\beta } 1k}^{D / 2 + D \bar{\nu }_{\varvec{\beta } 1k} / 2} }}_{A_2} \\&\quad \times \prod _{d=1}^D \underbrace{\frac{ \Gamma (\kappa _{\varvec{\beta } 0kd}) \Gamma (\kappa _{\varvec{\beta } 1kd}) }{ \Gamma (\kappa _{\varvec{\gamma } 0kd}) }}_{A_{3d}}, \end{aligned}$$

where the terms involving \(\pi \) and \(\varvec{\Sigma }_{0k}\) disappear because the exponents in the numerator and the denominator cancel out, except for a term that does not depends on the sample size, and thus it becomes irrelevant in the subsequent computations.

Now, let us assume that \(n_{\varvec{\beta } jk} / n_{\varvec{\gamma } 0k} {\mathop {\rightarrow }\limits ^{p}} \chi _j > 0\), \(j = 0, 1\). This must be the case since the sample is assumed independent and identically distributed. Then,

$$\begin{aligned} A_2&{\mathop {\sim }\limits ^{\centerdot }} \frac{ n_{\varvec{\gamma } 0k}^{D / 2 + D \bar{\nu }_{\varvec{\gamma } 0k} / 2} }{ (\chi _0 n_{\varvec{\gamma } 0k})^{D / 2 + D \bar{\nu }_{\varvec{\beta } 0k} / 2} (\chi _1 n_{\varvec{\gamma } 0k})^{D / 2 + D \bar{\nu }_{\varvec{\beta } 1k} / 2} } \\&{\mathop {\sim }\limits ^{\centerdot }} \frac{ n_{\varvec{\gamma } 0k}^{D (\bar{\nu }_{\varvec{\gamma } 0k} - \bar{\nu }_{\varvec{\beta } 0k} - \bar{\nu }_{\varvec{\beta } 1k} - 1) / 2} }{ \chi _0^{D / 2 + D \bar{\nu }_{\varvec{\beta } 0k} / 2} \chi _1^{D / 2 + D \bar{\nu }_{\varvec{\beta } 1k} / 2} } \\&{\mathop {\sim }\limits ^{\centerdot }} \frac{ n_{\varvec{\gamma } 0k}^{- D (\nu _0 + 1) / 2} }{ \chi _0^{D / 2 + D \bar{\nu }_{\varvec{\beta } 0k} / 2} \chi _1^{D / 2 + D \bar{\nu }_{\varvec{\beta } 1k} / 2} } \\&{\mathop {\sim }\limits ^{\centerdot }} \frac{ n_{\varvec{\gamma } 0k}^{- D (\nu _0 + 1) / 2} }{ \chi _0^{D n_{\varvec{\beta } 0k} / 2} \chi _1^{D n_{\varvec{\beta } 1k} / 2} }, \end{aligned}$$

where most of the terms involving the sample size disappears because \(\bar{\nu }_{\varvec{\gamma } 0k} - \bar{\nu }_{\varvec{\beta } 0k} - \bar{\nu }_{\varvec{\beta } 1k} = - \nu _0\). On the other hand, we can rewrite \(A_{3d}\) as a Beta function times a compensating term

$$\begin{aligned} A_{3d}&= B(\kappa _{\varvec{\beta } 0kd}, \kappa _{\varvec{\beta } 1kd}) \prod _{u = \kappa _{\varvec{\gamma } 0kd}}^{\kappa _{\varvec{\beta } 0kd} + \kappa _{\varvec{\beta } 1kd} - 1} u \\&{\mathop {\sim }\limits ^{\centerdot }} B(\kappa _{\varvec{\beta } 0kd}, \kappa _{\varvec{\beta } 1kd}) \kappa _{\varvec{\gamma } 0kd}^{(\nu _0 + 1 - d) / 2} \\&{\mathop {\sim }\limits ^{\centerdot }} B(\kappa _{\varvec{\beta } 0kd}, \kappa _{\varvec{\beta } 1kd}) n_{\varvec{\gamma } 0k}^{(\nu _0 + 1 - d) / 2}, \end{aligned}$$

where the last step is due \(\kappa _{\varvec{\gamma } 0kd} / n_{\varvec{\gamma } 0k}\) converges to 1/2. Hence, using Stirling approximation, we have

$$\begin{aligned} A_{3d}&{\mathop {\sim }\limits ^{\centerdot }} \frac{ \kappa _{\varvec{\beta } 0kd}^{\kappa _{\varvec{\beta } 0kd} - 1/2} \kappa _{\varvec{\beta } 1kd}^{\kappa _{\varvec{\beta } 1kd} - 1/2} n_{\varvec{\gamma } 0k}^{(\nu _0 + 1 - d) / 2} }{ (\kappa _{\varvec{\beta } 0kd} + \kappa _{\varvec{\beta } 1kd})^{\kappa _{\varvec{\beta } 0kd} + \kappa _{\varvec{\beta } 1kd} - 1/2} }, \end{aligned}$$

but we know that

$$\begin{aligned} \kappa _{\varvec{\gamma } jkd} = \tfrac{1}{2}(\bar{\nu }_{\varvec{\gamma } jk} + 1 - d) = \tfrac{1}{2}(n_{\varvec{\gamma } jk} + \nu _0 + 1 - d). \end{aligned}$$

Therefore,

$$\begin{aligned} A_{3d}&{\mathop {\sim }\limits ^{\centerdot }} (n_{\varvec{\beta } 0k} / 2 + \nu _0 / 2)^{(n_{\varvec{\beta } 0k} + \nu _0 + d - 2) / 2} \\&\quad \,\, (n_{\varvec{\beta } 1k} / 2 + \nu _0 / 2)^{(n_{\varvec{\beta } 1k} + \nu _0 + d - 2) / 2} \\&\quad \,\, (n_{\varvec{\gamma } 0k} / 2 + \nu _0)^{- (n_{\varvec{\gamma } 0k} + 2\nu _0 + 2d - 3) / 2} \\&\quad \,\,\, n_{\varvec{\gamma } 0k}^{(\nu _0 + 1 - d) / 2} \\&{\mathop {\sim }\limits ^{\centerdot }} n_{\varvec{\beta } 0k}^{(n_{\varvec{\beta } 0k} + \nu _0 + d - 2) / 2} n_{\varvec{\beta } 1k}^{(n_{\varvec{\beta } 1k} + \nu _0 + d - 2) / 2} \\&\quad \,\, n_{\varvec{\gamma } 0k}^{-(\nu _0 + 3d - 4) / 2 - n_{\varvec{\gamma } 0k} / 2} \\&{\mathop {\sim }\limits ^{\centerdot }} (\chi _0 n_{\varvec{\gamma } 0k})^{(n_{\varvec{\beta } 0k} + \nu _0 + d - 2) / 2} \\&\quad \,\,\, (\chi _1 n_{\varvec{\gamma } 0k})^{(n_{\varvec{\beta } 1k} + \nu _0 + d - 2) / 2} \\&\quad \,\,\, n_{\varvec{\gamma } 0k}^{-(\nu _0 + 3d - 4) / 2 - n_{\varvec{\gamma } 0k} / 2} \\&{\mathop {\sim }\limits ^{\centerdot }} \chi _0^{n_{\varvec{\beta } 0k} / 2} \chi _1^{n_{\varvec{\beta } 1k} / 2} n_{\varvec{\gamma } 0k}^{(\nu _0 - d) / 2}. \end{aligned}$$

Hence,

$$\begin{aligned} A_1&{\mathop {\sim }\limits ^{\centerdot }} A_2 \prod _{d=1}^D A_{3d} \\&{\mathop {\sim }\limits ^{\centerdot }} \frac{ n_{\varvec{\gamma } 0k}^{- D (\nu _0 + 1) / 2} }{ \chi _0^{D n_{\varvec{\beta } 0k} / 2} \chi _1^{D n_{\varvec{\beta } 1k} / 2} } \\&\quad \, \prod _{d=1}^D \chi _0^{n_{\varvec{\beta } 0k} / 2} \chi _1^{n_{\varvec{\beta } 1k} / 2} n_{\varvec{\gamma } 0k}^{(\nu _0 - d) / 2} \\&{\mathop {\sim }\limits ^{\centerdot }} n_{\varvec{\gamma } 0k}^{ - D (\nu _0 + 1) / 2 + D \nu _0 / 2 - D (D + 1) / 4 } \\&{\mathop {\sim }\limits ^{\centerdot }} n_{\varvec{\gamma } 0k}^{-4(2D + 1)}, \end{aligned}$$

which converges to zero.

(b) First, note that

$$\begin{aligned} b_k&\equiv p(\{\varvec{y}_i: i \in {\mathcal {I}}_{\varvec{\beta } 0k}\} \,|\, {\mathcal {I}}_{\varvec{\beta } 0k}) \\&\qquad \,\, p(\{\varvec{y}_i: i \in {\mathcal {I}}_{\varvec{\beta } 1k}\} \,|\, {\mathcal {I}}_{\varvec{\beta } 1k}) \, \\&\quad \div \,\, p(\{\varvec{y}_i: i \in {\mathcal {I}}_{\varvec{\gamma } 0k}\} \,|\, {\mathcal {I}}_{\varvec{\gamma } 0k})\\&= \left( \frac{ \bar{r}_{\varvec{\gamma } 0k} }{ \bar{r}_{\varvec{\beta } 0k} \bar{r}_{\varvec{\beta } 1k} } \right) ^{D/2} \frac{ \Gamma _D(\bar{\nu }_{\varvec{\beta } 0k} / 2) \Gamma _D(\bar{\nu }_{\varvec{\beta } 1k} / 2) }{ \Gamma _D(\bar{\nu }_{\varvec{\gamma } 0k} / 2) }\\&\qquad \frac{ O_p(1) \,|\, \bar{\varvec{S}}_{\varvec{\gamma } 0k} \,|\, ^{\bar{\nu }_{\varvec{\gamma } 0k} / 2} }{ \,|\, \bar{\varvec{S}}_{\varvec{\beta } 0k} \,|\, ^{\bar{\nu }_{\varvec{\beta } 0k} / 2} \,|\, \bar{\varvec{S}}_{\varvec{\beta } 1k} \,|\, ^{\bar{\nu }_{\varvec{\beta } 1k} / 2} } \\&= O_p(1) \underbrace{ \left( \frac{ \bar{r}_{\varvec{\gamma } 0k} }{ \bar{r}_{\varvec{\beta } 0k} \bar{r}_{\varvec{\beta } 1k} } \right) ^{D/2} }_{b_{k1}}\\&\quad \, \underbrace{ \frac{ \Gamma _D(\bar{\nu }_{\varvec{\beta } 0k} / 2) \Gamma _D(\bar{\nu }_{\varvec{\beta } 1k} / 2) }{ \Gamma _D(\bar{\nu }_{\varvec{\gamma } 0k} / 2) } }_{b_{k2}}\\&\quad \, \underbrace{ \left( \frac{ \,|\, \bar{\varvec{S}}_{\varvec{\gamma } 0k} \,|\, }{ \,|\, \bar{\varvec{S}}_{\varvec{\beta } 0k} \,|\, \,|\, \bar{\varvec{S}}_{\varvec{\beta } 1k} \,|\, } \right) ^{D / 2} }_{b_{k3}} \underbrace{ \prod _{j=0}^1 \frac{ \,|\, \bar{\varvec{S}}_{\varvec{\gamma } jk} \,|\, ^{n_{\varvec{\beta } jk} / 2} }{ \,|\, \bar{\varvec{S}}_{\varvec{\beta } jk} \,|\, ^{n_{\varvec{\beta } jk} / 2} } }_{b_{k4}}. \end{aligned}$$

From the first part of the theorem, we already know that \(b_{k1} = O_p(n_{\varvec{\gamma } 0k}^{q_1})\) and \(b_{k2} = O_p(n_{\varvec{\gamma } 0k}^{q_2})\) for some \(q_1, q_2 < \infty \). In addition,

$$\begin{aligned}&\left( \frac{ \,|\, \bar{\varvec{S}}_{\varvec{\gamma } 0k} \,|\, }{ \,|\, \bar{\varvec{S}}_{\varvec{\beta } 0k} \,|\, \,|\, \bar{\varvec{S}}_{\varvec{\beta } 1k} \,|\, } \right) ^{D / 2} \\&\quad = \left( \frac{ O_p(1) n_{\varvec{\gamma } 0k}^D \,|\, \bar{\varvec{S}}_{\varvec{\gamma } 0k} / n_{\varvec{\gamma } 0k} \,|\, }{ n_{\varvec{\beta } 0k}^D \,|\, \bar{\varvec{S}}_{\varvec{\beta } 0k} / n_{\varvec{\beta } 0k} \,|\, n_{\varvec{\beta } 1k}^D \,|\, \bar{\varvec{S}}_{\varvec{\beta } 1k} / n_{\varvec{\beta } 1k} \,|\, } \right) ^{D / 2}\\&\quad = \left( \frac{ O_p(1) n_{\varvec{\gamma } 0k}^D \,|\, \varvec{\Sigma }_{0k} \,|\, }{ n_{\varvec{\beta } 0k}^D \,|\, \varvec{\Sigma }_{0k} \,|\, n_{\varvec{\beta } 1k}^D \,|\, \varvec{\Sigma }_{1k} \,|\, } \right) ^{D /2}\\&\quad = O_p(1) n_{\varvec{\gamma } 0k}^{-D^2 / 2}, \end{aligned}$$

where the second equality is due to the fact that \(\bar{\varvec{S}}_{\varvec{\gamma } 0k} / n_{\varvec{\gamma } 0k}\), \(\bar{\varvec{S}}_{\varvec{\beta } 0k} / n_{\varvec{\beta } 0k}\), and \(\bar{\varvec{S}}_{\varvec{\beta } 1k} / n_{\varvec{\beta } 1k}\) converge in probability to (positive definite) finite matrices. Hence, \(b_{k1} b_{k2} b_{k3} = O_p(n_{\varvec{\gamma } 0k}^q)\) for some \(q < \infty \). In the next paragraphs, we will prove that \(b_{k4} = O_p((1 + u)^{n_{\varvec{\gamma } 0k}})\) for some \(u > 0\), ensuring that \(b_k\) diverges to \(\infty \) in probability, no matter the value of the aforementioned q.

First, recall a basic property of the conjugate Normal model: if \(\varvec{a}_i \,|\, \varvec{\mu }, \varvec{\Sigma } {\mathop {\sim }\limits ^{iid}} \text {N}_D(\varvec{\mu }, \varvec{\Sigma })\), with \((\varvec{\mu }, \varvec{\Sigma }) \sim \text {NIW}(\bar{\varvec{u}}_0, \bar{r}_0, \bar{\nu }_0, \bar{\varvec{S}}_0)\). Then, \((\varvec{\mu }, \varvec{\Sigma }) \,|\, \varvec{a}_1, \ldots , \varvec{a}_m \sim \text {NIW}(\bar{\varvec{u}}_m, \bar{r}_m, \bar{\nu }_m, \bar{\varvec{S}}_m)\), where the posterior hyperparameters can be computed recursively as (Bouchard-Côté et al. 2017)

$$\begin{aligned} \bar{r}_m&= \bar{r}_{m-1} + 1,\\ \bar{\nu }_m&= \bar{\nu }_{m-1} + 1,\\ \bar{\varvec{u}}_m&= \frac{\bar{r}_{m-1} \bar{\varvec{u}}_{m-1} + \varvec{a}_m}{\bar{r}_m},\\ \bar{\varvec{S}}_m&= \bar{\varvec{S}}_{m-1} + (\varvec{a}_m - \bar{\varvec{u}}_m)(\varvec{a}_m - \bar{\varvec{u}}_m)' \bar{r}_m / \bar{r}_{m-1}. \end{aligned}$$

Now, for any fixed \(j \in \{0, 1\}\), replace m with \(n_{\varvec{\gamma } 0k}\), and replace the \(\varvec{a}_i\)s with the elements of \(\{\varvec{y}_i: i \in {\mathcal {I}}_{\varvec{\beta } jk}\}\) followed by the elements of \(\{\varvec{y}_i: i \in {\mathcal {I}}_{\varvec{\beta } (1-j)k}\}\). Then, it is not hard to note that \(\bar{\varvec{S}}_{n_{\varvec{\beta } jk}} = \bar{\varvec{S}}_{\varvec{\beta }jk}\) and \(\bar{\varvec{S}}_{n_{\varvec{\gamma } 0k}} = \bar{\varvec{S}}_{\varvec{\gamma }0k}\). Hence, \(\bar{\varvec{S}}_{\varvec{\gamma }0k}\) can be obtained from any \(\bar{\varvec{S}}_{\varvec{\beta }jk}\) after a finite sequence of rank-1 updates.

Next, recall the well-known matrix determinant lemma (Harville 2008, Theorem 18.1.1). Given an invertible matrix \(\varvec{A}\) and two vectors \(\varvec{u}\) and \(\varvec{v}\), this lemma states that \( \,|\, \varvec{A} + \varvec{u}\varvec{v}' \,|\, = \,|\, \varvec{A} \,|\, (1 + \varvec{u}'\varvec{A}^{-1}\varvec{v})\). Applying this lemma \(n_{\varvec{\gamma }0k} - n_{\varvec{\beta }jk}\) consecutive times to \(\bar{\varvec{S}}_{\varvec{\beta }jk}\) (once per rank-1 update), we have that

$$\begin{aligned} \frac{ \,|\, \bar{\varvec{S}}_{\varvec{\gamma } 0k} \,|\, }{ \,|\, \bar{\varvec{S}}_{\varvec{\beta } jk} \,|\, }&= \prod _{q=n_{\varvec{\beta }jk}+1}^{n_{\varvec{\gamma }0k}} (1 + (\varvec{a}_i - \bar{\varvec{u}}_{q})'\bar{\varvec{S}}_{q-1}^{-1}(\varvec{a}_i - \bar{\varvec{u}}_{q}))\\&\quad > 1 + \sum \nolimits _{q=1}^{q^\star } (\varvec{a}_i - \bar{\varvec{u}}_{q})'\bar{\varvec{S}}_{q-1}^{-1}(\varvec{a}_i - \bar{\varvec{u}}_{q}), \end{aligned}$$

since the matrices \(\bar{\varvec{S}}_q\) are all positive definite. Now, each term in the sum is \(O(n_{\varvec{\gamma } 0k}^{-1})\), but there are \(n_{\varvec{\gamma } 0k} - n_{\varvec{\beta } jk} = O_p(n_{\varvec{\gamma } 0k})\) terms. So, the whole sum is \(O_p(1)\) and greater than zero as well. This is an important point, because it means that \( \,|\, \bar{\varvec{S}}_{\varvec{\gamma } 0k} \,|\, > \,|\, \bar{\varvec{S}}_{\varvec{\beta } jk} \,|\, (1 + O_p(1))\). Hence,

$$\begin{aligned}&\frac{ \,|\, \bar{\varvec{S}}_{\varvec{\gamma } 0k} \,|\, ^{n_{\varvec{\beta } jk} / 2} }{ \,|\, \bar{\varvec{S}}_{\varvec{\beta } jk} \,|\, ^{n_{\varvec{\beta } jk} / 2} }\\&\quad > (1 + O_p(1))^{n_{\varvec{\beta } 0k} / 2} (1 + O_p(1))^{n_{\varvec{\beta } 1k} / 2} \\&\quad = (1 + O_p(1))^{n_{\varvec{\gamma } 0k}}, \end{aligned}$$

where \(O_p(1)\) is greater than zero. Then, \(b_{k4}\) diverges faster than the terms that converge to zero, and thus \(b_k\) also diverges, as desired.

Appendix D: Software

We developed a Julia package, available at https://github.com/igutierrezm/MANOVABNPTest.jl, that is relatively easy to call from R (thanks to the R package JuliaConnectoR). Details about its installation, as well as a minimal reproducible example, are available at the repository’s README. A more elaborate example (specifically, an R script reproducing Fig. 2) is available at https://raw.githubusercontent.com/igutierrezm/MANOVABNPTest.jl/master/extras/elaborate-example.R.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gutiérrez, I., Gutiérrez, L. & Alvares, D. A new flexible Bayesian hypothesis test for multivariate data. Stat Comput 33, 50 (2023). https://doi.org/10.1007/s11222-023-10214-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11222-023-10214-6

Keywords

Mathematics Subject Classification

Navigation