Skip to main content
Log in

Optimal equivalence testing in exponential families

  • Regular Article
  • Published:
Statistical Papers Aims and scope Submit manuscript

Abstract

We develop uniformly most powerful unbiased (UMPU) two sample equivalence test for a difference of canonical parameters in exponential families. This development involves a non-unique reparametrization. We address this issue via a novel characterization of all possible reparametrizations of interest in terms of a matrix group. Furthermore, our procedure involves an intractable conditional distribution which we reproduce to a high degree of accuracy using saddlepoint approximations. The development of this saddlepoint-based procedure involves a non-unique reparametrization but we show that our procedure is invariant under choice of reparametrization. Our real data example considers the mean-to-variance ratio for normally distributed data. We compare our result to six competing equivalence testing procedures for the mean-to-variance ratio. Only our UMPU method finds evidence of equivalence, which is the expected result. We also perform a Monte Carlo simulation study which shows that our UMPU method outperforms all competing methods by exhibiting an empirical significance level which is not statistically significantly different from the nominal 5% level for all simulation settings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Berger JO (1985) Statistical decision theory and bayesian analysis. Springer, Berlin

    Book  MATH  Google Scholar 

  • Berger RL (1982) Multiparameter hypothesis testing and acceptance sampling. Technometrics 24:295–300

    Article  MathSciNet  MATH  Google Scholar 

  • Box GEP, Tiao GC (1973) Bayesian inference in statistical analysis. Wiley, New York

    MATH  Google Scholar 

  • Butler RW (2007) Saddlepoint approximations with applications. Cambridge University Press, New York

    Book  MATH  Google Scholar 

  • Casella G, Berger RL (2002) Statistical inference, 2nd edn. Duxbury Press, Pacific Grove

    MATH  Google Scholar 

  • Harville DA (1997) Matrix algebra from a statistician’s perspective. Springer, New York

    Book  MATH  Google Scholar 

  • Kraft H, Vetter H (1994) Twenty-four-hour Blood Pressure Profiles in Patients with Mild-to-moderate Hypertension: Moxonidine versus Captopril, J Cardiovasc. Pharmacol., 24. Suppl. 1:S29-33

    Google Scholar 

  • Lehmann EL, Romano J (2005) Testing statistical hypotheses, 3rd edn. Wiley, New York

    MATH  Google Scholar 

  • Lotti G, Gianrossi R (1993) Moxonidine vs. captopril in minor to intermediate hypertension. Double-blind study of effectiveness and tolerance. Fortschr Med 111(27):429–32

    Google Scholar 

  • Lugannani R, Rice SO (1980) Saddlepoint approximations for the distribution of sums of independent random variables. Adv Appl Prob 12:475–490

    Article  MATH  Google Scholar 

  • Paige RL, Trindade AA (2008) Practical small sample inference for single lag subset autoregressive models. J Stat Plan Inference 138:1934–1949

    Article  MathSciNet  MATH  Google Scholar 

  • Paige RL, Trindade AA, Fernando PH (2009) Saddlepoint-based bootstrap inference for quadratic estimating equations. Scand J Stat 36:98–111

    MathSciNet  MATH  Google Scholar 

  • Romano JP (2005) Optimal testing of equivalence hypotheses. Ann Stat 33:1036–1047

    Article  MathSciNet  MATH  Google Scholar 

  • Tukey JW (1991) The philosophy of multiple comparisons. Stat Sci 6:100–116

    Article  Google Scholar 

  • Wellek S (2010) Testing statistical hypotheses of equivalence and noninferiority, 2nd edn. Chapman & Hall/CRC, Boca Raton

    Book  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Robert L. Paige.

Ethics declarations

Conflicts of interest

The authors have no conflicts of interest, and no disclosures, declarations nor acknowledgements that need to be made.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendices

1.1 Proof of Theorem 1

As discussed in Sect. 2 a transformation from matrix group M equates to the following changes in canonical parameters and sufficient statistics:

$$\begin{aligned} {\tilde{\gamma }}=A^{T}\gamma =\left[ \begin{array}{c} \psi \\ \psi a^{T}+B^{T}\lambda \end{array} \right] \\ {\tilde{Z}}=A^{-1}Z=\left[ \begin{array} {l} U-aB^{-1}V\\ B^{-1}V \end{array} \right] =\left[ \begin{array}{c} {\tilde{U}}\\ {\tilde{V}} \end{array} \right] . \end{aligned}$$

The associated likelihood function is

$$\begin{aligned} {\mathcal {L}}\left( {\tilde{\gamma }}\right) =\exp \left\{ {\tilde{\gamma }} ^{T}{\tilde{Z}}+c\left( A^{-T}{\tilde{\gamma }}\right) \right\} \text {.} \end{aligned}$$
(13)

The optimal UMPU test for \(\psi \) depends upon the following conditional distribution:

$$\begin{aligned} f\left( {\tilde{u}}|{\tilde{v}};\psi \right) \end{aligned}$$

and we show that

$$\begin{aligned} f\left( {\tilde{u}}|{\tilde{v}};\psi \right) =f\left( u|v;\psi \right) . \end{aligned}$$

First note that for a regular exponential family (see for instance Butler 2007, Sec. 5.1.2) we have that

$$\begin{aligned} f\left( u|v;\psi \right)&=\exp \left\{ \psi u-c\left( \psi |v\right) -d\left( z\right) \right\} =\frac{\exp \left( \psi u-d\left( z\right) \right) }{\exp \left\{ c\left( \psi |v\right) \right\} } \end{aligned}$$
(14)
$$\begin{aligned}&=\frac{\exp \left( \psi u-d\left( z\right) \right) }{\int _{\left\{ u:z\in S\right\} }\exp \left( \psi u-d\left( z\right) \right) \mathtt {d}F\left( u\right) } \end{aligned}$$
(15)

where S is the joint support of z and the Riemann-Stieltjes integral in the denominator represents a sum in the discrete case and a surface integral in the continuous case. In a similar fashion, we have that

$$\begin{aligned} f\left( {\tilde{u}}|{\tilde{v}};\psi \right) =\frac{\exp \left( \psi \tilde{u}-d\left( {\tilde{z}}\right) \right) }{\int _{\left\{ {\tilde{u}}:\tilde{z}\in {\tilde{S}}\right\} }\exp \left( \psi {\tilde{u}}-d\left( {\tilde{z}}\right) \right) \mathtt {d}F\left( {\tilde{u}}\right) } \end{aligned}$$
(16)

where \({\tilde{S}}\) is the joint support of \({\tilde{z}}.\)

Recall that

$$\begin{aligned} {\tilde{z}}=A^{-1}z \end{aligned}$$

so that

$$\begin{aligned} {\tilde{u}}=A_{\left( 1\right) }^{-1}z \end{aligned}$$

where \(A_{\left( 1\right) }^{-1}\) denotes the first row of \(A^{-1}\).

Then the right-hand side of (16) can be written as

$$\begin{aligned} \frac{\exp \left( \psi A_{\left( 1\right) }^{-1}z-d\left( A^{-1}z\right) \right) }{\int _{\left\{ A_{\left( 1\right) }^{-1}z\text { }:\text { } A^{-1}z\in A^{-1}S\right\} }\exp \left( \psi A_{\left( 1\right) } ^{-1}z-d\left( A^{-1}z\right) \right) \mathtt {d}F\left( A_{\left( 1\right) }^{-1}z\right) }. \end{aligned}$$

Now consider the change variable for this integral in which we replace z with Az and to get equivalent expression

$$\begin{aligned} \frac{\exp \left( \psi A_{\left( 1\right) }^{-1}Az-d\left( A^{-1}Az\right) \right) }{\int _{\left\{ A_{\left( 1\right) }^{-1}Az\text { }:\text { } A^{-1}Az\in A^{-1}AS\right\} }\exp \left( \psi A_{\left( 1\right) } ^{-1}Az-d\left( A^{-1}Az\right) \right) \mathtt {d}F\left( A_{\left( 1\right) }^{-1}Az\right) }. \end{aligned}$$
(17)

But note that

$$\begin{aligned} A_{\left( 1\right) }^{-1}A=\left[ \begin{array}{cc} 1&0_{1\times \left( p-1\right) } \end{array} \right] \end{aligned}$$

so then \(f\left( {\tilde{u}}|{\tilde{v}};\psi \right) =f\left( u|v;\psi \right) \).

1.2 Proof of Theorem 2

Consider the log-likelihood function for likelihood (13) under a transformation A in matrix group M;

$$\begin{aligned} \ell \left( {\tilde{\gamma }}\right) ={\tilde{\gamma }}^{T}{\tilde{Z}}+c\left( A^{-T}{\tilde{\gamma }}\right) . \end{aligned}$$
(18)

The log-likelihood for primary reparametrization in (5) is

$$\begin{aligned} \ell \left( \gamma \right) =\gamma ^{T}Z+c\left( \gamma \right) \text {.} \end{aligned}$$

The score equations for this log-likelihood are

$$\begin{aligned} Z+\nabla c\left( \gamma \right) =0 \end{aligned}$$

where \(\nabla \) is the gradient symbol which represents the first partial derivative of \(c\left( \cdot \right) \) w.r.t. each element in \(\gamma \). MLE \({\hat{\gamma }}\) is the solution to above score equation which means that

$$\begin{aligned} Z+\nabla c\left( {\hat{\gamma }}\right) =0 \end{aligned}$$

and

$$\begin{aligned} \widehat{{\tilde{\gamma }}}=A^{T}{\hat{\gamma }} \end{aligned}$$

due to the invariance of MLEs. A similar argument shows that the conditional MLEs follow the same idea, that is

$$\begin{aligned} \widehat{{\tilde{\lambda }}}\left( \psi \right) =A_{\left( p-1\right) } ^{T}\left[ \begin{array}{c} \psi \\ {\hat{\lambda }}\left( \psi \right) \end{array} \right] \end{aligned}$$

where \(A_{\left( p-1\right) }^{T}\) denotes the last \(p-1\) rows of \(A^{T}\).

Next we consider the likelihood quantities in Skovgaard’s CDF approximation ( 6) and show that they are invariant under the interest parameter \(\psi \) preserving reparametrizations induced by transformations in matrix group M.

First we consider likelihood ratio

$$\begin{aligned} \frac{{\mathcal {L}}\left( \psi ,{\hat{\lambda }}\left( \psi \right) \right) }{{\mathcal {L}}\left( {\hat{\psi }},{\hat{\lambda }}\right) } \end{aligned}$$

which appears in the expression for the w parameter in (6) . We first show that the maximized likelihood is invariant under reparametrizations in M. To see this note that

$$\begin{aligned} {\mathcal {L}}\left( \widehat{{\tilde{\gamma }}}\right)&=\exp \left\{ \widehat{{\tilde{\gamma }}}^{T}{\tilde{Z}}+c\left( A^{-T}\widehat{{\tilde{\gamma }} }\right) \right\} =\exp \left\{ \left( A^{T}{\hat{\gamma }}\right) ^{T} {\tilde{Z}}+c\left( A^{-T}A^{T}{\hat{\gamma }}\right) \right\} \\&=\exp \left\{ {\hat{\gamma }}^{T}Z+c\left( {\hat{\gamma }}\right) \right\} ={\mathcal {L}}\left( {\hat{\gamma }}\right) \text {.} \end{aligned}$$

The profile likelihood has similar invariance properties;

$$\begin{aligned} {\mathcal {L}}\left( \psi ,\widehat{{\tilde{\lambda }}}\left( \psi \right) \right)&=\exp \left\{ \widehat{{\tilde{\gamma }}_{p}^{T}}{\tilde{Z}}+c\left( A^{-T}\widehat{{\tilde{\gamma }}_{p}}\right) \right\} =\exp \left\{ \left( A^{T}{\hat{\gamma }}_{p}\right) ^{T}{\tilde{Z}}+c\left( A^{-T}A^{T}\hat{\gamma }_{p}\right) \right\} \\&=\exp \left\{ {\hat{\gamma }}_{p}^{T}Z+c\left( {\hat{\gamma }}_{p}\right) \right\} ={\mathcal {L}}\left( \psi ,{\hat{\lambda }}\left( \psi \right) \right) \end{aligned}$$

where \(\widehat{{\tilde{\gamma }}_{p}}\) and \({\hat{\gamma }}_{p}\) are augmented MLE vectors defined as

$$\begin{aligned} \widehat{{\tilde{\gamma }}_{p}}=\left[ \begin{array}{c} \psi \\ \widehat{{\tilde{\lambda }}}\left( \psi \right) \end{array} \right] \text { and }{\hat{\gamma }}_{p}=\left[ \begin{array}{c} \psi \\ {\hat{\lambda }}\left( \psi \right) \end{array} \right] \text {.} \end{aligned}$$

Next consider the invariance of the ratio of determinants for the full and partial Fisher information matrices;

$$\begin{aligned} \frac{\left| j\left( {\hat{\psi }},{\hat{\lambda }}\right) \right| }{\left| j_{\mathbf {\lambda \lambda }}\left( \psi ,{\hat{\lambda }}\left( \psi \right) \right) \right| } \end{aligned}$$

which appears in the expression for the u parameter in (6).

Using results from (Harville 1997, Sec.15.7) the derivative of \(\ell \left( {\tilde{\gamma }}\right) \) w.r.t. \({\tilde{\gamma }}\) can be expressed in the following way:

$$\begin{aligned} \frac{\partial \ell \left( {\tilde{\gamma }}\right) }{\partial {\tilde{\gamma }} }=A^{-1}\left[ Z+\nabla c\left( A^{-T}{\tilde{\gamma }}\right) \right] \end{aligned}$$

so that the Hessian matrix for \(\ell \left( {\tilde{\gamma }}\right) \) is given as

$$\begin{aligned} \frac{\partial ^{2}\ell \left( {\tilde{\gamma }}\right) }{\partial \tilde{\gamma }\partial {\tilde{\gamma }}^{T}}=A^{-1}\left[ Hc\left( A^{-T}\tilde{\gamma }\right) \right] A^{-T} \end{aligned}$$

where \(Hc\left( \gamma \right) \) is the Hessian matrix for \(c\left( \gamma \right) \) and is given as

$$\begin{aligned} Hc\left( \gamma \right) =\frac{\partial ^{2}\ell \left( \gamma \right) }{\partial \gamma \partial \gamma ^{T}}. \end{aligned}$$

Therefore

$$\begin{aligned} \frac{\partial ^{2}\ell \left( {\tilde{\gamma }}\right) }{\partial \tilde{\gamma }\partial {\tilde{\gamma }}^{T}}=A^{-1}\frac{\partial ^{2}\ell \left( \tilde{\gamma }\right) }{\partial {\tilde{\gamma }}\partial {\tilde{\gamma }}^{T}}A^{-T} \end{aligned}$$

and

$$\begin{aligned} j\left( \psi ,{\tilde{\lambda }}\right) =A^{-1}j\left( \psi ,\lambda \right) A^{-T}. \end{aligned}$$

To consider partial information matrix for \({\tilde{\lambda }}\) we first partition the full information matrix for \(\gamma \) in the following way:

$$\begin{aligned} j\left( \psi ,\lambda \right) =\left[ \begin{array}{cc} j_{\psi \psi }\left( \psi ,\lambda \right) &{} j_{\psi \lambda }\left( \psi ,\lambda \right) \\ j_{\lambda \psi }\left( \psi ,\lambda \right) &{} j_{\lambda \lambda }\left( \psi ,\lambda \right) \end{array} \right] . \end{aligned}$$

Note that

$$\begin{aligned} j\left( \psi ,{\tilde{\lambda }}\right)&=A^{-1}j\left( \psi ,\lambda \right) A^{-T}\\&=\left[ \begin{array}{cc} 1 &{} -aB^{-1}\\ 0_{\left( p-1\right) \times 1} &{} B^{-1} \end{array} \right] \left[ \begin{array}{cc} j_{\psi \psi }\left( \psi ,\lambda \right) &{} j_{\psi \lambda }\left( \psi ,\lambda \right) \\ j_{\lambda \psi }\left( \psi ,\lambda \right) &{} j_{\lambda \lambda }\left( \psi ,\lambda \right) \end{array} \right] \left[ \begin{array}{cc} 1 &{} 0_{1\times \left( p-1\right) }\\ -aB^{-T} &{} B^{-T} \end{array} \right] \\&=\left[ \begin{array}{cc} j_{\psi \psi }\left( \psi ,{\tilde{\lambda }}\right) &{} j_{\psi {\tilde{\lambda }} }\left( \psi ,{\tilde{\lambda }}\right) \\ j_{{\tilde{\lambda }}\psi }\left( \psi ,{\tilde{\lambda }}\right) &{} j_{\tilde{\lambda }{\tilde{\lambda }}}\left( \psi ,{\tilde{\lambda }}\right) \end{array} \right] \end{aligned}$$

so that

$$\begin{aligned} j_{{\tilde{\lambda }}{\tilde{\lambda }}}\left( \psi ,{\tilde{\lambda }}\right) =B^{-1}j_{\lambda \lambda }\left( \psi ,\lambda \right) B^{-T}. \end{aligned}$$

Consider now the following ratio of determinants of the full and partial Fisher information matrices for a secondary reparametrization;

$$\begin{aligned} \frac{\left| j\left( \psi ,{\tilde{\lambda }}\right) \right| }{\left| j_{{\tilde{\lambda }}{\tilde{\lambda }}}\left( \psi ,\tilde{\lambda }\right) \right| }=\frac{\left| A^{-1}j\left( \psi ,\lambda \right) A^{-T}\right| }{\left| B^{-1}j_{\lambda \lambda }\left( \psi ,\lambda \right) B^{-T}\right| }=\frac{\left| j\left( \psi ,\lambda \right) \right| }{\left| j_{{\tilde{\lambda }}{\tilde{\lambda }} }\left( \psi ,\lambda \right) \right| }. \end{aligned}$$

This result has a number of ramifications including the invariance of the approximate asymptotic conditional variance of \({\hat{\psi }}\) given that \(V=v\) which is given in (Butler 2007, Sec. 5.4.5) as

$$\begin{aligned} j_{\psi \psi \cdot \lambda }^{-1}&=\frac{\left| j_{\lambda \lambda }\left( \psi ,{\hat{\lambda }}\left( \psi \right) \right) \right| }{\left| j\left( \psi ,{\hat{\lambda }}\left( \psi \right) \right) \right| }\\&=\left. j_{\psi \psi }\left( \psi ,\lambda \right) -j_{\psi \lambda }\left( \psi ,\lambda \right) j_{\lambda \lambda }^{-1}\left( \psi ,\lambda \right) j_{\lambda \psi }\left( \psi ,\lambda \right) \right] _{\psi ,\hat{\lambda }\left( \psi \right) }. \end{aligned}$$

This asymptotic conditional variance is used to define the conditionally studentized statistic for \(\psi \) in Sect. 5.1.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhao, R., Paige, R.L. Optimal equivalence testing in exponential families. Stat Papers 64, 1507–1525 (2023). https://doi.org/10.1007/s00362-022-01346-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00362-022-01346-4

Keywords

Navigation