A Characterization of Jeffreys’ Prior with Its Implications to Likelihood Inference

Yanagimoto, Takemi; Ohnishi, Toshio

doi:10.1007/978-981-15-9663-6_6

Takemi Yanagimoto⁴ &
Toshio Ohnishi⁵

Part of the book series: SpringerBriefs in Statistics ((JSSRES))

245 Accesses

Abstract

A characterization of Jeffreys’ prior for a parameter of a distribution in the exponential family is given by the asymptotic equivalence of the posterior mean of the canonical parameter to the maximum likelihood estimator. A promising role of the posterior mean is discussed because of its optimality property. Further, methods for improving estimators are explored, when neither the posterior mean nor the maximum likelihood estimator performs favorably. The possible advantages of conjugate analysis based on a suitably chosen prior are examined.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Aitchison J (1975) Goodness of prediction fit. Biometrika 62:547–554
Article MathSciNet Google Scholar
Amari S, Nagaoka H (2007) Methods of information geometry. Am Math Soc, Rhode Island
Book Google Scholar
Berger JO, Bernardo JM (1992) Ordered group reference priors with application to the multinomial problem. Biometrika 79:25–37
Article MathSciNet Google Scholar
Bernardo JM (1979) Reference posterior distributions for Bayesian inference. J Roy Statist Soc B 41:113–147
MathSciNet MATH Google Scholar
Corcuera JM, Giummole F (1999) A generalized Bayes rule for prediction. Scand J Statist 26:265–279
Article MathSciNet Google Scholar
Cox DR, Reid N (1987) Parameter orthogonality and approximate conditional inference (with discussion). J Roy Statist Soc B 49:1–39
MathSciNet MATH Google Scholar
Diaconis P, Ylvisaker D (1979) Conjugate priors for exponential families. Ann Statist 7:269–281
Article MathSciNet Google Scholar
Fisher NL (1995) Statistical Analysis of circular data. Cambridge University Press, Cambridge
Google Scholar
Ghosh M, Liu R (2011) Moment matching priors. Sankhya A 73:185–201
Article Google Scholar
Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT Press, Cambridge
MATH Google Scholar
James W, Stein C (1961) Estimation with quadratic loss. Proc Fourth Berkeley Symp Math Statist Prob 1:361–380
MathSciNet MATH Google Scholar
Jeffreys H (1961) Theory of probability, 3rd edn. Oxford Univ Press, Oxford
MATH Google Scholar
Lehmann EL (1959) Testing statistical hypotheses. Wiley, New York
MATH Google Scholar
Lindsey JK (1996) Parametric statistical inference. Clarendon Press, Oxford
MATH Google Scholar
Neyman J, Scott EL (1948) Consistent estimates based on partially consistent observations. Econometrica 16:1–32
Article MathSciNet Google Scholar
Robert CP (2001) The Bayesian choice, 2nd edn. Springer, New York
Google Scholar
Sakumura T and Yanagimoto T (2019) Posterior mean of the canonical parameter in the von-Mises distribution (in Japanese). Read at Japan Joint Statist Meet Abstract 69
Google Scholar
Spiegelhalter DJ, Best NG, Carlin BP, van der Lind A (2002) Bayesian measures of model complexity and fit (with discussions). J R Statist Soc B 64:583–639
Article Google Scholar
Spiegelhalter DJ, Best NG, Carlin BP, van der Linde A (2014) The deviance information criterion: 12 years on. J R Statist Soc B 76:485–493
Article MathSciNet Google Scholar
Tierney L, Kass RE, Kadane JB (1989) Fully exponential Laplace approximations to expectations and variances of nonpositive functions. J Am Statist Assoc 84:710–716
Article MathSciNet Google Scholar
Yanagimoto T, Ohnishi T (2009) Bayesian prediction of a density function in terms of $e$-mixture. J Statist Plann Inf 139:3064–3075
Article MathSciNet Google Scholar
Yanagimoto T, Ohnishi T (2011) Saddlepoint condition on a predictor to reconfirm the need for the assumption of a prior distribution. J Statist Plann Inf 41:1990–2000
Article MathSciNet Google Scholar

Download references

Acknowledgements

The authors express their thanks to a reviewer and the editors for their comments on points to be clarified.

Author information

Authors and Affiliations

Institute of Statistical Mathematics, 10-3 Midori-cho, Tachikawa, Tokyo, 190-8562, Japan
Takemi Yanagimoto
Faculty of Economics, Kyushu University, 744 Motooka Nishi-ku, Fukuoka, 819-0395, Japan
Toshio Ohnishi

Authors

Takemi Yanagimoto
View author publications
You can also search for this author in PubMed Google Scholar
Toshio Ohnishi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Takemi Yanagimoto .

Editor information

Editors and Affiliations

School of Economics, Kanazawa University, Kanazawa, Ishikawa, Japan
Nobuaki Hoshino
The Institute of Statistical Mathematics, Tachikawa, Tokyo, Japan
Shuhei Mano
The Institute of Statistical Mathematics, Tachikawa, Tokyo, Japan
Takaaki Shimura

Appendices

Appendix A. Proof of Theorem 6.1

Before presenting the proof, we clarify the notation that is more rigorous than that in the text. Write a density in the exponential family as

$$ p ({\varvec{x}}|\theta ) = \prod \exp \{ \theta \cdot t_i - M (\theta ) \} a (x_i), $$

where $\mathop {\textstyle {\sum }}t_i \in \mathcal {X} \subset \mathbb {R}^p$ is the sufficient statistic, and $\theta \in \Theta \subset \mathbb {R}^p$ with $\theta =(\theta _1,\ldots ,\theta _p)$ is the canonical parameter.

For a given $\theta _0$, the corresponding mean parameter is written as $\mu _0 = \nabla M (\theta _0)$. The Kullback–Leibler divergence is expressed as

$$\begin{aligned} \text{ D }(\theta _0, \theta ) \,=\, M (\theta ) + N (\mu _0) - \mu _0 \cdot \theta , \end{aligned}$$

and the Fisher information matrix is written as

$$\begin{aligned} I (\theta ) \,=\, \left\{ M_{ij} (\theta ) \right\} _{1 \le i, j \le p}, \end{aligned}$$

where $M_{ij} (\theta ) = \partial ^2 M (\theta )/\partial \theta _i \partial \theta _j$. For notational convenience, the partial derivative of a function of a vector variable with respect to its components is denoted by the corresponding suffixes.

We begin the proof with presenting an expression of the posterior mean as

$$\begin{aligned} \text{ E }\bigl \{ \theta \,;\, \pi (\theta ; \theta _0, n) \bigr \} = \frac{\int _\Theta \theta b (\theta ) \exp \{ - n \text{ D }(\theta _0, \theta ) \} d \theta }{\int _\Theta b (\theta ) \exp \{ - n \text{ D }(\theta _0, \theta ) \} d \theta }. \end{aligned}$$

Thus, we may evaluate

$$\begin{aligned} \int _\Theta g (\theta ) \exp \{ - n \text{ D }(\theta _0, \theta ) \} d \theta \end{aligned}$$

(6.16)

for cases in which $g (\theta )$ is $ \theta _i b (\theta )$ and $b (\theta )$.

When $\theta \approx \theta _0$, the following formal approximation is possible:

$$\begin{aligned} \text{ D }(\theta _0, \theta )&\approx \frac{1}{2} (\theta - \theta _0)^T I (\theta _0) (\theta - \theta _0) \\&\qquad + \frac{1}{3!} \sum _{j_1, j_2, j_3} M_{j_1 j_2 j_3} (\theta _0) (\theta _{j_1} - \theta _{0 j_1}) (\theta _{j_2} - \theta _{0 j_2}) (\theta _{j_3} - \theta _{0 j_3}) \\&\qquad + \frac{1}{4!} \sum _{j_1, j_2, j_3, j_4} M_{j_1 j_2 j_3 j_4} (\theta _0) (\theta _{j_1} - \theta _{0 j_1}) (\theta _{j_2} - \theta _{0 j_2}) (\theta _{j_3} - \theta _{0 j_3}) (\theta _{j4} - \theta _{0 j_4}), \end{aligned}$$

where $\theta _{0i}$ denotes the i-th component of $\theta _0$.

Since $I (\theta _0)$ is assumed to be positive definite, the a-th power can be defined for $a=1/2$ and $-1/2$. Both matrices are positive definite, and one is the inverse matrix of the other. We consider here the following parameter transformation of $\theta $ to z as

$$\begin{aligned} z \,=\, \sqrt{n} I^{1/2} (\theta _0) (\theta - \theta _0). \end{aligned}$$

The Jacobian of this transformation is

$$\begin{aligned} \frac{1}{n^{p/2} \sqrt{\det I (\theta _0)}}. \end{aligned}$$

Then the asymptotic expansion of the Kullback–Leibler divergence up to the order O(1/n) is given by

$$\begin{aligned} n \text{ D }(\theta _0, \theta ) \approx \frac{| z |^2}{2} + \frac{1}{\sqrt{n}} \sum _{j_1, j_2, j_3} a_{j_1 j_2 j_3}^{(1)} z_{j_1} z_{j_2} z_{j_3} + \frac{1}{n} \sum _{j_1, j_2, j_3, j_4} a_{j_1 j_2 j_3 j_4}^{(2)} z_{j_1} z_{j_2} z_{j_3} z_{j_4}, \end{aligned}$$

where $a_{j_1 j_2 j_3}^{(1)}$ and $a_{j_1 j_2 j_3 j_4}^{(2)}$ are defined as

$$\begin{aligned} a_{j_1 j_2 j_3}^{(1)} := \frac{1}{3!} \sum _{j_4, j_5, j_6} M_{j_4 j_5 j_6} (\theta _0) I^{-1/2}_{j_4 j_1} (\theta _0) I^{-1/2}_{j_5 j_2} (\theta _0) I^{-1/2}_{j_6 j_3} (\theta _0); \end{aligned}$$

(6.17)

$$\begin{aligned} a_{j_1 j_2 j_3 j_4}^{(2)} := \frac{1}{4!} \sum _{j_5, j_6, j_7, j_8} M_{j_5 j_6 j_7 j_8} (\theta _0) I^{-1/2}_{j_5 j_1} (\theta _0) I^{-1/2}_{j_6 j_2} (\theta _0) I^{-1/2}_{j_7 j_3} (\theta _0) I^{-1/2}_{j_8 j_4} (\theta _0). \end{aligned}$$

(6.18)

Note that $a_{j_1 j_2 j_3}^{(1)}$ and $a_{j_1 j_2 j_3 j_4}^{(2)}$ remain unchanged under the permutation of the suffixes, since $M (\theta )$ is assumed to be of $C^4$ class. To evaluate the integral in (6.16), we evaluate the asymptotic expansion of $\exp \{ -n \text{ D }(\theta _0, \theta ) \}$. Writing the density of the standard p-dimensional normal as $\phi (z)$, we can give the asymptotic expansion up to the order O(1/n) as

$$\begin{aligned}&\exp \{ -n \text{ D }(\theta _0, \theta ) \} \\&= (2 \pi )^{p/2} \phi (z) \exp \left( - \frac{1}{\sqrt{n}} \sum _{j_1, j_2, j_3} a_{j_1 j_2 j_3}^{(1)} z_{j_1} z_{j_2} z_{j_3} - \frac{1}{n} \sum _{j_1, j_2, j_3, j_4} a_{j_1 j_2 j_3 j_4}^{(2)} z_{j_1} z_{j_2} z_{j_3} z_{j_4} \right) \\&\approx (2 \pi )^{p/2} \phi (z) \Biggl ( 1 - \frac{1}{\sqrt{n}} \sum _{j_1, j_2, j_3} a_{j_1 j_2 j_3}^{(1)} z_{j_1} z_{j_2} z_{j_3} - \frac{1}{n} \sum _{j_1, j_2, j_3, j_4} a_{j_1 j_2 j_3 j_4}^{(2)} z_{j_1} z_{j_2} z_{j_3} z_{j_4} \\&\qquad \qquad \qquad \qquad \qquad \qquad + \frac{1}{2 n} \sum _{j_1, j_2, j_3, j_4, j_5, j_6} a_{j_1 j_2 j_3}^{(1)} a_{j_4 j_5 j_6}^{(1)} z_{j_1} z_{j_2} z_{j_3} z_{j_4} z_{j_5} z_{j_6} \Biggr ). \end{aligned}$$

In a sequel, we regard the domain of $\theta $ as $\mathbb {R}^p$.

Next, we calculate the asymptotic expansion of $g (\theta )$ up to the order O(1/n) by

$$\begin{aligned} g (\theta )&\approx g (\theta _0) + \frac{1}{\sqrt{n}} \sum _{j_1, j_2} g_{j_2} (\theta _0) I^{-1/2}_{j_2 j_1} (\theta _0) z_{j_1} \\&\qquad \qquad + \frac{1}{2 n} \sum _{j_1, j_2, j_3, j_4} g_{j_3, j_4} (\theta _0) I^{-1/2}_{j_3 j_1} (\theta _0) I^{-1/2}_{j_4 j_2} (\theta _0) z_{j_1} z_{j_2} \\&= g (\theta _0) \left( 1 + \frac{1}{\sqrt{n}} \sum _{j_1} c_{j_1}^{(1)} z_{j_1} + \frac{1}{n} \sum _{j_1, j_2} c_{j_1 j_2}^{(2)} z_{j_1} z_{j_2} \right) , \end{aligned}$$

where $c_{j_1}^{(1)}$ and $c_{j_1 j_2}^{(2)}$ denote, respectively,

$$\begin{aligned} c_{j_1}^{(1)} := \sum _{j_2} \frac{g_{j_2} (\theta _0)}{g (\theta _0)} I^{-1/2}_{j_2 j_1} (\theta _0) \end{aligned}$$

(6.19)

and

$$\begin{aligned} c_{j_1 j_2}^{(2)} := \frac{1}{2} \sum _{j_3, j_4} \frac{g_{j_3 j_4} (\theta _0)}{g (\theta _0)} I^{-1/2}_{j_3 j_1} (\theta _0) I^{-1/2}_{j_4 j_2} (\theta _0). \end{aligned}$$

(6.20)

Note that these coefficients also remain unchanged under the permutation of the suffixes, as are $a_{j_1 j_2 j_3}^{(1)}$ and $a_{j_1 j_2 j_3 j_4}^{(2)}$ in (6.17) and (6.18).

Combining these asymptotic expansions, we obtain that of the integrant in (6.16) as follows:

$$\begin{aligned}&g (\theta ) \exp \{ - n \text{ D }(\theta _0, \theta ) \} \\&\approx (2 \pi )^{p/2} g (\theta _0) \phi (z) \left( 1 + \frac{1}{\sqrt{n}} \sum _{j_1} c_{j_1}^{(1)} z_{j_1} + \frac{1}{n} \sum _{j_1, j_2} c_{j_1 j_2}^{(2)} z_{j_1} z_{j_2} \right) \\&\qquad \times \Biggl ( 1 - \frac{1}{\sqrt{n}} \sum _{j_1, j_2, j_3} a_{j_1 j_2 j_3}^{(1)} z_{j_1} z_{j_2} z_{j_3} - \frac{1}{n} \sum _{j_1, j_2, j_3, j_4} a_{j_1 j_2 j_3 j_4}^{(2)} z_{j_1} z_{j_2} z_{j_3} z_{j_4} \\&\qquad \qquad \qquad \qquad \qquad \qquad + \frac{1}{2 n} \sum _{j_1, j_2, j_3, j_4, j_5, j_6} a_{j_1 j_2 j_3}^{(1)} a_{j_4 j_5 j_6}^{(1)} z_{j_1} z_{j_2} z_{j_3} z_{j_4} z_{j_5} z_{j_6} \Biggr ). \end{aligned}$$

Since this approximated integrant contains $\phi (z)$, we may discard the odd order terms of the polynomial of z to give

$$\begin{aligned} (2 \pi )^{p/2} g (\theta _0) \phi (z) \\&\times \Biggl \{ 1 - \frac{1}{n} \sum _{j_1, j_2, j_3, j_4} a_{j_1 j_2 j_3 j_4}^{(2)} z_{j_1} z_{j_2} z_{j_3} z_{j_4} \\&\qquad \qquad + \frac{1}{2 n} \sum _{j_1, j_2, j_3, j_4, j_5, j_6} a_{j_1 j_2 j_3}^{(1)} a_{j_4 j_5 j_6}^{(1)} z_{j_1} z_{j_2} z_{j_3} z_{j_4} z_{j_5} z_{j_6} \\&\qquad \qquad - \frac{1}{n} \sum _{j_1, j_2, j_3, j_4} c_{j_1}^{(1)} a_{j_2 j_3 j_4}^{(1)} z_{j_1} z_{j_2} z_{j_3} z_{j_4} + \frac{1}{n} \sum _{j_1, j_2} c_{j_1 j_2}^{(2)} z_{j_1} z_{j_2} \Biggr \}. \end{aligned}$$

Let $Z = (Z_1, \ldots , Z_p)^T$ be a random variable having the density $\phi (z)$. Then the second moment is written as $E \{ Z_i Z_j \} = \delta _{ij}$. To evaluate the fourth moment, set $\gamma _{ijkl} := E \{ Z_i Z_j Z_k Z_l\}$. It follows that $\gamma _{iiii} = 3$ for every i, and that $\gamma _{iikk} = 1 $ for (i, j, k, l) such that two pairs take different integers. The sixth moment remains in the asymptotic expansion of the integral of (6.16), but disappears in the asymptotic expansion of the posterior mean.

The asymptotic expansion of the integral of (6.16) up to the order O(1/n) is expressed as

$$\begin{aligned}&\int _\Theta g (\theta ) \exp \{ - n \text{ D }(\theta _0, \theta ) \} d \theta \\&\approx \left( \frac{2 \pi }{n} \right) ^{p/2} \frac{g (\theta _0)}{\sqrt{\det I (\theta _0)}} \\&\qquad \times \Biggl \{ 1 - \frac{1}{n} \sum _{j_1, j_2, j_3, j_4} a_{j_1 j_2 j_3 j_4}^{(2)} \gamma _{j_1 j_2 j_3 j_4} + \frac{1}{2 n} \sum _{j_1, j_2, j_3, j_4, j_5, j_6} a_{j_1 j_2 j_3}^{(1)} a_{j_4 j_5 j_6}^{(1)} \gamma _{j_1 j_2 j_3 j_4 j_5 j_6} \\&\qquad \qquad \qquad \qquad \qquad - \frac{1}{n} \sum _{j_1, j_2, j_3, j_4} c_{j_1}^{(1)} a_{j_2 j_3 j_4}^{(1)} \gamma _{j_1 j_2 j_3 j_4} + \frac{1}{n} \sum _{j_1, j_2} c_{j_1 j_2}^{(2)} \delta _{j_1 j_2} \Biggr \}. \end{aligned}$$

Set the sum of the second and third terms as A, that is,

$$\begin{aligned} A \,=\, - \sum _{j_1, j_2, j_3, j_4} a_{j_1 j_2 j_3 j_4}^{(2)} \gamma _{j_1 j_2 j_3 j_4} \,+\, \frac{1}{2} \sum _{j_1, j_2, j_3, j_4, j_5, j_6} a_{j_1 j_2 j_3}^{(1)} a_{j_4 j_5 j_6}^{(1)} \gamma _{j_1 j_2 j_3 j_4 j_5 j_6}. \end{aligned}$$

Note that A is independent of $b(\theta )$. Next, we simplify the fourth term, by applying the properties of $a_{j_1 j_2 j_3}^{(1)}$, as follows:

$$\begin{aligned}&\sum _{j_1, j_2, j_3, j_4} c_{j_1}^{(1)} a_{j_2 j_3 j_4}^{(1)} \gamma _{j_1 j_2 j_3 j_4} \\&= 3 \sum _{j_1} c_{j_1}^{(1)} a_{j_1 j_1 j_1}^{(1)} + \sum _{j_1 \ne j_2} c_{j_1}^{(1)} a_{j_1 j_2 j_2}^{(1)} + \sum _{j_1 \ne j_2} c_{j_1}^{(1)} a_{j_2 j_1 j_2}^{(1)} + \sum _{j_1 \ne j_2} c_{j_1}^{(1)} a_{j_2 j_2 j_1}^{(1)} \\&= 3 \sum _{j_1} c_{j_1}^{(1)} a_{j_1 j_1 j_1}^{(1)} + 3 \sum _{j_1 \ne j_2} c_{j_1}^{(1)} a_{j_1 j_2 j_2}^{(1)} \\&= 3 \sum _{j_1, j_2} c_{j_1}^{(1)} a_{j_1 j_2 j_2}^{(1)}. \end{aligned}$$

The fifth term is rewritten as

$$\begin{aligned} \sum _{j_1, j_2} c_{j_1 j_2}^{(2)} \delta _{j_1 j_2} = \sum _{j_1} c_{j_1 j_1}^{(2)}. \end{aligned}$$

Therefore, the asymptotic expansion of the integral is of the form

$$\begin{aligned}&\int _\Theta g (\theta ) \exp \{ - n \text{ D }(\theta _0, \theta ) \} d \theta \\&\approx \left( \frac{2 \pi }{n} \right) ^{p/2} \frac{g (\theta _0)}{\sqrt{\det I (\theta _0)}} \Biggl \{ 1 + \frac{1}{n} \left( A - 3 \sum _{j_1, j_2} c_{j_1}^{(1)} a_{j_1 j_2 j_2}^{(1)} + \sum _{j_1} c_{j_1 j_1}^{(2)} \right) \Biggr \}. \end{aligned}$$

Thus, both the numerator and the denominator of the posterior mean are expressed in similar forms as

$$\begin{aligned} \int _\Theta \theta _i b (\theta ) \exp \{ - n \text{ D }(\theta _0, \theta ) \} d \theta&\approx \left( \frac{2 \pi }{n} \right) ^{p/2} \frac{\theta _{0i} g (\theta _0)}{\sqrt{\det I (\theta _0)}} \left( 1 + \frac{d_{Ni}}{n} \right) , \\ \int _\Theta b (\theta ) \exp \{ - n \text{ D }(\theta _0, \theta ) \} d \theta&\approx \left( \frac{2 \pi }{n} \right) ^{p/2} \frac{g (\theta _0)}{\sqrt{\det I (\theta _0)}} \left( 1 + \frac{d_D}{n} \right) . \end{aligned}$$

Consequently, we obtain the asymptotic expansion of the posterior mean as

$$\begin{aligned} \text{ E }\bigl \{ \theta _i \,;\, \pi (\theta ; \theta _0, n) \bigr \} \approx \theta _{0i} \left( 1 + \frac{d_{Ni} - d_D}{n} \right) . \end{aligned}$$

Next, we prove the necessity and suppose that $d_{Ni} = d_D$ for every i. Since the coefficients $a_{j_1 j_2 j_3}^{(1)}$ and $a_{j_1 j_2 j_3 j_4}^{(2)}$ are independent of $b(\theta )$, the difference $d_{Ni} = d_D$ is independent of A for every i. Thus, the difference depends on $c_{j_1}^{(1)}$ and $c_{j_1 j_1}^{(2)}$. To evaluate the difference, we decompose it into two terms $ F_1 + F_2$ such that $F_1$ is a function of $c_{j_1}^{(1)}$ and $F_2$ is a function of $c_{j_1 j_1}^{(2)}$.

Since $M(\theta )$ is assumed to be of $C^4$ class, $\text{ I }(\theta )$ is of $C^2$ class. Using the equality

$$\begin{aligned} \frac{1}{\theta _i b (\theta )} \frac{\partial \{ \theta _i b (\theta ) \}}{\partial \theta _{j_2}} - \frac{b_{j_2} (\theta )}{b (\theta )}&= \frac{\delta _{j_2 i}}{\theta _i}, \end{aligned}$$

we find that the coefficient of $c_{j_1}^{(1)}$ in the difference $d_{Ni} = d_D$ is written as

$$\begin{aligned} \sum _{j_3} \frac{\delta _{j_3 i}}{\theta _{0i}} I^{-1/2}_{j_3 j_1} (\theta _0) = \frac{I^{-1/2}_{i j_1} (\theta _0)}{\theta _{0i}}. \end{aligned}$$

This implies that $F_1$ can be expressed as follows:

$$\begin{aligned} F_1&= -3 \sum _{j_1 j_2} \frac{I^{-1/2}_{i j_1} (\theta _0)}{\theta _{0i}} a_{j_1 j_2 j_2}^{(1)} \\&= -3 \sum _{j_1 j_2} \frac{I^{-1/2}_{i j_1} (\theta _0)}{\theta _{0i}} \frac{1}{3!} \sum _{j_4, j_5, j_6} M_{j_4 j_5 j_6} (\theta _0) I^{-1/2}_{j_4 j_1} (\theta _0) I^{-1/2}_{j_5 j_2} (\theta _0) I^{-1/2}_{j_6 j_2} (\theta _0) \\&= -\frac{1}{2 \theta _{0i}} \sum _{j_1, j_2, j_4, j_5, j_6} I^{-1/2}_{i j_1} (\theta _0) I^{-1/2}_{j_4 j_1} (\theta _0) I^{-1/2}_{j_5 j_2} (\theta _0) I^{-1/2}_{j_6 j_2} (\theta _0) M_{j_4 j_5 j_6} (\theta _0). \end{aligned}$$

Since $I^{-1/2} (\theta _0)$ is symmetric, it follows that

$$\begin{aligned} \sum _{j_1} I^{-1/2}_{i j_1} (\theta _0) I^{-1/2}_{j_4 j_1} (\theta _0) = \sum _{j_1} I^{-1/2}_{i j_1} (\theta _0) I^{-1/2}_{j_1 j_4} (\theta _0) = I^{-1}_{i j_4} (\theta _0) \end{aligned}$$

and also that

$$\begin{aligned} \sum _{j_2} I^{-1/2}_{j_5 j_2} (\theta _0) I^{-1/2}_{j_6 j_2} (\theta _0)&= I^{-1}_{j_5 j_6} (\theta _0). \end{aligned}$$

Consequently, the former term is given by

$$\begin{aligned} F_1 \,=\, -\frac{1}{2 \theta _{0i}} \sum _{j_4, j_5, j_6} I^{-1}_{i j_4} (\theta _0) I^{-1}_{j_5 j_6} (\theta _0) M_{j_4 j_5 j_6} (\theta _0). \end{aligned}$$

Next, we evaluate the latter term $F_2$. It holds that

$$\begin{aligned} \frac{1}{\theta _i b (\theta )} \frac{\partial ^2 \{ \theta _i b (\theta ) \}}{\partial \theta _{j_3} \partial _{j4}} \,-\, \frac{b_{j_3 j_4} (\theta )}{b (\theta )}&= \frac{\delta _{j_4 i} b_{j_3} (\theta ) + \delta _{j_3 i} b_{j_4} (\theta )}{\theta _i b (\theta )}. \end{aligned}$$

Hence, the coefficient of $c_{j_1 j_1}^{(2)}$ in the difference $d_{Ni} = d_D$ is written as

$$\begin{aligned}&\frac{1}{2} \sum _{j_3, j_4} \frac{\delta _{j_4 i} b_{j_3} (\theta _0) + \delta _{j_3 i} b_{j_4} (\theta _0)}{\theta _{0i} b (\theta )} I^{-1/2}_{j_3 j_1} (\theta _0) I^{-1/2}_{j_4 j_1} (\theta _0) \\&= \frac{I^{-1/2}_{i j_1} (\theta _0)}{2 \theta _{0i}} \sum _{j_3} \frac{b_{j_3} (\theta _0)}{b (\theta )} I^{-1/2}_{j_3 j_1} (\theta _0) + \frac{I^{-1/2}_{i j_1} (\theta _0)}{2 \theta _{0i}} \sum _{j_4} \frac{b_{j_4} (\theta _0)}{b (\theta )} I^{-1/2}_{j_4 j_1} (\theta _0) \\&= \frac{I^{-1/2}_{i j_1} (\theta _0)}{\theta _{0i}} \sum _{j_3} \frac{b_{j_3} (\theta _0)}{b (\theta )} I^{-1/2}_{j_3 j_1} (\theta _0). \end{aligned}$$

Thus, the latter term $F_2$ is given by

$$\begin{aligned} F_2 = \frac{1}{\theta _{0i}} \sum _{j_1, j_3} I^{-1/2}_{i j_1} (\theta _0) \frac{b_{j_3} (\theta _0)}{b (\theta )} I^{-1/2}_{j_3 j_1} (\theta _0) = \frac{1}{\theta _{0i}} \sum _{j_3} I^{-1}_{i j_3} (\theta _0) \frac{b_{j_3} (\theta _0)}{b (\theta )}. \end{aligned}$$

Combining these results, we obtain that

$$\begin{aligned} d_{Ni} - d_D \,=\, -\frac{1}{2 \theta _{0i}} \sum _{j_4, j_5, j_6} I^{-1}_{i j_4} (\theta _0) I^{-1}_{j_5 j_6} (\theta _0) M_{j_4 j_5 j_6} (\theta _0) + \frac{1}{\theta _{0i}} \sum _{j_3} I^{-1}_{i j_3} (\theta _0) \frac{b_{j_3} (\theta _0)}{b (\theta )}. \end{aligned}$$

(6.21)

Using the equality

$$ M_{j_3 j_5 j_6} (\theta _0) = \frac{\partial I_{j_5 j_6} (\theta _0)}{\partial \theta _{0 j_3}} $$

and replacing the index $j_4$ in the summation of the difference (6.21) by $j_3$, we can rewrite the difference as follows:

$$\begin{aligned} d_{Ni} - d_D&= \frac{1}{\theta _{0i}} \sum _{j_3} I^{-1}_{i j_3} (\theta _0) \left\{ \frac{b_{j_3} (\theta _0)}{b (\theta )} -\frac{1}{2} \sum _{j_5, j_6} I^{-1}_{j_5 j_6} (\theta _0) \frac{\partial I_{j_5 j_6} (\theta _0)}{\partial \theta _{0 j_3}} \right\} . \end{aligned}$$

Applying the differentiation formula for the determinant of a differentiable and invertible matrix A(t), $d \{A (t)\}/dt = \det A (t)\, \text{ tr }\{ A^{-1} d\{ A(t)\}/dt\}$, we can express the right-hand side in terms of the derivative of the determinant of the matrix $I(\theta _0)$:

$$\begin{aligned} \sum _{j_5, j_6} I^{-1}_{j_5 j_6} (\theta _0) \frac{\partial I_{j_5 j_6} (\theta _0)}{\partial \theta _{0 j_3}} = \frac{\partial \;\;\; }{\partial \theta _{0 j_3}} \log \det I (\theta _0). \end{aligned}$$

This implies that

$$\begin{aligned} d_{Ni} - d_D \,=\, \frac{1}{\theta _{0i}} \sum _{j_3} I^{-1}_{i j_3} (\theta _0) \frac{\partial \;\;\; }{\partial \theta _{0 j_3}} \left\{ \log b (\theta ) -\frac{1}{2} \log \det I (\theta _0) \right\} . \end{aligned}$$

The condition for this equality to hold for every i is expressed as

$$ \nabla \log \frac{b (\theta )}{\sqrt{ \det I (\theta )}} \,=\, 0. $$

This completes the proof.

Appendix B. Proof of Corollary 6.1

To apply Theorem 6.1, set $s=\nabla N(t)$, and define the function $f_n (s)$ as

$$ f_n (s) = \frac{\mathop {\textstyle {\int }}\theta \exp \{ -n D (s, \theta ) \} \pi _J (\theta ) d \theta }{\mathop {\textstyle {\int }}\exp \{ -n D (s, \theta ) \} \pi _J (\theta ) d \theta }. $$

Theorem 6.1 yields that, for an arbitrary fixed s,

$$\begin{aligned} f_n (s) = s + a_n (s) / n^2, \end{aligned}$$

(6.22)

where the coefficient $a_n (s)$ is continuous and is of the order O(1).

Write the MLE of $\theta $ for a sample of size n, ${\varvec{x}}_n$, as $\hat{\theta }_{ML}({\varvec{x}}_n) $. Since the sample density is assumed to be in the exponential family, $\hat{\theta }_{ML}({\varvec{x}}_n)$ can be expressed as $\nabla N(\bar{t})$, which is written as $s_n$. Then, the posterior mean $\hat{\theta }({\varvec{x}}_n)$ is expressed as $\hat{\theta }({\varvec{x}}_n)=f_n (s_n)$. The assumption that the sampling density is in the regular exponential family shows that a true parameter $\theta _T$ is in the interior of $\Theta $; the law of large numbers implies that for an arbitrary small positive value $\epsilon $, the probability of the subspace of samples $\mathcal{{X}}_{\epsilon }(n) = \{{\varvec{x}}_n|\, |s_n -\theta _T| \le \epsilon \}$ is greater than $1-\epsilon $. From the continuity of the coefficient $a_n(s_n)$ in (6.22), it follows that $a_n(s_n)$ is bounded for ${\varvec{x}}_n \in \mathcal{{X}}_{\epsilon }(n)$. Thus, it holds that

$$ f_n (s_n) = s_n + c_n / n^2, $$

where $c_n$ is a finite value. Combining these results, we find that

$$ \hat{\theta }({\varvec{x}}_n) = \hat{\theta }_{ML}({\varvec{x}}_n) + O_P ( 1 / n^2 ).$$

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Yanagimoto, T., Ohnishi, T. (2020). A Characterization of Jeffreys’ Prior with Its Implications to Likelihood Inference. In: Hoshino, N., Mano, S., Shimura, T. (eds) Pioneering Works on Distribution Theory. SpringerBriefs in Statistics(). Springer, Singapore. https://doi.org/10.1007/978-981-15-9663-6_6

Download citation

DOI: https://doi.org/10.1007/978-981-15-9663-6_6
Published: 05 January 2021
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-9662-9
Online ISBN: 978-981-15-9663-6
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics