# Improved Cramér–Rao Type Integral Inequalities or Bayesian Cramér–Rao Bounds

## Abstract

New lower bounds on the mean square error for estimators of random parameter are obtained as applications of improved Cauchy–Schwarz inequality due to Walker (Stat Probab Lett 122:86–90, 2017).

## Keywords

Bayesian Cramér–Rao bound Cauchy-Schwarz inequality Cramér–Rao type integral inequality Walker’s inequality## 1 Introduction

Cramér–Rao lower bound for the variance of an unbiased estimator of a parameter is well known for its use in statistical literature. There has been a large amount of work to obtain Cramér–Rao type integral inequalities leading to lower bounds for the risks associated with Bayesian estimators. Earlier results in this diretion are due to Schutzenberger (1957) and Gart (1959). Other works in this direction in the statistical literature are due to Borovkov and Sakhanenko (1980), Targhetta (1984, 1988, 1990), Shemyakin (1987), Babrovsky et al. (1987), Brown and Gajek (1990), Prakasa Rao (1992), Ghosh (1993) and Gill and Levit (1995). In engineering literature, this problem is considered under the subject “random parameter estimation”. Significant results in this area in the engineering literature are due to van Trees (1968), Ziv and Zakai (1969), Chazan et al. (1975), Miller and Chang (1978), Weinstein and Weiss (1985), Weiss and Weinstein (1985), Brown and Liu (1993) among others. Prakasa Rao (1991) gives a comprehensive survey of results obtained in this area till about 1990. Related results on Cramér–Rao type integral inequalities were obtained in Prakasa Rao (1996, 2000, 2001). In a voluminous work, van Trees and Bell (2007) give a survey of Bayesian bounds for parameter estimation and nonlinear filtering/tracking and edited a volume containing selected papers dealing with Bayesian Cramér–Rao bounds, global Bayesian bounds, hybrid Bayesian bounds, constrained Cramér–Rao bounds and their applications to nonlinear dynamic systems.

It is well known that either the Cramér–Rao inequality giving a lower bound for the quadratic risk of an estimator or the Bayesian versions of the Cramér–Rao inequality obtained by several authors are all consequences or applications of the Cauchy–Schwarz inequality for suitable functions of the observations and the parameter. In a recent paper, Walker (2017) obtained an improved Cauchy–Schwarz inequality. Our aim in this short note is to obtain some Bayesian Cramér–Rao bounds as applications of the improved version of Cauchy–Schwarz inequality. Walker (2017) obtained a generalized Cramér–Rao inequality as an application of the improved Cauchy–Schwarz inequality.

## 2 Main Results

Walker (2017) obtained an improved version of the Cauchy–Schwarz inequality which implies the following probabilistic version.

#### Theorem 2.1

*X*and

*Y*are random variables defined on a probabilty space \((\Omega , \mathcal{F},P)\) with finite second moments, then

*Y*is a random variable with mean zero and variance 1 and

*X*is a random variable with mean \(\mu \) and finite variance \(\sigma ^2.\) Then the Cauchy–Schwarz inequality implies

*Y*has mean zero but positive variance and

*X*is another random variable with finite variance. Then it follows that

#### Corollary 2.1

*X*and

*Y*are random variables defined on a probability space \((\Omega , \mathcal{F},P)\) with finite second moments and if \(E(Y)=0,\) , then

*Z*be a random variable defined on a probability space \((\Omega , \mathcal{F}, P_\theta )\) where \(\theta \in \Theta \subset R.\) Suppose that the parameter \(\theta \) has a prior density \(\lambda (\theta )\) with respect to the Lebesgue measure on

*R*and that \(f(z,\theta )\) is the probability density function of the random variable

*Z*given the parameter \(\theta .\) Then the joint density of the random vector \((Z,\theta )\) is \(g(z,\theta )= f(z,\theta ) \lambda (\theta ).\) Let us consider a function \(\psi (z,\theta )\) such that \(E_\theta [\psi (Z,\theta )|Z]=0\) where \(E_\theta (\psi (Z,\theta )|Z)\) denotes the expectation of the random variable \(\psi (Z,\theta )\) with respect to the posterior distribution of \(\theta \) given

*Z*. Let \(E(\psi (Z,\theta ))\) denote the expectation of the random variable \(\psi (Z,\theta )\) with respect to the joint distribution of the random vector \((Z, \theta ).\) Then, for any random variable \(\ell (Z),\) with \(E[|\ell (Z)|]< \infty ,\)

*Z*and finite second moment, we obtain that

**Special Cases**

- (i)Suppose we choose \(\psi (Z,\theta )= \theta -E(\theta |Z).\) It is obvious that \(E[\psi (Z,\theta )|Z]=0.\) Applying the inequality (2.8), we get that$$\begin{aligned} E([\theta -\ell (Z)]^2)\ge & {} (E[\theta -\ell (Z)])^2+\, \frac{(E[\theta \psi (Z,\theta )])^2 }{E([\psi (Z,\theta )]^2)}\\= & {} (E[\theta -\ell (Z)])^2\nonumber \\&+\, \frac{(E[\theta (\theta -E(\theta |Z))])^2 }{E([\theta -E(\theta |Z)]^2)}\\= & {} (E[\theta -\ell (Z)])^2\nonumber \\&+\, \frac{(E[(\theta -E(\theta |Z)) (\theta -E(\theta |Z))])^2 }{E([\theta -E(\theta |Z)]^2)}\\= & {} (E[\theta -\ell (Z)])^2+\, E([\theta -E(\theta |Z)]^2).\\ \end{aligned}$$
- (ii)Let \(\pi (\theta |z)\) denote the posterior density function of the parameter \(\theta \) given the observation
*z*. Let \(I(\theta )\) denote the Fisher information in the observation*Z*given the parameter \(\theta .\) Suppose we chooseObserve that \(E[\psi (Z,\theta )|Z]=0\) and it is easy to check that$$\begin{aligned} \psi (z,\theta )= \frac{\partial \log (\pi (\theta |z))}{\partial \theta }. \end{aligned}$$Let$$\begin{aligned} E([\theta -\ell (Z)]^2)\ge & {} (E[\theta -\ell (Z)])^2\nonumber \\&+\,\frac{(E[(\theta -\ell (Z))\psi (Z,\theta )])^2}{E([\psi (Z,\theta )]^2)}. \end{aligned}$$(2.9)and$$\begin{aligned} I(\lambda )= E\left[ \left( \frac{\partial \log \lambda (\theta )}{\partial \theta }\right) ^2\right] \end{aligned}$$Applying the inequality given by Corollary 2.1, we get that$$\begin{aligned} I(\theta )= E\left[ \left( \frac{\partial \log f(Z,\theta )}{\partial \theta }\right) ^2|\theta \right] . \end{aligned}$$$$\begin{aligned} E([\theta -\ell (Z)]^2)\ge & {} (E[\theta -\ell (Z)])^2\nonumber \\&+\,\frac{(E[(\theta -\ell (Z)) \psi (Z,\theta )])^2 }{E([\psi (Z,\theta )]^2)}\\= & {} (E[\theta -\ell (Z)])^2+\, \frac{(E[\theta \psi (Z,\theta )])^2 }{E([\psi (Z,\theta )]^2)}\\= & {} (E[\theta -\ell (Z)])^2+\, \frac{(E[\theta \psi (Z,\theta )])^2 }{E(I(\theta ))+I(\lambda )}.\\ \end{aligned}$$ - (iii)We will now obtain an improved version of the van Trees inequality [cf. van Trees (1968), Gill and Levit (1995)]. LetAssuming that the prior density \(\lambda (\theta )\) converges to zero as \(\theta \) tends to the boundary of the set \(\Theta \) , it follows that$$\begin{aligned} \psi (z,\theta )= \frac{\partial \log (f(z,\theta )\lambda (\theta ))}{\partial \theta }. \end{aligned}$$and$$\begin{aligned} \int _{\Theta }\frac{d[f(z,\theta ) \lambda (\theta )]}{d\theta } d\theta = [f(z,\theta )\lambda (\theta )]_{\partial \Theta }=0 \end{aligned}$$(2.10)Using the above equations, it follows that$$\begin{aligned} \int _{\Theta } \theta \frac{d[f(z,\theta ) \lambda (\theta )]}{d\theta } d\theta= & {} [\theta f(z,\theta )\lambda (\theta )]_{\partial \Theta }\nonumber \\&-\,\int _\Theta f(z,\theta ) \lambda (\theta )d\theta \\= & {} -\int _\Theta f(z,\theta )\lambda (\theta )d\theta . \end{aligned}$$Observe that \(E[\psi (Z,\theta )|Z]=0.\) Applying Corollary 2.1, we get that$$\begin{aligned} \int _{-\infty }^{\infty }\int _\Theta (\theta -\ell (z))\frac{d[f(z,\theta ) \lambda (\theta )]}{d\theta }d\theta dz= & {} \int _{-\infty }^{\infty }\int _\Theta f(z,\theta )\lambda (\theta )d\theta dz\\= & {} 1. \end{aligned}$$$$\begin{aligned} E([\theta -\ell (Z)]^2)\ge & {} (E[\theta -\ell (Z)])^2\nonumber \\&+\,\frac{(E[(\theta -\ell (Z)) \psi (Z,\theta )])^2 }{E([\psi (Z,\theta )]^2)}\\= & {} (E[\theta -\ell (Z)])^2+\, \frac{(E[\theta \psi (Z,\theta )])^2 }{E([\psi (Z,\theta )]^2)}\\= & {} (E[\theta -\ell (Z)])^2+\, \frac{(E[\theta \psi (Z,\theta )])^2 }{E(I(\theta ))+I(\lambda )}. \end{aligned}$$
- (iv)Define the likelihood ratio given by$$\begin{aligned} L(z;\theta _1,\theta _2)= \frac{g(z,\theta _1)}{g(z,\theta _2)}. \end{aligned}$$

#### Remark

In a similar fashion, it is possible to improve other lower bounds for the risk of Bayesian estimators using Corollary 2.1 as applications of the improved Cauchy–Schwarz inequality due to Walker (2017) and also obtain similar Bayesian bounds for functions of a parameter. Note that the bounds obtained by using Walker’s inequality are sharper than those derived using the Cauchy–Schwarz inequality as illustrated by the Eqs. (2.2) and (2.3). Sudheesh and Dewan (2016) obtained Bayesian lower bound in the Gaussian case as an application of the generalized moment identity derived by them. The class of lower bounds derived above are sharper than those derived in Weiss and Weinstein (1985). As can be seen from the computations made in the special case (iii) discussed above, the lower bound obtained here for the risk of the random parameter \(\theta \) is tighter than the lower bounds obtained earlier in the literature.

## References

- Babrovsky BZ, Mayer-Wolf E, Zakai M (1987) Some classes of global Cramér–Rao bounds. Ann Stat 15:1421–1438CrossRefMATHGoogle Scholar
- Borovkov AA, Sakhanenko AI (1980) On estimates for the average quadratic risk. Probab Math Stat 1:185–195 (In Russian)MATHGoogle Scholar
- Brown LD, Gajek L (1990) Information inequalities for the Bayes risk. Ann Stat 18:1578–1594MathSciNetCrossRefMATHGoogle Scholar
- Brown LD, Liu RC (1993) Bounds on the Bayes and minimax risk for signal parameter estimation. IEEE Trans Inf Theory 39:1386–1394CrossRefMATHGoogle Scholar
- Chazan D, Ziv J, Zakai M (1975) Improved lower bounds on signal parameter estimation. IEEE Trans Inf Theory 21:90–93MathSciNetCrossRefMATHGoogle Scholar
- Gart John J (1959) An extension of Cramér–Rao inequality. Ann Math Stat 30:367–380CrossRefMATHGoogle Scholar
- Ghosh M (1993) Cramér-Rao bounds for posterior variances. Stat Probab Lett 17:173–178CrossRefMATHGoogle Scholar
- Gill RD, Levit Borris Y (1995) Application of the van Trees inequality: a Batesian Cramér–Rao bound. Bernoulli 1:59–79MathSciNetCrossRefGoogle Scholar
- Miller R, Chang C (1978) A modified Cramér–Rao bound and its applications. IEEE Trans Inf Theory 24:398–400CrossRefMATHGoogle Scholar
- Prakasa Rao BLS (1991) On Cramér–Rao type integral inequalities. Calcutta Stat Assoc Bull 40:183–205. Reprinted In: van Trees H, Bell KL (eds) Bayesian bounds for parameter estimation and nonlinear filtering/tracking. IEEE Press, Wiley, New York. pp 900–922Google Scholar
- Prakasa Rao BLS (1992) Cramér–Rao type integral inequalities for functions of multidimensional parameter. Sankhya Ser A 54:53–73MathSciNetMATHGoogle Scholar
- Prakasa Rao BLS (1996) Remarks on Cramér–Rao type integral inequalities for randomly censored data. In: Koul HL, Deshpande JV (ed) Analysis of censored data. IMS Lecture Notes No. 27. Institute of Mathematical Statistics, pp 160–176Google Scholar
- Prakasa Rao BLS (2000) Cramér–Rao type integral inequalities in Banach spaces. In: Basu AK, Ghosh JK, Sen PK, Sinha BK (eds) Perspectives in statistical sciences. Oxford University Press, New Delhi, pp 245–260Google Scholar
- Prakasa Rao BLS (2001) Cramér–Rao type integral inequalities for general loss functions. TEST 10:105–120MathSciNetCrossRefMATHGoogle Scholar
- Schutzenberger MP (1957) A generalization of the Frechet–Cramér inequality to the case of Bayes estimation. Bull Am Math Soc 63:142Google Scholar
- Shemyakin ML (1987) Rao–Cramér type integral inequalities for estimates of a vector parameter. Theory Probab Appl 32:426–434MathSciNetCrossRefMATHGoogle Scholar
- Sudheesh K, Dewan I (2016) On generalized moment identity and its application: a unified approach. Statistics 50:1149–1160MathSciNetCrossRefMATHGoogle Scholar
- Targhetta M (1984) On Bayesian analogues to Bhattacharya’s lower bounds. Arab Gulf J Sci Res 2:583–590MathSciNetMATHGoogle Scholar
- Targhetta M (1988) On the attainment of a lower bound for the Bayes risk in estimating a parametric function. Statistics 19:233–239MathSciNetCrossRefMATHGoogle Scholar
- Targhetta M (1990) A note on the mixing problem and the Schutzenberger inequality. Metrika 37:155–161MathSciNetCrossRefMATHGoogle Scholar
- van Trees Harry L (1968) Detection, estimation and modulation theory part 1. Wiley, New YorkMATHGoogle Scholar
- van Trees Harry L, Bell Kristine L (2007) Bayesian bounds for parameter estimation and nonlinear filtering/tracking. IEEE Press, Wiley, New YorkCrossRefMATHGoogle Scholar
- Walker SG (2017) A self-improvement to the Cauchy–Schwarz inequality. Stat Probab Lett 122:86–90MathSciNetCrossRefMATHGoogle Scholar
- Weinstein E, Weiss A (1985) Lower bounds on the mean square estimation error. Proc IEEE 73:1433–1434CrossRefGoogle Scholar
- Weiss A, Weinstein E (1985) A lower bound on the mean square error in random parameter estimation. IEEE Trans Inform Theory 31:680–682MathSciNetCrossRefMATHGoogle Scholar
- Ziv J, Zakai M (1969) Some lower bounds on signal parameter estimation. IEEE Trans Inform Theory 15:386–391MathSciNetCrossRefMATHGoogle Scholar