Bayesian vs. Frequentist Shrinkage in Multivariate Normal Problems
This chapter is dedicated to the comparison of Bayes and frequentist estimators of the mean θ of a multivariate normal distribution in high dimensions. For dimension k ≥ 3, the James-Stein estimator specified in (2.15) (and its more general form to be specified below) is usually the frequentist estimator of choice. The estimator is known to improve uniformly upon the sample mean vector X̅ as an estimator of θ when k ≥ 3, and while it is also known that it is not itself admissible, extant alternative estimators with smaller risk functions are known to offer only very slight improvement. For this and other reasons, the James-Stein estimator is widely used among estimators which exploit the notion of shrinkage. In the results described in this chapter, I will use the form of the James-Stein estimator which shrinks X̅toward a (possibly nonzero) distinguished point. This serves the purpose of placing the James-Stein estimator and the Bayes estimator of θ with respect to a standard conjugate prior distribution in comparable frameworks, since the latter also shrinks X̅ toward a distinguished point. It is interesting to note that the James-Stein estimator has a certain Bayesian flavor that goes beyond the empirical Bayes character highlighted in the writings of Efron and Morris (1973, etc.) in that the act of shrinking toward a particular parameter vector suggests that the statistician using this estimator is exercising some form of introspection in determining a good “prior guess” at θ. The Bayesian of course goes further, specifying, a priori, the weight he wishes to place on the prior guess. What results in the latter case is an alternative form of shrinkage, one that leads to a linear combination of X̅ and the prior guess, with weights influenced by the prior distribution rather than by the observed data. Since X̅ is a sufficient statistic for the mean of a multivariate normal distribution with known variance-covariance matrix S, I will henceforth, without loss of generality, take the sample size n to be 1.
KeywordsPrior Distribution Frequentist Estimator Multivariate Normal Distribution Prior Variance Diagonal Covariance Matrix
Unable to display preview. Download preview PDF.