Skip to main content

Density Prediction and the Stein Phenomenon


The Stein phenomenon is a path-breaking discovery in mathematical statistics in the last century. A large number of researchers followed Stein’s footsteps and developed a wide variety of minimax shrinkage point estimators of a multivariate normal mean vector, each dominating the sample mean. More recently, the problem resurfaced, but this time with minimax shrinkage predictive density estimation, illustrating once again the Stein phenomenon In this review paper, we discuss parallel developments for normal and Poisson distributions under the Kullback-Leibler and more general divergence losses.

This is a preview of subscription content, access via your institution.


  • Aitchison, J. (1975). Goodness of prediction fit. Biometrika 62, 547–554.

    MathSciNet  Article  Google Scholar 

  • Amari, S. (1982). Differential geometry of curved exponential families-curvatures and information loss. Ann. Statist. 10, 357–387.

    MathSciNet  Article  Google Scholar 

  • Baranchick, A. (1973). Inadmissibility of maximum likelihood estimators in some multiple regression problems with three or more independent variables. Ann. Statist. 1, 312–321.

    MathSciNet  Article  Google Scholar 

  • Boisbunon, A. and Maruyama, Y. (2014). Inadmissibility of the best equivariant predictive density in the unknown variance case. Biometrika 101, 733–740.

    MathSciNet  Article  Google Scholar 

  • Brown, L. D. (1971). Admissible estimators, recurrent diffusions, and insolvable boundary value problems. Ann. Math. Statist. 42, 855–904.

    MathSciNet  Article  Google Scholar 

  • Brown, L. D., George, E. I. and Xu, X. (2008). Admissible predictive density estimation. Ann. Statist. 36, 1156–1170.

    MathSciNet  Article  Google Scholar 

  • Corcuera, J. M. and Giummole, F. (1999a). A generalized Bayes rule for prediction. Scandinavian Journal of Statistics 26, 265–279.

    MathSciNet  Article  Google Scholar 

  • Corcuera, J. M. and Giummole, F. (1999b). On the relationship between α connections and the asymptotic properties of predictive distributions. Bernoulli 5, 163–176.

    MathSciNet  Article  Google Scholar 

  • Cressie, N and Read, T. R. C. (1984). Multinomial goodness-of-fit tests. J. Roy. Statist. Soc., B 46, 440–464.

    MathSciNet  MATH  Google Scholar 

  • Efron, B. and Morris, C. (1973). Stein’s estimation rule and its competitors - An empirical Bayes approach. J. Amer. Statist. Assoc. 68, 117–130.

    MathSciNet  MATH  Google Scholar 

  • George, E. I., Liang, F. and Xu, X. (2006). Improved minimax predictive densities under Kullback-Leibler loss. Annals of Statistics 34, 78–91.

    MathSciNet  Article  Google Scholar 

  • Ghosh, M., Mergel, V. and Datta, G. S. (2008). Estimation, prediction and the Stein phenomenon under divergence loss. J. Multivariate Anal. 99, 1941–1961.

    MathSciNet  Article  Google Scholar 

  • Hodges, J. L. Jr. and Lehmann, E. L. (1950). Some problems in minimax point estimation. Ann. Math. Statist. 21, 182–197.

    MathSciNet  Article  Google Scholar 

  • James, W. and Stein, C. (1961). Estimation with quadratic loss. University of California Press, Berkeley,.

  • Kato, K. (2009). Improved prediction for a multivariate normal distribution with unknown mean and unknown variance. Ann. Inst. Statist. Math. 61, 531–542.

    MathSciNet  Article  Google Scholar 

  • Komaki, F. (2001). A shrinkage predictive distribution for multivariate normal observables. Biometrika 88, 859–864.

    MathSciNet  Article  Google Scholar 

  • Komaki, F. (2004). Simultaneous prediction of independent Poisson observables. Ann. Statist. 32, 1744–1769.

    MathSciNet  Article  Google Scholar 

  • Komaki, F. (2015). Simultaneous prediction for independent Poisson processes with different durations. J. Multivariate Anal. 141, 35–48.

    MathSciNet  Article  Google Scholar 

  • Lindley, D. V. (1962). Discussion of ”Confidence sets for the mean of a multivariate normal distribution”, by C. Stein. J. Roy. Statist. Soc., B 24, 285–287.

    Google Scholar 

  • Matsuda, T. and Komaki, F. (2015). Singular value shrinkage priors for Bayesian prediction. Biometrika 102, 843–854.

    MathSciNet  Article  Google Scholar 

  • Rényi, A. (1961). On measures of entropy and information. Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability 1, 547–561.

    MathSciNet  MATH  Google Scholar 

  • Stein, C. (1956). Inadmissibility of the usual estimator for the mean of a multivariate normal distribution. Univ. California Press, Berkeley,.

  • Stein, C. (1981). Estimation of the mean of a multivariate normal distribution. Ann. Statist. 9, 1135–1151.

    MathSciNet  Article  Google Scholar 

  • Tsukuma, H. and Kubokawa, T. (2017). Proper Bayes and minimax predictive densities related to estimation of a normal mean matrix. J. Multivariate Anal. 159, 138–150.

    MathSciNet  Article  Google Scholar 

Download references


The authors are grateful to the Associate Editor and referees for their valuable comments and helpful suggestions. Research of the second author was supported in part by Grant-in-Aid for Scientific Research (18K11188 and 15H01943) from Japan Society for the Promotion of Science.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Malay Ghosh.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Ghosh, M., Kubokawa, T. & Datta, G.S. Density Prediction and the Stein Phenomenon. Sankhya A 82, 330–352 (2020).

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI:


  • Divergence loss
  • Dominance property
  • Empirical Bayes
  • Hellinger-Bhattacharyya divergence
  • Kullback-Leibler divergence
  • Minimaxity
  • Normal distribution
  • Poisson distribution
  • Risk function
  • Shrinkage estimator
  • Simultaneous estimation
  • Superharmonic

AMS (2000) subject classification

  • Primary 62C20
  • Secondary 62F10