Theory of Estimation

Part of the Die Grundlehren der mathematischen Wissenschaften book series (GL, volume 202)


We first superficially sketch the problem we will treat in this chapter. In the previous chapter we dealt with the question of how one can acquire more precise information on the value of an unknown parameter on the basis of a sample. Although one tries to construct confidence sets which are “as small as possible”, one cannot be guided in such a construction by the idea of “exactly” determining the parameter. To work out this concept is the goal of the theory of estimation. If \( (R,S) \) is a sample space and Γ a set of parameters of a class of probability measures P Γover \( (R,S) \) then one seeks a map h of R into Γ such that h(χ) for a sample χ∈R is “approximately” equal to the true parameter value. We are primarily concerned with the case in which Γ is a subset of R1 or where we have to estimate a mapping d from Γ into R1. For the sake of simplified formulation we agree that: Γ will always be a non-empty set of parameters of a class of probability measures and d a map from Γ into R 1, unless something else is specifically said. Further conditions can be also imposed on Γ as well as d.


Probability Measure Unbiased Estimate Sample Space Asymptotic Variance Conditional Density 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    This concept was given its first clear treatment in F. N. David and J. Neyman, Statist. Res. Mem. Univ. London 2,105–116 (1938).Google Scholar
  2. 2.
    See also Theorem 1 in H. Teicher, Ann. Math. Statist. 34, 1265–1269 (1963).MathSciNetzbMATHCrossRefGoogle Scholar
  3. 3.
    Due to A.N. Kolmogorov, Izv. Akad. Nauk SSSR Ser. Mat. 14, 303–326 (1950).MathSciNetzbMATHGoogle Scholar
  4. 4.
    This result is essentially due to C. R. Rao, Sankhya 12, 27–42 (1952).MathSciNetzbMATHGoogle Scholar
  5. 5.
    See L. Schmetterer, Ann. Math. Statist. 31, 1154–1163 (1960) and Publ. Math. Inst. Hungar. Acad. Sci. Ser. A. 6, 295–300 (1961). See also A. Kramli, Studia Sci. Math. Hung. 2, 159–161 (1967).MathSciNetCrossRefGoogle Scholar
  6. 6.
    See p. 273.Google Scholar
  7. 7.
    See L. Schmetterer, Mitteilungsbl. Math. Statist. 9, 147–152 (1957).MathSciNetGoogle Scholar
  8. 8.
    See E. W. Barankin, Ann. Math. Statist. 20, 477–501 (1949) and L. Schmetterer, loc. cit.5.CrossRefGoogle Scholar
  9. 9.
    R. R. Bahadur, Sankhya 18, 211–224 (1957).MathSciNetzbMATHGoogle Scholar
  10. 10.
    R. R. Bahadur, loc. cit.9 and L. Schmetterer, loc. cit.5.Google Scholar
  11. 11.
    D. Blackwell, Ann. Math. Statist. 18, 105–110 (1947). See also A.N. Kolmogorov, loc. cit.3.MathSciNetzbMATHCrossRefGoogle Scholar
  12. 12.
    For this terminology see III, p. 206.Google Scholar
  13. 13.
    Here and in the following lines we have Occasionally suppressed the reference to γ.Google Scholar
  14. 14.
    E. L. Lehmann and H. Scheffe, Sankhya 10, 305–340 (1950).MathSciNetzbMATHGoogle Scholar
  15. 15.
    More precisely, we are speaking here of absolute moments. Thus, we consider \( E(|h - d(\gamma ){|^p};\gamma ). \) Google Scholar
  16. 16.
    See also p. 477.Google Scholar
  17. 17.
    See L. Schmetterer, Abh. Deutsch. Akad. Wiss. Berlin, Kl. Math. Phys., Tech. 1964, Nr. 4, 117–120 and J. Roy and LM. Chakravarti, Ann. Math. Statist. 31, 392–398 (1960).Google Scholar
  18. 18.
    From the sampling theory standpoint the case in which all components of γ appear in xγ(i) is trivial.Google Scholar
  19. 19.
    The definition of the cj, 1⩽j⩽N in the elements of R that are different from xγ(i) is of no consequence.Google Scholar
  20. 20.
    We naturally assume that this set is non-empty.Google Scholar
  21. 21.
    Theorem 1.1 referred, however, to the totality of all unbiased estimates and not just to the linear estimates. The theorem holds trivially for linear estimates h if one limits the class of estimates of zero to the linear ones.Google Scholar
  22. 22.
    See M. Frechet, Rev. Inst. Internat. Statist. 11,182–205 (1943), C. R. Rao, Bull. Calcutta Math. Soc. 37, 81–91 (1945) and H. Cramer, Skand. Aktuarietidskr. 29, 85–94 (1946).CrossRefGoogle Scholar
  23. 23.
    One can replace this assumption by the requirement that /(x, y)=0 for each y in a set independent of y.Google Scholar
  24. 24.
    This result is of course related to I, Theorem 17.1.Google Scholar
  25. 25.
    R. R. Bahadur, loc. cit.9.Google Scholar
  26. 26.
    R. R. Bahadur loc. cit9 and L. Schmetterer loc. cit.5 Google Scholar
  27. 27.
    If Γ ⫅ R1 and d(y) = y for all y∈Γ, then we also say that hn is consistent for yeT.Google Scholar
  28. 28.
    See also III, p.240ff.Google Scholar
  29. 29.
    We do not exclude the possibility that N(ε) depends on y.Google Scholar
  30. 30.
    L. Le Cam and L. Schwartz, Ann. Math. Statist. 31,140–150 (1960). See also J. L. Doob, Colloques internationaux Centre National de la Recherche Scientifique, no. 13, 23–27 Centre National de la Recherche Scientifique, Paris 1949.Google Scholar
  31. 31.
    R. A. Fisher, Messenger of Math. 41, 150–160 (1912).Google Scholar
  32. 32.
    For the following one can replace the assumption f(x,y)≠0 for all x∈R1 by f(x,y)∈0 for all x∈R1 up to a ju-null set without difficulty. Moreover, R1 can also be replaced by an arbitrary Borel set M not depending on y, i.e., it is sufficient to require f(x,y)∈0 for each γ∈Γ and all x eM and f(x,y)=0 for each γ∈Γ and all x∈R1-M.Google Scholar
  33. 33.
    One can again allow exceptional sets of μ-measure zero here.Google Scholar
  34. 34.
    N(δ,ε) may also depend on γ but we suppress this. This will pertain to analogous statements in the following proof.Google Scholar
  35. 35.
    “0” is here the k-dimensional zero vector.Google Scholar
  36. 36.
    This proof is somewhat related to that of H. Cramer, Loc. cit. I,58 500ff. For the case considered here of a multi-dimensional parameter, K. C. Chanda, Biometrika 41, 56–61 (1954) has carried out Cramer’s proof in detail.MathSciNetGoogle Scholar
  37. 37.
    We have modified here and in the sequel an argument of H. Hornich, Monatsh. Math. 54, 130–134 (1950).MathSciNetzbMATHCrossRefGoogle Scholar
  38. 38.
    See V.S. Huzurbazar, Ann. Eugenics 14, 185–200 (1948).MathSciNetGoogle Scholar
  39. 39.
    A more precise formulation can be given with the help of the statement of Theorem 3.8.Google Scholar
  40. 40.
    See L. Le Cam and Ch. Kraft, Ann. Math. Statist. 27, 1174–1177 (1956) and also R. R. Bahadur, Sankhya 20, 207–210 (1958).MathSciNetzbMATHCrossRefGoogle Scholar
  41. 41.
    See p. 192. One can for example choose the set of all /c-tuples in C with rational components. Note that only the existence of a dense set in C is used in the proof and not the compactness of C.Google Scholar
  42. 42.
    Cf. H. Richter, Math. Ann. 150, 85–90 and 440-441 (1963), M. Sion, Trans. Amer. Math. Soc. 96, 237–246 (1960).MathSciNetCrossRefGoogle Scholar
  43. 43.
    A. Wald, Ann. Math. Statist. 20, 595–601 (1949), J. Wolfowitz, Ann. Math. Statist. 20, 601–602 (1949). See also J.L. Doob, Trans. Amer. Math. Soc. 36, 759–775 (1934).MathSciNetzbMATHCrossRefGoogle Scholar
  44. 44.
    Γ can in principle be any open set.Google Scholar
  45. 45.
    It is enough to require that the mapping be continuous for all x up to a μ-null set.Google Scholar
  46. 46.
    For each γ∈Γ there is a sufficiently small p such that Kp(y)⊂Γ.Google Scholar
  47. 47.
    We do not exclude the possibility that n(ε, δ) depends on γ.Google Scholar
  48. 48.
    One sees immediately that these assumptions can be generalized, Γ can be an arbitrary open set and Γ✶ an open, bounded subset of Γ with closure belonging to Γ. The closure is the smallest closed set containing Γ✶, i.e., the intersection of all closed subsets of Rk containing Γ✶.Google Scholar
  49. 49.
    See f.i. J. Pfanzagl, Metrika 14, 249–272 (1969).zbMATHCrossRefGoogle Scholar
  50. 50.
    A careful analysis of the assumptions used to prove asymptotic normality of ML estimates is given by L. Le Cam, Ann. Math. Statist. 41, 802–828 (1970). See also P. J. Huber, Proc. 5 th. Berkeley Sympos. Math. Stat. Probab. Vol. I, 221–233 (1967).MathSciNetzbMATHCrossRefGoogle Scholar
  51. 51.
    All these statements are related to probabilities w.r.t. γ.Google Scholar
  52. 52.
    0 represents here the k-dimensional null vector.Google Scholar
  53. 53.
    Actually, we have suppressed a step here since (I + Cn(n))-1 is only defined with probability arbitrarily close to 1 for sufficiently large n.Google Scholar
  54. 54.
    See48 for remarks on this formulation.Google Scholar
  55. 55.
    This follows from the additional continuity assumption and (3.3). Thus, γ→|E(A(ξγ);γ)|(possesses a positive lower bound in Γ0).Google Scholar
  56. 56.
    L. Le Cam, Univ. California Publ. Statist. 1, 277–329 (1953) and l.c.50.MathSciNetGoogle Scholar
  57. 57.
    For the notation see Theorem 3.6.Google Scholar
  58. 58.
    Cf. L, Le Cam, loc. cit. 56. See also P.J. Bickel and J.A. Yahav, Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 11, 257–276 (1969) and the references given there.CrossRefGoogle Scholar
  59. 59.
    See I, p.59. The map γ→f(x γ) is thus defined up to a null set (which does not depend on x).Google Scholar
  60. 60.
    Possibly with the exception of L-null sets.Google Scholar
  61. 61.
    We use the notation of Theorem 3.1 although the context is somewhat different.Google Scholar
  62. 62.
    Kp(γ) has the same meaning as in Theorem 3.6.Google Scholar
  63. 63.
    We allow the possibility that the expectation on the left is + ∞.Google Scholar
  64. 64.
    For the meaning of Pyo.∞ see the definition of Pyo.∞ on p. 314.Google Scholar
  65. 65.
    See IV, p. 265.Google Scholar
  66. 66.
    L. Le Cam, loc. cit. 56.Google Scholar
  67. 67.
    For the notation see p. 296. Also cf.61.Google Scholar
  68. 68.
    B. van der Waerden, Ber. Verh. sachs. Akad. Leipzig, Math.-Phys. Kl. 87, 353–364(1935).Google Scholar
  69. 69.
    See appendix, p. 480ff.Google Scholar
  70. 70.
    The theory of Bayes estimates has been carefully studied recently. We only mention: M.H. de Groot and M.M. Rao, Ann. Math. Statist. 34, 598–611 (1963) and L. Schwartz, Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 4, 10–26 (1965).MathSciNetCrossRefGoogle Scholar
  71. 71.
    L. Le Cam, loc. cit. 56.Google Scholar
  72. 72.
    Consistent, Asymptotically Normal.Google Scholar
  73. 73.
    See, for example, R.A. Fisher, Philos. Trans. Roy. Soc. London Ser. A 222, 309–368 (1922) and R.A. Fisher, Proc. Cambridge Philos. Soc. 22, 700–725 (1925).CrossRefGoogle Scholar
  74. 74.
    L. Le Cam, l.c.56.Google Scholar
  75. 75.
    See in connection with the entire question C. R. Rao, J. Roy. Statist. Soc. Ser. B, 24, 46–72 (1962). Proc. Fourth Berkeley Sympos. Math. Statist, and Prob. Vol. I, pp. 531–545, Univ. California Press, Berkeley, Calif., (1960) and Sankhya 24, Ser. A, 73–101 (1962), as well as Sankhya 25, Ser. A, 189–206 (1963).Google Scholar
  76. 76.
    J. Neyman, Proceedings of the Berkeley Symposium on Mathematical Statistics and Probability pp. 239–273, (1949), University of California Press, Berkeley and Los Angeles.Google Scholar
  77. 77.
    R. R. Bahadur, Sankhya 22, 229–253 (1960).MathSciNetzbMATHGoogle Scholar
  78. 78.
    See D. Basu, Sankhya 17, 193–196 (1956) as well as E. L. Lehmann, Proceedings of the Berkeley Symposium on Mathematical Statistics and Probability 1949, pp. 451–457, University of California Press, Berkeley and Los Angeles where, in a somewhat different connection, a similar example is considered.Google Scholar
  79. 79.
    See L. Schmetterer, Research Papers in Statistics (Neyman Festschrift, 301–317) John Wiley, New York 1966.Google Scholar

Copyright information

© Springer-Verlag Berlin · Heidelberg 1974

Authors and Affiliations

  1. 1.University of ViennaAustria

Personalised recommendations