Introduction to Mathematical Statistics pp 268-353 | Cite as

# Theory of Estimation

- 668 Downloads

## Abstract

We first superficially sketch the problem we
will treat in this chapter. In the previous chapter we dealt with the
question of how one can acquire more precise information on the value
of an unknown parameter on the basis of a sample. Although one tries
to construct confidence sets which are “as small as possible”, one cannot
be guided in such a construction by the idea of “exactly” determining
the parameter. To work out this concept is the goal of the theory
of estimation. If \(
(R,S)
\) is a sample space and *Γ* a set of parameters of
a class of probability measures *P* _{Γ}over \(
(R,S)
\) then one seeks a map
*h* of *R* into *Γ* such that *h*(χ) for a sample χ∈R is “approximately” equal
to the true parameter value. We are primarily concerned with the case
in which *Γ* is a subset of R_{1} or where we have to estimate a mapping *d*
from *Γ* into R_{1}. For the sake of simplified formulation we agree that:
*Γ will always be a non-empty set of parameters of a class of probability
measures and d a map from Γ into R* _{1}, *unless something else is specifically
said. Further conditions can be also imposed on Γ as well as d*.

## Keywords

Probability Measure Unbiased Estimate Sample Space Asymptotic Variance Conditional Density## Preview

Unable to display preview. Download preview PDF.

## References

- 1.This concept was given its first clear treatment in F. N. David and J. Neyman, Statist. Res. Mem. Univ. London 2,105–116 (1938).Google Scholar
- 2.See also Theorem 1 in H. Teicher, Ann. Math. Statist. 34, 1265–1269 (1963).MathSciNetzbMATHCrossRefGoogle Scholar
- 3.Due to A.N. Kolmogorov, Izv. Akad. Nauk SSSR Ser. Mat. 14, 303–326 (1950).MathSciNetzbMATHGoogle Scholar
- 4.This result is essentially due to C. R. Rao, Sankhya 12, 27–42 (1952).MathSciNetzbMATHGoogle Scholar
- 5.See L. Schmetterer, Ann. Math. Statist. 31, 1154–1163 (1960) and Publ. Math. Inst. Hungar. Acad. Sci. Ser. A. 6, 295–300 (1961). See also A. Kramli, Studia Sci. Math. Hung. 2, 159–161 (1967).MathSciNetCrossRefGoogle Scholar
- 6.See p. 273.Google Scholar
- 7.See L. Schmetterer, Mitteilungsbl. Math. Statist. 9, 147–152 (1957).MathSciNetGoogle Scholar
- 8.See E. W. Barankin, Ann. Math. Statist. 20, 477–501 (1949) and L. Schmetterer, loc. cit.5.CrossRefGoogle Scholar
- 9.R. R. Bahadur, Sankhya 18, 211–224 (1957).MathSciNetzbMATHGoogle Scholar
- 10.R. R. Bahadur, loc. cit.9 and L. Schmetterer, loc. cit.5.Google Scholar
- 11.D. Blackwell, Ann. Math. Statist. 18, 105–110 (1947). See also A.N. Kolmogorov, loc. cit.3.MathSciNetzbMATHCrossRefGoogle Scholar
- 12.For this terminology see III, p. 206.Google Scholar
- 13.Here and in the following lines we have Occasionally suppressed the reference to γ.Google Scholar
- 14.E. L. Lehmann and H. Scheffe, Sankhya 10, 305–340 (1950).MathSciNetzbMATHGoogle Scholar
- 15.More precisely, we are speaking here of absolute moments. Thus, we consider \( E(|h - d(\gamma ){|^p};\gamma ). \) Google Scholar
- 16.See also p. 477.Google Scholar
- 17.See L. Schmetterer, Abh. Deutsch. Akad. Wiss. Berlin, Kl. Math. Phys., Tech. 1964, Nr. 4, 117–120 and J. Roy and LM. Chakravarti, Ann. Math. Statist. 31, 392–398 (1960).Google Scholar
- 18.From the sampling theory standpoint the case in which all components of γ appear in xγ
^{(i)}is trivial.Google Scholar - 19.The definition of the c
_{j}, 1⩽j⩽N in the elements of R that are different from xγ^{(i)}is of no consequence.Google Scholar - 20.We naturally assume that this set is non-empty.Google Scholar
- 21.Theorem 1.1 referred, however, to the totality of all unbiased estimates and not just to the linear estimates. The theorem holds trivially for linear estimates h if one limits the class of estimates of zero to the linear ones.Google Scholar
- 22.See M. Frechet, Rev. Inst. Internat. Statist. 11,182–205 (1943), C. R. Rao, Bull. Calcutta Math. Soc. 37, 81–91 (1945) and H. Cramer, Skand. Aktuarietidskr. 29, 85–94 (1946).CrossRefGoogle Scholar
- 23.One can replace this assumption by the requirement that /(x, y)=0 for each y in a set independent of y.Google Scholar
- 24.This result is of course related to I, Theorem 17.1.Google Scholar
- 25.R. R. Bahadur, loc. cit.
^{9}.Google Scholar - 26.
- 27.If Γ ⫅ R
_{1}and d(y) = y for all y∈Γ, then we also say that hn is consistent for yeT.Google Scholar - 28.See also III, p.240ff.Google Scholar
- 29.We do not exclude the possibility that N(ε) depends on y.Google Scholar
- 30.L. Le Cam and L. Schwartz, Ann. Math. Statist. 31,140–150 (1960). See also J. L. Doob, Colloques internationaux Centre National de la Recherche Scientifique, no. 13, 23–27 Centre National de la Recherche Scientifique, Paris 1949.Google Scholar
- 31.R. A. Fisher, Messenger of Math. 41, 150–160 (1912).Google Scholar
- 32.For the following one can replace the assumption f(x,y)≠0 for all x∈R
_{1}by f(x,y)∈0 for all x∈R_{1}up to a ju-null set without difficulty. Moreover, R_{1}can also be replaced by an arbitrary Borel set M not depending on y, i.e., it is sufficient to require f(x,y)∈0 for each γ∈Γ and all x eM and f(x,y)=0 for each γ∈Γ and all x∈R_{1}-M.Google Scholar - 33.One can again allow exceptional sets of μ-measure zero here.Google Scholar
- 34.N(δ,ε) may also depend on γ but we suppress this. This will pertain to analogous statements in the following proof.Google Scholar
- 35.“0” is here the
*k*-dimensional zero vector.Google Scholar - 36.This proof is somewhat related to that of H. Cramer, Loc. cit. I,
^{58}500ff. For the case considered here of a multi-dimensional parameter, K. C. Chanda, Biometrika 41, 56–61 (1954) has carried out Cramer’s proof in detail.MathSciNetGoogle Scholar - 37.We have modified here and in the sequel an argument of H. Hornich, Monatsh. Math. 54, 130–134 (1950).MathSciNetzbMATHCrossRefGoogle Scholar
- 38.See V.S. Huzurbazar, Ann. Eugenics 14, 185–200 (1948).MathSciNetGoogle Scholar
- 39.A more precise formulation can be given with the help of the statement of Theorem 3.8.Google Scholar
- 40.See L. Le Cam and Ch. Kraft, Ann. Math. Statist. 27, 1174–1177 (1956) and also R. R. Bahadur, Sankhya 20, 207–210 (1958).MathSciNetzbMATHCrossRefGoogle Scholar
- 41.See p. 192. One can for example choose the set of all /c-tuples in C with rational components. Note that only the existence of a dense set in C is used in the proof and not the compactness of C.Google Scholar
- 42.Cf. H. Richter, Math. Ann. 150, 85–90 and 440-441 (1963), M. Sion, Trans. Amer. Math. Soc. 96, 237–246 (1960).MathSciNetCrossRefGoogle Scholar
- 43.A. Wald, Ann. Math. Statist. 20, 595–601 (1949), J. Wolfowitz, Ann. Math. Statist. 20, 601–602 (1949). See also J.L. Doob, Trans. Amer. Math. Soc. 36, 759–775 (1934).MathSciNetzbMATHCrossRefGoogle Scholar
- 44.Γ can in principle be any open set.Google Scholar
- 45.It is enough to require that the mapping be continuous for all x up to a μ-null set.Google Scholar
- 46.For each γ∈Γ there is a sufficiently small p such that K
_{p}(y)⊂Γ.Google Scholar - 47.We do not exclude the possibility that n(ε, δ) depends on γ.Google Scholar
- 48.One sees immediately that these assumptions can be generalized, Γ can be an arbitrary open set and Γ✶ an open, bounded subset of Γ with closure belonging to Γ. The closure is the smallest closed set containing Γ✶, i.e., the intersection of all closed subsets of R
_{k}containing Γ✶.Google Scholar - 49.See f.i. J. Pfanzagl, Metrika 14, 249–272 (1969).zbMATHCrossRefGoogle Scholar
- 50.A careful analysis of the assumptions used to prove asymptotic normality of ML estimates is given by L. Le Cam, Ann. Math. Statist. 41, 802–828 (1970). See also P. J. Huber, Proc. 5 th. Berkeley Sympos. Math. Stat. Probab. Vol. I, 221–233 (1967).MathSciNetzbMATHCrossRefGoogle Scholar
- 51.All these statements are related to probabilities w.r.t. γ.Google Scholar
- 52.0 represents here the
*k*-dimensional null vector.Google Scholar - 53.Actually, we have suppressed a step here since (I + C
_{n}(ξ_{(n)})_{-1}is only defined with probability arbitrarily close to 1 for sufficiently large n.Google Scholar - 54.See
_{48}for remarks on this formulation.Google Scholar - 55.This follows from the additional continuity assumption and (3.3). Thus, γ→|E(A(ξγ);γ)|(possesses a positive lower bound in Γ
_{0}).Google Scholar - 56.L. Le Cam, Univ. California Publ. Statist. 1, 277–329 (1953) and l.c.
^{50}.MathSciNetGoogle Scholar - 57.For the notation see Theorem 3.6.Google Scholar
- 58.Cf. L, Le Cam, loc. cit. 56. See also P.J. Bickel and J.A. Yahav, Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 11, 257–276 (1969) and the references given there.CrossRefGoogle Scholar
- 59.See I, p.59. The map γ→f(x γ) is thus defined up to a null set (which does not depend on x).Google Scholar
- 60.Possibly with the exception of
*L*-null sets.Google Scholar - 61.We use the notation of Theorem 3.1 although the context is somewhat different.Google Scholar
- 62.K
_{p}(γ) has the same meaning as in Theorem 3.6.Google Scholar - 63.We allow the possibility that the expectation on the left is + ∞.Google Scholar
- 64.
- 65.See IV, p. 265.Google Scholar
- 66.L. Le Cam, loc. cit. 56.Google Scholar
- 67.For the notation see p. 296. Also cf.
^{61}.Google Scholar - 68.B. van der Waerden, Ber. Verh. sachs. Akad. Leipzig, Math.-Phys. Kl. 87, 353–364(1935).Google Scholar
- 69.See appendix, p. 480ff.Google Scholar
- 70.The theory of Bayes estimates has been carefully studied recently. We only mention: M.H. de Groot and M.M. Rao, Ann. Math. Statist. 34, 598–611 (1963) and L. Schwartz, Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 4, 10–26 (1965).MathSciNetCrossRefGoogle Scholar
- 71.L. Le Cam, loc. cit. 56.Google Scholar
- 72.Consistent, Asymptotically Normal.Google Scholar
- 73.See, for example, R.A. Fisher, Philos. Trans. Roy. Soc. London Ser. A 222, 309–368 (1922) and R.A. Fisher, Proc. Cambridge Philos. Soc. 22, 700–725 (1925).CrossRefGoogle Scholar
- 74.L. Le Cam, l.c.
^{56}.Google Scholar - 75.See in connection with the entire question C. R. Rao, J. Roy. Statist. Soc. Ser. B, 24, 46–72 (1962). Proc. Fourth Berkeley Sympos. Math. Statist, and Prob. Vol. I, pp. 531–545, Univ. California Press, Berkeley, Calif., (1960) and Sankhya 24, Ser. A, 73–101 (1962), as well as Sankhya 25, Ser. A, 189–206 (1963).Google Scholar
- 76.J. Neyman, Proceedings of the Berkeley Symposium on Mathematical Statistics and Probability pp. 239–273, (1949), University of California Press, Berkeley and Los Angeles.Google Scholar
- 77.R. R. Bahadur, Sankhya 22, 229–253 (1960).MathSciNetzbMATHGoogle Scholar
- 78.See D. Basu, Sankhya 17, 193–196 (1956) as well as E. L. Lehmann, Proceedings of the Berkeley Symposium on Mathematical Statistics and Probability 1949, pp. 451–457, University of California Press, Berkeley and Los Angeles where, in a somewhat different connection, a similar example is considered.Google Scholar
- 79.See L. Schmetterer, Research Papers in Statistics (Neyman Festschrift, 301–317) John Wiley, New York 1966.Google Scholar