Introduction to Mathematical Statistics pp 25-121 | Cite as

# Introduction to Probability Theory

- 686 Downloads

## Abstract

We have already mentioned in the introduction that the axioms of mathematical probability_{1} are to be so chosen that they reflect empirical situations when given an appropriate interpretation. We have seen that a characterization of mass phenomena can be given in a certain sense by the empirical probabilities of the events occuring. It is thus desirable to choose the notion of mathematical probability in such a way that the theorems of the mathematical theory yield empirically verifiable facts if the mathematical probability is replaced by the empirical. We then speak briefly of the frequency interpretation of the mathematical theory. The simplest calculation rules of empirical probability are expressed by 1. and 2. (p. 23). These serve as model for the axioms of mathematical probability. In this chapter, we will discuss the most important facts of probability theory. However, we should point out at once that our program is not a complete construction of the theory. Since the main emphasis in this book is on the application of probability in mathematical statistics, many of the important theorems in this chapter will be given without proof.

## Keywords

Characteristic Function Probability Theory Conditional Probability Pairwise Disjoint Independent Random Variable## Preview

Unable to display preview. Download preview PDF.

## References

- 1.We give here a summary of the most important texts on probability theory by means of which the reader can fill in any gaps left here and deepen his knowledge. Bauer, Heinz: Wahrscheinlichkeitstheorie und Grundzuge der Maßtheorie.Google Scholar
- 2.All sets here belong to S even when this is not explicitly stated.Google Scholar
- 3.Th. Bayes, Philos. Trans. Roy. Soc. 53, 376–398 (1763) and 54, 298–310 (1764).Google Scholar
- 4.See p. 5.Google Scholar
- 5.Frequently,
*P*(-∞< ξ <x) is defined to be the distribution function of ξ, e.g. by Kolmogorov, l.c. Intro.^{6}Google Scholar - 6.This is precisely the case when F is absolutely continuous, i. e., to each ε>0 there corresponds a δ>0, such that for each finite or countably infinite set of pairwise disjoint intervals (x
_{i},y_{i}), Σ*F*(y_{i})-*F*(x_{i}) <ε if Σ y_{i}-x_{i}<δ.Google Scholar - 7.Properly, it should be referred to as “a” density; however, when no misunderstanding is likely—here and in similar cases—we apply the definite article.Google Scholar
- 8.Briefly, we usually write R.-N.-density.Google Scholar
- 9.In this case, trivial changes in notation have to be introduced.Google Scholar
- 10.Again this is precisely the case when F is absolutely continuous. f is the Radon-Nikodym density relative to
*n*-dimensional Lebesgue measure.Google Scholar - 11.
- 12.
- 13.This definition can easily be extended to infinitely many random variables. Cf. the remark following (2.1).Google Scholar
- 14.A better terminology would be marginal distribution of (ξ
_{1},...,ξ_{n}) relative to (ξ_{1},...,ξ_{k}), but the expression employed here has established itself in the literature.Google Scholar - 15.See also Theorem 17.7.Google Scholar
- 16.We will also refer to \( ({\sigma _{ij}})_{1n}^{1n} \) as the covariance matrix of P
_{ξ}.Google Scholar - 17.Obviously, each moment of odd order
*E*[(ξ-a)^{2n+1}],*n*⩾0, of a distribution which is symmetric with respect to a vanishes whenever it exists.Google Scholar - 18.We also say: all versions of P(A ∣ G) differ from each other only on P-null sets.Google Scholar
- 19.See for this and related problems D. H. Blackwell, Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability 1954–1955 Vol. II, pp. 1–6, University of California Press, Berkeley and Los Angeles (1956) and D.H. Blackwell and C. Ryll-Nardzewski, Ann. Math. Statist. 34, 223–225 (1963).Google Scholar
- 20.In place of P
_{R1}(A_{y}ξ)(z) we also write P_{R1}(A_{y}ξ = z). See p. 57 and p. 60.Google Scholar - 21.For more general investigations see M. Jifina, Czechosl. Math. J. 4, (79) 372–380 (1954) and Czechosl. Math. J. 9, (84) 445-451 (1959).Google Scholar
- 22.This concept is equivalent to that of the conditional probability given a σ-algebra as it is easy to see.Google Scholar
- 23.This condition can be dispensed with. See E.L. Lehmann l.e. Not.
_{11}37–38Google Scholar - 24.See II. 12.Google Scholar
- 25.Of course, one cannot manage with the form of Theorem 6.1 given here, but requires a generalization of this result to set functions which are not necessarily non-negative. However, this generalization can easily be obtained from Theorem 6.1.Google Scholar
- 26.More precisely: If ξ and n are r.v.’s and E(ξ
^{2}) and E(n^{2}) exist, then (in the notation of 20) \( {(E(|\xi \eta ||))^2} \le E({\xi ^2}|)E({\eta ^2}|)P - a.e. \).Google Scholar - 27.Thus, in somewhat more general formulation, Theorem 21.1 states that if ξ is a r.v. and E(ξ
^{2}) exists, then =\( E(\xi |) \) is the orthogonal projection of ξ onto the set of S-measurable functions.Google Scholar - 28.This Theorem is due to P. Levy: P. Levy, Calcul des Probabilites, Gauthier-Villars et Cie., Paris, 1925, 166ff.Google Scholar
- 29.P. Levy, l.c.
^{28}195 ff.Google Scholar - 30.Moreover, it can always be assumed that this function is right-continuous.Google Scholar
- 31.From Lemma 23.2 and Lemma 23.1 one can easily infer Theorem 23.4.Google Scholar
- 32.
- 33.Another example is given e.g. on p. 81.Google Scholar
- 34.Stieltjes, T.J. Nouv. Ann. Math., ser. 3, 9, 479–480 (1890).Google Scholar
- 35.Not all the coefficients in these linear combinations should be zero.Google Scholar
- 36.F. R. Helmert, Zeitschrift für Math, und Physik 21,192–219 (1876). K. Pearson, Philos. Mag. 50. Ser. 5, 157–175 (1900). 37 “Student”, Biometrika 6, 1–25 (1908), (Student is a pseudonym for W.S. Gosset). R. A. Fisher, Biometrika 10, 507–521 (1915).Google Scholar
- 37.“Student”, Biometrika 6, 1–25 (1908), (Student is a pseudonym for W.S. Gosset). R. A. Fisher, Biometrika 10, 507–521 (1915).Google Scholar
- 38.This distribution is also named for Snedecor.Google Scholar
- 39.R.A. Fisher, Metron 1, 1–32 (1921).Google Scholar
- 40.K. Pearson, Philos. Trans. Roy. Soc. London, Ser. A 185, 71–110 (1894).zbMATHCrossRefGoogle Scholar
- 41.We will no longer state the intervals over which the densities vanish. The constant C is always to be chosen in such a way that (6.3) holds in each case.Google Scholar
- 42.We can also show this without any calculations: Let p
_{1}>p.Google Scholar - 43.This formula already appears in A. Meyer, Vorlesungen uber Wahrscheinlichkeitsrechnung, B.G. Teubner, Leipzig 1879.Google Scholar
- 44.For a thorough treatement of limit theorems see B. V. Gnedenko and A. Kolmogoroff, Limit Distributions for Sums of Independent Random Variables, Cambridge, Mass., 1954.Google Scholar
- 45.If r-l>M
^{1}-M, then \( \left( \begin{array}{l} {M^1} - M \\ \,\,\,\,\,r - l \\ \end{array} \right) = 0. \) Google Scholar - 46.A. J. Hincin, C. R. Acad. Sci., Paris 189, 477–479 (1929).Google Scholar
- 47.A systematic treatment of the properties of this notion and of related concepts can be found in E. Lukacs, Stochastic Convergence, Math. Monographs, D. C. Heath, Lexington, Mass. 1968.Google Scholar
- 48.This short proof is due to Borges. Theorem 38.3 as well as Theorem 38.4 remain correct if the stochastic convergence to a real number (or to a k-tuple of real numbers) in the assumption and claim is replaced by stochastic convergence to a r.v. See K. Krickeberg, I.e. Not.
^{7}. We will make no use of this fact here.Google Scholar - 49.See loc. cit.
^{44}and P. Levy, Theorie de l’addition des variables aleatoires. Gauthier-Villars 2nd Ed., Paris 1954.Google Scholar - 50.P. Levy, loc. cit.
^{28}, 233ff. It is not hard to show that the convergence of the sequence (F_{n}) is even uniform in-∞<x<∞.Google Scholar - 51.That is: For every ε>0 there exists δ >0 such that \( |\phi _j^{11}(0) - \phi _j^{11}(t)| < \varepsilon \,if\,|t| < \delta \) uniformly for j = 1,2,....Google Scholar
- 52.We refer to the fundamental paper of C.G. Esseen, Acta Math. 77,1–125 (1944) and to generalizations by E. Hlawka, Monatsh. Math. 55,105-137 (1951).MathSciNetCrossRefGoogle Scholar
- 53.A good outline of this problem area is in Ju. V. Linnik, Proc. Fourth Berkeley Sympos. Math. Statist, and Prob, Vol. II, pp. 289–306. Univ. California Press, Berkeley, Calif. (1960).Google Scholar
- 54.See B.L. van der Waerden, Nieuw Arch. Wisk. 18, 40–45 (1936).Google Scholar
- 55.In a quite analogous way one shows that the multinomial distribution (see 37) can be approximated by a (k — l)-dimensional normal distribution. We mention for the sake of completeness that one can arrive at a multi-dimensional Poisson distribution by means of another passage to the limit. (See p. 92.) More precise and general results of this type can be found in M. Fisz, Studia Math. 14, 272–275 (1954).MathSciNetGoogle Scholar
- 56.There are many subtle investigations of this question. We mention only W. Feller, Ann. Math. Statist. 16, 319–329 (1945) and Ann. Math. Statist. 21, 301 (1950). See also Ibragimov, I. A. and Linnik, Ju. V. 1. c. 52.MathSciNetzbMATHCrossRefGoogle Scholar
- 57.This is an extension of Formula (39.11). See, for example, Lösch-Schoblik, Die Fakultat und verwandte Funktionen, Teubner, Leipzig 1951, 30.Google Scholar
- 58.For this and the following theorems, see H. Cramer, Mathematical Methods of Statistics, Princeton Univ. Press, Princeton 1946. See also E. Lukacs loc. cit. 47.Google Scholar
- 59.F possesse only countably many discontinuities (see p. 33), whence follows the possibility of such a choice for infinitely many ε>0 with ε→>0.Google Scholar
- 60.K. Krickeberg, Metrika 10, 179–181 (1966), has pointed out that one can prove the following theorem.MathSciNetzbMATHCrossRefGoogle Scholar
- 61.A detailed presentation is J.A. Shohat and J.D. Tamarkin, The Problem of Moments (Mathematical Surveys, Vol. I), Amer. Math. Soc., New York: 1943 and 1950.Google Scholar
- 62.H. Hamburger, Math. Z. 4, 186–222 (1919), Math. Ann. 81, 31-45, 235-319 (1920); 82, 120-164,168-187 (1921).MathSciNetCrossRefGoogle Scholar
- 63.F. Hausdorff, Math. Z. 9, 74–109 (1921). Also see S. Karlin and L.S. Shaple, Geometry of Moment Spaces, Mem. Amer. Math. Soc. No. 12, Providence 1953.MathSciNetCrossRefGoogle Scholar