Introduction to Probability Theory

Schmetterer, Leopold

doi:10.1007/978-3-642-65542-5_3

Leopold Schmetterer²

Part of the book series: Die Grundlehren der mathematischen Wissenschaften ((GL,volume 202))

867 Accesses

Abstract

We have already mentioned in the introduction that the axioms of mathematical probability₁ are to be so chosen that they reflect empirical situations when given an appropriate interpretation. We have seen that a characterization of mass phenomena can be given in a certain sense by the empirical probabilities of the events occuring. It is thus desirable to choose the notion of mathematical probability in such a way that the theorems of the mathematical theory yield empirically verifiable facts if the mathematical probability is replaced by the empirical. We then speak briefly of the frequency interpretation of the mathematical theory. The simplest calculation rules of empirical probability are expressed by 1. and 2. (p. 23). These serve as model for the axioms of mathematical probability. In this chapter, we will discuss the most important facts of probability theory. However, we should point out at once that our program is not a complete construction of the theory. Since the main emphasis in this book is on the application of probability in mathematical statistics, many of the important theorems in this chapter will be given without proof.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

We give here a summary of the most important texts on probability theory by means of which the reader can fill in any gaps left here and deepen his knowledge. Bauer, Heinz: Wahrscheinlichkeitstheorie und Grundzuge der Maßtheorie.
Google Scholar
All sets here belong to S even when this is not explicitly stated.
Google Scholar
Th. Bayes, Philos. Trans. Roy. Soc. 53, 376–398 (1763) and 54, 298–310 (1764).
Google Scholar
See p. 5.
Google Scholar
Frequently, P(-∞< ξ <x) is defined to be the distribution function of ξ, e.g. by Kolmogorov, l.c. Intro.⁶
Google Scholar
This is precisely the case when F is absolutely continuous, i. e., to each ε>0 there corresponds a δ>0, such that for each finite or countably infinite set of pairwise disjoint intervals (x_i,y_i), Σ F(y_i)-F(x_i) <ε if Σ y_i-x_i <δ.
Google Scholar
Properly, it should be referred to as “a” density; however, when no misunderstanding is likely—here and in similar cases—we apply the definite article.
Google Scholar
Briefly, we usually write R.-N.-density.
Google Scholar
In this case, trivial changes in notation have to be introduced.
Google Scholar
Again this is precisely the case when F is absolutely continuous. f is the Radon-Nikodym density relative to n-dimensional Lebesgue measure.
Google Scholar
Naturally, L(B) denotes the Lebesgue measure of B.
Google Scholar
It is convenient to agree also to write g(ξ) or g(ξ₁,...,ξ_n)
Google Scholar
This definition can easily be extended to infinitely many random variables. Cf. the remark following (2.1).
Google Scholar
A better terminology would be marginal distribution of (ξ₁,...,ξ_n) relative to (ξ₁,...,ξ_k), but the expression employed here has established itself in the literature.
Google Scholar
See also Theorem 17.7.
Google Scholar
We will also refer to \( ({\sigma _{ij}})_{1n}^{1n} \) as the covariance matrix of P_ξ.
Google Scholar
Obviously, each moment of odd order E[(ξ-a)²ⁿ⁺¹],n⩾0, of a distribution which is symmetric with respect to a vanishes whenever it exists.
Google Scholar
We also say: all versions of P(A ∣ G) differ from each other only on P-null sets.
Google Scholar
See for this and related problems D. H. Blackwell, Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability 1954–1955 Vol. II, pp. 1–6, University of California Press, Berkeley and Los Angeles (1956) and D.H. Blackwell and C. Ryll-Nardzewski, Ann. Math. Statist. 34, 223–225 (1963).
Google Scholar
In place of P_R1(A_y ξ)(z) we also write P_R1(A_y ξ = z). See p. 57 and p. 60.
Google Scholar
For more general investigations see M. Jifina, Czechosl. Math. J. 4, (79) 372–380 (1954) and Czechosl. Math. J. 9, (84) 445-451 (1959).
Google Scholar
This concept is equivalent to that of the conditional probability given a σ-algebra as it is easy to see.
Google Scholar
This condition can be dispensed with. See E.L. Lehmann l.e. Not.₁₁ 37–38
Google Scholar
See II. 12.
Google Scholar
Of course, one cannot manage with the form of Theorem 6.1 given here, but requires a generalization of this result to set functions which are not necessarily non-negative. However, this generalization can easily be obtained from Theorem 6.1.
Google Scholar
More precisely: If ξ and n are r.v.’s and E(ξ²) and E(n²) exist, then (in the notation of 20) \( {(E(|\xi \eta ||))^2} \le E({\xi ^2}|)E({\eta ^2}|)P - a.e. \).
Google Scholar
Thus, in somewhat more general formulation, Theorem 21.1 states that if ξ is a r.v. and E(ξ²) exists, then =\( E(\xi |) \) is the orthogonal projection of ξ onto the set of S-measurable functions.
Google Scholar
This Theorem is due to P. Levy: P. Levy, Calcul des Probabilites, Gauthier-Villars et Cie., Paris, 1925, 166ff.
Google Scholar
P. Levy, l.c.²⁸ 195 ff.
Google Scholar
Moreover, it can always be assumed that this function is right-continuous.
Google Scholar
From Lemma 23.2 and Lemma 23.1 one can easily infer Theorem 23.4.
Google Scholar
This exist for all i,j from Theorem 17.3
Google Scholar
Another example is given e.g. on p. 81.
Google Scholar
Stieltjes, T.J. Nouv. Ann. Math., ser. 3, 9, 479–480 (1890).
Google Scholar
Not all the coefficients in these linear combinations should be zero.
Google Scholar
F. R. Helmert, Zeitschrift für Math, und Physik 21,192–219 (1876). K. Pearson, Philos. Mag. 50. Ser. 5, 157–175 (1900). 37 “Student”, Biometrika 6, 1–25 (1908), (Student is a pseudonym for W.S. Gosset). R. A. Fisher, Biometrika 10, 507–521 (1915).
Google Scholar
“Student”, Biometrika 6, 1–25 (1908), (Student is a pseudonym for W.S. Gosset). R. A. Fisher, Biometrika 10, 507–521 (1915).
Google Scholar
This distribution is also named for Snedecor.
Google Scholar
R.A. Fisher, Metron 1, 1–32 (1921).
Google Scholar
K. Pearson, Philos. Trans. Roy. Soc. London, Ser. A 185, 71–110 (1894).
Article MATH Google Scholar
We will no longer state the intervals over which the densities vanish. The constant C is always to be chosen in such a way that (6.3) holds in each case.
Google Scholar
We can also show this without any calculations: Let p₁ >p.
Google Scholar
This formula already appears in A. Meyer, Vorlesungen uber Wahrscheinlichkeitsrechnung, B.G. Teubner, Leipzig 1879.
Google Scholar
For a thorough treatement of limit theorems see B. V. Gnedenko and A. Kolmogoroff, Limit Distributions for Sums of Independent Random Variables, Cambridge, Mass., 1954.
Google Scholar
If r-l>M¹-M, then \( \left( \begin{array}{l} {M^1} - M \\ \,\,\,\,\,r - l \\ \end{array} \right) = 0. \)
Google Scholar
A. J. Hincin, C. R. Acad. Sci., Paris 189, 477–479 (1929).
Google Scholar
A systematic treatment of the properties of this notion and of related concepts can be found in E. Lukacs, Stochastic Convergence, Math. Monographs, D. C. Heath, Lexington, Mass. 1968.
Google Scholar
This short proof is due to Borges. Theorem 38.3 as well as Theorem 38.4 remain correct if the stochastic convergence to a real number (or to a k-tuple of real numbers) in the assumption and claim is replaced by stochastic convergence to a r.v. See K. Krickeberg, I.e. Not.⁷. We will make no use of this fact here.
Google Scholar
See loc. cit.⁴⁴ and P. Levy, Theorie de l’addition des variables aleatoires. Gauthier-Villars 2nd Ed., Paris 1954.
Google Scholar
P. Levy, loc. cit.²⁸, 233ff. It is not hard to show that the convergence of the sequence (F_n) is even uniform in-∞<x<∞.
Google Scholar
That is: For every ε>0 there exists δ >0 such that \( |\phi _j^{11}(0) - \phi _j^{11}(t)| < \varepsilon \,if\,|t| < \delta \) uniformly for j = 1,2,....
Google Scholar
We refer to the fundamental paper of C.G. Esseen, Acta Math. 77,1–125 (1944) and to generalizations by E. Hlawka, Monatsh. Math. 55,105-137 (1951).
Article MathSciNet Google Scholar
A good outline of this problem area is in Ju. V. Linnik, Proc. Fourth Berkeley Sympos. Math. Statist, and Prob, Vol. II, pp. 289–306. Univ. California Press, Berkeley, Calif. (1960).
Google Scholar
See B.L. van der Waerden, Nieuw Arch. Wisk. 18, 40–45 (1936).
Google Scholar
In a quite analogous way one shows that the multinomial distribution (see 37) can be approximated by a (k — l)-dimensional normal distribution. We mention for the sake of completeness that one can arrive at a multi-dimensional Poisson distribution by means of another passage to the limit. (See p. 92.) More precise and general results of this type can be found in M. Fisz, Studia Math. 14, 272–275 (1954).
MathSciNet Google Scholar
There are many subtle investigations of this question. We mention only W. Feller, Ann. Math. Statist. 16, 319–329 (1945) and Ann. Math. Statist. 21, 301 (1950). See also Ibragimov, I. A. and Linnik, Ju. V. 1. c. 52.
Article MathSciNet MATH Google Scholar
This is an extension of Formula (39.11). See, for example, Lösch-Schoblik, Die Fakultat und verwandte Funktionen, Teubner, Leipzig 1951, 30.
Google Scholar
For this and the following theorems, see H. Cramer, Mathematical Methods of Statistics, Princeton Univ. Press, Princeton 1946. See also E. Lukacs loc. cit. 47.
Google Scholar
F possesse only countably many discontinuities (see p. 33), whence follows the possibility of such a choice for infinitely many ε>0 with ε→>0.
Google Scholar
K. Krickeberg, Metrika 10, 179–181 (1966), has pointed out that one can prove the following theorem.
Article MathSciNet MATH Google Scholar
A detailed presentation is J.A. Shohat and J.D. Tamarkin, The Problem of Moments (Mathematical Surveys, Vol. I), Amer. Math. Soc., New York: 1943 and 1950.
Google Scholar
H. Hamburger, Math. Z. 4, 186–222 (1919), Math. Ann. 81, 31-45, 235-319 (1920); 82, 120-164,168-187 (1921).
Article MathSciNet Google Scholar
F. Hausdorff, Math. Z. 9, 74–109 (1921). Also see S. Karlin and L.S. Shaple, Geometry of Moment Spaces, Mem. Amer. Math. Soc. No. 12, Providence 1953.
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

University of Vienna, Austria
Leopold Schmetterer (Professor of Statistics and Mathematics)

Authors

Leopold Schmetterer
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Schmetterer, L. (1974). Introduction to Probability Theory. In: Introduction to Mathematical Statistics. Die Grundlehren der mathematischen Wissenschaften, vol 202. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-65542-5_3

Download citation

DOI: https://doi.org/10.1007/978-3-642-65542-5_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-65544-9
Online ISBN: 978-3-642-65542-5
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics