Skip to main content

Abstract

Probability theory is a branch of measure theory. In measure theory and probability theory, we consider set functions of which the values are non-negative real numbers with the values called the measure and probability, respectively, of the corresponding set.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Here, if the condition ‘\(0 \le a \le b<\infty \)’ is replaced with ‘\(0 \le a< b<\infty \)’, \(\mathcal {F}_2\) is not an algebra because the null set is not an element of \(\mathcal {F}_2\).

  2. 2.

    This result is from (1.5.11).

  3. 3.

    Among those useful sets is the set \(\varUpsilon _T = \{ \text{ all } \text{ periodic } \text{ binary } \text{ sequences } \}\) considered in Example 2.1.15.

  4. 4.

    Here, fair means ‘head and tail are equally likely to occur’.

  5. 5.

    Because the probability measure \(\mathsf {P}\) is a set function, \(\mathsf {P}(\{k\})\) and \(\mathsf {P}( \{{ head} \} )\), for instance, are the exact expressions. Nonetheless, the expressions \(\mathsf {P}(k)\), \( \mathsf {P}\{k\}\), \(\mathsf {P}( { head} )\), and \(\mathsf {P}\{ { head} \}\) are also used.

  6. 6.

    The Vitali set \(\mathbb {V}_0\) discussed in Definition 2.A.12 is a subset in the space of real numbers. Denote the rational numbers in \((-1, 1)\) by \(\left\{ \alpha _i \right\} _{i=1}^{\infty }\) and assume the translation operation \(T_t (x) = x+t\). Then, the events \(\left\{ T_{\alpha _i}\mathbb {V}_0 \right\} _{i=1}^{\infty }\) will produce a contradiction.

  7. 7.

    Among these properties, (2.3.2) is called the Boole inequality. The Boole inequality (2.3.2) can also be written into two formulas similarly to Axioms 3 and 4 in Definition 2.2.9.

  8. 8.

    This has been described also in Definition 1.1.24.

  9. 9.

    This formula is called the Bonferroni inequality.

  10. 10.

    Note that the number is sometimes replaced by other quantity such as area, volume, or length as it is shown in Example 2.3.6.

  11. 11.

    Here, ‘fair’ means ‘1, 2, 3, 4, 5, and 6 are equally likely to occur’.

  12. 12.

    Here, \(\frac{13}{80} = 0.1625\), \(\frac{8}{13} \approx 0.6154\), and \(\frac{2}{13} \approx 0.1538\).

  13. 13.

    Here, \((-r)(-r-1)\cdots (-r-x+1) = (-1)^x r (r+1)\cdots (r+x-1)\). Equation (2.5.15) can also be obtained based on Table 1.4.

  14. 14.

    Notations U[ab], U[ab), U(ab], and U(ab) are all used interchangeably.

  15. 15.

    Note that \(\lim \limits _{n \rightarrow \infty }A_n = {\overset{\infty }{\underset{n=1}{\cup }}} A_n\) and \(\lim \limits _{n \rightarrow \infty }A_n = {\overset{\infty }{\underset{n=1}{\cap }}} A_n\) when \(A_1 \subseteq A_2 \subseteq \cdots \) and \(A_1 \supseteq A_2 \supseteq \cdots \), respectively, as discussed in (1.5.8) and (1.5.9).

  16. 16.

    The Cantor set discussed in Example 1.1.46 is one such example.

  17. 17.

    The axiom of choice can be expressed as “For a non-empty set \(B \subseteq A\), there exists a choice function \(f: 2^A \rightarrow A\) such that \(f(B) \in B\) for any set A.” The axiom of choice can be phrased in various expressions, and that in Definition 2.A.12 is based on “If we assume a partition \(\mathbb {P}_S\) of S composed only of non-empty sets, then there exists a set B for which the intersection with any set in \(\mathbb {P}_S\) is a singleton set.”

  18. 18.

    For any real number x, the measure of \(A=\{a\}\) is the same as that of \(A+x=\{a+x\}\).

  19. 19.

    When the suit of spades is declared the royal suit, the ace of diamonds, not the ace of spades, becomes the mighty.

References

  • N. Balakrishnan, Handbook of the Logistic Distribution (Marcel Dekker, New York, 1992)

    Google Scholar 

  • P.J. Bickel, K.A. Doksum, Mathematical Statistics (Holden-Day, San Francisco, 1977)

    Google Scholar 

  • H.A. David, H.N. Nagaraja, Order Statistics, 3rd edn. (Wiley, New York, 2003)

    Google Scholar 

  • R.M. Gray, L.D. Davisson, An Introduction to Statistical Signal Processing (Cambridge University Press, Cambridge, 2010)

    Google Scholar 

  • A. Gut, An Intermediate Course in Probability (Springer, New York, 1995)

    Google Scholar 

  • C.W. Helstrom, Probability and Stochastic Processes for Engineers, 2nd edn. (Prentice-Hall, Englewood Cliffs, 1991)

    Google Scholar 

  • S. Kim, Mathematical Statistics (in Korean) (Freedom Academy, Paju, 2010)

    Google Scholar 

  • A. Leon-Garcia, Probability, Statistics, and Random Processes for Electrical Engineering, 3rd edn. (Prentice Hall, New York, 2008)

    Google Scholar 

  • M. Loeve, Probability Theory, 4th edn. (Springer, New York, 1977)

    Google Scholar 

  • E. Lukacs, Characteristic Functions, 2nd edn. (Griffin, London, 1970)

    Google Scholar 

  • T.M. Mills, Problems in Probability (World Scientific, Singapore, 2001)

    Google Scholar 

  • M.M. Rao, Measure Theory and Integration, 2nd edn. (Marcel Dekker, New York, 2004)

    Google Scholar 

  • V.K. Rohatgi, A.KMd.E. Saleh, An Introduction to Probability and Statistics, 2nd edn. (Wiley, New York, 2001)

    Google Scholar 

  • J.P. Romano, A.F. Siegel, Counterexamples in Probability and Statistics (Chapman and Hall, New York, 1986)

    Google Scholar 

  • S.M. Ross, A First Course in Probability (Macmillan, New York, 1976)

    Google Scholar 

  • S.M. Ross, Stochastic Processes, 2nd edn. (Wiley, New York, 1996)

    MATH  Google Scholar 

  • A.N. Shiryaev, Probability, 2nd edn. (Springer, New York, 1996)

    Google Scholar 

  • A.A. Sveshnikov (ed.), Problems in Probability Theory (Mathematical Statistics and Theory of Random Functions, Dover, New York, 1968)

    Google Scholar 

  • J.B. Thomas, Introduction to Probability (Springer, New York, 1986)

    Google Scholar 

  • P. Weirich, Conditional probabilities and probabilities given knowledge of a condition. Philos. Sci. 50(1), 82–95 (1983)

    Google Scholar 

  • C.K. Wong, A note on mutually independent events. Am. Stat. 26(2), 27–28 (1972)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Iickho Song .

Appendices

Appendices

2.1.1 Appendix 2.1 Continuity of Probability

Theorem 2.A.1

For a monotonic sequence \(\left\{ B_n \right\} _{n=1}^{\infty }\) of events, the probability of the limit event is equal to the limit of the probabilities of the events in the sequence. In other words,

$$\begin{aligned} \mathsf {P}\left( \lim \limits _{n\rightarrow \infty }B_n \right) \ = \ \lim \limits _{n\rightarrow \infty } \mathsf {P}\left( B_n \right) \end{aligned}$$
(2.A.1)

holds true.  

Proof

First, when \(\left\{ B_n \right\} _{n=1}^{\infty }\) is a non-decreasing sequence, recollect that \(\underset{i=1}{\overset{n}{\cup }}B_i = B_{n}\) and \(\underset{i=1}{\overset{\infty }{\cup }} B_i = \lim \limits _{n\rightarrow \infty } B_n\). Consider a sequence \(\left\{ F_i\right\} _{i=1}^{\infty }\) such that \(F_1 = B_1\) and \(F_n = B_n - \underset{i=1}{\overset{n-1}{\cup }}B_i = B_n\cap B_{n-1}^c\) for \(n =2, 3, \ldots \). Then, \(\left\{ F_n \right\} _{n=1}^{\infty }\) are all mutually exclusive, \(\underset{i=1}{\overset{n}{\cup }} F_i = \underset{i=1}{\overset{n}{\cup }} B_i \) for any natural number n, and \(\underset{i=1}{\overset{\infty }{\cup }} F_i = \underset{i=1}{\overset{\infty }{\cup }} B_i = \lim \limits _{n\rightarrow \infty } B_n\). Therefore, \( \mathsf {P}\left( \lim \limits _{n\rightarrow \infty } B_n\right) = \mathsf {P}\left( \underset{i=1}{\overset{\infty }{\cup }} B_i\right) = \mathsf {P}\left( \underset{i=1}{\overset{\infty }{\cup }} F_i\right) = \sum \limits _{i=1}^{\infty } \mathsf {P}\left( F_i \right) = \lim \limits _{n \rightarrow \infty } \sum \limits _{i=1}^n \mathsf {P}\left( F_i \right) = \lim \limits _{n \rightarrow \infty } \mathsf {P}\left( \underset{i=1}{\overset{n}{\cup }} F_i\right) = \lim \limits _{n \rightarrow \infty } \mathsf {P}\left( \underset{i=1}{\overset{n}{\cup }} B_i\right) \), i.e.,

$$\begin{aligned} \mathsf {P}\left( \lim \limits _{n\rightarrow \infty } B_n\right)= & {} \lim \limits _{n \rightarrow \infty } \mathsf {P}\left( B_n \right) \end{aligned}$$
(2.A.2)

recollecting (2.2.15), Axiom 4 of probability, and \(\underset{i=1}{\overset{n}{\cup }}B_i = B_{n}\).

Next, when \(\left\{ B_n \right\} _{n=1}^{\infty }\) is a non-increasing sequence, \(\left\{ B_n^c \right\} _{n=1}^{\infty }\) is a non-decreasing sequence, and thus, we have

$$\begin{aligned} \mathsf {P}\left( \lim \limits _{n\rightarrow \infty } B_n^c\right)= & {} \lim \limits _{n \rightarrow \infty } \mathsf {P}\left( B_n^c \right) \end{aligned}$$
(2.A.3)

from (2.A.2). Noting that \(\lim \limits _{n\rightarrow \infty } B_n^c = \underset{i=1}{\overset{\infty }{\cup }} B_i^c\) because \(\left\{ B_n^c \right\} _{n=1}^{\infty }\) is a non-decreasing sequence and that \(\underset{i=1}{\overset{\infty }{\cap }} B_i = \lim \limits _{n\rightarrow \infty } B_n\) because \(\left\{ B_n \right\} _{n=1}^{\infty }\) is a non-increasing sequence, we have \(\lim \limits _{n\rightarrow \infty } B_n^c = \underset{i=1}{\overset{\infty }{\cup }} B_i^c= \left( \underset{i=1}{\overset{\infty }{\cap }} B_i\right) ^c= \left( \lim \limits _{n\rightarrow \infty } B_n\right) ^c\). Thus the left-hand side of (2.A.3) can be written as \( \mathsf {P}\left( \lim \limits _{n\rightarrow \infty } B_n^c\right) = 1- \mathsf {P}\left( \lim \limits _{n\rightarrow \infty } B_n\right) \). Meanwhile, the right-hand side of (2.A.3) can easily be written as \(\lim \limits _{n \rightarrow \infty } \mathsf {P}\left( B_n^c \right) = \lim \limits _{n \rightarrow \infty } \left\{ 1- \mathsf {P}\left( B_n \right) \right\} =1- \lim \limits _{n \rightarrow \infty } \mathsf {P}\left( B_n \right) \). Then, (2.A.3) yields (2.A.1). \(\spadesuit \)

The results of Theorem 2.A.1 that

$$\begin{aligned} \underset{n \rightarrow \infty }{\lim } \mathsf {P}\left( B_n \right) \ = \ \mathsf {P}\left( \underset{i=1}{\overset{\infty }{\cup }} B_i\right) \end{aligned}$$
(2.A.4)

for a non-decreasing sequence \(\left\{ B_n \right\} _{n=1}^{\infty }\) and that

$$\begin{aligned} \underset{n \rightarrow \infty }{\lim } \mathsf {P}\left( B_n \right) \ = \ \mathsf {P}\left( \underset{i=1}{\overset{\infty }{\cap }} B_i\right) \end{aligned}$$
(2.A.5)

for a non-increasing sequence \(\left\{ B_n \right\} _{n=1}^{\infty }\) are called the continuity from below and above of probability, respectively.

Theorem 2.A.1 deals with monotonic, i.e., non-decreasing and non-increasing, sequences. The same result holds true more generally as we can see in the following theorem:

Theorem 2.A.2

When the limit event \(\lim \limits _{n\rightarrow \infty } B_n\) of a sequence \(\left\{ B_n \right\} _{n=1}^{\infty }\) exists, the probability of the limit event is equal to the limit of the probabilities of the events in the sequence. In other words,

$$\begin{aligned} \mathsf {P}\left( \lim \limits _{n\rightarrow \infty }B_n \right) \ = \ \lim \limits _{n\rightarrow \infty } \mathsf {P}\left( B_n \right) \end{aligned}$$
(2.A.6)

holds true.  

Proof

First, recollect that, among the limit values of a sequence \(\left\{ a_n \right\} _{n=1}^{\infty }\) of real numbers, the largest and smallest ones are denoted by \(\overline{a_n}\) and \(\underline{a_n}\), respectively. When \(\underline{a_n} =\overline{a_n}\), this value is called the limit of the sequence and denoted by \(\underset{n \rightarrow \infty }{\lim }{a_n}\). Now, noting that \(\left\{ \overset{\infty }{\underset{k=n}{\cup }} B_k \right\} _{n=1}^{\infty }\) is a non-increasing sequence, we have \( \mathsf {P}\left( \underset{n \rightarrow \infty }{\limsup } B_n \right) = \mathsf {P}\left( \underset{n=1}{\overset{\infty }{\cap }} \underset{k=n}{\overset{\infty }{\cup }} B_k \right) \), i.e.,

$$\begin{aligned} \mathsf {P}\left( \underset{n \rightarrow \infty }{\limsup }~B_n \right)= & {} \lim \limits _{n \rightarrow \infty } \mathsf {P}\left( \overset{\infty }{\underset{k=n}{\cup }} B_k \right) \end{aligned}$$
(2.A.7)

from (1.5.9), (1.5.17), and (2.A.1). In the meantime, we have

$$\begin{aligned} \overline{ \mathsf {P}\left( B_n \right) } \ \le \ \lim \limits _{n \rightarrow \infty } \mathsf {P}\left( \overset{\infty }{\underset{k=n}{\cup }} B_k \right) \end{aligned}$$
(2.A.8)

because \( \mathsf {P}\left( B_n \right) \le \mathsf {P}\left( \overset{\infty }{\underset{k=n}{\cup }} B_k \right) \) from \(B_n \subseteq \overset{\infty }{\underset{k=n}{\cup }} B_k\). From (2.A.7) and (2.A.8), we get

$$\begin{aligned} \overline{ \mathsf {P}\left( B_n \right) } \ \le \ \mathsf {P}\left( \underset{n \rightarrow \infty }{\limsup } ~B_n \right) . \end{aligned}$$
(2.A.9)

Similarly, we get

$$\begin{aligned} \mathsf {P}\left( \underset{n \rightarrow \infty }{\liminf } ~B_n \right)= & {} \mathsf {P}\left( \underset{n=1}{\overset{\infty }{\cup }} \underset{k=n}{\overset{\infty }{\cap }} B_k \right) \nonumber \\= & {} \lim \limits _{n \rightarrow \infty } \mathsf {P}\left( \underset{k=n}{\overset{\infty }{\cap }} B_k \right) \nonumber \\\le & {} \underline{ \mathsf {P}\left( B_n \right) } \end{aligned}$$
(2.A.10)

for the non-decreasing sequence \(\left\{ \overset{\infty }{\underset{k=n}{\cap }} B_k \right\} _{n=1}^{\infty }\). The last line

$$\begin{aligned} \underset{n \rightarrow \infty }{\lim } \mathsf {P}\left( \underset{k=n}{\overset{\infty }{\cap }} B_k \right) \ \le \ \underline{ \mathsf {P}\left( B_n \right) } \end{aligned}$$
(2.A.11)

of (2.A.10) is due to \( \mathsf {P}\left( \underset{k=n}{\overset{\infty }{\cap }} B_k \right) \le \mathsf {P}\left( B_n \right) \) from \(\underset{k=n}{\overset{\infty }{\cap }} B_k \subseteq B_n\). Now, (2.A.9) and (2.A.10) produces

$$\begin{aligned} \overline{ \mathsf {P}\left( B_n \right) }\le & {} \mathsf {P}\left( \underset{n \rightarrow \infty }{\limsup } ~B_n \right) \nonumber \\= & {} \mathsf {P}\left( \lim \limits _{n \rightarrow \infty } B_n \right) \nonumber \\= & {} \mathsf {P}\left( \underset{n \rightarrow \infty }{\liminf } ~B_n \right) \nonumber \\\le & {} \underline{ \mathsf {P}\left( B_n \right) } \end{aligned}$$
(2.A.12)

if \(\underset{n \rightarrow \infty }{\lim } B_n\) exists. We get the desired result

$$\begin{aligned} \mathsf {P}\left( \lim \limits _{n \rightarrow \infty } B_n \right)= & {} \overline{ \mathsf {P}\left( B_n \right) }\nonumber \\= & {} \underline{ \mathsf {P}\left( B_n \right) } \nonumber \\= & {} \lim \limits _{n \rightarrow \infty } \mathsf {P}\left( B_n \right) \end{aligned}$$
(2.A.13)

by combining (2.A.12) and \(\overline{ \mathsf {P}\left( B_n \right) } \ge \underline{ \mathsf {P}\left( B_n \right) }\). \(\spadesuit \)

Definition 2.A.1

(continuity of probability) For the limit \( \lim \limits _{n \rightarrow \infty } B_n\) of a sequence \(\left\{ B_n \right\} _{n=1}^{\infty }\) of events, \(\left\{ \mathsf {P}\left( B_n \right) \right\} _{n=1}^{\infty }\) converges to \( \mathsf {P}\left( \lim \limits _{n \rightarrow \infty } B_n \right) \) as shown in (2.A.1) and (2.A.6). The relation

$$\begin{aligned} \mathsf {P}\left( \lim \limits _{n \rightarrow \infty } B_n \right) \ = \ \lim \limits _{n \rightarrow \infty } \mathsf {P}\left( B_n \right) \end{aligned}$$
(2.A.14)

is called the continuity of probability.

In other words, the probability of the limit of a sequence of events is equal to the limit of the sequence of the probabilities of the events.

2.1.2 Appendix 2.2 Borel-Cantelli Lemma

Let us discuss the Borel-Cantelli lemma, which deals with the probability of upper bound events.

Theorem 2.A.3

(Rohatgi and Saleh 2001) When  the sum of the probabilities \(\left\{ \mathsf {P}\left( B_n \right) \right\} _{n=1}^{\infty }\) of a sequence \(\left\{ B_n\right\} _{n=1}^{\infty } \) of events is finite, i.e., when \(\sum \limits _{n=1}^{\infty } \mathsf {P}\left( B_n \right) < \infty \), the probability \( \mathsf {P}\left( \overline{B_n} \right) \) of the upper bound of \(\left\{ B_n\right\} _{n=1}^{\infty } \) is 0.

Proof

First, from \(\sum \limits _{k=1}^{\infty } \mathsf {P}\left( B_k \right) = \lim \limits _{n \rightarrow \infty } \left\{ \sum \limits _{k=1}^{n-1} \mathsf {P}\left( B_k \right) + \sum \limits _{k=n}^{\infty } \mathsf {P}\left( B_k \right) \right\} = \sum \limits _{k=1}^{\infty } \mathsf {P}\left( B_k \right) + \lim \limits _{n \rightarrow \infty } \sum \limits _{k=n}^{\infty } \mathsf {P}\left( B_k \right) \), we get

$$\begin{aligned} \lim \limits _{n \rightarrow \infty } \sum \limits _{k=n}^{\infty } \mathsf {P}\left( B_k \right) \ = \ 0 . \end{aligned}$$
(2.A.15)

Now using (2.A.7) and the Boole inequality (2.3.2), we get \( \mathsf {P}\left( \underset{n \rightarrow \infty }{\limsup } B_n\right) = \lim \limits _{n \rightarrow \infty } \mathsf {P}\left( \underset{k=n}{\overset{\infty }{\cup }} B_k \right) \le \lim \limits _{n \rightarrow \infty } \sum \limits _{k=n}^{\infty } \mathsf {P}\left( B_k \right) \), i.e.,

$$\begin{aligned} \mathsf {P}\left( \underset{n \rightarrow \infty }{\limsup }~ B_n\right)= & {} 0 \end{aligned}$$
(2.A.16)

from (2.A.15). \(\spadesuit \)

Theorem 2.A.4

When  \(\left\{ B_n\right\} _{n=1}^{\infty }\) is a sequence of independent events and the sum \(\sum \limits _{n=1}^{\infty } \mathsf {P}\left( B_n \right) \) is infinite, i.e., \(\sum \limits _{n=1}^{\infty } \mathsf {P}\left( B_n \right) \rightarrow \infty \), the probability \( \mathsf {P}\left( \overline{B_n} \right) \) of the upper bound of \(\left\{ B_n\right\} _{n=1}^{\infty }\) is 1.

Proof

First, note that \( \mathsf {P}\left( \underset{n \rightarrow \infty }{\limsup } B_n\right) = \lim \limits _{n \rightarrow \infty } \mathsf {P}\left( \underset{i=n}{\overset{\infty }{\cup }} B_i \right) \), i.e.,

$$\begin{aligned} \mathsf {P}\left( \underset{n \rightarrow \infty }{\limsup } ~B_n\right)= & {} \lim \limits _{n \rightarrow \infty } \left\{ 1- \mathsf {P}\left( \underset{i=n}{\overset{\infty }{\cap }}B_i^c\right) \right\} \end{aligned}$$
(2.A.17)

as in the proof of Theorem 2.A.3. Next, if \(\sum \limits _{k=1}^{\infty } \mathsf {P}\left( B_k \right) \rightarrow \infty \), then \(\sum \limits _{k=n}^{\infty } \mathsf {P}\left( B_k \right) \rightarrow \infty \) because \(\sum \limits _{k=1}^{n-1} \mathsf {P}\left( B_k \right) \le n-1\) for any number n and \(\sum \limits _{k=1}^{\infty } \mathsf {P}\left( B_k \right) = \sum \limits _{k=1}^{n-1} \mathsf {P}\left( B_k \right) + \sum \limits _{k=n}^{\infty } \mathsf {P}\left( B_k \right) \). Therefore, we get \( \mathsf {P}\left( \underset{i=n}{\overset{\infty }{\cap }} B_i^c\right) = \prod \limits _{i=n}^{\infty } \mathsf {P}\left( B_i^c \right) = \prod \limits _{i=n}^{\infty } \left\{ 1- \mathsf {P}\left( B_i \right) \right\} \) recollecting that \(\left\{ B_i\right\} _{i=1}^{\infty }\) are independent of each other, and thus, \(\left\{ B_i^c \right\} _{i=1}^{\infty }\) are independent of each other. Finally, noting that \(1-x \le e^{-x}\) for \(x \ge 0\), we get

$$\begin{aligned} \mathsf {P}\left( \underset{i=n}{\overset{\infty }{\cap }} B_i^c\right)\le & {} \prod _{i=n}^{\infty } \exp \left\{ - \mathsf {P}\left( B_i \right) \right\} \nonumber \\= & {} \exp \left\{ -\sum _{i=n}^{\infty } \mathsf {P}\left( B_i \right) \right\} \nonumber \\= & {} 0 , \end{aligned}$$
(2.A.18)

which proves the theorem when used in (2.A.17). \(\spadesuit \)

When \(\left\{ B_n\right\} _{n=1}^{\infty } \) is a sequence of independent events, the probability \( \mathsf {P}\left( \overline{B_n} \right) \) of the upper bound event of \(\left\{ B_n\right\} _{n=1}^{\infty } \) is either 0 or 1 from the Borel-Cantelli lemma. Borel-Cantelli lemmas will be employed when we discuss the strong law of large numbers in Sect. 6.2.2.2.

Example 2.A.1

Assume \( \mathsf {P}\left( X_n = 0\right) =\frac{1}{n^2}=1- \mathsf {P}\left( X_n=1\right) \) for a sequence \(\left\{ X_n \right\} _{n=1}^{\infty }\) of independent events, and let \(B_n=\left\{ X_n=0\right\} \). Then, from Theorem 2.A.3, we have \( \mathsf {P}\left( \underset{n \rightarrow \infty }{\limsup }~ B_n\right) = \mathsf {P}\left( \text{ i.o. } B_n\right) = 0\) because \(\sum \limits _{n=1}^{\infty } \mathsf {P}\left( B_n \right) = \frac{\pi ^2}{6} < \infty \). Therefore, when n is sufficiently large, the probabilities that \(X_n\) will be 0 and 1 are 0 and 1, respectively. In other words, \(\lim \limits _{n\rightarrow \infty } X_n = 1\) almost surely. \(\diamondsuit \)

Example 2.A.2

Assume \( \mathsf {P}\left( X_n = 0\right) =\frac{1}{n}=1- \mathsf {P}\left( X_n=1\right) \) for a sequence \(\left\{ X_n \right\} _{n=1}^{\infty }\) of independent events, and let \(B_n=\left\{ X_n=0\right\} \). Then, from Theorem 2.A.4, we have \( \mathsf {P}\left( \text{ i.o. } B_n\right) =1\) or, equivalently, almost surely \(X_n=0\) because \(\sum \limits _{n=1}^{\infty } \mathsf {P}\left( B_n \right) = \infty \). On the other hand, \(B_n^c\) also occurs almost surely because \(\sum \limits _{n=1}^\infty \mathsf {P}\left( B_n^c \right) = \infty \). In other words, almost surely \(X_n\) is 0 infinitely many times, and at the same time, 1 infinitely many times. Consequently, the probability that \(\lim \limits _{n\rightarrow \infty } X_n\) does not exist, i.e., that \(X_n\) does not converge, is 1. \(\diamondsuit \)

2.1.3 Appendix 2.3 Measures and Lebesgue Integrals

The notion of length, area, volume, and weight that we encounter in our daily lives are examples of measure. The length of a rod, the area of a house, the volume of a ball, and the weight of a package assign numbers to objects. They also assign numbers to groups of objects.

A measure is a set function assigning a number to a set. Nonetheless, not all set functions are measures. A measure should satisfy some conditions. For example, if we consider the measure of weight, the weight of a bottle filled with water is the sum of the weight of the bottle and that of the water. In other words, the measure of the union of sets is equal to the sum of the measures of the sets for mutually exclusive sets.

Definition 2.A.2

(measure) A non-negative additive function \(\mu \) with the domain a \(\sigma \)-algebra is called a measure.

Here, an additive function is a function such that the value of the function for a countable union of sets is the same as the sum of the values of the function for the sets when the sets are mutually exclusive. In other words, a function \(\mu \) satisfying

$$\begin{aligned} \mu \left( \overset{\infty }{\underset{i=1}{\cup }} A_i \right) \ = \ \sum \limits _{i=1}^{\infty } \mu \left( A_i \right) \end{aligned}$$
(2.A.19)

for countable mutually exclusive sets \(\left\{ A_i \right\} _{i=1}^{\infty }\) in a \(\sigma \)-algebra is called an additive function.

Example 2.A.3

Consider a finite set \(\varOmega \) and the collection \(\mathcal {F} = 2^\varOmega \). Then, the number \(\mu (A)\) of elements of \(A \in \mathcal {F}\) is a measure. \(\diamondsuit \)

Theorem 2.A.5

For a measure \(\mu \) on a \(\sigma \)-algebra \(\mathcal {F}\), let \(\left\{ A_n \in \mathcal {F} \right\} _{n=1}^{\infty }\) and \(A_1 \subseteq A_2 \subseteq \cdots \). Then, \(A={\overset{\infty }{\underset{n=1}{\cup }}} A_n \in \mathcal {F}\) andFootnote 15 \(\lim \limits _{n \rightarrow \infty } \mu \left( A_n \right) =\mu (A)\).

Proof

First, because \(\mathcal {F}\) is a \(\sigma \)-algebra, \(A ={\overset{\infty }{\underset{n=1}{\cup }}} A_n\) is an element of \(\mathcal {F}\). Next, let \(B_1=A_1\) and \(B_n=A_n -A_{n-1}\) for \(n=2,3,\ldots \). Then, \(\left\{ B_n \right\} _{n=1}^{\infty }\) are mutually exclusive, \(A_n={\overset{n}{\underset{i=1}{\cup }}} B_i\), and \(A={\overset{\infty }{\underset{n=1}{\cup }}} B_n\). Thus, \(\lim \limits _{n\rightarrow \infty }\mu \left( A_n \right) = \lim \limits _{n\rightarrow \infty }\mu \left( {\overset{n}{\underset{i=1}{\cup }}}B_i\right) =\lim \limits _{n\rightarrow \infty } \sum \limits _{i=1}^n \mu \left( B_i \right) = \sum \limits _{i=1}^{\infty } \mu \left( B_i \right) \) and \(\mu (A)=\mu \left( {\overset{\infty }{\underset{i=1}{\cup }}} B_i\right) =\sum \limits _{i=1}^{\infty } \mu \left( B_i \right) \) from (2.A.19) and, consequently, \(\lim \limits _{n \rightarrow \infty } \mu \left( A_n \right) =\mu (A)\). \(\spadesuit \)

The measure for a subset A of \(\varOmega \) can be defined as \(\mu (A)=\sum \limits _{\omega \in A} \mu _{\omega }\) in an abstract space \(\varOmega \) by first choosing arbitrarily a non-negative number \(\mu _{\omega }\) for \(\omega \in \varOmega \) when \(\varOmega \) is a countable set.

Example 2.A.4

For an abstract space \(\varOmega =\{3, 4, 5\}\), let \(\mu _{\omega } = 5 - \omega \) for \(\omega \in \varOmega \). Then, \(\mu (A) =\sum \limits _{\omega \in A} \mu _{\omega }\) is a measure. We have \(\mu (\{ 3\}) = 2\), \(\mu (\{ 4\}) = 1\), \(\mu (\{ 5\}) = 0\), \(\mu (\{ 3, 4\}) = \mu (\{ 3\}) + \mu (\{ 4\})= 3\), \(\mu (\{ 3, 5\}) = \mu (\{ 3\}) + \mu (\{ 5\})= 2\), \(\mu (\{ 4, 5\}) = \mu (\{ 4\}) + \mu (\{ 5\})= 1\), and \(\mu (\{ 3, 4, 5\}) = \mu (\{ 3\}) + \mu (\{ 4\}) + \mu (\{ 5\}) = \mu (\{ 3, 4\})+ \mu (\{ 5\}) = \mu (\{ 4, 5\}) + \mu (\{ 3\}) = \mu (\{ 3, 5\}) + \mu (\{ 4\}) = 3\). \(\diamondsuit \)

To consider measure in uncountable sets, we introduce the notion of elementary sets, based on which the measure is defined and then extended for general sets.

Definition 2.A.3

(rectangle) In a Euclidean space \(\mathbb {R}^p\) of p dimension, a set in the form \(\left\{ \boldsymbol{x}= \left( x_1, x_2, \ldots , x_p \right) : \, a_i \le x_i \le b_i, i=1, 2, \ldots , p \right\} \) with \( -\infty < a_i \le b_i \le \infty \) is called a rectangle, an interval, or a box.

In Definition 2.A.3, \(a_i \le x_i \le b_i\) can be replaced with \(a_i< x_i < b_i\) or \(a_i < x_i \le b_i\). Note that a null set is also regarded as an interval.

Definition 2.A.4

(elementary set) A set is called an elementary set if it can be expressed as the union of a finite number of intervals.

Example 2.A.5

Examples of an elementary set and a non-elementary set are shown in Fig. 2.18. \(\diamondsuit \)

Definition 2.A.5

(outer measure; covering) Let \(\mu \) be an additive, non-negative, and finite set function defined on the collection of all elementary sets. A collection \(\left\{ A_i \right\} _{i=1}^{\infty }\) of elementary sets such that \(E \subseteq {\overset{\infty }{\underset{i=1}{\cup }}} A_i\) for \(E \subseteq \mathbb {R}^p\) is called a covering of E, and the lower bound

$$\begin{aligned} \mu ^{*} (E) \ =\ {\inf } \sum \limits _{i=1}^{\infty } \mu \left( A_i \right) \end{aligned}$$
(2.A.20)

of \(\sum \limits _{i=1}^{\infty } \mu \left( A_i \right) \) over all the coverings of E is called the outer measure of E.

In general, we have

$$\begin{aligned} \mu ^{*} (E) \ \le \ \sum \limits _{i=1}^{\infty } \mu ^{*} \left( B_i \right) \end{aligned}$$
(2.A.21)

when \(E = {\overset{\infty }{\underset{i=1}{\cup }}} B_i\), and

$$\begin{aligned} \mu ^{*} (E) \ = \ \mu (E) \end{aligned}$$
(2.A.22)

when E is an elementary set.

Fig. 2.18
figure 18

Examples of an elementary set (1) and a non-elementary set (2) in two-dimensional space

Example 2.A.6

Assume the sets shown in Fig. 2.18. Let the measure of the two-dimensional interval

$$\begin{aligned} A_{a,b,c,d} \ = \ \left\{ \left( x_1, x_2 \right) : \, a \le x_1 \le b ,\, c \le x_2 \le d \right\} \end{aligned}$$
(2.A.23)

be \(\mu (A_{a,b,c,d}) = (b-a)(d-c)\). Let the set in Fig. 2.18 (1) be \(B_1\). Then, we have \(B_1 \subseteq A_{0,2,0,2}\), \(B_1 \subseteq A_{0,2,0,1} \cup A_{1,2, 0, 2}\), and \(B_1 \subseteq A_{0,2,0,1} \cup A_{1, 2, 1,2}\), among which the covering with the smallest measure is \(\left\{ A_{0,2,0,1}, A_{1, 2, 1,2} \right\} \). Thus, the outer measure of \(B_1\) is \(\mu ^{*} \left( B_1 \right) = 2+1 = 3\). Similarly, let the set in Fig. 2.18 (2) be \(B_2\). Then, we have \(B_2 \subseteq A_{0,2,0,2}\), \(B_2 \subseteq A_{0,2,0,1} \cup A_{0, 2, 1, 2}\), \(B_2 \subseteq A_{0,2,0,1} \cup A_{1, 2, 1, 2}\), \(\ldots \), among which the covering with the smallest measure is \(\left\{ A_{\frac{2(i-1)}{n}, 2,\frac{2(i-1)}{n},\frac{2i}{n}} \right\} _{i=1}^{n}\) as \(n \rightarrow \infty \). Thus, the outer measure of \(B_2\) is \(\mu ^{*} \left( B_2 \right) = 4 {\underset{n \rightarrow \infty }{\lim }} \sum \limits _{i=1}^{n}\left( 1- \frac{i-1}{n}\right) \frac{1}{n} = 4\int _{0}^{1}(1-x)dx = 2\). \(\diamondsuit \)

Definition 2.A.6

(finitely \(\mu \)-measurable set; \(\mu \)-measurable set) For a sequence \(\left\{ A_n \right\} _{n=1}^{\infty }\) of elementary sets, a set A such that

$$\begin{aligned} \lim \limits _{n \rightarrow \infty } \mu ^{*} \left( A_n\triangle A \right) \ = \ 0 \end{aligned}$$
(2.A.24)

is called finitely \(\mu \)-measurable. A set is called \(\mu \)-measurable if it is obtained from a countable union of finitely \(\mu \)-measurable sets.

The collections of all finitely \(\mu \)-measurable sets and \(\mu \)-measurable sets are denoted by \(\mathcal {M}_F (\mu )\) and \(\mathcal {M} (\mu )\), respectively.

Theorem 2.A.6

The collection \(\mathcal {M} (\mu )\) is a \(\sigma \)-algebra, and the outer measure \(\mu ^{*}\) is an additive set function on \(\mathcal {M} (\mu )\).

Proof

Instead of a rigorous proof, we will simply discuss a brief outline. Assume two sequences \(\left\{ A_i \right\} _{i=1}^{\infty }\) and \(\left\{ B_i \right\} _{i=1}^{\infty }\) of elementary sets converging to A and B, respectively, when A and B are elements of \(\mathcal {M}_F (\mu )\). If we let \(d(A,B) =\mu ^{*} (A\triangle B)\), then we can show that \(\mathcal {M}_F (\mu )\) is an algebra by showing that \(A \cup B\) and \(A \cap B\) are included in \(\mathcal {M}_F (\mu )\) based on \(d\left( A_i \cup A_j, B_i \cup B_j \right) \le d\left( A_i,B_i \right) +d\left( A_j,B_j \right) \), \(d\left( A_i \cap A_j, B_i \cap B_j \right) \le d\left( A_i,B_i \right) +d\left( A_j,B_j \right) \), and \(\left| \mu ^{*} (A) - \mu ^{*} (B) \right| \le d(A,B)\). Moreover, we can show that \(\mu ^{*}\) is finitely additive on \(\mathcal {M}_F (\mu )\) based on \(\mu ^{*} (A) + \mu ^{*} (B) =\mu ^{*} (A \cup B) + \mu ^{*} (A \cap B)\) and \(\mu ^{*} (A \cap B)=0\) when \(A \cap B=\emptyset \).

Now, when \(A \in \mathcal {M} (\mu )\), we can express \(A = {\overset{\infty }{\underset{n=1}{\cup }}} A_n'\) for \(A_n' \in \mathcal {M}_F (\mu )\), and we then have \(A = {\overset{\infty }{\underset{n=1}{\cup }}} A_n\) by letting \(A_1=A_1'\) and \(A_n={\overset{n}{\underset{i=1}{\cup }}} A_i' - {\overset{n-1}{\underset{i=1}{\cup }}} A_i'\) for \(n =2, 3, \ldots \). Based on this, we can show that \(\mu ^{*}\) is additive on \(\mathcal {M} (\mu )\) by showing that \(\mu ^{*} (A)=\sum \limits _{i=1}^{\infty } \mu ^{*} \left( A_i \right) \). Finally, we can show that \(\mathcal {M} (\mu )\) is a \(\sigma \)-algebra based on the fact that any countable set operations on the sets in \(\mathcal {M} (\mu )\) can be obtained from a countable union of \(\mathcal {M}_F (\mu )\). \(\spadesuit \)

Based on Theorem 2.A.6, we can use \(\mu ^{*}\), instead of \(\mu \), as the measure when we deal with \(\mu \)-measurable sets. In essence, we have first defined \(\mu \) for elementary sets and then extended \(\mu \) into \(\mu ^{*}\), an additive set function on the \(\sigma \)-algebra \(\mathcal {M} (\mu )\).

Definition 2.A.7

(Lebesgue measure) The Lebesgue measure in the Euclidean space \(\mathbb {R}^p\) is defined as

$$\begin{aligned} \mu (A) \ = \ \sum \limits _{i=1}^n m \left( I_i \right) \end{aligned}$$
(2.A.25)

for \(A = \overset{n}{\underset{i=1}{\cup }}I_i\), where \(\left\{ I_i \right\} _{i=1}^{n}\) are non-overlapping intervals and

$$\begin{aligned} m(I) \ = \ \prod \limits _{k=1}^p \left( b_k - a_k \right) \end{aligned}$$
(2.A.26)

with \(I=\left\{ \boldsymbol{x}= \left( x_1, x_2, \ldots , x_p \right) : \, a_k \le x_k \le b_k, \, k=1, 2, \ldots , p \right\} \) an interval in \(\mathbb {R}^p\).

Definition 2.A.7 is based on the fact that any elementary set can be obtained from a union of non-overlapping intervals \(\left\{ I_i \right\} _{i=1}^{n}\). An open set can be obtained from a countable union of open intervals and is a \(\mu \)-measurable set. Similarly, a closed set is the complement of an open set and is also a \(\mu \)-measurable set because \(\mathcal {M} (\mu )\) is a \(\sigma \)-algebra. As discussed in Definition 2.2.7, the collection of all Borel sets is a \(\sigma \)-algebra and is called the Borel \(\sigma \)-algebra or Borel field. In addition, a \(\mu \)-measurable set can always be expressed as the union of a Borel set and a set which is of measure 0 and is mutually exclusive of the Borel set. Under the Lebesgue measure, all countable sets and someFootnote 16 uncountable sets are of measure 0.

Example 2.A.7

In the one-dimensional space, the Lebesgue measure of an interval [ab] is the length \(\mu \left( [a, b]\right) =b-a\) of the interval. The Lebesgue measure of the set Q of rational numbers is \(\mu (Q)=0\). \(\diamondsuit \)

Definition 2.A.8

(measure space; measurable space) In a metric space X, if there exist a \(\sigma \)-algebra \(\mathcal {M}\) of measurable sets composed of subsets of X and a non-negative additive set function \(\mu \), then X is called a measure space. Here, if \(X \in \mathcal {M}\), then \((X,\mathcal {M},\mu )\) is called a measurable space.  

Example 2.A.8

In the space \(X=\mathbb {R}^p\), we have the Lebesgue measure and the collection \(\mathcal {M}\) of all sets measurable by the Lebesgue measure. Then, it is easy to see that X is a measure space. \(\diamondsuit \)

Example 2.A.9

In the space \(X=\mathbb {J}_{+}\), let the number of elements in a set be the measure \(\mu \) of the set and let the collection of all subsets of X be \(\mathcal {M}\). Then, \((X,\mathcal {M},\mu )\) is a measurable space. \(\diamondsuit \)

Definition 2.A.9

(measurable function) When the set \(\{x: \, f(x)>a\}\) is always a measurable set, a real function f defined on a measurable space is called a measurable function.

Example 2.A.10

Continuous functions in \(\mathbb {R}^p\) are all measurable functions. \(\diamondsuit \)

Example 2.A.11

If f is a measurable function, so is |f|. If f and g are both measurable functions, then \(\max (f,g)\) and \(\min (f,g)\) are measurable functions. \(\diamondsuit \)

Example 2.A.12

If \(\left\{ f_n \right\} _{i=1}^{\infty }\) is a sequence of measurable functions, then \(\sup f_n (x)\) and \(\limsup \limits _{n \rightarrow \infty } f_n (x)\) are measurable functions. \(\diamondsuit \)

Definition 2.A.10

(simple function) When the range of a function on a measurable space is finite, the function is called a simple function.

Example 2.A.13

When the range of a simple function f is \(\left\{ c_1, c_2, \ldots , c_n \right\} \), we have \(f(x) =\sum \limits _{i=1}^n c_i K_{B_i}(x)\), where \(B_i= \left\{ x: \, f(x)=c_i \right\} \) and

$$\begin{aligned} K_E (x) \ = \ \left\{ \begin{array}{ll} 1, &{} x \in E, \\ 0, &{} x \notin E \end{array} \right. \end{aligned}$$
(2.A.27)

is the indicator function of E. \(\diamondsuit \)

Theorem 2.A.7

There exists a sequence \(\left\{ f_n \right\} _{n=1}^{\infty }\) of simple functions such that \(\lim \limits _{n \rightarrow \infty } f_n (x) = f(x)\) for any real function f defined on a measurable space. If f is a measurable function, then \(\left\{ f_n \right\} _{n=1}^{\infty }\) can be chosen as a sequence of measurable functions, and if \(f \ge 0\), \(\left\{ f_n \right\} _{n=1}^{\infty }\) can be chosen to increase monotonically.

Proof

When \(f \ge 0\), let \(B_{n,i}= \left\{ x : \frac{i-1}{2^{n}} \le f(x) \le \frac{i}{2^{n}} \right\} \) and \(F_n=\{x: \, f(x) \ge n\}\), and then choose

$$\begin{aligned} f_n (x) \ = \ \sum \limits _{i=1}^{n2^{n}} \frac{i-1}{2^{n}} K_{B_{n,i}}(x) + nK_{F_n}(x) . \end{aligned}$$
(2.A.28)

More generally, by letting \(f^{+}=\max (f,0)\), \(f^{-}=-\min (f,0)\), and \(f=f^+ - f^-\), we can prove the theorem easily. \(\spadesuit \)

Definition 2.A.11

(Lebesgue integral) In a measurable space \((X,\mathcal {M},\mu )\), let \(s(x) =\sum \limits _{i=1}^n c_i K_{B_i}(x)\) be a measurable function, where \(c_i>0\) and \(x \in X\). In addition, let \(E \in \mathcal {M}\) and \(I_E (s) =\sum \limits _{i=1}^n c_i \mu \left( E \cap B_i \right) \). Then, for a non-negative and measurable function f,

$$\begin{aligned} \int _{E} f d\mu \ = \ \sup I_E (s) \end{aligned}$$
(2.A.29)

is called the Lebesgue integral.

In Definition 2.A.11, the upper bound is obtained over all measurable simple functions s such that \(0 \le s \le f\). In the meantime, when the function f is not always positive, the Lebesgue integral can be defined as

$$\begin{aligned} \int _{E} f d\mu \ = \ \int _{E} f^{+} d\mu - \int _{E} f^{-} d\mu \end{aligned}$$
(2.A.30)

if at least one of \(\int _{E} f^{+} d\mu \) and \(\int _{E} f^{-} d\mu \) is finite, where \(f^{+}=\max (f,0)\) and \(f^{-}=-\min (f,0)\). Note that \(f =f^+ - f^-\) and that \(f^{+}\) and \(f^{-}\) are measurable functions. If both \(\int _{E} f^{+} d\mu \) and \(\int _{E} f^{-} d\mu \) are finite, then \(\int _{E} f d\mu \) is finite and the function f is called Lebesgue integrable on E for \(\mu \), which is expressed as \(f \in \mathcal {L} (\mu )\) on E.

Based on mensuration by parts, the Riemann integral is the sum of products of the value of a function in an arbitrarily small interval composing the integral region and the length of the interval. On the other hand, the Lebesgue integral is the sum of products of the value of a function and the measure of the interval in the domain corresponding to an arbitrarily small interval in the range of the function. The Lebesgue integral exists not only for all Riemann integrable functions but also for other functions while the Riemann integral exists only when the function is at least piecewise continuous.

Some of the properties of the Lebesgue integral are as follows:

  1. (1)

    If a function f is measurable on E and bounded and \(\mu (E)\) is finite, then \(f \in \mathcal {L} (\mu )\) on E.

  2. (2)

    If the measure \(\mu (E)\) is finite and \(a \le f \le b\), then \(a\mu (E) \le \int _{E} f d\mu \le b \mu (E)\).

  3. (3)

    If \(f, g \in \mathcal {L} (\mu )\) on the set E and \(f(x) \le g(x)\) for \(x \in E\), then \(\int _{E} f d\mu \le \int _{E} g d\mu \).

  4. (4)

    If \(f \in \mathcal {L} (\mu )\) on the set E and c is a finite constant, then \(\int _{E} cf d\mu \le c\int _{E} f d\mu \) and \(cf \in \mathcal {L} (\mu )\).

  5. (5)

    If \(f \in \mathcal {L} (\mu )\) on the set E, then \(|f| \in \mathcal {L} (\mu )\) and \(\left| \int _{E} f d\mu \right| \le \int _{E} |f| d\mu \).

  6. (6)

    If a function f is measurable on the set E and \(\mu (E)=0\), then \(\int _{E} f d\mu =0\).

  7. (7)

    If a function f is Lebesgue integrable on X and \(\phi (A) = \int _{A} f d\mu \) on \(A \in \mathcal {M}\), then \(\phi \) is additive on \(\mathcal {M}\).

  8. (8)

    Let \(A \in \mathcal {M}\), \(B \subseteq A\), and \(\mu (A-B)=0\). Then, \(\int _{A} f d\mu = \int _{B} f d\mu \).

  9. (9)

    Consider a sequence \(\left\{ f_n \right\} _{n=1}^{\infty }\) of measurable functions such that \(\lim \limits _{n \rightarrow \infty } f_n (x) = f(x)\) for \(E \in \mathcal {M}\) and \(x \in E\). If there exists a function \(g \in \mathcal {L} (\mu )\) such that \(\left| f_n (x) \right| \le g(x)\), then \(\lim \limits _{n \rightarrow \infty } \int _{E} f_n d\mu = \int _{E} f d\mu \).

  10. (10)

    If a function f is Riemann integrable on [ab], then f is Lebesgue integrable and the Lebesgue integral with the Lebesgue measure is the same as the Riemann integral.

2.1.4 Appendix 2.4 Non-measurable Sets

Assume  the open unit interval \(J = (0, 1)\) and the set \(\mathbb {Q}\) of rational numbers in the real space \(\mathbb {R}\). Consider the translation operator \(T_t: \mathbb {R} \rightarrow \mathbb {R}\) such that \(T_t (x) = x+t\) for \(x \in \mathbb {R}\). Suppose the countable set \(\varGamma _t = T_t \mathbb {Q}\), i.e.,

$$\begin{aligned} \varGamma _t= & {} \{t+q: \, q \in \mathbb {Q}\}. \end{aligned}$$
(2.A.31)

For example, we have \(\varGamma _{5445} = \{q+5445: \, q \in \mathbb {Q}\} = \mathbb {Q}\) and \(\varGamma _{\pi }= \{q+ \pi : \, q \in \mathbb {Q}\}\) when \(t=5445\) and \(t= \pi \), respectively.

It is clear that

$$\begin{aligned} \varGamma _t \cap J \ \ne \ \emptyset \end{aligned}$$
(2.A.32)

because we can always find a rational number q such that \(0< t+q < 1\) for any real number t. We have \(\varGamma _t = \{t+ q: \, q \in \mathbb {Q}\}= \{s+ (t-s) +q: \, q \in \mathbb {Q}\} = \left\{ s+ q^{\prime }: \, q^{\prime } \in \mathbb {Q} \right\} = \varGamma _s\) and \(\varGamma _t \cap \varGamma _s = \emptyset \) when \(t-s\) is a rational number and an irrational number, respectively. Based on this observation, consider the collection

$$\begin{aligned} \mathbb {K} \ = \ \left\{ \varGamma _t: \, t \in \mathbb {R}, \text{ distinct } \varGamma _t \text{ only } \right\} \end{aligned}$$
(2.A.33)

of sets (Rao 2004). Then, we have the following facts:

  1. (1)

    The collection \(\mathbb {K}\) is a partition of \(\mathbb {R}\).

  2. (2)

    There exists only one rational number t for \(\varGamma _t \in \mathbb {K}\).

  3. (3)

    There exist uncountably many sets in \(\mathbb {K}\).

  4. (4)

    For two distinct sets \(\varGamma _t\) and \(\varGamma _s\) in \(\mathbb {K}\), the number \(t-s\) is not a rational number.

Definition 2.A.12

(Vitali set) Based on the axiom of choiceFootnote 17 and (2.A.32), we can obtain an uncountable set

$$\begin{aligned} \mathbb {V}_0 \ = \ \left\{ x: \, x \in \varGamma _t \cap J , \ \varGamma _t \in \mathbb {K} \right\} , \end{aligned}$$
(2.A.34)

where x represents a number in the interval (0, 1) and an element of \(\varGamma _t \in \mathbb {K}\). The set \(\mathbb {V}_0 \) is called the Vitali set.

Note that the points in the Vitali set \(\mathbb {V}_0 \) are all in interval (0, 1) and have a one-to-one correspondence with the sets in \(\mathbb {K}\). Denoting the enumeration of all the rational numbers in the interval \((-1, 1)\) by \(\left\{ \alpha _i \right\} _{i=1}^{\infty }\), we get the following theorem:

Theorem 2.A.8

For the Vitali set \(\mathbb {V}_0 \),

$$\begin{aligned} (0, 1) \ \subseteq \ {\overset{\infty }{\underset{i=1}{\cup }}} T_{\alpha _i}\mathbb {V}_0 \ \subseteq \ (-1, 2) \end{aligned}$$
(2.A.35)

holds true.

Proof

First, \(-1< \alpha _i + x < 2\) because \(-1< \alpha _i < 1\) and any point x in \(\mathbb {V}_0\) satisfies \(0< x < 1\). In other words, \(T_{\alpha _i} x \in (-1, 2)\), and therefore

$$\begin{aligned} {\overset{\infty }{\underset{i=1}{\cup }}} T_{\alpha _i}\mathbb {V}_0 \ \subseteq \ (-1, 2) . \end{aligned}$$
(2.A.36)

Next, for any point x in (0, 1), \(x \in \varGamma _t\) with an appropriately chosen t as we have observed in (2.A.32). Then, we have \(\varGamma _t= \varGamma _x\) and \(x \in \varGamma _t = \varGamma _x\) because \(x-t\) is a rational number. Now, denoting a point in \(\varGamma _x \cap \mathbb {V}_0\) by y, we have \(y=x+q\) because \(\varGamma _x \cap \mathbb {V}_0 \ne \emptyset \) and therefore \(y-x \in \mathbb {Q}\). Here, \(y-x\) is a rational number in \((-1, 1)\) because \(0< x, y < 1\) and, consequently, we can put \(y-x = \alpha _i\): in other words, \(y=x+\alpha _i= T_{\alpha _i}x \in T_{\alpha _i}\mathbb {V}_0\). Thus, we have

$$\begin{aligned} (0,1) \ \subseteq \ {\overset{\infty }{\underset{i=1}{\cup }}} T_{\alpha _i}\mathbb {V}_0 . \end{aligned}$$
(2.A.37)

Subsequently, we get (2.A.35) from (2.A.36) and (2.A.37). \(\spadesuit \)

Theorem 2.A.9

The sets \(\left\{ T_{\alpha _i}\mathbb {V}_0 \right\} _{i=1}^{\infty }\) are all mutually exclusive: in other words,

$$\begin{aligned} \left( T_{\alpha _i}\mathbb {V}_0 \right) \cap \left( T_{\alpha _j}\mathbb {V}_0 \right) \ =\ \emptyset \end{aligned}$$
(2.A.38)

for \(i \ne j\).

Proof

We prove the theorem by contradiction. When \(i \ne j\) or, equivalently, when \(\alpha _i \ne \alpha _j\), assume that \(\left( T_{\alpha _i}\mathbb {V}_0 \right) \cap \left( T_{\alpha _j}\mathbb {V}_0 \right) \) is not a null set. Letting one element of the intersection be y, we have \(y=x+ \alpha _i = x^{\prime }+\alpha _j\) for \(x, x^{\prime } \in \mathbb {V}_0\). It is clear that \(\varGamma _x = \varGamma _{x^{\prime }}\) because \(x-x^{\prime } = \alpha _j -\alpha _i\) is a rational number. Thus, \(x=x^{\prime }\) from the definition of \(\mathbb {K}\), and therefore \(\alpha _i = \alpha _j\): this is contradictory to \(\alpha _i \ne \alpha _j\). Consequently, \(\left( T_{\alpha _i}\mathbb {V}_0 \right) \cap \left( T_{\alpha _j}\mathbb {V}_0 \right) = \emptyset \). \(\spadesuit \)

Theorem 2.A.10

No set in \(\left\{ T_{\alpha _i}\mathbb {V}_0 \right\} _{i=1}^{\infty }\) is Lebesgue measurable: in other words, \(T_{\alpha _i} \mathbb {V}_0 \notin \mathcal {M} (\mu )\) for any i.

Proof

We prove the theorem by contradiction. Assume that the sets \(\left\{ T_{\alpha _i}\mathbb {V}_0 \right\} _{i=1}^{\infty }\) are measurable. Then, from the translation invarianceFootnote 18 of a measure, they have the same measure. Denoting the Lebesgue measure of \(T_{\alpha _i}\mathbb {V}_0\) by \(\mu \left( T_{\alpha _i}\mathbb {V}_0 \right) =\beta \), we have

$$\begin{aligned} \mu ((0, 1)) \ \le \ \mu \left( {\overset{\infty }{\underset{i=1}{\cup }}} T_{\alpha _i}\mathbb {V}_0 \right) \ \le \ \mu ((-1, 2)) \end{aligned}$$
(2.A.39)

from (2.A.35). Here, \(\mu ((0, 1)) = 1\) and \(\mu ((-1, 2)) = 3\). In addition, we have \(\mu \left( {\overset{\infty }{\underset{i=1}{\cup }}} T_{\alpha _i}\mathbb {V}_0 \right) = \sum \limits _{i=1}^{\infty } \mu \left( T_{\alpha _i}\mathbb {V}_0 \right) \), i.e.,

$$\begin{aligned} \mu \left( {\overset{\infty }{\underset{i=1}{\cup }}} T_{\alpha _i}\mathbb {V}_0 \right)= & {} \sum \limits _{i=1}^{\infty } \beta \end{aligned}$$
(2.A.40)

because \(\left\{ T_{\alpha _i}\mathbb {V}_0 \right\} _{i=1}^{\infty }\) is a collection of mutually exclusive sets as we have observed in (2.A.38). Combining (2.A.39) and (2.A.40) leads us to

$$\begin{aligned} 1 \ \le \ \sum \limits _{i=1}^{\infty } \beta \ \le \ 3 , \end{aligned}$$
(2.A.41)

which can be satisfied neither with \(\beta =0\) nor with \(\beta \ne 0\). Consequently, no set in \(\left\{ T_{\alpha _i}\mathbb {V}_0 \right\} _{i=1}^{\infty }\), including \(\mathbb {V}_0\), is Lebesgue measurable. \(\spadesuit \)

Exercises

Exercise 2.1

Obtain the algebra generated from the collection \(\mathcal {C} =\{\{a\},\ \{b\}\}\) of the set \(S=\{a,b,c,d\}\).

Exercise 2.2

Obtain the \(\sigma \)-algebra generated from the collection \(\mathcal {C} =\{\{a\},\ \{b\}\}\) of the set \(S=\{a,b,c,d\}\).

Exercise 2.3

Obtain the sample space S in the following random experiments:

  1. (1)

    An experiment measuring the lifetime of a battery.

  2. (2)

    An experiment in which an integer n is selected in the interval [0, 2] and then an integer m is selected in the interval [0, n].

  3. (3)

    An experiment of checking the color of, and the number written on, a ball selected randomly from a box containing two red, one green, and two blue balls denoted by \(1, 2, \ldots , 5\), respectively.

Exercise 2.4

When \(\mathsf {P}(A)=\mathsf {P}(B)=\mathsf {P}(AB)\), obtain \( \mathsf {P}\left( AB^c + BA^c \right) \).

Exercise 2.5

Consider rolling a fair die. For \(A=\{1\}\), \(B=\{2, 4\} \), and \(C=\{1, 3, 5, 6\} \), obtain \( \mathsf {P}(A\cup B)\), \( \mathsf {P}(A\cup C)\), and \( \mathsf {P}(A\cup B\cup C)\).

Exercise 2.6

Consider the events \(A= ( - \infty , r ]\) and \(B=( - \infty , s]\) with \(r \le s\) in the sample space of real numbers.

  1. (1)

    Express \(C= ( r, s ]\) in terms of \(A\) and \(B\).

  2. (2)

    Show that \(B= A\cup C\) and \(A\cap C= \emptyset \).

Exercise 2.7

When ten distinct red and ten distinct black balls are randomly arranged into a single line, find the probability that red and black balls are placed in an alternating fashion.

Exercise 2.8

Consider two branches between two nodes in a circuit. One of the two branches is a resistor and the other is a series connection of two resistors. Obtain the probability that the two nodes are disconnected assuming that the probability for a resistor to be disconnected is p and disconnection in a resistor is not influenced by the status of other resistors.

Exercise 2.9

Show that \(A^c\) and \(B\) are independent of each other and that \(A^c\) and \(B^c\) are independent of each other when \(A\) and \(B\) are independent of each other.

Exercise 2.10

Assume the sample space \(S = \{ 1, 2, 3\}\) and event space \(\mathcal {F}=2^S\). Show that no two events, except S and \(\emptyset \), are independent of each other for any probability measure such that \(\mathsf {P}(1)>0\), \(\mathsf {P}(2)>0\), and \(\mathsf {P}(3)>0\).

Exercise 2.11

For two events \(A\) and \(B\), show the followings:

  1. (1)

    If \(\mathsf {P}(A)=0\), then \(\mathsf {P}(AB)=0\).

  2. (2)

    If \(\mathsf {P}(A)=1\), then \(\mathsf {P}(AB)=\mathsf {P}(B)\).

Exercise 2.12

Among 100 lottery tickets sold each week, one is a winning ticket. When a ticket costs 10 euros and we have 500 euros, does buying 50 tickets in one week bring us a higher probability of getting the winning ticket than buying one ticket over 50 weeks?

Exercise 2.13

In rolling a fair die twice, find the probability that the sum of the two outcomes is 7 when we have 3 from the first rolling.

Exercise 2.14

When a pair of fair dice are rolled once, find \(\mathsf {P}(a-2b <0)\), where a and b are the face values of the two dice with \(a \ge b\).

Exercise 2.15

When we choose subsets \(A\), \(B\), and \(C\) from \(D = \{1, 2, \ldots , k \}\) randomly, find the probability that \(C\cap ( A- B)^c = \emptyset \).

Exercise 2.16

Denote the four vertices of a regular tetrahedron by A, B, C, and D. In each movement from one vertex to another, the probability of arriving at another vertex is \(\frac{1}{3}\) for each of the three vertices. Find the probabilities \(p_{n,A}\) and \(p_{n,B}\) that we arrive at A and B, respectively, after n movements starting from A. Obtain the values of \(p_{10,A}\) and \(p_{10,B}\) when \(n=10\).

Exercise 2.17

A box contains N balls each marked with a number 1, 2, \(\ldots \), and N, respectively. Each of N students with identification (ID) numbers 1, 2, \(\ldots \), and N, respectively, chooses a ball randomly from the box. If the number marked on the ball and the ID number of the student are the same, then it is called a match.

  1. (1)

    Find the probability of no match.

  2. (2)

    Using conditional probability, obtain the probability in (1) again.

  3. (3)

    Find the probability of k matches.

Exercise 2.18

In the interval [0, 1] on a line of real numbers, two points are chosen randomly. Find the probability that the distance between the two points is shorter than \( \frac{1}{2} \).

Exercise 2.19

Consider the probability space composed of the sample space \(S = \{\text{ all } \text{ pairs } (k, m) \text{ of } \text{ natural } \text{ numbers }\}\) and probability measure

$$\begin{aligned} \mathsf {P}((k,m)) \ = \ \alpha (1-p)^{k+m-2} , \end{aligned}$$
(2.E.1)

where \(\alpha \) is a constant and \(0< p < 1\).

  1. (1)

    Determine the constant \(\alpha \). Then, obtain the probability \( \mathsf {P}((k,m): \, k \ge m)\).

  2. (2)

    Obtain the probability \( \mathsf {P}((k,m): \, k + m=r)\) as a function of \(r \in \left\{ 2, 3, \ldots \right\} \). Confirm that the result is a probability measure.

  3. (3)

    Obtain the probability \( \mathsf {P}((k,m): \, k \text{ is } \text{ an } \text{ odd } \text{ number})\).

Exercise 2.20

Obtain \( \mathsf {P}\left( A\cap B\right) \), \( \mathsf {P}\left( \left. A\right| B\right) \), and \( \mathsf {P}\left( \left. B\right| A\right) \) when \( \mathsf {P}\left( A\right) =0.7\), \( \mathsf {P}\left( B\right) =0.5\), and \( \mathsf {P}\left( \left[ A\cup B\right] ^c \right) =0.1\).

Exercise 2.21

Three people shoot at a target. Let the event of a hit by the i-th person be \(A_i\) for \(i=1, 2, 3\) and assume the three events are independent of each other. When \( \mathsf {P}\left( A_1 \right) =0.7\), \( \mathsf {P}\left( A_2 \right) =0.9\), and \( \mathsf {P}\left( A_3 \right) =0.8\), find the probability that only two people will hit the target.

Exercise 2.22

In testing circuit elements, let \(A= \{{\text {defective element}}\}\) and \(B= \{{\text {element identified as defective}}\}\), and \( \mathsf {P}\left( \left. B\right| A\right) = p\), \( \mathsf {P}\left. \left( B^c \right| A^c \right) = q\), \( \mathsf {P}\left( A\right) = r\), and \(\mathsf {P}(B) = s\). Because the test is not perfect, two types of errors could occur: a false negative, ‘a defective element is identified to be fine’; or a false positive, ‘a functional element is identified to be defective’. Assume that the production and testing of the elements can be adjusted such that the parameters p, q, r, and s are very close to 0 or 1.

  1. (1)

    For each of the four parameters, explain whether it is more desirable to make it closer to 0 or 1.

  2. (2)

    Describe the meaning of the conditional probabilities \( \mathsf {P}\left( \left. B^c \right| A\right) \) and \( \mathsf {P}\left( B\left| A^c \right. \right) \).

  3. (3)

    Describe the meaning of the conditional probabilities \( \mathsf {P}\left( \left. A^c \right| B\right) \) and \( \mathsf {P}\left( A\left| B^c\right) \right. \).

  4. (4)

    Given the values of the parameters p, q, r, and s, obtain the probabilities in (2) and (3).

  5. (5)

    Obtain the sample space of this experiment.

Exercise 2.23

For three events \(A\), \(B\), and \(C\), show the following results without using Venn diagrams:

  1. (1)

    \( \mathsf {P}(A\cup B) = \mathsf {P}(A) + \mathsf {P}(B) - \mathsf {P}(AB)\).

  2. (2)

    \( \mathsf {P}(A\cup B\cup C) = \mathsf {P}(A) + \mathsf {P}(B) + \mathsf {P}(C) + \mathsf {P}(ABC) - \mathsf {P}(AB) - \mathsf {P}(AC) - \mathsf {P}(BC)\).

  3. (3)

    Union upper bound, i.e.,

    $$\begin{aligned} \mathsf {P}\left( \overset{n}{\underset{i=1}{\cup }} A_i \right) \ \le \ \sum ^n_{i=1} \mathsf {P}\left( A_i \right) . \end{aligned}$$
    (2.E.2)

Exercise 2.24

For the sample space S and events E, F, and \(\left\{ B_i \right\} _{i=1}^{\infty }\), show that the conditional probability satisfies the axioms of probability as follows:

  1. (1)

    \(0\le \mathsf {P}(E|F)\le 1.\)

  2. (2)

    \(\mathsf {P}(S|F)=1.\)

  3. (3)

    \( \mathsf {P}\left( \left. \underset{i=1}{\overset{\infty }{\cup }}B_i \right| F \right) = \sum \limits _{i=1}^{\infty } \mathsf {P}\left( \left. B_i \right| F \right) \) when the events \(\left\{ B_i \right\} _{i=1}^{\infty }\) are mutually exclusive.

Exercise 2.25

Assume an event B and a collection \(\left\{ A_i \right\} _{i=1}^{n}\) of events, where \(\left\{ A_i \right\} _{i=1}^{n}\) is a partition of the sample space S.

  1. (1)

    Explain whether or not \(A_i\) and \(A_j\) for \(i \ne j\) are independent of each other.

  2. (2)

    Obtain a partition of B using \(\left\{ A_i \right\} _{i=1}^{n}\).

  3. (3)

    Show the total probability theorem \( \mathsf {P}(B) = \sum \limits _{i=1}^n \mathsf {P}\left( B \left| A_i \right. \right) \mathsf {P}\left( A_i \right) \).

  4. (4)

    Show the Bayes’ theorem \( \mathsf {P}\left( \left. A_k \right| B \right) = \frac{ \mathsf {P}\left( B \left| A_k \right. \right) \mathsf {P}\left( A_k \right) }{\sum \limits _{i=1}^{n} \mathsf {P}\left( B \left| A_i \right. \right) \mathsf {P}\left( A_i \right) }\).

Exercise 2.26

Box 1 contains two red and three green balls and Box 2 contains one red and four green balls. Obtain the probability of selecting a red ball when a ball is selected from a randomly chosen box.

Exercise 2.27

Box i contains i red and \((n-i)\) green balls for \(i=1, 2, \ldots , n\). Choosing Box i with probability \(\frac{2i}{n(n+1)}\), obtain the probability of selecting a red ball when a ball is selected from the box chosen.

Exercise 2.28

A group of people elects one person via rock-paper-scissors. If there is only one person who wins, then the person is chosen; otherwise, the rock-paper-scissors is repeated. Assume that the probability of rock, paper, and scissors are \(\frac{1}{3}\) for every person and not affected by other people. Obtain the probability \(p_{n,k}\) that n people will elect one person in k trials.

Exercise 2.29

In an election, Candidates \({A}\) and \({B}\) will get n and m votes, respectively. When \(n>m\), find the probability that Candidate \({A}\) will always have more counts than Candidate \({B}\) during the ballot-counting.

Exercise 2.30

A type O cell is cultured at time 0. After one hour, the cell will become

$$\begin{aligned} \begin{array}{ll} \text{ two } \text{ type } \text{ O } \text{ cells },&{} \text{ probability } \text{= } \frac{1}{4},\\ \text{ one } \text{ type } \text{ O } \text{ cell, } \text{ one } \text{ type } \text{ M } \text{ cell },&{} \text{ probability } \text{= } \frac{2}{3},\\ \text{ two } \text{ type } \text{ M } \text{ cells },&{} \text{ probability } \text{= } \frac{1}{12} . \end{array} \end{aligned}$$
(2.E.3)

A new type O cell behaves like the first type O cell and a type M cell will disappear in one hour, where a change is not influenced by any other change. Find the probability \(\beta _0\) that no type M cell will appear until \(n+ \frac{1}{2} \) hours from the starting time.

Exercise 2.31

Find the probability of the event \(A\) that 5 or 6 appears k times when a fair die is rolled n times.

Fig. 2.19
figure 19

A binary communication channel

Exercise 2.32

Consider a communication channel for signals of binary digits (bits) 0 and 1. Due to the influence of noise, two types of errors can occur as shown in Fig. 2.19: specifically, 0 and 1 can be identified to be 1 and 0, respectively. Let the transmitted and received bits be X and Y, respectively. Assume a priori probability of \( \mathsf {P}\left( X=1\right) =p\) for 1 and \( \mathsf {P}\left( X=0\right) =1-p\) for 0, and the effect of noise on a bit is not influenced by that on other bits. Denote the probability that the received bit is i when the transmitted bit is i by \( \mathsf {P}\left( \left. Y=i \right| X=i \right) =p_{ii}\) for \(i=0,1\).

  1. (1)

    Obtain the probabilities \(p_{10}= \mathsf {P}( Y=0 | X=1 )\) and \(p_{01}= \mathsf {P}( Y=1 | X=0 )\) that an error occurs when bits 1 and 0 are transmitted, respectively.

  2. (2)

    Obtain the probability that an error occurs.

  3. (3)

    Obtain the probabilities \(\mathsf {P}( Y=1 )\) and \(\mathsf {P}( Y=0 )\) that the received bit is identified to be 1 and 0, respectively.

  4. (4)

    Obtain all a posteriori probabilities \( \mathsf {P}( X = j |Y = k ) \) for \(j=0,1\) and \(k=0, 1\).

  5. (5)

    When \(p=0.5\), obtain \( \mathsf {P}( X=1 | Y=0 )\), \( \mathsf {P}( X=1 |Y=1 )\), \(\mathsf {P}( Y=1 )\), and \(\mathsf {P}( Y=0 )\) for a symmetric channel with \(p_{00}=p_{11}\).

Exercise 2.33

Assume a pile of n integrated circuits (ICs), among which m are defective ones. When an IC is chosen randomly from the pile, the probability that the IC is defective is \(\alpha _1 = \frac{m}{n}\) as shown in Example 2.3.8.

  1. (1)

    Assume we pick one IC and then one more IC without replacing the first one back to the pile. Obtain the probabilities \(\alpha _{1,1}\), \(\alpha _{0,1}\), \(\alpha _{1,0}\), and \(\alpha _{0,0}\) that both are defective, the first one is not defective and the second one is defective, the first one is defective and the second one is not defective, and neither the first nor the second one is defective, respectively.

  2. (2)

    Now assume we pick one IC and then one more IC after replacing the first one back to the pile. Obtain the probabilities \(\alpha _{1,1}\), \(\alpha _{0,1}\), \(\alpha _{1,0}\), and \(\alpha _{0,0}\) again.

  3. (3)

    Assume we pick two ICs randomly from the pile. Obtain the probabilities \(\beta _{0}\), \(\beta _{1}\), and \(\beta _{2}\) that neither is defective, one is defective and the other is not defective, and both are defective, respectively.

Exercise 2.34

Box 1 contains two old and three new erasers and Box 2 contains one old and six new erasers. We perform the experiment “choose one box randomly and pick an eraser at random” twice, during which we discard the first eraser picked.

  1. (1)

    Obtain the probabilities \(P_2\), \(P_1\), and \(P_0\) that both erasers are old, one is old and the other is new, and both erasers are new, respectively.

  2. (2)

    When both erasers are old, obtain the probability \(P_{3}\) that one is from Box 1 and the other is from Box 2.

Exercise 2.35

The probability for a couple to have k children is \(\alpha p^k\) with \(0<p<1\).

  1. (1)

    The color of the eyes being brown for a child is of probability b and is independent of that of other children. Obtain the probability that the couple has r children with brown eyes.

  2. (2)

    Assuming that a child being a girl or a boy is of probability \( \frac{1}{2} \), obtain the probability that the couple has r boys.

  3. (3)

    Assuming that a child being a girl or a boy is of probability \( \frac{1}{2} \), obtain the probability that the couple has at least two boys when the couple has at least one boy.

Exercise 2.36

For the pmf \(p(x)= \, _{r+x-1}\text{ C}_{r-1} \alpha ^r (1-\alpha )^x\), \(x \in \mathbb {J}_0\) introduced in (2.5.14), show

$$\begin{aligned} \lim \limits _{r \rightarrow \infty } p( x)= & {} \frac{\lambda ^x}{x!}e^{-\lambda } , \end{aligned}$$
(2.E.4)

which implies \(\lim \limits _{r \rightarrow \infty } { NB} \left( r,\frac{r}{r+\lambda }\right) = P (\lambda )\).

Exercise 2.37

A person plans to buy a car of price N units. The person has k units and wishes to earn the remaining from a game. In the game, the person wins and loses 1 unit when the outcome is a head and a tail, respectively, from a toss of a coin with probability p for a head and \(q=1-p\) for a tail. Assuming \(0<k<N\) and the person continues the game until the person earns enough for the car or loses all the money, find the probability that the person loses all the money. This problem is called the gambler’s ruin problem.

Exercise 2.38

A large number of bundles, each with 25 tulip bulbs, are contained in a large box. The bundles are of type \(R_5\) and \(R_{15}\) with portions \(\frac{3}{4}\) and \(\frac{1}{4}\), respectively. A type \(R_5\) bundle contains five red and twenty white bulbs and a type \(R_{15}\) bundle contains fifteen red and ten white bulbs. A bulb, chosen randomly from a bundle selected at random from the box, is planted.

  1. (1)

    Obtain the probability \(p_{1}\) that a red tulip blossoms.

  2. (2)

    Obtain the probability \(p_2\) that a white tulip blossoms.

  3. (3)

    When a red tulip blossoms, obtain the conditional probability that the bulb is from a type \(R_{15}\) bundle.

Exercise 2.39

For a probability space with the sample space \(\varOmega = \mathbb {J}_0 = \{ 0, 1, \ldots \}\) and pmf

$$\begin{aligned} p(x) \ = \ \left\{ \begin{array}{llll} 5c^2+c, &{} x=0; &{} \quad 3-13c, &{} x=1; \\ c, &{} x=2; &{} \quad 0, &{} \text{ otherwise };\end{array} \right. \end{aligned}$$
(2.E.5)

determine the constant c.

Exercise 2.40

Show that

$$\begin{aligned} \frac{x}{1+x^2}~\phi (x) \,< \, Q(x) \, < \, \frac{1}{2} \exp \left( -\frac{x^2}{2}\right) \end{aligned}$$
(2.E.6)

for \(x >0\), where \(\phi (x)\) denotes the standard normal pdf, i.e., (2.5.27) with \(m=0\) and \(\sigma ^2=1\), and  

$$\begin{aligned} Q(x) \ = \ \frac{1}{\sqrt{2\pi } } \int ^{\infty }_{x} \exp \left( -\frac{t^2}{2}\right) ~dt . \end{aligned}$$
(2.E.7)

Exercise 2.41

Balls with colors \(C_1\), \(C_2\), \(\ldots \), \(C_n\) are contained in k boxes. Let the probability of choosing Box \(B_j\) be \( \mathsf {P}\left( B_j \right) = b_j\) and that of choosing a ball with color \(C_i\) from Box \(B_j\) be \( \mathsf {P}\left. \left( C_i \right| B_j \right) = c_{ij}\), where \(\left\{ \sum \limits _{i=1}^{n} c_{ij} = 1 \right\} _{j=1}^{k}\) and \(\sum \limits _{j=1}^{k} b_j = 1\). A box is chosen first and then a ball is chosen from the box.

  1. (1)

    Show that, if \(\left\{ c_{i1} = c_{i2} = \cdots = c_{ik} \right\} _{i=1}^{n}\), the color of the ball chosen is independent of the choice of a box, i.e., \(\left\{ \left\{ \mathsf {P}\left( C_i B_j \right) = \mathsf {P}\left( C_i \right) \mathsf {P}\left( B_j \right) \right\} _{i=1}^{n} \right\} _{j=1}^{k}\), for any values of \(\left\{ b_j \right\} _{j=1}^{k}\).

  2. (2)

    When \(n=2\), \(k=3\), \(b_1 = b_3 = \frac{1}{4}\), and \(b_2= \frac{1}{2} \), express the condition for \( \mathsf {P}\left( C_1 B_1 \right) = \mathsf {P}\left( C_1 \right) \mathsf {P}\left( B_1 \right) \) to hold true in terms of \(\left\{ c_{11}, c_{12}, c_{13}\right\} \).

Exercise 2.42

Boxes 1, 2, and 3 contain four red and five green balls, one red and one green balls, and one red and two green balls, respectively. Assume that the probabilities of the event \(B_i\) of choosing Box i are \( \mathsf {P}\left( B_1 \right) = \mathsf {P}\left( B_3 \right) = \frac{1}{4}\) and \( \mathsf {P}\left( B_2 \right) = \frac{1}{2} \). After a box is selected, a ball is chosen randomly from the box. Denote the events that the ball is red and green by R and G, respectively.

  1. (1)

    Are the events \(B_1\) and R independent of each other? Are the events \(B_1\) and G independent of each other?

  2. (2)

    Are the events \(B_2\) and R independent of each other? Are the events \(B_3\) and G independent of each other?

Exercise 2.43

For the sample space \(\varOmega = \{ 1,2,3,4 \}\) with \(\left\{ \mathsf {P}(i) = \frac{1}{4} \right\} _{i=1}^{4}\), consider \(A_1 = \{ 1,3,4 \}\), \(A_2 = \{ 2,3,4 \}\), and \(A_3 = \{ 3 \}\). Are the three events \(A_1\), \(A_2\), and \(A_3\) independent of each other?

Exercise 2.44

Consider two consecutive experiments with possible outcomes \(A\) and \(B\) for the first experiment and \(C\) and \(D\) for the second experiment. When \( \mathsf {P}\left( AC\right) = \frac{1}{3} \), \( \mathsf {P}\left( AD\right) = \frac{1}{6} \), \( \mathsf {P}\left( BC\right) = \frac{1}{6} \), and \( \mathsf {P}\left( BD\right) = \frac{1}{3} \), are \(A\) and \(C\) independent of each other?

Exercise 2.45

Two people make an appointment to meet between 10 and 11 o’clock. Find the probability that they can meet assuming that each person arrives at the meeting place between 10 and 11 o’clock independently and waits only up to 10 minutes.

Exercise 2.46

Consider two children. Assume any child can be a girl or a boy equally likely. Find the probability \(p_{1}\) that both are boys when the elder is a boy and the probability \(p_2\) that both are boys when at least one is a boy.

Exercise 2.47

There are three red and two green balls in Box 1, and four red and three green balls in Box 2. A ball is randomly chosen from Box 1 and put into Box 2. Then, a ball is picked from Box 2. Find the probability that the ball picked from Box 2 is red.

Table 2.2 Some probabilities in the game mighty

Exercise 2.48

Three people \(A\), \(B\), and \(C\) toss a coin each. The person whose outcome is different from those of the other two wins. If the three outcomes are the same, then the toss is repeated.

  1. (1)

    Show that the game is fair, i.e., the probability of winning is the same for each of the three people.

  2. (2)

    Find the probabilities that \(B\) wins exactly eight times and at least eight times when the coins are tossed ten times, not counting the number of no winner.

Exercise 2.49

A game called mighty can be played by three, four, or five players. When it is played with five players, 53 cards are used by adding one joker to a deck of 52 cards. Among the 53 cards, the ace of spades is called the mighty, except when the suit of spadesFootnote 19 is declared the royal suit. In the play, ten cards are distributed to each of the five players \(\left\{ G_i \right\} _{i=1}^5\) and the remaining three cards are left on the table, face side down. Assume that what Player \(G_1\) murmurs is always true and consider the two cases (A) Player \(G_1\) murmurs “Oh! I do not have the joker.” and (B) Player \(G_1\) murmurs “Oh! I have neither the mighty nor the joker.” For convenience, let the three cards on the table be Player \(G_6\). Obtain the following probabilities and thereby confirm Table 2.2:

  1. (1)

    Player \(G_i\) has the joker.

  2. (2)

    Player \(G_i\) has the mighty.

  3. (3)

    Player \(G_i\) has either the mighty or the joker.

  4. (4)

    Player \(G_i\) has at least one of the mighty and the joker.

  5. (5)

    Player \(G_i\) has both the mighty and the joker.

Exercise 2.50

In a group of 30 men and 20 women, \(40\%\) of men and \(60\%\) of women play piano. When a person in the group plays piano, find the probability that the person is a man.

Exercise 2.51

The probability that a car, a truck, and a bus passes through a toll gate is 0.5, 0.3, and 0.2, respectively. Find the probability that 30 cars, 15 trucks, and 5 buses has passed when 50 automobiles have passed the toll gate.

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Song, I., Park, S.R., Yoon, S. (2022). Fundamentals of Probability. In: Probability and Random Variables: Theory and Applications. Springer, Cham. https://doi.org/10.1007/978-3-030-97679-8_2

Download citation

Publish with us

Policies and ethics