Fundamentals of Probability and Stochastic Processes with Applications to Communications pp 5171  Cite as
Probability Theory
 1.3k Downloads
Abstract
This chapter defines the central concepts and terms used in probability theory including the random experiment, space, event, elementary event, combined experiment, Cartesian product, etc. This chapter presents the axiomatic formulation of probability theory based on three axioms and shows how set operations are used in probability theory. This chapter also discusses the conditional probability, the total probability theorem, the Bayes’ theorem, and the independence of random events. Examples of a reliability problem and a communications signal detection problem are discussed.
This chapter defines the central concepts and terms used in probability theory including random experiment, space, event, elementary event , combined experiment , Cartesian product, etc. This chapter presents the axiomatic formulation of probability theory based on three axioms and shows how set operations are used in probability theory.
3.1 Random Experiments
A random experiment consists of executing a certain action or procedure under controlled conditions and taking an observation or a measurement on the outcome produced by the action. The experiment is called random because the outcome is unpredictable.
3.1.1 Space Ω
Each execution of a random experiment produces one outcome . A single execution of a random experiment that produces an outcome is called a trial . For example, in a diethrowing experiment, a trial produces exactly one of the six possible outcomes in Ω.
3.1.2 Event
In the dictionary, the word event is defined as outcome . We have already encountered the word outcome while discussing a random experiment. Therefore, we need to have a clear understanding of the difference between the two words,event and outcome, before we proceed.
An event is the result of an experiment that is of interest or concern. To take an example, suppose that, in a diethrowing game, you would win $10 if your diethrowing shows an outcome less than four. Here, the word “outcome” is a specific showing of the die face. The event of your interest is “winning $10.” Your diethrowing, a trial , would produce one outcome. If that outcome is either 1, 2, or 3, you would win the prize: the event of “winning $10” would occur if the outcome of the trial is 1, 2, or 3. Among all possible outcomes of the diethrowing experiment, that is, Ω = {1, 2, 3, 4, 5, 6}, there are three specific outcomes, 1, 2, and 3, that would make the event happen. These three numbers are members of a subset of Ω, A = {1, 2, 3}. The event “winning $10” is represented by the subset A.
So, “event ” is defined as follows: an event is a subset of the space Ω consisting of the elements that make the event happen. To form the subset defining an event, consider the elements of Ω and determine whether the elements would make the event happen or not. If yes, the elements are included in the subset.
The event consisting of all possible elements of an experiment, that is, Ω, is called a certain event and the event which has no element , that is, {∅}, an impossible event.
An event consisting of a single element is called an elementary event , for example, in a diethrowing experiment, {1}, {2}, {3}, {4}, {5}, and {6}, and in a cointossing experiment, {heads} and {tails}. A key distinction to make here is that an element written by itself as “1,” “2,” etc. is an outcome, whereas a single outcome shown in braces as in {1} is an elementary event.
3.1.3 Combined Experiments
Whether an event occurs or not is determined by the single outcome of a trial of an experiment. If an event under consideration involves the outcomes of multiple trials of a single experiment or a single or multiple trials of multiple experiments, a new experiment may be defined by combining the original experiments. This new experiment may be called a combined experiment in which a new space is defined as the set of all possible combinations of the outcomes of the individual trials of the original experiments. With this definition, the single outcome produced by a trial of this combined experiment is a unique sequence of the individual outcomes of the original experiments, and the event of the combined experiment is determined by this single outcome of a trial of the combined experiment.
For example, suppose that the event under consideration is determined by the sequence of the outcomes of n trials of a single experiment, e.g., throwing a die n times. A combined experiment may be defined by defining a single trial of the experiment as a sequence of n trials of the original experiment. The space of the combined experiment consists of all possible ordered sequences, that is, ntuples, of the elements of the original space.
3.1.4 Probabilities and Statistics
In probability analysis, one begins by assigning probabilities to elementary events, or, by using the known probabilities of other events, if the event under consideration is based on the events for which the probabilities have already been determined. For example, in a diethrowing experiment, first, the probabilities of the elementary events of the six sides, that is, 1/6 for each side, are assigned. Without these initial assignments, one cannot proceed to address more complex problems associated with the outcomes of a diethrowing experiment. The only other option is to try the experiment many times and count the frequencies of the outcomes of interest. For example, in a diethrowing experiment, to determine the probability of an odd number, a die must be thrown many times, and the frequencies of odd numbers must be counted. Even then, a question remains as to how many times the die must be thrown before the probability can be determined.
This dilemma can be avoided by taking one’s a priori judgment about the probabilities of the elementary events. The axiomatic approach to be discussed in the next section allows that a probability analysis can begin by one’s a priori assignment of the probabilities of the elementary events.
Statistics deals with analyzing the frequencies of the outcomes. Therefore, statistics can provide one with the basis of making a priori judgments, for example, on the probabilities of elementary events.
3.2 Axiomatic Formulation of Probability Theory
The axiomatic formulation of probability theory was introduced by the Russian mathematician Kolmogoroff in 1933. In this approach, all possible outcomes of an experiment form a space Ω. Events are defined by the subsets of the space Ω. Probabilities are determined for the events. Probabilities are “assigned” to the elementary events in Ω as the starting point of the probability analysis. The events and the probabilities must obey a set of axioms presented below.
Given two events A and B in Ω, the probabilities of A and B are denoted by P(A) and P(B). P(A) and P(B) are real numbers, referred to as probability measures, and must obey the following rules:
Axiom I
Axiom II
Axiom III
Axiom I states that the probability measure assigned to an event is nonnegative. Axiom II states that the probability measure assigned to a certain event is 1. Finally, Axiom III states that, if two events A and B are mutually exclusive with the probabilities P(A) and P(B), respectively, the probability that either A or B or both would occur, that is, the probability of the event A ∪ B, is the sum of the two probabilities P(A) and P(B).
Example 3.2.1
To illustrate Axiom III , consider that A represents the event that Tom will attend a conference in Philadelphia tomorrow at 9 AM and B the event that Tom will travel to Boston tomorrow at 9 AM. Assume that P(A) = 0.1 and P(B) = 0.2. Clearly, A and B are mutually exclusive because Tom cannot be at two places at the same time. Then, the probability that Tom will either attend the conference in Philadelphia or travel to Boston is the sum of the two probabilities, 0.3.
While Axioms I and II give the rules for assigning probability measures, Axiom III gives the rule for deriving the probability measure for a complex event A ∪ B from the probabilities of A and B. In the axiomatic approach, these three axioms are all one needs to formulate a probability problem. The above three axioms together with the employment of set theory are sufficient for developing probability theory.
A generalization of Axiom III is given by the theorem below.
Theorem 3.2.1
Proof
Continuing this process for n mutually exclusive sets A _{1} , A _{2} , … , A _{ n } , we prove the theorem by mathematical induction .
Q.E.D.
Example 3.2.2
In this example, we illustrate the significance of the three axioms in formulating a probability problem using a diethrowing experiment. Suppose that you would win $10, if the number of dots shown after throwing the die is less than four, and would win a trip to Philadelphia if it is more than four dots. What is the probability that you would win $10, a trip to Philadelphia or both? We will formulate and solve this problem using the three axioms.
For this problem, the space is Ω = {1, 2, 3, 4, 5, 6}. In the axiomatic approach, the formulation of a probability problem starts with the assignments of the probability measures for the basic events, whether they are elementary events or the events for which a priori probabilistic information is known. For this problem, we will assign 1/6 for each of the six possible outcomes: P({i}) = 1/6, i = 1 ~ 6. In the absence of any a priori information about the six elementary events, 1/6 is a reasonable assignment and satisfies Axiom I . If a priori information, e.g., past experimental data, is available about the die used in the experiment, different probabilities may be assigned. In any event , the key point here is that the formulation starts with the assignments of the probabilities.
The event that you will win $10 or a trip to Philadelphia or both would then be represented by the union of A and B and A ∪ B, and we need to determine P(A ∪ B).
The event represented by the union of two events A and B would occur if A or B or both would occur. For mutually exclusive A and B, the probability that “both” A and B would occur is zero. For the current problem, the probability of winning both would be zero because A and B are mutually exclusive.
In addition to Theorem 3.2.1, it is convenient to establish several key theorems up front that follow from the three probability axioms. These theorems are discussed below.
Theorem 3.2.2
Equation (3.5) states that the probability of an impossible event is 0.
Two observations are made regarding this theorem. First, unlike the probability measure 1 of a certain event , which is “assigned” by Axiom II , the probability measure of 0 can be “derived” from the axioms. Second, one might wonder why this is not included as an axiom. Since this measure can be derived from the above axioms, it would be superfluous to include this as an axiom. As can be seen from the proof given below, this theorem could have been included as Axiom II, and the current Axiom II could be derived instead. In any event, it is not necessary to include both.
Proof
Q.E.D.
Theorem 3.2.3
Proof
Q.E.D.
Q.E.D.
Theorem 3.2.4
Proof
Q.E.D.
Q.E.D.
This theorem shows that, if the three axioms are followed, the probability measure derived for any arbitrary event cannot be greater than 1. Once again, including this statement as an axiom would be superfluous.
Theorem 3.2.5
Proof
Q.E.D.

Define the experiment and the probability space Ω.

Assign the probabilities of the elementary events.

Define the event.

Determine the probability of the event.
The following example is a simple probability problem that can be solved without elaborate formulation. However, we will deliberately go through the above steps to illustrate the axiomatic approach to probability formulation.
Example 3.2.3
In a diethrowing game, a number less than 5 wins. Find the probability of winning the game.
Solution
Find P(A).
3.3 Conditional Probability
In the first ratio, consider that, given B, that is, for a fixed B, A is varied, that is, the ratio is a function of A. Similarly, in the second ratio, the ratio is a function of B for a fixed A. For the time being, let us denote these two quantities by R[A given B] and R[B given A], respectively. We now show that these two quantities are also probability measures in Ω satisfying Axioms I, II, and III. We show this using the first ratio, R[A given B], as A varies with B fixed.
First, the ratio R[A given B] satisfies Axiom I given by (3.1) as follows:
Q.E.D.
Next, the ratio R[A given B] satisfies Axiom II given by (3.2) as follows:
Q.E.D.
Finally, the ratio R[A given B] satisfies Axiom III given by (3.3) as follows:
Q.E.D.
3.3.1 Definition of the Conditional Probability
The first conditional probability as defined above can be interpreted as the probability of event A given that event B has occurred, and, similarly, the second conditional probability, as the probability of event B given that event A has occurred.
3.3.2 Total Probability Theorem
Theorem 3.3.1
The last equation is referred to as the total probability theorem .
Proof
Q.E.D.
3.3.3 Bayes’ Theorem
Theorem 3.3.2
This theorem is referred to as the Bayes’ theorem and is used to determine the probability that a given event A implies the subset B _{ i } of the partition . For example, given that a product is found to be defective, denoted by event A, the theorem can be used to calculate the probability that the defective product is from supplier B _{ i }, when the defect data for each supplier, P(AB _{ i }), is available.
Proof
Q.E.D.
Example 3.3.1
A reliability problem. A component is randomly selected from a batch of 10,000 pieces supplied by five different factories. The following table shows the factory data of failure statistics of the component and the number of pieces supplied by the factories. Suppose that the randomly selected component has just failed. What is the probability that the failed component is from Factory A?
Factory  #Supplied  Probability of failure  Factory  #Supplied  Probability of failure 
A  1000  P(failA) = 1.3×10^{−6}  D  2000  P(failD) = 1.4×10^{−6} 
B  3000  P(failB) = 1.2×10^{−6}  E  1000  P(failE) = 1.5×10^{−6} 
C  3000  P(failC) = 1.1×10^{−6} 
From the number of components supplied by each factory given above, we have
P(A) = 1000/10,000 = 0.1  P(B) = 3000/10,000 = 0.3  P(C) = 3000/10,000 = 0.3 
P(D) = 2000/10,000 = 0.2  P(E) = 1000/10,000 = 0.1 
Example 3.3.2
A communications signal detection problem. A total of 4000 characters have been received from four different sources as follows. The probabilities of character “a” from the four sources are given. Out of the total 4000 characters received, a randomly selected character is found to be “a.” What is the probability that this character came from Source A?
Source  #Characters sent  Probability of “a”  Source  #Characters sent  Probability of “a” 
A  500  P(aA) = 0.1  C  2000  P(aC) = 0.3 
B  1000  P(aB) = 0.2  D  500  P(aD) = 0.4 
3.3.4 Independence of Events
Definition of Independence
This definition of independence is consistent with the definition of the conditional probability .
3.4 Cartesian Product
This section defines a special type of set called the Cartesian product. To illustrate the Cartesian product , consider the following example:
Example 3.4.1
We will return to this example to calculate the probability of the event A after discussing more on the combined experiment .
Assume that the two individual experiments with spaces Ω_{1} and Ω_{2}, respectively, are independent , that is, an outcome from Ω_{1} has no effect on the outcome from Ω_{2}. Under this condition, the two events E × Ω_{2} and Ω_{1} × F are independent.
Note that {(heads, 1)}, {(heads, 2)}, and {(heads, 3)} are elementary events in the combined experiment space Ω.