Skip to main content

Joint Probability Distributions and Their Applications

  • Chapter
  • First Online:
Modern Mathematical Statistics with Applications

Part of the book series: Springer Texts in Statistics ((STS))

Abstract

In Chapters 3 and 4, we developed probability models for a single random variable. Many problems in probability and statistics lead to models involving several random variables simultaneously.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 139.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    This property of independent rvs can also be written as \( \sigma_{1}^{2} + \sigma_{2}^{2} = \sigma_{{X_{1} + X_{2} }}^{2} \). In part because the formula has the format a2 + b2 = c2, statisticians sometimes call this property the Pythagorean Theorem.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jay L. Devore .

Supplementary Exercises: (122–150)

Supplementary Exercises: (122–150)

  1. 122.

    Suppose the amount of rainfall in one region during a particular month has an exponential distribution with mean value 3 in., the amount of rainfall in a second region during that same month has an exponential distribution with mean value 2 in., and the two amounts are independent of each other. What is the probability that the second region gets more rainfall during this month than does the first region?

  2. 123.

    Two messages are to be sent. The time (min) necessary to send each message has an exponential distribution with parameter λ = 1, and the two times are independent of each other. It costs $2 per minute to send the first message and $1 per minute to send the second. Obtain the density function of the total cost of sending the two messages. [Hint: First obtain the cumulative distribution function of the total cost, which involves integrating the joint pdf.]

  3. 124.

    A restaurant serves three fixed-price dinners costing $25, $35, and $50. For a randomly selected couple dining at this restaurant, let X = the cost of the man’s dinner and Y = the cost of the woman’s dinner. The joint pmf of X and Y is given in the following table:

    p(x, y)

    y

    25

    35

    50

    x

    25

    .05

    .05

    .10

    35

    .05

    .10

    .35

    50

    0

    .20

    .10

  4. a.

    Compute the marginal pmfs of X and Y.

  5. b.

    What is the probability that the man’s and the woman’s dinner cost at most $35 each?

  6. c.

    Are X and Y independent? Justify your answer.

  7. d.

    What is the expected total cost of the dinner for the two people?

  8. e.

    Suppose that when a couple opens fortune cookies at the conclusion of the meal, they find the message “You will receive as a refund the difference between the cost of the more expensive and the less expensive meal that you have chosen.” How much does the restaurant expect to refund?

  9. 125.

    A health-food store stocks two different brands of a type of grain. Let X = the amount (lb) of brand A on hand and Y = the amount of brand B on hand. Suppose the joint pdf of X and Y is

    $$ f(x,y) = kxy\quad \, x \ge 0,\;y \ge 0,\;20 \le x + y \le 30 $$
    1. a.

      Draw the region of positive density and determine the value of k.

    2. b.

      Are X and Y independent? Answer by first deriving the marginal pdf of each variable.

    3. c.

      Compute P(X + Y ≤ 25).

    4. d.

      What is the expected total amount of this grain on hand?

    5. e.

      Compute Cov(X, Y) and Corr(X, Y).

    6. f.

      What is the variance of the total amount of grain on hand?

  10. 126.

    Let X1, X2, …, Xn be random variables denoting n independent bids for an item that is for sale. Suppose each X i is uniformly distributed on the interval [100, 200]. If the seller sells to the highest bidder, how much can he expect to earn on the sale? [Hint: Let \( Y = \hbox{max} (X_{1} ,X_{2} , \ldots ,X_{n} ) \). Find FY(y) by using the results of Section 5.7 or else by noting that Y ≤ y iff each Xi is ≤ y. Then obtain the pdf and E(Y).]

  11. 127.

    Suppose a randomly chosen individual’s verbal score X and quantitative score Y on a nationally administered aptitude examination have joint pdf

    $$ f(x,y) = \frac{2}{5}(2x + 3y)\quad 0 \le x \le 1, 0 \le y \le 1 $$

    You are asked to provide a prediction t of the individual’s total score X + Y. The error of prediction is the mean squared error E[(X + Y − t)2]. What value of t minimizes the error of prediction?

  12. 128.

    Let X1 and X2 be quantitative and verbal scores on one aptitude exam, and let Y1 and Y2 be corresponding scores on another exam. If Cov(X1, Y1) = 5, Cov(X1, Y2) = 1, Cov(X2, Y1) = 2, and Cov(X2, Y2) = 8, what is the covariance between the two total scores X1 + X2 and Y1 + Y2?

  13. 129.

    Let Z1 and Z2 be independent standard normal rvs and let

    $$ U = Z_{ 1} \quad V = \rho \cdot Z_{1} + \sqrt {1 - \rho^{2} } \cdot Z_{2} $$
    1. a.

      By definition, U has mean 0 and standard deviation 1. Show that the same is true for V.

    2. b.

      Use the properties of covariance to show that Cov(U, V) = ρ.

    3. c.

      Show that Corr(U, V) = ρ.

  14. 130.

    You are driving on a highway at speed X1. Cars entering this highway after you travel at speeds X2, X3, …. Suppose these Xi’s are independent and identically distributed with pdf f(x) and cdf F(x). Unfortunately there is no way for a faster car to pass a slower one—it will catch up to the slower one and then travel at the same speed. For example, if X1 = 52.3, X2 = 37.5, and X3 = 42.8, then no car will catch up to yours, but the third car will catch up to the second. Let N = the number of cars that ultimately travel at your speed (in your “cohort”), including your own car. Possible values of N are 1, 2, 3, …. Show that the pmf of N is p(n) = 1/[n(n + 1)], and then determine the expected number of cars in your cohort. [Hint: N = 3 requires that X1 < X2, X1 < X3, X4 < X1.]

  15. 131.

    Suppose the number of children born to an individual has pmf p(x). A GaltonWatson branching process unfolds as follows: At time t = 0, the population consists of a single individual. Just prior to time t = 1, this individual gives birth to X1 individuals according to the pmf p(x), so there are X1 individuals in the first generation. Just prior to time t = 2, each of these X1 individuals gives birth independently of the others according to the pmf p(x), resulting in X2 individuals in the second generation (e.g., if X1 = 3, then X2 = Y1 + Y2 + Y3, where Yi is the number of progeny of the ith individual in the first generation). This process then continues to yield a third generation of size X3, and so on.

    1. a.

      If X1 = 3, Y1 = 4, Y2 = 0, Y3 = 1, draw a tree diagram with two generations of branches to represent this situation.

    2. b.

      Let A be the event that the process ultimately becomes extinct (one way for A to occur would be to have X1 = 3 with none of these three second-generation individuals having any progeny) and let p* = P(A). Argue that p* satisfies the equation

      $$ p* = \sum {(p*)^{x} \cdot p(x)} $$

      That is, p* = ψ(p*) where ψ(s) is the probability generating function introduced in Exercise 166 from Chapter 3. [Hint: \( A = \bigcup\nolimits_{x} {(A \cap \{ X_{1} = x\} )} \), so the Law of Total Probability can be applied. Now given that X1 = 3, A will occur if and only if each of the three separate branching processes starting from the first generation ultimately becomes extinct; what is the probability of this happening?

    3. c.

      Verify that one solution to the equation in (b) is p* = 1. It can be shown that this equation has just one other solution, and that the probability of ultimate extinction is in fact the smaller of the two roots. If p(0) = .3, p(1) = .5, and p(2) = .2, what is p*? Is this consistent with the value of μ, the expected number of progeny from a single individual? What happens if p(0) = .2, p(1) = .5, and p(2) = .3?

  16. 132.

    Let f(x) and g(y) be pdfs with corresponding cdfs F(x) and G(y), respectively. With c denoting a numerical constant satisfying |c| ≤ 1, consider

    $$ f(x,y) = f(x)g(y)\{ 1 + c[2F(x) - 1][2G(y) - 1]\} $$
    1. a.

      Show that f(x, y) satisfies the conditions necessary to specify a joint pdf for two continuous rvs.

    2. b.

      What is the marginal pdf of the first variable X? Of the second variable Y?

    3. c.

      For what values of c are X and Y independent?

    4. d.

      If f(x) and g(y) are normal pdfs, is the joint distribution of X and Y bivariate normal?

  17. 133.

    The joint cumulative distribution function of two random variables X and Y, denoted by F(x, y), is defined by

    $$ F(x,y) = P[(X \le x) \cap (Y \le y)]\quad - \infty < x < \infty ,\quad - \infty < y < \infty $$
    1. a.

      Suppose that X and Y are both continuous variables. Once the joint cdf is available, explain how it can be used to determine the probability \( P[(X,\;Y)\; \in \;A] \), where A is the rectangular region \( \{ (x,\;y){:}\,a\; \le \;x \le \;b,\;c\; \le \;y\; \le \;d\} \).

    2. b.

      Suppose the only possible values of X and Y are 0, 1, 2, … and consider the values a = 5, b = 10, c = 2, and d = 6 for the rectangle specified in (a). Describe how you would use the joint cdf to calculate the probability that the pair (X, Y) falls in the rectangle. More generally, how can the rectangular probability be calculated from the joint cdf if a, b, c, and d are all integers?

    3. c.

      Determine the joint cdf for the scenario of Example 5.1. [Hint: First determine F(x, y) for x = 100, 250 and y = 0, 100, and 200. Then describe the joint cdf for various other (x, y) pairs.]

    4. d.

      Determine the joint cdf for the scenario of Example 5.3 and use it to calculate the probability that X and Y are both between .25 and .75. [Hint: For 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1, \( F(x,y) = \int_{0}^{x} {\int_{0}^{y} {f(u,v)dvdu} } \).]

    5. e.

      Determine the joint cdf for the scenario of Example 5.5. [Hint: Proceed as in (d), but be careful about the order of integration and consider separately (x, y) points that lie inside the triangular region of positive density and then points that lie outside this region.]

  18. 134.

    A circular sampling region with radius X is chosen by a biologist, where X has an exponential distribution with mean value 10 ft. Plants of a certain type occur in this region according to a (spatial) Poisson process with “rate” .5 plant per square foot. Let Y denote the number of plants in the region.

    1. a.

      Find \( E\left( {Y|X = x} \right) \) and \( V\left( {Y|X = x} \right) \)

    2. b.

      Use part (a) to find E(Y).

    3. c.

      Use part (a) to find V(Y).

  19. 135.

    The number of individuals arriving at a post office to mail packages during a certain period is a Poisson random variable X with mean value 20. Independently of the others, any particular customer will mail either 1, 2, 3, or 4 packages with probabilities .4, .3, .2, and .1, respectively. Let Y denote the total number of packages mailed during this time period.

    1. a.

      Find \( E\left( {Y|X = x} \right) \) and \( V\left( {Y|X = x} \right) \).

    2. b.

      Use part (a) to find E(Y).

    3. c.

      Use part (a) to find V(Y).

  20. 136.

    Consider a sealed-bid auction in which each of the n bidders has his/her valuation (assessment of inherent worth) of the item being auctioned. The valuation of any particular bidder is not known to the other bidders. Suppose these valuations constitute a random sample \( X_{1} , \ldots ,X_{n} \) from a distribution with cdf F(x), with corresponding order statistics \( Y_{1} \le Y_{2} \le \cdots \le Y_{n} \). The rent of the winning bidder is the difference between the winner’s valuation and the price. The article “Mean Sample Spacings, Sample Size and Variability in an Auction-Theoretic Framework” (Oper. Res. Lett. 2004: 103–108) argues that the rent is just \( Y_{n} - Y_{n - 1} \) (do you see why?).

    1. a.

      Suppose that the valuation distribution is uniform on [0, 100]. What is the expected rent when there are n = 10 bidders?

    2. b.

      Referring back to (a), what happens when there are 11 bidders? More generally, what is the relationship between the expected rent for n bidders and for n + 1 bidders? Is this intuitive? [Note: The cited article presents a counterexample.]

  21. 137.

    Suppose two identical components are connected in parallel, so the system continues to function as long as at least one of the components does so. The two lifetimes are independent of each other, each having an exponential distribution with mean 1000 h. Let W denote system lifetime. Obtain the moment generating function of W, and use it to calculate the expected lifetime.

  22. 138.

    Sandstone is mined from two different quarries. Let X = the amount mined (in tons) from the first quarry each day and Y = the amount mined (in tons) from the second quarry each day. The variables X and Y are independent, with µX = 12, σX = 4, µY = 10, σY = 3.

    1. a.

      Find the mean and standard deviation of the variable X + Y, the total amount of sandstone mined in a day.

    2. b.

      Find the mean and standard deviation of the variable XY, the difference in the mines’ performances in a day.

    3. c.

      The manager of the first quarry sells sandstone at $25/ton, while the manager of the second quarry sells sandstone at $28/ton. Find the mean and standard deviation for the combined amount of money the quarries generate in a day.

    4. d.

      Assuming X and Y are both normally distributed, find the probability that the quarries generate more than $750 revenue in a day.

  23. 139.

    In cost estimation, the total cost of a project is the sum of component task costs. Each of these costs is a random variable with a probability distribution. It is customary to obtain information about the total cost distribution by adding together characteristics of the individual component cost distributions—this is called the “roll-up” procedure. Since E(X1 + ⋯ + Xn) = E(X1) + ⋯ + E(Xn), the roll-up procedure is valid for mean cost. Suppose that there are two component tasks and that X1 and X2 are independent, normally distributed random variables. Is the roll-up procedure valid for the 75th percentile? That is, is the 75th percentile of the distribution of X1 + X2 the same as the sum of the 75th percentiles of the two individual distributions? If not, what is the relationship between the percentile of the sum and the sum of percentiles? For what percentiles is the roll-up procedure valid in this case?

  24. 140.

    Random sums. If X1, X2, …, Xn are independent rvs, each with the same mean value μ and variance σ2, then the methods of Section 5.3 show that E(X1 + ···  + Xn) =  and V(X1 + X2 + ··· + Xn) = 2. In some applications, the number of Xi’s under consideration is not a fixed number n but instead a rv N. For example, let N be the number of components of a certain type brought into a repair shop on a particular day and let Xi represent the repair time for the ith component. Then the total repair time is TN  = X1 + X2 + ··· + XN, the sum of a random number of rvs.

    1. a.

      Suppose that N is independent of the Xi’s. Use the Law of Total Expectation to obtain an expression for E(TN) in terms of μ and E(N).

    2. b.

      Use the Law of Total Variance to obtain an expression for V(TN) in terms of μ, σ2, E(N), and V(N).

    3. c.

      Customers submit orders for stock purchases at a certain online site according to a Poisson process with a rate of 3 per hour. The amount purchased by any particular customer (in thousands of dollars) has an exponential distribution with mean 30, and purchase amounts are independent of the number of customers. What is the expected total amount ($) purchased during a particular 4-h period, and what is the standard deviation of this total amount?

  25. 141.

    The mean weight of luggage checked by a randomly selected tourist-class passenger flying between two cities on a certain airline is 40 lb, and the standard deviation is 10 lb. The mean and standard deviation for a business-class passenger are 30 lb and 6 lb, respectively.

    1. a.

      If there are 12 business-class passengers and 50 tourist-class passengers on a particular flight, what are the expected value of total luggage weight and the standard deviation of total luggage weight?

    2. b.

      If individual luggage weights are independent, normally distributed rvs, what is the probability that total luggage weight is at most 2500 lb?

  26. 142.

    The amount of soft drink that Ann consumes on any given day is independent of consumption on any other day and is normally distributed with μ = 13 oz and σ = 2. If she currently has two six-packs of 16-oz bottles, what is the probability that she still has some soft drink left at the end of 2 weeks (14 days)? Why should we worry about the validity of the independence assumption here?

  27. 143.

    A student has a class that is supposed to end at 9:00 a.m. and another that is supposed to begin at 9:10 a.m. Suppose the actual ending time of the 9 a.m. class is a normally distributed rv X1 with mean 9:02 and standard deviation 1.5 min and that the starting time of the next class is also a normally distributed rv X2 with mean 9:10 and standard deviation 1 min. Suppose also that the time necessary to get from one classroom to the other is a normally distributed rv X3 with mean 6 min and standard deviation 1 min. Assuming independence of X1, X2, and X3, what is the probability that the student makes it to the second class before the lecture starts? Why should we worry about the reasonableness of the independence assumption here?

  28. 144.

    This exercise provides an alternative approach to establishing the properties of correlation.

    1. a.

      Use the general formula for the variance of a linear combination to write an expression for V(aX + Y). Then let a = σY/σX, and show that ρ ≥  –1. [Hint: Variance is always ≥ 0, and Cov(X, Y) = σX · σY · ρ.]

    2. b.

      By considering V(aXY), conclude that ρ ≤ 1.

    3. c.

      Use the fact that V(W) = 0 only if W is a constant to show that ρ = 1 only if Y = aX + b.

  29. 145.

    A rock specimen from a particular area is randomly selected and weighed two different times. Let W denote the actual weight and X1 and X2 the two measured weights. Then X1 = W + E1 and X2 = W + E2, where E1 and E2 are the two measurement errors. Suppose that the Ei’s are independent of each other and of W and that \( V\left( {E_{1} } \right) = V\left( {E_{2} } \right) = \sigma_{E}^{2} \).

    1. a.

      Express ρ, the correlation coefficient between the two measured weights X1 and X2, in terms of \( \sigma_{W}^{2} \), the variance of actual weight, and \( \sigma_{X}^{2} \), the variance of measured weight.

    2. b.

      Compute ρ when σW  = 1 kg and σE  = .01 kg.

  30. 146.

    Let A denote the percentage of one constituent in a randomly selected rock specimen, and let B denote the percentage of a second constituent in that same specimen. Suppose D and E are measurement errors in determining the values of A and B so that measured values are X = A + D and Y = B + E, respectively. Assume that measurement errors are independent of each other and of actual values.

    1. a.

      Show that

      $$ {\text{Corr}}(X,Y) = {\text{Corr}}(A,B) \cdot \sqrt {{\text{Corr}}(X_{1} ,X_{2} )} \cdot \sqrt {{\text{Corr}}(Y_{1} ,Y_{2} )} $$

      where X1 and X2 are replicate measurements on the value of A, and Y1 and Y2 are defined analogously with respect to B. What effect does the presence of measurement error have on the correlation?

    2. b.

      What is the maximum value of Corr(X, Y) when Corr(X1, X2) = .8100, Corr(Y1, Y2) = .9025? Is this disturbing?

  31. 147.

    Let X1, …, Xn be independent rvs with mean values μ1, …, μn and variances \( \sigma_{1}^{2} \), …, \( \sigma_{n}^{2} \). Consider a function h(x1, …, xn), and use it to define a new random variable Y = h(X1, …, Xn). Under rather general conditions on the h function, if the σis are all small relative to the corresponding μis, it can be shown that E(Y) ≈ h(μ1, …, μn) and

    $$ V(Y) \approx \left( {\frac{\partial h}{{\partial x_{1} }}} \right)^{2} \cdot \sigma_{1}^{2} + \cdots + \left( {\frac{\partial h}{{\partial x_{n} }}} \right)^{2} \cdot \sigma_{n}^{2} $$

    where each partial derivative is evaluated at (x1, …, xn) = (μ1, …, μn). Suppose three resistors with resistances X1, X2, X3 are connected in parallel across a battery with voltage X4. Then by Ohm’s law, the current is

    $$ Y = X_{4} \left( {\frac{1}{{X_{1} }} + \frac{1}{{X_{2} }} + \frac{1}{{X_{3} }}} \right) $$

    Let μ1= 10 Ω, σ1= 1.0 Ω, μ2= 15 Ω, σ2 = 1.0 Ω, μ3= 20 Ω, σ3= 1.5 Ω, μ4= 120 V, σ4= 4.0 V. Calculate the approximate expected value and standard deviation of the current (suggested by “Random Samplings,” CHEMTECH 1984: 696–697).

  32. 148.

    A more accurate approximation to E[h(X1, …, Xn)] in the previous exercise is

    $$ \begin{aligned} E\left[ {h\left( {X_{1} , \ldots ,X_{n} } \right)} \right] & \approx h(\mu _{1} , \ldots ,\mu _{n} ) + \frac{1}{2}\sigma _{1}^{2} \left( {\frac{{\partial ^{2} h}}{{\partial x_{1}^{2} }}} \right) \\ & \quad + \cdots + \frac{1}{2}\sigma _{n}^{2} \left( {\frac{{\partial ^{2} h}}{{\partial x_{n}^{2} }}} \right) \\ \end{aligned} $$

Compute this for Y =h(X1, X2, X3, X4) given in the previous exercise, and compare it to the leading term h(μ1, …, μn).

  1. 149.

    The following example is based on “Conditional Moments and Independence” (The American Statistician 2008: 219). Consider the following joint pdf of two rvs X and Y:

    $$ \begin{aligned} f\left( {x,y} \right) & = \frac{{e^{{ - [(\ln x)^{2} + (\ln y)^{2} ]/2}} }}{{2\uppi xy}}[1 + \sin (2\uppi \ln x) \\ & \quad \sin (2\uppi \ln y)]\quad {\text{for}}\;x > 0,\;y > 0 \\ \end{aligned} $$
    1. a.

      Show that the marginal distribution of each rv is lognormal. [Hint: When obtaining the marginal pdf of X, make the change of variable u = ln (y).]

    2. b.

      Obtain the conditional pdf of Y given that X = x. Then show that for every positive integer n, E(Yn|X = x) = E(Yn). [Hint: Make the change of variable ln(y) = u + n in the second integrand.]

    3. c.

      Redo (b) with X and Y interchanged.

    4. d.

      The results of (b) and (c) suggest intuitively that X and Y are independent rvs. Are they in fact independent?

  2. 150.

    Let Y0 denote the initial price of a particular security and Yn denote the price at the end of n additional weeks for n =1, 2, 3, …. Assume that the successive price ratios Y1/Y0, Y2/Y1, Y3/Y2, … are independent of one another and that each ratio has a lognormal distribution with µ = .4 and σ = .8 (the assumptions of independence and lognormality are common in such scenarios).

    1. a.

      Calculate the probability that the security price will increase over the course of a week.

    2. b.

      Calculate the probability that the security price will be higher at the end of the next week, be lower the week after that, and then be higher again at the end of the following week. [Hint: What does “higher” say about the ratio Yi+1/Yi?]

    3. c.

      Calculate the probability that the security price will have increased by at least 20% over the course of a five-week period. [Hint: Consider the ratio Y5/Y0, and write this in terms of successive ratios Yi+1/Yi.]

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Devore, J.L., Berk, K.N., Carlton, M.A. (2021). Joint Probability Distributions and Their Applications. In: Modern Mathematical Statistics with Applications. Springer Texts in Statistics. Springer, Cham. https://doi.org/10.1007/978-3-030-55156-8_5

Download citation

Publish with us

Policies and ethics