Skip to main content

Random Variables, Densities, and Cumulative Distribution Functions

  • Chapter
  • First Online:
Book cover Mathematical Statistics for Economics and Business
  • 6435 Accesses

Abstract

It is natural for the outcomes of many experiments in the real world to be measured in terms of real numbers. For example, measuring the height and weight of individuals, observing the market price and quantity demanded of a commodity, measuring the yield of a new variety of wheat, or measuring the miles per gallon achievable by a new hybrid automobile all result in real-valued outcomes. The sample spaces associated with these types of experiments are subsets of the real line or, if multiple values are needed to characterize the outcome of the experiment, subsets of n-dimensional real space, \( {\mathbb{R}^n} \).

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 139.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Notice that the algebraic specification faithfully represents the positive values of f(x) in the preceding table of values, and defines f(x) to equal 0 \( \forall \) x \( \notin \){2, 3,…,12}. Thus, the domain of f is the entire real line. The reason for extending the domain of f from R(X) to \( \mathbb{R} \) will be discussed shortly. Note that assignments of probabilities to events as \( {{P_X}(A) = \sum\nolimits_{{x \in A}} {f(x)} } \) are unaffected by this domain extension.

  2. 2.

    See F.S. Woods (1954) Advanced Calculus, Boston: Ginn and Co., p. 141. Regarding continuity of f(x), note that f(x) is continuous at a point dD(f) if, \( \forall \) \( \varepsilon \) > 0, ∃ a number δ(\( \varepsilon \)) > 0 such that if |x − d| < δ(\( \varepsilon \)), then f(x) − f(d) < \( \varepsilon \). The function f is continuous if it is continuous at every point in its domain. Heuristically, a function will be continuous if there are no breaks in the graph of y = f(x). Put another way, if the graph of y = f(x) can be completely drawn without ever lifting a pencil from the graph paper, then f is a continuous function.

  3. 3.

    It can be shown that Borel sets are representable as the union of a collection of disjoint intervals, some of which may be single points. The collective area in question can then be defined as the sum of the areas lying above the various intervals and below the graph of f.

  4. 4.

    Note that in the discrete case, it is conceptually possible to define a random variable that has an outcome that occurs with zero probability. For example, if \( f(y) = {I_{{\left[ {0,1} \right]}}}(y) \)is the density function of the continuous random variable Y, then \( X = {I_{{\left[ {0,1} \right)}}}(Y) \)is a discrete random variable that takes the value 1 with probability 1 and the value 0 with probability zero. Such random variables have little use in applications, and for simplicity, we suppress this possibility in making the range of the random variable synonymous with its support.

  5. 5.

    In a more advanced treatment of the subject, we could resort to more general integration methods, in which case a single integral could once again be used to define P X . On Stieltjes integration, see R.G. Bartle (1976) The Elements of Real Analysis, 2nd ed., New York: John Wiley, and Section 3.2 of the next chapter.

  6. 6.

    There are still other types of random variables besides those we have examined, but they are rarely utilized in applied work. See T.S. Chow and H. Teicher (1978) Probability Theory, New York: Springer-Verlag, pp. 247–248.

  7. 7.

    Alternative shorthand notation that is often used in the literature is respectively {X ≤ b} and P(X ≤ b). Our notation establishes a distinction between the function X and a value of the function x.

  8. 8.

    For those readers whose recollection of the limit concept from calculus courses is not clear, it suffices here to appeal to intuition and interpret the limit of F(b) as “the real number to which F(b) becomes and remains infinitesimally close to as b increases without bound (or as b decreases without bound).” We will examine the limit concept in more detail in Chapter 5.

  9. 9.

    \( {\lim_{{b \to {d^{ - }}}}} \) indicates that we are examining the limit as b approaches d from below (also called a left-hand limit). \( {\lim_{{b \to {d^{ + }}}}} \) would indicate the limit as b approached d from above (also called a right-hand limit). For now, it will suffice for the reader to appeal to intuition and interpret \( {\lim_{{b \to {d^{ - }}}}} \) F(b) as “the real number to which F(b) becomes and remains infinitesimally close to as b increases and becomes infinitesimally close to d.”

  10. 10.

    A strictly increasing function has F(x i ) < F(x j ) when X i  < X j .

  11. 11.

    This can be rigorously justified by the fact that under the conditions stated: (1) the (improper) Riemann integral is equivalent to a Lebesque integral; (2) the largest set of points for which f(x) can be discontinuous and still have the integral \( \int_{{ - \infty }}^b {f(x)} \)dx defined \( \forall \) b has “measure zero;” and (3) the values of the integrals are unaffected by changing the values of the integrand on a set of points having “measure zero.” This result applies to multivariate integrals as well. See C.W. Burill, 1972, Measure, Integration, and Probability, New York: McGraw-Hill, pp. 106–109, for further details.

  12. 12.

    The differentiation is accomplished by applying Lemma 2.1 twice: once to the integral \( {\mathop{\int }\nolimits_c^d \left[ {\mathop{\int }\nolimits_a^b f(\mathop{x}\nolimits_1, \mathop{x}\nolimits_2 ) \mathop{{dx}}\nolimits_1 } \right]} \) dx 2, differentiating with respect to d to yield \( {\mathop{\int }\nolimits_a^b } \) f(x 1, d) dx 1, and then differentiating the latter integral with respect to b to obtain f(b,d). In summary, \( {\left( {{\partial^2}/\partial b\partial d} \right) \mathop{\int }\nolimits_c^d \mathop{\int }\nolimits_a^b } \) f(x 1,x 2) dx 1 dx 2 = f(b,d).

  13. 13.

    In the discrete m-dimensional case, the PDF can be defined as \( f\left( \mathbf{x} \right) = F\left( \mathbf{x} \right) + \mathop{{\lim }}\limits_{{\delta \to {{\rm{0}}_{ + }}}} \left( {\sum\limits_{{i = 1}}^m {{{\left( { - 1} \right)}^i}} \sum\limits_{{\mathbf{v} \in {S_i}}} F \left( {\mathbf{x} - \delta\mathbf{v}} \right)} \right) \)where S i is the set of all of the different (m × 1) vectors that can be constructed using i 1’s and m-i 0’s.

  14. 14.

    The reader is reminded that we are suppressing the technical requirement that for every Borel set of y values, the associated collection of w values in S must constitute an event in S for the function Y to be called a random variable. As we have remarked previously, this technical difficulty does not cause a problem in applied work.

  15. 15.

    R. Courant and F. John, Introduction to Calculus and Analysis, New York, John Wiley-Interscience, 1965, p. 143.

  16. 16.

    In applying the mean value theorem to the numerator, we treat f(x 1, x 2) as a function of the single variable x 2, fixing the value of x 1 for each application.

  17. 17.

    In the continuous case, it is also presumed that f and \( {f_{{m + 1,...,n}}} \) are continuous in \( \left( {{x_{{m + 1}}},...,{x_n}} \right) \) within some neighborhood of points around the point where the conditional density is evaluated in order to justify the conditional density definition via a limiting argument analogous to the bivariate case. Motivation for the conditional density expression when conditioning on an elementary event in the continuous case can then be provided by extending the mean-value theorem argument used in the bivariate case. See R.G. Bartle, Real Analysis, p. 429 for a statement of the general mean value theorem for integrals.

  18. 18.

    Technically, the factorization need not hold at points of discontinuity for the joint density function of a continuous random variable. However, if the random variables are independent, there will always exist a density function for which the factorization can be formed. This has to do with the fundamental non-uniqueness of PDFs in the continuous case, which can be redefined arbitrarily at a countable number of isolated points without affecting the assignment of any probabilities of events through integration. There are few practical benefits of this non-uniqueness, and we suppress this technical anomaly here.

  19. 19.

    Any points of discontinuity can be ignored in the definitions of the probability integrals without affecting the probability assignments.

  20. 20.

    We will henceforth suppress constant reference to the fact that factorization might not hold for some points of discontinuity in the continuous case – it will be tacitly understood that results we derive based on the factorization of f(x 1,x 2) may be violated at some isolated points. For example, for the case at hand, marginal and conditional densities may not be equal at some isolated points. Assignments of probability will be unaffected by this technical anomaly.

  21. 21.

    The same technical proviso regarding points of discontinuity in the case of continuous random variables hold as in the bivariate case. See Footnote 18.

  22. 22.

    E.J. Gumbel (1958) Distributions a’ plusieurs variables dont les marges sont données, C.R. Acad. Sci., Paris, 246, pp. 2717–2720.

  23. 23.

    This must be an approximation – why?

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer Science+Business Media New York

About this chapter

Cite this chapter

Mittelhammer, R.C. (2013). Random Variables, Densities, and Cumulative Distribution Functions. In: Mathematical Statistics for Economics and Business. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-5022-1_2

Download citation

Publish with us

Policies and ethics