Topics: Characteristic Functions, Proof of Central Limit Theorem, Adaptive CSMA

4.1 Characteristic Functions

Before we explain the proof of the CLT, we have to describe the use of characteristic functions.

Definition 4.1

Characteristic Function The characteristic function of a random variable X is defined as

$$\displaystyle \begin{aligned} \phi_X(u) = E(e^{iuX}), u \in \Re. \end{aligned}$$

In this expression, \(i := \sqrt {-1}\). â—‡

Note that

$$\displaystyle \begin{aligned} \phi_X(u) = \int_{- \infty}^\infty e^{iux}f_X(x) dx, \end{aligned}$$

so that Ï• X(u) is the Fourier transform of f X(x). As such, the characteristic function determines the pdf uniquely.

As an important example, we have the following result.

Theorem 4.1 (Characteristic Function of \(\mathcal {N}(0, 1)\))

Let \(X =_D \mathcal {N}(0, 1)\) . Then,

$$\displaystyle \begin{aligned} \phi_X(u) = e^{ - \frac{u^2}{2}}. \end{aligned} $$
(4.1)

\({\blacksquare }\)

Proof

One has

$$\displaystyle \begin{aligned} \phi_X(u) = \int_{- \infty}^\infty e^{iux} \frac{1}{\sqrt{2\pi}} e^{- \frac{x^2}{2}} dx, \end{aligned}$$

so that

$$\displaystyle \begin{aligned} \frac{d}{du} \phi_X(u) &= \int_{- \infty}^\infty i xe^{iux} \frac{1}{\sqrt{2\pi}} e^{- \frac{x^2}{2}} dx = - \int_{- \infty}^\infty i e^{iux} \frac{1}{\sqrt{2\pi}} de^{- \frac{x^2}{2}} \\ &= \int_{- \infty}^\infty i \frac{1}{\sqrt{2\pi}} e^{- \frac{x^2}{2}} de^{iux} = - u \int_{- \infty}^\infty e^{iux} \frac{1}{\sqrt{2\pi}} e^{- \frac{x^2}{2}} dx \\ &= - u \phi_X(u). \end{aligned} $$

(The third equation follows by integration by parts.) Thus,

$$\displaystyle \begin{aligned} \frac{d}{du} \log(\phi_X(u)) = - u = - \frac{d}{du} \frac{u^2}{2}, \end{aligned}$$

which implies that

$$\displaystyle \begin{aligned} \phi_X(u) = A e^{- \frac{u^2}{2}}. \end{aligned}$$

Since ϕ X(0) = E(e i0X) = 1, we see that A = 1, and this proves the result (4.1). □

We are now ready to prove the CLT.

4.2 Proof of CLT (Sketch)

The technique to analyze sums of independent random variables is to calculate the characteristic function. Let then

$$\displaystyle \begin{aligned} Y(n) = \frac{X(1) + \cdots + X(n) - n \mu}{\sigma \sqrt{n}}, n \geq 1. \end{aligned}$$

We have

$$\displaystyle \begin{aligned} \phi_{Y(n)} (u) &= E\left(e^{iu Y(n)}\right) = E\left(\varPi_{m=1}^n \exp\left\{\frac{iu(X(m) - \mu)}{\sigma \sqrt{n}} \right\}\right) \\ & = \left[E\left(\exp\left\{\frac{iu(X(1) - \mu)}{\sigma \sqrt{n}}\right\}\right)\right]^n \\ &= \left[E\left(1 + \frac{iu(X(1) - \mu)}{\sigma \sqrt{n}} - \frac{u^2(X(1) - \mu)^2}{2 \sigma^2n} + o(1/n)\right)\right]^n \\ &= \left[1 - u^2/(2n) + o(1/n)\right]^n \rightarrow \exp\left\{- u^2/2\right\}, \mbox{ as } n \rightarrow \infty. \end{aligned} $$

The third equality holds because the X(m) are i.i.d. and the fourth one follows from the Taylor expansion of the exponential:

$$\displaystyle \begin{aligned} e^a \approx 1 + a + \frac{1}{2}a^2. \end{aligned}$$

Thus, the characteristic function of Y (n) converges to that of a \(\mathcal {N}(0, 1)\) random variable. This suggests that the inverse Fourier transform, i.e., the density of Y (n) converges to that of a \(\mathcal {N}(0, 1)\) random variable. This last step can be shown formally, but we will not do it here. \({\square }\)

4.3 Moments of \(\mathcal {N}(0, 1)\)

We can use the characteristic function of a \(\mathcal {N}(0, 1)\) random variable X to calculate its moments. This is how. First we note that, by using the Taylor expansion of the exponential,

$$\displaystyle \begin{aligned} \phi_X(u) & = E\left(e^{iuX}\right) = E\left(\sum_{n=0}^\infty \frac{1}{n!} (iuX)^n \right) \\ &= \sum_{n=0}^\infty \frac{1}{n!} (iu)^n E(X^n). \end{aligned} $$

Second, again using the expansion of the exponential,

$$\displaystyle \begin{aligned} \phi_X(u) = e^{- u^2/2} = \sum_{m = 0}^\infty \frac{1}{m!} \left(- \frac{u^2}{2}\right)^m. \end{aligned}$$

Third, we match the coefficients of u 2m in these two expressions and we find that

$$\displaystyle \begin{aligned} \frac{1}{(2m)!} i^{2m} E\left(X^{2m}\right) = \frac{1}{m!} \left(- \frac{1}{2}\right)^m, \end{aligned}$$

This givesFootnote 1

$$\displaystyle \begin{aligned} E\left(X^{2m}\right) = \frac{(2m)!}{m! 2^m}. \end{aligned} $$
(4.2)

For instance,

$$\displaystyle \begin{aligned} E(X^2) = \frac{2!}{1!2^1} = 1, E\left(X^4\right) = \frac{4!}{2! 2^2} = 3. \end{aligned}$$

Finally, we note that the coefficients of odd powers of u must be zero, so that

$$\displaystyle \begin{aligned} E\left(X^{2m + 1}\right) = 0, \mbox{ for } m = 0, 1, 2, \ldots. \end{aligned}$$

(This should be obvious from the symmetry of f X(x).) In particular,

$$\displaystyle \begin{aligned} \mbox{var}(X) = E\left(X^2\right) - E(X)^2 = 1. \end{aligned}$$

4.4 Sum of Squares of 2 i.i.d. \(\mathcal {N}(0, 1)\)

Let X, Y  be two i.i.d. \(\mathcal {N}(0, 1)\) random variables. The claim is that

$$\displaystyle \begin{aligned} Z = X^2 + Y^2 =_D Exp(1/2). \end{aligned}$$

Let θ be the angle of the vector (X, Y ) and R 2 = X 2 + Y 2. Thus (see Fig. 4.1)

$$\displaystyle \begin{aligned} dx dy = r dr d \theta. \end{aligned}$$
Fig. 4.1
figure 1

Under the change of variables \(x = r \cos {}(\theta )\) and \(y = r \sin {}(\theta )\), we see that dxdy = rdrdθ. That is, [r, r + dr] × [θ, θ + dθ] covers an area rdrdθ in the (x, y) plane

Note that E(Z) = E(X 2) + E(Y 2) = 2, so that if Z is exponentially distributed, its rate must be 1∕2. Let us prove that it is exponential. One has

$$\displaystyle \begin{aligned} f_{X, Y}(x, y)dxdy &= f_{X, Y}(x, y) r dr d\theta = \frac{1}{2 \pi} \exp\left\{- \frac{x^2 + y^2}{2} \right\}r dr d\theta \\ &= \frac{1}{2\pi} \exp\left\{- \frac{r^2}{2}\right\} r dr d \theta = \frac{d \theta}{2 \pi} \times \exp\left\{- \frac{r^2}{2} \right\} r dr \\ & =: f_\theta (\theta ) d \theta \times f_R (r) dr, \end{aligned} $$

where

$$\displaystyle \begin{aligned} f_\theta (\theta) = \frac{1}{2\pi}1\{0 < \theta < 2\pi\} \mbox{ and } f_R(r) = r \exp\left\{- \frac{r^2}{2} \right\} 1\{r \geq 0\}. \end{aligned}$$

Thus, the angle θ of (X, Y ) and the norm \(R = \sqrt {X^2 + Y^2}\) are independent and have the indicated distributions. But then, if V = R 2 =: g(R), we find that, for v ≥ 0,

$$\displaystyle \begin{aligned} f_V(v) = \frac{1}{|g'(R)|} f_R(r) = \frac{1}{2r} r \exp\left\{- \frac{r^2}{2} \right\} = \frac{1}{2} \exp\left\{- \frac{v}{2} \right\} \end{aligned}$$

which shows that the angle θ and V = X 2 + Y 2 are independent, the former being uniformly distributed in [0, 2π] and the latter being exponentially distributed with mean 2.

4.5 Two Applications of Characteristic Functions

We have used characteristic functions to prove the CLT. Here are two other cute applications.

4.5.1 Poisson as a Limit of Binomial

A Poisson random variable X with mean λ can be viewed as a limit of a B(n, λ∕n) random variable X n as n →∞. To see this, note that

$$\displaystyle \begin{aligned} E(\exp\{iu X_n\}) = E(\exp\{iu(Z_n(1)) + \ldots + Z_n(n)\}), \end{aligned}$$

where the random variables {Z n(1), …, Z n(n)} are i.i.d. Bernoulli with mean λ∕n. Hence,

$$\displaystyle \begin{aligned} E(\exp\{iu X_n\}) = \left[E(\exp\{iu(Z_n(1)\}))\right]^n= \left[1 + \frac{\lambda}{n} (e^{iu} - 1)\right]^n. \end{aligned}$$

For the second identity, we use the fact that if Z =D B(p), then

$$\displaystyle \begin{aligned} E(\exp\{iuZ\}) = (1 - p)e^0 + p e^{iu} = 1 + p(e^{iu} - 1). \end{aligned}$$

Also, since

$$\displaystyle \begin{aligned} P(X = m) = \frac{\lambda^m}{m!} e^{-\lambda} \mbox{ and } e^a = \sum_{m=0}^\infty \frac{a^m}{m!}, \end{aligned}$$

we find that

$$\displaystyle \begin{aligned} E(\exp\{iuX\}) = \sum_{m=0}^\infty \frac{\lambda^m}{m!} \exp\{- \lambda\} e^{ium} = \exp\{\lambda (e^{iu} - 1)\}. \end{aligned}$$

The result then follows from the fact that

$$\displaystyle \begin{aligned} \left(1 + \frac{a}{n}\right)^n \to e^a, \mbox{ as } n \to \infty. \end{aligned}$$

4.5.2 Exponential as Limit of Geometric

An exponential random variable can be viewed as a limit of scaled geometric random variables. Let X =D Exp(λ) and X n = G(λ∕n). Then

$$\displaystyle \begin{aligned} \frac{1}{n} X_n \to X, \mbox{ in distribution}. \end{aligned}$$

To see this, recall that

$$\displaystyle \begin{aligned} f_X(x) = \lambda e^{- \lambda x} 1\{x \geq 0\}. \end{aligned}$$

Also,

$$\displaystyle \begin{aligned} \int_0^\infty e^{- \beta x} = \frac{1}{\beta} \end{aligned}$$

if the real part of β is positive.

Hence,

$$\displaystyle \begin{aligned} E\left(e^{iuX}\right) = \int_0^\infty e^{iux} \lambda e^{- \lambda x} dx = \frac{\lambda}{\lambda - iu}. \end{aligned}$$

Moreover, since

$$\displaystyle \begin{aligned} P(X_n = m) = (1 - p)^m p, m \geq 0, \end{aligned}$$

we find that, with \(p = \frac {\lambda }{n}\),

$$\displaystyle \begin{aligned} E\left(\exp\left\{iu \frac{1}{n} X_n \right\}\right) & = \sum_{m=0}^\infty (1 - p)^m p \exp\{ium/n\} \\ & = p \sum_{m=0}^\infty [(1 - p)exp\{iu/n \}]^m = \frac{p}{1 - (1 - p)exp\{iu/n\}} \\ & = \frac{\lambda/n}{1 - (1 - \lambda/n) \exp\{iu/n\}} \\ & = \frac{\lambda}{n( 1 - (1 - \lambda/n)(1 + iu/n + o(1/n)))} \\ & = \frac{\lambda}{\lambda - iu + o(1/n)}, \end{aligned} $$

where o(1∕n) → 0 as n →∞. This proves the result.

4.6 Error Function

In the calculation of confidence intervals, one uses estimates of

$$\displaystyle \begin{aligned} Q(x) := P(X > x) \mbox{ where } X =_D \mathcal{N}(0, 1). \end{aligned}$$

The function Q(x) is called the error function. With Python or the appropriate smart phone app, you can get the value of Q(x). Nevertheless, the following bounds (see Fig. 4.2) may be useful.

Fig. 4.2
figure 2

The error function Q(x) and its bounds

Theorem 4.2 (Bounds on Error Function)

One has

$$\displaystyle \begin{aligned} \frac{x}{1 + x^2} \frac{1}{\sqrt{2\pi}} \exp\left\{- \frac{x^2}{2} \right\} \leq Q(x) \leq \frac{1}{x \sqrt{2 \pi}} \exp\left\{- \frac{x^2}{2} \right\}, \forall x > 0. \end{aligned}$$

\({\blacksquare }\)

Proof

Here is a derivation of the upper bound. For x > 0, one has

$$\displaystyle \begin{aligned} Q(x) &= \int_{x}^\infty f_X(y)dy = \int_x^\infty \frac{1}{\sqrt{2 \pi}} e^{- \frac{y^2}{2}} dy = \frac{1}{\sqrt{2 \pi}} \int_x^\infty \frac{y}{y} e^{- \frac{y^2}{2}} dy \\ & \leq \frac{1}{\sqrt{2 \pi}} \int_x^\infty \frac{y}{x} e^{- \frac{y^2}{2}} dy = \frac{1}{x \sqrt{2 \pi}} \int_x^\infty y e^{- \frac{y^2}{2}} dy \\ &= - \frac{1}{x \sqrt{2 \pi}} \int_x^\infty de^{- \frac{y^2}{2}} = \frac{1}{x \sqrt{2 \pi}} e^{- \frac{x^2}{2}}. \end{aligned} $$

For the lower bound, one uses the following calculation, again with x > 0:

$$\displaystyle \begin{aligned} \left(1 + \frac{1}{x^2}\right) \int_x^\infty e^{- \frac{y^2}{2}} dy & \geq \int_{x}^\infty \left(1 + \frac{1}{y^2}\right) e^{- \frac{y^2}{2}}dy \\ &= - \int_{x}^\infty d\left( \frac{1}{y} e^{- \frac{y^2}{2}}\right) = \frac{1}{x} e^{- \frac{x^2}{2}}. \end{aligned} $$

â–¡

4.7 Adaptive Multiple Access

In Sect. 3.4, we explained a randomized multiple access scheme. In this scheme, there are N active station and each station attempts to transmit with probability 1∕N in each time slot. This scheme results in a success rate of about 1∕e ≈ 36%. However, it requires that each station knows how many other stations are active.

To make the scheme adaptive to the number of active devices, say that the devices adjust the probability p(n) with which they transmit at time n as follows:

$$\displaystyle \begin{aligned} p(n+1) = \left\{ \begin{array}{l l} p(n), & \mbox{ if } X(n) = 1; \\ ap(n), & \mbox{ if } X(n) > 1; \\ \min\{bp(n), 1\}, & \mbox{ if } X(n) = 0. \end{array} \right. \end{aligned}$$

In these update rules, a and b are constants with a ∈ (0, 1) and b > 1. The idea is to increase p(n) if no device transmitted and to decrease it after a collision. This scheme is due to Hajek and Van Loon (1982) (Fig. 4.3).

Fig. 4.3
figure 3

Bruce Hajek

Figure 4.4 shows the evolution over time of the success rate T n. Here,

$$\displaystyle \begin{aligned} T_n = \frac{1}{n} \sum_{m=0}^{n-1} 1\{X(m) = 1\}. \end{aligned}$$

The figure uses a = 0.8 and b = 1.2. We see that the throughput approaches the optimal value for N = 40 and for N = 100. Thus, the scheme adapts automatically to the number of active devices.

Fig. 4.4
figure 4

Throughput of the adaptive multiple access scheme

4.8 Summary

  • Characteristics Function;

  • Proof of CLT;

  • Moments of Gaussian;

  • Sum of Squares of Gaussians;

  • Poisson as limit of Binomial;

  • Exponential as limit of Geometric;

  • Adaptive Multiple Access Protocol.

4.8.1 Key Equations and Formulas

Characteristic Function

\(\phi _X(u) = E(\exp \{iuX\})\)

D. 4.1

For \(\mathcal {N}(0, 1)\)

\(\exp \{- u^2/2\}\)

T. 4.1

Moments of \(\mathcal {N}(0, 1)\)

E(X 2m) = (2m)!∕(m!2m)

(4.2)

Error Function \(P(\mathcal {N}(0, 1) > x)\)

Bounds

T. 4.2

4.9 References

The CLT is a classical result, see Bertsekas and Tsitsiklis (2008), Grimmett and Stirzaker (2001) or Billingsley (2012).

4.10 Problems

Problem 4.1

Let X be a N(0, 1) random variable. You will recall that E(X 2) = 1 and E(X 4) = 3.

  1. (a)

    Use Chebyshev’s inequality to get a bound on P(|X| > 2);

  2. (b)

    Use the inequality that involves the fourth moment of X to bound P(|X| > 2). Do you get a better bound?

  3. (c)

    Compare with what you know about the N(0, 1) random variable.

Problem 4.2

Write a Python simulation of Hajek’s random multiple access scheme. There are 20 stations. An arrival occurs at each station with probability λ∕20 at each time slot. The stations update their transmission probability as explained in the text. Plot the total backlog in all the stations as a function of time.

Problem 4.3

Consider a multiple access scheme where the N stations independently transmit short reservation packets with duration equal to one time unit with probability p. If the reservation packets collide or no station transmits a reservation packet, the stations try again. Once a reservation is successful, the succeeding station transmits a packet during K time units. After that transmission, the process repeats. Calculate the maximum fraction of time that the channel can be used for transmitting packets. Note: This scheme is called Reservation Aloha.

Problem 4.4

Let X be a random variable with mean zero and variance 1. Show that E(X 4) ≥ 1.

Hint

Use the fact that E((X 2 − 1)2) ≥ 0.

Problem 4.5

Let X, Y  be two random variables. Show that

$$\displaystyle \begin{aligned} (E(XY))^2 \leq E(X^2)E(Y^2). \end{aligned}$$

This is the Cauchy–Schwarz inequality.

Hint

Use E((λX − Y )2) ≥ 0 with λ = E(XY )∕E(X 2).