1 Introduction

Consider a sequence of independent identically distributed (i.i.d.) positive integer valued random variables \(\{X_{n}\}_{n=1}^{\infty }\). Denote the corresponding sequence of upper records by \(\{X^{(n)}\}_{n=1}^{\infty }\). Specifically, the first random variable in the sequence is identified as the first record, the second record is the first subsequent Xn which exceeds X1. It is well known that the record value sequence corresponding to a sequence of geometric random variables has a simple distributional structure. If we define the record spacings sequence \(\{S_{n}\}_{n=1}^{\infty } \) by S1 = X1 = X(1) and for n > 1, Sn = X(n)X(n− 1), then in the geometric case these spacings are independent random variables. Geometric characterizations based on the independence of the record spacings are well known. In the present paper we will consider a simple relationship between the distribution of the first two records and the distribution of the first two Xn’s. Two related conjectured characterizations are described. In addition parallel results are discussed in the case of weak records.

2 The Conjectured Characterizations

Consider a sequence of i.i.d. positive integer valued random variables \( \{X_{n}\}_{n=1}^{\infty }\) with corresponding upper record sequence \(\{X^{(n)}\}_{n=1}^{\infty }\). If the Xi’s have a common geometric distribution, then because the record spacings are themselves geometrically distributed with homogeneous success probabilities, it follows that

$$ X^{(1)}+X^{(2)} \overset{d}{=}X_{1}+2X_{2}. $$
(2.1)

After formulating this unusual relationship between the two sequences, \( \{X_{n}\}_{n=1}^{\infty }\) and \(\{X^{(n)}\}_{n=1}^{\infty }\), it becomes plausible that this is a characteristic property of the geometric distribution. Two conjectures were considered.

Conjecture 1.

Suppose that \(X^{(1)}+X^{(2)} \overset {d}{=}X_{1}+2X_{2}\), then px = P(X = x) = p(1 − p)x− 1, for each x = 1, 2, ... for some p ∈ (0, 1).

Conjecture 2.

Suppose that, for some positive integer m > 2, \({\sum }_{i=1}^{m}\) \(X^{(i)} \overset {d}{=}{\sum }_{i=1}^{m} iX_{i}\), then px = P(X = x) = p(1 − p)x− 1, for each x = 1, 2, ... for some p ∈ (0, 1).

Both conjectures are judged to be plausible. Conjecture 2 would appear to be more difficult to resolve. In the next section we will provide a proof of Conjecture 1 under no regularity conditions. A proof of Conjecture 2 remains elusive.

3 Proof of Conjecture 1

Throughout this section we will employ the usual convention, when convenient, of denoting 1 − p by q and denoting 1 − px by qx.

Theorem 1.

If \( \{X_{n}\}_{n=1}^{\infty }\) are i.i.d. positive integer valued random variables with common discrete density function f(x) = px, x = 1, 2, ... where px > 0 ∀x so that a record value sequence is well-defined, and if \(X^{(1)}+X^{(2)} \overset {d}{=}X_{1}+2X_{2}\), then px = p(1 − p)x− 1, x = 1, 2, ... for some p ∈ (0, 1).

Proof.

First note that set of possible values of X1 + 2X2 and of X(1) + X(2) is the set {3, 4, 5, ...}. □

Necessity

It is well-known that if the Xi’s are i.i.d. with a common Geometric (p) distribution, then the record spacings X(m)X(m− 1) are also i.i.d. with a common geometric (p) distribution. Since we can write X(1) + X(2) = (X(2)X(1)) + 2X(1), the result follows.

Sufficiency

As in the statement of the theorem we have P(X = x) = px, x = 1, 2...

Assuming that \(X^{(1)}+X^{(2)} \overset {d}{=}X_{1}+2X_{2}\), we wish to prove that px = pqx− 1, x = 1, 2, .... First note that

$$ P(X_{1}+2X_{2}=3)= p_{1}p_{1}, $$

while

$$ P(X^{(1)}+X^{(2)}=3)= p_{1}\frac{p_{2}}{q_{1}}. $$

Equating these expressions we may conclude that p2 = p1q1 For simplicity of notation we will denote p1 by p. Thus far we have shown that p1 = p = pq1 − 1 and p2 = pq = pq2 − 1. We now argue inductively. Suppose that for some positive even integer 2k, we have pj = pqj− 1 for every j ≤ 2k, we claim that in such a case because of Eq. 2.1, we will also have p2k+ 1 = pq2k+ 1 − 1. To see this, consider

$$ \begin{array}{@{}rcl@{}} P(X_{1}+2X_{2}=2k+2)&=&\sum\limits_{j=1}^{k}P(X_{2}=j,X_{1}=2k+2-2j) \\ &=&\sum\limits_{j=1}^{k} pq^{j-1},pq^{2k+1-2j} \\ &=&p^{2}\sum\limits_{j=1}^{k} q^{2k-j}, \end{array} $$

and

$$ \begin{array}{@{}rcl@{}} P(X^{(1)}+X^{(2)}=2k+2)&=&\sum\limits_{j=1}^{k}P(X^{(1)}=j,X^{(2)}=2k+2-j) \\ &=&\sum\limits_{j=2}^{k} pq^{j-1}\frac{pq^{2k+1-j}}{q^{j}}+p\frac{p_{2k+1}}{q} \\ &=&p^{2}\sum\limits_{j=2}^{k} q^{2k-j}+pp_{2k+1}/q. \end{array} $$

Since (2.1) holds, we may conclude that

$$ pp_{2k+1}/q=p^{2}q^{2k-1}. $$

which implies that p2k+ 1 = pq2k = pq(2k+ 1)− 1, as claimed.

A similar argument will show that if for some positive odd integer 2k − 1, we have pj = pqj− 1 for every j ≤ 2k − 1, then because of Eq. 2.1, we will also have p2k = pq2k− 1. For this, it is necessary to equate P(X1 + 2X2 = 2k + 1) and P(X(1) + X(2) = 2k + 1).

It then follows by induction that px = pqx− 1 for every x = 1, 2, ..., i.e. that X has a geometric(p) distribution.

4 Discussion Regarding Conjecture 2

The proof of Theorem 1 was less transparent than was expected. Although Conjecture 2 is eminently plausible, the book-keeping necessary to prove the result appears to be daunting and the conjecture remains open. However, if we consider the case in which m = 3, we may argue that the conjecture appears to be unlikely to be true based on the following observations.

The possible values of X1 + 2X2 + 3X3 and of X(1) + X(2) + X(3) are {6, 7, 8, ...}. If we assume that P(X1 + 2X2 + 3X3 = 6) = P(X(1) + X(2) + X(3) = 6), this implies that

$${p_{1}^{3}}=p_{1}\frac{p_{2}}{1-p_{1}}\frac{p_{3}}{1-p_{1}-p_{2}},$$

from which we obtain

$$p_{3}=\frac{{p_{1}^{2}}(1-p_{1})(1-p_{1}-p_{2})}{p_{2}}.$$

Thus p1 and p2 appear to be unconstrained, except that their sum must be less than 1.

If we consider other possible values, i.e., consider equalities of the form

$$P(X_{1}+2X_{2}+3X_{3}=y)=P(X^{(1)}+X^{(2)}+X^{(3)}=y),$$

then each new value of y will result in an expression for py in terms of p1, p2, ... , py− 1. However no obvious constraints on p1 or p2 appear to arise.

Of course, if p2 = p1(1 − p1) then subsequent pj’s appear to be of the geometric form (i.e., = p1(1 − p1)j− 1). However, other choices for p2 would seem to lead to non-geometric solutions.

Cases in which m > 3, exhibit similar problems and, in fact, would appear to admit an even wider variety of non-geometric solutions. It appears that only in the case m = 2 is a characterization possible.

Remark 1.

We have carefully avoided stating that non-geometric solutions will exist in cases in which m > 2, because we have been unable to explicitly determine completely a convergent non-geometric sequence that satisfies the condition \({\sum }_{i=1}^{m}X^{(i)} \overset {d}{=} {\sum }_{i=1}^{m} iX_{i}.\)

5 An Analogous Weak Record Result

When we turn to investigate record phenomena for sequences of i.i.d. non-negative integer valued random variables, the concept of weak records plays the role usually played by records. An observation in the sequence \(\{X_{i}\}_{i-1}^{\infty }\) is a weak record if it exceeds or equals all the preceding Xi’s in the sequence. In this setting geometric random variables with possible values {0, 1, 2, ...} play a role analogous to that played by positive geometric variables in record value discussions. In this Section we will add asterisks to non-negative integer random variables and corresponding weak records to distinguish them from the positive random variables and ordinary records discussed in the previous Sections.

We thus will consider a sequence \(\{X^{*}_{i}\}_{i=1}^{\infty }\) of non-negative random variables with a corresponding weak record sequence denoted by \(\{X^{*(i)}\}_{i=1}^{\infty }\). (an introduction to weak records can be found in Arnold et al. (1998)). We will say that a non-negative integer valued random variable X has a geometric distribution if its discrete density is of the form P(X = k) = p(1 − p)k, k = 0, 1, 2, ... and we write \(X^{*} \sim geo^{*}(p)\). Parallel to the result for positive geometric variables, it is well-known that the weak record spacings corresponding to geometric(p) are themselves i.i.d. with a common geometric(p) distribution. It is consequently plausible that the following result, analogous to Theorem 1, might be true (this was suggested by a referee). The proof is a close parallel to the proof for ordinary (i.e., positive) geometric variables.

Theorem 2.

If \( \{X^{*}_{n}\}_{n=1}^{\infty }\) are i.i.d. non-negative integer valued random variables with common discrete density function f(x) = px, x = 0, 1, 2, ... where px > 0 ∀x so that a weak record value sequence is well-defined, and if \(X^{*(1)}+X^{*(2)} \overset {d}{=}X^{*}_{1}+2X^{*}_{2}\), then px = p(1 − p)x, x = 0, 1, 2, ... for some p ∈ (0, 1).

Proof.

First note that set of possible values of \(X^{*}_{1}+2X^{*}_{2}\) and of X∗(1) + X∗(2) is the set {0, 1, 2, ...}. □

Necessity

We use the fact that if the \(X^{*}_{i}\)’s are i.i.d. with a common geometric(p) distribution, then the record spacings X∗(m)X∗(m− 1) are also i.i.d. with a common geometric(p)distribution. Since we can write X∗(1) + X∗(2) = (X∗(2)X∗(1)) + 2X∗(1), the result follows.

Sufficiency

As in the statement of the theorem we have P(X = x) = px, x = 0, 1, 2... , however it will be convenient to denote p0 by p ∈ (0, 1).

For convenience we define \(V=X_{1}^{*}+2X_{2}^{*}\) and W = X∗(1) + X∗(2). Under the assumption that \(V\overset {d}{=}W\) we wish to prove that pk = p(1 − p)k k = 0, 1, 2, ... where \(p=P(X^{*}_{1}=0)\). Elementary computations yield the following expressions for the discrete densities of V and W, in which we use the notation \(q_{j}=P(X_{1}^{*} \geq j))\).

$$ \begin{array}{@{}rcl@{}} \text{For}~ k ~\text{odd, } P(V=k)=\sum\limits_{j=0}^{(k-1)/2}p_{j}p_{k-2j}, \end{array} $$
(5.1)
$$ \begin{array}{@{}rcl@{}} \\ \text{For}~ k ~\text{odd, } P(W=k)=\sum\limits_{j=0}^{(k+1)/2}p_{j}p_{k-j}/q_{j}, \end{array} $$
(5.2)
$$ \begin{array}{@{}rcl@{}} \\ \text{For}~ k~ \text{even, } P(V=k)={\sum}_{j=0}^{k/2}p_{j}p_{k-2j}, \end{array} $$
(5.3)
$$ \begin{array}{@{}rcl@{}} \\ \text{For}~ k ~\text{even, } P(W=k)=\sum\limits_{j=0}^{k/2}p_{j}p_{k-j}/q_{j}. \end{array} $$
(5.4)

Since \(V\overset {d}{=}W\), we can equate (5.1) and (5.2) when k = 2 and conclude that p1 = p(1 − p). Next consider an arbitrary k > 2 and assume that, for j < k − 1, it has been verified that pj = p(1 − p)j and qj = (1 − p)j. Then by equating (5.1) and (5.2), if k is odd, or by equating (5.3) and (5.4), if k is even, we may conclude that pk− 1 = p(1 − p)k− 1. We may thus, by induction, conclude that \(P(X^{*}_{1}=k)=p_{k}=p(1-p)^{k}, \ \ k=0, 1,2,..\), i.e., that \(X**_{1} \sim geo*(p)\).

6 Closing Observations

Conjecture 2 continues to be tantalizing. Our arguments in Section 4 strongly suggest that it will not prove to be true. One might try to use simulations to compare the distributions of X1 + 2X2 and of X(1) + X(2) using a particular non-geometric distributions for the Xi’s. However, it is highly unlikely that any well-known choice for the distribution of the Xi’s will result in the desired equi-distribution of the two statistics. We believe that the best hope for resolving the problem lies in identifying a convergent non-geometric discrete density as outlined at the end of Section 4.