Iterated full information secretary problem

We consider a full information best-choice problem where an administrator who has only one on-line choice in m consecutive searches has to choose the best candidate in one of them.


Introduction and notation
In the full information best-choice problem Gilbert and Mosteller (1966) we deal with a discrete time stochastic process (X 1 , . . . , X n ) where X 1 , . . . , X n are i.i.d. random variables with known continuous cumulative distribution function F. We observe elements of (X 1 , . . . , X n ) one by one and our goal is to choose on-line the largest element of X 1 , . . . , X n which is not a priori known. Stopping the process at a given moment means choosing the object we have observed at this moment according to the knowledge obtained in the hitherto observations. The best-choice problem consists of finding a strategy of stopping the process that maximizes the probability P [X τ = max {X 1 , . . . , X n }] over all stopping times τ ≤ n. [see Gnedin (1996) Numbers d k , called decision numbers, are implicitly defined as satisfying the following equalities: d 0 = 0 and for k = 1, 2 . . ., The optimal stopping time is given by the following formula: if the set under minimum is nonempty, otherwise τ * n = n. It is known [see Samuels (1991)], that the sequence (d k ) is increasing in k, lim x −1 (e x − 1)dx = 1.
The maximal probability (using the optimal stopping time) v n = P X τ * n = max {X 1 , . . . , X n } does not depend on F, is strictly decreasing in n and Samuels (1991)]. In this paper we consider a modification to the classical full information best-choice problem. Namely, we consider m consecutive classical full information searches. Our aim is to choose the largest element in one of them if we have only one choice. Our goal is to find a strategy that maximizes the probability of achieving this aim.
The problem considered here is related to real life situations where contests may be repeated several times but once in one of them the choice is made the procedure ends. Usually selectors know how many times they can repeat the contest and intuitively, the more contests are ahead the more selective they can be.
The solution of the no information version for the repeated contest problem was presented in Kuchta and Morayne (2014a).
Here is a formal description of the problem considered.
Let n 1 , . . . , n m ∈ N and X (m) , X (m−1) , . . . , X (1) be a sequence of m consecutive searches: for i = 0, . . . , m − 1, . . , X i k=0 n m−k are independent random variables with known continuous cumulative distribution function F m−i . (The inverse numbering simplifies some technicalities in the proof and is adjusted to the recursion we will use.) The continuous distribution F m−i is known and since the largest measurement in a sample remains the largest under all monotonic transformations of its variable, we lose no generality by assuming that F m−i is the standard uniform for all searches: the selector sees the whole sequences X (m) , X (m−1) , . . . , X (m−i+1) and the first j values of the search X (m−i) . The goal of the selector is to stop the search at a time t maximizing the probability that X t ∈ Max Y (m) . Formally, for t ≥ 1, let F t be the σ -algebra generated by X 1 , . . . , X t , F t = σ (X 1 , . . . , X t ). Our aim is to find a stopping time τ m with respect to the filtration (F t ) maximizing the probability P X τ m ∈ Max Y (m) .

Optimal stopping time
Let us recall the Monotone Case Theorem [see Chow et al. (1971)], which is often a very useful tool when looking for an optimal stopping time.
We apply this theorem to determine an optimal stopping time for m searches, i.e. for Y (m) .
Let γ m−1 be the probability of success using an optimal stopping time for Y (m−1) .
For 1 ≤ k ≤ n m we define the sequence of multiple search decision numbersd k in the following way:d (2) Notice that the numbersd k are to be used only in the first search X (m) . For m = 1, d k = d k . Now let us define the following stopping times τ m for Y (m) . For m = 1: if the set under minimum is nonempty, otherwise τ 1 = n 1 , and, for m > 1: In the first case of the definition above we choose from the first search X (m) = X 1 , . . . , X n m . In the second case we choose from among the remaining m − 1 searches: Y (m−1) , so the recursion is used.

Theorem 2.2
The stopping time τ m is optimal for Y (m) . When using τ m the probability of success equals where we set γ 0 = 0.
Proof In the proof we use recursion with respect to m. If τ m−1 is an optimal stopping time for the case Y (m−1) , then when looking for an optimal stopping time τ m for m searches, i.e. for Y (m) , the only stopping times that should be considered are the times of relative records for X (m) and the optimal stopping time τ m−1 in the remaining m −1 searches Y (m−1) . Namely, let ρ 1 = 1 and, for 2 ≤ i ≤ m j=1 n j if ρ i−1 ≤ n m , then Let us notice that in our notation the probability of stopping on a maximal element is equal to Thus, our aim is maximizing E(Z τ ) over all F ρ i i -stopping times τ . By Theorems 2.3, 4.1 and Proposition 5.2 of Kuchta and Morayne (2014b) the process Z satisfies the hypothesis of the Monotone Case Theorem.
Suppose we have seen the t-th element x from the first search and this element is maximal so far. Thus t = ρ j for some j. There are still n m − t elements to come in X (m) . The probability that the next n m − t elements are not bigger than x is equal to x n m −t and this is the probability of winning if we stop now. The probability of winning in the time of the next relative record in X (m) is equal to where the i-th summand is the probability that exactly i elements from the remaining n m − t ones are larger than x, and the maximum of those i elements appears first.
Choosing the times when they come corresponds to the factor n m −t i , the probability that exactly these elements are bigger than x is equal to x n m −t−i (1 − x) i and the probability that the largest element from this group comes before the other ones is equal to 1 i .
If there is no relative record after x till the time n m , i.e., within the first search X (m) , we use the optimal strategy for the remaining part which consists of m − 1 searches X (m−1) , . . . , X (1) . The probability that this happens and that we succeed is equal to x n m −t γ m−1 . Thus, by the Monotone Case Theorem, we decide to stop at the t-th moment if it is the first moment of the relative record when Since The where 0 < x < 1 and 1 ≤ t ≤ n m − 1. Thus the smallest x satisfying (5) is equal to the solution of the equation Hence τ m is an optimal stopping time.
The probability that we choose from X (m) and we are successful is equal to where p t is the probability that we stop at time t and it is successful, i.e. X t = max X 1 , . . . , X n m . Thus Let 1 ≤ t ≤ n m − 1. The probability that no element among the first t is chosen and that the absolute largest is X t+1 is equal to (see the explanation below) where the first integral is the probability that the i-th element is belowd n m −i and it is the biggest among the first t elements, the second integral is the probability that the i-th element is belowd n m −i and it is the absolute maximum, and the factor 1 n m −t is the probability that the best element among the remaining n m − t elements is exactly the (t + 1)-th one.
The probability that X t+1 is the largest in X (m) but does not pass the threshold, is equal to Note that if the last event (whose probability is given by (9)) happens then also the previous one (whose probability is given by (8)) does, because the thresholdsd n m −i are decreasing with i. Thus by (8) and (9), for 1 ≤ t ≤ n m − 1, We do not stop at the first search X (m) if and only if, for every 1 ≤ t ≤ n m − 1, X t <d n m −t when X t = max{X 1 , . . . X t }. Thus the probability that using τ m we do not stop at the first search X (m) is equal to The above equality, (7), (10) and (6) yield (3).

Asymptotics
In this section we examine the asymptotic behavior of the probability of success γ m and the multiple search decision numbersd k as n i −→ ∞ for every i ∈ {1, . . . , m}. Let us define recursively the following sequence: r 0 = 0, and for i ≥ 1: where c i satisfies the following equation (i ≥ 1): or, equivalently, Let n * i = min{n 1 , . . . , n i } for i = 1, . . . , m and let, for 1 ≤ k ≤ n m − 1, Theorem 3.1 γ m −→ r m as n * m −→ ∞. Proof We prove this theorem by induction with respect to m.
For m = 1 we have only one search and Of course, this is the asymptotic solution (1) of the classical full information bestchoice problem. Let m ≥ 2 and assume that lim Note that α k is, in fact, a function of m variables: k, n m−1 , . . . , n 1 ; α k = α k (n m−1 , . . . , n 1 ).
Claim 1 α k −→ c m as (k, n * m ) −→ (∞, ∞) and k ≤ n m − 1. Proof of Claim 1 For 1 ≤ k ≤ n m − 1, (2) and (13) yield Since the function 1 x 1 + x k k − 1 is increasing in k for x > 0 and, for k = 1, which implies L = U . Hence the limit from the statement of the claim exists. In view of the definition of c m we also obtain lim (k,n * m )→(∞,∞) α k = c m . Further we follow the method used by Samuels (1982), [see also Samuels (1991)]. Consider the first search X (m) . Let M t = max {X 1 , . . . , X t }, 1 ≤ t ≤ n m , and let M 0 = 0. Let σ n m be the arrival time of the largest element in X (m) andσ n m be the arrival time of the largest element in X (m) before the time σ n m , i.e.
Becaused n m −t is decreasing and M t is increasing in t for 1 ≤ t ≤ n m , the probability that we choose from X (m) and we are successful is equal to and the probability that we do not choose from X (m) is equal to Claim 2 and Proof of Claim 2 We change variables: Then n m − σ n m = n m 1 − T n m and n m −σ n m = n m 1 − T n mT n m +T n m . Thus, applying (13), Let us define the following events (depending on n m ): where K = n m 1 − T n mT n m +T n m .
The conditional probability , Now we integrate this probability multiplied by the exponential density of S and we obtain the conditional probability for given T = t andT =t: In the next step we integrate this expression over the unit square: It is easy to see that Making the following change of variables in the first integral of (17) Interchanging the order of integration we obtain By (12) we obtain (15). The conditional probability of {S ≥ c m / (1 − t)} for T = t andT =t is equal to Analogously, This completes the proof of the claim.
By (14), (15) and (16) and the induction hypothesis, we get This completes the proof.
The following proposition describes the asymptotic behavior of the sequences (c m ) and (r m ) when m −→ ∞ . Proof The function is decreasing in y, (the derivative of this function is negative i.e. dg dy = (e y − 1) ∞ y e −x x −1 dx − e −y y −1 < 0). By (11), (12) and r 0 = 0, c 1 ≈ 0.804. It is easy to see that the sequence (r m ) is increasing and c m is decreasing with m −→ ∞ and both sequences are bounded by 0 and 1. Thus, both sequences are convergent. Let β = lim m→∞ c m . By (11) and (12) the sequence (c m ) satisfies the following recurrence Thus, β is the solution of the following equation It is easy to check that the only β satisfying this equation is β = 0. By (18), it is now easy to check that lim m→∞ r m = 1. This completes the proof.
Approximations of the first ten elements of the sequences (r m ) and (c m ) are given in Table 1, see also Fig. 1. For comparison, the first column gives the corresponding probability of success (a m ) for the iterated no information version (the classical secretary problem) [see also Kuchta and Morayne (2014a)].  The following proposition describes the asymptotic behavior of the decision numberŝ d k when k −→ ∞ and n * m −→ ∞.