1 Introduction

The heuristics literature in cognitive psychology is concerned with two different but related issues. First, are heuristics predictive of human behavior? Second, how does heuristic performance compare to the optimal solutions of the decision-making tasks? When testing models in the controlled environment of the laboratory, the focus of the literature typically is on the former question, their predictiveness, namely, how well the predictions of the model match systematic and replicable patterns of behavior that are observed in the raw data. In contrast, our manuscript focuses on the second question, namely, the relative performance of heuristics (simple rule-of-thumb models) vis-à-vis an optimal benchmark.

One viewpoint emphasizes the performance shortcomings of descriptive heuristics, arguing that they often lead to biases and sub-optimal behavior (e.g., Kahneman, 2003a, b, 2011; Tversky & Kahneman, 1974). An alternative viewpoint contends that heuristics can be fast and frugal, exhibiting excellent performance and even outperforming normative models in environments of irreducible uncertainty arising from nature (Gigerenzer et al., 1999, 2011; Hertwig et al., 2013; Todd et al., 2012) or imperfect knowledge of opponents’ strategic behavior and payoffs in games (Spiliopoulos & Hertwig, 2020). This latter strand of the literature emphasizes that heuristics perform best when their processes match relevant characteristics of the environment, thereby exploiting them efficiently in the spirit of Simon’s ‘scissors’ metaphor, particularly in the face of uncertainty—see Hertwig et al. (2019) for examples of this across a wide range of different decision-making tasks.

Ultimately, moving away from a binary antagonistic stance regarding heuristic performance, research should be directed at demarcating the boundary between environments where heuristics perform well and environments where they may be inappropriate. Where does this boundary lie? Are heuristics restricted to performing well only in relatively simple environments or do they excel in exactly the opposite type, i.e., complex environments? In choosing a new domain of complex problems to test heuristics, one restriction is that some problems are intractable within reasonable time and computational limits (e.g., NP-hard problems such as the Traveling Salesman problem). Consequently, the optimal solutions are unknown, rendering relative performance an undefinable property.

We have opted to study the taxing, yet numerically solvable, domain of sequential search problems, which sometimes are dubbed as optimal stopping problems (Ferguson, 2002), where the decision maker (DM) interviews the choice alternatives (items, applicants) sequentially. There is a very large body of theoretical research on the standard secretary problem, which originated about 60 years ago (Chow et al., 1964; Lindley, 1961). We narrow down this domain to a class of no-information sequential search problems (Ferguson, 2002) in which the DM has no prior knowledge about the probability distribution of the choice alternatives and, therefore, cannot acquire information from experience about its parameter values. Rather, in deciding whether to accept or reject a given item, the DM is only informed of the relative rank of the present item in comparison to all the items preceding it in the sequence. The optimal stopping rules for these problems may, in principle, be numerically calculated by dynamic programming (see, e.g., Chow et al., 1964; Gilbert & Mosteller, 1966). As Simon (1990) has noted, optimal strategies provide important insights into the nature of the decision-making environment under investigation, but the mathematical methods for discovering them often use a formal language which is alien to most DMs and at times incomprehensible. Consequently, optimal strategies are inaccessible to most humans and organizations, both in the laboratory and in practice, because of their formal language, computational costs, and the constraints imposed by the users’ cognitive abilities. This begs the question whether there exist simple heuristics that are both accessible to DMs and capable of excellent performance in this domain.

Our paper presents three contributions. First, we investigate the performance of heuristics in a non-competitive decision-making domain, consisting of four problem variants (presented in Sect. 2), that is considerably richer and more complex than the domains already studied. One set consists of three heuristics proposed by Seale and Rapoport (1997, 2000) and further investigated by Stein et al. (2003)—these three heuristics have been found to be used by subjects in experiments. Two of these heuristics perform extremely well in variants of sequential search problems for which the optimal decision rule consists of a single threshold. However, they perform poorly in problem variants whose optimal decision rules call for multiple (rather than a single) threshold. Second, in response to this finding, we investigate for the first time the performance of a heuristic originally proposed only for a specific sequential search problem (the expected rank minimization problem; Krieger & Samuel-Cahn, 2009) in three other variants with multiple thresholds—we call this the Progressive Stopping (PS) heuristic (Sect. 3). We establish that this heuristic achieves more than 95% of the optimal performance across all these variants and for sequences with different numbers of items (Sect. 4). Third, we investigate a competitive secretary problem where multiple employers compete with one another to hire the best job applicant (Sect. 5). We then construct an optimal solution (a subgame-perfect Nash equilibrium) for this game following (and correcting) the proof of Karlin and Lei (2015). We propose a new heuristic related to the PS heuristic—the Inverse Progressive Stopping heuristic (Sect. 5.2)—and show that it exhibits remarkably high performance for the whole set of employers. Readers conversant with the literature on sequential search problems and/or heuristics may skip the following Sects. 1.1 and 1.2, respectively, and continue from Sect. 2.

1.1 A brief literature review of sequential search problems

The class of no-information sequential search decision problems has been studied theoretically in applied probability and operations research (see literature reviews by Ferguson, 1989, 2002; Freeman, 1983; Samuels, 1991) and experimentally in the disciplines of psychology and behavioral economics (see, e.g., Corbin, 1980; Bearden & Rapoport, 2005; Lee, 2006; Mak et al., 2019; Palley & Kremer, 2014; Seale, 1996; Seale & Rapoport, 1997, 2000). Non-distributional models of sequential search can be divided into two streams depending on the objective of the DM conducting the search. In one stream (e.g., Lindley, 1961), commonly referred to as the probability maximization problem (PMP), or the best choice secretary problem, the DM’s objective is to maximize the probability of selecting the best item. This may appear very restrictive; however, in the business world, returns to many investments are increasingly described by an all-or-nothing distribution of returns in the long run, e.g., due to first mover advantage or a superior technology dominating the industry. Or at the very least, there are strongly increasing returns to choosing higher relative ranks; consider a sigmoidal relationship, which approximates an all-or-nothing objective function.

Alternatively, the second stream of research (Chow et al., 1964), which is called the expected rank minimization problem (ERMP), relaxes the all-or-nothing assumption as the DM’s objective is assumed to minimize the expected absolute rank of the item selected. Under this objective, the payoffs are either the same (equal weight) or they decrease monotonically in the absolute rank of the selected items; the smaller the rank, the higher the payoff. These two objectives differ from each other in their assumptions about the DM’s objective, and so do the methods for computing their optimal solutions.

At first sight, the assumption of knowledge of relative ranks—and ignorance of the distribution of items—may seem overly restrictive, particularly to economists. Traditionally, the economics literature deals with distributional models, which allow for normative solutions derived from applications of Bayesian updating, e.g., work on incomplete information in game theory. By contrast, research in Operations Research and Applied Statistics often consists of non-distributional models, where the utilities of items and their associated probability distribution are unknown—see Bearden and Rapoport (2005) for a more detailed comparison of distributional and non-distributional models of search. Consequently, neither computation of utilities and probabilities is required nor complex Bayesian updating after the presentation and inspection of each item. There are three major problems that severely restrict the applicability of distributional models to sequential search in the field. First, full-information solutions are very sensitive to the right extreme tail of the distribution of item valuations leading to non-robustness if the distribution is not known perfectly. In the wild, distributions are almost never presented by description to a decision-maker, at best s/he may approximately learn the distribution through observation. However, the tail ends of distributions consisting of rare events are virtually impossible to learn with any reasonable degree of precision even with large number of observations. Second, Knightian uncertainty is considerable in large worlds, therefore attaching cardinal valuations to items even before the sequential search commences is difficult. Third, and most importantly, distributional models are restricted to a single attribute, whereas non-distributional models of sequential search (e.g., searching sequentially for a date; searching sequentially for an apartment after moving to a new location; attempting to choose the best k, k > 1, proposals which are evaluated sequentially) do not have this restriction.

We argue that the majority of important managerial decisions are based on incomplete information due to the considerable state of flux in economic conditions, the unpredictability of innovation, etc., leading to limited managerial control over organizational processes (March & Simon, 1993). A relevant example would be a venture capital firm, sifting through start-ups with considerably uncertain valuations and uncertain likelihoods of the future states affecting the valuations in an attempt to decide which one to invest in. In short, we believe that non-distributional models are more frequently applicable in practice and more realistically capture the nature of sequential searches than distributional models. The important characteristics of search problems are preserved in non-distributional models that allow for the sequential search of multi-attribute items while by-passing the problem of integrating them into a single value and still allowing for a rich set of variants in terms of alternative objective functions, probabilistic knowledge of the number of items and other assumptions—this will be apparent in Sect. 2 where we describe the different variants that we investigate.

1.2 A brief literature review of heuristics

One viewpoint of the descriptive heuristics literature argues that they often lead to biases and sub-optimal behavior (e.g., Kahneman, 2003a, 2003b, 2011; Tversky & Kahneman, 1974) as they are subject to an effort-performance tradeoff (Payne et al., 1988). While the reduction in effort may outweigh the loss in performance for some applications, the contention is that heuristics necessarily suffer a considerable degradation in performance. This viewpoint has been challenged by the fast and frugal heuristics literature, which argues that heuristics can achieve excellent performance and even outperform normative models in environments of irreducible uncertainty (Gigerenzer et al., 1999). Consequently, an effort-performance tradeoff is not a given; quite the opposite, less can be more. Earlier work investigated fast and frugal heuristics in considerably simpler decision-making tasks with a focus on tasks of inference (Gigerenzer et al., 1999, 2011; Todd et al., 2012), individual decisions under risk and uncertainty (Hertwig et al., 2019; Payne et al., 1988; Thorngate, 1980) and social environments (Hertwig et al., 2013). In parallel research, economists have turned their attention to the axiomatization of heuristics (and, more generally, principles of bounded rationality) from the psychology literature Manzini and Mariotti (2007, 2012a, 2012b, 2014); see also Mandler et al., (2012).

More recent work has extended the domain of inquiry from individual to strategic decision making. Theoretical work by Spiliopoulos and Hertwig (2020) shows that bounded-rational heuristics are more robust than sophisticated decision rules—including the normative Nash equilibrium—to both strategic and payoff uncertainty in one-shot strategic interactions. A burgeoning management literature concludes that managers often employ heuristics (Bingham & Eisenhardt, 2011) and that they can be effective decision-making tools in the face of uncertainty (Artinger et al., 2015). For example, Åstebro and Elhedhli (2006) show that simple heuristics can be more effective than complex regression models in predicting the success of risky ventures. That is, even in complex real world managerial domains, heuristics are not necessarily a second-best solution, particularly in large world environments were uncertainty and noise reign supreme.

2 The set of optimal stopping problems

It is generally agreed that the first statement of the PMP appeared in the 1960 column of Scientific American (Gardner, 1960). Lindley (1961) seems to be the first to have solved the PMP in a scientific publication; his work has been extended by many others (e.g., Freeman, 1983). As stated above, the problem is to establish a stopping rule that determines whether to choose item i based only on its relative rank Ri, i.e., its rank among items 1 through i (i = 1, 2, …, n). The PMP is stated in terms of the assumptions that underlie the sequential search for the best applicant:

Assumption 1 (number of applicants). The number of applicants for employment, n, is finite and known.

Assumption 2 (no. of positions). A single position is available.

Assumption 3 (random order of arrival). The applicants are interviewed one at a time in a random order, where the n! orderings are equally likely.

Assumption 4 (no ties). The decision maker (DM) can rank order all the n applicants in terms of their absolute rank, Ai, from best (Ai = 1) to worst (Ai = n) with no ties.

Assumption 5 (decision rule). On each stage i of the search, the DM is only informed of the relative rank, Ri, of applicant i. Based on this information, the DM either accepts the applicant for the position and thereby ends the search or rejects it and then interviews the next one. If no selection is made prior to the nth applicant, then the last applicant must be selected.

Assumption 6 (no recall). Once rejected, an applicant may not be recalled.

Assumption 7 (no refusal). An offer of selection is accepted with certainty by the applicant.

Assumption 8 (payoff function). The DM’s objective is to maximize his/her expected payoff: 1, if the best applicant (Ai = 1) is selected, and 0, otherwise (Ai ≠ 1).

A note on notation and terminology. The absolute and relative ranks of applicant i depend on n. Because the value of n in sequential search experiments is fixed, and the results do not necessarily generalize to other values of n, the superscript n is suppressed. In the rest of this section, we use the terms candidate for an item with relative rank Ri = 1 and applicant for all relative ranks.

Most of the assumptions stated above have been relaxed in one way or another thereby giving rise to multiple variants of the best choice problem—see early review papers by Gilbert and Mosteller (1966) and Freeman (1983), and subsequently Chun (1998, 2000), Bearden et al. (2005) and Bearden et al. (2006). For example, Presman and Sonin (1972) and Bruss and Samuels (1987) replace Assumption 1 by:

Assumption 1’. The DM is only informed of the probability distribution of the number of items, n.

We consider four typical variants of this standard specification. In the first variant, called PMP-UN (Bruss & Samuels, 1987; Presman & Sonin, 1972), the DM only knows the probability distribution of n. The second variant (Gilbert & Mosteller, 1966; Woryna, 2017; Yeo & Yeo, 1994), called PMP-k, assumes that the DM’s objective is to maximize his/her expected payoff: 1, if any of the k best applicants is selected (Ai = 1, …, Ai = k), and 0, otherwise. We will examine the two most-researched cases, namely, k = 2 and k = 5. The standard PMP problem is essentially the special case where k = 1. In the third variant, called ERMP (Chow et al., 1964), the DM’s objective is to choose an item that minimizes the expected value of the absolute rank of the item selected (Assumption 8’). In the fourth variant (Gilbert & Mosteller, 1966, Problem 2b), which we term problem PMP-DC, at each stage i of the search, the DM is informed of the relative rank Ri of applicant i. She is allowed two (rather than a single) choices, termed r1 and r2 (r1 < r2); if either of them selects the overall best applicant, then the search stops with a win (success). The first choice is used on the first candidate starting with item r1. If it fails, then the second choice is used on the first candidate starting with item r2. A detailed comparison of the assumptions of these non-competitive PMP problems can be found in Table 1. Appendix 1 documents the optimal solutions for all these variants.

Table 1 The set of individual optimal stopping problems and their assumptions

We investigate one final variant, the strategic PMP-COMP (for competitive), where multiple employers are competing to hire the best out of n applicants—this will be presented in detail later in the text. To the best of our knowledge, this is the first time that heuristic decision rules are examined in the context of competitive secretary problems. This is an important extension, as arguably, many real-world optimal stopping problems exhibit such game theoretic characteristics arising from the competition between employers pursuing a limited number of applicants.

3 The progressive stopping heuristic

The computation of the optimal decision rule problems with multiple thresholds, such as the ERMP, becomes tedious, costly, or highly time consuming (particularly if n is very large). This led KS-C to consider simpler rules: “We consider finding simple stopping rules that perform well in minimizing the sum of the expected value of the absolute ranks of the items selected, when one or more items are desired” (2009, p. 1042). For the case where a single item is to be selected, KS-C proposed the following heuristic rule for the ERMP, which we henceforth refer to as the progressive stopping (PS) heuristic:

$$t_{n} \left( c \right) = \inf \left\{ {i:R_{i} \le c\delta } \right\}, \delta = \frac{i}{n + 1 - i},$$
(1)

where tn(c) is the threshold stopping rule and c ≥ 1 is a constant. This rule stops and chooses the first applicant with a relative rank of Ri satisfying the above constraint, i.e., meeting the threshold tn(c), and guarantees that some applicant is always chosen as Pr(tn(c) ≤ n) = 1. This rule performed very well in the ERMP that KS-C focused on (see Table 13) especially for very high n—for which the optimal solution is even more computationally intensive—choosing an expected rank smaller than 4 and achieving 98.5% of the optimal (maximum) performance for n > 100,000. KS-C write: “Hence, when the number of items becomes large, the case where it is hard to implement dynamic programming, is when the simple rule performs almost as well as the optimal rule” (2009, p. 1053). The performance of this simple rule is striking. We conjecture, and subsequently test the hypothesis that the PS heuristic rule performs well in other variants of the secretary problem where the optimal solutions call for multiple decision thresholds.

How likely is it that this heuristic—which was originally proposed as a computational device, not as a heuristic descriptive of human behavior—may be discovered and adopted by decision makers? While the exact quantitative optimal solution is difficult to deduce without formal mathematical training, we posit that qualitative aspects of the optimal solution may be accessible to inexperienced DMs; they may first attempt to deduce basic qualitative features of an optimal solution to a problem and then construct a heuristic out of basic building blocks that satisfies these qualitative requirements. Alternatively, they may learn inductively what works as long as the heuristic is relatively simple. Consider the experimental economics literature on eureka or epiphany moments, where after repeated exposure participants suddenly gain an insight into complicated games, with the effect observable both in choices and response times (Dufwenberg et al., 2010; McKinney & Huyck, 2013; Schotter & Trevino, 2020).

What are the building blocks and insights related to optimal stopping problems? Dealing with insights, it is often easier to first consider the border-cases of a problem—this can be construed as a type of decrease-and-conquer approach (Levitin, 2012). For these problems, they are stopping the search immediately and choosing the first candidate or continuing search until the last candidate. It is immediately apparent that in the first case the probability of success is quite low as no knowledge has been accumulated through search with which to compare and improve on the first candidate—it is in essence, equivalent to a random uninformed choice. In the latter case, since not choosing a candidate leads to a loss, the probability of choosing the last candidate conditional on search continuing that long, should be 1. These two observations imply high thresholds for the beginning of the search and zero thresholds at its end. Alternatively, the DM may realize that there are two opposing forces at work, which imply the same qualitative conclusions. The longer the search, the more likely it is to come across the desired candidates; but continuing the search risks losing the most desirable candidates by passing up the opportunity to hire.

What is a simple way to construct a heuristic satisfying this insight and what building blocks are required? Tracking the ratio of the number of applicants already interviewed to the number of the remaining to be interviewed applicants satisfies these requirements in a frugal and intuitive manner. Specifically, the PS heuristic considers the ratio of two integers as combined in the search depth \(\delta\): the number of the present stage (how far you are away from the beginning of the search) divided by the number of stages (plus 1) that remain in the search: \(\frac{i}{n + 1 - i}\). This ratio of two integers is multiplied by the scaling parameter c, which is the only parameter of the heuristic. Because Ri is an integer assuming the values 1, 2, …, n, the heuristic rule consists of a sequence of n thresholds and has the same form as the optimal rule for multiple-threshold problems. As the search progresses (i), the numerator in the ratio \(\frac{i}{n + 1 - i}\) increases and the denominator decreases, leading to an increase in the search depth variable \(\delta\), and a gradual relaxation of the criterion for stopping the search. Figure 1 plots the progression of Ri as i increases for various values of c. Higher values of c lead to a quicker relaxation of the stopping threshold criterion. The curves are convex, i.e., increase at an increasing rate with i, as delaying choice quickly increases the probability of passing up the best prospects, and not choosing an item before all n items are observed necessarily leads to failure. This conforms with the qualitative aspects of the solutions at the border-cases described above. Note that larger values of n reveal the same general patterns but on a different scale. The PS heuristic may thus be also viewed as a satisficing/aspiration rule in the spirit of Simon (1957, p. 263):

“(a) When performance falls short of the level of aspiration, search behavior (particularly search for new alternatives of action) is induced. (b) At the same time, the level of aspiration begins to adjust itself downward until goals reach levels that are practically attainable.”

Fig. 1
figure 1

Threshold values \(t_{n} \left( c \right)\) of the PS heuristic for n = 20 and select values of c

The PS heuristic has all the properties of a fast and frugal heuristic as it ignores a large proportion of information that could be gleaned from extensive search, and has obvious search, stopping, and decision components, as described above.

4 Results for non-competitive optimal stopping problems

The heuristic performance is measured by an approximation score, called the a-score, which is defined below in terms of the probability of achieving the objective, called the probability of success:

$$a{\text{-score}} = \frac{{{\text{Pr}}\left( {{\text{success}}\;{\text{achieved}}\;{\text{by}}\;{\text{the}}\;{\text{heuristic}}\;{\text{rule}}} \right)}}{{{\text{Pr}}\left( {{\text{success}}\;{\text{achieved}}\;{\text{by}}\;{\text{the}}\;{\text{optimal}}\;{\text{solution}}} \right)}}.$$

Before proceeding with an in-depth analysis of the PS heuristic in multiple-threshold sequential search problems, we first summarize our findings on existing heuristics for the PMP, PMP-UN, and PMP-2 problems. The cutoff heuristic, where the DM rejects the first r − 1 applicants, and then selects the first candidate thereafter achieves perfect performance (a-score = 1), for the optimal r* value, in the standard PMP and PMP-UN. However, it (and two other heuristics) suffers a significant degradation in performance in the PMP-2, which calls for two rather than a single threshold. This failure leads us to consider the PS heuristic as an alternative for the class of multiple-threshold search problems. A more detailed discussion of these heuristics and their performance in these problems can be found in Appendix E of the Supplemental Online Material.

4.1 The progressive stopping heuristic in multiple-threshold search problems

4.1.1 PMP-DC, PMP-5, PMP-2 and ERMP

In assessing the PS heuristic’s performance, we searched for the best value of the single parameter (c), i.e., the one that maximizes the probability of success. The a-score for problem PMP-5 was computed by simulation, but the a-scores for problems PMP-2 and PMP-DC were computed directly from Eqs. (4) and (6) in Appendix 1, respectively, as these two equations apply to any values of the thresholds r1 and r2 and not only for the optimal values r1* and r2*.

The results for problems PMP-DC, PMP-2, PMP-5 and ERMP for n = 20, 50, and 100 are summarized in Table 2—more details, such as the optimal c values, threshold values, and the probability of winning can be found in Tables 9, 10, 11, 12 and 13 in Appendix 2. A key finding of the present paper is that the PS heuristic performs extremely well across these variants. In PMP-DC, the PS heuristic emulates the optimal solution across all n (achieving an a-score = 1) for the appropriate values of c. In PMP-2 and PMP-5, while not achieving perfection, the PS heuristic achieves exceptionally high a-scores ranging from 0.978 to 0.999. In the ERMP, the heuristic achieves a-scores ranging from the near-perfect 0.995 for n = 20 to 0.963 for n = 100 and as KS-C showed, still performs well for higher values of n: 0.963 for n = 100, 0.981 for n = 1000, 0.984 for n = 10,000 and 0.985 for n > 100,000. In some of these cases, the PS heuristic improves in performance as n increases, that is as the complexity of the optimal solution increases this heuristic becomes even more effective—problem complexity is tamed by a simple decision process as embodied in the PS heuristic.

Table 2 The approximation scores of the PS heuristic in multiple threshold problems

The PS heuristic outperforms other approximate solutions that have been suggested for these problems. Dietz et al. (2011) propose two policies approximating the optimal solution to PMP-2, one of which consists of a single threshold and the other of two thresholds. The former has two free parameters (the relative ranking cutoff to be applied after a position threshold), whereas the latter has four parameters (two rankings and two positions). In this sense, they are considerably more complex that the PS heuristic’s single free parameter. Furthermore, deriving the optimal parameters requires maximization over complex functional forms involving obscure—to the uninitiated—combinatorics (Dietz et al., 2011, Eq. 2, p. 160, Eq. 6, p. 164). From their Table 2 (p. 167), we calculate the approximation score for n = 100 and k = 5 of the single-level policy as 0.894 and of the double-level policy as 0.967.Footnote 1 Recall from Table 2 that the relevant a-score of the PS heuristic is 0.985; it is superior to the single- and double-level policies both in performance and parsimony.

We are not aware of such close approximations of the optimal solutions by the same simple, single-parameter heuristic to such a wide range of complex decision-making problems requiring formal mathematical techniques to solve, which for most people are intractable and incomprehensible. In the next section, we propose a variant of the PS heuristic and show that the impressive performance of this class of heuristics is not limited to individual or non-competitive secretary problems as it also applies to competitive problems.

How robust is the PS heuristic to different values of c and n? Fig. 2 presents the approximation scores associated with different values of these two variables for each of the four problems. A striking result is that for all problems there is very little variation in the approximation score with respect to the number of items n conditional on values of c. For PMP-DC, the a-score is quite insensitive, or robust, to values of c, as the curve is quite flat for a large range of values. This is not the case for the other three problems, where the curve reveals greater sensitivity to c. Let us examine the range of c values for which the a-score is greater than 0.95. For the sake of exposition, we report here the ranges for n = 50 as there is little variation across n. The ranges for problems PMP-DC, ERMP, PMP-2 and PMP-5 are [2.0–6.2], [1.7–2.7], [1.1–2.4] and [1.2–2.6], respectively. Note that c values between 2.0 and 2.4 guarantee a-scores greater than 0.95 for all four problems. Consequently, the PS heuristic generalizes admirably, that is, a decision-maker who has learned appropriate c values from experience in one of these problems can transfer this knowledge over to a different type of problems and immediately achieve excellent performance. Furthermore, due to the discreteness of the possible (integer) threshold stopping values, there exist ranges of c values that produce the same optimal thresholds and perfect a-scores in problem PMP-DC. In conclusion, the PS heuristic exhibits considerable robustness across different problems and their features—this is a valuable trait for a fast and frugal heuristic.

Fig. 2
figure 2

Robustness of the PS heuristic to values of c and n

5 The competitive secretary problem: PMP-COMP

Immorlica et al. (2006) and subsequently Karlin and Lei (2015) consider a new variant of the PMP in which the sequential search for the best applicant is conducted in a competitive setting by multiple DMs. This opens a new horizon for future applications of the secretary problem as a general model of sequential search. As a motivating example, consider the case of k (k ≥ 2) academic departments sending their faculty representatives, one per department, to a national academic conference with the instructions of interviewing and subsequently hiring a single job applicant for a junior position in their department. Exactly n job applicants attending the conference are ranked from 1 (best) to n (worst) with no ties (same ranking for all the k employers) and arrive at the interview, one at a time, in a random order. The interviewers (hereafter called employers) are also ranked from 1 (best) to k (worst) with no ties in terms of the “quality” of their job offers (e.g., starting salary, teaching load, academic prestige, location of the school, or some combination of the above). Employers are instructed by their institutions to hire the best applicant (Karlin & Lei, 2015).

The assumptions underlying the sequential search for the best applicant in the competitive variant of the PMP problem (hereafter called PMP-COMP) are stated below.

Assumption 1 (no. of applicants). The number of applicants, n, is finite and commonly known.

Assumption 2 (no. of employers). The number of employers, k, is finite and commonly known.

Assumption 3 (priorities). (a) The n applicants are ordered in terms of their absolute ranks Ai from best (Ai = 1) to worst (Ai = n) with no ties. (b) The employers are ranked by the applicants from best (1) to worst (k) with no ties. The employers’ ordering is publicly known whereas the applicants’ ordering is not revealed to the employers.

Assumption 4 (interview). The n applicants are interviewed independently by the k employers, one at a time, in a random order (all n! orderings are equally likely).

Assumption 5 (applicant’s decision rule). If applicant i receives multiple job offers, then she accepts the offer from the highest ranked employer among those making the offer. Refusing an offer is not an option, and the terms of the job offer are not negotiable.

Assumption 6 (employer’s decision rule). At each stage i of the search, at the end of the interview all the k employers are informed of the relative rank, Ri, of applicant i (i = 1, 2, … n). Based on this information, any employer may either accept an applicant i (i.e., make her a job offer) or reject her. The k (binary) decisions are made independently and are irrevocable. Once an employer hires an applicant, he may submit no further offers. Note, that an employer may make more than one offer, as prior offers may be rejected by applicants accepting offers from higher ranked employers.

Assumption 7 (no recall). Once rejected by all the k employers, the applicant may not be recalled.

Assumption 8 (objective). The objective of each employer is to maximize the probability of hiring the best applicant (payoff = 1 if Ai = 1 and 0, otherwise).

5.1 The optimal solution

Using backwards induction to compute employer j’s best response, Karlin and Lei (K&L) constructed the subgame-perfect Nash equilibrium solution to the PMP-COMP problem. The solution also yields the probability of success for each of the k employers. The equilibrium solution has the form of a multi-threshold strategy (t1, t2, … tk), where for each value of n and each value of j between 1 and k, there is a unique integer tj, called the optimal threshold value, such that employer j accepts applicant i if (1) applicant i is a candidate (Ri = 1) and (2) i ≥ tj. Thus, the first t1 applicants are rejected by all the employers, in the interval (t1 + 1, t2) an applicant is accepted by employer j = 1 if it is a candidate, in interval (t2 + 1, t3) employer j = 1 (if he is still active in the game) and employer j = 2 submit simultaneously two job offers to applicant i if it is a candidate, and so on. To remain consistent with the K&L notation, the thresholds are defined by the number of applicants to reject, whereas in the previous sections the thresholds, r, were defined as the applicant with which an employer should start submitting offers, i.e., rejecting the first r − 1 applicants. In Appendix 3 we provide counter examples for which the algorithm in K&L fails to solve for the true optimal thresholds, as those are reported in Table 1 of K&L.

The structure of the equilibrium proposed by K&L is correct, only the specific values of the thresholds (t1, t2, … tk) are problematic. We constructed an algorithm that resolves this inconsistency and returns the true optimal thresholds for all n—the reasoning for our proof is identical to that given in K&L with a single modification to their Eq. 3 (2015, p. 946). K&L define the optimal risk, \(R_{k} \left( i \right),\) as that belonging to a set of rules ignoring the first i applicants. Consequently, their Eq. 3 should read \(R_{k} \left( {i - 1} \right)\) instead of \(R_{k} \left( i \right)\). In Appendix 3, we present the pseudo-code for the algorithm that calculates the subgame-perfect Nash equilibrium. Table 3 reports optimal threshold values using our algorithm for the parameter values \(n = \left\{ {10, 20, 50, 100, 1,000, 10,000} \right\}\) where k = 10. We report identical thresholds as K&L for the case \(n \to \infty\), which we approximated with n = 100,000—see Table 4 for the results as \(n \to \infty\) for large number of employers, k. For j = 1, 2, 3, and 4—the top four employers—K&L report the optimal threshold values e−1 ≈ 0.368 (same as the asymptotic threshold value in the basic PMP problem), e−3/2 ≈ 0.223, e−47/24 ≈ 0.141, and e−2761/1152 ≈ 0.091, respectively. In the limit, the thresholds (as proportions of n) are equal to the limiting probabilities of success (Matsui & Ano, 2016).

Table 3 Optimal threshold values for k employers competing with one another for hiring the best applicant: n = 10, 50, 100, 1,000 and 10,000, and k = 10
Table 4 Optimal threshold values (as a proportion of n) for employers competing with one another for hiring the best applicant as n → ∞

The algorithm for the optimal solution is calculated recursively starting from the highest-ranked employer, who behaves as if there were no competition whatsoever; that is, the solution is identical to that of the standard best choice problem. Once this is fixed, the employer in the second rank best responds to the behavior of the top-ranked employer but disregards the behavior of all lower ranked employers. Therefore, the solution for each employer j is independent of all lower-ranked employers, so that the results for \(k < 10\) are simply the same threshold values in Table 3 truncated at k. Similarly, if the optimal threshold is zero for the jth employer, then it must also be zero for all employers > j.

Before moving on we present an intriguing finding. Consider the extension of the PMP problem where the DM can make r choices (r < n), which can be found in Sect. 2c of the paper by Gilbert and Mosteller (1966). Recall that earlier we examined the PMP-DC problem, for which r = 2. The numerical solution for the generalized r-choice problem appears in the fifth column of Table 4 by Gilbert and Mosteller (the column titled “P(win)” for the solution with r Starting Numbers). Note that when n → ∞, the solution to the PMP-COMP is identical to the r-choice problem solution. This finding can be obtained from our Table 4 by simply adding the probabilities of winning for all employers up to rank j—that is, calculating the probability of any employer hiring the best candidate. Consequently, the subgame perfect equilibrium solution for the competitive secretary problem PMP-COMP constructed by K&L, where the strategy of each of the competitors has a single threshold, is identical to the strategy of the single player in the r-choice problem, which calls for using multiple threshold values in the sequential search for selecting the best applicant.

5.2 The heuristic solution to problem PMP-COMP

How can DMs intuit important insights into this competitive problem? Recall that the optimal solution is structured in terms of multiple threshold values similar to the problems we examined above where we applied the PS heuristic with one important difference: whereas the PS heuristic produces increasing thresholds, the PMP-COMP optimal solution requires thresholds that are decreasing in the (numerical) rank of the employer j. For lower-ranked employers to have a chance of hiring they must—on average—make offers earlier than higher-ranked employers to avoid directly competing with them (recall that multiple offers are always resolved in favor of the highest-ranked employer). Armed with the insight that thresholds are decreasing in the (numerical) rank of the employer provides the necessary qualitative characteristics of a solution.

We propose the Inverse Progressive Stopping heuristic (IPS) (Eq. 2), inspired by the original PS heuristic, due to its dependence on easily accessible information to the employer, i.e., the number of applicants already interviewed and the number remaining to be searched:

$$t_{j} \left( c \right) = \left\{ {\begin{array}{*{20}c} {\sup \left\{ {i: R_{j} \le \frac{1}{c\delta }} \right\}} & \quad { {\text{if}} \;\; \exists \;\; i \;\; {\text{s.t.}} \;\; R_{j} \le \frac{1}{c\delta }} \\ 0 & \quad {{\text{otherwise}}} \\ \end{array} } \right.,\quad \delta = \frac{i}{n + 1 - i} \quad {\text{for}} \;\;c > 0.$$
(2)

In contrast to the original PS heuristic, the right-hand side fraction is now inverted so that it is decreasing in the search depth \(\delta\) (and by extension, i) instead of increasing, and we seek the supremum of i for which the inequality holds rather than the infimum. Also, note that the left-hand side of the inequality is the relative rank of employer j. Finally, as the indexing of applicants i has a lower bound of 1, we added a second possibility which allows for thresholds of zero, i.e., ignoring no applicants and choosing the first one that appears, as is often the case in the optimal solution. If an i does not exist for satisfying the first possibility, then the threshold is set to zero.

In deriving the optimal value for the scaling parameter c, we chose a criterion more relevant to the multiple players in the competitive secretary problem than to the individual DM in the non-competitive problems. Let \(p_{j}^{*}\) be the optimal probability of winning for employer \(j\), and \(p_{j}^{^{\prime}}\) be the probability of wining according to the heuristic. The latter assumes that all employers are using the heuristic and the same value of c. The optimal value of c minimizes the mean absolute deviation (D) in the optimal and heuristic win probabilities: \(D = 1/k\sum\nolimits_{j = 1}^{k} {|p_{j}^{*} - p_{j}^{^{\prime}} |}\). The optimization was performed via a two-stage grid search to ensure a global maximum with the first stage in increments of 0.1 for \(0.1 \le c \le 10\) and the second with finer increments of 0.05 for \(1 \le c \le 5\).

As can been seen in Table 5 (associated optimal c values and corresponding threshold can be found in Table D.1 of the online supplemental material), the performance of the IPS heuristic is quite impressive, as D varies between 0.0035 and 0.006 for different n values. That is, measuring this as the percentage of games an employer is expected to win, the difference between the optimal and heuristic solution leads to less than a 0.6%-point change on average to the win rate. Furthermore, the lowest deviation score D (0.0035) is observed for n = 1000, which would be the most computationally expensive for determining the optimal solution. For n = 10, both the optimal and heuristic solutions exhibit a total win probability of 1 (the best applicant will surely be hired by one of the employers). For n other than 10, using the heuristic instead of the optimal solution leads to a small decrease in the total win probability corresponding to a 4–5% point increase in the chances that the best applicant will not be hired at all. However, note that our deviation score criterion D that was used to choose the optimal c did not specifically maximize the total probability of any employer winning. Directly optimizing for the latter leads to a significant reduction in the gap between the heuristic and the optimal solution—see Table D.2 in Appendix D of the online supplemental material.

Table 5 Optimal and IPS heuristic threshold values for problem PMP-COMP

Another significant observation is that the optimal c values across different values of n all lie in a narrow range from 1.65 to 2.65 (see also Table 12 in Appendix 2 for problem ERMP)—recall that this overlaps significantly with the range of c [2.0–2.4] that guaranteed high performance in the non-competitive problems in Sect. 4. Furthermore, as is evident from Fig. 3, which shows how the deviation score D varies by c and n, the deviation score is quite insensitive around the optimal values of the scaling factor c = 2. We conclude that the heuristic exhibits excellent performance with similar c values across varying orders of magnitude of n and that it is quite robust to deviations from c around the optimal value. That is, the same heuristic with identical c value could be used by any employer from 1 to k for virtually any range of n with very small performance degradation in terms of the probability of winning. Even without prior experience, it is likely that DMs will still attain high performance, which can be fine-tuned with experience. Note that the parameter c is not some opaque parameter embedded in a complex functional form making it difficult to interpret and adjust; rather, it is a multiplicate scaling factor that adapts the rate of change in the threshold values as the search progresses.

Fig. 3
figure 3

Deviation score, D, for various values of c and n in the PMP-COMP

6 Discussion

More than 60 years ago, Simon (1957, 1959, 1982) suggested that while most people strive to make rational choices, their decisions are often subjected to cognitive limitations. People are constrained, he posited, by the amount of information they have at their disposal, the amount of time they need for deliberation before making their decisions, and their previous experience. Inspired in part by Simon, a large body of research on rules-of-thumb, or heuristics, has been growing exponentially in psychology and behavioral economics (e.g., Gigerenzer et al., 1999, 2011). They conclude that cognitive constraints need not be detrimental to decision-making performance if they are attuned to environmental characteristics, i.e., heuristics may be ecologically rational. In general, behavioral models are often solely judged on their predictiveness, how well they match behavioral data from experiments. The implicit assumption is that we should expect heuristics to be inferior in performance to the normative solutions; therefore, heuristics are valuable only as descriptive models. Consequently, less attention has been directed to the model’s relative performance: how close is the performance of the heuristic to the optimal solution. If the insight that the simplicity of the rule-of-thumb and its accuracy are not mutually exclusive, then the relative performance of the rule is a critical construct for validating heuristics not only as descriptive, but also as prescriptive models.

Our results support previous claims in the heuristic literature that there exist simple heuristics that can provide straightforward explanations for solving astonishingly complex decision-making tasks. A synthesis of our findings reveals that existing descriptive heuristics, such as the cutoff and successive non-candidate heuristics (Seale & Rapoport, 1997), should be employed by decision-makers whenever the optimal solution of the problem requires a single threshold. However, the PS and the inverse PS heuristics should be used in multiple-threshold problems, as they achieved near optimal performance across five variants of sequential search problems, including a more complex competitive problem where the search for the best job applicant is conducted sequentially by a group of k interviewers. As in Spiliopoulos and Hertwig (2020), this result pushes back against the philosophical arguments (e.g., Sterelny, 2003) that heuristics would not perform well in strategic environments against other humans in contrast to individual decision-making tasks against nature.

Armed with a toolbox of such heuristics, a DM may achieve near optimal solutions with minimal computational demands across a wide domain of problems with varying assumptions, including different objective functions. Our proposed heuristics are “simple” in the sense that they are restricted to include only a single parameter, and they require only counting of items and elementary mathematical operations like maximization, minimization, summation, and division. Furthermore, they do not compute probabilities and utilities, their cost of implementation is negligible, and they employ a minimum of processing time. Notably, both heuristics are built around a simple and intuitive ratio that DMs can track in real time—namely, the ratio of the number of interviewed candidates to the number of those remaining in the sequence. This contrasts sharply with the necessary computations for the optimal solutions (that involve recursive dynamic programming), which suffer from the curse of dimensionality. Furthermore, we have shown that within each type of problem, the optimal value of the only free parameter of these heuristics, c, is relatively invariant to the number of items, n. In the terminology of Gigerenzer et al. (1999), the PS and IPS heuristics can be classified as “fast and frugal” in terms of the information processing they require, storage space, and mental computation.

The performance of these heuristics might be evaluated in comparison to the prolong amount of time and considerable effort in deriving the optimal solutions to these variants of the secretary problem. Experimental studies in which the sequential search problems that we have selected are iterated in time to allow for learning might provide important additional information. In prior studies, Seale and Rapoport (1997) report no evidence of learning, whereas Goldstein et al. (2020) do; note that in the latter study, but not in the former, the distribution of applicants’ ability could be learned with experience. While the experimental evidence on learning in repeated secretary games is meager with mixed results, one of the goals of the present paper is to further instigate this line of research for a broader class of problems and heuristics. We have shown that a set of simple heuristics with excellent performance exists. Now, significant efforts must be directed in understanding how decision makers facing optimal sequential search problems arrive at and construct specific heuristics, how successful they are in doing so, and how they learn with repeated exposure to improve upon these or adjust them.