Many critical problems require building a consensus among stakeholders: A group of friends at a restaurant must agree on what appetizers to share. Nations at a climate summit must arrive at a consensus on emission goals. Scheduling meetings is a particularly simple problem of this kind: participants planning a meeting must agree upon a time to meet. Here, we study the combinatorics and statistics of the scheduling problem. We posit that this analysis can be fruitfully extended to more difficult and pressing problems that require consensus building such as the ones mentioned above.

When all responses of a scheduling poll such as Doodle are collated it usually turns out that there is no time that works for everyone. In this paper, we develop formulae that help estimate the likelihood that a poll will succeed if all participants make a good faith attempt to attend the meeting.

The paper is organized as follows. We begin with an analysis of the scheduling problem. In the conclusion, we discuss the generalization to broader consensus building problems and the relationship of our work to prior work on consensus building in the statistics [1] and statistical physics [2] literature.

1 Basic model of scheduling polls

We suppose that there are \(\ell \) time slots available for the meeting and m respondents to the poll. It is assumed that each respondent has prior immovable commitments that conflict with r time slots on the poll. \(g = \ell - r\) is thus the number of time slots each respondent can attend. The poll is taken to be arranged in the usual fashion. There are \(\ell \) columns corresponding to the different time slots and m rows each corresponding to a different respondent. A given respondent can fill the \(\ell \) columns of the poll in \(C(\ell ,r)\) different ways and the total number of polls that can result is

$$\begin{aligned} N = [C(\ell ,r)]^m. \end{aligned}$$
(1)

We assume that all the different ways to fill the poll are equally likely.

We define failure as a poll in which there is no time slot that works for all respondents. Our objective is to calculate \(\pi _0,\) the probability of failure. Evidently, if \(m r < \ell \), the probability of failure is zero: no matter how the respondents distribute their responses they cannot block all the available time slots. When \(m r = \ell \), failure becomes an option. In this case, it is easy to see that the number of distinct polls that fail, \(n_0 = \ell !/[(\ell /m)!]^m\), and \(\pi _0 = n_0 / N\), where N is given by Eq. (1) with \(r \rightarrow \ell /m\). For \(m r > \ell \), calculating the number of polls that fail is a more complicated problem of combinatorics. We find

$$\begin{aligned} \pi _0 = \sum _{j=0}^{\ell - r} \frac{ (-1)^j \ell !}{j! (\ell - j)!} \left[ \frac{ (\ell - r)! (\ell - j)! }{ \ell ! ( \ell - r - j )! } \right] ^m. \end{aligned}$$
(2)

The derivation of this formula is a bit lengthy and is relegated to Appendix A.

As an application of this formula with a remarkable outcome, suppose that the participants’ schedules are the same each week, a common circumstance in an academic workplace. Then, under reasonable assumptions, we find that it is risky to try to arrange a meeting of more than four participants! Because the schedule repeats from week to week, the maximum number of available slots is \(\ell = 40\) and if we assume that each participant is free for only half of the available slots (\(g = 20\)), then the probability of failure \(\pi _0 = 7.5\%\) for \(m = 4\), but it rises precipitously to \(\pi _0 = 28\%\) for \(m = 5\).

Fig. 1
figure 1

Histogram of probabilities \(\pi _i\) that a poll will yield exactly i viable meeting times. In the plot \(i = 1, \ldots , 7\). The histogram on the left (red) is for \(\ell =15, m=9, r = 4.\) The histogram on the right (blue) is for \(\ell =20, m = 9, r = 4.\) The failure probability is 0.34 for the first case and 0.027 for the second case, and the shape of the histogram for the two cases is also different. When the probability of failure is high, \(\pi _i\) decreases monotonically; when it is low, \(\pi _i\) rises to a peak corresponding to the most likely number of viable meetings

It is also of interest to know \(\pi _i,\) the probability that the poll will yield exactly i viable times, where \(i = 1, 2, \ldots , g\). This is shown in Fig. 1, which is easily generated using Eq. (A13) in Appendix A.

The basic model of scheduling analyzed here belongs to the class of mathematical problems called urn models. However, since scheduling polls are new, this particular model has not been analyzed in the literature [3].

2 Simplified model

We now describe a modification of the model that yields simpler expressions but is qualitatively and (under appropriate circumstances) quantitatively the same as the original model. In the simplified model, each respondent fills each entry by an independent toss of a biased coin that comes up “yes” with probability \(p = g/\ell \) and “no” with probability \(q = r/\ell \). In this model, the number of yes entries made by each respondent is equal to g on average, but it fluctuates from realization to realization. Evidently, in this case, the probability that a given column will be viable is \(\mathcal{P} = p^m\) and the probability that the poll fails, i.e., no column is viable, is

$$\begin{aligned} \pi _0 = ( 1 - p^m )^\ell . \end{aligned}$$
(3)

Revisiting the two examples in Fig. 1, the failure probabilities for the simplified model are 0.38 and 0.055, respectively, in qualitative agreement with the earlier numbers. The relationship between the two models is similar to that between the microcanonical and canonical ensembles in statistical physics. It is an empirical question which one is a better model for real polls: there is no a priori reason to prefer one over the other.

A big advantage of the simplified model is that its behavior can be understood easily. For example, it is easy to see from Eq. 3 that \(\pi _0\rightarrow 0\) if \(\ell \rightarrow \infty \) with p and m fixed (unless \(p = 0\), i.e., respondents reject every possible meeting time). In other words, a sufficiently large poll becomes too big to fail. Conversely, if the number of respondents \(m\rightarrow \infty \) with p and \(\ell \) fixed, then \(\pi _0\rightarrow 1,\) i.e., the poll will always fail (except if \(p=1\)). If m and \(\ell \) are both very large and \(p < 1,\) then

$$\begin{aligned} \pi _0 \rightarrow \exp [- \ell p^m] \end{aligned}$$
(4)

which tells us that \(\ell \sim (1/p)^m\) or larger if we want the poll to have a reasonable chance of succeeding, i.e., \(\ell \) grows exponentially with m. If \(\ell = K(1/p_c)^m\), where \(K > 0\), then for large m

$$\begin{aligned} \pi _0 \rightarrow \exp [- K (p/p_c)^m] \end{aligned}$$
(5)

which has a sharp transition for \(m\rightarrow \infty \) from \(\pi _0 = 1\) for \(p < p_c\) to \(\pi _0 = 0\) for \(p > p_c,\) and \(\pi _0 = \exp (-K)\) for \(p = p_c\) as shown in Fig. 2.

Fig. 2
figure 2

Phase Transition. The probability for the poll to fail, \(\pi _0\), plotted as a function of p, the fraction of time slots that each respondent is available. The plot is shown for \(m= 3\), \(m = 10\) and \(m = 20\) where m is the number of respondents. We assume that \(\ell = 2^m\), i.e., \(p_c = 1/2\) and \(K = 1\). Note that the transition in \(\pi _0\) sharpens to an abrupt discontinuity in the thermodynamic limit \(m \rightarrow \infty \) showing that the poll failure probability undergoes a first-order phase transition as p varies past \(p_c\)

As with the basic model discussed in the previous section, it is possible to obtain an expression for \(\pi _i,\) the probability of ending up with i viable columns: \(\pi _i = C(\ell , i) \mathcal{P}^i (1 - \mathcal{P})^{\ell -i}\) with Eq. (3) being the case when \(i=0.\)

Returning to our original model, one might hope that—as for the microcanonical and canonical ensembles of statistical physics—it will reduce to the simplified model when \(\ell \) is large, even for finite m. In Appendix B, we have constructed a complex integral representation of the probability of failure \(\pi _0\) (as well as \(\pi _1, \ldots , \pi _g\)) for the original model. Through a saddle-point analysis, we show that \(\pi _0 \rightarrow \exp [-\ell f(m, r/\ell )]\) when \(\ell \rightarrow \infty .\) Although the exponential decay of \(\pi _0\) with \(\ell \) is similar to the simplified model, rather surprisingly we find that the decay constant \(f(m, r/\ell )\) is only equal to \(-\ln (1-p^m)\) when \(m\rightarrow \infty .\) Thus, for the two models to agree, it is not sufficient that \(\ell \rightarrow \infty \); we also require \(m\rightarrow \infty .\)

Since the two models agree asymptotically in the limit that \(\ell \rightarrow \infty \) and \(m \rightarrow \infty \), the exponential scaling and first-order phase transition are exactly the same for both models.

3 Discussion

The finding that the size of a poll \(\ell \) must grow exponentially with the number of respondents m raises the question whether scheduling a meeting is an exponentially hard problem in the sense of algorithmic complexity. It is true that the brute force method of conducting an exhaustive poll to schedule a meeting is exponentially hard, but it is possible that there is a more clever way to search the space of possible meeting times that only scales polynomially in the number of respondents. We leave this as an open question.

There are several simple generalizations of our model of scheduling polls that are possible. For example, one could imagine that there are two populations of respondents with different numbers of conflicting commitments. However, there may be diminishing returns to such generalizations. Real polls fail due to complex dynamic phenomena (e.g., a respondent stalls the poll, while the schedules of the early responders shift and fill up) and it may be desirable to incorporate these processes in the model instead.

Within our model, the only fail safe strategy for a given number of respondents m, each with r conflicting commitments, is to increase the number \(\ell \) of available time slots, so that \(\ell > m r\). This may not always be practical. Indeed, as we argued under the common circumstance that the schedule is the same from week to week under reasonable assumptions, it may be difficult to schedule a meeting of more than four individuals.

We conclude by framing some other consensus building problems in a similar way to the scheduling problem. A college department chair who wishes to make fair teaching assignments might create a poll in which the columns correspond to the courses that need to be taught and the rows to faculty members available to teach them. Success in this case requires not only that there is someone willing to teach every course, but that the teaching assignments can be made in a way that each instructor is assigned their proper teaching load. As a less academic problem with a similar structure, consider the group of friends choosing appetizers to share at a restaurant. The columns would correspond to the different appetizers on the menu and the rows to the names of the diners. A high bar for success would be to only consider appetizers that make the yes list for all the diners; in practice, it may be necessary to set a lower bar. For the problem of creating climate agreements, the rows of the model are the nation states that might sign the agreement and the columns correspond to different versions of the agreement ranging from the most stringent to more watered down variants. Each box of the model would be characterized by two numbers: the probability that the corresponding nation state would sign onto that variant of the agreement and a score corresponding to the climate impact of the nation state signing onto that particular variant. Within the climate field, even the process of writing the summary for policy makers of an IPCC Report [4] is a large problem in consensus building: every sentence (the columns) has to be approved by all the stakeholders (the rows). We see that these are all closely linked problems, but in each case, the statistical and combinatoric analysis required is somewhat different. Apart from mathematical analysis, it may be useful to analyze these problems from the perspective of behaviorial economics [5]. For example, one could ask whether a better outcome is achieved by offering the full menu of choices from the outset or by repeating the poll, offering options that correspond to less favorable outcomes only on later iterations. It is relevant to note here that there is a class of prior studies on consensus building [1, 2] that is concerned with modeling how stakeholders influence each other leading to an evolution of their opinions. Here, by contrast, we are interested not in time evolution but in the complex combinatorics that arise from having a multiplicity of both stakeholders and choices.