Proof of Theorem 1 in Sect. 4.2
Theorem 1 invites to consider two general strategies:
-
strategy I: always immediately publish a passed step,
-
strategy II: never publish anything before the end of the game.
The theorem is proved by induction on the number of remaining temporal intervals, for scientist
\(A\). To start with, suppose there is just one temporal interval left. There are two cases: either
\(A\) is ahead of
\(B\), or the opposite (a similar position for
\(A\) and
\(B\) can be considered as a special case of the first or of the second case). Start with the first case, and define
\(a\) and
\(b\) such that:
\(B\) is
\(b\) steps ahead of the last published step (which might be step 0), and
\(A\) is
\(a\) steps ahead of
\(B\).
Firstly, suppose
\(B\) adopts strategy I. Let’s compare
\(A\)’s reward according to the strategy she adopts. Suppose she adopts strategy I: she publishes
\(b+ a\) steps, while
\(B\) publishes
\(b\) steps, so
\(A\) immediately gets the reward
\((a+b/2)v\). Then,
\(A\) and
\(B\) start on the same level for the last temporal interval.
\(A\)’s final reward is given by Table
1, according to the research outcome of the final temporal interval.
Table 1 \(A\)’s reward during the last temporal interval, with strategy I
Suppose now
\(A\) adopts strategy II: she doesn’t publish now (while
\(B\) publishes her
\(b\) steps) and waits for the last time.
\(A\)’s final reward is given by Table
2. For each cell, Table
1 is larger than Table
2. This shows that, when
\(A\) is in front of
\(B\) and
\(B\) adopts strategy I, then strategy I is better than strategy II for
\(A\).
Table 2 \(A\)’s reward during the last temporal interval, with strategy II
Secondly, suppose \(B\) adopts strategy II. Suppose \(A\) adopts strategy I. She first publishes \(b+a\) steps, while \(B\) publishes nothing. So, her reward after the last temporal interval is given by \(bv/2\) plus the cells of Table 1. Suppose now \(A\) adopts strategy II. Then her reward after the last temporal interval is given by \(bv/2\) plus the cells of Table 2. So, here again, strategy I is better than strategy II for \(A\).
Consider now the second case:
\(B\) is ahead of
\(A\). Define
\(a\) and
\(b\) such that
\(A\) is
\(a\) steps ahead of the last published step, and
\(B\) is
\(b\) steps ahead of
\(A\). First, suppose that
\(B\) adopts strategy I. If
\(A\) adopts strategy I too, her final reward is given by Table
3. If she adopts strategy II, it is given by Table
3 minus
\(av/2\). So, strategy I is better for
\(A\).
Table 3 Expected rewards for \(A\) and \(B\)
Suppose now that
\(B\) adopts strategy II. If
\(A\) adopts strategy I, her final reward is given by Table
4. If she adopts strategy II, it is given by Table
3 minus
\(av/2\). So, strategy I is better for
\(A\) in this case too. So in any case, strategy I is better than strategy II for
\(A\), and this establishes the first case of the induction proof.
Table 4 \(A\)’s reward, according to the outcome of the first temporal interval
Suppose now it has been proved for some \(m<n\) that, when \(m\) temporal intervals remain, \(A\) should publish every step she has passed. Consider \(A\) when there remains \(m+1\) temporal intervals. By hypothesis, \(A\) should publish at the next temporal interval (when there remains only \(m\) temporal intervals). If \(B\) adopts strategy I, then she will publish at the next temporal interval, so \(A\) is in the case already proved24, where \(m=1\). If \(B\) adopts strategy II, \(B\) doesn’t publish before the end, so nothing is changed whether \(A\) publishes right now or just at the next temporal interval. Hence, strategy I is still better than strategy II for \(A\), and Theorem 1 has been proved by induction.25
Note that the conclusion that \(A\) should always publish has not only been shown on average, but also for every possible outcome of research — for every comparison of the corresponding cells in the various tables. So, the theorem doesn’t depend on the hypothesis that the scientist is risk-neutral.
Proof of Theorem 2 in Sect. 4.2
It is useful to distinguish between two kinds of steps: the steps both \(A\) and \(B\) have passed but not yet published (call them the “common steps”), and the steps \(A\) is the only one to have passed, if any (her “solitary steps”). I am going to show that solitary steps should be published. Then, taking into account the possibility of common steps cannot make publication a worse strategy than it is if there are only solitary steps (because nothing can be gained in not publishing them, in particular because the competitor’s strategy doesn’t depend on what is actually published, and a potential reward can be lost). For a result aiming at proving the superiority of publication, this is just fine. Strategies I and II are defined in the same way as in the proof of Theorem 1. Again, proving Theorem 2 amounts to showing that a scientist should prefer strategy I to strategy II.
As the chain is composed of
\(l\) steps, a total reward
\(lv\) will finally be distributed to the scientists. Because of their symmetric role, both scientists can expect a reward
\(lv/2\) if they adopt the same strategy. Table
5 shows the expected rewards for scientists I will argue for. Going from this Table
5 to Theorem 2 —
\(A\) should adopt strategy I — will not be difficult. Whatever strategy
\(B\) adopts,
\(A\) can expect a better reward if she adopts strategy I. And the same can be said for
\(B\). So, both should choose it, and it is a Nash equilibrium.
Table 5 Expected rewards for \(A\) and \(B\) (respectively), according to the strategies they adopt
Table 5 could clearly be derived from the following proposition \(P(k)\), if it was true for any \(k\):
Proposition \(P(k)\): If \(B\) has adopted strategy I, if \(A\) has some solitary steps, and if there remain \(k\) steps before the end of the chain (i. e. \(A\) is at the position \(l-k\) ), then \(A\) ’s expected reward is greater if she publishes her solitary steps.
So, let’s prove
\(P(k)\) by induction. In the initial case,
\(k=1\), and
\(A\) is at step
\(l-1\). She has some (say
\(a\)) solitary steps. Suppose
\(A\) publishes her
\(a\) steps now. She immediately gets an
\(av\) reward, and both scientists are at step
\(l-1\). Given the symmetry between
\(A\) and
\(B, A\)’s expected reward for the last step is
\(v/2\). Summing, her expected reward in this case is
$$\begin{aligned} E = av + \frac{v}{2}. \end{aligned}$$
(1)
Suppose now
\(A\) doesn’t publish her
\(a\) steps now. I am going to build a table (Table
6) for
\(A\)’s expected reward in the long run (called
\(E_{6}\)), according to the results after this next temporal interval. Note that it is
\(A\)’s expected reward from this specific situation until the end of the game, and not only for the next temporal interval. Here is how table
6 is filled in. If
\(A\) succeeds, the game is over, and the rewards can be easily computed. If
\(A\) doesn’t succeed, the game is not over, so computing a reward isn’t straightforward. If both
\(A\) and
\(B\) fail, then they still are at the same position. So,
\(A\)’s expected reward in this case is
\(E_{6}\) itself. In case
\(A\) fails and
\(B\) succeeds,
\(A\) is in position
\(l-1\), and
\(B\) in
\(l-a\), one step higher. Let’s call
\(E_{6}^{\prime }\) the expected reward in this situation. In any case
\(A\) cannot expect to publish more than
\(a\) steps, and perhaps she will have to share some of these rewards, so
\(E^{\prime }_{6}< av\) on average. This gives Table
6. To get the expected reward
\(E_{6} \), this table is weighted with the probability of the cells outcome. Some easy computation shows that
\(E_{6} < E\). In words: if
\(A\) has some solitary steps and if she is at the position
\(l-1\), her expected reward is larger if she publishes them now, than if she doesn’t. This proves
\(P(1)\).
Table 6 Expected reward \(E_{6}\) for \(A\) after one temporal interval, with \(A\) starting \(a\) positions ahead of \(B\), who will publish after this temporal interval
Suppose now that \(P(k)\) has been proved for some \(k\). In the case \(k+1, A\) is at step \(l-k-1\). By \(P(k)\), she knows that she will publish her solitary steps, if she has some, when she is on \(l-k\), i. e. on the next step. Both \(A\) and \(B\) would publish if they reached step \(l-k\) (\(B\) would because she always publishes): we are tempted to say that this step plays the role of the end of a shorter chain, that we are in a situation where \(P(1)\) applies, in order to argue that \(A\) should publish now at \(l-k\). This reasoning implicitly assumes that \(A\) should maximize her reward when someone arrives at \(l-k\) (if it is to play the role of the end of chain). Actually, the real goal for \(A\) is to maximize her reward when either her or \(B\) arrives at step \(l\), the actual end of the chain. Hopefully, the only way to maximize the reward at \(l\) is to maximize it at \(l-k\) first: someone will publish at \(l-k\), the two scientists will be together at this step, so there is no way for \(A\) to keep any sort of better non-published lead for the real end of the chain. So, the argument of \(P(1)\) can be used, and \(P(k+1)\) is proved. As was argued before, this proves Table 5, and Theorem 2. 26
Proof of Theorem 3 in Sect. 4.3.1
As before, it is clear that \(A\) should publish a common step, and also that she cannot know in practice which steps are solitary or common. Strategies I and II are defined in the same way as in the proof of Theorem 1.
Theorem 3 is equivalent to stating that strategy II gives a larger average reward than strategy I. As the total reward distributed among scientists is constant at
\(v_1 + v_2\), it is equivalent to stating Table
7.
Table 7 \(A\)’s expected reward, in case \(A\) is at step 1 and \(B\) at step 0
Let’s consider the case of the bottom left cell. Let’s call \(E_{4}\) the average reward that \(A\) can expect after the first temporal interval. Table 8 is filled in: if \(A\) and \(B\) succeed, only \(B\) publishes because \(A\) follows strategy II; then, they are in a symmetrical position for the last step: \(A\)’s expected reward is \(v_2/2\). If \(A\) fails and \(B\) succeeds, only \(B\) publishes too, and \(A\)’s expected reward is \(v_2/2\). If both scientists fail, they remain in the same position, and the reward is still \(E_{4}\). If \(A\) succeeds and \(B\) fails, \(A\)’s reward is called \(E_{6}\) and is computed below.
\(E_{4}\) is computed by weighting Table
8 with Table
9:
$$\begin{aligned} E_{4} = p^2 \frac{v_2}{2} + p(1-p)\frac{v_2}{2} +p(1-p) E_{6} + (1-p)^2 E_{4} \end{aligned}$$
(2)
The evaluation of
\(E_{6}\) is made by filling in a similar Table
10.
Table 8 Expected rewards for \(A\) and \(B\), according to their strategies
Table 9 \(A\)’s expected reward, according to the outcome of the first temporal interval
Table 10 Expected reward \(E_{9}\) for \(A\), in case \(A\) is at step 1 and \(B\) at step 0
Weighting Table
10 with Table
9, one computes
$$\begin{aligned} E_{6} = p^2 \Big ( \frac{v_1}{2} + v_2 \Big ) + p(1-p) \Big (v_1 + v_2 + \frac{v_2}{2} \Big ) + (1-p)^2 E_{6}. \end{aligned}$$
(3)
\(E_{6}\) is extracted from it and injected in Eq.
2:
$$\begin{aligned} E_{4} = \frac{v_2}{2(2-p)} + \frac{1-p}{2(2-p)^2}\big [ (2-p) v_1 + (3-p) v_2 \big ] \end{aligned}$$
(4)
The condition expressed in the bottom left cell of Table
7 is that
\(E_{4} > \frac{v_1+v_2}{2}\). With Eq.
4, it is equivalent to
$$\begin{aligned} v_2 > \frac{2-p}{1-p} v_1. \end{aligned}$$
(5)