Skip to main content
Log in

How to Eat Your Entropy and Have it Too: Optimal Recovery Strategies for Compromised RNGs

  • Published:
Algorithmica Aims and scope Submit manuscript

Abstract

Random number generators (RNGs) play a crucial role in many cryptographic schemes and protocols, but their security proof usually assumes that their internal state is initialized with truly random seeds and remains secret at all times. However, in many practical situations these are unrealistic assumptions: The seed is often gathered after a reset/reboot from low entropy external events such as the timing of manual key presses, and the state can be compromised at unknown points in time via side channels or penetration attacks. The usual remedy (used by all the major operating systems, including Windows, Linux, FreeBSD, MacOS, iOS, etc.) is to periodically replenish the internal state through an auxiliary input with additional randomness harvested from the environment. However, recovering from such attacks in a provably correct and computationally optimal way had remained an unsolved challenge so far.

In this paper we formalize the problem of designing an efficient recovery mechanism from state compromise, by considering it as an online optimization problem. If we knew the timing of the last compromise and the amount of entropy gathered since then, we could stop producing any outputs until the state becomes truly random again. However, our challenge is to recover within a time proportional to this optimal solution even in the hardest (and most realistic) case in which (a) we know nothing about the timing of the last state compromise, and the amount of new entropy injected since then into the state, and (b) any premature production of outputs leads to the total loss of all the added entropy used by the RNG, since the attacker can use brute force to enumerate all the possible low-entropy states. In other words, the challenge is to develop recovery mechanisms which are guaranteed to save the day as quickly as possible after a compromise we are not even aware of. The dilemma that we face is that any entropy used prematurely will be lost, and any entropy which is kept unused will delay the recovery.

After developing our formal definitional framework for RNGs with inputs, we show how to construct a nearly optimal RNG which is secure in our model. Our technique is inspired by the design of the Fortuna RNG (which is a heuristic RNG construction that is currently used by Windows and comes without any formal analysis), but we non-trivially adapt it to our much stronger adversarial setting. Along the way, our formal treatment of Fortuna enables us to improve its entropy efficiency by almost a factor of two, and to show that our improved construction is essentially tight, by proving a rigorous lower bound on the possible efficiency of any recovery mechanism in our very general model of the problem.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. Since conditional min-entropy is defined in the worst-case manner, the value \(\gamma _j\) in the bound below should not be viewed as a random variable, but rather as an arbitrary fixing of this random variable.

  2. Intuitively, “fresh” refers to the new entropy in the system since the last state compromise.

  3. Intuitively, this game captures security against an attacker that can cause a machine to reboot.

  4. The intuition for the competitive ratio \(r= \alpha \cdot \beta \) (which will be explicit in Sect. 6) comes from the case when the sequence sampler \(\mathcal {E}\) is restricted to constant sequences \(w_i = w\). In that case, \(r\) bounds the ratio between the time taken by \(\mathcal {SC}\) to win and the time taken to receive a total weight of one.

  5. This does not correspond to the bit b chosen by \(\mathcal {A}'\) in the simulation.

  6. We analyze their construction against constant sequences much more carefully in Sect. 6.1.

  7. [5] contains a detailed discussion of the subtleties here and the justification for such an assumption.

  8. We note that when the sequence sampler \(\mathcal {E}\) must be constant, \((t, q, w_\mathrm {max}, \alpha , \beta , \varepsilon )\)-security is equivalent to \((t, q, w_\mathrm {max}, \alpha ', \beta ', \varepsilon )\)-security if \(\alpha \cdot \beta = \alpha ' \cdot \beta '\).

  9. There is an attack: Let \(w= 1/(2^i+1)\) and start Fortuna’s counter so that pool \(i+1\) is emptied after \(2^i\cdot \log _2 q\) steps. Clearly, \(\mathcal {SC}_\mathcal {F}\) takes \((2^i+2^{i+1})\cdot \log _2 q = 3 \cdot 2^i \cdot \log _2 q\) total steps to finish, achieving a competitive ratio arbitrarily close to \(3 \log _2 q\).

  10. This follows from the analysis of our own scheduler in Appendix 2.

  11. To compare with our previous numbers from Sect. 5, recall that we had \(\beta = 4\). Therefore, we note that the above scheduler achieves such security in four times the amount of time that it takes to receive about 750 bytes to 1.2 kilobytes of entropy. These are the proper numbers to compare, though they make less sense in the constant-rate case.

  12. Note that, while this assumption is quite strong, we do not impose a fixed order on the \(\mathsf {set}\text {-}\mathsf {refresh}\) calls or assume constant entropy from \(\mathcal {D}\text {-}\mathsf {refresh}\) calls as [9] do. Indeed, the original Fortuna construction is clearly not secure in our extended model even with a constant entropy assumption.

  13. Technically, we replace \(\mathcal {E}_i\) with \(\mathcal {E}_i'\), which outputs a sequence of length \(i \cdot r\).

References

  1. Barak, B., Halevi, S.: A model and architecture for pseudo-random generation with applications to /dev/random. In: Proceedings of the 12th ACM Conference on Computer and Communications Security, CCS ’05, ACM, pp. 203–212. New York, NY, USA (2005)

  2. Barker, E., Kelsey, J.: Recommendation for Random Number Generation Using Deterministic Random Bit Generators. NIST Special Publication, Oakland (2012)

    Book  Google Scholar 

  3. Bellare, M., Rogaway, P.: The security of triple encryption and a framework for code-based game-playing proofs. In: Vaudenay S. (ed.) Advances in Cryptology—EUROCRYPT. Lecture Notes in Computer Science, vol. 4004, pp. 409–426. Springer, Berlin, Heidelberg (2006)

  4. CVE-2008-0166. Common vulnerabilities and exposures (2008)

  5. Dodis, Y., Pointcheval, D., Ruhault, S., Vergniaud, D., Wichs, D.: Security analysis of pseudo-random number generators with input: /dev/random is not robust. In: Proceedings of the 2013 ACM SIGSAC Conference on Computer Communications Security, CCS ’13, ACM, pp. 647–658. New York, NY, USA (2013)

  6. Dorrendorf, L., Gutterman, Z., Pinkas, B.: Cryptanalysis of the random number generator of the windows operating system. IACR Cryptol. ePrint Arch. 2007, 419 (2007)

    Google Scholar 

  7. Eastlake, D., Schiller, J., Crocker, S.: Randomness Requirements for Security (2005). http://www.rfc-editor.org/rfc/rfc4086.txt

  8. Ferguson, N.: Private communication (2013)

  9. Ferguson, N., Schneier, B.: Practical Cryptography, 1st edn. Wiley, New York (2003)

    MATH  Google Scholar 

  10. Gutterman, Z., Pinkas, B., Reinman, T.: Analysis of the linux random number generator. In: Proceedings of the 2006 IEEE Symposium on Security and Privacy. SP ’06, IEEE Computer Society, pp. 371–385. Washington, DC, USA (2006)

  11. Heninger, N., Durumeric, Z., Wustrow, E., Halderman, J.A.: Mining your Ps and Qs: detection of widespread weak keys in network devices. In: Proceedings of the 21st USENIX Security Symposium (2012)

  12. Kelsey, J., Schneier, B., Ferguson, N.: Yarrow-160: notes on the design and analysis of the yarrow cryptographic pseudorandom number generator. In: Sixth Annual Workshop on Selected Areas in Cryptography, pp. 13–33. Springer (1999)

  13. Kelsey, J., Schneier, B., Wagner, D., Hall, C.: Cryptanalytic attacks on pseudorandom number generators. In: Vaudenay S. (ed.) Fast Software Encryption. Lecture Notes in Computer Science, vol. 1372, pp. 168–188. Springer, Berlin, Heidelberg (1998)

  14. Lacharme, P., Röck, A., Strubel, V., Videau, M.: The linux pseudorandom number generator revisited. IACR Cryptol. ePrint Arch. 2012, 251 (2012)

    Google Scholar 

  15. Lenstra, A.K., Hughes, J.P., Augier, M., Bos, J.W., Kleinjung, T., Wachter, C.: Public keys. In: Advances in cryptology–CRYPTO 2012. Lecture Notes in Computer Science, vol. 7417, pp. 626–642. Springer, Heidelberg (2012)

  16. Nguyen, P.Q., Shparlinski, I.E.: The insecurity of the digital signature algorithm with partially known nonces. J. Cryptol. 15(3), 151–176 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  17. Sahai, A., Vadhan, S.P.: A complete problem for statistical zero knowledge. J. ACM 50(2), 196–249 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  18. Schinlder, W., Killmann, W.: Evaluation Criteria for True (Physical) Random Number Generators Used in Cryptographic Applications. In: Kaliski, B.S., Koç, Ç.K., Paar, C. (eds.) Cryptographic Hardware and Embedded Systems - CHES 2002: 4th International Workshop, Redwood Shores, CA, USA, August 13–15, 2002, Revised Papers, pp. 431–449. Springer, Berlin, Heidelberg (2003)

  19. Wikipedia. /dev/random. http://en.wikipedia.org/wiki//dev/random (2004). Accessed 09 Feb 2014

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Noah Stephens-Davidowitz.

Appendices

Appendix 1: Proof of Theorem 1

We prove the two bounds in Theorem 1 as two separate propositions. Note that the first lower bound applies even when adversaries are restricted to just constant sequences.

Proposition 1

For \(q\ge 3\), let \(\mathcal {SC}\) be a \((t, q, w_\mathrm {max}, \alpha , \beta , \varepsilon )\)-secure scheduler against constant-rate adversaries running in time \(t_\mathcal {SC}\). Then, either \(t = O(q \cdot (t_\mathcal {SC}+ \log q))\), \(\varepsilon \ge 1/(q-1/w_\mathrm {max}+ 1)\), or

$$\begin{aligned} r> \log _eq - \log _e(1/w_\mathrm {max})- \log _e\log _eq - 1 \; , \end{aligned}$$

where \(r= \alpha \cdot \beta \) is the competitive ratio.

Proposition 2

Suppose that \(\mathcal {SC}\) is a \((t, q, w_\mathrm {max}, \alpha , \beta , \varepsilon )\)-secure scheduler running in time \(t_\mathcal {SC}\). Then, either \(t = O(q (t_\mathcal {SC}+ \log q))\), \( r^2 > w_\mathrm {max}^2 q\), \(\varepsilon \ge 1/e\), or

$$\begin{aligned} \alpha > \frac{w_\mathrm {max}}{w_\mathrm {max}+ 1} \cdot \frac{\log _e(1/\varepsilon )-1}{\log _e\log _e(1/\varepsilon )+1} \; , \end{aligned}$$

where \(r= \alpha \cdot \beta \).

It should be clear that Theorem 1 follows immediately from the two propositions.

1.1 Proof of Proposition  1

The main step in the proof of Proposition 1 is the following lemma:

Lemma 1

For any \(q\ge 3\) let \(\mathcal {E}_i\) be the constant sequence sampler that simply outputs the sequence \((1/i, \ldots , 1/i)\) for \(i=1/w_\mathrm {max},\ldots q\). Then, for any keyless scheduler \(\mathcal {SC}\) with P pools, there exists an i and an adversary \(\mathcal {A}\) such that \(\mathcal {E}_i\) and \(\mathcal {A}\) win \(\mathsf {SGAME}(P, q, w_\mathrm {max}, r)\) for any \(r> \log _eq - \log _e(1/w_\mathrm {max})- \log _e\log _eq - 1\).

Furthermore, there exists a single adversary \(\mathcal {A}'\) that, given any keyless scheduler \(\mathcal {SC}\), i, and \(r\), can output the \(\tau \) that allows \(\mathcal {E}_i\) to win \(\mathsf {SGAME}(P, q, w_\mathrm {max}, r)\) against \(\mathcal {SC}\) (or outputs FAIL if none exists) in time \(O(q \cdot (\log q + t_\mathcal {SC}))\), where \(t_\mathcal {SC}\) is the run-time of the scheduler.

Proof

We assume without loss of generality that \(1/w_\mathrm {max}\) is an integer.

Fix any keyless scheduler \(\mathcal {SC}\) and start state \(\tau _0\). Given the corresponding sequence \((\mathsf {in}_j, \mathsf {out}_j)_{j=1}^q\), we define the sequence of “leave times” \(b_1, \ldots b_q \in \mathbb {N}\cup \{\infty \}\) as \(b_j = \min \{ T \ge j : \mathsf {out}_T = \mathsf {in}_j \}\) (where we adopt the convention that \(\min \varnothing = \infty \)). Intuitively, at time T, we imagine the scheduler selecting a pool \(\mathsf {in}_T\) in which to “throw a ball”, and a pool \(\mathsf {out}_T\) to empty afterwards. The leave time \(b_j\) is the time at which the ball that was “thrown” at time j will “leave the game”.

Let \(\tau _T\) be the state of \(\mathcal {SC}\) after T steps, and let \(\mathcal {A}_T \) be the adversary that sets the state of \(\mathcal {SC}\) to \(\tau _T\). Note that \(\mathcal {SC}\) wins \(\mathsf {SGAME}(P, q, w_\mathrm {max}, r)\) against \(\mathcal {E}_i, \mathcal {A}_T\) if and only if there is some set of i balls \(J \subseteq [T+1, T+i\cdot r]\) with \(b_j = b_{j'} \le T+i\cdot r\) for all \(j, j' \in J\).

We proceed by “marking balls”. We first consider \(\lfloor \frac{w_\mathrm {max}\cdot q}{r}\rfloor \) non-overlapping intervals of length \( r/w_\mathrm {max}\) in \(\{1,\ldots , q\}\). By hypothesis, there must be at least \(1/w_\mathrm {max}\) balls in each of these intervals that leave at the same time in the same interval. We mark all such balls, marking at least \( \frac{q}{r} - 1/w_\mathrm {max}\) distinct balls in total. Now, consider \(\lfloor \frac{w_\mathrm {max}\cdot q}{2r}\rfloor \) non-overlapping intervals of length \(2 r/w_\mathrm {max}\). In each such interval, there must be at least \(2/w_\mathrm {max}\) balls whose leave time is the same and in the interval. We mark these balls. Previously no more than \(1/w_\mathrm {max}\) balls that we’d marked had the same leave time, so we must have marked at least \(1/w_\mathrm {max}\) new balls in each interval. Therefore, we’ve now marked at least \(\frac{q}{r} + \frac{q}{2r} - 2/w_\mathrm {max}\) distinct balls, and no set of more than \(2/w_\mathrm {max}\) balls have the same leave time.

Proceeding by induction, suppose that after \(j < \lfloor \frac{w_\mathrm {max}\cdot q}{r}\rfloor \) steps, we have marked at least \(\sum _{k=1}^j \frac{q}{k\cdot r} - j/w_\mathrm {max}\) distinct balls, and no set of more than \(j/w_\mathrm {max}\) marked balls have the same leave time. We consider \(\lfloor \frac{w_\mathrm {max}\cdot q}{(j+1)\cdot r}\rfloor \) non-overlapping intervals of length \((j+1)\cdot r/w_\mathrm {max}\) and note that in each such interval there must be \((j+1)/w_\mathrm {max}\) balls with the same leave time. So, we mark these and note that we must have marked an additional \( \frac{q}{2 r} - 1/w_\mathrm {max}\) new balls and that no set of more than \((j+1)/w_\mathrm {max}\) marked balls have the same leave time.

It follows that this procedure will mark at least \(\sum _{k=1}^{\lfloor w_\mathrm {max}\cdot q/r\rfloor } \frac{q}{k\cdot r} - q/r\) balls. Recalling that the nth harmonic number satisfies \(H_n = \sum _{k=1}^n1/k > \log _e(n+1)\), it follows that we’ve marked at least \(\frac{q}{r}\cdot (\log _eq - \log _er - \log _e(1/w_\mathrm {max})-1)\) distinct balls in this way. But, there are only q balls total. It follows that \(r > \log _eq - \log _e(1/w_\mathrm {max})- \log _e\log _eq - 1\).

It remains to construct an \(\mathcal {A}'\) that finds the winning \(\tau \) in \(O(q\cdot (t_\mathcal {SC}+ \log q))\) time given \(\mathcal {SC}\), i, and \(r\). \(\mathcal {A}'\) first computes \((\tau _j)_{j=0}^{q-1}\) in time \(O(q \cdot t_\mathcal {SC})\). Now, as above, \(\mathcal {A}'\) divides \(\{ 1,\ldots , q\} \) into disjoint intervals of length \(\lfloor \frac{q}{i\cdot r}\rfloor \). For each such interval \([T + 1, T + i\cdot r]\), \(\mathcal {A}'\) simply simulates \(\mathsf {SGAME}(P, i\cdot r, r)\) against \(\mathcal {E}_i\) starting at \(\tau _T\).Footnote 13 \(\mathcal {A}\) returns \(\tau _T\) if it wins the simulation. If no \(\tau _T\) wins, \(\mathcal {A}'\) outputs FAIL. This takes time \(O(q \log q)\). (The \(\log q\) overhead is incurred because \(\mathcal {A}\) needs to write numbers that could be as large as q.)

The result follows. \(\square \)

From this, Proposition 1 follows easily.

Proof of Proposition 1

Fix \(\mathcal {SC}\).

Let \(\mathcal {E}\) be the sequence sampler that selects \(i \mathop {\leftarrow }\limits ^{\$}\{1/w_\mathrm {max}, \ldots , q \}\) and then behaves as the constant sequence sampler \(\mathcal {E}_i\) from Lemma 1. Let \(\mathcal {A}\) be the adversary that behaves as follows: On input \(\mathsf {skey}\), \(\mathcal {A}\) produces the keyless scheduler \(\mathcal {SC}_\mathsf {skey}\) such that \(\mathcal {SC}_\mathsf {skey}(\sigma ) = \mathcal {SC}(\mathsf {skey}, \sigma )\). \(\mathcal {A}\) then simulates \(\mathcal {A}'\) from the lemma, which outputs either some state \(\tau \) or FAIL. If \(\mathcal {A}'\) outputs \(\tau \), \(\mathcal {A}\) simply does the same. Otherwise, \(\mathcal {A}\) outputs an arbitrary state.

By Lemma 1, \(\mathcal {A}\) runs in time \(O(q \cdot (\log q + t_\mathcal {SC}))\), and if \(r\le \log _eq - \log _e(1/w_\mathrm {max})- \log _e\log _eq - 1\), then with probability at least \(1/(q-1/w_\mathrm {max}+1)\), this procedure produces an \(\mathcal {E}_i, \tau \) pair that wins \(\mathsf {SGAME}(P, q, w_\mathrm {max}, r)\) against \(\mathcal {SC}_\mathsf {skey}\). The result follows. \(\square \)

1.2 Proof of Proposition 2

Proof of Proposition 2

Suppose \(r^2 \le w_\mathrm {max}^2 q\). For simplicity, we will assume \(1/w_\mathrm {max}\) is an integer.

Our proof begins similarly to that of Lemma 1. In particular, we let \(\tau _0\) be any start state. Let \(B_1, \ldots , B_q\) be random variables over the choice of \(\mathsf {skey}\) corresponding to leave times, \(B_j = \min \{ T \ge j : \mathsf {out}_T = \mathsf {in}_j \}\). We again think of a ball with weight \(w_j\) thrown into pool \(\mathsf {in}_j\) at time j and leaving the game at time \(B_j\).

Intuitively, our approach will be to first show a pair of adversaries that win if balls take too long to leave. We’ll then show a pair of adversaries that win if balls leave too quickly.

In particular, let \(\mathcal {E}\) simply output a sequence of \(\alpha /w_\mathrm {max}\) maximum weights followed by 0s, \((w_\mathrm {max},\ldots , w_\mathrm {max}, 0, \ldots 0)\). For any \(\mathsf {skey}\) and any \(1 \le T \le q\), let \(\tau _T(\mathsf {skey})\) be the state that \(\mathcal {SC}\) with \(\mathsf {skey}\) reaches after T steps, starting at \(\tau _0\). Let \(\mathcal {A}_k\) be the adversary that simply outputs \( \tau _{k r/w_\mathrm {max}}(\mathsf {skey})\) on input \(\mathsf {skey}\). Note that in order for \(\mathcal {SC}\) to win \(\mathsf {SGAME}\) against \(\mathcal {E}, \mathcal {A}_k\), it is necessary but not sufficient for there to be some j with \(k r/w_\mathrm {max}< j \le k r/w_\mathrm {max}+ \alpha /w_\mathrm {max}\) and \(B_j \le (k+1) r/w_\mathrm {max}\). (Intuitively, there must be some ball that enters in the first \(\alpha /w_\mathrm {max}\) steps of the game against \(\mathcal {A}_{k }\) and leaves before time \(r/w_\mathrm {max}\).)

Now, let \(\mathcal {A}_k^*\) be an adversary that for \(0 \le k' < k\) selects \(j_{k'}\) uniformly at random with \(k' r/w_\mathrm {max}< j_{k'} \le k'r/w_\mathrm {max}+ \alpha /w_\mathrm {max}\). If \(B_{j_{k'}} > (k'+1)\cdot r/w_\mathrm {max}\), then \(\mathcal {A}_k^*\) simply behaves as \(\mathcal {A}_{k'}\). Otherwise, \(\mathcal {A}_k^*\) behaves as \(\mathcal {A}_k\). Let \(E_k\) be the event that \(B_{j_{k'}} < (k'+1)\cdot r/w_\mathrm {max}\) for all \(k' \le k\). Note that \(\mathcal {A}_k^*\) wins if \(E_{k}\) happens and \(B_{j} > (k+1) \cdot r/ w_\mathrm {max}\) for all j with \(k r/w_\mathrm {max}< j \le kr/w_\mathrm {max}+ \alpha /w_\mathrm {max}\). (To be clear, \(A_k^*\) may win in other circumstances as well.) Therefore,

$$\begin{aligned} \varepsilon \ge \Pr [E_{k}]\cdot \Pr \Big [\forall j\ \mathrm { with }\ \frac{kr}{w_\mathrm {max}}< j \le \frac{kr+ \alpha }{w_\mathrm {max}} ,\ B_j > (k+1) \cdot \frac{r}{w_\mathrm {max}} \ \Big |\ E_{k} \Big ] \; . \end{aligned}$$

Rearranging, we have

$$\begin{aligned} \Pr [E_{k}] - \varepsilon&\le \Pr [E_{k}]\cdot \Pr \Big [\exists j\ \mathrm { with }\ \frac{kr}{w_\mathrm {max}}\!<\! j \!\le \! \frac{kr+ \alpha }{w_\mathrm {max}} ,\ B_j \!\le \! (k+1) \cdot \frac{r}{w_\mathrm {max}} \ \Big |\ E_{k} \Big ]\\&\le \Pr [E_{k}]\cdot \sum _{j = k \cdot r/w_\mathrm {max}}^{(k \cdot r+\alpha )/w_\mathrm {max}} \Pr \Big [ B_j \le (k+1) \cdot \frac{r}{w_\mathrm {max}} \ \Big |\ E_{k} \Big ]\\&= \frac{\alpha }{w_\mathrm {max}} \cdot \Pr [E_{k}]\cdot \Pr \Big [ B_{j_k} \le (k+1) \cdot \frac{r}{w_\mathrm {max}} \ \Big |\ E_{k} \Big ]\\&= \frac{\alpha }{w_\mathrm {max}} \cdot \Pr [E_{k+1}] \; , \end{aligned}$$

where \(B_{j_k}\) is chosen uniformly at random with \(k r/w_\mathrm {max}< j_{k} \le kr/w_\mathrm {max}+ \alpha /w_\mathrm {max}\). So, we have the recurrence relation \( \Pr [E_k] \ge (w_\mathrm {max}/\alpha ) \cdot (\Pr [E_{k-1}] - \varepsilon ) \), with \(\Pr [E_{0}] = 1\). It follows that

$$\begin{aligned} \Pr [E_k] \ge \Big ( \frac{w_\mathrm {max}}{\alpha }\Big )^{k} - \varepsilon \cdot \sum _{i=1}^{k} \Big (\frac{w_\mathrm {max}}{\alpha }\Big )^i > \Big ( \frac{w_\mathrm {max}}{\alpha }\Big )^{k} - \varepsilon \cdot \frac{w_\mathrm {max}}{\alpha - w_\mathrm {max}} \; . \end{aligned}$$

Now, let \(\mathcal {E}^*\) be the sequence sampler that randomly selects \(j_k\) with \(k r/w_\mathrm {max}< j_{k} \le kr/w_\mathrm {max}+ \alpha /w_\mathrm {max}\) for all \(k < (w_\mathrm {max}+ 1)\cdot \alpha /w_\mathrm {max}\). \(\mathcal {E}^*\) then outputs the sequence \((w_i)\) where \(w_i = w_\mathrm {max}/(w_\mathrm {max}+ 1)\) if \(i = j_k\) for some k and \(w_i = 0 \) otherwise. Suppose the event \(E_{k^*}\) occurs where \(k^* = (w_\mathrm {max}+ 1)\cdot (\alpha -1)/w_\mathrm {max}+1\). Then, for all \(k\le k^*\), the \(j_k\)-th ball leaves before the \(j_{k+1}\)-st ball enters. In particular, \(\mathcal {E}^*, \mathcal {A}_0\) win \(\mathsf {SGAME}\). Therefore,

$$\begin{aligned} \varepsilon \ge \Pr [E_{k^*}] > \Big ( \frac{w_\mathrm {max}}{\alpha }\Big )^{k^*} - \varepsilon \cdot \frac{w_\mathrm {max}}{\alpha - w_\mathrm {max}} \; . \end{aligned}$$

It follows that

$$\begin{aligned} \alpha > \frac{w_\mathrm {max}}{w_\mathrm {max}+ 1} \cdot \frac{\log _e(1/\varepsilon )-1}{\log _e\log _e(1/\varepsilon )+1} \end{aligned}$$

provided that \(\varepsilon < 1/e\).

It is easy to see that \(\mathcal {A}_k^*\) and \(\mathcal {E}^*\) run in time \(O(q (t_\mathcal {SC}+ \log q))\), and the result follows. \(\square \)

Appendix 2: Construction of Constant-Rate Scheduler and Proof of Theorem 5

We first notice that Fortuna’s scheduler can be easily modified to use a different base. In particular, for any integer \(b\ge 2\), we define a keyless scheduler, \(\mathcal {SC}_b\). Roughly, \(\mathcal {SC}_b\) has \(P_b \approx \log _b q\) pools, numbered \(0, \ldots , P_b - 1\). The state \(\tau \in \{0,\ldots , q-1 \}\) will just be a counter. The pools are filled in turn, and pool i is emptied whenever the counter \(\tau \) divides \( b^i \cdot P_b\) but not \(b^{i+1} \cdot P_b\).

Our actual construction will be slightly more involved than the above, but it is simply an optimized version of this basic idea. In particular, we make four changes:

  1. 1.

    We account for \(w_\mathrm {max}\) by emptying pools when \(\tau \) divides \(b^i \cdot P_b/w_\mathrm {max}\), instead of just \(b^i \cdot P_b\).

  2. 2.

    We use slightly fewer than \(\log _b q\) pools, setting \(P_b = \log _b q - \log _b \log _b q - \log _b (1/w_\mathrm {max})\).

  3. 3.

    We do not empty the 0th pool twice in a row. (While this never comes up when \(b=2\), it is an issue for \(b\ge 3\).)

  4. 4.

    If pool j will next be emptied sooner than pool i and \(j > i\), we fill pool j instead of pool i. (This captures the idea of emptying multiple pools at once from Sect. 6.)

For simplicity, we assume that \(\log _b \log _b q\) and \(\log _b (1/w_\mathrm {max})\) are both integers, and we let \(P_b = \log _b q - \log _b \log _b q - \log _b (1/w_\mathrm {max}) \). Then, we define \(\mathcal {SC}_b\) as in Fig. 10.

Fig. 10
figure 10

Our keyless scheduler construction

Theorem 5 shows that this scheme achieves a very good competitive ratio of \(r_b \approx b P_b\). In Appendix 1, we show a lower bound in the constant-rate case of \(r> \log _eq - \log _e\log _eq - \log _e(1/w_\mathrm {max}) -1 \) (or \(r > P_e-1\) in slightly abused notation), so this result is very close to optimal.

Proof of Theorem 5

Note that \(\mathcal {E}\) must output a constant sequence, \((w, \ldots , w)\) with \(r_b/q \le w\le w_\mathrm {max}\). (If \(w< r_b/q\), then we win by default.) We assume without loss of generality that \(1/w\) is an integer.

We first handle the case when \(w> w_\mathrm {max}/b\). Note that no pool is emptied more than once every \(\frac{b}{w_\mathrm {max}} \cdot P_b\) steps and at least one pool is emptied every \(\frac{b-1}{w_\mathrm {max}} \cdot P_b\) steps. So, if \(w> w_\mathrm {max}/b\), \(\mathcal {SC}_b\) wins as soon as the first pool is emptied after \(\frac{1}{w} \cdot P_b\) steps, in time at most \((\frac{1}{w}+\frac{b-1}{w_\mathrm {max}})\cdot P_b\). It therefore achieves a competitive ratio of less than \((1+(b-1)\cdot \frac{w}{w_\mathrm {max}})\cdot P_b \le b P_b\).

Now, assume \(w\le w_\mathrm {max}/b\).

Let \(i\ge 1\) such that \(\frac{b^{i+1}-1}{b-1} \ge w_\mathrm {max}/w> \frac{b^{i}-1}{b-1}\). Consider the first time a pool whose index is at least i is emptied. If it is full on this first emptying, then \(\mathcal {SC}_b\) wins, in time at most \(b^{i}\cdot P_b/w_\mathrm {max}\). Otherwise, let \(T^*\) be the first time such a pool is emptied. Then, \(\mathcal {SC}_b\) wins the next time a pool whose index is greater than i is emptied, at time \(T^* + b^i\cdot P_b/w_\mathrm {max}\). In both cases, \(\mathcal {SC}_b\) achieves a competitive ratio of at worst \(r_b =w\cdot (T^* + b^i\cdot P_b/w_\mathrm {max})\).

We wish to bound \(T^*\). Let j such that \(b^{j+1}>w_\mathrm {max}\cdot T^*/P_b \ge b^{j}\). Then, at time \(T^*\) the pool that is emptied has weight at least

$$\begin{aligned} w\cdot \lfloor T^*/P_b\rfloor + \frac{w}{w_\mathrm {max}} \cdot \sum _{k=0}^j b^k>&w\cdot \Big ( \frac{T^*}{P_b} + \frac{1}{w_\mathrm {max}}\cdot \frac{b^{j+1}-1}{b-1} - 1 \Big )\\> & {} w\cdot \Big ( \frac{T^*}{P_b} + \frac{1}{w_\mathrm {max}}\cdot \frac{w_\mathrm {max}\cdot \frac{T^*}{P_b} - 1}{b-1} - 1 \Big )\\= & {} \frac{w\cdot T^*}{P_b} \cdot \frac{b}{b-1} - w\cdot \frac{1 + (b - 1)\cdot w_\mathrm {max}}{ (b-1)\cdot w_\mathrm {max}} \; . \end{aligned}$$

Note that the above weight is less than one by hypothesis. Applying this and rearranging,

$$\begin{aligned} w\cdot T^* < \frac{w+ (1+w)(b-1)\cdot w_\mathrm {max}}{b \cdot w_\mathrm {max}}\cdot P_b \; . \end{aligned}$$

Plugging in and recalling that \(w\le w_\mathrm {max}/b\) and \(w_\mathrm {max}/w> \frac{b^{i}-1}{b-1}\),

$$\begin{aligned} r_b&< \frac{w+ (1+w)(b-1)\cdot w_\mathrm {max}}{b \cdot w_\mathrm {max}}\cdot P_b + \frac{w}{w_\mathrm {max}} \cdot \Big ((b-1) \cdot \frac{w_\mathrm {max}}{w} + 1 \Big ) \cdot P_b\\&\le \Big (\frac{(1+w_\mathrm {max}/b)(b-1) }{b } + \frac{1}{b^2}\Big )\cdot P_b + \Big (b-1 + \frac{1}{b} \Big ) \cdot P_b\\&= \Big (b + \frac{w_\mathrm {max}}{b} + \frac{1-w_\mathrm {max}}{b^2} \Big )\cdot P_b \end{aligned}$$

The result follows. \(\square \)

Appendix 3: Recovering and Preserving Secutity

1.1 Recovering Security

We consider the following security game with an attacker \(\mathcal {A}\), a sampler \(\mathcal {D}\), and bounds \(q_\mathcal {D}, \gamma ^*\).

  • \(\mathcal {D}\) sends \(J \subset \{1,\ldots , q_\mathcal {D}\}\) to the challenger.

  • The challenge chooses a seed \(\mathsf {seed}\mathop {\leftarrow }\limits ^{\$}\mathsf {setup}\), and a bit \(b \mathop {\leftarrow }\limits ^{\$}\{0,1\}\) uniformly at random. It sets \(\sigma _0 :=0\). For \(k=1,\ldots ,q_\mathcal {D}\), the challenger computes

    $$\begin{aligned} (\sigma _{k}, I_{k},\gamma _{k},z_{k}) \leftarrow \mathcal {D}(\sigma _{k-1}). \end{aligned}$$
  • The attacker \(\mathcal {A}\) gets \(\mathsf {seed}\), J, and \(\gamma _1,\ldots ,\gamma _{q_\mathcal {D}}, z_1,\ldots z_{q_\mathcal {D}}\). It gets access to an oracle \(\mathsf {get}\text {-}\mathsf {refresh}()\) which initially sets \(k:=0\) on each invocation increments \(k:=k+1\) and outputs \(I_k\). At some point the attacker \(\mathcal {A}\) outputs a value \(S_0 \in \{0,1\}^n\), an integer d, and \(I_j^*\) for \(j \in J\) such that \(k+d \le q_\mathcal {D}\) and

    $$\begin{aligned} \sum _{\begin{array}{c} k < j \le k+d\\ j \notin J \end{array}} \gamma _j \ge \gamma ^* \; . \end{aligned}$$
  • For \(j=k+1,\ldots ,k+d\), the challenger computes

    $$\begin{aligned} S_j \leftarrow \left\{ \begin{array}{lr} \mathsf {refresh}(S_{j-1}, I_{j}) &{} : j \notin J\\ \mathsf {refresh}(S_{j-1},I_j^*) &{} : j \in J \end{array} \right. \; . \end{aligned}$$

    If \(b=0\) it sets \((S^*, R) \leftarrow \mathsf {next}(S_d)\) and if \(b=1\) is sets \((S^*,R) \leftarrow \{0,1\}^{n + \ell }\) uniformly at random. The challenger gives \(I_{k+d+1},\ldots ,I_{q_\mathcal {D}}\), and \((S^*,R)\) to \(\mathcal {A}\).

  • The attacker \(\mathcal {A}\) outputs a bit \(b^*\).

Definition 6

(Recovering Security) We say that PRNG with input has \((t, q_\mathcal {D}, \gamma ^*, \varepsilon )\)-recovering security if for any attacker \(\mathcal {A}\) and legitimate sampler \(\mathcal {D}\), both running in time t, the advantage of the above game with parameters \(q_\mathcal {D},\gamma ^*\) is at most \(\varepsilon \).

1.2 Preserving Security

We define preserving security exactly as in [5]. Intuitively, it says that if the state \(S_0\) starts uniformly random and uncompromised and is then refreshed with arbitrary (adversarial) samples \(I_1,\ldots ,I_d\) resulting in some final state \(S_d\), then the output \((S^*,R) \leftarrow \mathsf {next}(S_d)\) looks indistinguishable from uniform.

  • The challenger chooses an initial state \(S_0 \leftarrow \{0,1\}^n\), a seed \(\mathsf {seed}\leftarrow \mathsf {setup}\), and a bit \(b \leftarrow \{0,1\}\) uniformly at random.

  • The attacker \(\mathcal {A}\) gets \(\mathsf {seed}\) and specifies an arbitrarily long sequence of values \(I_1,\ldots ,I_d\) with \(I_j \in \{0,1\}^n\) for all \(j \in [d]\).

  • The challenger sequentially computes

    $$\begin{aligned} S_j = \mathsf {refresh}(S_{j-1}, I_j, \mathsf {seed}) \end{aligned}$$

    for \(j=1,\ldots ,d\). If \(b=0\) the attacker is given \((S^*,R) = \mathsf {next}(S_d)\) and if \(b=1\) the attacker is given \((S^*,R) \leftarrow \{0,1\}^{n + \ell }\).

  • The attacker outputs a bit \(b^*\).

Definition 7

(Preserving Security) A PRNG with input has \((t,\varepsilon )\)-preserving security if for any attacker \(\mathcal {A}\) running in time t, the advantage of \(\mathcal {A}\) in the above game is at most \(\varepsilon \).

1.3 Modified Composition Theorem

With these modified definitions, [5]’s proof of their composition theorem immediately extends to handle semi-adaptive \(\mathsf {set}\text {-}\mathsf {refresh}\) queries.

Theorem 8

Assume that a PRNG with input has both \((t, \varepsilon _{p})\)-preserving security and \((t, q_\mathcal {D}, \gamma ^*, \varepsilon _{r})\)-recovering security as defined above. Then, it is \(((t',q_\mathcal {D}, q_R, q_S), \gamma ^*, q_R(\varepsilon _r + \varepsilon _p))\)-robust in the semi-adaptive \(\mathsf {set}\text {-}\mathsf {refresh}\) model where \(t' \approx t\).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dodis, Y., Shamir, A., Stephens-Davidowitz, N. et al. How to Eat Your Entropy and Have it Too: Optimal Recovery Strategies for Compromised RNGs. Algorithmica 79, 1196–1232 (2017). https://doi.org/10.1007/s00453-016-0239-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00453-016-0239-3

Keywords

Navigation