Provable Second Preimage Resistance Revisited
Abstract
Most cryptographic hash functions are iterated constructions, in which a mode of operation specifies how a compression function or a fixed permutation is applied. The MerkleDamgård mode of operation is the simplest and more widely deployed mode of operation, yet it suffers from generic second preimage attacks, even when the compression function is ideal.
In this paper we focus on provable security against second preimage attacks. Based on the study of several existing constructions, we describe simple properties of modes of operation and show that they are sufficient to allow some form of provable security, first in the random oracle model and then in the standard model. Our security proofs are extremely simple. We show for instance that the claims of the designers of Haifa regarding second preimage resistance are valid.
Lastly, we give arguments that proofs of second preimage resistance by a blackbox reduction incur an unavoidable security loss.
Keywords
Hash function Second preimage resistance Security proof Unavoidable security loss Blackbox reductions1 Introduction
 1.
Collision adversaries up to about \(2^{n/2}\) queries
 2.
Preimage adversaries up to about \(2^{n}\) queries
 3.
SecondPreimage adversaries up to about \(2^{n}\) queries
The MerkleDamgård mode of operation therefore enjoys a form of provable security, since the whole hash function is not less secure than the compression function with respect to collision adversaries. This allows hash function designers to focus on designing collisionresistant compression functions, arguably an easier task than designing a fullblown hash function. A comparable result holds for preimage resistance, since a preimage on the full hash function would lead to a pseudopreimage on the compression function.
The situation is however not as good for the remaining classical security notion, namely second preimage resistance. In fact, it turned out that the MerkleDamgård iteration of a secure compression function is not as secure as the compression function itself: in 2005, Kelsey and Schneier described an attack [12] that finds a second preimage of an \(\ell \)block message with \(2^{n}/\ell \) evaluations of the compression function, even if it is ideal (i.e., a public random function).
The existence of several generic attacks [10, 11, 12] demonstrated that there was definitely a problem with the MerkleDamgård construction, and motivated further research, and new modes of operations have emerged. It also motivated hash function designers to provide proofs that their mode of operation is sounds, and that it does not suffer from generic attacks.
An elegant solution, both theoretically and practically appealing, is the widepipe hash proposed by Lucks in 2005 [13]. The underlying idea is simple: make the internal state twice as big as the output. This makes the construction provably resistant to second preimage attacks in the standard model, because a second preimage on the iteration yields either an \(n\)bit second preimage or a \(2n\)bit collision on the compression function. This construction is also very practical, and it is implemented by 4 out of the 5 SHA3 finalists. However, the memory footprint of a widepipe construction is as least twice as big compared to MerkleDamgård , so in some cases where memory is restricted, it would be beneficial to have a “narrowpipe” mode of operation.
In this paper, we focus on narrowpipe ^{1} modes of operation, where several questions remain unanswered. For instance, the exact resistance to generic second preimage attack of the MerkleDamgård construction is in fact unknown. Existing attacks give an upperbound above the birthday paradox, and the fact that a second preimage is also a collision give a birthday lowerbound. The generic second preimage security of MerkleDamgård is thus known to lie somewhere between \(2^{n/2}\) and \(2^{n}/\ell \) queries, for messages of size \(\ell \).
Our Goal and Our Results. The objective of this paper is to describe very simple conditions that, when satisfied by a narrowpipe mode of operations, are sufficient to provide some form of provable resistance against second preimage attacks beyond the birthday bound.
Provable security against second preimage attack comes in several flavors. One possible setting to discuss the security of a mode of operation is the random oracle model, i.e., assuming that the compression function is a public random function. Proofs that there cannot exist second preimage attacks under the assumption that the compression function is a random oracle show that the mode of operation is immune to generic attacks, i.e., attacks that target the mode of operation itself and thus work for any compression function. The second preimage attacks of KelseySchneier and that of Andreeva et al. [2] are generic attacks.
We show that a simple tweak to the MerkleDamgård mode is sufficient to prevent all generic second preimage attacks. This modification, namely the inclusion of a round counter, is one of the distinctive features of Haifa. Biham and Dunkelman proposed Haifa in 2006 [8], as a collection of tweaks to the original MerkleDamgård mode of operation; they claimed a security level of \(2^n\) against second preimage adversaries, without providing proofs. We thus show that their claim is valid.
The assumption that hash functions, or just components thereof, are random, is strong and unrealistic enough to make some uncomfortable, so that we would like to get rid of it. Constructions of keyed hash functions provably achieving a form of second preimage resistance without relying on the existence of public random functions, but instead based on the hardness of a general assumption have been known for quite a while [9, 15], under the name of Universal OneWay Hash Functions (UOWHFs). Later on, modes of operation of keyed hash functions that promote a form of second preimage resistance from the compression function to the whole construction have been designed [4, 17].
The security of the latter modes of operation is established by a blackbox reduction, namely an algorithm that turns a successful attacker against the hash function into a (somewhat less) successful attacker against the compression function. Thus, the iteration remains secure, up to some level, as long as the compression functions are themselves secure.
Inspired by these constructions we again isolate a specific property of modes of operation which is sufficient to provide this kind of “reductionist” security, without heavy assumptions on the compression function. This feature is, again, simple: given a bit string \(x\), it must be possible to forge a message \(M\) such that \(f(x)\) is evaluated while computing \(H^f(M)\). We then describe a “generic” reduction that solely requires this specific property to show that a mode of operation promotes the second preimage resistance of the compression function. This proof is, to some extent, an abstraction of the security proofs of several existing schemes.
Lastly, we observe that in all these proofs of second preimage security by reduction there is always a security loss proportional to the size of hashed messages (i.e., security is guaranteed up to a level of \(2^n / \ell \) where \(\ell \) denotes the size of hashed messages). We give arguments hinting that this security loss is unavoidable, and is caused by the proof technique itself.
Organisation of the Paper. In Sect. 2 we recall the security notions we are concerned with. Then in Sect. 3 we introduce a generic narrowpipe mode of operation, and we show that all the particular constructions that we consider are instances of this generic framework. In Sect. 4 we discuss the generic attacks that apply to the known provably secondpreimage resistant constructions we consider, and we show how to make them immune to these attacks. Lastly, in Sect. 5 we show our main result, namely that the security loss in the security proofs is unavoidable.
2 Definitions
We recall the definition of the usual second preimage notions. The Spr notion is folklore and applies to unkeyed hash functions, while Sec and eSec security notions have been defined in [16] and applies to families of hash functions indexed by a key.
 Spr

The adversary receives a (random) challenge \(M\) and has to find a second message \(M^{\prime }\) such that \(H(M) = H(M^{\prime })\) with \(M \ne M^{\prime }\). The advantage of the adversary is its success probability (taken over the random coins used by the adversary and the random choice of the challenge).
 Sec

The adversary receives a random challenge message and a random key, and she has to produce a colliding message for the given key. The advantage is the success probability of the adversary (over the random coins used by the adversary and the random choice of the challenge).
 eSec

The adversary chooses the challenge message. Then, she receives a random key and has to find a colliding message under this key. The advantage is the maximum taken over the choice of \(M\) by the adversary of her success probability (taken over the random coins used and the random choice of the key).
Historically, eSecsecure hash function families have been called Universal OneWay Hash Functions (UOWHFs). It must be noted that a Secadversary can be used to win the eSec security game (just generate the challenge message randomlyfirst). Therefore, if \(H^{(\cdot )}\) is eSecsecure, then it is also Secsecure.
Note that the size of the challenges plays an important role in the discussion of second preimage resistance. The known generic attacks are faster when the challenges become longer. For this reason, the second preimage security notions are often parametrized by the size of the challenges. When the challenge consists of an \(\ell \)block long message, the notions are denoted by \(\mathsf{Spr }[\ell ], \mathsf{Sec }[\ell ]\) and \(\mathsf{eSec }[\ell ].\)
We say that an adversary against a security notion \((t, \varepsilon )\)breaks the security notion if it terminates in time \(t\) and wins the game with probability \(\varepsilon \). Let us note a fact that will have some importance later on. In all these notions, the success probability is taken over the random coins of the adversary and over the choice of the challenge. This means that an adversary implementing an attack against “weak messages” or “weak keys” may succeed on a small fraction of the challenge space and fail systematically on nonweak messages, while still having a nonzero advantage. A consequence is that it is not possible to increase the success probability of adversaries against a single challenge by repeating them until they succeed.
How to compare the efficiency of adversaries that have different running time and success probability? If an adversary \((t, \varepsilon )\)breaks a security notion, then the expected number of repetitions of the experiment defining the notion before the adversary wins is \(1/\varepsilon \). This represents a total of \(t/\varepsilon \) time units, and this is a meaningful scale. Intuitively, it represents “how much time do we have to wait before the adversary shows me what she is capable of”. We call the global complexity of an adversary the ratio between its time complexity and its success probability. As an example, notice that the global complexity of exhaustive search is \(2^n\) (for all second preimage notions).
Following the notations in use in the existing literature, we will denote by \(\mathcal {A}_H\) an adversary against an iterated hash function, and by \(\mathcal {A}_f\) the adversary against the corresponding compression function. Hopefully, things will be clear by the context.
3 Abstract NarrowPipe Modes of Operations
Because we would like to state results that are as generic as possible, we introduce a framework of abstract modes of operation, which encompasses all the narrowpipe modes of operation known to the authors. This framework will enable us to show that our results hold for any mode of operation satisfying a minimum set of conditions.
We will consider that a narrowpipe mode of operation \(H^{(\cdot )}\) is a circuit that takes as its input \(M\) (the full message), \(K\) (the key, if present), \(h_i\) (the current chaining value) and \(i\) (the block counter). This circuit is responsible for preparing the input to the compression function. The next chaining value \(h_{i+1}\) is the output of the compression function on the input prepared by the circuit. The output of the whole hash function is the output of the compression function on its last invocation. The circuit activate a special wire “last call” to indicate that the hash process is terminated. We denote by \(\mathfrak {e}: \mathbb {N}\rightarrow \mathbb {N}\) the function that returns the number of calls to the compression function given the size of \(M\). We thus implicitly assume that the number of calls to the compression function does not depend on the input of the hash function (i.e., on \(M\) and \(K\)), but only on the size of \(M\). We are inclined to believe that this restriction is natural.
There are constructions that are apparently not narrowpipe, but that still fit in this framework, such as the GOST hash function (the checksum can be computed in the last invocation, and does not need to be transmitted between each invocation of the compression function). Note that this requires the full message \(M\) to be given to the mode of operation at each invocation.
Note that by choosing \(H^{(\cdot )}\) to be a circuit, we implicitly admit the existence of an upperbound on the size of the messages (if only because the block counter comes on a finite number of wires). In the sequel, by “mode of operation”, we implicitly mean “a narrowpipe mode of operation that fits the above framework”. This does not seem to be a restriction, as we are not aware of any narrowpipe construction using a single compression function that does not fit the above definition.
3.1 CollisionResistance Preserving Modes of Operation
While we tried to make the definition of a mode of operation as generic as it gets, we are not interested in really bad modes of operation. We are not interested in noncollision resistant constructions, for instance. In this section, we characterize a few properties modes of operation should have not to be totally worthless.
We say that a mode of operation is strengthened if the binary encoding of the size of the processed message is contained in the input to the last invocation of the compression function. It is wellknown that the MerkleDamgård mode of operation is strengthened, which is the key in establishing its important collisionresistance preservation. However, in general, being strengthened is not completely sufficient to be collisionresistance preserving. Some further technicalities are required.
We say that a mode of operation is messageinjective if for all functions \(f\) and all keys \(K\), the function that maps the message \(M\) to the sequence of compressionfunction inputs \(\left( x_i\right) \) is injective. This implies that hashing two different messages \(M\) and \(M^{\prime }\) cannot generate the same sequence of inputs \((x_i)\). This property is necessary for collisionresistance preservation: if \(H^{(\cdot )}\) is not messageinjective, there exists a function \(f\) and a key \(K\) such that there exist two colliding messages \(M\) and \(M^{\prime }\) generating the same hash, without causing a collision in the compression function.
We also say that a mode of operation is chainingvalueinjective if for all \(f\) and all \(K\), there exists a (deterministic) function that maps \(x_i\) to \(h_{i1}\). The combination of these three properties is sufficient to ensure collisionresistance preservation.
Lemma 1
A mode of operation \(H^{(\cdot )}\) simultaneously messageinjective, chainingvalueinjective and strengthened is collisionresistance preserving.
This lemma is just a restatement of the wellknown result of Merkle and Damgård, but we include its proof, because it is a good warmup, and because it will be useful later on.
Proof

Either \(\bigl  M \bigr  \ne \bigl  M^{\prime } \bigr  \). In this case, because \(H^{(\cdot )}\) is strengthened, the inputs of the last invocation of the compression are not the same when hashing \(M\) and \(M^{\prime }\), and because \(M\) and \(M^{\prime }\) collide, we have found a collision on \(f\) (on its last invocation).

Or \(\bigl  M \bigr  = \bigl  M^{\prime } \bigr  \). Suppose that the compression function is invoked \(r = \mathfrak {e}(M)\) times in both cases. In this case, there are again two possibilities. Either \(x_r \ne x_r^{\prime }\), and we have a collision since \(h_r = h_r^{\prime }\), or \(x_r = x_r^{\prime }\). By chainingvalueinjectivity, we have \(h_{r1} = h^{\prime }_{r1}\). The argument repeats. Either we find a collision along the way, or we reach the conclusion that \(x_i = x_i^{\prime }\), for all \(i\), which is impossible by messageinjectivity. \(\square \)
Because of this lemma, we call a mode \(H^{(\cdot )}\) “collisionresistance preserving” if it satisfies these three conditions.
3.2 Some Particular Modes of Operations
MerkleDamgård. The MerkleDamgård mode of iteration was independently suggested in 1989 by Merkle [14] and Damgård [6]. It is an unkeyed mode of operation, so the circuit \(H^{(\cdot )}\) just ignores the key input. In this mode, the input to the compression function is usually considered to be formed of two parts playing different roles: the chaining value input, on \(n\) bits, and the message block input, on \(m\) bit, the output of the function being \(n\)bit wide.
The padding is done usually by appending a single ‘1’ bit followed by as many ‘0’ bits as needed to complete an \(m\)bit block including the length of \(M\) in bits (the wellknown MerkleDamgård strengthening). However, for the sake of simplicity, we will consider in the sequel a simplified padding scheme: the last block is padded with zeroes, and the message length in bits is included in an extra block.
HAIFA. The HAsh Iterative FrAmework (Haifa), introduced in 2006 by Biham and Dunkelman [8], is a MerkleDamgårdlike construction where a counter and salt are added to the input of the compression function. In this paper, we consider a simplified version of Haifa (amongst other things, we disregard the salt). For our purposes, the definition we use is of course equivalent. In Haifa, the compression function \(f :\left\{ 0,1\right\} ^n \times \left\{ 0,1\right\} ^m \times \left\{ 0,1\right\} ^{64} \rightarrow \left\{ 0,1\right\} ^n\) takes three inputs: the chaining value, the message block, and the round counter (we arbitrarily limit the number of rounds to \(2^{64}\)). The designers of Haifa claimed that the round counter was sufficient to prevent all generic second preimage attacks.
Shoup’s UOWHF. Shoup’s Universal OneWay Hash Function works just like MerkleDamgård by iterating an eSecsecure compression function family \(f: {\left\{ 0,1\right\} }^{k} \times {\left\{ 0,1\right\} }^{n} \times {\left\{ 0,1\right\} }^{m} \rightarrow {\left\{ 0,1\right\} }^{n}\) to obtain a (keyed) eSecsecure hash function (i.e., a UOWHF).
The scheme uses a set of masks \(\mu _0, \dots , \mu _{\kappa 1}\) (where \(2^{\kappa }1\) is the length of the longest possible message), each one of which is a random \(n\)bit string. The key of the whole iterated function consists of the key \(k\) of the compression function and of these masks. The size of the key is therefore logarithmic in the maximal size of the messages that can be hashed. The order in which the masks are applied is defined by a specified sequence: in the \(i\)th invocation of the compression function, the \(\nu _2(i)\)th mask is used, where \(\nu _2(i)\) denotes the largest integer \(\nu \) such that \(2^\nu \) divides \(i\). As advertised before, this construction enjoys a form of provable secondpreimage security in the standard model: it promotes the eSec security of the compression function to that of the whole hash function.
Theorem 1 [17]
Let \(H^{(\cdot )}\) denote Shoup’s mode of operation. If an adversary is able to break the \(\mathsf{eSec }[\ell ]\) notion of \(H^f\) with probability \(\varepsilon \) in time \(T\), then one can construct an adversary that breaks the eSec notion of \(f\) in time \(T + \mathcal {O}\left( \ell \right) \), with probability \(\varepsilon / \ell \).
The Backwards Chaining Mode. Andreeva and Preneel described in [3] the Backwards Chaining Mode (BCM) which promotes the secondpreimage resistance of an unkeyed compression function to the Sec notion of the (keyed) full hash function. We will assume for the sake of simplicity that the message block and the chaining values have the same size. The iteration is keyed, and the key is formed by a triplet \((K_0, K_1, K_2)\) of \(n\)bit strings (note that the size of the key is independent of the size of the messages).
This construction also enjoys a form of provable secondpreimage security in the standard model. It promotes the Spr security of the compression function to the Secsecurity of the whole hash function.
Theorem 2 [3]
Let \(H^{(\cdot )}\) denote the BCM mode of operation. If an adversary is able to break the \(\mathsf{Sec }[\ell ]\) notion of \(H^f\) with probability \(\varepsilon \) in time \(T\), then one can construct an adversary that breaks the Spr notion of \(f\) in time \(T+\mathcal {O}\left( \ell \right) \), with probability \(\varepsilon / \ell \).
The Split Padding. Yasuda introduced the Split Padding in 2008 [18], as a minor but clever tweak to the MerkleDamgård strengthening. For the sake of simplicity, we will assume that the message block is twice bigger than the chaining values (i.e., it is \(2n\)bit wide). The tweak ensures that any message block going into the compression function contains at least \(n\) bits from the original message (this is not necessarily the case in the last block of the usual MerkleDamgård padding scheme).
It promotes a kind of eSecsecurity of the compression function to the Sprsecurity of the (unkeyed) iteration. More precisely, the security notion required of the compression function is the following: the adversary chooses a chaining value \(h\) and the first \(n\) bits of the message block \(m_1\), and is then challenged with the last \(n\) bits of the message block \(m_2\). She has to find a new pair \((h^{\prime },m^{\prime }) \ne (h,m_1\,  \,m_2)\) such that \(f(h,m_1 \,  \,m_2) = f(h^{\prime },m^{\prime })\). To some extent, this is the eSec security notion, but here the “key” of the compression function is the last \(n\) bits of the message block.
Theorem 3 [18]
Let \(H^{(\cdot )}\) denote the Split Padding mode of operation. If an adversary is able to break the \(\mathsf{Spr }[\ell ]\) notion of \(H^f\) with probability \(\varepsilon \) in time \(T\), then one can construct an adversary that breaks the eSeclike notion of \(f\) in time \(T+\mathcal {O}\left( \lambda \right) \), with probability \(\varepsilon / \ell \).
4 How to Make Your Mode of Operation Resistant Against Second Preimage Attacks?
In this section, we describe two simple properties of modes of operation, and we show that these properties allows some kind of security results against second preimage adversaries.
4.1 Resistance Against Generic Attacks
Generic attacks are attacks against the modes of operation, i.e., attacks that do not exploit any property of the compression function, and that could therefore work regardless of its choice. Generic attacks can therefore break the hash function even if the compression function does not have any weakness, and they could work even if the compression function were a random oracle (a public, perfectly random function).
Symmetrically, an attack against a hash function where the compression is perfectly random is necessarily an attack against the mode of operation (since it is impossible to break a perfectly random function).
We will therefore follow the existing literature [1, 2, 7, 10, 11, 12] by assuming that the compression function is random. In the random oracle model, the relevant measure of efficiency of an adversary is the number of query sent to the random oracle, rather than time. Indeed, the adversaries cannot obtain any kind of advantage by computation alone without querying the random function. In this particular setting, we say that an adversary \((q,\varepsilon )\)breaks a security notion if she sends at most \(q\) queries to the random oracle and wins with probability at least \(\varepsilon \).
We now show that a very simple criterion, directly inspired from Haifa, is sufficient to obtain an optimal level of provable resistance to generic second preimage attacks.
Definition 1
A mode of operation \(H^{(\cdot )}\) has domain separation if there exist a deterministic algorithm \({\mathbf {idxEx}}\) which, given an input to the compression function \(x_i\) produced when evaluating \(H^f(K,M)\), recovers \(i\), regardless of the choice of \(M\), \(K\) and \(f\).
Amongst all the modes of operation considered above, only Haifa has domain separation: the round counter is part of the input to the compression function. The following theorem show that Haifa is optimally resistant to generic second preimage attacks, as was claimed by its designers.
Theorem 4
Proof
 1.
Either \(\bigl  M \bigr  \ne \bigl  M \bigr  \), and because \(H^{(\cdot )}\) is strengthened, then the adversary has found a (second) preimage of \(H^f(M)\) for the compression function \(f\). Since \(f\) is a random oracle, each query has a probability \(2^{n}\) to give this preimage.
 2.Or \(M\) and \(M^{\prime }\) have the same size. Because \(H^{(\cdot )}\) is strengthened, injective and extractable, we know (by looking at the proof of Lemma 1) that there exists a collision on \(f\) of the form:It is important to notice that the same value of \(i\) occurs in the three members of this equation. The “index extractor” \({\mathbf {idxEx}}\) of the domain separation mechanism can be used to partition the possible inputs to \(f\) into disjoint classes (corresponding to the preimages of integers). In the collision above, \(x_i\) and \(x_i^{\prime }\) belong to the same, “\(i\)th” class. When submitting a query \(x\) to \(f\), the adversary implicitly chooses the index \(i={\mathbf {idxEx}}(x)\) of the class to which \(x\) belong. The collision above can only be found if \(f(x) = h_{{\mathbf {idxEx}}(x)}\), meaning that for each query, there is only one target value that ensures victory. Therefore, because \(f\) is a random oracle, each query hits the single target with probability \(2^{n}.\)$$ f(x_i) = f(x_i^{\prime }) = h_i $$
4.2 Resistance Against All Attacks
The assumption that the compression function is random is the crux of the proof of the previous result. While it is completely unrealistic, results proved under this assumption still say something meaningful: they show that the mode of operation itself does not exhibit obvious weaknesses, and that the adversaries have to look into the compression function to break the iteration.
Nevertheless, it would be more satisfying to drop this requirement. In that case, the adversary “knows” the source code of the compression function, so that she does not need an external oracle interface to evaluate it. The relevant measure of her complexity is thus her running time. We say that an adversary \((t,\varepsilon )\)break a hash function (or a compression function) if she runs in time at most \(t\) and succeeds with probability at least \(\varepsilon \).
For this, we show that another simple criterion is enough to offer a nontrivial level of security. This criterion is directly inspired by the three constructions with provable security in the standard model discussed above.
Definition 2
Given a mode of operation \(H^{(\cdot )}\) and a compression function \(f\), let \(P(i,y)\) denote the set of pairs \((M,K)\) such that when evaluating \(H^f(M,K)\), then the \(i\)th input to \(f\) is \(y\) (i.e., \(x_i = y\) in Algorithm 2).
We say that a mode of operation \(H^{(\cdot )}\) allows for embedding if \( P(i,y) \ne \emptyset \) for any \(y\) and if it is computationally easy to sample random elements in \(P(i,y)\).
Shoup’s UOWHF allows for embedding, yet proving it is not so easy. We refer the reader to [17] for the full details, but here is an intuitive version. Controlling the message block in the \(i\)th iteration is easy, but controlling the chaining value is not so obvious. Clearly, the mask used in the \(i\)th iteration must be chosen carefully, but the problem is that choosing it will also randomize the output of the previous iterations. The key idea is that between two arbitrary points of the iteration, there is always a mask that is used only once (the one with the greatest index). By choosing this particular mask after all the others, it is possible to control the chaining value at this particular point, regardless of the other masks. This yields a recursive procedure to control the chaining value between the first and the \(i\)th iterations: observe that the chaining value can be set to (say) zero in the iteration where the mask with the greatest index occur before the \(i\)th iteration, independently of what happens afterward. Suppose that this mask happens in iteration \(j\). Then, we are left with the problem of controlling the chaining value between the \(j\)th and the \(i\)th iteration, a strictly smaller problem, to which the same technique can be applied recursively.
The backwards chaining mode easily allows for embedding. To embed in the first block, just set \(K_0\) appropriately. To embed at any other index smaller than \(\ell 1\), just choose \(m_{i}\) and \(m_{i+1}\) with care. Finally, to embed at index \(\ell 1\) or \(\ell \), pick the message at random and choose \(K_1\) and \(K_2\) accordingly (the keys are necessary to embed in the last blocks because of the padding scheme). The splitpadding does not allows for this definition of embedding, but it allows to embed \(n\) bits of message block into any compression function input.
Theorem 5
Let \(H^{(\cdot )}\) be a mode of operation satisfying the hypotheses of Lemma 1 and that additionally allows for embedding.
If an adversary is able to break the \(\mathsf{Sec }[\ell ]\) notion of \(H^f\) with probability \(\varepsilon \) in time \(T\), then one can construct an adversary that breaks the Spr notion of \(f\) in time \(T+\mathcal {O}\left( \mathfrak {e}(\ell ) \right) \), with probability \(\varepsilon / \mathfrak {e}(\ell )\).
Proof
The proof works by exhibiting a reduction \(\mathcal {R}\) that turns an adversary \(\mathcal {A}_H\) against the iteration into an adversary against the compression function. The reduction \(\mathcal {R}\) is described by the pseudocode of Algorithm 2.
The reduction starts by forging a random message \(M\) that “embeds” the challenge \(x\) at a random position \(i\), and then it sends this to the adversary \(\mathcal {A}_H\). If the adversary succeeds in producing a second preimage \(M^{\prime }\), then \(M\) and \(M^{\prime }\) collide. If the collision happen just at position \(i\), then a second preimage of the challenge \(x\) is readily found.
The running time of the reduction is clearly that of \(\mathcal {A}_H\) plus the time needed to hash both \(M\) and \(M^{\prime }\). Clearly, \(M^{\prime }\) cannot be larger that the running time of \(\mathcal {A}_H\), so that the running time of \(\mathcal {R}\) is essentially that of the adversary.
It remains to determine the success probability of the reduction. First of all, the adversary succeeds with probability \(\varepsilon \) on line 3. Note that the challenge fed to \(\mathcal {A}_H\) is uniformly random: the challenge \(x\) given to \(\mathcal {R}\) is supposed to be chosen uniformly at random, and \((M,K)\) is uniformly random amongst the possibilities that place the random block \(x\) at a random position \(i\).
Next, we show that when the adversary \(\mathcal {A}_H\) succeeds, the reduction itself succeeds with probability \(1/\mathfrak {e}(\ell )\). First, we claim that at the beginning of line 11, we have \(x_{\mathfrak {e}(\ell )j} \ne x^{\prime }_{\mathfrak {e}(\ell ^{\prime })j}\) and \(f\left( x_{\mathfrak {e}(\ell )j} \right) =f\left( x^{\prime }_{\mathfrak {e}(\ell ^{\prime })j} \right) \). The reasoning behind this is exactly the same as that in the proof of Lemma 1. This establishes the correctness of the reduction in passing.
Finally, we see that the reduction succeeds if and only if \(\mathfrak {e}(\ell )j = i\). Because \(i\) has been chosen uniformly at random, this happens with probability \(1/\mathfrak {e}(\ell )\), regardless of the value of \(j\) (which is under the control of the adversary). \(\square \)
Discussion. All the proof of resistance considered above (Theorems 1, 2, 3 and 5) only provide a security level of \(2^n / \ell \). In some cases, this makes perfect sense, because a generic attack of this complexity is applicable. However, such generic attacks could be made impossible by including a counter in the mode of operation, and yet it seems impossible to provide better security proofs in the standard model.
It is then natural to ask whether these security proofs could be improved to reflect the effect of the patch on the security of the schemes. In other terms, we ask whether it is it possible to prove the patched schemes resistant to second preimage attacks in the standard model up to a level of roughly \(2^n\)?
The last contribution of this paper is to show that this is in fact impossible with the “usual” proof technique.
5 Unavoidable Security Loss in BlackBox Reduction
Resistance against second preimage attacks in the standard model of a mode of operation \(H^{(\cdot )}\) is often announced by theorem formulated similar to the following “typical” result.
Theorem 6 (informal and typical)
There is a blackbox reduction \(\mathcal {R}(\cdot , \cdot )\) such that \(\mathcal {R}(f, \mathcal {A}_H)\) is a secondpreimage adversary against the compression function \(f\) that \((t+t^{\prime }, \alpha \cdot \varepsilon + \beta )\)breaks \(f\), for all compression functions \(f\) and all second preimage adversaries \(\mathcal {A}_H\) that \((t,\varepsilon )\)break \(H^f\).
The reduction is given blackbox access to both the adversary and the compression function \(f\), and this is a way of formalizing that the reduction must work for any adversary and any compression function. For the sake of simplicity, in this paper we allow the reduction to issue only one query to the adversary. To some extent, this narrows our study a little, but all the reductions we are aware of (in [3, 17, 18]) fit into this category. Note also that the adversary \(\mathcal {A}_H\) may fail deterministically on a given challenge, so that it is pointless to rerun it again and again to increase its success probability.
Reductions are generally assumed to have to simulate the legitimate input challenge distribution the adversary is normally expecting. In our case, this means that the distribution of the challenges \(M,K\) must be indistinguishable from random. Note that if \(M,K\) were biased, then the adversary could detect that it is “being used”, and fail deterministically. In any case, when we mention the success probability \(\varepsilon \) of the adversary \(\mathcal {A}_H\), we assume that its input distribution is uniformly random.
Now, while our objective is to understand what happens when \(\mathcal {A}_H\) succeeds, it is easier to get a glimpse of what happens when \(\mathcal {A}_H\) fails. In this setting, the reduction is just a randomized Turing machine trying to break the second preimage resistance of an arbitrary blackbox function, which cannot be done faster than exhaustive search. For instance, \(f\) could be a PseudoRandom Function with a randomlychosen secret key. We could even use the algorithm shown in Fig. 2 to simulate a “truly” random function. In any case, it follows that \(\beta \le t^{\prime }/2^n\). The provable security level offered by a reduction is thus upperbounded by \(\alpha \cdot 2^n\). We will thus say that a reduction is useable if \(\alpha > t^{\prime }/2^n\), as this implies that the reduction offers a provable security level better than that of exhaustive search (or equivalently, that the reduction actually makes use of the adversary).
5.1 How Do Reductions Use the Adversary?
In the sequel, we will make the natural assumption that the \(\mathcal {A}_H\) adversary the reduction has access to has a nonzero success probability. We will also restrict our attention to useable reductions. By doing so we rule out modes of operation for which no useable reduction is known (such as the MerkleDamgård construction), but at the same time we rule out bogus modes that would have been a problem.
 (i)
\(H^{(\cdot )}(K,M,i1,h_{i2}) \ne H^{(\cdot )}(K,M^{\prime },i1,h_{i2})\)
 (ii)
For all \(j\) such that \(i \le j \le \ell \), \( H^{(\cdot )}(K,M,j,h_{j1}) = H^{(\cdot )}(K,M^{\prime },j,h_{j1})\)
The MerkleDamgård construction (and therefore Haifa ) and the splitpadding are easily seen to be suffix clonable: it is sufficient to change the \(i\)th message block while leaving all the subsequent messages blocks untouched. Shoup’s construction is also easily seen to be suffixclonable: it suffices to leave \(K\) untouched and to modify the beginning of \(M\). Lastly, the BCM mode of operation is also suffixclonable (and it again suffices to keep the right suffix of \(M\)).
We will thus assume that our mode of operation \(H^{(\cdot )}\) is suffixclonable. Since it is provably secure, there exists a reduction \(\mathcal {R}\) with a reasonably high success probability. Our objective, and the main technical contribution of this section, is to show the following theorem:
Theorem 7
We always have \(\alpha \le 1/\ell + t^{\prime }/2^n\). It follows that the provable security level offered by \(\mathcal {R}\) cannot be higher than \(2^{n}/\ell +t^{\prime }\).
The remaining of this section is devoted to the proof of this result. The general idea of the proof is to build an environment around the reduction \(\mathcal {R}\) that simulates a legitimate “world” for \(\mathcal {R}\), but in which it is easy to see that \(\mathcal {R}\) has a low success probability. Then because the security level offered by \(\mathcal {R}\) has to hold in all legitimates environment, it follows that in general \(\mathcal {R}\) cannot offer more in general than in the simulated world.
Connection Point. Before going any further, let us observe what happens when the adversary finds a second preimage. Let us denote by \(x_i\) and \(h_i\) (resp. \(x_i^{\prime }\) and \(h_i^{\prime }\)) the sequence of inputs and outputs of \(f\) while evaluating \(H^f(M)\) (resp. \(H^f(M^{\prime })\)). Since \(M\) and \(M^{\prime }\) collide, and because \(H\) satisfies the hypotheses of Lemma 1, then a second preimage of one of the \(x_i\) input values can be readily obtained from \(M^{\prime }\). If we look closely at the proof of Lemma 1, we will see that if \(\bigl  M \bigr  \ne \bigl  M^{\prime } \bigr  \), then we obtain a second preimage of \(f\) on the last invocation. Otherwise, there exists an index \(i\) such that \(f(x_i) = f(x_i^{\prime })\) and \(x_i \ne x_i^{\prime }\). In the sequel, we call this particular index \(i\) the “connection point”, and we note that at this particular index a second preimage of \(x_i\) for \(f\) is revealed, which we call “the second preimage at connection point”.
Next, we claim that in order to be usable, a reduction must embed the challenge \((x,k)\) into \((M,K)\). This justifies a posteriori our observation that the three schemes of interest all allow some form of embedding. To establish this result, we first show that a legitimate world with various interesting properties can be built around the reduction. When we argued that \(\beta \) was small, we used the (somewhat informal) argument that \(f\) could be implemented by a Random Function simulator, and that inverting such a function faster than exhaustive search is impossible. We now make this argument more formal, with the additional feature that we will be able to choose whether the adversary succeeds or fails, and where it connects.
 1.
Before \(\mathcal {R}\) sends its query \((M,K)\) to \(\mathcal {A}_H\), we simulate \(f\) by generating random answers and storing them (for consistency), “implementing” \(f\) with the random function simulator of Fig. 2.
 2.
When \(\mathcal {R}\) sends its query \((M,K)\) to \(\mathcal {A}_H\), we choose an integer \(i \in \{0, \dots , \ell \}\) (this will be the connection point), and we use the suffixclonability property of the mode of operation to generate a different message \(M^{\prime } \ne M\) satisfying the conditions of the definition of suffixclonability.
 3.
We evaluate \(H^f(M^{\prime })\) in a special way. On the first \(i1\) iterations we use the random function simulator in place of \(f\). On the \(i\)th iteration we program \(f\) so that \(f(x^{\prime }_i) = h_i\), thus “connecting” \(M^{\prime }\) to \(M\) in iteration \(i\).
 4.
We return \(M^{\prime }\) as the answer of \(\mathcal {A}_H\) to the reduction, and keep simulating \(f\). The reduction will be able to check that \(H_K^f(M) = H_K^f(M^{\prime })\) by sending the appropriate queries to \(f\).
When running inside this environment, the view of the reduction is consistent and legitimate. In this environment, we are able to choose the connection point at will. For instance, we can make sure that the \(\clubsuit \) event never happens. In this case, the reduction, even though it knows a collision on \(f\), cannot find a second preimage on \(f\) faster than exhaustive search (because each new query to \(f\) returns an independent random answer, and thus each query yields a second preimage with probability \(2^{n}\)).
Footnotes
 1.
We call “narrowpipe” a construction where the internal state has the same length as the digest.
References
 1.Andreeva, E., Bouillaguet, C., Dunkelman, O., Kelsey, J.: Herding, second preimage and trojan message attacks beyond MerkleDamgård. In: Jacobson Jr, M.J., Rijmen, V., SafaviNaini, R. (eds.) SAC 2009. LNCS, vol. 5867, pp. 393–414. Springer, Heidelberg (2009)Google Scholar
 2.Andreeva, E., Bouillaguet, C., Fouque, P.A., Hoch, J.J., Kelsey, J., Shamir, A., Zimmer, S.: Second preimage attacks on dithered hash functions. In: Smart, N.P. (ed.) EUROCRYPT 2008. LNCS, vol. 4965, pp. 270–288. Springer, Heidelberg (2008)CrossRefGoogle Scholar
 3.Andreeva, E., Preneel, B.: A threepropertysecure hash function. In: Avanzi, R.M., Keliher, L., Sica, F. (eds.) SAC 2008. LNCS, vol. 5381, pp. 228–244. Springer, Heidelberg (2009)Google Scholar
 4.Bellare, M., Rogaway, P.: Collisionresistant hashing: towards making UOWHFs practical. In: Kaliski Jr, B.S. (ed.) CRYPTO 1997. LNCS, vol. 1294, pp. 470–484. Springer, Heidelberg (1997)Google Scholar
 5.Brassard, G. (ed.): CRYPTO 1989. LNCS, vol. 435. Springer, Heidelberg (1990)zbMATHGoogle Scholar
 6.Damgård, I.B.: A design principle for hash functions. In: Brassard [5], pp. 416–427Google Scholar
 7.Dean, R.D.: Formal aspects of mobile code security. Ph.D. thesis, Princeton University, Jan 1999Google Scholar
 8.Eli Biham, O.D.: A framework for iterative hash functions – HAIFA. Presented at the second NIST hash workshop, 24–25 Aug 2006Google Scholar
 9.Impagliazzo, Russell, Naor, Moni: Efficient cryptographic schemes provably as secure as subset sum. J. Cryptol. 9(4), 199–216 (1996)CrossRefzbMATHMathSciNetGoogle Scholar
 10.Joux, A.: Multicollisions in iterated hash functions. Application to cascaded constructions. In: Franklin, M. (ed.) CRYPTO 2004. LNCS, vol. 3152, pp. 306–316. Springer, Heidelberg (2004)CrossRefGoogle Scholar
 11.Kelsey, J., Kohno, T.: Herding hash functions and the Nostradamus attack. In: Vaudenay, S. (ed.) EUROCRYPT 2006. LNCS, vol. 4004, pp. 183–200. Springer, Heidelberg (2006)CrossRefGoogle Scholar
 12.Kelsey, J., Schneier, B.: Second preimages on nbit hash functions for much less than 2\(^{n}\) work. In: Cramer, R. (ed.) EUROCRYPT 2005. LNCS, vol. 3494, pp. 474–490. Springer, Heidelberg (2005)CrossRefGoogle Scholar
 13.Lucks, S.: A failurefriendly design principle for hash functions. In: Roy, B. (ed.) ASIACRYPT 2005. LNCS, vol. 3788, pp. 474–494. Springer, Heidelberg (2005)CrossRefGoogle Scholar
 14.Merkle, R.C.: One way hash functions and DES. In: Brassard [5], pp. 428–446Google Scholar
 15.Naor, M., Yung, M.: Universal oneway hash functions and their cryptographic applications. In: STOC, pp. 33–43. ACM (1989)Google Scholar
 16.Rogaway, P., Shrimpton, T.: Cryptographic hashfunction basics: definitions, implications, and separations for preimage resistance, secondpreimage resistance, and collision resistance. In: Roy, B., Meier, W. (eds.) FSE 2004. LNCS, vol. 3017, pp. 371–388. Springer, Heidelberg (2004)Google Scholar
 17.Shoup, V.: A composition theorem for universal oneway hash functions. In: Preneel, B. (ed.) EUROCRYPT 2000. LNCS, vol. 1807, pp. 445–452. Springer, Heidelberg (2000)CrossRefGoogle Scholar
 18.Yasuda, K.: How to fill up MerkleDamgård hash functions. In: Pieprzyk, J. (ed.) ASIACRYPT 2008. LNCS, vol. 5350, pp. 272–289. Springer, Heidelberg (2008)CrossRefGoogle Scholar