On subset-resilient hash function families

In this paper, we analyze the security of subset-resilient hash function families, which is first proposed as a requirement of a hash-based signature scheme called HORS. Let H\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathcal {H}}$$\end{document} be a family of functions mapping an element to a subset of size at most k. (r, k)-subset resilience guarantees that given a random function H from H\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathcal {H}}$$\end{document}, it is hard to find an (r+1)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(r+1)$$\end{document}-tuple (x,x1,…,xr)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(x,x_1,\ldots ,x_r)$$\end{document} such that (1) H(x) is covered by the union of H(xi)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$H(x_i)$$\end{document} and (2) x is not equal to any xi\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$x_i$$\end{document}. Subset resilience and its variants are related to nearly all existing stateless hash-based signature schemes, but the power of this security notion is lacking in research. We present three results on subset resilience. First, we show a generic quantum attack against subset resilience, whose time complexity is smaller than simply implementing Grover’s search. Second, we show that subset-resilient hash function families imply the existence of distributional collision-resistant hash function families. Informally, distributional collision resistance is a relaxation of collision resistance, which guarantees that it is hard to find a uniform collision for a hash function. This result implies a comparison among the power of subset resilience, collision resistance, and distributional collision resistance. Third, we prove the fully black-box separation from one-way permutations.


Introduction
The digital signature scheme is an essential primitive in cryptography. Usually, a provably secure signature scheme needs to be based on the hardness of mathematical problems. However, due to the researches of quantum computations, the hardness of these problems becomes vulnerable. For example, RSA and CDH can be solved by Shor's algorithm [37]. Even though there are still several hard problems that are not broken by quantum computers until now, no one knows whether they can keep secure in the following decades. A hash-based signature scheme (HBS) becomes an attractive choice in this setting. HBS is based on assumptions of hash functions rather than the hardness of mathematical problems.
When we analyze the security of HBS, it is necessary to confirm the security notions for hash functions. Note that different HBS requires different security notions for hash functions. For example, the security of Lamport's scheme [30] requires one-wayness, and Merkle's signature scheme [34] requires one-wayness and collision resistance. These signature schemes are stateful, meaning that the signer must maintain a dynamic state during multiple signing executions. When it comes to practical stateless HBS, it usually requires a particular security notion for hash function families, called subset resilience.
Subset-resilient hashing (SRH) is a kind of hash function families, which is first proposed in [36] as a building block of HORS (Hash to Obtain Random Set), a stateless few-time HBS. Talking about common security notions for hash function families, we usually focus on the behavior of a single function mapping one element from the domain to one element of the range. For example, a collision-resistant hash function family (CRH) H = {h : {0, 1} m(n) → {0, 1} l(n) } requires that for h ← H, it is hard to find distinct x 1 , x 2 such that h(x 1 ) = h(x 2 ). Instead, subset resilience focuses on a hash function H mapping one element of the domain to a subset of size at most k. We call (x, x 1 , x 2 , . . . , x r ) be an (r , k)-subset cover with regard to H if it holds that H (x) ⊆ r i=1 H (x i ) and x = x i for any i = 1, . . . , r . an (r , k)subset-resilient function family requires that for randomly sampled H , it is hard to find an (r , k)-subset cover with regard to H .
Indeed, here the hash function H can be considered as a tuple of k "short" hash functions Little knowledge about SRH is known. Aumasson and Endignoux [4] propose an upper bound of generic security level for SRH, where the hash functions are treated as random oracles. Still, the power of SRH is unclear, which leads to several interesting questions. How about the generic security of SRH in the quantum world? What is the relation between SRH and other assumptions (such as CRH)? Can we construct a provable SRH by other fundamental primitives, such as one-way permutations?

Our results
In this paper, we attempt to answer the above questions on SRH.
-A generic quantum attack against SRH First, we give a quantum algorithm finding subset covers, giving an upper bound of the quantum security of subset resilience. The algorithm is more efficient than simply implementing Grover's search algorithm [20] on this problem. -SRH⇒dCRH Second, we prove the statement that the existence of SRH implies the existence of (infinitely often) distributional collision-resistant hashing (dCRH), which is a weaker assumption than CRH. Informally, dCRH requires that it is hard to find a OWP / / CRH rSRH dCRH

Quantum Attack
Th.8 Th.9 Th.10 Fig. 1 The relation among subset resilience and other assumptions. A→B means that the existence of A implies the existence of B. A B means the impossibility of constructing B from A in the fully black-box manner uniform collision of the hash function. Thus, the power of assuming the existence of SRH is stronger than dCRH. Note that this proof does not yield a black-box construction of SRH from dCRH. -OWP SRH Third, we prove the impossibility of constructing an SRH from one-way permutations (OWP) in a fully black-box manner. Our proof uses Simon's separating oracle [38], which proves the fully black-box separation of dCRH from OWP. Although we show that the existence of SRH implies the existence of dCRH in the second result, the third result is still not trivial. It is because the second result does not yield a black-box construction.
To sum up, the relation of SRH and other assumptions about hash functions is depicted in Fig. 1 (where rSRH will be introduced in the following).

Observations on subset cover finding problem
First, we reduce the subset cover finding problem to another one that is easier to analyze. Consider the following two problems: Problem 1 is the original subset cover finding problem, and Problem 2 is a harder variation. A solution to Problem 2 is immediately a solution to Problem 1. However, a solution to Problem 1 is possibly not a solution to Problem 2. We call a solution to Problem 2 as an (r , k)-restricted subset cover with regard to (h 1 , . . . , h k ). If it is hard to find an (r , k)restricted subset cover with regard to a randomly sampled function tuple from hash function family H, we say that H is an (r , k)-restricted subset-resilient hash function family and (r , k)-rSRH in short. (r , k)-restricted subset resilience is a weaker notion than (r , k)-subset resilience.
Furthermore, we have some observations on these problems: Observation 1 For any k < k , finding an (r , k)-subset cover of (h 1 , . . . , h k ) is not harder than finding an (r , k )-subset cover of (h 1 , . . . , h k ).
x 1 , . . . , x k such that H (x 1 ) = ... = H (x k ). In the beginning, it picks t 0 number of random elements from X , and list them in X 1 . Let Y 1 be the list of images of X 1 . Next, define F(x) such that F 1 (x) = 1 if and only if H (x) ∈ Y 1 and x / ∈ X 1 , and run Grover's algorithm on F 1 . Grover's algorithm outputs a solution after O( √ |X |/t 0 ) = O( √ N /t 0 ) number of quantum queries to H , which implies a collision of H .
Then, the algorithm repeats Grover's algorithm t 2 times, obtaining t 2 number of collisions. After that, it lists t 2 collisions in list X 2 , and lets Y 2 be the list of images of X 2 . Similarly, define F 2 (x) such that F 2 (x) = 1 if and only if F(x) ∈ Y 2 and x / ∈ X 2 . By running Grover's algorithm, it obtains a 3-collision of H with O( √ (N /t 2 )) quantum queries to H . Again, it repeats Grover's algorithm t 3 times, and then lists t 3  This algorithm can be modified to find k-restricted subset covers. Suppose the target functions h 1 , . . . , h k are all 2-to-1 functions. Given (h 1 , . . . , h k ), we first picks t 0 random elements of X , and list them in X 0 . Let Y 1 be the image of X 0 with regard to h 1 . Define F 1 such that F 1 (x) = 1 if and only if h 1 (x) ∈ Y 1 and x / ∈ X 1 . Run Grover's algorithm on F 1 t 1 times. Until now, our algorithm is the same as the original one.
Then, we list the solution of Grover's algorithm on F 1 in X 1 . Note that each element in X 1 implies a collision with an element of X 0 with regard to h 1 . List these elements of X 0 in X 1 .
Next, we inject h 2 to our algorithm. Let Y 2 be the list of images with regard to h 2 . Denote F 2 such that F 2 (x) = 1 if and only if h 2 (x) ∈ Y 2 and x / ∈ X 1 , and then run Gorver's algorithm on X 2 t 3 times. Similarly, we list the solutions of Grover's algorithm in X 2 and list the corresponding elements of X 1 in X 2 .
Observe that X 2 and X 2 contain collisions w.r.t. h 2 . In addition, since X 2 ⊂ X 1 , and the collisions of X 2 are all recorded in X 1 . As a result, we obtain a list of t 3 number of 2-restricted subset covers w.
Similarly, we can inject h 3 , . . . , h k into our algorithm. Eventually, we obtain a k-restricted subset cover with regard to (h 1 , . . . , h k ). By well-defined t 0 , . . . , t k , the number of quantum queries to h i is optimized to approximate O(N 1 2 (1− 1 2 k+1 −1 ) ), which is the same as that of Liu's algorithm for finding (k + 1)-collisions.
To eliminate the requirement of 2-to-1 functions, we require h 1 , . . . , h k to be general functions with high compression. Formally, we require that the size of the domain is more than (k + 1) times larger than the range (which is not a strong condition for hash functions). This condition guarantees that an element of the domain has a second preimage w.r.t. each h i with high probability. In this case, Grover's algorithms go through with high probability.
Relate to dCRH Our proof is inspired by [28], which proves the statement that the existence of multi-collision-resistant hashing (MCRH) implies the existence of (infinitely often) secure dCRH. In this paper, we turn to prove a similar statement on k-rSRH (compressing 2n bits to n bits) instead of MCRH.
We start with the case that k = 2. To prove that the existence of 2-rSRH implies the existence of dCRH, we assume towards the contradiction that there does not exist a dCRH in the world. It implies that there exists a probabilistic polynomial-time algorithm A that samples a uniform collision of h ← H with overwhelming probability over the choice of h. Here, a uniform collision (x 1 , x 2 ) of h implies that x 1 is randomly distributed from the domain and x 2 is uniformly distributed from the set h −1 (h(x 1 )). Next, to show the contradiction, we need to construct a polynomial-time algorithm breaking 2-rSRH using A. Now we can obtain collisions with regard to h 1 and h 2 using A, but how to obtain two collision pairs where ones of them are identical? A naive idea is that running A(h 1 ) and A(h 2 ) respectively. However, since A outputs uniform collisions, a 2-restricted subset cover only appears with negligible probability.
Recall that A is probablistic. We write A(h; r ) where r is the randonmess. If we run A(h 1 ) twice using randomness r 1 and r 2 , we can obtain (x 1 , x 2 ) and (x 3 , x 4 ), which are uniform collisions of h 1 with overwhelming probability. Can we choose particular r 1 and r 2 such that we can build a "bridge" between x 1 and x 3 with regard to h 2 ?
We fix the input function of A. Now A(h 1 ; ·) can be considered a deterministic function of r . Since we focus on x 1 and x 3 rather than x 2 and x 4 , we denote f (·) by the first element of the A(h 1 ; ·). If we want (x 1 , x 3 ) to be a collision of h 2 , the problem becomes finding r 1 and r 2 such that h 2 ( f (r 1 )) = h 2 ( f (r 2 )). Surprisingly, it becomes another collision-finding problem w.r.t. h 2 ( f (·)). Therefore, our algorithm breaking 2-rSRH runs as follows. Given (h 1 , h 2 ) and a dCRHbreaker A, define f (r ) as the first element of A(h 1 ; r ) and h (r ) Next, we need to show that (x 1 , x 2 ) and (x 1 , x 3 ) are both distinct. The proof of the first statement is not complicated. Since r 1 is the first element of output of A(h ), r 1 is uniformly distributed from the domain. Thus, (x 1 , x 2 ) ← A(h 1 ; r 1 ) flips a random coin, and (x 1 , x 2 ) is distributed as a uniform collision of h 1 . The main problem is the second statement. We prove this statement by in two steps. Let f (r ) be the first element of A(h 1 ; r ), which implies that x 1 = f (r 1 ) and x 3 = f (r 2 ). First, we observe that f is regular due to the definition of A. Thus, for every x, the corresponding sets f −1 (x) are of the same size and distinct. Since (r 1 , r 2 ) is a uniform collision of h , we prove that r 1 and r 2 drop into the same set f −1 (x) for some x only with negligible probability. Therefore, f (r 1 ) = f (r 2 ) (which implies x 1 = x 3 ) only holds with negligible probability.
From the above statements, we can construct a machine breaking 2-rSRH with A. Then we extend the result to k-rSRH for general constant k by inductions. That is, we constrct a machine breaking (k + 1)-rSRH with A together with a machine breaking k-rSRH, say Break-k-SR. By running Break-k-SR twice with different random coins, we can obtain two k-restricted subset covers of h 1 , . . . , h k . Then, we need to build a bridge between the first elements of two subset covers. Similarly, let f (r ) be the first elements of Break-k-SR(h 1 , . . . , h k ; r ) and h (r ) h k+1 ( f (r )). By running A(h ), we can obtain the two random coins that we want. Essentially, this implies general relation between dCRH and k-rSRH for constant k.
Separating from OWP Suppose there exists a fully black-box construction of k-rSRH from OWP f , say C f = (C f 1 , . . . , C f k ). It implies that there exists a reduction R such that: if there exists an adversary A that can break k-restricted subset resilience of C f , then there exists a polynomial-time reduction R inverting f with access to f and A with non-negligible probability. Our result rules out the existence of such a polynomial-time R for any polynomial-time C f .
Our proof uses Simon's separating oracle [38] consisting of two oracles Γ = ( f , CoverFinder). f is a random permutation, and CoverFinder f (C f 1 , . . . , C f k ) is an oracle that can output a k-restricted subset cover of (C f 1 , . . . , C f k ) with high probability. (Note that CoverFinder may be exponential-time.) Thus, CoverFinder behaves as the adversary A breaking k-rSRH. Now we prove that R Γ cannot invert f . More formally, we prove that for random y, R Γ (y) cannot output x such that f (x) = y with non-negligible probability.
If R is only given the access to f rather than ( f , CoverFinder), this statement is true since a random permutation f n is one-way with overwhelming probability [19]. However, CoverFinder may be helpful for inversing f . In our proof, we define a special event called y-hit. When R queries CoverFinder, if it obtains some the preimage of y with regard to f from the output, we say that y-hit occurs. Intuitively, R can gain some advantage of inversing y from CoverFinder only if it triggers y-hit in the queries to CoverFinder . Nevertheless, since CoverFinder outputs a random subset cover for each query, y-hit only occurs with negligible probability in each query. This implies that R can hardly gain advantage from the queries to CoverFinder, and thus it cannot invert f .
Our proof consists of four steps. First, we construct the separating oracle Γ and show that it can break any k-rSRH. Second, we prove that it is hard to compute f −1 (y) without triggering y-hit even given access to Γ . Third, we prove that if there exists a polynomial-time adversary computing f −1 (y) with triggering y-hit, then there exists another polynomial-time adversary computing f −1 (y) without triggering y-hit (which contradicts to the last statement). The last two statements imply the one-wayness of f given access to Γ . Finally, we use the above statements to prove the separation result.

Motivations
Different hash functions have different security levels. For example, a hash function may be more resistant to collisions than another one. However, none of them can avoid generic attacks, which work regardless of the structures of hash functions. For instance, a birthday attack is a generic attack against collision resistance in classical settings. Grover's algorithm [20] and BHT algorithm [12] are generic quantum attacks on one-wayness and collision resistance, respectively.
Generic attacks are essential for the security analysis of hash functions. It provides an upper bound for the security level. In many cases, if the hash function is considered a random oracle, the complexity of an optimal generic attack usually represents the exact security level of this security notion. Since subset resilience is a security notion used in hash-based signature schemes, which are required to be post-quantum, it is essential to analyze the complexity of generic attacks on this security notion. It motivates our first result, giving generic quantum attacks on subset resilience.
Usually, we prefer cryptographic schemes based on weaker assumptions. For instance, we prefer a scheme based on the hardness of CDH (Computational Diffie-Hellman Problem) rather than that based on the hardness of DDH (Decisional Diffie-Hellman Problem). Since finding a solution to CDH is harder than DDH, it is more convincing that CDH is hard. In hash-based cryptography, we also have this consideration. If P=NP, there is no one-way or collision-resistant hash function at all. However, if we believe that one-way functions exist (this world is called Minicrypt in [26]), does it imply that CRH also exists? Actually, it is not the truth. Simon [38] first proves the impossibility of constructing a CRH from oneway permutation in fully black-box manners. Komargodski et al. [29] define four worlds of hashing-related primitives. Roughly speaking, in one of the worlds (called Hashomania), CRH exists, while in other worlds (called Unihash and Minihash), one-way functions exist, but no CRH exists. We do not know which our real world is, but we prefer the schemes that remain secure in weaker worlds.
It motivates our second result analyzing the "power" of SRH. When we assume that SRH exists, we need to know the reliability of this statement. In our work, we prove that the existence of SRH implies the existence of dCRH. It implies that the assumption that SRH exists lives in the world where dCRH exists.
In addition, we compare the power of subset resilience with one-wayness in our third result. We rule out the possibility of constructing an SRH from one-way permutations in fully blackbox manners. Although it is not a positive result linking SRH to one-way permutations, it proves that SRH does not obviously exist in Minicr ypt in a fully black-box manner. Consider the HORS signature scheme, a hash-based signature scheme whose CMA security is based on one-wayness and SRH. Our result implies that CMA security of HORS cannot be reduced to one-wayness solely in fully black-box manners.
We remark that the security of subset resilience does not directly affect the security of SPHINCS [8] and its variants [5,9], which are practical and many-time stateless HBS. In SPHINCS, when signing a message m, it generates a pseudorandomness r , and then signs the hash value of r ||m. This enables the schemes to be based on a variant assumption called target subset resilience rather than subset resilience. (It is similar in other variants of SPHINCS.) However, if the computation of r is corrupted (e.g. in the case that r is computed by calculating a pseudorandom function on the message m, and the key of the pseudorandom function is leaked), the security will be degraded to be based on (restricted) subset resilience. Note that the leakage of the pseudorandom function key only affects the security while the key pair is still in use. A leakage after its life will not cause a forgery immediately.

Related work
Subset resilience and hash-based signatures Subset resilience is first proposed in [36] as one of the assumptions needed in a few-time hash-based signature scheme called HORS. The security under chosen message attack is based on one-wayness and subset resilience. Then, Pieprzyk et al. [35] propose HORS++, a variant of HORS, and remove the requirement of subset resilience. Instead, they introduce a cover-free family [17], which captures similar property to subset resilience. In other words, a cover-free family imply an informationtheoretic version of SRH, meaning that the probability of finding a subset cover is exactly 0.
In 2015, Bernstein et al. [8] propose SPHINCS, a practical stateless hash-based signature scheme. This scheme implements HORST, a simple variant of HORS, as a primitive in the structure. In the security analysis of SPHINCS, the security is reduced to subset resilience. However, SPHINCS essentially only requires a target version of subset resilience, which is a relaxation of subset resilience.
After that, Aumasson and Endignoux [4] first analyze the generic security of subset resilience in classical settings and then propose an attack on HORS called weak message attack. To avoid weak message attacks, they propose PORS, a variant of HORS. Based on that, the authors propose a variant of SPHINCS called Gravity-SPHINCS [5].
The state-of-art version of SPHINCS is SPHINCS+ [9]. In SPHINCS+, the few-time signature scheme is instantiated by FORS, another variant of HORST. SPHINCS+ is now considered an alternative candidate of NIST post-quantum cryptography standardization. The security of SPHINCS+ is based on interleaved target subset resilience (ITSR), a variant of target subset resilience deliberately designed for SPHINCS+.
Although they seem similar, subset resilience and ITSR are quite different notions. First, ITSR introduces an "interleaf". It means that the output of the hash function also includes an index, and all the elements of a subset cover are required to collide on a single index. Second, ITSR is a target version of subset resilience as well. The hash computations introduce a randomizer that cannot be freely computed by the adversary.
In SPHINCS+ paper, the authors propose a generic attack against ITSR. Although their attacks can be immediately extended to our non-target case, our attack is more efficient since an adversary is more powerful in non-target cases.
In addition, Yehia et al. [40] analyze the security of FORS in SPHINCS+ and proposes a variant of FORS called DFORS. The motivation of DFORS also comes from a consideration about the non-target cases.
As far as we know, all of the existing practical stateless (few-time or many-time) hashbased signature schemes are related to subset resilience or its variants. On the other hand, all of the existing stateful hash-based signature schemes (including one-time signature schemes) are not related to subset resilience, such as MSS [34], XMSS [13], XMSS MT [24], XMSS-T [25], and LMS [31,33]. [15]. A "birthday attack" can be implemented to find collisions with O(N 1/2 ) number of queries to the function where N denotes the size of the range. In a quantum setting, Brassard et al. [12] present a quantum algorithm on finding collisions of 2-to-1 functions with time complexity O(N 1/3 ) and quantum memory complexity O(N 1/3 ). This algorithm is proven to be optimal in terms of quantum time complexity [1,42]. In addition, Chailloux et al. [14] present a quantum algorithm on finding collisions with time complexity O(2 2n/5 ) and quantum memory complexity O(n).

Collision Resistance and its variants Collision-resistant hashing (CRH) is first proposed in
Distributional collision-resistant hashing (dCRH) is first proposed in [16]. Komargodski and Yogev [28] first analyze the power of dCRH and proves that a statistical zero-knowledge proof implies a dCRH. Then, Bitansky et al. [10] prove that a dCRH implies a constant-round statistically-hiding commitment scheme, and a two-message statistically-hiding commitment scheme also implies a dCRH.

Remark 1
We can follow the idea of dCRH to define a relaxation of k-rSRH that we call distributional k-rSRH. That is, given h 1 , . . . , h k , it is hard to sample (x, x 1 , . . . , x k ) such that x is uniformly distributed from the domain and x i is uniformly chosen from the set Indeed, our second and third result on k-rSRH also works on distributional k-rSRH.

Multi-collision-resistant hashing (MCRH) is another variant of collision-resistant hashing.
A k-MCRH requires that it is hard to find k distinct elements from the domain such that their images are all equal. Interestingly, MCRH has very similar properties to k-rSRH. Suzuki et al. [39] first analyze the time complexity of MCRH. Hosoyamada et al. [23] and Liu and Zhandry [32] propose two quantum algorithms on finding k-collisions, which can be modified to our quantum algorithms on finding (k − 1)-restricted subset covers. Komargodski and Yogev [28] prove that the existence of MCRH implies the existence of dCRH. In recent years, there are other results on MCRH [11,29].

Remark 2
Although MCRH and k-rSRH have very similar properties, we do not know any precise relation between them (neither implication nor separation). In our perspective, they are likely to be assumptions in "orthogonal" directions. A multi-collision implies a "series" of collisions of a single function, while a restricted subset cover implies "paralleled" collisions of several functions. More generally, we can mix up these ideas and define a rather hard problem as follows: We can also define an assumption that finding such a solution is hard for some hash function families. However, we cannot observe any applications or motivations of this assumption except it is a more general form of collision resistance that covers both multi-collisions and multiple functions.

Black-box Separation
The first black-box separating result is proposed in [27], which demonstrates the impossibility of constructing a key-agreement protocol from one-way permutations in a black-box manner. Simon [38] proves the fully black-box separation of CRH from one-way permutations. Indeed, Simon's proof also rules out the possibility to construct a dCRH from one-way permutations.
In 2005, Haitner et al. [21] prove the fully black-box separation of constant-round statistically-hiding commitment schemes from one-way permutations. Furthermore, Berman et al. [7] and Komargodski et al. [29] independently construct constant-round statisticallyhiding commitment schemes from MCRH in fully black-box manners. This rules out the possibility of constructing an MCRH from one-way permutations in fully black-box manners. The latter authors also prove the fully black-box separation of CRH from MCRH.

Organizations
This paper is organized as follows: In Sect. 2, we introduce the preliminaries and give the definition of restricted subset-resilient hash function families (rSRH). In Sect. 3, we give a quantum algorithm breaking any k-rSRH. In Sect. 4, we prove the statement that if k-rSRH exists, then distributional collision-resistant hash function families exist. In Sect. 5, we prove the statement that it is infeasible to construct a provable k-rSRH by one-way permutations. In Sect. 6, we extend the three results to general (r , k)-SRH. In Sect. 7, we give the conclusions and several open questions.

Preliminaries
Let X be a set or a distribution, x ← X means that x is uniformly sampled from X . Let M be a probabilistic algorithm, x ← M(·; r ) means that x is the output of M with randomness r . x ← M means x is the output of M(·; r ) where r is random chosen. For event E relative to M, we use Pr M [E] to represent the probability of E over the randomness of M.
For integer n ∈ N, we denote [n] = {1, . . . , n}. We say that : The statistical distance of two variables X and Y over distribution

Definition 1 (Efficient function family ensemble) A function family ensemble
-F is samplable: there exists a probabilistic polynomial-time algorithm such that given 1 n , it outputs the description of a uniform element of F n .
-F can be efficiently computed: there exists a deterministic polynomial-time algorithm such that given x ∈ D n and f ∈ F n , it outputs f (x).
Especially, we say that a probablistic polynomial-time algorithm Samp k is a k-sampling algorithm of an efficient function family ensemble F if it outputs a tuple of (possibly nonuniform) functions ( f 1 , . . . , f k ) ∈ F k n . Obviously, one can construct a k-sampling algorithm of F by simply sampling k elements from F n . However, there may exist other k-sampling algorithms such that the resulting function tuple has some special properties that we may take interest in. We give a simple example.
by running Samp twice, it will have this property only with probability 1/2. But if we sample the first function by randomness r and sample the section one by randomness r ⊕ 1, the resulting ( f 1 , f 2 ) will have this property with probability 1.

Security notions for hash functions
If an efficient function family ensemble is compressing, which means the size of the domain is larger than that of the range, we call it a hash function family. In practice, an ideal hash function family is expected to be one-way and collision-resistant.
} be an efficient samplable family ensemble. We say that F is a one-way function family (OWF) if for any probabilistic polynomial-time algorithm A, there exists a negligible function (·) such that for large enough n ∈ N.
In particular, if F is a family ensemble of permutations, we say that F is a one-way permutation family (OWP). If F is a hash function family, we say that F is preimage-resistant hash function family.
Let H n be a hash function family and h ← H n . We say that ( If there is no polynomial-time adversary that can find a collision for h ← H n , we say that H is a collision-resistant hash function family (CRH).

Definition 3 (Collision-Resistant Hashing
holds for large enough n ∈ N. Distributional collision resistance is a relaxation of classical collision resistance. A distributional collision-resistant hash function family (dCRH) guarantees that there is no probabilistic polynomial-time adversary that can output a uniform collision. We first define the distribution of uniform collisions.
But in the case that m > n (e.g. m ≥ 2n), x 1 = x 2 only holds with negligible probability.
} be a hash function family. We say that H is distributional collision-resistant if for any probabilistic polynomial-time algorithm A and any two negligible functions δ(·) and (·), it holds that for large enough n ∈ N.
We say that a dCRH is infinitely often secure if the above security holds for infinitely many n's rather than large enough n's.

Grover's algorithm and its applications
In this paper, we assume the readers have sufficient knowledge of basic quantum computations and omit formal descriptions. Let F : X → {0, 1} be a function or a database mapping an element of the set X to a bit. It is called a database search problem with regard to F that finding an x ∈ X such that F(x) = 1. We suppose an adversary can only evaluate the image of x by querying F as an oracle and suppose |F −1 (1)| is non-empty (|F −1 (1)| |X |). The adversary can only solve this problem after O(|X |) (classical) queries to F in the worst case.
Now we consider the case that the adversary is given quantum access to F : X → Y . It means the adversary submits x∈X ,y∈Y α x,y |x, y to F and receive in return x∈X ,y∈Y α x,y |x, y ⊕ F(x) . This is called the quantum query model. In this model, the time complexity of a quantum algorithm is measured by the number of quantum queries to F.
In the quantum query model, the adversary can solve a database search problem with a smaller number of queries [20].
Theorem 1 [20] Let F : X → {0, 1} be a function mapping an element of set X to a bit and In the following, we regard the above algorithm as a black box that can solve the database search problem with regard to any function/database F with at most O( |X | |F −1 (1)| ) quantum queries to F. We call it Grover's algorithm.
An important application of Grover's algorithm is finding collisions for hash functions [12,42]. If a function H : X → Y satisfies |X | = k|Y | and |H −1 (H (x))| = k for each x ∈ X , we say that H is a k-to-1 function. Given a 2-to-1 function H : X → Y where |X | = 2|Y | = 2N , a quantum algorithm can output a collision of H with O(N 1/3 ) queries by following steps: If there exist i 1 and i 2 such that x i 1 = x i 2 and y i 1 = y i 2 , output (x i 1 , x i 2 ). This step requires N 1/3 queries to H . 3. Let F : X → {0, 1} be a function defined as follows: F(x) = 1 if and only if H (x) ∈ L 2 and x / ∈ L 1 . Run Grover's algorithm on F. Note that |F −1 (1)| = t = N 1/3 , and evaluating F requires a query to H . The Grover's algorithm outputs x after at most O( The above algorithm submits O(N 1/3 ) quantum queries to H in total and outputs a collision of H . There are many studies on quantum collision-finding algorithms [2,6,14,42].
In addition, Hosoyamada et al. [23] and Liu and Zhandry [32] generalized this idea and constructed quantum algorithms finding k-multi-collisions, that is, k distinct elements that collide w.r.t. a hash function. On finding k-multi-collision, the time complexity is respectively ) in these studies. For example, on finding 3-collisions, the time complexity is respectively O(N 4n/9 ) and O(N 3n/7 ).

Subset-resilient hash funtion families
Subset resilience is first proposed in [36], which is an assumption needed for EU-CMA (Existential Unforgeability under Chosen Message Attacks) security of HORS. Given a parameter n and a set T = {0, 1} l(n) , suppose there is a function H mapping to a subset of T of size at . From now on, we let r and k be constant integers. Since we have introduced the definition of k-sampling algorithms, an (r , k)-subset cover can also be defined as follows.
holds for large enough n.
Next, we define a weaker assumption than subset resilience, which we call as restricted subset resilience. Before introducing this assumption, we propose a definition called restricted subset cover, similar to subset cover. The difference is that for a restricted subset cover (x, x 1 , . . . , x r ), h i (x) is required to be covered by the union of h i (x j ) for each i. It is a sufficient but unnecessary condition for a subset cover. H = (h 1 , . . . , h k ) be a tuple of functions where h i : X → Y for each i ∈ [k]. We say that (x,

Definition 9 (Restricted Subset-Resilient Hash Function Families) Let
} be a hash function family and Samp k be a k-sampling algorithm of H. We say that H is an (r , k)-restricted subset-resilient hash function family ((r , k)-rSRH) with regard to Samp k such that for any probabilistic polynomial-time algorithm A, there exists a negligible function such that holds for large enough n.
In the following, we focus on (k, k)-restricted subset resilience in particular, and simply call it k-restricted subset resilience. In this situation, finding an (k, k)-restricted-subset cover Thus, (k, k)-restricted subset resilience can be redefined as follows:

be an efficient family ensemble and
Samp k be a k-sampling algorithm of H. We say that H is a secure k-subset-resilient hash function family (k-rSRH) with regard to Samp k such that for any probabilistic polynomialtime algorithm A, there exists a negligible function such that holds for large enough n.
If (x, x 1 , .., x k ) is a (k, k)-restricted subset cover of H = (h 1 , . . . , h k ), we simply say that it is a restricted subset cover of H .

SRH and signature schemes
SRH and rSRH are helpful to construct hash-based few-time signature schemes. For instance, the EU-CMA security of HORS [36] is based on SRH and one-way function families. In addition, HORS can be simply modified to be based on rSRH (instead of SRH) and onewayness: instead of picking 2 m number of random strings in the key generation algorithm, it picks k groups of them, each of which contains 2 m random strings. Just like HORS, the public key includes k2 m number of values. To sign the message m, it leaks the element indexed h i (m) in group i for each i ∈ [k]. Compared to HORS, the size of the secret key and the public key becomes k times larger, and the running time of the key generation algorithm also becomes k times longer. However, it requires a weaker assumption, resisting weak message attack in [4]. One may consider that the sizes of keys are extremely large and impractical. It is not a big issue since the public keys and the secret keys can be compressed by introducing a Merkle tree and a pseudorandom function, respectively. However, the running time of the key generation algorithm cannot be compressed.
We give the formal description and security proof of scheme in Lemma 2 in Appendix A.

Remark 3
Essentially, the scheme in Lemma 2 is a very simplified version of FORS [9]. The differences are as follows. First, FORS introduces a Merkle tree structure to compress the public key. Second, it replaces the one-way functions with tweakable hash functions to decrease the security loss. Third, it introduces a randomizer on the message. That is, instead of computing h i (m), it picks a randomizer r and computes h i (r ||m). Here r is computed by P R F(k, m) where P R F is a pseudorandom function, and k is a secret key (in the randomized version of FORS, r is computed by P R F(k, rand, m) where rand is a random nonce). This step makes the security based on a "target version" of restricted subset resilience, a weaker assumption. See details in [9]. Consider the case that the pseudorandom function key k is leaked in the deterministic version of FORS. We have F(k, m)||m). Then, the EU-CMA security of FORS will be degraded to restricted subset resilience of H = (h 1 , . . . , h k ).
Apart from hash-based signature schemes, since SRH has similar properties to cover-free families (CFFs), it can also be used in other CFF-based primitives. For example, in [41] and [22], the authors respectively propose an aggregate signature scheme and a programmable hash function based on CFF. By replacing the CFF with an (r , k)-SRH in these primitives, we obtain an r -time aggregate signature scheme and an (r , 1)-programmable hash function. However, this will lead to larger public keys and do not behave better than the original versions. It is an open question about other practical applications of SRH besides hash-based signature schemes.

Quantum algorithms breaking k-rSRH
This section gives a quantum algorithm for finding k-restricted subset covers, showing an upper bound of k-restricted subset resilience security for hash functions in the quantum world. This work is inspired by [32], which shows a quantum algorithm finding multi-collisions. Interestingly, the time and memory complexity for finding a k-restricted subset cover are roughly the same as those needed for finding a (k + 1)-collision.
In this section, we treat the target function as an oracle. We suppose that an algorithm can only obtain the target function value by querying this oracle. The time complexity is evaluated by the number of (quantum) queries to this oracle.
To "inject" the third function into the algorithm, we repeat the steps a little bit more times than before and slightly change some of the steps. We first pick N 7/15 elements from the domain and list them in X . After evaluating their images of h 1 and running Grover's algorithm repeatedly, we find collisions of a subset of X w.r.t. h 1 . We remove from the X the elements whose collisions are not found. Next, we evaluate the images of X w.r.t. h 2 , and run Grover's algorithm repeatedly again. As a result, we obtain a subset of X , of which the collision are found w.r.t. h 2 and h 1 . Then we do the same things on h 3 , and get the final result. It implies a 3-restricted subset cover of (h 1 , h 2 , h 3 ).
Our quantum algorithm on finding 3-restricted subset cover for H = (h 1 , h 2 , h 3 ) is depicted as follows: 1. Let t 1 = N 7/15 (instead of N 3/7 in the case of two functions). Pick a list X 1 = {x 1 is uniformly chosen from Dom.

Evaluate y
Run Grover's algorithm on for each x (i) 1 ∈ X 1 and list them in Y 2 . Since now X 1 only contains t 2 elements, this step requires t 2 = N 3/15 queries to H . 6. Define F 2 : Dom → {0, 1}: 3 in X 3 orderedly. Again, remove from X 1 the elements whose second-preimages are not found in X 3 . Now for every x (i) In total, the algorithm requires O(N 7/15 ) queries to H . Generally, for any constant k, let t s = N (2 k−s+1 −1)/(2 k+1 −1) for each s ∈ {1, . . . , k + 1}. Let H = (h 1 , . . . , h k ) practice, (h 1 , . . . , h k ) is usually instantiated as a division of a long hash H : Dom ← Rng k . In this case, the hash values of x w.r.t. h 1 , . . . , h k can be computed by a single hash query. Thus, if the memory is large enough, we can compute all the Y 1 , . . . , Y k right after step 1 by t 1 hash computations and skip step 2(a) in the loops. This modification can decrease the time cost (but the complexity is not changed). Now we analyze the time complexity in the above algorithm.

Remark 4 In
Step 2(a) requires t s classical queries.
Step 2(c) requires t s+1 classical queries. The number of quantum queries required in the algorithm is in total The number of classical queries required in the algorithm is in total Since k is a constant, we conclude the following statement.

) ) quantum queries to H .
Note that here the functions are required to be 2-to-1. Namely, it guarantees that any x ∈ Dom has a second-preimage of h s and thus |F −1 s (1)| is exactly equal to |Y s | = t s . For a general function, there may exist some "bad" elements in Dom which have no secondpreimage w.r.t. some h i . If we unfortunately pick bad elements into X s−1 , |F −1 s (1)| will be smaller than what we expected, making Grover's algorithm fail in the expected steps. In other words, for a general function, we need to ensure that in each loop, X i−1 always contains enough number of "good" elements which have second-preimages w.r.t. each h i .
Next, we try to eliminate the need of 2-to-1 property. We require that the size of the domain is (k + 1) times larger than that of the range. In this case, a constant fraction of x's have their second-preimages w.r.t. each h i . H = (h 1 , . . . , h k ) where h i : Dom → Rng for each h i and |Dom| = (k + 1) |Rng| = (k + 1)N . The probability that x has a second-

Lemma 3 Let
This completes the proof.
We slightly change some of the steps in our algorithm. Let c > 0 be a constant. In step 1, We pick (1 + c)kt 1 number of x (i) 's from Dom (instead of t 1 number of them in the previous version). In step 2(b) of loop s, we run Grover's algorithm on F s (1 + c)kt s+1 times rather than t s+1 times.
Let S H be the set of x that has a secone-preimage w.r.t. each h i (which implies S H = Dom\ i∈[k] Dom h i ). Note that the elements of X 0 are uniformly chosen. Due to Chernoff bound, there are at least 1 (1+c)k fraction of x's in X with overwhelming probability.
which is negligible since Grover's algorithm successfully runs on F 1 in step 2(b) of loop 1.
When we run Grover's algoirithm on F 1 , it randomly picks an element from F −1 1 (1), which corresponds to a uniformly random element from X 0 . Since X 0 is uniformly chosen in step 1, the distribution of each element in X 1 is also uniform. Note that the size of X 1 is (1 + c)kt 2 . Again due to Chernoff bound, there are at least t 2 number of x's in X 1 such that x drops in S H with overwhelming probability. It implies that |F −1 2 (1)| ≥ t 2 holds with overwhelming probability. Suppose it holds. Then Grover's algorithm in step 2(b) of loop 2 can be completed with the expected number of queries.
Similarly, |F −1 s (1)| ≥ t s holds with overwhelming probability for each s. Suppose it always holds for each s, the algorithm will go through and output a k-restricted subset cover for H . Note that k and c are constant, and the number of quantum queries is (1 + c)k times larger than the previous one. The total number of queries is still O(N quantum queries to H .

Remark 5
Indeed, this algorithm also works in the case that the ranges of functions are not identical, in other words, the case that h i : Dom → Rng i , where |Dom| ≥ (k + 1)|Rng i | = (k + 1)N but Rng i may differ from Rng j for i = j.

A time-memory tradeoff
In the last subsection, we show an algorithm finding k-restricted subset covers which requires ) quantum queries to the functions. However, we observe that the memory required in this algotithm is also O(N Note that the memory cost mainly depends on t 1 . We can flexibly adapt |X 0 | = t 1 and other t s for s ∈ {2, . . . , t + 1} to decrease the memory cost, but increase the running time.
We redifine t s = t In total, the time complexity becomes O(t 1 ) , the algorithm will require less memories and more running time.
When t 1 becomes a polynomial of log N , then the running time becomes close to O(N 1/2 ), which is the time complexity of simply running Grover's algorithms.

Constructing dCRH from k-rSRH
In this section, we show that the existence of k-rSRH implies the existence of dCRH. This work is inspired by [28], which shows a similar relation between dCRH and MCRH. Note that our construction is non-black-box. That is, we do not present an explicit construction of dCRH from k-rSRH, but only prove the existence of dCRH instead.

From 2-rSRH to dCRH
In this subsection, we prove a weaker statement: the existence of 2-rSRH implies the existence of dCRH.

Theorem 4 Assuming the existence of a secure 2-rSRH such that each of functions compresses 2n bits to n bits, then there exists an (infinity often) secure dCRH.
Proof To prove this statement, we will show the contradiction that if infinity-often secure dCRH does not exist, then there does not exist a secure 2-rSRH with regard to any 2-sampling algorithm. We assume that there exists a probabilistic polynomial-time algorithm A that breaks distributional collision resistance of any hash function family. Then, we construct a polynomial-time algorithm BreakSR to break 2-restricted subset resilience of any H with regard to any 2-sampling algorithm Samp.
Given H and Samp, let (D 1 , D 2 ) be the distribution of the output of Samp(1 n ). Note that D 1 and D 2 are two distributions of H n . Due to our hypothesis, there exists a probabilistic polynomial-time algorithm A and two negligible functions δ and such that for large enough n, it holds that and thus Let r be the randomness of A. Here, A is given the security parameter, a function h 1 sampled from D 1 and the randomness r , then it outputs a collision that is statistically close to COL h with all but negligible probability over the choice of h. Let (x 1 , x 2 ) ← A(1 n , h 1 ; r ) and denote by A 1 the deterministic algorithm with input (1 n , h 1 , r ) and output x 1 .
We omit the security parameter 1 n in the following. Note that fixing h 1 in the input of A 1 , A 1 becomes a deterministic algorithm whose input is r and output is x ∈ {0, 1} 2n . Without loss of generality, suppose the length of the randomness of A is at most l r (n) > 2n. For (h 1 , h 2 ) ← Samp, we define a special function h 2 : {0, 1} l r (n) → {0, 1} n as follows: h 1 ; r )).
It is not hard to observe that h 2 is samplable by Samp and it is efficiently computable. Due to our hypothesis that no dCRH exists, there exists another probabilistic polynomialtime algorithm A that can find uniform collisions for h 2 . That is, there exist two negligible functions δ and such that holds for large enough n. Due to the existence of A and A , we can construct an algorithm BreakSR(1 n , h 1 , h 2 ) to output a 2-restricted subset cover w.r. t. (h 1 , h 2 ): Formally, we aim to prove that there exists a negligible function μ such that Pr holds for large enough n. Define the above experiment by Game 0. We show the probability of Game 0 is overwhelming in the following steps: -Game 1 differs Game 0 in the following parts. BreakSR does not run (r 1 , in step 2. Instead, it directly picks (r 1 , r 2 ) ← COL h 2 . Due to Eq. (20), the statistical distance of Game 0 and Game 1 is less than δ (n) except with probability (n) (over the choice of h 2 ). We have and thus where the probability is taken over the choice of (h 1 , h 2 ) and the randomness of BreakSR. h 1 ; r 2 )) and thus h 2 (x 1 ) = h 2 (x 3 ) with probability 1. Next, we prove that the other three events in Eq. (21) also occur with overwhelming probability.

Lemma 4
Pr Proof In Game 1, r 1 is the first element of a sample from COL h 2 , which means that r 1 is uniform from {0, 1} l r (n) . Recall that (x 1 , x 2 ) ← A(h 1 ; r 1 ). Thus, the probability in Eq. (24) is essentially taken over the choice of h 1 and r 1 . Suppose A(h 1 ) and COL h 1 are δ(n)-close (except with probability (n) over the choice of h 1 due to Eq. (18)). We have For holds with probability 0. Next, we show that For any y ∈ {0, 1} n , denote by X h 1 y ⊆ {0, 1} 2n the set of x such that h 1 (x) = y. For convenience, we say X y instead of X h 1 y in the following.
We say X y is "large" if |X y | ≥ 2 n/2 or X y is "small" otherwise. Since X y 's are disjoint for different y, there are at most 2 n different X y 's. Thus, there exist at most 2 3n/2 number of x such that X h 1 (x) is small (otherwise the number of bad X y 's will be more than 2 n ). As a result, the number of x's such that X h 1 (x) is large is more than 2 2n − 2 3n/2 . We have Thus, From Eqs. (25) and (26), we complete the proof of Lemma 4.

Lemma 5
Pr Proof Again, we assume that Δ(A(h 1 ), COL h 1 ) ≤ δ(n) (except with probability (n)). Then we have Pr For any x ∈ {0, 1} 2n , denote by R x ⊆ {0, 1} l r (n) the set of random coins making A output x as the first element: Then, we have Therefore, the mapping from r 1 to x 1 is regular except with probability δ(n).
Recall how (r 1 , r 2 ) ← COL h 2 and x 1 are chosen. First, we uniformly choose r 1 from {0, 1} l r (n) and let x 1 = A 1 (h 1 ; r 1 ). Then, r 2 is uniformly chosen from the following set: That is, S x 1 is the set of r which maps to x where h 2 (x ) = h 2 (x 1 ). Let X h 2 y be the set of x such that h 2 (x) = y. We have For convenience, we say X h 2 (x 1 ) instead of X h 2 h 2 (x 1 ) in the following. Fix r 1 . Obviously, we have r 1 ∈ R x 1 ⊂ S x 1 . In addition, recall that x 3 = A 1 (h 1 ; r 2 ). Thus, x 3 = x 1 holds if and only if r 2 also drops in R x 1 . It occurs with probability |R x 1 |/|S x 1 | (over the choice of r 2 ). Note that the mapping between r 1 and x 1 is regular and S x 1 contains |X h 2 (x 1 ) | number of R x . We have The above inequality is for a fixed r 1 . Since r 1 is uniformly chosen, the distribution of x 1 is δ(n)-close to the uniform distribution. Thus, where the second inequality is due to Eq. (27). From Eqs. (34) and (35) we have which completes the proof of Lemma 5.

From Lemmas 4 and 5 we have
where the factor of (n) is not accumulated since the proofs of two lemmas begin with the same assumption that A(h 1 ) and COL h 1 are δ(n)-close.

From general k-rSRH to dCRH
In the last subsection, we construct an algorithm breaking the security of 2-rSRH with an algorithm breaking dCRH. Indeed, this construction can also be extended to break the security of k-rSRH for any constant k > 2. This implies the relation between dCRH and general k-rSRH.
Our extension is overviewed as follows. First, we construct a machine BreakSR breaking 2-rSRH with A breaking dCRH as we present in the last subsection. Next, we construct a machine Break-3-SR breaking 3-rSRH with BreakSR and A. Iteratively, we construct a machine Break-(s +1)-SR breaking (s +1)-rSRH with Break-s-SR and A for s = 2, . . . , k −1. Finally, we obtain a Break-k-SR breaking k-rSRH, which proves our statement.
Formally, the recursive algorithm Break-(s+1)-SR(1 n , h 1 , h 2 , . . . , h s+1 ) breaking (s+1)-rSRH runs as follows: We rule out fully black-box constructions of k-rSRH from OWP. , let X C i be the set of x that does not have a collision w.r.t. C i . We observe that |X C i | < 2 l(n) , otherwise the range size of C i must be larger than 2 l(n) . Since x is randomly chosen from {0, 1} n , it holds that where the second inequality holds since n > l(n) − log k. Next, we show that (x, x 1 , . . . , x k ) is a k-restricted subset cover w.r.t. (C 1 , . . . , C k ) with constant probability. Suppose x / ∈ i∈[k] X C i (with probability is more than 1/2). Due to the stategy of CoverFinder, it holds that C i (x) = C i (x i ) for each i ∈ [k], but it is possible that x = x i for some i. Note again that x is randomly chosen from {0, 1} n and π 1 is a random permutation on {0, 1} n . Since x / ∈ X C i , there are at least two x ∈ {0, 1} n such that C i (x) = C i (x i ) and x i is one of them with uniform distribution. Thus, the probability that x = x i is at most 1 2 . We have Note that k is constant. Due to inequality (41) and (42), the probability that (x, x 1 , . . . , x k ) is a k-restricted subset cover is more than 1 2 k+1 . This completes the proof.

From inversing to compressing
From this subsection, we show that for every polynomial-time algorithm A given the access to the oracle Γ , there exists a negligible function negl such that Pr y←{0,1} n f n ,CoverFinder,A for large enough n. Let A-win be the above event and we need to show that Pr[A-win] ≤ negl(n). In this subsection, we show a weaker statement. We define a special event called y-hit and prove that without triggering y-hit, A-win only occurs with negligible probability.

Definition 12
In the process of running A Γ (y), when A makes a query (C 1 , . . . , C k ) to CoverFinder and obtains (x, x 1 , . . . , x k ), we say that this query triggers the event y-hit if in evaluating C f i (x) and C f i (x i ) for some i ∈ [k], it queries an x to f n such that y = f n (x). If there exists such a query to CoverFinder triggering y-hit, we simply say that y-hit occurs.

Lemma 7 For any polynomial-time adversary
Now we pick Y f as follows: (1) pick the lexicographically smallest y * ∈ I f , (2) run A(y * ), (3) every time A makes f n -query x and obtains an image y = f n (x), remove this y from I f , (4) every time A makes CoverFinder-query (C 1 , . . . , C k ) and obtains (x, x 1 , . . . , x k ), evaluate C i (x) and C i (x i ) for each i ∈ [k], and removes from I f all the outputs of f n -queries during the evaluation, (5) store this y * in a set Y f , and (6) go to step (1).
Without loss of generality, we suppose that if x ← A(y), A has queried f n (x) in the execution. Thus, for each y * picked in step (1), it is removed from I f in step (3).

Lemma 8 Let q A be the upper bound of the number of queries made by A and q C be the upper bound of the number of f -queries required in evaluating C
It holds that Proof Suppose a query to CoverFinder(C 1 , . . . , C k ) is replied by (x, x 1 , . . . , x k ). Note that evaluating all the C i (x) and C i (x i ) makes at most 2kq C queries to f n . For every y, A(y) makes at most q A queries to f n and also at most q A queries to CoverFinder. When we pick Y f , for each y ∈ Y f , we remove at most q A elements in step (3) and at most q A · 2kq C elements in step (4). Thus, in each loop, we remove at most number of elements from I f and then add one element to Y f . This implies the lemma.
. Let Z f be the partial truth table that stores all the maps of f n except those from X f to Y f . Next, we show that (X f , Y f , Z f ) can encode the whole truth table of f n . We introduce a reconstruction algorithm that outputs the truth table Z of f n taking as input (X f , Y f , Z f ).
1. While Y f = ∅ (a) Pick the lexicographically smallest y ∈ Y f (b) Run A(y) as follows: • Remove y from Y f and go to step 1. • When A queries (C 1 , . . . , C k ) to CoverFinder, do as follows: i. Obtain (w, π 1 , . . . , π k ) from the random tape of CoverFinder. ii. Compute C f i (w) for each i ∈ [k]. When it queries x to f n , answer Z f (x).
iii. For every j from 0 n to 1 n , evaluate C f n (x) = y. This event never happens. Assume it does, it implies that A(y) queries x to f n but x ∈ X f , and thus f n (x) ∈ Y f . Note that y ∈ Y f ⊆ I f . Due to the description of Y f , all the f n -queries during the execution of A(y) have been removed from Y f . Thus, A(y) never queries an x to f n that f n (x) ∈ Y f . As is discussed above, the reconstruction algorithm correctly replies to the queries from A(y) for any y ∈ Y f . The reconstruction algorithm identically builds the real truth table Z . Let = 2 −n/3 . We say f n is " -good" if |I f | ≥ 2 n . It implies that for any -good f n , the probability of A-wins ∧ y-hit is more than over the choice of y ∈ {0, 1}. Let S be the set of f n such that f n is -good. We reconsider the probability Pr f n ,y [A-wins ∧ y-hit].
Since f n is a random permutation, we consider the following cases: • Case 1 f n is -good.
In this case, we have |I f | ≥ 2 n = 2 2n/3 . Since |I f | ≤ 3kq 2 |Y f |, we have Note that f n can be encoded by Note that q is a polynomial of n. Since |Y f | ≥ 2 2n/3 /5q 2 , for large enough n, we have for large enough n, which completes the proof.

From y-hit to y-hit
In this subsection, we show that if there exists an adversary A f ,CoverFinder that can invert y with triggering y-hit, then we can construct an another algorithm to invert y without triggering y-hit.

Lemma 10
For any y ∈ {0, 1} n and any permutation f n , suppose there exist a polynomial p(n) and a polynomial-time algorithm A such that Next, we prove the inequality (55). Let q be the upper bound of the number of queries to CoverFinder made by A, and let C (1) , . . . , C (q) be the queries to CoverFinder made by A(y) (each C (i) is a tuple of k circuits (C (i) 1 , . . . , C (i) k )). There are at most q "chances" for B to terminate. We define two events as follows: • Jump i Before querying CoverFinder(C (i) ), B chooses w * such that C (i) j (w * ) hits y for some j ∈ [k], which leads to termination. • Fail i The query C (i) to CoverFinder triggers y-hit.
We observe that the event B-win ∧ y-hit occurs if and only if Jump i happens for some i ∈ [q] and Fail j never occurs for any j < i. That is, In addition, we observe that for any f n , Jump i happens with the half of probability that Fail i happens. That is, From equality (56) and (57) . This completes the proof.

Main result
In this section, we give the separation result using the lemma in the last three subsections.
for infinitely many n ∈ N and thus Pr f n ,y,R CoverFinder That is, On the other hand, it is non-trivial to extend the result from (k, k)-SRH to (r , k)-SRH, since (k, k)-SRH is a stronger assumption than (r , k)-SRH for r < k. It is natural that when we turn to (r , k)-SRH, there will be some additional constraint conditions upon our results. Now we explain how our results of (k, k)-SRH is generalized to (r , k)-SRH. We give a simple example on generalizing our first result to finding an (r , k)-restricted subset cover. Given quantum access to H = (h 1 , . . . , h 4 ) where h i : X → Y , we aim to find a (2,4)-restricted subset cover (x, x 1 , x 2 ) for H . It implies that for each i ∈ [4], We can run the algorithm in Sect. 3 on (h 1||2 , h 3||4 ) and obtain (x, Thus, it is a (2, 4)-restricted subset cover (and also a (2,4)-subset cover) w.r. t. (h 1 , . . . , h 4 ).
Formally, we generalize our results to (r , k)-SRH (and also to (r , k)-rSRH) as follows: is a tuple of functions where h * i : X → Y and |Y | = |Y | ω = N ω . Note that |X | ≥ (r + 1)|Y | due to the conditions on each h i . We can run the algorithm in Theorem 3 on H * , and obtain (x, x 1 , . . . , x r ) as a r -restricted subset cover of H * with overwhelming probability. The output is an (r , k)-restricted subset cover of H (and also an (r , k)-subset cover). Due to Theorem 3, the number of required quantum queries to H is Theorem 9 (Extended Theorem 5) For constant k ≥ 2 and r , denote ω = k/r . Assuming the existence of a secure (r , k)-SRH such that each of functions compresses 2ωn bits to n bits, then there exists an (infinitely often) secure dCRH.
Theorem 10 (Extended Theorem 7) For constant k ≥ 2 and r , let ω = k/r . There is no fully black-box construction from OWP to (r , k)-SRH compressing n bits to l(n) < (n − log r )/ω. Proof Again, we consider the case that k = ωr . In the proof of Theorem 9, we show that an (r , k)-rSRH compressing n bits to l(n) bits immediately implies a k-rSRH compressing n bits to ωl(n) l (n) bits. Note that l (n) < n − log r . Due to Theorem 7, there is no fully black-box contruction of such a k-rSRH from OWP. It implies that there is no fully black-box contruction of (r , k)-rSRH compressing n bits to l(n) bits from OWP.

Remark 6
Note that cover-free families (CFFs) are information-theoretic version of SRH, implying the possibility to construct a perfect SRH without any assumption (such as OWP). For instance, by implementing the CFF in [17], we can constuct an (r , k)-SRH mapping {0, 1} n to {0, 1} l(n) where 2 l(n) ≤ 16r 2 n and k = 2 l(n) /4r . However, it does not contradicts to our result, since the parameter k in this instance is far from a constant. We stress that our Theorems 9 and 10 only work on their constraint conditions.

Conclusion and open questions
In this work, we present three results on the studies of subset resilience. The first result is a generic quantum attack against subset resilience. This implies an upper bound of the security of subset resilience. The second result is the relation with dCRH. It implies that the power of assuming SRH is stronger than dCRH. The third result is the fully black-box separation from one-way permutations, which rules out the possibility of constructing SRH from one-way permutations in fully black-box manners.
Note that there is a constraint condition in each statement. (For example, we only rule out the possibility of constructing an (r , k)-SRH from OWP in the case that l(n) < (n − log r )/ω where ω = k/r .) Indeed, we do not know whether the bounds of the parameters and the complexity of the attacks are optimal, and we cannot give a counterexample when the parameter is out of the bound. It leaves an open question whether we can improve the results presented in this paper.
Target subset resilience is a weaker variant of subset resilience. It is first proposed as a security notion needed in RMA security of HORS [36]. Although the CMA security of SPHINCS [8] and Gravity-SPHINCS [5] is reduced to subset resilience, the reductions are non-tight since finding a subset cover does not immediately cause a forgery. SPHINCS+ [9] fills this gap by introducing interleaved target subset resilience (ITSR), a variant of target subset resilience. Thus, it is also an interesting open question whether our results can be extended to target versions of subset resilience.
There are still a number of open questions around SRH. First, we do not know how to construct a provable SRH based on other assumptions, such as hard mathematical problems (we only ruled out the possibility to be constructed by one-way permutation). Second, we do not know other practical applications of SRH, such as constructing commitment schemes or analyzing hash functions with particular structures. Third, we do not know whether it is possible to construct a CRH from SRH. These questions have been answered with regard to other relaxations of CRH, such as multi-collision-resistant hash functions (MCRH).
Talking about MCRH, it has very similar properties to SRH on our results. However, we cannot observe a precise relation between SRH and MCRH. This is another interesting question around SRH.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

A.1: Definition and Security models
Definition 13 A signature scheme Γ = (KeyGen, Sign, Ver) consists of three polynomialtime algorithms along with an associated message space M = {M n } such that: • The key generation algorithm KeyGen takes as input the security parameter 1 n . It outputs a pair of keys ( pk, sk), where pk and sk are called the public key and the secret key respectively. • For security parameter n, the signing algorithm Sign takes as input a secret key sk and a message m ∈ M. It outputs a signature σ . • For security parameter n, the verification algorithm Ver takes as input a public key pk, a message m ∈ M and a signature σ . It outputs a bit b. If b = 1, we say σ is a valid signature of m.
The standard security for a signature scheme is existential unforgeability under chosen message attack (EU-CMA). In the definition, we introduce an experiment for adversary A. In this experiment, the adversary is given the public key pk and access to the signing oracle Sign(sk, ·). A is allowed to query the signing oracle at most q times, and all the queries are recorded in {M i } q 1 . Finally, A is required to output (m * , σ * ) such that σ * is a valid signature for m * and m * has not been queried.

A.2: Construction in Lemma 2
In the following we show the signature scheme in Lemma 2 and prove the EU-CMA security based on restricted subset resilience and one-wayness. Let H be an (r , k)-rSRH mapping m(n)-bit strings to m (n)-bit strings and Samp k be a k-sampling algorithm. Let f :  Output σ = (s 1,t 1 , . . . , s k,t k ).
The correctness of Γ can be easily verified. Next, we prove the security.

Theorem 11
Let H be an (r , k)-rSRH and f is a one-way function. r ≤ c2 m for some constant c < 1. Then, Γ is existentially unforgeable under r -time chosen message attacks.