De Finetti Theorems for Quantum Conditional Probability Distributions with Symmetry

The aim of device-independent quantum key distribution (DIQKD) is to study protocols that allow the generation of a secret shared key between two parties under minimal assumptions on the devices that produce the key. These devices are merely modeled as black boxes and mathematically described as conditional probability distributions. A major obstacle in the analysis of DIQKD protocols is the huge space of possible black box behaviors. De Finetti theorems can help to overcome this problem by reducing the analysis to black boxes that have an iid structure. Here we show two new de Finetti theorems that relate conditional probability distributions in the quantum set to de Finetti distributions (convex combinations of iid distributions), that are themselves in the quantum set. We also show how one of these de Finetti theorems can be used to enforce some restrictions onto the attacker of a DIQKD protocol. Finally we observe that some desirable strengthenings of this restriction, for instance to collective attacks only, are not straightforwardly possible.


Introduction
The aim of quantum key distribution is to establish a shared key between two parties, commonly called Alice and Bob, that is unknown to any third party, commonly called Eve. To achieve this goal, Alice and Bob can share an entangled quantum state and use the correlated outcomes of measurements on this state to generate a secure key pair via a postprocessing protocol. If Eve has tampered with the shared state, Alice and Bob either notice this and abort the protocol, or are able to generate a secure key pair anyway [1][2][3]. In device-independent quantum key distribution (DIQKD) we assume that Eve not only has control over the shared state, but is also able to manipulate the devices that Alice and Bob use to measure the state. As long as the devices are not manipulated in a way that sends information out of Alice's and Bob's laboratories through channels other than those controlled by Alice and Bob, there are still protocols that allow for the generation of a shared secret key [4][5][6][7][8].
In the device-independent context, the devices of Alice and Bob are treated as black boxes and modeled by a conditional probability distribution P AB|XY . Alice and Bob can give inputs x and y respectively to the box, and receive outputs a and b with probability P AB|XY (ab|xy). We will often write P (ab|xy) instead of P AB|XY (ab|xy) when the random variables A, B, X and Y are implicitly understood. In the device dependent case the inputs correspond to the choice of measurement basis, and the outputs to the results of the measurement. We will denote the sets of possible inputs by X and Y, and the sets of outputs by A and B. If the inputs and outputs are strings of length n, i.e. they are of the form X =X n (whereX denotes some set of possible single round inputs) and analogously for Y, A and B, we call P AB|XY an n-round box. If QÂB |XŶ is a box with inputs and outputs inX ,Ŷ,Â,B we denote by Q ⊗n AB|XY the n-round iid box with Not all boxes P AB|XY describe processes that are physically possible if we assume that no information can leave the laboratories of Alice and Bob. All boxes must then be such that Bob gains no information about Alice's input from his output, and vice versa. We refer to boxes satisfying this constraint as non-signaling: ∀a, x, y, y ′ b P (ab|xy) = b P (ab|xy ′ ).
If we furthermore assume that the boxes are described by quantum theory, we can describe the distribution P AB|XY by some quantum state shared between Alice and Bob and some POVMs describing their measurements.
The set of quantum boxes is a proper subset of the set of non-signaling boxes. That both sets are not identical is demonstrated by the Popescue-Rohrlich box [9]. When constructing DIQKD security proofs, often the analysis would be simplified if there were some form of reduction from general box behaviour to the iid case, as it is substantially easier to construct security proofs for the latter (as achieved in e.g. [5,6,10]). To find such a reduction, so-called de Finetti theorems may be a promising tool, as they have previously been used to achieve this goal in the case of device-dependent QKD [3,11]. De Finetti theorems allow us to relate the entries of an arbitrary permutation invariant box to the entries of a de Finetti box (a convex combination of iid boxes). De Finetti theorems where originally developed for random variables [12] and then extended to quantum states [3,[13][14][15] and boxes [16,17]. For example, in [16] it was shown that for each set of single round inputsÂ and outputsX there exists a de Finetti box τ A|X such that for all permutation invariant boxes P A|X it holds that ∀a ∈ A, x ∈ X P (a|x) ≤ (n + 1) |X |(|Â|−1) τ (a|x).
(Here we treat the inputs and outputs of Alice and Bob as lumped together to a single input and output.) However, the de Finetti theorems for boxes derived in e.g. [16,17] have the drawback that the de Finetti boxes cannot be restricted to the quantum set even if the original permutation invariant boxes are quantum. This creates an obstacle for applications, because many existing DIQKD security proofs under the iid assumption exploit the properties of the quantum set [5,6,10]. This implies that such proofs cannot be combined with the de Finetti theorems in [16,17] to obtain security against non-iid attacks, as those de Finetti theorems involve boxes that are not in the quantum set. (It is true that one could aim to derive a security proof for all iid behaviours in the non-signaling rather than quantum set, then apply the de Finetti theorems of [16,17] to obtain security against non-iid attacks. However, this would give lower asymptotic keyrates and noise tolerance compared to security proofs against quantum attackers, because non-signaling behaviours yield a significantly larger class of possible attacks.) Ideally, we would like to find a de Finetti theorem that can extend the iid security proofs against quantum attackers in [5,6,10] to cover non-iid quantum attackers, while preserving the asymptotic keyrates and noise tolerance from those proofs, similar to the situation for device-dependent QKD [11]. While we do not fully achieve this goal in this work, we do obtain a de Finetti theorem that allows a partial reduction to the iid case (in a sense described in section 3), and we also highlight some concrete difficulties that may be faced when aiming for a full reduction.
Regarding other existing approaches for reductions to the iid case, we note that for DIQKD protocols that use only one-way communication for error correction [3], a proof technique known as the entropy accumulation theorem (EAT) [18] can be used to essentially reduce the analysis of non-iid (but time-ordered) boxes to the iid scenario [8]. Alternatively, the techniques in [19,20] can be used to obtain security proofs for such protocols even when the boxes accept all inputs in parallel, though the resulting asymptotic keyrates are lower than in the iid case. There are however protocols that don't only use one-way error correction (broadly referred to as advantage distillation protocols [3,10,[21][22][23][24], such as the Cascade protocol [25] or the repetition-code protocol [3,21]), and these protocols do not admit a security proof via those approaches. 1 The significance of these protocols in DIQKD is that under an iid assumption, it has been shown [10] that they can achieve higher noise tolerances than one-way error correction (i.e. they can achieve positive keyrates even when the keyrate given by one-way error correction is zero), analogous to results for devicedependent QKD [3,23,24]. However, for device-dependent QKD these improved noise tolerances can be lifted to the non-iid case using de Finetti arguments as mentioned above, whereas in DIQKD such an argument is currently missing -in fact, there are currently no security proofs for DIQKD advantage distillation protocols against non-iid attacks. Finding a way to resolve this would be useful in, for instance, tackling a foundational question of characterizing which nonlocal box behaviours can be used for DIQKD [26] (analogous to the question of bound information in device-dependent QKD [27]), since advantage distillation can have higher noise tolerances than one-way error correction.
Our main result in this work consists of two de Finetti theorems for Clauser-Horne-Shimony-Holt (CHSH) symmetric quantum boxes (see definition 4), such that the de Finetti box is quantum as well. We further show how the first de Finetti theorem could be used in the security proofs of DIQKD protocols, yielding a partial reduction to the iid case.
The rest of this paper is structured as follows: In section 2.1 we show the first de Finetti theorem (theorem 6). It is similar to eq. (5) and shows that the entries of a CHSH symmetric quantum box are upper bounded, up to a factor polynomial in n, by the entries of a fixed quantum de Finetti box. In section 2.2 we then show the second de Finetti theorem (theorem 11), which is closer to the original de Finetti theorems for random variables and quantum states. It states that the marginal of the first k rounds of a n-round CHSH symmetric quantum box is close to (and not just upper bounded by) a quantum de Finetti box. Our results in this section rely on the existence of appropriate threshold theorems (see e.g. theorem 7 below). A natural question is whether it is possible to derive them without using the threshold theorems; however, we show in appendix B that proving a de Finetti theorem of the first form is essentially equivalent to proving a threshold theorem, hence it would be a result of comparable difficulty.
In light of this, our results cannot currently be used as an alternative method to prove threshold theorems. However, our focus is more on the application of these results for DIQKD security proofs. Hence in section 3, we present an application of the first de Finetti theorem to bound the diamond distance between two channels acting on boxes. The diamond distance measures how well these two channels can be distinguished by an attacker. Since the security of a DIQKD protocol is related to the diamond distance between the protocol and an ideal channel [28,29], bounds on the diamond distance can be useful in DIQKD security proofs. We show that to prove security of a DIQKD protocol against arbitrary (so-called coherent [30]) quantum attacks it is sufficient to prove security against an adversary who holds an extension of a fixed quantum de Finetti box (theorem 15). However, this extension may not be quantum itself and can only be restricted to the non-signaling set.
In section 4 we show that the result from section 3 cannot be strengthened to restrict the attacker further to collective attacks [30] (attacks where the black box can be described by an iid quantum state and iid measurements for Alice and Bob, see definition 16). For this, we construct two channels that cannot be distinguished at all using boxes compatible with collective attacks, but can be distinguished if arbitrary quantum boxes are available (theorem 17). This shows that the theorem from section 3 cannot be immediately used to conclude security against coherent attacks from security against collective attacks.
Here ∥x∥ 0 denotes the number of non-zero entries of a n-bit string x.
If for an index i we have a i ⊕ b i = x i y i we say that the CHSH game is won in round i [31]. Thus, definition 4 basically states that a box is CHSH symmetric if its entries P (ab|xy) only depends on how many indices the CHSH game was won. Our definition of CHSH symmetry differs slightly from the one in [16], where it is only required that Our definition agrees with the definition in [16] for permutation symmetric boxes -essentially, we have implicitly incorporated the constraint of permutation symmetry into definition 4 itself.
Of course an attacker can initially manipulate the boxes of Alice and Bob such that they do not possess CHSH symmetry. However, Alice and Bob can run the following procedure to enforce CHSH symmetry: First Alice chooses a random permutation π and transmits it to Bob over the authenticated channel, then Alice and Bob permute their inputs and outputs according to π. If we view the original box as a conditional probability distribution P ABE|XY Z for Alice, Bob and Eve we can view π as additional knowledge E ′ of Eve and describe the box after π has been applied byP ABEE ′ |XY Z . Alice and Bob will not require π for the remainder of the protocol and can now discard it; therefore, in the rest of our discussion we do not include it in the marginal of the Alice-Bob boxes, whereas on Eve's component we will simply absorb E ′ into E and no longer explicitly denote it. Then the marginalP AB|XY has permutation symmetry. To go from permutation symmetry to CHSH symmetry Alice and Bob apply the depolarization protocol described in appendix A of [32] to each round. If the box in the honest implementation of the DIQKD protocol has CHSH symmetry it is unchanged by this depolarization protocol. It is important to note that only the marginal box of Alice and Bob has CHSH symmetry after this protocol: from the perspective of Eve, who knows the permutation π and the random bits chosen in the depolarization protocol in [32], the box may not have CHSH symmetry. However, we highlight that in the case of device-dependent QKD, this did not prevent constructing a security proof via de Finetti arguments [11], and hence there still remains the possibility of a similar result for DIQKD.
It was shown in [16] that a de Finetti theorem holds for CHSH symmetric boxes: Theorem 5 (Corollary 6 in [16]). For each number n of rounds there is an n-round CHSH symmetric de Finetti box τ AB|XY such that for all CHSH symmetric boxes P AB|XY it holds that P (ab|xy) ≤ (n + 1)τ (ab|xy).
Theorem 5 was derived for all CHSH symmetric boxes P AB|XY , even if they are not quantum. However, the de Finetti box τ AB|XY constructed in the theorem is also not quantum.
The main result we derive in this section is a de Finetti theorem for quantum CHSH symmetric boxes, hence resolving this issue: For each number n of rounds there is an n-round CHSH symmetric quantum de Finetti box τ AB|XY such that for all CHSH symmetric quantum boxes P AB|XY it holds that P (ab|xy) ≤ (n + 1) 2 τ (ab|xy).
The maximal probability with which any single round quantum box can win the CHSH game is w = 2+ √ 2 4 [33]. This value is called the quantum value of the CHSH game. To prove theorem 6 we need the following specialization of a theorem from [34] to the CHSH case. It says that the probability that the fraction of won CHSH games is larger than a certain threshold (namely the value of the CHSH game) is exponentially small in n. Such theorems are commonly referred to as threshold theorems. 2 Theorem 7 (Theorem 5 in [34] where Pr P AB|XY ,µ ⊗n denotes the probability measure in which X and Y are sampled from µ ⊗n and A and B are sampled using P AB|XY and denotes the relative entropy. Note that the box P AB|XY in theorem 7 does not have to be CHSH symmetric. However, if P AB|XY is CHSH symmetric then we can describe it completely by n + 1 parameters {p 0 , p 1 , . . . , p n }, which we define as follows: for each k ∈ {0, ..., n}, take any a, b, x, y ∈ {0, 1} n such that k = ||a ⊕ b ⊕ xy ⊕ 1|| 0 (in other words, a, b, x, y win exactly k instances of the CHSH game). Then define By CHSH symmetry, all combinations of a, b, x, y with the same value of k have the same value of P (ab|xy), so the expression (11) is indeed well-defined. The normalization factor n k 2 n is chosen to give these parameters a simple interpretation: namely, p k is in fact equal to the probability of winning exactly k CHSH games for the box P (ab|xy) (regardless of the input distribution). To see this, notice that for fixed x, y and k there are exactly n k 2 n pairs a, b such that k = ||a ⊕ b ⊕ xy ⊕ 1|| 0 [16]. Therefore for any probability measure µ on the n-round inputs x and y, we indeed have = p k where in the second equality we used that the summand does not depend on a and b, and for a fixed x and y there are n k 2 n possible a and b with ∥a ⊕ b ⊕ xy ⊕ 1∥ 0 = k. Theorem 7 then implies l≥k p l ≤ e −nD(k/n,1−k/n||w,1−w) .
To prove theorem 6 we need one further ingredient: is such that the bound is, up to a factor polynomial in n, equal to n k w k (1 − w) n−k , the probability to win exactly k games if a single game is won with probability w. "Imperfect" threshold theorems can be roughly described as giving bounds of the more general form e −n∆(k/n) , where ∆ is some potentially "looser" way to quantify the distance from k/n to w [35]. Our first de Finetti theorem (theorem 6) and its generalization (theorem 22) both require a perfect threshold theorem. However, the proof of our second de Finetti theorem (theorem 11) still holds with an imperfect threshold theorem, though the resulting bound would be weaker.
0 be a concave function that attains its maximum at some x * ∈ [a, b]. Then ∀n ∈ N, The proof is given in appendix A. Now we are ready to prove theorem 6: be the quantum value of the CHSH game and define Now set Then Note that and Therefore, f is concave and its maximum on the interval [0, 1] occurs at p = α.
Since f is concave, we can apply lemma 8 to eq. (18) and get Now we turn to the box P AB|XY . Following the earlier notation, let p k denote the probability of winning exactly k CHSH games with this distribution. We observe that • If α > w then, the threshold theorem 7 implies • If α < 1 − w we can use the threshold theorem to get a bound on the minimal number of won games, because the CHSH game has the property that winning exactly k games is just as hard as losing exactly k games (and thus winning n − k games). Hence we have • If α ∈ [1 − w, w] we can rewrite the trivial bound p k ≤ 1 in the form Hence we can summarize the implications of the threshold theorem as We can simplify the term in the supremum by inserting the definition of relative entropy: Now recall that by eq. (11), P (ab|xy) is related to p k by It is a well known identity of the Beta function that where for the last inequality we used lemma 8 and the fact that the maximum of f on [0, 1] is f (α). Inserting eq. (28) followed by eq. (25)-(26) into eq. (27) gives Combining eq. (21) and eq. (29) yields P (ab|xy) ≤ (n + 1) 2 τ (ab|xy), as desired.
The arguments in the proof of theorem 6 are not specific to CHSH symmetry. In fact, in appendix B we show that we get such a de Finetti theorem whenever a threshold theorem analogous to theorem 7 holds.

The second de Finetti theorem
The de Finetti theorem discussed in the previous section is similar to the de Finetti theorems for boxes shown in [16]; they show that the entries of some given box are upper bounded by the entries of a de Finetti box. The original de Finetti theorems for random variables and quantum states are of a different flavor: They show that the marginal on the first k rounds of an arbitrary n-round permutation invariant state is close to a de Finetti state if k ≪ n. In this section we show a theorem of this type for CHSH symmetric boxes. We use the following distance measure on the space of boxes: Definition 9. Let P A|X and Q A|X be two boxes with the same input set X and output set A. Their distance is This distance is just the ℓ 1 distance of the probability distributions of a, maximized over the input x. To state the de Finetti theorem we need to introduce the notion of the marginal of an n-round box. In general, this marginal may not be well-defined without some kind of no-signaling condition across different rounds (since otherwise the output distribution of one round could potentially depend on the input in another round). However, it turns out that for CHSH symmetric boxes this is indeed well-defined, as we now show.

Lemma 10. Let P AB|XY be an n-round CHSH symmetric quantum box and let 1 ≤ k ≤ n be an integer. Then the expression
is independent of the choice of x k+1 ...x n and y k+1 ...y n , and we shall refer to it as the marginal of the first k rounds. Furthermore, P k AB|XY is a CHSH symmetric quantum box (of k rounds).
Proof. We shall use the notation a = (a 1 ...a k ) and a ′ = (a k+1 ...a n ), and define b, b ′ , x, x ′ , y, y ′ analogously. To see that P k AB|XY is independent of the choice of x ′ and y ′ we calculate where in the second equality we used the CHSH symmetry of P AB|XY and in the third equality we shifted the summation variable b ′ by x ′ y ′ . To see the CHSH symmetry of P k AB|XY , note that by CHSH symmetry of P AB|XY , P (aa ′ , bb ′ |x0, y0) only depends on a ⊕ b ⊕ xy and a ′ ⊕ b ′ . Hence P k (ab|xy) only depends on a ⊕ b ⊕ xy. Furthermore, the permutation invariance of P AB|XY immediately implies that P k AB|XY is also permutation invariant, and hence we conclude that P k AB|XY is CHSH symmetric. Finally, the fact that P k AB|XY is a quantum box immediately follows from the fact that P AB|XY is quantum. Now we can state the de Finetti theorem: Theorem 11. Let P AB|XY be an n-round CHSH symmetric quantum box and let P k AB|XY be the marginal of the first k rounds as defined in eq. (31). There is a k-round CHSH symmetric quantum de Finetti box τ AB|XY such that For the proof of theorem 11 we first note that the distance between two CHSH symmetric boxes is just the ℓ 1 distance between the distributions of the wins and losses of the CHSH game, which are independent from the input into the box.

Lemma 12.
Let P AB|XY and Q AB|XY be CHSH symmetric n-round boxes and W = A ⊕ B ⊕ XY ⊕ 1 ∈ {0, 1} n be the random variable that indicates in which rounds the CHSH game was won. Let P W and Q W the distribution of W . Then Proof. For all x, y Another ingredient for the proof of theorem 11 is a bound on the ℓ 1 -distance between two binomial distributions: Lemma 13. Let k ∈ N and p, q ∈ (0, 1). Denote by P = Binom(k, p) and Q = Binom(k, q) the binomial distributions with k trials and success probabilities p and q. Then Proof. Denote by P 0 and Q 0 the Bernoulli distributions with success probability p and q respectively. We use Pinsker's inequality and the reverse Pinsker's inequality (Lemma 4.1 in [36]) to calculate Now we are ready to prove the de Finetti theorem: Proof of theorem 11. Denote by p N the probability that Alice and Bob win exactly N CHSH games on the box P AB|XY .
By the de Finetti theorem for random variables [12] we have where P k W denotes the distribution of the first k bits of W . For p ∈ [0, 1] denote by Q(p)ÂB |XŶ the single-round box with CHSH winning probability p. Then by lemma 12 and eq. (40) The box n N =0 p N Q N n ⊗k is CHSH symmetric and de Finetti, but not quantum. The problems are the terms with N > nw and N < n(1 − w), where w = 2+ √ 2 4 . We define a quantum CHSH symmetric de Finetti box as We will show The statement of theorem 11 then follows by combining this bound and the bound in eq. (41) using the triangle inequality. Let δ > 0. We split the sum in the definition of τ AB|XY to obtain where in the last line we used P AB|XY − Q AB|XY ≤ 2 for all normalized boxes P AB|XY and Q AB|XY . We start by bounding the terms with N ∈ [wn, (w + δ)n] and N ∈ [(1 − w − δ)n, (1 − w)n] using lemma 13: where we used 2 Now we turn to the terms with N > (w + δ)n and N < (1 − w − δ)n. By the threshold theorem for the CHSH game (theorem 7) it holds that where in the last step we used Pinsker's inequality, i.e D(p, Putting together the bounds for N ∈ [wn, (w + δ)n] and N > (w + δ)n we find Now we choose and obtain This completes the proof.
The choice of δ in eq. (52) is not optimal, it does not give the minimal possible error term in theorem 11. However, the improvement that can be achieved by choosing δ optimally does not change the O ln(n/k)k/n behavior. To see this, choose for some β > − ln(n/k)/2. Then the error term is given by The optimal β is such that A numerical optimization indicates that for ln(n/k) = 0 the minimum of C ′ is achieved at β = 0, so C ′ = 4. As ln(n/k) increases the minimum of C ′ decreases slowly, at ln(n/k) = 10 it is given by C ′ ≈ 2.03, and at ln(n/k) = 100 by C ′ ≈ 0.96. As ln(n/k) → ∞ it converges C ′ → 0, which can be seen by choosing β = (ln(n/k)) −1/4 . Regardless of the choice of β the error is always at least C ln(n/k)k/n.

Applications
In this section we show how our first de Finetti theorem (theorem 6) has applications in DIQKD security proofs, by first using it to derive a bound on channel distinguishability, then discussing its implications for security proofs. This result, and the proof of it, are analogous to theorem 25 in [16], except that we use theorem 6 as the de Finetti theorem, rather than the statement in eq. (5). We remark that the works [19,20] also used threshold theorems (of somewhat different forms) to obtain DIQKD security proofs. However, as discussed in the introduction, their proof techniques currently only apply to protocols using one-way error correction, and yield lower asymptotic keyrates compared to the iid case. In contrast, the results we derive here could be applied to all protocols having the appropriate symmetry properties. While they currently do not yield a full reduction to the iid case, our hope is that it would be possible to develop them further to obtain security proofs that are more generally applicable and yield higher asymptotic keyrates compared to [19,20], as was the case for de Finetti theorems in device-dependent QKD [11].

Bound on the diamond distance between channels
Here we consider channels on boxes of the following form: A channel E that acts on boxes of the form P A|X and outputs a random variable R as its result is described by a probability distribution P E X on X , and a conditional probability distribution P E R|AX which determines the result R given A and X. When acting on P A|X the channel produces a distribution on R given by This definition is general enough to capture all protocols in a parallel DIQKD scenario, where all bits of the n-bit input X are entered at the same time into the box. It does not cover all protocols that are possible in a sequential DIQKD scenario [8], where some of the input bits are only given to the box after some output bits have been received. In such a sequential scenario it is in principle possible to construct channels where the input to the box in some round depends on the output of the box in previous rounds. If we consider boxes P AE|XZ , where the additional E, Z interface is held by Eve, we can also apply the channel E only to the A, X interface to obtain a box with input Z and outputs R and E. We will denote this box by (E ⊗ id) (P AE|XZ ) RE|Z .
We define the distance between two channels E and F by how well Eve can distinguish the boxes (E ⊗ id) (P AE|XZ ) RE|Z and (F ⊗ id) (P AE|XZ ) RE|Z if she is also given access to R. Then she can choose her input Z dependent on R. This leads to the following definition [16,37]: Definition 14. Let E and F be two channels acting on boxes of the form P A|X . The distinguishablity of E and F using the box P AE|XZ is given by We define the diamond distance between the channels with respect to some set P of boxes to be the following: Similarly to the usual diamond distance between quantum channels, the above definition of diamond distance with respect to some set P is a measure of how distinguishable the channels are with respect to a distinguisher that can only use boxes from P. Simple choices of P include for instance the sets of quantum or non-signaling boxes. For the following main theorem of this section we will however take P to be the set of quantum boxes P ABE|XY Z such that the marginal P AB|XY has CHSH symmetry, and denote the diamond distance with respect to this P as ||E − F|| quantum,CHSH ♢ . Note that if the action of the channels E, F can be described by Alice and Bob first performing the depolarizing procedure described above, this restriction causes no change in the diamond distance as compared to choosing P to be the entire set of quantum boxes P ABE|XY Z .
Theorem 15. Let E and F two channels on n-round boxes of the form P AB|XY , and let τ AB|XY be the de Finetti box from theorem 6. Then where the supremum is taken over all non-signaling boxes that have the marginal τ AB|XY .
Proof of theorem 15. Let P ABE|XY Z be a quantum box whose marginal P AB|XY has CHSH symmetry. Let R AB|XY be such that By theorem 6 all entries of R AB|XY are positive. Because the non-signaling condition is linear and τ AB|XY and P AB|XY are non-signaling, R AB|XY is also non-signaling. Now we define an extension τ ABE|XY Z of τ AB|XY as follows: The box has one more possible outcome for Eve then the box P ABE|XY Z . We will call this additional outcome e * . The box τ ABE|XY Z then works as follows: With probability (n + 1) −2 the box acts just like P ABE|XY Z , and with probability 1 − (n + 1) −2 it always returns e * to Eve and acts like R AB|XY for Alice and Bob. Formally, this is given by Since τ ABE|XY Z is the linear combination of two non-signaling boxes it is non-signaling itself. Furthermore, by eq. (60) it is an extension of τ AB|XY . Finally, it holds that Hence for all P ABE|XY Z ||(E − F) ⊗ id(P ABE|XY Z )|| ≤ (n + 1) 2 sup Taking the supremum over all P ABE|XY Z with CHSH symmetric marginal P AB|XY yields the claim.

Implications for DIQKD security proofs
Theorem 15 can be seen as a version of the postselection theorem for quantum channels [11]. It allows us to bound the distance between two channels by the distinguishability of the channels when Eve is restricted to extensions of a fixed de Finetti box. This could potentially be a useful tool in security proofs of DIQKD protocols, because a protocol can be defined to be secure if its diamond distance to an ideal protocol is small [28,29]. In particular, Theorem 15 implies that to prove security against coherent quantum attacks, it is sufficient to prove security for the case where the marginal of Alice and Bob is given by τ AB|XY , and Eve possesses a non-signaling extension of this box. This helps to simplify the task of a DIQKD security proof, because it means that it suffices to analyze (extensions of) the specific box τ AB|XY , which has the convenient property of being a convex combination of iid quantum boxes. However, there is a caveat: Although the box τ AB|XY is quantum, the extensions τ ABE|XY Z in the theorem statement here are allowed to be general non-signaling boxes. Furthermore, we will show in the next section that an adversary who has access to arbitrary non-signaling extensions of τ AB|XY can actually be strictly better at distinguishing channels than an adversary who has only access to collective attack boxes. Hence theorem 15 does not immediately yield security against coherent attacks from security against collective attacks -still, since it does allow a "partial" reduction to the latter (namely, allowing us to focus on extensions of a quantum de Finetti box τ AB|XY ), it may still simplify DIQKD security proofs.
We also remark that for our second de Finetti theorem (theorem 11), we currently do not have in mind an explicit application of it in DIQKD security proofs. Still, we presented it in this work in case it has applications in other contexts -for instance, it might be useful in proving properties that only depend on the box P AB|XY itself, rather than involving its extensions as in DIQKD security proofs. It is also more similar to the original de Finetti theorem for classical random variables, or the early versions for quantum states developed in e.g. [14].

Difficulties in bounding the diamond distance by restriction to collective attacks
Theorem 15 shows that to bound the diamond distance between two channels E and F it is sufficient to restrict the attacker to non-signaling extensions of a fixed de Finetti box. There are many desirable strengthenings of this result: For example, one could restrict the attacker only to quantum extensions of the de Finetti box. One could also further restrict the attacker to use only quantum extensions of iid boxes, instead of the fixed de Finetti box. Finally, one could also restrict the attacker to collective attack boxes (defined below), as would be desirable to conclude security against coherent attacks directly from security against collective attacks. In this section we will see that a theorem like theorem 15 does not hold for this strongest restriction; more precisely, we show that it is impossible for the bound (59) to hold if the supremum is instead restricted to collective attack boxes (which we define later below). It remains open whether such a theorem holds for one of the other strengthenings mentioned above, or whether a reduction to collective attacks in a somewhat different form is possible. (We note that the answers to these questions do not straightforwardly follow from existing no-go theorems on non-signaling privacy amplification [38,39], since in our result τ AB|XY is restricted to a convex combination of quantum distributions rather than non-signaling distributions.) We start by defining the boxes that an attacker is allowed to use in collective attacks. While there is potentially some flexibility in defining this, here we use a definition that essentially corresponds to the boxes considered in the security proofs of [5,10], up to a collective measurement on Eve's side-information: We remark on two aspects of the above definition. Firstly, note that we assume the Hilbert spaces of Alice and Bob can be split into n rounds, but assume no internal structure of Eve's Hilbert space. However, since the state ρ AB is iid and thus has an iid purification, the state ρ ABE is related by a local operation on Eve's system to this iid purification.
Since we assume nothing about G e,z except that it is a valid POVM, we can absorb this local operation into G e,z and thus describe any collective attack box also with a state ρ ABE that is iid. Collective attack boxes can thus be seen as boxes that are essentially iid, up to Eve performing a local operation on her systems followed by a joint measurement. Secondly, the fact that the definition inherently incorporates this measurement means that Eve's system is forced to be a box rather than a genuine quantum state. However, for the purposes of computing diamond distance, this in fact does not make a difference (as long as arbitrary POVMs G e,z are allowed in the definition) -observe that the process of a distinguisher producing a guess for the channel can be described as it performing a POVM on its systems, and the optimal such POVM essentially induces a valid choice of G e,z in the above definition.
A crucial observation on collective attack boxes is the following: Consider the box P e,z AB|XY which described the outcomes of Alice and Bob conditioned on Eve inputting z and getting outcome e. It is given by where ρ e,z AB is a valid state, Because a E a,x = b F b,y = id we see that P e,z AB|XY is not only non-signaling between Alice and Bob, but also between the individual rounds. This means that for example a 1 P e,z AB|XY (a 1 a 2 ...a n b|xy) does not depend on x 1 . The following main result of this section exploits this insight: Theorem 17. For each n > 1 there exist two channels E and F acting on n-round boxes of the form P AB|XY such that ||(E −F)⊗id(P ABE|XY Z )|| = 0 for all collective attack boxes P ABE|XY Z , but ||E − F|| quantum,CHSH ♢ ̸ = 0.
Theorem 17 shows that a statement like theorem 15 cannot hold if we maximize only over collective attack boxes instead of all non-signaling extensions of the fixed de Finetti box (not even for, say, an exponential prefactor instead of (n + 1) 2 ). This shows that an attacker who has access to any non-signaling extension of the fixed de Finetti box is stronger than an attacker who has only access to collective attack boxes.
In the proof of theorem 17 we will use that all collective attack boxes are non-signaling between the rounds of Alice and Bob, and that the non-signaling condition is linear. The following lemma will be crucial. It states that for each linear subspace of the probability distributions on some set, there are two channels (which act on probability distributions, not yet on boxes), that cannot be distinguished by any probability distribution in the linear subspace: Lemma 18. Let X be some finite set. We treat the unnormalized probability distributions on X as an orthant of an |X | dimensional real vector space. Let P be some linear subspace in this vector space, and Q = (Q(x)) x∈X ̸ ∈ P. Then there are two conditional probability distributions P E R|X and P F R|X such that the following holds: Denote for a probability distribution P on X by E(P ) and F(P ) the distributions on R which are obtained by first sampling x using P and the sampling r using P E R|X and P F R|X , i.e. E(P )(r) = x P (x)P E (r|x). Then for all P ∈ P and E(Q) ̸ = F(Q).
Proof. There exists a vector ∆ = (∆ x ) x∈X with |∆ x | ≤ 1 for all x and ∆ · P = 0 for all P ∈ P and ∆ · Q ̸ = 0. Take R = {0, 1} and Then for P any probability distribution on X Hence for all P ∈ P ||E(P ) − F(P )|| 1 = 0 (74) and We will now construct two channels E and F that act on boxes P A|X (i.e. we consider only Alice) that cannot be distinguished by any collective attack box, but that can be distinguished by a certain quantum box that is not a collective attack box. We will then see how to modify this construction to include Bob and to ensure that the box used to distinguish both channels has CHSH symmetry on the marginal of Alice and Bob.

Lemma 19.
For each n > 1 there are two channels E and F that act on n-round boxes P AE|XZ such that ||(E − F) ⊗ id(P AE|XZ )|| = 0 for all collective attack boxes P AE|XZ , but ||E − F|| quantum,CHSH Proof. We first construct the channels E and F, then show that Eve cannot distinguish them if she is restricted to collective attack boxes, and finally show that there is a quantum box (naturally not a collective attack box) that can be used to distinguish both channels. We construct the channels E and F as follows, depending on a parameter m > n/2. For both channels, Alice does the following steps: 1. She enters uniformly random inputs x 1 , ..., x n into the inputs of her box.

She calculates
Now consider the linear subspace P of probability distributions on the (w, t) given by the linear constraints for all w, t, t ′ and take a Q ̸ ∈ P (a specific Q will be constructed below). Alice constructs the channels E and F by applying the conditional probability distributions P E R|W T and P F R|W T from lemma 18 to her result (w, t) from step 3. Now we prove that Eve cannot distinguish E and F if she uses a collective attack box P AE|XZ . For this, we use that for all e and z P e,z A|X is non-signaling between the rounds of Alice. In particular, the outputs of the rounds m + 1, ...n cannot depend on the inputs in the rounds 1, ..., m, so W and T are independent when generated using P e,z A|X . Hence The box P e,z therefore satisfies eq. (78), and hence ||(E − F)(P e,z A|X )|| = 0 by lemma 18. Since this holds for all e and z we have also ||(E − F) ⊗ id(P AE|XZ )|| = 0.
Finally we construct a box Q A|X that allows Eve to distinguish the channels E and F with a nonzero advantage over guessing. Notice that here Eve does not keep any system (neither quantum not classical) for herself and can distinguish the channels only from their result R. Q A|X can then be an arbitrary conditional probability distribution. Take Q A|X such that the result is surely a = (1, 1, ..., 1) if i x i > n/2 and surely a = (0, 0, ..., 0) otherwise. Then if t > n/2 so eq. (78) does not hold.
Now we can adapt the statement of lemma 19 to prove theorem 17.
Proof of theorem 17. First we generalize the construction in lemma 19 to boxes for which also Bob has an input, i.e. boxes of the form P AB|XY . For this, we take the channels E and F such that they act like in lemma 19 on Alice's inputs and outputs, and give an arbitrary input Y and ignore the output B for Bob. Clearly, both channels still cannot be distinguished with collective attack boxes, but can be distinguished by a box Q AB|XY , which acts like the box Q A|X from lemma 19 on A and X and arbitrarily on B and Y . However, Q AB|XY does not have CHSH symmetry. Using the depolarizing procedure in [32] we can construct a boxQ ABE|XY such that the marginalQ AB|XY has CHSH symmetry and there is an output e * for Eve (corresponding to the case in which the depolarizing protocol does nothing), such thatQ e * AB|XY = Q AB|XY . Then to distinguish E and F usingQ ABE|XY Eve first checks E. If E = e * she distinguishes E and F as in lemma 19, otherwise she just guesses randomly. Because the probability that E = e * is nonzero, we have (E − F) ⊗ id(Q ABE|XY ) > 0.
Several remarks are in order. Firstly, the diamond norm ||E − F|| quantum,CHSH ♢ between the channels E and F from theorem 17 is exponentially small in n, because Eve only tries to distinguish E and F in the case when the depolarizing protocol does nothing. This means that while a theorem in the same form as theorem 15 cannot hold for collective attack boxes, it is entirely possible that there is, for example, a theorem that yields a bound of the form where g(n) → 0 as n → ∞. Such a result could be sufficient to allow security proof reductions to collective attacks. Secondly, the results of this section relied only on the fact that collective attack boxes are non-signaling between the rounds. This property arose entirely from the fact that Alice and Bob's measurements act on different Hilbert spaces in different rounds, and hence also holds more generally, i.e. even if the measurements in each round are different, or the states are entangled across rounds. This seems to suggest that in DIQKD, imposing an assumption that different rounds have different Hilbert spaces may already be a fairly strong restriction by itself 3 , even if we allow many other non-iid behaviours across states in different rounds, such as classical correlations or even entanglement.(In fact, security proof reductions to the iid case under this assumption were indeed previously studied in [40,41], though the latter was restricted to one-way protocols.) Whether this assumption seems reasonable may depend on the protocol -for instance, it seems unsatisfactory if each honest party has to use a single device for all inputs/outputs (as in [8], which used the EAT to avoid this assumption for one-way protocols), but if each honest party has access to n devices that are "well isolated" from each other, it might be more plausible.
Thirdly, we remark that all sequential DIQKD protocols naturally fulfill a certain form of non-signaling constraints between the individual rounds of Alice and Bob: Alice and Bob's inputs in one round cannot influence the outputs in preceding rounds. Theorem 17 does not rule out that a result like theorem 15 exists for channels with such a sequential structure. However, preserving such a sequential structure for the purposes of a security proof appears rather incompatible with permutation symmetry, so exploiting such sequential structures might require different techniques from those used in this paper.

Conclusion
In this paper we proved two de Finetti theorems for quantum conditional probability distributions with CHSH symmetry. The advantage of these theorems over similar de Finetti theorems [16,17] is that the de Finetti boxes are in the quantum set. The first de Finetti theorem states that the entries of a CHSH symmetric box are upper bounded, up to a polynomial factor in n, by the entries of a fixed de Finetti box. This theorem is actually not restricted to boxes with CHSH symmetry but can be applied to arbitrary symmetries if a corresponding threshold theorem is available. The second de Finetti theorem states that the marginal of the first k rounds of an n round CHSH symmetric box is close to (and not just upper bounded by) a de Finetti box.
We further showed that the first de Finetti theorem can be used to obtain a bound on the diamond distance between two channels acting on boxes. Specifically, an attacker who tries to distinguish two channels E and F can be restricted to non-signaling extensions of a fixed quantum de Finetti box without decreasing the distinguishability between both channels by more than a polynomial factor. Because the security of DIQKD protocols is defined in terms of the distance between the channel given by the protocol and an ideal channel this statement might be useful in security proofs. However our theorem does not immediately allow to conclude security against coherent attacks from security against collective attacks: A straightforward strengthening of it to bound the diamond distance between two channels by the distinguishability using collective attack boxes does not hold. Based on some insights in our proof approach, we speculate that in DIQKD, assuming that boxes in different rounds act on different Hilbert spaces may already be a fairly strong constraint, even if we allow correlations or entanglement between states in different rounds.

Funding
This project was funded by the Swiss National Science Foundation via the National Center for Competence in Research for Quantum Science and Technology (QSIT), the Air Force Office of Scientific Research (AFOSR) via grant FA9550-19-1-0202, and the QuantERA project eDICT.

A Proof of lemma 8
Proof of lemma 8. The second inequality follows directly because f is non-negative and attains its maximum at x * . The idea for the first inequality is to replace f by a piecewise linear function that equals 0 at a and b, and equals f (x * ) at x * . By concavity this piecewise linear function is always smaller than f itself. Writing this out explicitly: By concavity and non-negativity we have for all x ∈ [a, x * ] Therefore we have

B A de Finetti theorem for general symmetries
In this section we will show that for arbitrary games a threshold theorem such as theorem 7 can always be used to prove a de Finetti theorem similar to theorem 6. Conversely, we will also see that a de Finetti theorem implies a threshold theorem. This means that proving a de Finetti theorem for some symmetry is just as hard as proving a threshold theorem for the game associated with that symmetry.

B.1 Statement of the main theorem
Throughout this section we will only consider boxes with a single interface and with a single round input setX , a single round output setÂ, and corresponding n-round input and out sets X =X n and Y =Ŷ n . CHSH symmetric boxes can be described like this by treating the two parties Alice and Bob as one, so the input and output sets areÂ =X = {0, 1} 2 . We consider the following generalization of CHSH symmetry: Definition 20.
1. Let d ∈ N and let w :Â ×X → {1, . . . , d} be some function. We will call w the predicate function of the symmetry. For a ∈ A and x ∈ X we define freq w (a, x) = ( k 1 n , . . . , 2. We say an n-round box P A|X has w-symmetry if P (a|x) = P (a ′ |x ′ ) whenever freq w (a, x) = freq w (a ′ , x ′ ), for all a, a ′ ∈ A n and x, x ′ ∈ X n . CHSH symmetry is an example of w-symmetry with w ((a, b), (x, y)) = 1 if a ⊕ b = xy and w ((a, b), (x, y)) = 2 if a ⊕ b ̸ = xy. The definition of w-symmetry is an extension of the symmetries considered in [16] for permutation invariant boxes, where only certain predicate functions w where considered, namely those where for each pair x, x ′ either the images of w(·, x) and w(·, x ′ ) are disjoint or w(·, x) and w(·, x ′ ) are identical up to a permutation of the elements ofÂ.
Instead of the set of quantum single round CHSH boxes, we will in this section consider a general convex set Q of single round boxes QÂ |X . If we view Q as a convex subset in R |Â||X | we can consider its affine hull: the smallest affine superset of Q. Throughout this section we will denote the dimension of the affine hall by d ′ . For the CHSH symmetric case, Q is the set of CHSH symmetric quantum boxes, and d ′ = 1.
In the CHSH symmetric case it was crucial that the expected number of wins of nrounds of the CHSH games is between 2− In the CHSH symmetric case it is There exists a de Finetti state τ A|X ∈ conv Q ⊗n A|X |QÂ |X ∈ Q independent of w such that the following hold: 1. Let P A|X be an n-round box, let µ be a probability measure on X , and take any f ∈ ∆ d . Suppose there exists some C > 0 such that Then 2. Let P A|X be an n-round box with w-symmetry and let each box QÂ |X ∈ Q have wsymmetry. Let C be such that for all f ∈ ∆ d there is a µ > 0 such that eq. (88) holds. Then 3. Let P A|X be an n-round box for which eq. (90) holds. Then Qualitatively, we can interpret the equations and statements in theorem 22 as follows: • The condition (88) is a perfect threshold theorem, written in a form similar to eq. (25) (which was for the CHSH case). It states that the probability to obtain a frequency distribution f outside of the set F µ of expected frequencies decays exponentially with n and with the distance from f to F µ , as measured by the relative entropy.
• Part 1 of theorem 22 states that if we have such a threshold theorem, then the probability of obtaining the frequencies f using the box P A|X can be bounded by the probability of obtaining f using the de Finetti box τ A|X , up to a polynomial factor. We have chosen to state this part of the theorem separately because it does not require P A|X to be w-symmetric.
• Part 2 asserts that if we have the further condition that P A|X and all boxes in Q are w-symmetric, then we can get a de Finetti theorem analogous to theorem 6 in the CHSH case. We will derive part 2 from part 1 by expressing the entries of P A|X and τ A|X in terms of their respective probabilities of obtaining freq w (a, x) = f , which is possible by w-symmetry.
• Finally part 3 shows the other direction of the equivalence between a threshold theorem and a de Finetti theorem: A box P A|X satisfying the de-Finetti theorem statement (90) also satisfies a threshold theorem, albeit with a larger prefactor than in eq. (88).
To prove theorem 22, we first show some preparatory lemmas in sections B.2 and B.3, then combine them in section B.4. Some insight can be gained into the proof structure by writing a slightly different proof of theorem 6 (the CHSH case), in order to draw analogies to parts 1 and 2 of theorem 22 separately. This version of the proof proceeds as follows: first prove eq. (21) as before, giving a lower bound on τ (a|x). However, we reorder the proof after that point. Namely, observe that combining (25), (26) and (28) gives Putting together (21) and (92) gives p k ≤ (n + 1) 2 2 n n k τ (ab|xy).
The symmetry condition has not been used up to this point. We now use it to relate p k to P (ab|xy) via (11), which yields the desired inequality P (ab|xy) ≤ (n + 1) 2 τ (ab|xy). The proof in the subsequent sections basically follows the same structure as the above version. First, eq. (21) is generalized to lemma 26 in section B.2. Next, eq. (92) is replaced by lemma 28 in section B.3, bounding the probability of obtaining some outcome frequency in terms of a supremum over iid boxes (the 2 n n k factor in (92) counts different ways to achieve the specified frequency). These lemmas are combined to obtain part 1 of theorem 22, which is the generalization of (93). Finally, the symmetries are invoked to relate the box distribution to the probabilities of obtaining some outcome frequencies, analogous to (11), to yield part 2 of theorem 22.

B.2 Construction and properties of the de Finetti box
In this section we will construct the de Finetti box τ A|X and show that this τ (a|x) is at most polynomially smaller then Q ⊗n (a|x), for all Q ∈ Q. Before that we need to prove some preparatory lemmas: The following lemma and proof are adopted from [42].

Lemma 23 (Matrix Determinant Lemma). Let A ∈ R n×n be an invertible matrix and
Proof. We calculate det(A − vv T ) as the determinant of a block matrix: is concave.
Proof. First assume that all α i > 0 and i α i < 1.
To show that f is concave we compute the Hesse matrix: .
To show that f is concave it is sufficient to show that A is positive definite. Let A (k) be the upper left k × k block of A. By Sylvester's criterion A is positive definite if det(A (k) ) > 0 for all k ∈ {1, . . . , n}.
. Then By lemma 23 we calculate where the last inequality follows from det(B (k) ) > 0 and k Now assume the general setting where also α i = 0 and i α i = 1 is allowed. For each i choose a sequence (α By continuity The following lemma is the generalization of lemma 8.

Lemma 25.
Let C ⊆ R d be a bounded convex set, and denote by vol(C) the volume of C (under the Lebesgue measure). Then for any n ∈ N and any concave function f : The proof idea is similar to that of lemma 8 (assuming for simplicity that f attains its supremum at some point x * ∈ C): We will lower bound f by a function that is zero on the boundary of C, takes the value f (x * ) at x * , and is determined on the rest of C by "interpolating linearly" between the values at x * and the boundary of C. (Geometrically, the graph of this new function is basically the surface of a convex cone.) Proof. Take any ϵ > 0. There exists some x * ∈ C such that f (x * ) ≥ sup x∈C f (x) − ϵ. We evaluate the integrals using spherical coordinates centered on this point: Let S d−1 ⊆ R d be the (d − 1)-sphere, and let µ be the surface measure on S d−1 with respect to the Proof. We view Q as a bounded convex subset of R |Â||X | . Because the affine hull of Q has dimension d ′ there exists a bounded convex set C ⊆ R d ′ and a bijective linear map C ∋ ϕ → Q ϕ ∈ Q. Choose Now fix a ∈ A and x ∈ X and for a ′ ∈Â and x ′ ∈X let f a ′ x ′ = |{i|a i = a ′ and x i = x ′ }|/n be the frequency of the pair (a ′ , x ′ ) in (a, x). Then ϕ → Q ⊗n ϕ (a|x) is a concave map by lemma 24 and by the linearity of ϕ → Q ϕ . Hence by lemma 25 Now we prove the lemma for general d ≥ 2. For this, observe that n k 1 , ..., k d = n k 1 Applying eq. (117) to each binomial coefficient completes the proof since by a telescoping product argument.

Lemma 28.
Let P A|X be a n-round box. If

B.4 Proof of theorem 22
Now we are ready to prove the general theorem 22. For this, we use lemma 26 to show that the entries of the de Finetti box are at most smaller by a polynomial factor then the corresponding entries of any iid box. Then we use lemma 28 to show that the threshold theorem implies that the probability of a frequency f under P A|X can be bounded, up to a polynomial factor, by probability of f under some iid box (but possibly a different iid box for each f ). Combining both lemmas yields the proof of part 1. Part 2 will follow from part 1 by using the definition of w-symmetry, and part 3 will follow directly from lemma 27.
Proof of theorem 22.
1. This part follows directly by combining lemma 28 and lemma 26. Suppose 2. For this part, we have by hypothesis that P A|X and every box in Q has w-symmetry. Then also τ A|X has w-symmetry. Take any a ∈ A, x ∈ X , and define f = freq w (a|x). Then we have Pr P A|X ,µ ⊗n [freq w (A, X) = f ] = P (a|x) a∈A n ,x∈X n freq w (a,x)=f and similarly Pr τ A|X ,µ ⊗n [freq w (A, X) = f ] = τ (a|x) a∈A n ,x∈X n freq w (a,x)=f µ ⊗n (x).