One-Shot Randomized and Nonrandomized Partial Decoupling

We introduce a task that we call partial decoupling, in which a bipartite quantum state is transformed by a unitary operation on one of the two subsystems and then is subject to the action of a quantum channel. We assume that the subsystem is decomposed into a direct-sum-product form, which often appears in the context of quantum information theory. The unitary is chosen at random from the set of unitaries having a simple form under the decomposition. The goal of the task is to make the final state, for typical choices of the unitary, close to the averaged final state over the unitaries. We consider a one-shot scenario, and derive upper and lower bounds on the average distance between the two states. The bounds are represented simply in terms of smooth conditional entropies of quantum states involving the initial state, the channel and the decomposition. Thereby we provide generalizations of the one-shot decoupling theorem. The obtained result would lead to further development of the decoupling approaches in quantum information theory and fundamental physics.


I. INTRODUCTION
Decoupling refers to the fact that we may destroy correlation between two quantum systems by applying an operation on one of the two subsystems. It has played significant roles in the development of quantum Shannon theory for a decade, particularly in proving the quantum capacity theorem [1], unifying various quantum coding theorems [2], analyzing a multipartite quantum communication task [3,4] and in quantifying correlations in quantum states [5,6]. It has also been applied to various fields of physics, such as the black hole information paradox [7], quantum many-body systems [8] and quantum thermodynamics [9,10]. Dupuis et al. [11] provided one of the most general formulations of decoupling, which is often referred to as the decoupling theorem. The decoupling approach simplifies many problems of our interest, mostly due to the fact that any purification of a mixed quantum state is convertible to another reversibly [12].
All the above studies rely on the notion of random unitary, i.e., unitaries drawn at random from the set of all unitaries acting on the system, which leads to the full randomization over the whole Hilbert space. In various situations, however, the full randomization is a too strong demand. In the context of communication theory, for example, the full randomization leads to reliable transmission of quantum information, while we may be interested in sending classical information at the same time [13], for which the full randomization is more than necessary. In the context of quantum many-body physics, the random process caused by the complexity of dynamics is in general restricted by symmetry, and thus no randomization occurs among different values of conserved quantities. Hence, in order that the random-unitary-based method fits into broader context in quantum information theory and fundamental physics, it would be desirable to generalize the previous studies using the full-random unitary, to those based on random unitaries that are not fully random but with a proper structure.
As the first step toward this goal, we consider a scenario in which the unitaries take a simple form under the following direct-sum-product (DSP) decomposition of the Hilbert space: (1) Here, the superscripts l and r stand for "left" and "right", respectively, and j is the index of the diagonal subspaces. This decomposition often appears in the context of quantum information theory, such as information-preserving structure [14,15], the Koashi-Imoto decomposition [16], data compression of quantum mixed-state source [17], quantum Markov chains [18,19] and simultaneous transmission of classical and quantum information [13]. Also, quantum systems with symmetry are represented by the Hilbert spaces decomposed into this form (see e.g. [20]), in which case j is the label of irreducible representations of a compact group G, H l j is the representation space and H r j is the multiplicity space for each j. In this paper, we introduce and analyze a task that we call partial decoupling. We consider a scenario in which a bipartite quantum state Ψ on system AR is subject to a unitary operation U on A, followed by the action of a quantum channel (CP map) T : A → E. The unitary is assumed to be chosen at random, not from the set of all unitaries on A, but from the subset of unitaries that take a simple form under the DSP decomposition. Thus, partial decoupling is a generalization of the decoupling theorem [11] that incorporates the DSP decomposition. Along the similar line as [11], we analyze how close the final state T A→E (U A Ψ AR U †A ) is, on average over the unitaries, to the averaged final state The main result in this paper is that we derive upper and lower bounds on the average distance between the final state and the averaged one. The bounds are represented in terms of the smooth conditional entropies of quantum states involving the initial state, the channel and the decomposition. For a particular case where J = 1 and dim H A l j = 1, the obtained formulae are equivalent to those given by the decoupling theorem [11].
The result in this paper is applicable for generalizing any problems within the scope of the decoupling theorem by incorporating the DSP structure. Some of the applications are investigated in our papers [21][22][23][24].
In Refs. [21][22][23], we investigate communication tasks between two parties in which the information to be transmitted has both classical and quantum components. In this case, the Hilbert space H l j in (1) is assumed to be a one-dimensional space C, and H l j to be the spaces with the same dimension for all j: multiplicity spaces {H r j } and should be in the form of where I l j is the identity on H l j and U r j is a random unitary on H r j . Hence, this case is also in the scope of partial decoupling with a DSP decomposition given by the symmetry. Similarly, all physical phenomena investigated based on decoupling [7][8][9][10][11] can be lifted up by partial decoupling to the situation with symmetry. We think that further significant implications on various topics will be obtained beyond these examples.
This paper is organized as follows. In Section II, we introduce notations and definitions. In Section III, we present formulations of the problem and the main results. Before we prove our main results, we provide discussions about implementations of our protocols by quantum circuits in Section IV. Section V describes the structure of the proofs of the main results, and provides lemmas that will be used in the proofs. The detailed proofs of the main theorems are provided in Section VI-VIII. Conclusions are given in Section IX. Some technical lemmas and proofs are provided in Appendices.

II. PRELIMINARIES
We summarize notations and definitions that will be used throughout this paper. See also Appendix H for the list of notations.

A. Notations
We denote the set of linear operators and that of Hermitian operators on a Hilbert space H by L(H) and Her(H), respectively. For positive semidefinite operators, density operators and sub-normalized density operators, we use the following notations, respectively: A Hilbert space associated with a quantum system A is denoted by H A , and its dimension is denoted by d A . A system composed of two subsystems A and B is denoted by AB. When M and N are linear operators on H A and H B , respectively, we denote M ⊗ N as M A ⊗ N B for clarity. In the case of pure states, we often abbreviate |ψ A ⊗ |φ B as |ψ A |φ B . For ρ AB ∈ L(H AB ), ρ A represents Tr B [ρ AB ]. We denote |ψ ψ| simply by ψ. The maximally entangled state between A and A , where H A ∼ = H A , is denoted by |Φ AA or Φ AA . The identity operator is denoted by I. We denote (M A ⊗ I B )|ψ AB as M A |ψ AB , and (M A ⊗ I B )ρ AB (M A ⊗ I B ) † as M A ρ AB M A † . When E is a supermap from L(H A ) to L(H B ), we denote it by E A→B . When A = B, we use E A for short. We also denote (E A→B ⊗id C )(ρ AC ) by E A→B (ρ AC ). The set of linear completely-positive (CP) supermaps from A to B is denoted by CP(A → B), and the subset of trace non-increasing (resp. trace preserving) ones by CP ≤ (A → B) (resp. CP = (A → B)). When a supermap is given by a conjugation of a unitary U A or an isometry W A→B , we especially denote it by its calligraphic font such as Let A be a quantum system such that the associated Hilbert space H A is decomposed into the DSP form as For the dimension of each subspace, we introduce the following notation: We denote by Π A j the projection onto a subspace H A l j ⊗ H Ar j ⊆ H A for each j. For any quantum system R and any X ∈ L(H A ⊗ H R ), we introduce a notation which leads to X AR = J j,k=1 X AR jk .

B. Norms and Distances
For a linear operator X, the trace norm is defined as ||X|| 1 = Tr[ √ X † X], and the Hilbert-Schmidt norm as ||X|| 2 = Tr[X † X]. The trace distance between two unnormalized states ρ, ρ ∈ P(H) is defined by ρ − ρ 1 . For subnormalized states ρ, ρ ∈ S ≤ (H), the generalized fidelity and the purified distance are defined bȳ respectively [25]. The epsilon ball of a subnormalized state ρ ∈ S ≤ (H) is defined by For a linear superoperator E A→B , we define the DSP norm by where the supremum is taken over all finite dimensional quantum systems C and all subnormalized states ξ ∈ S ≤ (H AC ) such that the reduced state on A is decomposed in the form of Here, {q j } J j=1 is a probability distribution, { j } J j=1 is a set of subnormalized states on H A l j and π Ar j is the maximally mixed state on H Ar j . The epsilon ball of linear CP maps with respect to the DSP norm is defined by For quantum systems V , W , a linear operator X ∈ L(H V W ) and a subnormalized state ς ∈ S ≤ (H W ), we introduce the following notation: This includes the case where V is a trivial (one-dimensional) system, in which case X V W = X W . We omit the superscript W for ς when there is no fear of confusion.

C. One-shot entropies
For any subnormalized state ρ ∈ S ≤ (H AB ) and normalized state ς ∈ S = (H B ), define The conditional min-, max-and collision entropies (see e.g. [26]) are defined by respectively. The smoothed versions are of the key importance when we are interested in the one-shot scenario. We particularly use the smooth conditional min-and max-entropies: for ≥ 0. Note that Expressions (17)- (22) can be generalized to the case where ρ ∈ P(H).
where Φ l j and Φ r j are fixed maximally entangled states on H A l j ⊗ H A l j and H Ar j ⊗ H A r j , respectively.

E. Random unitaries
Random unitaries play a crucial role in the analyses of one-shot decoupling. By using them, it can be shown that there exists at least one unitary that achieves the desired task. In particular, the Haar measure on the unitary group is often used. The Haar measure H on the unitary group is the unique unitarily invariant provability measure, often called uniform distribution of the unitary group. When a random unitary U is chosen uniformly at random with respect to the Haar measure, it is referred to as a Haar random unitary and is denoted by U ∼ H.
The most important property of the Haar measure is the left-and right-unitary invariance: for a Haar random unitary U ∼ H and any unitary V , the random unitaries V U and U V are both distributed uniformly with respect to the Haar measure. This property combined with the Schur-Weyl duality enables us to explicitly study the averages of many functions on the unitary group over the Haar measure. In the following, the average of a function f (U ) on the unitary group over the Haar measure is denoted by In this paper, however, we are interested in the case where the Hilbert space is decomposed into the DSP form: where H j is the Haar measure on the unitary group on H Ar j for any j. Hence, when we write U ∼ H × below, it means that U is in the form of J j=1 I A l j ⊗ U Ar j and U Ar j ∼ H j .

III. MAIN RESULTS
We consider two scenarios in which a bipartite quantum state Ψ AR is transformed by a unitary operation on A and then is subject to the action of a quantum channel (linear CP map) T A→E . The unitary is chosen at uniformly random from the set of unitaries that take a simple form under the DSP decomposition (1).
In the first scenario, which we call non-randomized partial decoupling, the unitaries are such that they completely randomize the space H Ar j for each j, while having no effect on j or the space H A l j . This scenario may find applications when complex quantum many-body systems are investigated based on the decoupling approach, in which case the DSP decomposition is, for instance, induced by the symmetry the system has. In the second scenario, which we refer to as randomized partial decoupling, we assume that dimH A l j = 1 and that dimH Ar j does not depend on j. The unitaries do not only completely randomize the space H Ar , but also randomly permute j. This scenario may fit to the communication problems. For instance, one of the applications may be classical-quantum hybrid communicational tasks, where the division of the classical and quantum information leads to the DSP decomposition.
For both scenarios, our concern is how close the final state is, after the action of the unitary and the quantum channel, to the averaged final state over all unitaries. It should be noted that the averaged final state is in the form of a block-wise decoupled state in general. This is in contrast to the decoupling theorem, in which the averaged final state is a fully decoupled state.

A. Non-Randomized Partial Decoupling
Let us consider the situation where U has the DSP form: U := J j=1 I A l j ⊗ U Ar j . For any state Ψ AR , the averaged state obtained after the action of the random unitary U ∼ H × is given by Here, π Ar j is the maximally mixed state on H Ar j , and Ψ A l R jj is an unnormalized state on H A l j ⊗ H R defined by Our interest is on the average distance between the state T A→E (U A Ψ AR U †A ) and the averaged state T A→E (Ψ AR av ) over all U ∼ H × . For expressing the upper bound on the average distance, we introduce a quantum system A * represented by a Hilbert space and a linear operator F AĀ→A * : H A ⊗ HĀ → H A * defined by where HĀ l j ∼ = H A l j , HĀ r j ∼ = H Ar j and HĀ ∼ = H A . The following is our first main theorem about the upper bound: Theorem 1 [Main result 1: One-shot non-randomized partial decoupling] For any , µ ≥ 0, any subnormalized state Ψ AR ∈ S ≤ (H AR ) and any linear CP map T A→E , it holds that Here, H ,µ min (A * |RE) Λ(Ψ,T ) is the smooth conditional min-entropy for an unnormalized state Λ(Ψ, T ), defined by F (Ψ AR ⊗τĀ E )F † with τ AE = J(T A→E ) being the Choi-Jamiołkowski representation of T A→E . It is explicitly given by where B µ DSP (T ) is the set of µ-neighbourhoods of T , defined by (15).
In the literature of chaotic quantum many-body systems, it is often assumed that the dynamics is approximated well by a random unitary channel, which is sometimes called scrambling [7,29,30]. Despite the fact that a number of novel research topics have been opened based on the idea of scrambling, some of which are using the decoupling approach [7,9,10], symmetry of the physical systems has rarely been taken into account properly. When the system has symmetry, the associated Hilbert space is naturally decomposed into a DSP form as where j is the label of irreducible representations of a compact group of the symmetry, H A l j is the irreducible representation and H Ar j corresponds to the multiplicity for each j. Due to the conservation law, the scrambling dynamics in the system should be compatible with this decomposition and should be in the form of U A = J j=1 I A l j ⊗ U Ar j . Hence, Theorem 1 is applicable to the study of complex physics in chaotic quantum many-body systems with symmetry.
Theorem 1 reduces to a simpler form when the symmetry is abelian. In this case, all the irreducible representation one-dimensional, i.e., dim H A l j = 1. The averaged output state is explicitly calculated to be The operator F AĀ→A * in (31) reduces to a direct sum of operators that are proportional to projectors, and the operator Λ(Ψ, T ) ∈ S ≤ (H A * RE ) in Theorem 1 reduces to Theorem 1 implies that, if the smooth conditional min-entropy of the unnormalized state Λ(Ψ, T ) is sufficiently large, the final state T A→E • U A (Ψ AR ) is close to T A→E (Ψ AR av ).
The Hilbert space H A = ⊕ J j=1 H Ar j is then isomorphic to a tensor product Hilbert space H Ac ⊗ H Ar , i.e., A ∼ = A c A r . Here, H Ac is a J-dimensional Hilbert space with a fixed orthonormal basis {|j } J j=1 , and H Ar is an r-dimensional Hilbert space. We consider a random unitary U on system A of the form which we also denote by U ∼ H × . In addition, let P be the permutation group on [1, · · · , J], and P be the uniform distribution on P. We define a unitary G σ for any σ ∈ P by We denote the supermap given by conjugation of G σ by the calligraphic font as G σ (·) = G σ (·)G † σ . For the initial state, we use the notion of classically coherent states, defined as follows: Definition 2 (classically coherent states [31]) Let K 1 and K 2 be d-dimensional quantum systems with fixed orthonormal bases {|k 1 } d k 1 =1 and {|k 2 } d k 2 =1 , respectively, and let W be a quantum system. An unnormalized state ∈ P(H K 1 K 2 W ) is said to be classically coherent in K 1 K 2 if it satisfies |k K 1 |k K 2 = 0 for any k = k , or equivalently, if is in the form of where kk ∈ L(H W ) for each k and k .
We now provide our second main result: Theorem 3 [Main result 2: One-shot randomized partial decoupling] Let , µ ≥ 0, Ψ AR be a subnormalized state that is classically coherent in A c R c , and T A→E be a linear CP map such that the Choi-Jamiołkowski representation τ AE = J(T A→E ) satisfies Tr[τ ] ≤ 1. It holds that The function α(J) is 0 for J = 1 and 1 J−1 for J ≥ 2, and β(A r ) is 0 for dimH Ar = 1 and 1 for dimH Ar ≥ 2. The exponents H I and H II are given by Here, C is the completely dephasing channel on A c with respect to the basis {|j } J j=1 , and τ AB = J(T A→B ) is the Choi-Jamiolkowski representation of the complementary channel T A→B of T A→E .
Note that, since the subnormalized state Ψ AR is classically coherent in A c R c , the averaged state Ψ AR av is explicitly given by |j j| Ac ⊗ π Ar ⊗ Ψ Rr jj ⊗ |j j| Rc .
Small error for one-shot randomized partial decoupling implies that the third party having the purifying system of the final state may recover both classical and quantum parts of correlation in Ψ AR . Thus, it will be applicable, e.g., for analyzing simultaneous transmission of classical and quantum information in the presence of quantum side information. In this context, H I in the above expression quantifies how well the total correlation in Ψ AR can be transmitted by the channel T A→B , whereas H II for only quantum part thereof (see [21][22][23]).

C. A Converse Bound
So far, we have presented achievabilities of non-randomized and randomized partial decoupling. At this point, we do not know whether the obtained bounds are "sufficiently tight". To address this question, we prove a converse bound for partial decoupling. We assume the following two conditions for the converse: Converse Condition 1 dim H l j = 1, dim H r j = r (j = 1, · · · , J), Throughout the paper, we refer to the conditions as CC1 and CC2, respectively. The two conditions are always satisfied in the case of randomized partial decoupling, but not necessarily satisfied in the case of non-randomize one. Consequently, the converse bound we prove below is directly applicable to randomized partial decoupling, but is not applicable to non-randomized partial decoupling in general. The converse bound is stated by the following theorem.
Theorem 4 [Main result 3: Converse for partial decoupling] Suppose that CC1 and CC2 are satisfied. Let |Ψ ARD be a purification of a normalized state Ψ AR ∈ S = (H AR ), which is classically coherent in A c R c due to CC2, and T A→E be a trace preserving CP map with the complementary channel T A→B . Suppose that, for δ > 0, there exists a normalized state are normalized states on E, such that Then, for any υ ∈ [0, 1/2) and ι ∈ (0, 1], it holds that where C is the completely dephasing channel on A c , and the smoothing parameters λ and λ are defined by Note that, when a quantum channel T A→E achieves partial decoupling for a state Ψ AR within a small error, it follows from the decomposition of Ψ av (see (43)) that whereτ E j := T A→E (|j j| Ac ⊗π Ar ) = (H E ). This is in the same form as the assumption of Theorem 4.
Let us compare the direct part of randomized partial decoupling (Theorem 3) and the converse bound presented above. The first term in the R.H.S. of the achievability bound (41) is calculated to be On the other hand, the converse bound (45) yields where ψ AB := T A →B (Ψ AA p ), with |Ψ p AA being a purification of Ψ A and H A ∼ = H A . Note that there exists a linear isometry from A to RD that maps |Ψ p to |Ψ [12], and that the conditional max entropy is invariant under local isometry (see Lemma 21 below). A similar argument also applies to the second term in (41) and (46). Thus, when Ψ A is the maximally mixed state, in which case |Ψ p AA = |Φ AA and thus ψ = τ , the gap between the two bounds is only due to the difference in values of smoothing parameters and types of conditional entropies. By the fully quantum asymptotic equipartition property [32], this gap vanishes in the limit of infinitely many copies. From this viewpoint, we conclude that the achievability bound of randomized partial decoupling and the converse bound are sufficiently tight.

D. Reduction to The Existing Results
We briefly show that the existing results on one-shot decoupling [11] and dequantization [31] are obtained from Theorems 1, 3 and 4 as corollaries, up to changes in smoothing parameters. Thus, our results are indeed generalizations of these two tasks.
First, by letting J = 1 in Theorem 3, we obtain the achievability of one-shot decoupling: Corollary 5 [Achievability for one-shot decoupling (Theorem 3.1 in [11])] Let , µ ≥ 0, Ψ AR be a subnormalized state, and T A→E be a linear CP map such that the Choi-Jamiołkowski representation Note that the duality of the conditional min and max entropies ( [25]: see also Lemma 24 in Section being the Choi-Jamiolkowski representation of the complementary channel T A→B of T A→E . A similar bound is also obtained by letting J = 1 and dimH A l j = 1 in Theorem 1. A converse bound for one-shot decoupling is obtained by letting J = 1 in Theorem 4, and by using the duality of the conditional entropies, as follows:

Corollary 6 [Converse for one-shot decoupling (Theorem 4.1 in [11])] Consider a normalized state
where |Ψ p AA is a purification of Ψ A , H A ∼ = H A , and the smoothing parameter λ is defined by (47).
Next, we consider the opposite extreme for Theorem 3, i.e., we consider the case where dimH Ar = 1. This case yields the dequantizing theorem:

Corollary 7 [Achievability for dequantization (Theorem 3.1 in [31])] Let
A be a quantum system with a fixed basis {|j } d A j=1 , H R ∼ = H A and , µ ≥ 0. Consider a subnormalized state Ψ AR that is classically coherent in AR, and a linear CP map T A→E such that the Choi-Jamiołkowski representation where C is the completely dephasing channel on A with respect to the basis {|j } J j=1 , and τ AB = J(T A→B ) is the Choi-Jamiolkowski representation of the complementary channel T A→B of T A→E .
In the same extreme, Theorem 4 provides a converse bound for dequantization, which has not been known so far:

Corollary 8 [Converse for dequantization]
Consider the same setting as in Corollary 7, and assume that Ψ AR is normalized, and that T A→E is trace preserving. Let |Ψ ARD be a purification of Ψ AR . Suppose that, for δ > 0, there exists a normalized state Then, for any υ ∈ [0, 1/2) and ι ∈ (0, 1], it holds that where the smoothing parameter λ is defined by (47).

IV. IMPLEMENTING THE RANDOM UNITARY WITH THE DSP FORM
Before we proceed to the proofs, we here briefly discuss how the random unitaries U ∼ H × that respect the DSP form can be implemented by quantum circuits. Since Haar random unitaries are in general hard to implement, unitary t-designs, mimicking the t-th statistical moments of the Haar measure on average [33][34][35], have been exploited in many cases. Since the decoupling method makes use of the second statistical moments of the Haar measure, we could use the unitary 2-designs instead of the Haar measure for our tasks. Although a number of efficient implementations of unitary 2-designs have been discovered [33][34][35][36][37][38][39][40][41], and it is also shown that decoupling can be achieved using unitaries less random than unitary 2-designs [42,43], we here need unitary designs in a given DSP form, which we refer to as the DSP unitary designs. Thus, we cannot directly use the existing constructions, posing a new problem about efficient implementations of DSP unitary designs. Although this problem is out of the scope in this paper, we will briefly discuss possible directions toward the solution.
One possible way is to simply modify the constructions of unitary designs known so far. This could be done by regarding each Hilbert space H Ar j , on which each random unitary U Ar j ∼ H j acts, as the Hilbert space of "virtual" qubits. The complexity of the implementation, i.e. the number of quantum gates, is then determined by how complicated the unitary is that transforms the basis in each H Ar j into the standard basis of the virtual qubits. Another way is to use the implementation of designs on one qudit [44], where it was shown that alternate applications of random diagonal unitaries in two complementary bases achieves unitary designs. This implementation would be suited in quantum many-body systems because we can choose two natural bases, position and momentum bases, and just repeat switching random potentials in those bases under the condition that the potentials satisfy the DSP form. Finally, when the symmetry-induced DSP form is our concern, unitary designs with symmetry may possibly be implementable by applying random quantum gates that respects the symmetry.
In any case, the implementations of DSP unitary designs, or the symmetric unitary designs, and their efficiency are left fully open. Further analyses are desired.

V. STRUCTURE OF THE PROOF
In the rest of the paper, we prove the three main theorems, Theorem 1, Theorem 3 and Theorem 4 in Section VI, Section VII and Section VIII, respectively. For the sake of clarity, we sketch the outline of the proofs in Subsection V A (see also Fig. 1). We then list useful lemmas in Subsection V B. See also Appendix H for the list of notations used in the proofs.

A. Key lemmas and the structure of the proofs
For the achievability statements (Theorem 1 and Theorem 3), the key technical lemma is the twisted twirling, which can be seen as a generalization of the twirling method often used in quan- FIG. 1. Outline of our proofs. The PD stands for the partial decoupling. For the smoothed randomized partial decoupling and the converse bound, we first assume two conditions, WA 1 and WA 2, but will remove them later to complete the proof. The details of the conditions are given in the main text.
tum information science. See Appendix A for the proof.

Lemma 9 (Twisted Twirling)
Let H Ar j be a r j -dimensional subspace of H Ar , and Π Ar j be the projector onto H Ar j ⊂ H Ar for each of j = 1, · · · , J. Let I ArA r be I Ar ⊗ I A r , and F ArA r ∈ L(H ArA r ) be the swap operator defined by a,b |a b| Ar ⊗ |b a| A r for any orthonormal basis {|a } in H Ar and H A r . In Then, it holds that, for j = k, Moreover, The twisted twirling enables us to show the following lemma (see Appendix B).

Lemma 10
For any ς ER ∈ S = (H ER ) and any X ∈ Her(H AR ) such that X A l R jj = 0, the following inequality holds for any possible permutation σ ∈ P: Here, A T l denotes the transposition of A l with respect to the Schmidt basis of the maximally entangled state |Φ l j A l A l in (26), and the norm in the R.H.S. is defined by (16).
Based on this lemma, we can prove the non-smoothed versions of Theorem 1 and Theorem 3 in Subsections VI A and VII A, respectively.
To complete the proofs of Theorem 1 and Theorem 3, smoothing the statements is needed, which is done in Subsections VI B and VII B based on the following lemma proven in Appendix C.

Lemma 11
Consider arbitrary unnormalized states Ψ AR ,Ψ AR ∈ P(H AR ) and arbitrary CP maps T ,T : and thatΨ The following inequality holds for any possible permutation σ ∈ P and for both Ψ * = Ψ av and Ψ * = C A (Ψ): The converse statements are proved independently in Section VIII.
When we prove the one-shot randomized decoupling theorem (Theorem 3) and the converse (Theorem 4), we first put the following two working assumptions: These assumptions are finally dropped in Subsections VII C and VIII C using the following lemma (see Appendix D for a proof).

Lemma 12
Let T A→E be a linear CP map that does not necessarily satisfies WA 1 and WA 2. By introducing a quantum system E c with dimension J, define an isometry Y Ac→AcEc := j |jj AcEc j| Ac , and a linear mapŤ A→EEc by T A→E • Y Ac→AcEc . Then,Ť A→EEc is a linear CP map and, for any Ψ AR that is classically coherent in A c R c , the following equalities hold:

B. List of useful lemmas
We here provide several useful lemmas, some of which are in common with those in the proof of the one-shot decoupling theorem [11]. Proofs of Lemmas 16-20 and 29-35 will be provided in Appendix E.

Lemma 17
Let {p k } k be a normalized probability distribution, {ρ k } k be a set of normalized states on AB, and {ρ k } k be that of subnormalized ones. For ρ ABK := k p k ρ AB k ⊗ |k k| K andρ ABK := k p kρ AB k ⊗ |k k| K , the purified distance satisfies Lemma 18 Let {p k } k and {q k } k be subnormalized probability distributions, and {ρ k } k and {ς k } k be sets of normalized states on A. For ρ AK := k p k ρ A k ⊗ |k k| K and ς AK :

Lemma 19
The DSP norm defined by (13) satisfies the triangle inequality, i.e., for any superoperators E and F from L(

Lemma 20
Let {Π j } j be a set of orthogonal projectors on H such that j Π j = I. For any ∈ P(H),

Lemma 26 (Lemma A.5 in [11])
For any state ρ ABK ∈ S = (H ABK ) in the form of where ρ k ∈ S = (H AB ), k|k = δ k,k and {p k } k is a normalized probability distribution, it holds that (It is straightforward to show that the above equalities also hold for ρ ABK ∈ S ≤ (H ABK ) and ρ k ∈ S ≤ (H AB ), by noting that

Lemma 27 (Lemma A.7 in [11]) For any state
where the notations are the same as in Lemma 26, and for any ≥ 0 it holds that (Note that, although Lemma A.7 in [11] assumes that ρ ABK 1 K 2 is normalized, the condition is not used in the proof thereof.)

Lemma 29
In the same setting as in Lemma 27, it holds that If ρ is also diagonal in K 1 K 2 (i.e., if ρ is in the form of (75)), there existsρ, satisfying the above conditions, that is diagonal in K 1 K 2 .

Lemma 31
Consider the same setting as in Lemma 26. For any where ε := k p k k .

Other Technical Lemmas
Lemma 32 Consider two linear operators X, Y : Let |Φ AA and |Φ BB be maximally entangled states between A and A , and B and B , respectively. Then, and the transposition is taken with respect to the Schmidt bases of |Φ AA and |Φ BB .

Lemma 33
If 2 is classically coherent in XY for a positive semidefinite operator ∈ P(H AXY ), so is .

Lemma 34
Let π be the maximally mixed state on system A, and let C be the completely dephasing operation on A with respect to a fixed basis {|i } d A i=1 . For any ρ ∈ P(H AB ), it holds that
Lemma 36 (Lemma 35 in [45]) Let c ∈ (0, ∞) be a constant, f : [0, c] → R be a monotonically nondecreasing function that satisfies f (c) < ∞, and {p k } k∈K be a probability distribution on a countable set K. Suppose k (k ∈ K) satisfies k ∈ [0, c], and k∈K p k k ≤ for a given ∈ (0, c 2 ]. Then we have

VI. PROOF OF THE NON-RANDOMIZED PARTIAL DECOUPLING (THEOREM 1)
We now prove the non-randomized partial decoupling (Theorem 1). As sketched in Subsection V A, we proceed the proof in two steps: showing the non-smoothed version in Subsection VI A, and then smoothing it in Subsection VI B.

A. Proof of The Non-Smoothed Non-randomized Partial Decoupling
The non-smoothed version of Theorem 1 is given by where Ψ AR av = J j=1 Ψ A l R jj ⊗ π Ar j . Note that, due to the definition of the conditional collision entropy (19), (22) and its relation to the conditional min-entropy (see Lemma 23), we have for a proper choice of ς ER ∈ S = (H ER ). In addition, it holds that We first show this relation. Let Π A * j be the projection onto a subspace H Ar j ⊗ HĀ r j ⊂ H A * for each j. Due to the definition of F AĀ→A * given by (31), it holds that Using the property of the Hilbert-Schmidt norm (Lemma 20), we have Using Eq. (87) and the explicit form of Λ(Ψ, T ), i.e. Λ(Ψ, T ) := F (Ψ AR ⊗ τĀ E )F † , each term in the summand is given by where the last line follows from Lemma 32. Thus, we obtain (86). From Eqs. (85) and (86), it suffices to prove that for any ς ER ∈ S = (H ER ). In the following, we denote the L.H.S. of Ineq. (90) by κ. Due to Lemma 14, for any ς ∈ S = (H ER ), we have Using this and Jensen's inequality, we obtain av,jj , we can apply Lemma 10 for X AR = Ψ AR − Ψ AR av and σ = id. This yields where the second line follows from the fact that Ψ A l ArR av,jk = 0 for j = k. To calculate the first term in (93), note that and that Thus, we simply apply Lemma 34 to obtain for each j. Substituting this to (93), we arrive at Ineq. (90).

B. Proof of The Smoothed Non-Randomized Partial Decoupling
We now smoothen the conditional min-entropy to complete the proof of Theorem 1. To this end, fixΨ ∈ B (Ψ) andT ∈ B µ DSP (T ) so that Let |Ψ p,av AA be a purification of Ψ A av . Noting that Ψ av is decomposed in the form of (28), by properly choosing a DSP decomposition for A , it holds that where q j := TrΨ jj and j is a purification of Ψ A l jj /q j for each j.
In addition, let D A→E + and D A→E − be superoperators such that which yields T −T = D + − D − . Note that, in general, it does not necessarily imply that D + = T and D − =T . We now apply Lemma 11 for the case where σ = id. To obtain the explicit forms, we compute where we have used the properties of Ψ AA p,av , ∆ A E ± , and D A→E ± described above. The last line follows from the definition of the DSP norm. Furthermore, introducing a notationŪ(·) := E U ∼H × [ U(·)], we also have (see Lemma 11 for the definition and properties of δ AR ± ) where the fourth line follows from the definition of the DSP norm (13), and the seventh line from the triangle inequality for the DSP norm (Lemma 19). Applying the non-smoothed version of the non-randomized partial decoupling (Ineq. (84)) to a stateΨ and a CP mapT , we have All together, Ineq. (63) in Lemma 11 leads to which, together with (97), concludes the proof of Theorem 1.

VII. PROOF OF THE RANDOMIZED PARTIAL DECOUPLING (THEOREM 3)
We here show Theorem 3. We first put the following two assumptions, which simplify the proof: in which T jk is a linear supermap from L(H Ar ) to L(H Er ) defined by T jk (ζ) = T (|j k| ⊗ ζ) for each j, k.
We show the non-smoothed version in Subsection VII A and the smoothed version in Subsection VII B. The above assumptions are then dropped in Subsection VII C.

A. Proof of The Non-Smoothed Randomized Partial Decoupling under WA 1 and WA 2
Under the assumptions WA 1 and WA 2, the non-smoothed version of the randomized partial decoupling is given by Note that, as we will describe in Subsection VII C for general cases, the min entropies H min (A|E) τ and H min (A|E) C(τ ) are equal to the max entropies −H max (A|B) C(τ ) and −H max (A r |BA c ) C(τ ) , respectively, due to the duality of the conditional entropies for pure states (Lemma 24). The proof of this inequality will be divided into three steps.

Upper bound on the average trace norm
To prove Ineq. (106), we first introduce the following lemma that relates the average trace norm of an operator T A→E • G A σ • U A (X AR ) to the average Hilbert-Schmidt norm.

Lemma 37
Let X AR be an arbitrary Hermitian operator such that X AR = J j,k=1 |j k| Ac ⊗ X ArRr jk ⊗ |j k| Rc , and let ζ ∈ S = (H E ) and ξ ∈ S = (H R ) be arbitrary states that are decomposed as ζ E = j |j j| Ec ⊗ ζ Er j , ξ R = j |j j| Rc ⊗ ξ Rr j , respectively. Then it holds that where the norm in the R.H.S. is defined by (16).
It should be noted that Lemma 37 provides a stronger inequality than that obtained simply using Lemma 14.

Proof:
We exploit techniques developed in [31]. Recall that U is in the form of J j=1 |j j| Ac ⊗ U Ar j , and G σ is defined by G σ := J j=1 |σ(j) j| Ac ⊗ I Ar for any σ ∈ P. We define a subnormalized state γ σ ∈ S ≤ (H ER ) for each σ by γ ER σ := J j=1 |σ(j) σ(j)| Ec ⊗ ζ Er σ(j) ⊗ξ Rr j ⊗|j j| Rc . Further, by letting P be a quantum system with an orthonormal basis {|σ } σ∈P , we define a subnromalized state γ ∈ S ≤ (H P ER ) by Using Lemma 14 and Jensen's inequality, we obtain In the last line, we used the following relation: which can be observed from the fact that, due to the decomposition of T A→E from WA 2, Due to the fact that 1 |P|

Generalization of the dequantizing theorem
Our second step to prove the non-smoothed randomized partial decoupling is to generalize the non-smoothed version of the dequantizing theorem (Proposition 3.5 in [31]).

Lemma 38
In the same setting as in Theorem 3, it holds that where we have defined Ψ AR dp := C A (Ψ AR ) = J j=1 |j j| Ac ⊗ Ψ ArR jj .
Proof: Since Ψ AR and Ψ AR av are classically coherent in A c R c by assumption, we can apply Lemma 37 for X AR = Ψ AR − Ψ AR dp to obtain (115) Noting that Ψ AR jj − Ψ AR dp,jj = 0, we can also apply Lemma 10 under the assumption that A l is a one-dimensional system, r j = r and ς ER = ζ E ⊗ ξ R . Then, we obtain, for any σ ∈ P, where we have used d A = rJ in the last line. Taking the case of J = 1 into account, and noting that E σ [g(σ)] = E σ [g(σ −1 )] for any function g, it follows that Here, we have used the definitions Ψ AR dp := C A (Ψ AR ) and τ AE dp := C A (τ AE ) in the sixth line, and Lemma 34 in the seventh line. Due the relation between the conditional collision entropy and the conditional min-entropy (Lemma 23), it is further bounded from above by 2 −H min (A|R) Ψ|ξ −H min (A|E) τ |ζ .
Finally, we use the property of the the conditional min-entropy (Lemma 28). There exist normalized states ξ and ζ in the form of such that H min (A|R) Ψ|ξ = H min (A|R) Ψ and H min (A|E) τ |ζ = H min (A|E) τ . Thus, we obtain which, together with Ineq. (115), complete the proof of Lemma 38.

Proof of The Non-Smoothed Randomized Partial Decoupling
We now prove the non-smoothed randomized partial decoupling, i.e., under the assumptions WA 1 and WA 2. Note that β(A r ) is 0 for dimH Ar = 1 and 1 for dimH Ar ≥ 2. By the triangle inequality, we have where we have used the fact that the unitary invariance of the Haar measure implies U A (Ψ AR av ) = Ψ AR av for any unitary U . The first term is bounded by simply using Lemma 38.
To bound the second term in (121), we use Lemma 37, leading to Since Ψ R dp,jj = Ψ R av,jj by definition, we can apply Lemma 10 for X AR = Ψ AR dp − Ψ AR av . Noting that Ψ ArR dp,jk = Ψ ArR av,jk = 0 for j = k, this yields Thus, similarly to the derivation around Eq. (117), we obtain Substituting this into Ineq. (122), and noting that Ψ AR dp − Ψ AR av = 0 if dimH Ar = 1, we obtain an upper bound on the second term of the R.H.S. in Ineq. (121).

B. Proof of The Randomized Partial Decoupling under The Conditions WA 1 and WA 2
We now show, under the conditions WA 1 and WA 2, the randomized partial decoupling: The function α(J) is 0 for J = 1 and 1 J−1 for J ≥ 2, and β(A r ) is 0 for dimH Ar = 1 and 1 for dimH Ar ≥ 2. The exponentsH I andH II are given bỹ Note that, the duality of the conditional smooth entropies for pure states (Lemma 24), implies To prove the statement, we again start with the triangle inequaltiy: By the triangle inequality, we have Below, we derive upper bounds on the two terms in the R.H.S. separately.
For an upper bound on the first term, fixΨ ∈ B (Ψ) andτ ∈ B µ (τ ) so that we have and that Let D A→E + and D A→E − be superoperators such that which yields T −T = D + − D − . From Lemma 11, the CP mapT A→E having the Choi-Jamiołkowski stateτ AE satisfies Due to Lemma 38, the first term in the R.H.S. of the above inequality is bounded as Similarly to (101) and (102), using (128) and (129), it turns out that the second and the third terms are bounded from above by and respectively. Hence, we obtain In the same way, we also have Substituting these inequalities into Eq. (127), we obtain the desired result (Ineq. (125)).

C. Dropping Working Assumptions WA 1 and WA 2
We now drop the working assumptions WA 1 and WA 2, and show that Theorem 3 holds in general. To remind the working assumptions, we write them down here again: in which T jk is a linear supermap from L(H Ar ) to L(H Er ) defined by T jk (ζ) = T (|j k|⊗ ζ) for each j, k, To drop these assumptions, we use Lemma 12. Using the linear isometry Y Ac→AcEc , given by Y = j |jj AcEc j| Ac , we define a new CP mapŤ A→EEc by T A→E • Y Ac→AcEc . Lemma 12 states that Letτ AEEc be the Choi-Jamiołkowski state ofŤ A→EEc , i.e.,τ AEEc := J(Ť A→EEc ). We denote by |τ ABE a purification of τ AE such that the reduced state τ AB is equal to J(T A→B ), where T A→B is the complementary map of T A→E . Then, it is clear thatτ AEEc = Y(τ AE ), which implies that a purification |τ ABEEc ofτ AEEc is given by |τ ABEEc = Y |τ ABE . It is also straightforward to verify thatτ AB = C(τ AB ).
The new CP mapŤ A→EEc clearly satisfies WA 1 and WA 2. Hence, using Eq. (137) and achievability of the randomized partial decoupling under those assumptions (Ineq. (125)), we obtain Due to the duality of conditional smooth entropies (Lemma 24), we have Using the property of the conditional smooth entropy for classical-quantum states (Lemma 27), and noting thatτ AEEc is classically coherent in A c E c , we also have Substituting these into (138), and noting that Tr[τ ] = Tr[τ ] ≤ 1 by assumption, we obtain Theorem 3.

VIII. PROOF OF THE CONVERSE
We provide the proof of Theorem 4 under Converse Conditions 1 and 2, which are The proof proceeds along the similar line as the proof of the converse part of the one-shot decoupling theorem (see Section 4 in [11]). Suppose that there exists a normalized state Ω ER := J j=1 ς E j ⊗ Ψ Rr jj ⊗ |j j| Rc , where {ς j } J j=1 are normalized states on E, such that, for δ > 0, We separately prove that, in this case, the following inequalities hold for any υ ∈ [0, 1/2) and ι ∈ (0, 1]: Here, λ and λ are given by and x := √ 2 4 √ 24υ + 2δ. First, we prove these relations based on the working assumptions WA 1 and WA 2 in Subsection VIII A and VIII B. We complete the proof of Theorem 4 by dropping these assumptions in Subsection VIII C.
• V A→BE : A Stinespring dilation of T A→E .
• |θ BERD : A subnormalized pure state on BERD such that which is classically coherent in E c R c .
Note that the existence of |θ satisfying the above condition follows from Lemma 30 about the property of the conditional max-entropy for classically coherent states. From the definition of the conditional max-entropy, and from the definitions of θ and Θ, we have The proof of Ineq. (142) proceeds as follows. First, we prove that for any X ∈ P(H ER ), we can construct a subnormalized pure state |θ X BERD from θ and X such that Second, we prove that if X ER satisfies certain conditions, the θ X satisfies Third, we prove that for a proper choice of X ER satisfying the conditions for (149), Ineq. (148) implies Combining (147), (149) and (150), we arrive at (142). Before we start, we remark that the partial decoupling condition (141) is used in the proof of (149), particularly when we evaluate the smoothing parameter λ.

Proof of Ineq. (149)
Define a subnormalized probability distribution q k := k| Rc |θ 2 1 J k=1 , and normalized pure states |θ k ErRr by |θ k ErRr := q −1/2 k k| Ec k| Rc |θ for k such that q k > 0. Let ω ∈ S ≤ (H ER ) be a subnormalized state defined by where θ Er k and θ Rr k are reduced states of |θ k on E r and R r , respectively. Consider an arbitrary X ∈ P(H ER ) so that and As we prove in Appendix F, for any such X, the state |θ X is a subnormalized pure state, and the partial decoupling condition (141) implies where λ is defined by (144). Due to the definition of Θ and the invariance of min-entropy under local isometry (Lemma 21), we obtain

Proof of Ineq. (150)
We choose a proper X ER satisfying Conditions (156) and (157), and prove Ineq. (150) from (148). Define a normalized stateθ where J := |{k|1 ≤ k ≤ J, q k > 0}|, and X ER := J · I E ⊗θ R . Noting that θ is classically coherent in E c R c , it is straightforward to verify that Consequently, X ER satisfies Conditions (156) and (157). Using Ineq. (148), we have which implies, together from the definition of the conditional min-entropy and J ≤ J, that

B. Proof of Ineq. (143) under WA 1 and WA 2
We prove (143), that is, under the assumptions WA 1 and WA 2. To show this, we introduce the following notations: • |Ψ ARD : A purification of Ψ AR , in the same way as in the previous subsection.
• T A→E C : A trace preserving CP map defined by T A→E • θ ERD C : A subnormalized state on ERD such that H max (RD|E) θ C = H υ max (RD|E) Θ C and P (θ C , Θ C ) ≤ υ, which is classically coherent and diagonal in E c R c .
The assumptions WA 1 and WA 2 imply that Θ ERD C is classically coherent and diagonal in E c R c . Thus, the existence of θ C satisfying the above condition follows from Lemma 30. By definition, we have The proof of Ineq. (164) proceeds as follows. First, we introduce a quantum stateΨ ARD and a quantum channelT A→E To explicitly defineΨ ARD andT A→E C , observe that, since Θ C is a normalized state, we have Thus, due to Uhlmann's theorem, and noting that Θ RD C = Ψ RD , there exists a normalized pure state |Ψ ARD such that P (Ψ ARD ,Ψ ARD ) ≤ υ andΨ RD =θ RD C . It follows from the latter equality that there exists a trace preserving CP mapT A→E

Block-wise application of the converse inequality (142)
Define a normalized probability distribution {r k := k| Rc |Ψ 2 1 } J k=1 , and let |Ψ k ArRrD := r −1/2 k k| Ec k| Rc |Ψ for k such that r k > 0. SinceΨ is classically coherent in E c R c , theΨ k are normalized states. Define also a CP mapT Ar→E which is trace preserving due to the assumptions WA1 and WA2. We apply the converse inequality (142) forΨ k andT Ar→E C,k for each k, by letting J = 1. We particularly choose υ = 0, in which case Ineq. (142) leads to The smoothing parameter λ k is given by λ k := 2 ι + 4 2δ k + 2 2δ k + 4 2δ k , δ k := T Ar→E A simple calculation yields

Calculation of Averaged Entropies
Using the fact thatθ C is classically coherent and diagonal in E c R c , it is straightforward to verify thatθ ERD whereλ := k r k λ k . Combining these all together with Eq. (165), we obtain As we prove in Appendix G, the partial decoupling condition (141) implies where λ(ι, x) := 2 √ ι + 2x + √ x + 2x. A simple calculation then yields whose right-hand side is exactly λ given in (145). In addition, noting that Θ C is normalized, and by using the relation between the purified distance and the trace distance (Property 2 in Lemma 16), the last term in the R.H.S. of (173) is calculated to be Combining these all together, we arrive at C. Dropping the working assumptions WA 1 and WA 2 We here show that the working assumptions WA 1 and WA 2 can be dropped. The proof is based on Lemma 12. Since the CP mapŤ A→EEc , defined in Lemma 12, satisfy both conditions, it satisfies Ineq. (142), which is Let V A→BE be a Stinespring dilation of T A→E , and let Z Rc→RcEc be a linear isometry defined by Z := j |jj RcEc j| Rc . A purification |ϑ BRDEEc ofŤ A→EEc (Ψ ARD ) is given by |ϑ BRDEEc = (V A→BE ⊗ Z Rc→RcEc )|Ψ ARD , and satisfies ϑ BRD = T A→B • C A (Ψ ARD ). Hence, due to the duality for the conditional smooth entropy (Lemma 24), it holds that Combining this with (178), we conclude The mapŤ A→EEc also satisfies Ineq. (143): Similarly to (179) and (140), by using the property of the conditional max entropy for classicalquantum states (Lemma 29), we have which leads to This concludes the proof of Theorem 4 for any trace preserving CP map T A→E .

IX. CONCLUSION
In this paper, we have proposed and analyzed a task that we call partial decoupling. We have presented two different formulations of partial decoupling, and derived lower and upper bounds on how precisely partial decoupling can be achieved. The bounds are represented in terms of the smooth conditional entropies of quantum states involving the initial state, the channel and the decomposition of the Hilbert space. Thereby we provided a generalization of the decoupling theorem in the version of [11], by incorporating the direct-sum-product decomposition of the Hilbert space. Applications of our result to quantum communication tasks and black hole information paradox are provided in Refs. [21][22][23] and [24], respectively. A future direction is to apply the result to various scenarios that have been analyzed in terms of the decoupling theorem, such as relative thermalization [10] and area laws [8] in the foundation of statistical mechanics.
Then, it holds that, for j = k, Moreover, n ) † = 0 for i = j = k = l trivially follows from the fact that the random unitaries {U j } j are independent and that E U j ∼H j [U j ] = 0.
Let us consider the case where j = k and prove Eqs. (A2) and (A3). Note that any X ArB ∈ L(H ArB ) is decomposed into X ArB = p,q X Ar p ⊗ X B q , where X Ar p ∈ L(H Ar ) and X B q ∈ L(H B ). Using the fact that for any X Ar p ∈ L(H Ar ), which follows from the Schur-Weyl duality [46], we have Using this equality twice for j and k, we obtain Eq. (A2). It also leads to Eq. (A3) as follows: Here, we have used relations , and used Eq. (A2) in the last line.
We finally show Eq. (A4). Consider the operator E U j ∼H j (U Ar j ⊗ U A r j )|p q| Ar ⊗ |s t| A r (U Ar j ⊗ U A r j ) † . Since this commutes with V ⊗2 (∀V ∈ U(r j )), we obtain from the Schur-Weyl duality [46] that where α pqst and β pqst are determined by Note that the first equation is obtained by taking the trace of Eq. (A8), and the second is by calculating the expectation of F ArA r by both sides in Eq. (A8). Solving these equalities, we obtain Here, Π j is the projection onto a subspace H A l j ⊗ H Ar j ⊂ H A , and {|j } J j=1 is a fixed orthonormal basis of H Ac . The W is indeed an isometry, because (B4) where H Ac j ⊂ H Ac is a one-dimensional subspace spanned by |j for each j. Denoting the projection onto H A l j ⊂ H A l by Π A l j ∈ L(H A l ) and one onto H Ar j ⊂ H Ar by Π Ar j ∈ L(H Ar ), we also have and thus Let R be another quantum system represented by a finite dimensional Hilbert space H R . Any X AR ∈ L(H AR ) is decomposed by W A→AcA l Ar in the form of whereX A l ArR jk Conversely, any Y AcA l Ar ∈ L(H Ac ⊗ H A l ⊗ H Ar ) such that supp(Y AcA l Ar ) ⊂ img(W A→AcA l Ar ), is mapped to (W A→AcA l Ar ) † (Y AcA l Ar ) ∈ L(H A ). Note thatX jk is related to X jk defined by (10) as |j k| Ac ⊗X A l ArR jk = W A→AcA l Ar (X AR jk ). In the following, we denoteX A l ArR jk by X A l ArR jk for simplicity of notations.
Let A be a quantum system such that H A ∼ = H A . It is straightforward to verify that the fixed maximally entangled state |Φ defined by (26) is decomposed by W as where |Φ l j ∈ H A l j ⊗ H A l j and |Φ r j ∈ H Ar j ⊗ H A r j are fixed maximally entangled states of rank l j and r j , respectively.

Proof of Lemma 10
We now prove Lemma 10. The statement is given as follows: for any ς ER ∈ S = (H ER ) and any X ∈ Her(H AR ) such that X A l R jj = 0, the following inequality holds for any possible permutation σ ∈ P: Here, A T l denotes the transposition of A l with respect to the Schmidt basis of the fixed maximally entangled state used to define the Choi-Jamiołkowski representation τ AE of T A→E . Proof: Introducing a notation F RE,R E Thus, using the fact that for any function f , we have where we have defined Ξ AA RR . We first embed the operator Ξ AA RR σ into the space A c A l A r R and A c A l A r R . We introduce the following notations for the embedded map and the embedded operators: Using these notations, the operator Ξ AA RR σ is embedded to be Due to Lemma 9, the terms in the summation remain non-zero only in the following three cases: (i) J ≥ 2 and (j, k) = (m, n) (j = k), (ii) J ≥ 2 and (j, k) = (n, m) (j = k), and (iii) j = k = m = n.
In the following, we assume that J ≥ 2, and separately investigate the three cases using Lemma 9. Our concern is then Ξ σ,(i) , Ξ σ,(ii) and Ξ σ,(iii) such that In the case (i), from Lemma 9, we have . It follows that and consequently, from the condition for X, i.e. X A l R jj = 0, that Tr[(X AR ) ⊗2 Ξ AA RR σ,(i) ] = 0. Let us next consider the case (ii), where (j, k) = (n, m) (j = k). This case yields . Denoting the A r part of Υ and T * byĀ r , we have where the fourth line follows from the Choi-Jamiołkowski correspondence (25) and the last line from the swap trick (Lemma 39). Hence we obtain Finally, we investigate the case (iii). Lemma 9 leads to where Similarly to (B20) and (B22), we have and Combining this with (B20), (B22) and (B25), we obtain

Noting that Tr
] is a Hermitian operator for each j, and by using the property of the Hilbert-Schmidt norm (see Lemma 13), the above equality leads to Combining this with (B24), we have Since Ξ σ = Ξ σ,(i) + Ξ σ,(ii) + Ξ σ,(iii) , we can thus obtain from these evaluations that for any ς ER ∈ S = (H ER ) and σ ∈ P. Combining this with Eq. (B13) concludes the proof.

Appendix C: Proof of Lemma 11
We prove Lemma 11. We start with recalling the statement: Consider arbitrary unnormalized states Ψ AR ,Ψ AR ∈ P(H AR ) and arbitrary CP maps T ,T : A → E. Let D A→E + and D A→E − be arbitrary CP maps such that T −T = D + − D − . Let δ AR + and δ AR − be linear operators on H A ⊗ H R , such that and thatΨ The following inequality holds for any possible permutation σ ∈ P and for both Ψ * = Ψ av and Ψ * = C A (Ψ): Proof: By a recursive application of the triangle inequality, we have The expectation value of the first term is bounded as In the same way, the expectation value of the last term is bounded as For the second term, we have Similarly, the expectation value of the fourth term is bounded as Combining these all together, we obtain (C3).
To show Property 2, note that for any ρ, ς ∈ S ≤ (H), we have (see Lemma 6 in [25]) whereD is the generalized the trace distance defined bȳ Noting that the second term in the above expression is no greater than the first term, we conclude the proof. For Property 3, define λ φ := φ|φ and consider a normalized pure state |φ n := λ −1/2 φ |φ . Due to the triangle inequality and the first statement of this lemma, we have which completes the proof.

Proof of Lemma 17:
Since ρ ABK and ρ AB k are normalized, the purified distances are given by

Proof of Lemma 19:
Consider arbitrary finite dimensional quantum system C and any subnormalized state ξ on AC such that the reduced state on A takes the form of ξ A = J j=1 q j A l j ⊗ π Ar j . Due to the triangle inequality for the trace norm, it holds that By taking the supremum over all C and ξ in the first line, we obtain Lemma 19.
Proof of Lemma 20: Due to the completeness of the set of projectors, it holds that = j,k Π j Π k .. This yields Tr and completes the proof.

Proof of Lemma 29:
Let |ϕ k ABC be a purification of ρ AB k for each k. A purification of ρ ABK 1 K 2 is given by |ϕ ABCK 1 K 2 K 3 := k √ p k |ϕ k ABC |k K 1 |k K 2 |k K 3 . Due to the duality of the conditional entropies (Lemma 24), Lemma 27 and isometric invariance (Lemma 22), we have which completes the proof.
Proof of Lemma 31: Letρ AB k ∈ B k (ρ AB k ) be such that H k min (A|B) ρ k = H min (A|B)ρ k for each k, and define a subnormalized stateρ ABK := k p kρ AB k ⊗ |k k| K . From Lemma 26, we have H min (A|BK)ρ = − log( k p k ·2 −H min (A|B)ρ k ). Due to the property of the purified distance (Lemma 17), we also haveρ ABK ∈ B √ 2ε (ρ ABK ), where ε = k p k k . This completes the proof.

Proof of Lemma 32: Let {|i } d
and {|j } d B j=1 be the Schmidt bases of |Φ AA and |Φ BB , respectively, and suppose that X = i,j x ij |j i| and Y = i,j y ij |j i|. The statement follows by noting that Tr[X T Y ] = i,j x ij y ij .

Proof of Lemma 33:
Suppose that 2 is classically coherent. For any x = y, it holds that which implies x| X y| Y |x X |y Y = 0 and completes the proof.

Proof of Lemma 34:
The first inequality is proved as Similarly, we obtain the second one as which concludes the proof.
This completes the proof. Relations among subnormalized states Θ, θ, θ X , Ω, ω and ω X are depicted. Dashed arrows represent how the states are defined, and the dotted lines represent the distances between the states. The goal of the proof is to evaluate the distance between |Θ and |θ X , by expressing it in terms of distances between other states on the whole system BERD as depicted in the left. To evaluate the distance between |θ and |ω , we also consider those states on subsystem ER, as depicted in the right.
Due to the operator monotonicity of the inverse function (see e.g. [47]), we have Consequently, Γ ER X is contractive, and thus |θ X and |ω X are indeed subnormalized states. Relations among these states are depicted in Figure 2.
Theδ can further be calculated as follows. By the triangle inequality, we havē