Attainability and lower semi-continuity of the relative entropy of entanglement, and variations on the theme

The relative entropy of entanglement $E_R$ is defined as the distance of a multi-partite quantum state from the set of separable states as measured by the quantum relative entropy. We show that this optimisation is always achieved, i.e. any state admits a closest separable state, even in infinite dimensions; also, $E_R$ is everywhere lower semi-continuous. We use this to derive a dual variational expression for $E_R$ in terms of an external supremum instead of infimum. These results, which seem to have gone unnoticed so far, hold not only for the relative entropy of entanglement and its multi-partite generalisations, but also for many other similar resource quantifiers, such as the relative entropy of non-Gaussianity, of non-classicality, of Wigner negativity $\unicode{8212}$ more generally, all relative entropy distances from the sets of states with non-negative $\lambda$-quasi-probability distribution. The crucial hypothesis underpinning all these applications is the weak*-closedness of the cone generated by free states, and for this reason the techniques we develop involve a bouquet of classical results from functional analysis. We complement our analysis by giving explicit and asymptotically tight continuity estimates for $E_R$ and closely related quantities in the presence of an energy constraint.


I. INTRODUCTION
In its early days, almost a century ago [1][2][3][4][5], quantum mechanics was mostly regarded as a bizarre physical theory whose exotic mathematics was needed to explain the behaviour of atomic spectra. As decades passed and physicists grew accustomed to the strangeness of the quantum world, they became interested not only in the question of what we can do for it, that is, of how to explain or interpret it, but also in the more operationally-oriented question of what it can do for us, namely, of what feats can be achieved by exploiting genuinely quantum effects. Far from descending from a purely practically-oriented mindset, this attitude reflects a general belief, originally stemming from information theory [6], that exploring the ultimate operational potential of a resource tells us something about its nature.
The latest incarnation of this philosophy is the formalism of quantum resource theories [7][8][9]. In this very general framework, one identifies two sets of objects that can be accessed at will in an inexpensive fashion: a set of quantum states ('free states'), and a set of quantum operations ('free operations'). The main theoretical concern in this context is how to characterise in an operationally meaningful way the resource content of an arbitrary state ρ that is not free. In accordance with the above philosophy, one typically considers two complementary tasks: evaluate the number of 'golden units' of pure resource that can be drawn from ρ by manipulating it via free operations, or -converselythe number of golden units that are needed to prepare ρ via free operations in the first place.
These two complementary ideas lead, in the asymptotic limit of many copies, to the notions of distillable resource and of resource cost [9]. Albeit somewhat opposite to each other, these two approaches allow, under appropriate assumptions, to single out a particular resource quantifier as the unique function determining the rates at which resources can be inter-converted by means of free operations [10]. This is the regularised version of the relative entropy of resource, defined for an arbitrary state ρ by D F (ρ) . . = inf σ∈F D(ρ σ), where F is the set of free states, and D(· ·) is Umegaki's relative entropy [11,12].
Historically, the first embodiment of the relative entropy of resource arose in the context of entanglement quantification for states of a composite finite-dimensional quantum system [13][14][15][16][17]. In this case, the role of free states is played by the set of separable (a.k.a. un-entangled) states, defined as classical mixtures of product states (describing un-correlated subsystems) [18,19]. The resulting entanglement monotone, called the relative entropy of entanglement, turned out to be a very successful entanglement measure. Its regularisation can be endowed with multiple operational interpretations, either in the context of entanglement manipulation [20][21][22] or in that of hypothesis testing [23]. More generally, it plays a fundamental role in the analysis of composite quantum systems and quantum channels [24].
For all these reasons, it is desirable to have a general method to calculate or estimate the relative entropy of resource. Since it is naturally defined as a minimisation, upper bounds are easily computed by making ansatzes as to what the closest free state may be. A systematic technique to construct lower bounds, instead, has been put forth by Berta,Fawzi,and Tomamichel [25,§ 5], who found a dual variational formula involving an external maximisation instead of a minimisation. Their proof, however, rests on an application of Sion's theorem, and we argue that a simple generalisation to infinite-dimensional systems does not work. This is a serious problem, as many resource theories of interest are infinite-dimensional, and one could even make the case that almost all fundamental quantum systems, i.e. the quantum fields that form the basis of our most successful theoretical models, are intrinsically infinite-dimensional.
On a different note, it is useful to observe that although the relative entropy is not a metric, we can intuitively interpret the quantity D F (ρ) as a distance of a state ρ from the set F. The natural question that arises in connection with this interpretation concerns the existence of a nearest free state σ to a given state ρ -in other words, we are asking whether the infimum in the definition of D F (ρ) is attainable. This problem has a simple positive solution in the finite-dimensional setting, provided that the set F is closed, due to the trace norm compactness of the state space. In infinite dimensions, the lower semi-continuity of the relative entropy still allows to prove the existence of the nearest free state σ if the set F is trace norm compact, but this observation covers basically none of the physically interesting cases -typically, F is closed but not compact, and attainability of the infimum in the definition of D F is a non-trivial open question. The experience with many other quantifiers defined in a similar way, through infima (and suprema), seems to suggest that this question may have, in general, a negative answer.
In this paper we will show that this is however not the case. Namely, we leverage the special properties of the quantum relative entropy to prove that for many practically important non-compact sets F of free states the infimum in the definition of D F is in fact always achieved -uniquely for faithful states, if F is convex -and that the resulting function D F is lower semi-continuous. Our first two main results establish a general condition on F for these conclusions to hold (Theorem 5) and an effective criterion to check whether such condition is met in almost all cases of practical interest (Theorem 7). Exploiting Theorem 5, for the physically interesting finite-entropy states we derive a dual variational formula for D F (ρ) involving an external supremum instead of an infimum (Theorem 9). This formula, which is our final main result, generalises to infinite-dimensional resource theories the one discussed in [25] for the finite-dimensional case, and can be used to generate lower bounds to D F in a systematic way, thus addressing the pressing problem discussed before.
We also study the continuity of D F (Proposition 16) and how the closest free state to ρ depends on ρ itself (Proposition 18). The key assumption on F underpinning all these results is the closedness of the cone composed of all non-negative multiples of states in F with respect to the weak*-topology -the topology induced on the Banach space of trace class operators acting on a certain Hilbert space by its pre-dual, the space of compact operators.
On the technical side, the proofs of our general results, and especially of Theorem 7, constitute a systematic and somewhat gratifying application of many of the cardinal results of functional analysis -inter alia, the uniform boundedness principle, the Banach-Alaoglu theorem, and the Krein-Šmulian theorem -to the framework of quantum resource theories. Although some functional analytic tools have been applied to the study of entanglement in quantum field theories before [26], to the extent of our knowledge it is the first time that this is done in such a systematic way, further extending these ideas to general quantum resources. For the proof of Theorem 9, which is technically rather involved, we employ an original strategy that gives as by-products generalisations of Petz's variational formulae [27] to the case of unfaithful states and of Lieb's three-matrix inequality [28,Theorem 7] to infinite dimensions.
The main advantage of our approach is its universality, which we demonstrate by applying it to a wide range of examples. First and foremost, we establish that the relative entropy of entanglement [13,14] is (a) always achieved, (b) lower semi-continuous, and (c) that it can be expressed as a dual maximisation for finite-entropy states (Corollary 20). The same conclusions (a), (b), and (c) hold true for the multi-partite generalisations of the relative entropy of entanglement, where the role of F is played by states that are separable according to a prescribed set of partitions (Corollary 23), and for the relative entropy distance to the set of states with a positive partial transpose (Corollary 25). Moving on to the continuous variable setting, we are able to prove (a), (b), and (c) for all quantifiers of the form D F , where F is composed by those multi-mode states with non-negative λ-quasi-probability distribution, λ ∈ [−1, 1] being a fixed parameter (Corollary 32). Of special physical interest are the case λ = +1, in which case F comprises all so-called 'classical' states, i.e. all convex mixtures of coherent states, as well as the case λ = 0, in which case F includes the states with non-negative Wigner function. The last application of our results is to the resource theory of non-Gaussianity. In this case, F is taken to be the non-convex set of Gaussian states, and D F is known to have a closed-form expression -namely, it can be computed for a state ρ as the difference S(ρ G ) − S(ρ), where S is the von Neumann entropy and ρ G is the Gaussian state with the same first and second moments as those of ρ. Although this function is known to be generally discontinuous in ρ [29], our results imply that it is at least lower semi-continuous (Corollary 35).
For the special cases of the relative entropy of entanglement, its multi-partite generalisations, the Rains bound, and regularisations thereof, we complement our findings by establishing quantitative continuity estimates that are valid under appropriate energy constraints (Propositions 36 and 39). Our bounds, which we prove by further refining the techniques in Ref. [30,31], turn out to be asymptotically tight in many physically interesting cases, and imply that the aforementioned quantifiers, whose importance for the study of entanglement theory can hardly be overestimated, are uniformly continuous on energy-bounded sets of states.
The rest of the paper is organised as follows. Section II contains basic notation and definitions, as well as a brief introduction to some functional analytic techniques that play an essential role in this article. In Section III we state and prove our main results in a completely general form (Theorems 5, 7, and 9). These tools are then applied in Section IV to several concrete examples of the relative entropy of resource, starting from the bipartite relative entropy of entanglement and ending with the relative entropy of non-Gaussianity. Section V is devoted to a quantitative continuity analysis of the relative entropy of resource under energy constraints.

A. Topologies for quantum systems
In what follows, H will denote an arbitrary (not necessarily finite-dimensional) separable Hilbert space. 1 The Banach space of trace class operators on H, endowed with the trace norm T 1 . . = Tr √ T † T , will be denoted with T(H). It can be thought of as the dual of the Banach space K(H) of compact operators on H equipped with the operator norm, in formula T(H) = K(H) * . In turn, the dual of T(H) -and hence the bi-dual of K(H) -is the Banach space of bounded operators on H, denoted with B(H) = T(H) * = K(H) * * [32, Chapter VI]. Importantly, from the separability assumption for H it follows that both K(H) and T(H) -but not B(H) -are separable as Banach spaces. 2 The cone of positive semi-definite trace class operators is defined by T + (H) . . = {X ∈ T(H) : ψ|X|ψ ≥ 0 ∀ |ψ ∈ H}. Quantum states on H are represented by density operators, i.e. positive semi-definite trace class operators with unit trace; they form the set D(H) . . = {X ∈ T + (H) : Tr X = 1}. States ρ ∈ D(H) for which ρ > 0 are said to be faithful. For an arbitrary set S ⊆ T(H), we will denote the cone it generates with cone(S) . . = {λX : λ ∈ [0, ∞), X ∈ S}. For example, we have that T + (H) = cone (D(H)).
We will consider essentially two topologies on T(H). The first one is induced by the native norm of T(H): a sequence of operators (T n ) n∈N in T(H) is said to converge with respect to the trace norm topology to some The second one is the weak*-topology induced on T(H) by its pre-dual K(H), which can be defined as the coarsest topology making all functions of the form T(H) T → Tr T K continuous, where K ∈ K(H) is an arbitrary compact operator. Convergence of a sequence (T n ) n∈N to T ∈ T(H) with respect to the weak*-topology, denoted T n w * − −− → n→∞ T , is therefore equivalent to the condition that Tr T n K − −− → n→∞ Tr T K for all K ∈ K(H). 3 It is not difficult to see that any sequence of operators converging in the trace norm topology is also convergent (to the same limit) with respect to the weak*-topology. This is usually expressed by saying that the weak*-topology is coarser than the trace norm topology. An immediate consequence is that weak*-closed sets are also closed with respect to the trace norm topology.
The fact that these two topologies are genuinely different in infinite dimensions can be illustrated by showing that the two associated notions of convergence are different. For example, in a Hilbert space with orthonormal basis {|n } n∈N , the sequence of pure states (|n n|) n∈N does not converge at all with respect to the trace norm topologyit is not even a Cauchy sequence -but it tends to 0 with respect to the weak*-topology, i.e. |n n| w * − −− → n→∞ 0. Intuitively, the weak*-topology treats any component that 'escapes to infinity' as converging to 0, while the trace norm topology takes it into account nonetheless.
Curiously, these two topologies agree on the very special set of quantum states. Namely, given a sequence (ρ n ) n∈N of density operators ρ n ∈ D(H) and another state ρ ∈ D(H), we have that ρ n Lemma 4.3]. In order for this surprising conclusion to hold, it is crucial that both ρ n and ρ are normalised density operators. In the above example, the weak*-limit was 0, hence not a normalised density operator.
The cone of positive semi-definite trace class operators is well known to be closed with respect to the trace norm topology. It is an easy yet instructive exercise to verify that it is also weak*-closed, a fact that we will employ multiple times throughout the paper. To see this, it suffices to remember that X ∈ T(H) satisfies X ≥ 0 if and only if Tr Xψ = ψ|X|ψ ≥ 0 for all |ψ ∈ H. The claim then follows because each rank-one projector ψ is a compact operator.
Unlike the trace norm topology, which is induced by a metric (in the sense that a metric -namely, the normdetermines the convergence of all nets), the weak*-topology is not 'metrisable', i.e. it is not induced by any metric [37, Proposition 2.6.12]. The reader could wonder why to introduce the complicated weak*-topology alongside the more intuitive trace norm one. The fundamental reason to do so is that a general result in Banach space theory, the Banach-Alaoglu theorem [37,Theorem 2.6.18], guarantees that the dual unit ball is always compact in the weak*-topology. In the present context, this implies that the unit ball of T(H) is weak*-compact. The acute reader will correctly suspect that the achievability results in our work rests crucially on this compactness property.
Remark 1. An immediate consequence of the Banach-Alaoglu theorem is that, provided that H is separable (and hence both K(H) and T(H) are), the weak*-topology, albeit globally non-metrisable, is indeed metrisable on normbounded subsets [37,Corollary 2.6.20]. Since weak*-compact sets of T(H) are always norm-bounded [37, Corollary 2.6.9], and compactness and sequential compactness are equivalent on metrisable spaces, we deduce the following handy fact: if H is separable then any weak*-compact subset of T(H) is also sequentially weak*-compact.
Remark 2. Several notions of weak topologies are commonly employed in the von Neumann algebra approach to quantum information. The purpose of this remark is to clarify some possible issues deriving from confusion in the terminology. Given a von Neumann algebra M and its pre-dual M * spanned by all positive normal functionals on M (also called normal states), the weak* topology on M * in the von Neumann algebra sense, or vN-weak* topology for short, is induced by the semi-norms M * ϕ → |ϕ(x)|, indexed by x ∈ M. How does this topology compare to the weak*-topology we employ? To make a comparison, we look at the simplest case where M = B(H) is the von Neumann algebra of all bounded operators on some Hilbert space H, so that M * T(H) is essentially the space of trace class operators on H. It is not difficult to see that the vN-weak* topology on T(H) is the coarsest topology that makes all functions of the form T(H) T → Tr T X continuous, where X ∈ B(H) is an arbitrary bounded operator. This is clearly a stronger topology than our weak*-topology, whose definition, despite the apparent similarity, only requires X to vary over all compact operators. This is a key difference: the weak*-converging sequence |n n| w * − −− → n→∞ 0 we looked at before does not converge in the vN-weak* topology, as can be seen swiftly by picking X = I.

B. Relative entropy of resource
The von Neumann entropy, or simply the entropy, of an arbitrary positive semi-definite trace class operator X ∈ T + (H) whose spectral decomposition reads X = i x i |e i e i | ≥ 0 is defined by where the sum on the right-hand side is well defined (although possibly infinite) because x → −x ln x is non-negative for all x ∈ [0, 1], and all but a finite number of eigenvalues of X belong to the interval [0, 1], due to the fact that X is of trace class. Given two positive semi-definite trace class operators X, Y ∈ T(H), X, Y ≥ 0, with spectral decomposition X = i x i |e i e i | and Y = j y j |f j f j |, x i , y j > 0, one defines their relative entropy by [11,38] This is a function taking on values in the set of extended real numbers R ∪ {+∞}, and we adopt the convention according to which D(0 0) = 0 and D(X 0) = +∞ if X = 0. The expression on the first line of (3) is to be interpreted as specified in the second. Thanks to the convexity of the function x → x ln x, each term of the series is non-negative; hence, the value of the series is well defined, albeit possibly infinite. Note that a necessary condition for D(X Y ) < ∞ to hold is that the support of the first argument is contained into the support of the second, in formula supp X ⊆ supp Y . On a different note, it is useful to observe that whenever Y ≤ I (an assumption that can be made without loss of generality) the expression (3) can be recast as where we convene that If moreover S(X) < ∞, then (4) can be further reduced to the more familiar expression which in this case is well defined because the only term that can possibly diverge on the right-hand side is the second, to be interpreted as the series − Tr [X ln Y ] . . = − i,j | e i |f j | 2 x i ln y j . A fundamental result establishes that the relative entropy is jointly convex, i.e. [39,40] (see also [41,Theorem 5.4 for all finite probability distributions p 1 , . . . p k (satisfying p i ≥ 0 for all i and i p i = 1) and all collections of positive semi-definite trace class operators X i , Y j ∈ T + (H). We now establish the special property of the quantum relative entropy which is key for our approach, namely, its weak* lower semi-continuity.
Remark 3. The absence of a similar property for other quantum divergences prevents the immediate extension of our results to the corresponding generalised divergences of resource. Indeed, remember from Remark 2 that the weak*-topology employed here is not the one commonly encountered in the study of von Neumann algebras.
Lemma 4. The relative entropy is lower semi-continuous with respect to the product weak*-topology. Proof.
An elementary yet important fact that we shall use multiple times without further comments is the following: if {f α } α is a (possibly infinite) family of lower semi-continuous real-valued functions f α : X → R on the topological space X , then F : X → R defined by F (x) . . = sup α f α (x), referred to as the point-wise supremum of the family {f α } α , is also lower semi-continuous. Now, a well-known result of Lindblad [42,Lemmata 3 and 4] states that where the supremum is over all finite-dimensional projectors P . By the above criterion, if we establish that (X, Y ) → D P XP P Y P is lower semi-continuous for all P then we are done. Write an arbitrary finite-dimensional projector P as P = N i=1 |ψ i ψ i |. The map X → P XP = N i,j=1 ψ i |X|ψ j |ψ i ψ j |, being a finite sum of weak*continuous functions, is clearly continuous with respect to the weak*-topology on the input space. Since the output space is finite dimensional, all Hausdorff linear topologies are equivalent there. Thanks to the fact that (X, Y ) → D(X Y ) is lower semi-continuous whenever X, Y are finite-dimensional positive semi-definite matricesindeed, due to the operator monotonicity of the logarithm [43] the above function can be written as D(X Y ) = sup c>0 {−S(X) − Tr X ln(Y + cI) + Tr[Y − X]}, where I is the identity operator, S is the von Neumann entropy (2), and the functions inside this latter supremum are continuous in X, Y -we deduce immediately that T + (H)×T + (H) (X, Y ) → D P XP P Y P is lower semi-continuous with respect to the product weak*-topology. The same is then true of the fully-fledged relative entropy (7), thanks to the representation (8).
The framework of quantum resource theories [9] models all those situations where operational or experimental constraints limit the set of states of a quantum system one can access in practice. Let a quantum system with Hilbert space H be given, and denote with F ⊆ D(H) the set of states that are accessible at no cost, hereafter called free states. To quantify the resource cost of an arbitrary state ρ ∈ D(H), one employs functions known as resource quantifiers. One of the simplest and most important such functions is the so-called relative entropy of resource, defined for ρ ∈ D(H) by Several specific examples of this construction are explored in detail in the forthcoming Section IV. Importantly, the above function D F is a resource monotone, i.e. it is non-increasing under any (completely) positive trace preserving map that sends F into itself. As a matter of fact, it is perhaps the most important resource monotone, as its regularisation governs the inter-conversion rates between states under asymptotically resource non-generating operations [10]. As explained in the Introduction, we are concerned here with the general properties of the relative entropy of resource (9), for either finite-or infinite-dimensional systems. For instance, a pressing problem is how to compute D F in practice. While (9) allows us to find upper bounds rather easily by simply making ansatzes for σ, it is not at all obvious how to calculate lower bounds systematically. In the finite-dimensional case, Berta, Fawzi, and Tomamichel [25] managed to tackle this problem by employing the variational expression for the relative entropy found by Petz [27]; plugging it into (9) and using Sion's theorem to exchange the infimum and supremum, one obtains a formula for D F that involves an external supremum instead of an infimum (cf. (9)). Such a formula is an ideal tool to compute lower bounds on D F in a systematic way, as well as to prove general properties of D F . And indeed, one of our main results provides an extension of it to infinite-dimensional resource theories (Theorem 9).
However, to arrive there we need to start from simpler questions. A particularly immediate one is whether the infimum in (9) is always achievable. In other words, does there always exist a closest free state? Is it unique? One could also wonder whether the resulting function D F is lower semi-continuous, which is one of the strongest forms of regularity we can hope for in general, as in infinite dimensions useful (i.e. extensive, or, more formally, additive) resource monotones are typically everywhere discontinuous -so, for instance, is the von Neumann entropy [44].
The compactness (and convexity) of F with respect to the trace norm topology is a sufficient condition to ensure both the existence of a closest free state to any given state and also the lower semi-continuity of D F . If dim H < ∞, then this amounts to requiring that F is closed: under this hypothesis, which is typically met in many important cases, in the finite-dimensional case we can replace the infimum in (9) with a minimum. The problem is that the above compactness assumption is almost never met for infinite-dimensional resource theories: indeed, density operators form themselves a non-compact set! Just to name a few examples of interesting F, separable states [18,19] or states with positive partial transpose [45] in a bipartite system, classical states [46][47][48], states with a non-negative Wigner function [49][50][51], or even Gaussian states [52] in continuous variable multi-mode systems all give rise to non-compact sets. For these cases, prior to our paper none of the above properties of D F was known [53]. In Section IV we will see how to apply our main results (Theorems 5, 7, and 9) to establish the achievability and lower semi-continuity of D F in all of these cases and in even greater generality, and to derive dual variational expressions for it.

A. The statements
Throughout this section we will present the statements of our main results. All proofs can be found in the forthcoming Section III B.
Our first main result establishes a sufficient condition that makes the relative entropy of resource (9) achieved, meaning that the infimum in (9) is in fact a minimum, and lower semi-continuous as a function of the state. As we will see in the next section, this condition covers virtually all quantum resource theories of practical interest.
Theorem 5. Let H be a (possibly infinite-dimensional) separable Hilbert space, and let F ⊆ D(H) be a (not necessarily convex) set of density operators on H. If cone(F) = {λσ : λ ∈ [0, ∞), σ ∈ F} is weak*-closed, then the relative entropy distance from F, defined by (9), is: (a) always achieved, meaning that for all ρ there exists σ ∈ F such that D F (ρ) = D(ρ σ); and (b) lower semi-continuous in ρ with respect to the trace norm topology, i.e. such that ρ n tn − −− → n→∞ ρ implies that for all sequences of density operators (ρ n ) n∈N .
(c) is trace norm compact, and furthermore convex if F itself is convex; and (d) contains a unique state if F is convex and ρ is faithful (i.e. ρ > 0).

Remark 6.
The faithfulness condition in claim (d) of Theorem 5 cannot be omitted even in the case of finitedimensional space H. This is confirmed by Example 21 in Section IV A.
The main difficulty in applying Theorem 5 lies in verifying the weak*-closedness of the cone generated by the set of free states F. In fact, although it is often the case that the set F itself is trace norm closed, since the weak*-topology is coarser than the trace norm topology this fact cannot be used to deduce the sought property of cone(F). To make our life easier, we will now equip ourselves with a technical tool that turns out to cover almost all interesting cases. The following result ought to be thought of as instrumental to the application of Theorem 5. Theorem 7. Let H be a (possibly infinite-dimensional) separable Hilbert space, and let (M n ) n∈N be a sequence of compact operators on H that converges to the identity in the strong operator topology, i.e. such that M n |ψ − |ψ − −− → n→∞ 0 for all |ψ ∈ H. Define M n (·) . . = M n (·)M † n . Consider a set of states F ⊆ D(H), and assume that: (i) F is convex; (ii) F is trace norm closed; (iii) M n preserves free states (up to normalisation), i.e. M n (cone(F)) ⊆ cone(F) for all n ∈ N.
Then cone(F) is weak*-closed, and in particular claims (a)-(b) of Theorem 5 hold true.
Remarkably, hypothesis (iii) in Theorem 7 cannot be removed. To see why, and to better illustrate the nature of the assumption of weak*-closedness in Theorem 5, we now present a simple yet instructive example. Then F H,E is (i) convex, (ii) trace norm closed, but cone(F H,E ) is not weak*-closed. Claim (i) is obvious, while (ii) follows from the trace norm lower semi-continuity of Tr ρH, which is a point-wise supremum of continuous functions. To verify that cone(F H,E ) is not weak*-closed, consider the sequence of states σ , and the right-hand side does not belong to cone(F H, Our final main result is a general variational expression for the relative entropy of resource D F that is dual to (9), in that it features an external maximisation instead of a minimisation. Its proof leverages in a key way Theorem 5.
Note. If the state σ appearing in (12) fails to be faithful the interpretation of its logarithm may pose some problems.
In this case, we convene that where P : H → supp(σ) is the orthogonal projector onto the support of σ.
The variational approach to the study of quantum relative entropy has a long and illustrious tradition [25,27,41,54]. As for the relative entropy of resource, variational expressions of the above dual kind have proved to be very useful in establishing general properties of D F and related quantifiers, for instance super-additivity [25,48]. Furthermore, as mentioned before they can be of immense help computationally, as they provide a valuable tool to generate lower bounds on D F systematically.
As for the proof technique, the one in [25, Section V.A] rests on Sion's theorem and does not carry over to infinite dimensions, because the compactness hypothesis typically breaks down. In Section III B we show how to overcome this difficulty by making use of Theorem 5; as a by-product of our derivation, exploiting some new results on multivariate trace inequalities [55][56][57] we extend the celebrated Lieb's three-matrix inequality [28] to the infinite-dimensional case (Appendix B).

B. Proofs
Proof of Theorem 5. Consider the set where B 1 , defined by (1), is the unit ball of the trace norm. Note that B 1 is weak*-compact thanks to the Banach-Alaoglu theorem [37, Theorem 2.6.18], while cone(F) is weak*-closed by hypothesis. Being the intersection of a weak*-compact and a weak*-closed set, F is also weak*-compact. Now, let us prove claim (a) for a fixed state ρ ∈ D(H). Clearly, we can assume without loss of generality that D F (ρ) < ∞, otherwise any state σ ∈ F achieves (9). For all λ ∈ [0, 1] and ρ, σ ∈ D(H), we have that thanks to the fact that f (λ) . . = λ − 1 − ln λ ≥ 0 for all λ ∈ [0, 1], with the convention that f (0) = +∞. We can then write Since the function η → D(ρ η) is lower semi-continuous with respect to the weak*-topology, it achieves its minimum on the weak*-compact set F. Let η 0 ∈ F be such that D F (ρ) = D(ρ η 0 ). It must be that η 0 = 0, otherwise D F (ρ) = +∞, contradicting our working assumptions. Due to the fact that η 0 ≥ 0, we deduce that in fact Tr η 0 > 0.
where the inequality follows from the fact that ln Tr η 0 − Tr η 0 + 1 ≤ 0 as Tr η 0 ∈ (0, 1]. We infer immediately that in fact Tr η 0 = 1, i.e. η 0 = σ 0 ∈ F. We now move on to claim (b). Let (ρ n ) n∈N be a sequence such that ρ n tn − −− → n→∞ ρ. Up to extracting a subsequence, we can assume without loss of generality that (D F (ρ n )) n∈N converges (if lim n→∞ D F (ρ n ) = +∞ there is nothing to prove). We now construct states σ n ∈ F such that D F (ρ n ) = D(ρ n σ n ). Thanks to the fact that F is weak*-compact and hence sequentially weak*-compact by Remark 1, we can extract a weak*-converging subsequence σ n k where (i) holds due to (16), (ii) thanks to Lemma 4, and (iii) because the sequence (D F (ρ n )) n∈N converges. We now set out to prove claim (c). The convexity of Σ F (ρ) follows immediately from that of F, once we remember that the relative entropy is a jointly convex function and hence in particular convex in its second argument (cf. (6)). To prove the trace norm compactness, pick an arbitrary sequence (σ n ) n∈N with σ n ∈ Σ F (ρ) for all n. Since σ n ∈ F, where F defined by (14) is weak*-compact and hence sequentially weak*-compact by Remark 1, we can extract a weak*-converging sub-sequence (σ n k ) k∈N , so that σ n k where (i) is because σ n ∈ Σ F (ρ), (ii) holds thanks to the lower semi-continuity of the relative entropy (Lemma 4), and in (iii)-(iv) we introduced the state σ * . . = (Tr η * ) −1 η * ∈ F and proceeded as in (17). Note that (ii) ensures that Tr η * > 0, otherwise we would have that η * = 0 and hence D(ρ η * ) = +∞. Since σ * ∈ F, the above inequality implies that σ * ∈ Σ F (ρ) and moreover Tr η * = 1, so that in fact η * = σ * ∈ Σ F (ρ). To conclude, note that since weak* and trace norm topology coincide on the set of density operators (cf. the discussion in Section II A), from σ n k w * − −− → k→∞ η * = σ * we infer that σ n k tn − −− → k→∞ σ * with respect to the (much stronger) trace norm topology. Therefore, Σ F (ρ) is trace norm compact.
Finally, assume as in (d) that F ⊆ D(H) is convex and that ρ > 0 is such that D F (ρ) < ∞. To show that the minimiser is unique, assume that there exist σ 1 , σ 2 ∈ F such that D(ρ σ 1 ) = D(ρ σ 2 ) = D F (ρ) < ∞, and let us show that σ 1 = σ 2 . Note that the states σ 1 and σ 2 are also faithful, because a necessary condition in order for D(ρ σ) to be finite is that supp ρ ⊆ supp σ, where supp denotes the support, and in this case supp ρ = H. Set σ 0 . . = 1 2 (σ 1 + σ 2 ). Then the convexity of the relative entropy implies that , and hence in particular Since the relative entropy is strictly concave in each of the entries provided that they are faithful states [58,59], this can happen only if σ 1 = σ 2 . To be more explicit, by a result of Petz [59, p. 130] we can infer from (20) that σ it holds for all t ∈ R. By applying the spectral theorem, it is easy to see that for given |ψ ∈ H and ω ∈ D(H) we have that ψ|ω it |ψ = 1 if and only if |ψ is an eigenvector of ω. Thus, each eigenvector of σ 1 must be an eigenvector of σ 2 as well, implying that σ 1 and σ 2 can be diagonalised simultaneously. Since for p, q > 0 it holds that p it = q it for all t ∈ R if and only if p = q, it must be that σ 1 = σ 2 , as claimed.
Remark 10. We note in passing that the conditions for equality in the joint convexity of the relative entropy (generalising (20)) have been studied thoroughly in the specialised literature. See e.g. [60,Theorem 8] or [61,Corollary 5.3].
Proof of Theorem 7. Let us break the argument into several elementary steps.
(I) We start by showing that (I.a) First, observe that since M n (cone(F)) ⊆ cone(F) we have that This holds for all n ∈ N, hence we infer that cone(F) ⊆ n∈N M −1 n (cone(F)). (I.b) To show the converse inclusion, take a trace class X ∈ T(H), X = 0, such that X ∈ M −1 n (cone(F)) for all n ∈ N. An important observation to make now is that the sequence of operator norms ( M n ∞ ) n∈N is bounded, i.e.
This non-trivial fact follows from the uniform boundedness principle [37, Section 1.6.9] combined with the observation that sup n∈N M n |ψ < ∞ for all fixed |ψ ∈ H because M n |ψ − −− → n→∞ |ψ . An immediate consequence is that To see why this is the case, start by observing that the second identity follows from the first upon taking the trace. To prove the first, then, pick an ε > 0, and construct a finite-rank operator X such that we see that thanks to (23). Since ε > 0 is arbitrary, it must be that M n (X) − X 1 − −− → n→∞ 0, as claimed. This proves (24). Now, observe that M n (X) ∈ cone(F) is positive semi-definite for all n; since the cone of positive semi-definite operators is trace norm closed, we deduce that also X ≥ 0; since X = 0, it must be that Tr X > 0 and hence, by (24), we have that Tr M n (X) > 0 for sufficiently large n, too. For n large enough, define σ n . . = Mn(X) Tr Mn(X) and σ . . = X Tr X . Note that Remembering that σ n ∈ F because X ∈ M −1 n (cone(F)) and invoking the trace norm closedness of F, hypothesis (ii), we infer that σ ∈ F and therefore X ∈ cone(F), as claimed. This proves (21).
(II) We now argue that for all n ∈ N the map M n is sequentially continuous with respect to the weak*-topology on the input space and the trace norm topology on the output. This means that for all sequences (X p ) p∈N of trace class operators, To see why this is the case, notice first that The first inequality in (29) follows from the relation X 1 ≤ lim inf p→∞ X p 1 , in turn a consequence of the general fact the dual norm is lower semi-continuous with respect to the weak*-topology [37, Theorem 2.6.14].
The second inequality, instead, is proved in a similar manner to (23). Namely: (i) the space of trace class operators is the dual to the Banach space of compact operators; and (ii) sup p∈N Tr X p S < ∞ for all compact S because Tr X p S − −− → p→∞ Tr XS, the uniform boundedness principle [37, Section 1.6.9] immediately implies (29). Now, for every ε > 0 consider a finite-rank ε-approximation of M n , i.e. an operator M n with rk M n < ∞ and Thus, from (30) we infer that Given that K < ∞ and M n ∞ < ∞ are fixed constants and that ε > 0 is arbitrary, we conclude that in fact M n (X p ) − M n (X) 1 − −− → p→∞ 0, finally proving (28). (III) The last preliminary observation we need to make is that cone(F) is trace norm closed thanks to hypothesis (ii).
In fact, if for a sequence (X p ) p∈N in cone(F) we have that X p tn − −− → p→∞ X ∈ T(H), then either X = 0, hence there is nothing to prove, or else Tr X > 0. In this latter case, since Tr X p − −− → p→∞ Tr X we have that X p / Tr X p tn − −− → p→∞ X/ Tr X. Since the operators on the left-hand side (well defined for sufficiently large p) belong to F and this is trace norm closed, it must be that also X/ Tr X ∈ F, i.e. X ∈ cone(F), as claimed.
(IV) We now conclude the argument by proving that cone(F) is weak*-closed. Thanks to hypothesis (i), cone(F) is convex; also, we have seen in Section II A that the Banach space K(H) of compact operators on a separable Hilbert space H is itself separable. In this situation, an immediate corollary of the Krein-Šmulian theorem [ implies that also X ∈ cone(F). Using (21), it suffices to show that M n (X) ∈ cone(F) for all n ∈ N. This follows straightforwardly by combining the weak*to-trace-norm sequential continuity of M n , proved in point (II) above, and the trace norm closedness of cone(F), proved in (III).
We now set out to prove our final main result, Theorem 9. The key idea of the proof is contained in Lemma 13 below, which essentially tackles the simpler case where the relative entropy of resource is finite. Before presenting it, we will make sure that a result by Petz [27,Corollary 2] holds in a slightly more general sense than originally stated.
Lemma 11. Let ξ ∈ T + (H) be a positive semi-definite trace class operator, and let X = X † ∈ B(H) be bounded. Then ln Tr e ln ξ+X = sup where the left-hand side is interpreted as in (13) when ξ is not strictly positive definite. In particular, for a fixed (but arbitrary) X as above the function is concave and monotonically non-decreasing.
Proof of Lemma 11. The famous variational formula due to Petz [27], re-adapted to our notation, states that Tr ωX − ln Tr e ln ξ+X + Tr ξ − 1 (35) whenever the state ω ∈ D(H) and the operator ξ ∈ T + (H) are both faithful, i.e. ω, ξ > 0. These further assumptions, which will turn out to be superfluous, mean that we cannot use (35) directly in our proof. Fortunately, here we need only the 'easy' inequality contained in (35) Here, (i) is just (3) evaluated between two numbers, (ii) is the data processing inequality for the relative entropy [41, Proposition 5.1(ii)], and finally (iii) is Araki's identity [62, Theorem 3.10] (see also [41,Corollary 12.8] for how to remove the faithfulness assumption). Note that the equality in (iii) is trivially true (as +∞ = +∞) also when supp(ω) ⊆ supp(ξ) = supp e ln ξ+X . Now, from the above inequality we deduce that ln Tr e ln ξ+X ≥ Tr ωX − D (ω ξ) + Tr ξ − 1 .
This proves that the left-hand side of (33) is no smaller than the right-hand side. To establish the converse statement, one can observe that the inequality in (ii) of (36), and hence that in (37), is saturated for the choice ω = e ln ξ+X Tr e ln ξ+X , simply because D(ω c ω) = D(1 c) holds for all ω ∈ D(H) and c > 0. This concludes the proof of (33).
We now prove that (34) is monotonically non-decreasing. This will follows from (33) once we show that for each ω ∈ D(H) the function ϕ ω : T + (H) → R ∪ {−∞} defined by ϕ ω (ξ) . . = Tr ξ − D(ω ξ) is non-decreasing. To prove this latter claim, pick ξ 1 ≤ ξ 2 ; we can take ξ 2 ≤ I without loss of generality, as otherwise we write ξ i = aξ i with a > 0 and ξ i ≤ I, and assuming that ϕ ω (ξ 1 ) ≤ ϕ ω (ξ 2 ) we deduce that When ξ 1 ≤ ξ 2 ≤ I, denoting with ω = i µ i |e i e i | a spectral decomposition of ω one can use (4) to write where the inequality comes from the operator monotonicity of the logarithm (in the sense of Schmüdgen [ Finally, the concavity of the function in (34) descends once again from (33) and from the generally valid fact that the supremum over a convex set of a jointly concave function (see (6)) is itself concave.
Lemma 13. Let C ⊆ T + (H) ∩ B 1 be a convex and weak*-compact subset of positive semidefinite trace class operators with trace at most 1. Let ρ ∈ D(H) be a state with finite entropy S(ρ) < ∞ and such that D C (ρ) . . = inf ξ∈C D(ρ ξ) < ∞. Then where again we use (13) to interpret the right-hand side when ξ is not strictly positive definite.
Proof. By the same reasoning we encountered in the proof of Theorem 5, since thanks to Lemma 4 the relative entropy is weak* lower semi-continuous in both arguments, and hence in particular with respect to the second, we can find 0 = ξ ∈ C such that D C (ρ) = D(ρ ξ). Now, consider an arbitrary ξ ∈ C and some λ ∈ [0, 1]. By convexity of C, we have that (1 − λ)ξ + λξ ∈ C, so that where the last equality follows from (5). Note also that the expression at the rightmost side is well defined and finite, because − Tr ρ ln ξ = D(ρ ξ) + S(ρ) + 1 − Tr ξ < ∞ and moreover thanks to the operator monotonicity of the logarithm.
We can now divide both sides of (41) by λ and take the limit λ → 0 + . To carry out this computation we use a well-known representation of the Fréchet differential of the operator logarithm reported (without proof) e.g. in [55,Lemma 3.4], obtaining that where and the integral on the right-hand side converges absolutely for X = ξ − ξ and X = ξ . Since we were not able to deduce from the existing literature a completely rigorous proof of (43) and (44) that works in the infinite-dimensional case as well, we provide one in Appendix A. We now leave (43) for a moment, and return to the definition of D C . By making the ansatz ω = ρ in 33, for every bounded X = X † ∈ B(H) and every ξ ∈ C we deduce that so that naturally Tr ρX − ln Tr e ln ξ +X + Tr ξ − 1 ≥ sup Tr ρX − ln Tr e ln ξ +X + Tr ξ − 1 = sup ln Tr e ln ξ +X − Tr ξ + 1 , where in the second line we have used the elementary fact that inf a∈A sup b∈B f (a, b) ≥ sup b∈B inf a∈A f (a, b) holds for an arbitrary function f : A × B → R on any product set A × B. This proves the first inequality needed to establish (40). As for the second, consider that ln Tr e ln ξ +X − Tr ξ + 1 ln Tr e ln ξ +ln(ρ+ 2 I)−ln(ξ+ I)+(Tr ξ−1)I − Tr ξ + 1 ln Tr e ln ξ +ln( Here, in (iv) we made the ansatz X = ln(ρ + 2 I) − ln(ξ + I) + (Tr ξ − 1)I, where later we take → 0 + ; (v) holds because denoting with ρ = i p i |e i e i | and ξ = j µ j |f j f j | the spectral decompositions of ρ and ξ, we have that by the dominated convergence theorem applied to each series (remember that S(ρ) < ∞ and also − Tr ρ ln ξ = D(ρ ξ) + S(ρ) + 1 − Tr ξ < ∞); (vi) follows from Lieb's three-operator inequality, proved in [28,Theorem 7] for the finite-dimensional case and extended to infinite dimensions in Lemma 43; in (vii) we noted that for a bounded where we used Tonelli's theorem to exchange the integrals; (viii) comes from observing that on the one hand Tr 1 ξ+ I ξ ≤ 1 Tr ξ , and on the other (50) in (ix) we leveraged (43); finally, in (x) we remembered that Tr ξ ∈ (0, 1] and Tr ξ ∈ [0, 1] (the case where Tr ξ = 0 and thus ξ = 0 is trivial and can be excluded) and observed that sup for all > 0 and a ∈ [0, 1]. This concludes the justification of (47) and thus the proof.
We are finally ready to present the proof of our last general result.
Proof of Theorem 9. Let ρ ∈ D(H) be a state with finite entropy S(ρ) < ∞. If D F (ρ) < ∞ then we can use directly Lemma 13 and complete the proof. Therefore, what we set out to do now is to devise an argument that tackles also the case where D F (ρ) = +∞. To this end, start by observing that the general inequality can be proved exactly as (46), whose derivation, in fact, does not rely on any further assumption. The problem is to establish the converse inequality. To this end, for an arbitrary parameter p ∈ (0, 1] let us construct the convex set where F = cone(F) ∩ B 1 is as in (14). Since F p is the sum of two weak*-compact sets, it is itself weak*-compact.
we see that ln Tr e ln ξ+X − Tr ξ + 1 = sup ln Tr e ln η+X − Tr η + 1 ln Tr e ln σ+X + ln λ − λ + 1 Here, (i) holds thanks to Lemma 13; in (ii) we exploited the concavity of the function ξ → ln Tr e ln ξ+X , as established by Lemma 11; and in (iii) we applied the Peierls-Bogoliubov inequality [64,Theorem 7]. Taking the limit p → 0 + we therefore deduce that Now, as in the proof of Lemma 13, by weak*-compactness for all p we can find some η p ∈ F such that D . By the same reason, when taking the limit p → 0 + we can assume without loss of generality that using the lower semi-continuity of the relative entropy, we see that Combining this with (56) yields that Together with (52), this proves (12).

Remark 14.
In the special case where F = {σ} contains a single state (and S(ρ) < ∞), Theorem 9 yields immediately the identity This is naturally just Petz's variational formula [27], without any faithfulness assumption on either ρ or σ. In this sense, Theorem 9 can also be seen as a generalisation of Petz's result.
Remark 15. One could wonder whether there may be a more direct approach to prove Theorem 9. Assuming that we have established (59), or equivalently (35), we could plug this into (16), obtaining Tr ρX − ln Tr e ln η+X + Tr η − 1 .
Now, knowledge of [25] would suggest that we try to exchange the supremum and infimum using Sion's theorem. Note that the weak*-topology makes F compact, which is encouraging. Since the other conditions can be shown to be met, we need only to ask ourselves whether F η → f X (η) . . = − ln Tr e ln η+X + Tr η − 1 is lower semi-continuous for all fixed X = X † ∈ B(H). Unfortunately, that is not the case. To see why, it suffices to take X = 0 and compute f 0 (η) = − ln Tr η + Tr η − 1. Evaluating this on a sequence (η n ) n∈N with constant trace Tr η n ≡ 1 but such that η n w * − −− → n→∞ 0 (such as the one constructed in Section II A) shows that lim n→∞ f 0 (η n ) = 1 < ∞ = f 0 (0).

C. On the continuity of the relative entropy of resource
The lower semi-continuity of D F implies the following sufficient conditions for its local continuity.
holds provided that one of the following conditions is valid: (a) ρ n = Φ n (ρ)/ Tr Φ n (ρ), where Φ n is a positive trace-non-increasing linear transformation of T(H) such that for each n, and Tr Φ n (ρ) − −− → n→∞ 1; (b) c n ρ n ≤ σ n for all n, where c n − −− → n→∞ 1 are real numbers, and (σ n ) n∈N is a sequence of states in D(H) such that Proof. Let us prove the claims one at a time.
(a) A previous result by one of us [65, Lemma 1] implies that (Tr Φ n (ρ)) D F (ρ n ) ≤ D F (ρ) for all n. So, in this case (61) follows directly from Theorem 5(b).
(b) For an arbitrary set F the function D F satisfies the inequality valid for any states ρ and σ in D(H) and any p ∈ (0, 1), where is the binary entropy. Inequality (64) Now, we argue that under the hypotheses in (b) we have that σ n tn − −− → n→∞ ρ. In fact, since σ n − ρ n = σ n − c n ρ n + (c n − 1) ρ n , using the triangle inequality and the fact that X 1 = Tr X for X ≥ 0 we arrive at σ n − ρ n 1 ≤ 2 |c n − 1|. The right-hand side tends to 0 by hypothesis; hence, so does the left-hand side. The condition c n ρ n ≤ σ n and inequality (64) also show that since σ n = (1 − c n ) σn−cnρn 1−cn + c n ρ n , we have that It follows that This relation and Theorem 5(b) imply (61).
for any sequence (ρ n ) n∈N converging to an arbitrary state ρ provided that c n ρ n ≤ ρ for all n, where (c n ) n∈N is a sequence of positive numbers tending to 1. 4 This holds, in particular, for the sequence (ρ n ) n∈N of finite-rank states obtained by truncation of the spectral decomposition of ρ. We deduce the following: Corollary 17. The function D F is completely determined by its values on the set of finite-rank states in D(H).

D. On the continuity of the minimiser(s)
Let F be a convex subset of D(H) such that cone(F) is weak*-closed. For any state ρ denote by Σ F (ρ) the minimiser set of ρ defined in Theorem 5, i.e. the set of all states σ ∈ F such that D F (ρ) = D (ρ σ). By Theorem 5, Σ F (ρ) is always nonempty, convex, trace norm compact, and consists of a single state if ρ is faithful.
Here we will push forward the investigation of the properties of the function ρ → Σ F (ρ) by considering its continuity in a neighbourhood of a faithful state. A variation on the argument used in the proof of Theorem 5 yields the following result.
Proposition 18. Let F ⊆ D(H) be convex and such that cone(F) is weak*-closed. Let (ρ n ) n∈N be any sequence of states in D(H) converging to a faithful state ρ in trace norm, i.e. ρ n Then any sequence (σ n ) n∈N such that σ n ∈ Σ F (ρ n ) for all n converges in trace norm to the unique state in Σ F (ρ).
Proof. We can assume without loss of generality that D F (ρ n ) < ∞ for all n ∈ N. Proceeding by contradiction, and calling σ the unique state in Σ F (ρ), we can also posit, up to selecting a subsequence of (σ n ) n∈N , that lim inf n→∞ σ n − σ 1 > 0. Since the set F = cone(F) ∩ B 1 defined by (14) is weak*-compact and therefore sequentially weak*-compact (see Remark 1), we can extract from the sequence (σ n ) n∈N a weak*-converging sub-sequence (σ n k ) k∈N , so that σ n k w * − −− → k→∞ η * ∈ F. In analogy with (19), we now have that where (i) is thanks to (69), (ii)-(iv) are justified as the corresponding inequalities in (19), and in (iii) we introduced the state σ * . . = (Tr η * ) −1 η * ∈ F. Note that Tr η * > 0 thanks to (ii). The above relation together with the faithfulness of ρ guarantees that σ * = σ is the unique state in Σ F (ρ), and moreover Tr η * = 1, so that in fact η * = σ * = σ. Since weak* and trace norm topology coincide on the set of density operators (see Section II A), from σ n k w * − −− → k→∞ η * we infer that σ n k tn − −− → k→∞ σ. Hence, lim inf n→∞ σ n − σ 1 = 0, and we have reached a contradiction. Remark 19. If the limit state ρ in Proposition 18 is not faithful, the arguments from the above proof show that the sequence (σ n ) n∈N is relatively compact with respect to the trace norm and all the limit points of this sequence are contained in Σ F (ρ).
Here, the closure is taken with respect to the trace norm topology. As it turns out, a state σ AB is separable if and only if it can be decomposed as σ AB = |ψ ψ| A ⊗ |φ φ| B dµ(ψ, φ) for some Borel probability measure µ defined on the product of the sets of local normalised pure states [19]. A state that is not separable is called entangled.
The relative entropy of entanglement is nothing but the relative entropy of resource associated with the set of free states S defined by (71). In formula, it is defined by [13] Its central importance in entanglement theory stems from the fact that its regularisation bounds from above the distillable entanglement and from below the entanglement cost [14][15][16][17].
To set the stage for the application of Theorem 5, we need to ask ourselves whether the cone generated by separable states is weak*-closed. This has been proved already in Ref. [66,Lemma 25]; an alternative and significantly more general proof resting on Theorem 7 will be presented below (cf. Corollary 22). A straightforward application of Theorems 5 and 9 then yields: Corollary 20. For an arbitrary bipartite system with separable Hilbert space H AB , the relative entropy of entanglement is: (a) always achieved, meaning that for all ρ AB ∈ D(H AB ) there exists a state σ AB ∈ S AB such that E R (ρ AB ) = D (ρ AB σ AB ); and (b) lower semi-continuous with respect to the trace norm topology.
Moreover, for every state ρ = ρ AB ∈ D(H AB ) with finite entropy S(ρ) < ∞, it holds that The following example shows that the state σ AB ∈ S AB such that E R (ρ AB ) = D (ρ AB σ AB ) may not be unique, even in the finite-dimensional case.
we see that Since σ 1 AB = σ 2 AB , the relative entropy of entanglement is not uniquely achieved in this case.

B. Relative entropy of multi-partite entanglement
The above Corollary 20 can be generalised to the multi-partite setting. To do so, let us start by fixing some terminology. We follow the particularly clear exposition of Szalay [67]. For a given positive integer m, representing the total number of parties, let [m] . . = {1, . . . , m}. Of particular interest to us are the partitions of this set. A partition π = (π(j)) j is a finite collection of non-empty sets π(j) ⊆ [m] that do not intersect, i.e. π(j) ∩ π(j ) = ∅ for all j = j , and together cover the whole [m], i.e. ∪ j π(j) = [m]. We will denote the set of all partitions of [m] with P (m).
In the context of multi-partite entanglement, a partition π ∈ P (m) can represent the allowed quantum interactions between some systems A 1 , . . . , A m : two parties A , A can exchange quantum messages and thus establish entanglement if and only if and belong to the same element of the partition, i.e. , ∈ π(j) for some j. One can however imagine a setting where not one but several partitions are allowed in this sense. Let π = {π k } k ⊆ P (m) be the (non-empty) set of allowed partitions. We can then define the set of π-separable states by [67] S π A1...Am Here, A π k (j) is the system obtained by joining those A i such that i ∈ π k (j), and we used the shorthand notation The notions of multi-partite separability most commonly employed in the literature correspond to special choices of π in (76). Namely, we can pick π to be: • The set containing only the finest partition {{1}, . . . , {m}}: the states obtained in (76) are then called totally (or fully) separable.
• More generally, the set of partitions of [m] into at least k subsets: the corresponding states in (76) are usually referred to as k-separable [13,67,68] (of particular interest is the case of bi-separability).
• Taking a different angle, we can consider also the set of partitions of [m] involving subsets of at most k elements: the states obtained in (76) are then called k-producible [67,69].
Let ρ A1...Am be a state of an m-partite quantum system A 1 . . . A m . For a generic non-empty π ⊆ P (m), we can define its relative entropy of π-entanglement by Special cases of the quantity in (77) have been considered by several authors in different contexts [21,[70][71][72][73][74].
To apply Theorems 5 and 9 to the relative entropy of π-entanglement we need to first extend the result of Ref. [66,Lemma 25] on the weak*-closedness of the cone of bipartite separable states. This can be done thanks to a swift application of Theorem 7.
Corollary 22. For a positive integer m, let π ⊆ P (m) be a non-empty subset of partitions of [m]. Then, for arbitrary separable Hilbert spaces H A1 , . . . , H Am , the cone generated by the set of π-separable states (76) is weak*-closed.
In particular, the cones generated by k-separable and k-producible states are weak*-closed, for all positive integers k ≤ m. A and M n (·) = M n (·)M † n . Note that M n is of finite rank and hence compact for all n. Also, using the fact that the coefficients of any multi-partite pure state with respect to the product basis m =1 |k A k1,...,km∈N form a square-summable sequence, one sees that (M n ) n∈N converges to the identity in the strong operator topology in the sense explained in the statement of Theorem 7. Since it is straightforward to verify that M n (cone(F)) ⊆ cone(F) for F = S π A1...Am , and given that this set is trace norm closed and convex by construction, we can apply Theorem 7 and conclude.
We are now ready to apply Theorems 5 and 9.
and extended by linearity and continuity to the whole T(H AB ). 6 The set of PPT states on AB is defined by PPT entangled states that are not separable have been the first -and, to date, only [75] -examples of entangled states that are undistillable, meaning that their distillable entanglement vanishes [76][77][78]. This is not to say that they are easy to generate; in fact, they have positive entanglement cost [79], become distillable if the set of protocols available is enlarged to include all 'PPT operations' [80], and can even have an almost maximal 'Schmidt number' [81][82][83].
The relative entropy of NPT entanglement of an arbitrary state ρ AB ∈ D(H AB ) is simply the relative entropy of resource associated to the set (80), in formula [84,85] This quantity is a generally sharper upper bound to the distillable entanglement than the standard relative entropy of entanglement [84], and it has the advantage of being often easier to compute, even upon regularisation [85]. The cone generated by PPT states turns out to obey the hypotheses of Theorem 5.
Proof. This can be seen as a corollary to Theorem 7. However, it is even easier to give a direct proof of this fact. For an arbitrary such Y , by definition of Γ we see that also Y Γ has a finite expansion and is thus compact. We deduce straight away that the cone of operators X with X Γ ≥ 0 is weak*-closed. Taking the intersection with the cone of positive semi-definite trace class operators, which is also weak*-closed, we obtain precisely cone(PPT AB ), which is then weak*-closed as well.
Thanks to Lemma 24 we can immediately apply Theorems 5 and 9, which give: Corollary 25. For an arbitrary bipartite system with separable Hilbert space, the relative entropy of NPT entanglement is: (a) always achieved, meaning that for all ρ AB ∈ D(H AB ) there exists a state σ AB ∈ PPT AB such that E R, PPT (ρ AB ) = D (ρ AB σ AB ); and (b) lower semi-continuous with respect to the trace norm topology.
As it turns out, one can modify (81) to give an even better upper bound to the distillable entanglement, namely, the Rains bound. For an arbitrary bipartite state ρ AB , this is defined by [84,[86][87][88] Note that in the right-hand side of (83) the operator σ AB need not be normalised, although it is implicitly assumed to be of trace class. One can anyway see that it must be at least sub-normalised, meaning that Tr σ AB = Tr σ Γ AB ≤ σ Γ AB 1 ≤ 1. The Rains bound is not of the general form in (9). However, it is sufficiently alike to it that we can hope to employ similar techniques to show that it is achieved and lower semi-continuous. In fact, the set of operators σ AB over which the optimisation in (83) runs can be shown to be weak*-compact. And yet, remarkably, we cannot deduce a statement similar to Theorem 5 in this case, because the function to be minimised in (83) fails to be lower semi-continuous, due to the negative trace term. 7 D. Relative entropy of non-classicality, Wigner non-positivity, and generalisations thereof We now move on to resource theories specific to continuous variable systems. An m-mode continuous variable system [52,[89][90][91] is just a finite collection of m harmonic oscillators with canonical operators x 1 , p 1 , . . . , x m , p m . These obey the canonical commutation relations [x j , x k ] = 0 = [p j , p k ] and [x j , p k ] = iδ jk I on the underlying Hilbert space H m . . = L 2 (R m ) of square-integrable complex-valued functions on R m . Defining the 'vector of operators' r . . = (x 1 , . . . , x m , p 1 , . . . , p m ) , we can rephrase them in the matrix form For a vector ξ ∈ R 2m , the associated Weyl operator is the unitary defined by W ξ . . = e −iξ Ωr , while the corresponding coherent state is [92] |ξ . . = W ξ |0 , with |0 being the vacuum state. The operators W ξ satisfy the important identity known as the Weyl form of the canonical commutation relations. Taking the trace against W ξ yields a representation of any trace class operator T ∈ T(H m ) as a 'characteristic function'. Here we are especially interested in a slight generalisation of this notion. For a parameter λ ∈ [−1, 1], let us define the λ-ordered characteristic function χ T,λ of an operator T ∈ T(H m ) by An important property of χ T,λ is that it characterises the trace class operator T completely: namely, χ T1,λ (ξ) = χ T2,λ (ξ) for all ξ if and only if T 1 = T 2 [89, p. 199].
If λ ≥ 0, the λ-ordered characteristic function is guaranteed to be square-integrable; however, it may or may not be such if λ < 0. In any case, if χ T,λ happens to be square-integrable then its Fourier transform gives the λ-quasi-probability distribution of T , in formula If T = ρ is a quantum state and the function W ρ,λ is well defined, it is not difficult to show that it is also realvalued. It is usually referred to as a quasi-probability distribution because in general it can take on negative values. In fact, the negativity of W ρ,λ (for λ < 1) is an unmistakable signature of quantumness. Accordingly, it turns out that substantial physical insight can be gained by considering the family of resource theories whose free states are those with a non-negative W ρ,λ . Defining this notion rigorously may appear problematic at first sight, because for a given ρ ∈ D(H m ) the object W ρ,λ may be ill-defined as a function -this is the case if χ ρ,λ in (87) is not square-integrable -and thus it may be unclear how to check that W ρ,λ ≥ 0. While this problem can be overcome by resorting to the theory of distributions, we prefer to take a simpler route here. All we need is the following notion.
Definition 26 [93,Chapter 5]. A function f : R s → C is called positive definite if for all positive integers n ∈ N + and all choices of ξ 1 , . . . , ξ n ∈ R s , it holds that n µ,ν=1 i.e. if the n × n complex matrix on the left-hand side turns out to be positive semi-definite. Here, |µ denotes the µ th vector of the canonical basis of C n .
The connection between this notion and that of non-negativity of the λ-quasi-probability distribution is captured by Bochner's theorem [94, Theorem 1.8.9]: a function f : R s → C is the Fourier transform of a measure on R s if and only if it is continuous at 0 and positive definite. In this case, the measure is in fact a probability measure if and only if f (0) = 1.
We are now ready to define rigorously our object of interest.
where χ σ,λ is given by (87), and the notion of positive definiteness for functions is explained in Definition 26.

Remark 28.
An alternative way to rephrase (90) is as follows: Remark 29. Thanks to the aforementioned Bochner's theorem, P m,λ (respectively, cone(P m,λ )) is easily seen to contain precisely those quantum states (respectively, positive trace class operators) σ for which χ σ,λ is the Fourier transform of a probability measure (respectively, a measure) on R 2m . In fact, due to the strong continuity of the map R 2m ξ → W ξ , in turn a consequence of Stone's theorem [95,Theorem 10.15], the λ-ordered characteristic function is always continuous, and moreover it satisfies χ σ,λ (0) = Tr σ = 1.
Three special cases of particular physical significance, corresponding to as many values of λ, are as follows: 1. λ = 1: W ρ,1 is called the Husimi function of ρ [96]; it has the special representation where |u is a coherent state (85). Hence, W ρ,1 ≥ 0 for all quantum states, i.e. P m,1 = D(H m ).
3. λ = −1: W ρ,−1 is referred to as the Glauber-Sudarshan P -function of ρ [104,105]. The set P m,−1 in this case can be described more compactly as the (trace norm) closed convex hull of the set of coherent states (85), in formula [46] P m,−1 = cl tn conv |u u| : u ∈ R 2m .
In general, we define the relative entropy of λ-negativity of an arbitrary m-mode state ρ ∈ D(H m ) as the relative entropy of resource associated to the set P m,λ , in formula For the special case λ = −1, this quantity has been considered in [48], where it is employed to give upper bounds to transformation rates in the resource theory of non-classicality. In order to apply Theorems 5 and 9 to the relative entropy of λ-negativity, we must first prove the weak*-closedness of the cone generated by P m,λ . This has been established in Ref. [48,Lemma 38] for the case λ = −1, but all other cases (except for the trivial one λ = 1) have not been studied elsewhere, to the best of our knowledge. In order to apply Theorem 7 to the case at hand, we need a preliminary result. Let us fix some terminology first. Denote with the 'total photon number' Hamiltonian on H m = L 2 (R m ). This can be diagonalised as N = k1,...,km∈N (k 1 + . . . + k m ) |k 1 k 1 | 1 ⊗ . . . ⊗ |k m k m | m , where |k j is the k th 'Fock state' on the j th mode. Accordingly, for some η ∈ [−1, 1] we will set with the convention that 0 0 = 1.

Lemma 30.
For an arbitrary η ∈ [−1, 1], λ ∈ [−1, 1], and σ ∈ P m,λ , we have that Proof. The case η = 1 is trivial, while the claim for η = −1 follows from the simple observation that -since (−1) N is the parity operator -we have that . . ⊗ |0 m is the multi-mode 'vacuum state'. Since |0 0| ∈ P m,λ for every λ ∈ [−1, 1], also the case η = 0 is easily dealt with. From now on, we assume that η = −1, 0, 1. Employing the expression for the (0-ordered) characteristic function of a thermal state for the Hamiltonian N and mean photon number η 1−η (see e.g. [52,Eq. (4.48)-(4.49)]), one derives the representation where the integral is in the Bochner sense with respect to the operator norm. Using this twice and leveraging also the Weyl form of the canonical commutation relations (86), for an arbitrary ξ ∈ R 2m we obtain that In the above calculations, we implicitly used the Fubini-Tonelli theorem, applicable because χ σ is bounded in modulus by 1 and thanks to the absolute integrability of the Gaussians, and performed the change of variable w . . = u + v and z . . = u−v 2 . Now, assume that σ ∈ P m,λ . As per the above discussion, thanks to Bochner's theorem [94,Theorem 1.8.9] there exists a probability measure µ on R 2m such that Plugging this representation into (99) and using once again the Fubini-Tonelli theorem to swap the integrals yields where in the last line we introduced the change of variables We now claim that from (101)- (102) it is quite clear that χ η N σ η N , λ is the Fourier transform of a measure on R 2m , which allows us to conclude the proof thanks to Remark 29. Indeed, χ η N σ η N , λ is written as a point-wise product of a Gaussian 8 and the Fourier transform of a measure -namely, µ re-scaled by another Gaussian factor. Since the point-wise product of Fourier transforms is the Fourier transform of the convolution [110, Proposition 2.3.22 (11)], and a Gaussian is the Fourier transform of another Gaussian, we see that indeed χ η N σ η N , λ is the Fourier transform of a measure -namely, that obtained by convolving a Gaussian re-scaled version of µ by another Gaussian.
We are now ready to deduce: Corollary 31. For all positive integers m, the cone cone(P m,λ ) generated by the set (90) of λ-positive states is weak*-closed.
Proof. For n ∈ N, let us construct the compact operator M n = n n+1 N , where N is the total photon number Hamiltonian (95). It is immediate to verify that M n converges to the identity in the strong operator topology, in the sense explained in the statement of Theorem 7. Also, P m,λ is easily verified to be convex and trace norm closedthis latter fact is apparent once one notices that the operators n µ,ν=1 c * µ c ν e − λ 4 ξµ−ξν 2 W ξµ−ξν in (91) are finite linear combinations of unitary operators and hence bounded. The last missing condition needed to apply Theorem 7 with F = P m,λ is (iii), i.e. that conjugation by n n+1 N preserves cone(P m,λ ). This follows directly from Lemma 30.
Thanks to the weak*-closedness of the cone P m,λ we are now in position to apply Theorems 5 and 9, yielding the following.

E. Relative entropy of non-Gaussianity
In an m-mode continuous variable system, particularly simple yet experimentally relevant states are the so-called Gaussian states. A state σ ∈ D(H) is said to be Gaussian if any of its λ-ordered characteristic functions χ σ,λ is a Gaussian (and hence all are). Let us consider for instance the case λ = 0. Since χ σ,0 achieves it maximum modulus at 0 (see [111,Proposition 14] and [112,Lemma 10]), if it is a Gaussian it must be centred, i.e. [52,Eq. (4.48)] where the real vector s = Tr[σ r] ∈ R 2m and the 2m × 2m real matrix V = Tr [σ {r − s, (r − s) }] represent the first and second moments of the state, respectively. We will denote with G m the set of m-mode Gaussian states. Unlike all other sets of free states considered so far, G m is not convex.
The relative entropy of non-Gaussianity can be defined as the relative entropy of resource D F corresponding to the choice of free states F = G m , in formula [113,114] δ R (ρ) . . = D Gm (ρ) = inf σ∈Gm D(ρ σ) . (106) In Ref. [114] it was shown that if ρ has well-defined second moments, a condition that we equivalently rephrase by requiring that Tr ρN < ∞ for N the total photon number Hamiltonian (95), then the optimisation in (106) is achieved at the Gaussian state σ = ρ G with the same first 9 and second moments as those of ρ, hereafter called the Gaussification of ρ, i.e.
To analyse this object more effectively, we now recall an alternative characterisation of G m that makes this set easier to work with. To this end, let us introduce the unitary modelling a 50:50 beam splitter acting on a bipartite quantum system AB, where A, B are composed of m mode each. This can be defined by where x A j , p A k are the canonical operators corresponding to system A, and analogously for B. The action of the beam splitter unitary on a tensor product of coherent states (85)) can be expressed as Having established the notation, we now report a (marginally simplified) version of an interesting result by Cuesta [115].
where A, B stand for two m-mode systems, and U is the beam splitter unitary (108).
We are now ready to prove the following generalisation of the result in Ref. [ Proof. Since G m is not convex, we cannot hope to apply Theorem 7. Hence, we have to proceed differently in this case. Consider a general net 10 (X α ) α on cone(G m ) -hence, a generic X α can be written as X α = λ α σ α , where λ α ≥ 0 and σ α is a Gaussian state. Assume that X α w * − − → α X ∈ T(H), and let us show that X ∈ cone(G m ) as well. 11 For arbitrary u, v, u , v ∈ R 2m , denoting with |u , |v , etc. the coherent states (85) and with |u, v = |u ⊗ |v their tensor products, we have that Here, (i) and (iv) hold by the definition of weak* convergence, (ii) follows from the quantum Darmois-Skitovich theorem (Lemma 33), while (iii) and (v) descend from the identity (109). Since linear combinations of (tensor products of) coherent states are dense in the topology induced by the Hilbert space norm, from the above identity we deduce that in fact X ⊗ X = U (X ⊗ X)U † . Applying Lemma 33 once again concludes the proof.
Corollary 35. The relative entropy of non-Gaussianity (106) is: (a) always achieved: if it is finite at ρ, then it is achieved at its Gaussification σ = ρ G , and therefore (107) holds; and (b) lower semi-continuous with respect to the trace norm topology.
A notable aspect of the above result is claim (b), which implies in particular that the map where ρ G is the Gaussification of ρ, is lower semi-continuous, something that is not at all obvious a priori. We know from Ref. [29] that such map can very well be discontinuous even on energy-bounded states, so that lower semi-continuity is really the strongest form of regularity that can reasonably be obeyed. The lower semi-continuity of the map (112) implies the lower semi-continuity of the map ρ → S(ρ G ) on the same set. This property is quite surprising in view of the discontinuity of the map ρ → ρ G on the set of energy-bounded states, which follows from the aforementioned discontinuity of the map (112) (established by Ref. [29]) and the continuity of the entropy on these sets (along with the equality Tr ρ G N = Tr ρN ).
Proof of Corollary 35. Thanks to Lemma 34, claim (b) follows immediately from Theorem 5. As for claim (a), let ρ be an arbitrary m-mode state. If δ R (ρ) = +∞ then there is nothing to prove. Otherwise, D(ρ σ) < ∞ for some Gaussian state σ ∈ G m . We will argue that in fact Tr ρN < ∞, so that ρ has well-defined (first and) second moments.
with ω j > 0, and Z a normalising constant [52,Eq. (3.60)]. Given that D(ρ σ) < ∞, it must be that V † ρV = |0 0| ⊗ ρ , so that Now, let us invoke the variational expression for the measured relative entropy in Ref. [48,Lemma 20]. Upon taking a straightforward limit, this implies that Since both (113) as well as the last term on the rightmost side of (114) are finite, putting together (113) and (114) we deduce that Tr ρ H q < ∞, hence Tr ρ N < ∞ and thus Tr ρN < ∞, i.e. ρ has well-defined second moments. Applying the result of Ref. [114] then completes the proof of claim (a).

V. TIGHT UNIFORM CONTINUITY BOUNDS FOR THE RELATIVE ENTROPY OF ENTANGLEMENT AND GENERALISATIONS THEREOF
Until now we have considered the problem of establishing the lower semi-continuity of the relative entropy of resource D F . At this point, the reader could wonder whether and under what conditions D F is fully-fledged continuous. Strictly speaking, this can only happen in finite dimension. In fact, as mentioned before, D F , like many entropic quantities, is often everywhere discontinuous in infinite dimensions [44]. A remarkable example of this behaviour is offered, for instance, by the relative entropy of entanglement [53], but a similar reasoning holds in other cases as well. And yet, such a highly discontinuous behaviour does not represent a problem physically, because it typically involves infinite-energy states. Throughout this section we will see how it is possible to restore a (slightly weaker) form of continuity for the relative entropy of entanglement and related quantities by looking only at the physically meaningful energy-constrained states [30,31,117,118].

A. Energy constraints
In order to model an energy constraint we introduce a Hamiltonian, i.e. a densely defined, positive semi-definite operator H whose spectrum spec(H) is bounded from below. Since the ground state energy can be re-defined without affecting the physics, we will hereafter take H to be grounded, that is, such that min spec(H) = 0. In this context, another important assumption is convexity: if F ⊆ D(H) is a convex subset of states, then the function D F defined in (9) is itself convex and satisfies inequality (64) -informally, it is 'not too convex'. If in addition D F does not increase too fast with respect to the energy, in the sense that sup ρ∈D(H), Tr ρH≤E then Ref. [118,Proposition 3] guarantees that D F is uniformly continuous on the set {ρ ∈ D(H) : Tr ρH ≤ E} of energy-constrained states for any E > 0. Moreover, it also gives an explicit (uniform) continuity bound for D F , i.e. an upper bound on the quantity |D F (ρ) − D F (σ)| for all pairs of states ρ, σ with Tr ρH, Tr σH ≤ E. Such a bound is faithful, meaning that it vanishes as ρ − σ 1 → 0, but it is not accurate when ρ and σ are very close -indeed, in this regime it scales with ρ − σ 1 , which is generally not optimal. Significantly more accurate continuity bounds for the function D F under an energy constraint can be obtained by using the methods proposed in Ref. [30,31]. To apply them, we shall also require that the function under examination be bounded by a multiple of the local entropy or the sum of local entropies. To fix ideas and establish more concrete statements, we will consider in Section IV-B two examples of functions D F , namely, the relative entropy of entanglement and the relative entropy of NPT entanglement, as well as another closely related quantity, the Rains bound. In all these cases, the infinite-dimensional generalisations of the Alicki-Fannes-Winter method [117,119] yield explicit and asymptotically tight uniform continuity bounds under energy constraint on one party of a bipartite system. In Section IV-C we will obtain uniform continuity bounds for the relative entropy of π-entanglement in m-partite system defined in (77) for any given set π of partitions of {1, . . . , m} under different forms of energy constraints.
From now on, assume that H is a grounded Hamiltonian H on H. The methods of Ref. [30,31] need one more regularity assumption on H: informally, this captures the fact that its energy levels should not become too dense as the energy grows; formally, the requirement is that lim λ→0 + Tr e −λH λ = 1 .
Note that already the finiteness of the trace on the left-hand side -commonly referred to as the Gibbs hypothesis -guarantees that H has a purely discrete spectrum and that each eigenvalue has finite multiplicity. As established in Ref.
where S is the von Neumann entropy (2), is finite and satisfies that Importantly, condition (116) holds for the Hamiltonians of many quantum systems of practical interest, including those made of finitely many harmonic oscillators [118,120]. 12 The function F H defined in (117) encodes some key information on the physics of the system. By Ref. [121,Proposition 1] (see also Ref. [122,Proposition 11]) it is continuous, strictly increasing, and strictly concave. However, it often does not possess some other handy properties that turn out to be useful in computations. To enforce such properties, let us consider an auxiliary function G : [0, +∞) → R that should be thought of as a more 'regular' version of F H itself. We will assume that: The existence of a function G with these properties is proved in [30, Proposition 1], where it is also shown that the minimal such function is given by for all E ≥ 0. If the operator H satisfies also the condition in Ref. [120,Theorem 3], then one can find a function G satisfying all the above conditions (i)-(v) and moreover such that For example, consider an -mode quantum oscillator with frequencies ω 1 , . . . , ω . Denoting the annihilation and creation operators corresponding to the k th mode with a k , a † k , respectively, the total Hamiltonian takes the form The function satisfies all the above conditions (i)-(vi), as discussed in Ref. [30].

B. Uniform continuity on energy-constrained states: bipartite case
Throughout this section, we will establish tight uniform continuity bounds on the relative entropy of entanglement and some closely related quantities that are valid on energy-constrained states. The functions we will consider are E R (defined by (72)), E R, PPT (defined by (81)), the Rains bound R (defined by (83)), as well as the corresponding regularised quantities, constructed as where the limit exists thanks to Fekete's lemma [123], because the un-regularised quantities are all sub-additive on tensor products.
The above functions E R , E R, PPT , and R are non-negative, convex, and they satisfy inequality (64). Moreover, for any bipartite state ρ AB on H AB we have [14,124] for any states ρ and σ in D(H AB ) such that 1 2 ρ − σ 1 ≤ ε, where g(x) . . = (x + 1) ln(x + 1) − x ln x. It is easy to see that all the continuity bounds in (124) are asymptotically tight 13 for large d.
If both systems A and B are infinite dimensional, then asymptotically tight uniform continuity bounds for the functions E R , E R, PPT , R (as well as their regularisations) under the energy constraint on one of the systems A and B can be obtained by using Ref.
If in addition we assume that the operator H satisfies the Gibbs hypothesis, i.e. Tr e −λH < ∞ for all λ > 0, as well as the condition in [120,Theorem 3], expressed by the inequality 14 then for each function f the continuity bound in (125) is asymptotically tight for large E. This is true, in particular, if H is the canonical Hamiltonian (120) of an -mode quantum oscillator. In this case (125) holds with the choice (121) of G, yielding where T * . . = 1 ε min 1,

E E0
(the parameters E 0 and E * are determined in (121) via the frequencies of the oscillator).
Remark 37. The right hand side of (125) tends to zero as ε → 0 for any given E > 0 due to the condition Proof of Proposition 36. The validity of inequality (125) 13 A continuity bound sup where the states ρ and σ in the last expression are states of the system A k B k (the k-th copy of AB), while ω k = ρ ⊗(k−1) ⊗ σ ⊗(n−k) is a state of the system The assumption Tr Hρ A , Tr Hσ A ≤ E and the inequality (123) imply finiteness of all the terms in (128). So, to prove inequality (125) for f = E ∞ R it suffices to show that where ∆ t (E, ε) denotes the right-hand side of (125), for each k and any t ∈ (0, T ]. This can be done thanks to the arguments from Ref. [30, proof of Theorem 1], which we summarise in the remaining part of the proof. To simplify the notation, we will rename the system A k B k as AB. For some d ≥ d 0 , let γ(d) . . = G −1 (ln d). Thanks to Ref. [30,Lemma 1], for any d > d 0 such that E ≤ γ(d) there exist states ρ , σ , α i , β i ∈ D(H AB ) (i = 1, 2) and numbers s, t ∈ (0, 1) such that: where s = s 1+s and t = t 1+t . The function E R is well defined on all the states ρ ⊗ ω k , σ ⊗ ω k , α i ⊗ ω k , β i ⊗ ω k (i = 1, 2), since their marginal states corresponding to the subsystems A 1 , . . . , A n have finite energy. So, by using the first relation in (130), the convexity of E R and inequality (64) it is easy to show that and with h 2 being the binary entropy (65). These inequalities imply that Assume that E R (α 2 ⊗ ω k ) ≥ E R (α 1 ⊗ ω k ). Then the subadditivity of E R implies while the monotonicity of E R under local partial traces shows that Ref. [117]). Hence, Since Tr Hα A i ≤ E/s 2 (i = 1, 2), it follows from inequality (123) together with property (iii) in Section V A (cf. (117)) that Inequalities (133), (135) and (136) imply that Similarly, by using the second relation in (130) and by noting that Tr Hβ A i ≤ E/t 2 (i = 1, 2), we obtain Since s, t ≤ y . . = E/γ(d) and the function E → G(E)/ √ E is non-increasing by property (v) in Section V A, for x = s, t we have where the last equality follows from the definition of γ(d). Now, thanks to the fact that rk ρ A ≤ d and rk σ A ≤ d, the supports of both ρ A and σ A are contained in some 2d-dimensional subspace of H A . By the triangle inequality we have So, the arguments from Ref. [117,proof of Corollary 8] imply that It follows from (137)-(141) and the monotonicity of the function g that We now conclude the proof of (129). If t ∈ (0, T ] then there is a natural number Since ln d * ≤ ln(d * − 1) + 1/(d * − 1) ≤ ln(d * − 1) + 1/d 0 , inequality (142) with d = d * and the monotonicity of the function g imply the claimed relation (129). Assume now that the operator H satisfies the condition of [120,Theorem 3] and that the function G satisfies property (vi) in Section V A. Note first that the condition in Ref. [30,Eq. (28)] holds whenever f is any of the This can be shown by using any purification γ(E) of the Gibbs state γ(E) . . = e −λH Tr e −λH of system A, where λ is determined by the equation E Tr e −λH = Tr He −λH [44], since Tr Hγ(E) = E and Thus, the asymptotical tightness of the continuity bound in (125) for f = E R , E R, PPT , R follows directly from the corresponding assertion of Ref. [30,Theorem 1]. The asymptotical tightness of the continuity bound in (125) for f = E ∞ R , E ∞ R, PPT R ∞ can be shown by repeating the arguments from the proof of the aforementioned assertion of Ref. [30,Theorem 1].
If H is the Hamiltonian (120) of the -mode quantum oscillator then the use of the function G ,ω in (121) in the role of G allows to write the right-hand side of (125) in an explicit form. We refer the reader to Ref. [30,Section 3.2] for details.
If H is the Hamiltonian of a quantum system A then the positive operator on H ⊗n A defined by the formula H n . . = H ⊗ I ⊗ . . . ⊗ I + · · · + I ⊗ . . . ⊗ I ⊗ H , where I is the unit operator on each of the factors H A k , is the Hamiltonian of the system A n obtained by joining n copies of A [90]. The continuity bounds in (125)  • the functions E R , E ∞ R , E R, PPT , E ∞ R, PPT , R, and R ∞ are asymptotically continuous in the following sense [53]: if (ρ n ) n∈N and (σ n ) n∈N are sequences of states such that ρ n , σ n ∈ D(H ⊗n AB ), Tr H n ρ A n n , Tr H n σ A n n ≤ nE, ∀ n , and lim n→+∞ ρ n − σ n 1 = 0 , where H n is the positive operator on H ⊗n A defined in (145) and E > 0 is a finite positive number, then Proof. The first assertion directly follows from the continuity bounds in (125) (since the right-hand side of (125) vanishes as ε → 0). To prove the second assertion, note that F Hn (E) = nF H (E/n) for each n [31, Lemma 2]. So, if G : [0, +∞) → R is any function on R + satisfying conditions (i)-(v) in Section V A for the operator H, and d 0 is a positive integer such that ln d 0 > G(0), then the function G n (E) . . = nG(E/n) satisfies the same conditions for the operator H n and d n . . = d n 0 is a positive integer such that ln d n > G n (0). Using this it is easy to obtain from Proposition 36 that where f = E R , E ∞ R , E R, PPT , E ∞ R, PPT , R, R ∞ , for any t ∈ (0, T ], ε n . . = 1 2 ρ n − σ n 1 , and T . . = min 1, E G −1 (ln d0) . Since the sequence (ε n ) n→N is vanishing by hypothesis and G(E) = o √ E as E → +∞, the right-hand side of (148) tends to zero as n → +∞ for any fixed t ∈ (0, T ].
C. Uniform continuity on energy-constrained states: multipartite case In this subsection we obtain uniform continuity bounds for the relative entropy of π-entanglement of a state of m-partite system A 1 . . . A m defined in (77) for any given (non-empty) set π ⊆ P (m) of partitions of {1, . . . , m} under the energy constraint imposed either on the whole system A 1 . . . A m or on the subsystem A 1 . . . A m−1 . Note first that The first inequality follows from the definitions of E R,π and E R , the latter being the relative entropy distance from the set of fully separable states. The second inequality is proved in Ref. [125] in the finite-dimensional setting. Its validity in general case is established in Appendix C. Since the above inequalities hold with arbitrary m − 1 subsystems of There is an important aspect in which E R,π differs from its bipartite counterpart, E R . Namely, E R is sub-additive, in the sense that E R (ρ AB ⊗ ω A B ) ≤ E R (ρ AB ) + E R (ω A B ) for all states ρ AB , ω A B , where the bipartition on the left-hand side is AA : BB . No analogous inequality can be established for E R,π when π contains more than one partition, because in that case the set of π-separable states is not closed under tensor products. For this reason, the limit in the regularisation E ∞ R,π (ρ AB ) . . = lim n→∞ 1 n E R,π ρ ⊗n AB of the relative entropy of π-entanglement is only guaranteed to exist when π is composed of one partition only (thanks to Fekete's lemma [123]). We will therefore consider the quantity E ∞ R,π only in this special case, which is however physically very relevant, as it includes e.g. the fully local scenario (corresponding to the finest partition π = {{1}, . . . , {m}}).
By using the upper bound (149), the non-negativity of E R,π , the result in Ref. [117,Lemma 7], and the arguments from Ref. [117,proof of Corollary 8], one can show that for any states ρ and σ in D(H A1...Am ) such that 1 2 ρ − σ 1 ≤ ε, where as usual g(x) = (x + 1) ln(x + 1) − x ln x, provided that the systems A 1 , . . . , A m−1 are finite dimensional. In (151), as per the above discussion, we can set either E * R,π = E R,π and leave π arbitrary, or else, if π is composed of one partition only, consider also the case where E * R,π = E ∞ R,π . Assume that A 1 , . . . , A m are arbitrary infinite-dimensional quantum systems. If H 1 We will obtain continuity bounds for the function E R,π and its regularisation E ∞ R,π under two forms of energy constraint. They correspond to the cases s = m − 1 and s = m in the following proposition.
If all the operators H 1 , . . . , H s are unitary equivalent to an operator H on H A and G : [0, +∞) → R is any function on R + satisfying conditions (i)-(v) in Section V A, then for either E * R,π = E R,π and π arbitrary, or E * R,π = E ∞ R,π and π composed of one partition only. In particular, if H is the canonical Hamiltonian (120) of an -mode quantum oscillator then (156) with the choice (121) of G becomes for either E * R,π = E R,π and π arbitrary, or E * R,π = E ∞ R,π and π composed of one partition only. Here, the parameters E 0 and E * are defined in (121) via the frequencies of the oscillator. Both continuity bounds in (157) are asymptotically tight for large E if m = 2 and s = 1, 2.
Remark 40. The right-hand sides of (155) and (156) tend to zero as ε → 0 for any given E > 0, due to the condition Proof of Proposition 39. The upper bounds (149) and (150), the non-negativity and convexity of E R,π , together with the general inequality (64) show that for any non-empty set of partitions π the function E R,π belongs to the classes L m−1 m (1, 1) and L m m (1 − 1/m, 1) defined in Ref. [31]. So, in both cases s = m − 1, m, the continuity bounds (155) Since E ∞ R,π (ρ) ≤ E R,π (ρ) for any state ρ and F H [s] is non-decreasing, inequality (159) shows that the continuity bound (155) for E * R,π = E ∞ R,π holds trivially if ε ≥ 1/2. Hence, from now on we will assume that ε < 1/2. For a given positive integer u we have that [117] where This can be done by using the arguments from Ref. [118, proof of Theorem 1], as we explain now. Letρ andσ denote purifications of the states ρ and σ with the property that δ . . = 1 2 ρ −σ 1 = √ 2ε (such purifications exist thanks to the Fuchs-van de Graaf inequalities [126] and Uhlmann's theorem [127]).
where X ± denote the positive and negative part of the self-adjoint operator X. Since these are states over a system comprising A 1 . . . A m as well as a purifying ancilla, we can consider the reduced states on A 1 . . . A m , denoted by τ ± = (τ ± ) A1...Am . Since By applying the main trick from Ref. [118, proof of Theorem 1] to the statesρ v ,σ v and δ −1 (ρ v −σ v ) ± =τ ± ⊗ω v (instead ofρ,σ andτ ± ) and by using the convexity of E R,π and the validity of inequality (64) for this function we obtain Assume that E R,π (τ + ⊗ ω v ) ≥ E R,π (τ − ⊗ ω v ). By the subadditivity of E R,π we have E R,π (τ + ⊗ ω v ) ≤ E R,π (τ + ) + E R,π (ω v ), while the definition of E R,π and the monotonicity of the relative entropy imply that E R,π (τ − ⊗ ω v ) ≥ E R,π (ω v ) (cf. Ref. [117]). Hence, Inequalities (162), (163)) and (164) together imply (161). By the reasoning in Ref. [31,Remark 6], the continuity bounds (151) and (155) for E * R,π = E ∞ R,π allow us to obtain (156) for E * R,π = E ∞ R,π by using the arguments from Ref. [31, proof of Theorem 2] with f = E ∞ R,π . Assume now that H is the Hamiltonian (120) of the -mode quantum oscillator. In this case, we can take G to be the function G ,ω in (121); this allows us to write (156) in the explicit form (157). To prove the last claim, note that the function G ,ω satisfies condition (vi) in Section V A [30,Section 3.2]. Note also that if m = 2 the condition from the last claim in Ref. [31,Theorem 2] holds for the functions E R and E ∞ R in the cases s = 1 and s = 2. Indeed, in both cases the first relation in this condition is proved by using a product state with appropriate marginal energies, while the second relation is proved by using a pure state ρ in D(H A1A2 ) such that ρ A k is the Gibbs state γ(E) . . = e −λH Tr e −λH of system A k , k = 1, 2, where λ is determined by the equation E Tr e −λH = Tr He −λH [44], since Tr Hγ(E) = E and Thus, the asymptotic tightness of the continuity bound (156) for E * R,π = E R in both cases s = 1, 2 follows directly from the last claim in Ref. [ • the function E R,π is asymptotically continuous in the following sense [53]: if (ρ n ) n∈N and (σ n ) n∈N are any sequences of states such that ρ n , σ n ∈ D(H ⊗n A1...Am ), Tr H k,n σ A n k n ≤ nE, ∀ n, and lim n→+∞ ρ n − σ n 1 = 0 , (166) where A n k denotes n copies of A k , H k,n is the positive operator on H ⊗n A k defined in (145) with H = H k , and E > 0 is a finite positive number, then The above properties are also valid for the function E ∞ R,π if π composed of one partition only. Proof. The assertion about uniform continuity of the functions E R,π and E ∞ R,π follows directly from continuity bound (155) with s = m − 1 (speaking about E ∞ R,π we assume that π composed of one partition). To prove of the asymptotic continuity of the functions E R,π and E ∞ R,π note that F (H where ε n = 1 2 ρ n − σ n 1 . Since lim n→+∞ ε n = 0 by hypothesis and F H  [118,Lemma 1], the right-hand side of (168) tends to zero as n → +∞.

VI. CONCLUSIONS AND OUTLOOK
In this paper we established the surprising fact that the infimum defining the relative entropy of entanglement is always achieved, also in infinite-dimensional systems. This has been shown to be a consequence of a much more general result, stating that the relative entropy distance to a (convex) set of free states F, called the relative entropy of resource, is always achieved and moreover lower semi-continuous, provided that the cone generated by F is closed in the weak*-topology (Theorem 5). We employed this latter result to establish a dual variational formula by means of which the relative entropy of resource can be expressed as a maximisation instead of a minimisation (Theorem 9). In doing so, we generalised several results of classic matrix analysis, most notably Lieb's three-matrix inequality, to the infinite-dimensional case (Appendix B). The applications we envision for our dual formula are on the one hand computational, and on the other rest on the theoretical framework proposed in [25,48], where expressions of that kind are used to establish properties such as the super-additivity.
We further identified a general set of conditions implying the above topological property (Theorem 7), and showed how to apply them to a variety of quantum resource theories, namely, that of multi-partite entanglement (Section IV B), NPT entanglement (Section IV C), non-classicality, Wigner negativity and more generally λ-negativity (Section IV D), and finally non-Gaussianity (Section IV E). Interestingly, the topological condition we have pinpointed is obeyed in almost all cases of practical interest, and can thus be regarded as a natural regularity assumption to impose on arbitrary infinite-dimensional quantum resource theories. For example, one could imagine to employ it to generalise the results of Ref. [10], which rest on a key identity between the smoothed regularised (generalised) robustness and the regularised relative entropy, to infinite-dimensional resources. Also, it would be interesting to extend the methods in this paper to address other resource quantifiers involving optimisations over non-compact sets, or else channel resource quantifiers [128].
In the second part of our paper we focused our attention on the relative entropy of (NPT) entanglement, the Rains bound, regularisations thereof, and the corresponding multi-partite generalisations. We have established tight uniform continuity bounds for all those functions in the presence of an energy constraint. Conceptually, those bounds complement the general statement of lower semi-continuity, and prove that much stronger regularity properties can be obtained if one looks only at energy-bounded sets of states. We speculate that even tighter constraints could possibly be derived by leveraging techniques recently proposed by Becker, Datta, and Jabbour [? ].
We are now ready to use Beppo Levi's monotone convergence theorem (see e.g. [130,Theorem 11.1(ii)]), applicable thanks to (A1), which yields the absolute integrability of f and the identities Junge and LaRacuente [56] (see also [57]) have recently established a version of the multivariate Golden-Thompson inequality from [55] that works in infinite dimensions as well. However, their result is expressed in a 'unitarily rotated' form that is not prima facie equivalent to the generalised Lieb's three-matrix inequality that we need here. Namely, Junge and LaRacuente prove that [ for all trace class operators 0 < A ∈ T(H) and finite collections {X k } k of bounded self-adjoint operators X k = X † k ∈ B(H). Here, β 0 : R → R + , given by β 0 (t) . . = π 2(cosh(πt)+1) , is a fixed probability density function on R. It is not obvious how to deduce an inequality of the form [28, Theorem 7] from (B1). The way to do so is detailed in [55,Appendix E] for the finite-dimensional case. The purpose of this appendix is to extend this derivation to the infinite-dimensional case as well.
where the left-hand side is interpreted as in (13) when A is not strictly positive definite, and the integral on the right-hand side is absolutely converging.
Proof. We start by arguing that it suffices to prove the claim when A > 0 is strictly positive definite. Indeed, assume that we have addressed that case; given some A ≥ 0 that is not strictly positive definite, we can pick any trace class ∆ > 0 and for > 0 define A . . = A + ∆. Clearly, A > 0 and also A ≥ A, so we would obtain that Tr e ln A+ln B−ln C ≤ Tr e ln A +ln B−ln C where the first inequality comes from the monotonicity of the function in (34), as established by Lemma 11. 15 Since the term proportional to in the last line of the above inequality is finite (this will follow from the A > 0 case of (B2)), taking the limit → 0 + yields the general case of (B2).
Therefore, in what follows we assume that A > 0. Let (P n ) n∈N be a sequence of orthogonal projectors P n : H → V n C n , where (a) for all n the subspace V n is invariant under A -for example, it may be the linear span of n eigenvectors; and (b) Π n . . = P † n P n : H → H converges strongly to the identity, which we write Π n s − −− → n→∞ I. Set C n . . = δI + P n (C − δI)P † n ⊕ 0 = P n CP † n ⊕ δQ n Q † n = . . C n ⊕ δQ n Q † n , where the direct sum is with respect to the decomposition H = V n ⊕V ⊥ n , we denoted with Q n the orthogonal projector onto V ⊥ n , and we set C n . . = P n CP † n . Then clearly C n ≥ δI for all n, and moreover C n s − −− → n→∞ C. Putting all together: Tr e ln A+ln B−ln C = exp Here, (i) is an application of Junge and LaRacuente's result (B1) with p = 2, k = 1, 2, X 1 = 1 2 ln B, and X 2 = − 1 2 ln C; (ii) follows from the convexity of the exponential exp : R → R (remember that β 0 is a probability density function); and in (iii) we leveraged the cyclicality of the trace. The justification of (iv) is slightly more complex. Since C n s − −− → n→∞ C and C n , C ≥ δI, thanks to [131, Propositions 10.1.9 and 10.1.13(a)] we see that (C n ) − 1±it 2 s − −− → n→∞ C − 1±it 2 ; as we did in (24), this can be shown to imply that in turn guaranteeing that because B is bounded. Now, given that and moreover +∞ −∞ dt |β 0 (t)| = 1, the identity (iv) follows from (B7) thanks to Lebesgue's dominated convergence theorem. Continuing with the justification of the derivation in (B5): in (v) we decomposed the trace exploiting the fact that V n is invariant under the action of A, introducing the operators B n . . = P n BP † n and A n . . = P n AP † n on V n ; in (vi) we massaged the first term, which is the trace of an n × n matrix, by means of the identity [55, Eq.
valid for all x, y > 0 (this step is the same as in [55,Appendix E], where it is explained in more detail); (vii) descends from the chain of equalities +∞ 0 ds Tr B 1 C n + sI A 1 C n + sI = +∞ 0 ds Tr B n 1 C n + sI n A n 1 C n + sI n + +∞ 0 ds 1 (δ + s) 2 Tr B (I − Π n ) A (I − Π n ) = +∞ 0 ds Tr B n 1 C n + sI n A n 1 C n + sI n where I n denotes the identity operator on V n (essentially, the identity matrix of size n); finally, (viii) can be deduced once again thanks to Lebesgue's dominated convergence theorem, because (a) due to C n ≥ δI we have C+sI A 1 C+sI , from which we infer that precisely as in (B7). This concludes the proof of the first inequality in (B2). As for the second, we see as in (a) and (b) above that the integral in (B2) is upper bounded by 1/δ.
where |ϕ 1 i1 i1 and |ψ 1 i1 i1 are orthogonal sets of unit vectors in H A1 and H A2...Am , respectively, and p 1 i1 i1 is a probability distribution. Applying the Schmidt decomposition with respect to the bi-partition A 2 : A 3 . . . A m to any of the vectors ψ 1 i1 , we obtain where |ϕ 2 i1i2 i2 and |ψ 2 i1i2 i2 are orthogonal sets of unit vectors in H A2 and H A3...Am , respectively, and p 2 i1i2 i2 is a probability distribution for any given i 1 .
By repeating this process we get