Disks globally maximize the entanglement entropy in $2+1$ dimensions

The entanglement entropy corresponding to a smooth region in general three-dimensional CFTs contains a constant universal term, $-F \subset S_{\text{EE}}$. For a disk region, $F|_{\rm disk}\equiv F_0$ coincides with the free energy on $\mathbb{S}^3$ and provides an RG-monotone for general theories. As opposed to the analogous quantity in four dimensions, the value of $F$ generally depends in a complicated (and non-local) way on the geometry of the region and the theory under consideration. For small geometric deformations of the disk in general CFTs as well as for arbitrary regions in holographic theories, it has been argued that $F$ is precisely minimized by disks. Here, we argue that $F$ is globally minimized by disks with respect to arbitrary regions and for general theories. The proof makes use of the strong subadditivity of entanglement entropy and the geometric fact that one can always place an osculating circle within a given smooth entangling region. For topologically non-trivial entangling regions with $n_B$ boundaries, the general bound can be improved to $F \geq n_B F_0$. In addition, we provide accurate approximations to $F$ valid for general CFTs in the case of elliptic regions for arbitrary values of the eccentricity which we check against lattice calculations for free fields. We also evaluate $F$ numerically for more general shapes in the so-called"Extensive Mutual Information model", verifying the general bound.


Introduction
In spite of its intrinsically divergent nature, the entanglement entropy (EE) of spatial regions in d-dimensional CFTs contains pieces which are well defined in the sense of being independent of the regularization utilized. The nature of such "universal" terms changes considerably depending on whether d is even or odd. In even dimensions, the prototypical universal contributions for smooth surfaces are coefficients of logarithmic terms and have an intrinsically UV origin. In particular, they are local in the entangling surface ∂A, meaning that they are given by integrals over such surface of various intrinsic and extrinsic curvatures. These appear weighted by certain theory-dependent constants which are proportional to the corresponding trace-anomaly coefficients [1][2][3][4]. In odd dimensions, there is no logarithmic term for smooth In each set, all the different regions are related to each other through conformal transformations (the same transformations are applied in each set), and therefore share the same F for a given CFT. Disks are naturally unchanged, but notice for instance how an ellipse (innermost figure in middle plot) can be deformed, without modifying F , into a disk with a small depression. regions 1 and the universal piece corresponds to the constant term, independent of the cutoff. As opposed to logarithmic terms, constant terms are not given by local integrals over the entangling surface. Rather, they generally depend in a complicated way on the theory under consideration as well as on the shape of the entangling region A. The simplest case corresponds to three-dimensional CFTs. For those, the EE of a smooth region takes the form where c 0 is a non-universal constant and δ is a UV regulator. On the other hand, F (A) is a universal, dimensionless, and generally non-local in nature piece. Importantly, F (A) is a conformal invariant quantity, so that F (A) = F (T (A)), where T is any conformal transformation -see Fig. 1. When A is a disk -namely, a region with ∂A = S 1 -the corresponding F , which henceforth we denote F 0 , equals the free energy of the corresponding theory on S 3 [6,7]. F 0 turns out to define an RG monotone for general quantum field theories, hence providing a version of the F -theorem in three dimensions [8].
In this paper we would like to ask the question of which region A minimizes the universal part of the entanglement entropy in three dimensions for a given CFT. Naturally, the obvious candidate corresponds to the disk region, i.e., For small geometric deviations of the disk, the above inequality follows from Mezei's formula [9] -see (2.2) below-which informs us that corrections to F 0 are given by a positivedefinite geometry-dependent coefficient weighted by the (also positive-definite) stress-tensor coefficient 2 C T . On general grounds, one can also argue that F tends to grow as regions incorporate sufficiently thin sectors, as in that case F includes contributions proportional to the ratio between the corresponding long and short dimensions -see e.g., [15,16]. As far as particular theories are concerned, the only model for which (1.2) has a definite positive resolution is holographic Einstein gravity. In that case, F holo can be written in terms of the Willmore energy [17][18][19] of a duplicated version of the corresponding Ryu-Takayanagi surface [20][21][22] and the global minimization of F holo by F holo 0 follows from simple geometric arguments.
An analogous problem has been studied for the universal (logarithmic) contribution to the EE in four-dimensions in [21] and [23]. In that case, however, the essentially local nature of the corresponding term makes it be fully determined in terms of two (theory-independent) local integrals over ∂A weighted by the trace-anomaly charges of each theory a and c. The global minimization of the EE universal term by the round sphere follows then rather straightforwardly for general CFTs in the case of arbitrary entangling surfaces of genus g = 0 and g = 1. On the other hand, as pointed out in [23], when a > c it can be shown that the EE universal coefficient is actually unbounded from below for sufficiently high genus. 3 Here, we will argue that disks globally maximize the EE for general CFTs in three dimensions. Before getting to the proof, we start with a brief summary of previous results and general remarks in Section 2. Then, in Section 3 we study in detail the case of elliptic regions. In particular, we provide analytic approximations of F for such regions for arbitrary values of the eccentricity for general CFTs. These we compare with the exact result obtained numerically for the so-called "Extensive Mutual Information" (EMI) model as well as with lattice results for free scalars and fermions finding good agreement for the latter and a small puzzle for the former. The lattice results are obtained using mutual information (MI) as a geometric regulator which requires the evaluation of the MI for two concentric regions and the subtraction of a purely geometric piece weighted by the coefficient controlling the EE of a thin strip region. In Section 4 we evaluate F for several families of more general entangling regions in the EMI model. This allows to illustrate to what extent F may vary in general as the geometric features of the entangling region non-perturbatively deviate from the most symmetric cases. Naturally, all results in these two sections are in agreement with (1.2), 2 It is worth pointing out that in the case of the Euclidean free energy, S 3 also provides a local maximum and that the leading correction to the free energy for slightly deformed spheres is similarly controlled by CT [10,11]. Beyond the perturbative regime, partial results for free fermions suggest that S 3 does not necessarily correspond to a global maximum [10]. However, the match between the EE and the Euclidean free energy does not go through beyond the round case, so we cannot extrapolate such results to the present context. For additional related results regarding the extremization of three-dimensional free energies in Lorentzian backgrounds, see [12][13][14]. 3 The argument follows from two facts: i) the EE universal coefficient can be written as a linear combination of the Willmore functional of ∂A plus (a/c − 1) times the Euler characteristic of ∂A -which is proportional to (1 − g), where g is the genus of the entangling surface; ii) for every genus g there exists at least a surface whose Willmore energy lies somewhere between the values 4π and 8π [24,25]. Then, whenever (a/c − 1) > 0, a growing g will make the Euler characteristic piece more and more negative, whereas the Willmore part for certain geometries will remain bounded between the aforementioned values independently of g.
as expected. In Section 5 we prove (1.2) for general theories. In order to do that, we first study the effect on the EE of considering geometric perturbations of the entangling region for which the derivatives of the perturbation go to zero with a different velocity than the perturbation itself and characterize the type of deformations which allow us to to separate the leading correction to F from the subleading ones. Denoting by γ(s) the curve describing the entangling surface ∂A and η(s) its normal vector at a point s, the relevant deformations γ λ (s) = γ(s) + δ λ (s)η(s) are such that their k-th derivative behaves as ||δ where || · || is the uniform norm. We then show that any functional allowing for a hierarchy of perturbative terms in δ λ in the sense just described which is Euclidean invariant, strong superadditive and constant for disks (which includes F as a particular case) is globally minimized for disks. Three facts play a crucial role in the proof: i) for a given entangling region which fully contains another one and such that both coincide within a given interval, functionals of that class always increase more for the outer region than for the inner one as we perturb both outwards at the point of osculation; ii) disks have a vanishing first-order correction to F ; iii) it is always possible to place an osculating disk within a given region. We conclude in Section 6 with some final comments. In particular, we show there that our results in Section 4 imply a stronger bound for topologically non-trivial regions -i.e., regions with holes and/or multiply connected subregions. In particular, we find F ≥ n B F 0 , where n B is the number of boundaries of the region. In Appendix A we explore how the MI regularization of the EE works in the case of the EMI model, for which we have full control of the calculations. In particular, we study how the practical limitations of the lattice involving the finiteness of the regions affect the results and extract some conclusions which we use in Section 3.

General observations and previous results
In this section we review what is known about the issue of whether or not F is minimized by disk regions. The evidence for general CFTs is essentially limited to small perturbations of the disk and to regions which contain very thin sectors. On the other hand, in the particular case of holographic theories dual to Einstein gravity, F can be written in terms of the so-called Willmore energy of a duplicated version of the corresponding Ryu-Takayanagi surface. This in turn implies (1.2) for holographic Einstein gravity.

Perturbative deformations of the disk for general CFTs
The question of whether or not F is minimized for a disk region for arbitrary CFTs has a positive answer when the analysis is restricted to figures which are small deformations of it. In particular, for a slightly deformed disk defined by the polar equation (where a (c) , a (s) are some numerical coefficients which determine the shape of the perturbation), the result for F at leading order in is given, for general CFTs, by the three-dimensional version of Mezei's formula [9,26] where C T is the coefficient which controls the flat-space stress-tensor two-point function charge. 4 In this expression, everything is determined by the geometry of the entangling region with the exception of C T , which is the only theory-dependent piece. This is a positivedefinite quantity for unitary CFTs and hence it is evident from (2.2) that (1.2) holds, since the geometric coefficient is also positive semi-definite in all cases.

Very thin regions for general CFTs
When one of the dimensions of the entangling region becomes very thin compared to the other, the universal contribution approaches the one corresponding to a strip -whose boundary is defined by two parallel straight lines of length L and separated a distance r L. In that situation, F tends to diverge as is a positive coefficient characteristic of the CFT under consideration. This quantity is known, for instance, for: holographic theories [16,28], free scalars and fermions [29], as well as for the EMI model we consider below [30]. As opposed to other charges appearing in the entanglement entropy of symmetric regions in various dimensions, k (3) does not have an alternative CFT interpretation beyond entanglement entropy. The behavior (2.3) shows that the conjectural properties (1.2) are clearly satisfied for regions which possess a dimension sufficiently thinner than the other. Besides, regions which may posses various features including some which approach the strip-like shape just described will tend to make F grow.

Holography and Willmore energy
For holographic theories dual to Einstein gravity, the entanglement entropy for a given boundary region can be obtained from the Ryu-Takayanagi prescription [16,[31][32][33][34]. The purely geometric nature of the formula has allowed for the evaluation of entanglement entropy for regions beyond the most symmetric instances. This has been done in the three-dimensional case e.g., in [22,35], using the "Surface Evolver" tool [36] -see also [37,38]. As far as F holo is concerned, it can be proven that it is indeed globally minimized by disk regions. The argument is as follows. Given a general smooth, orientable, closed surface, Σ in R 3 (not necessarily minimal), the so-called "Willmore energy" is defined as [17][18][19] where H ≡ (k 1 + k 2 )/2 is the mean curvature constructed from the principal curvatures k 1 and k 2 of Σ and dS the surface element. This quantity has a global lower bound which is saturated when Σ is a spherical surface. 5 In order to see why this is the case, one can modify the Willmore functional (2.4) subtracting the Gaussian curvature K = k 1 k 2 as The study of both versions of the functional is equivalent. This is because, from the Gauss-Bonnet theorem Σ KdS = 2πχ (Σ), where χ (Σ) is the Euler characteristic of the surface. Hence, for fixed genus,W is just a constant shift with respect to W. The interest of the modified functionalW is manifest when written in terms of the principal curvatures It immediately follows thatW (Σ) has a sign,W (Σ) ≥ 0, and also that the bound is exclusively saturated for a totally umbilical surface, k 1 = k 2 , which in the present context is just a fancy name for a sphere [18]. After this discussion, it is straightforward to check that the bound for the original functional (2.5) is found from (2.6) particularizing the Euler characteristic to the sphere case, χ S 2 = 2. Now, in [20][21][22], it was pointed out that the finite term in the three-dimensional holographic entanglement entropy can be written in terms of a Willmore energy as where L is the AdS radius and G is the Newton constant -see also [40][41][42][43]. In this expression, is a closed surface embedded in R 3 which consists of a duplicated Ryu-Takayanagi surface [44]: the surface Σ RT(A) is just Σ RT(A) reflected with respect to the AdS boundary and joined along the entangling surface ∂A. Then, as mentioned earlier, the Willmore energy of the spherical surface is minimal, saturating the bound (2.5). Hence, using the fact that the Ryu-Takayanagi surface for a disk region is an hemi-sphere, this is translated into a global bound for F holo ,

Elliptic entangling regions
Perhaps the simplest (non-perturbative) generalizations of the disk region which come to mind are ellipses. In spite of this, there do not seem to be too many results available in the 5 If the surfaces taken into account have greater genus, the bound is higher [39].
literature. In the holographic context, they have been studied numerically alongside more general regions in [35]. In the same context, some bounds for their corresponding F were obtained in [45]. In this section we start by considering the limit in which the elliptic regions are very squashed (eccentricity e → 1), which leads to a general approximation in terms of the strip coefficient k (3) valid for general theories. Then, using this approximation along with Mezei's formula for the opposite limit of almost round ellipses, we build a trial function which approximates F (e) for arbitrary values of the eccentricity for a general CFT as long as we know the values of the three coefficients F 0 , C T , k (3) . We present the resulting approximations for the EMI model, holographic Einstein gravity, free scalars and free fermions. For the last two, we perform lattice calculations of F (e) using MI as a regulator for e = √ 3/2 0.866, e = 2 √ 2/3 0.943, e = √ 15/4 0.968 and e = 2 √ 6/5 0.98 which we compare to the approximations. The fermion results agree reasonably well with the trial function, which also approximates the exact EMI model results with a discrepancy lower than a ∼ 5% for the whole range. On the other hand, the lattice produces results for the scalar which are considerably lower than the ones obtained from the approximation. While this appears to be an artifact of the lattice results (at least to some extent) we introduce an additional trial function which does a better job in approximating the scalar results. Along with the initial one, these two functions confidently bound the possible values of F (e) for a given CFT. In all cases, F (e) is a monotonically increasing function of the eccentricity.

General-CFT approximations for arbitrary eccentricity
Ellipses of semi-major and semi-minor axes a,b and eccentricity e ≡ 1 − b 2 /a 2 ∈ [0, 1) can be parametrized in polar and Cartesian coordinates respectively by where the "eccentric anomaly" t is related to the polar angle θ by tan(θ) = b a tan(t). As we vary e, we can interpolate between the regime which is very close to the disk (as e → 0) and the one which approaches the strip-like behavior described in subsection 2.2 (as e → 1). In the former case, one finds, from Mezei's formula In the opposite regime, we can take advantage of (2.3), namely, of the fact that very thin regions are controlled by the strip coefficient k (3) times the ratio of their long and thin dimensions. Using the equation for the width of the squashed ellipse, 2y = 2b 1 − x 2 /a 2 , we can then approximate F (e→1) as [45] F This is valid for general CFTs and, at least for some models, it in fact provides a very good approximation to the exact result for not-so-squashed ellipses -see comments below (4.28).
Similar formulas valid for general CFTs can be analogously obtained for other closed squashed entangling regions. Formulas (3.3) and (3.2) provide us with approximations to F (e) in those two regimes, for general CFTs provided we know the coefficients k (3) , F 0 and C T . Using them, we can try to do better and produce expressions which approximate F (e) for arbitrary values of the eccentricity.
Our strategy is to build an approximation from a linear combination of test functions with particular relative coefficients. The idea is to fix those coefficients in a way such that (3.3) and (3.2) hold when the full expression is expanded around e → 0 and e → 1 respectively. 6 In the former case, this implies setting to zero the O(e 1 , e 2 , e 3 , e 5 , e 6 , e 7 ) coefficients. Naturally, there are many possible candidate functions one may consider, but we find it convenient to choose as building blocks functions of the form ∼ E[e 2 ] k / , (3.8) 6 Let us mention that a similar strategy was used in [46] (see also [47]) in order to produce high-precision approximations to the entanglement entropy corner functions of general CFTs using the parameters CT and k (3) . In that case, the available (numerical) results for free fields and holography allowed the authors to verify the excellent agreement between the trial functions and the exact curves.
Although F (e) | tri 1,2 do not look particularly simple, note that the exact F (e) for a given CFT will in general be an extremely complicated function of e depending on many details of the theory. Our approximations above are completely explicit and depend on the CFT under consideration exclusively through the three constants k (3) , F 0 and C T . Anticipating results presented in Section 4, we can test the precision of our formulas against numerical calculations for the EMI model. For this, we know k (3) , F 0 and C T analytically -see (4.28), (4.12) and (4.15) below-and we can also compute F (e) exactly (up to numerical precision). The first function, F (e) | trial 1 produces a very good approximation to the exact EMI model results. As shown in the inset of the left plot in Fig. 2, the greatest discrepancy is lower than ∼ 5% and it is much smaller for most values of e. The case of F (e) | trial 2 is rather different. For that one, the discrepancy grows up to a maximum of ∼ 22% for e ∼ 0.96. In view of this, we expect F (e) | trial 1 to be a better approximation for general CFTs. However, we decided to keep F (e) | trial 2 as it does a much better job in approximating the free scalar results obtained in the lattice, as we explain below.
As anticipated, there are other CFTs for which we know the values of the three relevant coefficients and for which we can therefore compute F (e) | tri 1,2 . In particular, for a free scalar [27,29,48,49], a free fermion [27,29,48,49] and Einstein gravity holography [16,28] we have Figure 2: We plot the universal contribution to the entanglement entropy corresponding to an elliptic region, normalized by the disk result, F 0 , as a function of the eccentricity. In the left plot, the data points are (exact) numerical results obtained for the EMI model and the purple curves are the trial functions F (e) | trial 1,2 particularized to this model. The relative error in the approximations is shown in the inset. In the right plot, we present both trial-function curves for: a free scalar, a free fermion, the EMI model and holographic Einstein gravity. The dots correspond to lattice values for the scalar and the fermion. The inset is a zoom in of the region e > 0.85.
Note that the fermion results correspond to a Dirac field, so the values of the coefficients per degree of freedom would require dividing them by 2. Using these coefficients, in the right plot of Fig. 2 we present the corresponding F (e) | tri 1,2 curves, alongside the EMI one. As we can see, the fermion and EMI curves are very close to each other 8 and, for each model, both trial functions are also rather similar. The scalar and holography curves are the ones making F/F 0 greater and lower, respectively, for general values of e. This hierarchy of theories is similar to the one encountered for the entanglement entropy corner function a(θ) normalized by C T in [30]. In all cases, F (e) | tri 2 lies notably below F (e) | tri 1 in the whole range.
In the present case, we have also performed lattice calculations for the free scalar and the fermion corresponding to e = √ 3/2 0.866, e = 2 √ 2/3 0.943, e √ 15/4 0.968 and e = 2 √ 6/5 0.98. As explained in the following subsection, we obtain two values of F (e) for each eccentricity and model. They are both presented in Fig. 2. We find where we definedF (e) ≡ F (e) /F 0 here to avoid the clutter. Now, for the trial functions we findF We observe that the fermion results never differ from the first trial function prediction more than a ∼ 4% and in most cases the agreement is better. The agreement with the second trial function is much worse, and the discrepancies reach a maximum of ∼ 17%. In the case of the scalar, it is the second trial function the one which approximates better the lattice results, with disagreements lower than ∼ 5% in most cases, except for the last value, which differs by a ∼ 13%. On the other hand, the results are clearly off from the ones predicted by the first trial function: in some cases, the discrepancy grows up to ∼ 40%. We expect the results both for the scalar and the fermion to have an uncertainty which is probably not less than ∼ 5%. This does not explain however why we cannot fit the results for both theories with a single curve. In fact, we suspect that the lattice is considerably underestimating the actual scalar results. In particular, as we mention below, analogous calculations for the disk region yieldF ferm 0 | lattice 0.99 andF scal 0 | lattice 0.92. Namely, while the fermion result approaches the analytical answer very well, the scalar one is an 8% lower. In view of this and of the agreement for the EMI and free fermion results with the first trial function, a reasonable guess is that F (e) | tri 1 is actually a very good approximation to the exact curve for general CFTs, including the scalar, whereas F (e) | tri 2 is not so much. We say a bit more about this in the next subsection.

Free-field calculations in the lattice
As anticipated, in this subsection we compute F (e) /F 0 for elliptic regions in the lattice for free scalars and fermions. As argued in [51], trying to obtain F 0 from a direct calculation of the entanglement entropy in a square lattice does not produce reasonable results. 9 Rather, one obtains wildly oscillating answers as the ratio R/δ varies (here δ stands for the lattice spacing). The problem has to do with the fact that we cannot resolve the radius of the disk with a precision better than δ, which means that we cannot distinguish disks with radii R and R(1 + aδ), with a ∼ O(1). But such uncertainty will pollute F via the area-law piece 9 For satisfactory calculations in radial lattices see [52,53].
in the entanglement entropy (1.1) as −F → −F + 2πc 0 a. As it is clear from this, the issue cannot be resolved by making the disk radius larger in the lattice.
An alternative approach which does produce convergent results consists in using mutual information as a geometric regulator [51,[54][55][56]. Given a region A, we consider two concentric ones A − and A + . These are defined by considering a normal to ∂A at each point and moving a distance ε/2 inwards and outwards along such direction, respectively. ε can be chosen to be constant for all values of the parameter s defining ∂A or, alternatively, it can be a function of s if one decides that A + and A − should be dilated versions of A -e.g., if A is an ellipse, A − and A + would also be ellipses using this second method. In both cases, we are left with an inner annulus of width ε (constant or variable) -see Fig. 3.
The idea is then to consider the MI which, using the purity of the ground state can be rewritten as where A + ∪ A − is the annulus region. The three regions appearing in (3.20) are finite and are thus defined by a finite number of points in the lattice, so they are suitable for that setup. Now the idea is to consider annuli such that ε/R 1 while keeping ε/δ 1. In that limit, the MI behaves as This equality is true up to terms which diverge as ∼ 1/ε and ∼ 1/δ respectively in the corresponding limits. Hence, in order to extract F from the MI (3.20), one must deal with those terms first. As argued in [51], if we parametrize the boundary of A by the length parameter s, one finds for ε/R 1, where k (3) is the coefficient characterizing the EE of a thin strip region -see (2.3). In order for this to work, the region A must be chosen precisely half way between ∂A + and ∂A − . As mentioned earlier, ε(s) corresponds to the separation between ∂A + and ∂A − at a given point s, measured along the normal direction to ∂A at that point. Therefore, given some region A in the lattice, we can extract its F by computing I(A + , A − ) as in (3.20) and subtracting the first piece in the r.h.s. of (3.21), then dividing by −2. For fixed values of ε/R, where R is some characteristic size of A, the results will improve their accuracy as R/δ 1. In the case of disk regions, (3.21) simplifies considerably, and one gets where ε ≡ R 2 − R 1 and R ≡ (R 1 + R 2 )/2. In this case, both methods described above involve a constant ε and A − , A + are always disks.
For an elliptic region A of eccentricity e, on the other hand, we can either choose a constant ε or, alternatively, force A − and A + to be ellipses as well. In both cases, computing ε requires obtaining the line which intersects the boundary of the elliptic entangling region normally at the point determined by t -see Fig. 3. Assuming the ellipse is centered at the origin of coordinates and parametrized by [a cos(t), b sin(t)], this is given by A point on the normal can be parametrized as [a cos(t), y(a cos(t))] and we want to determine the quantity α that we need to add to both coordinates so that the new point lies a distance ε/2 from [a cos(t), b sin(t)]. Hence, we need to solve for α. Solving this equation and plugging back we find that the points r i,o (t) defining the shapes of the inner and outer curves read , y a cos(t) + (3.25) Here, (−1) i = −1, 1 for the inner and outer shapes, respectively. Finally, we can compute the entanglement entropy of the original elliptic region as where I(pseudoellipse i , pseudoellipse o ) is the mutual information of the "pseudoellipses" defined by the shapes r i (t) and r o (t) respectively. For the second method, on the other hand, we consider two concentric ellipses of the same eccentricity e parametrized by [a 1,2 cos(t), b 1,2 sin(t)]. Computing ε(t) requires now obtaining the intersections of the normal line to ∂A with the exterior and interior ellipses. These have The intersections occur at the pointsr 1,2 (t) -see Fig. 3-wherẽ and (x 2 ,ỹ 2 ) are given by the same expressions replacing (1 ↔ 2) in all labels. From this, ε(t) can be obtained as Putting the pieces together, we find that the EE universal term for an ellipse of eccentricity e with a boundary half way between two concentric ellipses of the same eccentricity can be obtained as where 1, 2 are labels referring to the two (inner and outer, respectively) concentric ellipses. Both (3.26) and (3.30) should produce equivalent approximations to F (e) for sufficiently large values of b/ε. However, as we discuss in Appendix A in the case of the EMI model -for which we can perform calculations for arbitrary values of b/ε-the prescriptions are not equally good in their precision for finite values of b/ε -see Fig. 10. In particular, we find that the pseudoellipses one produces more stable and accurate answers, specially for values of e close to 1. In the lattice, we have evaluated I(pseudoellipse i , pseudoellipse o ) as well as I(ellipse 2 , ellipse 1 ) numerically using (3.20) and then subtracted the purely geometric pieces proportional to k (3) in each case. Similarly to the EMI case, we observe that the first method produces greater and more stable results, so we only present those here.
We perform lattice calculations for free scalars and fermions. For the former, we consider a set of fields and conjugate momenta φ i , π j , i, j = 1, . . . , N labelled by their positions at the square lattice. These satisfy canonical commutation relations, [φ i , π j ] = iδ ij and [φ i , φ j ] = [π i , π j ] = 0. Then, given some Gaussian state ρ, the EE can be computed from the two-point correlators X ij ≡ tr(ρφ i φ j ) and P ij ≡ tr(ρπ i π j ) . In particular, one has [29,57] S EE (A) = tr [(C A + 1/2) log( where C A ≡ √ X A P A and (X A ) ij ≡ X ij , (P A ) ij = P ij (with i, j ∈ A) are the restrictions of the correlators to the sites belonging to region A.
The lattice Hamiltonian reads in this case where the lattice spacing is set to one. The relevant expressions for the correlators X (x 1 ,y 1 ),(x 2 ,y 2 ) and P (x 1 ,y 1 ),(x 2 ,y 2 ) corresponding to the vacuum state are given by [29] In the case of the free fermion, we consider fields ψ i , i = 1, . . . , N defined at the lattice points and satisfying canonical anticommutation relations, {ψ i , ψ † j } = δ ij . Given a Gaussian state ρ, we define the matrix of correlators D ij ≡ tr(ρψ i ψ † j ). Then, similarly to the scalars case, the EE for some region A can be obtained from the restriction of D ij to the corresponding lattice sites as [29] (3.35) The lattice Hamiltonian is given by 36) and the correlators in the vacuum state read [29] D (n,k),(j,l) = 1 2 δ n,j δ kl − π −π dx π −π dy sin(x)γ 0 γ 1 + sin(y)γ 0 γ 2 8π 2 sin 2 x + sin 2 y e i(x(n−j)+y(k−l)) . (3.37) As a warm up, we perform calculations for disk regions -see also [51], where this was previously done. In that case, the results for F 0 are known analytically both for the scalar and the fermion -see (3.14) and (3.15) above. In the lattice, we consider pairs of disks of different radii and the corresponding annuli and use (3.22) to extract our results. The results have a numerical uncertainty associated to the finiteness of both ε/R and δ/ε. This comes from the difficulty of achieving ε/R 1 while keeping δ/ε 1. Naturally, considering sufficiently large values of ε requires taking disks with large enough radii, which increases the computation time. In Fig. 4 we present various data points of F 0 (normalized by the analytic result) for both the scalar and the fermion as a function of b/ε. We take ε/δ = 6, 8, 10 and values of disk radius up to R/δ = 58. As we can see, the results oscillate considerably for both models. In order to extract a number for each of them, we proceed in two ways. First, we take the two values of F (e) corresponding to the largest R/δ considered for each of the three values of ε, and take the average of those six values. Then, we multiply them by 1 plus a small correction factor (∼ 0.05) which we obtain in Appendix A based on the EMI model and which should approximately account for the fact that we are considering not too large values of R/δ for the given values of ε/δ. Secondly, we perform fits {1/x, 1} to the series of data points for each ε and take the average of the three values. We find from these two methods, respectively, The results are in very good agreement with the expectations, although we observe that the scalar one obtained from the fits is still underestimating the actual result. Moving on to the ellipses, we perform calculations for eccentricities e = √ 3/2 0.866, e = 2 √ 2/3 0.943, e = √ 15/4 0.968 and e = 2 √ 6/5 0.98. These belong to the region in which the almost-round and very-thin approximations -(3.2) and (3.3) respectively-do not work so well. As mentioned earlier, the pseudoellipses method yields better results than the variable-ε one, so we use the former to perform our calculations. The results are shown in Fig. 4. Proceeding similarly to the disks, in order to extract approximations for large-b/δ results we make fits to the data for fixed values of ε and take the average. Alternatively, we also consider for each ε the values obtained for the two greatest values of b/δ, take the average of the six and multiply by the correction factor. The results appear in (3.17) above. The fermion data points look somewhat more harmonious than the scalar ones and the data series produce more similar predictions. On the other hand, as mentioned in the previous subsection, our candidate trial function F (e) | tri 1 -which yielded a very good agreement with the exact EMI results-produces results which also agree very well with the fermion ones in the lattice, but not so for the scalar. It is possible that the methods we are using are still insufficient to account for the underestimation of the results associated to the finite-R/δ limitations in the case of the free scalar. If that is case, the predictions of F scal (e) | tri 1 are probably more accurate than our lattice results. Alternatively, it may also be that the idea of accurately reproducing F (e) for any CFT with a single function completely determined by k 3 , C T and F 0 does not work so well and one rather requires a set of functions -e.g., F scal (e) | tri 1 , F scal (e) | tri 2 and the intermediate curves between them, as displayed in Fig. 2-to precisely account for all possible theories.
Regarding the fundamental question which is subject of study in the present paper, the evidence gathered from holography [35], the EMI model, the lattice results for free fields and the two limits (e → 0 and e → 1) for general theories makes it clear that F (e) is a monotonically increasing function of the eccentricity for a given CFT -and therefore F (e) ≥ F 0 for all e for general CFTs. As a final comment for this section, we also point out that the methods utilized here for computing EE using MI should be similarly applicable to other regions beyond ellipses. In that case, the constant-ε method appears to be the best choice.

More general shapes in the EMI model
In order to obtain explicit results for more complicated entangling regions, we consider now the so-called "Extensive mutual information model" (EMI) [55]. This follows from considering a general formula for the mutual information compatible with the known general axioms satisfied by this measure in a general QFT -see e.g., [50]-plus the additional requirement that it is an extensive function of its arguments, namely, I(A, B) + I(A, C) = I(A, BC) for any A, B, C. This model corresponds to a free fermion in d = 2 [58] but does not describe the mutual information of any actual theory (or limit of theories) for d ≥ 3, as recently shown [50]. Nonetheless, the fact that it satisfies all known properties expected for a valid MI -in particular, some highly non trivial, such as monotonicity: B ⊆ C ⇒ I(A, B) ≤ I(A, C)makes it a useful toy model which has shown to produce qualitatively and quantitatively reasonable results in various cases [5,30,51,59,60].
In the EMI model, the entanglement entropy of a region A in a time slice of 3-dimensional Minkowski spacetime is given by where the integrals are both along the entangling surface ∂A, n is a unit normal vector and κ (3) is a positive parameter. There are different ways to regulate the above integrals, which otherwise diverge as r 1 → r 2 . For our purposes, it will be convenient to introduce a UV regulator δ along an auxiliary extra dimension, so that we replace |r 1 − r 2 | 2 → |r 1 − r 2 | 2 + δ 2 with δ all the rest of scales. Now, in order to obtain a computationally useful formula for the universal constant term, F EMI (A), note from (1.1) that, on general grounds, i.e., the combination in the r.h.s. is equivalent to F (A) up to terms which vanish as the regulator is taken to zero. Alternatively, we can also subtract the area-law piece, fixing first the value of c 0 by computing the EE in a particular example. Then, we trivially have Observe that while c 0 is regulator-dependent, it does not change as long as we perform all our calculations with the same regulator. Using (4.2) we find Alternatively, we can use (4.3). In order to fix c 0 , we can evaluate the EE for a simple region like a strip, or a disk. By doing so, we find that c 0 = π for the regulator introduced above, and we can write 10 Now, in order to make progress, let us choose a particular set of coordinates. We parametrize our entangling surfaces as where f (θ i ) is a function of the polar angle, θ ∈ [0, 2π). Naturally, in the case of a disk, f (θ i ) = R, where R is just its radius. In general we have 10 One could try to extract a δ-independent expression for F EMI (A), e.g., by rewriting (4.4) as trading the derivatives with the integrals and finally taking the δ → 0 limit. Unfortunately, we have not been able to obtain anything too useful from this approach.
where we used the notation f i ≡ f (θ i ). As for the normal vectors, one finds Using these we get Plugging the above expressions back in (4.4) or (4.6), we obtain an explicit formula for the universal term, up to O(δ) terms. Let us first consider the simplest possible case, corresponding to a disk region. Then, we have f i = R, and the expression for F EMI simplifies drastically, (4.11) The integrals can be explicitly performed and the result reads (4.12) Alternatively, we can use (4.6). The first integral is trivially 2πR and, for the second, we can use the symmetry of the disk to fix r 2 = (R, 0) n 2 = (1, 0). Then, we have finding the same answer. Now, in order to consider more complicated figures, we will numerically integrate (4.4) and (4.6) for various values of δ 1 and obtain a linear fit to the resulting data -which is indeed approximately linear in δ. The values of F EMI are in each case obtained as the δ = 0 limits of the fits.

Slightly deformed disks
The next-to-simplest case corresponds to small deformations of the disk region like the ones considered in (2.1). In that case, we can compare our numerical results to Mezei's formula (2.2). In the case of the EMI model, the stress-tensor two-point function coefficient takes the value [50] C EMI This can be obtained by considering an entangling region with a straight corner which, for general CFTs, contains a universal term of the form [61,62] S EE | univ = −a(θ) log( /δ) , Figure 5: We plot the universal piece F EMI for small deformations of a disk region corresponding to (4.26) for = 2, 3, 4, 5, 6 as a function of the small parameter . The blue dots are obtained numerically integrating (4.4) in each case. The red lines correspond to the leading-order parabolic approximations produced by Mezei's formula (4.19).
where the function of the opening angle a(θ) satisfies [30] a(θ) = π 2 24 C T (θ − π) 2 + . . . for almost smooth corners. In the EMI case, an explicit calculation produces [55,63] a EMI (θ) = 2κ (3) [1 + (π − θ) cot θ] , (4.18) from which (4.15) easily follows. Combining this with (2.2) we have In order to test the validity of this formula, we start by inserting (2.1) in (4.4). Even though we have not succeeded in performing the integrals analytically in full generaly, we manage to do so in a case-by-case basis for individual values of . We start by expanding around = 0 inside the integrand and then take the limit δ → 0 of the O( ), O( 2 ) and O( 3 ) terms, which turn out to be finite. The resulting expressions can be integrated analytically, and we find that the O( ) and O( 3 ) pieces vanish, as expected on general grounds, and an exact match with (4.19) for the quadratic term. We can also use formula (4.19) to perform some checks on the numerical methods that we use below for arbitrary (non-perturbative) shapes. In order to extract F EMI from (4.4) or (4.5) for a given f (θ), we numerically integrate the corresponding expressions for various small values of δ and perform a linear fit of the resulting data points. The value of F EMI is then obtained as the y-intercept of the corresponding line. Doing this for a family of the form 20) for several values of we obtain results (essentially identical for both (4.4) and (4.5)) which are always compatible with Mezei's formula for small enough values of , as shown in Fig. 5.
There, we observe that as grows, the quadratic approximation becomes worse. This is not surprising: while the coefficient of the quadratic term grows like 3 , it is natural to expect higher powers of to appear in the quartic and higher-order pieces. Fitting the data points to expansions involving even powers of , we can extract the quadratic coefficient and compare with (4.19). The results arē As we can see, the numerics do an excellent job in reproducing the exact coefficients in all cases. A similar analysis in the case of the other family of basis functions, sin( θ)/ √ π, displays an analogous match with the exact formula for the quadratic piece. We are therefore confident that we can trust our numerical approach and proceed to study regions with non-perturbative shapes.

More general shapes
Let us move on and consider now more general shapes, which do not correspond to small perturbations of a disk. As anticipated earlier, once we specify the entangling surface via f (θ), we perform numerical integrations of (4.4) and (4.5) and extract F EMI from the δ → 0 limit. It is not obvious at first sight which regions will have a greater F . In particular, as mentioned earlier, shapes related by conformal transformations share the same F . We do know, nevertheless, that shapes including thin portions will tend to increase F -see (2.3)and that shapes which are similar enough to disks will have small F 's. Note though that disks deformed by relatively small bumps can modify F considerably, as shown in Fig. 1.
As our first family, we consider regions defined by the equation for various values of a, b. We plot the ones corresponding to a = 1/2, 1, . . . , 5/2, b = 1/2, 1 . . . , 5/2 in the upper row of Fig. 6. In the same figure, we show the results obtained for F EMI for half-integer values of b as a function of a. We observe that in all cases, the curves which includes ellipses of eccentricity e as particular cases for b = 1. We plot the results for F EMI for various values of b as a function of e in the right plot of the lower row of Fig. 6. Once again, we find that all shapes produce results which lie above the disk result and which tend to grow monotonically as the parameters make them become increasingly different from it. Using the ellipse results and the formulas obtained in Section 3, we can perform another check of the validity of the numerics beyond the perturbative level. In particular, we know that for sufficiently squashed ellipses, the result should approach (3.3) for general CFTs. In the case of the EMI model, the strip coefficient k (3) can be computed analytically, and the result reads k e=0.99 = 71.3κ (3) , which is already very close (∼ 0.8% off). The match improves as e grows. For instance, we have F EMI |   e=0.9999 = 697.9, which differ by ∼ 0.2% and ∼ 0.05% respectively. Naturally, there is a priori no limit in the complexity of the shapes we can probe. We have considered a variety of less symmetric regions and gathered some of the results for F EMI in Fig. 7. As we can see, the values can vary considerably from shape to shape in a way which is far from obvious by looking at the geometry of the figures.
As an approximate guiding rule, it turns out that we can establish an analogy with the quantity Similarly to the property we wish to test for F , this geometric quantity satisfies the isoperimetric inequality, Besides, for small deformations of a disk and for very thin strips it behaves, respectively, as which are rather similar to the general expressions for F in those regimes. Hence, it is natural to wonder about possible relations between the two quantities. Obviously, while R is a fixed number for a given A, F depends on the theory under consideration, so one can only expect the analogy to be approximate at best. Besides, as opposed to F , R is not conformally invariant. In the case of the EMI model, we find that in most cases we have tested, given two entangling regions A, B such that and viceversa. However, this is not a general rule, and we have found counterexamples. For instance, if A is an ellipse of eccentricity e = 0.97 and B is a region defined by f (θ) = 1 + 15/26 · sin(2θ) 2 , we have R A /R B 1.12514 whereas F EMI A /F EMI B 0.871263. In sum, we observe that the explicit evaluation of F in a concrete model produces results which are always larger than the disk one and which can vary very considerably as we change the entangling region, even without the need of considering shapes with very thin sectors.

General proof using strong subadditivity
In this section we provide a general proof for (1.2). Namely, we establish that F is globally minimized by disk regions for general theories. Before getting there we require some preliminary results regarding general inequalities satisfied by F which follow from the strong subadditivity of EE and the behavior of F under different kinds of geometric deformations for a given entangling region. These are included in the first two subsections. The proof is presented in subsection 5.3.

Strong superadditivity of F
The entanglement entropy of two entangling regions γ A and γ B satisfies the strong subadditivity property 11 In the case of three-dimensional QFTs, this implies for the universal term This follows from (5.1) because the perimeters cancel in the combination. We call this property of F "strong supperadditivity" (SSA), having the opposite sign to strong subadditivity because of the conventional sign of F in (1.1). The inequality (5.2) holds even if the intersection and union do not have smooth boundaries. If these boundaries have discontinuous first derivatives, the difference between the two sides of the inequality is in fact infinite. As mentioned earlier, here we will need only be concerned with smooth intersections and unions. For pure states, the entropy for a region γ is equal to the one of its complementγ, and this implies F (γ) = F (γ) .
where the difference γ A − γ B stands for the relative complement γ A ∩ γ B . We will consider regions lying in the plane t = 0. In this case the inequality (5.2) cannot be nontrivially saturated in QFT. This is a consequence of the Reeh-Schlieder property. The saturation of (5.2) is called the Markov property and only occurs in special situations such as regions with boundary in the null plane, or in CFTs for regions with boundaries in the null cone [64,65].

Geometric perturbations of F and 4-expandable functionals
In order to show that the property of SSA implies that F is globally minimized for circles in a CFT, a preliminary step concerns the behavior of the functional F under small perturbations. We consider only smooth boundaries for entangling regions and call S the space of curves in the plane which are boundaries of compact regions with smooth and finite boundary. Let s be a length parameter along γ, and call η(s) to the outward pointing normal unit vector. We call a perturbation of γ ∈ S to a set of curves γ ∈ S, ∈ [0, 0 ], given by where the smooth function δ (s) satisfies δ ≤ , and where we write the uniform norm h = max s |h(s)|. If the perturbation is given by for a fixed function h(s), such that the function with all its derivatives go to zero with with the same velocity, we should have a power expansion for F of the form The question we want to address is what happens for more general perturbations where the derivatives of δ (s) do not go to zero with the same velocity, or even do not go to zero at all. Physically, only ultraviolet entanglement can be sensitive to the derivatives of δ as the perturbation size goes to zero. Then, it is clear that the non-local part of the functional F , which relates the perturbation at a point with the shape of the curve or the perturbation far away, must still satisfy (5.7) with the first two terms going to zero as and 2 respectively. To understand local contributions due to short length entanglement we can forget about the global shape of γ. These local terms are included for example in the distributional nature of the kernel A γ 2 (s 1 , s 2 ) in the vicinity of coincidence points. Locally, scale invariance implies (for dimensional reasons) the form In momentum space the above term writes for large momentum dp |δ (p)| 2 p 3 , (5.9) which is precisely the form of the leading angular momentum term for deformations of disk regions -see eq. (2.2). The coefficient of this singular term must be independent of the precise form of γ and is proportional to C T [9]. For comparison, the perimeter functional, which is not scale invariant, has the milder leading local term dp |δ (p)| 2 p 2 . (5.10) This has dimensions of length sinceδ has length-square dimensions. Analogously, we can analyze the behavior of possible higher non-linear terms in the perturbation. In the scale-invariant case, in momentum space we have a leading behavior for large momentum given by dp |δ (p)| n p 2n−1 . (5.11) For the sake of the proof below, we are interested in understanding perturbations of the form δ (s) = λ a h(s/λ), where we have written = λ a , a > 0. For these, the derivatives go as δ (k) ∼ λ a−k . In momentum space,δ (p) ∼ λ a+1 , p ∼ λ −1 . The leading local term of n th order (5.11) goes as λ n(a−1) . On the other hand, the non-local pieces scale as λ (a+1)n . For these perturbations, in order to separate the first term (n = 1) of (5.7), from the second (n = 2), we need a > 3. 12 We will use a = 4 in the proof. For a = 4, the first term in (5.7) is order λ 5 , while the second (the leading local piece) is order λ 6 . The rest of the contributions start at order λ 7 . This motivates the following definition.

Definition:
We call a functional f : S → R "a-expandable" if for any perturbation δ λ (s) of any γ ∈ S, such that δ (k) λ ∼ λ a−k as λ → 0, the following expansion is valid where the second-order term is higher order in λ than the linear term, and the rest is higher order than the second term.
Naturally, F is an example of a 4-expandable functional.

F is globally minimized by disk regions
With this understanding we are in position to prove the minimality of F for disk regions. In order to highlight the geometric nature of the proof we state it for general functionals.
Theorem: If an Euclidean invariant 13 4-expandable functional F on S is strong superadditive and constant for disks, then F is globally minimized for disks.
Proof: First we deal with regions with non-trivial topology. Suppose γ has more than one connected components γ = γ 1 ∪ · · · ∪ γ n . From SSA This shows that multicomponent regions have larger F than certain single-component ones. Then suppose γ is a single-component region with holes, γ = γ 0 − (γ 1 ∪ · · · ∪ γ n ), where γ 0 , γ 1 · · · , γ n are single-component simply connected regions. From (5.4) it follows which shows that F (γ) is greater or equal than some single-component region without holes. Therefore, in order to prove the theorem, we can from now on restrict ourselves to singlecomponent simply connected regions. Consider then a region γ and another regionγ included in it,γ ⊆ γ. Assumeγ is osculating to γ, that is, both curves are tangent to each other and at the point of contactwhich we can set to the length parameter s = 0 for both curves-their curvatures are equal, γ (0) = γ (0). Since one curve is included inside the other, the third derivatives must also agreeγ (0) = γ (0), and the deviation between them will be fourth order |γ(s) − γ(s)| ∼ |s| 4 near |s| = 0. Around the point s = 0 we consider a family of deformationsγ λ (s) ofγ, with deformation functionδ λ (s), such that the deformation has support in s ∈ (−2λ, 2λ), and the curveγ λ (s) coincides with γ(s) for s ∈ (−λ, λ). Then, the deformation can be chosen such that δ (k) λ ∼ λ 4−k . With support inside the interval s ∈ (−λ, λ) whereγ λ and γ coincide, we can place a deformation of γ given by δ λ (s) = λ 4 g(s/λ), where g is a smooth function. We also impose g < 0 such the deformation is inwards -see Fig. 8 for a representation of the geometric setup just described. We also have δ (k) λ ∼ λ 4−k . From SSA we find then 14 Expanding for small λ, we have for the first-order deviation That is, to first order, f for the larger region increases more than for the smaller one going outwards at the point of osculation. For a disk region and an Euclidean invariant F , the coefficient A disk 1 (s) has to be independent of s. Then, taking a first order variation from a disk to a scaled disk, and considering that F is constant on disks, it follows that A disk 1 (s) = 0 . (5.17) Figure 9: We show the procedure outlined in the text for finding an osculating circle c(s * ) inside a given shape.
This will be useful when combined with (5.16). In order to use this result, we have first to discuss some geometric preliminaries. We will show that for an arbitrary curve γ we can always place an osculating circle inside (or outside) it.
To show this, call the length parameter s ∈ [0, L]. Take a point s 0 and a small circle internal to γ and tangent to γ at s 0 -see Fig. 9. By increasing the radius of this circle and keeping it tangent to γ at s 0 we arrive to a unique circle c(s 0 ) which is still included in γ and is tangent to γ at s 0 and at (at least) another point l(s 0 ). We can choose the length parameter such that s 0 > 0, l(s 0 ) < L, s 0 < l(s 0 ). We define the function l(s) for another l(s 0 ) > s > s 0 in a similar way, by taking the largest circle c(s) tangent at s and internal to γ, and where l(s) is the smallest length parameter (greater than s) among the points at which the circle c(s) is tangent to γ. It is clear that moving s from s 0 to larger values, l(s) can only decrease because γ([s, l(s)]) and a segment of the circle c(s) from the point s to l(s) defines a closed curve that divides the plane in two regions. Any c(s ) for s < s < l(s) is included in this region and will have l(s ) < l(s). Then, there is a smallest s * > s 0 such that s * = l(s * ). This indicates that c(s * ) is an internal osculating circle to γ. At this point the osculating circle and γ have a point of contact of degree four, the curvatures of γ and c agree, and the derivative of the curvature of γ vanishes. In fact s * is a local maximum of the curvature. If it where a minimum, γ would leave part of the circle outside.
We have shown that for any γ we can place an osculating circle inside it. It is not difficult to show, using the same ideas, that we can also find an external osculating circle to γ. A shorter proof follows by using conformal transformations. We can make a conformal inversion I centered at a point inside γ. As a result γ is mapped to a closed curveγ, and the exterior of γ is mapped to the interior ofγ. Again there will be a circlec osculating toγ internally. Considering that the osculating condition is conformally invariant, and that circumferences are mapped to circumferences, by inverting back to the original plane we obtain that there is a circumference c = I(c) osculating to γ, and this circumference is completely included in the exterior of γ. The corresponding circle either lies completely outside γ, in which case the point of contact is a local minimum of curvature for γ, with negative curvature, or the circle includes γ completely, in which case the contact point is a positive curvature local minimum. 15 Hence, we can find a point of local curvature maximum and another of local curvature minimum where we can place internal and external osculating circles. By direct application of the above calculation to the curve γ and these osculating circles we get the inequality (5.16) for the A 1 coefficients of both curves. From the equation (5.17) for circles, we find that for any γ there is at least one point of local curvature maximum where A 1 ≥ 0 and one point of local curvature minimum where A 1 ≤ 0. If |A 1 | > 0, the sign of A 1 is kept constant in a finite interval around the point, whereas for A 1 = 0 it can change sign there. We can use a perturbation of order , with all derivatives of the same order, to increase the size of the local curvature minimum or decrease the size of the local curvature maximum, by pushing the curve to the outside or the inside respectively. 16 If |A 1 | > 0 we can then use the expansion (5.12) for this perturbation and check that F always decreases. In the case A 1 = 0 the perturbation can be chosen such that the change in F vanishes to the order of the change in the curvature. For any γ, we have some curvature maxima {r + 1 , · · · , r + k } and minima {r − 1 , · · · , r − k }, and define Then we have shown that for any γ there is a γ with K(γ ) < K(γ) and F (γ ) ≤ F (γ). Hence, the function has to be non decreasing with q. In particular, the absolute minimum of F has to be achieved for q = 0, corresponding to a circle.
Therefore, the F term in the entanglement entropy of three-dimensional CFTs is globally minimized for disk regions. Except for the particular continuity property of the functional that we used, which is natural for conformal invariant functionals, we did not need to invoke conformal invariance but just Euclidean invariance, plus the requirement that the functional is constant for disks. The proof should go over to the Lorentzian case as well. In that case, however, all deformations of the circle on its light cone have the same F . Another comment is that having an a-expandable functional for a > 4 and not expandability for a = 4 (i.e., more singular functionals) is not enough for the proof, since we need to match the boundaries of a deformed osculating circle with the curve, which requires perturbations with a = 4. In principle, lower a-expandability (softer functionals) -such as the isoperimetric ratio R(A) defined in eq. (4.29), having 3-expandability-will be enough. Note however that R(A) is not strong superadditive and we cannot use the same proof as for the isoperimetric inequality. If we have 2-expandability instead, we could modify the proof above 15 The two cases depend on whether the center of inversion lies inside or outside the osculating circlec. 16 This can be done, for example, by replacing a small interval of the curve around the point of osculation by a segment of a circle tangent at two points at each side of it, and smearing up the points of contact to make it smooth. The same can be done if the minimum or maximum of curvature correspond to a segment of a circle rather than a point. taking just tangent circles instead of osculating ones. However, we always have both interior and exterior tangent circles to a given point, and that would imply a vanishing first order coefficient A 1 at any point. Then, the only possibility for a SSA functional to be 2-expandable is that it is constant under deformations.

Final comments
The main results of the paper appear summarized at the end of the introduction. Let us close with some final comments.
As mentioned in the introduction, the analogous four-dimensional problem was studied in [21] and [23]. The outcome of the study was in that case that the round sphere is the entangling surface which minimizes the logarithmic coefficient of the EE for general theories as long as g = 0, 1 but that, rather surprisingly, for theories with a > c the coefficient could be lowered arbitrarily by considering certain surfaces of increasing genus [23] -see the introduction for more comments. It is then natural to wonder what happens in threedimensions with the analogous question for entangling regions with a fixed number of holes. In this case, however, the answer turns out to be more trivial. For instance, for annuli regions defined by pairs of radii from the set [66]. This implies that F will always decrease as we make the inner radius smaller with respect to the outer one, being F 0 the limiting result as R 1 /R 3 → 0. In fact, for regions with non-trivial topology, our general bound (1.2) can be improved. Imagine we have a region with n disconnected subregions γ = ∪ n j γ j , each one of which has m j holes, γ j = γ j,0 − (γ j,1 ∪ · · · ∪ γ j,m j ). Then, using (5.13), (5.14) and (1.2) we have This follows from applying F ≥ F 0 repeatedly to each single-component simply connected piece. In the last equality we rewrote n+ n j m j as n B , which is the total number of boundaries of the region. In words, for a region with n B boundaries, F is bounded below not only by the disk result F 0 , but by n B times F 0 . Saturation of the inequality occurs only for purely topological theories, F topo (γ) = n B F topo 0 . It is also natural to wonder about the problem analogous to the one considered here for d = 4 + 1 and d = 5 + 1 theories. In the latter case, the geometric nature of the universal logarithmic term for smooth entangling regions [3,4] should make it amenable to an analysis similar to the one in [21,23] for the d = 3 + 1 case. Note however that in that case the sign of the different geometric contributions weighted by the trace-anomaly coefficients is not obvious -at least at first sight-so the situation is trickier. The d = 4 + 1 case is more challenging but methods similar to the ones considered here may be useful. Naturally, in all cases the expectation is that entangling regions bounded by round (hyper)spheres globally maximize the EE. This expectation is supported by partial evidence from holographic theories [21].

Acknowledgments
The work of P.B. and H.C. was supported by the Simons foundation through the It From Qubit Simons collaboration. H.C. was also supported by CONICET, CNEA, and Universidad Nacional de Cuyo, Argentina. The work of JM is funded by the Agencia Nacional de Investigación y Desarrollo (ANID) Scholarship No. 21190234 and by Pontificia Universidad Católica de Valparaíso. JM is also grateful to the QMAP faculty for their hospitality.

A Mutual information regularization of EE in the EMI model
In Section 4 we computed F numerically for many kinds of entangling regions in the EMI model. In order to do so, we introduced a small regulator δ along an auxiliary extra dimension, evaluated F EMI (A) as a function of δ and then extracted the δ → 0 limit. However, as we explained in Section 3 there is an alternative way of regularizing EE that makes use of the MI of concentric regions and which yields converging results in the lattice. In this appendix we explore this alternative method for the EMI model in the case of elliptic entangling regions for the two geometric setups explained in Section 3, namely: fixing a constant ε, and forcing the concentric regions to be ellipses. This will allow us to test to what extent we may expect both methods to differ in general when the size of the entangling regions cannot be considered to be extremely larger than the separation between the auxiliary regions -which is always the case in the lattice. We will also verify the convergence of both methods between themselves and with the δ method used in the main text.
In the three-dimensional EMI model, the mutual information between two regions A, B in a fixed time slice is given by where the normal vectors are defined outwards to each region, respectively. This formula is manifestly non-divergent, as r A and r B correspond now to different regions. For regions A, B parametrized by [x A,B (t), y A,B (t)] we have Applied to the setup required for regularizing the EE, as in (3.20), we need to evaluate the mutual information I EMI (A + , A − ), which involves integrals over ∂A + = ∂A + and ∂A − . Using the EE formula for the EMI model, (4.1), it is easy to see that in I EMI (A + , A − ) the contributions from S EMI EE (A + ) and S EMI EE (A − ) cancel with the terms in S EMI EE (A + ∪ A − ) which involve performing both integrals over ∂A + or both integrals over ∂A − . The only terms which survive are the ones which involve one integral over ∂A + and the other one over ∂A − , in agreement with (A.2). In the case in which we keep ε constant for all points of the ellipse boundary, the relevant formula is where the parametrization of the pseudoellipses appears in (3.25) and k EMI is defined in (4.28). On the other hand, if we force the auxiliary regions to be ellipses, the relevant formula reads where a = (a 1 + a 2 )/2, b = (b 1 + b 2 )/2 and the formula for ε(t) appears in (3.29).
As a first check, we have verified that both (A.3) and (A.4) produce results essentially identical to the ones obtained using the δ method for sufficiently large values of b/ε -e.g., b/ε ∼ 100 yields excellent results in all cases. On the other hand, practical limitations force us to consider smaller values of such a quotient when performing calculations in the lattice (b/ε ∼ 10). In Fig. 10 we have plotted the results achieved using both methods for values of b similar to the ones considered in our lattice calculations in Section 3. As we can see, in both cases the results tend to underestimate considerably the actual ones -this underestimation tends to be greater as the eccentricity grows. Importantly, we observe that the pseudoellipses method makes a better job in approximating the actual results than the variable-ε one. The difference between both methods becomes rather considerable for greater values of the eccentricity. For each eccentricity and each value of b, we can observe what is the factor we need to multiply the corresponding EMI result by in order to obtain the exact answer F EMI (e) . We use such factors in Section 3 to correct the lattice results obtained for the corresponding values of b/δ.