1 Introduction

Black holes are fundamental objects in Einstein’s general relativity. The spatial size of a black hole is usually characterized by its horizon; however, the horizon cannot be directly observed in classical theories either locally or from asymptotic infinity. A few of recent arguments (e.g. see Refs. [1, 2]) suggest that quantum effects may render the horizon locally observable, but this topic remains controversial. There is another special surface named “photon sphere” where gravity is also so strong that photons are forced to travel in orbits [3,4,5]. Differing from the horizon, some photons can escape from the photon sphere, making it observable. The photon sphere plays a key role for gravitational lensing [6, 7] or ringdown of waves around a black hole [8]. It is also related to the characteristic (quasinormal) resonances of black-hole spacetimes [5, 9,10,11]. For a Schwarzschild black hole of mass M, the radius of photon sphere is 3M. The outmost photon sphere is unstable and can cast a “shadow” for an observer at the asymptotic infinity. Recently, the first picture of a black hole shadow was taken [12], which gave us a direct impression of the appearance of the black hole size and shape.

Owing to the significance in astrophysical observations, it is important to study the photon spheres and their shadows. Although the classical properties of the horizon have been well studied, the photon spheres and shadows are still lack of extensive investigations. In a spherically symmetric black hole of mass M, Hod proved that for Einstein gravity coupled to matter satisfying the weak energy condition and negative trace energy condition, the innermost photon sphere radius \(r_{\mathrm {ph,in}}\) and total mass M satisfy [13]

$$\begin{aligned} r_{\mathrm {ph,in}}\le 3M. \end{aligned}$$
(1.1)

By using the same energy condition, Ref. [14] proved an relationship between innermost photon sphere and its shadow radius: \(r_{\mathrm {sh,in}}\ge \sqrt{3}r_{\mathrm {ph,in}}\). A lower bound \(r_{\mathrm {ph,in}}\ge 2M\) was conjectured also by Hod [15] but counterexample was found by Ref. [14].

For the observational purpose, it is more relevant to consider the outermost photon sphere. The proof of Hod’s does not apply to the outermost one when there are multiple photon spheres, which do exist in black holes satisfying the dominant energy condition [16]. Recently, a series of universal inequalities about outermost photon sphere was proposed [17, 18]

$$\begin{aligned} 3r_{+}/2\le r_{\mathrm {ph,out}}\le r_{\mathrm {sh,out}}/\sqrt{3}\le 3M. \end{aligned}$$
(1.2)

Here \(r_+\) is the radius of the horizon. Refs. [17, 18] verify it in many different black holes. Its generalization to higher dimensions were discussed in [19].

In this paper, we will first prove the inequalities (1.2) for spherically symmetric and static black holes in Einstein gravity, for matter fields satisfying a few simple requirements. We then consider more general static configurations and define the corresponding “photon sphere” and “outermost” photon sphere. We conjecture that the area of outermost photon sphere \(A_{\mathrm {ph,out}}\), the corresponding shadow area \(A_{\mathrm {sh,out}}\) and horizon area \(A_{{\mathcal {H}}}\) (if exists), also satisfy a series of universal inequalities, sandwiched within the Penrose inequality:

$$\begin{aligned} 9A_{{\mathcal {H}}}/4\le A_{\mathrm {ph,out}}\le A_{\mathrm {sh,out}}/3\le 36\pi M^2. \end{aligned}$$
(1.3)

Although we do not have the full proof of Eq. (1.3) yet, we will give some pieces of evidence to support it. We will also show that, similar to the horizon, photon spheres in static spacetimes are also conformal invariant structures.

2 Spherically symmetric case

We first present the full proof of (1.2) for spherically symmetric metrics in \((3+1)\) dimensions, which read

$$\begin{aligned} \text {d}s^2=-f(r)e^{-\chi (r)}\text {d}t^2+\frac{\text {d}r^2}{f(r)}+r^2\text {d}\Omega _2^2\,. \end{aligned}$$
(2.1)

Here \(f(r_+)=0\), f(r) is positive when \(r>r_+\). Einstein’s equation reduces to the following three equations,

$$\begin{aligned} f'= & {} -8\pi r\rho +\frac{1-f}{r},\qquad \chi '=-\frac{8\pi r}{f}(\rho +p_r)\,, \end{aligned}$$
(2.2)
$$\begin{aligned} p_r'= & {} \frac{1}{2fr}[\mathcal{N} (\rho +p_r)+2fT-8fp_r]\,, \end{aligned}$$
(2.3)

where \({\mathcal {N}}:=3f-1-8\pi r^2p_r\), and \(p_r\), \(\rho \), T are radial pressure, energy density and the trace of stress tensor respectively. We require \(r^3p_r(r)\rightarrow 0\) and \(r^3\rho (r)\rightarrow 0\) when \(r\rightarrow \infty \). Photon spheres are determined by \(U'=0\), where

$$\begin{aligned} U(r):=f(r)e^{-\chi (r)}/r^2\,,\quad \hbox {and}\quad U'=-{\mathcal {N}} e^{-\chi }/r^3\,. \end{aligned}$$
(2.4)

It is clear that U must have an extremum. Multiple and odd numbers of extrema can also arise and the radii of photon spheres satisfy \({\mathcal {N}}=0\) [13]. Furthermore, we must have \({\mathcal {N}}>0\) and \(U'<0\) if \(r>r_{\mathrm {ph,out}}\).

2.1 Proof of upper bounds

Here we show that if both the weak and strong energy conditions are satisfied for \(r\ge r_{\mathrm {ph,out}}\), we have

$$\begin{aligned} r_{\mathrm {ph,out}}\le r_{\mathrm {sh,out}}/\sqrt{3}\le 3M\,. \end{aligned}$$
(2.5)

The radius of shadow \(r_{\mathrm {sh,out}}\) is related to the outermost photon sphere by \(r_{\mathrm {sh,out}}=1/\sqrt{U(r_{\mathrm {ph,out}})}\) [14].

We introduce an auxiliary function \({\mathcal {W}}(r):=e^{-\chi (r)}[1+8\pi r^2p_r(r)]\), which has following relations

$$\begin{aligned} U(r_{\mathrm {ph,out}})={\mathcal {W}}(r_{\mathrm {ph,out}})/ (3r_{\mathrm {ph,out}}^2),~~(r^3U)'={\mathcal {W}}\,. \end{aligned}$$
(2.6)

The key is to show that \({\mathcal {W}}(r)\le 1\) when \(r\ge r_{\mathrm {ph,out}}\). The Einstein’s equation and null energy condition imply \(\chi \ge 0\). The weak energy condition tells us \(\rho \ge 0\). Using Eq. (2.2), we see \(\max f=1-8\pi r^2 \rho |_{f'=0}\le 1\). We now split the interval \([r_{\mathrm {ph,out}},\infty )\) into two groups: \(\{I_1^+, I_2^+,\ldots \}\) where \(p_r\ge 0\) and \(\{I_1^-,I_2^-,\ldots \}\) where \(p_r\le 0\). If \(r\in I_n^-\) (here \(n=1,2,\ldots \)), we see \({\mathcal {W}}(r)\le 1\). If \(r\in I_n^+\), i.e., \(p_r\ge 0\), we find that the derivative of \({\mathcal {W}}(r)\) is

$$\begin{aligned} {\mathcal {W}}'(r)=4\pi r e^{-\chi }[(\rho +p_r)(1+8\pi r^2p_r+f)/f+4p_T]\,. \end{aligned}$$
(2.7)

Here \(p_T:={T^{\theta }}_{\theta }={T^{\phi }}_{\phi }\) is the transverse pressure. As \(p_r\ge 0\) and \(f\le 1\), we see that \({\mathcal {W}}'\ge 4\pi r e^{-\chi }[(\rho +p_r)(1+f)/f+4p_T]\ge 8\pi r e^{-\chi }(\rho +p_r+2p_T)\). Thus the strong energy condition ensures that \({\mathcal {W}}(r)\) is non-decreasing in the interval \(I_n^+\). The maximal value of \({\mathcal {W}}(r)\) at every interval \(I_n^+\) is theqrefore at the endpoint, where \(p_r=0\). Thus, we also have \({\mathcal {W}}(r)\le 1\) at the interval \(I_n^+\). We can now immediately see \(r_{\mathrm {sh,out}}\ge \sqrt{3}r_{\mathrm {ph,out}}\).

To prove the shadow upper bound, we introduce \(\widetilde{U}=(1-2M/r)/r^2\) for the Schwarzschild black hole of the same mass. It is clear that \((r^3\widetilde{{U}})'=1\ge {\mathcal {W}}= (r^3{U})'\). At \(r\rightarrow \infty \), the asymptotical flatness requires matter fields to decay at least in following ways

$$\begin{aligned} \rho \sim {\mathcal {O}}(1/r^{3+\delta _1}),~~p_r\sim {\mathcal {O}} (1/r^{3+\delta _2}),~~\delta _1>0,~~\delta _2>0\,. \end{aligned}$$
(2.8)

Then we find

$$\begin{aligned} \chi ={\mathcal {O}}(r^{-(1+\min \{\delta _1,\delta _2\})}), ~~f=1-2M/r+{\mathcal {O}}(r^{-(1+\delta _1)})\,, \end{aligned}$$

which implies

$$\begin{aligned} U=\frac{fe^{-\chi }}{r^2}=r^{-2}-\frac{2M}{r^3} +{\mathcal {O}}(r^{-(3+\min \{\delta _1,\delta _2\})})\,. \end{aligned}$$
(2.9)

Then we have \([r^3\widetilde{{U}}(r)-r^3{U}(r)]|_{r\rightarrow \infty }=0\). Thus, we have \(\widetilde{{U}}(r)\le {U}(r)\) when \(r\ge r_{\mathrm {ph,out}}\). As the maximum of \({\widetilde{U}}\) in the interval \([r_{\mathrm {ph,out}},\infty )\) is \(1/(27M^2)\), we see the upper bound

$$\begin{aligned} 1/(27 M^2)=\max {\widetilde{U}}\le \max U=1/r_{\mathrm {sh,out}}^2. \end{aligned}$$
(2.10)

In addition, we have the rigidity: for all spherically symmetric static spacetimes of mass M, \(r_{\mathrm {ph,out}}=3M\) or \(r_{\mathrm {sh,out}}=3\sqrt{3}M\) arises if and only if the exterior of photon sphere is the Schwarzschild.

If we only focus on the photon sphere, the requirement can be much relaxed and we only need the null energy condition outside the outermost photon sphere. Our main tool is a new mass function

$$\begin{aligned} M(r):={\mathfrak {m}}(r,\rho )+\frac{4\pi }{3} r^3p_r(r)=\frac{r}{3} (1-{\mathcal {N}}/2)\,. \end{aligned}$$
(2.11)

Here \({\mathfrak {m}}(r,\rho )\) is the Hawking–Geroch mass [20, 21], given by

$$\begin{aligned} {\mathfrak {m}}(r,\rho ):=\frac{r}{2}[1-f(r)]=r_+/2+4\pi \int _{r_+}^rx^2\rho (x)\text {d}x\,. \end{aligned}$$
(2.12)

Applying Eqs. (2.2) and (2.3) we find an identity

$$\begin{aligned} M'(r)=\frac{8\pi r^2}{3}(\rho +p_T)+\frac{2\pi r^2}{3f}(\rho +p_r){\mathcal {N}}. \end{aligned}$$
(2.13)

For \(r\ge r_{\mathrm {ph,out}}\), \({\mathcal {N}}\ge 0\), the null energy condition ensures \(M'\ge 0\). This shows \(M(r_{\mathrm {ph,out}})\le M(\infty )=M\). At the photon sphere \({\mathcal {N}}=0\), we have \(M(r_{\mathrm {ph,out}})=r_{\mathrm {ph,out}}/3\). Thus, we obtain \(r_{\mathrm {ph,out}}\le 3M\). Compared to Hod’s proof (1.1), our condition is much weaker but the conclusion is stronger (as \(r_{\mathrm {ph,out}}\ge r_{\mathrm {ph,in}}\)). This result also implies a new positive mass theorem and entropy bound for static spherically symmetric black holes: if the null energy condition outside the black hole is satisfied, the mass M must be positive and the horizon radius must be smaller than 3M (cannot saturate this bound). In fact, the condition can be even weaker: There exists a photon sphere outside which the null energy condition is satisfied. This is remarkable since even in the spherical case, the previous proofs often require the weak energy condition.

Stronger inequalities may be obtained for some particular matters. For example, if the matters contain the standard Maxwell field with charge Q and all the other matters satisfy null energy condition, then Eq. (2.13) implies \(M'\ge 2Q^2/(3r^2)\) and so \(M(r_{\mathrm {ph,out}})\le M-2Q^2/(3r_{\mathrm {ph,out}})\), which leads to a tighter bound in the charged case \(r_{\mathrm {ph,out}}\le \frac{3M}{2}(1+\sqrt{1-8Q^2/(9M^2)})\). Thus, in all spherically symmetric black holes of same mass and charge, Reissner-Norström (RN) black hole has largest photon sphere.

2.2 Proof of lower bound

It follows from Eq. (2.12) that \({\mathfrak {m}}(r,\rho )\ge r_+/2\) when \(\rho \ge 0\), then we see from Eq. (2.11) that the lower bound \(3r_+/2\le r_{\mathrm {ph,out}}\) holds if weak energy condition holds and \(p_r\ge 0\) at \(r=r_{\mathrm {ph}}\). This is generally satisfied by astronomical black holes.

In theory, many important solutions such as the RN black holes have negative \(p_{r}\). In these cases, the weak energy condition alone is not enough to ensure \(3r_+/2\le r_{\mathrm {ph,out}}\). As an example, we consider \(\chi =0\) and

$$\begin{aligned} f(r)=1-\frac{r_+}{r}-\frac{6\rho _0r_+}{\sqrt{e}r}+\rho _0e^{-r/(2r_+)}(2+4r_+/r),\nonumber \\ \end{aligned}$$
(2.14)

for which \(\rho (r)=-p_r(r)=\frac{\rho _0}{8\pi rr_+}e^{-r/(2r_+)}\) with \(\rho _0>0\), satisfying the weak energy condition. After specifying \(\rho _0=1\), we find \(r_{\mathrm {ph,out}}/r_+\approx 1.417<3/2\). However, we find that the lower bound holds if \(p_r\) and \(\rho \) satisfy an additional condition: there is a function \(\Xi (r)\) such that,

$$\begin{aligned} \forall r>r_+,~~~[r^2\Xi (r)]'\ge 0~~\mathrm {and}~-\rho \le \Xi (r)\le p_r(r)\,.\nonumber \\ \end{aligned}$$
(2.15)

This requirement is weak in the sense that we only need the existence of such function \(\Xi (r)\). This condition admits the negative pressure. For example, in Einstein–Maxwell theory, \(\rho (r)=-p(r)\propto Q^2/r^4\) and one we can take \(\Xi (r)=p(r)\). The proof is as follows.

\(r^2\Xi (r)\) is a non-decreasing function outside the horizon and the null energy condition implies \(\rho \ge -\Xi \) and \(p_r\ge \Xi \) and so \(M(r)\ge {\mathfrak {m}}(r,-\Xi )+\frac{4\pi r^3}{3}\Xi \). On the other hand, we find

$$\begin{aligned} {\mathfrak {m}}(r,-\Xi )= & {} r_+/2-4\pi \int _{r_+}^rx^2\Xi (x)\text {d}x\nonumber \\\ge & {} {r_+}/2-4\pi r^2\Xi (r)(r-r_+)\,, \end{aligned}$$
(2.16)

which gives us

$$\begin{aligned} M(r)+8\pi r^3\Xi (r)/3\ge [1+8\pi r^2\Xi (r)]\, r_+/2\,. \end{aligned}$$
(2.17)

Substituting \(r=r_{\mathrm {ph,out}}\) and \(M(r_{\mathrm {ph,out}})=r_{\mathrm {ph,out}}/3\) into the above, we obtain

$$\begin{aligned} \left( r_{\mathrm {ph,out}} - 3{r_+}/2\right) [1+8\pi r_{\mathrm {ph,out}}^2\Xi (r_{\mathrm {ph,out}})]\ge 0\,. \end{aligned}$$
(2.18)

As \(r_+\) is the outermost horizon, we have \(f'(r_+)\ge 0\). Eq. (2.2) implies \(1-8\pi r^2_+\rho (r_+)\ge 0\) and theqrefore

$$\begin{aligned} 1+8\pi r_+^2\Xi (r_+)\ge 0\Rightarrow 1+8\pi r^2\Xi (r)\ge 0\, \end{aligned}$$
(2.19)

if \(r>r_+\). We thus prove the lower bound. This proof also applies to the stronger statement \(r_{\mathrm {ph,in}}\ge 3r_+/2\).

3 General static cases

3.1 Generalization of photon spheres

In this part we consider the general static spacetimes, where it is more instructive to study the “photon sphere” in the spacetime rather than only to focus on its spatial projection. In static spacetime, there is a timelike Killing vector \(\xi ^\mu =(\partial /\partial t)^\mu \) outside the horizon and t is the time coordinate. The metric of spacetime has following 3+1 decomposition,

$$\begin{aligned} \text {d}s^2=-\phi ^2\text {d}t^2+h_{ab}\text {d}x^a\text {d}x^b\,. \end{aligned}$$
(3.1)

Here \(h_{ab}\) is the metric of equal-t slice \(\Sigma _t\) and \(\phi ^2=\xi ^\mu \xi _\mu \). Motivated by Ref. [22], we call a connected timelike co-dimensional 1 surface \(\Gamma =\{t\}\times {\mathcal {S}}\) to be marginal transversely-trapping surface (MTTS), if \({\mathcal {S}}\) is topological 2-sphere and any null geodesic that starts tangentially on \(\Gamma \) will keep laying on \(\Gamma \). Let \(n^\mu \) be its outward unit normal covector, \(T^\mu \) be the tangent vector of a null geodesic. We see \(T^\mu n_\mu |_\Gamma =0\) and so \(T^\mu \nabla _\mu (n_\nu T^\nu )=0\). We thus have the condition for an MTTS [23]:

$$\begin{aligned} \forall ~\mathrm {null~tangent~vector }T^\mu , ~~K_{\mu \nu }T^\mu T^\nu =0\,. \end{aligned}$$
(3.2)

Here \(K_{\mu \nu }\) is extrinsic curvature of the MTTS. Then an equal-t cross-section \({\mathcal {S}}\) is a photon sphere.

We can give a more explicit expression to find a photon sphere in the static spacetime. Assume that \({\mathfrak {R}}\) is the scalar curvature of \({\mathcal {S}}\), \(\Sigma _t\) is an equal-t slice and R is its scalar curvature, \(k_{\mu \nu }\) is the extrinsic curvature of \({\mathcal {S}}\) embedded in \(\Sigma _t\) and its trace is k, \(l^\mu \) is the unit normal vector of \(\Sigma _t\) and \((\partial /\partial t)^\mu =\phi l^\mu \), \({\hat{r}}^\mu \) is the outward unit normal vector of \({\mathcal {S}}\) embedded in \(\Sigma _t\). \((\gamma _{\mu \nu }, {\mathcal {D}}_\mu ) \) and \((h_{\mu \nu }, D_\mu )\) are the induced metrics and covariant derivatives of the photon sphere \({\mathcal {S}}\) and static slice \(\Sigma _t\), respectively. See Fig. 1 for schematic explanations on these notations.

Fig. 1
figure 1

Notations in the MTTS and photon sphere. For convenience, we immerse all the vectors/tensors into the 3+1 dimensional spacetime and so their indexes become spacetime indexes. For example, \(h_{\mu \nu }={e^a}_\mu {e^b}_\nu h_{ab}\), where \({e^a}_\mu \) is the pull-black map from spacetime to the equal-t surface

In static cases, one can find that \(n^\mu |_{{\mathcal {S}}}={\hat{r}}^\mu |_{{\mathcal {S}}}\) and so we have the decomposition \(K_{\mu \nu }=-l_{\mu }l_{\nu }\phi ^{-1}{\hat{r}}^\tau D_\tau \phi +k_{\mu \nu }\). Assuming that \(s^\mu \) is an arbitrary unit tangent vector field of \({\mathcal {S}}\), then \(T^\mu =l^{\mu }+s^\mu \) is a null vector tangent to MTTS. The requirement (3.2) implies \(\phi ^{-1}{\hat{r}}^\mu D_\mu \phi -k_{\mu \nu }s^\mu s^\nu =0\). As the result we find

$$\begin{aligned} k_{\mu \nu }=\gamma _{\mu \nu }\phi ^{-1}{\hat{r}}^\mu D_\mu \phi \,. \end{aligned}$$
(3.3)

Using the decomposition of the Einstein’s tensor

$$\begin{aligned} {\mathfrak {R}}=-16\pi p_r+2\phi ^{-1}{\mathcal {D}}^2\phi +2k\phi ^{-1}{\hat{r}}^\mu D_\mu \phi +k^2-k_{\mu \nu }k^{\mu \nu }\,, \end{aligned}$$
(3.4)

where \(p_r:=T_{\mu \nu }{\hat{r}}^\mu {\hat{r}}^\nu \) is the pressure on \({\mathcal {S}}\), we can obtain

$$\begin{aligned} \left. 3k^2/4-8\pi p_r+\phi ^{-1}{\mathcal {D}}^2\phi -{\mathfrak {R}}/2\right| _{{\mathcal {S}}}=0\,. \end{aligned}$$
(3.5)

It reduces to \({\mathcal {N}}=0\) in the spherical case.

Similar to the horizon, the photon sphere is also a conformal invariant structure. This can be understood by the fact that the null geodesics are conformal invariant, or that Eq. (3.3) is invariant under the conformal transformation \(\{h_{\mu \nu }\rightarrow {\tilde{h}}_{\mu \nu }=\Omega ^2 h_{\mu \nu }, ~\phi \rightarrow {\tilde{\phi }}=\Omega \phi \}\). Particularly, if we choose \(\Omega =\phi ^{-1}\), then \({\tilde{h}}_{\mu \nu }\) is just the “optical metric” [14] and the trace of extrinsic curvature is \({\tilde{k}}|_{{\mathcal {S}}}=0\). Thus, photon sphere \({\mathcal {S}}\) is a minimal surface in the “optical metric”. For a surface S, assume that \(\text {d}S\) is the surface element induced by original metric \(h_{\mu \nu }\). Then the surface element under the optical metric reads \(\phi ^{-2}\text {d}S\). Thus, photon sphere is a critical surface which locally minimizes following functional

$$\begin{aligned} P:=\int _S\phi ^{-2}\text {d}S\,. \end{aligned}$$
(3.6)

It should be pointed out that we generalized the photon sphere concept directly from the well-defined spherical case. However, whether such thin-shell photon sphere exists in general remains to be further investigated. It may be necessary for us to introduce some “weak photon spheres” by relaxing requirement (3.2) while keeping most of the essential properties. See e.g. Ref. [24] for an example. Nevertheless, we shall proceed with the assumption.

3.2 Outermost photon sphere and conjectures about its size

For general static spacetimes, the photon spheres may intersect with each others and have many inequivalent homology classes. See the left panel of Fig. 2 for example. The meaning of the “outermost” photon sphere needs to be clarified. We propose a proper definition about the “outermost” should satisfy the following four requirements: (1) it satisfies Eq. (3.3) piecewise and no tangentially null geodesic can escape outside; (2) it is closed; (3) no any part of photon spheres is outside it; and (4) \(\forall \) topological 2-sphere X outside the “outermost” photon sphere, we have

$$\begin{aligned} \frac{3}{4}k^2-8\pi p_r+\phi ^{-1}{\mathcal {D}}^2\phi -{\mathfrak {R}}/2|_X>0\,, \end{aligned}$$
(3.7)

where \(k, {\mathcal {D}}^2\) and \({\mathfrak {R}}\) are the trace of extrinsic curvature, Laplace operator and scalar curvature of X respectively, and \(p_r\) is the pressure normal to X. In the spherical case, Eq. (3.7) recovers the condition \({\mathcal {N}}>0\). Based on these considerations, we define the outermost photon sphere \({\mathcal {S}}_{\mathrm {out}}\) as the enveloping surface of outermost segments of all photon spheres, illustrated in the middle panel of Fig. 2.

Fig. 2
figure 2

Left: the circles stand for different photon spheres, which may intersect with each others. Middle: red circle is the “outermost photon sphere”, which is just the enveloping surface of outermost segments of photon spheres. Right: the outermost photon sphere contains two connected branches \({\mathcal {S}}_{\mathrm {out}}^{(1)}\) and \({\mathcal {S}}_{\mathrm {out}}^{(2)}\), and \({\mathcal {S}}_{\mathrm {out}}={\mathcal {S}}_{\mathrm {out}}^{(1)} \cup {\mathcal {S}}_{\mathrm {out}}^{(2)}\)

The \({\mathcal {S}}_{\mathrm {out}}\) may be disconnected and contain many connected branches \({\mathcal {S}}_{\mathrm {out}}^{(i)}\), i.e., \({\mathcal {S}}_{\mathrm {out}}=\bigcup _{i}{\mathcal {S}}_{\mathrm {out}}^{(i)}\). See the right panel of Fig. 2 for example.

We denote the area of \({\mathcal {S}}_{\mathrm {out}}^{(i)}\) to be \(A_{\mathrm {ph,out},i}\), i.e.

$$\begin{aligned} A_{\mathrm {ph,out},i}=\int _{{\mathcal {S}}_{\mathrm {out}}^{(i)}}\text {d}S\,. \end{aligned}$$
(3.8)

The \({\mathcal {S}}_{\mathrm {out}}^{(i)}\) will cast a shadow at the observer’s sky. In the spherical case, the shadow is a disk, of which the radius is independent of the angle of view. In general cases, the shadow may have complicated shapes and depend on the angle of view. It is more convenient to study the apparent area of photon sphere measured at infinity, which is given by following integration,

$$\begin{aligned} A_{\mathrm {sh,out},i}=\int _{{\mathcal {S}}_{\mathrm {out}}^{(i)}}\phi ^{-2}\text {d}S\,. \end{aligned}$$
(3.9)

Here \(\text {d}S\) is the surface element induced by original metric \(h_{\mu \nu }\). We can use \(A_{\mathrm {sh,out},i}\) to characterize the size of shadow. In the spherically symmetric case \(r_{\mathrm {sh,out}}=\sqrt{A_{\mathrm {sh,out},i}/(4\pi )}\). Assume \(A_{H,i}\) to be the area of horizon inside \({\mathcal {S}}_{\mathrm {out}}^{(i)}\). The inequalities in (2.5) have a naturally generalization:

$$\begin{aligned} 9A_{H,i}/4\le A_{\mathrm {ph,out},i}\le A_{\mathrm {sh,out},i}/3\le 36\pi M^2\,, \end{aligned}$$
(3.10)

We also conjecture a global version which involves the union of all the connected branches:

$$\begin{aligned} 9A_{{\mathcal {H}}}/4\le A_{\mathrm {ph,out}}\le A_{\mathrm {sh,out}}/3\le 36\pi M^2\,, \end{aligned}$$
(3.11)

Here \(A_{{\mathcal {H}}}=\sum _i A_{H,i}\) and the same for others.

Although we do not have a complete proof beyond the spherical case, we can already prove some parts now in special situations. For example, \(9A_{{\mathcal {H}}}/4\le 36\pi M^2\) is simply the Penrose inequality and has been proven by several different methods [25]. In following we will offer the proofs on \(9A_{{\mathcal {H}}}/4\le A_{\mathrm {sh,out}}/3\) and \(A_{\mathrm {ph,out}}\le A_{\mathrm {sh,out}}/3\le 36\pi M^2\) in the special case without the spherical symmetry.

4 Proofs without spherically symmetry

As the first step to consider the general case, we assume that the outmost horizon is connected, and always assume the weak and strong energy conditions. We also assume that outmost photon sphere is smooth at beginning. Finally, we will give argument to show that the smoothness of outmost photon sphere can be relaxed into piecewise smoothness. We will convert the inequalities into the problems of finding maximum/minimum, which can be solved by variational method. In general, to check if the variational problem will give us minimum or maximum, we may need to compute the second order variation. This in general will be complicated. However, if the on-shell value of variational problem has explicit expression, we have simpler way.

Assume a functional is \({\mathcal {F}}:{\mathcal {X}}\mapsto {\mathbb {R}}\), where \({\mathcal {X}}\) is the space spanned by all arguments of \({\mathcal {F}}\). Mathematically, if \({\mathcal {F}}\) satisfies a few of very general restrictionsFootnote 1 and bounded from above/below, then \({\mathcal {F}}\) must has maximum/minimum [32] and the maximum/minimum of \({\mathcal {F}}\) must be given by the maximal/minimal on-shell value of corresponding variational problem. Thus, to find the maximum/minium of \({\mathcal {F}}\), we only need to find all on-shell values and pick out the maximal/minimal one. In the case that the on-shell value of variational problem has explicit expression, this is easy to perform.

4.1 Bondi–Scahs formalism

We will use Bondi–Scahs formalism [26, 27] in this section, which offers us simple way to study behaviors of metric in non-spherical case. Bondi–Scahs formalism foliates the spacetime by a series of null surfaces which are labeled by \(u=\)constant and the general metric has following form

$$\begin{aligned} \text {d}s^2= & {} -\frac{V}{r}e^{2\beta }\text {d}u^2-2e^{2\beta }\text {d}u\text {d}r\nonumber \\&+r^2h_{AB}(\text {d}x^A-U^A\text {d}u)(\text {d}x^B-U^B\text {d}u)\,. \end{aligned}$$
(4.1)

As the spacetime is asymptotically flat, we then fix the boundary conditions as follows

$$\begin{aligned}&\beta |_{r\rightarrow \infty }=0,\nonumber \\&\left. \frac{V}{r}\right| _{r\rightarrow \infty }=1,\quad \quad h_{AB}|_{r\rightarrow \infty }\text {d}x^A\text {d}x^B=\text {d}\Omega ^2\,, \end{aligned}$$
(4.2)

where \(\text {d}\Omega ^2\) is the metric of unit sphere. The photon spheres are geometrical structures of a static black hole, so they are independent of the choices of coordinates. To give readers clear visible pictures, we used static 3+1 decomposition. To study the inequalities proposed in this paper, the Bondi–Scahs coordinates gauge is more convenient.

In Bondi–Scahs gauge, we have three gauge freedoms. Firstly, we have freedom to choose where the null hypersurface \(u=0\) locates. One convenient choice is that we use outgoing light rays of a photon sphere \({\mathcal {S}}\) to define the null hypersurface \(u=0\). See Fig. 3.

Fig. 3
figure 3

Left: the outgoing null rays of photon sphere form the null surface \(u=0\). A photon sphere \({\mathcal {S}}\) locates at an equal-t spacelike surface as well as locates at a null sheet. On the null sheet, the photon sphere then is defined by \(r=r_{\mathcal {S}}(x^A)\). Right: the null foliation of spacetime. Other null sheets are obtained by using translation generated by static Killing vector \(\xi ^\mu \). The u-coordinate is free and we can choose arbitrary timelike curve which passes through these null sheets

The photon sphere \({\mathcal {S}}\) then locates at equal-t surface as well as locates at \(u=0\) null sheet. Thus, \({\mathcal {S}}\) is also defined by \(u=0\) and \(r=r_{\mathcal {S}}(x^A)\). After we obtain the first null sheet \(u=0\), all the other null sheets can be obtained by using the translation generated by static Killing vector \(\xi ^\mu \). Every null sheet is labeled by a constant u. In fact, besides photon sphere, for any closed surface S in the equal-t submanifold and outside horizon, we can always the immerse one closed surface S into its outgoing light rays. Using this initial null sheet and the translation generated by \(\xi ^\mu \), we can always obtain a local null foliation and S lays on one null sheet.

Secondly, we have freedom to choose u-coordinates. The tangent vector of u-coordinate is \((\partial /\partial u)^\mu \). In Bondi–Scahs gauge, the normal covector of null sheets \(\propto (\text {d}u)^\mu \) but normal vector is not proportional to \((\partial /\partial u)^\mu \). In fact, the normal vector of equal-u surface satisfies

$$\begin{aligned} g^{\mu \nu }(\text {d}u)_\nu \propto (\partial /\partial r)^\mu \,, \end{aligned}$$
(4.3)

which lays on the equal-u null sheets. This defines the coordinate r and we find

$$\begin{aligned} g_{rr}=g_{\mu \nu }(\partial /\partial r)^\mu (\partial /\partial r)^\nu =0\,, \end{aligned}$$

which matches with the metric (4.1). The u-coordinate is arbitrary timelike curve and passes through those null sheets, e.g. see Fig. 3. We can choose \((\partial /\partial u)^\mu =\xi ^\mu \), i.e. the static Killing vector field. Then we have

$$\begin{aligned} \phi ^2=\xi ^\mu \xi _\mu =g_{uu}=Ve^{2\beta }/r-r^2h_{AB}U^AU^B\,. \end{aligned}$$
(4.4)

Here \(\phi \) is same as (3.1).

The third gauge freedom comes from the fact that the relationship (4.3) does not fix the choice of coordinate r uniquely. We can choose \({\tilde{r}}\) to replace r, where \({\tilde{r}}\) satisfies \((\partial /\partial r)^\mu =\psi (\partial /\partial r)^\mu \) with arbitrary nonzero scalar field \(\psi \). To eliminate this freedom, we can impose gauge condition \(\partial _rh=0\). By this gauge choice, we see

$$\begin{aligned} \sqrt{h}\text {d}^2x=\sqrt{h}\text {d}^2x|_{r\rightarrow \infty }=\text {d}\Omega \,. \end{aligned}$$
(4.5)

and \(\text {d}\Omega \) is a surface element of unite sphere. The differential gauge \(\partial _rh=0\) leaves a freedom in choosing the initial condition of r. To fix this freedom, we can set the horizon radius \(r_h\) to be constant, which leads to a simple formula for the area of horizon

$$\begin{aligned} A_{{\mathcal {H}}}=4\pi r_h^2\,. \end{aligned}$$
(4.6)

Alternatively, we can choose a photon sphere \({\mathcal {S}}\) to be constant \(r=r_{\mathrm {ph}}\), then the area of \({\mathcal {S}}_{\mathrm {ph}}\) has simple expression,

$$\begin{aligned} A_{\mathrm {ph}}=4\pi r_{\mathrm {ph}}^2\,. \end{aligned}$$
(4.7)

In general, we cannot require that both horizon and photon sphere have constant r.

The coordinates \(\{x^A\}\) lay on the null sheets. We consider the static case, so we can choose the transverse direction \(x^A\) to be orthogonal to \(\xi ^\mu \), which means \(U^A=0\). Then metric becomes

$$\begin{aligned} \text {d}s^2=-\frac{V}{r}e^{2\beta }\text {d}u^2-2e^{2\beta }\text {d}u\text {d}r+r^2h_{AB}\text {d}x^A\text {d}x^B\,. \end{aligned}$$
(4.8)

In Appendix A we offer more detailed argument on Eq. (4.8). Strictly speaking, in some spacetimes such coordinates may only cover a finite neighborhood. Here we assume that it can cover the whole region outside horizon \({\mathcal {H}}\).

In above coordinates gauge, the Einstein’s equation shows [27]

$$\begin{aligned} \partial _r\beta =\frac{r}{16}h^{AC}h^{BD}(\partial _rh_{AB}) (\partial _rh_{CD})+2\pi r T_{rr} \end{aligned}$$
(4.9)

and

$$\begin{aligned} \partial _r(e^{-2\beta }V)= & {} \frac{{^{(2)}R}}{2}-{\mathfrak {D}}^2\beta -({\mathfrak {D}}\beta )^2-8\pi r^2\rho \nonumber \\&-\frac{Vre^{-2\beta }}{8} h^{AC}h^{BD}(\partial _rh_{AB})(\partial _rh_{CD})\,. \end{aligned}$$
(4.10)

Here \({^{(2)}R}\) and \({\mathfrak {D}}_A\) are the scalar curvature and covariant derivative operator of \(h_{AB}\). \(T_{rr}:=T_{\mu \nu }(\partial /\partial r)^\mu (\partial /\partial r)^\nu \) is the null-null component of energy momentum tensor \(T_{\mu \nu }\), \(\rho :=T_{\mu \nu }\xi ^\mu \xi ^\nu /(\xi ^\mu \xi _\mu )\). The weak energy condition then insures

$$\begin{aligned} T_{rr}\ge 0,~~\rho \ge 0\,. \end{aligned}$$
(4.11)

Then Eq. (4.9) implies

$$\begin{aligned} \beta \le 0\,. \end{aligned}$$
(4.12)

In following we will define some target functionals on some surfaces S, which will depend on the spacetime geometry and choices of S. We always assume the Einstein’s equation, so the spacetime geometry is determined by distribution of matters. As the result, the functionals will depend on distribution of matters and choices of S. It is necessary to know how many degrees of freedoms the target functionals have. The energy momentum tensor has 10 components, 4 of which are constrained by Bianch identity (or the conserved law \(\nabla ^\mu T_{\mu \nu }=0\)). The requirement \(U^A=0\) also supplies 2 constraints according to Einstein’s equation. Thus, we have 4 bulk degrees of freedoms. Ten components of energy momentum tensor combining with the choice of surface S offers us 11 degrees of freedoms at the surface S. \(U^A|_S=0\) and \(h|_S=h|_{r\rightarrow \infty }\) offer us 3 constraints on the surface S. Thus, we we have 8 degrees of freedoms at a surface S. Theqrefor, we can choose at most 4 bulk variables and at most 8 surface variables as the independent variables when we use variational method. It will simplify the variational problem if we choose the independent variables suitably.

4.2 Proof of \(9A_{{\mathcal {H}}}/4\le A_{\mathrm {sh,out}}/3\)

In this subsection, we will use variational method to prove: in all black holes of which outmost horizon areas are \(A_{{\mathcal {H}}}\), if photon sphere exists, then the minimum value of \(A_{\mathrm {sh,out}}\) is \(27A_{{\mathcal {H}}}/4\). To do that, we introduce an auxiliary functional

$$\begin{aligned} {\mathcal {B}}:=A_{S}^{-2}\int _S\phi ^2\text {d}S\,. \end{aligned}$$
(4.13)

Here S is arbitrary surface which encloses outmost horizon \({\mathcal {H}}\) and lays on the equal-t surface, \(A_{S}\) is the area of S. The Cauchy–Schwartz inequality shows that

$$\begin{aligned} \left( \int _S\phi ^2\text {d}S\right) \int _S\phi ^{-2}\text {d}S\ge \left( \int _S\phi \phi ^{-1}\text {d}S\right) ^2=A_{S}^2\, \end{aligned}$$

and so

$$\begin{aligned} \int _S\phi ^{-2}\text {d}S\ge 1/{\mathcal {B}}\,. \end{aligned}$$
(4.14)

Then for outmost photon sphere \({\mathcal {S}}_{\mathrm {sh,out}}\), we have

$$\begin{aligned} A_{\mathrm {sh,out}}=\int _{S_{\mathrm {sh,out}}}\phi ^{-2}\text {d}S\ge \frac{1}{{\mathcal {B}}|_{S={\mathcal {S}}_{\mathrm {sh,out}}}} \ge \frac{1}{\max {\mathcal {B}}}\,. \end{aligned}$$
(4.15)

In Appendix B we show that, if we fix the horizon area and strong energy condition is satisfied, then \({\mathcal {B}}\) is bounded from above. Thus, functional \({\mathcal {B}}\) has maximum. To find the this maximum, we need to find all extreme values and pick up the maximal one. For every surface S lay on the equal-t submanifold, we can immerse it into the null sheet \(u=0\) and parameterize it by \(\{u=0,r=r_S(x^A)\}\) in the Bondi–Scahs coordinates gauge. In the Appendix C, we have shown that: if a surface S makes \({\mathcal {B}}\) extreme among all surfaces which lay on equal-t surface, it will also makes \({\mathcal {B}}\) extreme among all surfaces which lay on \(u=0\) null sheet. Then we have following relationship

$$\begin{aligned}&(\max {\mathcal {B}}\mathrm {~on~}\Sigma _t)=(\mathrm {maxiaml~extreme~value~of~} {\mathcal {B}}\mathrm {~on~}\Sigma _t)\nonumber \\&\quad =(\mathrm {one~of~extreme~values~of~}{\mathcal {B}}\mathrm {~on~null~sheet~}u=0)\nonumber \\&\quad \le (\mathrm {maximal~extreme~value~of~}{\mathcal {B}} \mathrm {~on~null~sheet~}u=0)\,. \end{aligned}$$
(4.16)

Assume that \({\mathcal {B}}_{m}\) is the the maximal critical value of \({\mathcal {B}}\) on the null sheet \(u=0\),Footnote 2 then we see that max\(~{\mathcal {B}}\) in Eq. (4.15) satisfies \(\max {\mathcal {B}}\le {\mathcal {B}}_{m}\). In the null sheet \(u=0\), if we can show \({\mathcal {B}}_{m}=4/(27A_{{\mathcal {H}}})\), then \(A_{\mathrm {sh,out}}\ge 27A_{{\mathcal {H}}}/4\) is a corollary. This is what we will do in this subsection.

In the null sheet \(=0\), the value of \( {\mathcal {B}}\) depends on the function \(r_S(x^A)\) as well as the metric and matter distributions. In following we will use variational method to find the maximum of \( {\mathcal {B}}\) on the null sheet \(u=0\). As we have analyzed at the end of Sec. 4.1, we can choose at most 4 bulk variables and at most 8 surface variables as the independent variables when we use variational method. We will first do not specify what are independent variables but only use Einstein’s equation to rewrite \( {\mathcal {B}}\) into a different form (see Eq. (4.26)) and then specify what are independent variables, by which we can easy to perform variation and find the on-shell values.

In Appendix D we use a concrete function as an example to explain the main idea of following steps. Though what we will do in following is more complicated than the example of Appendix D, the mathematical essence has no difference. Referring to the metric (4.8), our target functional reads

$$\begin{aligned} {\mathcal {B}}=A_{S}^{-2}\int _S r^{-1}Ve^{2\beta }\text {d}S\,. \end{aligned}$$
(4.17)

We take the gauge that the horizon has constant radius \(r=r_h\). Then we find

$$\begin{aligned} Ve^{-2\beta }|_{r=r_0}= & {} \int _{r_h}^{r_0}\Bigg [{^{(2)}R}/2-{\mathfrak {D}}^2\beta -({\mathfrak {D}}\beta )^2- 8\pi r^2\rho \nonumber \\&-\frac{Vre^{-2\beta }}{8}h^{AC}h^{BD}(\partial _rh_{AB}) (\partial _rh_{CD})\Bigg ]\text {d}r.\nonumber \\ \end{aligned}$$
(4.18)

In general \(Ve^{2\beta }\) will depend on \(r_0\) and \(x^A\) both. The induced metric on S reads \(\text {d}s_S^2=r_S^2h_{AB}\text {d}x^A\text {d}x^B\), so we find \(\text {d}S=r_S^2\text {d}\Omega \). Eqs. (4.18) and Eq. (4.17) then show

$$\begin{aligned} {\mathcal {B}}= & {} \frac{1}{A_S^2}\int \text {d}\Omega r_Se^{4\beta }\int _{r_h}^{r_S}\Bigg [{^{(2)}R}/2-{\mathfrak {D}}^2\beta -({\mathfrak {D}}\beta )^2 \nonumber \\&- 8\pi r^2\rho -\frac{Vre^{-2\beta }}{8}h^{AC}h^{BD}(\partial _rh_{AB}) (\partial _rh_{CD})\Bigg ].\nonumber \\ \end{aligned}$$
(4.19)

and area of S is

$$\begin{aligned} A_{S}=A_{S}[r_S]:=\int r_S^2\text {d}\Omega \,. \end{aligned}$$
(4.20)

We define,

$$\begin{aligned} \cos \Phi _S(x^A):=e^{4\beta }|_S\,, \end{aligned}$$
(4.21)

and

$$\begin{aligned} N_1(r,x^A)^2:= & {} 8\pi r^2\rho ,~~{N_2}(r,x^A)^2\nonumber \\:= & {} \frac{Vre^{-2\beta }}{8}h^{AC}h^{BD} (\partial _rh_{AB})(\partial _rh_{CD}) \end{aligned}$$
(4.22)

which take the constraints \(\beta \le 0, \rho \ge 0\) and \(V\ge 0\) into account. Then we find

$$\begin{aligned} {\mathcal {B}}= & {} \frac{1}{A_S^2}\int \text {d}\Omega r_S\cos \Phi _S\int _{r_h}^{r_S(x^A)}\nonumber \\&\times \left[ {^{(2)}R}/2-{\mathfrak {D}}^2\beta -({\mathfrak {D}}\beta )^2-N_1^2-N_2^2\right] \text {d}r\,. \end{aligned}$$
(4.23)

It needs to note that \(r_S(x^A)\) in general may not be constant. To relax the dependence of upper limit \(r_S\) in the second integration of Eq. (4.23), we can introduce step-function \(\Theta (x)\) so that

$$\begin{aligned} {\mathcal {B}}= & {} \frac{1}{A_{S}[r_S]^2}\int \text {d}\Omega r_S\cos \Phi _S\int _{r_h}^{\infty }\text {d}r\Theta (r_S-r)\nonumber \\&\times \left[ {^{(2)}R}/2-{\mathfrak {D}}^2\beta -({\mathfrak {D}}\beta )^2-N_2^2-N_1^2\right] \nonumber \\= & {} \frac{1}{A_{S}[r_S]^2}\int _{r_h}^{\infty }\text {d}r\int \text {d}\Omega \cos \Phi _S r_S\Theta (r_S-r)\nonumber \\&\times \left[ {^{(2)}R}/2-{\mathfrak {D}}^2\beta -({\mathfrak {D}}\beta )^2-N_2^2-N_1^2\right] \,. \end{aligned}$$
(4.24)

As \(h_{AB}\) is metric of 2-dimensional space, we can always find a coordinate transformation \(x^A\rightarrow y^A\) such that

$$\begin{aligned} h_{AB}\text {d}x^A\text {d}x^B=e^{2\Psi (r,y^A)}\gamma _{AB}\text {d}y^A\text {d}y^B\,, \end{aligned}$$
(4.25)

where \(\gamma _{AB}\) is the standard metric of unit sphere. By using this conformal metric, we find that \({\mathcal {B}}\) becomes

$$\begin{aligned} {\mathcal {B}}= & {} \frac{1}{A_{S}[r_S]^2}\int _{r_h}^{\infty }\text {d}r\int r_S\cos \Phi _S\Theta (r_S-r)[1-{\hat{\mathfrak {D}}}^2\Psi \nonumber \\&-{\hat{\mathfrak {D}}}^2\beta -({\hat{\mathfrak {D}}}\beta )^2-e^{-2\Psi }N_1^2-e^{-2\Psi }N_2^2]\sqrt{\gamma }\text {d}^2y\,.\nonumber \\ \end{aligned}$$
(4.26)

Here \({\hat{\mathfrak {D}}}_A\) is the covariant derivative operator corresponding to \(\gamma _{AB}\).

From Eqs. (4.13) to (4.26), we just made identical deformations according to Einstein’s equation. In Eq. (4.26), we find that \(\{\beta ,r_S,\Phi _S,N_1, {N_2}, \Psi \}\) has 4 independent bulk variables and only 2 independent surface variables, so we can treat all them as independent variables. These are independent arguments of functional \({\mathcal {B}}\).

We solve the variational problem by following steps. The variation with respective to \(\Psi \) is easy computed from Eq. (4.23) and we obtain

$$\begin{aligned} \sin \Phi _S\int _{r_h}^{r_S}\left[ {^{(2)}R}/2-{\mathfrak {D}}^2\beta -({\mathfrak {D}}\beta )^2-N_1^2-N_2^2\right] \text {d}r=0\,. \end{aligned}$$
(4.27)

Using Eq. (4.18) and definitions of \(N_1\) and \({N_2}\), we find that Eq. (4.27) reduces

$$\begin{aligned} V e^{-2\beta }|_{r=r_S}\sin \Phi _S=0\,. \end{aligned}$$
(4.28)

This equation shows that \(\Phi _S=0\). The variations with respective to \({N_2}\) and \(N_1\) read

$$\begin{aligned} e^{-2\Psi }r_S\Theta (r_S-r)N_1=e^{-2\Psi }r_S\Theta (r_S-r){N_2}=0\, \end{aligned}$$
(4.29)

and so \({N_2}=N_1=0\) when \(r_h<r<r_S\). Then by using Eq. (4.26), we find that variation with respective to \(\Psi \) gives us

$$\begin{aligned} {\hat{\mathfrak {D}}}^2[\Theta (r_S-r)r_S]=0,~~r_h<r<r_S\,. \end{aligned}$$
(4.30)

This equation shows that \(\Theta (r_S-r)r_S\) must be independent of \(y^A\). This is because

$$\begin{aligned} \forall f(y^A),~~\int f{\hat{\mathfrak {D}}}^2f\text {d}\Omega =\int (\partial f)^2\text {d}\Omega \,. \end{aligned}$$
(4.31)

Then we see \({\hat{\mathfrak {D}}}^2[\Theta (r_S-r)r_S]=0\Rightarrow \partial _A[\Theta (r_S-r)r_S]=0\). Thus Eq. (4.30) shows \(r_S\) to be constant. Then the variation with respective to \(\beta \) shows

$$\begin{aligned} {\hat{\mathfrak {D}}}^2\beta =0\,, \end{aligned}$$
(4.32)

which shows that \(\beta =\beta (r)\). Thus, the variation with respective to \(r_S\) shows

$$\begin{aligned} 4r_SA_{S}{\mathcal {B}}=\int _{r_h}^{\infty }[\Theta (r_S-r)+r_S\delta (r_S-r)]\text {d}r\,, \end{aligned}$$
(4.33)

where

$$\begin{aligned} A_S=4\pi r_S^2,~~{\mathcal {B}}=\frac{r_S-r_h}{4\pi r_S^3} \end{aligned}$$
(4.34)

Equation (4.33) shows \(r_S=3r_h/2\). Finally, the on-shell value of \({\mathcal {B}}\) on the null sheet \(u=0\)

$$\begin{aligned} {\mathcal {B}}|_{\mathrm {on-shell}}=1/(27\pi r_h^2)=\frac{4}{27A_{{\mathcal {H}}}}\,. \end{aligned}$$
(4.35)

The value of \(\Psi \) cannot be determined by variational method, so the solutions of variational problem are not unique. However, the on-shell values of all solutions are same. As Eq. (4.35) is the only on-shell value of the variational problem, we can conclude that \(4/(27A_{{\mathcal {H}}})\) is the maximal extreme value of \({\mathcal {B}}\) on the null sheet \(u=0\). This proves \(9A_{{\mathcal {H}}}/4\le A_{\mathrm {sh,out}}/3\).

4.3 Proof of \(A_{\mathrm {ph,out}}\le A_{\mathrm {sh,out}}/3\)

In this subsection, we do not fix the horizon area but fix the functional \({\mathcal {B}}|_{S={\mathcal {S}}_{\mathrm {ph,out}}}={\mathcal {B}}_0\). In Appendix B, we have shown \(A_S<1/{\mathcal {B}}\) for any surface outside horizon and laying on equal-t submanifold. If we fix \({\mathcal {B}}|_{S={\mathcal {S}}_{\mathrm {ph,out}}}={\mathcal {B}}_0\), the areas of photon spheres are bounded from above, so we can use variational method to find the maximum of \(A_{\mathrm {ph,out}}\). We will prove

$$\begin{aligned} \max A_{\mathrm {ph,out}}=\frac{1}{3{\mathcal {B}}_0}\,, \end{aligned}$$
(4.36)

If this is true, we then have

$$\begin{aligned} A_{\mathrm {ph,out}}\le \frac{1}{3{\mathcal {B}}_0}=\frac{1}{3{\mathcal {B}}|_{S={\mathcal {S}}_{\mathrm {ph,out}}}} \le \frac{1}{3}\int _{{\mathcal {S}}_{\mathrm {out}}}\phi ^{-2}\text {d}S=\frac{A_{\mathrm {ph,out}}}{3}\,. \end{aligned}$$
(4.37)

To prove Eq. (4.36), our tool is variational method combining with Lagrangian multipliers. The target functional now is

$$\begin{aligned} A_S=\int _S\text {d}S\,. \end{aligned}$$

According to the conclusion in Appendix C, this can be converted into the task of finding critical surface among the surfaces laying on null sheet \(u=0\). Similar to (4.16), we have following relationship

$$\begin{aligned}&(\max A_S\mathrm {~on~}\Sigma _t)\\&\quad \le (\mathrm {maximal~extreme~value~of~} A_S\mathrm {~on~null~sheet~}u=0)\,. \end{aligned}$$

As we want to find the upper bound of \(A_S\) when S is a photon sphere, there are a few of restrictions on surface S.

Firstly, in the equal-t surface, we have know that a photon sphere is a critical surface which makes the integration \(\int \phi ^{-2}\text {d}S\) to be extremal. In the null sheet \(u=0\), Appendix C shows that a photon sphere should also be a critical surface which makes the integration \(\int \phi ^{-2}\text {d}S\) to be extremal on the null sheet \(u=0\). On the null sheet \(u=0\), we have \(\int \phi ^{-2}\text {d}S=\int r_S^3e^{-4\beta }/(Ve^{-2\beta })\text {d}\Omega \). The extreme condition means that S on the null sheet \(u=0\) should satisfy following equation

$$\begin{aligned} 3r_S^{-1}Ve^{-2\beta }-4Ve^{-2\beta }\partial _r\beta -\partial _r(Ve^{-2\beta })=0\,. \end{aligned}$$
(4.38)

As we now fix the value of functional \({\mathcal {B}}={\mathcal {B}}_0\), we have following constraint

$$\begin{aligned} {\mathcal {B}}=A_{S}^{-2}\int _S r^{-1}Ve^{2\beta }\text {d}S={\mathcal {B}}_0\,. \end{aligned}$$
(4.39)

which shows

$$\begin{aligned} \int r_SVe^{2\beta }\text {d}\Omega -{\mathcal {B}}_0A_S^2=0\,. \end{aligned}$$
(4.40)

Thus, we construct following target functional on the null sheet \(u=0\) with two Lagrangian multipliers

$$\begin{aligned} F= & {} A_S+\int [3r_S^{-1}Ve^{-2\beta }-4Ve^{-2\beta }\partial _r\beta \nonumber \\&-\partial _r(Ve^{-2\beta })]\lambda _1(x^A)\text {d}\Omega \nonumber \\&+\lambda _2\left( \int r_SVe^{2\beta }\text {d}\Omega -{\mathcal {B}}_0A_S^2\right) \,. \end{aligned}$$
(4.41)

Here \(\{\lambda _1(x^A)\) \(\lambda _2\}\) are two Lagrangian multipliers and \(\lambda _2\) is constant. The critical values of \(A_{\mathrm {ph,out}}\) in equal-t surface are given by the critical values of F in the null sheet \(u=0\).

To solve the variational problem, we separate functional F into two parts

$$\begin{aligned} F=F_{\mathrm {bd}}+F_{\mathrm {bulk}}\,, \end{aligned}$$

where

$$\begin{aligned} F_{\mathrm {bd}}:= & {} A_S-\lambda _2{\mathcal {B}}_0A_S^2-\int _S[\partial _r (Ve^{-2\beta })\nonumber \\&+4(\partial _r\beta )Ve^{-2\beta }]\lambda _1(x^A)\text {d}\Omega \end{aligned}$$
(4.42)

and

$$\begin{aligned} F_{\mathrm {bulk}}:=\int _S\text {d}\Omega [3r_S^{-1}\lambda _1(x^A) +e^{4\beta }\lambda _2]Ve^{-2\beta }\,. \end{aligned}$$
(4.43)

The \(\partial _r(Ve^{-2\beta })\) and \(\partial _r\beta \) in \(F_{\mathrm {bd}}\) are given by Eq. (4.9). The \(Ve^{-2\beta }\) in \(F_{\mathrm {nd}}\) is given by Eq. (4.18). Using Eqs. (4.9), (4.10) and (4.18) we find

$$\begin{aligned}&\partial _r(Ve^{-2\beta })|_{r=r_S}={^{(2)}R}/2-{\hat{\mathfrak {D}}}^2\beta _S\nonumber \\&\quad -({\hat{\mathfrak {D}}}\beta _S)^2-N_{1S}^2-N_{2S}^2 \end{aligned}$$
(4.44)

and

$$\begin{aligned} \left. (\partial _r\beta )Ve^{-2\beta }\right| _S=8N_{1S}^2+W_S^2,~~W_S^2:=\left. 2\pi r T_{rr}Ve^{-2\beta }\right| _S\,. \end{aligned}$$
(4.45)

Here \(\beta _S=\beta |_S\), \(N_{1S}^2\) and \(N_{2S}^2\) are defined by Eq. (4.22) but restricted on surface S. We use the lower index “S” to explicitly show that they are surface quantities. Then using the conformal transformation (4.25), we have

$$\begin{aligned} F_{\mathrm {bd}}= & {} A_S-\lambda _2{\mathcal {B}}_0A_S^2-\int \lambda _1 (y)\nonumber \\&\times [1-{\hat{\mathfrak {D}}}^2\Psi _S-{\hat{\mathfrak {D}}}^2\beta _S-({\hat{\mathfrak {D}}}\beta _S)^2\nonumber \\&-33e^{-2\Psi }N_{1S}^2-e^{-2\Psi }N_{2S}^2\nonumber \\&-4e^{-2\Psi }W_S^2] \sqrt{\gamma }\text {d}^2y\,. \end{aligned}$$
(4.46)

and Eq. (4.18) shows

$$\begin{aligned} F_{\mathrm {bulk}}= & {} \int \text {d}\Omega [3r_S^{-1}\lambda _1+e^{4\beta _S} \lambda _2]Ve^{-2\beta }\nonumber \\= & {} \int _{r_{h}}^{\infty }\text {d}r\int \Theta (r_S-r)[3r_S^{-1}\lambda _1 +e^{4\beta _S}\lambda _2]\nonumber \\&\times [1-{\hat{\mathfrak {D}}}^2\Psi -{\hat{\mathfrak {D}}}^2\beta \nonumber \\&-({\hat{\mathfrak {D}}}\beta )^2-e^{-2\Psi }N_1^2 -e^{-2\Psi }N_2^2]\sqrt{\gamma }\text {d}^2y\,. \end{aligned}$$
(4.47)

Functional \(F_{\mathrm {bd}}\) only involves quantities at the surface S, but functional \(F_{\mathrm {bulk}}\) involves quantities at surface S as well as in bulk region between S and \({\mathcal {H}}\). The functional F now depends on 6 surface variables \(\{N_{1S},N_{2S},\beta _S, \Psi _S, W_S, r_S\}\) and 4 bulk variables \(\{N_1, N_2, \beta , \Psi \}\). We can treat all these variables as independent variables.

The \(\Psi _S, W_S\), \(N_{1S}\) and \(N_{2S}\) appear only in boundary part \(F_{\mathrm {bd}}\). Using Eq. (4.46), we find that the variations with respective to \(W_S, N_{1S}\) and \(N_{2S}\) show

$$\begin{aligned} N_{1S}=N_{2S}=W_S=0\,. \end{aligned}$$

Then variation of \(\Psi _S\) reads

$$\begin{aligned} {\hat{\mathfrak {D}}}^2\lambda _1(y^A)=0\,. \end{aligned}$$
(4.48)

This shows that \(\lambda _1(y^A)\) is a constant. The variation with respective to \(\beta _S\) involves both \(F_{\mathrm {bulk}}\) and \(F_{\mathrm {bd}}\). The result reads,

$$\begin{aligned} \lambda _1{\hat{\mathfrak {D}}}^2\beta _S=-4\lambda _2e^{4\beta _S}(Ve^{-2\beta })_{r=r_S}\,. \end{aligned}$$
(4.49)

As \(\lambda _1\) is constant, we integrate Eq. (4.49) on S and find

$$\begin{aligned} 0=-4\lambda _2\int e^{4\beta _S}(Ve^{-2\beta })_{r=r_S}\sqrt{\gamma }\text {d}^2y\,. \end{aligned}$$
(4.50)

As \(V>0\), we find \(\lambda _2=0\), which implies \({\hat{\mathfrak {D}}}^2\beta _S=0\)Footnote 3 and so \(\beta _S\) is constant.

The variables \(\{N_1,N_2\}\) appear only in bulk part \(F_{\mathrm {bulk}}\). The variation with respective to \(N_1\) and \(N_2\) shows \(N_1=N_2=0\). Now we have

$$\begin{aligned} F_{\mathrm {bulk}}= & {} \lambda _1\int _{r_{h}}^{\infty }\text {d}r\int 3\Theta (r_S-r)r_S^{-1}\nonumber \\&\times [1-{\hat{\mathfrak {D}}}^2\Psi -{\hat{\mathfrak {D}}}^2\beta -({\hat{\mathfrak {D}}}\beta )^2] \sqrt{\gamma }\text {d}^2y\,. \end{aligned}$$
(4.51)

Then we variation with respective to \(\Psi \) shows \({\hat{\mathfrak {D}}}^2[\Theta (r_S-r)r_S^{-1}]=0\) and so \(r_S\) is constant. The variation of \(\beta \) shows \({\hat{\mathfrak {D}}}^2\beta =0\) and so \(\beta =\beta (r)\) when \(r<r_h<r_S\). Thus, we have following results at the surface S

$$\begin{aligned} Ve^{-2\beta }|_{\text {on-shell}}&=r_S-r_h,\nonumber \\ r_SVe^{2\beta } |_{\text {on-shell}}&=r_Se^{4\beta _S}(r_S-r_h),\nonumber \\ \partial _r(Ve^{-2\beta })|_{\text {on-shell}}&=\frac{1}{2}{^{(2)}R}|_S,~~\Psi _S=0\,. \end{aligned}$$
(4.52)

Note that \(\Psi _S=0\Rightarrow {^{(2)}R}|_S=2\). Take these into Eqs. (4.38) and (4.40) we have

$$\begin{aligned} r_h=\frac{2r_S}{3},~~4\pi r_S^2=\frac{e^{4\beta _S}}{3{\mathcal {B}}_0} \end{aligned}$$
(4.53)

Thus, we find that \(F|_{\mathrm {on-shell}}=e^{4\beta _S}/(3{\mathcal {B}}_0)\). The on-shell values of target functional are not unique. We then have

$$\begin{aligned} \max F=\max \{e^{4\beta _S}/(3{\mathcal {B}}_0)|\forall \beta _S\le 0\} =\frac{1}{3{\mathcal {B}}_0}\,. \end{aligned}$$
(4.54)

This proves Eq. (4.36) and so we obtain \(A_{\mathrm {ph,out}}\le A_{\mathrm {sh,out}}/3\). It needs to note that above proof is still true for every connected photon spheres, i.e. \(A_{\mathrm {ph,out},i}\le A_{\mathrm {sh,out},i}/3\). Thus, we prove one part of our stronger conjecture (3.11).

4.4 Proof of \(A_{\mathrm {sh,out}}/3\le 36\pi M^2\)

The solution of Eq. (4.10) can be written as

$$\begin{aligned} Ve^{-2\beta }|_{r_0}= & {} r_0-2M-\int _{r_0}^{\infty }\Bigg [{^{(2)}R}/2-1 -{\mathfrak {D}}^2\beta -({\mathfrak {D}}\beta )^2\nonumber \\&- 8\pi r^2\rho \nonumber \\&-\frac{Vre^{-2\beta }}{8}h^{AC}h^{BD}(\partial _rh_{AB}) (\partial _rh_{CD})\Bigg ]\text {d}r\,. \end{aligned}$$
(4.55)

This shows

$$\begin{aligned} M= & {} \frac{r_0}{2}-\frac{1}{8\pi }\int _{r=r_0}\text {d}\Omega \left\{ Ve^{-2\beta } -\int _{r_0}^{\infty }[({\mathfrak {D}}\beta )^2+8\pi r^2\rho \right. \nonumber \\&+\left. \left. \frac{Vre^{-2\beta }}{8}h^{AC}h^{BD}(\partial _rh_{AB}) (\partial _rh_{CD})\right] \text {d}r\right\} \,. \end{aligned}$$
(4.56)

In the gauge that horizon has constant \(r=r_h\), we find

$$\begin{aligned} M= & {} \frac{r_h}{2}+\frac{1}{8\pi }\int _{r=r_h}\text {d}\Omega \Bigg \{\int _{r_h}^{\infty } \Bigg [({\mathfrak {D}}\beta )^2+8\pi r^2\rho \nonumber \\&+ \frac{Vre^{-2\beta }}{8}h^{AC}h^{BD}(\partial _rh_{AB}) (\partial _rh_{CD})\Bigg ]\text {d}r\Bigg \}\ge \frac{r_h}{2}\,.\nonumber \\ \end{aligned}$$
(4.57)

This shows the Penrose inequality \(9A_{{\mathcal {H}}}/4\le 36\pi M^2\).

To show \(A_{\mathrm {sh,out}}/3\le 36\pi M^2\), we immerse the photon sphere into the null sheet \(u=0\) and take the gauge that the radius of outmost photon sphere \(r_{{\mathcal {S}}}\) is constant. The size of shadow reads

$$\begin{aligned} A_{\mathrm {sh,out}}=r^{3}_{{\mathcal {S}}}\int _{r=r_{{\mathcal {S}}}} \frac{\text {d}\Omega }{Ve^{2\beta }}\,. \end{aligned}$$
(4.58)

In general, the horizon radius \(r_h\) is no longer constant. Eq. (4.56) becomes

$$\begin{aligned} M= & {} \frac{r_{{\mathcal {S}}}}{2}-\frac{1}{8\pi }\int _{r=r_{{\mathcal {S}}}} \text {d}\Omega \nonumber \\&\times \Bigg \{Ve^{-2\beta }-\int _{r_{{\mathcal {S}}}}^{\infty } \Bigg [({\mathfrak {D}}\beta )^2+8\pi r^2\rho \nonumber \\&+ \frac{Vre^{-2\beta }}{8}h^{AC}h^{BD}(\partial _rh_{AB}) (\partial _rh_{CD})\Bigg ]\text {d}r\Bigg \}\,. \end{aligned}$$
(4.59)

In following we will use variational method to show that, in all black holes of which the size of shadow is \(A_{\mathrm {sh,out}}=12\pi a_0^2\), the minimal value of mass is \(a_0/3\). The main idea is as follow.

We treat M as a functional, the positive mass theorem shows \(M\ge 0\). Thus, the functional M is bounded from below. Then the minimum value can be found according to variational problem. As \({\mathcal {S}}_{\mathrm {out}}\) is a photon sphere, so constraint (4.38) should also satisfied. We construct the target functional

$$\begin{aligned} {\mathcal {M}}= & {} \frac{r_{{\mathcal {S}}}}{2}-\frac{1}{8\pi }\int _{r=r_{{\mathcal {S}}}} \text {d}\Omega \nonumber \\&\times \left\{ Ve^{-2\beta }-\int _{r_{{\mathcal {S}}}}^{\infty }[({\mathfrak {D}}\beta )^2 +8\pi r^2\rho \right. \nonumber \\&+\left. \left. \frac{Vre^{-2\beta }}{8}h^{AC}h^{BD}(\partial _rh_{AB}) (\partial _rh_{CD})\right] \text {d}r\right\} \nonumber \\&+\frac{\lambda _2}{4\pi } \left( r^{3}_{{\mathcal {S}}}\int _{r=r_{{\mathcal {S}}}}\frac{\text {d}\Omega }{Ve^{2\beta }} -12\pi a_0^2\right) \nonumber \\&+\int _{r=r_{{\mathcal {S}}}}\left[ 3r_{{\mathcal {S}}}^{-1}Ve^{-2\beta } -4Ve^{-2\beta }\partial _r\beta -\partial _r(Ve^{-2\beta })\right] \nonumber \\&\times \lambda _1(x^A)\text {d}\Omega \,. \end{aligned}$$
(4.60)

Here \(\lambda _1(x^A)\) and \(\lambda _2\) are two Lagrangian multipliers and \(\lambda _2\) is constant. The \(\lambda _2\) insures that the size of shadow is fixed to be \(12\pi a_0^2\) and the \(\lambda _1\) insures that condition (4.38) is satisfied.

We use the variable transformations (4.22) and find

$$\begin{aligned} {\mathcal {M}}= & {} \frac{r_{{\mathcal {S}}}}{2}-\frac{1}{8\pi }\int \text {d}\Omega Ve^{-2\beta }\nonumber \\&+\frac{\lambda _2}{4\pi }\left( r^{3}_{{\mathcal {S}}} \int _{r=r_{{\mathcal {S}}}}\frac{\text {d}\Omega }{Ve^{2\beta }}-12\pi a_0^2\right) \nonumber \\&+\int _{r=r_{{\mathcal {S}}}}\left[ 3r_{{\mathcal {S}}}^{-1}Ve^{-2\beta } -4Ve^{-2\beta }\partial _r\beta -\partial _r(Ve^{-2\beta })\right] \nonumber \\&\times \lambda _1(x^A)\text {d}\Omega \nonumber \\&+\frac{1}{8\pi }\int _{r_{{\mathcal {S}}}}^{\infty }\text {d}r\int \text {d}\Omega [({\mathfrak {D}}\beta )^2+N_1^2+N_2^1]\, \end{aligned}$$
(4.61)

The functional \({\mathcal {M}}\) contains the boundary part and bulk part. In principle, we can follow the similar step of Sect. 4.3 to solve the variational problem. However, in this case, we have simpler method.

We first choose \(\beta , N_1, N_2\) and \(K:=\sqrt{Ve^{-2\beta }}\) as four independent bulk variables. The bulk variations with respective to \(\beta , N_1\) and \(N_2\) only involves the third line of Eq. (4.604.61). The result shows

$$\begin{aligned} r\in [r_{{\mathcal {S}}},\infty ),~~\beta =\beta (r),~~N_1=N_2=0 \Rightarrow \partial _rh_{AB}=0\,. \end{aligned}$$

Thus we see \(h_{AB}=h_{AB}|_{r=\infty }\). This shows \({^{(2)}R}={^{(2)}R}|_{r\rightarrow \infty }=2\). Then we find

$$\begin{aligned}Ve^{-2\beta }=r-2M\,.\end{aligned}$$

Thus, the on-shell geometry outside photon sphere \({\mathcal {S}}_{\mathrm {out}}\) becomes spherically symmetric. We then find that

$$\begin{aligned} \min M=\min {\mathcal {M}}|_{\mathrm {on-shell}}=\min {\mathcal {M}}|_{\mathrm {spherically ~symmetric}}\,. \end{aligned}$$
(4.62)

In spherically symmetric case, we have shown that \(\min M=\sqrt{A_{\mathrm {ph,out}}/(108\pi )}=a_0/3\). Thus Eq. (4.62) shows \(\min M=a_0/3\). Then we finish the proof of \(A_{\mathrm {sh,out}}/3\le 36\pi M^2\) in the case without spherical symmetry. Note that above proof also implies that \(A_{\mathrm {sh,out}}/3=36\pi M^2\) is true only if the geometry outside \({\mathcal {S}}_{\mathrm {out}}\) is Schwarzschild.

Now we make a short summary on this section. We use variational method to prove \(9A_{{\mathcal {H}}}/4\le A_{\mathrm {sh,out}}/3\) and \(A_{\mathrm {ph,out}}\le A_{\mathrm {sh,out}}/3\le 36\pi M^2\) under two assumptions: (1) weak and strong energy condition are satisfied, and (2) outmost horizon and outmost photon sphere are connected and smooth. Our proof also implies a rigidity theorem: \(A_{\mathrm {ph,out}}=36\pi M^2\) only if the geometry outside outmost photon sphere is Schwarzschild. Our proof on \(A_{\mathrm {ph,out}}\le A_{\mathrm {sh,out}}/3\) is also true when outmost photon sphere is not connected. As the result, we obtain \(A_{\mathrm {ph,out},i}\le A_{\mathrm {sh,out},i}/3\) for every connected branch and so prove a part of our stronger conjecture (3.11). The lower bound \(9A_{{\mathcal {H}}}/4\le A_{\mathrm {ph,out}}\) in spherically symmetric case involves some non-typical energy condition (2.15). It is not clear how to generalize it into the case without spherical symmetry. We leave the study of this inequality for the future.

Comment on smoothness: In above proofs, we assumed that the outmost photon sphere \({\mathcal {S}}_{\mathrm {out}}\) is smooth. This requirement can be be replaced by piecewise smoothness. The reason is that, in variational method, the target functionals and constraints are defined in integrations. For a outmost photon sphere \({\mathcal {S}}_{\mathrm {ph,out}}^{(i)}\), we can always find a smooth surface \({\mathcal {S}}_{\mathrm {out},\varepsilon }^{(i)}\) which satisfies: (1) \(\forall \varepsilon >0\), the maximal volume enclosed by \({\mathcal {S}}_{\mathrm {out}}^{(i)}\) and \({\mathcal {S}}_{\mathrm {out},\varepsilon }^{(i)}\) is smaller than \(\varepsilon M^3\), i.e.

$$\begin{aligned} \max \int _{V}\text {d}V<\varepsilon M^3\,, \end{aligned}$$
(4.63)

where V is arbitrary spacelike 3-dimensional surface and satisfies \(\partial V={\mathcal {S}}_{\mathrm {out}}^{(i)}\cup {\mathcal {S}}_{\mathrm {out},\varepsilon }^{(i)}\), and (2) \(A_{\mathrm {ph,out},i}\) and \(A_{\mathrm {sh,out},i}\) are approximated in arbitrary accuracy

$$\begin{aligned}&\left| A_{\mathrm {ph,out},i}-\int _{{\mathcal {S}}_{\mathrm {out},\varepsilon }^{(i)}}\text {d}S\right|<\varepsilon M^2,\nonumber \\&\left| A_{\mathrm {sh,out},i}-\int _{{\mathcal {S}}_{\mathrm {out}, \varepsilon }^{(i)}}\phi ^{-2}\text {d}S\right| <\varepsilon M^2\,, \end{aligned}$$
(4.64)

and (3) the constraint (4.38) is broken arbitrarily small

$$\begin{aligned} \int _{{\mathcal {S}}_{\mathrm {out},\varepsilon }^{(i)}}\left| 3r^{-1} Ve^{-2\beta }-4Ve^{-2\beta }\partial _r\beta -\partial _r(Ve^{-2\beta })\right| \text {d}S<\varepsilon M^2\,. \end{aligned}$$
(4.65)

We can use the smooth surface \({\mathcal {S}}_{\mathrm {out},\varepsilon }^{(i)}\) to replace \({\mathcal {S}}_{\mathrm {out}}^{(i)}\) and obtain the same conclusion. Because of this reason, the conclusions in our above proofs are still valid if assume the outmost photon sphere is piecewise smooth.

5 Conclusion

In this paper, we conjectured a series of universal inequalities about the size of a static black hole in Einstein gravity. We gave a complete proof in the spherically symmetric case. We studied the properties of the photon spheres in general static spacetimes and proved that photon spheres are conformal invariant structures of the spacetime. Our results strongly suggest that black holes photon spheres may have rich physical contents and mathematical structures.

Our conjecture gives us a simple way to estimate the size of the horizon and black hole mass. For the spherically symmetric case, though we assume the spacetime is static outside the horizon (if exists), we in fact only need it being static outside the photon sphere due to Birkhoff theorem. It needs to emphasize that the upper bound in (1.2) do not require the existence of a black hole. This has significance in astronomy. Birkhoff theorem implies the interior of photon sphere may not contain a black hole. For example, a neutron star can also form a photon sphere and the corresponding shadow. However, our inequality (1.2) implies if the radius of photon sphere is larger than \(2.25M_{\odot }\) or the radius of shadow is larger than \(3.89M_{\odot }\), then the interior of the photon sphere cannot be a neutron star. If we find a larger size photon sphere or shadow, the interior must be a black hole or it is to form a black hole. In astrophysical observations, to verify whether or not a gravitational system has photon sphere can give us useful information about the gravitational system. For example, if the spacetime contains a naked singularity rather than a black hole, we can classify naked singularities into two kinds based on whether or not a naked singularity is covered within a photon sphere [28]. For the case which contains a photon sphere, the gravitational lensing effects of naked singularity is very similar to Schwarzschild black hole; however, for the case without photon sphere, gravitational lensing effects is qualitatively different from the case of a Schwarzschild black hole. The combinational observations between the shadow and other gravitational lensing effects in principle can gives us some physical information of gravitational source such as the charge/mass ration, distance to the gravitational source and so on [29,30,31]. Thus, we can study how to read these useful physical information from the observations on shadows of black holes in the future.