Transcending the ensemble: baby universes, spacetime wormholes, and the order and disorder of black hole information

In the 1980's, work by Coleman and by Giddings and Strominger linked the physics of spacetime wormholes to `baby universes' and an ensemble of theories. We revisit such ideas, using features associated with a negative cosmological constant and asymptotically AdS boundaries to strengthen the results, introduce a change in perspective, and connect with recent replica wormhole discussions of the Page curve. A key new feature is an emphasis on the role of null states. We explore this structure in detail in simple topological models of the bulk that allow us to compute the full spectrum of associated boundary theories. The dimension of the asymptotically AdS Hilbert space turns out to become a random variable $Z$, whose value can be less than the naive number $k$ of independent states in the theory. For $k>Z$, consistency arises from an exact degeneracy in the inner product defined by the gravitational path integral, so that many a priori independent states differ only by a null state. We argue that a similar property must hold in any consistent gravitational path integral. We also comment on other aspects of extrapolations to more complicated models, and on possible implications for the black hole information problem in the individual members of the above ensemble.


Introduction
The past year has seen several interesting developments in the study of black hole information. In particular, it has been well-known for some time that the von Neumann entropy S rad of emitted Hawking radiation as a function of time gives an important diagnostic of whether and to what degree information is preserved or lost in evaporating black holes [1]. Familiar effective field theory would give an entropy that increases monotonically throughout the evaporation, even though the black hole's Bekenstein-Hawking entropy S BH = A 4G monotonically decreases to a value near zero. In contrast, a model in which the black hole is a standard quantum system with density of states S BH coupled unitarily to the radiation field would -when the initial state is purerequire S rad ≤ S BH at all times. As a result, in such models S rad generally increases to a maximum, at which time it nearly equals S BH , and then decreases monotontically thereafter. The final phase with decreasing S rad describes the return of information to the external universe from the black hole.
Despite many arguments suggesting that the latter so-called 'Page curve' should accurately approximate the result of black hole evaporation, for many years it was unclear how such a result could be obtained from a controlled gravitational calculation; see e.g. reviews in [2][3][4][5][6]. The plethora of proposals for new physics that might be associated with obtaining this Page curve (including [3,) were thus all properly viewed as speculative and contained at least some optimistic extrapolation or ad hoc ingredient. 1 Recently, however, it was noted that the 'unitary' Page curve, including the turnover of S rad , could be obtained by combining ideas from holography with effective field theory [42,43] -or equivalently with quantum field theory in curved space. In particular, under very general conditions [42,43] argued that one could obtain this result by computing the generalized entropy S gen = A 4G + S bulk of an appropriate comdimension-2 quantum extremal surface (QES) , where the surface is chosen so that holography suggests this might represent S rad . Here S bulk is the von Neumann entropy of bulk fields outside the codimension-2 QES. See also further explorations of this idea in [44][45][46][47].
Critically, [48,49] then pointed out that -at least in some contexts -this seemingly-hybrid recipe in fact follows from replica trick calculations of S rad using the gravitational path integral (and in particular that this was implicit in earlier derivations of the quantum corrected Ryu-Takayanagi [50,51] and Hubeny-Rangamani-Takayanagi [52] entropy formulae [53,54]). While at this level the physical mechanisms behind such results remain somewhat mysterious, the derivation from the gravitational path integral nevertheless implies that the explicit addition of novel physics is not required. Indeed, it instead suggests that fundamental lessons might be revealed by carefully dissecting the relevant calculations and studying the path integral in more detail.
A starting point for such further investigation is the observation of [49] that the above replica trick results appear to be inconsistent with one might normally call a single well-defined theory. In particular, rather than taking single well-defined values, partition-function-like quantities seem to have both a mean value and a non-zero variance. This feature is associated with the fact that dominant saddles in the replica computations involve connected bulk spacetimes with disconnected asymptotically AdS boundaries. Such geometries have been termed spacetime wormholes, or Euclidean wormholes when the geometry is Euclidean. This relation will be reviewed below, but is familiar from older discussions [55][56][57][58]. In particular, refs. [55][56][57] argued that spacetime wormholes require the gravitational Hilbert space to include spacetimes with compact Cauchy surfaces, and thus for which space at a moment of time has no asymptotically AdS boundary. This part of the gravitational Hilbert space was called the baby universe sector. Furthermore, it was argued that entanglement with this sector typically led the rest of the theory (here the asymptotically AdS sector) to act as if it were part of an ensemble of theories. However, a particular member of the ensemble could be chosen by selecting an appropriate baby universe state.
Our goal here is to combine the above ideas to better understand the ensembles associated with replica trick computations and to extract implications for particular members of such ensembles. We begin in section 2 by reviewing the connection between spacetime wormholes and ensemble-like properties, and by revisiting the baby universe ideas of [55][56][57]. In doing so, we incorporate features associated with a negative cosmological constant and asymptotically AdS boundaries. This both strengthens the results and allows a useful change in perspective. In particular, we avoid the use of 'third quantized perturbation theory' and emphasize that certain results follow exactly from any well-defined path integral. We also focus on the key role played by null states.
The output is a description of how (say, partition-function-like) quantities at asymptotically AdS boundaries have a spectrum of possible values determined by the gravitational path integral. Below, we focus on quantities Z[J * , J] that might be interpreted as computing the inner product of a state created by a source J on the past half of the Euclidean AdS boundary with another state created by a sourceJ = (J * ) * on the future half of a Euclidean AdS boundary, where * denotes CPT conjugation. However, the most general partition-function-like quantities allowed by our formalism include quantities that in a dual CFT would describe matrix elements of operators as well as e.g. Tr ρ n for a wide variety of density matrices. The Rényi entropies of [44,49] are then functions of these quantities. In accordwith the original works [55][56][57], our analysis will show that one may generally describe such quantities as bring drawn from an ensemble of their possible values with the particular ensemble specified by the choice of baby universe state.
After describing this framework in section 2, section 3 introduces some simple toy models in which the gravitational path integral can be performed exactly including the full sum over possible topologies. The toy models are topological and involve finite-dimensional Hilbert spaces. An interesting feature of the models is that the dimension of the asymptotically AdS Hilbert space becomes a random variable Z, whose value can be less than the naive number k of independent states in the theory. For k > Z, consistency turns out to arise from an exact degeneracy in the inner product defined by the gravitational path integral. This degeneracy means that many a priori independent states differ by a null state, and so should be regarded as linearly dependent in the gravitational Hilbert space. Section 4 relates this degeneracy to diffeomorphism invariance, black holes, and the Page curve, arguing in particular that the replica computations of [48,49] will imply a corresponding degeneracy in more general contexts. In section 5, we describe the approximation in which wormhole effects are small, analogous to the third quantised formalism of [57], and emphasise that the appearance of null states is associated with the failure of this approximation. We close with some summary and final discussion in section 6.
2 The gravitational path integral with spacetime wormholes 2

.1 Path integrals and ensembles
We begin by describing a natural set of observables in any theory of gravity. For definiteness and convenience, we will assume locally AdS d+1 asymptotics. This is the context in which we have the most control and the clearest interpretation in terms of possible CFT duals.
Our theory will be defined by the path integral over a set of fields (including a metric) denoted collectively by Φ, with action S[Φ]. Each boundary is associated with a set of admissible boundary conditions labelled by J, describing the behaviour of the fields Φ ∼ J near the given boundary. In particular, J includes a d-dimensional boundary metric on a boundary manifold M. We will focus on the case where the boundary metric has Euclidean signature, but Lorentzian or complex metrics are also allowed. We will generally take each M to be connected, and introduce disconnected boundary manifolds by specifying multiple such boundaries, each with its own J. However, there is no harm in letting M be disconnected, and the notation below remains consistent. For each field other than the metric, J typically includes a function on the d-dimensional boundary M specifying an appropriate boundary condition for that field; e.g., it will typically specify what in the AdS/CFT context is known as the "nonnormalisable part" of the field. In all cases, by S[Φ], we then mean the holographically renormalised action with boundary condition J. Now, the gravitational path integral with asymptotically AdS boundary conditions specified by J is usually interpreted as computing a partition function Z [J]. This is particularly familiar in the AdS/CFT context [59,60] where it gives the partition function of the dual CFT 2 , but the identification of this quantity as a partition function in fact dates back to the first discussions of Euclidean approaches to black hole thermodynamics (see e.g. [61]). Motivated by this interpretation, with an eye toward the ideas of [55][56][57], and following [62], we introduce the following notation for the path integral defined by an asymptotic boundary with n connected components, each with an associated J i : (2.1) This equation defines the left hand side as the path integral over all configurations with n asymptotic boundaries with boundary conditions specified by J 1 , . . . , J n . The notation is chosen to be suggestive of a particular interpretation to be described below.
The presence of spacetime wormholes in the path integral now leads to a phenomenon which is very puzzling from the standard AdS/CFT point of view [58,63] (see [55,56] for earlier discussions of the asymptotically flat analogue in which Smatrix elements play the role of our partition functions). The path integral (2.1) does generally not factorize over disconnected boundaries: (2.2) The difference between right and left sides arises because the sum over topologies in the Euclidean path integral for Z[J 1 ]Z[J 2 ] not only yields terms of the form T 1 T 2 for any pair T 1 , T 2 of terms associated separately with Z[J 1 ] and Z[J 2 ] , but also contains additional contributions from terms in which the two boundaries lie in the same connected component of the bulk manifold; see figure 1. We use the term spacetime wormhole, or sometimes Euclidean wormhole, to refer to any such connection.
Note that spacetime wormholes are generally localized in both space and time, and thus differ qualitatively from spatial wormholes like the familiar Einstein-Rosen bridge that exist on every smooth Cauchy slice of the maximally extended Lorentz signature Schwarzschild spacetime. The two sides of (2.2) must thus differ unless the contributions with extra connections exactly cancel among themselves, or unless such contributions are excluded. The first option appears to require fine tuning, and the second the imposition of non-local constraints that undermine the presumed local nature of the theory. It is also difficult to see how one might introduce useful such constraints without destroying other apparent successes of the Euclidean path integral, such as the description of the Hawking-Page transition for AdS black holes, which is associated with a change in the topology of the dominant Euclidean saddle. We therefore allow terms with extra connections, and at least for the moment assume that they lead to a non-zero difference between the two sides of (2.2). It follows that we cannot simply interpret Z[J 1 ] , Z[J 2 ] as partition functions with product Z[ From the bulk point of view, the extra connections appear to describe dynamical interactions between a priori independent asymptotic regions. This point of view is not naturally compatible with standard AdS/CFT, but it may instead be consistent to interpret Z[J 1 ]Z[J 2 ] · · · as the expectation value of a product of partition functions in an ensemble of boundary dual theories. In this interpretation, the connected contributions would describe probabilistic correlations from the ensemble average rather than dynamical interactions. While these two interpretations may at first seem to be in tension, in analogous settings it was argued by [55][56][57] that they are in fact consistent. The rest of section 2 will be dedicated to providing a version of this discussion that incorporates features associated with asymptotically AdS boundaries. We find that using these new features allow strengthened conclusions, and perhaps as a result we will take a slightly different perspective than that of [55][56][57].
Before turning to the detailed discussion in section 2.2, it is useful to provide a brief overview. As in [55][56][57], the connection between the above two interpretations is motivated by realizing that summing over arbitrary topologies in our path integrals, and in particular over manifolds with arbitrary numbers of connected components, means that generic terms in Z[J 1 ]Z[J 2 ] · · · contain factors associated with compact spacetimes having no boundaries whatsoever. The idea that the Hilbert space of a theory can be identified by cutting open the path integral then suggests that we should also slice open such compact spacetimes. Doing so identifies a new sector not associated on this slice with any of the asymptotically AdS boundaries, but which is instead associated with spatially compact universes; see figure 2. We call this the baby universe sector following [55][56][57], where the name comes from the idea that one can in many cases [64][65][66][67] think of the closed universe having been emitted by a (here asymptotically AdS) parent universe. Figure 2: Slicing open a spacetime with a boundary and a handle (left) can give a disconnected geometry on the slice, including a closed 'baby universe' that has become detached from the parent asymptotically AdS universe. The baby universe does not intersect the asymptotically AdS boundary (red line) at the moment of time described by the indicated slice.

Parent Baby
The discussion of baby universes is simplest in the context of Euclidean path integrals with boundary conditions J i given by Euclidean metrics, but our discussion does not exclude more general contexts. In particular, one can choose boundary conditions with Lorentzian pieces of the metric, using a Schwinger-Keldysh type formalism in which Euclidean sections of the metric are used to prepare states and Lorentzian sections give real time evolution. In such a case, it is useful to think of the gravitational path integral as involving complex metrics.
Such constructions allow us to describe quite general observables that might be associated with a putative dual CFT. Indeed, the set of observables we are using is also sufficient to describe coupling to an auxiliary quantum system, as is important in [43][44][45][46][47][48]. To do this, we can simply allow sources J to be operators in the auxiliary system, and then include a corresponding auxlliary path integral to compute the effects of such operators. We discuss this construction in more detail in section 4.
Note that the ability of Euclidean or complex universes to split and join as shown in figure 2 indicates that baby universes can affect the physics of universes with asymptotically AdS boundaries. In this context, it becomes clear that the definition of our path integral (2.1) includes an implicit choice of the initial and final state of closed baby universes. Most naturally, the path integral computes expectation values in the Hartle-Hawking no-boundary state [68], defined by the absence of additional boundaries besides those required by the Z[J] insertions. But this is not the only choice of baby universe state that we can describe with our gravitational path integral, and other choices will be associated with different ensembles. In particular, we will construct special 'α-states' of baby universes in which the factorisation property is restored, and no ensemble is required.
One further comment is in order before turning to the details. In the above discussion we have written our amplitudes as if the path integral gives some definite, finite value. However, in all but the very simplest contexts, gravitational path integrals have been defined only as asymptotic expansions (perhaps with nonperturbative contributions) in some small coupling. Both loop expansions and sums over nonperturbative sectors will typically fail to converge, and there may be no obvious, natural or unique way to define a finite result. The distinction between exact quantities with finite values of parameters and asymptotic expansions may well be important, and we will return to this issue in section 6. Nonetheless, for the remainder of this section we will treat the path integral in (2.1) as if it gives well-defined exact results.

The baby universe Hilbert space
As described above, one can obtain a natural Hilbert space interpretation by cutting open the path integral (2.1). In particular, we split each history over which we sum into a 'past' and 'future' that meet on some slice where we imagine summing over a complete set of intermediate states. There is a choice of how we cut, constrained by the way in which the asymptotic boundaries are labelled past or future. For now, we will choose to place each connected component of the boundary either entirely to the past or entirely entirely to future of our cut, so that our intermediate slice intersects no asymptotically AdS boundaries (generalizing in section 2.4). We thus identify the relevant Hilbert space as the space of closed universes in the theory. We call this the 'baby universe' Hilbert space H BU for the reasons described above.
One might hope to describe elements of the baby universe Hilbert space as wavefunctions of all possible spatial metrics (and field configurations on those metrics). A complication is that, as usual in a gravitational theory, diffeomorphism invariance forbids a notion of universal time that might be used specify precisely where the past/future cut is to be made. Proceeding in this manner would thus require imposing the gravitational constraints (the Wheeler-DeWitt equation) on the resulting wavefunctions. This is made particularly challenging in the current context where spacetime wormholes are important, so that the associated splitting and joining of universes should modify these constraints [57].
However, we can bypass these difficulties entirely by using our asymptotic boundaries to define states in the baby universe Hilbert space. Given a set {J 1 , . . . , J m } of boundary conditions, there is a state Here we emphasize that this is not just a state on a single universe, but that it instead represents a state of the full collection of an indefinite number of baby universes. States of the form (2.3) defined by different sources, or even with different numbers of sources m, are generally not mutually orthogonal in any useful sense. Note that the physical notion of inner product cannot simply be assumed to have any particular form, but is something we must compute from the theory. It must thus follow from an appropriate path integral. Now, some readers may be confused by the fact that in quantum field theory one typically uses first-quantized path integrals to compute Green's functions and not to compute inner products. However, as explained in e.g. [69], in defining the gravitational path integral one must make a choice -in some languages, associated with specifying the contour of integration -as to whether it fully imposes the gravitational constraints or instead defines a Green's function. We simply choose the former, and we take the correlators (2.1) to be computed with the same specifications. With this understanding, the path integral indeed computes the inner product 3 which is then given by Here the right hand side is just the amplitude defined in (2.1) with boundary conditions Z[J] and Z[J * ], and where * is the CPT conjugate operation on boundary conditions J. This operation should have the property that if we act with * on every boundary, the amplitude is complex conjugated: This guarantees that the inner product (2.5) is Hermitian. If we can interpret Z[J] as random variables with correlation functions Z[J 1 ] · · · Z[J n ] , then (2.5) reduces to a standard construction in probability theory, in which the covariance matrix of pairs of random variables defines an inner product. In particular, showing that the amplitudes follow from expectation values of a distribution with nonnegative probabilities would imply that our inner product is positive semi-definite. Note that the states (2.3) need not be normalised. In particular, the norm of the Hartle-Hawking state is given by what one might call the cosmological partition function Z, defined by the path integral over all spacetimes without boundary: (2.7) For most purposes, it would be sufficient to consider normalised amplitudes, where we divide by Z. This is equivalent to performing the path integral excluding closed components of spacetime which do not connect to any asymptotic boundary. We now have a space of states defined by (finite) linear combinations of the states (2.3) in correspondence with formal polynomials of 'partition functions' Z[J], and an inner product defined by extending (2.5) sesquilinearly. This is almost enough to construct a baby universe Hilbert space. The missing ingredient is a single property that we demand of our path integral (2.1), namely reflection positivity. This can be stated as the requirement that (2.5) defines a positive semidefinite inner product on finite linear combinations of states (2.3): Thus is clearly required if our gravitational path integral is to define a standard quantum theory, though it is cumbersome to verify directly for all states. While this can be done for the simple toy models studied in section 3, for more complicated systems it would be very useful to find properties that imply (2.8) but are easier to check. Assuming (2.8), we now define the baby universe Hilbert space H BU though a standard construction, as the completion of the space of linear combinations of states (2.3) with the inner product (2.5). Roughly speaking, states of H BU are infinite sums over states (2.3) with finite norm defined by (2.5). 4 Importantly, however, infinite sums with different terms and coefficients may not give rise to distinct states in H BU . Equivalently, some infinite sums may be identified with the zero state in H BU ; i.e., for appropriate coefficients c i one may find Naively, the Hilbert space H BU may appear to consist of formal power series in the objects Z[J] with some convergence property. But it is in fact smaller since the construction divides out by the set of 'null states' (2.9). This may seem like a minor technical point. Of course, from one perspective the inner product defined by any 4 H BU is the set of equivalence classes of Cauchy sequences Recall that a sequence is Cauchy when |Ψ i − |Ψ j 2 → 0 as i, j → ∞. The inner product between two such sequences is defined by the limit of the inner products of the terms, which exists and is the same for all members of the equivalence class. H BU is then a Hilbert space, so in particular is complete and the inner product is positive definite. It is separable as long as the set of possible sources J has a countable dense subset (assuming that amplitudes are continuous in J).
In the presence of spacetime wormholes, different spatial slices of a spacetime may have different number of connected components. Here, on the slice Σ 1 we have two circular universes, but on Σ 2 we have only one. These may be thought of as different gauge choices for the same state.
gravitational path integrals naturally leads to a large set of such null states due to the gravitational gauge symmetry. But we usually expect that symmetry to act trivially at the asymptotically AdS boundaries where our sources J are defined; i.e., natural sources J are invariant under familiar gravitational gauge symmetries. As a result, one might expect the null states to simply encode possible senses in which one may have accidentally introduced an overcomplete set of sources. However, one should expect the sum over topologies to modify the gravitational gauge invariance so that it no longer corresponds precisely to familiar diffeomorphisms. As illustrated in figure 3, one expects different slices of the same spacetime to describe gauge equivalent states. But including a sum over topologies means that two such slices may no longer be related by a diffeomorphism, and in fact that they need not even contain the same number of connected components for space at the given time. It will thus be important to compute the effects of this modified gauge symmetry rather than to assume that they take a familiar form. In particular, while one might naively expect the effect of such modifications to be small, we will find sections 3 and 4 that in certain circumstances they lead to dramatic physical consequences.
The above construction of H BU is very similar to the construction of the Hilbert space of a quantum field theory from its correlation functions in the Wightman [79] or Osterwalder-Schrader (see Theorem 3-7 of [80], [81]) reconstuction theorems. In this analogy, our objects Z[J] correspond to (smeared) local operators inserted in the Euclidean past, and the inner products between states with finitely many operator insertions are given by the (Euclidean) Wightman functions. The Hilbert space is again defined by the above completion construction.

Operators and α-eigenstates
Having constructed the baby universe Hilbert space H BU , we now introduce a set of operators acting on it. Here we once again find asymptotic boundaries useful. In particular, we take any boundary Z[J] to define an operator Z[J] on H BU . The matrix elements of this operator are defined by a path integral over all configurations with boundaries specified by some initial and final states with an additional boundary Z[J].
Since the labelling of boundaries as past, future, and in between does not affect the value of the path integral, the defining relation of the operator Z[J] is (2.10) Since the span of the bra-vectors in (2.10) is dense in H BU , we may write the action of such operators as and thus by combining (2.5) and (2.11) that we may identify our original path integral as computing correlators in HH as advertised earlier: We also see that the Hermitian conjugate of Z[J] is given by taking the CPT conjugate of the source: Thus far, we have really defined the Z[J] as operators on the baby universe pre-Hilbert space (before taking the quotient by null vectors (2.9)). To show that Z[J] is well-defined on H BU , we must show that it maps null states to null states. But this follows immediately from either (2.14) or (2.12). In particular, for any null state N and an arbitrary state Ψ , we may define Ψ = Z[J * ] Ψ to write The last equality follows from the fact that N is null, and since Ψ is arbitrary we see that Z[J] N is also null as desired. Following [55], we call these α-eigenstates, or α-states for short. The spectrum {Z α [J]} α of Z[J] may be either discrete or continuous. In the latter case the |α are not normalisable states, but are instead delta function normalized. However, for simplicity we use notation in either case as if |α are normalisable eigenvectors, writing α |α = δ α α , (2.18) leaving the appropriate modifications for continuous spectrum implicit. It turns out that the set { Z[J]} for all possible J in fact defines a complete commuting set of operators on H BU , as the state |α is determined up to a phase by its eigenvalues Z α [J]. To see this, note that we can determine all matrix elements of |α via (2. 19) This means that the α-states define a preferred orthonormal basis for H BU ; we can even fix phases by choosing HH α > 0. The above calculation of the matrix elements also shows that the Hartle-Hawking state has non-zero overlap with every α-state, HH|α = 0. Otherwise |α has vanishing overlap with a dense set of states, and hence must be the zero state. If we define p α by these overlaps according to where the second follows from completeness and orthonormality of the α basis. Now, by inserting complete sets of α-states, we can compute the general amplitude (2.1): The normalising factor Z is the norm of the Hartle-Hawking state (2.7). Equation (2.22), along with (2.21), tells us that a gravitational path integral (2.1) is quite generally compatible with an ensemble interpretation, exemplified by the matrix ensemble dual to JT gravity in [62], and analogous to the random couplings of [55,56]. Specifically, the parameters α label the various theories in the ensemble, the eigenvalues Z α [J] give definite values for observables in the theory associated with the particular label α, and p α gives the probability of selecting α from the ensemble. The states |α making up our preferred eigenbasis of H BU are in one-to-one correspondence with members of the ensemble. A less extreme example of α-states is provided by the 'eigenbranes' described in [82] in the context of JT gravity, which act to constrain the eigenvalues of Z[J], thus partially diagonalizing these operators. Note that we arrived at a classical probability distribution because the relevant operators are mutually commuting (2.16). The only property required of the gravitational path integral (besides its existence) was reflection positivity, to guarantee nonnegative probabilities.
With our new Hilbert space point of view, it is now clear that the ensemble described above is not unique. Instead, through (2.13) it was associated with the implicit choice of the Hartle-Hawking state in H BU . While the Hartle-Hawking state is a particularly simple and natural choice, we are nevertheless free to select any state we like.
In particular, if the initial state of the baby universes is an α-state, this selects a single member of the ensemble so that amplitudes factorize: (2.23) Any other state |Ψ is a superposition of α-states, and describes an ensemble with probabilities p α = | α|Ψ | 2 . Classical probabilities are sufficient to describe the ensemble, since relative phases between different α-states in the superposition are irrelevant for correlation functions of the commuting operators Z[J]. In other words, with respect to the algebra of the Z[J], the α-states define superselection sectors. If the path integral (2.1) already defines factorising amplitudes, so that our theory of gravity has a single boundary dual, we have a trivial special case of the formalism described here. In that case, the operators Z[J] are constants Z[J], and the Hilbert space of closed universes H BU is one-dimensional, spanned by the Hartle-Hawking state, which is also the unique α-state. We discuss this possibility further in section 6.

More Hilbert spaces
The above discussion concerned the Hilbert space H BU of closed 'baby' universes. We constructed H BU by cutting amplitudes in such a way that any given asymptotic boundary lies completely on one side of the cut. We now generalize this construction to allow cuts that intersect one or more components of the asymptotic boundary, thus splitting such boundary components into two parts. This gives us many different Hilbert spaces depending on the boundary conditions at the intersection, and in particular on the choice of a (d − 1)-dimensional (perhaps oriented) spatial boundary geometry Σ. We thus call the resulting Hilbert space H Σ , leaving implicit the other sources J on Σ. Note that Σ can have any number of connected components, and if Σ is empty we find again the Hilbert space H Σ=∅ = H BU of closed baby universes described above.
The construction of H Σ proceeds much as for H BU , except that in addition to closed asymptotic boundary conditions denoted by Z[J] we also have objects ψ[J] defining boundary conditions on a piece M of an asymptotic boundary with ∂M = Σ. As before, the manifold M, and in particular its boundary Σ, is implicitly included in the sources J. For example, in the right panel of figure 2, M is the solid black semicircle forming the past asymptotically AdS boundary and Σ consists of the right and left endpoints. In a dual interpretation, ψ[J] would define a state on the CFT Hilbert space with spatial geometry Σ, as the wavefunction for a given CFT field configuration on Σ would be computed by a path integral on M with sources J.
As before, we may choose M to be connected. Note that this does not imply Σ = ∂M to be connected. When Σ is not, it can be useful to write Σ as the disjoint union Σ = Σ 1 · · · Σ m of components Σ i (where the ordering of the components is meaningful, in case they have the same geometry). Generalizing (2.3), we then have states where ψ[J i ] is associated with component Σ i for any source J. While this notation is useful, it is also somewhat awkward if we take a given ψ[J i ] to be associated with a connected M i , whose boundary ∂M i = Σ i may again be disconnected. As a result, one will sometimes need to use a number of distinct decompositions Σ = Σ 1 · · · Σ m (perhaps with different values of m) for a given H Σ . The inner product on H Σ generalizes (2.5) in a natural way if we note that a boundary condition ψ[J i ] in the 'bra' (on someM i with ∂M i = Σ i ) can be paired with a boundary condition ψ[J i ] in the 'ket' (again on some M i with ∂M i = Σ i ) to define a boundary condition Z[J * , J] associated with the closed boundary manifoldM * i M i constructed by taking the manifoldM * i (formed fromM i by reversing the orientation) and sewingM * i to M i along Σ i . In Z[J * , J], * again denotes CPT conjugation of sources, and the sources onM * M are given locally byJ * , J. One may also wish to restrict the allowed sources to vanish sufficiently quickly at Σ i so that the sources defined onM i M i by such sewings are sufficiently smooth.
It is important that the above sewing is uniquely defined even when Σ i admits isometries. In particular, recall that the above discussion fixed a manifold Σ ⊇ Σ i from the beginning, and at no point was there a quotient by diffeomorphisms of Σ. The individual points of Σ should thus be thought of as carrying definite labels, defining the unique sewing ofM to M. In particular, the notation in (2.24) is not invariant under reordering of the Σ i .
We shall write the pairing as Z[J * , J] = ψ[J], ψ[J] . This notation is chosen be suggestive of an inner product (·, ·) of states in the dual CFT Hilbert space. The distinguishability of points in Σ is motivated either by a dual CFT perspective, or from familiar gravitational boundary conditions at asymptotically AdS boundaries. The extended inner product is then defined by using the above pairing and and evaluating the resulting path integral as before: We emphasize again that if Σ contains identical connected components Σ 1 , Σ 2 , the components are treated as distinguished and canonically ordered. Thus in the notation of (2.24), While the norms of these states will agree, the inner product of these states with generic other kets will not (for example, so this pairing makes sense). This is a special case of the statement that states need not be invariant under symmetries of Σ. As in the discussion of H BU , the structure above is properly described as being pre-Hilbert space. The actual Hilbert space H Σ is then constructed as a completion, which includes a quotient with respect to the space of null vectors. This procedure succeeds when the path integral is appropriately reflection positive, by which we mean that the inner product it defines on the pre-Hilbert space is positive semi-definite. The inner product on the final H Σ is then positive definite as desired. Note that reflection positivity on H Σ is an additional requirement we impose on the path integral, not necessarily implied by reflection positivity on H BU ; this will prove to be relevant for the toy model discussed in section 3.
As before, we have operators Z[J] acting on the Hilbert spaces H Σ , and in particular which preserve the space of null states in the pre-Hilbert space for the same reason as before. Again, these operators mutually commute. But now we also have a plethora of new operators which can map between Hilbert spaces with different boundaries. In particular, if ψ[J] is associated with M having ∂M = Σ, then for anyΣ there is an operator The adjoint operator ψ[J] † maps from HΣ Σ to HΣ by taking the boundary conditions defined by the state on which it acts, and gluing to boundary conditions of the CPT conjugate source J * along the manifold Σ.
Since the Z[J] commute, it is again useful to diagonalize them using α-states. Thus the Hilbert space splits as (2.28) One can explicitly build the spaces H α Σ from the α-states of H BU , as we may define and, the states (2.29) are dense in H α Σ . In the special case Σ = ∅ corresponding to H BU , each H α ∅ is one dimensional, consisting of multiples of |α . It follows that all of our boundary operators leave α unchanged. For example, evaluating the analogue of (2.25) in α-states we have It also follows that Z[J] commutes with ψ[J]. Finally, note that there is a natural map Υ from H Σ 1 ⊗ H Σ 2 into H Σ 1 Σ 2 defined by concatenation of sources: (2.31) This maps acts nicely within each α-sector, taking H α and more generally Υ : Here we have used the notation cH for non-negative real c to denote a Hilbert space with inner product c times that of H. In particular, cH = {0} for c = 0. We will use Υ α to denote the restriction of Υ to diagonal tensor products of the form H α Σ 1 ⊗ H α Σ 2 . It is natural to attempt to interpret H α Σ as the Hilbert space of a dual CFT C α on Σ; this is the natural formulation of an isomorphism between bulk and boundary Hilbert spaces in the context of ensembles and baby universes. In this case, we would expect Υ α to be an isomorphism, since this property would certainly hold true in a local dual theory. But this is not always the case, as the map may not be surjective; we will discuss an explicit example in section 3.6. The failure of Υ α to be an isomorphism is a precise version of another potential 'factorisation problem' [83][84][85], which differs from the partition function factorisation problem discussed in the introduction and the start of this section. This new issue is naturally associated with spatial wormholes while (2.2) is related to spacetime wormholes. In particular, the factorization problem of [83][84][85] occurs when there are two-sided black hole states with a spatial wormhole (Einstein-Rosen bridge) which cannot be represented as superpositions of products of 'microstates' in the corresponding one-sided Hilbert spaces. For example, in a bulk theory with a standard Maxwell field but no charged particles, there are eternal charged black holes but no one-sided counterparts. An extreme version appears in pure JT gravity, which has a two-boundary Hilbert space but no single-sided Hilbert space. We expect that this feature is an artefact of simple toy models, and would be absent in more realistic theories.

Example: a very simple topological theory
This section further explores the structure described in section 2 in very simple theories of two-dimensional gravity. Indeed, the model described in section 3.1 is plausibly the simplest possible such theory. Our models are inspired by recent work studying spacetimes of nontrivial topology in JT gravity [62,86,87], along with the addition of 'end-of-the-world brane' dynamical boundaries [49]. We further simplify that class of models by removing any notion of a dynamical metric or dilaton, leaving a theory of topology alone. The resulting models are tractable enough to be solved exactly, and for many details to be made explicit. They thus give a surprisingly clean illustration of the ideas of section 2, and demonstrate the type of results to which such ideas can lead.
We begin by presenting the simplest model (without end-of-the-world branes) in section 3.1. This theory allows only one boundary condition Z, associated with a single operator Z of the class described in section 2.3, with the path integral defined by a single bulk parameter S 0 determining the suppression of nontrivial topology, along with a (somewhat ad hoc) parameter S ∂ associated with boundaries, whose preferred value S ∂ = S 0 will be determined later by a consistency analysis in section 3.7. We then evaluate its amplitudes in section 3.2 and construct the Hilbert space of closed universes H BU in section 3.3. The most interesting output of this model is that the spectrum of Z turns out to be non-negative and discrete, and in fact takes non-negative integer values for S ∂ = S 0 , compatible with an interpretation as the dimension of a dual Hilbert space. The model with end-of-the-world branes is then described in section 3.4, and its α-states are described in section 3.5. Here we find that, no matter how many species k of end-of-the-world brane states we allow, for S ∂ = S 0 all α-states define an inner product on end-of-the-world brane states with rank equal to or less than the eigenvalue Z α of Z, compatible with states in a dual Hilbert space of dimension Z α . This remarkable compression of the Hilbert space illustrates the importance of understanding the null states 2.9 in extracting the correct physics. It also shows in this model that results analogous to the Rényi entropy computations of [48,49] will hold not just for typical members of the ensemble defined by the Hartle-Hawking no-boundary state, but in fact for all allowed α-states.
We then return to the ad hoc parameter S ∂ in section 3.7. First, we describe how different choices for this parameter modify the model. We find that for generic S ∂ (and in particular S ∂ = 0) the end-of-the-world brane models fail to be reflection positive, and find the set of S ∂ for which reflection positivity holds true. For values of S ∂ satisfying reflection positivity for any number k of end-of-the-world brane states, the spectrum of Z is a subset of the non-negative integers and the rank of the end-of-theworld brane Hilbert space is bounded as above. In particular, the reflection positive models have all the properties required to interpret Z α as the dimension of a Hilbert space which contains the end-of-the-world brane states.

A theory of topological surfaces
We now consider a theory of purely topological two-dimensional gravity in which spacetime is a two-dimensional manifold 7 (surface), but the only additional structure we introduce is an orientation. We thus have neither a spacetime metric nor the conformal or complex structure that would appear in the standard model of topological gravity [88]. The histories that can appear in a path integral are then the set of oriented topological surfaces with boundaries dictated by the relevant boundary conditions. This set is discrete and (for each connected component) is famously classified by genus and number of circular boundaries [89,90]. Since there is no possibility to add sources in this model, we simply use Z to denote the boundary condition on any circular boundary. 8 In this first model, the only boundaries are those fixed by boundary conditions. As described in section 2.4, such boundaries should be thought of as distinguishable even when their boundary conditions coincide. As a result, the space of allowed configurations is the set of oriented surfaces with labelled boundaries, and two such configurations are considered equivalent only when they are related by a diffeomorphism that preserves each boundary separately.
We therefore define our path integral as a sum over such diffeomorphism classes of surface M . Nevertheless, residual effects of diffeomorphism invariance can lead to a nontrivial measure µ(M ) on this space. This can arise when a group Γ(M ) of residual gauge symmetries remains after gauge fixing diffeomorphisms. This naturally leads to symmetry factors in the measure, of the form µ(M ) = 1 |Γ(M )| . One may therefore expect to write our path integral in the tentative form where we sum over surfaces M obeying the appropriate boundary conditions, up to diffeomorphisms acting trivially on the boundaries, weighted by an action S[M ]. One would ideally like to derive the measure factor µ(M ) from a more complete model. Here, we will be content to define the model with a well-motivated choice of measure that leads to natural results. Since boundaries are distinguishable, and since any two surfaces related by boundary-preserving diffeomorphisms are already considered equivalent, we will assume the trivial measure µ(M ) = 1 for any connected manifold. It then remains to discuss only contributions to µ(M ) from boundary-preserving diffeomorphisms that interchange the connected components of M . These can act only on compact connected components (i.e., the ones that have no boundary). With this understanding, the detailed form of µ(M ) turns out to have little effect on the physics of interest. It leads only to a change of the 'cosmological partition functon' Z, the sum over compact universes, which is an overall normalisation of amplitudes (though at the end of section 3.3 we will encounter a situation in which our choice of measure is physically important). Nevertheless, we regard diffeomorphisms that permute compact connected components (necessarily with the same genus g) as residual gauge symmetries, and divide by the number of such permutations in the measure. This means that, if M has m g connected components of genus g with no boundary for each g, we have Following the principles of effective field theory, we should now write down the most general action allowed by the degrees of freedom. Fortunately, with only the topological degrees of freedom available to us, there is a unique local such action S(M ) = −S 0 χ(M ), proportional to the Euler characteristic χ of spacetime 9 , with a unique free parameter S 0 . This is the Einstein-Hilbert action in two dimensions, and is the topological term of the action in JT gravity.
Despite the apparent uniqueness for the action, we now introduce an additional term −S ∂ |∂M |, where |∂M | denotes the number of circular boundaries of M . As forewarned in the introduction to this section, for the moment the extra parameter S ∂ appears completely ad hoc. In particular, while this is an intrinsic function of asymptotic boundaries, it is not a local counterterm. Indeed, as stated above, we expect that the unique local theory of our form is given by setting S ∂ = 0. We discuss how this factor may arise in 3.7 below, perhaps most simply by introducing a new local degree of freedom residing on boundaries. For now we simply note that the parameter effectively just rescales the definition of Z; i.e., it can be removed by introducingZ = e S ∂ Z and replacing each Z in (3.1) byZ.
Since all values of S ∂ are related by this scaling, it suffices to discuss only a single value in detail, and then to use the above scaling to understand all other values. Until section 3.7, we will thus confine discussion to the particularly simple case S ∂ = S 0 . As an a posteriori justification, we will show in section 3.7 that the end-of-the world brane models fail to be reflection positive when S ∂ = 0, and S ∂ = S 0 is the most natural choice to cure this failure.
Our action is thus given by The practical simplification of choosing S ∂ = S 0 is that it precisely cancels boundary contributions to χ in the action. The amplitudes in our path integral thus take the form which we have written in terms of a modified Euler characteristicχ that does not count boundaries and which is given simply bỹ Here g is the usual genus of each connected component that counts handles. It will be useful below to sometimes use an alternate presentation of the sum (3.5). Instead of summing over surfaces with labeled boundaries, we can write Z n as a sum over ordered lists M L of connected manifolds, and also where we choose not to label the boundaries. The number of ways to label the boundaries is then accounted for by including a separate factor of the multinomial coefficient n! i n i ! , where n i is the number of boundaries in the ith entry of the list M L . As is well known, n! i n i ! gives precisely the number of ways to arrange n boundaries into lists of subsets that have n i boundaries in the ith subset. For a list of length m, including a factor of 1 m! then accounts for the fact that the components are not ordered in the original sum (3.5), and also for the factor of µ(M ) that arises when some items in the list both coincide and have no boundaries (so that exchanging these items neither generates a new term in (3.5) nor generates a new partition of the n boundaries). Thus we may rewrite (3.5) as where n, m, and n i are as above.
Before computing the amplitudes (3.5), it is useful to comment further on the interpretation of Z in terms of a putative dual 0 + 1-dimensional quantum mechanics (which we will sometimes call a CFT in analogy with AdS/CFT). Each Z would be naturally associated with the path integral of this quantum mechanics on the circle, which would describe the partition function Tr e −βH for a circle of length β. But since we have no metric, there is no notion of boundary length β, and invariance under diffeomorphisms of the boundary implies a vanishing Hamiltonian H = 0. This means we have a topological quantum mechanics (a one-dimensional TQFT) where the only observable is the trace of the identity operator, which is the dimension of the Hilbert space: A unitary dual quantum mechanics is therefore characterised by Z taking a value in the natural numbers N (or perhaps by Z being infinite). In the presence of spacetime wormholes connecting these boundaries, it would thus seem natural to find that Z is a random variable taking nonnegative integer values. We will see below that this is precisely the case for our model.

Evaluating the amplitudes
We now solve for the amplitudes Z n defined above. We begin by computing the no-boundary partition function Z as in equation (2.7). This is the case n = 0, given by the sum over arbitrary compact spacetimes without boundary. For this, we first compute the sum λ over connected compact surfaces, which are classified by genus. The measure is trivial for a connected surface, i.e. µ(M ) = 1, so we have With our amplitudes defined by (3.5), and in particular excluding boundaries from the count in the Euler character, the value of λ is always the amplitude for any con-nected component of spacetime (with fixed but arbitrary boundaries) after summing over connected topologies. This property determines all amplitudes of the model. In the usual way, one may write Z as the exponential of the sum λ over connected surfaces. For this, it is important that we include symmetry factors in our definition of the measure µ(M ). Indeed, the exponentiation is particularly explicit by using (3.7) with n = n i = 0, in which lists of length m contribute 1 m! times the mth power of the sum in (3.9). We thus find Z = 1 = e λ . (3.10) In particular, in our model the path integral defined by the sum over topologies converges.
We now introduce boundaries. To evaluate Z n , it is simplest to compute a generating function 11) and to extract the amplitudes from a power series in the 'chemical potential' u. Again, we wish to write (3.11) as the exponential of a sum over connected geometries. This is precisely the usual combinatorics familiar from Feynman diagrams, but it can also be seen explicitly from (3.7) which gives where m is the number of surfaces in the list M L , and n i for i = 1, . . . m is the number of boundaries of the ith surface in the list. Sinceχ for the disconnected surface M L is the sum ofχ for the individual components, this disconnected pieces exponentiate, Furthermore, since the factor u n n! is determined entirely by n while the factor e S 0χ (M ) depends only on the genus g, the double sum in (3.13) may be written as the product Here the last equality has used (3.9) to identify λ with the sum over g. We can extract the correlators Z n by expanding the generating function exp (λe u ) in powers of u.
We pause to note that there is a more direct way to compute the amplitudes Z n . Here we first divide by Z to remove contributions from closed manifolds and thus any mention of µ(M ). What remains is then just to simply count the relevant configurations remaining in (3.5). Such configurations are classified according to which of the n boundaries lie in the same connected component of spacetime, and thus by a partition of the set {1, 2, . . . , n} labelling the boundaries. For each connected component of spacetime, it then remains only to sum over genus, giving a factor of λ from (3.9). We may thus compute the amplitudes from a counting of partitions, graded by the number of subsets of {1, 2, . . . , n} that the partition defines: Here B n is known as the Bell polynomial of order n (BellB[n,λ] in Mathematica; also called Touchard polynomial). In agreement with our previous result, these polynomials are indeed known to have the generating function exp(λ(e u − 1)) as in (3.14) after dividing by Z = e λ .
To illustrate the counting in detail, consider the example of the third moment Z n ; i.e., the case n = 3. There are five distinct ways to divide the three boundaries into connected components: Since the boundaries are distinguishable, the three configurations with two connected components are counted separately, and there are no explicit symmetry factors in the first line above. 10 The alternative counting used in (3.7) would instead list each topologically distinct term in (3.16) only once, but would accompany each term by the number N L of distinct ordered lists that one can construct from the connected components and the factor of n! m! n i ! from (3.7). This gives the identical result where the first term has (N L , m!, n! n i ! ) = (1, 3!, 3! 1!1!1! ) since the 3 components are all identical but have only one boundary each, the second term has (N L , m!, n! n i ! ) = (2, 2!, 3! 2!1! ) since the two components are not homeomorphic but the cylinder has 2 boundaries, and the third term has (N L , m!, n! n i ! ) = (1, 1, 3! 3! ) since all 3 boundaries lie in the single connected component.
We now interpret the amplitudes in terms of a probability distribution where Z is regarded as a random variable. To do this, we divide the generating function e uZ by the normalisation factor Z and write the result as the Taylor series for the exponential: Extracting the coefficient of u n n! from (3.18) gives showing that all moments can be generated from a single distribution for Z with support on nonnegative integers d having manifestly non-negative probabilities Pr(Z = d) = p d (λ). We thus identify Z as a Poisson random variable with mean λ. We may also read this off directly from (3.14) using the fact that exp [λ(e u − 1)] is the moment generating function for a Poisson random variable. Alternatively, one can see this from the amplitudes (3.15) using the fact that B n is the nth moment of the Poisson distribution. The appearance of the Poisson distribution can be understood from the result that all connected components of spacetime contribute the same amplitude λ after summing over genus, independent of the number of boundaries. This corresponds to the fact that the cumulants of the Poisson distribution (that is, the completely connected correlation functions) are all equal to λ. This is a surprising and remarkable result. As reviewed in section 5 below, a perturbative description of the theory following [57] (based on a Fock space labelled by number of baby universes and with wormholes treated as a small correction) would have led to the expectation that Z should have a continuous distribution supported on all real numbers. Instead, from our exact nonperturbative solution we find that the support of Z is discrete, and limited to nonnegative values.
Furthermore, for our choice S ∂ = S 0 (or more generally for S ∂ = S 0 + log n for any positive integer n), since Z takes nonnegative integer values d we find that the result is compatible with the interpretation (3.8) in terms of an ensemble of dual Hilbert spaces. Although at this stage this result appears to depend on fine tuning the parameter S ∂ , we will see in section 3.7 that full consistency (in particular full reflection positivity) of the model in fact favours precisely the relation S ∂ = S 0 + log n.
As a final comment, it is interesting that the relation (3.9) between the 'bare' parameter e S 0 and the physically observable parameter λ is not injective, but is instead two-to-one. This means that there for a given value of e S 0 , there is a second value eS 0 that gives rise to the same λ, and hence the same theory. In particular, we find This is a strong-weak self-duality of the model in the sense that the semiclassical limit of large S 0 suppresses connected topologies (and thus describes weakly coupled universes), but yields the same theory as a very small value of the dualS 0 . At the self-dual value e S 0 = 2 we have λ = 4, and smaller values of λ correspond to complex couplings, with e −S 0 ∈ 1 2 + iR. From the point of view of the path integral in a semiclassical expansion it is surprising that such a complex coupling gives rise to reflection positive amplitudes, and hence to a unitary Hilbert space and positive probabilities.

The baby universe Hilbert space
We can now give a complete description of the Hilbert space of closed universes H BU . Every state can be written as a linear combination of Z m created by inserting m boundaries in the past, with inner product (3.21) A more general state ∞ n=0 c n |Z n can then be represented as |f (Z) , where f is a function with Taylor coefficients c n , which grow slowly enough for convergence. Demanding that the partial sums N n=0 c n |Z n N form a Cauchy sequence guarantees that f defines an entire analytic function (see appendix A.1). Before considering the details of the inner product, we are thus led to the idea that H BU is a space of functions f : R → C (or perhaps f : C → C), with argument Z.
We can read off the extension of the inner product to states |f (Z) from the last line in (3.21): This is (up to normalisation factor e λ ) the covariance of random variables f (Z), g(Z) where Z is Poisson distributed. But the salient feature of (3.22) is that it depends only on the vales of f and g evaluated at non-negative integers (also known as the set N of natural numbers). In particular, we find that the state f (Z) has zero norm whenever the function f vanishes on N: To form the Hilbert space H BU , we must quotient by such null states as in (2.9). For example, since sin(πZ) vanishes on N we have the otherwise surprising relation More generally, for any f we have sin(πZ)f (Z) = 0, so in some sense the space of null states is the same size as the total space before the quotient. Similarly, the Hartle-Hawking state can be represented by the constant function f (Z) = 1, or more generally by any function that has f (d) = 1 for all d ∈ N (for example, |HH = |e 2πijZ for any integer j). To emphasise the impact of the quotient by null states, note that by adding vectors of the form |Z n sin(πZ) we can change any finite number of coefficients c n (for n = 0) in the expansion of the state ∞ n=0 c n |Z n at will. As a result, the only physical information in any finite collection of coefficients c n is the overlap with the Z = 0 eigenstate (given by c 0 ).
These considerations reveal an enormous degeneracy in how states of H BU are represented as sums of |Z n . We regard this degeneracy as a gauge equivalence. As described in section 2.2 this gauge symmetry is a natural modification of diffeomorphism invariance associated with allowing topology change in the functional integral. But the enormous power of this seemingly natural modification comes as a surprise. This indicates that the corrections to diffeomorphism invariance are not generic, but are instead highly correlated. As a result, the corrections conspire to enhance the impact of the gauge symmetry, and thus to produce the degeneracy observed above. Such conspiracies call out for a more fundamental explantation, and we will see in sections 3.7 and 4 below that at least some of these conspiracies are in fact implied by reflection positivity of our path integral.
In parallel with the treatment in section 2.3, we can now discuss the α-states of our model. These are the eigenstates Z = d of Z, labelled by d ∈ N, and they must form a basis for H BU . When expressed as a sum of the states |Z n states, we may choose coefficients defining the Taylor series of any analytic function taking a non-zero value at Z = d but vanishing at other natural numbers, since multiplication by Z acts as multiplication by the constant d on such a function. One of the infinitely many ways to represent such eigenstates states is then where the coefficient is chosen to enforce the normalisation Finally, we discuss the spacetime interpretation of our operator Z and its eigenstates Z = d . From (3.22), note that projecting the states f (Z) onto the (here, one-dimensional) subspace where Z takes the value d is equivalent to restricting the sum on the right-hand side of (3.22) to the given eigenvalue d, or equivalently to terms of order λ d . But due to (3.9) (and the fact that the analogous equations are identical for any fixed number n > 0 of boundaries on the connected surface), these give precisely the contributions in (3.5) that arise from spacetimes with d connected components. We thus find that working in the eigenspace with eigenvalue d is equivalent to restricting the sum over amplitudes to terms where the universe has precisely d connected components 11 .
In other words, the operator Z counts the number of connected components of spacetime! This is quite surprising, since this is not a quantity we would naturally associate with a Cauchy slice if we were to attempt to quantise by gauge fixing diffeomorphisms (unlike the number of connected components of space, which is a natural observable when universes cannot split and join, but is not gauge invariant when they can).
The α-states are designed to make amplitudes factorise (2.23), and it is interesting to note how our model achieves this. To work in an α-state Z = d , we can impose the nonlocal constraint that spacetime has exactly d connected components. This does not exclude wormhole configurations connecting multiple boundaries, but provides additional correlations between disconnected configurations of boundaries. It thus achieves factorisation in a surprising way, which may be instructive for less simple models. Note that our choice of symmetry factors on spacetimes without boundary, which otherwise only acts to renormalise Z, is crucial for this simple description of α-state correlation functions.
Since Z takes values in N, H BU has a natural representation as a harmonic oscillator Hilbert space in which Z acts as a number operator. 12 We can define the annihilation operator a as acting to shift functions of Z, so that we have the relations (3.28) In this description, the Hartle-Hawking state is a coherent state, which can be represented as The distribution of the associated ensemble then follows from the well-known fact that the number operator follows a Poisson distribution in a coherent state.

End-of-the-world branes
We now extend the model described above by introducing dynamical boundaries, which (following [49]) we call end-of-the-world (EOW) branes. We choose to include an arbitrary number k of species of EOW brane, so each of these boundaries is labelled by an index i ∈ {1, 2, . . . , k}. Equivalently, we can place a topological quantum mechanics on the EOW branes, with zero Hamiltonian and a k-dimensional Hilbert space, so that i labels an orthonormal basis of states in that Hilbert space. Apart from the species label, the only local data on an EOW brane is an orientation compatible with the spacetime it bounds. Introducing the EOW branes has two effects. Firstly, they can appear as closed boundaries in the sum over topologies, but this is largely unimportant, only acting to change the value of λ so that it is no longer given by (3.9). More importantly, the EOW branes allow us to impose a new class of possible boundary conditions. Namely, we can specify that we have a boundary condition which is an oriented interval labelled at its endpoints by EOW brane species i and j. Since the interval is oriented, we may refer to it as having a past endpoint that creates an EOW brane of type i and a future endpoint that destroys an EOW brane of type j. We refer to both past and future labels as EOW brane sources. In a putative 0+1 dual, the condition that a boundary creates an EOW brane with label i corresponds to the preparation of a certain 0+1 dual state ψ i . We denote a boundary interval between EOW branes i and j by (ψ j , ψ i ) since the bulk path integral with this boundary condition should compute the inner Figure 4: A spacetime contributing to an amplitude (ψ j , ψ i )(ψ i , ψ j )Z . The solid red lines indicate asymptotically AdS boundaries, and the dashed green lines are EOW brane boundaries. The spacetime has two boundary components, each with the topology of a circle. One (solid red circle at bottom) is a single circular asymptotically AdS boundary (a Z-boundary). The other is formed by a pair of asymptotically AdS segments connected by a pair of EOW brane segments to form a topological circle.
product between these states.
Since the boundaries carry an orientation, the notation distinguishes bra-vectors from ket-vectors so that (ψ j , ψ i ) = (ψ i , ψ j ); in general, these are CPT conjugate boundary conditions. This coincides with the general notation introduced in section 2.2.
Including the ψ i , the most general amplitude can now be written The associated boundary conditions for the path integral require m circular boundaries without EOW brane sources and n additional interval boundary segments labelled appropriately with EOW brane species. Since the EOW branes are dynamical, the path integral is then computed by summing over all oriented surfaces whose circular boundaries are of the following three types: 1) circular EOW brane boundaries, each labelled by an arbitrary species independent of all boundary conditions, 2) m circular boundaries without EOW brane labels as dictated by the number of Z's in the amplitude, and 3) additional circular boundaries formed by partitioning into subsets the oriented intervals (ψ j , ψ i ) dictated by the boundary conditions and, for each subset, forming a circle by connecting the (ψ j , ψ i ) segments using oriented EOW brane segments whose species labels match the source labels at both endpoints. See figure 4 for an example.
We now know the set of amplitudes to compute and the corresponding configurations over which we are to sum. It remains only to specify the measure on the configurations. As before, the Euler characteristic is the unique local action without introducing additional degrees of freedom. However, we will again include a parameter S ∂ associated with each circular boundary. We use the same S ∂ for every circular boundary, no matter how it is formed from asymptotic pieces and EOW branes. Again, we will see in section 3.7 that this can be obtained by introducing additional local degrees of freedom which reside on both asymptotic and EOW brane boundaries, and integrating them out. While this no longer corresponds to a simple scaling of our operators, we will nonetheless once again focus on the case S ∂ = S 0 , resulting in an action which counts only genus and not the number of boundary components, and comment on the extension to other values in section 3.7.
It remains to specify the symmetry factors that will be the analog of µ(M ) in (3.1). In doing so, it is useful to note that, since all asymptotic boundaries are treated as distinguishable, they will not contribute to symmetry factors. The only indistinguishable boundaries are those formed by circles involving EOW branes alone. Furthermore, such circles are completely independent of the boundary conditions. They thus enter all of our sums in precisely the same way as the genus g. The analogue of (3.5) for our new model is then where we sum over diffeomorphism classes of surface M with the boundary conditions specified on the left hand side. The measure µ is analogous to (3.2) but includes additional factors associated with counting end-of-the-world branes using Bose statistics.
We may now proceed to evaluate the above amplitudes. As a first step, we again define λ as the sum over connected surfaces with no asymptotic boundaries in analogy with (3.9). However, this sum must now allow for the possibility of circular EOW brane boundaries, each with k possible species labels. Since EOW brane boundaries can be specified in precisely the same way for each genus, this simply multiplies the result (3.9) by an overall factor counting the number of possible such labelled boundaries. For a fixed number n of EOW brane boundaries, including symmetry factors we count k n n! ways to label the boundaries with k species. Summing this factor over n shows the new factor to be e k and we obtain (3.33) As before, we can now compute all amplitudes through a generating function, where we sum over all configurations, with any number of asymptotic boundaries, and fugacities u and t ij (with i = 1, · · · , k) for the Z and (ψ j , ψ i ) boundaries respectively. As we explain below, this yields , (3.34) where t is the k × k matrix with entries t ij , and I the k × k identity matrix. Once again, we compute this result by writing it as the exponential of a sum over connected spacetimes, each weighted by a factor of λ from summing over genus and closed EOW branes. The connected contribution is a sum over all possible boundaries we could insert on a given connected spacetime (excepting circular EOW brane boundaries, which have already been absorbed into λ). This sum is itself given as the exponential of a sum over distinct types of boundaries: The u accounts for insertions of circle boundaries Z as before. The nth term in the sum comes from boundary components consisting of n intervals corresponding to some (ψ j , ψ i ), alternating with n EOW branes. Summing over species of EOW branes results in the matrix product and trace, and the factor of 1 n avoids overcounting equivalent configurations where the n component intervals are cyclically permuted.
For an alternative route to this result where various factors are more explicit, we can present (3.32) as a sum over ordered lists of connected manifolds. This is readily obtained from (3.7) by recognizing that the circular EOW brane boundaries enter every sum on the same footing with the genus g. We have where the factor k I i I i ! for each connected manifold counts the number of ways (including symmetry factors) to assign EOW brane labels to I i indistinguishable circular boundaries and the factor D! i D i ! again counts partitions of the D distinguishable boundaries into (labelled) subsets of size D i . Finally, the factor C(D) represents the number of ways to form D distinguishable boundaries from the specified boundary conditions (together with interpolating EOW brane segments).
In comparing with (3.36), the relation to the exponential of (3.34) is clear from the factor of 1/L! in (3.36), the inclusion of factors of (3.34), and the defining property of generating functions. By this last feature, we mean the fact that the definition of the generating functions (3.34) converts the factors C(D) D! i D i ! counting the number of ways to match distinguishable boundaries to boundary conditions into the above-described weighted sum over all possible boundary conditions for each connected component.
We now interpret the amplitudes as describing an ensemble, for which (3.34) is the (unnormalised) generating function for moments of random variables Z and (ψ j , ψ i ). Let us first set t = 0 in order to consider the marginal distribution of Z. We then recover the old result (3.14) without EOW branes, so Z is again Poisson distributed, though with a new value of λ given by (3.33).
We can now characterise the distribution of (ψ j , ψ i ) by conditioning on Z = d for each fixed d ∈ N. To find the corresponding conditional generating functions, we Taylor expand the exponential in (3.34) and write each term as an average over the Poisson The result is the generating function for a standard complex Wishart distribution [91] with d degrees of freedom.
To make this more transparent, and to simultaneously explain this distribution to the uninitiated reader, we can rewrite the generating function by introducing kd 'auxiliary' complex variables ψ a i , arranged in a d × k matrix. The index i = 1, . . . , k labels the EOW brane states, and we will interpret a = 1, . . . , d as labels for an orthonormal basis of the boundary Hilbert space H CFT (which is d-dimensional based on our interpretation (3.8) of Z). The ψ a i variables will be interpreted as the components of the EOW brane states ψ i in this orthonormal basis.
In terms of the ψ a i variables, our Wishart generating function (3.37) can now be written as a Gaussian integral: In the 0+1 dual interpretation, this means that the wavefunction of each EOW brane states is selected independently and uniformly at random from the unit sphere of a d-dimensional Hilbert space H CFT , and then multiplied by a random normalization so that its squared norm is drawn from an appropriate χ 2 -distribution. In particular, the number of linearly independent states, given by the rank of the matrix of inner products, is bounded by Z: with probability one we have rank(ψ j , ψ i ) = min{k, Z}.
(3.41) This is another surprising and remarkable result from such a simple model, since in the semiclassical limit (without the exponentially small effects of spacetime wormholes) the k EOW brane states appear to be orthogonal, and we can choose k to be as large as we like. As discussed below in section 5, even if we include Euclidean wormholes there is an expansion in e −S 0 which for a finite number of amplitudes at any finite order gives no obvious sign that apparently distinct EOW brane states must in fact be linearly dependent. Nonetheless, in the complete solution after summing all nonperturbative effects, we find that the number of linearly independent states is truncated. As in [49], as and discussed further in section 4, this is a version of the semiclassical Page curve [1].
At first sight, this appears to require an enormous conspiracy in the nonperturbative contributions, which might lead one to suspect that it is an artefact of studying particularly simple models. We will show below that this is not the case, since it follows from a more primitive principal, namely reflection positivity of the path integral. For this, we must study the Hilbert space interpretation of the model with EOW branes.

Baby universe Hilbert space with EOW branes
We now incorporate the EOW branes into the baby universe Hilbert space. This enlarges the space relative to that of section 3.3 because, along with circular closed universes, we also have k 2 new types of universe whose spatial slice is an interval bounded by EOW branes, say with labels i and j (where the orientation defines a preferred order). On the other hand, the above-mentioned conspiracies will also imply the existence of new null states.
It is most straightforward to construct H BU from the α-states. These are eigenstates of the Z operator as before, but now are simultaneously eigenstates of the k 2 operators (ψ j , ψ i ) as well; note that Hermitian conjugation acts on these operators by swapping i, j. We label the corresponding eigenvalues by Z α and (ψ j , ψ i ) α , so we have The set of α-states is determined by the allowed sets of eigenvalues, which is constrained by (3.41).
As in section 3.3, the eigenvalues Z α of Z are given by the nonnegative integers d. Indeed, we can still define states |Z = d by any of the means discussed in that section, for example by (3.25). However, they are now not full α-states, since they are eigenstates only of Z and not of (ψ j , ψ i ). Instead they are the projections of the Hartle-Hawking state onto the corresponding eigenspace of Z. We can generate the rest of this eigenspace by acting with the operators (ψ j , ψ i ) on |Z = d .
In each such eigenspace, we can now diagonalise the operators (ψ j , ψ i ). Their simultaneous eigenvalues correspond to Hermitian k × k positive definite matrices of rank at most d (though any rank other than min(d, k) has probability zero in any normalizable state). The baby universe Hilbert space therefore decomposes as a direct sum: With this description, the α-states are delta function wavefunctions living in the subspaces H Z=d , supported on some particular matrix (ψ j , ψ i ) α ∈ M d k . In particular, we write their inner product as α |α = δ αα , (3.44) where δ αα is the product of a Kronecker delta δ ZαZ α for the eigenvalue of Z with an appropriate Dirac delta function on M d k associated with the choice of L 2 measure in (3.43).
Finally, the wavefunction of the Hartle-Hawking state in this description is given by where f Zα is the probability density function of the complex Wishart distribution with Z α degrees of freedom with respect to the measure on our L 2 space; this is the overlap α Z = Z α = f Zα . For Z α ≥ k, this density is given explicitly in (3.65).

Hilbert spaces with boundaries
Our discussion of Hilbert spaces is not yet complete. In particular, other Hilbert spaces of interest arise when we insert complete sets of states on Cauchy slices that intersect 'asymptotically AdS' boundaries. Here there are two types of boundary, distinguished by their orientation; we call them 'left' and 'right' boundaries of space. In a 0+1 dual, the two types of boundaries would correspond to CPT conjugate theories. In our model, the most general slice Σ of the asymptotically AdS boundaries will consist of n L left boundaries and n R right boundaries. We thus denote the associated Hilbert space H Σ from section 2.4 as H n L ,n R . Reversing the orientation of all boundaries gives the dual (Hermitian conjugate) Hilbert space, so H * n L ,n R = H n R ,n L . The simplest of these is H BU = H 0,0 , which we have already discussed. We will be primarily interested in the one-sided Hilbert space H 0,1 (related to H 1,0 by duality) and the two-sided space H 1,1 .
We begin by considering the single boundary Hilbert space H 0,1 , which is spanned by states of the form |ψ i ; Z m (ψ j 1 , ψ i 1 ) · · · (ψ jn , ψ in ) . Recall that the operator ψ i maps H BU to H 0,1 (or more generally H n L ,n R → H n L ,n R +1 ). All of the above states can be produced by acting with the operator ψ i on a state of closed baby universes. In particular, we can span H 0,1 by acting with one of the k operators ψ i (for i = 1, . . . k) on α-states of H BU . The inner product on such states is so in particular, the different α-sectors are orthogonal, and H 0,1 admits a direct sum decomposition (where this is to be understood in the appropriate sense given that some of the parameters defining α are continuous). The inner product on each sector H α 0,1 is simply given by the matrix of eigenvalues (ψ j , ψ i ) α . On sectors with Z α < k, this is degenerate, and Next, we look at the two-boundary sector H 1,1 . In the same way, this Hilbert space can be populated by acting with boundary creating operators on states of H BU , for example on α-states. We have the same direct sum structure as before, H 1,1 = α H α 1,1 . States within each H α 1,1 can be created by acting with separate EOW brane states on left and right boundaries using ψ * j ψ i . But we now have an additional possibility where we introduce a single asymptotic boundary that connects left and right. In a general theory, one might call this the cylinder boundary (with topology Σ times an interval), and one might think of it as obtained by cutting in half a partition function on Σ × S 1 . By acting on HH , it thus creates a state that one expects to interpret as a 'thermofield double' in some CFT dual. In our case the cylinder degenerates to a line segment (since Σ is a point), which we can think of as half of a Z circle. We denote the boundary condition by , the associated operator by , and the resulting state by = HH . Thus, and the inner products of these states are given by (3.50) From the first of these, we see that for fixed α the states |ψ * j , ψ i ; α span a subspace isomorphic to the tensor product of two single boundary subspaces, so this tensor product embeds naturally in H α 1,1 ; i.e., H α 0,1 ⊗ H α 1,0 ⊆ H α 1,1 . This inclusion could be an exact equality, but only if the new state | ; α can be built from a linear combination of factorised states |ψ * j , ψ i ; α . This suggests that we look for a linear combination with zero norm. Such a vector would be projected out of the Hilbert space H 1,1 , giving |∆ = 0 and providing an identity relating the cylinder state to a superposition of one-sided states.
Before computing the norm of our ansatz |∆ , we first change to a more convenient basis diagonalising the EOW brane inner product in the α-state in question (with eigenvalues (ψ j , ψ i ) α ). Specifically, we pick linear combinations φ a of the ψ i boundary conditions for which (φ b , φ a ) α = δ ab , with the index a = 1, . . . , r running up to the rank of the matrix of inner products. In this basis, we rewrite our candidate null state and compute its norm: Re c aa + r a,b=1 In the last line we have chosen the coefficients c ab = δ ab to be δ ab , as this minimizes ∆ ∆ . The above calculation teaches us two things. Firstly, for the norm to be nonnegative we have an inequality which applies in all α states: Reflection positivity =⇒ Z α ≥ rank(ψ j , ψ i ) α . (3.53) This explains our empirical result (3.41) that the rank of the EOW brane inner product is bounded by Z α , in terms of reflection positivity of the path integral. The same argument can be used in much more general models, and we repeat it with the inclusion of a conserved energy in section 4, where we also connect it with the Page curve [1]. Secondly, we find that if the inequality (3.53) is saturated, we have |∆ = 0, and hence an identity Since the 'factorized states' |ψ * j , ψ i ; α then span the two-sided Hilbert space H α 1,1 , we also find an equivalence between Hilbert spaces This factorization holds in our model for sectors with Z α ≤ k; i.e., when there are enough EOW branes to populate a one-sided Hilbert space of dimension Z α .
To emphasise the importance of α-states in this argument, we examine how it fails in a more general (normalised) state |Ψ ∈ H BU , such as the Hartle-Hawking state. Specifically, let us choose linear combinations φ a of EOW brane states ψ i to diagonalise the expectation value of the inner product in the state |Ψ ; i.e., we take where a, b = 1, · · · , r, with r = rank Ψ (ψ j , ψ i ) Ψ . If we now compute the norm of the state we find an extra term, coming from the overlaps φ * b , φ b ; Ψ φ * a , φ a ; Ψ : Here we have defined the variance of boundary condition X as the connected amplitude for XX † , but there are k 2 such terms, so they are collectively important when k is of order λ or larger. As a result, (3.58) gives no meaningful bound relating the rank of the inner product (ψ j , ψ i ) to the partition function Z. Note that this is not really an issue of fluctuations in the particular parameter Z α , as the same discussion applies to the states |Z = d , which fix the eigenvalue of Z but not those of (ψ j , ψ i ).
Returning to the issue of reflection positivity, we should also discuss the Hilbert spaces H n L ,n R associated with arbitrary numbers of left and right boundaries. But in our model all possible boundary conditions creating such states can be formed by combining with the above ψ i . In superselection sectors with Z α ≤ k, the above result then implies H n L ,n R = H ⊗n L 1,0 ⊗ H ⊗n R 0,1 and the inner product on H n L ,n R is positive definite. In superselection sectors with Z α > k the higher Hilbert spaces are not tensor products of the lower Hilbert spaces. But much as above, considering states similar to (3.57) again shows the inner product to be positive for Z α > k = r. We thus see by direct calculation that our path integral satisfies reflection positivity.

The boundary parameter S ∂
We now discuss the parameter S ∂ , contributing an action proportional to the number of boundaries. First we describe how changing S ∂ from its preferred value S ∂ = S 0 alters the physics, and thus in particular explain why this value is preferred. We then discuss how we might naturally incorporate such a parameter in the model.
Let us first consider the model without EOW branes, discussed in sections 3.1, 3.2 and 3.3. There the only effect of S ∂ is to rescale the quantities and operators associated with the Z boundaries. We thus find an ensemble interpretation in which Z is e S ∂ −S 0 times a Poisson random variable, so that the α-states are characterised by Z eigenvalues Z α ∈ e S ∂ −S 0 N. From the gravitational perspective, there is nothing wrong with this model for any positive value of S ∂ . In particular, reflection positivity is preserved for all Hilbert spaces. Complex values are excluded by reflection positivity on H 1,1 , which is spanned by orthogonal states | ; α with norm ; α| ; α = Z α . From the boundary perspective, there is a good dual interpretation only when e S ∂ −S 0 is a nonnegative integer, so that Z α takes nonnegative integer values which can be interpreted as the dimension of a dual Hilbert space. Nothing from the bulk perspective appears to prefer such values, so our choice S ∂ = S 0 appears to be rather artificial.
This changes once we introduce the EOW brane states. The bulk then provides a principled reason to prefer particular values of S ∂ , as the inner product on EOW brane states will otherwise fail to be positive semidefinite. To see this, we focus on a sector of H BU with fixedẐ eigenvalue Z = d, in which our EOW brane amplitudes are given by the generating function (3.37), reproduced below with fugacities t ij rescaled by a factor of i for later convenience and with the matrix of EOW brane inner products encoded in a k × k Hermitian matrix M , M ij = (ψ j , ψ i ): For d ∈ N, by introducing dk auxiliary Gaussian variables we showed in (3.38) that this gives a probability distribution for M , and hence a reflection positive inner product on H BU . This argument does not apply for d / ∈ N, so we must find a different way to determine whether we have a positive semidefinite inner product.
If M is to be interpreted as a random variable selected from some probability distribution, (3.62) defines χ d,k as the characteristic function of the distribution. This is the Fourier transform of the probability density function p d,k , which is in general a distribution on the space of k × k Hermitian matrices. It thus determines our inner product, which acts on a space of functions f, g of k × k Hermitian matrices M : A succinct summary answering this question is contained in [92], to which we refer the reader for the results we now use. For d > k, the inverse Fourier transform of χ d,k is a continuous function of M , taking non-zero values only on positive-definite matrices: This is manifestly nonnegative and so defines a probability distribution. This result extends to d > k − 1, where the probability density diverges at the edge where M becomes degenerate, but is still integrable. This is easiest to see from the density in terms of the eigenvalues of M ; fixing k−1 positive eigenvalues and taking the last λ → 0, the density goes as λ d−k . The important result for us is that this range d > k − 1, along with the smaller nonnegative integer values of d already covered by (3.38), turns out to exhaust the values of d for which the inner product on H BU is positive semidefinite: We can intuit this from (3.65) by analytic continuation of the density in d. As d approaches k − 1, the density goes to zero for any fixed positive definite matrix from the zero in normalisation factor N d=k−1,k = 0, but the probability density piles up near det M = 0 and we end up with a probability density supported on the submanifold of singular matrices with rank k − 1. However, if we try to go further to k − 2 < d < k − 1, the probability density becomes negative. Even for values of d < k − 1 at which the probability density appears to be positive, the density is not integrable near det M = 0. On the other hand, since χ d,k is analytic (so its Fourier transform decays exponentially) and χ d,k (t = 0) = 1, the integral of the distribution p d,k over all M is well-defined and equal to unity. The resolution is that p d,k becomes a singular distribution which must be defined by a principal value prescription, and which is not positive definite on the singular submanifold det M = 0. As a result, the inner product on H BU can be positive definite only when all sectors with d / ∈ N have d ≥ k − 1. For a given S ∂ , this requirement is most stringent for the smallest non-zero eigenvalue of Z α , namely d = e S ∂ −S 0 . We thus find that reflection positivity can hold only when either S ∂ − S 0 is the logarithm of a positive integer, or S ∂ > S 0 + log(k − 1).
We can use the arguments of the last section to slightly strengthen our restrictions on S ∂ by considering positivity in Hilbert spaces with boundaries, and in particular in H 1,1 . The discussion leading to (3.53) shows that positivity in H 1,1 requires rank M ≤ d for the matrix of inner products M in each sector Z = d. This is violated by the distribution (3.65) in the range k−1 < d < k, since M has probability density supported on matrices with full rank, rank M = k. This gives us our final result: Reflection positivity =⇒ e S ∂ −S 0 ∈ N or S ∂ > S 0 + log k. (3.67) For any non-zero number of EOW brane species, we find that a non-zero value of S ∂ is required; the absence of a boundary action S ∂ = 0 does not lead to a reflection positive theory. The most natural choice is the minimal value S ∂ = S 0 , which is the definition of the theory we used throughout the rest of this section. The failure of models with S ∂ = 0 motivates us to explain the physics that might lead to an action counting the number of boundary components |∂M |. This is nontrivial, because |∂M | is not a local action. For example, if we take a cylinder (with two boundaries), we can slice it in two along its length, and glue together the two edges of each piece so that we form two separate cylinders. The resulting manifold has four boundaries, so |∂M | is not preserved by this cut and paste.
However, we can achieve the same effect with a local action by introducing a new degree of freedom on each boundary. This should propagate along both asymptotic and EOW brane boundaries. Note that we regard this as part of the bulk dynamics that happens to be localised at the boundary, and not part of the dual 'CFT' dynamics. Most simply, this can be a topological quantum mechanics with Hilbert space H ∂ . In that case, each boundary provides a factor of dim H ∂ , and we can regard −S ∂ |∂M | = − log dim H ∂ as a nonlocal effective action from integrating out this dynamics. This gives a local definition of our theory, but only if e S 0 is an integer. This is not entirely satisfactory: besides the somewhat artificial restriction on S 0 , it seems that this degree of freedom should allow for additional boundary conditions that project onto a particular state of this boundary quantum mechanics, in which case we are again left with the theory S ∂ = 0.
A slightly different possibility is that some local bulk dynamics gives rise to a path integral localised at the boundary, but one which cannot be described by any quantum mechanics. This seems like a strange situation at first sight, but we note that precisely this phenomenon occurs for JT gravity. In that theory, a local bulk theory gives rise to a degree of freedom associated with asymptotic boundaries, described by the Schwarzian path integral [93,94]. The Schwarzian alone is not a consistent quantum mechanics, since the path integral on the circle cannot be interpreted as Tr e −βH for any Hamiltonian H [85,95]. This possibility arises from a quotient by residual gauge symmetries acting nontrivially on the boundary (in that case, an SL(2, R)). Nonetheless, the gravitational theory (for example, the Lorentzian theory on a spacetime lying between two boundaries, has a good Hilbert space interpretation. While we do not have a concrete proposal to make at this time, we speculate that some analogous dynamics (or an appropriate accounting of residual gauge freedom) could naturally give rise to a theory of topology which includes a boundary effective action S ∂ . In particular, we hope that our model might be obtained as a limit of a theory with more dynamics, and that this construction might offer insight into this possibility.

Spacetime 'D-branes'
We conclude the discussion of the model with some interpretative remarks for some of the results in terms of 'spacetime D-branes,' which we call SD-branes below. An SDbrane means an object on which spacetime can end, and as such is seen from spacetime as D-branes are seen from the worldsheet in string theory. In particular, they are not localised in spacetime in any way. This will be similar in spirit to the discussion of D-branes and 'eigenbranes' in [62,82], though the framework of the Hilbert space of baby universes provides a new interpretation. We will focus on the model without EOW branes.
To study the theory in the presence of an SD-brane, we should introduce a new type of boundary of spacetime, interpreted as spacetime ending on the SD-brane. We will assign a free (possibly complex) parameter g to these boundaries, interpreted as a coupling to the SD-brane. To compute an amplitude in the presence of an SD-brane, we should allow for any number (including zero) of these additional boundaries; i.e., the spacetime is allowed to end many times on the same SD-brane. But for the purposes of computing amplitudes, each SD-brane boundary acts much the same as a Z boundary, so we can account for them by inserting factors of gZ. To avoid overcounting different spacetimes connecting to the SD-brane, we must divide by factorials of the number of boundaries, treating the new boundaries as indistinguishable and introducing further symmetry factors where appropriate. We thus have the following recipe for computing the amplitude in the presence of an SD-brane with coupling g: As before, the notation on the left-hand side indicates the boundary conditions for the path integral. But from the right-hand side we learn that the insertion of an SD-brane is equivalent to inserting the operator e gẐ . In other words, the SD-brane is not a new object at all! Instead, a state SD-brane g containing an SD-brane was already present in H BU as a coherent state e gZ of baby universes. We may thus identify the corresponding boundary conditions:  We here used the result e uZ = exp(λe u ) of (3.14) in the Hartle-Hawking state, with a shifted value of u due to the presence of the SD-brane. The result (3.70) tells us is that amplitudes in the presence of an SD-brane are the same as amplitudes in the Hartle-Hawking state, but with a different value of the coupling λ. In fact, we can move between any positive real values of λ by adding an appropriate SD-brane. This is a familiar situation from worldsheet string theory, where different values of an apparently free parameter (e.g. the coupling of the string to the Euler characteristic) turn out to describe different states of the same theory (e.g. coherent states of the dilaton).
We can also make use of these SD-branes in yet one more way by considering the effect of the imaginary part of the coupling θ = Im g. This has no effect in the amplitude (3.70), and to see its relevance we must allow for a different kind of SD-brane state in which g is not fixed but instead has a superposition of different values for θ. First, we note that the representation of the SD-brane as e gZ and the integer spectrum for Z imply that θ should be understood to be periodic with period 2π. A natural basis of states superposing different values of θ is thus defined by the Fourier transformed states, where for simplicity we will now focus on the case g = iθ, or Re g = 0. In particular, the above basis diagonalizes the inner product: For d < 0, this inner product vanishes, indicating that the resulting state is null.
To understand these states better, we may use the representation (3.69) of the SD-brane states as an exponential to write them as . (3.73) But this is precisely the expression we gave in (3.25) for the α-state Z = d ! Furthermore, it is now clear that taking Re g = 0 simply rescales the resulting state Z = d . This means that we can give a somewhat geometric description of a given α-sector by including a particular (Fourier transformed) SD-brane. This SD-brane is not a new fundamental object, but is built from a coherent state of interacting baby universes. The SD-brane description of α-states is at first sight rather different from the alternative geometric interpretation given in section 3.3 where the Z = d sector arose after constraining the path integral to spacetimes with d connected components. However, we see that the two are equivalent in the end. We expect a similar equivalence to arise in the model with EOW branes, and correspondingly in the JT gravity contexts of [62,82].

Entropy bounds and the Page curve
A remarkable property of our models above was the strong role played by null states, and in particular the bound (3.41) on the rank of the inner product in any α-sector with Z α = d. In section 3.6 we showed this bound to follow from an abstract argument involving the cylinder state in the Hilbert space H 1,1 associated with a pair of disconnected boundaries. As the reader may already realize, it is straightforward to generalize this argument so as to apply to very general reflection positive gravitational path integrals. More realistic models will likely have an infinite number of states in any H Σ , so to obtain a meaningful bound on the number of states we must impose a constraint. We will achieve this here by bounding the entropy of mixed states in H Σ with a given expected energy E.

Entropy bounds
We now state this form of the argument using the more general notation from section 2. The ideas are closely related to those in [96]. As before, we work in some definite (but arbitrary) α-sector of the given theory and also choose a spatial boundary manifold Σ; i.e., we consider a particular Hilbert space H α Σ from section 2.4. One property we require of our theory is that there is a notion of time evolution, here in Euclidean time. This means that the allowed boundary conditions include Euclidean 'cylindrical' boundary manifolds C β = Σ × I β for intervals I β of arbitrary length β > 0. According to the general principles of section 2, this boundary condition describes an operator on H α Σ that we may call e −βH and for which e (4.1) The final property we require of our theory is that the CPT conjugation acting on boundary conditions acts trivially on I β . When e −βH is trace-class, this condition ensures that states φ a ∈ H α Σ define a Hermitian matrix (φ b , e −βH φ a ) α which can be diagonalized to yield discrete eigenvalues with finite degeneracy. We will take this to be the case for now and return later to the possibility that e −βH might fail to be trace-class.
The above semi-group property of e −βH then implies that the eigenvectors can be chosen to be independent of β. Together with Hermiticity, it also implies the relation e −βH = e −βH/2 † e −βH/2 so that the eigenvalues must be non-negative. Henceforth, we thus take φ a to denote such an orthonormal eigenbasis of H α Σ with eigenvalues e −βEa . The key fact is then that the boundary conditions e −βH must also define an operator on the baby universe Hilbert space H BU , which we can use to define cylinder states by acting on the α-states α ∈ H BU in direct analogy with section 3.6: We will be interested in forming mixed states on H α Σ , which can be thought of as elements of the Hilbert space H α Σ * ⊗ H α Σ , spanned by products φ * b ⊗ φ a of our eigenstates φ a ∈ H α Σ and their CPT conjugates. This space of density matrices is isometrically embedded via states |φ * b , φ a ; α into the 'two-sided Hilbert space' H α Σ * Σ associated with two copies of our spatial boundary Σ. Since these latter states were built from orthonormal eigenstates of e −βH on H α Σ , the overlaps are given by The last overlap we require is the norm of the state β/2 ; α . This involves gluing two cylinders of length β/2 to create boundary conditions with a circle of length β: we have β/2 † β/2 = Z(β), where the operator Z(β) acting on H BU is defined by boundary conditions Σ × S 1 β , with a thermal circle S 1 β of length β. The norm of our cylinder state is then given by β/2 ; α β/2 ; α = Z α (β), (4.5) where Z α (β) is the eigenvalue of Z(β) in the α state, Z(β) α = Z α (β) α . We now introduce a state 6) and impose that its norm is nonnegative, As in section 3.6, it is important that this computation was performed in a fixed αsector. While we arrived at (4.6) under the assumption that e −βH is trace class, a similar argument using approximate eigenvectors would in any case bound the trace of e −βH by Z α (β). Thus the case where e −βH fails to be trace class cannot occur and we can use (4.6) and (4.7) as written. We can use the inequality (4.7) to make some more direct statements about the spectrum of states in H α Σ . Firstly, we can use it to bound the number of orthogonal states N (E) with bounded energy E a ≤ E. In a thermodynamic limit, we would usually expect this to be dominated by states with energy close to the maximum, so N (E) is controlled by the density of states at energy E. To bound this quantity, note that a e −βEa ≥ N (E)e −βE , by dropping all states with E a > E in the sum. From the result (4.7) we can then say that N (E) ≤ e βE Z α (β) for any β. The sharpest bound is obtained by minimising over all β, finding This quantity is nothing but the Legendre transform of log Z α (β), which is the usual way of obtaining the canonical entropy from a partition function. In a semiclassical theory, and in the overwhelming majority of α-states, we expect S α (E) to be approximately the Bekenstein-Hawking entropy of an appropriate black hole. This is because Z α (β) is defined by the Gibbons-Hawking path integral with periodic Euclidean boundary conditions [61], computed semiclassically by the on-shell action of a classical Euclidean black hole. The associated entropy S α (E), defined as the Legendre transform of log Z α (β), is then given by the Bekenstein-Hawking formula. This remains accurate in typical α states (in the measure of the Hartle-Hawking ensemble) as long as the variance of the Z(β) operator is small. This is the case if connected wormhole configurations between two asymptotic Z(β) boundaries are suppressed. The same quantity S α (E) appears in a stronger bound, constraining the von Neumann entropy S(ρ) of any mixed state ρ on H α Σ . This constraint depends on the energy expectation value E = Tr(ρH), where from our earlier considerations we can define H on H α Σ by matrix elements (φ b , Hφ a ) α = E a δ ab . Specifically, we prove that S(ρ) ≤ S α (E) for ρ any density matrix on H α Σ with Tr(ρH) = E. (4.10) It suffices to show this for the density matrix that maximises S(ρ) subject to the energy constraint. This is simply a Gibbs state, , Z Gibbs (β) = Tr(e −βH ) = a e −βEa , (4.11) where we choose β to fix the desired energy, Note that Z Gibbs is precisely the quantity we bounded in (4.7), with the inequality Z Gibbs (β) ≤ Z α (β). Now, we can compute the von Neumann entropy of ρ Gibbs as the Legendre transform of Z Gibbs : The inequality follows because S α (E) is defined in (4.9) by the same minimisation as used here to obtain S(ρ Gibbs (E)), after replacing Z Gibbs (β) by the larger function Z α (β). This demonstrates the claimed entropy bound (4.10).

Consequences and interpretations
Our results (4.8) and (4.10) show that, for theories defined by reflection positive path integrals, the density of states in any H α Σ is bounded by S α (E) from (4.9), which generically we expect to be given by the Bekenstein-Hawking entropy of an appropriate black hole.
We interpret this result as a semiclassical Page curve. The class of mixed states ρ on H α Σ that we can prepare by asymptotic sources includes old black holes. For example, we can create pure state black holes by collapse, couple to an auxiliary 'bath' system into which the Hawking radiation escapes, and trace out the bath. In the usual semiclassical description, it seems that this process can produce states of a given energy with arbitrarily large entropy. This entropy comes from the large interior which grows with time (in particular linearly with time along a 'nice slice' [97]), which can be populated with a growing number number of naively distinct possible low energy states. Our result shows that in an alpha sector of a reflection positive path integral, nonperturbative effects giving exponentially small overlaps between these states must conspire to produce surprising linear relations between them. Such relations must occur after the Page time so that the entropy of the black hole is bounded by the Bekenstein-Hawking entropy, to satisfy (4.10). If this inequality is (approximately) saturated, the entropy of the black hole (i.e. the density matrix on H α Σ ) and of the radiation will follow the Page curve.
We expect that in contexts where the naive number of states in H Σ can be made arbitrarily large, one will find that the bound S(ρ) ≤ S α (E) of (4.10) can be saturated, as in our model with large k. In particular, we expect this to hold for the old black holes in the discussion above. This requires saturation of the inequality in (4.7) for all β, and so |∆ becomes a null state. Note that |∆ = 0 is equivalent to the statement that Z α (β) is equal to the actual thermal partition function Tr e −βH on H Σ,α . The result that the function Z α (β) can be written as a thermal trace is a strong constraint on the eigenvalues of Z(β), which should be viewed as generalizing the result Z α ∈ N from our models in section 3.
In the case of saturation, the statement that |∆ is null leads to a gauge equivalence Following [11], the cylinder state is naturally associated with a two-sided black hole with an Einstein-Rosen bridge joining the two boundaries. We see the familiar equivalence between this and a superposition of product states emerging as an example of our gauge equivalence.
To connect further with our desire to understand black hole evaporation, we recall from section 2 that for any state ρ prepared with asymptotic sources, the Rényi (and von Neumann) entropies S n (ρ) of ρ again define operators on H BU and take definite values in α-sectors. These entropies are then subject to versions of the above bound in each α-sector, and as a result so are their expectation values S n (ρ) in the Hartle-Hawking state. In the context of black holes, any such entropies will then reproduce an appropriate Page curve defined by the Bekenstein-Hawking entropy. In particular, the final result will then be much as in the recent discussions of replica wormholes [48,49] which in our language are indeed the most natural saddle points contributing to the average entropy S n (ρ) . 14 The argument above shows that similar results will then hold when one computes the full result of any reflection positive gravitational path integral. Further, it tells us that these bounds hold not just on average, but in every α-state. This puts additional constraints on higher moments of the entropy.
It is, however, important to note the precise sense in which the entropies S n (ρ) have just been defined. From our perspective, the basic quantities are the eigenvalues S n,α (ρ) of S n (ρ) in the various α-states. These are entropies defined separately on each H Σ,α . Working in the Hartle-Hawking state then computes the average S n (ρ) of such entropies over the α-states in the Hartle-Hawking ensemble. In particular, while this S n (ρ) is computed by replica wormholes (to a first approximation), it manifestly does not include entanglement with the baby universe sector. This is a physically useful notion of entropy as the α-sectors are superselected from the standpoint of asymptotic observers, and entanglement with superselection sectors is in principle unobservable. Nevertheless, if one wishes to consider the entropy of some density matrix on the full space H Σ (and not just on a single α-sector) defined by some fixed set of sources, entanglement with baby universes will generally lead to much larger entropies that exceed the Bekenstein-Hawking entropy and thus do not reproduce the expected Page curve. In this more mathematical sense, Hawking was correct [98] that information is lost in black hole evaporation. This is all in direct parallel with the conclusions of [55][56][57]99] from long ago. We will also discuss such connections in more detail in a forthcoming companion paper.
5 On third-quantized perturbation theory

Formulating a wormhole perturbation theory
We have been interested above in contexts where spacetime wormholes provide the dominant effects. But in most circumstances spacetime wormholes are not the minimum action configurations. In such cases, it is natural to expect other configurations to dominate, and for the contributions of spacetime wormholes to be nonperturbatively suppressed by a factor of the form e −S , where S, of order G −1 N , is the action of a wormhole. This holds for computing simple amplitudes in our models of section 3, for which higher topologies are suppressed by factors of the large parameter λ. In such cases it is natural to use an approximation where different universes evolve independently at leading order, and where spacetime wormholes are included as perturbative interactions between universes. The resulting perturbation theory is the 'third quantised' formalism of [57]. This approximation was also emphasized in other contemporaneous literature on wormholes [55,56,99,100].
We now describe an analogous approximation in our framework. This will serve both to complete the connection to the above literature and to provide a better understanding of the interesting circumstances described above in which this approximation fails. Nevertheless, this section represents a distraction from the main line of inquiry presented here, and some readers may wish to skip directly to section 6.
The early works [55][56][57] focused on studying microscopic wormholes, with the intent of describing physics on distances scales much larger than the wormhole's characteristic size (say, Planck scale). The relevant scale is the 'width' of the wormhole mouth, thought of as some length scale associated with the cross-sectional area. In contrast, the separation between the spacetime regions associated with the wormhole mouths can be much larger. In that context, it is most natural to describe the physics using the operators of the low energy effective field theory, studying the effect of integrating out the microscopic wormholes. In contrast, we have wormhole mouths which, as with replica wormholes, are determined by a classical or quantum extremal surface. As a result, our wormholes will typically have a size similar to some black hole horizon, which may be both macroscopic and large. For us it thus will be more natural to discuss CFT boundary operators Z[J] in place of the low energy bulk fields. This captures much of the same physics, and is analogous to using an S-matrix description in place of an effective Lagrangian. 15 The effects on the bulk effective field theory that arise from integrating out macroscopic wormholes will be explored in section 6.
Suppose then that, for some theory and amplitude of interest, the contribution from topologies connecting many boundaries is suppressed relative to disconnected topologies. This holds for familiar simple amplitudes in theories of interest, including the model discussed in section 3, as well as for JT gravity -though it does not hold for all amplitudes, as we will discuss below. In a case where it does, at zeroth order of approximation we may neglect the connected contributions, obtaining an amplitude that approximately factorizes: Identifying an asymptotically AdS boundary Z[J] with an operator Z[J] acting on the baby universe Hilbert space H BU as in (2.11), at this leading order of approximation we can simply replace Z[J] with a multiple of the identity operator Z −1 Z[J] . In particular, at this level of approximation, acting with any Z[J] on HH yields another state proportional to HH , so the baby universe Hilbert space defined in section 2.2 collapses to a single dimension.
To incorporate nontrivial wormhole physics, we must go to next order in the approximation, allowing contributions to the path integral from spacetimes that connect either one or two asymptotic boundaries, but not more. The contributions from spacetimes with one asymptotic boundary are then analogous to quantum field theory tadpoles, while the two boundary contributions are analogous to quantum field theory propagators. In particular, the Hilbert space H BU becomes nontrivial, and takes the form of a Fock space. To see this, we define 'single universe states' by subtracting the 'tadpole contributions' from one boundary states; i.e., one need only introduce the modified (tilded) states 2) 15 In the language of [101], the effects of higher topology we study are more closely analogous to 'wormhole interactions', as opposed to the 'instanton interactions' arising from nearby wormhole mouths of primary interest in that work. and similarly for states involving larger numbers of universes. Loosely speaking, the spacetime created by the operator Z[J] is most likely to immediately cap off, failing to create a closed universe. It is natural to subtract this possibility, in which case we are most likely to create a single closed universe which can propagate to another asymptotic boundary, justifying the name of 'single universe state'. Going to higher orders in the approximation would require additional subtractions for this description to remain valid.
The resulting Fock space structure can be used to define baby universe creation and annihilation operators a † J , a J * , where in particular we have 4) and the algebra [a J 1 , a J 2 ] = 0, One can then write corrections to the boundary operators Z[J] in terms of baby universe creation and annihilation operators: where · · · indicates higher order terms. One is then tempted to think of the states |Z[J] as (approximations to) states of a single closed baby universe, with a wavefunction for the metric and other fields determined by the source J (and by varying J we would expect to obtain an overcomplete set of coherent states). We can diagonalise the inner product on the single-universe Hilbert space, taking linear combinations of Z[J] for different J to give operators Z i which are chosen to be Hermitian and give amplitudes satisfying We can then write Z i = Z i + a † i + a i + · · · , with a more conventional oscillator algebra [a i , a † j ] = δ ij labelled by an orthonormal basis of single-universe states. Repeated applications of a † i are then said to create more universes, which can interact through topologies connecting three or more boundaries and into which we could incorporate as higher order terms in (5.6). As long as these higher topologies are suppressed, we can thus construct a useful perturbation theory, where the inner product in (5.5) gives the 'free propagator' for single universe states, with higher topologies contributing vertices.
In particular, as noted above, based on the validity of the free approximation H BU appears to be well described by a Bosonic Fock space built on the single-universe Hilbert space. The Hartle-Hawking state provides the oscillator ground state, and multi-universe states are built by acting with a † i operators. Alternatively, in the free approximation we can think of H BU in terms of the wavefunction Ψ(Z i ), a function of the real variables Z i . The operator Z i then acts as a position operator (or a free field operator in QFT, where the label i could be momentum, for example), multiplying by Z i . As the oscillator vacuum, the Hartle-Hawking state has a Gaussian wavefunction for each Z i , shifted to be centred on Z i .
It is now tempting to use this free Fock space description to describe the spectrum of Z[J], and hence the dual ensemble and the α-states. We are led to expect that the spectrum of {Z i } has continuous support on the whole of R, independently for every i. In the resulting ensemble the Z i , and hence the Z[J], are normally distributed at the first nontrivial order described above, with covariance matrix given by the singleuniverse inner product 16 in (5.5). At each higher order, corrections from interactions would then appear to contribute only small non-Gaussian corrections to the measure, the conclusion reached in [101], for example. However, in this respect, we have been misled by the free 'approximation' 5.6. It turns out to be invalid because, while perturbation theory is accurate in many circumstances, it is not applicable in α-states, as we will argue in a moment. The true, nonperturbative spectrum is smaller because the Fock space description of the Hilbert space is invalid once we take into account the null states (2.9) by which we must quotient by to obtain H BU . Due to the null states, the 'universe number' which grades the Fock space is not a diffeomorphism invariant observable.
Before we describe the breakdown of third-quantised perturbation theory, we clarify that it is not necessarily signalled by the dominance of spacetime wormhole effects. It may happen that the most important contribution to an amplitude comes from a nontrivial topology, but higher topologies remain negligible. This occurs prominently in two recent examples. The first is the spectral form factor Z(β + it)Z(β − it) of JT gravity [62,86,102], for which the contribution from the disconnected topology decays in time, while the connected topology gives a contributions that is exponentially suppressed but growing. Eventually, the connected topology dominates, giving the 'ramp'. A second example is the nth Rényi entropy of an evaporating black hole after the Page time, which can be described as a sum of n-boundary amplitudes; the dominant configuration is a 'replica wormhole', a spacetime which connects the n boundaries [48,49]. However, higher topologies continue to be suppressed in such cases, and a similar perturbation theory remains valid; it simply happens to be dominated by n-universe vertices, so requires their inclusion. 17 Instead, we are interested in cases when the third quantised perturbation theory fails entirely, and many topologies must be considered at once. For example, this occurs when we compute amplitudes with a parametrically large number of boundary components, giving very large moments of Z[J]. Equivalently, we can describe these amplitudes as the overlaps of states with very large universe occupation number 18 . While any particular process of splitting and joining universes is suppressed, the total amplitude of such interactions is enhanced by combinatorial factors counting the number of processes with many possible universes (or joining many possible boundaries). This allows higher topologies to become important.
Crucially, this breakdown of perturbation theory applies to α-states and so is vitally important for understanding the spectrum of Z[J]. The approximation of weakly interacting baby universes is thus not a reliable guide to the details of the spectrum. In the free theory, the α-states are like position eigenstates in the harmonic oscillator. They thus have infinite expectation value for the number operator. As we reduce the uncertainty in the α parameters and create a baby universe wavefunction with a more narrow spread, the mean universe occupation number increases, and eventually becomes exponentially large. At that point, the above approximation is not self-consistent for studying such states.
In retrospect, it should not be surprising that perturbation theory is of limited use for determining the spectrum of observables. As a simple example of similar behavior, if we perturb around the minimum of a potential in quantum mechanics, we cannot at any finite order tell whether the configuration space is compact, and hence if the momentum should be quantised. 19 The truncation of the spectrum of Z[J] is invisible at any finite order in the thirdquantised perturbation theory. Thus in that description it could be seen only via some nonperturbative effect, or in an exact solution if one turns out to be available. Our models of 3 provide a simple example of the latter. Recall that, in terms of the usual bulk perturbation theory in G N , the spacetime wormholes describing third-quantised 17 This perturbation theory is also useful for discussing the average entanglement spectrum close to the Page time [49], though it requires summation of a class of 'tree-level' diagrams involving vertices of all valences. 18 This notion is well-defined only in the third quantised perturbation theory, but can nonetheless be used to diagnose whether that perturbation theory is self-consistent. 19 We mentioned above the natural third quantization interpretation of Z[J] as a position-like operator, but we could equally well have interpreted it as an analogue of free particle momentum interactions are already nonperturbative, so the relevant expansion parameter is of the form e −S for an action S of order G −1 N . From this point of view, the compression of the Hilbert space is then a doubly nonperturbative effect, contributing to simple amplitudes as e −c e −S for some (possibly imaginary) constant c.

Perturbation theory in the topological model
To give some insight into the validity of third quantised perturbation theory, we discuss its applicability in the context of the model of section 3. We will restrict our considerations to the model without EOW branes.
The small parameter that suppresses topology is e −S 0 , with S 0 multiplying the Euler characteristic. It is natural to organise the third quantised perturbation theory as an expansion in that parameter, with higher genus topologies appearing as loops. However, the details of such an expansion (particularly accounting for diffeomorphisms of connected surfaces) are not necessary for the point we wish to illustrate. To simplify the discussion, we thus instead assume that the full connected correlators (and thus any sums over connected surfaces with given boundaries) have already been computed exactly. These are all given by the same number λ, so our perturbation theory will be an expansion in inverse powers of λ. As noted in section 3, this expansion is organised by counting the number of connected components of spacetime.
Let us begin by noting a precise sense in which the free Gaussian approximation is appropriate at large λ. This follows from first observing that a sum of N independent Poisson distributions with parameter λ/N is again a Poisson distribution, with parameter λ. Taking λ and N large with fixed ratio then implies that we can apply the central limit theorem to the Poisson distribution as λ → ∞. Specifically, we may define which has mean zero and variance unity. This X is just new encoding of the boundary condition Z, with the shift by λ acting to subtract the 'tadpole' and set X = 0, and with an additional rescaling to fix the variance Z −1 X 2 = 1 2 . The central limit theorem then implies that as λ → ∞ the distribution of X converges to a normal (and thus Gaussian) distribution. In particular, at large λ any amplitudes f (X) for bounded continuous functions f (fixed independently of λ) approach those computed by integrating against a Gaussian. These are the vacuum amplitudes of a harmonic oscillator, with wavefunction ∝ e − x 2 2 , so this defines the 'free' Gaussian approximation mentioned above.
We will return to the discussion of this wavefunction later. Before doing so, we the large λ expansion to study the moments Z −1 Z n = B n (λ) and note both when and how that expansion fails as we also take n to be large. For fixed n, the leading order contribution at large λ comes from completely disconnected spacetimes, giving B n (λ) ∼ λ n . At the next order, we have spacetimes with n − 1 disconnected components, which requires one 'cylinder', a component joining two boundaries. 20 There are n 2 = n(n−1) 2 choices of which boundaries to join, so we have B n (λ) = λ n + n(n − 1) 2 λ n−1 + · · · λ → ∞, fixed n. (5.9) At the next order, we have spacetimes with n − 2 components, which means either two cylinders, or a 'pair of pants' connecting a trio of boundaries to the same component of spacetime. We can continue in this way to any desired order λ n−k in the expansion by accounting for possible topologies with n − k connected components. Now, let us consider what happens when n also becomes large. The first sign of trouble occurs when n if of order √ λ, when the second term in the above expansion is no longer smaller than the first. There are roughly n 2 /2 ways to choose pairs of boundaries to join by a cylinder (neglecting the correction from choosing the same boundary twice), which is sufficiently large to overcome the suppression by λ. But this does not apply only for a single cylinder; terms with any number of cylinder components again contribute at the same (leading) order. In some sense our free approximation has failed.
However, it turns out that the large λ expansion remains useful because we can explicitly account for the sum over configurations with k cylinder components. For In this regime, we can now systematically correct (5.10) in powers of λ −1 as before. Such corrections can account for including higher topologies with more boundaries as well as compensating for the overcounting of cylinder configurations. From (5.10), we see that Z n is dominated by contributions with roughly n 2 λ cylinder components. This can be much greater than one and the analysis will remain applicable, though it should certainly remain much less than n, so we must have n λ.
If this is the case, the correction from the cylinders is small in the sense that it is subleading to the λ n term when expressed as an expansion of log B n (λ). Taking n larger still, (5.10) remains accurate until n is of order λ 2/3 . At that point we find significant corrections from including any number of connected components having three boundaries each ('pairs of pants'), and also from certain aspects of the overcounting of configurations of multiple cylinders. In the latter context, the relevant configurations are those in which two cylinders end on the same boundary. We previously included these configurations for simplicity (and to obtain a definite power of λ), but since they are not allowed we must now compensate by subtracting off their contributions. Together, these two effects multiply (5.10) an extra factor of e − n 3 3λ 2 . This pattern continues, with similar e # n k λ k−1 corrections appearing whenever n becomes of order λ 1− 1 k for k = 2, 3, 4, . . .. As discussed in appendix A.2, this structure is also apparent from a direct asymptotic expansion of B n (λ).
In summary, in the regime λ n the large λ expansion remains a tractable way to compute the moments Z n and is organized by types of contributing geometries. However, once n is of order λ, this perturbation theory breaks down catastrophically, since there is no longer any suppression of connected topologies with many boundaries. This is the regime in which the novel effects of null states and gauge invariance become relevant, truncating the spectrum of Z and making its discreteness apparent.
To explain this last statement in more detail, we first describe the state |Z n in the free approximation. We begin by translating to the harmonic oscillator position variable variable X introduced in (5.8), writing Z n = λ n 1 + 2 λ X n . Expanding log Z n at large λ (but any fixed n), this gives log Z n = n log λ + 2 λ nX + O(nλ −1 ). We may thus approximate Z n ∼ λ n exp 2 λ nX . For sufficiently small n that the free approximation is applicable, we therefore have an approximate equivalence between the following states: Here the final equality uses (5.8), and the middle state lives in the harmonic oscillator Hilbert space of the free approximation. In particular, |0 is the (normalized) oscillator vacuum with wavefunction ψ(X) ∝ e − X 2 2 . After applying the exponential operator, the resulting wavefunction is a shifted Gaussian, which is a coherent state of the harmonic oscillator with average occupation number (here, 'universe number') n 2 λ . From the above analysis, it follows that the free approximation is valid for universe numbers N λ.
Now, a wavefunction of width ∆X in the X variable has an occupation number that scales as N (∆X) −2 as the width goes to zero, where the leading contribution comes from writing occupation number in terms of the Harmonic oscillator Hamiltonian and focusing on the kinetic term. In terms of the width ∆Z in Z, this is N λ(∆Z) −2 . But resolving the natural integer discreteness in the spectum of Z requires ∆Z ∼ 1, and hence N of order λ. As a result, and as one might expect, the discreteness of the Z spectrum is thus associated with the complete breakdown of third quantised perturbation theory.
We can also see directly that this regime is connected with the appearance of null states, and thus the appearance of new gauge equivalences. Perhaps the simplest equivalence is that between the Hartle-Hawking state and the exponential e 2πiZ . Note that any state e αZ is described in the free approximation by a coherent state with average occupation number N ∼ |α| 2 λ. But for α of order one (for example, for α = 2πi) this is of order λ and the free approximation fails.
All these phenomena occur when the state of baby universes has unsuppressed interactions with a given boundary. Roughly speaking, if we have a state of H BU containing N closed universes and introduce a new boundary, the new boundary will connect to any given universe with amplitude λ −1 . Hence it will connect to some universe with amplitude N/λ. This effect becomes of leading order at N of order λ, when the free description breaks down. We emphasise that this heuristic is appropriate for N λ when the free approximation can be used, but that N itself becomes illdefined once it becomes of order λ. At that point, null states appear and, furthermore, the null states are not preserved by any notion of universe number operatorN .

Discussion
As with many works motivated by the black hole information problem, various readers may wish to focus on either the technical aspects of the above results or, alternatively, on their further significance for quantum gravity. For this reason, we separate our discussion below into more technical remarks in section 6.1 and a broader consideration of implications in section 6.2

Summary and future directions
We have seen that combining features of AdS asymptotics with the basic perspective of Coleman [55] and of Giddings and Strominger [56,57] from the late 1980's leads to a sharp structure in which states in a 'baby universe Hilbert space' H BU control an ensemble of results for quantities Z[J] computed at asymptotically AdS boundaries.
This version of the argument uses only manifest properties of the path integral and makes no further assumptions about locality.
Nevertheless, the final result is much the same as in [55,56]. In particular, the full bulk theory naturally includes both H BU and what one may call asymptotically AdS states, and there is a sense in which the two sectors interact. However, the theory has superselection sectors for the algebra of operators on the asymptotically AdS states, so that an observer with no access to H BU naturally experiences an ensemble. The superselection sectors are associated with a complete orthonormal basis { α } of H BU in which the Z[J] take definite values and exhibit factorization. Thus for a given state Ψ ∈ H BU , the probability of outcome Z α [J] is p α = Ψ α 2 . Furthermore, all properties of the full spectrum of superselection sectors can at least in principle be computed from correlators in the Hartle-Hawking no-boundary state HH ∈ H BU . We then explored this construction in detail in simple topological models inspired by Jackiw-Teitelboim gravity with and without end-of-the-world branes (EOW branes, see e.g. [49,103]), and perhaps also with an extra boundary degree of freedom. Without EOW branes, there is a single asymptotically AdS boundary condition Z, for which the associated operator Z is naturally interpreted as the dimension of the CFT Hilbert space. This operator is also present in the model with EOW branes. Interestingly, the models predict this operator to have a quantized spectrum with eigenvalues Z α ∈ e S ∂ −S 0 N, where S ∂ is a parameter associated with the extra boundary degree of freedom. The potential eigenstates associated with other potential eigenvalues turn out to be null states. Perhaps even more intriguingly, unless S ∂ is taken to be larger than S 0 + log k, the models with EOW branes are reflection positive only when all Z α are nonnegative integers, and thus only when e S ∂ −S 0 ∈ N. The particular ensemble defined by the Hartle-Hawking no-boundary state gives a Poisson distribution for the Z α .
Models with EOW branes have additional boundary conditions (ψ j , ψ i ) for i, j = 1, . . . k. The (ψ j , ψ i ) are naturally interpreted as the matrix of inner products between EOW brane states in a dual boundary quantum mechanics. For given (integer) Z α , the eigenvalues of (ψ j , ψ i ) take the form aψ a j ψ a i for some rectangular matrix ψ a i of size k × Z α k . As a result, the rank of any (ψ j , ψ i ) α cannot exceed either k or Z α . The ensemble defined by the Hartle-Hawking no-boundary state arises from choosing independent complex Gaussian random entries for each of the ψ a i . For k Z α , this structure (ψ j , ψ i ) α = aψ a j ψ a i requires a sizeable compression of the naive the CFT Hilbert space (which would have had dimension k). In particular, any list of more than Z α states in the CFT Hilbert space turns out to be linearly dependent due to the presence of null states. We also argued that a similar constraint on the number of linearly dependent states must arise in any theory where the gravitational path integral defines a positive semi-definite physical inner product. Our general argument is closely related to ideas in [96], and various related suggestions can be found in e.g. [104][105][106][107][108]. But the result is deeply related to recent successes [42,43,48,49] in reproducing various forms of the Page curve associated with the black hole information problem. With hindsight one can say that it was implicit in all of these works, and in fact moderately explicit in [49]. But here we see that it is an exact statement at finite Z in every possible baby universe state.
Indeed, in order to explain the Rényi computations of [49] for typical members of the Hartle-Hawking ensemble some version of this compression must occur whenever the number of a priori independent states inside a quantum extremal surface exceeds the generalized entropy defined by the region outside. And due to a maximin argument [42,43], one expects this to occur whenever the number of a priori independent quantum states that can exist inside a given bulk domain of dependence with fixed exterior geometry exceeds the area of the codimension-2 surface where the past and future boundaries of this domain of dependence intersect; see also [109] for more on quantum maximin surfaces.
In the context of black hole evaporation, for general baby universe states Ψ this picture gives a sense in which interactions with baby universes formally lead to loss of information during the evaporation of black holes. But as described previously in [55][56][57]99], since the α-states define superselection sectors for asymptotic observers, any given asymptotic observer can find no operational signs of this information loss. In particular, while the observer may not be able to predict the exact outcome of an experiment involving black holes, they may simply consider the experiment to be a partial measurement of the previously unknown value of (in this interpretation unique) value of α describing the universe in which they live. To the extent that α has been measured, no further information is then lost.
At the technical level there remain many interesting generalizations to explore in the future. For example, even in the models discussed here, it would be useful to understand if one can formulate the Hilbert spaces H BU using slices at 'finite time', or in other words without reference to asymptotic boundaries. Moving beyond the current model, one would like to add topological matter, and also to explore a similarly topological version of the de Sitter models of [110] and [49]. Work along these lines is in progress and we hope to report soon. In the longer term, it is also clearly of interest to study more realistic models.

Transcending the ensemble: implications and interpretations for each α-sector
We now turn to more speculative comments concerning the implications of our results above.
A key lesson from this work appears to be that, at least in sufficiently simple models, gravitational path integrals by themselves succeed in describing a great deal of microscopic information. In particular, in our models the bulk path integral leads to a definite construction of the possible boundary theories -defined by simultaneous eigenvalues Z α [J] -and also of the ensemble defined by the Hartle-Hawking state. However, this was possible only due to the exact solubility of the model, and in particular the convergence of the sum over topologies. In more realistic models, we will surely not be so fortunate.
Even in the simple case of JT gravity and its cousins [49,62,87], the gravitational path integral fails to converge. Though the model is sufficiently simple that the path integral for any given topology is exactly computable, the sum over topologies is an asymptotic series with zero radius of convergence in the expansion parameter e −S 0 . While there is an extremely natural completion of the model defined by a dual double-scaled matrix integral, it remains unclear whether the gravitational path integral uniquely selects this completion, or how it is realised in the bulk. This completion is associated with nonperturbative effects in the sum over topologies, which are doubly nonperturbative in G N . The same doubly nonperturbative scale was associated with truncation of the baby universe Hilbert space in our model, suggesting a tantalising connection to explore in more generality.
If we apply the ideas of this paper to more conventional 'top-down' examples of AdS/CFT duality, such as type IIB supergravity (or string theory) with AdS 5 × S 5 boundary conditions, there are several possible outcomes. The first possibility, suggested by our simple model and JT gravity, is that a nonperturbatively complete bulk theory defines a large Hilbert space H BU of baby universes. The eigenstates α would then be associated with a menagerie of dual CFTs, and the Hartle-Hawking state again defines an ensemble of them. However, this is in tension with the established statement of the duality, which uniquely selects N = 4 Yang-Mills theory as a CFT dual. 21 A nontrivial ensemble would require surprising new families of maximally supersymmetric CFTs; in particular, since N = 4 Yang-Mills is the unique such theory at weak 21 Recall that a given α-state determines partition functions for all possible boundary conditions on the bulk fields. These boundary conditions include specifications the flux on S 5 and the asymptotic dilaton, associated with the rank N of the dual U (N ) gauge group and the 't Hooft coupling λ respectively. An α-state would specify a family of theories labelled by these parameters.
coupling, these new CFTs must be strongly coupled throughout their moduli space.
Perhaps the more likely scenario is that N = 4 Yang-Mills is the unique dual and there is no ensemble. The baby universe Hilbert space interpretation is that H BU is one-dimensional, so the Hartle-Hawking state is the unique state of closed universes. The nonperturbative diffeomorphism invariance that produced null states is then required to act in the most emphatic possible fashion, rendering every possible state gauge equivalent. This unique state must then also be an α-state, and must exhibit factorization despite the existence of spacetime wormholes. Nevertheless, in analogy with typical α-states in our model, it remains possible that simple spacetime wormhole configurations still give excellent approximations to certain amplitudes. Of course, in analogy with highly atypical α-states in our model, it is also possible that that simple spacetime wormhole configurations always receive large corrections.
An intermediate position is that the bulk theory leads to an ensemble interpretation in an asymptotic (say, large N ) expansion, but there is a unique theory at any finite N . This is consistent with the observation [111] that essentially any effective field theory in AdS solves the bootstrap order by order in large N perturbation theory. We can thus emulate a consistent CFT in a large N expansion, which nevertheless need not exist at any given finite N .
In any case, the suggestion is that the gravitational path integral should contain the full physics in each consistent α-sector. And since the baby universe state in such sectors does not change, there is no room in a given sector for information loss. As a result, the gravitational path integral should teach us how each consistent α-sector transfers information to the outgoing Hawking radiation.
With this in mind, we recall that a key feature of the discussion in [55][56][57] was the idea that one could integrate out the spacetime wormholes and describe their effects in terms of a modified effective action in which the detailed couplings were controlled by the α-states. In other words, the original theory with specified couplings and spacetime wormholes was equivalent (from the asymptotic point of view) to a theory with an ensemble of bulk couplings but where spacetime wormholes were forbidden. The same construction will apply in our context, but with one important distinction. Namely, [55][56][57] focussed on wormholes with Planck-sized cross-sections under the assumption that microscopic wormholes would dominate in any physical process. But the mouths of the replica wormholes in [48,49] are determined by the location of a quantum extremal surface. As a result, they approximately coincide with the relevant black hole horizons and thus are macroscopic in size. Integrating out such wormholes thus induces an ensemble of highly non-local couplings in the effective action. Indeed, the couplings naturally mediate transitions in which any given interior configuration specifying the geometry and matter fields arbitrarily far inside the black hole can be replaced by any other, no matter deep the black holes throat may have become. At least for replica numbers n near 1, the action for a replica wormhole whose mouth has area A is of order A 4G [112], so the amplitude for such processes should be exponentially small in this quantity. However, in an old black hole the large number of internal states can lead to a large effect as seen directly above and in [49] (and as foreshadowed in [113][114][115]).
The exact location and nature of the above non-local interactions is clearly of some interest. In particular, while quantum extremal surfaces may appear outside the black hole's event horizon [45], for black holes evaporating into a vacuum they should always lie inside [42,43]. Were all of the physics determined by replica wormholes confined far enough inside the horizon, there would be no possibility of affecting the exterior, and in particular no way it could purify the emitted Hawking radiation. However, any separation of the QES from the horizon arises from time dependence, which is typically associated with quantum effects. The backreaction of such effects on the spacetime is then suppressed by a power of G. As a result, the QES tends to be adiabatically close to any horizon, and thus separated by an amount only of order G. In addition, since the QES is determined by balancing the quantum effect of evaporating against a classical effect, the saddle-point is somewhat broad. A rough estimate of the width of the saddle-point suggests that the typical fluctuations of the area are also of order G. 22 This places the QES outside the horizon with order one amplitude. The associated non-local interactions will then naturally transfer information from the deep black hole interior into the outgoing Hawking radiation in much the form suggested in [22,30].
However, for a full understanding of the physics associated with such interactions it appears one must take into account the corrections they imply for the theory's physical inner product. As described in section 4, such corrections are associated with extending the familiar diffeomorphism invariance of gravitational systems to a more general slicing invariance of the path integral with topology change. Extending this to arbitrary Euclidean time evolution -even involving processes that change the topology of the slice used to define the quantum state -implies spacetimes of different topologies to be gauge related. In other words, this is a restatement of the old maxim that for gravitational systems time evolution is a gauge symmetry unless it involves evolution along an asymptotic boundary. This then directly implies that the path integral com- 22 For example, we can perform the path integral over replicated geometries and matter, while leaving unfixed the location of the QES where branching between replicas occurs. This leaves a final integral over the QES location to compute, which is roughly e −Sgen for n close to 1, where S gen is the generalised entropy of the QES and we integrate over its location. The integral over the area of the QES (fixing ingoing time, for example) is then dA e −Sgen(A) , with S gen (A) ∼ A 4G + # log(A 0 − A) [42,43], where A 0 is the area of the (stretched) horizon. At the saddle point, where A 0 − A is of order G, we have S gen (A) of order G −2 leading to a width ∆A of order G. putes the gauge invariant physical product as one would expect from general arguments [69,[76][77][78] (though admittedly those arguments are most direct in contexts where it is not obvious that topology change should be included).
As a result, one may think of the induced nonlocal interactions as modifying the gravitational constraints; i.e., with new terms in the Wheeler-DeWitt equation. The interesting feature, however, is that these modifications are highly non-generic. In the regime that in our models corresponds to k Z α , there are a large number of strongly correlated small corrections, where the correlations conspire to give a large number of null states; i.e., they make the physical inner product highly degenerate so that a priori independent states are in fact linearly dependent in the physical Hilbert space, and so that the dimension of the physical Hilbert space is bounded by Z α . Furthermore, following ideas related to [96], we argued in section 4 that null states must enforce a similar bound in a general reflection positive gravitational path integral.
It is this bound that leads to the Page curve, and which thus determines the rate at which the above interactions transfer information out of the black hole. As a result, while the above non-local interactions are intimately tied to this change in the inner product, it is natural to think of the former as secondary and the latter as primary. In particular, it is in terms of the inner product that (for reflection positive path integrals) we find a clean statement of the correlations and conspiracies inherent in the details of the induced interactions; see again section 4.
We believe the explicit demonstration of such a large number of null states to be a lesson of fundamental importance. It implies that -due to the above mentioned conspiracies -the gauge symmetry of gravitational systems is much larger and more powerful than had been previously established. The idea that bounds on entropy might be related to such a gauge symmetry date back at least to the early 1990's, when such suggestions arose in discussions of black hole complementarity proposals (see e.g. comments in [104]) and cosmological analogues in de Sitter space. It is also much like the truncation of the bulk Hilbert space implicit in random tensor network models [116,117] in which the disorder is implemented by inserting randomly chosen projections into the bulk. However, we now see this to be a direct result of the gravitational path integral.
The physics of this enlarged gravitational gauge invariance remains to be understood in detail, especially in the context of more realistic models. Nevertheless, the argument of section 4 indicates that the long discussed relation [11,96,118,119] between two-sided bulk black holes and bulk thermofield double states (4.15) should be understood as an example of this gauge equivalence. In particular, we now see that the so-called "superselection sectors" of [120] -which were argued there to be physically distinct -are in fact gauge equivalent. 23 We now speculate further on the implications of this enhanced gauge invariance for issues involving black hole information and the connection to other works. It seems clear that in sufficiently old black holes (where the number of a priori independent internal states is sufficiently large), this gauge invariance implies that vast numbers of a priori independent states must in fact to be regarded as physically equivalent. Furthermore, at least in our model, this happens in an essentially random way that does not respect any additional structure 24 . Extrapolating this result to more complicated models suggests that one will find many states which a priori seem to have very different physics -and in particular in which infalling observers have vastly different experiences -but which are nevertheless gauge equivalent. For example, just as there can be gauge equivalence between Alice meeting Bob and Alice finding only empty space, there is no reason for the physical inner product to respect Alice's notion of particle number (as distinguished, say, from total charges coupled to a gauge field), or even her notion of particle number in a given mode. As a result, even for pure state black holes, the experience of observers inside the black hole may fundamentally fail to be well-defined as a gauge invariant concept. One may view this as a variant of the firewall-like possibility described in [41] that black holes may have 'no interior', or at least no interior from which familiar physics can be extracted.
Nevertheless, as with any gauge symmetry, one is free to fix a gauge in order to define a language (i.e., a set of observables) with which to describe the physics. In particular, as noted above, at the level of Hilbert spaces any gauge invariance is naturally associated with what one may roughly call a projection P from some kinematic Hilbert space H kin to a physical Hilbert space 25 H phys ⊂ H kin . In this sense, one may think of a general gauge fixing procedure as a choice of linear subspace H GF ⊂ H kin such that P defines a bijection between H GF and H phys . Within a given such gauge fixing scheme, it may then be that the experiences of infalling observers become well-defined. For example, in describing the interior of a black hole of radius R 0 that recently formed from collapse, it would be natural to choose a gauge in which the interior is of size comparable to R 0 (even if such small interiors are gauge equivalent to certain much larger interiors that might form when an initially much larger black hole decays to size R 0 ), and in particular in which standard effective field theory is a good approximation.
With this in mind, we recall that the discussions of [42,[44][45][46][47][48][49]121] described a close parallel between old black holes that have been radiating into an external system ('the bath') and the ER=EPR paradigm of [31]. In particular, these works suggested that infalling observers experience only standard physics even at the horizon of black holes that have been evaporating for longer than the Page time. At first sight such statements may seem to be in great tension with our bound on the number of linearly independent states inside the black hole. But this tension can be resolved by interpreting the comments of [42,[44][45][46][47][48][49]121] as providing a gauge fixed description, where in this case the choice of gauge depends on the state of the bath. In other words, if the black hole system with physical Hilbert space H phys is considered in the presence of another system with Hilbert space H bath then, even if the bath system by itself has no gauge invariance, one is free to gauge fix by choosing a general linear subspace H GF, joint ⊂ H kin ⊗ H bath for which P defines a bijection to H phys ⊗ H bath . Note that there is no requirement for H GF, joint be a tensor product H GF 0 ⊗ H bath for any fixed subspace H GF 0 ⊂ H kin . Instead, one is free to effectively let the choice of subspace H GF 0 ⊂ H kin vary with the choice of state in H bath .
The connection with the above works is particularly clear in the discussion of Petz reconstruction in [49]. There one wishes to reconstruct an operator O on H kin using an operator O R on H bath . Now, since O R is an operator on H bath , it is automatically gauge invariant. However, since the operators O discussed in that work were constructed without regard to the (random) physical inner product, they are not gauge invariant. This is consistent, as O R reconstructs O only on a subspace H code ⊂ H kin ⊗ H bath that similarly fails to be gauge invariant. However, at least to good approximation we can think of H code as defining a partial gauge fixing (meaning that we could choose some H GF,joint ⊃ H code . In particular, we may use any bath bra-state ψ bath | to define a linear map from H code to H kin via its natural action on H bath . And for any choice of ψ bath |, the image defines a subspace H ψ ⊂ H kin with at most dimension d code e S BH , i.e., where this dimension is much less than the dimension of H phys . As a result, with high probability distinct states in H ψ will project to distinct states of H phys . In this sense H code approximately satisfies the requirements for a partial gauge fixing; a complete gauge fixing would result from extending H code to make the projection of each H ψ isomorphic to H phys . We note that such a gauge fixed interpretation allows all of the hallmarks of what is often called state dependence [25][26][27][28] and which is naturally associated with the ER=EPR paradigm. In particular, in contexts where one expects to find only a small number of black hole states (states in H phys ) for each bath state, it will be possible to choose a partial gauge fixing of the form described above that selects only states in H kin with no drama at the horizon. In particular, one will be able to choose a code subspace within which the evolution can be well-described by standard local effective field theory. In addition, we note that standard objections [40,41,[122][123][124] to state dependence focus on non-uniqueness of the predicted physics, and that such objections are clearly moot in a context where the state dependence is simply a choice of gauge (so that non uniqueness of H GF is to be expected, and so that the gauge invariant predictions are in fact identical).
Nevertheless, the non-uniqueness arguments of [40,41,[122][123][124] then show the sort of states that, while they appear at first sight to be physically distinct, must in fact be related by the enlarged gauge symmetry described above. In particular, tracing through such leads to other gauges in which infalling observers experience varying amounts and types of drama at the horizon, as well as to gauges where the observer simply fails to exist in the interior of the black hole. 26 Furthermore, just as there is a particular gauge (or class of gauges) realizing ER=EPR-like scenarios, it seems likely that one can also find gauges realizing fuzzball scenarios (see e.g. [17,19,24,29,[125][126][127][128][129], the non-violent non-locality proposal 27 [20,22,30]), proposals emphasizing the bulk Wheeler-DeWitt equation [130,131], the black hole final state proposal [14], and perhaps other proposals as well.
On the other hand, the above discussion immediately raises the question of how different experiences of a given observer could possibly be gauge related, and thus how the above scenario could possibly be realized in models that are sufficiently realistic to describe our own universe. While there is surely more to be said about this issue, we note that any gauge fixing scheme can be used to define an associated gauge invariant observable. I.e., just as one can use Coulomb gauge in electromagnetism to define gauge invariant operators ("the potential in Coulomb gauge"), in the above scenario one can use any gauge to define a notion of observer inside the black hole. The variety of possible gauges would then mean that there are a variety of possible gauge invariant definitions of the observer which happen to coincide (or nearly coincide) under familiar conditions outside old black holes but which differ greatly inside old black holes. One may then rephrase the above statement in a less surprising manner: While we may well-enough understand how to define an observer at the leading semi-classical level, 26 If one imposes the constraint that the observer survives (in a recognizable form) for a given proper time behind the black hole horizon, then one would expect a generic gauge consistent with this constraint to predict the maximum amount of such drama consistent with the observer's survival to that point. 27 The non-locality scale L d in spacetime dimension d is set by the condition ∆A ∼ G described in footnote 22. On a Killing slice of a static black hole of area-radius R, the corresponding proper distance from the event horizon would be L d ∼ p R d− 4 2 p . With respect to the definitions of [22], L d then gives "non-violent" physics for d < 4.
there may be many possible extensions of this definition at the level of non-perturbative physics, and predictions for the observer inside old black holes may depend sensitively on the choice of this extension 28 . The scenario described above (in which apparently distinct observer experiences are gauge related) may thus be considered to be just another version of this idea. It will likely be of great interest to further explore such conjectures and related physics in future work.
From this, one can check the recurrence relation B n+1 (λ) = λ(B n (λ) + B n (λ)) (A. 3) and B 0 (λ) = 1, from which we can see that B n (λ) is a monic polynomial of order n. In particular this gives us the scaling at large λ and fixed n, B n (λ) ∼ λ n , λ → ∞, n fixed. (A.4)

A.1 Large n and convergence
For studying convergence of n c n |Z n , we require the moments at large n and fixed λ. For this, observe that the ratio of consecutive terms in the sum defining B n (λ) is where the asymptotic form applies for 1 n d 2 . For large n, the ratio is unity and hence the dth term in the sum is maximal when d ∼ n log n . Substituting this value back into the sum, we can find an estimate of B n (λ) at large n, which we can write as For a more carful derivation and many more terms in the expansion, it is convenient to write d = n log n 1 + x √ n and take the limit of the terms in the sum as n → ∞ at fixed x. In this limit, the series becomes a Gaussian integral in x. From this, we can estimate the norm of the basis state |Z n = Z n |Z n = e −λ/2 B 2n (λ): log |Z n = n log n − n log log n − n(1 − log 2) + o(n) as n → ∞. (A.8) Now we can begin to characterise convergence of sums c n |Z n in the baby universe Hilbert space of section 3.3. By definition, the series converges if the partial sums form a Cauchy sequence. That is, ∞ n=0 c n |Z n converges ⇐⇒ n 2 n=n 1 c n |Z n → 0 as n 1 , n 2 → ∞, (A.9) where in this limit we can take n 1 , n 2 to infinity separately at different rates. 29 We will not characterise such series completely, but find a sufficient condition to give us a class of convergent series, and a necessary condition to constrain them.
First, a necessary condition for convergence (coming from n 1 = n 2 ) is that the norm of individual terms go to zero Convergence =⇒ |c n | |Z n → 0 as n → ∞. (A.10) Now, from (A.8), we see that |Z n is eventually larger than R n for any R > 0, so |c n |R n is bounded, which implies that f (z) := c n z n converges in the disc |z| < R. Since this holds for all R, we find that our series defines an entire analytic function, ∞ n=0 c n |Z n converges =⇒ f (z) = c n z n is entire analytic. (A.11) We can thus characterise convergent series in terms of the class of allowed analytic functions. Improving on the analyticity result, we can bound the growth of allowed functions f . To do this, we introduce the order of an analytic function, which is the infimum over all ρ such that |f (z)| < exp(|z| ρ ) for sufficiently large z. We can strengthen our necessary condition to ∞ n=0 c n |Z n converges =⇒ f (z) = c n z n has order ≤ 1, (A. 12) which means that for every > 0, we have |f (z)| < exp(|z| 1+ ) for sufficiently large |z|.
To show this, we use a result expressing the order in terms of the Taylor coefficients, namely order(f ) = lim sup n→∞ n log n log(1/|cn|) . For the norm of the terms in the series to go to zero, we must have log(1/|c n |) − log |Z n go to infinity, so for sufficiently large n we have log(1/|c n |) > log |Z n . From (A.8), for any > 0 and sufficiently large n we have log |Z n > (1 − )n log n. In turn, this means that log(1/|c n |) > (1 − )n log n for large enough n, and hence lim sup n→∞ n log n log(1/|cn|) ≤ 1. Our sufficient condition is absolute convergence, which means that the sum of norms converges, and follows from the triangle inequality for the norm. Now, from (A.8), we have the result that |Z n decays faster than n!a n for any a. From this, we can find a simple sufficient bound on the coefficients for convergence, |c n | < A x n n! for some A, x =⇒ n c n |Z n convergent. (A.14) In particular, this means that any exponential function |e xZ , or more generally a function of exponential type, defines a convergent series by its Taylor expansion. The gap between our sufficient and necessary conditions (order one functions that are not of exponential type) is small but nonempty, for example containing 1 Γ(−z) .

A.2 Large λ and n
Here, we study a limit of λ → ∞ and n → ∞ at fixed ratio ν = n λ , which will interpolate between the large λ fixed n and large n fixed λ results. We could proceed from the same series expression, but we use an alternative method, starting from an integral representation of B n (λ). This expression extracts the moments from the generating function (3.11) by a contour integral This result in fact interpolates between our two previous results for large λ fixed n (by taking ν 1) and large n fixed λ (by taking ν 1). It is interesting in particular to see how the large λ result breaks down when n becomes large. Taking ν 1 we have u * = ν − ν 2 + O(ν 3 ), so S(u * ) ∼ −ν log ν + ν + 1 2 ν 2 + · · · , with higher terms all integer powers of ν. Substituting this into the steepest descent result, we have e λS(u * ) u * 2πS (u * )λ ∼ e n log λ−n log n+n+ n 2 2λ +··· √ 2πn ∼ λ n n! e n 2 2λ +··· , (A. 19) where we applied Stirling's approximation to the factorial. The terms in the exponential are of the form n k λ k−1 for k = 2, 3, . . ., and become relevant when n is of order λ 1−1/k . The first correction occurs from the k = 2 term shown explicitly, first relevant when n is of order √ λ, when it contributes an order one rescaling of B n (λ): Higher order terms in the exponential are given by higher orders in the expansion of S(u * ) at small ν.