Nonanalyticity and On-Shell Factorization of Inflation Correlators at All Loop Orders

The dynamics of quantum fields during cosmic inflation can be probed via their late-time boundary correlators. The analytic structure of these boundary correlators contains rich physical information of bulk dynamics, and is also closely related to cosmological collider observables. In this work, we study a particular type of nonanalytic behavior, called nonlocal signals, for inflation correlators with massive exchanges at arbitrary loop orders. We propose a signal-detection algorithm to identify all possible sources of nonlocal signals in an arbitrary loop graph, and prove that the algorithm is exhaustive. We then present several versions of the on-shell factorization theorem for the leading nonlocal signal in graphs with arbitrary number of loops, and provide the explicit analytical expression for the leading nonlocal signal. We also generalize the nonlocal-signal cutting rule to arbitrary loop graphs. Finally, we provide many explicit examples to demonstrate the use of our results, including an n-loop melon graph and a variety of 2-loop graphs.


Introduction
The inflation patch of the (3 + 1)-dimensional de Sitter spacetime features an exponential expansion of 3-dimensional flat space towards a spacelike future boundary. The exponential expansion can on the one hand trigger active particle productions through quantum fluctuation of the vacuum state, and on the other hand redshift all those quantum fluctuations to superhorizon scales. If there is an observer who can access a finite portion of the future boundary, then this observer could access the quantum field theory processes in the bulk by measuring the equaltime correlation functions of field operators. These equal-time correlation functions at the future boundary of the inflation patch are what we call inflation correlators.
It is widely believed that our own universe has undergone a period of inflation at very high energy scales which lasted for at least several tens of e-folds [1]. A happy consequence of this scenario is that we are the observers who can measure inflation correlators from large-scale nonuniformity of the universe. This type of measurements has been emphasized in recent years as promising tools to study not only the primordial evolution of the universe, but also the particle physics at presumably the highest energy scale we can ever reach, a program known as the cosmological collider (CC) physics .
Similar to ordinary scattering amplitudes, inflation correlators are functions of external momenta of field operators. As it happened repeatedly in the history, it turns out very useful to extend those external momenta to complex values and to study the analytical properties of inflation correlators on the complex planes. Indeed, the study of analytic properties of scattering amplitudes has led to many fruitful results in the past decades [64][65][66].
Through recent studies, analytical structures of tree-level dS amplitudes have been understood quite well. In general, a tree-graph contribution to a B-point amplitude T (k 1 , · · · , k B ) is regular in all physically accessible configurations (henceforth physical region), but possesses singularities in the unphysical region. Generic singularities include the total-energy pole, which is a divergence of the amplitude when the sum of magnitudes of all external momenta goes to zero: k 1 +· · ·+k B → 0. The magnitude of a momentum k i ≡ |k i | (i = 1, · · · , B) is called an energy in the literature, hence the name total-energy pole. Likewise, a tree graph generally possesses poles when the sum of all energies at an interaction vertex goes to zero, which is called a partial-energy pole. Technically, the total-energy and partial-energy limits are singular due to the divergence of the (SK) time integrals in the early-time limit. In the early times, the spacetime curvature and the future boundary become irrelevant, and one expects that the residues of these poles are given by flatspace quantities. Indeed, for example, the residue of the total-energy pole is exactly given by the corresponding scattering amplitude in flat space. However, the total-energy and partial-energy singularities are in unphysical region and thus inaccessible if one holds all individual energies finite.
Besides the total-energy and partial-energy singularities, there is another class of singularities which are characteristic of inflation correlators with massive exchanges. These are branch points sitting at the boundary of physical region, when a partial sum of external momenta goes to zero. For a B-point amplitude T (k 1 , · · · , k B ), this limit can be specified by P i ≡ k 1 + · · · + k i → 0 (i < B). We note that these soft limits are physically accessible even when individual k i remains finite. The branch points at such limits are generated by complex powers P ±iω i where ω is a real parameter related to the mass of intermediate particles. As shown in Fig. 1, the complex power function P iω is nonanalytic at P = 0: It produces logarithmic oscillations in P in the physical region P > 0, and generates a branch cut when P < 0. The logarithmic oscillation is exactly the CC signals extensively studied in recent years. The detailed shape of these oscillations contains rich information about particle physics at the inflation scale. Thus, such complex-power branch points are phenomenologically important and deserve more studies.
Following our earlier works along this line, we shall call such complex-power singularities in partial sums of momenta nonlocal signal, to highlight their relation to CC observables. There is a similar type of complex-power branch points at the boundary of the physical region in the complex plane of partial sum of energies k 1 + · · · + k i instead of partial momenta. These complex energy powers are called local signal which are also CC observables. The physical origin of nonlocal and local signals are quite different [83]. In this work, we focus on nonlocal signals, and a study of local signals will be presented elsewhere. 1 It has been realized in the early studies that the form of a nonlocal or local signal is typically simpler than the whole amplitude [9,10]. From the viewpoint of SK integral, the signals are generated from factorized time integrals; One does not need to worry about time-orderings in the multi-layer time integrals so far as the (nonlocal or local) signal is the only concern. From the bootstrap viewpoint, the signal part corresponds to the solution to the homogeneous (sourceless) bootstrap equations [9,67,92]. All these facts show that the nonlocal signal in an amplitude should be much simpler to understand and compute than the full results.
Indeed, in [83] it was shown that one can formulate a cutting rule for signals in a tree graph, in the sense that one only needs to compute factorized time integrals. We proved this cutting rule in a more rigorous way in [92] with the partial Mellin-Barnes representation. Later in [94], we generalized this result to arbitrary 1-particle-irreducible 1-loop graphs, and showed that the nonlocal signals in 1-loop graphs are also free from time orderings. More importantly, we provided Figure 1: The function P iω on the complex plane with ω = 4, taken from [94]. The color shows the phase of the function and the brightness shows the magnitude.
an explicit analytic expression for the leading nonlocal signal (leading in the partial sum P i as P i → 0), and showed that the leading nonlocal signal factorizes into two subgraphs together with a "bubble signal" at the 1-loop order. This factorization is very similar to the on-shell factorization of tree-level scattering amplitude in flat spacetime, as we shall elaborate in Sec. 2. Thus, we will borrow the term "on-shell factorization" to describe the factorization of nonlocal signals in inflation correlators. We choose to use this term also to avoid potential confusion with the factorization of correlators at their partial-energy poles [67][68][69]78].
In this work, we generalize our earlier study of 1-loop nonlocal signals to graphs with an arbitrary number of loops, and thus provide a more complete understanding of leading nonlocal signals within the scope of perturbative diagrammatic expansion. The main results of this work are the following: 1. We properly define what a nonlocal signal is in an arbitrary massive inflation correlator.
Specifically, a nonlocal signal is defined with respect to a particular nonlocal soft limit, where a particular partial sum of external momenta goes to zero, while all other independent partial sums remain finite. The nonlocal signal is then defined as the complex power dependence in the soft partial sum.
2. Intuitively, a nonlocal signal arises when some of the bulk propagators in a graph become soft simultaneously. We formulate this intuition into a signal-detection algorithm which enables us to locate all sources of nonlocal signal in a given nonlocal soft limit. This is the first main result of our work, summarized in Theorem 1 in Sec. 3. Essentially, the algorithm says that we take all possible nonlocal cuts of the diagram with respect to a fixed bipartition. Then, the kinematic region where all cut lines become soft gives a candidate of nonlocal signals. With the partial Mellin-Barnes representation, we prove rigorously that our signaldetection algorithm is exhaustive. Technically, we show that the nonlocal signal must be from the "degenerate singularities" of the loop integral, and we show that the "degenerate regions" and nonlocal cuts have a one-to-one relationship.
3. Given the complication of arbitrary loop inflation correlators in the absence of full dS isometry, we are unable to rigorously formulate a factorization theorem in the most general case. Instead, we can prove the factorization of leading nonlocal signal in graphs with an arbitrary number of loops in two broad cases: a) The graph can contain arbitrary dSsymmetry-breaking propagators and interactions, but the graph has no internal vertices, i.e., all vertices are connected to at least one external line; 2) The graph can have arbitrary loop topology but is dS covariant. In either of these two cases, we can show that the leading nonlocal signal in a given nonlocal soft limit is the sum of finite terms, each of which corresponds to a "minimal cut" of the diagram. With a given minimal cut, the nonlocal signal factorizes into three pieces: a left subgraph, a right subgraph, and a (D − 1)-loop "melon signal," where D is the number of lines in the minimal cut. This is the second main result of this paper, summarized in Theorem 2 in Sec. 4 and Theorems 3 and 4 in Sec. 5.
The rest of the paper is structured as follows. Given the technical complication of this work, we add a non-technical discussion about the nonlocal signal in Sec. 2, where we start from flatspace scattering amplitudes, explain the origin of singularities in QFT amplitudes, and show why the factorization of a nonlocal signal is similar to the on-shell factorization of tree amplitudes in flat space. Then, in Sec. 2.2, we provide a discussion of nonlocal signals from cutting a tree line in arbitrary inflation correlators, as a warm-up example for more complicated cases of cutting loops.
Then, in Sec. 3 we set out to study nonlocal signals in arbitrary loop graphs, defining the nonlocal signal properly (Sec. 3.1), formulating the signal-detection algorithm (Sec. 3.2), and proving the algorithm with the partial Mellin-Barnes representation (Sec. 3.3). We also collect a few useful lemmas about cutting a graph in Sec. 3.4. In Sec. 4, we formulate and prove the on-shell factorization theorem (Theorem 2) of leading nonlocal signal in an arbitrary graph without internal vertices, which we call a bulk-free graph.
Then, in Sec. 5, we discuss the complications of graphs with internal vertices (Sec. 5.1), and then formulate the on-shell factorization theorem of nonlocal signals for graphs with unique minimal cut (Sec. 5.2) and graphs with multiple minimal cuts (Sec. 5.3).
We then provide in Sec. 6 a number of explicit examples, including the tree-level mixing graph, the L-loop melon graphs, and many two-loop examples. These examples serve as demonstrations of nonlocal signals in many different situations in multiloop graphs. The conclusion and outlook are in Sec. 7. We collect frequently used mathematical functions and notations in App. A, and useful integrals in App. B.
Notations and conventions. Most of the notations and conventions of this work are in line with the ones adopted in our previous works, especially [94]. We use mostly plus signature for the spacetime metric: ds 2 = a 2 (τ )(−dτ 2 + dx 2 ), where a(τ ) = −1/(Hτ ) is the scale factor, τ ∈ (−∞, 0) is the conformal time, and H is the inflation Hubble parameter. We take H = 1 throughout this work for simplicity. We use bold letters such as k to denote 3-momenta and the corresponding italic letter k ≡ |k| to denote its magnitude. When encountering the sum of variables with different indices, we shall use a shorthand notation k 12 ≡ k 1 +k 2 , E 12s ≡ E 1 +E 2 +E s , etc. We shall use Mellin variables such as s i ands i extensively. We often use a pair of barred and unbarred variables to denote the two (independent) Mellin variables on the same bulk line.
We shall also uses i and sī interchangeably. Following our notations, we shall denote the sum of Mellin variables with the shorthand s ij = s i + s j , s iījj = s i +s i + s j +s j , etc.
2 Nonlocal Signals: Physical Understanding 2.1 From scattering amplitudes to inflation correlators Flat-space scattering amplitudes and correlators. The occurrence of a singularity in amplitudes typically has a physical meaning. It is instructive to review a familiar example from scattering amplitudes in flat spacetime. Consider a simple model of a massless scalar φ and a massive scalar σ of mass m, interacting through a simple cubic vertex L ⊃ − 1 2 λσφ 2 . Then, the scattering φ(k 1 )φ(k 2 ) → φ(k 3 )φ(k 4 ) at the leading order is mediated by a single σ exchange at the tree level, with three independent channels. Take the s-channel exchange as an example, the scattering amplitude simply reads: where s = −(k µ 1 + k µ 2 ) 2 is the squared 4-momentum of the s-channel. Clearly, the amplitude possesses a simple pole at s = m 2 , where the intermediate particle is on shell. In addition, the amplitude factorizes into three pieces at this pole: It looks funnily trivial for this simple process, but this factorization is actually a very general result which holds nonperturbatively [108]. The crucial ingredient of this factorization is an intermediate state going on shell. To appreciate the physical meaning of this pole, we switch to a manifestly on-shell language, writing s = (k 0 s ) 2 − k 2 s = (E 1 + E 2 ) 2 − k 2 s and using the on-shell energy E 2 s ≡ k 2 s + m 2 . (We emphasize that E s ̸ = k 0 s .) Then: This is nothing but what we would get by solving the Lipman-Schwinger equation to the first nontrivial order in the old-fashioned perturbation theory. In this formalism, all the states stay manifestly on shell, at the expense that the energy is not conserved at each interaction vertex, i.e., E 1 + E 2 ̸ = E s , etc. (Indeed, the energy difference in the denominator is familiar in the standard perturbation theory in quantum mechanics.) The uncertainty principle tells us that the energy needs not to be conserved so long as the process happened within a short period of time ∆t ∼ (E 1 + E 2 − E s ) −1 . Therefore, when the intermediate particles "go on shell" in the covariant language, it really means that the energy conservation is satisfied at each interaction vertex so that the interaction can happen indefinitely long, which becomes the source of the divergence. Also, since the intermediate on-shell particle can travel a long distance, the left and right subgraphs are naturally separated from each other. This gives us an intuitive understanding of the on-shell pole and the factorization of the graph.
Although the old-fashioned perturbation theory provides a clear physical intuition for the scattering process, it is rarely used in practical calculations due to the manifest breaking of Lorentz covariance. However, the noncovariant approach based on the Schwinger-Keldysh (SK) formalism (also called the in-in formalism) [19,[109][110][111][112], proves enormously useful in cosmology with flat spatial slices, where the Lorentz symmetry is typically absent but the 3-dimensional translation and rotation symmetries are present. In this noncovariant approach, we use the time and 3-momentum as independent variables, so that every interaction vertex is associated with a 3-momentum-conserving δ-function as well as an integral over the time variable. There are four types of frequently used propagators, the Feynman D ++ , the anti-Feynman D −− , and the two Wightman functions D ±∓ , which for a scalar of mass m are respectively given by: As a warm-up example, we use this formalism to recompute the scattering amplitude T (φφ → φφ). Note that the scattering amplitude is an in-out amplitude connecting states at t = −∞ and t = +∞, and the four external legs should be amputated, while the internal leg should take the Feynman propagator D ++ . Therefore: Here and below we use the shorthand E ij··· = E i + E j + · · · . This is exactly the familiar result (1). In cosmology, observables usually correspond to equal-time correlation functions rather than scattering amplitudes. The analytical structure for the correlation functions is already different from the corresponding scattering amplitudes in flat space. Still take the above φφ → φφ process as an example. Now we want to compute the equal-time correlation function ⟨φ k 1 φ k 2 φ k 3 φ k 4 ⟩ ′ instead of the scattering amplitude. For this purpose, we should take the final time at t = 0, and include all possible four propagators according to the SK formalism [19]. Also, the external legs are not amputated. The s-channel exchange then gives the following contribution to the correlator: Here the prime in ⟨· · ·⟩ ′ means that we have removed the 3-momentum-conserving δ-function from the correlator. Using the explicit expressions for the propagators given in (4) and (5), we can evaluate the above integral directly, and the result is: An immediate observation is that, unlike the scattering amplitude, the correlation function is regular in the interior of physically reachable region. (In this case, the interior of the physical region is specified by the joint condition E i > 0 (i = 1, 2, 3, 4), E s > m, and E 12 , E 34 > E 2 s − m 2 .) In particular, there is no physical analog of "on-shell poles" since we are not using the 4-momentum language, and all quantities are manifestly on shell. However, if we are allowed to go to unphysical region by analytical continuation of some energy or momentum variables, we can find new singularities. For instance, we will hit poles when we send either of the three factors in the denominator to zero: E 1234 , E 12s , or E 34s . The first is called a total-energy pole and the last two are called the partial-energy poles in the literature. In addition, if we want to view k s = E 2 s − m 2 as an independent variable and consider the complex k s -plane, we will also find a pair of branch points at k s = ±im. Or, if we stick to energy variables E i (i = 1, · · · , 4, s), we only find poles in the complex energy plane, without getting any branch cut [84]. There are certain freedom and ambiguities in choosing variables to do analytic continuation and it seems that there is no unique optimal choice. One can make use of freedom to choose appropriate variables depending on the problem they want to solve. In any case, the lesson here is that, unlike scattering amplitudes, a graph contribution to the correlator is fully regular in the physical region, and singularities appear only in unphysical regions, or at most at the boundary of the physical region. (For instance, the total-energy pole can be physically approached by simultaneously sending E 1 , · · · , E 4 → 0.) This result also extends to loop orders: New singularities such as branch cuts can appear in the complex energy plane at loop orders, but they are all in unphysical region.
Inflation correlators. Now we proceed to inflation correlators. Here we are interested in the correlation function of final time slice, marked by the conformal time τ f ≃ 0. Due to the exponential expansion, it is more convenient to use the comoving momentum k as the kinematic variable. So, the correlator we are interested in has the form ⟨φ k 1 · · · φ k 4 ⟩ ′ τ =0 . By a slight abuse of terminology, the magnitude k i ≡ |k i | is sometimes called the energy in the literature, although it is not directly related to the energy of the mode. Indeed, there is no simple time-independent relation between the energy and the comoving momentum in an inflating background.
Even with this kinematic difference, the analytic structure of inflation correlators, when viewed as functions of complex "energies" k i , has similarities with the flat-space correlation functions. In particular, inflation correlators are also regular in the interior of the physical region. Singularities emerge only in unphysical regions, or at most at the boundary of the physical region. The difference is that the curved spacetime background distorts the mode function, which makes the expression of correlation functions more complicated, but also a lot more interesting. As a result, the tree-level correlators may also possess branch cuts in unphysical regions, which is a central topic of this work.
Again, let us take the 4-point s-channel exchange graph as the example, but in dS. (As a reminder, we always take H = 1.) Here we take the four external modes to be conformal scalar field φ c with mass m 2 c = 2. (It is the conformal scalar rather than the massless scalar in dS that resembles a flat-space massless scalar field.) For the s-channel exchanged particle, we take it to be a massive scalar σ in the principal series, namely, with mass m > 3/2. Still consider the direct cubic coupling √ −gφ 2 c σ, we can then compute the correlation function in the standard way. The complete expression is rather long, and here we only write it in the following schematic way: Here we have defined ν ≡ m 2 − 9/4, and |τ f | ≪ 1 is a late-time cutoff inserted to properly take account of the late-time fall off of a conformal scalar. All the three functions F (NL) , F (L) , F (BG) are analytic in k s at k s = 0 although each of them contain singularities in other places. Important is that the nonanalytic behavior in k s at k s = 0 is fully from the first term ∝ k 2ic ν s , which is called the nonlocal signal. In addition, there is a piece analytic in k s but nonanalytic in k 34 /k 12 which is called the local signal. Finally, there is a piece analytic in all energy variables, and is called the background.
Away from the special point k s = 0, the full correlator ⟨φ k 1 φ k 2 φ k 3 φ k 4 ⟩ ′ s is actually an analytic function on the whole complex k s plane except isolated poles and branch cuts. As mentioned above, these singularities are all outside the physical region. The expression (9) can be viewed as an expansion of the full correlator around k s = 0.
At this point we want to make a comment on the 3-point limit of (9). If we now send k 4 → 0 to get a 3-point correlator, we see that this amounts to sending k 34 → k s due to the momentum conservation. As a result, the nonlocal and the local signals become indistinguishable. Since we are only considering nonlocal signals in this work, we will stay away from such degenerate configurations.
Readers may wonder why we care about the singularities of correlation functions if they all appear outside of the physical region. In fact, even in scattering amplitude, the pole should not really appear in the physical region. The usual on-shell mass pole in a tree-level scattering amplitude, if kinematically reachable, implies that the intermediate particle cannot be stable. Its finite lifetime T then has the effect of shifting the position of the on-shell pole off the real axis, by an amount of T −1 , which is the well-known Breit-Wigner construction. In this case, the pole off the real axis produces a peak in the scattering amplitude as a function of s. As is well known, searching for such peaks is a standard way to find new particles in collider experiments; See Fig.  2(a). Even when the on-shell pole is not kinematically reachable (for example, when m < m 1 +m 2 as shown in Fig. 2(b)), the tail of the pole may still be seen in the scattering amplitude near the threshold, so long as the pole is not too far away from the physical region. This is the case, for example, for the pion pole in the nucleon-nucleon scattering [108].
In the case of inflation correlators, the appearance of a complex power at k s = 0 has even more interesting consequences: it gives rise to an oscillatory dependence on the logarithm of a momentum ratio when this ratio approaches zero; See Fig. 2(c). Thus it becomes a true signal to be searched for from cosmological data, and this is the ultimate reason that we call it a signal. The main aim of this work is to define, to locate, and to characterize such signals in inflation correlators with an arbitrary number of external points and an arbitrary number of loops. However, before embarking on a full and technical discussion, let us first explain the physics behind these nonlocal signals.  Interpretation of nonlocal signals. There are two useful ways to understand the appearance of the nonlocal signal. Physically, it can be understood as a resonance between the soft massive mode and the adjacent hard modes. Mathematically, it can be understood as a manifestation of the conformal two-point correlator of boundary operators with complex scaling dimensions.
1. Resonance effect. The fast expansion of the spacetime background can trigger spontaneous pair production of massive scalar particles. Technically, the scalar mode function of a fixed comoving momentum develops a negative-frequency mode when its physical wavelength (kτ ) −1 is redshifted to a value comparable to its Compton wavelength m −1 . As a result, we find on-shell excitations and a nonzero occupation number for low momentum modes with |kτ | ≲ m. This production can also be seen from the propagator of the massive particle in the late-time limit (16): Here we have retained the terms nonanalytic in k s at the leading order. If we now sandwich this late-time propagator with the external modes and perform the two time integrals of the two interaction vertices, we get, schematically: We see that the integral receives most of its contribution from the saddle point |k 12 τ 1 | ∼ ν and |k 34 τ 2 | ∼ ν, where the nonrelativistic massive mode resonates with the external modes at the two endpoints. This resonance condition tells us that, by sending k s /k 12 and k s /k 34 → 0, we are effectively probing the τ → 0 limit of the soft mode. This turns out to be a useful physical intuition for our later analysis. Indeed, the resulting momentum dependence on the right hand side of (11) is exactly the nonlocal signal shown in (9). This is also called a clock signal in the literature, as the massive mode can be viewed as a clock clicking with fixed physical frequenecy, which records the expansion history a(t) through the resonance [10].

2.
Conformal two-point correlator. The nonlocal signal can be understood from a late-time boundary point of view in terms of CFT correlators [94,113,114]. Very often, the massive propagator is covariant under the bulk isometries, which is SO(4,1) in the case of dS. This is mapped to the conformal symmetry at the future boundary. Thus, when we pull a bulk 2-point function to the boundary, it will naturally satisfy all the boundary conformal Ward identities, and thus are classified by the scaling dimension. For instance, a principal scalar with mass m > 3/2 in the bulk will give rise to a pair of operators on the boundary with scaling dimension ∆ ± = 3 2 ± i ν. Thus, its 2-point function with two ∆ ± operators in the position space has the form 1/|x| 2∆ ± . In the momentum space, this translates to d 3 x e −ik·x |x| −2∆ ± ∼ k 2∆ ± −3 = k ±2i ν . Now, by going to a squeezed limit, we are effectively pushing the massive bulk propagator to the boundary, and therefore we do expect to see the k ±2i ν behavior in the correlation function, which is exactly the nonlocal signal. Incidentally, there is also a local part of the two-point function ∝ δ(x), which arises from the two-point correlator of ∆ ± with its shadow counterpart ∆ ∓ [115]. This local part leads to the local signal in (9). Let us emphasize that we are not assuming any CFT dual of the bulk theory. All we need is the kinematic information of the boundary correlators which is constrained by the conformal symmetry.

Nonlocal signals in tree graphs
The above observations for the nonlocal signal in a 4-point correlator is readily generalized to arbitrary tree graphs. As we have shown, the essence of a nonlocal signal is that an intermediate massive propagator becomes soft relative to all other propagators it connects to, so that a latetime expansion of this soft propagator can be performed, yielding a nonanalytic term in the soft momentum. In [92], it was shown using the partial MB representation that this late-time limit of the internal soft propagator is the only source of the nonlocal signal. We shall generalize these arguments to arbitrary loop graphs in the subsequent sections. Now, focusing on tree graphs, the momentum of any internal line is fully controlled by the external momenta, and therefore, we can adjust the external momenta to make any internal line soft. This makes the computation of nonlocal signals rather straightforward.
In fact, if we adopt the late-time expansion of the soft internal propagator, we can directly work out an explicit expression for its nonlocal signal. So, let us consider an arbitrary tree graph, containing an internal massive scalar line D ab (P ; τ 1 , τ 2 ) with soft momentum P and two time variables τ 1,2 . We assume there are I L (I R ) additional tree internal lines, denoted by D aa i , i = 1, · · · , I L (D bb j , j = 1, · · · , I R ), connected to the vertex at τ 1 (τ 2 ). There may also be a number of bulk-to-boundary propagators connected to the vertex at τ 1 (τ 2 ), with total incoming energy being k L (k R ). Then, the relevant part of the SK integral for this tree graph can be written as: Terms not explicitly shown and denoted by "· · · " are irrelevant to our current argument. At this point we have to introduce the explicit form of the bulk massive scalar propagator. The two Wightman functions D ∓± (k; τ 1 , τ 2 ) are given by (13) and (14): where H (j) ν (z) is the Hankel function of j'th kind with j = 1, 2, and the two (anti-)time-ordered propagators are constructed out of the two Wightman functions and given by (15): These are complicated expressions. However, as mentioned above, so far as the nonlocal signal is the only concern, it is valid to expand the soft propagator D ab (P ; τ 1 , τ 2 ) in the late-time limit: Here we have ignored the terms that are analytic in k as k → 0, since we are only interested in the nonanalytic part in the final result. Importantly, the nonanalytic part of the propagator in the soft limit k → 0 is manifestly real. As a result, this part is independent of the two SK indices, and thus independent of the time ordering. This is in nice agreement with the aforementioned physical argument: The two endpoints of a soft propagator are in space-like separation, where the time ordering is irrelevant. Technically, it means that the two time integrals over τ 1 and τ 2 are factorized. This is the essence of the cutting rule for inflation correlators. Now, substituting (16) into the SK integral (12), we get the nonanalytic part of the tree graph in the limit P → 0: Figure 3: The factorization of the nonlocal signal from a tree propagator in a general inflation correlator in the soft limit P → 0. The orange cut on the left graph means to take the nonanalytic part of the soft (blue) propagator. On the right hand side, the three graphs correspond to the left subgraph, the tree signal, and the right subgraph. The two hatched vertices on the left/right subgraphs denote the complex-power-of-time insertions, i.e., the factors in blue in (17).
This shows that, in the soft limit P → 0, the tree diagram possesses a nonlocal signal ∝ P ±2i ν . The signal factorizes into three pieces: a left subgraph and a right subgraph, given respectively by the expressions in the first and the second curly brackets in (17), as well as a nonanalytic "signal," given by 1 4π Γ 2 (∓i ν)(P/2) ±2i ν . This can be viewed as an on-shell factorization theorem for inflation correlator at the tree level, which is very similar to the on-shell factorization of scattering amplitudes discussed previously. We illustrate the factorization of nonlocal signal in Fig. 3.
The above result can be pushed to all orders in k s . In principle, we can simply do the late-time expansion of the soft propagator to all orders in k s and keep the nonanalytic part as in (16). In practice, we find it more straightforward to work with the partial Mellin-Barnes representation. In this formalism, the leading signal (17) corresponds to the residue of the leading IR poles in the Mellin variables, while the higher order terms can be found by including residues from more IR poles. (Don't worry about the jargon; More explanations will be given in subsequent sections.) The result is: Here the left and right subgraphs G (L) n 2 are obtained from the leading order results in (17), by replacing the factor (−τ 1 ) p 1 +3/2+ic ν → (−τ 1 ) p 1 +3/2+ic ν+2n 1 and (−τ 2 ) p 2 +3/2+ic ν → (−τ 2 ) p 2 +3/2+ic ν+2n 2 , respectively.
The tree-level result in (17) can also be easily generalized to arbitrary loop graphs so long as we are cutting a tree line, because the momentum of a tree line can always be controlled by the external momenta. One may find it slightly uncomfortable if the soft internal line is connected to loops at the two sides, as in this case the adjacent bulk propagators in the loop do not have definite momenta, and the resonance argument in the last subsection does not immediately apply to this case. However, so long as one still accepts that the nonlocal signal comes entirely from the late-time limit of the soft tree propagator, one can still derive the on-shell factorization of the nonlocal signal following exactly the same steps to get (17). As mentioned above, the validity of this late-time expansion can be more rigorously justified using the partial MB representation. This generalization to arbitrary loop graphs when cutting a tree line is implicitly included in Fig.  3: The two blobs can contain arbitrary loop structures.
It is when we try to extract nonlocal signals in loop propagators in a general loop graph that we find a real need to go beyond the simple late-time expansion, and to develop more advanced techniques. As we showed in previous works [51,92,94], the partial MB representation is suitable for this task. So, the the subsequent sections, we will use this method to develop a general algorithm and a theorem to detect and also to compute leading nonlocal signals in arbitrary loop graphs. Obviously, many technical details are involved. To help the readers keep track of what is going on, we will devote the subsection below to a brief discussion of the basic physical picture of nonlocal signals in loop diagrams.

Nonlocal signals in loop graphs
Intuitively, we can imagine that a loop inflation correlator also possesses nonlocal signals when some of its loop propagators become soft. In such situations, a late-time expansion of the soft loop propagator would be possible, and one can again generate oscillatory signals through the interference between the late-time massive modes and the effectively massless modes at an interaction vertex. However, here we immediately face a complication: The momentum of a loop propagator is not uniquely fixed by the choice of external momenta. Even if we can set up a momentum configuration such that the momentum transfer from one part of the graph to the other part is soft, the momentum of each individual loop line is not guaranteed to be soft. Given that the loop momentum can in principle be arbitrarily high, it is not immediately justified that we can make a late-time expansion to any internal loop propagator.
More careful analysis with partial MB representation for 1-loop graphs reveals that, for the purpose of generating a nonlocal signal from a loop process, it is not essential to make any loop propagator soft. Instead, what is important is that the external momentum configuration should be suitably chosen, such that the sum of momenta of at least two loop propagators is forced to approach zero [94].
This observation can be generalized to graphs with an arbitrary number of loops. The result is similar: To generate a nonlocal signal, one should take a soft limit of the external momenta configuration, such that the sum of momenta in several loop propagators is forced to approach zero. In this case, we can always make use of the freedom of redefining loop momentum variables to make all these loop lines soft simultaneously. Consequently, there is always a parameter space where all these loop lines go on shell in the sense described around (10). It is trivial to see that, when all these loop lines become soft with their momentum approaching zero, we can simply remove them without spoiling the momentum conservation in the whole graph. That is, we can take a cut of graph. Of course, there are additional technical requirements for this cut to generate a nonlocal signal which will be detailed later, but this simple line of reasoning forms the intuitive basis behind our signal-detection algorithm to be introduced in Sec. 3.
In short, our signal-detection algorithm says that we take all possible nonlocal cuts of a graph with respect to a given bipartition. Then, each cut is a candidate of the nonlocal signal in the sense that the parameter region with all cut lines getting soft could (though not always) generate a complex-power contribution to the graph. It still remains to find an explicit expression for the Figure 4: An illustration of the factorization theorem (66) of the nonlocal signal in an arbitrary loop level inflation correlator in the soft limit P → 0. The orange cut on the left graph means to take the nonanalytic part of the soft (blue) propagator. On the right hand side, the three graphs correspond to the left subgraph (46), the melon signal (45), and the right subgraph (47). The hatched vertices on the left/right subgraphs denote the complex-power-of-time insertions.
nonlocal signal from a given cut. We explore this topic in Sec. 4 and Sec. 5, where we present and prove various versions of on-shell factorization theorem for general loop graphs, where the signal generated from a (D − 1)-loop cut can be analytically computed, and is called a "melon signal," as shown in Fig. 4. This is exactly the loop version of tree-level factorization shown in (17) and in Fig. 3.
It may come as a surprise that the phenomenon of on-shell factorization appears at loop levels for inflation correlators. In flat-space scattering amplitude, putting loop lines on shell yields branch cuts rather than poles. The optical theorem then relates the discontinuity of this branch cut to a total cross section, where we need to sum over all possible final states. This summation over states shows that the factorization no longer holds when cutting loop lines. The situation for the inflation correlator is in fact similar: We also need to sum over all possible "intermediate states," which correspond to infinite towers of modes with definite scaling dimensions. For a melon graph of D lines, this scaling dimension can be organized as 3 However, an important point here is that, in the squeezed limit, there is only a finite number of terms with n = 0 are dominant, which we call the leading signals. Then, each term in the leading signal is factorized.

Defining nonlocal signal
Above we have explained the occurrence of nonlocal signals in a given inflation correlator in a heuristic way. With this physical picture in mind, now we are going to develop the formalism to systematically investigate the nonlocal signals. The first task is to precisely define what we mean by a nonlocal signal in an arbitrary inflation correlator. This is the main topic of the current subsection.
Basic setup. We first introduce the basic setup. We shall begin with a B-point inflation correlator, by which we mean a correlation function of B boundary operators φ (i) (i = 1, · · · , B).
To make the nonlocal signal distinct from local signals, we require B ≥ 4. (See comments below (9).) The superscript (i) denotes the species of the boundary operator. Thus these B operators can be either identical or distinct. For CC applications, these boundary operators correspond to bulk fields which have nonvanishing boundary limit, such as a (nearly) massless scalar field, including the inflaton fluctuation, or the massless spin-2 graviton. For theoretical investigations, one also frequently considers the case where φ is a conformal scalar, namely a scalar of effective mass m 2 = 2 in dS with 3 spatial dimensions. A conformal scalar dies away at the future boundary τ → 0 as φ ∝ |τ |, and therefore it does not immediately correspond to a realistic CC scenario. However, cases with conformal-scalar external modes are often simplest from a technical viewpoint, and conformal-scalar correlators are interesting theoretical objects on their own right.
We shall assume that the translation and rotation of the 3-dimensional spatial slices are kept by all classical backgrounds in the problem, and therefore it is beneficial to work in the 3-momentum space. Let us work in the Fourier space and assume that each of the B operators φ (i) k i carries a fixed 3-momentum k i . Clearly, the total momentum should be conserved, so the B-point correlator should be proportional to an overall δ-function of total momentum conservation. So, we can write the B-point correlator as: We assume that the cluster decomposition holds for this correlator: The function T ({k}) does not contain further momentum-conserving δ-function factors. All such factors are classified as "disconnected contributions." In this work, we shall always assume that the bulk theory is a weakly coupled quantum field theory, such that the weak coupling expansion is valid at least as an asymptotic series. In such cases, we can approximate the amplitude T ({k}) as a sum of connected graphs truncated at a finite order in the number of loops: Here G({k}) is the amplitude of an individual graph. Below we do not distinguish between the graph and the associated amplitude, and we shall directly call G({k}) a graph without causing any confusion. Given the perturbativity of the problem, it is legitimate to work with individual graph G({k}) instead of the total amplitude T ({k}). The analysis below will exclusively focus on individual graphs. The readers should keep in mind that the full result of the amplitude should be the sum of all diagrams to a desired order in the perturbation theory.
In this work, we shall mostly focus on the case where the internal lines of a graph G({k}) propagate scalars σ of arbitrary mass m > 3/2. The couplings among these scalars and with the boundary operator φ are assumed to be local, in the sense that the spatial or temporal derivatives only appear as polynomials of finite degree.
Nonlocal bipartition and nonlocal soft limit. The nonanalyticity we shall consider in this work arises from a particular soft limit where the sum of a subset of external momenta goes to zero. To characterize this type of soft limits more precisely, we introduce the concept of nonlocal bipartition of the graph and the corresponding nonlocal soft limit.
Let G({k}) be a graph of B ≥ 4 external points and L independent loops as prescribed above. Here {k} denotes the collection of all 3-momenta of the B external legs. Then, a nonlocal bipartition of the graph G means a separation of the set {k} into two disjoined subgroups, and {k (R) } have B L and B R elements, respectively. Then, as a part of the definition for the nonlocal bipartition, we demand B L ≥ 2 and B R ≥ 2. Clearly, B = B L + B R , and therefore, a nonlocal bipartition is possible only when B ≥ 4. As discussed in the previous section, this assumption is necessary in order to identify nonlocal signals from local signals.
We then introduce the concept of partial sum of external momenta. By a partial sum, we mean a linear combination: where the coefficients β ij take values from {0, +1}, with the requirement that at least one β ij = 0 for any given i. We shall denote the set of all partial sums of the external momenta by { P}. Now, we are ready to introduce the concept nonlocal soft limit. By a nonlocal soft limit associated with the nonlocal bipartition {k (L) } ∪ {k (R) }, we mean the limit specified by the following two conditions: 1. The partial sum P ≡ k (L) → 0 (which then automatically implies k (R) = −P → 0 due to the total momentum conservation); 2. All external momenta k j and all partial sums P i except ±P remain finite in the physical region.
Thus, by a nonlocal soft limit, we are considering a particular partial sum P being much softer than any external momenta and any other independent partial sums of them. That is, we only consider one soft configuration at a time, and this turns out to be a useful strategy to isolate nonlocal signals of the graph and analyze them one by one. Below whenever we mention a nonlocal soft limit P → 0, we always mean that the above two conditions are met.
Nonlocal signal. Now we introduce our definition of the nonlocal signal. As we shall see below, schematically, in a given nonlocal soft limit P → 0, the graph G in general has the following behavior: Here A b|ℓ ({k}) is a generally complex amplitude analytic and finite at P = 0. So, it allows a Taylor expansion in P with a nonvanishing O(P 0 ) term. α ℓ and ω ℓ are both real and positive numbers. The complex exponent α ℓ ± iω ℓ then implies the existence of a branch cut on the negative real axis of P , and also a logarithmic oscillatory signal on the real positive axis. This kind of nonanalyticity is the central topic of this work. As is made clear, different terms in the ℓ-summation are distinguished by different values of α ℓ ± iω ℓ , and we shall call each pair of terms with b = ± in the ℓ-summation a nonlocal signal. We will sometimes just call it a signal when no confusion could arise. This is clearly a generalization of the nonlocal signal defined for a tree diagram in the last section. It can be shown that the total number C of nonlocal signals in a given nonlocal soft limit of a given graph is finite. Let us organize these C signals such that α 1 ≤ α 2 ≤ · · · ≤ α C . Clearly, the nonlocal signal with the smallest α ℓ gives the leading contribution. We shall call such signal(s) the leading signal. It is possible that the leading signals are not unique. For instance, we may have α 1 = α 2 < α 3 < · · · . A major task of this work is to work out as explicitly as possible the form of the leading signal in a given nonlocal soft limit of an arbitrary loop graph. Below, we will first propose an algorithm to "locate" the origin of nonlocal signals. Then, in subsequent sections, we shall refine our result by giving more explicit expressions for the nonlocal signals for various special cases.

Signal detection algorithm
Given the complexity of an arbitrary loop diagram, not all nonlocal soft limits contain nonlocal signals. Thus, it would be useful first to have a practical algorithm to detect and locate all nonlocal signals in a given soft limit. We shall now present such an algorithm. At the end of this subsection, we shall introduce a theorem of signal detection, showing that our algorithm is exhaustive.
Nonlocal cut and its degree. The key concept of our algorithm is the nonlocal cut. Given a bipartition {k (L) } ∪ {k (R) } of the graph G({k}), we define a nonlocal cut of the graph with respect to this bipartition to be the following operation on the graph: We remove some of the bulk propagators from the graph such that the resulting cut graph satisfies the two conditions: 1. The left set {k (L) } is totally disconnected from the right set {k (R) }; 2. The left set {k (L) } itself is fully connected, so is the right set {k (R) }.
A removed line in this operation is called a cut line, and the number of lines removed in the operation is called the degree of the cut.
Irreducible cut and minimal cut. It is useful in the following analysis to introduce the concept of the irreducible cut and the minimal cut.
By an irreducible cut, we mean a nonlocal cut which ceases to be a nonlocal cut if we put any one cut line back into the cut graph. Clearly, a nonlocal cut is either irreducible or reducible, and any nonlocal cut can be made irreducible by putting some cut line back. However, we note that the resulting irreducible might not be unique. There may be many distinct ways to reduce a nonlocal cut, and they may end up with distinct irreducible cuts. For instance, in Fig. 5, we can get a nonlocal cut by removing all horizontal internal lines. This nonlocal cut can be reduced to either of the two distinct irreducible cuts in the left and middle diagrams of Fig. 5.
By a minimal cut, we mean a nonlocal cut of the smallest degree for a given nonlocal bipartition. Clearly, a minimal cut is irreducible, but an irreducible cut does not have to be minimal. See Fig. 5. Also, we note that the minimal cuts may not be unique for a given bipartition. For example, in the double triangle diagram in Fig. 13, we can cut the two skew lines in either of the left and right triangle loops. Each of them is a minimal cut. However, if there is a unique irreducible cut for a given nonlocal bipartition, then it is automatically a minimal cut, and it is also the unique minimal cut for the given bipartition.
Signal detection algorithm. Now we are ready to present one main result of this work, namely an algorithm to detect all possible nonlocal signals in an arbitrary graph. We state it as the following theorem. 1. The graph is an analytic function of P in the interior of the physical region.
2. In the limit P → 0, every nonlocal cut with respect to the nonlocal bipartition {k (L) } ∪ {k (R) } gives rise to a candidate of nonlocal signal when all cut lines become soft simultaneously.
The first part of this theorem ensures that no nonanalytic behavior appears in the interior of the physical region. So, to find a nonlocal signal, we do need to consider a soft limit. Then, in the second part, the theorem tells us that the nonlocal signal when P → 0 can be exhausted by enumerating all possible nonlocal cuts for a given nonlocal bipartition of the graph. When all the cut lines of a nonlocal cut become soft simultaneously, they could (though not always) contribute a nonlocal together. Whether such soft configurations really lead to nonlocal signals also depends on other details of the graph such as interaction types. Examples of nonlocal cuts with or without nonlocal signals will be given in Sec. 6. So, the existence of nonlocal cuts is a necessary but not sufficient condition for nonlocal signals. Also, the above theorem only tells us where nonlocal signals could possibly appear; It does not give us an explicit expression for these signals. All these remaining questions will be addressed in the following two sections.

Proof of the signal detection theorem
Now we are going to prove Theorem 1. The proof is somewhat lengthy. So, we provide an outline of the proof here, to give readers a quick idea of what is going on: 1. We first specify the diagram we are going to analyze, and then introduce the key tool, namely the partial MB representation. In one sentence, the partial MB representation means that we use the MB representation for all the bulk propagators, but leave all bulk-to-boundary propagators unchanged. The partial MB representation allows us to isolate a relatively simple loop momentum integral, which encodes all the dependences on the momentum partial sums. Therefore, to analyze the analyticity of the graph as functions of momentum partial sums, we can focus on the loop integral only.
2. We then present a lemma of degenerate singularity, which asserts that all possible nonanalytic behavior of the loop integral as a function of momentum partial sums must be from a region where two or more internal propagators carry linearly dependent momenta and these momenta all become soft simultaneously. We call such regions in the loop integral space degenerate region, and the resultant nonanalytic behavior the degenerate singularity. Phrased in another way, a necessary condition for the existence of a nonlocal signal is the existence of a degenerate region.
3. We then show that each degenerate region of the loop integral space leads to a nonlocal cut and vice versa. Thus, the existence of a nonlocal cut is a necessary condition of a nonlocal signal. Therefore, by enumerating all possible nonlocal cuts with a given bipartition, we are guaranteed to get all possible nonlocal signals in the corresponding soft limit.
Below we carry out these steps in detail.
Specification of the graph. We assume that the graph G({k}) has B external legs, I internal legs, V vertices, and L independent loops. For definiteness, we take the external legs to be the bulk-to-boundary propagators of a conformal scalar with m 2 = 2: On the other hand, we take all of the I internal lines to be bulk propagators of massive scalar in principal series, i.e., m i > 3/2 (i = 1, · · · , I). For these so-called principal scalars we define the mass parameter ν ≡ m 2 − 9/4 which is a positive real number. Then, the SK bulk propagators of a principal scalar with mass parameter ν are given as in (13)- (15). Below, we shall also call the two Wightman functions D ∓± in (13) and (14) the homogeneous propagators, since they are solutions to the homogeneous scalar equation of motion. On the other hand, the Feynman and anti-Feynman propagators D ±± in (15) are also called inhomogeneous propagators, as they are solutions to scalar field equation with δ-function source.
The interaction vertices can be taken to be arbitrary local couplings for the validity of the theorem, which can have arbitrary polynomial dependences on the space and time derivatives, and also general polynomial dependences on the time variable of the vertex, so long as the integral is well behaved in the late-time limit. 2 If the theory possesses the dilation symmetry, then the explicit time dependence of an interaction vertex is uniquely fixed. Purely for notational simplicity, we shall take all interactions to be direct couplings with arbitrary power dependences on τ i , namely (−τ i ) p i for the i'th vertex. Again, it is understood that p i is always chosen such that the integral is well behaved in the late-time limit.
With all the propagators and interactions specified, we can write down the SK integral for the graph G({k}) as: Here a i = ± (i = 1, · · · , V ) is the SK index of the i'th vertex. We take the L independent loop momentum variables to be q k (k = 1, · · · , L). The momentum flowing in the ℓ'th bulk propagator is denoted as p ℓ . Also, in the bulk propagator D a ℓ1 a ℓ2 (p ℓ ; τ ℓ1 , τ ℓ2 ), the two time variables τ ℓ1 , τ ℓ2 should be identified with the time variables of the two vertices to which the propagator is attached, respectively, so are the two SK indices. The same identifications are also made for the time variables and SK indices of the bulk-to-boundary propagators C a j (k j ; τ j ).
Partial Mellin-Barnes representation. The basic tool of the proof is the partial MB representation introduced in [51,92]. In short, the partial MB representation suggests that we use the MB representation for all the bulk propagators. In particular, for the two homogeneous propagators, we have and the partial MB representation for the two inhomogeneous propagators can be obtained by substituting (25) into (15). After adopting the partial MB representation for all bulk propagators, the graph (24) takes the following form: Here we have used (s i ,s i ) as a pair of Mellin variables for the i'th internal propagator, and we have suppressed the SK-index dependences in the integrand for clarity. The integral then breaks into several pieces. The Mellin-space time integral T({k}; {s,s}) take the following form: where the function N (τ 1 , · · · , τ V ), called the nesting function, denotes all possible combinations of θ functions that order the time variables. The factor e ia i E i τ i comes from the bulk-to-boundary propagators C a j (k j ; τ j ), where E i denote the sum of the magnitudes of all 3-momenta of the bulkto-boundary propagators attached to the i'th vertex. In the power function (−τ i ) P i −2S i , P i is a fixed real number which receives three contributions: which can be understood by looking at (24), (23), and (25). On the other hand, the term S i in the exponent of (−τ i ) P i −2S i denotes the sum of all Mellin variables for the bulk propagators attached at the i'th vertex. We note in passing that, in the time integral T, all dependences on external momenta are through the "partial energies" E i (i = 1, · · · , V ), namely the partial sum of the magnitudes of external momenta. The graph G({k}) does have nonanalytic behavior on partial energies which, in the boundary of physical regions, corresponds to the so-called "local signals." We will study the local signals elsewhere.
Next, we have a factor of loop momentum integral L({k}; {s,s}): where p i is the momentum of the i'th internal propagator. We emphasize that the Mellin-space loop integral L({k}; {s,s}) is completely independent of SK indices. Finally, we collect all other time-and momentum-independent factors into D(s i ,s i ), including the exponential factors and Γ factors from the MB representation of the bulk propagators as in (25).
The break of the graph G({k}) into different pieces in (26) is what makes the partial MB representation useful for our analysis. For example, in this work, we are interested in the analyticity of G({k}) as a function of partial sums of external momenta. Now, (26) tells us that only the loop integral L({k}; {s,s}) could possibly depend on the partial sums of external momenta. Therefore, we only need to make a careful analysis of L({k}; {s,s}) for our purpose, which is what we shall do as the next step.
Parametrization of internal momenta. We can choose loop momentum variables q ℓ (ℓ = 1, · · · , L) in such a way that the momentum p i of the i'th loop propagator (i = 1, · · · , I) takes the following form: where the coefficients α iℓ take values from {0, ±1}. The representation in the second sum β ij k j is not unique due to the total momentum conservation k = 0. This freedom allows us to choose a representation in which β ij takes value from {0, +1}. 3 Then, the combination β ij k j is simply a partial sum of external momenta as defined in (21). Now, we want to understand the analyticity of G({k}) as a function of a particular partial sum P at either finite P in the physical region or at P = 0, with all other partial sums P i ̸ = ±P held fixed and finite. Therefore, let us write where P is either finite or zero. and we want to understand what happens when we send δP → 0.
A first observation is that the momentum p i of the i'th internal propagator, as given in (30), can be Taylor expanded around δP = 0, with all O(δP 2 ) and higher order terms vanish. Let us denote the zeroth order term in this expansion as p i , then, we can write: where γ i is a momentum-independent number. We note that all loop momentum dependences in the above expansion are in the zeroth order term p i . It turns out that all the nonanalyticity of the loop integral L({k}; {s,s}) as P → 0 comes from a particular region in the loop momentum space, where two or more linearly dependent internal momenta become soft simultaneously. We call a singularity from such a region the degenerate singularity. Below, we use a lemma to make this point more explicit.
Degenerate singularity. Now suppose P is fixed at a physical value either at or away from 0. By construction of the nonlocal soft limit, all partial sums of external momenta P i except P (and, trivially, −P = k (R) ) are finite. This fact allows us to introduce a comoving cutoff Λ in the momentum integral, which satisfies Now, the loop integral is performed over a 3L-dimensional space Q spanned by L independent loop momenta. To analyze the analyticity of the integral, we separate this loop momentum space 3 Proof: We can remove the first L internal propagators carrying the loop momenta q i and set q i = 0 for simplicity (i = 1, · · · , L). The resultant graph is a tree graph, whose internal propagators all have fixed momenta determined by the external momenta k. Then, consider any internal line with momentum p i in this tree diagram, there are two possibilities. Either p i = 0, meaning β ij = 0 for all j, or, p i ̸ = 0 and is determined by external momenta {k}. In the latter case, removing this internal line with momentum p i would give a bipartition of the graph {k} → {k (L) }∪{k (R) }, and therefore, we have p i = k (L) . Thus we see that, for this internal line, β ij = +1 for all k j ∈ {k (L) } and β ij = 0 for all k j ∈ {k (R) }. into different regions. Specifically, for the i'th internal propagator with momentum p i given in (30), we define a soft zone U i , which is a codimension-3 sheet with O(Λ) thickness: Here p i ≡ |p i | and p i is defined in (32). Then, the whole loop momentum space Q can be separated into the following four types of regions: 1. Hard region, which is the region outside of all soft zones, namely Q − L i=1 U i . In this region, the momenta of all loop propagators are harder than Λ.
2. Singly soft region, which is the interior of a given soft zone U i but with all zone-intersections removed, namely U i − j̸ =i U j . In this region, there is a single loop propagator with momentum p i softer than Λ, while the momenta of all other loop propagators are harder than Λ.
3. Nondegenerate region. A nondegenerate region could appear when i∈I U i ̸ = ∅ for a label set I ⊂ {1, · · · , L} that contains at least two elements. An intersection i∈I U i is called nondegenerate, if all momenta p i with i ∈ I are linearly independent. In this region, the propagators with momenta p i (i ∈ I) are softer than Λ, while all other loop propagators have momenta harder than Λ.

Degenerate region. A degenerate region means a nonempty intersection
i∈I U i where all momenta p i with i ∈ I are linearly dependent. The relative softness or hardness of momenta in this region is similar to the nondegenerate region, but the degeneracy of p i with i ∈ I makes a huge difference in the analytic property of the loop momentum integral, as will be detailed below.
We illustrate the separation into these four types of regions in Fig. 6. The usefulness of this separation comes from the following "lemma of degenerate singularity":  (29) as P → 0 come from the integral over the degenerate region. The integral, restricted in either the hard region, the singly soft region, or the nondegenerate region, is analytic in P as P → 0. Proof: 1. We first show that the loop integral over the hard region Q − L i=1 U i is analytic as δP → 0.
In this region, we have p i > Λ > δP for all p i , and thus we can Taylor expand the i'th propagator into convergent power series of δP for all i = 1, · · · , I: U j , we have p j ≃ p j > Λ for any j ̸ = i. Then, we can Taylor expand |p j | −2s jj around δP = 0 and p i = 0 for all j ̸ = i, and we can keep the leading term only. The leading term is simply independent of δP and p i . So we conclude that, within U i − j̸ =i U j , at the leading order in δP and p i , the whole loop integral is proportional to This integral is analytic at δP = 0. To see this, we rewrite this integral as: The first integral on the right hand side of (37) can be carried out directly, which is independent of δP : On the other hand, in the second integral on the right hand side of (37), we can expand the integrand around δP = 0 since p i > Λ > δP in this region. Thus this integral is analytic in δP , so is the original integral (36). So we conclude that the loop momentum integral over the singly soft region is also analytic at δP = 0.
3. Next we show that the loop integral over any nondegenerate region i∈I U i is analytic at δP = 0. In this case, we can choose p i i ∈ I, together with some other independent momenta to form a new set of loop integral variables. Then, in parallel with the above argument, the integral in the region i∈I U i ̸ = ∅ can be rewritten as factorized integrals: which is again analytic at δP = 0 by the same reasoning as above.
4. So, any possible singular behavior of the loop integral as δP → 0 must arise from a degenerate region i∈I U i . In this case, we can no longer take all p i with i ∈ I as independent variables. Suppose such a degenerate region is an intersection of D soft zones, and the order of degeneracy is 1. (That is, there are D − 1 vectors p i out of D which are linearly independent. Higher degrees of degeneracy can be handled similarly.) Then we can write, say: Then the integral within this region will have the following form: We can certainly shift and rescale the D −1 integral variables p i such that the above integral becomes: Here we have removed the upper limit Λ of the integral at the expense of introducing an (irrelevant) analytic term in δP . The integral is nothing but the melon integral of degree D which we shall encounter frequently in this work. We work out this integral explicitly in App. B. The result is: where the omitted terms are analytic in δP . Therefore, we see that the loop momentum integral in the degenerate region produces a term ∝ δP −2s 11···DD , and this would contribute to a nonlocal signal whenever the exponent 3(D − 1) − 2s 11···DD / ∈ N.
Thus we conclude that a necessary condition for the appearance of singular behavior in δP → 0 is the existence of a degenerate region in the loop momentum space. This completes our proof of the lemma of degenerate singularity.
Equivalence between nonlocal cut and degenerate region. The lemma of degenerate singularity tells us that, when P → 0, nonanalyticity could (though not necessarily) arise from each degenerate region. Now, we are going to show that the existences of a nonlocal cut and a degenerate region imply each other. Thus, the existence of a nonlocal cut is a necessary condition for the existence of a nonlocal signal. By enumerating all nonlocal signals in a given nonlocal soft limit, we are guaranteed to get all possible nonlocal signals in this limit. Naturally, our proof comes in two steps, one with "⇒" and the other with "⇐." Nonlocal cut ⇒ degenerate region: First, we show that each nonlocal cut of the graph implies the existence of a degenerate region in the loop momentum space. In fact, a nonlocal cut of degree D means that we can set the momenta of all cut lines to zero under the constraint of the nonlocal soft limit. (That is, all partial sums except P remain finite.) This implies the existence of an intersection of at least D soft zones, and we only need to show that this intersection is degenerate. Suppose it were otherwise and the soft intersection is nondegenerate. Then it implies that we can choose the D momenta in these D lines as independent loop momentum variables. This means that removing these D lines should result in a connected graph. However, being a nonlocal cut means that removing these D lines should result in a disconnected graph. This contradiction shows that the intersection of the D soft zones must be degenerate. Degenerate region ⇒ nonlocal cut: Second, we show that a degenerate region from the intersection of D soft zones gives a nonlocal cut. In fact, the existence of such a degenerate region means that we can set the momenta of the corresponding D lines to zero up to O(Λ) without affecting the momenta of all other lines. So we can as well remove these D lines without affecting other internal lines. Then, we only need to show that removing these D lines is a valid nonlocal cut. This comes in two steps: 1. We need to show that, after the removal of these D lines, the left set {k (L) } is totally disconnected from the right set {k (R) }. For simplicity, assuming the degeneracy of this region is order 1. (Again, degeneracies of higher order can be handled similarly.) Then we can select the momenta p i of D − 1 lines as independent loop momentum variables, and rewrite the momentum of the remaining line as and P D is a partial sum of external momenta. Now, since all these D lines can become 0 simultaneously, it implies that P D = 0. This is consistent with our assumption of nonlocal soft limit (all partial sums but P remain finite) only if P D = P = k (L) (or if, trivially, P D = −P = k (R) ). Therefore, removing the first D − 1 lines will result in a tree graph where the D'th line is a tree line connecting a left subgraph with external momenta {k (L) } and a right subgraph with external momenta {k (R) }. Removing this remaining tree line thus makes the left subgraph and right subgraph disconnected. 4 2. We still need to show that, in the remaining graph after the removal of the D lines, the B L left points are fully connected, and so are the B R right points. Suppose this is not true, and a genuine subset of {k (L) } is disconnected from others, then it implies that the sum of all momenta in this subset is zero, and this contradicts our assumption of the nonlocal soft limit. Therefore, the B L left points must be fully connected, so are B R the right points.
Now we have established the equivalence between the degenerate region and the nonlocal cut, and therefore we have proved the second statement of Theorem 1.
The loop integral is analytic at finite P. It remains to prove the first statement of Theorem 1, namely, the loop integral is analytic at finite P. By virtue of the lemma of degenerate singularity, we only need to show that a degenerate region never appears in the loop momentum space when all partial sums P i , including P ≡ k (L) , are held finite. Suppose it were otherwise, and we do have a degenerate region and the order of degeneracy is 1. (Again, degeneracies of higher order can be handled similarly.) Repeating the analysis of previous paragraphs, we see that, in this case, there must be at least one partial sum of external momenta approaching 0, which contradicts the original condition of all partial sums held finite. Therefore, there cannot be a degenerate region in this case and the loop integral must be analytic in P when all partial sums are held finite in the physical region. This completes the proof of the first statement of Theorem 1.

Some simple lemmas
At the end of this section, we collect several simple yet useful lemmas. These results will be used in the analysis of subsequent sections.
Lemma 2 (Perfect bipartition) An irreducible cut reduces the original graph into exactly two disconnected subgraphs; There exists no subgraph that is isolated from both the left and the right graphs. This is called a perfect bipartition.
Proof: If there exists a third connected component other than the left or the right graph, then we can always connect it to the left or to the right graph (but not to both), by putting back several cut lines. The resultant graph is still a valid nonlocal cut, showing that the original cut is not irreducible.
This almost trivial observation has several useful consequences for an irreducible cut graph: 1. All bulk vertices belong to either the left or the right graph; There can be no bulk vertex that is connected to neither the left nor the right graph.
2. It is impossible to cut all internal lines ending at the same bulk vertex. Otherwise this bulk vertex will be isolated from the left and right subgraphs.
3. A cut line in an irreducible cut must connect the left subgraph to the right subgraph. Proof: By the main lemma, the two endpoints of a cut line must belong to either the left or right subgraph. If they both belong to the left (or right) graph, then we can put the cut line back, and the resultant graph is again a valid cut, showing that the original cut is not irreducible.
Note that the property of perfect bipartition no longer holds for reducible cuts. In a reducible cut graph, we can have multiple disconnected components. In particular, there can be a bulk vertex such that all bulk lines attached to it are cut. In this case, we will have no control over the relative 1 Figure 7: Illustration of internal and external vertices. The magenta dots denote internal vertices, the blue dots denote external vertices, and the little white box denotes boundary points.
softness of these lines. This makes the resonance argument badly inapplicable, and also makes the computation of nonlocal signals difficult. Fortunately, this happens only to reducible cuts, which are also nonminimal cuts, meaning that they do not contribute to the leading nonlocal signal.

Lemma 3 (Maximal degree)
The degree D of an irreducible cut must be smaller than L + 1 where L is the number of loops in the whole graph.
Proof: By Lemma 2, in an irreducibly cut graph, a cut line must connect the left and the right subgraphs. Given that the left (right) subgraph is itself connected, we can always make use of a collection of bulk lines in the left subgraph to connect all endpoints of all cut lines that are attached to the left subgraph. We can do the same on the right side. With these additional lines, we have constructed a D − 1 loop graph which is a subgraph of the original graph. Since the number of loops in a subgraph must not exceed the number of loops L in the original graph, we conclude that D ≤ L + 1.
Till now, we have been considering how to find the nonlocal signal for a given bipartition of the graph. However, not all bipartition leads to nonlocal signals. Let us call a bipartition that contains nonlocal signals signal-bearing. It would be useful to know a priori whether a bipartition is signal-bearing or not. Here we provide two necessary conditions for a signal-bearing bipartition.
To state these conditions, we introduce the concept of the internal vertex and the external vertex, which will be useful in the following sections. An internal vertex is a vertex to which all lines attached are bulk propagators. An external vertex is a vertex to which there is at least one bulk-to-boundary propagator attached. We stress that both internal and external vertices are bulk vertices. We illustrate these definitions in Fig. 7.
With the above preparation, we can now ready to state two necessary conditions for a signalbearing bipartition.

Lemma 4 (Signal-bearing bipartition)
A signal-bearing bipartition of a graph has the following two properties: 1. A group of external points connected at a common vertex must be on the same side of the bipartition.  Figure 8: Illustration of the second condition in Lemma 4. In the limit k 1 + k 2 → 0, the left and middle graphs could give rise to a nonlocal signal while the right graph cannot.
2. For any two external vertices on the same side, there must exist a path of bulk propagators connecting them without passing through any external vertices on the other side.
The proof of this lemma is trivial. If either of the above two conditions is not met, a nonlocal cut would be impossible. Although trivial to prove, this lemma is useful in our analysis. The first condition shows that, not only the boundary points, but also all external vertices, are unambiguously separated into two groups by the bipartition. So we can meaningfully talk about left external vertices and right external vertices. The second condition shows that "disconnected soft limits" are signal-less. This is illustrated in Fig. 8.

On-Shell Factorization of Bulk-Free Graphs
The signal detection algorithm from the last section gives us a practical way to search for nonlocal signals in a graph. By executing all possible nonlocal cuts associated with a given nonlocal soft limit, we are guaranteed to locate the origins of all possible nonlocal signals. However, the signal detection algorithm does not provide explicit expressions for the nonlocal signals. In the present and the next sections, we are going to work out explicit expressions for the leading nonlocal signals in a given soft limit for various types of graphs. These expressions will make the property of on-shell factorization explicit at any loop orders: The leading nonlocal signal from a minimal cut of degree D naturally factorizes into three pieces: a left subgraph, a right subgraph, and a nonanalytic piece which we call the (D − 1)-loop melon signal. Also, the cutting rule follows as a natural byproduct of our proof of the factorization theorem. That is, all cut propagators can be replaced by their real parts, and thereby the time orderings between any two endpoints of a cut propagator can be avoided.
In this section, we consider graphs of arbitrary loop order that contain no internal vertices. This amounts to saying that the fields corresponding to all internal lines have no self or mutual interactions, except the couplings to the external mode. Thus we can think of these graphs as the expectation values of a bunch of free fields in the bulk, with external legs acting as external sources for these bulk fields and their products. For this reason, we call such graphs bulk-free graphs. As we shall see, the analysis for bulk-free graphs turns out to be free from many subtleties in most general graphs. Thus, the factorization of nonlocal signals for such graphs holds under very general assumptions about the theory.
For definiteness, we shall first present and prove the factorization theorem for arbitrary bulkfree graphs under a relatively restricted set of assumptions. After completing the proof, we shall explore the consequences of loosening some of the assumptions made in the theorem.

Factorization theorem for bulk-free graphs
Theorem 2 (On-shell factorization of bulk-free graphs) Let G({k}) be a B-point and Lloop graph without any internal vertices, whose all bulk propagators can be massive with arbitrary (possibly dS-boost-breaking) dispersions. Then, for an arbitrary nonlocal bipartition of the graph {k (L) } ∪ {k (R) }, the amplitude G({k}) exhibits a nonlocal signal in the P ≡ {k (L) } → 0 limit, if and only if there exists a nonlocal cut associated with the bipartition. When a nonlocal bipartition exists for this bipartition, the minimal cut is unique. The explicit expression for the nonlocal signal in this limit is factorized into three parts: Here the subscript "nonlocal" below the summation sign means to sum over all possible combinations of c 1 , · · · , c D ∈ {±} subject to the condition D i=1 c i ν i ̸ = 0. All the nonanalyticity in P at P = 0 is encoded in M c 1 ···c D (P ), which we call the "melon signal," and its explicit expression is: In the above expression, the repeated indices are not summed unless explicitly stated. On the other hand, the left subgraph G Here we are assuming the left subgraph G is specified similarly. In addition, we use V CL to denote the set of left endpoints of the D cut lines, and use V CR to denote the set of right endpoints of the D cut lines.

Proof of the factorization theorem
Outline of the proof. The proof of Theorem 2 would also be long and technical. Thus we also outline the main idea of the proof before spelling out all the details.
1. We first show that, for a given nonlocal bipartition, if a nonlocal cut exists, then there will be a unique minimal cut associated with this nonlocal bipartition. So, we can focus on this minimal (and irreducible) cut for the subsequent analysis. Then, according to Lemma 2, this minimal cut makes a perfect bipartition of the graph, and we get a left subgraph, a right subgraph, and D cut lines.
2. We then take the partial MB representation for the graph, and rewrite the loop momentum integral to make the structure manifest that the graph is separated into the left and right subgraphs, together with L cut lines. 4. We can then decouple the left and right subgraphs from the loop momentum integral by restricting ourselves to the "degenerate soft region" where all D cut lines are soft, in accordance with the analysis in the proof of Theorem 1. In this particular case, we can compute the leading nonanalytic term explicitly by finishing a "melon integral." 5. We then analyze the pole structure of the whole graph in the Mellin space, and show that the nonlocal signal receives contribution only from the poles arising from the MB representation of the cut propagators. By collecting the residues at these poles, we finish the Mellin integral and get the final result (44), and thus complete the proof of the factorization theorem.
6. As a direct corollary of our proof, we show that, so long as the nonlocal signal is the only concern, we can replace all cut propagators by their real parts. As a result, the time integrals over all cut propagators are automatically factorized. This can be viewed as a cutting rule for computing nonlocal signals.
Below we carry out these steps in detail.
Unique minimal cut. We first show that, if there exists a nonlocal cut for the given nonlocal bipartition {k (L) } ∪ {k (R) }, then the irreducible cut is unique, and this irreducible cut is automatically a minimal cut. In fact, by our assumption of the bulk-free graph, there is no internal vertex. So, any cut line in the given nonlocal cut must connect two external vertices. Then, according to Lemma 4, in a signal-bearing bipartition, any external vertex is either a left vertex (L) or a right vertex (R). So, all cut lines can be classified as LL, RR, or LR, according to the sides of its two endpoints. We can restore all LL and RR lines, and the resultant graph is still a valid cut. However, we cannot restore any LR line, since this operation would reconnect the left and right subgraphs. So, we have found a unique irreducible cut, in which all LR lines are cut. Also, as will be made clear in the following analysis, the restored LL or RR lines in the above "cut reduction" procedure do not lead to independent new signals. So, we shall only consider the minimal cut in the rest of the proof.
Loop momentum integral in the partial MB representation. We still work with the partial MB representation, and the graph G({k}) still has the form of (26). To study the analytic properties of G({k}) as a function of momentum partial sums, let us look at the loop momentum integral L({k}). In this section, we suppress the dependence of L({k}) on Mellin variables to avoid notational clutter.
The whole graph G({k}) has L independent loop integral variables. With the minimal cut of the graph given, we can choose these L independent loop momenta as follows. First, from Lemma 3, we see that the D cut lines belong to a (D − 1)-loop subgraph of the whole graph. Furthermore, it is possible to choose the D − 1 momenta flowing in these D − 1 cut lines as independent integral variables. Second, after executing the nonlocal cut, the left subgraph is generally an N L -loop graph with N L ≥ 0 and the right subgraph generally an N R -loop graph with N R ≥ 0. Furthermore, we have N L + N R + D − 1 = L. Thus, we can choose the remaining N L + N R integral variables from N L of loop lines in the left subgraph and N R of loop lines in the right subgraph.
With the above choice made, we can rearrange the whole loop integral into the following form: Here L (L) is the loop integral of the N L -loop left subgraph. As made explicit, it is a function of all external momenta of the left subgraph {k (L) }, as well as the momenta of D cut lines, namely q 1 , · · · , q D . Here, for convenience, we denote the momentum of the D'th cut line by , although we stress that q D is not an independent loop momentum variable. Then the left loop integral can be written as: where p (L) i is the momentum of the i'th propagator. Likewise, the momentum integral of the right subgraph can be expressed as: Now, we have put the Mellin-space loop integral L {k} into a proper form. The next task is to detect potential nonanalytic behavior in L {k} in P as P → 0 while all external momenta k and partial sums P ̸ = P held fixed. In the following, we shall show that the loop integrals of the left and right subgraphs are both analytic in P as P → 0, and therefore, all possible nonanalytic behavior must be from the first line of (48).
Analyticity of left and right subgraphs. Now we show that the Mellin-space loop integrals for the left and right subgraphs are analytic in each of q i (i = 1, · · · , D) when we send all q i → 0 simultaneously. We prove the above statement for the left subgraph and the treatment for the right subgraph is identical.
Suppose the left subgraph has N L independent loops. If N L = 0 (tree graph), then it automatically allows a Taylor expansion around q ℓ → 0 (ℓ = 1, · · · , D) in the Mellin representation, and the leading term is automatically finite.
If N L ̸ = 0 (loop graph), let there be V L vertices in the left subgraph. By our assumption of the main theorem, these V L vertices are all external vertices, to which all bulk-to-boundary propagators and the cut propagators attach. Therefore, it is clear that the loop momentum integral L (L) given in (49) depends on external momenta (including {k (L) }, {q}, and P) only through certain combinations, namely, only through the total momenta injected into each of the V L vertices. Let the total momentum flowing into the i'th vertex be K i . Clearly, we can write where the coefficients A ij and B iℓ take their values from {0, +1}. We can choose loop momentum variables q (L) ℓ (ℓ = 1, · · · , N L ) in such a way that the momentum p (L) i of the i'th loop propagator (i = 1, · · · , I L ) takes the following form: where the coefficients α iℓ take values from {0, ±1}, and β ij take values from {0, +1}. (See Footnote 3 for the proof.) By our construction of the nonlocal soft limit, the first term of the above expression, A ij k (L) j , remains finite for all i when taking the soft limit. Therefore, taking the soft limit is equivalent to evaluating the left subgraph at the finite values of external momenta j . According to the first part of Theorem 1, the loop integral is analytic in the external momenta for generic finite values in the physical region. Therefore, we conclude that the left subgraph is analytic in q ℓ and P in the simultaneous soft limit q ℓ → 0 (ℓ = 1, · · · , D). In this limit, the left subgraph allows a Taylor expansion, whose zeroth order term is In exactly the same way, we can show that the right subgraph is also analytic in q ℓ in the same soft limit.
Factorization of loop momentum integral. The above analysis shows that we can Taylor expand both the left and right subgraphs around q ℓ = 0 and P = 0. When we insert such expansions back into the full loop integral L in (48), the higher order terms in q ℓ and P would eventually contribute to higher powers of P when we finish the q ℓ integral, which decreases faster in the P → 0 limit. Therefore, to single out the leading term, we should keep the leading term only, which simply means that we can directly set q ℓ = P = 0 in both L (L) and L (R) . In turn, it means that we can move L (L) and L (R) out of the q ℓ -integral. The result is: where M D−1 (P ) is the (D − 1)-loop melon integral in the Mellin representation: This integral can be directly done with details shown in App. B, and the result is: It is already clear that the result of the loop momentum integral at the leading order of P factorizes into three pieces, the left subgraph L (L) , the right subgraph L (R) , and the melon integral M D−1 (P ). In the P → 0 limit, both the left and right subgraphs become independent of P , and all the P dependences, and thus all potential nonanalyticities in P , are from the melon integral M D−1 (P ). From (56) we see that the (non)analyticity of M D−1 (P ) in P is fully controlled by the value of the power −2s 11···DD . After we perform the Mellin integrals, this power will take its value from the location of poles in all Mellin variables. Therefore, the (non)analyticity of the graph would be controlled by the locations of these poles. A nonanalytic piece appears whenever we take poles such that −2s 11···DD / ∈ N. So the next step is to analyze the pole structure of the partial MB representation of the graph G as a function of all Mellin variables.
Pole structure. A Mellin integral contour runs along a path from −i∞ to +i∞, and in all cases of our interest, the Mellin integrand is meromorphic. So, we should close the contour properly and pick up the residues of appropriate poles. In our case, the power P 3(D−1)−2s 11···DD in the melon integral in (56) together with the limit P → 0 shows that we should close the MB contour from the left half plane. Now, let us look at the pole structure of the Mellin integrand. There are three types of poles: The integral is converged when τ i → −∞ as is guaranteed by the Bunch-Davies initial condition. Any singularity in S i must be from the IR limit τ i → 0. In this limit, we can expand the exponential factor e ia i E i τ i = (ia i E i τ i ) n /n!, and we see that the integral (57) is divergent if 1 + P i − 2S i + n = 0 for all n = 0, 1, 2, · · · . They correspond to right poles of S i . In fact, we can do this integral explicitly, and get: The aforementioned poles are encoded in the Euler Γ-function. The above argument can be generalized to arbitrarily nested time integrals to show that the full time integral, including all possible nesting functions, only produces right poles in the Mellin variables. We refer the readers to [94] for details. Since we only need to pick up left poles, the time-integral poles are irrelevant to us.

2.
Poles from the loop momentum integral. Second, we have a set of poles from an Euler Γ factor in the melon integral (56). The poles from Γ( 3 2 − s ℓl ) are right poles and can be discarded. On the other hand, there is a series of left poles from the factor Γ(s 11···DD − 3(D−1) This set of poles arise due to the divergence of the melon loop integral in the UV limit where all loop momenta goes to infinity. For this reason we call it the UV pole. Clearly, the UV pole forces the combination −2s 11···DD to be an integer and thus its contribution to the melon integral is analytic in P at P = 0. So, to evaluate the nonlocal signal, we can discard this set of poles as well.
3. Infrared poles from Hankel functions. Third, we have 2D sets of poles from the Euler Γ factors in the MB representation of the D cut propagators. Assuming distinct masses ν i (i = 1, · · · , D) for all lines, these Γ factors are: So the corresponding poles are: where c i , d i = ±. Note that these poles are from the MB representation of the bulk propagators, which correspond to a late-time expansion of the bulk propagators. The positions of these poles encode the information of the scaling dimensions of the late-time modes, and for this reason we shall call them infrared (IR) poles. As we shall show below, these poles are the only sources of nonlocal signals.
For principal scalars we have ν i ∈ R + and thus the IR poles can generate noninteger powers for K in the melon integral (56). So we should consider all possible combinations of IR poles to get the full nonanalytic contributions to the graph. That is, we should consider all possible choices of c i and d i in (61).
However, some combinations of c i and d i do not yield nonanalytic results. There are two cases where this can happen.
First, we have a string of Γ factors 1/Γ[s 11 , · · · , s DD ] in the melon loop integral (56) which give a set of zeros of degree 1 when s ℓl are nonpositive integers. These zeros show that we should discard all pole combinations with c i = −d i , and only keep the poles with c i = d i for all i = 1, · · · , D. So, after this step, the possible signal poles reduce to: Second, it may happen that, for a particular choice of c i , the combination Nonlocal signal cutting rule. A direct consequence of choosing the nonlocal poles in (62) is that we can replace the propagators of all D cut lines by their real parts, known as the nonlocal signal cutting rule [51,83,92,94]. This is a direct generalization of the nonlocal signal cutting rule for one-loop graph proposed in [94]. To see this, we note that the MB representation for a bulk propagator (25) contains a factor e ∓iπ(s i −s i ) . When evaluating this factor with the nonlocal poles (62), we get (−1) n i −n i which is real. The whole propagator (25) then becomes real as well. This is why we can replace the cut propagators by their real parts. Furthermore, after this replacement, all four propagators D ab with a, b = ± become identical, and the time-ordering θ-functions in D ±± automatically disappear. So, the time integral factorizes into a left part and a right part.
Let us emphasize again that the cutting rule presented here is conceptually different from the nonlocal cut introduced earlier. As we showed above, the cutting rule D ab → Re D ab is a consequence of taking nonlocal signal poles (62) for all cut propagators. On the other hand, a nonlocal cut only indicates which propagators should be made soft in order to generate a nonlocal signal. It is not a priori clear that we can take nonlocal signal poles to integrate out Mellin variables of all the cut propagators. While the procedure of "taking signal poles for all cut lines" is always possible for bulk-free graphs, we will see in the next section that this is not the case for more general graphs.
Completing the Mellin integral. Now we are ready to finish all the Mellin integrals in (26) for a bulk-free graph and complete the proof of Theorem 2. Since we are only concerned with the leading nonlocal signal, we only need to collect residues of leading nonlocal poles for the Mellin variables of all cut lines, namely n i =n i = 0 for all i = 1, · · · , D in (62). Then, the melon integral (56), together with the factors from MB representations of cut lines (collectively denoted as D(s i ,s i ) in (26)), gives exactly the melon signal M c 1 ···c D (P ) in (45). Since the time integrals have been fully factorized into a left part and a right part, finishing the rest of Mellin integrals simply generates the standard SK integrals for the left subgraph and the right subgraph, as shown in (46) and (47). In particular, the additional time-power insertion (−τ i ) 3/2+ic i ν i comes from the cut propagator, and is nothing but the leading mode of a bulk propagator in the late-time limit. Thus we see that, despite of all the complications of loops, the cut lines can still be expanded at the late-time limit so far as the nonlocal signal is the only concern.

Discussions
From the proof of Theorem 2, we see that the assumption of bulk-free graphs brings two great simplifications. First, whenever nonlocal cuts exist, the minimal cut is guaranteed to be unique. Second, after taking the minimal cut, the resulting left and right subgraphs are both analytic and finite when P → 0. Importantly, the arguments leading to these simplifications do not rely on the type of couplings and the dispersions of bulk fields. Thus, we can immediately loosen some of the conditions of Theorem 2 as follows.
1. As a trivial extension, we can change the external lines to be (nearly) massless scalars (such as the inflaton fluctuations), massless spin-2 gravitons, or arbitrary mixtures of them. These are the most relevant cases for CC applications.
2. We can generalize the type of internal fields to include nonzero spins and more exotic (and possibly dS boost breaking) dispersions. The tensor structure introduced by the spinning fields does not affect the analytical properties, and the factorization of the leading nonlocal signal into three pieces still holds in this case. Of course, the explicit expression such as the melon signal (45) will be changed, but it is still calculable so long as the MB representation of bulk propagators can be analytically found. As an example, we can consider a vector field of mass m with dS-boost-breaking chemical potential µ. In terms of the vector mass parameter ν = m 2 − 1/4 and the dimensionless chemical potential µ = µ/H, its propagator can be found to be [92]: where W κ,µ (z) is the Whittaker W function. The MB representation of this propagator is: where 2 F 1 is the dressed hypergeometric function, defined in App. A. Thus we see that the pole structure of this propagator is very similar to the case of a massive scalar. The integrand also possesses two pairs of IR poles at s = −n ∓ i ν ands = −n ∓ i ν. 5 3. We can include arbitrary dS covariant or dS breaking couplings, such as uncontracted time derivatives or fully contracted space derivatives. A time derivative on external lines is inconsequential, while a time derivative on an internal line only shifts the position of right poles produced by the Mellin-space time integral, which is again irrelevant for the computation of nonlocal signals. On the other hand, the spatial derivatives all become factors of momenta, which do not affect the analytic property. The only caveat is that a spatial derivative on a cut line would make the signal decrease faster than nonderivative couplings as P → 0.

On-Shell Factorization of Arbitrary Graphs
In the previous section, we studied nonlocal signals in arbitrary bulk-free graphs, and derived an explicit and factorized expression for the nonlocal signal in the given nonlocal soft limit. The restriction to bulk-free graphs implies that there are no internal vertices, and this significantly simplifies the analysis. As a result, we are able to extend the factorization theorem to rather general couplings and field species, as discussed above.
However, we are ultimately interested in nonlocal signals in more general graphs, including arbitrary loop graphs with internal vertices. In this case, the analysis can be significantly involved. Thus, in the following, we shall first discuss several complications with loop graphs containing internal vertices, and then focus on a particular case where the minimal cut is unique with respect to a given bipartition to which we can directly generalize the on-shell factorization theorem proved in the previous section for bulk-free graphs. For graphs with multiple minimal cuts, we will see that there can be a peculiar type of nonlocal signals, which we call hybrid signals. This hybrid signal further complicates the analysis. Fortunately, as we shall see, such hybrid signals are absent for graphs with complete dS covariance. Thus, we can still formulate an on-shell factorization theorem for dS covariant graphs with most general loop topology. Below we spell out these points in turn.

Complications with internal vertices
Leading signal. The first complication for graphs with internal vertices is that there could be multiple distinct irreducible cuts associated with a given nonlocal bipartition. See Fig. 5 for an example. Then, according to Theorem 1, each of these cuts could potentially contribute a nonlocal signal and we need to collect all of them to get the full nonanalytic piece in the corresponding soft limit.
This complication is partially relieved if we only consider the leading signal. As shown in (22), each of the nonlocal signals in a given soft limit P → 0 scale as P α ℓ ±iω ℓ , where both α ℓ and ω ℓ are positive real numbers. Thus, the leading signal, namely the signal with the smallest α ℓ , dominates over all nonlocal signals in the soft limit.
It then remains to understand which cut gives the leading signal. Intuitively, we expect that a nonlocal signal would be more subleading if we cut more lines. The reason is that a nonlocal signal is generated from on-shell resonance of heavy particles, whose number density is diluted by the we use the representation (64). As shown in [51], the MB representation is not unique, and in this case, one can use the so-called partially resolved MB representation for the bulk propagator to make the cutting rule manifest. expanding physical volume V according to 1/V ∼ 1/a 3 ∼ (−τ ) 3 . From the analysis around (11), we learn that the late-time limit τ → 0 is translated to a squeezed limit P → 0 in the correlation function. So, by asking one more internal line to contribute the nonlocal signal, we pay the price of an additional suppression factor of P 3 . Indeed, this intuition is very well confirmed by the explicit expression of the melon signal (45), in which we can read the real part of the power of P as P 3(D−1) , where D is the degree of the cut.
The above analysis seems to suggest that we only need to consider minimal cuts in order to get the leading signal. However, there are further complications, as we shall discuss below.
Derivative couplings. An immediate complication comes from derivative couplings. In a general setup where the dS boosts are explicitly broken (such as in the EFT of inflation [116]), we can have spatial-derivative couplings without temporal-derivative counterparts. Suppose we have spatial derivatives −∂ 2 i acting on an endpoint of a soft bulk propagator, then, in the momentum space, this gives rise to a factor of P 2 where P is the momentum of the soft propagator. As a result, it might be possible that the signal from a minimal cut decreases even faster than the signal from a nonminimal cut, countering our earlier intuition.
We note that only spatial derivatives change the scaling behavior of the signal in the soft limit. Neither the time derivatives nor the non-scale-invariant time powers at each interaction vertex generate similar complications. This can be very easily seen using the partial MB representation, in which the time and momentum variables are fully separated, and thus any change in the time integral T({k}; {s,s}) cannot alter the power counting of momentum partial sums in the momentum sector L({k}; {s,s}).
The complication of derivative couplings can be addressed by defining the drop ∆ of a nonlocal cut, which measures how fast the would-be signal decreases in the late-time/squeezed limit. In the late-time limit, a massive propagator behaves like k 3 ×nonanalytic factor. Then, when there are spatial derivatives, we can define the total drop of a cut of Degree D as: where δ i ≥ 0 denotes the total number of spatial derivatives acting on both sides of the i'th cut propagator. Then, the leading signal would be given by the cut of minimal drop ∆ rather than the minimal degree D. This is indeed the case if there is a unique cut of minimal drop ∆ in a given bipartition. Due to another complication we shall discuss below, when there are multiple cuts sharing the same minimal drop, the situation can be more complicated.
On the other hand, we note that the complication of spatial derivatives is always irrelevant in dS covariant graphs. The reason is simple: Whenever there is a derivative coupling in a dS covariant graph, the spatial and temporal derivatives must appear together. So, in the above example of L ⊃ −(∂ 2 i σ) · · · we must have terms like L ⊃ [σ ′′ − ∂ 2 i σ] · · · . While the spatial derivatives make the signal decrease faster in the soft limit, the temporal derivatives do not. Thus, the temporal derivatives give the leading signal, whose scalings are identical to our naive counts of number of cuts. Thus, we conclude that, in dS covariant graphs, the leading nonlocal signal is given by the minimal cut, at least when the minimal cut is unique.
Thus, in the following analysis, when we mention a minimal cut, we always mean a cut of minimal degree D for dS covariant graphs, or a cut of minimal drop ∆ for boost breaking graphs.
There are additional complications with graphs containing internal vertices when the minimal cuts are not unique in a given nonlocal soft limit, to be discussed below. For now, the analysis of the derivative couplings is sufficient for us to formulate the on-shell factorization theorem for arbitrary loop diagrams with unique minimal cut. Thus, we shall first formulate this theorem in Sec. 5.2 before considering the most general case in Sec. 5.3.

On-shell factorization with unique minimal cut
The above analysis shows that the complication from internal vertices can be largely circumvented if we only consider the leading signal, and if there exists a unique minimal cut associated with a given nonlocal bipartition. Equipped with this knowledge, we are ready to generalize the on-shell factorization to arbitrary loop diagrams with unique minimal cut, as summarized in the following theorem. As we shall see, this is the most general situation where we can formulate a theorem with certain mathematical rigor. Proof. We focus on the nonlocal signal generated from the minimal cut with degree D. Since a minimal cut is necessarily irreducible, by Lemma 2, it generates a perfect bipartition of the graph. As a result, the graph is fully resolved into a left subgraph, a right subgraph, and D cut lines. Then, the proof goes in the same way as we did for Theorem 2 in the last section. The difference is that the left and right subgraphs are not guaranteed to be analytic in q ℓ as q ℓ → 0 where q ℓ is the momentum of the ℓ'th cut line (ℓ = 1, · · · , D). Due to the existence of additional cuts, the left and right subgraphs can exhibit additional nonanalytic behaviors in q ℓ → 0. However, thanks to the assumption of unique minimal cut, the nonlocal signals arising from those additional nonanalytic terms would be subleading in the soft limit P → 0.
Taking account of additional nonanalytic terms in the left and right subgraphs, we can write the soft limit q ℓ → 0 of the Mellin-space loop integral of the left subgraph, as, That is, the analytic part of the left loop integral can be Taylor expanded around q ℓ → 0, and the leading order (n = 0) term is generically nonzero. The nonanalytic term, on the other hand, typically decreases to zero in the q ℓ → 0 limit, so long as the interactions are IR finite. Indeed, from the expression of the melon signal in (45), we see that a cut of degree D yield a nonanalytic term (namely, a signal) that scales like P 3(D−1)+i c ℓ ν ℓ . So, the leading contribution from the left subgraph in the q ℓ → 0 limit is from the q ℓindependent term of its analytic part. Then, the rest of the proof goes in exactly the same way as the last section. Now, so long as the couplings are dS covariant, the decreasing speed of a nonlocal signal is exactly given by the degree of the corresponding nonlocal cut. Therefore, it is guaranteed that the nonanalytic terms of left or right subgraphs are subdominant relative to the nonlocal signal from the minimal cut. So, the leading nonlocal signal must be from the term explicitly shown in (67). Then, going through all similar steps as in the proof of Theorem 2, we end up with the factorized results in (66). This completes the proof of Theorem 3.
Clearly, the above proof can be readily generalized to the case of boost breaking interactions and dispersions. To address the complication of spatial derivatives, we only need to change the condition of unique minimal cut to the unique cut of minimal drop ∆. It is also straightforward to see that the cut of minimal drop is necessarily an irreducible cut, so the perfect bipartition (left subgraph + right subgraph + cut lines) still applies by Lemma 2.

On-shell factorization with multiple minimal cuts
Now let us consider the most general case where there exist more than one minimal cut in a given nonlocal bipartition. As alluded to several times, things get even more complicated if the minimal cut is not unique for a given bipartition. To appreciate the difficulty in this case, we need to introduce another subtlety associated with internal vertices, namely the δ-constraint in the Mellin space.
δ-constraint from internal vertices. The subtlety is this: While it may be possible to cut all the bulk lines attached to a given internal vertex, it is impossible to get nonlocal signals from all these lines, due to the δ-constraint of an internal vertex in the Mellin space.
To see this point more explicitly, we consider an internal vertex with the time variable τ and let there be N bulk propagators attached to this vertex. Then, if we cut all these N bulk propagators, we may want to naively apply the cutting rule D ab → Re D ab , which would allow us to remove all the time orderings between τ and any other time variables. In this case, the τ -integral can be isolated: Now, in the Mellin space, each propagator D ( ν i ) gives rise to a factor of (−τ ) −2s iī . Thus, the time integral yields a δ-function: On the other hand, as we showed in the last section, for all N lines to contribute nonlocal signals, it is important to take the (IR) signal poles. In the current situation, it means that we should choose: where c 1 , · · · , c N = ±1. So, the possibility that all these lines together contribute a nonlocal signal with ic i ν i / ∈ R is obviously inconsistent with the δ-function constraint (69). This gives us a sharp example showing that, while we can take a nonlocal cut by making certain lines soft, we cannot get nonlocal signals from all these soft lines. As a result, the cutting rule does not apply.
The δ-constraints add another complication to our analysis. In the partial MB representation, we need to finish all Mellin integrals to recover the ordinary expressions for the graph. As shown in the proof of Theorem 2, we only need to pick up left poles of all Mellin variables, and the only left poles are UV poles from the loop momentum integrals, and the IR poles from the MB representation of the bulk massive propagators. Now, the δ-constraint introduces another possibility: we can carry out integration over some Mellin variables by the δ-constraint, and the resulting expression would flip the side of poles of other Mellin variables. While we can postpone the use of δ-constraint to deal with the side-flipping problem, we cannot avoid the δ-constraint altogether. As we shall see below, the δ-constraint may lead to peculiar signals when there are multiple minimal cuts in a graph.
Hybrid signal from multiple minimal cuts. Now we come back to the discussion of multiple minimal cuts. In this case, our first guess may be that the leading nonlocal signal in the soft limit is the sum of the nonlocal signals corresponding to each of the nonlocal cuts. However, our experience with explicit examples suggests that there can be additional contributions to the leading nonlocal signal, which can have the same fall-off as those signals from a minimal cut, but can not be obtained by adding nonlocal signals from all minimal cuts alone. This is closely related to the δ-constraint discussed above.
At this point, we suggest the readers have a look at the two-point mixing example in Sec. 6.2 before reading on. To see the peculiarity of this example, we note that, in all examples hitherto considered, the nonlocal signal was obtained by taking signal poles for both of Mellin variables of all cut lines: Using our intuition of on-shell resonance, we may say that the signal comes from the resonance of both endpoints of a cut propagator. Now, we see that, in the example of 2-point mixing graph in Fig. 11, one can generate a nonlocal signal by asking the two modes attached to the two cubic vertices to resonate, while the two modes at the two-point mixing vertex are "locked" by the δ-constraint. This yields a nonlocal signal that scales with the same real power as the signal from a single soft propagator, namely P 0 , but with its imaginary power being P ±i ν 1 ±i ν 2 , which we call a hybrid signal. So, in this two-point mixing example, we would expect that the leading signals have three terms: The first two are ordinary signals, corresponding to cutting only the left internal line or only the right internal line in Fig. 11, and requiring both modes of a single line to resonate. The last one, namely the hybrid signal, corresponds the peculiar situation where we cut two lines, but the resulting signal scales as if it is from cutting one line. We expect that this feature also appears in more general loop graphs whenever there are multiple minimal cuts: The full leading signal is not only from the ordinary signal of each individual minimal cut, but also include hybrid signals from several one-side resonances. It is actually not clear if we would get more peculiar signals in more complicated loop graphs, and this is the major reason that we cannot get a simple and neat factorization theorem for the most general loop diagram.
One possible way to see the appearance of the hybrid signal is to think of a loop graph with internal vertices as a soft limit of a bulk-free graph without any internal vertices: We can artificially insert auxiliary external legs with energies E i to all internal vertices and thus turn them into external vertices. Then, we get a bulk-free graph which appears much simpler. The original graph can then be recovered by taking E i → 0 for all the auxiliary external legs. Then, one may attempt to think of the hybrid signal as a combination of ordinary signals in the bulk-free graph in the corresponding soft limit. However, the real situation is not as trivial. The reason is that, when we talk about ordinary signals, we always assume that the momenta P j of the signal-generating bulk propagators are much softer than any other external momenta, where the resonance picture is well applicable. That is, we need P j /E i ≪ 1 for all i and j. However, when we turn off the momenta of all auxiliary external legs, the above assumption no longer holds. As a consequence, we need to know how to take analytical continuation of the bulk-free graph from the region P j /E i ≪ 1 to P j /E i → ∞. This is a nontrivial task which we leave for a future study.
Absence of hybrid signals in dS covariant graphs. Although the hybrid signals make the analysis of nonlocal signals rather intractable, they are likely absent in dS covariant graphs. In the example of two-point mixing graph in Sec. 6.2, the absence of the hybrid signal in the dS covariant limit is clearly observed. There, we have a simple explanation: a covariant mixing (such as a mass mixing) can always be rotated away by a linear field redefinition. Of course this explanation does not apply directly to arbitrary loop graphs, in which case we have to resort to other arguments.
Fortunately, we can find such an argument from CFT, using our interpretation of nonlocal signals as two-point function at the future boundary. (See the end of Sec. 2.1, and also [94] for more detailed expositions.) From our proof of Theorem 2, we see that the leading nonlocal signal is obtained by cutting open all soft lines while pitching all hard lines. Here the "pitching" procedure means that we can take all hard lines out of the loop integral as we did around (54).  Figure 9: Illustration of the absence of hybrid signals in dS covariant graphs. The leading nonlocal signal in any loop diagram can be understood as a two-point function on the late-time boundary. In a dS covariant graph, the resulting two-point function respects the boundary conformal symmetry, and thus must be diagonal in the scaling dimensions at its two endpoints. This precludes all hybrid signals.
As a result, nonlocal signals in arbitrary loop graphs can also be viewed as the late-time limit of a two-point function. We illustrate this "cut and pinch" procedure in Fig. 9. Then, so long as the bulk graph is dS covariant, the corresponding boundary two-point correlator will be conformal covariant, and thus diagonal in the scaling dimensions: where ∆ 1,2 denote the scaling dimension of the boundary two-point function The crucial point is that bulk dS isometry (or equivalently, the boundary conformal symmetry) demands that ∆ 1 = ∆ 2 or ∆ 1 = ∆ 2 where ∆ 2 = 3 − ∆ 2 is the shadow counterpart of ∆ 2 . The latter case with ∆ 1 = ∆ 2 is irrelevant to us since it makes G(P ) independent of P and only produces a contact term ∝ δ (3) (x) in position space. 6 Thus the only possible nonanalytic behavior must have ∆ 1 = ∆ 2 ≡ ∆. Therefore the hybrid signal with ν 1 ̸ = ν 2 is precluded since it corresponds to G(P ; τ 1 , τ 2 ) ∝ (−P τ 1 ) ±i ν 1 (−P τ 2 ) ±i ν 2 in the late-time limit, inconsistent with the conformal constraint.
On-shell factorization of most general dS covariant graphs. Above we have shown that the main complication from internal vertices with non-unique minimal cuts is the generation of hybrid signals, however, this complication is absent if the graph respects the full dS isometry. Thus, if we restrict ourselves to dS covariant processes, it would be possible to formulate the following theorem for leading nonlocal signals in arbitrary loop graphs: is contributed (probably exclusively) by the sum of signals coming from all minimal cuts: + (other nonlocal signals) + (terms analytic in K).
Here C labels all minimal cuts with respect to the given bipartition, and D denotes the degree of the minimal cuts. The three factors G L(C) c 1 ···c D (K) denote the left subgraph, the right subgraph, and the melon signal generated by the cut C. The term denoted by "other nonlocal signals" is probably subleading in the P → 0 limit.

Discussions
Theorem 4 shows that the leading nonlocal signal of an arbitrary (dS covariant) graph is always generated by one (or several if the minimal cut is not unique) (D − 1)-loop melon subgraph with D ≥ 1, which is formed by all the cut lines in the corresponding minimal cut. This can be intuitively understood as from a "cut and pinch" process: We first identify a minimal cut that could generate a melon signal, and then pinch all the other lines and loops. There is also a boundary OPE perspective: The cut lines are soft and we can push them to the future boundary of dS spacetime, where the massive modes become boundary operators with scaling dimensions 3/2 ± i ν ℓ . Then we can do the OPE to the endpoints of the cut lines, such that they are pinched together and the soft lines become a melon graph. See [94] for more details.
It is interesting to see how we can pinch a loop. In fact, this is related to the UV pole of the Mellin-space loop momentum integral L({k}; {s,s}) in (29). The simplest example is a bubble integral (namely the 1-loop melon integral): Clearly, the integral M 1 (P ) possesses a set of UV poles at s 1122 = −n + 3/2 with n = 0, 1, · · · . If we take the residue of M 1 (P ) at the leading UV pole s 1122 = 3/2, we get: This shows that, at the leading UV pole, the residue of the bubble integral reduces to a constant number. In this sense, the bubble integral is "pinched" to a point. This is in fact a more general statement. Consider a particular layer of arbitrary loop momentum integral in the Mellin space, which corresponds to a particular sub-loop structure in the original loop graph: where p i is an arbitrary combination of external momenta and loop momenta other than q. In general, L is a function of {s i ,s i } with some poles: When {s i ,s i } take certain values, this integral could diverge in the hard region of q, and the divergent part can be obtained by: Here Λ is a hard scale: Λ ≫ |p i |. Thus we can then neglect all the p i , and therefore: Now notice that: we can then extract the possible UV divergence of L: Therefore, L indeed has a UV pole s iī = 3/2, and the residue is: That is, when we calculate the Mellin integral, we are able to take the UV pole of any loop integral and pinch the loop to a constant. Now we can understand our theorem (74) with this pinch idea. All the soft lines will be diluted due to the exponential expansion of the spacetime. Obviously we cannot pinch all the loops, otherwise we will not obtain a signal. So we should pinch all the lines and loops but leave one melon subgraph with minimal degree, which corresponds to a minimal cut. If there are multiple minimal cuts, we should sum the signals together since we should sum residues at all the allowable pole configurations.

Examples
Below we will present some concrete examples of four-point inflation correlators, and explicitly show how our theorems work. For definiteness, we set all the external lines to be the conformal scalar φ with m 2 = 2, all the bulk lines to be massive scalars σ ℓ in the principal series with different mass parameters ν ℓ > 0. It is assumed that all fields are coupled directly without any spatial or temporal derivatives, but with arbitrary power dependences (−τ i ) p i in the coupling. For simplicity we also assume that ν ℓ take generic values such that ℓ c ℓ ν ℓ ̸ = 0 for any c ℓ ∈ {0, ±1} with at least one c ℓ ̸ = 0. We will compute the Mellin-space loop integral (29) and analyze its possible pole structures which could generate a nonlocal signal.
Notations. In this section, we use solid lines to denote massless inflaton, and the external momenta k 1 , · · · , k 4 are flowing inwards. We specify the nonlocal soft limit to be the s-channel squeezed limit, namely k s ≡ k 1 + k 2 = −k 3 − k 4 and k s → 0. Each internal line ℓ is labeled by its mass ν ℓ . In the Mellin space, we introduce two Mellin variables s ℓ ,s ℓ to the two endpoints of the bulk line ℓ, respectively. We use an arrow to clarify our choice: The arrow always points from s ℓ tos ℓ . The independent loop momenta q i will be explicitly labeled in graphs, flowing in the arrowed direction.

Melon graph
As a first example, let us consider the L-loop melon graph as shown in Fig. 10. This graph is bulk-free, and there is only one nonlocal cut, namely cutting all the (L + 1) massive propagators simultaneously.
Let us write down the expression for the melon graph. We neglect the constant factor −τ f /(2k) in the bulk-to-boundary propagator (23), and absorb the factor −τ into the interaction (−τ i ) p i , then the melon graph is expressed as the following: Here M (L) (k s ; τ 1 , τ 2 ) is the L-loop integral of the L + 1 massive propagators: Then we apply partial MB representation to extract the nonlocal signal from the Mellin-space loop integral (29), and the procedure is totally parallel with the proof of Theorem 2. The result is given by (44), and the nonlocal signal is: As we can see, the nonanalyticity of k s is fully encoded in the melon signal M c 1 ···c L+1 (k s ), whose explicit form of is given by (45). We also find that the signal contains 2 L (generally different) frequencies with respect to log k s , given by |ω c 1 ···c L+1 |: where c 1 , · · · , c L can take values from ±. Each choice gives rise to a value of the frequency, modulo the overall sign-flip of c ℓ . Figure 10: The 4-point L-loop melon graph.

Two-point mixing
From now on, let us consider graphs that contain internal vertices. The simplest case is a treelevel two-point mixing, shown in Fig. 11. Given the subtleties associated with internal vertices, we will investigate this example more closely than the previous one. The purpose of this example is to explicitly show the appearance of a hybrid signal for non-dS covariant couplings and the absence of this signal for dS covariant couplings.
Since we are looking for nonlocal signals, let us assume that we can cut the graph as usual. The expression for this graph is the following: As always, we neglected unimportant prefactors from the bulk-to-boundary propagators. We first cut the ν 1 propagator. It is not a priori clear that we can take a symmetric cut D ⇒ Re D. So, we take a safer cut: By a cut we mean to remove the blue term. Similarly, Figure 11: The 4-point correlator with tree-level s-channel exchange of two mixed massive modes. The mixing is generally not dS covariant. but here we do not remove any term.
Under this cut, the integral of τ 1 and τ 2 is factorized. The original integral now becomes two parts: A (fully) factorized (F) part, and a (partially) nested (N) part which involves integral of θ(τ 3 − τ 2 ). For example, we can write G +±± = G We use the MB representation for the massive propagators: Here and below, we use s · · · ≡ 2 i=1 ds i 2πi ds i 2πi , and, The other four branches can be calculated similarly. It is easy to check that the time integral will only give right poles of the Mellin variables. The next step is to integrate out these Mellin variables. Since the integrand of (93) is proportional to k −2s 1122 s and k s → 0 in the squeezed limit, we should sum residues at left poles, and the only left poles are the IR poles. The leading poles are the following: However, the factor e −iπ(s 2 −s 2 ) − e +iπ(s 2 −s 2 ) in the integrand of (93) requires that c 3 = −c 4 , otherwise G (N) +++ would vanish. Furthermore, to generate a signal with nonzero frequency, we should take c 1 = c 2 such that G (N) . This is the case that corresponds to a minimal cut on Line 1. Of course there is a similar signal ∝ k ±2i ν 2 s corresponding to a minimal cut on Line 2. These signals are ordinary signals of Fig. 11. Now let us look at the factorized part. In this part, we have: −+ e +ik 12 τ 1 +ik 34 τ 2 , G The other four branches can be directly obtained by taking the complex conjugate. In total, Now, using the MB representation for the massive propagators, we get: The three time integrals give: We can first use the δ-function to removes 2 , then we find the integrand of (99) is proportional to k −2s 12 s . For the other three Mellin variables, we take the following left poles to collect leading terms proportional to k i( ν 1 + ν 2 ) s : Together with the complex conjugate, the final result is: This is a hybrid signal as discussed in Sec. 5.3. The summation of residues at IR poles is given by Figure 12: The 4-point double-bubble graph.
When p 3 = −4 (covariant mass mixing), the above dressed hypergeometric function is simplified and is manifestly real. It can be checked that the above Im 2 F 1 = 0 for all integer p 3 ≤ −4. Therefore, we conclude that there can be a hybrid signal for non-covariant mixing but such a signal does not exist for covariant mass mixing, in agreement with the general analysis in Sec. 5.3.

Double-bubble graph
The third example is the double-bubble graph in Fig. 12, which is like a loop version of the tree-level 2-point mixing graph. Below we mainly focus on the loop integral under the partial MB representation, but we also keep in mind that the time integral at the internal vertices could give extra δ-functions. The loop integral in Mellin space is: The two loop momentum integrals are factorized, and the integral can be easily computed: There are two sets of left poles from this loop integral, coming from the factor Γ(s 1122 − 3/2) and Γ(s 3344 − 3/2), respectively. In fact, they correspond to the UV poles of the left and right loops. If we take one of the UV poles, say s 3344 = 3/2, then the leading contribution becomes k 3−2s 1122 s , and we can then set s 1,1,2,2 again at the IR poles: and the loop integral L is proportional to k 3+2ic 1 ν 1 +2ic 2 ν 2 s . This corresponds to the minimal cut of the left loop, namely cutting Line 1 and Line 2. Similarly, we can cut the right loop, namely Line 3 and Line 4, and take the UV pole: s 1122 = 3/2, then we will get a signal proportional to k 3+2ic 1 ν 3 +2ic 2 ν 4 s . These are all ordinary signals. Of course, we cannot take both UV poles for the purpose of generating a signal, since the result will simply be a constant.
The time integral of the internal vertex could give another type of signal. For simplicity, we take a direct coupling for the internal vertex with an arbitrary power p 3 , and then the time integral gives: Figure 13: The 4-point double-triangle graph.
We can use this δ-function to remove one Mellin variable. Then, the integrand of L becomes proportional to k −1−p 3 −2s 1234 s . To integrate out the remaining Mellin variables, we take residues of the integrand at the leading IR poles: we find that there is a signal of frequency |c 1 ν 1 + c 2 ν 2 + c 3 ν 3 + c 4 ν 4 |, which is generated by the resonances of the four modes at the two external vertices. This is exactly a hybrid signal discussed in Sec. 5.3. However, for dS-covariant coupling, namely p 3 = −4, this signal should vanish, just like the case of a covariant 2-point mass mixing. 7

Double-triangle graph
The next example is the double-triangle graph Fig. 13. We can write down the loop integral in the Mellin space: First consider the q 2 integral: The integral allows various different series expansions, depending on the shape of the triangle formed by vectors k s and q 1 ; See App. B. Especially, 7 One can see this point more clearly by taking spectral decompositions of the two loops, as was done in [95]. Figure 14: The 4-point (planar) double-box graph.
Then we see that the k s power becomes k . This suggests that there is a hybrid signal from the four modes attached to the "outer" two vertices, with frequencies |c 1 ν 1 +c 2 ν 2 +c 3 ν 3 +c 4 ν 4 |. Similar to the previous cases, this signal should vanish for dS covariant couplings, namely when p 3 = p 4 = −4.
Depending on the shapes of vectors k s , q 1 and k 3 , L can be expanded as different series. Here we focus on the hard region:  Figure 15: The 4-point non-planar double-bubble graph.

Non-planar double-box graph
Now let us consider the non-planar double box graph Fig. 15. The loop integral is: Figure 16: The 4-point bubble-on-bubble graph.

Bubble-on-bubble graph
Our final example is the bubble-on-bubble graph Fig. 16. The loop integral can be calculated explicitly: We can first take the UV pole of the small loop at s 4455 = 3/2, then the integrand of L is proportional to k 3−2s 112233 s . Next we should specify appropriate poles for s 1,1,2,2,3,3 . As before, the factor Γ(s 11 ) in the denominator requires that we should select leading IR poles s 1 =s 1 = ∓ic 1 ν 1 . Then, we can choose leading IR poles of s 2,2,3,3 . For example, we can take leading poles s 2 = s 2 = ∓ic 2 ν 2 and s 3 = −s 3 = ∓ic 3 ν 3 (notice that there is no factor Γ(s 33 ) in the denominator), which gives a signal proportional to k 3+2ic 1 ν 1 +2ic 2 ν 2 s . This corresponds to cutting Line 1 and Line 2. Similarly, we can cut Line 1 and Line 3 and obtain a signal proportional to k 3+2ic 1 ν 1 +2ic 3 ν 3 s . However, if we try to cut Line 2, and Line 3 simultaneously, there would be an additional δconstraint coming from the integral of τ 3 and τ 4 . For example, if we consider the G +−+− channel, then the time integrals give: dτ 4 (−τ 4 ) p 4 +3×3/2−2s 345 = 2πδ i(p 4 + 11/2 − 2s 345 ) , and thus the integrand is now proportional to k −5−p 3 −p 4 −2s 1123 s , which indicates signals of frequencies |2c 1 ν 1 + c 2 ν 2 + c 3 ν 3 | in the non-dS-covariant case, corresponding to the oscillations of the four outer modes.

Conclusion and Outlook
Inflation correlators are central observables in Cosmological Collider physics, playing a similar role as the scattering amplitudes and S-matrix for scattering experiments in Minkowski spacetime. It is thus crucial to have a good understanding of the analytical structure of inflation correlators. Among all kinds of singularities, the nonlocal signal produced by massive exchanges is special: On the one hand, it gives rise to a characteristic oscillatory signal in the physical region and thus is the main observable in CC physics. On the other hand, it produces a branch cut starting from the origin in the complex plane of the momentum transfer, and this feature has no flat spacetime correspondence. For these reasons, studying the analytical structure of the nonlocal signal is useful and important for both CC phenomenology and general dS QFTs.
In this work, we generalize our previous result in [94] to all loop orders: With the help of partial Mellin-Barnes representation, we state and prove a factorization theorem (74), which can be used to detect and identify all possible nonlocal signal in an arbitrary graph. At 1-loop order, a nonlocal signal could appear when two internal lines become soft simultaneously, and we can cut these two soft lines, pinch their endpoints and get a bubble signal. Similarly, the nonlocal signal of an arbitrary graph is associated with a nonlocal cut, and the leading signal in the nonlocal soft limit P → 0 corresponds to the minimal cut. The signal appears when all these lines become soft simultaneously, and we can pinch their endpoints and get a melon subgraph, and the nonanalyticity of P is manifest in its melon signal.
Partial Mellin-Barnes representation has been proven suitable and useful when studying inflation correlators, and there are still many interesting targets to explore with this tool. Under this representation, the graph breaks into different pieces in (26), and both the time and loop momentum integrals are much simplified. Moreover, the external momenta only appear in the loop integral L, while the external energies only appear in the time integral T (if there is no degeneracy between momenta and energies, of course). Therefore, it seems that considering the loop integral L is enough to analyze the nonlocal signals. However, this is not exactly the case, since time integral T could give some δ-functions of Mellin variables, which can result in the hybrid signals briefly discussed in Sec. 5.3 and Sec. 6.2. Although such signals are absent in bulk-free graphs and forbidden by the conformal symmetry in graphs with dS-covariant couplings, it will be interesting to have a detailed study of such hybrid signals.
Also, as mentioned in the Introduction, there are other kinds of nonanalyticity, including the local signal associated with the partial external energy sum. Since the energy only appears in time integrals, we can derive similar detection rules for such local signals using partial MB representation. This requires a careful study of the Mellin-space time integral T in (27), and we will pursue this topic in another work.
Branch cuts associated with CC signals are in a sense unique to dS amplitudes, which reflect the particle production and resonances in an inflationary universe. So they are absent in the corresponding flat-space amplitudes. It is well known that we have full control of a meromorphic function if we know its behavior at all the singularities (for example, residues at simple poles). The case of functions with branch cuts is similar: once we know the discontinuity along every branch cut, we can restore the function using the dispersion relation. We have seen that the massive inflation correlators have a branch cut starting from the origin of the complex plane of certain momentum transfers, and we are able to calculate the nonanalyticity using our factorization theorem. It is thus natural to ask whether we can reconstruct the full correlator using dispersion relation. This is not as trivial as it sounds, since our result is only valid around the squeezed limit. However, to do a dispersion integral along the full branch cut, we need a more comprehensive understanding of the nonanalyticity away from the origin point. Meanwhile, we should also identify all the other singular structures on the complex plane for the dispersion integrals to work. We also leave this question to a future work.
We also use the dressed version of generalized hypergeometric functions, defined in the following way when the series converges: p F q a 1 , · · · , a p b 1 , · · · , b q z = ∞ n=0 Γ a 1 + n, · · · , a p + n b 1 + n, · · · , b q + n z n n! .
The definition can be extended beyond the radius of convergences by analytical continuation. A central formula for our partial MB representation for massive inflation correlators is the following MB representation of the Hankel functions: Finally, we collect frequently used symbols in Table. 1.

B Useful Integrals
Melon integral. To derive the melon signal M c 1 ···c D (P ) (45), we need to compute the Mellinspace melon integral (55), and the result is given in (56). Below we will show the calculation explicitly. The basic tool is the bubble integral: We start with the definition: We first consider the integral of q 1 , which is a bubble integral: (143) Then we can finish the integral of q 2 , which is also a bubble integral:   Melon signal (45) We see that the two Γ functions in red are canceled between (143) and (144), and so are those in blue. We can then repeat the procedure and finish the full integral, and the result is exactly (56): Expansion of loop integrals For more complicated Mellin-space loop momentum integrals, it is very difficult to compute the closed-form result. However, we can calculate the leading contribution in special hierachical limits of the momentum configuration. For instance, we consider the triangle integral: This is the loop integral of a triangle with external momenta k s , k 1 and k 2 = −k 1 − k s . Furthermore, we can consider the squeezed limit k 1 ≫ k s . Then the leading result is: The detailed derivation can be found in the Appendix B of [94]. Here we provide a much simpler proof which could be applied to general situations, including the 2-loop examples in Sec. 6. In the limit k s → 0, there could be two contributions. One piece is analytic in k s , then we can set k s → 0 in the integrand of (146). Equivalently, we can expand the integrand and take the leading contribution in the region q ≫ k 1 : which becomes a bubble integral and gives the first line of (147). Another piece is nonanalytic in k s . This piece appears in the region q ≪ k 1 , so the factor |k 1 − q| −2s 33 becomes |k 1 | −2s 33 , and we obtain: which is again a bubble integral and gives the second line of (147).