An obstacle to sub-AdS holography for SYK-like models

We argue that"stringy"effects in a putative gravity-dual picture for SYK-like models are related to the branching time, a kinetic coefficient defined in terms of the retarded kernel. A bound on the branching time is established assuming that the leading diagrams are ladders with thin rungs. Thus, such models are unlikely candidates for sub-AdS holography. In the weak coupling limit, we derive a relation between the branching time, the Lyapunov exponent, and the quasiparticle lifetime using two different approximations.


Introduction
Out-of-time-order correlators (OTOCs) have attracted significant attention for two reasons. First, they describe quantum chaos in a way comparable to classical chaos (at least, for systems with a large parameter N , where the early-time OTOCs are characterized by a Lyapunov exponent). The second reason has to do with black holes and gravity. In this setting, OTOCs serve as a unique probe of high energy physics, namely, gravitational scattering between incoming and outgoing particles at the event horizon. The formal theory of such scattering was initiated by Dray and 't Hooft [1] and further developed by 't Hooft [2][3][4], but its physical meaning was hard to grasp because there seemed to be no observable effects. Only later it was realized that the relevant quantity is an OTOC and that it provides a basis for comparison between black holes and other systems. Maximal chaos, namely, the relation κ = 2πT between the Lyapunov exponent and temperature, is a hallmark of gravity [5], whereas small corrections to κ may be attributed to stringy effects [6] or some other form of nonlocality. (The term "maximal chaos" is due to the inequality κ 2πT , which holds under very general assumptions [7].) The Sachdev-Ye-Kitaev model realizing maximal chaos [8,9] has contributed to the concept of quantum gravity as a very general phenomenon, something like "quantum thermodynamics". We would like to move toward a more concrete understanding of gravity. The SYK model is relatively simple but has a very rough bulk-dual picture, as compared with the celebrated duality between certain supersymmetric conformal theories and superstrings in anti-de Sitter spaces [10]. The main goal of this paper is to figure what exactly is missing. An abstract answer is known: the SYK model lacks a gap in the conformal dimension spectrum, which is the standard condition for sub-AdS locality [11]. We will try to give a more intuitive explanation, which could help to dissect the problem and find a way around.
The rest of the introduction is devoted to the comparison between bulk-local and bulknonlocal effects in the context of the SYK model. In the main part of the paper, we show that the nonlocal effects cannot be diminished within a more general setting. This should provide useful guidance for the future search of holographic models.
In the discussion of holography for the SYK model, we will focus on the soft mode [12][13][14]. In its basic form, the holographic correspondence is between the Schwarzian action I Sch = −N α S J −1 1/T 0 Sch e iϕ(τ ) , τ dτ on the boundary and Jackiw-Teitelboim (JT) gravity in the bulk [15][16][17]. The latter has the following action: where the dilaton field Φ is normalized is such a way that its extremal value in an on-shell configuration equals the entropy. (The zero-temperature entropy, which can be represented by a topological term, is not included.) However, this correspondence is too simple because the Schwarzian action is already local. A more interesting form of holography would be one that provides a bulk-local description of some nonlocal boundary physics. Let us mention another, more compelling reason to regard the JT gravity degenerate. It has to do with Dray-'t Hooft "shock waves". In the JT theory, a gravitational shock can be eliminated by an SL(2, R) coordinate transformation on one of the two half-spaces (say, on the future of the shock), such that the only remaining effect is a boundary shift. As a consequence, the boundary representation of the shock generator is local; for example, a shock at the past horizon is represented by L −1 = e κt ∂ t . This operator is applied to one side of the thermofield double. In the low-temperature limit of the SYK model, the thermofield double is described by the Wightman function G W ; its perturbation by the shock is given by the eigenfunction of the retarded kernel, Υ R = L  [14,18]. The shock generator is more physical than the effective action because it is an on-shell object. We expect it to be well-defined (but not necessarily boundary-local) for all maximally chaotic systems. In fact, it is boundary-nonlocal for AdS black holes in 3 + 1 dimensions, and it would be very desirable to find a concrete quantum model with this property.
Unlike with microscopic models, boundary nonlocality is easily achieved in the dilaton gravity setting. To break the degeneracy of the JT theory, let us perturb the dilaton potential with a quadratic term: Its holographic dual is present in the SYK model as a correction to the Schwarzian action [14]. We will now make use of some formulas from the cited paper to quantify the corresponding physical effects, which we call bulk-local. Then we will compare them with the correction to the Lyapunov exponent, which is a measure of bulk nonlocality. This may seem too technical for an introduction, but all cumbersome factors will magically cancel, resulting in a simple figure of merit. The most important manifestation of the quadratic term is in thermodynamics. So let us consider static, i.e. rotationally symmetric, on-shell field configurations. Recall that the dilaton potential Φ at the center is equal to the entropy (up to a constant term). Furthermore, −U (Φ) is the temperature (up to an arbitrary factor) [14,19]. Thus, the entropy as a function of temperature is where c is arbitrary. On the SYK side, we have the following expansion in powers of N , where the extensive part is further expanded in T /J [20,21,14]: Here E 0 is the ground state energy, S is the zero-temperature entropy per Majorana site, α S is the coefficient in the Schwarzian action, and γ is the coefficient in the correction to it. Neglecting the O(N 0 ) term, we obtain the entropy in the thermodynamic limit: The comparison between equations (3) and (5) gives the value of parameter a in the dilaton potential: where we have used the expressions for α S and γ from table 1 in Ref. [14]. The notation k c (h) stands for the eigenvalue of the conformal kernel; k c (2) is its derivative at h = 2.
We are now in a position to determine the relative strengths of bulk-local and bulk-nonlocal corrections to the JT theory. To characterize the former, we consider the two terms in the specific heat: Denoting the second (relatively small) term by −δC, we get: This should be compared with the finite-temperature correction to the Lyapunov exponent, κ = 2πT − δκ. Taking the value of δκ from [13] or [18], we get: This is the promised figure of merit. It would be interesting to know how general this equation is, beyond the SYK setting. In any case, a holographic model would be one for which the left-hand side of equation (9) is small. For the SYK model, we have and hence, −k c (2) k R (−1) > 1. Looking for holography among slightly more general models, one could try to either decrease −k c (2) or increase k R (−1). We will see that the second recipe does not work because for a large class of SYK-like models. The number is called "branching time" [18]; its definition and use do not require maximal chaos.

Outline of the paper
The remainder of this paper is organized as follows. In section 2, we review the kinetic equation for OTOCs with general rung function and establish our conventions. In section 2.3, we review the concept of branching time. In section 3, we present the main result of this paper, the inequality (11). It is derived by relating the branching time to the winding speed of the phase of the Green function. The required upper bound for the winding speed is proved in appendix B. In section 4, we investigate the branching time for weakly coupled models and discuss its relation to the quasi-particle lifetime and the Lyapunov exponent.

Preliminaries
We adopt the notation of Ref. [18], namely, set β = 2π and denote the connected out-of-timeorder correlator by OTOC(t 1 , t 2 , t 3 , t 4 ). More explicitly, where θ j = τ j + it j (j = 1, 2, 3, 4) are complex variables with their real parts (i.e. imaginary times) fixed as follows: The − and + signs in (12) are for bosons and fermions respectively. Following [14], we assume the single mode ansatz within the time window where the scrambling time t scr is the time scale at which non-linear effects appear. For example, in the low temperature limit of the SYK model, C ∼ N βJ and t scr ≈ β ln C. Finally, Υ R X,Y (t) and Υ A X,Y (t) denote the retarded and advanced vertex functions, respectively.
Conventions. In this paper, we use the condensed matter convention in defining Green functions; for example, the imaginary-time and the retarded/advanced Green functions for Majorana operators are defined as follows: Here the factors −1 and ±i are chosen such that the bare Green functions in the frequency domain 1 have the standard form, G 0 (ω) = ω −1 . Let us also remind the reader that G R (ω) and G A (ω) are complex conjugates of each other and that they can be analytically continued to the upper and lower complex half-planes, respectively. These analytic continuations have the property that G R (z) = G A (z * ) * .

Rung function in SYK-like models
Connected four-point functions, including OTOCs, are given by sums of ladder diagrams. The function OTOC(t 1 , t 2 , t 3 , t 4 ) with fixed t 3 , t 4 satisfies a kinetic equation [9,13,22,18], with retarded kernel expressed as follows Here R, represented by the vertical wavy line and called the rung function, depends on the exact form of interactions in the model. In the most general case, a "rung" is a two-particle irreducible diagram that involves four times at its corners. However, we consider the class of models where R depends only on two times (or rather, their difference t 34 = t 3 − t 4 ) and refer to this assumption as the thin rung approximation. For example, in the SYK model, the rung function is a product of Wightman functions, multiplied by the second moment of the random coupling: (The dotted line, representing the averaging over disorder, contributes the factor (q − 1)!J 2 .) With our choice of imaginary time shifts (13), the Wightman function is related to the imaginarytime Green function as G W (t) = G(it + π). In the generalized Keldysh formalism (see Appendix C), the Wightman function is, essentially, the Keldysh Green function between two different contour folds (corresponding to the upper and lower rails in ladder diagrams), G W = − i 2 G K 21 . Meanwhile, the rung function can be expressed as the variational derivative of the Keldysh self-energy with respect to the Keldysh Green function, namely, R = δΣ K 21 δG K 21 .
One more thing the rung function is relevant to is the definition of the inner product of vertex functions, which has been used in the ladder identity [18].

Kinetic equation in the frequency domain
Following [18], we define a variant of the kernel utilizing the time translation symmetry: We denote its largest eigenvalue by k R (α) and the corresponding eigenvector by Υ R α (t), i.e.
In this notation, finding the Lyapunov exponent κ amounts to solving the equation k R (−κ) = 1. The retarded vertex function is given by the corresponding eigenvector Υ R (t) = Υ R −κ (t). Similarly, the advanced vertex function Υ A = Υ A −κ is the corresponding eigenvector of the operator K A α that is adjoint to K R α with respect to the inner product (20). For the purpose of this paper, it is useful to rewrite (22) with α = −κ in the frequency representation, where Υ R (ω) is the Fourier transform of Υ R (t), and K R −κ (ω, ω ) = K R −κ (t, t )e i(ωt−ω t ) dt dt . Combining (18) and (21), we have the following explicit expression 2 For later convenience, we define the W function as follows where we have used the property G R ω + i κ(µ) . Note that the equal sign in (25) is only possible on the real axis 3 , i.e. W > 0 in the upper half-plane.
In the Keldysh formalism, W can be understood as the variation of the Keldysh Green function with respect to the Keldysh self-energy, W = δG K 21 δΣ K

21
. In general, it depends on two variables, the center-of-mass frequency iκ and the relative frequency ω. However, in the current problem, the center-of-mass frequency is set to be imaginary as we assume exponential growth, and the relative frequency is kept real. So we recast the two variables into a complex argument, ω + i κ 2 . Diagrammatically, the kinetic equation (23) can be represented as follows Here the blue curve represents the vertex function Υ R , and the frequency ω in the loop should be integrated over. 2 where in the second step, we change the integration variables (s, t, t ) → s+ t−t 2 , −s+ t−t 2 , t with Jacobian 1 and reorganize ωt − ω t + iκs as ω 3 because the imaginary part of the retarded Green function is always negative in the upper half-plane (c.f. Eq. (92)). On the other hand, G R can have zeros on the real axis.
To summarize, we have obtained the equation for the Lyapunov exponent κ and the retarded vertex function Υ R in the frequency domain,

Branching time and a generating function
The branching time t B is defined in Ref. [18] as which measures the average rung separation. Physically, the branching time characterizes the sensitivity of the Lyapunov exponent to perturbations of the system. Let us imagine some theory with the retarded kernel K R , eigenvalue function k R (α), and Lyapunov exponent κ. Now we perturb the theory, which changes the retarded kernel, . The corresponding first-order shift δκ of the Lyapunov exponent can be found as follows: Now, we will compute the branching time by a generating function method. Within the thin rung approximation, the OTOC is given by a sum of diagrams with n rungs (denoted by F n ), We may interpret F n / OTOC as the "probability" 4 for a n-rung ladder to appear in the OTOC, and define the average number of rungs as .
To proceed, let us introduce a generating function with a parameter µ representing the chemical potential for rungs, then the average number of rungs is the logarithmic derivative of the generating function, namely, A useful observation is that the generating function Z(µ, t 1 , t 2 , t 3 , t 4 ) can be determined by the same kinetic equation approach, with a weighted kernel e µ K R : This suggests Z(µ, t 1 , t 2 , t 3 , t 4 ) should have a similar form to the OTOC, where C(µ) and Υ R/A (t, µ) are smooth functions near µ = 0. Plugging the last equation into (33), we get The number κ(µ) is determined by solving the eigenvalue equation e µ k R (−κ(µ)) = 1. Differentiating it with respect to µ, we find Practically speaking, this relation provides an alternative route to the branching time. The branching time is the change in the Lyapunov exponent when a weight e µ is added to the retarded kernel K R . This method will be applied in following sections to prove a bound for the branching time and to estimate the branching time of weakly interacting systems. We would also like to comment on the meaning of equation (36). Together with (37), it gives which is consistent with the interpretation of t B as the average rung separation. Note that (38) is correct only for large time differences such that and therefore, the distribution of n approaches a delta function.

A bound on the branching time
In this section, we present the main result of this paper: we show that within the thin rung approximation, the following bound on the branching time holds for fermionic models: We will prove it using the generating function trick with parameter µ, and express the branching time as the µ-derivative of κ via (37). Our proof involves two steps: is the phase of the retarded Green function on the upper half plane. This step is completed by a Hellmann-Feynman type argument, which will be shown momentarily.
This second part is common to fermionic Green functions (i.e. the result does not rely on the kinetic equation or the thin rung approximation). For this reason, we have put the proof of the bound in a separate section as Appendix B.
We start with the deformed kernel e µ K R and the following equation (i.e. introduce a parameter µ to Eq. (27)) Since W > 0 away from the real axis, we can equivalently write Using the property R(ω) = R(−ω) * (i.e. the reality of the rung function in the time representation, R(t) = R(t) * -see Appendix D for a proof), we get the following formula for the complex conjugate of (43): We have also used the fact that W is real. A neater way to write these formulas is to use bra Υ R | and ket | Υ R with the Hermitian inner product given by the integral over dω 2π . Then R may be regarded as a Hermitian operator with the matrix elements ω| Consequently, we have e µ Υ R | R| Υ R = Υ R |W −1 | Υ R . Next, we run a Hellmann-Feynman type of argument: when we take the µ derivative of both sides of the equation e µ Υ R | R| Υ R = Υ R |W −1 | Υ R , the terms involving derivatives of Υ R | and | Υ R cancel due to (45). Therefore, only the derivatives of e µ R and W −1 survive, namely, In the last step, we used equation (45) again. Recall that W ω+i κ is the square of the magnitude of G R , whose derivative along the imaginary axis is related to the derivative of the phase of G R along the real axis. More explicitly, we have the following formula: .
Setting µ = 0, we obtain t B = κ (0) −1 : We would like to interpret the above expression for t B as the average value of the winding speed of φ with some probability distribution P . Such an interpretation is indeed possible: is a normalization factor. Next, we will bound t B using a lemma that bounds the winding speed of φ.
Lemma. The retarded/advanced Green function G R/A of an arbitrary Fermi system satisfies the following inequality for all ω ∈ R and s ∈ R + : The proof is given in the appendix B.
Applying the Lemma with s = κ 2 to (49), we conclude that t B 2 κ , which is equivalent to the bound (40). Now we make a few comments on the bound.
The equal sign in the Lemma is achievable if the spectral function is a delta function; for example, if A(ω) = 2πδ(ω), then the equality is attained at ω = 0. Note that in order to get an equal sign in the branching time bound, t B κ = 2, one needs the bound in the Lemma to be tight whenever the ratio Υ R (ω)/ G R (ω + i κ 2 ) is non-zero. On the other hand, the equality in the Lemma can not hold for all ω because the total change of the phase difference between G R and G A is fixed, namely, Therefore, we expect that the equality in (50) can be achieved only at isolated points.
On the other hand, Υ R (t) tends to zero at t → ±∞, and therefore, its Fourier transform can not be a sum of delta functions. Combining these two observations, we expect that the branching time bound can not be saturated in any physical model.
For a more rigorous argument along similar lines, let us assume that G R (t, 0) decays as e −Γt/2 as t goes to infinity. In the frequency representation, this means that G R (z) is analytic for Im z > − Γ 2 . As a result, the lemma proved in appendix B can be strengthened as follows: which consequently provides a tighter upper bound on the branching time, 2. The key assumption for the bound to hold is the thin rung approximation for OTOCs. This condition might be violated, for example, in matrix models.
3. Our proof works only for fermionic systems, since the proof of the Lemma relies on the positivity of the spectral function, which is only true for fermionic Green functions.
4. The proof can be generalized to the higher-dimensional case under mild assumptions. The only change is to add momentum arguments to the vertex and Green functions and to integrate over the momentum whenever there is a loop in the diagram. The conclusion is that the bound t B κ 2 still holds, provided the center-of-mass momentum of the exponentially growing mode is zero. One also needs the identity R( p − q, ω − ω ) = R( q − p, ω − ω) * , which is analogous to the multi-flavor case discussed below. 5. The bound and its proof also generalize to multi-flavor Fermi system, where we need to introduce flavor indices for the rung function R ab , the Green functions G R,A a , and vertex functions Υ R,A a . Here, we have assumed that the Green function is diagonal in the flavor basis, i.e. G R,A ab = δ ab G R,A a . Diagrammatically, the rung function looks like this: Note that we require the incoming and outgoing flavors on each side to be the same. For SYK-like models with disorder, this assumption is true if the disorder is flavor-diagonal, e.g. if J abcd J a b c d ∝ δ aa δ bb δ cc δ dd . A natural generalization of the reality condition for R is Hermiticity, namely, R ab (ω − ω ) = R ba (−ω + ω ) * , see appendix D for a proof. The derivation of the inequality t B κ 2 requires minor modifications but remains correct.

Branching time at weak coupling
In this section, we investigate the special case where the Green function has a quasiparticle pole, which is expected at weak coupling. To be concrete, let us take the Majorana SYK model as an example and comment on the general case and higher dimensions later. In the limit of weak interaction, i.e. at temperatures β −1 J, we may approximate the retarded Green function by that of a quasiparticle with zero energy and lifetime τ qp = 1/Γ, where Γ ∼ J: In general, G R could have multiple poles characterized by decay rates Γ of the same order of magnitude. As a consequence, the calculation of t B based on (55) in the following subsections will only give an order-of-magnitude estimate. See appendix C.5 for more discussions on the estimation of Γ.

A bound state problem
We have previously mentioned the bound t B 2 κ+Γ . This result, along with an actual estimate for t B , can also be derived using the intuition from solving a bound state problem in ordinary quantum mechanics, similar to the large-q SYK discussed in Ref. [13]. We would like to explain the quantum mechanical interpretation here, which will also be useful for later discussions of various approximation methods.
Let us start with the kinetic equation (43) with parameter µ and insert the quasiparticle ansatz (55) into W = | G R | 2 , which yields W = 1 ω 2 +( κ+Γ 2 ) 2 so that equation becomes In the time domain, the above equation turns into a second order differential equation, which can be further interpreted as a Schrodinger equation with potential V = −e µ R(t) and total energy E = − (κ+Γ) 2

4
< 0 (i.e. a bound state problem). Using the Hellmann-Feynman theorem, we get the potential energy at µ = 0 by taking the µ-derivative of the total energy, i.e.
Note that the kinetic energy E K is always non-negative, and hence, E = E K + E V E V . Expressing E and E V in terms of κ, Γ, and t B , we get (κ + Γ) 2 4 which reproduces the bound at weak coupling (54).

Comments on various approximation methods
The bound state interpretation provides a useful clue for various approximation methods. We will start with the zero range potential approximation (see Ref. [23] for an exposition) for the rung function (potential term in the QM interpretation) and derive an approximate relation between the branching time t B , the Lyapunov exponent κ, and the quasiparticle decay rate Γ: In other words, the branching time is half of its upper bound (54). We also comment on a widely used approximation in the literature, the delta function approximation in the kinetic equation introduced by Stanford [24], and show that it leads to the same relation.
Zero-range potential approximation for the bound state energy. For the bound state problem (57), the simplest approximation is to replace the potential with a delta-function: Then the wave function and the corresponding bound state energy E = − (κ+Γ) 2 4 are expressed as follows Therefore, we have κ(µ) = e µ R(0) − Γ and Eliminating R(0), we get the relation (60): This formula shows that at weak coupling, branching is essential for scrambling to occur, which is in contrast to the strong coupling limit, where the branching slows down the scrambling. For example, in the SYK model with J 1, the deviation of the Lyapunov exponent from its maximal value is The contrast suggests that there are actually two different mechanisms for scrambling, one at weak coupling and another at strong coupling (corresponding, respectively, to high and low temperature) [25,26].
As to the validity of the zero-range potential approximation we have used, it should work when the range of the potential R(t) (denoted as t 0 ) is much smaller than the size of the wave packet given by Υ R (t, µ) in (62), namely For example, in appendix E, we show that this criterion is satisfied for the Brownian SYK [27,28], where the rung function is indeed a delta function, and therefore, the relation (60) is exact. On the other hand, there is also a weakly coupled model where this approximation (and the approximations below) are not accurate. In appendix F, we show that for the (regular) large q SYK at weak coupling, there is an order 1 prefactor discrepancy between the approximated and exact results.
Delta function approximation in the kinetic equation. The zero range potential approximation for the bound state problem above is similar to the type of approximation used in [24] by Stanford, where the product of retarded and advanced Green functions (which is denoted as W in this paper) is approximated by a delta function: In the last step, the Lorentzian function is replaced by the delta function. 5 Therefore, we have a simplified equation, The appearance of the delta function supplies additional convenience: the vertex function Υ R (ω, µ) should also be proportional to δ(ω) in this approximation. Thus, we have the following relation, κ(µ) + Γ = e µ R(0) , which is identical to the result obtained using the zero range potential approximation, and therefore, entails the same conclusion, We would like to make a few additional comments about this approach: 1. The approximation in (67) is valid when κ + Γ ω 0 , where ω 0 is the frequency scale below which the rung function R(ω) is almost constant. This is merely the frequency space version of the criterion (66).
2. The computations here generalize to higher dimensions and the multi-flavor case straightforwardly, where the retarded Green function has the following form The subscript a labels the quasiparticle flavor, and p is the momentum vector. In this case, we need to consider a generalized version of equation (42), namely, Next, we use the following approximation in the above equation, which leads to the following result: Here R and Γ are understood as matrices in the flavor (a, b) and momentum ( p, q) indices, namely, and the expectation value · is taken on the eigenvector with largest eigenvalue κ(µ). Then the branching time is given by the Hellmann-Feynman theorem, with 1/t B = R evaluated on the same eigenvector, which leads to the following relation:

Summary and discussion
Our main result is the bound on the branching time, κt B 2, for a large class of SYKlike models, which may be interpreted as a warning sign against naive attempts to obtain a holographic model with bulk locality. On the other hand, the assumptions used to prove the bound might serve as a guide for the search of desired models in a broader class. A general retarded kernel contains four matrix indices and a rung function of four time variables: Here all indices are different. Each operator can be bosonic or fermionic, as long as the total fermion parity is even. The rung function R ab;cd (t 1 , t 2 , t 3 , t 4 ) is the sum of general 2-particle irreducible (2PI) diagrams. To derive the bound, we made the following assumptions: 1. The conservation of flavor singlet. We assumed that we could restrict to the subspace where the Green functions on the two parallel rails for the retarded kernel are fermionic and with the same flavor index. In other words, we require the "scramblon" that mediates the propagation of chaos to be a singlet in the fermion flavor: In SYK-like models, this conservation owes to the disorder averaging and the condition that the disorder is flavor-diagonal.
2. The model is purely fermionic. We used the fact that the spectral function is positive semi-definite, which is not true for bosonic operators. It is then interesting to study models with both bosons and fermions [22], and ask whether similar bounds exists.
3. The thin rung approximation requires that the rung function only depend on two times (and further, due to the time translation symmetry, it only depends on the time difference of the two ends): It would be interesting to study models with more complicated rung functions. As an example, we could consider a matrix model, where the 2PI diagrams can be nontrivial: Here, the double lines represent the matrix fields that mediate the interaction between fermions.
It would also be interesting to understand logical connection between these conditions and bulk non-locality. Another motivation of our work was to investigate the relation between scrambling and branching. There are two distinct behaviors: for the SYK model at strong coupling, the branching (occurring at rate 1/t B ) slows down the scrambling [18], while in this paper, we have found the opposite tendency at weak coupling: κ ≈

A An alternative derivation of the ladder identity
In Ref. [18], the following identity relating t B , the Lyapunov exponent κ, and the prefactor C, was derived using a cut-and-glue consistency condition. Schematically, the procedure is to decompose a long ladder diagram into two shorter ladders glued by a "box" (formed by two rungs and two horizontal rails). The branching time t B appears as the typical size of the box in the aforementioned derivation. Now, we give an alternative derivation using formula (38) for the average number of rungs. The idea is that we can divide the long ladder by a rung instead of a box. In other words, if we connect two OTOCs by a rung, we will get a sum of longer ladders with certain multiplicities The l.h.s. of the equation consists of one retarded OTOC (as explained in figure 1; see also Ref. [18] section 5), a rung function, and an ordinary OTOC. The "dot" product represents the integration over intermediate times. There should be an additional N factor on the l.h.s. due to the summation over internal indices. The r.h.s. of the equation is a sum over n-rung diagrams F n with a counting factor n. That is to say, where the retarded OTOC R differs from the ordinary one by the factor of 2 cos κπ 2 , namely Next, we insert the single mode ansatz into the composition formula (85). Note that the r.h.s. of (85) is the average number of rungs times the OTOC, as discussed in section 2.3. Together with (38), we get the following expression: (87) 6 Here we follow the convention in Ref. [18] (consistent with the right-to-left operator multiplication order, see e.g. (85)). One can equivalently use the more standard notation as in appendix C, where the time goes right. Meanwhile, the l.h.s. of (85) consists of two OTOCs, where the last integral is done by switching to new integration variables, t 56 = t 5 − t 6 and t = t 5 +t 6 2 . The latter is constrained by the end points, i.e. t 1 +t 2 2 > t > t 3 +t 4 2 . The integration over t 56 gives the normalization factor Υ A , Υ R by definition, while the integration over t gives the factor t 1 +t 2 −t 3 −t 4 2 . Comparing (87) and (88), we prove the ladder identity (83).

B Proof of the Lemma
In this appendix, we present a proof of the Lemma stated in the main text. The lemma asserts a bound on the rate of phase winding for the retarded and advanced fermionic Green functions. For convenience, let us denote The lemma asserts that Our proof relies on the positivity of the fermionic spectral function, namely, The retarded Green function (91) is holomorphic in the upper half-plane. Moreover, for Im z > 0, we have In other words, f (z) = − G R (z) is an analytic function that maps the upper half-plane to itself. Our goal is to find a bound on the derivative of f (z) at z 0 = ω + is.
To apply the basic version of the Schwarz lemma (about a bounded holomorphic map of the unit disk preserving the origin), we need to construct two maps transforming the upper half-plane to the unit disk and vice versa: 1. y = y(η) maps the upper half-plane to the unit disk with y(−G R (ω + is)) = 0, e.g.
2. z = z(ξ) maps the unit disk to the upper half plane with z(0) = ω + is. Such a map can be defined by the formula and has the property z (0) = −2is.
Composing these two maps together with f (z) = − G R (z) in the middle, we obtain a holomorphic function that maps the unit disk to itself, with g(0) = 0. Therefore, according to the Schwarz lemma, we have |g (0)| 1, which implies Now we use the above inequality to bound I s (ω). Let us begin with this chain of inequalities: (97) Combining it with (96), we conclude that

C Keldysh formalism for multiple contour folds
In this appendix, we study correlation functions on contours with multiple folds, as illustrated by Fig. 2. In some parts, we will use the SYK model as an example. Our principal goal is to introduce the necessary language for the proof of Hermiticity of the rung function in appendix D.
The multi-fold Keldysh formalism has been discussed in the literature, e.g. in Ref. [29]. The main difference from the standard Keldysh formalism is that the Green functions G R , G A , G K are matrices with respect to the fold index (G R and G A are actually diagonal), but the equations have the same form as for scalar functions. We first derive the Keldysh equations, i.e. a variant of the Schwinger-Dyson equation and an expression for self-energy. Here we use the convention (as is customary in the literature) that time runs from left to right. Note that this is different from the convention for a similar figure in Ref. [18], where the forward time evolution is chosen to run from right to left for the convenience of operator interpretation.

C.1 Green function in the Keldysh basis
Points on the contour folds are specified by the real part of time, t, fold index (α = 1, 2), and the choice of particular side of the fold (u or d). Fields as such are functions of complex time, so for each t, we have ψ 1 (t) = ψ(t) and ψ 2 (t) = ψ(t − iτ ) on the first and second fold, respectively. The distinction between u and d comes into play in the construction of correlation functions, which are contour-ordered. Thus, the Green function is a matrix in the (u, d) basis (apart from its dependence on the fold indices): The symbol T c denotes the contour ordering: the operators are arranged such that their rightto-left order agrees with their sequence along the contour, as indicated by arrows in Fig. 2. For example, T c ψ u , where ζ is 1 for bosons and −1 for fermions. It is often convenient to introduce the (+, −) basis, Note that the −− correlator always vanishes. The +− and −+ correlators are the retarded and advanced Green functions, namely, G R = G +− and G A = G −+ ; they are diagonal in the contour index. The ++ correlator is the Keldysh Green function, G K = G ++ . We may summarize this notation in a matrix form, Written in terms of the field operators ψ α (t), the Green functions are as follows: Let us give explicit expressions for the off-diagonal Keldysh functions at thermal equilibrium: where A(ω) is the spectral function, (105) In this notation, the Wightman function G W we used in the main text is related to G K for the contour with fold separation τ = β/2 (or τ = π in dimensionless units): (106)

C.2 Schwinger-Dyson equations
The Keldysh equation is the Schwinger-Dyson equation, where the self-energy Σ and the inverse Green function G −1 0 are matrices: Here, ω = i∂ t is understood as the operator representing frequency. In terms of components, we have

C.3 Diagrammatic rules
Now, we derive the diagrammatic rules using the SYK model as an example. In addition to the Green functions, we will need the interaction term on the Keldysh contour: The prefactor 2 1−q/2 = 2 −1/2 q · 2 in the second line arises as follows: 2 −1/2 comes from the basis change formula (100), and the additional factor of 2 is due to duplicate contributions from the two terms in the first line. The factor q k counts different choices leading to the same term with k "+" fields.
We are now in a position to formulate the Feynman rules specific to the model.
1. Green function in (+, −) basis: Note the factor of i in the definition.
2. Interaction vertex in the (u, d) basis (for the purpose of illustration, we draw the diagrams for q = 4): In the (+, −) basis, The construction of diagrams involves one more step: Gaussian averaging over J jklm is represented by dotted lines connecting pairs of vertices.

C.4 Expression for the self-energy
Neglecting subleading (in 1/N ) terms, we have − iΣ αβ (t 1 , t 2 ) = 1 (q − 1)! where (q − 1)! in the denominator is the number of symmetries and the dotted line represents the disorder averaging, which gives a factor J 2 (q − 1)!. Both sides of the above equation are matrices in the (u, d) or (+, −) basis. For the calculation of its elements, we should sum over patterns of pluses and minuses with an odd number of pluses around each vertex. Not all such patterns are allowed because G −− = 0 and because the product of the retarded and advanced Green functions vanishes, i.e. G +− αβ (t 1 , t 2 )G −+ αβ (t 1 , t 2 ) = 0. In particular, the allowed Green functions in the expression for Σ R = Σ −+ are G R = G +− and G K = G ++ : In the above equation, we have assumed that q = 4. For a general q, the series continues as follows: The prefactor i q−2 /2 q−2 comes from (114), and (q − 1)!J 2 is due to disorder averaging. The combinatorial factor 1 (q−k)!(k−1)! is the inverse number of symmetries of the diagram, which has q − k retarded (i.e. +−) lines and k − 1 Keldysh (i.e. ++) lines. There are also some implicit factors that cancel each other: (−1) (q−1)q 2 (fermionic sign), i q−1 from q − 1 Green functions (see (112)), and i from the definition of self-energy (see (115)). We may transform the above series into a more compact form: Similarly, the advanced self-energy is given by this expression: The Keldysh self-energy needs to be derived separately, (121) where the last term is to compensate the double counting in the summation when k = 1. For later convenience, we may rewrite since the mixed terms all vanish due to time constraints for the retarded and advanced Green functions. Again, the k = 1 term needs individual care, and we find Note that the retarded and advanced Green functions G R,A αβ are diagonal in the fold index (i.e. vanish if α = β); therefore the corresponding self-energies Σ R,A αβ are also diagonal. In contrast, the Keldysh component Σ K αβ is not necessarily diagonal. An alternative way to obtain the above relations is to work in the (u, d) basis, where the vertex is diagonal Besides the basis transformation law of G shown in (101), one needs to use a similar rule for Σ: C.5 Quasiparticle decay rate Γ for SYK at weak coupling As a simple application of the formalism, we will estimate the quasiparticle decay rate Γ for the SYK model at βJ 1. This quantity is defined by the t → ∞ asymptotics of the Green function, G R (t, 0) ∼ e −Γt/2 . Let us make a stronger assumption and adopt the ansatz G R (ω) ≈ 1 ω+iΓ/2 , which is only qualitatively correct. The task is to self-consistently determine Γ using the equations for the Green function and self-energy. First, by definition, For a large t, we have Inserting these expressions into (117), we obtain an explicit formula for the retarded self-energy: We are interested in the high temperature limit, where x ≈ −i Γβ 4 is small, and therefore, negligible for the leading order calculation. (In other words, the diagonal component of the Keldysh function is small, |G K | |G R |.) Next, we estimate the zero frequency value of the self-energy by integrating its long-time asymptotic form (128), 7 i.e.
This quantity should be set to −iΓ/2 as a self-consistency condition. Thus, we obtain We now make a few remarks about this formula.
1. The quasiparticle decay rate Γ is of order J instead of J 2 ; the latter is what one might have guessed based on perturbation theory as all disorder-averaged diagrams come with even powers of J. However, the naive perturbative argument fails due to the divergence arising from the long-lived bare Green function. We need to include the quasiparticle decay in the first place. On the other hand, Γ ∼ J is a reasonable result from the dimensional analysis perspective since J is the only energy scale at infinite temperature.
2. We should not trust the order 1 prefactor because in (129), we have used the long-time asymptotics to approximate Σ R (t). It is not valid at times t 1/Γ, and therefore, the expression for Γ is off by an order 1 factor, though the J scaling is correct. We expect the approximation (129) and the final formula to become accurate in the q → 2 limit. Indeed, as shown in Fig. 3, the approximate formula (130) agrees with the numerical solution for q = 2.
3. We can also compare (130) with the exact large q result (see Appendix F), which we denote by Γ * . Specifically, Γ * = 2vq −1 and v ≈ 2 3/2−q/2 √ qJ for weak coupling. There is some discrepancy, with Γ given by (130) equal to Γ * / √ 2 in the q → ∞ limit. This is not a surprise for the reason discussed in point 2.

D Hermiticity of R(t)
In this appendix, we show the Hermiticity of the rung function, namely where a, b label field flavors (not to be confused with fold indices α, β). Here we assume that the flavors on the top and the bottom rails come in identical pairs. That is a nontrivial condition, restricting the class of models we can handle. To formulate the problem, we consider the interaction vertex λ abc ψ † a ψ b O c , where O c is a bosonic field that mediates the interaction between the rails (cf. Eq. (5.1) in Ref. [18]). Without loss of generality, we choose O c to be real; namely, we treat the real and imaginary parts of a complex field separately. In the Keldysh basis (u, d), the interaction is written as follows: In the (+, −) basis, In our formalism, only the third term in (133) will contribute to the rung function for the retarded kernel. Thus, we have R ab (t) = cc λ abc λ bac G K cc (t) , where G K cc (t) = Tr ( √ ρO c (t) √ ρO c (0)) , ρ = e −βH /Z .
To relate R ab to R * ba , we will examine how each factor in (134) is transformed under complex conjugation. In fact, G K cc (t) is real since O c and O c are real and the configuration we have chosen is symmetric on the imaginary time circle. As for the coefficients λ abc , we will use the Hermiticity of the interaction Hamiltonian: It follows that λ abc λ bac = (λ bac λ abc ) * .
Together with the reality of G K , this implies the desired identity, R ab (t) = R * ba (t).
has a delta function factor, and therefore, is an example where the zero-range potential approximation is exact. Finally, we insert the UV form of G K , into the formula for R(t) and get R(t) = (q − 1) Jδ(t) 2 q−2 , R(0) = (q − 1) Thus, we obtain the Lyapunov exponent and the branching time: The explicit form of the Lyapunov exponent is consistent with the expectation that for q > 2, the model is chaotic. We have also checked the calculated Lyapunov exponent against the numerical result in [28] and found a good agreement.
with the ones derived from the zero-range potential approximation, see section 4.2. Using the fact that Γ is small, we get κ approx = R(0) − Γ ≈ 2v , t B,approx = 1 Note an order 1 factor discrepancy between the approximate and exact results. This discrepancy is expected since the "potential term" R(t) = v 2 2 cosh 2 vt 2 can not be well approximated by a delta function.