Operator growth in the SYK model

We discuss the probability distribution for the"size"of a time-evolving operator in the SYK model. Scrambling is related to the fact that as time passes, the distribution shifts towards larger operators. Initially, the rate is exponential and determined by the infinite-temperature chaos exponent. We evaluate the size distribution numerically for $N = 30$, and show how to compute it in the large-$N$ theory using the dressed fermion propagator. We then evaluate the distribution explicitly at leading nontrivial order in the large-$q$ expansion.


Introduction
In quantum many-body systems, the butterfly effect is roughly the statement that time evolution takes simple (few-body) operators to complicated ones (many-body). This makes it possible for the disturbance of a single particle far in the past to have significant effects on all particles at a later time. In systems with spatial locality, this takes a while, since the disturbance has to spread through the system [1][2][3][4][5][6][7][8][9][10][11][12]. In nonlocal systems the process can be much faster. However, the concept of operator growth still makes sense if each term in the Hamiltonian only couples together a few degrees of freedom at a time. In this setting, simple operators still take time to become complicated [13][14][15][16][17].
A rough diagnostic of this effect is the commutator-squared between W (t) = e iHt W e −iHt and V , where W, V are simple operators [18][19][20][21][22]. The idea is that as time advances, W (t) grows in such a way that it has nontrivial effects at almost any site in the system. As a result, it then fails to commute with other simple operators, such as V , and so [W (t), V ] 2 becomes order one. In the case where W, V are fermionic operators, then one considers the anticommutator-squared instead.
In this paper we will consider another diagnostic, which is to compute the full probability distribution for the size of the time-evolving operator [3]. To define this, one expresses W (t) in a basis of operators organized by the number of "simple" operators that appear in a given product (the "size"). Let's explain this more concretely for the case of the SYK model [23,24]. In that case it is natural to take the simple operators to be the individual fermions, ψ i . We choose W to be one of those fermions, say W = ψ 1 . The time-evolving W (t) can be expanded as ψ 1 (t) = s, a 1 <...<as 2 s−1 2 c a 1 ...as (t)ψ a 1 ...ψ as , (1.1) where s is the "size" of the basis element, i.e. the number of elementary fermions that appear in the product. The factor 2 s− 1 2 is to compensate for the fact that we normalize the fermions so that ψ 2 = 1 2 . The probability distribution for size s is then defined as P s (t) = a 1 <...<as |c a 1 ...as (t)| 2 . (1.2) As time passes, this distribution shifts towards operators of larger size-the operator grows.
We can think of the c a 1 ...as (t) coefficients as describing a quantum wave function for the evolving operator. As we will see, in the infinite N SYK model, this "operator wave function" can be understood as a standard wave function for a quantum particle moving on a special, rapidly expanding graph shown in Fig. 2a. With time evolution, most of the particle's wave function moves deeper into the graph at an exponentially growing rate. This corresponds to the operator becoming larger and more complicated.
The fact that the rate is exponential is because the graph on which the particle is moving becomes more highly connected as we move deeper. In terms of the growing operator, this reflects the fact that once an operator has already become quite large, it has many different ways to grow larger still. This is the basic origin of exponential early-time behavior of correlators such as [W (t), V ] 2 , diagnosed by the chaos exponent (or many-body quantum Lyapunov exponent) λ L [22,25].
Although most of the wave function moves rapidly into the graph, there is a small exponentially decaying tail for the particle to remain at (or return to) the root of the graph. In operator language this is the amplitude for ψ 1 (t) to remain equal to ψ 1 (0) = ψ 1 , or more explicitly 1 2 N/2 Tr[ψ 1 (t)ψ 1 (0)], the infinite temperature two point function. This correlator exponentially decays because most of the wave function is leaking into the space of complicated many-fermion operators.
It is a challenging problem to go beyond this qualitative discussion and actually compute P s (t). In the rest of the paper, we will discuss some partial results for this quantity, mainly in the large N theory. In particular, we will explain an equivalent particle-moving-on-agraph problem, and we will show how to sum the 'melonic' (infinite N ) perturbation theory for the wave function using the dressed infinite-temperature two-point function of fermions. However, the dressed two-point function is not known analytically except for the large-q SYK model. So the only place where we will succeed in computing the operator wave function is at infinite N and at leading order in large q. We will use this wave function to compute a few things, including the (previously known) infinite-temperature chaos exponent.
Before we begin with the main calculation, we will make three preliminary comments.

Preliminary comments 2.1 A classical model
Before analyzing the quantum problem, we can discuss an analog of operator growth for a classical model of many-body chaos. This model was previously considered in the context of high energy scattering in weakly coupled gauge theory [26]. It goes like this. Suppose we have a collection of N particles where we initially label one as infected and the others as healthy. The rule for time evolution is that with some probability γ per unit time, any infected particle can heal itself at the cost of infecting (q−1) random other particles. For simplicity, we assume that the total number N is large enough and the time is short enough that the infected particles are always very dilute. If we are interested in the probability that some randomly chosen particle will be infected after time t, we can proceed in three different ways. The three ways get increasingly complicated, but each gives an interesting perspective.
The first way is to simply notice that the expected number of infected particles N inf is growing according to This leads to N inf = e (q−2)γt . The probability that a randomly chosen particle will be infected at time t is simply γt . This type of intuition was used in early discussions of scrambling by quantum circuits [15], and it is related to the kinetic equation method used for the OTOC in weakly coupled theories in [4].
A second way is to follow the possible chains of events that lead to our random final particle getting infected, and add up all of the probabilities. To start, the simplest way it could happen is that the original infected particle never infects any other particles, but by chance it happens to be our randomly selected particle at time t. The probability for this is 1 N e −γt . The second simplest thing would be for the original particle to infect one set of (q−1) particles, for one of these to be our randomly chosen particle, and for this one to not infect any further particles. The probability for this is Summing over all possible chains of infection, we find the probability This method of calculation is very similar to the calculation of the OTOC by ladder diagrams, where k labels the number of rungs in the ladder (see [24,27] in the SYK model and [7][8][9]28,29] in weakly coupled theories). In particular the crucial factors of (q − 1) that appear here and that also appear in the ladder diagrams. These factors correct for the fact that we are only following one possible chain of infection, picking one out of the (q−1) particles infected in each given event.
A final way, analogous to the discussion of the quantum problem in this paper, is to calculate the probability distribution for the number of infected particles at time t, and then take the expectation value explicitly. To do this we can use the equation 3) The solution with initial conditions P (s k , 0) = δ k,0 is With this probability distribution, we can of course find the same answer as with the other two methods for the probability that a random particle is infected, by taking the sum k s k N P (s k , t) = 1 N e (q−2)γt . It is interesting to note that if we scale time so that γ ∼ 1 q , this formula is quite similar to the probablity distribution we will find in the SYK model at large q, see (5.17).

A numerical plot for N = 30
We would now like to show an example numerical plot of P s (t) for the quantum problem. To begin we should explain how this can be computed numerically. The SYK model with N Majorana fermions lives in a Hilbert space of dimension 2 N/2 . The space of operators acting on that Hilbert space can be understood as a Hilbert space in its own right, with inner product given by (2.5) In this "operator Hilbert space" we can decompose ψ 1 (t) in a basis of operators of definite size, as in (1.1). A formula equivalent to (1.2) is Here, the sum is over an orthogonal basis s-fermion operators, for example all N s operators of the form ψ a 1 . . . ψ as with a 1 < ... < a s .
To evaluate this in practice for reasonably small values of N , we can compute ψ 1 (t) by exact diagonalization and exponentiation of the Hamiltonian and then evaluate the sum over operators in (2.6) by random sampling. In Fig. 1 we show the result of this computation. We also plot the expected value and variance of the size as a function of time. At t = 0 all of the probability is concentrated in size one, but as time passes we see successive peaks in larger sizes as the probability mass moves towards larger operators. At late time, the distribution appears to converge to the size distribution of a random fermionic operator. For such an operator, P s is proportional to the total number of operators of size s (for s odd only), which gives P s → N s 2 1−N . So for example, the most common size at late time is N 2 = 15. Notice that in the early phase, the peaks occur more rapidly as time passes. This is because alreadylarge operators can grow faster than small ones. The "scrambling time" where the operator reaches full size would fall somewhere around three-quarters of the way through the plot. At right, we plot both the mean value and the variance of the size.

The size and infinite-temperature OTOCs
As a final preliminary comment, we would like to show that out-of-time-order correlators at infinite temperature are related to the expectation value of s in the distribution P s (t). We define A(t) as the typical anticommutator squared at infinite temperature between ψ 1 (t) and a single-fermion operator Here, we have averaged over the index of the second operator ψ j . The growth of this object is a useful diagnostic for quantum chaos and is simply related to other infinite-temperature out-of-time-order correlators.
Inserting the expansion of ψ 1 (t) in (1.1), we have In order to simplify this expression, we can use that the ψ a 1 ...ψ as operators are orthogonal with respect to our inner product (·, ·) and that this is preserved after taking anticommutators with ψ j . In fact, when a 1 < ... < a s and b 1 < ... < b s , one has the useful formula This formula collapses the sum over {a 1 ...a s } and {b 1 ...b s } to the diagonal terms. It also allows us to sum over j, getting a factor of s from the s different values of j for which we get a nonzero answer. We find In other words, A(t) is simply related to the mean size of the operator ψ 1 (t).

The graph of operators
We will now proceed with the main part of the paper. From this point forward we will be discussing the SYK model in the large N limit. Our conventions for SYK [23,24] are that the Hamiltonian is Here J a 1 ...aq is an antisymmetric tensor, drawn from a Gaussian distribution, with mean zero and the property that the square of a given component has the average value Here, we introduced the dimensionful constant J. We will also use J , which differs by a q-dependent factor as in [30] We would like to understand the time evolution of a particular fermion operator ψ 1 (t) in the large N limit of this model. A key simplification will be that the this time evolution stays within a particular class of operators, which consist of many fermions contracted together in various ways with the J a 1 ...aq tensor.
It is convenient to organize this class of operators by their size, which as always refers to the number of elementary fermion operators. We will sometimes use "generation" in place of size, where generation refers to the number of times we have to commute H with ψ 1 for the operator to first appear in the Baker-Campbell-Hausdorff series for ψ 1 (t) (for further discussion of this perspective, see § 6). Size s and generation k are related by Let us now discuss the types of operators that appear in the time evolution of ψ 1 (t), choosing q = 4 for simplicity.
• Generation zero: At infinite N , the only size-one operator that appears in ψ 1 (t) is simply ψ 1 itself. It will be convenient to work with operators that are orthonormal with respect to the inner product defined in (2.5). A normalized version of the operator ψ 1 is simply √ 2ψ 1 , which we denote as Our notation for this operator as a horizontal line will become clear from further examples below.
• Generation one: At generation one (size q − 1), the operator that appears in the time evolution is simply the commutator of the Hamiltonian with ψ 1 . The normalized version of this operator is We can interpret this operator as follows. The original fermion ψ 1 has split into q − 1 fermions by a single action of the Hamiltonian.
• Generation two: In the second generation, it will be convenient to divide the operator into three (more generally q − 1) terms, corresponding to a further division into q − 1 fermions of any of the fermions present in the operator O 1 . These distinct terms correspond to the operators . . . (3.8) Our notation with the fan diagrams is that the three daughter lines coming out of a vertex are always ordered such that the index of the top line is less than the index of the middle line, which is less than the index of the bottom line. Because of this ordering convention, the operators shown above are different from each other.
• Generation three: In the third generation, there are different kind of operators that can appear, corresponding to the division of a fermion that was "born" in the first generation or the second generation. For example, we have the operators • Generation k + 1: More generally, the operators for generation k + 1 are obtained by considering all of the operators at generation k, and for each one, allowing one of the fermions to divide further, contracting with a J a 1 ...aq symbol, and normalizing. Graphically, we simply turn one of the lines into a fan.
In the infinite N limit, these operators all have definite size and are orthogonal. Note that at finite N , some of the indices of the fermions might happen to be the same. Using that ψ 2 a = 1 2 , this would imply that the operator is actually of smaller size than 1 + (q − 2)k. However, this does not happen at infinite N . Now that we have discussed the set of operators that we will use, we can describe the evolution of ψ 1 (t). The idea is that the operators we have described form a graph, and the time evolution of the operator is simply the quantum evolution of a particle moving on the graph. More precisely, we can think of the space of operators being a Hilbert space with inner product (2.5). In the infinite N SYK model, the operators we dissused above correspond to an orthonormal basis for a subspace of the space of all possible operators. It is helpful to think about these operators O ( ) k as basis states |O ( ) k for an abstract particle that represents the evolving operator ψ 1 (t). The Heisenberg equation is an ordinary Schrodinger equation acting in this space, for an appropriate Hamiltonian H: (3.10) We can now explain the point of the basis of operators that we have chosen. The nice feature is that in this basis, H is proportional to the adjacency matrix on the graph, The adjacency matrix is defined as the matrix that has a 1 at location i, j if i, j are vertices connected by an edge, and a zero otherwise. So, for example, we have The evolution of the operator ψ 1 (t) in the large N theory is therefore simply the quantum evolution of a particle moving on the graph shown in Fig. 2, with initial condition that the particle starts out at the leftmost vertex.
An obvious feature of this graph is that it is rapidly expanding. The degree of a vertex (the number of neighbors) grows roughly linearly with generation k. This corresponds to the fact that at generation k, the operators contain 1 + (q − 2)k fermions, and the Hamiltonian can act on any one of those, turning a single fermion into a fan of fermions and producing a new operator at generation k + 1. The fact that the degree is growing with distance in this way means that this graph expands more rapidly than e.g. a Cayley graph/Bethe lattice/discretization of hyperbolic space, for which the degree is constant.
We would like to call attention to two qualitative features of the evolution of a particle on such a graph. operators, whose associated fan diagrams are indicated in blue. The problem of the time evolution of ψ 1 (t) in the large N theory is equivalent to the motion of a quantum particle on this graph (extended to further layers). In (b), (c), and (d), we show versions of the graph where we limit the recursive depth of the fan diagrams. The return amplitude on these graphs gives the zeroth, first and second iterations of the real-time Schwinger-Dyson equations. For any finite cutoff these amplitudes oscillate in time, but for the infinite graph the return amplitude decays exponentially.

1.
A basic effect is that particles tend to move to the right, towards more complicated operators. This is because at any given vertex, there tend to be many more edges leading to the right than to the left: there are more ways for the operator to grow than to shrink. We expect this to lead to exponential decay of the amplitude that the particle remains (or returns) to the original leftmost vertex ψ 1 . This amplitude is simply the correlation function 2 2 N/2 Tr (ψ 1 (t)ψ 1 (0)). It exponentially decays due to the wave function for ψ 1 (t) leaking more and more into the space of complicated operators orthogonal to ψ 1 (0) = ψ 1 .
The graph picture gives an intuitive explanation for why the real-time correlator should exponentially decay, but it does not give an efficient method for computing the decay rate. The best way we know of to compute the correlator is by numerically solving the Schwinger Dyson equations in real time to compute the retarded propagator. At infinite temperature, the real-time equations are simply given by (3.13) At infinite temperature, the retarded propagator is simply G R (t) = 2θ(t) 2 N/2 Tr (ψ 1 (t)ψ 1 (0)).
For t > 0 this is exactly the return amplitude for the quantum particle to be at the leftmost vertex of the graph as a function of time. In the next section, we will see how to use the solution to these equations to write a formula for the wave function on other vertices. For now, we will make a side comment. One way to solve these SD equations is to start with the free answer Σ = 0 and simply iterate the equations. The function G R (t) that we get after a finite number of iterations sums a set of diagrams where we cut off the recursive structure of the melon diagrams at some level. For example, after iterating zero times, we simply take the free propagator. After one iteration or two iterations, respectively, we are effectively summing diagrams of the form (3.14) where all lines represent free propagators. Summing these diagrams is equivalent to evaluating the return amplitude for a particular cutoff version of the graph, where we keep all vertices that correspond to fan diagrams with 'recursive depth' equal to or less than the number of iterations of the SD equations. For example, the cutoffs corresponding to the zero-th, first and second iterations of the SD equations are shown in Fig. 2

in panels (b), (c) and (d).
This gives a perspective on why we get exponential decay of the two point function 2 2 N/2 Tr (ψ 1 (t)ψ 1 (0)). For example, consider the self-energy diagrams on the left in (3.14). These describe oscillation between the operator ψ 1 and J 1abc ψ a ψ b ψ c . In the graph picture, it represents a particle that is moving between the two states of the simple graph shown in (c) of Fig. 2. The result is a return amplitude that oscillates in time, cos(2 1− q 2 Jt). If we consider the SD equations after two iterations, we are studying a particle moving in the somewhat more complicated graph shown in (d). It still oscillates, but the return amplitude has a somewhat lower average value. For any finite cutoff, or any finite iteration of the SD equations, we will get a correlator that oscillates in time. But in the limit where we study the infinite graph, the return amplitude decays exponentially because the particle can continue moving to the right forever in the infinite graph 2. Another important qualitative feature is that the expected size of the operator grows exponentially in time. This is because the degree of the graph is growing linearly with the generation k. The timescale for evolution from generation k to k +1 is proportional to the inverse of the degree, which is proportional to 1/k. So as the particle moves farther out into the graph, it speeds up proportionally to its distance. This leads to the expectation value of k growing exponentially with time. 1

Computing the wave function on the graph
In principle, one could evaluate the wave function for the particle moving on the graph by directly studying that problem. However, it is more convenient to translate the problem into a correlation function in the infinite temperature SYK model and then re-sum the 'melonic' SYK perturbation theory in the usual way.
Let's imagine that we want to compute the wave function that corresponds to the time evolution of the operator O 0 (t) = 2 1 2 ψ 1 (t). We can write this explicitly as The diagram in the last expression is the time contour for a path integral that evaluates the correlator. The two horizontal lines with arrows on them represent the forwards and backwards time evolution operators in the expression ψ 1 (t) = e iHt ψ 1 e −iHt . In order to evaluate this quantity by perturbation theory, we should integrate interaction vertices of the SYK model everywhere on this folded time contour, connecting the loose propagators either to the ψ 1 operator at the right end, or the fermions in whatever O namely twice the two point function at infinite temperature for time separation t. If we like, we can write this (for t > 0) as the retarded propagator, since at infinite temperature G R (t) = 1 2 N/2 Tr ({ψ(t), ψ(0)}) θ(t) = 2θ(t)G(t). So the answer for the return amplitude is given by the solution to the Schwinger-Dyson equations (3.13).
It is helpful to have a quick look at the perturbation theory that generates the SD equations. At large N , the perturbation theory for the return amplitude looks like the following Let us explain this notation. In the first line, in the Feynman diagrams, the two endpoints represent the operators √ 2ψ 1 inserted at time zero and time t. The lines in the Feynman diagrams represent free propagators, so the first diagram is simply 1 = ( √ 2) 2 · 1 2 , where the two factors of √ 2 are for the normalizations of the external operators, and the 1 2 is a free fermion propagator. When we have interaction diagrams, we need to take care to sum over whether the interaction vertex is on the "forwards" or "backwards" portion of the time contour. It is easy to check that these contributions cancel unless all vertices are ordered in time in the same way that they are ordered in the diagram. In this case, the contributions from the two portions of the contour add together, giving an extra factor of two for each vertex. So for example the second diagram gives ( Here the (2iJ) 2 is for the two interaction vertices, the t 2 2 is for the integral over two ordered points between zero and t, and the ( 1 2 ) 5 is for the five free fermion propagators. This evaluates to − J 2 t 2 8 . In the second line, we represent a dressed retarded propagator, which is equal to the return amplitude, as a line with a black dot in the middle.
The next simplest case is when we take O . Now we need to evaluate a correlation function of a composite operator built out of three fermions, and the singlefermion operator ψ 1 (t). The lowest order diagram for this involves expanding down a single copy of the interaction vertex J 1abc ψ 1 ψ a ψ b ψ c , where the ψ 1 is contracted with our operator ψ 1 (t), and the other fermions are contracted with the O 1 operator at time zero. Note that this Feynman diagram has the same structure as the fan diagram that we used to label the operator itself. At infinite N , the only other diagrams that contribute are 'melonic' decorations of this diagram, as in These decorations can be summed by replacing the free propagators by dressed propagators that solve the Schwinger-Dyson equations. The full answer, including the numerical factor from the normalization of the operators, is This simple pattern persists for arbitrary operators O ( ) k : to compute the wave function, we can simply interpret the fan diagram of the operator O ( ) k itself as a Feynman diagram, where all of the edges are dressed retarded propagators G R (t). We then integrate over the times of the vertices (subject to the ordering constraint which is imposed by the θ(t) in the retarded propagator). Including the correct numerical prefactor, one has for k ≥ 1 This gives an algorithm for computing the wave function of the particle moving on the graph, i.e. the time evolving operator. However there are two problems. First, in general we do not have an exact expression for the infinite temperature G R (t). Second, the number of different fan diagrams grows rapidly with generation k. In the special case of large q, both problems go away, because there is a known formula for G R (t), and as we will see, the different fan diagrams at a given generation are all proportional to the same function of time.

The wave function in the large-q SYK model
In this section we will evaluate the wave function and corresponding probability distribution P s (t) to leading nontrivial order in the large-q SYK model, namely 1 q . To begin we will do a straightforward large q analysis, where t does not scale with q. This approximation breaks down at times of order q, and we will comment on how to resum t/q effects at the end of the section.
At large q and infinite temperature, there is a simple expression for the product of q propagators [30] Taking the 1/q-th power of this, we find that so a single propagator is almost given by the free answer. The fact that G R (t) q is nontrivial will lead to an interesting wave function. However, the computation will be simplified by the fact that any fixed O(1) number of propagators are simply step functions, which means that once the ordering of time arguments are imposed we can set them equal to unity. A very useful point is that this implies that the wave function has the same time dependence for all operators of a given generation k. This means we only have to compute a single representative r for each k. This will make the computation of the wave function tractable.
Let's understand how this works by considering the expressions for two different generation 3 operators, each "born" from | . The first expands "depth-first," and the second expands "breadth-first," We have not included constants of proportionality, because we will fix them below by a different argument. The propagators with a dot on them represent the dressed retarded propagators G R , so that explicitly The ratio of these expressions is at leading order in large q, we see that the nontrivial time dependence of these integrands is equal. The only difference is that they have a different set of θ functions that impose different orderings of the time arguments. This applies more generally: we can write the integrand of any dressed fan diagram at generation k in the simple form times a set of step functions that impose the ordering of time arguments appropriate for a given fan diagram.
We now have to do the integral. In principle, this integral should be over times t 1 . . . t k respecting the constraints from the step functions. In our example, the depth-first operator has t 3 < t 2 < t 1 , and the breadth-first operator has t 3 < t 1 and t 2 < t 1 , with no relationship between t 2 and t 3 . However, since the integrand (5.7) is symmetric under interchanges of the t j , these restrictions will only affect the numerical prefactor and not the time dependence of the result. Thus, the time dependence of any dressed fan diagram will be the same. Picking the "depth-first" expansion to be our representative r at each generation k, we have that This implies that at leading order in large q, we have P s k (t) ∝ tanh 2k (J t), where s k = 1 + (q − 2)k.
As a final step, we need to determine the numerical coefficients. We can do this by requiring that the probability distribution remain normalized for all times. The trick here is to use the fact that we already have an expression for P 1 (t) = G R (t) 2 from (5.1) that is accurate at first subleading order in the 1/q expansion: Now, to determine the numerical coefficients for P s k with k > 1, we try to solve (5.10) to order 1/q. Indeed, one can solve this equation by setting N k = 2/kq. This gives the probability distribution at leading nontrivial order in 1/q We will now make several comments about this result.
1. One can evaluate the expectation value of the size s in this distribution. At leading order in 1/q, we have: This result for the expected value of the size determines the initial exponential growth of the anticommutator-squared, via (2.10). We conclude that the chaos exponent at large q and infinite temperature is λ L = 2J . The formulas from [30] can be used to show that in the large-q model we have Here the energy spectrum is such that −1 < x < 1, and x = 0 corresponds to infinite temperature state. So we find agreement with previous results.
2. Note that the leading-order answer for (5.12) depends on the 1 q -suppressed probabilities for s > 1, because s ≈ qk and this factor of q cancels against the 1 q suppression. In other words, at large q, the operator initially has only small probability (of order 1 q ) to grow, but if it does grow it gets so big (size proportional to q) that this makes a large effect on the expected value of the size. This is reflected in the fractional variance of the size distribution, which is large, proportional to q (5.14) 3. As another example of something one can compute with this distribution, we can generalize the logic that led to (2.10) slightly, finding Evaluating this with our large q result (5.11), we find 4. So far, we have considered a simple large-q limit, where we do not allow t to scale with q. This approximation will break down at times of order q. It would be nice to extend our analysis to resum effects of order t/q. Although we have not studied this systematically, we will make a few comments. To capture the important effects, one can no longer approximate G R (t) as simply θ(t) in cases where the time argument can be long. Instead, we can approximate it as G R (t) = θ(t) cosh −2/q (J t) ≈ θ(t)e −2J t/q . We expect based on numerics and [31] that this expression is accurate for all time t, although it does not follow from the approximation worked out in [30].
A convenient feature of this approximation is that (ignoring the step functions) we have G R (t)G R (t ) = G R (t + t ). This composition property allows us to convert fan diagram integrands into each other, so we retain the property that only one representative from each generation must be computed. In the example given above, to convert (5.5) to . Another simplification is that inside the integrand, we can expect to approximate G R (t j ) q−α ≈ G R (t j ) q , because the presence of the factor G R (t j ) q will make the integral prefer the region where t j is of order one, so the factor G R (t j ) α will be approximately one.
Following these approximations, we find that the total effect is to multiply the wave function (5.8) by G R (t). Normalizing the probability distribution, we find the expression P s (t) = Γ(k + 2/q) Γ(k + 1)Γ(2/q) tanh 2k J t cosh 4/q J t , s = 1 + (q − 2)k. (5.17) This probability distribution resums the t/q corrections to our straightforward large-q result (5.11). However, it is not fully satisfactory because there are expected to be k/q corrections that are not accurately summed here. We hope that the expression is nevertheless qualitatively accurate even for large k. Note that at our level of approximation the denominator could be written e 4J t/q instead of cosh 4/q J t.
Our main purpose in writing the expression (5.17) is that one finds a very similar formula in a classical model of operator growth discussed in § 2.1.

5.
It is sometimes convenient to define a "coarse-grained" wave function by Ψ s (t) = (−i) k P s (t). This is the amplitude for ψ 1 (t) to be of size s at time t. We ought to have a composition property where the two point function of fermions can be computed by inserting a complete set of states at an intermediate time and summing over the sizes of all operators that appear. In other words, we should have Indeed, one can check that this property holds for (5.17). It follows that it also holds at order 1 q for (5.11).

Discussion
In this paper, we discussed the time evolution of a simple fundamental fermion operator ψ 1 in the SYK model. In the large N limit, we related the operator growth problem to the problem of a particle moving on a rapidly expanding graph. We computed the size distribution for the evolving operator explicitly in two cases: numerically for N = 30 fermions with q = 4 and analytically at large N and large q. We showed how to use this size distribution to compute out-of-time-order correlators at infinite temperature.
Throughout, we have emphasized a particular decomposition of the time evolving operator, into components with a given size (number of elementary fermions appearing in a product). We would like to contrast this with the Baker-Campbell-Hausdorff expansion These terms also give a decomposition of the time-evolving operator. We emphasize that this is different from the size decomposition, because the terms at order k in the BCH expansion do not all have the same size. While the kth nested commutator contains terms up to size s = k(q − 2) + 1, there is also weight on operators of shorter sizes. For instance, in the q = 4 SYK model, the k = 2 term in the BCH expansion contains operators of size 5 as well as an operator of size 1, namely ψ 1 . The fact that our wave function P s (t) is a nontrivial function of time indicates that it receives contributions from many different orders in the BCH expansion (starting at order k, where s = 1 + (q − 2)k).
Another point we would like to emphasize is the following. The notion of size that we have used makes explicit reference to a particular set of simple operators {ψ i }, out of which we construct complicated ones. This set of simple operators is determined by the Hamiltonian itself, and it depends on the q-local and sparse nature of H. If instead the Hamiltonian were a totally random matrix, we would have no sensible notion of simple operators, and no good way to define size. However, for Hamiltonians such as SYK, a preferred set of simple operators is selected by the fact that the interaction can be written in terms of finite (order q) products of them.
There are many possible directions for improvements on our work. For example, it would be interesting to understand 1/N corrections to the P s (t) distribution at a level that would make it possible to see saturation of the late-time distribution. Another challenge is to extend the approach studied here to compute out-of-time-order correlators at finite temperature. For instance, one might be tempted to try to define a size distribution P (β) s (t) with respect to an inner product (A, B) β ≡ Z(β) −1 Tr[A † e −βH/2 B e −βH/2 ], where Z(β) is the thermal partition function. In principle, one could use the Schwinger-Dyson equations at large q to compute a candidate wave function. However, this necessarily requires the use of a different set of operators O ( ) k (β) that now depend on the temperature β. Unfortunately these operators do not appear to admit a simple relationship between "generation" k and operator size s. This means we do not know how to extract the expected size from this candidate distribution, and we do not know how to relate it to the out-of-time-order commutator as we did in § 2.3.
In holographic theories, operator growth is described by a particle falling towards the black hole horizon. It is tempting to think of the radial direction in the graph of operators as being similar to the radial direction in the bulk theory, so that the particle propagating deeper into the graph resembles the particle falling into the bulk. We do not know if there is a more precise connection to be made there.

A Some numerical data
In this appendix we give numerical data for the infinite-temperature chaos exponent λ L and the rate of decay of the two point function in real time, µ, for different values of q. The parameter µ is defined by saying that for large t, G(t) ∝ e −µt .
Although we do not know how to compute these quantities analytically, it is straightforward to compute them by numerically solving the Schwinger-Dyson equations in real time. This directly gives µ, and by constructing the retarded ladder kernel [27], one can find λ L as described in [30]. We give a table here of the values that we found. Note that although the  physical model makes sense only for even integer q, the Schwinger-Dyson equations make sense for arbitrary q. Note that the values of µ for fairly large q seem to agree well with the formula µ 2J = 1 q + π 2 6q 2 + O(q −3 ) that one would expect based on [31].