Chaos, Complexity, and Random Matrices

Chaos and complexity entail an entropic and computational obstruction to describing a system, and thus are intrinsically difficult to characterize. In this paper, we consider time evolution by Gaussian Unitary Ensemble (GUE) Hamiltonians and analytically compute out-of-time-ordered correlation functions (OTOCs) and frame potentials to quantify scrambling, Haar-randomness, and circuit complexity. While our random matrix analysis gives a qualitatively correct prediction of the late-time behavior of chaotic systems, we find unphysical behavior at early times including an $\mathcal{O}(1)$ scrambling time and the apparent breakdown of spatial and temporal locality. The salient feature of GUE Hamiltonians which gives us computational traction is the Haar-invariance of the ensemble, meaning that the ensemble-averaged dynamics look the same in any basis. Motivated by this property of the GUE, we introduce $k$-invariance as a precise definition of what it means for the dynamics of a quantum system to be described by random matrix theory. We envision that the dynamical onset of approximate $k$-invariance will be a useful tool for capturing the transition from early-time chaos, as seen by OTOCs, to late-time chaos, as seen by random matrix theory.


Introduction
Quantum chaos is a general feature of strongly-interacting systems and has recently provided new insight into both strongly-coupled many-body systems and the quantum nature of black holes. Even though a precise definition of quantum chaos is not at hand, understanding how chaotic dynamics process quantum information has proven valuable. For instance, Hayden and Preskill [1] considered a simple model of random unitary evolution to show that black holes rapidly process and scramble information. The suggestion that black holes are the fastest scramblers in nature [2,3] has led to a new probe of chaos in quantum systems, namely the 4-point out-of-time-order correlation function (OTOC). Starting with the work of Shenker and Stanford [4,5], it was shown [6] that black holes are maximally chaotic in the sense that a bound on the early time behavior of the OTOC is saturated. Seperately, Kitaev proposed a soluble model of strongly-interacting Majorana fermions [7,8], which reproduces many features of gravity and black holes, including the saturation of the chaos bound [9,10]. The Sachdev-Ye-Kitaev model (SYK) has since been used as a testing ground for questions about black hole information loss and scrambling.
In recent work, [11] found evidence that the late time behavior of the SYK model can be described by random matrix theory, emphasizing a dynamical perspective on more standard notions of quantum chaos. Random matrix theory (RMT) has its roots in nuclear physics [12,13] as a statistical approach to understand the spectra of heavy atomic nuclei, famously reproducing the distribution of nearest neighbor eigenvalue spacings of nuclear resonances. Random matrix theory's early success was later followed by its adoption in a number of subfields, including large N quantum field theory, string theory, transport in disordered quantum systems, and quantum chaos. Indeed, random matrix eigenvalue statistics have been proposed as a defining characteristic of quantum chaos, and it is thought that a generic classically chaotic system, when quantized, has the spectral statistics of a random matrix ensemble consistent with its symmetries [14].
Current thinking holds that both spectral statistics and the behavior of the OTOC serve as central diagnostics of chaos, although the precise relation between the two is unclear. OTOCs have recently been studied using techniques from quantum information theory, and it was found that their decay as a function of time quantifies scrambling [15] and randomness [16]. The goal of this paper is to connect various concepts as a step towards a quantum information-theoretic definition of quantum chaos that incorporates scrambling, chaotic correlation functions, complexity, approximate randomness, and random matrix universality.
As alluded to above, an important first step to bridge early-time chaos and late-time dynamics is to understand the relation between the OTOC and the spectral statistics. We derive an explicit analytical formula relating certain averages of OTOCs and spectral form factors which holds for arbitrary quantum mechanical systems. A simple corollary is that spectral form factors can be approximated by OTOCs defined with respect to random (typically non-local) operators, highlighting the fact that spectral statistics are good probes of macroscopic thermodynamic properties, but may miss important microscopic physics such as early-time chaos. We also compute correlation functions for an ensemble of Hamiltonians given by the Gaussian Unitary Ensemble (GUE), and find that 4-point OTOCs decay faster than 2-point correlators contrary to findings for local quantum Hamiltonians [6]. Due to the basis independence of the GUE, averaged correlation functions do not depend on sizes of operators, and thus can be expressed solely in terms of spectral form factors. Furthermore, we find that correlators for GUE Hamiltonians do not even depend on the time-ordering of operators. These results imply that the GUE ignores not only spatial but also temporal locality.
Another important question is to understand the approach to entropic (as well as quantum complexity) equilibrium via pseudorandomization at late times in strongly coupled systems. We consider the ensemble of unitaries generated by fixed GUE Hamiltonians, namely and study its approach to Haar-randomness by computing frame potentials which quantify the ensemble's ability to reproduce Haar moments. We find that the ensemble forms an approximate k-design at an intermediate time scale, but then deviates from a k-design at late times. These results highlight that the k-design property fails to capture late time behavior of correlation functions. An interesting application of unitary k-designs is that Haar-randomness is a probe of quantum complexity. We apply techniques from [16] to lower bound the quantum circuit complexity of time evolution by GUE Hamiltonians and find a quadratic growth in time.
In order to make precise claims about the behavior of OTOCs and frame potentials for GUE Hamiltonians, we need explicit expressions for certain spectral quantities. Accordingly, we compute the 2-point and 4-point spectral form factors for the GUE at infinite temperature, as well as the 2-point form factor at finite temperature. We then use these expressions to discuss time scales for the frame potentials. We also analytically compute the late-time value of the k-th frame potential for arbitrary k.
Under time evolution by strongly-coupled systems, correlations are spread throughout the system and the locality of operators as well as time-ordering appear to be lost from the viewpoint of correlation functions, as implied by the late-time universality of random matrix theory. Also motivated by the k-design property's failure to capture late-time chaos (i.e., E GUE t fails to be Haar-random at late times), we propose a new property called k-invariance, which may provide a better probe of chaos at both early and late times. The property of kinvariance characterizes the degree to which an ensemble is Haar-invariant, meaning that the ensemble is invariant under a change of basis. When the dynamics becomes approximately Haar-invariant, correlation functions can be captured solely in terms of spectral functions, which signifies the onset of an effective random matrix theory description. We thus provide an information theoretically precise definition of what it means for a system's dynamics to be described by random matrix theory. Specifically, we say that an ensemble of Hamiltonian time evolutions E t is described by random matrix theory at times greater than or equal to t with respect to 2k-point OTOCs when E t is approximately k-invariant with respect to its symmetry class, for example the symmetry class of either the unitary, orthogonal, or symplectic groups.
The paper is organized as follows: In Section 2, we provide a brief overview of random matrix theory and explicitly compute the spectral form factors for the GUE at infinite and finite temperature. In Section 3, we compute correlation functions for the GUE, including the OTOC, and demonstrate that they can be expressed in terms of spectral correlators as well. In Section 4, we compute frame potentials for the GUE, and extract the timescales when it becomes an approximate k-design both at finite and infinite temperatures. We show that the frame potentials can be also expressed as products of sums of spectral correlators. In Section 5, we discuss complexity bounds and complexity growth for the GUE. In Section 6, we discuss Haar-invariance as a diagnostic of delocalization of spatial degrees of freedom and random matrix universality at late times. We conclude with a discussion in Section 7. The appendices contain an review of various information-theoretic definitions of scrambling in the literature, a discussion of information scrambling in black holes, more details of our random matrix calculations, and numerics.

Form factors and random matrices
For a long time, the spectral statistics of a random matrix were seen as a defining feature of quantum chaos. More recently, it has been proposed that the late time behavior of certain strongly coupled theories with large numbers of degrees of freedom also exhibit a dynamical form of random matrix universality at late times [11]. The central object of study in this recent work is the 2-point spectral form factor, 1 which is defined in terms of the analytically continued partition function R 2 (β, t) ≡ |Z(β, t)| 2 , where Z(β, t) ≡ Tr e −βH−iHt (2) and where · denotes the average over an ensemble of Hamiltonians. In SYK as well as standard RMT ensembles, the 2-point spectral form factor decays from its initial value and then climbs linearly back up to a floor value at late times. The early time decay of the form factor is called the slope, the small value at intermediate times is called the dip, the steady linear rise is called the ramp, and the late time floor is called the plateau. In Fig. 1 we observe these features in SYK with N = 26 Majoranas, which has GUE statistics at late times. 2 Furthermore, it was found that in SYK, time scales and many features of the slope, dip, ramp and plateau agree with predictions from RMT. In this section, we briefly review random matrix theory. Further, we study the 2-point spectral form factor for the GUE at both infinite and finite temperature, compute its analytic form, and extract its dip and plateau times and values. 3 In addition, we compute the 4-point form factor and extract relevant time scales and values. We find that the late-time rise in the 4-point form factor is quadratic in t, in contrast to the linear rise in the 2-point form factor. The expressions derived in this section will give us analytic control over the correlation functions and frame potentials discussed in later sections. For a detailed treatment of the random matrix ensembles, we refer the reader to [25,26,27]. 1 One motivation for studying this object is a simple version of the information loss problem in AdS/CFT [17], where the apparent exponential decay of 2-point correlation functions in bulk effective field theory contradicts the finite late-time value of e −O(S) implied by the discreteness of the spectrum. As we shall see in the next section, the 2-point form factor is equivalent to the average of 2-point correlation functions. More recently, chaos and information loss in correlation functions and form factors have also been studied in holographic CFTs [18,19,20,21,22]. 2 For SYK with N Majoranas, particle-hole symmetry dictates the symmetry class of the spectrum, where N (mod 8) ≡ 2 or 6 corresponds to GUE statistics [23]. Furthermore, the spectral density of SYK and its relation to random matrices has also been discussed in [24]. 3 We consider the GUE since it corresponds to the least restrictive symmetry class of Hamiltonians. The generalization of our analysis to the GOE or GSE is left for future work.

Random matrix theory
The Gaussian Unitary Ensemble GUE(L, µ, σ) is an ensemble of L × L random Hermitian matrices, where the off-diagonal components are independent complex Gaussian random variables N (µ, σ) C with mean µ and variance σ 2 , and the diagonal components are independent real Gaussian random variables N (µ, σ) R . It is common in the math literature to work with GUE(L, 0, 1) which has zero mean and unit variance, but we will instead use the normalization GUE(L, 0, 1/ √ L) so that the eigenvalues do not scale with the system size. 4 The probability density function of the ensemble has a Gaussian form up to a normalizing factor. As the GUE is invariant under unitary conjugation H → U HU † , the integration measure dH = d(U HU † ) is likewise invariant. The probability measure P (H) dH on the ensemble integrates to unity. Instead of integrating over dH directly, it is convenient to change variables to eigenvalues and diagonalizing unitaries. Up to a normalizing constant C defined in Eq. (166) in App. C, the measure becomes where dU is the Haar measure on the unitary group U (L) and ∆(λ) is the Vandermonde 4 The reason for using the normalization GUE(L, 0, 1/ √ L) instead of GUE(L, 0, 1) is as follows: With the standard normalization GUE(L, 0, 1), the energy spectrum ranges from −2 √ L to 2 √ L. This implies that by applying a local operator, one may change the energy of the system by O( √ L). With the physical normalization GUE(L, 0, 1/ √ L), the energies lie within the range −2 to 2, and local operators act with O(1) energy. See [28] for discussions on normalizing q-local Hamiltonians.
The joint probability distribution of eigenvalues is and is symmetric under permutations of its variables. For simplicity, we define a measure Dλ which absorbs the Gaussian weights, eigenvalue determinant, and constant factors. We integrate over the GUE in the eigenvalue basis as The probability density of eigenvalues ρ(λ), where can be written in terms of the joint eigenvalue probability density by integrating over all but one argument The spectral n-point correlation function, i.e. the joint probability distribution of n eigenvalues, ρ (n) is defined as With these definitions at hand, we quote a few central results. In the large L limit, the density of states for the Gaussian ensembles gives Wigner's famous semicircle law, where the semicircle diameter is fixed by our chosen eigenvalue normalization. Also in the large L limit, the spectral 2-point function can be expressed in terms of a disconnected piece and a squared sine kernel as [25] ρ (2)

Spectral form factors
The 2-point spectral form factor for a single Hamiltonian H is given in terms of the analytically continued partition function Z(β, t) = Tr (e −βH−iHt ) as Similarly, the spectral form factor averaged over the GUE is denoted by which is the Fourier transform of the spectral 2-point function. At infinite temperature β = 0, the Fourier transform of the density of states is just Z(t) = Tr (e −iHt ), the trace of unitary time evolution. Using the semicircle law, we take the average of Z(t) at large L where J 1 (t) is a Bessel function of the first kind. The function J 1 (2t)/t is one at t = 0 and oscillates around zero with decreasing amplitude that goes as ∼ 1/t 3/2 , decaying at late times. At infinite temperature, the 2-point spectral form factor for the GUE is More generally, we will also be interested in computing 2k-point spectral form factors the Fourier transform of the spectral 2k-point function ρ (2k) . 5 Although the form factors can be written exactly at finite L, our analysis will focus on analytic expressions that capture the large L behavior. 6 Note that in [11], 2-point form factors were normalized via dividing by Z(β) 2 . At infinite temperature, this simply amounts to dividing by L 2 , but at finite temperature the situation is more subtle. As we will comment on later, the correct object to study is the quenched form factor Z(β, t)Z * (β, t)/Z(β) 2 , but since we only have analytic control over the numerator and denominator averaged separately, we instead work with the unnormalized form factor R 2 as defined above. 5 In the random matrix literature, the 2-point form factor is often defined as the Fourier transform of the connected piece of the spectral 2-point correlation function, where the connected piece of the spectral 2k-point function is often referred to as the 2k-level cluster function. Our definition for the 2k-point spectral form factor R 2k includes both connected and disconnected pieces. 6 In addition to relating the form factor to the fidelty of certain states, [29] also studies the 2-point spectral form factor for the GUE, computing an analytic form at finite L and discussing the dip and plateau.

2-point spectral form factor at infinite temperature
Here we calculate the 2-point form factor at β = 0. Working at large L, we can evaluate R 2 by first pulling out the contribution from coincident eigenvalues In the large L limit, we can make use of the sine kernel form of the 2-point function Eq. (13). Using Eq. (16), we integrate the first term, a product of 1-point functions, and find In order to integrate the sine kernel, we make the change of variables: which allows us to rewrite the integral Having decoupled the variables, in order to integrate over u 1 and u 2 , we must employ a short distance cutoff. We develop a certain approximation method which we refer to as the 'box approximation,' and explain its justification in App. C. Specifically, we integrate u 1 from 0 to u 2 , and integrate u 2 from −π/2 to π/2, Note that in the random matrix theory literature, a common treatment [30] is to approximate the short-distance behavior of ρ (2) (λ 1 , λ 2 ) by adding a delta function for coincident points λ 1 = λ 2 and inserting a 1-point function into the sine kernel. For R 2 this gives the same result as the approximation above, but this short-distance approximation does not generalize to higher k-point form factors, as discussed in App. C. The 2-point form factor we compute where we define the functions As was discussed in [11], we can extract the dip and plateau times and values from R 2 . From the ramp function r 2 , we observe that the plateau time is given by (26) where after the plateau time, the height of the function R 2 is the constant L. This value can also be derived by taking the infinite time average of R 2 . The other important time scale is the dip time t d , which we can estimate using the asymptotic form of the Bessel function at large t, which gives oscillating at times ∼ O(1) with decaying envelope ∼ t −3/2 . While the first dip time is O(1), we will be interested in the dip time as seen by the envelope, especially because the oscillatory behavior disappears at finite temperature (see Fig. 3). Solving for the minimum of the envelope of R 2 , we find up to order one factors. The true minimum of the envelope and ramp is (6/π) 1/4 √ L ≈ 1.18 √ L, but in light of the approximations we made, and the fact that the precise ramp behavior is somewhat ambiguous, we simply quote the dip time as We plot the 2-point form factor for different dimensions L in Fig. 2. The oscillations in the early time slope behavior of the form factor simply arise from the oscillatory behavior of the Bessel function, i.e. the zeros of r 1 (t) 2 .

2-point spectral form factor at finite temperature
Recall that spectral 2-point function at finite temperature is defined as As described in App. C, we insert the spectral 2-point function ρ (2) and, using the shortdistance kernel, find R 2 (t, β) in terms of the above functions: First we comment on the validity of the approximations used in the finite temperature case. The first and third terms of Eq. (29), dominating at early and late times respectively, are computed from the 1-point function. Therefore, the expression captures the early time, slope, and plateau behaviors. The dip and ramp behavior, encoded in the r 2 term, are more subtle. The expression correctly captures the slope of the ramp, but deviates from the true ramp at large β. We will discuss this more in App. C, but here only discuss quantities around the dip for small β, where Eq. (29) is a good approximation.
The ramp function r 2 , which is the same as at infinite temperature, gives the plateau time For convenience we define the function h 1 (β) ≡ J 1 (2iβ)/iβ, which is real-valued in β. 8 The initial value and plateau value are thus given by To find the dip time, we make use of the asymptotic expansion of the Bessel function as Finding the minimum of the expression gives the dip time and evaluating R 2 at the dip gives up to order one factors. While we could write down full expressions for the dip time h 2 and dip value h 3 in terms of the Bessel function, we only trust Eq. (29) in this regime for small β, and thus report the functions perturbatively.
The 2-point form factor is plotted in Fig. 3 for various values of L and β. While increasing the dimension L lowers the dip and plateau values and delays the dip and plateau times, decreasing temperature raises the dip and plateau values and delays the dip times. We also note that lowering the temperature smooths out oscillations from the Bessel function. 9 After normalizing R 2 (β, t) by its initial value, the late-time value is 2 −S (2) where S (2) is the thermal Rényi-2 entropy.

4-point spectral form factor at infinite temperature
We can also compute the 4-point form factor at infinite temperature, defined as As we explain in App. C, we compute R 4 by replacing ρ (4) by a determinant of sine kernels and carefully integrating each term using the box approximation. The result is given in terms of the functions r 1 (t) and r 2 (t) defined above. The initial value of R 4 is L 4 . Given the dependence on the ramp function, the plateau time is still t p = 2L. The plateau value 2L 2 − L matches the infinite time average of Eq. (35). The dip time is found again by considering the leading behavior of R 4 and expanding the Bessel functions Solving for the minimum, we find the dip time where at the dip time R 4 (t d ) ≈ L. We plot the R 4 (t) for various values of L in Fig. 4. Let us summarize the time scales and values for the form factors considered above: form factor time scale time value The β-dependent functions were defined above.
With an understanding of the first few form factors, we briefly describe the expected behavior for 2k-point form factors R 2k (t) (with k L). Initially, R 2k decays from L 2k as The ∼ t k growth after the dip levels off at the plateau time 2L, with plateau value ∼ kL k .
Given that we employed some approximation to compute the form factors, we perform numerical checks for the expressions above in App. D. At both infinite and finite temperature, we correctly capture the time scales, early time decay, dip behavior, and the late-time plateau, but find slight deviations from the analytic prediction for the ramp. We discuss this and possible improvements to the ramp function in App. D.
Later we will study frame potentials which diagnose whether an ensemble forms a kdesign. We will find that the frame potentials for the ensemble of unitaries generated by the GUE can be written in terms of the spectral form factors discussed here, thereby allowing us to extract important time scales pertaining to k-designs.
3 Out-of-time-order correlation functions 3

.1 Spectral form factor from OTOCs
Although quantum chaos has traditionally focused on spectral statistics, recent developments from black hole physics and quantum information theory suggest an alternative way of characterizing quantum chaos via OTOCs [1,4,6,15]. In this subsection, we bridge the two notions by relating the average of 2k-point OTOCs to spectral form factors. We work at infinite temperature (β = 0), but note that by distributing operator insertions around the thermal circle, the generalization to finite temperature is straightforward. The results in this subsection are not specific to GUE and are applicable to any quantum mechanical system.
Consider some Hamiltonian H acting on an L = 2 n -dimensional Hilbert space, i.e. consisting of n qubits. We start by considering the 2-point autocorrelation function A(0)A † (t) , time evolved by H. We are interested in the averaged 2-point function: where dA represents an integral with respect to a unitary operator A over the Haar measure on U (2 n ). We note that since the 2-point Haar integral concerns only the first moment of the Haar ensemble, we can instead average over the ensemble of Pauli operators 10 where A j are Pauli operators and L 2 = 4 n is the number of total Pauli operators for a system of n qubits. To derive the spectral form factor, we will need the first moment of the Haar ensemble Applying Eq. (41) to Eq. (39), we obtain where R H 2k (t) ≡ |Tr(e −iHt )| 2k is the same as R 2k (t) from before, but written for a single Hamiltonian H instead of averaged over the GUE. Thus, the 2-point form factor is proportional to the averaged 2-point function.
This formula naturally generalizes to 2k-point OTOCs and 2k-point form factors. Consider 2k-point OTOCs with some particular ordering of operators Operators which do not multiply to the identity have zero expectation value at t = 0, and the value stays small as we time-evolve. We are interested in the average of such 2k-point OTOCs. By using Eq. (41) 2k − 1 times, we obtain where Thus, higher-point spectral form factors can be also computed from OTOCs. In fact, by changing the way we take an average, we can access various types of form factors. For instance, let us consider OTOCs We then have The fact that the expression on the right-hand side is asymmetric is because the operator is not Hermitian. 11 These expressions not only provides a direct link between spectral statistics and physical observables, but also give a practical way of computing the spectral form factor. If one wishes to compute or experimentally measure the 2-point form factor R 2 (t), one just needs to pick a random unitary operator A and study the behavior of the 2-point correlator A(0)A † (t) . In order to obtain the exact value of R 2 (t), we should measure A(0)A(t) for all possible Pauli operators and take their average. Yet, it is possible to obtain a pretty good estimate of R 2 (t) from A(0)A(t) with only a few instances of unitary operator A. Consider the variance of A(0)A(t) , If the variance is small, then the estimation by a single A would suffice to obtain a good estimate of R 2 (t). Computing this, we obtain This implies that the estimation error is suppressed by 1/L. By choosing a Haar unitary operator A (or 2-design operator, such as a random Clifford operator), one can obtain a good estimate of R 2 (t).

A check in a non-local spin system
To verify Eq. (42) and the claim that the variance of the 2-point functions is small, consider a random non-local (RNL) spin system with the Hamiltonian given as the sum over all 2-body operators with random Gaussian couplings J ijαβ [31]: where i, j sum over the number of sites and α, β sum over the Pauli operators at a given site. Such Hamiltonians have a particularly useful property where locally rotating the spins of H RNL with couplings J ijαβ creates another Hamiltonian H RNL with different couplings J ijαβ . More precisely, if we consider an ensemble of such 2-local Hamiltonians; the ensemble is invariant under conjugation by any 1-local Clifford operator Here a Clifford operator refers to unitary operators which transform a Pauli operator to a Pauli operator. For this reason, the 2-point correlation function A(0)A † (t) E RNL depends only of the weight of Pauli operator A: where A is an m-body Pauli operator (51) and where · E RNL denotes the ensemble (disorder) average. Thus, this system is desirable for studying the weight dependence of 2-point correlation functions.
As mentioned above, we can write the average over 2-point correlation functions as the average over all Paulis as time evolving with H RNL . Numerically, for a single instance of H RNL , we find that the average over all 2-point functions of Pauli operators gives R 2 as expected. In Fig. 5, for n = 5 sites and averaged over 500 random instances of H RNL to suppress fluctuations, we plot R 2 along side all 2-point functions of Pauli operators. We observe that correlation functions depend only on the weight of A, with the higher weight Pauli operators clustered around R 2 . The arrangement of the 2-point functions for Paulis of different weight depends on the number of sites n. But for n = 5, the even and odd weight Paulis are respectively below and above R 2 at later times and weight 2 and 3 Paulis are the closest to R 2 . We will comment on the size dependence of correlators in Sec. 6.
The conclusion is that we can choose a few random Paulis, and by computing 2-point functions, quickly approximate R 2 . We also checked that by increasing the number of spins, the variance becomes small and 2-point functions become closer to R 2 .

Operator averages and locality
Let us pause for a moment and discuss the meaning of considering the operator average from the perspective of spatial locality in quantum mechanical systems. In deriving the above exact formulae relating the spectrum and correlators, we considered the average of OTOCs over all the possible Pauli operators. For a system of n qubits, a typical Pauli operator has support on 3n/4 qubits because there are four one-body Pauli operators, I, X, Y, Z. It is essential to recognize that the average of correlation functions is dominated by correlations of non-local operators with big supports covering the whole system. Thus, the spectral statistics have a tendency to ignore the spatial locality of operators in correlation functions. 12 In fact, the spectral statistics ignore not only spatial locality but also temporal locality of operators. Namely, similar formulas can be derived for correlation functions with various ordering of time. For instance, consider the following 4-point correlation function: where the C operator acts at time 2t instead of 0 such that the correlator is not out-of-timeordered. Computing the average of the correlator with ABCD = I, we obtain which is exactly the same result as the average of 4-point OTOCs in Eq. (44). Indeed, timeordering is washed away since GUE Hamiltonians cause a system to rapidly delocalize, thus destroying all local temporal correlations.
In strongly coupled systems with local Hamiltonians, correlation functions behave rather differently depending on the time ordering of operators, as long as the time gaps involved are small or comparable to the scrambling time [33,4,5,9]. This observation hints that the spectral statistics are good probes of correlations at long time scales, but may miss some important physical signatures at shorter time scales, such as the exponential growth of OTOCs with some Lyapunov exponent.

OTOCs in random matrix theory
Next, we turn our attention to correlators averaged over random matrices, analytically computing the 2-point correlation functions and 4-point OTOCs for the GUE. We begin with the 2-point correlation functions for the GUE where dH represents an integral over Hamiltonians H drawn from the GUE. Since the GUE measure dH is invariant under unitary conjugation dH = d(U HU † ) for all U , we can express the GUE average as by inserting U, U † where dU is the Haar measure. Haar integrating, we obtain where AB represents the connected correlator. If A, B are non-identity Pauli operators, we have If R 2 (t) 1, we have for any non-identity Pauli operator A. It is worth emphasizing the similarity between Eq. (59) and Eq. (42). Recall that Eq. (42) was derived by taking an average over all Pauli operators A and is valid for any quantum mechanical system while Eq. (59) was derived without any additional assumption on the locality of Pauli operator A. Namely, the key ingredient in deriving Eq. (59) was the Haar-invariance of the GUE measure dH. The resemblance of Eq. (59) and Eq. (42) implies that the GUE is suited for studying physical properties of chaotic Hamiltonians at macroscopic scales such as thermodynamic quantities. Next, we compute the 4-point OTOCs for the GUE Inserting U, U † , we must compute the fourth Haar moment We can avoid dealing directly with the (4!) 2 terms generated by integrating here and focus on the leading behavior. Assuming that A, B, C, D are non-identity Pauli operators, we obtain Thus, OTOCs are almost zero unless ABCD = I. 13,14 A similar analysis allows us to obtain the following result for 2k-point OTOCs: The above equation is nonzero when Again, note the similarity between Eq. (63) and Eq. (44). Recall that in order to derive Eq. (44), we took an average over This analysis also supports our observation that the GUE tends to capture global-scale physics very well. Similar calculations can be carried out for correlation functions with arbitrary timeordering. For m-point correlators, at the leading order, we have where t ij = t j − t i . Namely, we have: So, for the GUE, GUE . This implies that the GUE does not care if operators in the correlator are out-of-time-ordered or not, ignoring both spatial and temporal locality.
Careful readers may have noticed that the only property we used in the above derivations is the unitary invariance of the GUE ensemble. If one is interested in computing correlation functions for an ensemble of Hamiltonians which are invariant under conjugation by unitary operators, then correlation functions can be expressed in terms of spectral form factors. Such techniques have been recently used to study thermalization in many-body systems, see [35] for instance. We discuss this point further in Sec. 6.

Scrambling in random matrices
Finally, we discuss thermalization and scrambling phenomena in random matrices by studying the time scales for correlation functions to decay.
We begin with 2-point correlators and thermalization. In a black hole (or any thermal system), quantum information appears to be lost from the viewpoint of local observers. This apparent loss of quantum information is called thermalization, and is often associated with the decay of 2-point correlation functions A(0)B(t) where A and B are some local operators acting on subsystems H A and H B which local observers have access to. In the context of black hole physics, H A and H B correspond to infalling and outgoing Hawking radiation and such 2-point correlation functions can be computed from the standard analysis of Hawking and Unruh [36,37]. 2-point correlation functions of the form A(0)B(t) have an interpretation as how much information about initial perturbations on H A can be detected from local measurements on H B at time t. A precise and quantitative relation between quantum information (mutual information) and 2-point correlation functions is derived in Appendix B. The upshot is that the smallness of A(0)B(t) implies the information theoretic impossibility of reconstructing from Hawking radiation (defined on H B ) an unknown quantum state (supported on H A ) that has fallen into a black hole.
Is the GUE a good model for describing thermalization? For the GUE, we found A(0)B(t) R 2 (t)/L 2 for non-identity Pauli operators with AB = I. Since the early time behavior of R 2 (t) factorizes and is given by the time scale for the decay of 2-point correlation functions, denoted by t 2 , is O(1). This is consistent with our intuition from thermalization in strongly coupled systems where t 2 β. As such, quantum information appears to be lost in O(1) time for local observers in systems governed by GUE Hamiltonians. Next, let us consider 4-point OTOCs and scrambling. To recap the relation between OTOCs and scrambling in the context of black hole physics, consider a scenario where Alice has thrown an unknown quantum state into a black hole and Bob attempts to reconstruct Alice's quantum state by collecting the Hawking radiation. Hayden and Preskill added an interesting twist to this classic setting of black hole information problem by assuming that the black hole has already emitted half of its contents and Bob has collected and stored early radiation in some quantum memory he possesses. The surprising result by Hayden and Preskill is that, if time evolution U = e −iHt is approximated by a Haar random unitary operator, then Bob is able to reconstruct Alice's quantum state by collecting only a few Hawking quanta [1]. This mysterious phenomenon, where a black hole reflects a quantum information like a mirror, relies on scrambling of quantum information where Alice's input quantum information is delocalized over the whole system [15]. The definition of scrambling can be made precise and quantitative by using quantum information theoretic quantities as briefly reviewed in App. A and App. B.
The scrambling of quantum information can be probed by the decay of 4-point OTOCs of the form A(0)B(t)A † (0)B † (t) where A, B are some local unitary operators. An intuition is that an initially local operator B(0) grows into some non-local operator under time evolution via conjugation by e −iHt , and OTOCs measure how non-locally B(t) has spread. For this reason, the time scale t 4 when OTOCs start decaying is called the scrambling time.
Having reviewed the concepts of scrambling and OTOCs, let us study scrambling in random matrices. For the GUE, we found A(0)B(t)C(0)D(t) R 4 (t)/L 4 for non-identity Pauli operators with ABCD = I. Since one can approximate R 4 as R 4 (t) R 2 (t) 2 at early times, we obtain This implies that the decay time scale of 4-point OTOCs is t 4 1 2 t 2 , which is O(1) and is faster than the decay time of 2-point correlation functions. This behavior is in strong contrast with behaviors in chaotic systems studied in the context of black hole physics. Namely, in holographic large-N CFTs with classical gravity duals, the decay times are with t 4 t 2 . Also, the scrambling time t 4 ∼ O(1) violates a bound on quantum signalling which would hold for quantum systems with local interactions [1,3]. The pathology can be also seen from the viewpoint of black hole information problems. If black hole dynamics is modeled by the time evolution of some Hamiltonian sampled from GUE random matrices, then the scrambling time for OTOC decay is O(1). So Bob might be able to reconstruct Alice's quantum state in O(1) time. If Bob jumps into the black hole after decoding Alice's quantum state, Alice can send a quantum message with O(1) energy to Bob and verify the quantum cloning.
Another difference between GUEs and actual chaotic systems can be seen from the behaviors of correlators of the form A(0)B(t)C(2t)D(t) . In the previous subsection, we showed that In strongly chaotic large-N systems, we expect the following behaviors [9,6]: Thus these two types of correlators should behave in a rather different manner. These discrepancies clearly highlight the failure of GUE to capture early-time quantum chaos behavior which is present in realistic strongly-coupled systems. What was wrong about random matrices? Recent developments from black hole physics teach us that the butterfly effect in chaotic systems stems from delocalization of quantum information where initially local operators grow into non-local operators. However, for the GUE, the system does not distinguish local and non-local operators. To be concrete, let A local be some one-qubit Pauli operator, and A non-local = U A local U † be some non-local operator created by conjugating A local via some non-local unitary U . Due to the Haar invariance of the GUE measure, we have As this argument suggests, the GUE is a good description of quantum systems which have no notion of locality. After the scrambling time, we expect that an initially local operator A local (0) will time evolve to A local (t) which has support on the whole system, and the notion of locality is lost (or at least obfuscated) after the scrambling time. We thus expect that A local (0)A local (t) GUE will be a good description of two-point correlation functions after the scrambling time. Similarly, the GUE does not distinguish time-ordering as seen from . This implies that, at late time scales when the GUE becomes a good description, the system forgets the locality of time. In this sense, the GUE captures physics of quantum chaos after the locality of spacetime is forgotten. We will elaborate on this issue in Sec. 6.

Frame potentials and random matrices
In discussions of black hole information loss, we often approximate the chaotic internal dynamics of a black hole as evolution by a Haar random unitary [1,4], and talk about typical black hole states as random pure states generated by Haar unitaries [38]. While it is impractical to generate a Haar random unitary operator -due to its exponential quantum circuit complexity, as noted by [1] -it often suffices to sample from an ensemble that only reproduces the first few moments of the Haar ensemble. [16] made significant progress in quantifying chaos in OTOCs by relating the late-time decay of 2k-point OTOCs to the k-th frame potential, measuring the distance to Haar-randomness. 15 One efficient way of generating a unitary k-design is to employ random local quantum circuits where one applies random two-qubit unitary gates at each unit time [40,1,41] and the ensemble monotonically becomes a k-design as time evolves. Motivated by tensor network descriptions of the AdS/CFT correspondence [42,43], random local quantum circuits have been used as a toy model of the Einstein-Rosen bridge and the dynamics of the two-sided AdS black hole [15]. While such toy models are successful in capturing key qualitative features such as fast scrambling and complexity growth, their dynamics is not invariant under time translations. A natural question is to ask if systems of time-independent Hamiltonians are able to form k-designs or not.
In this section we study time-evolution by the ensemble of GUE Hamiltonians and quantify its approach to Haar-randomness by asking when it forms a unitary k-design. We consider the ensemble of unitary time evolutions at a fixed time t, with Hamiltonians drawn from the GUE As the frame potential quantifies the ensemble's ability to reproduce Haar moments, i.e. form a k-design, we will be interested in the time scales at which we approach "Haar values." Making use of the spectral form factors computed for the GUE, we derive explicit expressions for the frame potentials and extract the key time scales. We find that the GUE ensemble forms an approximate k-design after some time scales, but then deviates from being a kdesign.

Overview of QI machinery
We begin by introducing the formalism of unitary k-designs and defining the frame potential.
for some function f and for all V ∈ U (L). Taking Given an ensemble of unitary operators E = {p i , U i }, we might ask how Haar-random it is. More specifically, we should ask to what extent our ensemble reproduces the first k moments of the Haar ensemble, a notion quantified by unitary k-designs. 17 The k-fold channel with respect to the ensemble E is written here for a continuous ensemble. We say that an ensemble E is a unitary k-design if and only if Φ meaning we reproduce the first k moments of the Haar ensemble. But it does not make sense to compute the k-fold channels and check this equality for all operators in the algebra. Thus, we want a quantity which measures how close our ensemble is to being Haar-random. The frame potential, defined with respect to an ensemble as [44] F (k) measures Haar-randomness in the sense that is tells us how close the ensemble is to forming a unitary k-design. More precisely, it measures the 2-norm distance between the k-fold channel Φ Haar with respect to the Haar ensemble. The frame potential will be a central object of study in this section.
The k-th frame potential for the Haar ensemble is given by Furthermore, for any ensemble E of unitaries, the frame potential is lower bounded by the Haar value F (k) with equality if and only if E is a k-design. In particular, the deviation from the Haar value F Haar corresponds to the 2-norm distance of 2-fold quantum channels. The notion of an approximate k-design is reviewed in App. A.
We will also need to compute moments of the Haar ensemble, i.e. the ability to integrate monomials of Haar random unitaries. The exact formula [45,46] for evaluating these moments is given by where, for the n-th moment, we sum over cycles of the permutation group S n . The Weingarten function Wg, a function of cycles σ ∈ S n , is defined in App. C.3. Performing Haar integrals then simply amounts to contracting indices and computing the Weingarten functions.

Frame potentials for the GUE
The first frame potential for the GUE is written as Noting that the GUE measure is invariant under unitary conjugation, we find where we define Λ ≡ U e −iHt U † , i.e. the matrix exponential of the GUE matrix in the diagonal basis. Going into the eigenvalue basis, we can express the GUE integral as where we have used the left and right invariance of the Haar measure to write the expression as a single Haar integral. Written out explicitly with indices, and we can do the Haar integral using the second moment We find written in terms of the 2-point form factor We know from the expression found in Sec. 2, that at early times R 2 ∼ L 2 , so the early time behavior of the frame potential is dominated by the R 2 2 term until near the dip time. At the dip time, R 2 ≈ √ L and F GUE ≈ 1, achieving the Haar value and forming a 1-design. At late times t → ∞, we take the late time limit of R 2 where only the δ ij terms contribute, and find R 2 ≈ L, meaning that the first frame potential F  A common intuition is that physical systems will become more and more uniformly random as time goes passes. Then one might expect that the frame potential, a measure of Haar randomness, would be a monotonically decreasing function with time. While it is monotonic for random local quantum circuits, we found that it is not generically monotonic for ensembles of unitaries generated by fixed Hamiltonians. 18 In Sec. 6, we propose an alternative quantity which may be monotonic at late times. k = 2 frame potential We can similarly compute the second frame potential using the unitary invariance of the GUE measure: where again, Λ is the exponentiated diagonal matrix. The fourth moment of the Haar ensemble that appears here generates 4! 2 = 576 terms. Recalling Eq. (80), we can compute the fourth moment by computing the necessary Weingarten functions and summing over δ-function contractions.
We relegate the presentation of the full expression for the k = 2 frame potential, and the definitions of the spectral quantities on which it depends, to Appendix C.2. While F As we approach the dip time, the spectral quantities in the second frame potential, At late times, in the infinite time average, we know that R 2 → L, and R 4 → 2L 2 − L from the two eigenvalue pairings in the sum where the exponent vanishes, i.e. δ ik δ j and δ i δ jk , and accounting for the i = j = k = terms. This tells us that the only terms that survive at late times, and are not suppressed in L, are which gives us F GUE ≈ 10, to leading order in 1/L.

Higher k frame potentials
Let us review what we have discussed so far.

k = 1 Frame Potential
We computed the first frame potential for the GUE to be Since R 2 → L and R 4 → 2L 2 − L in the late time limit, F GUE approaches 10.
Early : F The full expression for the third frame potential is given in App. C.2. The leading order behavior at early times is R 2 6 /L 6 , and at the dip time, the third frame potential approaches its Haar value. Again, the late time behavior above is better understood by looking at the dominant form factors. At late times, the terms that contribute at zeroth order in L are as R 2 → L, R 4 → 2L 2 , and R 6 → 6L 3 to leading order in L. In summary, Early : F where '#cycles' denotes the number of cycles in the permutation σ. The Weingarten function contributing at leading order in 1/L is the one labeled by the partitioning of 2k into ones, i.e. the trivial permutation of S 2k , which contributes as All other Weingarten functions, labeled by the integer partitions of 2k, contribute at subleading order at early and late times. Thus, instead of computing the full fourth frame potential, we can compute the terms of combinations of spectral functions with this Weingarten function as their coefficient. In the sum over elements of the permutation group σ, τ ∈ S 2k , we simply need the terms where τ σ −1 is the trivial permutation, i.e. τ = σ. Computing this we find the dominant contribution to the k = 4 frame potential, at leading order in 1/L. The full expression is still too large to reproduce here, but we can comment on the relevant features. The early time behavior is At the dip, where R n ∼ L n/2 , all terms are suppressed, leaving only the constant Haar value 24. Lastly, the late time behavior is In summary, k-th Frame Potential We are now poised to discuss the general form of the k-th frame potential We can also determine what the general late time value should look like. Above, we understood that the plateau value of the k-th frame potential is the sum of the Haar value and the contributions of the spectral functions. It was only the squares of the spectral functions that gave contributions which were not suppressed by 1/L at late times. Extrapolating from above, we expect the k-th frame potential to have with coefficients c . Given the way the spectral form factors are generated from Haar integration, we can understand these coefficients as the number of partial bijections of a given length. For example, for k = 3 there are 24 partial bijections on a 3 element set of length 2, i.e. 24 nonclosed cycles of length two, which gives us 24 ways of constructing the 2-point functions for k = 3. More generally, the coefficients above can be written as where for k = 4, we have the coefficients 1, 16, 72, 96, 24. The k-th coefficient is the Haar value c k (k) = k!, i.e. the number of ways to construct 0-point functions in the Haar integration. We can then write down the general late time behavior for the k-th frame potential Since the late time value of the 2k-point spectral form factor is, to leading order in L, R 2k = k!L k , the late time floor value for the k-th frame potential of the GUE is where the first few terms of this sequence are 2, 10, 96, 1560.
We emphasize that while the purpose of this section is to understand GUE Hamiltonians, the derivations in this subsection where we relate the frame potential to spectral 2k-point functions only used the unitary invariance of the measure to proceed in doing the calculations by Haar integration. Thus, if we are handed an ensemble whose measure is unitarily invariant, the same relations hold.

Frame potentials at finite temperature
We now generalize the discussion of the frame potential to ensembles at finite temperature and compute the thermal frame potential for the GUE. Again we consider the ensemble of unitary time evolutions at a fixed time t, with H drawn from an ensemble E. One might consider generalizing the frame potential to finite temperature by defining the frame potential with respect to a thermal density matrix ρ β = e −βH /Tr(e −βH ), and taking thermal expectation values. With this in mind, we define the frame potential at finite temperature by taking the average over all thermal 2k-point functions, with the operator insertions A and B spaced equidistant on the thermal circle AB(t) . . . AB(t) = Tr (e −βH/2k Ae −βH/2k B(t) . . . e −βH/2k Ae −βH/2k B(t) /Tre −βH . (108) Averaging the norm-squared 2k-point correlation function over all operators and then averaging over the ensemble, we find Tr(e −βH 1 )Tr(e −βH 2 )/L 2 .
Note that this definition differs from the one in the Appendix of [16] by a factor of L 2 . With this slight change in normalization, we reduce to the usual frame potential F (k) E at infinite temperature. k = 1 Frame Potential Let us compute the first thermal frame potential for GUE Hamiltonians: Tr(e −βH 1 )Tr(e −βH 2 )/L 2 .
where we define which is normalized such that we recover the infinite temperature form factor R 2 (t) when β → 0. This normalization differs from |Z(t, β)| 2 /Z(β) 2 , which gives an initial value of one.
Here the thermal form factor which naturally arises from the thermal frame potential has a late time value which is β-independent. The initial value of R 2 (t, β), and thus F GUE (t, β), depends on the β.
In stating the time scales for the thermal frame potential, we will work with the 'quenched' version of Eq. (112) where the numerator and denominator are averaged separately. As we mentioned in Sec. 2.2, the 'annealed' 2-point form factor is the correct object to consider, but we opt to work with the more analytically tractable quenched form factor. Numerically, the two functions are in close agreement with each other.

Time scales from GUE form factors
With an understanding of the behavior of the GUE spectral form factors from Sec. 2.2, we can now look at the time scales for the dip and plateau of the first frame potential At t d ≈ √ L, when R 2 ≈ √ L, we reach the minimal Haar value of 1, and at the plateau time t p = 2L, when R 2 = L, we reach the late time value of 2.
There is another time scale at play here which is an artifact of working at infinite temperature. We might also ask what is the first time the form factor or frame potential reaches its minimal value. This time scale can be attributed to the first zero of the Bessel function, J 1 (2t) = 0 at t ≈ 1.92, and is universal for all values of L. This is the first time at which the ensemble becomes a 1-design. Something like the scrambling time, where the frame potential begins to deviate rapidly from its initial value, occurs at O(1) time.
Using the explicit expression for the GUE 4-point form factor, we can also verify the expected time scales in the second frame potential F  Lastly, we can extract the time scales and values of the finite temperature frame potential from our discussion of R 2 (t, β). The initial value of the first frame potential is where h 1 (β) = J 1 (2iβ)/iβ. At the dip time, t d ≈ h 2 (β/2) √ L, the thermal form factor defined above R 2 (t d , β/2) ≈ √ Lh 3 (β/2)/h 1 (β), with the functions defined in Sec. 2.2. For β L, we have F Finally, as we can see from time averaging Eq. (112), at the plateau time for any β, as the late time value of the thermal frame potential does not depend on the temperature. Let us briefly comment on the dip value of the k-th frame potential at infinite temperature. As we discussed, at the dip time t d ≈ √ L, the frame potentials reached the Haar value and form an approximate k-design for some k. Determining the size of k requires an understanding of the corrections to the dip value. The leading order correction to the Haar value at the dip comes from R 2 2 /L 2 ∼ 1/L, the coefficient of which is c k−1 (k) = k! k. So at the dip time meaning we form an approximate k-design for k L. The claim that the GUE forms a k-design at intermediate times but then deviates from this behavior at late times might at first seem surprising, but the late time behavior makes sense if we consider the dephasing of GUE eigenvalues in the t → ∞ limit. Under the exponential map λ → e iλt , the GUE eigenvalues are distributed around the circle and at early times will still be correlated and logarithmically repel. However, at late times the eigenvalues will spread uniformly around the circle. Moreover, explicitly computing the level density for the GUE under the exponential map and taking the long time limit, one finds that the density becomes constant and the eigenvalues are independently and uniformly distributed. Eigenvalue statistics of Haar random unitary operators can be characterized by the following well-known relation [48] 19 If we suppose that the eigenvalue distribution of U is random, then dU tr(U t )tr(U † t ) would not depend on t. Therefore, the late-time eigenvalue statistics of unitaries generated by fixed GUE matrices is quite different from those of Haar unitaries, which have eigenvalue repulsion.

Complexity and random matrices
In recent years, the notion of quantum complexity has attracted significant attention in the study of quantum many-body systems [49,50,51]. By quantum complexity of a quantum state |ψ , we mean the minimal number of elementary local quantum gates necessary to (approximately) create |ψ from a trivial product state with no entanglement. A similar characterization applies to the quantum complexity of unitary operators constructed from the identity operator. Quantum complexity provides deep insight into what kinds of physical operations are allowed (or prohibited) in a given physical system as states or operators of very large complexity cannot be prepared or implemented in a short period of time by the evolution of local Hamiltonians with finite energy density. Quantum complexity has also proven useful in condensed matter physics where topological phases of matter can be classified in terms of the quantum complexity of ground state wavefunctions [52]. More recently, it was asked whether the AMPS thought experiment can be carried out in a physically reasonable amount of time and resources by considering the computational complexity of decoding the Hawking radiation [53]. In the past few years, quantum complexity has been considered in holography as a possible CFT observable 20 to study the late-time dynamics of the AdS black holes [50,51]. Despite all the promises of the usefulness of quantum complexity, a precise understanding of the growth of quantum complexity in quantum many-body systems, especially in AdS/CFT, continues to elude us. While it is possible to see a hint of complexity growth from entanglement dynamics at early times before the scrambling time, 21 the late-time complexity growth remains difficult to observe as the extremal surfaces do not go through the interior of the black hole and entanglement entropies get saturated at late times. From a mathematical perspective, it is extremely challenging to compute the quantum gate complexity of a given quantum state |ψ as one essentially needs to consider all the possible quantum circuits creating |ψ and find the one with the minimal number of gates. Thus it would be valuable to have an analytical toy example of Hamiltonians whose dynamics indeed makes the quantum complexity of wavefunctions increase even after the scrambling time by providing a rigorous lower bound on quantum complexity.
Here, we present analysis of complexity growth of typical Hamiltonian time evolution by GUEs and show that quantum complexity indeed grows in time. A lower bound on a typical unitary operator in an ensemble E can be computed from a simple counting argument. Observe that short depth quantum circuits can prepare only a small number of unitary operators which occupy a tiny fraction of the whole space of unitary operators. The idea is that, if there are so many unitary operators in E which are sufficiently far apart and distinguishable, then most of operators in E cannot be created by a short depth circuit. Furthermore, it has been found that lower bounds on the number of distinguishable unitary operators in E can be obtained by frame potentials, a measure of randomness in E. Although such a counting argument often gives a rather loose lower bound, it is still possible to obtain a rigorous complexity lower bound for a system of quantum many-body Hamiltonians. See [16] for a rigorous treatment and details.
To be concrete, let us consider a system of qubits where we pick a pair of qubits and apply an arbitrary two-qubit gate at each step. While the circuit complexity for generating an ensemble and the circuit complexity for generating a particular unitary in the ensemble are different, the former provides an approximate lower bound for the circuit complexity of typical unitary operators in the ensemble [16]. We define the number of quantum gates necessary to create an ensemble E by a quantum gate complexity C gate . The lower bound on the quantum gate complexity is then given by 20 At least with respect to some subspace of states of the boundary CFT. 21 For example, from the level-statistics of the entanglement spectrum [54].
up to some constant multiplicative factor. Let us consider the bound for small k. In Sec. 4, we found that F (k) drops to its minimal value ∼ k! at t ∼ O(1) (the first zero of the Bessel function). We thus have up to the first dip time t dip ∼ O(1) where we have used an approximation R 2k (R 1 ) 2k . Thus, at t ∼ O(1), the following lower bound on the complexity is obtained: Converting it into a quantum circuit complexity, we obtain This lower bound should be valid as long as k ∼ O(1). As we have discussed in Sec. 2 and Sec. 4, the early-time oscillations of spectral form factors and frame potentials disappear at finite temperature. It would be then useful to consider the complexity lower bound based on envelope functions of form factors and frame potentials. Since the asymptotic behavior is given by R 1 (t) ∼ 1/t 3/2 , we would have where β implies that we consider the asymptotic behaviors of the envelope. Thus, the quantum circuit complexity grows at least logarithmically in t up to the thermal dip time. While the above studies are able to provide rigorous lower bounds on quantum circuit complexity, the bounds are not meaningful when k is small. To obtain a meaningful lower bound on quantum complexity, we need to evaluate the frame potential and form factor for large k. Analytically computing R 2k and F (k) for large k seems rather challenging. Instead, we employ a certain heuristic argument to derive the decay of R 2k and F (k) . Let us begin by recalling the early-time behavior of 1-point form factor. The 1-point form factor R 1 (t) can be analytically written via a contour integral as follows [55] For L → ∞, the integral gives the Bessel function: But J 1 (2t) t for t 1, so we have where the Gaussian decay is dominant for t 1 while, for 1 t √ L, the Bessel function dominates the decay. In a similar manner, the 2k-point form factor can be analytically written as where the sign of ±it depends on the index of u i and the integral part is equal to unity at t = 0. In previous sections, we have neglected the Gaussian decay because our discussions were mostly centered on small k spectral form factors. But, for large k, the Gaussian decay part is no longer negligible. Let us bound the form factor by using the Gaussian decay part only by neglecting the decay contribution from Bessel functions in the integral part: While the validity of this inequality for large k remains unclear, we assume its validity up to the dip time ∼ √ L when ramp behavior kicks in. The notion of unitary k-design and its application to complexity would be meaningful only up to k ∼ O(L) (see [16] for instance). By using this approximate bound for k = cL with c ∼ O(1), we will have up to the dip time ∼ √ L. This leads to the following estimate of quantum complexity growth for the GUE: which predicts a quadratic growth of quantum complexity. Let us compare our estimate with predictions from the AdS/CFT correspondence. According to the conjecture that quantum complexity is proportional to the volume in the bulk, the early-time complexity (volume) growth is quadratic in time, and then becomes linear in time. Our analysis above suggests that the complexity growth for the GUE is (at least) quadratic in t for a long time until very close to the saturation of quantum complexity ∼ L. One may find that t 2 complexity growth is unphysical as the system has evolved only for time t. The point is that the GUE Hamiltonian is generically non-local and is comprised of O(n)-body terms whereas we measure quantum complexity by using two-local quantum gates as building blocks. where U is integrated over the unitary group U (L) and where f (H) is an arbitrary function. As a consequence, a typical GUE Hamiltonian is non-local (or O(n)-local), so local operators are delocalized essentially immediately. Indeed, the Haar-invariance of the GUE ensemble and non-locality of its Hamiltonians resulted in unusual behaviors of OTOCs whose decay time was shorter than that of 2-point correlation functions. It thus appears that local chaotic Hamiltonians and a typical Hamiltonian from a Haar-invariant ensemble behave in a dramatically different way. However, previous studies on chaotic Hamiltonians suggest that at late times, Haarinvariant Hamiltonian ensembles, such as the GUE, GOE and GSE, capture behaviors of correlation functions remarkably well. This apparent tension between early time and late time behaviors may be resolved in the following manner. Initially, any ensemble of local Hamiltonians is not Haar-invariant because Hamiltonians are made of local terms. This can be clearly seen from the fact that the OTOC, A(0)B(t)A(0)B(t) , behaves rather differently depending on the sizes of operators A, B. Yet, after the scrambling time when local operators become delocalized by Hamiltonian evolution, it becomes harder to tell whether the original operators A(0), B(0) were local or not, and we expect that the unitary ensemble becomes 'approximately' Haar-invariant.

Characterization of Haar-invariance
With this observation in mind, we are naturally led to consider a fine-grained characterization of Haar-invariance which we shall call k-invariance. Intuitively, k-invariance refers to an ensemble of unitary operators which appear to be Haar-invariant up to k-th moments. More precisely, let E be an ensemble of unitary operators. We define a Haar-invariant extension E of this ensemble by: From the construction, we can easily see W EW † = E for any unitary operator W , and so the Haar'ed ensemble is independent of any basis. Let us consider the k-fold twirl superoperator: Then, E is said to be k-invariant if and only if Φ (k) An ensemble of unitaries is Haar-invariant if and only if it is k-invariant for all k ≥ 1. Similar definitions apply to Haar-invariance with respect to orthogonal and symplectic groups. The utility of k-invariance can be seen from an explicit relation between correlation functions and spectral statistics. Recall that we have derived the following relation in the GUE by using the Haar-invariance of the GUE measure: It is clear that the same derivation applies to any ensemble which is k-invariant. The implication is that, after the k-invariance time, the behavior of 2k-point OTOCs can be completely determined by the spectral statistics alone. The physical significance of the kinvariance time is that it is the time scale when OTOCs behave in a similar way regardless of the locality or non-locality of the operators A j , B j (as well as their time-ordering). A similar conclusion holds for k-th frame potentials which can be written only in terms of spectral form factors for k-invariant ensembles. Thus, k-invariance and its associated time scale will be a useful notion to characterize the loss of locality from the perspective of 2k-point OTOCs and the onset of random matrix behavior. How can one verify that some ensemble E is k-invariant? One formal approach is to use frame potentials. Let us define the following operator which corresponds to the difference between tensor expanders from E and its Haar-invariant extension E. Then we have E is the k-th frame potential for an ensemble E. Here we used the fact that the Haar unitary ensemble is left and right invariant. Therefore, we arrive at the following inequality with equality if and only if E being k-invariant. The difference F (k) measures the 2-norm distance to being k-invariant. 22 The above derivation is a straightforward generalization of a method used in [44].

Haar-invariance in a spin system
Let us examine k-invariance for the random non-local (RNL) spin system discussed in Sec. 3.1 where we defined the Hamiltonian in Eq. (48) as the sum over all 2-body operators with random Gaussian couplings J ijαβ . The time evolution of the first frame potential for this ensemble as well as its Haar-conjugated generalization are shown in Fig. 7 along side the difference F (1) E , measuring the distance to 1-invariance. We only report numerics for a modest spin system of n = 6 spins. The difficulty of performing frame potential numerics is mentioned in App. D.
We find that in this chaotic spin system, at early times we quickly deviate from 1invariance, but after evolution by the system's chaotic dynamics, we observe an approach to approximate 1-invariance at late times. For this system, we see that the frame potential approaches, but does not equal, its Haar-invariant counterpart at later times. But we found numerically that increasing the number of sites makes this late time difference smaller. Thus we expect that at large N for chaotic systems, we reach k-invariance at late times. , computed numerically using the 2point form factor as in Eq. (86). On the right we plot the difference, measuring the 2-norm distance to 1-invariance and observe approximate 1-invariance at late times.

Comments on k-invariance
While frame potentials provide a quantitative way of judging if an ensemble E is k-invariant or not, it would be beneficial to relate it to some physical observables such as correlation functions. It is perhaps not a big surprise that k-invariance can be verified by 2k-point OTOCs. The following statement holds: where A j , B j are Pauli operators, andÃ j ,B j are some transformations from A j , B j such that where W is an arbitrary element of unitary 2k-design. The proof is straightforward and thus is skipped. Motivated by late-time random matrix universality of chaotic quantum systems, we have introduced a novel quantum information theoretic concept, k-invariance, as a possible way of bridging early-time and late-time physics. We would like to comment on a few caveats.
First, consider an ensemble of unitary operators E generated by some Hamiltonians. Since E t=0 = {I}, the ensemble is Haar-invariant at time t = 0. Thus, an ensemble is initially k-invariant and is expected to immediately deviate at t > 0 and then eventually become approximately k-invariant. Therefore F (k) E , which quantifies k-invariance, is not a monotonic quantity under time evolution. However, we expect that it is monotonically decreasing at late times. We observe these features in the non-local spin system described above. Depending on the symmetries of the system of interest, we would need to consider the Haar measure with respect to an appropriate Lie group G ⊂ U (L).
Second, for realistic physical systems with local Hamiltonians, it is not likely that an ensemble E t becomes k-invariant in an exact sense even at very late times. This can be seen from a recent work which shows that the late-time value of infinite temperature OTOCs A However, it should be noted that a prediction from the AdS/CFT seems to suggest that correlation functions may become exponentially small e −O(S) even if A, B are local operators. This may suggest a subtle but important distinction between ordinary strongly interacting systems and gravitational systems which leads to a far-reaching question concerning the universality of gravity and the universality of random matrix theory, seen from the lens of k-invariance.
Let us conclude the section with a brief remark on the Eigenstate Thermalization Hypothesis (ETH). The notion of k-invariance may be viewed a dynamical analog of Berry's conjecture about random eigenvectors, which was the motivation behind ETH [57,58,59]. A basic assumption of ETH is that matrix elements of a local operator O, with respect to energy eigenstates, look "random" inside some sufficiently small energy window ∆E. A system achieving k-invariance roughly tells us that energy eigenstates may be treated as random vectors after sufficiently long times for studying dynamics via OTOCs. 23 Given the prevalence of eigenstate thermalization in strongly correlated many-body systems, 24 a precise relation between k-invariance, ETH and OTOCs would provide clarity on defining what it means for a quantum system to be chaotic.

Discussion
Random matrix theory provides a powerful paradigm for studying late-time chaos. We have leveraged the technology of random matrix theory and Haar-invariance to study correlation functions like OTOCs which diagnose early-time chaos, and frame potentials which diagnose randomness and complexity. The salient feature of the GUE which gave us computational traction is its Haar-invariance, namely that the ensemble looks the same in any basis. As a result, the dynamics induced by GUE Hamiltonians is non-local (O(N )-local) with respect to any tensor factor decomposition of the Hilbert space, and so the dynamics immediately delocalizes quantum information and other more subtle forms of correlations. Accordingly, the GUE captures features of the long-time physics of a local system that has been delocalized.
In a chaotic quantum system described by a local Hamiltonian, there are two temporal regimes of interest: times before the system scrambles and thus has mostly local correlations, and times after the system scrambles when correlations have effectively delocalized. We suggested that the transition between these two regimes may be due to the onset of approximate Haar-invariance, and we defined k-invariance as a precise characterization. A careful understanding of Haar-invariance for ensembles of local quantum systems could yield precise insights into the apparent breakdown of locality, and tell us in what time regimes we can use Haar-invariance to calculate late-time physics (i.e., correlation functions, frame potentials, complexity, etc.) A concrete way of studying delocalization of operators and the emergence of k-invariance would be to compare connected pieces of OTOCs with local and non-local operators and observe their eventual convergence. Of particular interest is to find the 2-invariance time when all the 4-point OTOCs, regardless of sizes of operators, start to behave in a similar manner. This time scale must be at least the scrambling time since OTOCs with local operators start to decay only around the scrambling time while OTOCs with non-local operators decay immediately. Relatedly, we would like to draw attention to an upcoming work [63] which studies the onset of random matrix behavior at early times.
In this paper, we computed correlation functions averaged over an ensemble of Hamiltonians. Chaotic systems described by disordered ensembles tend to have small variance in their correlators, and their averaged correlation functions are close to those computed for a simple instance of the ensemble. Even in regimes where replica symmetries are broken, performing time bin averaging reproduces the averaged behaviors very well. We find in App. D.3 that the time bin-averaged frame potential in the large L limit for two samples agrees with averaging over the whole ensemble.
We conclude by mentioning a far reaching goal, but one that provides the conceptual pillars for these ideas, namely understanding black holes as quantum systems. While black holes are thermodynamic systems whose microscopic details remain elusive, questions about information loss can be precisely framed by late-time values of correlation functions within AdS/CFT [17], where unitary evolution can be discussed in terms of the boundary CFT. Ultimately, we would like to use random matrix theory to characterize chaos and complexity in local quantum systems and identify late-time behaviors which are universal for gravitational systems. An interesting future question is to see if gravitational systems are described by random matrices in the sense of k-invariance and pinpoint some late-time behavior which results from gravitational universality. and NHJ acknowledge support from the Simons Foundation through the "It from Qubit" collaboration. NHJ is supported the Institute for Quantum Information and Matter (IQIM), an NSF Physics Frontiers Center (NSF Grant PHY-1125565) with support from the Gordon and Betty Moore Foundation (GBMF-2644). JL is supported in part by the U.S. Department of Energy, Office of Science, Office of High Energy Physics, under Award Number de-sc0011632. Research at Perimeter Institute is supported by the Government of Canada through Industry Canada and by the Province of Ontario through the Ministry of Research and Innovation.

A Scrambling and 2-designs
Recently there has been growing interest in scrambling and unitary designs from the high energy and quantum information communities. Here we provide a short summary of different ways of quantifying them for infinite temperature cases.

A.1 Scrambling
We begin with scrambling. Consider a system of qubits and non-overlapping local (O(1)body) Pauli operators V, W and compute The initial value of OTOC at t = 0 is 1. Scrambling is a phenomenon where the OTOC becomes O( ) with 1 being a small but finite constant: It is often the case that OTOCs with local operators are the slowest to decay. This can be seen from our analysis on 4-point spectral form factors. So, by the scrambling time, OTOCs with non-local operators are already O( ) or smaller. The scrambling time is lower bounded by O(log(n)) in the case of 0-dimensional O(1)-local systems due to a Lieb-Robinson-like argument [3]. Scrambling has caught significant attention from the quantum gravity community since it is closely related to the Hayden-Preskill thought experiment on black hole information problems [1]. Assume that V, W act on qubits on some local regions A, D respectively, and define their complements by B = A c , C = D c . Imagine that A is an unknown quantum state |ψ thrown into a "black hole" B, and the whole system evolves by some time-evolution operator U = e −iHt . At time t, we collect the "Hawking radiation" D and attempt to reconstruct (an unknown) |ψ from measurement on D. Such a thought experiment was considered by Page who argued that, if a black hole's dynamics U is approximated by a random unitary operator, then reconstructing |ψ is not possible unless we collect more than n/2 qubits of the Hawking radiation [64]. As we shall show in Appendix B, the impossibility of reconstruction of A from D is reflected in the smallness of the 2-point correlation functions: The famous calculations by Hawking and Unruh imply that these two-point correlators are thermal, and quickly become small.
Hayden and Preskill considered a situation where a black hole B has already emitted half of its contents, and we have collected its early radiation and stored it in some secure quantum memory M . The quantum memory M is maximally entangled with B, and the question is whether we can reconstruct |ψ by having access to M . It has been shown that scrambling, as defined above, implies that we can reconstruct |ψ with some good average fidelity by collecting the Hawking radiation on D at time t: Therefore, scrambling implies the possibility of recovering local quantum information via local measurements on the Hawking radiation. A random unitary operator U typically gives very small OTOCs which enables reconstruction of A in the Hayden-Preskill thought experiment.
Reconstruction problems in the Hayden-Preskill setting are closely related to the problem of decoupling. A crucial difference between scrambling and decoupling is that decoupling typically considers A, D to be some finite fraction of the whole system and concerns the reconstruction of unknown many-body quantum states supported on a big region A. Since we quantify the reconstruction via fidelity for many-body quantum states, the requirement tends to be more stringent. The relation between scrambling and decoupling is discussed in [65] in the context of local random circuits.

A.2 Unitary designs
Next let us discuss unitary 2-designs. Consider an ensemble of time evolution operators U j with probability distributions p j ; E = {U j , p j } with j p j = 1. The 2-fold channels of E and the Haar ensemble are If Φ E (ρ) = Φ Haar (ρ) for all ρ, then we say E is 2-design. One can check if E is 2-design or not by looking at OTOCs. Consider the OTOC V W (t)V W (t) for arbitrary Pauli operators V, W which are not necessarily local operators. We will be interested in the ensemble averages of OTOCs: Haar for all pairs of Pauli operators V, W , then the ensemble forms a unitary 2-design [16]. A typical unitary operator from a 2-design achieves scrambling because for any (possibly non-local) Pauli operators V, W . The first equation implies that the OTOC value for a single instance from the ensemble is typically 1/L in absolute value while the second equation implies that the OTOC, after ensemble averaging, is 1/L 2 . Since OTOCs are small, a typical 2-design unitary operator U implies scrambling, but the converse is not always true. Recall that scrambling only requires OTOC = O( ). There is thus a big separation in the smallness of the OTOC, and the scrambling time may be much shorter than the 2-design time. Also, scrambling requires OTOC = O( ) only for local operators while a 2-design unitary makes the OTOC small for all pairs of Pauli operators. The lower bound for the exact 2-design time is O(log(n)), but no known protocol achieves this time scale.
One important distinction between scrambling and the 2-design time is how small the OTOCs becomes. The phenomena of scrambling concerns the deviation of OTOC values from the maximal value 1. The concept of a 2-design concerns the deviation of OTOC values from the minimal value O(1/L). The former is related to early-time chaos and the latter is related to late-time chaos.

A.3 Approximate 2-designs
Finally, let us briefly discuss the notion of approximate 2-design. When two quantum operations Φ E and Φ Haar are close to each other, we say that E is an approximate 2-design. In order to be quantitative, however, we need to pick appropriate norms with which two quantum operations can be compared. The 2-norm distance can be defined in a simple way via If S = 0, then Φ E and Φ Haar would be the same. We say that E is a δ-approximate 2-design in the 2-norm if tr(SS † ) ≤ δ. Frame potentials are closely related to the 2-norm distance because tr(SS † ) = F E − F Haar ≥ 0. In [16], a relation between the frame potential and OTOCs has been derived In practice, the main contribution to the left-hand side comes from OTOCs of the form AB(t)AB(t) E . For simplicity of discussion, let us assume that AB(t)CD(t) E = 0 when C = A or D = B (where A, B, C, D are non-identity Pauli operators). Then, a simple analysis leads to for typical non-identity Pauli operators A, B. Thus, being a δ-approximate 2-design in the 2-norm implies that OTOCs are typically small. However, this does not necessarily imply scrambling because OTOCs with local operators are often the slowest to decay. In order to guarantee scrambling, we would need a δ L -approximate design in the 2-norm (under an assumption on AB(t)CD(t) E = 0 for C = A or D = B). For this reason, an alternative distance measure called the diamond norm is often used in quantum information literature. See [66] for relations between different norms.

B Information scrambling in black holes
In this Appendix, we discuss behaviors of 2-point correlators and 4-point OTOCs from the viewpoint of information scrambling in black holes. We begin by deriving a formula which relates two-point autocorrelation functions and mutual information. We will be interested in the following quantity and U is the time-evolution operator of the system, and P A and P D are sets of Pauli operators on A and D. There are L 2 A and L 2 D Pauli operators. The relation between apparent information loss and two-point correlators can be understood by using the state representation |U of a unitary operator U . Given a unitary operator U acting on an n-qubit Hilbert space H, one can view U as a pure quantum state |U defined on a 2n-qubit Hilbert space H ⊗ H: Or equivalently, |U ≡ 1 One easily see that the quantum state |U is uniquely determined by a unitary operator U . The state representation allows us to view |U ABCD as a four-partite quantum state: where B = A c and D = C c in the original system of qubits. Given the state representation |U of a unitary operator, we can derive the following formula where I (2) (A, D) is the Rényi-2 mutual information between A and D for |Ψ , defined by AD . To derive the formula, let ρ AD be the reduced density matrix of |U on AD. Its graphical representation is The averaged 2-point correlator is given by where dotted lines represent averaging over Pauli operators. By using 1 Let us further ponder this formula. For strongly interacting systems, it is typically the case that So, the following relation for the autocorrelation functions holds approximately: where we took A and D to be the same subset of qubits.
The above formula has an interpretation as information retrieval from the early Hawking radiation. Consider scenarios where Alice throws a quantum state |ψ into a black hole and Bob attempts to reconstruct it from the Hawking radiation. In accordance with such thought experiments, let A be qubits for Alice's quantum state, B be the black hole, C be the remaining black hole and D be the Hawking radiation. Then, the averaged 2-point correlation functions have an operational interpretation as Bob's strategy to retrieve Alice's quantum state. Let us assume that the initial state of the black hole is unknown to Bob and model it by a maximally mixed state ρ B = I B L B . Alice prepares an EPR pair |EPR AR on her qubits and her register qubits. Notice the difference from the Hayden-Preskill setup where Bob had access to some reference system B which is maximally entangled with the black hole B. In this decoding problem, we do not grant such access to Bob. He just collects the Hawking radiation D and tries to reconstruct Alice's quantum state.
The most obvious strategy is to apply the inverse U † . However, Bob does not have an access to qubits on C. So, he applies U † CD ⊗ I R to ρ C ⊗ ρ DR where ρ C = I C L C . Graphically, this corresponds to The success of decoding is equivalent to distillation of an EPR pair between A and R. So, we compute the EPR fidelity. Namely, letting Π be a projector onto an EPR pair between A and R, we have which leads to Therefore, the decay of 2-point correlation functions indeed implies that Bob cannot reconstruct Alice's quantum state. Finally, let us summarize the known relations between correlation functions and mutual information: Note that the first formula proves that the decay of OTOCs leads to large I (2) (A, BD) which implies the possibility of Bob decoding Alice's quantum state by accessing both the early radiation B and the new Hawking radiation D. These two formulae allow us to formally show that a black hole can be viewed as a quantum error-correcting code. Let A, D be degrees of freedom corresponding to incoming and outgoing Hawking radiation, and B, C be degrees of freedom corresponding to other exotic high energy modes at the stretched horizon. Since a black hole is thermal, we know that | O A O D (t) avg | decays at t ∼ O(β). Also, due to the shockwave calculation by Shenker and Stanford [4], we know that N ). These results imply that after the scrambling time: The implication is that quantum information injected from A gets delocalized and non-locally is hidden between C and D. The error-correction property can be seen by where a is the number of qubits on A. Namely, if we see the black hole as a quantum code which encodes A into BCD, then the code can tolerate erasure of any single region B, C, D.
In other words, accessing any two of B, C, D is enough to reconstruct Alice's quantum state. Thus, black hole dynamics, represented as a four-partite state |U ABCD , can be interpreted as a three-party secret sharing quantum code.

C Spectral correlators and higher frame potentials
In this Appendix we will present formulas for form factors from random matrix theory. For GUE(L, 0, 1/ √ L), L × L matrices with off-diagonal complex entries and real diagonal entries chosen with variance σ 2 = 1/L, the joint probability of eigenvalues for GUE, with normalizing factors, is and the joint probability distribution of n eigenvalues (i.e., the n-point spectral correlation function), defined as ρ (n) (λ 1 , . . . , λ n ) = dλ n+1 . . . dλ L P (λ 1 , . . . , λ L ) .
We can compactly express ρ (n) (λ 1 , . . . , λ n ) in terms of a kernel K [26,25] as In the large L limit, the kernel K is approximately where the i = j case is called the sine kernel, and the i = j case is simply the Wigner semicircle. In the large L limit, the basic approach for computing spectral form factors will be expanding the determinant in Eq. (168) using the kernel in Eq. (169), and computing the Fourier transform of the resulting sums of product of kernels. Thus we will have sums of integrals of the form [25] where we define the Fourier transform of the sine kernel is not regulated in the (λ i + λ j ) direction. The most direct method to soften this divergence is to impose a cutoff which is fixed by the normalization condition L π While the 'box approximation' of applying the cutoff allows us to compute higher-point spectral correlators in the large L limit, it does lead to errors relative to an exact answer whose closed form is not tractable. 25 Thus we must be careful to keep track of these errors and compare with numerics. However, we find that at infinite temperature, the box approximation of the spectral form factors is analytically controlled at early times like O(1) and late times greater than O( √ L). To understand the errors of the box approximation, we first consider various cases heuristically: when we have i k i = 0, the λ integral in Eq. (170) is directly fixed by normalization. When i k i = 0, the λ integral in Eq. (170) dephases and so decays when | i k i | is large, and thus the induced error is unimportant at long times. At small, O(1) values of the |k i |'s (assuming that m is O(1)), the error induced by the box approximation is also small and the value is still close to the i k i = 0 value.
For instance, carefully keeping track of factors of L tells us that in R 4 , for early times like O(1) the error is suppressed by O(1/L) relative to largest order terms, while for late times after O( √ L) the error is suppressed by O(1/ √ L) relative to the largest order terms. In this discussion, particularly for i k i = 0, we assumed simple sine kernel correlations and found r 2 to be a pure linear function. However, a more delicate treatment shows some other transition time scale at early times, which likely complicates the functional form of r 2 and gives a different slope for the ramp. We briefly address this issue for our numerics in App. D.
Since the dephasing of the λ integral at large | i k i | is suppressed at finite temperature, to better capture long-time finite temperature eigenvalue correlations we use a modified kernel K which is valid in the short distance limit |λ a − λ b | ∼ O(1/L) [67,55], which naturally provides a cutoff in the (λ i + λ j ) direction. However, this approximation assumes the continued domination of the regulated integral in the short distance limit, which may not be true for large β. However, for small β the modified kernel is reliable. In the generic case, one should consider the full expression of Hermite polynomials as the sine kernel, and correctly take the limit. A complicated formula has been derived in [67,55] from a saddle point approximation.

C.1 Expressions for spectral correlators
Using the analysis above, it is straightforward to compute form spectral correlation functions for the GUE. It is convenient to define as mentioned earlier. The infinite temperature form factors which appear in the calculation of the first and second frame potentials are As R 4,2 is simply R 2 (2t), we only need to compute the first three spectral correlation functions. We will also investigate the finite temperature version of R 2 , which we defined as R 2 at infinite temperature We start by computing R 2 at infinite temperature: Evaluating the first term in the integral, we find The second term can be evaluated using Eq. (170), and we find The final result is R 2 at finite temperature As explained above, to better capture long-time correlations at finite temperature we will use the short-distance-limit kernel K. Firstly, for i = j, we have L Dλ e −2βλ 1 = Lr 1 (2iβ) .
For i = j we have Putting everything together, we obtain

R 4 at infinite temperature
We now compute R 4 (t), again by separately considering coincident eigenvalues, using the determinant of kernels, and Fourier transforming to find We can simplify this formula at early times of O(1) and late times greater than O( √ L) by dropping subdominant terms and find where the 2L 2 r 2 2 term gives a quadratic rise at late times, akin to the ramp in R 2 .

R 4,1 at infinite temperature
We find that Just as above, we can approximate R 4,1 at early and late times by C.2 Expressions for higher frame potentials k = 2 frame potential We computed the second frame potential for the GUE to be with form factors as defined in Eq. (176). Let us try and extract the interesting behavior encoded in the expression. We know the maximal value of the spectral n-point functions defined above at early times, R 2 ∼ L 2 , R 4 ∼ L 4 , R 4,1 ∼ L 3 , and R 4,2 ∼ L 2 . From the expression for the frame potential above, we keep the terms that are not suppressed in 1/L, i.e. can contribute at least at zeroth order: with the Haar value appearing at the beginning. At early times, the leading order behavior is F GUE ∼ R 2 4 /L 4 . From our calculation of the n-point form factors, we know that at the dip time all form factor terms above are suppressed in L, meaning the frame potential goes like the Haar value. Knowing the late time value of the 2-point and 4-point form factors, the terms above that will contribute at late times are which gives ≈ 10 in the large L limit. In the strict t → ∞ limit, where R 2 → L, R 4 → 2L 2 − L, and R 4,1 , R 4,2 → L, we have GUE ≈ 10 for L 1 .

C.3 Expressions for Weingarten
Lastly, we give the definition of the unitary Weingarten function, which appeared in the integration of Haar random unitaries in Eq. (80). The 2k-th moment of the Haar ensemble appeared in the k-th frame potential. For the n-th moment, the Weingarten function is a function of an element σ of the permutation group S n and presented as defined in [46], where we sum over integer partitions of n (recall that the conjugacy classes of S n are labeled by integer partitions of n). χ λ is an irreducible character of S n labeled by λ (as each irrep of S n can be associated to an integer partition) and e is the identity element. .

D Additional numerics
We conclude with a few numerical checks on the formulae we derived for the form factors and frame potentials.

D.1 Form factors and numerics
As we mentioned in Sec. 2.2 and discussed in App. C.1, in order to derive expressions for the form factors for the GUE we had to make approximations which should be compared to numerics for the GUE. We briefly remind the reader that at infinite temperature, we derived the expression R 2 (t) = L 2 r 2 1 (t) − Lr 2 (t) + L .
Numerical checks of this expression are shown in Fig. 8. We see that the approximations employed work well at β = 0, reproducing the early time oscillations, dip, plateau, and ramp features. But there is some discrepancy in the ramp behavior which merits discussion. As we take L → ∞, the difference between the predicted ramp and numerical ramp is not suppressed. In Fig. 8, we see that the relative error between the numerics and analytic prediction does not decrease as we increase L, indicating that this difference in the ramp prediction is not an artifact of finite L numerics. On a log-log plot, this shift from the numerics suggests that we capture the correct linear behavior, but with a slightly different slope for the ramp. The r 2 (t) = 1 − t/2L function which controls the slope behavior comes from the Fourier transform of the square of the sine kernel. Recall that in our approximation, we integrated over the entire semicircle. A phenomenological observation is that the modified ramp function defined byr 2 (t) ≡ 1 − 2t/πL, where we change the slope to 2/π, does a much better job of capturing the ramp behavior. Working in the short-distance limit of the 2-point correlator ρ (2) (λ 1 , λ 2 ) (as in [30]) and integrating the sine kernel over the entire semicircle, we obtaiñ r 2 whose behavior we only trust near the dip. Numerically, we find that this modified slope of 2π/L better captures the r 2 function near the dip, with error that is suppressed as we take L → ∞. The same numerics are reported in Fig. 9, but with the modified ramp behavior. There is still some discrepancy near the plateau time when we transition to the constant plateau value, but the ramp behaviors near the dip are in much better agreement. Relative Error Figure 9: The same numerics as reported in Fig. 8, but now compared to the analytic expression with the modified ramp behaviorr 2 (t).
We understand the Bessel function contribution to R 2 (t), which arises from 1-point func-tions. The subtlety above is really in the connected piece of the 2-point function R 2 (t) conn ≡ R 2 (t) − L 2 r 2 1 (t) .
Numerically, we see that the connected 2-point form factor for the GUE exhibits three different behaviors: an early time quadratic growth, an intermediate linear growth, and then a late-time constant plateau. The closed form expression we derived in Sec. 2 should be viewed as a coarse approximation before the plateau, approximately capturing the linear regime. The modified ramp functionr 2 (t) = 1−2t/πL appears to capture the linear behavior near the dip with the correct slope. In [55], a more detailed treatment of the connected correlator is given at early times. From the integral representation of the connected 2-point form factor, they find that to leading order in L (Eq.(2.28) in [55]). The three behaviors are compared with numerics in Fig. 10.   In summary, the three regimes of the connected 2-point form factor are roughly captured by The early time quadratic behavior does not play an important role in our analysis of GUE correlation functions and frame potentials, but is of independent physical interest. This intruiging early-time behavior of the connected 2-point form factor will be explored in [63]. At finite temperature we find good agreement between the expression R 2 (t, β) and numerics at early and late times, but again see a deviation of the dip and ramp behaviors from the analytic prediction, as shown in Fig. 11. Using the modified rampr 2 we find closer agreement at small β, but as we increase β the predicted ramp behavior again starts to deviate from the numerics, indicating that there is a β-dependence to the slope that we do not fully understand. But as we discussed in App. C.1, we only trust the short-distance approximation at finite temperature, and thus R 2 (t, β), for small β. We also report numerics for the R 4 expression in Fig. 12.

D.2 Frame potentials and numerics
As the frame potential depends on the eigenvectors of the elements in the ensemble (and not just the eigenvalues as per the form factors) and requires a double sum over the ensemble, numerical simulation of the frame potential is harder than for the form factors. For an ensemble of L × L matrices, we need to consider sample sizes greater than L 2k for the k-th frame potential, which amounts to summing over many samples for fairly modest Hilbert space dimension. Instead, for a given L, we can sequentially increase the sample size and extrapolate to large |E GUE |. In Fig. 13 we consider the first frame potential for the GUE at L = 32 and, in the limit of large sample size, find good agreement with the analytic expression computed from R 2 . Alternatively, we can numerically compute the frame potentials by ignoring the coincident contributions to the double sum in F (k) , i.e. when U = V . For a finite number of samples, these terms contribute L 2 /|E| to the sum, meaning we must look at large ensembles before their contribution does not dominate entirely. Ignoring these terms, we can time average over a sliding window to compute the frame potential with only a few samples, as shown in Fig. 13.
GUE . On the right, we time bin average F (1) GUE as described above and, for L = 32 and 100 samples, we find good agreement with the quantities on the left.

D.3 Minimal realizations and time averaging
Given an ensemble of disordered systems, one can ask whether a quantity averaged over the ensemble is the same as for a single random instance of the ensemble. It is known that up until the dip time, the spectral form factor is self-averaging, meaning that single instance captures the average for large L [68]. However, the spectral form factor is not self-averaging at late times. We can try to extract the averaged behavior from a single instance in regimes dominated by large fluctuations by averaging over a moving time window. In Fig. 14, we see that for a single instance of the GUE, the time average of the spectral form factor at finite β gives the same result as the ensemble average for sufficiently large L. For the frame potential, we can consider two instances, the smallest ensemble for which the frame potential makes sense. Ignoring the coincident terms in the sum, we see that the frame potential is also self-averaging at early times and that the time average at late times agrees with the ensemble average and analytic expression.