Chaos and complexity by design

We study the relationship between quantum chaos and pseudorandomness by developing probes of unitary design. A natural probe of randomness is the"frame potential,"which is minimized by unitary $k$-designs and measures the $2$-norm distance between the Haar random unitary ensemble and another ensemble. A natural probe of quantum chaos is out-of-time-order (OTO) four-point correlation functions. We show that the norm squared of a generalization of out-of-time-order $2k$-point correlators is proportional to the $k$th frame potential, providing a quantitative connection between chaos and pseudorandomness. Additionally, we prove that these $2k$-point correlators for Pauli operators completely determine the $k$-fold channel of an ensemble of unitary operators. Finally, we use a counting argument to obtain a lower bound on the quantum circuit complexity in terms of the frame potential. This provides a direct link between chaos, complexity, and randomness.


Introduction
Random unitary operators have often been used to approximate chaotic dynamics. Notably, in the context of black holes Hayden and Preskill used a model of random dynamics to show such systems can be efficient information scramblers; initially localized information will quickly thermalize and become thoroughly mixed across the entire system [1]. It was later conjectured [2] and then proven [3,4] that black holes are the fastest scramblers in nature, suggesting their dynamics must have much in common with random unitary evolution. Such "scrambling" is a byproduct of strongly-coupled chaotic dynamics [5][6][7], and so the work of [1] suggests there should be strong quantitative connection between such chaos and pseudorandomness. 1 The connection between pseudorandomness and chaotic dynamics can also be understood at the level of operators. For example, consider W a local operator of low weight (e.g. a Pauli operator acting on a single spin). With a chaotic Hamiltonian H, the operator W (t) = e iHt W e −iHt will be a complicated nonlocal operator that has an expansion as a sum of products of many local operators with an exponential number of terms each with a pseudorandom coefficient [13]. We can gain intuition for this by considering the Baker-Campbell-Hausdorff expansion of W (t) If H is q-local and sufficiently "generic" to contain all possible q-local interactions, then the jth term will consist of a sum of roughly ∼ (n/q) qj terms each of weight ranging from 1 to ∼ j(q − 1), where we assume the system consists of n spins so that the Hilbert space is of dimension d = 2 n : • At roughly j ∼ n/(q − 1), there will be many terms in the sum of weight n. These terms are delocalized over the entire system. For a system without spatial locality, the relationship between time t and when the jth term becomes O(1) is roughly t ∼ log j. The timescale t ∼ O(log n) for the operator to cover the entire system is indicative of fast-scrambling behavior.
• At around j ∼ 2n/q log(n/q), the total number of terms will reach 2 2n , equal to the total number of orthogonal linear operators acting on the Hilbert space. 2 Even after the operator covers the entire system, it continues to spread over the unitary group (though possibly only until a time roughly O(log n)+ a constant).
Furthermore, the coefficient of any given term will be incredibly complicated, depending on the details of the path through the interaction graph and the time. Over time, W (t) should cover the entire unitary group (possibly quotiented by a group set by the symmetries of the Hamiltonian). At sufficiently large t, one might even suspect that for many purposes W (t) can be approximated by a random operatorW ≡ U † W U , with U sampled randomly from the unitary group. 3 If this is true, then we would say that W (t) behaves pseudorandomly.

Chaos
This pattern of growth of W (t) can be measured by a second local operator V . For example, the group commutator of W (t) with V , given by W (t) † V † W (t) V , measures the effect of the small perturbation V on a later measurement of W . In other words, it is a measure of the butterfly effect and the strength of chaos: • If W (t) is of low weight and few terms, then W (t) and V approximately commute [W (t), V ] ≈ 0, and the operator W (t) † V † W (t) V is close to the identity.
• If instead the dynamics are strongly chaotic, W (t) will grow to eventually have a large commutator with all other local operators in the system (in fact, just about all other operators), and so W (t) † V † W (t) V will be nearly random and have a small expectation in most states.
Thus, the decay of out-of-time-order (OTO) four-point functions of the form can act as a simple diagnostic of quantum chaos, where U (t) = e −iHt is the unitary time evolution operator, and the correlator is usually evaluated on the thermal state · ≡ tr {e −βH · }/tr e −βH [14,6,3,15,4]. 4 For further discussion, please see a selection (but by all means not a complete set) of recent work on out-of-time-order four-point functions and chaos [3,16,13,17,15,18,4,7,[19][20][21][22][23][24][25][26][27][28][29][30][31]. For sufficiently chaotic systems and sufficiently large times, the correlators Eq. (2) will reach a floor value equivalent to the substitution, W (t) → U † W U with U chosen randomly. Furthermore, it can be shown [7] that the decay of correlators Eq. (2) implies the sort of information-theoretic scrambling studied by Hayden and Preskill in [1]. This explains why the random dynamics model of [1] was such a good approximation for strongly-chaotic systems, such as black holes. However, are out-of-time-order four-point functions Eq. (2) actually a sufficient diagnostic of chaos?
In [1] the authors did not actually require the dynamics to be a uniformly random unitary operator sampled from the Haar measure on the unitary group. Instead, it would have been sufficient to sample from a simpler ensemble of operators that could reproduce only a few moments of the larger Haar ensemble. 5 Of course, there may be other finer-grained information theoretic properties of a system that are dependent on higher moments and would require a larger ensemble to replicate the statistics of the Haar random dynamics. If random dynamics is a valid approximation for computing some, but not all, of these finergrained quantities then they can represent a measure of the degree of pseudorandomness of the underlying dynamics. In this paper, we will make some progress in developing some of these finer-grained quantities and therefore connect measures of chaos to measures of randomness.

Unitary design
The extent to which an ensemble of operators behaves like the uniform distribution can be quantified by the notion of unitary k-designs [32][33][34][35][36]. 6 A unitary k-design is a subset of the unitary group that replicates the statistics of at least k moments of the distribution. Consider a finite-dimensional Hilbert space H ⊗k = (C d ) ⊗k consisting of k copies of H = C d . Given an ensemble of unitary operators E = {p j , U j } acting on H with probability One fine-grained quantity is the quantum circuit complexity of a quantum state |ψ(t) = U (t)|ψ(0) . Consider a simple initial state, such as the product state |ψ(0) = |0 ⊗n , undergoing chaotic time evolution. After a short time thermalization time of O(1), the system will evolve to an equilibrium in which local quantities will reach their thermodynamic values. Next, after the scrambling time of O(log n), the initial state |ψ(0) will be forgotten: the information will be distributed in such a way that measurements of even a large number of collections of local quantities of |ψ(t) will not reveal the initial state. However, even after the scrambling time, the quantum circuit complexity of the time-evolved state |ψ(t) , as quantified by the number of elementary quantum gates necessary to reach it from the initial product state, will continue to evolve. In fact, it is expected to keep growing linearly in time until it saturates at a time exponential in the system size e O(n) [50,51].
We hope this presents an intuitive picture that chaos, pseudorandomness, and quantum circuit complexity should be related. To that end, having first established a connection between higher-point out-of-time-order correlators and pseudorandomness, we will next connect the randomness of an ensemble to computational complexity. Finally, we will use correlation functions to probe pseudorandomness by comparing different random averages to expectations from time evolution.
Below, we will summarize our main results, deferring the technical statements to the body and appendices.

Main results
We will focus on a particular form of 2k-point correlation functions evaluated on the maximally mixed state ρ = 1 d I where any of the A j , B j may be a product Pauli operators that act on a single spin. 8 Note that each of the B j is conjugated by the same unitary U (which is similar to picking all the time arguments in Eq. (4) to be either 0 or t). Furthermore, U will not necessarily represent Hamiltonian time evolution, and instead we will let U be sampled from some 8 To be clear, this means that the operators we are correlating are not necessarily simple or local. 6 ensemble. 9 From this point forward, we will use the notatioñ to simplify expressions involving unitary conjugation. Therefore, we can represent the ensemble average of OTO 2k-point correlation functions as where the integral is with respect to the probability distribution in an ensemble of unitary operators E = {p j , U j }. Finally, the k-fold channel over the ensemble E is which is a superoperator.

Chaos and k-designs
First, we will prove a theorem stating that a particular set of 2k-point OTO correlators, averaged over an ensemble E, is in a one-to-one correspondence with the k-fold channel Φ and we provide a simple formula to convert from one to the other. Such an explicit relation between OTO correlators and the k-fold channel may have practical and experimental applications such as statistical testing (e.g. a quantum analog of the χ 2 -test) and randomized bench marking [40]. Next, we prove that generic "smallness" of 2k-point OTO correlators implies that the ensemble E is close to k-design. We will make this statement precise by relating OTO correlators to a useful quantity known as the frame potential 9 As a result, these correlators are not really out-of-time-order, since there may not be a notion of time. Instead, they are probably more accurately called out-of-complexity-order (OCO) since we might generalize the notion of time ordering to complexity ordering, where we put unitaries of smaller complexity to the right of unitaries of larger complexity. In order to (hopefully) avoid confusion, we will continue to call the 2k-point functions in Eq. (5) out-of-time-order, despite there not necessarily being a notion of time.
This quantity, first introduced in [45], measures the (2-norm) distance between Φ (k) E (·) and Φ (k) Haar (·), and has been shown to be minimized if and only if the ensemble E is k-design. We will derive the following formula: Average of |2k-point OTO correlator| 2 ∝ kth frame potential F (k) which shows that 2k-point OTO correlators A 1B1 · · · A kBk E are measures of whether an ensemble E is a unitary k-design. Thus, the decay of OTO correlators can be used to quantify an increase in pseudorandomness.

Chaos, randomness, and complexity
We prove a lower bound on quantum circuit complexity needed to generate an ensemble of unitary operators E This bound is actually given by a rather simple counting argument. The denominator should be thought of as (the log of) the number of choices made at each step in generating the circuit. For instance, if we have a set G of cardinality g of q-qubit quantum gates available and at each step randomly select q qubits out of n total and select one of the gates in G to apply, then we would make g n q choices at each step. Recalling our result relating OTO correlators and the frame potential, this result implies that generic smallness of OTO correlators leads to higher quantum circuit complexity. This is a direct and quantitative link between chaos and complexity.
However, we caution the reader that in many cases Eq. (12) may not be a very tight lower bound. We will provide some discussion of this point as well as a few examples, however further work is most likely required to better understand the utility of this bound.

Haar vs. simpler ensemble averages
Finally, we present calculations of the Haar average of some higher-point OTO correlators and compare them to averages in simpler ensembles. These results suggest that the floor value of OTO correlators of local operators might be good diagnostics of pseudorandomness.
For 4-point OTO correlators, we find where AC represents a connected correlator and d = 2 n is the total number of states. In contrast, if the U are averaged over the Pauli operators (which form a 1-design but not a 2-design), we find We will present intuitive explanations on this difference from the viewpoint of local thermal dissipations vs. global thermalization or scrambling. For 8-point OTO correlators with Pauli operators A, B, C, D, we compute averages over the unitary group and the Clifford group (which form a 3-design on qubits but not a 4-design) This suggests that forming a higher k-design leads to a lower value of the correlator. The results Eq. (13)-(15) are exact for any choice of operators. (The extended results for the correlators in Eq. (15) is presented in Appendix D.1.) However, for a particular ordering of OTO 4m-point functions where we average over choices of operators, we will also show that a Haar averaging over the unitary group scales as This result hints that these correlators continue to be probes of increasing pseudorandomnes.

Organization of the paper
For convenience of the reader, we included a(n almost) self-contained introduction to Haar random unitary operators and unitary design in §2. In §3, we establish a formal connection between chaos and unitary design by proving the theorems mentioned above. In §4, we connect complexity to unitary design by proving the complexity lower bound (12). In §5, we include the explicit calculations of 2-point and 4-point functions averaged over different ensembles and discuss how these averages relate to expectations from time evolution with chaotic Hamiltonians. We also discuss results and expectations for higher-point functions.
We conclude in §6 with an extended discussion of these results, their relevance for physical situations, and outline some future research directions.
Despite page counting to the contrary, this is actually a short paper. A knowledgeable reader may learn our results by simple reading §3, §4 and §5. On the other hand, a large number of extended calculations and digressions are relegated to the Appendices: • In §A, we collect some proofs that we felt interrupted the flow of the main discussion.
• In §B, we discuss the number of nearly orthogonal states in large-dimensional Hilbert spaces.
• In §C, we extend our complexity lower bound to minimum circuit depth by considering gates that may be applied in parallel. We also derive a bound on early-time complexity for evolution with an ensemble of Hamiltonians.
• In §D, we hide the details of our 8-point functions Haar averages and also derive the ∼ d −2m scaling of Haar averages of certain 4m-point functions.
• In §E, we provide a generalization of the frame potential that equals an average of the square of OTO correlators for arbitrary states rather than just the maximally mixed state.
• Finally, in §F we prove some extended results relating to our earlier work [7] that are somewhat outside the main focus of the current paper.

Measures of Haar
The goal of this section is to provide a review of the theory of Haar random unitary operators and unitary design in a self-contained manner. The presentation of this section owes a lot to a recent paper by Webb [43] as well as a course note by Kitaev [52]. 10 Haar random unitaries
Pauli operators are the basis of the operator space. Observe that Tr(P † i P j ) = d ij for P i , P j 2 P n . Given any operator A acting on H, one can expand A as follows: The cyclic permutation operator W cyc can be decomposed as follows: For instance, the SWAP operator is given by SWAP = 1 d P P ⌦ P † , or graphically: As an application, let us consider the Pauli twirl: This equation can be derived as follows: f igure (9)

C. Haar random
Let A be an operator acting on H. The central object concerning Haar random unitary operators is the k-fold averages: where the integral is taken over the Haar measure. The Haar measure is the unique probability measure that is both left-invariant and right-invariant: The Pauli group isP = h!I, X, Zi where! = ! for odd d and! = e ⇡/d for even d. We are often uninterested in global phases, so we define There are d 2 Pauli operators in P. When the Hilbert space consists of n spins, we write the Pauli group by P n . For systems of qubits, (representatives of) P n consist of tensor products of qubit Pauli operators, such as X ⌦ Y ⌦ I ⌦ Z ⌦ · · · without global phases. Pauli operators are the basis of the operator space. Observe that Tr(P † i P j ) = d ij for P i , P j 2 P n . Given any operator A acting on H, one can expand A as follows: By using this property, the cyclic permutation operator W cyc can be decomposed as follows: For instance, the SWAP operator is given by SWAP = 1 d P P P ⌦ P † , or graphically: As an application, let us consider the Pauli twirl: This equation can be derived as follows: f igure (10)

C. Haar random
Let A be an operator acting on H. The central object concerning Haar random unitary operators is the k-fold averages: Pauli operators for C d (i.e. d-state spins) are defined by The Pauli group isP = h!I, X, Zi where! = ! for odd d and! = e ⇡/d for even d. We are often uninterested in global phases, so we define There are d 2 Pauli operators in P. When the Hilbert space consists of n spins, we write the Pauli group by P n . For systems of qubits, (representatives of) P n consist of tensor products of qubit Pauli operators, such as X ⌦ Y ⌦ I ⌦ Z ⌦ · · · without global phases. Pauli operators are the basis of the operator space. Observe that Tr(P † i P j ) = d ij for P i , P j 2 P n . Given any operator A acting on H, one can expand A as follows: By using this property, the cyclic permutation operator W cyc can be decomposed as follows: For instance, the SWAP operator is given by SWAP = 1 d P P P ⌦ P † , or graphically: As an application, let us consider the Pauli twirl: This equation can be derived as follows: f igure (10)

C. Haar random
Let A be an operator acting on H. The central object concerning Haar random unitary operators is the k-fold averages: Pauli operators for C d (i.e. d-state spins) are defined by The Pauli group isP = h!I, X, Zi where! = ! for odd d and! = e ⇡/d for even d. We are often uninterested in global phases, so we define There are d 2 Pauli operators in P. When the Hilbert space consists of n spins, we write the Pauli group by P n . For systems of qubits, (representatives of) P n consist of tensor products of qubit Pauli operators, such as X ⌦ Y ⌦ I ⌦ Z ⌦ · · · without global phases. Pauli operators are the basis of the operator space. Observe that Tr(P † i P j ) = d ij for P i , P j 2 P n . Given any operator A acting on H, one can expand A as follows: By using this property, the cyclic permutation operator W cyc can be decomposed as follows: For instance, the SWAP operator is given by SWAP = 1 d P P P ⌦ P † , or graphically: As an application, let us consider the Pauli twirl: This equation can be derived as follows: f igure (10)

C. Haar random
Let A be an operator acting on H. The central object concerning Haar random unitary operators is the k-fold averages: auli group isP = h!I, X, Zi where! = ! for odd d and! = e ⇡/d for even d. We are uninterested in global phases, so we define are d 2 Pauli operators in P. When the Hilbert space consists of n spins, we write the group by P n . For systems of qubits, (representatives of) P n consist of tensor products it Pauli operators, such as X ⌦ Y ⌦ I ⌦ Z ⌦ · · · without global phases. li operators are the basis of the operator space. Observe that Tr(P † i P j ) = d ij for 2 P n . Given any operator A acting on H, one can expand A as follows: ing this property, the cyclic permutation operator W cyc can be decomposed as follows: stance, the SWAP operator is given by SWAP = 1 d P P P ⌦ P † , or graphically: application, let us consider the Pauli twirl: quation can be derived as follows: f igure (10)

C. Haar random
A be an operator acting on H. The central object concerning Haar random unitary ors is the k-fold averages: "e H " 1 "e H " 1 "e H " 1 "e H " "e H " 1 "e H " e iHt (8) .
When an operator A is a linear combination of permutation operators, it is clear that A commutes with V ⊗k (⇐). A difficult part is to prove the converse (⇒), which relies on Von Neumann's double commutant theorem.

Pauli operators
Pauli operators for C d (i.e. d-state spins or qudits) are defined by where ω ≡ e 2πi/d . We note that Eq. (20) implies ZX = ωXZ and X d = Z d = I, and that for d > 2, the Pauli operators are unitary and traceless, but not Hermitian. The Pauli group isP = ωI, X, Z , whereω = ω for odd d, andω = e π/d for even d. Since we are usually uninterested in global phases, we will consider the quotient of the group There are d 2 (representative) Pauli operators in P. When the Hilbert space is built up from the space of n qubits, we will denote the Pauli group by P n . For such systems, (the representatives of) P n consist of tensor products of qubit Pauli operators, such as X ⊗ Y ⊗ I ⊗ Z ⊗ · · · without any global phases.
The Pauli operators provide a basis for the space of linear operators acting on the Hilbert space. They are orthogonal, tr {P † i P j } = dδ ij for P i , P j ∈ P, and therefore we can expand any operator A acting on H as With this property, the cyclic permutation operator W cyc on H ⊗k can be decomposed as where the sum is over k − 1 copies of P.
The case of k = 2 is particularly important, giving an operator that swaps two subsystems. Explicitly, we have SWAP = 1 d P P ⊗ P † , or graphically (up to a multiplicative factor) The Pauli group isP = h!I, X, Zi where! = ! for odd d and! = e for even d. We are typically uninterested in global phases, so we define There are d 2 Pauli operators in P. When the Hilbert space consists of n spins, we write the Pauli group by P n . For systems of qubits, (representatives of) P n consist of tensor products of qubit Pauli operators, such as X ⌦ Y ⌦ I ⌦ Z ⌦ · · · without global phases. Pauli operators are the basis of the operator space. Observe that Tr(P † i P j ) = d ij for P i , P j 2 P n . Given any operator A acting on H, one can expand A as follows: The cyclic permutation operator W cyc can be decomposed as follows: For instance, the SWAP operator is given by SWAP = 1 d P P ⌦ P † , or graphically: As an application, let us consider the Pauli twirl: This equation can be derived as follows: f igure (9)

C. Haar random
Let A be an operator acting on H. The central object concerning Haar random unitary operators is the k-fold averages: where the integral is taken over the Haar measure. The Haar measure is the unique probability measure that is both left-invariant and right-invariant: = B. Permutation operators operators for C d (i.e. d-state spins) are defined by li group isP = h!I, X, Zi where! = ! for odd d and! = e ⇡/d for even d. We are interested in global phases, so we define e d 2 Pauli operators in P. When the Hilbert space consists of n spins, we write the up by P n . For systems of qubits, (representatives of) P n consist of tensor products Pauli operators, such as X ⌦ Y ⌦ I ⌦ Z ⌦ · · · without global phases. operators are the basis of the operator space. Observe that Tr(P † i P j ) = d ij for P n . Given any operator A acting on H, one can expand A as follows: this property, the cyclic permutation operator W cyc can be decomposed as follows: nce, the SWAP operator is given by SWAP = 1 d P P P ⌦ P † , or graphically: plication, let us consider the Pauli twirl: ation can be derived as follows: C. Haar random be an operator acting on H. The central object concerning Haar random unitary s is the k-fold averages: group isP = h!I, X, Zi where! = ! for odd d and! = e ⇡/d for even d. We are terested in global phases, so we define d 2 Pauli operators in P. When the Hilbert space consists of n spins, we write the p by P n . For systems of qubits, (representatives of) P n consist of tensor products auli operators, such as X ⌦ Y ⌦ I ⌦ Z ⌦ · · · without global phases. perators are the basis of the operator space. Observe that Tr(P † i P j ) = d ij for n . Given any operator A acting on H, one can expand A as follows: this property, the cyclic permutation operator W cyc can be decomposed as follows: ce, the SWAP operator is given by SWAP = 1 d P P P ⌦ P † , or graphically: lication, let us consider the Pauli twirl: tion can be derived as follows:  Pauli operators for C d (i.e. d-state spins) are defined by e Pauli group isP = h!I, X, Zi where! = ! for odd d and! = e ⇡/d for even d. We are en uninterested in global phases, so we define ere are d 2 Pauli operators in P. When the Hilbert space consists of n spins, we write the uli group by P n . For systems of qubits, (representatives of) P n consist of tensor products qubit Pauli operators, such as X ⌦ Y ⌦ I ⌦ Z ⌦ · · · without global phases. Pauli operators are the basis of the operator space. Observe that Tr(P † i P j ) = d ij for , P j 2 P n . Given any operator A acting on H, one can expand A as follows: using this property, the cyclic permutation operator W cyc can be decomposed as follows: r instance, the SWAP operator is given by SWAP = 1 d P P P ⌦ P † , or graphically: an application, let us consider the Pauli twirl: is equation can be derived as follows:

C. Haar random
Let A be an operator acting on H. The central object concerning Haar random unitary erators is the k-fold averages: (k) sP = h!I, X, Zi where! = ! for odd d and! = e ⇡/d for even d. We are in global phases, so we define i operators in P. When the Hilbert space consists of n spins, we write the . For systems of qubits, (representatives of) P n consist of tensor products rators, such as X ⌦ Y ⌦ I ⌦ Z ⌦ · · · without global phases. s are the basis of the operator space. Observe that Tr(P † i P j ) = d ij for n any operator A acting on H, one can expand A as follows: erty, the cyclic permutation operator W cyc can be decomposed as follows: SWAP operator is given by SWAP = 1 d P P P ⌦ P † , or graphically: , let us consider the Pauli twirl: be derived as follows: C. Haar random erator acting on H. The central object concerning Haar random unitary -fold averages: (k) "e H " 1 "e H " 1 "e H " 1 "e H " 1 "e H " 1 "e H " e iHt (8) .
Here and in what follows, a dotted line represents an average over all Pauli operators. For example, let us consider the Pauli channel where A is any operator on the system. This equation can be derived graphically by applying Eq. (24) a1 a2 The Pauli group isP = h!I, X, Zi where! = ! for odd d and! = e ⇡/d for even d. We are typically uninterested in global phases, so we define There are d 2 Pauli operators in P. When the Hilbert space consists of n spins, we write the Pauli group by P n . For systems of qubits, (representatives of) P n consist of tensor products of qubit Pauli operators, such as X ⌦ Y ⌦ I ⌦ Z ⌦ · · · without global phases. Pauli operators are the basis of the operator space. Observe that Tr(P † i P j ) = d ij for P i , P j 2 P n . Given any operator A acting on H, one can expand A as follows: The cyclic permutation operator W cyc can be decomposed as follows: For instance, the SWAP operator is given by SWAP = 1 d P P ⌦ P † , or graphically: As an application, let us consider the Pauli twirl: This equation can be derived as follows:

C. Haar random
Let A be an operator acting on H. The central object concerning Haar random unitary operators is the k-fold averages: where the integral is taken over the Haar measure. The Haar measure is the unique probability measure that is both left-invariant and right-invariant: i group isP = h!I, X, Zi where! = ! for odd d and! = e ⇡/d for even d. We are nterested in global phases, so we define d 2 Pauli operators in P. When the Hilbert space consists of n spins, we write the up by P n . For systems of qubits, (representatives of) P n consist of tensor products auli operators, such as X ⌦ Y ⌦ I ⌦ Z ⌦ · · · without global phases. operators are the basis of the operator space. Observe that Tr(P † i P j ) = d ij for n . Given any operator A acting on H, one can expand A as follows: this property, the cyclic permutation operator W cyc can be decomposed as follows: ..,P k 1 2Pn P 1 ⌦ P 2 ⌦ · · · P k 1 ⌦ Q † , Q= P 1 P 2 · · · P k 1 .
ce, the SWAP operator is given by SWAP = 1 d P P P ⌦ P † , or graphically: plication, let us consider the Pauli twirl: tion can be derived as follows: group isP = h!I, X, Zi where! = ! for odd d and! = e ⇡/d for even d. We are terested in global phases, so we define d 2 Pauli operators in P. When the Hilbert space consists of n spins, we write the p by P n . For systems of qubits, (representatives of) P n consist of tensor products auli operators, such as X ⌦ Y ⌦ I ⌦ Z ⌦ · · · without global phases. perators are the basis of the operator space. Observe that Tr(P † i P j ) = d ij for . Given any operator A acting on H, one can expand A as follows: his property, the cyclic permutation operator W cyc can be decomposed as follows: Q= P 1 P 2 · · · P k 1 .
ce, the SWAP operator is given by SWAP = 1 d P P P ⌦ P † , or graphically: lication, let us consider the Pauli twirl: ion can be derived as follows: C. Haar random e an operator acting on H. The central object concerning Haar random unitary is the k-fold averages:

B. Permutation operators
Pauli operators for C d (i.e. d-state spins) are defined by e Pauli group isP = h!I, X, Zi where! = ! for odd d and! = e ⇡/d for even d. We are en uninterested in global phases, so we define ere are d 2 Pauli operators in P. When the Hilbert space consists of n spins, we write the li group by P n . For systems of qubits, (representatives of) P n consist of tensor products ubit Pauli operators, such as X ⌦ Y ⌦ I ⌦ Z ⌦ · · · without global phases. Pauli operators are the basis of the operator space. Observe that Tr(P † i P j ) = d ij for P j 2 P n . Given any operator A acting on H, one can expand A as follows: using this property, the cyclic permutation operator W cyc can be decomposed as follows: instance, the SWAP operator is given by SWAP = 1 d P P P ⌦ P † , or graphically: an application, let us consider the Pauli twirl: is equation can be derived as follows: P = h!I, X, Zi where! = ! for odd d and! = e ⇡/d for even d. We are in global phases, so we define operators in P. When the Hilbert space consists of n spins, we write the . For systems of qubits, (representatives of) P n consist of tensor products rators, such as X ⌦ Y ⌦ I ⌦ Z ⌦ · · · without global phases.
are the basis of the operator space. Observe that Tr(P † i P j ) = d ij for any operator A acting on H, one can expand A as follows: erty, the cyclic permutation operator W cyc can be decomposed as follows: WAP operator is given by SWAP = 1 d P P P ⌦ P † , or graphically: let us consider the Pauli twirl: be derived as follows: C. Haar random erator acting on H. The central object concerning Haar random unitary fold averages:

k-fold channel
Let A be an operator acting on H ⊗k . The k-fold channel of A with respect to the unitary group is defined as where the integral is taken over the Haar measure. Note, this is sometimes referred to as the k-fold twirl of A. The Haar measure is the unique probability measure on the unitary group that is both left-invariant and right-invariant [54] Haar dU = 1, for all V ∈ U (H), where f is an arbitrary function. If we take f (U ) = (U ⊗k ) † A U ⊗k , then we can show that the twirl of A is invariant under k-fold unitary conjugation, and that twirl of the k-fold unitary conjugation of A equals the twirl of A where for Eq. (29) we used the right-invariance property, and for Eq. (30) we used leftinvariance.

Weingarten function
The content of Eq. (29) is that Φ Here, S k is the permutation group, and u π (A) is some linear function of A. Since u π (A) is a linear function, it can be written as for some operators C π . From Eq. (30), we find that C π commutes with all operators V ⊗k . Then again, by the Schur-Weyl duality, we have Φ (k) The coefficients c π,σ are called the Weingarten matrix [55]. Since Φ (k) So finally, we find Φ (k) Here, we assumed the presence of the inverse Q −1 which is guaranteed for k ≤ d.

Examples
For k = 1, Q I,I = d, so one has Φ (1) For k = 2, so one has where S is the SWAP operator.

Haar random states
We can also consider the k-fold average of a Haar random state. Define a random state by |ψ = U |0 , where U is sampled uniformly from the unitary group. Then, we have for some coefficients c π . Since Haar (|ψ ψ|) ⊗k commutes with V ⊗k , we again decomposed it with permutation operators. Furthermore, one has which implies c π = c for all π. By taking the trace in Eq. (39), we have Defining the projector onto the symmetric subspace as Π sym = 1 14

Frame potential
In this paper, we will be particularly interested in the following quantity Using Eq. (35), we can rewrite this

Unitary design
Consider an ensemble of unitary operators E = {p j , U j } where p j are probability distributions such that j p j = 1, and U j are unitary operators. The action of the k-fold channel with respect to the ensemble E is given by or for continuous distributions The ensemble E is a unitary k-design if and only if Φ Haar (A) for all A. Intuitively, a unitary k-design is as random as the Haar ensemble up to the kth moment. That is, a unitary k-design is an ensemble which satisfies the definition of the Haar measure in Eq. (28) when f (U ) contains up to kth powers of U and U † (i.e. balanced monomials of degree at most k). By this definition, if an ensemble is k-design, then it is also k −1-design. However, the converse is not true in general.
It is convenient to write the above definition of k-design in terms of Pauli operators. An ensemble E is k-design if and only if Φ Haar (P ) for all Pauli operators P ∈ (P) ⊗k , since the Pauli operators are the basis of the operator space. Furthermore, for an arbitrary ensemble E 15 due to the left/right-invariance of the Haar measure. By using Eq. (47), we can derive the following useful criteria for k-designs [43] E is k-design ⇔ Φ E (P ) is a linear combination of W π for all P ∈ (P) ⊗k .
To make use of this, we will look at some illustrative examples.
Pauli is a 1-design The Pauli operators form a unitary 1-design. In fact, we have already shown this in with Eq. (25). We have shown that an average over Pauli operators gives 1 for all ρ. For example, if d = 2 and A = X (Pauli X operator), then we have which is consistent with tr{X} = 0. Thus, an average over Pauli operators is equivalent to talking a trace. Since Pauli operators can be written in a tensor product form: P 1 ⊗P 2 ⊗. . .⊗P n for a system of n qubits, they do not create entanglement between different qubits and do not scramble. Instead, they can only mix quantum information locally. This implies some kind of relationship between 1-designs and local thermalization that we will revisit in the discussion §6.

Clifford is a 2-design
The Clifford operators form a unitary 2-design. The Clifford group C n is a group of unitary operators acting on a system of n qubits that transform a Pauli operator into another Pauli operator Clearly, Pauli operators are Clifford operators, since P QP = e iθ Q for any pairs of Pauli operators P, Q: Pauli operators transform a Pauli operator to itself up to a global phase. However, non-trivial Clifford operators are those which transform a Pauli operator into a different Pauli operator. An example of such an operator is the Control-Z gate where summation is modulo 2. The conjugation of a Pauli operator with a Control-Z gate is as follows Let us prove that the Clifford group is 2-design by using Eq. (48) [43]. For qubit Pauli operators of the form P ⊗ P (P = I), the action of the Clifford 2-fold channel is because a random Clifford operator will transform P into some other non-identity Pauli operator. Recalling the definition of the swap operator SWAP = 1 d P ∈Pn P ⊗ P , the RHS is a linear combination of I ⊗ I and SWAP. On the other hand, for other Pauli operators P ⊗ Q with P = Q, the action of the channel is This can be seen by rewriting the sum as since the Clifford group is invariant under element-wise multiplication by a Pauli R. Since we assumed the Pauli operators are different we have P Q = I, and thus we can pick R such that it anti-commutes with P Q. This implies R † P R ⊗ R † QR = −P ⊗ Q, and therefore the two terms cancel. Finally, I ⊗ I remains invariant. Thus, since the action of the channel on Pauli operators gives a linear combination of permutation operators, by Eq. (48) the Clifford group is a unitary 2-design. Note that Clifford operators do not have a tensor product form, in general. This means that unlike evolution restricted to Pauli operators, they can change the sizes of an operator as seen in the Control-Z gate example Eq. (53). In other words they can grow local operators into a global operators, indicative of the butterfly effect.
In fact, Clifford operators can prepare a large class of interesting quantum states called the stabilizer states. Let |ψ 0 = |0 ⊗n be an initial product state. This state satisfies Z j |ψ 0 = |ψ 0 . Let U be an arbitrary Clifford operator and consider |ψ = U |ψ 0 . This state |ψ satisfies the following By definition, S j are Pauli operators and will commute with each other. A quantum state that can be represented by a set of commuting Pauli operators S j is called a stabilizer state. Examples of stabilizer states include ground states of the toric code and the perfect tensors used in construction of holographic quantum error-correcting codes [56]. The upshot is that Clifford operators can create a global entanglement and can scramble quantum information. We will return to this point again in the discussion §6.

? is a higher-design
Currently there is no known method of constructing an ensemble which forms an exact k-design for k ≥ 4 in a way which generalizes to large d. Instead, there are several constructions for preparing approximate k-design in an efficient manner [41,48].

Measures of chaos and design
In this section, we show that 2k-point OTO correlators are probes of k-unitary designs. We will focus on a Hilbert space H = C d with 2k-point correlators of the following form We can think of this as a correlator evaluated in a maximally mixed or infinite temperature state ρ = 1 d I. The trace can be rewritten as by considering an enlarged Hilbert space H ⊗k that consists of k copies of the original Hilbert space H, where W πcyc represents a cyclic permutation operator on H ⊗k . The action of W cyc is to send the jth Hilbert space to the (j + 1)th Hilbert space (modulo k), see Fig. 1 for a graphical representation. 11 Observe that Eq. (59) contains a k-fold unitary action suggesting correlators of the form Eq. (58) have the potential to be sensitive to whether an ensemble is or is not a k-design. Unitary k-designs concern the k-fold channel of an ).
U U † Figure 1: Schematic form of the 2k-point OTO correlation functions Eq. (58), interpreted as a correlation function on the enlarged k-copied system. The dotted line diagram sur- where E is an ensemble of unitary operators. To further this point, let us consider an average of these correlators over an ensemble of unitary operators E Looking back at Eq. (59), the idea is that A 1 , . . . , A k operators probe the outcome of k-fold channel Φ E (B 1 ⊗ · · · ⊗ B k ). Indeed, the part of Fig. 1 surrounded by a dotted line, is Φ E (B 1 ⊗ · · · ⊗ B k ). Below, we will make this intuition precise by proving that this set of OTO correlators Eq. (58) completely determine the k-fold channel Φ (k) E .

Chaos and k-designs
In this subsection, we prove that 2k-point OTO correlators completely determines the k-fold channel of an ensemble E, denoted by Φ where ρ is defined over k copies of the system, H ⊗k . The map is linear, and completelypositive and trace-preserving (CPTP), i.e. a quantum channel. For simplicity of discussion, we assume that the system H is made of n qubits so that H = C d with d = 2 n . The input density matrix ρ can be expanded by Pauli operators where The output density matrix is given by For a given Pauli operator B 1 , . . . , B k , we would like to examine Φ E (B 1 ⊗ · · · ⊗ B k ). Let us fix Pauli operator B 1 , . . . , B k for the rest of the argument. Note that the output Φ E (B 1 ⊗ · · · ⊗ B n ) can be also expanded by Pauli operators Since we have fixed B 1 ⊗· · ·⊗B k , for notational simplicity we have not included B 1 , · · · , B k indices from the tensor γ. In order to characterize the k-fold channel, we need to know values of γ C 1 ,...,Cn for a given B 1 , · · · , B k . We would like to show that we can determine the values of γ C 1 ,...,Cn by knowing a certain collection of OTO correlators. Consider a 2k-point OTO correlator labeled by the set of A operators, averaged over an ensemble E where as alwaysB j = U † B j U and A 1 , . . . , A k are Pauli operators. As before, for simplicity of notation we have not included B 1 , . . . , B k indices on α. Now that the notation is setup, the main question is whether one can determine the coefficients γ C 1 ,...,C k from the numbers α A 1 ,...,A k . Substituting Eq. (65) into Eq. (66), we see where tensor contractions are implicit following the Einstein summation convention. This shows we can compute OTO correlators α A 1 ,...,A k the coefficients defining the k-fold channel γ C 1 ,...,C k . To establish the converse, we must prove that the tensor M is invertible.
where δ P Q is the delta function for Pauli operators P, Q The proof of this theorem is sort of technical and has been relegated to Appendix A.1. Thus, from the OTO correlators α C 1 ,...,C k , we can completely determine the k-fold channel As an obvious corollary, this means that 2k-point OTO correlators can measure whether or not an ensemble forms k-design.

Frame potentials
In this section we introduce the frame potential, a single quantity that can measure whether an ensemble is a k-design. Furthermore, we show how the frame potential may be computed from OTO correlators. Given an ensemble of unitary operators E, the kth frame potential is defined by the following double sum [45] where |E| denotes the cardinality of E.Denote the frame potential for the Haar ensemble as F (k) Haar . Then, the following theorem holds.
Theorem 3. For any ensemble E of unitary operators, with equality if and only if E is k-design.
The proof of this theorem is very insightful and beautiful, which we reprint from [45].
with equality if and only if E is k-design.
Note that we derived the minimal value of the frame potential in §2, which holds for k ≤ d. The frame potential quantifies the 2-norm distance between the Haar ensemble and the k-fold E-channel. 12 Here we show that the frame potential can be expressed as a certain average of OTO correlation functions.
where summations are over all possible Pauli operators.
The LHS of the equation is the operator average of the 2-norm of OTO correlators, and the RHS is the kth frame potential up to a constant factor. There are d 4k Pauli operators A 1 , . . . , A k , B 1 , . . . , B k , which leads to 1/d 4k . The theorem implies that the quantitative effect of random unitary evolution is to decrease the frame potential, which is equivalent to the decay of OTO correlators.
Proof. We take the averages over A 1 , · · · , A k , B 1 , · · · , B k first. Expanding the LHS gives 1 d 4k For k = 2, this can be depicted graphically as Recall that a SWAP operator is given by SWAP = 1 For instance, the SWAP operator is given by SWAP = 1 d P P ⌦ P † , or graphically: As an application, let us consider the Pauli twirl: This equation can be derived as follows: C. Haar random Let A be an operator acting on H. The central object concerning Haar random unitary operators is the k-fold averages: where the integral is taken over the Haar measure. The Haar measure is the unique probability measure that is both left-invariant and right-invariant: this property, the cyclic permutation operator W cyc can be decomposed as follows: nce, the SWAP operator is given by SWAP = 1 d P P P ⌦ P † , or graphically: plication, let us consider the Pauli twirl: ation can be derived as follows: C. Haar random be an operator acting on H. The central object concerning Haar random unitary s is the k-fold averages: auli operators, such as X ⌦ Y ⌦ I ⌦ Z ⌦ · · · without global phases. perators are the basis of the operator space. Observe that Tr(P † i P j ) = d ij for n . Given any operator A acting on H, one can expand A as follows: this property, the cyclic permutation operator W cyc can be decomposed as follows: ce, the SWAP operator is given by SWAP = 1 d P P P ⌦ P † , or graphically: lication, let us consider the Pauli twirl: tion can be derived as follows: C. Haar random be an operator acting on H. The central object concerning Haar random unitary is the k-fold averages: = ere are d 2 Pauli operators in P. When the Hilbert space consists of n spins, we write the uli group by P n . For systems of qubits, (representatives of) P n consist of tensor products qubit Pauli operators, such as X ⌦ Y ⌦ I ⌦ Z ⌦ · · · without global phases. Pauli operators are the basis of the operator space. Observe that Tr(P † i P j ) = d ij for , P j 2 P n . Given any operator A acting on H, one can expand A as follows: using this property, the cyclic permutation operator W cyc can be decomposed as follows: r instance, the SWAP operator is given by SWAP = 1 d P P P ⌦ P † , or graphically: an application, let us consider the Pauli twirl: is equation can be derived as follows: f igure (10)

C. Haar random
Let A be an operator acting on H. The central object concerning Haar random unitary erators is the k-fold averages: i operators in P. When the Hilbert space consists of n spins, we write the . For systems of qubits, (representatives of) P n consist of tensor products rators, such as X ⌦ Y ⌦ I ⌦ Z ⌦ · · · without global phases. s are the basis of the operator space. Observe that Tr(P † i P j ) = d ij for n any operator A acting on H, one can expand A as follows: erty, the cyclic permutation operator W cyc can be decomposed as follows: SWAP operator is given by SWAP = 1 d P P P ⌦ P † , or graphically: , let us consider the Pauli twirl: be derived as follows: C. Haar random erator acting on H. The central object concerning Haar random unitary -fold averages: "e H " e iHt (8) "e H " e iHt (8) "e H " e iHt (8) "e H " e iHt (8) "e H " e iHt (8) "e H " e iHt (8) .
Thus, we replace each average of Pauli operators A 1 , · · · , A k , B 1 , · · · , B k by SWAP operators. There are 2k loops, where k of them contribute to tr{U V † } and the remaining k loops contribute to tr{V U † }. Keeping track of the number of factors of d, we find which is the desired result.
Lastly, we note that we can use the frame potential to lower bound the size of the ensemble. Since all the terms in the sum are positive, by taking the diagonal part of the double sum, we find Large k In Appendix B, we provided an intuitive way of counting the number of nearly orthogonal states in a high-dimensional vector space C d by introducing a precision tolerance . A similar argument holds for the number of operators; if we can only distinguish operators up to some precision , then the total number of unitary operators is given by the volume of the unitary group measured in balls of size , which roughly goes like ∼ −2 2n .
The key point is that if we have a tolerance , then the number of operators is finite even for a continuous ensemble. This means that there is a maximum size to any ensemble that is a subset of the unitary group. Looking at our bound Eq. (82), we see that for k ∼ d we begin to reach the maximum size for some .
As a corollary, this means that for large k d, being a k-design implies that the ensemble is an approximate k+1-design. In this sense, there are really only O(d) nontrivial moments of the Haar ensemble.
As an explicit example, we will consider the case of d = 2. Here, for Haar we can compute the frame potential exactly for all k [45] F (k) and thus we have the following relationship On the other hand, for any ensemble the frame potential satisfies Therefore, for d = 2 we see that if an ensemble is a k-design, for k 2 it will automatically be close to being a k + 1-design.

Ensemble vs. time averages
In this subsection, we consider the time average of the frame potential to ask whether the ensemble specified by sampling U (t) = e −iHt at different times t will ever form a k-design.
In the classical statistical mechanics and chaos literature, this is the question of whether a system is ergodic-whether the average over phase space equals an average over the time evolution of some initial state. Consider the one-parameter ensemble of unitary matrices defined by evolving with a fixed time-independent Hamiltonian H, We can compute the frame potential as = lim If the spectrum is chaotic/generic, then the energy levels are all incommensurate. The terms in the sum evaluate to zero unless we have E i = E m j for all possible pairings (i, j) This is larger than F This "half-randomness"can be understood in the following way. Let us write the Hamiltonian as Rotating H by a unitary operator does not affect the frame potential, so we can consider a classical Hamiltonian with the same frame potential. Even though H is classical, it has an ability to generate entanglement. Namely, if an initial state is |+ = j |j , then time-evolution will create entanglement. In terms of the original Hamiltonian H, we see that the system keeps evolving in a non-trivial manner as long as the initial state is "generic" and is different from eigenstates of the Hamiltonian H. This arguments suggests that the frame potential, in a time-average sense, can only see the spectrum distribution. This fact, that Hamiltonian time evolution can never lead to random unitaries, can also be understood in terms of the distribution of the spacing of the eigenvalues. 14 The phases of the Haar random unitary operators have Wigner-Dyson statistics; they are repulsed from being degenerate. The eigenvalues of a typical Hamiltonian H also have this property. However, these eigenvalues live on the real line, while the phases of e −iHt live on the circle. In mapping the line to the circle, the eigenvalues of H wrap many times. This means that the difference of neighboring phases of e −iHt will not be repulsed; e −iHt will have Poisson statistics! In this sense, the ensemble formed by sampling e −iHt over time may never become Haar random. 15 The calculation in the beginning of this subsection provides an explicit calculation of the 2-norm distance between such an ensemble and k-designs and quantifies the degree to which the ensemble average is "too random" as compared to the time average.

Measures of complexity
For an operator or state, the computational complexity is a measure of the minimum number of basic elementary gates necessary to generate the operator or the state from a simple reference operator (e.g. the identity operator) or reference state (e.g. the product state). With the assumption that the elementary gates are easy to apply, the complexity is a measure of how difficult it is to create the state or operator.
On the other hand, an ensemble contains many different operators with different weights or probabilities. In that case, the computational complexity of the ensemble should be understood as the number of steps it takes to generate the ensemble by probabilistic applications of elementary gates. For instance, to generate the ensemble of Pauli operators, we randomly choose with probability 1/4 a Pauli operator I, X, Y, Z to apply to a qubit, and then repeat this procedure for all the qubits.
The complexity of an ensemble is related to the complexity of an operator in the following way. If an ensemble can be prepared in C steps, then all the operators in the ensemble can be generated by applications of at most C elementary gates. On the other hand, if an ensemble cannot be prepared (or approximated) in C steps, then-for the sorts of ensembles we are interested in-most of the operators cannot be generated by applications of C elementary gates. For example, generating the Haar ensemble will take exponential complexity since, on average, individual elements have exponential complexity.
The complexity of the ensemble can be lower bounded in terms of the number of elements or cardinality of the ensemble |E|. If all the elements are represented equally (with uniform probabilities), then clearly at least E circuits need to be generated from probabilistic applications of the elementary gates. Making use of a fact introduced in the previous section that F (k) E provides a lower bound on |E|, here we show that the frame potential provides a lower bound on a circuit complexity of generating E. We will also explain how this bound applies to ensembles that depend continuously on some parameters and thus have a divergent number of elements.
Two additional bounds that are somewhat outside the scope of the main presentation, one on circuit depth and one on the early-time complexity growth with a disordered ensemble of Hamiltonians, are relegated to Appendix C.

Discrete ensembles
Consider a system of n qubits. Let G denote an elementary gate set that consists of a finite number of two-qubit quantum gates. We denote the number of elementary two-qubit gates by g := |G|. At each time step we assume that we can implement any of the gates from G. One typically chooses G so that gates in G enables a universal quantum computation. A well-known example is where T is the π/4 phase shift operator; T = diag(1, e iπ/4 ). (Of course, this is not the only choice of elementary gate sets.) Our goal is to generate an ensemble of unitary operators E by sequentially implementing quantum gates from G. Let us denote the necessary number of steps (i.e. the circuit complexity) to generate E by C(E). Then one has the following complexity lower bound.
Theorem 5. Let g be the number of distinct two-qubit gates from the elementary gate set.
Then the circuit complexity C(E) to generate an ensemble E is lower bounded by The proof relies on an elementary counting argument. Arguments along this line of thought have been used commonly in the literature.
Proof. At each step, we randomly pick a pair of qubits. Since there are g implementable quantum gates and n 2 qubit pairs, there are in total gn 2 choices at each step. If this procedure runs for C steps, the number of unique circuits this procedure can implement is upper bounded by # of circuits ≤ (gn 2 ) C .
Since there are |E| unitary operators in an ensemble E, we must have which implies In Appendix B, we provide an intuitive way of counting the number of states in a d = 2 n -dimensional Hilbert space (which was well known for a long time from [50]). As a sanity check on Theorem 5, if we substitute |E| 2 2 n into Eq. (92) we see that the complexity of most states is exponential in the number of qubits C > 2 n log 2.
Finally, let us examine the relation between the frame potential and the circuit complexity. Using Eq. (81) Of course, this bound obviously depends on the choice of basic elements g and the fact that we are using two-qubit gates. If we had considered q-body gates, we would have found a denominator log(g n q ) ≈ log(gn q ) for n q. This is no more than a choice of "units" with which to measure complexity. Thus, we state our key result as: Theorem 6. For an ensemble E with the kth frame potential F (k) E , the circuit complexity is lower bounded by In this context, log(choices) simply indicates the logarithm of the number of decisions that are made at each step. If we imagine we have some kind of decision tree for determining which gate to apply where we make a binary decision at each step (and use log 2 ), then we may set the denominator to unity and measure complexity in bits rather than gates, i.e. C(E) ≥ 2kn − log 2 F (k) E . In the above discussion, we glossed over a subtlety. Here, we considered the quantum circuit complexity to prepare or approximate an entire ensemble E. A closely related but different question concerns the quantum circuit complexity required to implement a typical unitary operator from the ensemble E. Nevertheless, in ordinary settings the typical operator complexity and the ensemble complexity are roughly of the same order. While establishing a rigorous result in this direction is beyond the scope of this paper, see [50] for some basic proof techniques that are useful in establishing this connection.
An important consequence of Theorem 6 is that the smallness of the frame potential (i.e. generic smallness of OTO correlators) implies increases in quantum circuit complexity of generating the ensemble E. As a corollary, we can simply rewrite Eq. (97) as (98) In this sense, we see how the decay of OTO correlators is directly related to an increase in the (lower-bound) of the complexity. Next, recall that the frame potential for a k-design is given by F (k) Haar = k!, which does not grow with n. The complexity of a k-design is thus lower bounded as which for large n grows roughly linearly in k and n (at least). In §C.1, we show that the minimum depth circuit to make a k-design also growths linearly in k (at least). Finally, we offer an additional information-theoretic interpretation for our lower bound and generalize it for ensembles with non-uniform probability distributions. Consider an ensemble E = {p j , U j } with probability distribution {p j } such that j p j = 1. The second Rènyi entropy of the distribution {p j } is defined as S (2) = − log j p 2 j . In this more general situation, we can still bound the frame potential by considering the diagonal part of the sum Since the von Neumann entropy S (1) = − j p j log(p j ) is always greater than the second Rènyi entropy, S (1) ≥ S (2) , we can bound the von Neumann entropy as The entropy of the ensemble is a notion of complexity measured in bits. 16 16 However, this is not to be confused with the entanglement entropy. The entropy of the ensemble is essentially the logarithm of the number of different operators, and therefore can be exponential in the size of the system. Instead, the entanglement entropy (as a measure of entanglement) can only be as large as (half) the size of the system.

Continuous ensembles
Many interesting ensembles of unitary operators are defined by continuous parameters, e.g. a disordered system has a time evolution that may be modeled by an ensemble of Hamiltonians. 17 While the counting argument in §4.1 is not directly applicable to these systems with continuous parameters, the complexity lower bound generalizes to such systems by allowing approximations of unitary operators. To be concrete, imagine that we wish to create some unitary operator U 0 by combining gates from the elementary gate set. In practice, we do not need to create an exact unitary operator U 0 . Instead, we may be fine with preparing some U that faithfully approximates U 0 to within a trace distance where the notation || · || p ≡ (tr {| · | p }) 1/p specifies the p-norm of an operator. Now, let us derive a complexity lower bound up to an -tolerance. We begin by taking N s samples from the ensemble and use them to estimate the frame potential of the continuous distribution F where the each of the two sums runs over all N s samples. We can lower bound Eq. (103) as follows F where the sum over i runs from 1 to N s . The sum over j includes only a smaller subset N (U i ), which is the number of operators within a trace distance of a particular U i . To continue, let's bound the summand. First, note that The 2-norm is upper bounded by the 1-norm as ||O|| 2 ≤ √ d||O|| 1 , which let's us rewrite this as For this formula to be sensible, this approximation requires that < √ 2. Substituting Eq. (106) into Eq. (104), we can bound the frame potential The term in brackets in the middle expression is the average number of operators within a trace distance of an operator in our sample set. In the final expression this is represented by the symbol N . Now, let's run the counting argument again. If we want to make N s circuits exactly, then in C steps we must have where as before (choices) summarizes the information about our choice of gate set, etc. Instead, if you only care about making circuits to within an -accuracy, then in C steps N s instead satisfies (choices) C N > N s .
This lets us lower bound the complexity of the ensemble at precision as We then take the continuum limit by taking N s → ∞. The number of operators within an -ball of a given sample will also diverge, but the ratio N s /N should remain finite and converge to some value, roughly the volume of the ensemble as measured in balls of -radius. 18 Finally, in §C.2, we further extend this notion of bounding complexity for continuous ensembles to show that the initial early-time growth of complexity for evolution with an ensemble of Hamiltonians grows initially as t 2 for a time t < 1/ log(d).

Measures of correlators
While much of the focus of this paper has been on the behavior of ensembles, we were originally motivated by the following question: When is a random unitary operator an appropriate approximation to the underlying dynamics? In this section, we will attempt to return the focus to this question by computing random averages over correlation functions and comparing them to expectations for chaotic time evolution in physical systems.

Haar random averages
In this subsection, we will explicitly compute some ensemble averages of OTO correlators for different choices of ensembles. A particular goal will be to understand the asymptotic behavior of these averages in the limit of a large number of degrees of freedom d → ∞. We present explicit calculations for 2-point and 4-point functions here and provide results for 6-point and 8-point functions. Additional calculations may be found in Appendix D.

Consider a 2-point correlator, averaged over Haar random unitary operators
Since U and U † each only appear in the expression once, we will obtain the same answer if the average is instead performed over a 1-design: AB Haar = AB 1-design . By using a formula from §2, we can derive the following expression Graphically, the calculation goes as follows .
It is often convenient to consider physical observables with zero mean by shifting A → A − A . Then, we see that these 2-point correlation function vanish. Of course, this always holds for Pauli operators AB Haar = 0, A, B ∈ P, A, B = I.
Next, let us consider the norm squared of a 2-point correlator averaged over Haar random unitary operators | AB | 2 Haar = 1 d 2 Haar dU tr{A U † BU } tr{U † B † U A † }. Note that we take the Haar average after squaring the correlator. Since there are two pairs of U and U † appearing, we can perform the average over a 2-design: | AB | 2 Haar = | AB | 2 2-design . Let us assume that A, B are Pauli operators so that we can neglect contributions from A and B . There are four terms, but only one term survives because the trace of non-identity Pauli operators is zero. We depict the calculation graphically as . ( where C I = 1/(d 2 − 1) comes from the Weingarten function as shown in §2. The final result is Thus, the variance of the Haar averaged 2-point function is exponentially small in the number of qubits.

4-point functions
Next, consider a 4-point OTO correlator averaged over Haar random unitary operators: As has already been explained, we will obtain the same answer if the average is performed with a 2-design: ABCD Haar = ABCD 2-design . By using formulas from §2 we can derive the following expression 19 where d = 2 n and AC ≡ AC − A C . In particular, for Pauli operators A, B, C, D = I, one has When nonzero, the result is exponential small in the number of qubits n. The derivation of the aforementioned formula can be understood graphically as follows

Short-cut to OTOs
Beni Yoshida

Short-cut to OTOs
Beni Yoshida

Short-cut to OTOs
Beni Yoshida

Short-cut to OTOs
Beni Yoshida 19 Note: this formula was independently obtained by Kitaev.

33
By rewriting the expression in terms of connected correlators, we obtain the formula Eq. (118). Of course, we can obtain the same result by instead averaging over the Clifford group.
Since by definition Clifford operators transform a Pauli operators to another Pauli operator,B is a random Pauli operator withB = I. There are d 2 − 1 non-identity Pauli operators. Among them, d 2 /2−1 Pauli operators commute with A, and d 2 /2 anti-commute with A. Therefore, we find As expected, the 4-point OTO correlator ensemble average over the Clifford group equals the Haar average. Recalling our result from §3 that 4-point OTO values completely determine the 2-fold channel, this explicit calculation gives an alternative proof that the Clifford group forms a unitary 2-design. Finally, we will present one additional way of computing the Haar average of 4-point OTO correlators. For convenience, we introduce the following notation .
The more general expression is slightly complicated though has the same scaling. Thus, the Haar average of 6-point OTO correlators does not reach any lower a floor value than the Haar average of 4-point OTO correlators.

8-point functions
Finally, we will study Haar averages of 8-point OTO correlators. In this case, there are two different types of nontrivial out-of-time ordering, which behave differently at large d. These computations are annoyingly technical, and so the details are hidden in Appendix D.1. The 8-point OTO correlators of the first type can be written in the following manner For Hermitian operators, this essentially repeats ABCD twice. For reasons that will subsequently become clear, we will call such OTO correlators "non-commutator types." (However, the result does depends on the commutation relations between A, C and B, D.) For these correlators, the scaling of the Haar average with respect to d is and are denoted "commutator-type" correlators. These correlators have the property that they can be written in the form i.e. they are the expectation value of the group commutator of the operator AKA † K † . The OTO correlators Eq. (129) cannot be written in this way. As with the non-commutator types, the exact Haar average depends on commutation relations between A, C and B, D.
However, the scaling with respect to d does not The Haar average of these correlators is much smaller than the Haar average of the noncommutator types and the 4-and 8-point Haar averages! This suggests they might be a useful statistic for distinguishing ensembles that form a 4-design and ensembles that form a 2-design but do not form a higher design.
To test this idea, we can take an average of the commutator type 8-point OTO correlators averaged over the Clifford group. Since we have assumed the operators A, . . . , D are Pauli operators, we find where in the first equality we commuted C,D and C † ,D † , which holds because C, D are Pauli operators, and in the second equality we have defined K(P, Q) by for Pauli operators P, Q. The final answer is Recall that the Clifford group is a unitary 2-design, is not a 3-design in general (except for a system of qubits) [43], and is never a 4-design. Therefore, we see that commutator-type correlators may provide a statistical test of whether an ensemble forms a k-design but not a k + 1-design. We explore this idea further in §D.2.

Dissipation vs. scrambling
In this subsection, we will return to the 2-and 4-point averages and compare them against expectations from time evolution. Furthermore, we will attempt to provide some physical intuition for the behavior of these averages over different ensembles. This will support our picture of chaotic time evolution leading to increased pseudorandomness. For strongly coupled thermal systems, it is expected that the connected part of the 2point correlation functions decays exponentially within a time scale t d of order the inverse temperature β This time scale is often referred to as a "dissipation" or "thermalization" time and is related to the time it takes local thermodynamic quantities reaching equilibrium. 20 It is suggestive that the results Eq. (112) and Eq. (137) are so similar. After a short time t d , for these 2-point functions the chaotic dynamics give the same results as the Haar random dynamics. Next, we turn to the variance of the 2-point correlator A(0)B(t) . For a closed system of finite number of degrees of freedom, the 2-point function will be quasi-periodic with recurrences after a timescale t r ∼ e d that is exponential in the dimension and doubly exponential in the number of degrees of freedom d = 2 n . As such, the long-time average of | A(0)B(t) | 2 must be nonzero. This can be estimated by performing a time average and gives a well known result [61][62][63] Comparing against our result for the Haar-averaged dynamics Eq. (116), we see that they coincide. Next, let's consider 4-point correlators in strongly-coupled theories with a large number of degrees of freedom N . 22 (For example, this can be thought of a system of N qubits where all the qubits interact but the interactions are at most q-local, with q N and N → ∞.) First, let's consider the case of a time-ordered correlator. Similar to the case of the 2-point functions, two of the three Wick contractions are expected to decay exponentially within a dissipation time t d which for qubits is analogous to considering a 2-point function between the composite operators AC and BD. 23 Thus, this correlator will equilibrate after a time t d with a late-time value that depends on the expectations AC and BD . Now, let's consider the out-of-time-order 4-point correlator in a large N strongly interacting theory. For t ∼ t d , this will behave similarly to the time-ordered correlator with two of the three Wick contractions decaying exponentially However, for t > t d the correlator obtains a exponentially growing connected component This growth occurs in the regime t d < t < t * . The time scale t * = λ −1 log N , known as the fast scrambling time, is the time at which the exponentially growing piece of the correlator compensates its 1/N suppression and becomes O(1). The coefficient λ has the interpretation of a new kind of Lyapunov exponent [15] and is bounded from above by 2π/β [4]. Finally, for t > t * , these OTO 4-point correlators are expected to decay to a small floor value that is exponentially small in N . A natural guess for this floor is which is reproduced from Eq. (118) with 1-point functions assumed to be subtracted off.
As we mentioned, the 2-point function Eq. (137) reached its Haar random value after a short dissipation time t d . This is very suggestive of a picture where chaotic dynamics behave as a pseudo-1-design after a time t d . Taking this point further, let's consider the 4-point OTO correlator averaged over Pauli operators, an ensemble that forms a 1-design, but not a 2-design. Furthermore, we will assume that the operators A and B have zero overlap. Under this assumption, we can show that The proof of this is relegated to §A.2. Apparently, Pauli operators capture the behavior of the dynamics around t ∼ t d , i.e. from after the dissipative regime until the scrambling regime, but then a 2-design is required to capture the behavior after t ∼ t * , i.e. the post-scrambling regime. 24 Thus, we might say that after a time ∼ t d , the system becomes a pseudo-1-design, and then after ∼ t * the system becomes a pseudo-2-design. (See Fig. 2 for a cartoon of this behavior.) However, it remains an open question whether there are any additional meaningful timescales that can be probed with correlators after t * , though we are hopeful that such timescales might be hiding in higher-point OTO correlators.

Discussion
In this paper, we have connected the related ideas of chaos and complexity to pseudorandomness and unitary design. A cartoon of these ideas is expressed nicely by Fig. 3. Operators can be thought of as being organized by increasing complexity. Regions defined by circles of larger and larger radius can be thought of as defining designs with increasing 24 Note that these observations depend on the ensemble we average over actually being the Pauli operators, and not just any ensemble that forms a 1-design without forming a 2-design. Furthermore, we assume that A, B, C, D are simply few-body operators so that the correlator is of local operators. These choices are determined for us by the basis in which the Hamiltonian is q-local. k. 25 In the rest of the discussion, we will make some related points, tie up loose ends, and mention future work.   Figure 3: A cartoon of the unitary group, with operators arranged by design. We pick the identity operator to be the reference operator of zero complexity and place it at the center. Typical operators have exponential complexity and live near the edge. Operators closer to the center have lower complexity, which makes them both atypical and more physically realizable in a particular computational model.

Generalized frame potentials and designs
In realistic physical systems, one usually does not have access to the full Hilbert space. For example, there may be some conserved quantities, such as energy or particle numbers, or the system may be at some finite temperature β. In that case, one would be interested in understanding pseudorandomness inside a subspace of the Hilbert space, i.e restricted to some state ρ. In Appendix E, we generalize the frame potential for an arbitrary state ρ finding that the quantity has all the useful properties desired of a frame potential. In particular, it is minimized by the Haar ensemble, and it provides a lower bound on ensemble size and complexity. However, if the state ρ is the thermal density matrix, ∝ e −βH and the ensemble is given by time evolution with an ensemble of Hamiltonians E = {e −iHt }, then we need to take into account the fact that the state itself depends on the ensemble. Instead, we can define a thermal frame potential In this case, even at t = 0 one can derive an interesting bound on the complexity of the ensemble. We hope to return to this in the future to analyze the complexity of formation: the computational complexity of forming the thermal state ρ β from a suitable reference state. 26 Finally, it would be similarly interesting to consider a different generalization of unitary designs where, under some physical constraints, we can only access some limited degrees of freedom in the system. In this sense, one could think of the unitary ensemble (as opposed the state) as being generated by tracing over these additional degrees of freedom. These "subsystem designs" would then be "purified" by integrating back in the original degrees of freedom. 27 This interesting direction is a potential subject of future work.

More chaos in quantum channels
In Appendix F, we revisit some ideas from our previous work [7], though these ideas are also relevant to the current work. In particular, in §F.1 we reconsider the Hayden-Preskill notion of black hole scrambling [1]. In this thought experiment, Alice throws a secret quantum state into a black hole. Assuming Bob knows the initial state and dynamics of the black hole, we show how the question of whether Bob can reconstruct Alice's secret is related to the decay of an average of a certain set of OTO four-point correlators.
In §F.2, we provide an operational interpretation to taking averages over OTO correlators. We show that this is related to a quantum game of "catch" where Alice may "spit-on" or otherwise perturb the ball before throwing it to Bob. The average over four-point OTO correlators gives the probability that Alice did not modify the ball. We also show that an average over higher-point OTO correlators can be interpreted as an "iterated" game of catch (i.e. what normally people just call "catch") where both Alice and Bob have the opportunity to modify the ball each turn. In this case, the OTO correlator average is related to the joint probability that neither Alice nor Bob perturb the ball.
Finally, in §F.3 we show that an average over a particular ordering of 2k-point OTO correlators can be related to the kth Rényi entropy of the operator U interpreted as a state. We find that − log(a certain average of 2k-point OTO correlators) ∝ S where S (k) subsystem is the Rényi k-entropy of a particular subsystem of the density matrix ρ = |U U |.

Volume of unitary operators
The argument in §4.2 led to a bound on the ratio N s /N , which can be interpreted as the volume of E in terms of -balls. An interesting application of this bound might be to think about the volume of unitary operators in U (d) that can be probed in a finite time scale T , i.e. the volume of operators with depth D ∼ T . (See §C.1 for further discussion of a lower bound on circuit depth in terms of the frame potential.) In fact, in certain situations (such as the Brownian circuit introduced in [5] or the random circuit model), it is not difficult to show that the volume of unitary operators with depth T grows at least ∼ exp(const · n T ) for some small T and some constant independent of n by computing the k = 1 frame potential. This implies that the space of unitary operators, with the metric being quantum gate complexity, has hyperbolic structure with constant curvature, as discussed in e.g. [64,67,65] (see also [68]). On the other hand, one can upper bound the volume of unitary operators with circuit depth T by thinking about how the depth can grow V (T ) ∼ n 2 n−2 2 n−4 2 · · · 2 2 ≈ exp(n log n · T ). Thus, for small T and large n, the lower bound seems to be reasonably tight.
Once a lower bound on the volume of unitary operators in an ensemble is obtained in the unit of -balls, we can also obtain a lower bound (of the same order) on the complexity of a typical operator in the ensemble. This seems possible by using the formal arguments given in [50] even when the elementary gate set is not discrete, e.g. all the two-qubit gates.
Finally, it's a curious fact that for systems with time-dependent Hamiltonian ensembles (such as the random circuit models or the Brownian circuit of [5]) that we get an initial linear growth of the volume with T . As argued in §C.2 (and confirmed numerically), for time independent Hamiltonian evolution-e.g. in SYK or in the Gaussian unitary ensemble (GUE)-we get a lower bound V (T ) ∼ exp(const · n T 2 ), which persists for a short time T ∼ 1/ √ n. It would be very interesting to understand the difference in this scaling. 28

Tightness of the complexity bound
While the frame potential provides a rigorous lower bound on the complexity of generating an ensemble of unitary operators, there may be a cost: the bound may not be very tight when applied to time evolution by an ensemble of Hamiltonians. 29 Let us try to understand this better. To be concrete, consider the k = 2 frame potential for a strongly coupled spin systems that scrambles in t * ∼ log n time. In such a system, for local operators W, V of unit weight, OTO four-point correlators W (t) † V † W (t)V will begin to decay after t ∼ O(log n). Since the k = 2 frame potential is the average of fourpoint OTO correlators, one might expect that the frame potential will also start to decay at t ∼ O(log n).
However, this is not quite right. We expect the decay time for more general correlators of larger operators to be reduced tot * ∼ t * − O log(size W ) − O log(size V ) , where t * ∼ O(log n) is the scrambling when V and W are low-weight operators. If we randomly select Pauli operators W and V , they will typically be nonlocal operators with O(n) weights, and therefore the OTO decay timet * will be reduced to O(1) for V and W of typical sizes.
In fact, the above estimate suggests most of the correlators determining the complexity bound should begin to decay immediately. As we can see in Eq. (98), each correlator itself only makes a logarithmic contribution to the complexity and so we shouldn't expect the remaining slow decaying local correlators to be dominant. (To be sure, a further investigation of this point is required.) One possible way to fix this problem would be to generalize the frame potential by using p-norm with p = 2 so that it is more sensitive to the slower decaying local correlators. We leave the study of such a generalization to the future.

Complexity and holography
Finally, we will return to the question of complexity and holography discussed in the introduction. In the context of holography, computational complexity was "introduced" [70] as a possible resolution to the firewall paradox of [71,6]. A direct connection between complexity and black hole geometry was first proposed by Susskind [72,73], which culminated in proposals that the interior of the black hole geometry is holographically dual to the spatial volume [74] or the spacetime action [75,76]. These proposals are motivated by the fact that the black hole interior continues to grow as the state evolves long past the time entropic quantities equilibrate [77]. While there is nice qualitative evidence for both of these proposals [67,13,65], missing is a direct understanding of computational complexity in systems that evolve continuously in time with a time-independent Hamiltonian.
A hint can be obtained by considering some of the motivations for these holographic complexity proposals. In particular, building on the work of [77] and previous work of Swingle [78,79], Maldacena suggested that the black hole interior could be found in the boundary theory by a tensor network construction of the state [80]. A tensor network toy model of the evolution of the black hole interior was investigated in [7]. In this toy model, the interior of the black hole was modeled as a flat tiling of perfect tensors, see Fig. 4. These tensors were elements of the Clifford group and acted as two-qubit unitary operators that highly entangled neighboring qubits at each time step. From the perspective of the boundary theory, this is a model for Hamiltonian time evolution. 30 This toy model captures some important features of the complexity growth of the black hole state. The number of tensors in the network grows linearly in time, by construction. Operators will grow ballistically, exhibiting the butterfly effect, and the network scrambles in linear time. Thus, this network captures the aspects black hole chaos related to local scrambling and ballistic operator growth discussed in [13] as well as aspects of complexity growth discussed in [77,[73][74][75].
However, since in this model the perfect tensor is a repeated element of the Clifford group, the complexity can never actually grow to be very big. In fact, the quantum recurrence time of the model was investigated in [7] and was found to be exponential in the entropy ∼ e n rather than doubly exponential ∼ e e n as expected in a fully chaotic model. This is related to our oft stated fact that the Clifford group generally does not form  Figure 4: A 6-qubit tensor network model for the geometry of the interior of a black hole. Via holography, the growth of the interior is expected to correspond to chaotic time evolution of a strongly coupled quantum theory. Here, each node corresponds to a perfect tensor and the numbers label the qubit. a higher-than-2-design. In fact, this model can actually be mapped to a classical problem, and by the Gottesman-Knill theorem its complexity can be no greater than O(n 2 ) gates [82]. 31 These observations were the inspiration for this current work, since in this toy model 4-point OTO correlators behave chaotically, but higher-point OTO correlators do not. Nevertheless, this model can be "improved" by using random 2-qubit tensors rather than a repeated perfect tensor. 32 In [41], it was shown that this local random quantum circuit approaches a unitary k-design in a circuit depth that scales at most as O(k 10 ). Our complexity lower bound for a k-design Eq. (99) suggests that the time to become a kdesign is lower bounded by k, and we suspect that this can be saturated. 33 It is in this sense that we speculate that the complexity growth of the chaotic black hole is pseudorandom. That is, we suspect that as the complexity of the black hole state increases linearly with time evolution t, the dynamics evolve to become pseudo-k-designs, with the value k roughly scaling with t and that this may be quantified by either representative 2k-point OTO correlators or by an appropriate generalization of unitary design. With this in mind, it would be interesting to see whether one could use the tools of unitary design to prove a version of the conjectures of [75,76] suggesting that complexity (is greater than or) equal to action.

A.1 Proof of Theorem 2
Proof. The LHS of Eq. (68) can be written explicitly as Both C † k A † k · · · C † 1 A † 1 and A 1 C 1 · · · A k C k are Pauli operators with complex phases. This means that the traces will give a nonzero contribution only if both A 1 · · · A k C 1 · · · C k ∝ I and A 1 · · · A k C 1 · · · C k ∝ I. Thus, the sum will vanish unless A 1 · · · A k ∝ A 1 · · · A k . So, we consider the case where A 1 . . . A k ∝ A 1 . . . A k ∝ P for some Pauli operator P . Then, the summation can be written as Here we expanded the trace because C † k A † k · · · C † 1 A † 1 and A 1 C 1 · · · A k C k are proportional to identity operators when C 1 · · · C k ∝ P † , and the d comes from the trace of an identity operator. By using the cyclic property of the trace, we can eliminate C † k and C k d · Here we used the fact that fixing C 1 , . . . , C k−1 uniquely determines C k when C 1 · · · C k ∝ P . Next, recall the relationship for summing over Pauli operators Eq. (25) where (· · · ) represents arbitrary operators. This gives By repeated action of Eq. (151), we can establish the desired result.

A.2 Proof of Eq. (143)
We can expand operators A, B, C, D by using Pauli operators as a basis of operators where Q i are Pauli operators Notice that and 1 d 2 where again we have defined K(P, Q) by Q † P Q = K(P, Q)P . Thus, the desired average becomes Observe that Q i Q j Q k Q j = 0 only when Q i = Q k , which gives Next, we assume that A and B have no overlap. This implies that [Q i , Q j ] = 0 if a i , b j = 0, and we obtain Since AC = i a i c i and BD = j b i d i , we see that as desired.

B More orthogonal states
In the Hilbert space H = C d , there are d orthogonal states which form the basis of the Hilbert space i|j = δ ij , (i, j = 1, . . . , d).
Yet, there are ∼ 2 d states which are nearly orthogonal to each other ĩ |j 1, for i = j. A quantum state in this Hilbert space can be written as where |a 1 | 2 + |a 2 | 2 + · · · + |a d | 2 = 1. Thus, a quantum state can be associated with a point on a 2d-dimensional unit sphere. By computing the volume of states whose inner product with a given reference state is larger than , we can find the number of nearly orthogonal states. Such a volume was explicitly computed in [50], but the point is that meaning that there are a doubly exponential number of nearly orthogonal states in terms of the number of degrees of freedom n, with d = 2 n .
Here we provide another simple argument for this scaling. 35 Define {a} = (a 1 , . . . , a d ) with a j = ±1. We consider the following state There are 2 d = 2 2 n states which can be represented in this way. Let us pick {a} and {b} randomly, and consider an inner product between two states |{a} and |{b} Since we chose {a} and {b} randomly, the product a j b j = ±1 is random, and the inner product will be close to zero implying that there are O(2 d ) = O(2 2 n ) nearly orthogonal states.

C More complexity bounds
In this appendix we consider complexity lower bounds for when gates can be applied in parallel §C.1 and for disordered Hamiltonian systems at early times §C.2. 35 Part of this presentation follows a nice talk given by Adam Brown [85].

C.1 Circuit depth
In Theorem 6, the complexity C(E) was defined as the number of steps required to generate the ensemble E when only a single two-qubit gate from the gate set can be applied at each step. Yet, quantum gates may be applied in parallel if they do not act on the same qubits. By allowing simultaneous applications of quantum gates, we can instead consider the minimum quantum circuit depth D(E) to generate E and obtain a lower bound with a very similar argument. At each step, we can pair up the qubits in n q n−q q n−2q q · · · q q = n!/(q!) n/q different ways (assuming n is divisible by q for simplicity). So, in a depth 1 circuit there are roughly gn!/(q!) n/q choices. 36 Thus, the lower bound on the depth is For q n, the denominator is log g + n log(n). For large n, q, the denominator goes like log g + n log(n/q). Essentially, the effect of large q is only important if there's larger than an exponential number of gates g 2 n .

C.2 Early times
Here, we consider complexity lower bounds for disordered Hamiltonian systems at early times. This discussion is intended to be completely general, but for concreteness you may imagine that we are referring to the ensemble implied by time evolving with the Sachdev-Ye-Kitaev Hamiltonian [58][59][60] H = (i) q/2 where the overline denotes an ensemble average. The model consists of N Majorana fermions ψ i , and each term in the Hamiltonian involves q of the fermions interacting with a random coupling independently picked from a Gaussian with variance j 2 i 1 ...iq . We want to understand the initial growth of complexity of an ensemble of time evolution operators E(t) = {e −iHt }, where H is disordered e.g. defined by Eq. (169). In general, we assume that tr {H} = 0, and so the first nontrivial moment is the second moment. Start by expanding U i (t) = e −iH i t and V j (t) = e −iG j t for early times 36 We thank Fernando Brandao for this suggestion.
Plugging in to the definition of the frame potential, we get Next, we use the fact that tr {H i H i } = 0 by independence, and expand assuming t 2 tr {H 2 } d to get an expression for the frame potential This implies a lower bound on the initial growth of complexity of 37 valid for early times such that tr {H 2 } t 2 /d 1. For the SYK model Eq. (169), we have tr {H 2 } = J 2 d (N/2q 2 ), and C(t) > k (J t) 2 (N/q 2 ) for times t 2q 2 /J 2 N . Thus, while complexity is expected to eventually grow linearly in time [50,51], for early times our bound predicts a quadratic phase of growth.

D More Haar random averages
In this appendix, we present the details of the Haar random averages of 8-point OTO correlators as well as an argument for the scaling with d of higher-point functions.

D.1 8-point functions
A general formula to compute the Haar average of 2k-point OTO correlators is where Wg(π) is the Weingarten function. Because of the Weingarten function, this is difficult to evaluate. The unitary Weingarten function on S n is given by [55] 37 We thank Jordan Cotler for discussions.

51
where summation is over all partitions λ of n. s λ is a polynomial in d given by where l(λ) is the length of λ. f λ is the dimension of the irreducible representation associated with λ. χ λ is the irreducible characters. For S 2 , the character table is given by We have Thus, For S 3 , the character table is given by We have Thus Wg(1, 1, 1) = 1 6 , , For S 4 , the character table is given by We have s (4) = d(d + 1)(d + 2)(d + 3), Thus , , , 53 We also note here the large d asymptotic behavior We now will evaluate the following 8-point OTO correlator average of the commutatortype correlators whereB = U BU † andD = U DU † . We assume that A, B, C, D are Pauli operators, A, B, C, D = I and AC, BD = I. We write the correlator as where W 2341 is a cyclic permutation. We have For large d, the coefficients Wg(π) scales as follows At first sight, the dominant contribution might seem to be O 1 but, due to nice cancellation, the above expression becomes which is O (d −4 ). Finally, we can write down the complete answer , (commutator type), which we rewrite as The difference from the commutator type is that the first trace contains the operator

D.2 Higher-point functions
Finally, we will make a conjecture regarding the behaviors of higher-point OTO correlation functions. We speculate that 4m-point OTO correlators of the commutator-type form will have the following asymptotic scaling.
Conjecture 1. The 4m-point OTO correlation functions asymptotically scale as when averaged over Haar random unitary operators.
For m = 1, 2, we recover the analytical results we have already obtained. Below, we provide a supporting argument for this conjecture.
First, let's consider 8-point functions using the method from §5. We are interested in computing the following quantity OTO (8) (A 1 , B 1 , A 2 From a simple calculation, we find If B 1 = I, we have since A 1 A 2 = I and B 2 = I. Thus, we have By a recursive argument, we can show the following The average of OTO(A 1 , B 1 , . . . , A m , B m ) over B 1 is zero as long as B m = I. If B 1 = I, then we have Thus, we see This argument suggests that the Haar average of 4m-point OTO correlators have a scaling ∼ d −2m . One would need to evaluate the Weingarten function for S 2m in order to check this exactly, and we will not attempt to do that. In particular, the Haar average of OTO (4) (A, B) does not depend on A, B as long as A, B = I. For 8-point or higher-point OTO correlators, OTO (4m) (A 1 , B 1 , . . . , A m , B m ) depends on the details of A 1 , B 1 , . . . , A m , B m , even when we impose B 1 , . . . , B k = I and A 1 · · · A k = I. Namely, this suggests that the exact result will depend on the commutation relations of the A j and B j , and we do not know an exact form of dependence. Our argument can only tell us that Eq. (202) has a value that scales as O (d −2m ).
Finally, for these commutator-type higher-point OTO correlators, we can compute Clifford averages easily by using the fact that A j andB j are Pauli operators. This supports our belief that higherpoint correlators might be useful probes of whether an ensemble forms a k-design but does not form a higher-design.

E More (general) frame potentials
In this appendix, we generalize the notion of unitary design for states described by an arbitrary density matrix ρ. We will find two possible generalizations of the frame potential: The difference is the ordering of V and U † in the second trace. For the maximally mixed state ρ = I d , both expressions are reduced to the original frame potential F (k) E up to factor of proportionality: However, the two expressions behave quite differently when ρ is not the maximally mixed state. Below, we explain how we arrived at these expressions and study their basic properties such as their Haar average values and the fact that they are minimized when the ensemble is a k-design. Our conclusion is that the first expression, F E , seems to be a more appropriate generalization of the frame potential.

Properties
The original kth frame potential, defined for ρ ∝ I, is minimized if and only if the unitary ensemble E is a k-design. We expect any sensible generalization of the frame potential should have similar minimization property. If so, then e.g. if F (k) Haar (ρ), we can think of the ensemble E as forming k-design with respect to the state ρ. We prove the following lemma.
with equality if E is a k-design.
Proof. We begin with G (k) . Consider the following operator Observe For F (k) , consider the following quantity where σ L = (ρ 1/k ) ⊗k ⊗ (I) ⊗k and σ R = (I) ⊗k ⊗ (ρ 1/k ) ⊗k . Define We then have In §3, we showed that the kth frame potential is proportional to the average of certain 2k-point OTO correlators, where the correlators are evaluated for maximally mixed states. Here, we compute an average of OTO correlators for general ρ to derive a candidate expression of generalized frame potential. Inspired by [4], we consider regulated OTO correlators of the form
When ρ is the thermal density matrix, Eq. (216) uniformly distributes the operators around the thermal circle. Using Eq. (216), it is not difficult to prove the following lemma.
Lemma 2. For regulated 2k-point OTO correlators, we have This is one of our motivations for considering F (k) (ρ) as the proper generalization of the frame potential. 39 Also, we note that due to the fact that the maximally mixed state is normalized with a factor of 1/d, for these generalized frame potentials the scaling with respect to d is a bit different from the original definition of the frame potential F (k) . For a maximally mixed state (ρ = I/d), we have generalized original ensemble Next, we will analyze the properties of F Thus, it behaves differently on pure states vs. mixed states, e.g.

F
(1) so that there is one ρ per trace. However, this means that the correlators Eq. (216) will have many copies of ρ. This is similar to what happens when using correlators to compute Rényi entropies.
For ρ = |ψ ψ| with |ψ = |0 ⊗n , we find that random Pauli-X operators, operators that can either act as X or I on each qubit, achieve the Haar value. To see this, we can evaluate tr |ψ ψ|U V † . Since we choose U and V randomly from a set of Pauli-X operators, then U V † |ψ will be given by a random product state |a 1 , . . . , a n with the a j = 0, 1. Since tr |ψ ψ|U V † = 0, . . . , 0|a 1 , . . . , a n , the non-zero contribution has probability 1/d. This means that F Pauli-X = 1/d, which is the Haar value. Finally, as before we can use the generalized frame potential to bound the size of the ensemble Note that random Pauli-X operators for ρ = |0 0| are tight in terms of this lower bound since there are exactly d different Pauli X operators. Next, we will analyze the properties of G Note that the result does not depend on ρ. The cardinality bound gives Thus, to become a k-design requires |E| ≥ d 2 . Note that this is a tight lower bound since a set of all the Pauli operators consists of d 2 different operators. Therefore, the minimization of G (1) requires a 1-design, regardless of ρ.
The fact that all these properties of G (1) are independent of the state ρ, in conjugation with Lemma 2, suggests that F (k) is the more interesting generalization of the frame potential. As a result, we will continue by focusing only on F (k) .
We continue by considering F Haar . The Haar value is E (ρ) are of the same order regardless of the state ρ, we see that the maximum values are significantly different.
Finally, there is an easy way to compute the minimal value of the generalized frame potential F (k) for any pure state e.g. ρ = |0 0|. Let us write the frame potential as follows Next, we define an ensemble of wavefunctions The frame potential is given by expression This is minimized when E |0 forms a k-design as an ensemble of quantum states. To obtain the Haar value, recall Eq. (42) for the k-fold average of a Haar random state where Π sym = 1 k! π∈S k W π . From this, we see that the Haar value of F (k) evaluated on a pure state will be F (k) In summary, we have At large d, the Haar value for pure states averaged over the Haar ensemble scales like ∼ k!/d k , which for k > 2 is much smaller than for the Haar average of a mixed state. It's also interesting to point out that if we consider the time average of the generalized frame potential on a pure state |+ , this is exactly equal to k!/d k . Thus, in comparison to §3.3, the time average of this generalized frame potential can actually obtain an almost Haar value with respect to |+ .

Thermal frame potentials
The above generalization is incorrect for a system in a thermal state that evolves in time by an ensemble of Hamiltonians. This is because the thermal state ρ β = e −βH /tr {e −βH } depends on the Hamiltonian and thus the ensemble itself. Instead, we will define a thermal frame potential W By first averaging over Hamiltonians and then taking an average over all choices of operators of the norm squared of these correlators, we find For β = 0 (infinite temperature) the thermal frame potential reduces to the original frame potential F (k) up to a factor of proportionality where F (k) is the original frame potential, F (k) (I/d) is the generalized frame potential from the previous section, and both are evaluated on the ensemble E(t) = {e −iHt }. We can also use W (k) β (t) to make a bound on the cardinality of the ensemble. For simplicity, let's assume that the ensemble E(t) is a discrete set of e −iHt taken with a uniform weight. This let's us make a bound W (k) or in terms of the cardinality, For k = 1, the bound takes a particularly simple form One nice property of the thermal frame potential is that even when t = 0 we may obtain a non-trivial bound. For example, with k = 1 we can express the thermal frame potential as If β = 0, the bound is trivial |E(0)| ≥ 1. However, if β > 0 the thermal frame potential can be smaller than unity, which gives a non-trivial lower bound on |E(0)|! We can see this straightforwardly by applying the Cauchy-Schwartz inequality to the integrand of Eq. (245) The intuition for such a lower bound at t = 0 comes from the fact that e −βG , e −βH contain imaginary time evolution. We suspect that this will also let us bound the "complexity of formation" of such states ρ β with respect to an ensemble of Hamiltonians.

F More chaos in quantum channels
Finally, in this appendix we expand on some of the results of [7] that are otherwise somewhat outside the main focus of the current paper: • In §F.1, we revisit the black hole thought experiment of Hayden and Preskill [1] and discuss its relationship to the decay of OTO four-point functions.
• Next, in §F.2 we provide an operational meaning to taking averages over OTO correlation functions as introduced in [7]. The argument is partly inspired by a course taught by Kitaev [52].
• Finally, in §F.3 we generalize the relationship between OTO correlators and Rényi entropy obtained in [7] relating an average over 2k-point functions and to an expression involving the kth Rényi entropy.

F.1 OTO correlators and black holes as mirrors
In the conventional black hole information thought experiment, Alice throws some quantum state (or herself!) into a black hole, the system evolves by some chaotic unitary operator U , and then Bob attempts to reconstruct Alice's quantum state (or Alice!) by collecting the outgoing Hawking radiation emitted by the black hole. Hayden and Preskill [1] added an interesting twist to this classic setup by assuming that Bob knows both the initial state and the dynamics U of the black hole. In this scenario, they showed that if the dynamics U are sampled from a 2-design, Alice's m-qubit quantum state can be immediately reconstructed by collecting m+ qubits of the Hawking radiation. The black hole acts as if it is a quantum information mirror, "reflecting" Alice's state almost immediately.
To be more precise, let's split the input state into subsystems A and B, and let's split the output state into different subsystems C and D. We let A represent Alice or her input quantum state, B represents initial black hole state, C represents the remaining black hole state after some evolution and evaporation, and D represents the emitted Hawking radiation. The black hole dynamics U take AB to CD. This setup is shown in Fig 5. In the conventional setting, Bob has an access to only D. However, in the Hayden-Preskill modification, Bob has access to both B and D. The mirror phenomenon can be better understood by interpreting the unitary operator U as a state via the Choi-Jamilkowski isomorphism [7] The original operator U acts unitarily on n-qubit states, while the state |U is defined on a 2n-qubit Hilbert space. This interpretation lets us compute entropies and informations between the input and the output. In particular, whether Bob is able to reconstruct Alice's quantum state A can be quantified by the mutual information where e.g. S A = −tr {ρ A log ρ A } is the entanglement entropy evaluated on the density matrix ρ A = tr BCD {|U U |}. If I(A : BD) is near its maximal value 2S A , then Bob can reconstruct Alice's unknown quantum state. 40 It's easy to show that for a 2-design unitary operator U that the mutual information is close to its maximum I(A : BD) ≈ 2S A , thus enabling reconstruction of Alice's quantum state [1]. The information reconstruction problem is closely related to four-point OTO correlators. It was shown in [7] that four-point OTO correlation functions averaged over all Pauli operators in A and D is related to the second Rényi entropy by the formula where dAdD means an average over all Pauli operators in A and D, S AC is the Rényi-2 entropy for the joint region AC, and d A = 2 S A is the Hilbert space dimension of A, etc. From a simple calculation, we obtain where ABD is the Rényi-2 mutual information. This inequality holds since S A = S BD ≤ S BD . The decay of all the four-point OTO correlators between Alice A and the Hawking radiation D implies a strong correlation between Alice A and BD (the subsystem that Bob has access to), thus establishing a direct link between four-point OTO correlators and the information reconstruction problem.

F.2 Operational interpretation
In this subsection, we attempt to provide an operational interpretation to sum over outof-time-order correlation functions discussed in [7] and in this work.
Consider the following classical analogy. Alice and Bob are playing a game of catch, and Alice is about to throw the ball to Bob. However, Alice is known to cheat sometimes by perturbing the ball (she throws a mean spitball!), and so Bob would like to determine whether Alice has cheated. One way for Bob to check is to ask someone that he trusts, Charlie, to throw an identical ball to Bob. Then, Bob can compare the two balls and determine whether Alice has modified the ball. This setup is shown in Fig. 6. As we will explain, the average of all OTO correlators over the operators being correlated is closely related to the probability of Bob detecting Alice's perturbation.

Alice Bob
Renyi-2 entropies. In this section, we discuss operational interpretations correlation functions.

A. Catchball between Alice and Bob
Imagine that Alice and Bob are playing the catch ball, and Alice is a ball to Bob. The issue is that Alice might cheat by applying some perturb and Bob's task is to determine whether Alice has cheated or not. For this take the following strategy. Bob asks someone he trusts, say Charlie, to th to Bob. Then Bob can compare two balls, one from Alice and the othe determine if Alice has applied a perturbation or not.
What averages of OTO correlation functions measures is similar to the detecting Alice's perturbation in the above setup.

I. OPERATIONAL INTERPRETATION
In our paper, we showed that the averages of OTO correlation functions a Renyi-2 entropies. In this section, we discuss operational interpretations of avera correlation functions.

A. Catchball between Alice and Bob
Imagine that Alice and Bob are playing the catch ball, and Alice is about to ball to Bob. The issue is that Alice might cheat by applying some perturbations t and Bob's task is to determine whether Alice has cheated or not. For this purpose take the following strategy. Bob asks someone he trusts, say Charlie, to throw the to Bob. Then Bob can compare two balls, one from Alice and the other from C determine if Alice has applied a perturbation or not.
What averages of OTO correlation functions measures is similar to the probabil detecting Alice's perturbation in the above setup.
Renyi-2 entropies. In this section, we discu correlation functions.  In a quantum setting, we prepare the state |EP R of n EPR pairs and give Alice one half of all the pairs so that Alice's density matrix ρ A is a maximally mixed state. We give the other half of the pairs to Charlie. Thus, Charlie has a perfectly correlated copy of Alice's initial state for Bob to later compare against. Now, Alice applies a perturbation, the Pauli operators A i with probability p i , which corresponds to the following superoperator Of course, included in this set is the identity operator I and an associated probability p I that Alice does not modify her state. 41 Next, Alice throws her ball to Bob by applying U , and Charlie throws his ball to Bob by applying U * . 42 The overall state is described by the statistical ensemble Bob would like to compare the two quantum states from Alice and Charlie to determine whether Alice applied a perturbation. Note that if Alice did not apply a perturbation A i = I, then the final state is equal to the initial state |EP R because Therefore, Bob's strategy is to perform a projective measurement Π = |EP R EP R| on his state. The projection operator |EP R EP R| can be represented as an average over all the Pauli operators on Bob's state: Thus, the probability of Bob's measuring of |EP R is given by which is easy to check by making use of Eq. (253). Eq. (255) is depicted graphically in Fig. 7, which makes it clear that we can interpret it as an average over four-point OTO correlators of the form 43 In the Hayden-Preskill setup, Bob performs a projective measurement |EPR EPR| only on qubits on D. Then, assuming that Alice applies perturbations A i to qubits on A with equal probabilities, the probability of Bob's measuring |EPR on D is exactly given by Eq. (249). 41 If p I = 0, then Bob would already know that Alice always applies a perturbation. 42 The conjugation appears because Charlie's copy of Alice's original state is actually a CPT conjugate. The application of U * can be also interpreted as Bob's uncomputation.
U ⌦k U † ⌦k (4) A 1 ⌦ · · · ⌦ A k (5) j . Note that the average over A i is with probabilities p i , but the average over B j is uniform with probabilities p j = 2 −2n . This game of "catch" can be "generalized" by considering multiple rounds of throwing the ball back and forth between Alice and Bob. In this case, both Alice and Bob have the option of applying perturbations at any round of the game (except for the final time Bob receives the ball), but Charlie always faithfully throws a copy of the original ball to Bob for comparison at the end. This process turns out to be equal to an average over 4m-point OTO correlators with the same ordering as those studied in §D.2. This setup, with m = 2 rounds of catch, is shown in Fig 6(b).

F.3 Rényi-k entropy
Finally, we generalize the relationship between the second Rényi entropy and an average over four-point OTO correlators that was obtained in [7]. 44 We will consider 2k-point OTO correlation functions of the form We assume that A j and D j only act on subsets of qubits, and we will denote these subregions by A and D, respectively. We are interested in taking an average of correlators of the form Eq. (257), where we will let one of the operators from each subregion A k , D k depend on the other operators such that the average gives a permutation operator (see Eq. (23) and the discussion in §2). In particular, to get a cyclic permutation π cyc = π 23...k1 , we will take an average in the following way 1 d A
We will denote such an average as follows where the subscript notation π A , π D mean that we took the average such that A 1 ⊗ A 2 ⊗ · · · ⊗ A k and D 1 ⊗ D 2 ⊗ · · · ⊗ D k form the permutation operators associated with π A , π D . Now, let us represent U as a state |U by using the channel-state isomorphism Eq. (247) discussed in §F.1. We divide the input into subsystems A and B, and we divide the output into subsystems C and D. Our key result is a formula relating an average over 2k-point OTO correlators with operators in A and D to the exponential of the k-th Rényi entropy of the subsystem AC with S (k) AC the Rényi k-entropy of AC. To derive this, we also use the following relation tr A 1D1 · · · A kDk = tr (A 1 ⊗ · · · ⊗ A k ) · (D 1 ⊗ · · · ⊗D k ) · W πcyc , discussed in the beginning of §3, or graphically We are interested in 2k-point OTO correlation functions at infinite t hA 1 (0)C 1 (t)A 2 (0)C 2 (t) · · · A k (0)C k (t)i = 1 d Tr [A 1 (0)C 1 (t)A 2 (0)C 2 (t) · where C j = UC j U † . We assume that A j and C j act on subsets of qu and C. As in our previous paper, we will consider averages of the abo function. We are particularly interested in averages associated with per For instance, for a cyclic permutation ⇡ cyc = ⇡ 23...k1 , we will consider aver where d A 2(k 1) is the total number of Pauli operators A 1 , . . . , A k 1 . W as follows where ⇡ A , ⇡ C represents that we take averages such that A 1 ⌦ A 2 ⌦ · · · ⌦ · · · ⌦ C k form permutation operators associated with ⇡ A , ⇡ C .
The key result is summarized below. .
From Eq. (23) we know that taking an average of the form Eq. (258) replaces (A 1 ⊗· · ·⊗A k ) and (D 1 ⊗ · · · ⊗D k ) by permutation operators W π cyc −1 and W πcyc , which act on A and D, respectively. Graphically, we have We are interested in 2k-point OTO correlation functions at infinite temperature: (5) where C j = UC j U † . We assume that A j and C j act on subsets of qubits, denoted by A and C. As in our previous paper, we will consider averages of the above OTO correlation function. We are particularly interested in averages associated with permutation operators.
For instance, for a cyclic permutation ⇡ cyc = ⇡ 23...k1 , we will consider averages in the following manner 1 d A 2(k 1) X A 1 ,...,A k 1 2Pauli: where d A 2(k 1) is the total number of Pauli operators A 1 , . . . , A k 1 . We write the averages as follows where ⇡ A , ⇡ C represents that we take averages such that A 1 ⌦ A 2 ⌦ · · · ⌦ A k and C 1 ⌦ C 2 ⌦ · · · ⌦ C k form permutation operators associated with ⇡ A , ⇡ C .
The key result is summarized below. 2 We are interested in 2k-point OTO correlation functions at infinite temperature: (5) where C j = UC j U † . We assume that A j and C j act on subsets of qubits, denoted by A and C. As in our previous paper, we will consider averages of the above OTO correlation function. We are particularly interested in averages associated with permutation operators.
For instance, for a cyclic permutation ⇡ cyc = ⇡ 23...k1 , we will consider averages in the following manner 1 d A 2(k 1) X A 1 ,...,A k 1 2Pauli: where d A 2(k 1) is the total number of Pauli operators A 1 , . . . , A k 1 . We write the averages as follows where ⇡ A , ⇡ C represents that we take averages such that A 1 ⌦ A 2 ⌦ · · · ⌦ A k and C 1 ⌦ C 2 ⌦ · · · ⌦ C k form permutation operators associated with ⇡ A , ⇡ C .
The key result is summarized below.
1 U ⌦k U † ⌦k (1) We are interested in 2k-point OTO correlation functions at infin where C j = UC j U † . We assume that A j and C j act on subsets o and C. As in our previous paper, we will consider averages of the function. We are particularly interested in averages associated with For instance, for a cyclic permutation ⇡ cyc = ⇡ 23...k1 , we will consider manner 1 d A 2(k 1) X A 1 ,...,A k 1 2Pauli: where d A 2(k 1) is the total number of Pauli operators A 1 , . . . , A k 1 .
as follows where ⇡ A , ⇡ C represents that we take averages such that A 1 ⌦ A 2 ⌦ · · · ⌦ C k form permutation operators associated with ⇡ A , ⇡ C .
The key result is summarized below.

U U †
We are interested in 2k-point OTO correlation functions at infinite temperatu hA 1 (0)C 1 (t)A 2 (0)C 2 (t) · · · A k (0)C k (t)i = 1 d Tr [A 1 (0)C 1 (t)A 2 (0)C 2 (t) · · · A k (0)C where C j = UC j U † . We assume that A j and C j act on subsets of qubits, den and C. As in our previous paper, we will consider averages of the above OTO function. We are particularly interested in averages associated with permutation For instance, for a cyclic permutation ⇡ cyc = ⇡ 23...k1 , we will consider averages in th manner 1 d A 2(k 1) X A 1 ,...,A k 1 2Pauli: where d A 2(k 1) is the total number of Pauli operators A 1 , . . . , A k 1 . We write th as follows where ⇡ A , ⇡ C represents that we take averages such that A 1 ⌦ A 2 ⌦ · · · ⌦ A k and · · · ⌦ C k form permutation operators associated with ⇡ A , ⇡ C .
The key result is summarized below.
where π A = π −1 cyc and π D = π cyc . Careful consideration of this picture elucidates that it is proportional to tr{ρ AC k }.