Universal chaotic dynamics from Krylov space

Krylov complexity measures the spread of the wavefunction in the Krylov basis, which is constructed using the Hamiltonian and an initial state. We investigate the evolution of the maximally entangled state in the Krylov basis for both chaotic and non-chaotic systems. For this purpose, we derive an Ehrenfest theorem for the Krylov complexity, which reveals its close relation to the spectrum. Our findings suggest that neither the linear growth nor the saturation of Krylov complexity is necessarily associated with chaos. However, for chaotic systems, we observe a universal rise-slope-ramp-plateau behavior in the transition probability from the initial state to one of the Krylov basis states. Moreover, a long ramp in the transition probability is a signal for spectral rigidity, characterizing quantum chaos. Also, this ramp is directly responsible for the late-time peak of Krylov complexity observed in the literature. On the other hand, for non-chaotic systems, this long ramp is absent. Therefore, our results help to clarify which features of the wave function time evolution in Krylov space characterize chaos. We exemplify this by considering the Sachdev-Ye-Kitaev model with two-body or four-body interactions.


Outline and results
Krylov complexity measures the spread of a time-evolving state in a Hilbert space. For a maximally entangled state, this complexity only depends on the spectrum of the Hamiltonian and is independent of the choice of fundamental gates. We study the universal behavior of Krylov complexity for Hamiltonians describing chaotic systems. The Krylov approach consists of defining a particular Hilbert space basis, the Krylov basis. State evolution in this basis can be mapped to a particle moving on a one-dimensional chain. We exploit this map to equivalently describe state evolution in the Krylov basis in terms of forces acting on the particle. At early times, Krylov complexity displays a linear growth. Here we find that this linear growth is described by a generalized Ehrenfest theorem in Krylov space, providing an effective classical equation of motion for Krylov complexity. The linear growth of Krylov state complexity is not a characteristic of chaos, since it may also appear in non-chaotic systems. For late times, in chaotic systems Krylov complexity shows a characteristic peak and saturation structure, as numerically observed in Ref. [1]. By taking the continuum limit of the one-dimensional chain, we derive an analytical expression for the Krylov complexity at late times that confirms the observation of Ref. [1]. Moreover, we calculate the wave function in the Krylov basis. The norm of the amplitude squared of this wave function is referred to as transition probability. We find that it exhibits a universal rise-slope-ramp-plateau behavior with a long ramp. The ramp-plateau behavior is characteristic for chaos. Due to probability conservation, the characteristic long ramp that we find gives rise to the peak structure of the Krylov complexity. Moreover, we find that in non-chaotic systems, the long ramp of the transition probability disappears. This implies that in this case, the peak in the Krylov complexity is absent. Our results thus clarify which features of the wave function time evolution in Krylov space characterize chaos.

Chaos and Krylov space
To put our results into context, we begin with a brief review of quantum chaotic systems and the Krylov approach. The time evolution of a quantum chaotic system is characterized by the statistics of the energy spectrum. In a quantum chaotic system, the energy levels are correlated and exhibit two salient phenomena: level repulsion and spectral rigidity [2][3][4]. Level repulsion refers to the fact that energy levels tend to avoid clustering. Spectral rigidity means that the number of levels within an energy interval of given size has small fluctuations. Both of these properties are due to the precise nature of correlations between level spacings in chaotic systems. More precisely, it is expected that the level spacing statistics coincides with the random matrix theories (RMT) and are well-approximated by the Wigner-Dyson distribution [5][6][7][8]. The most studied RMT is theβ-Gaussian ensemble [9], which we will focus on in this paper. Recently, the Sachdev-Ye-Kitaev (SYK) model [10,11], a quantum mechanical model with all-to-all interactions (as opposed to nearest-neighbor interactions), was found to exhibit the level statistics of Gaussian RMT [12,13]. Moreover, Jackiw-Teitelboim (JT) gravity, a two-dimensional dilaton gravity that shows similar features of the SYK model, is precisely consistent with a double-scaled random matrix integral [14].
The level statistics also determines the behavior of the spectral form factor (SFF) given by the square of the absolute value of the partition function with a complex time argument [15,16], where E p and D are the p-th eigenvalue and the dimension of the Hamiltonian H, respectively, and β, t ∈ R. In a chaotic system, the SFF for the ensemble average usually shows three regions as a function of time: slope, ramp, and plateau [17][18][19], as shown in Fig. 1. Roughly speaking, these features arise from the width of the spectrum, spectral rigidity, and level repulsion, respectively. Since its time evolution reflects these properties, the SFF may be used to diagnose quantum chaos. The time at which there is a cross-over between the slope and ramp evolution is referred to as dip time. A chaotic system usually has an exponentially late dip time and an exponentially long and linear ramp region, controlled by the long-range spectral rigidity [5,8,17]. Chaotic evolution is a complicated process that requires a complexity measure for its quantitative analysis. Partially motivated by new relations between quantum computation and the time evolution of black holes [20][21][22][23], several concepts of complexity were proposed to measure how many computational steps are required to reach a target state or operator from a reference state or operator. One of the motivations for the investigations in the present paper is to examine how complexity reflects late-time chaos, based on level repulsion and spectral rigidity.
Among the complexity measures in information theory, Nielsen defined the complexity of a unitary operator U (t) = e −iHt as the minimal distance to the identity in the unitary group [24][25][26]. The minimal distance on the group manifold is defined in terms of some cost function. The definition of the cost relies on the choice of few-or many-body terms based on the locality properties of the Hamiltonian H. A similar choice of fundamental operations appears in the notion of computational complexity, which measures the complexity of producing a target state |ψ T ⟩ starting from a reference state |ψ R ⟩ [27,28]. Given a set of elementary quantum gates, the computational complexity is the minimum number of elementary gates necessary to achieve a unitary transformation U within a precision so that |ψ T ⟩ = U |ψ R ⟩.
When discussing the relation between late-time chaos and these notions of complexity, we note that level statistics does not provide the information about the locality properties of the Hamiltonian directly. Given a Hamiltonian from a Gaussian matrix ensemble, there is no natural way to define locality [18,43], let alone few-body or many-body interaction terms. As we will describe below, Krylov complexity is unambiguously defined even in this case, and hence well-suited for matrix ensembles.
Notions of complexity were also proposed in the context of the AdS/CFT correspondence [44]. In particular, the volume or action of a wormhole connecting the two sides of an eternal black hole is conjectured to be related to the complexity of preparing the dual state in quantum field theory [21,[45][46][47][48][49]. The real-time evolution of holographic complexity exhibits a similar linear-to-plateau behavior as the computational complexity [50], where the growth rate is argued to be bounded by the energy [20]. Moreover, holographic complexity has a nonzero initial value that is proportional to the initial volume of the wormhole in the dual gravity theory. The eternal black hole corresponds to a thermofield double (TFD) state in the field theory [51]. So, the initial volume as well as the complexity are generated by imaginary time evolution in preparing the TFD state. So far, a precise holographic dual of complexity is still an open question, despite recent progress [48,49,[52][53][54][55][56][57]. One of the remaining challenges is to precisely define complexity for interacting quantum field theories, in particular since their Hilbert space is infinite dimensional. One approach in this direction is to consider CFTs and to construct gates from conformal symmetry transformations [39,40,[58][59][60][61]. This also allows to construct a gravity dual of the cost function [62].
With potential relations to holography in mind, the notion of Krylov complexity draws increasing attention [1,[63][64][65][66][67][68][69] since it is well-defined in any quantum theory. Krylov complexity has the advantage that its complexity measure is independent of the locality properties of the Hamiltonian. It does not rely on defining elementary gates or a given tolerance. This makes it very appealing in the context of holographic dualities. According to the Hilbert space on which Krylov complexity is defined, it describes the evolution of states [1] or operators [63], in both real time and imaginary time [70,71].
In [1], a notion of Krylov state complexity is defined that realizes the appealing visualization of a wavefunction spreading over the Hilbert space in a basis-independent way. The authors of [1] refer to this Krylov state complexity as 'spread complexity'. It measures how far the target state spreads in the Hilbert space H. The target state |ψ τ ⟩ = e −τ L |0⟩ starts from a reference state |0⟩ at τ = 0 and evolves under a Liouvillian operator L constructed from the Hamiltonian H. Based on the Taylor series of L, this evolution may be studied in Krylov space that is constructed by applying L on |0⟩ repeatedly. In the orthogonal and normalized basis of Krylov space, namely {|O n ⟩} with |O n ⟩ = ψ n (L) |0⟩ and ψ n (x) a polynomial of degree n, the Liouvillian L becomes a tridiagonal matrix, whose components are called Lanczos coefficients [72,73], denoted as {a n , b n }. In terms of the Krylov basis, the time evolution of a state |ψ τ ⟩ = e −τ L |0⟩ can be effectively mapped to the propagation of a quantum particle along a one-dimensional chain, which is referred to as Krylov chain [63]. Krylov complexity is then defined as the location of the particle in the Krylov chain. This is equivalent to the expected number of times of applying L on |0⟩ required to generate |ψ τ ⟩.
Krylov operator complexity measures how far an operator in the Heisenberg picture spreads in the space of operators. By the Gelfand-Naimark-Segal (GNS) construction [74][75][76], the space of operators is isometric to a double-copy Hilbert space. More precisely, the reference state is defined as |O⟩ = (O ⊗ I) |0⟩, with O an arbitrary operator and I the identity, acting on the single-copy Hilbert space, respectively, and |0⟩ is a maximally entangled state in the double-copy Hilbert space. Moreover, one considers a Liouvillian L = H ⊗I−I⊗H, where H is the Hamiltonian acting on the single-copy Hilbert space. Since L |0⟩ = 0, the application of L on |O⟩ is nothing but the commutator, namely L |O⟩ = ([H, O] ⊗ I) |0⟩. Then, the Krylov operator complexity e −iHt Oe iHt is identified as the Krylov state complexity e −itL |O⟩ in the double-copy Hilbert space. Once the operator O has a nonzero commutator with the Hamiltonian, it will grow under the evolution with L. Methods for studying the time evolution of Krylov complexity were recently obtained by decomposing Liouvillian L into annihilation and creation operators and analyzing the "complexity algebra" [68,[77][78][79].
The exponential growth of Krylov operator complexity, and also the linear growth of Lanczos coefficients, allow to obtain the Lyapunov exponent [63,80] which characterizes the exponential operator size growth [81,82] given by, e.g., the out-of-time-ordered correlator (OTOC) [83][84][85][86]. However, the maximally exponential growth of Krylov operator complexity at early times is also observed in integrable systems, including free field theories. Exponential growth is therefore not necessarily related to chaos [67,87]. Here, we hence also turn our attention to the relation between the late-time behavior of Krylov complexity and chaos [65,88]. Moreover, it is argued in [89][90][91][92] that the descent in the Lanczos coefficients as well as the late-time behavior of the Krylov operator complexity given by the evolution with a chaotic Hamiltonian H is expected to be governed by the RMT. The relation between Krylov complexity and chaos has further been discussed for a number of models, including the SYK models [66,93,94], quantum field theories [95][96][97][98][99], many-body localization system [100], and open systems [101][102][103][104]. Krylov complexity has also been used for distinguishing topological phases [105,106] and for investigating the quantum charging advantage of SYK-like quantum batteries [107].
To study late-time chaos from Krylov complexity, it appears to be more convenient to study Krylov state complexity directly. The Krylov complexity of the maximally entangled state only depends on the spectrum of the Hamiltonian H. It is thus tied to the SFF and suitable to describe late-time chaos [1], in particular during the time range when the chaotic level spacing becomes manifest [65,89]. In Fig. 1, we show the Krylov state complexity of the maximally entangled state, and its correspondence to the SFF. It exhibits quadratic growth, linear growth, a peak, and saturation, whose transition time scales are close to those in the SFF. We refer to the time when it reaches its peak as "peak time". We further refer to the quadratic growth and linear growth regions as the early-time behavior and to the peak and saturation as the late-time behavior. We will show that the early-time behavior is given by a double time integral of the SFF via an Ehrenfest theorem and the latetime behavior is determined by the universal behavior of the probability given by the wave function in the Krylov space.

Organization of the paper
In Sec. 2, we first review the construction of Krylov space and Krylov complexity. We introduce the continuum limit of the Krylov approach in a first-order and a secondorder formalism, respectively. Moreover, we determine the Krylov complexity for obtaining the TFD state from a reference state given by a maximally entangled state. This Krylov complexity is entirely determined by the Hamiltonian spectrum.
In Sec. 3, we consider the Gaussian matrix ensemble and study the evolution of Krylov complexity at early times. We propose an Ehrenfest theorem for Krylov complexity, which linearly relates the second-order time derivative of the Krylov complexity to the SFF, see (1.1). In particular, with the Lanczos coefficients given by Gaussian matrix ensemble, the linear growth of the Krylov complexity is the determined by the slope of SFF, which is not necessarily related to chaos. In Sec. 4, we study the evolution of a maximally entangled state in Krylov space at late times. For the Gaussian unitary ensemble (GUE), we numerically study the distribution and evolution of the transition probability | O n |e −τ L |0 | 2 , namely the probability for reaching each state in the Krylov basis. We find that the transition probability universally exhibits a rise-slope-ramp-plateau behavior with an exponentially long ramp. Like the SFF, the ramp-plateau behavior exhibited in the transition probability characterizes chaos. To analytically explain and estimate this behavior, we further approximate the polynomial ψ n (L) and derive an expression for the rise-slope-ramp-plateau behavior in App. C. Moreover, we show that the above ramp-plateau behavior generally appears in any subspace observable in the Krylov space. Finally, we show that the existence of the long ramp in the transition probability is directly responsible for the peak in the Krylov complexity.
In Sec. 5, we study the transition probability and Krylov complexity for a nonchaotic spectrum, where the levels are uncorrelated. In contrast to the chaotic case, the transition probability here exhibits a rise-slope-plateau behavior without a ramp. The absence of a ramp in the transition probability is directly responsible for the absence of a peak in Krylov complexity. However, the linear growth of Krylov complexity persists.
In Sec. 6, we further study the transition probability and Krylov complexity in the SYK model. In the SYK 4 model, the transition probability exhibits a rise-sloperamp-plateau behavior similar to the case of RMT. In the SYK 2 model, the transition probability exhibits a rise-slope-ramp-plateau behavior with a short ramp at small n and no ramp at large n.
We conclude in Sec. 7 with an outlook to future directions.

Krylov state complexity
In this section, we detail the general framework for Krylov space and Krylov complexity of a time-evolving state.

Krylov space
Given a Hilbert space H, a reference state |0⟩ ∈ H, and a Hermitian operator L acting on H called Liouvillian [63], we can construct the Krylov space as follows. First, we construct a sequence of normalized states {|A j ⟩} K−1 j=0 by subsequently applying L to |0⟩, namely |A j ⟩ = µ −1/2 2j L j |0⟩ , for j = 0, 1, 2, · · · , K − 1, with µ j = ⟨0| L j |0⟩ the moments and K the minimal number such that µ 2K = 0. In general, these states may not be independent of each other. Let L to be the index of the first state |A L ⟩ becoming linearly dependent on the former states {|A j ⟩} L−1 j=0 . Then its latter states {|A j ⟩} K−1 j=L are also linearly dependent on these former states only span a L-dimensional space K, called Krylov space. Usually, we take the set of the former L states {|A j ⟩} L−1 j=0 as its basis. In general, the states {|A j ⟩} L−1 j=0 are not orthogonal to each other. We may apply the Gram-Schmidt orthogonalization to the sequence {|A j ⟩} L−1 j=0 to generate a sequence of orthogonal states where p n (x) and ψ n (x) are respectively the monic and normalized orthogonal polynomial of degree n with a measure given by the spectrum of L [108]. The norm h n will be determined later. Define the projection on the Krylov space as π K = L−1 n=0 |O n ⟩ ⟨O n |. Let |E p ⟩ ∈ K to be the eigenstate of L K = π K Lπ K , namely, π K L |E p ⟩ = E p |E p ⟩ for p = 0, 1, · · · , K − 1. The orthogonality relation and completeness relation are where n is the shorthand of L−1 n=0 and the measure in E on the spectrum with the spectral density and a continuation of inner product |⟨E p |0⟩| 2 . Formally, we can also write the completeness relation as n ψ n (E)ψ n ( The above Gram-Schmidt orthogonalization is realized by the following iterative algorithm [72,73] where {a n , b n } are the Lanczos coefficients with the dimension of energy. By default, we choose b n ≥ 0. The Gram-Schmidt orthogonalization is equivalent to a tridiagonalization of the Liouvillian L into a matrix L, The Lanczos coefficient can also be generated by the moments µ j and vice versa. The Lanczos coefficients give the monic polynomials, where L (n) is the n × n sub-matrix {L pq } n−1 p,q=0 . Obviously, the spectrum of L are the roots of p L (E) = 0. The norm h n of the monic polynomial p n (E) is given by b 2 n = h n /h n−1 and h 0 = 1. The polynomials satisfy the recurrence relation. For the normalized polynomials, 3), the n-th component of the eigenvector of L for energy E p is given by v n (E p ) = ⟨0|E p ⟩ ψ n (E p ). Then the eigenstate of L K in K can be written as Given a (non-normalized) state |ψ⟩ ∈ K, we can expand it on the normalized orthogonal basis {|O n ⟩} L−1 n=0 as |ψ⟩ = n |O n ⟩ ϕ n , ϕ n = ⟨O n |ψ⟩ . (2.10) Then we can define the Krylov complexity of the state |ψ⟩ as where J = n n |ϕ n | 2 , P = n |ϕ n | 2 . (2.12)
According to the Schrödinger equation (2.15), we have the imaginary time derivative 21) and the real time derivatives ∂ t P (t; β) = 0 and where τ = β + it and b 0 = 0 according to the convention in (2.6). The r.h.s. of (2.22) is just the expectation value of the commutator, such that we have with Krylov complexity operator defined asK = n n |O n ⟩ ⟨O n |. We refer to this equation as the Ehrenfest theorem of Krylov complexity, since it relates the second derivative of Krylov complexity to the expectation value of the gradient of the square of Lanczos coefficients. In this sense, it provides a classical equation of motion for the Krylov complexity. This feature will become more clear in the continuum limit in App. A. We note that a version of (2.22) for Krylov operator complexity in the case where a n = 0 was derived in [108]. We will also explain the relation between the Ehrenfest theorem (2.22) and the "complexity algebra" [68] in Sec. 3.2.
To calculate K(t; β), we could start from K(0; 0) = 0, evolve it along imaginary time for β, get K(0; β), evolve it along real time for t with the initial condition ∂ t K(t; β)| t=0 = 0, and finally get K(t; β). Usually, we will integrate (2.22) over the real time and get the complexity difference ∆K(t; β) ≡ K(t; β) − K(0; β) = t 0 dt 2 We discuss two limits of Krylov complexity below. At the low temperature limit, we may use eigenstates with the lowest two energies E 0,1 to approximate the wave function (2.24) The Krylov complexity converges to a constant plus an oscillation with frequency 10 2 cos(tE 10 ) |⟨E 1 |0⟩| 2 n nψ n (E 0 )ψ n (E 1 ). (2.25) Similarly to the SFF [17], the transition probability |ϕ n | 2 is determined by the energy levels in the long-time average, where oscillating phases average to zero and terms with E p = E q survive, namely where we have assumed for simplicity that there is no degeneracy. We refer to the time when |ϕ n | 2 converges to this value as plateau time t p . Then the long-time average of the complexity is given by (2.28) The late time average at β = 0 will be simplified if the reference state is taken to be a maximally entangled state in Sec. 2.4.

Continuum limit
It is difficult to solve the recurrence relation and Schrödinger equation with general Lanczos coefficients on the discrete Krylov chain. To simplify this problem, we consider the continuum limit n → x, with x a continuous coordinate, and solve the corresponding differential equations.

First-order formalism
We may map the polynomials ψ n (E) and the wave function ϕ n (τ ) to some continuous functions of n (as given below in (2.29) and (2.36)). Assuming that these functions depend smoothly on n, we may approximate their differences in n by their derivative w.r.t. n, and write the recurrence relation (2.9) and Schrödinger equation (2.15) as first-order differential equations [92,108]. We refer to this approach as the firstorder formalism of the continuum limit. Based on this simplifying approach, we may easily derive the Krylov complexity in the continuum limit. However, we will see that the assumption of smoothness is subtle and has to be clarified in a second-order formalism.
The authors of [92,108] developed an approach to calculate the polynomial ψ n (E) and wave function ϕ n (it) in the continuum limit as follows. The continuum limit is defined to take the form x n = ϵn, Ψ(E, x n ) = i n ψ n (E), Φ(t, x n ) = i n ϕ n (β + it), b(x n ) = b n , a(x n ) = a n , (2.29) which is valid when a n , b n , i n ψ n (E), and i n ϕ n (β + it) are smooth functions of n. For real time t, the recurrence relation (2.9) and the Schrödinger equations (2.15) become where b ′ = ∂ x b, and equivalently for the other variables. The above two equations are related by the transformation (2.14) from energy E to time t. Due to the ϵb ′ Ψ term in (2.30), the norm in (2.2) is not preserved by the evolution along x. Using the coordinate y with dy = dx/(2ϵb(x)) and with the new variableΨ = √ bΨ and Φ = √ bΦ, these equations simplify to Using the coordinate x, the solutions then become where the function f (t − ) is determined by the initial condition and t − (t, x) labels the characteristic curves [108] t − (t, x) = t − x dr 2ϵb(r) . (2.35) This shows that the wave function Φ(t, x) propagates forward with a local velocity 2ϵb(x), from n = 0 to n = L. Notice that (2.33) and (2.34) do not contain the end point of the Krylov chain x = ϵL since the last state in the Krylov basis (2.1) is |O L−1 ⟩. Since the initial condition ϕ n (0) = δ 0n is highly discontinuous, the continuum limit is valid only when the wave function spreads out. However, we notice that the discrete recurrence relation (2.9) and Schrödinger (2.15) enjoy the parity symmetry n → L − n, but their continuum versions (2.30) and (2.31) break the parity x → ϵL − x. As a result, the characteristic curves (2.39) have a preferred direction. The breaking of parity is due to the assumption on the smoothness of i n ψ n (E) and i n ϕ n (β + it) as functions of n in the continuum limit. If we consider an alternative continuum limit, namely This result corresponds to the backward characteristic curves . (2.39) This backward propagation will be important after the wave function is reflected by the endpoint at n = L. The forward propagation and backward propagation are unified by the second-order formalism presented in the next section. Finally, we note that the polynomials ψ n (E) obtained from (2.33) have some artefacts. First, as a function of E, ψ n (E) is a Fourier mode of frequency L(1 − 1 − n/L) instead of a polynomial of degree n. Second, in general it does not obey the orthogonality and completeness relations (2.2)(2.3) and is not normalized to 1. Third, it usually takes a complex value. The first two artefacts are the results of continuum limit. Finally, since (2.30) does not preserve the normalization, we have to renormalize ψ n (E) for each n and E. The third artefact is solved by the second-order formalism as well.

Second-order formalism
We will adopt the following second-order formalism, which develops from the approach in [108]. Applying the recurrence relation (2.9) and Schrödinger equations (2.15) twice, we obtain where we dropped the arguments of ψ n (E) and ϕ n (τ ) and the coefficients are c n = b n b n+1 , d n = b n (a n−1 + a n ), e n = b 2 n + a 2 n + b 2 n+1 . (2.42) When a n = 0 and then d n = 0, the second-order formalism results (2.40) and (2.41) are factorized into even and odd parts, respectively [108]. This happens in the case of even-parity spectrum {E p } = {−E p }. Moreover, we consider even L for simplicity. The recurrences of even sector {ψ 0 , ψ 2 , · · · , ψ L−2 } and odd sector {ψ 1 , ψ 3 , · · · , ψ L−1 } are decoupled. The same applies to the evolution of wave function. It is therefore not appropriate in general to assume that the even sector smoothly connects to the odd sector. We therefore proceed as follows. We consider the continuum limit c(x n ) = c n , g(x n ) = e n − 2c n , where we assume that the i n ψ n (E) and i n ϕ n (β + it) for even n and for odd n are continuous respectively. The second-order formalism of (2.40) and (2.41) becomes, for either the even sector or the odd sector, Since these are real equations, we may obtain real solutions with real boundary conditions. Using the coordinate y with dy = dx/(2ϵ c(x)) and with the variables Ψ = c 1/4 Ψ,Φ = c 1/4 Φ, these equations are simplified into two wave equations Thus, the characteristic curves are They correspond to the forward and backward characteristic curves also given by the first-order formalism, (2.35) and (2.39). Since ψ 0 (E) = 1 in the even sector and ψ 1 (E) = E/b 1 in the odd sector, we impose the boundary conditions for simplicity of the solutions. The functions Ψ(E, x) determined by solving the wave equations with the corresponding boundary conditions for the even sector and odd sector, respectively. A better approximation may be obtained by modifying the boundary conditions according to the values of ψ 2 (E) and ψ 3 (E).

Krylov complexity for the TFD state
The Krylov approach relies on the choice of the Liouvillian and reference state. In this subsection, to study the direct relation between chaos in the spectrum of a Hamiltonian H and Krylov state complexity, we will construct the Liouvillian from a Hamiltonian H and consider a maximally entangled state as the reference state. This is motivated by the construction of Ref. [1]. Consider a Hilbert space H with dimension D = dim H, and a Hamiltonian H with eigenstates |E i ⟩ 1 , i = 0, 1, 2, · · · , D − 1, where the subscript " 1 " denotes the single Hilbert space. We consider a double-copy of the Hilbert space H L ⊗H R . Given a copy of energy basis |E i , E j ⟩ = |E i ⟩ 1 ⊗ |E j ⟩ 1 , we can define a maximally entangled state in the double-copied Hilbert space where U L,R is an unitary operator acting on H L,R . We will take |0⟩ as the reference state. We consider the Liouvillian given by where I is the identity operator. Since the choice of unitary operators in (2.50) will not affect the moment ⟨0| L j |0⟩, we are free to chose U L = U R = I. The Lanczos algorithm only depends on the spectrum of H.
. Their inner product can be written as the trace Tr in the single-copy Hilbert space, Vandermonde matrix, whose rank is the dimension of Krylov space L = dim K. L is reduced by the degree of degeneracy in the spectrum for the following reasons. Consider the decomposition which overlaps with the reference state as ⟨E p |0⟩ = m p /D, and the remaining m p − 1 states are orthogonal to |0⟩. Then the dimension of Krylov space is reduced by m p − 1 for each H p . Since the Vandermonde matrix of the non-degenerate spectrum E n p for 0 ≤ p, n < l has full rank, L equals the number of the non-degenerate energies, namely L = l. In this paper, we focus on the case that the spectrum is uniformly d-fold degenerate. So, the Krylov space of the maximally entangled state has dimension L = D/d and overlap |⟨E p |0⟩| 2 = 1/L. Obviously, L K is diagonal on the basis |E p ⟩. Notice the difference between the spectral density ρ(E) of L K in (2.5) and the spectral density of H. The Krylov basis defined in (2.1) obeys the orthogonality and completeness relations (2.2)(2.3), with (2.4) equal to (2.52).
The target state with τ = β + it is a TFD state with inverse temperature 2β and evolved by the left side Hamiltonian for time t. Its coefficient on the Krylov basis, i.e. the wave function on the Krylov chain, is Obviously, the survival probability is related to the SFF (1.1) as |ϕ 0 (τ ) With S(2β) = Z(2β)/D, the Krylov complexity is written as Thus, the Krylov complexity for the TFD state only depends on the spectrum. The above Krylov approach based on the maximally entangled state works at finite D. In the D → ∞ limit, one have to firstly regularize the dimension D by truncating the spectrum. Then rescale the spectrum into an finite energy interval such that all the moments µ j for finite j are finite. Finally, one can take the D → ∞ limit. Taking the Gaussian ensembles as examples, we will normalize the Hamiltonian so that the spectral density ⟨ρ(E)⟩ in ensemble average vanishes when The Krylov space for the maximally entangled state has overlap |⟨E 0 |0⟩| 2 = 1/L. At β = 0, the long-time averages of the transition probability and of the Krylov complexity, given by (2.26) and (2.28), are simplified due to the orthogonality relation (2.2), as shown in [65,91]: These results are independent of whether chaotic behavior is present or not. They are essentially a consequence of taking the maximally entangled state as the reference state. However, we will show in Sec. 4.2 that the fluctuations of transition probability and Krylov complexity are indeed sensitive to the presence of chaotic behavior.

Krylov complexity at early times
Here, we will consider the Hamiltonian H drawn from the Gaussian orthogonal ensemble (GOE), the Gaussian unitary ensemble (GUE) and the Gaussian symplectic ensemble (GSE). They belong to theβ-Hermite (Gaussian) ensemble with Dyson indexβ = 1, 2, 4 respectively [9]. The measures of their random spectra are given by (C.9) in App. C. Due to the level repulsion in RMT, the spectra in both the GOE and GUE are non-degenerate, where L = D, and the spectrum in the GSE is doubly degenerate, where L = D/2. To simplify the notation, we are free to rescale the Hamiltonian so that the first Lanczos coefficient b 1 = 1. To recover the dimension, we can rescale the energies and times as E p → E p /λ, a n → a n /λ, b n → b n /λ, and τ → τ λ, where λ has the dimension of energy. In Fig. 2, we display the Lanczos coefficients for the Krylov space of maximally entangled states for a realization of the GUE. When n increases, a n fluctuates around 0 and b n decreases from 1 to 0. Their expectation values in the large L limit are further discussed in Sec. 3.1. Their fluctuations become stronger when n increases. In Fig. 3, we show the snapshots of the transition probability at some instants of real time t and imaginary time β in one realization of the GUE. Along the real time evolution, the profile consists of a shock wave, a long tail, and some residual noise following. The strength of the shock wave and the tail decreases along the time, but the tail becomes longer. The shock wave becomes tiny when reaching the end of the Krylov chain. Along the imaginary time evolution, the wave function is localized at the ground state of the Schrödinger equation (2.15), which is similar to the fate of the wave function in the Krylov space of open systems [101][102][103][104].
In this section, we will develop an analytical approach for obtaining the evolution of Krylov complexity at early times.

Lanczos coefficients in the large dimension limit
For the Liouvillian drawn from theβ-Hermite ensemble and a fixed reference state |0 ′ ⟩, the recent results of [69] show that the Lanczos coefficients obey the statistical distribution function where N (r, s) is the Gaussian distribution with mean r and variance s, and χ r (x) is the chi-distribution, given by χ r (x) = 2 −r/2 x r−1 e −x 2 /2 /Γ(r/2). The Lanczos coefficients have the expectation value and variance [69] ⟨a n ⟩ = 0, ∆ 2 (a n ) = 4 where ⟨· · · ⟩ denotes average over theβ-Hermite ensemble, and ∆ 2 denotes the variance. In the large L limit with finite x = n/L, we have ∆ 2 (b n ) = O(L −1 ). So the variance is small compared to the average of b n . Thus, in the large L limit, we can take the expectation values a n = 0, b n = 1 − n L , (3.4) where the deviation from the convention b 2 1 = 1 is negligible for large L. In our analysis below, we will maintain (3.4) for conciseness. The same scaling behavior in (3.4) was applied to the Krylov operator complexity at late times in [89].
We notice that, for the maximally entangled state, the Liouvillian L in (2.51) is factorized and thus it is not a random Hamiltonian acting on the double-copy Hilbert space H L ⊗ H R . The statistics of Lanczos coefficients (3.2) is not simply applicable. However, we find that if we take the limit (3.4) as the tridiagonalization of H and apply the algorithm (2.6) on the maximally entangled state |0⟩ with the Liouvillian L, we get the same Lanczos coefficients (3.4). In order words, the limit (3.4) is a fixed point of the algorithm on maximally entangled state (2.50). So, (3.4) may be taken as an approximation for the Lanczos coefficients of maximally entangled state in the large L limit, as shown in Fig. 2.
We now consider the continuum limit (2.29) of the Lanczos coefficients (3.4), which is given by a(x) = 0, b(x) = 1 − x/(ϵL). Then the y coordinate in the continuum limit is y(x) = L 1 − 1 − x/(Lϵ) . From the characteristic curves (2.47), at time t the shock wave will reach the site Thus the shock wave reaches the last site at t = L. When t > L, the shock wave gets reflected and travels backward. The authors of [69] gave an approximate way to relate the density of state to the Lanczos coefficients in the large L limit. Setting b n = 0 for a relatively small number of n, e.g., n = ms for m = 1, 2, ..., r with integers r, s ∼ √ L and L = rs, the density of states is slightly affected, however we neglect this effect since only a small proportion of b n are sent to zero for large L. This approach makes the tridiagonal matrix L into a block diagonal matrix with r blocks of size s × s. Furthermore, in the large L limit, the Lanczos coefficients change smoothly, e.g. (3.4). We can further approximate the a n 's and b n 's in the m-th block by their mean valuesā m ,b m . From the case of constant Lanczos coefficients (D.7), the density of state in the m-th block is with normalization L/s. The total density of state is where a n = a(x), b n = b(x) are introduced in the continuum limit with x = n/L. Obviously, via (3.7), in the large L limit the Lanczos coefficients (3.4) give the semicircle law With the above ingredients, in the next subsection we will find an analytical approximation to the Krylov complexity at early times.

Krylov complexity from spectral form factor
Based on the Ehrenfest theorem (2.22) and the limit value of Lanczos coefficients (3.4) in the RMT ofβ-ensemble, we now give a direct relation between the Krylov complexity for maximally entangled states and the SFF at early times.
To proceed, we insert (3.4) into the Ehrenfest theorem (2.22) and intermediately find the simply combinations of coefficients It is reminiscent of the "complexity algebra" in [68,79]. In their case, b 2 n+1 −b 2 n = An+ B, ∀n ≥ 0 and then L, B,K form a closed algebra, where B = [L,K],K = AK + B. The closed algebra completely determines the evolution of complexity. However, here we have an "anomaly" δ n0 , which prevents the "complexity algebra" from being closed. From the n = 0 term in the Ehrenfest theorem (2.22), we obtain that the external input for the evolution of complexity is |ϕ 0 (τ )| 2 , which is proportional to the SFF according to (2.54). More precisely, we get the following expression of the second derivative of Krylov complexity, where in the last expression, only different levels will contribute to the summation. This equation states that under the approximation for the Lanczos coefficient (3.4), the second derivative of Krylov is given by the SFF (1.1). In deriving this equation, we should note that the wave function also implicitly depends on the RMT Hamiltonian. Thus, we have actually neglected the statistical correlation between the Lanczos coefficients and the wave function resulted from the RMT. Surprisingly, if we take the double integral over the time on both sides of (3.10), we obtain exactly the spectral complexity defined in [110], This implies that the Krylov state complexity and the spectral complexity are related by the Ehrenfest theorem in Krylov space. The authors of [92] also proposed an equivalence between Krylov complexity and spectral complexity in the continuum limit. However, since we neglect the fluctuation of the Lanczos coefficients, we will see that the two complexities match each other through the linear growth region but deviate from each other at late times. We will distinguish the Krylov complexity K and its approximation from the Ehrenfest theorem, i.e. the spectral complexity C. Notice that, beyond the plateau time, the SFF converges to the plateau value |Z(β + it)| 2 → dZ(2β) in ensemble average [17] with d-fold degenerate spectrum. Then (3.10) will vanish, which is a necessary condition for the saturation of spectral complexity C. However, it is not a sufficient condition, as (3.10) does not ensure that ∂ t C(t; β) vanishes. It is necessary to check the saturation case by case. In Ref. [110], the authors calculated the spectral complexity in the microcanonical ensemble for the GUE, GOE, GSE. Here we will work on the canonical ensemble, focus on the GUE in the main text and leave the calculation of the GOE and GSE in App. B. Finally, we will compare the numerical result of Krylov complexities to those spectral complexities.
For the GUE, the one-point function of the spectral density ⟨ρ(E)⟩ at large L obeys the semicircle law (3.8), where the bracket is the matrix ensemble average. Its two-point correlation is given by the sine kernel [17][18][19] ⟨ρ( The sine kernel shows the short-range correlation between the spectrum. By Fourier transformation, we obtain the SFF which contributes the slope of SFF. The connected part contributes to the ramp of SFF via the t/2π in the integrand in the energy window satisfying t < 2π ⟨ρ(E)⟩ and contributes to the plateau when the energy window shrinks to zero.
We were unable to find an analytic formula of the SFF for a general β [19]. So, we will consider two limits β = 0 and β ≫ 1 below.

Infinite temperature limit
We first consider the infinite temperature limit β = 0 so that the SFF and Krylov complexity are sensitive to the full spectrum. Evaluating the integral (3.13) with (3.8), we obtain the SFF [19,111] where J n is the n-th Bessel function of the first kind. Inserting this into the Ehrenfest theorem (3.10) and taking the double-time integral with the initial condition C (1,0) (0; 0) = C(0; 0) = 0, we obtain the spectral complexity which asymptotes to a constant at t → ∞. More precisely, the spectral complexity has the asymptotic behavior The comparison between our analytical expression and the numerical calculation for the Krylov complexity at β = 0 is shown in the right panel of Fig. 4. They agree well before the peak time of the Krylov complexity. The linear growth 16t/3π is the result of the double-time integral of the slope of the SFF, which is uniquely determined by the one-point function of the spectral density and not the spectral correlation. In Sec. 5, we will show that an uncorrelated spectrum leads to the quadratic-tolinear growth in complexities. Similarly, in App. D, we calculate the complexities in the case of constant Lanczos coefficients and find the same quadratic-to-linear growth. We therefore state that the linear growth of the Krylov complexity with a maximally entangled reference state is not related to chaos. The saturation value of the spectral complexity C(∞; 0) = L − 1 is different from the saturation value of the Krylov complexity K ∞ = (L − 1)/2 in (2.56), as shown in the right panel of Fig. 4. The saturation of the spectral complexity after the plateau time is due to the cancellation between the factor sin 2 (π ⟨ρ(E)⟩ x) in the sine kernel and the Fourier mode sin 2 (tx/2) in (3.11) at either the imaginary infinity x → i∞ or x → −i∞. This saturation is related to chaos. Meanwhile, the saturation of the Krylov complexity for maximally entangled states is due to the discrete spectrum, which is not necessarily related to chaos. In App. B, the spectral complexities in the GOE and GSE exhibit the same quadratic-to-linear growth as in (3.16). But they saturate to (L/2) log t and 2L/3 respectively when t ≫ L. In Fig. 5, we compare the numerical Krylov complexities and the spectral complexities for the three Gaussian ensembles, where they agree with each other at early times only.

Low temperature limit
In the low temperature limit 1 ≪ β ≪ L 2/3 , we may approximate the spectral density by the lower edge of the spectrum, where we have shifted the energy as E → E − 2 and considered E ≪ 1, i.e. the "double scaled" limit [14,17]. By evaluating (3.13), we obtain the SFF z 0 e −t 2 dt the error function. Using the Ehrenfest theorem (2.22) and taking the double-time integral, we obtain a change of complexity of the form Asymptotically, we find Similarly, the ∂ t C(t; β) → 0 when t → ∞ and then C(t; β) saturates at late times. We compare the spectral complexity C(t; β) to the change of the numerical Krylov complexity ∆K(t; β) = K(t; β) − K(0; β) in the right panel of Fig. 4. They have the same quadratic-to-linear growth behavior, but asymptote to different saturation values at late times. The initial value of Krylov complexity K(0; β) also grows with the imaginary time β, as shown in the left panel of Fig. 4. for small n, and the saturation is due to the discreteness of the spectrum. In fact, in Sec. 5, we demonstrate that the linear growth and saturation of the Krylov state complexity remain the same for non-chaotic spectra with uncorrelated levels.

Krylov complexity and chaos at late times
The authors of Ref. [1] observed that the peak in Krylov complexity is sensitive to chaos. In this section, we focus on the late-time behavior of the transition probability |ϕ n (β + it)| 2 and Krylov complexity K(t; β). In the transition probability, we discover a universal rise-slope-ramp-plateau behavior that includes a long ramp related to chaos. We find that the long ramp in the transition probability is responsible for the peak in the Krylov complexity.
Our analysis of the late-time behavior of the transition probability and the Krylov complexity in the GUE is organized as follows. In Sec. 4.1, we calculate the transition probability and Krylov complexity numerically and discuss their typical behavior. In Sec. 4.2, we examine their fluctuations and give a first look at the ramp and the saturation. In Sec. 4.3, we calculate the Krylov complexity in the continuum limit without taking the correlation between the spectral density and the polynomials of the Krylov basis into account. A part of this correlation is then included into the analysis in App. C. In Sec. 4.4, we examine the relationship between the ramp in the transition probability and the peak in the Krylov complexity.

Numerical simulation
The Krylov complexity (2.55) in the ensemble average may be written as (4.1) To study the chaotic dynamic from the Krylov space, we further examine the transition probability |ϕ n (β + it)| 2 for the ensemble average of the GUE. This quantity records the square of the amplitude of the wave function in the n-th Krylov basis. In addition, we examine its disconnected part |⟨ϕ n (β + it)⟩| 2 and connected part |ϕ n (β + it)| 2 conn. = |ϕ n (β + it)| 2 − |⟨ϕ n (β + it)⟩| 2 . The former measures the square of the average amplitude of the wave function in the ensemble, and the latter measures the correlation between the fluctuation of the wave function over the ensemble. Other observables in the Krylov space, such as the quantity in (2.11), can be split into their disconnected and connected parts in a similar manner. Our numerical simulation focuses on the case of β = 0 as the peak of the Krylov complexity is already significant at this value.
In Fig. 6, we show numerical snapshots of the probability wave |ϕ n (it)| 2 at different times in the ensemble average of the GUE with a large L and a huge number of realizations. We observe a shock wave with a long tail propagating forward and decaying during the evolution. Notably, it decays quickly when it reaches the end of the Krylov chain at t ≈ L. The reflected wave is completely broken up. Once the shock wave passes by the Krylov chain, the global evolution of the probability wave is governed by diffusion. In particular, after t ≈ L, the probability wave smoothly diffuses to a plateau value 1/L, where the probability ramps up for n ≲ L/2, and ramps down for n ≳ L/2.
In Fig. 7, we show the numerical time-evolution of the transition probability in the ensemble average |ϕ n (it)| 2 , its disconnected part |⟨ϕ n (it)⟩| 2 , and its connected part |ϕ n (it)| 2 conn. for various values of n. We observe a rise-slope-ramp-plateau behavior with a ramp-up for n ≲ L/2 and a ramp-down for n ≳ L/2. This behavior has different features for different scales of n compared to L. Specifically: n ≪ L The rise-slope is short and is mainly contributed by the disconnected part.
The ramp starts from a small step that depends on n, linearly increases, and then gradually slows down. The ramp is contributed by the connected part always, and gradually stops at a plateau time near 2L.
n ≲ L/2 The rise-slope becomes longer and is partially contributed by the connected part. The ramp starts from an obvious step below the plateau, linearly increases with a smaller rate, and then gradually slows down. The plateau time is still near 2L.
L/2 ≲ n < L The rise-slope is long and is mainly contributed by the connected part. The ramp starts from a step larger than the plateau, linearly decreases, and then gradually slows down. The plateau time is obviously greater than 2L.
In Fig. 8, we show the numerical time-evolution of the Krylov complexity in the ensemble average ⟨K(t; 0)⟩, as well as its disconnected part n n |⟨ϕ n (it)⟩| 2 , and its connected part n n |ϕ n (it)| 2 conn. . The peak is partially contributed by both the disconnected part and connected part. with m ≲ L/2 will exhibit ramps after the dip time of |ϕ m | 2 . For example, the m-site probability P (m) defined as ramps up after its slope in chaotic systems, as shown in Fig. 9.
In the following subsections, we will provide an analytical explanation for these numerical results from the perspective of chaos.

A first look at late times
To explain the behavior of the transition probability, we first discuss the late-time plateau, and then the linear ramp appearing prior to the plateau.
Let us consider the behavior once the plateau time is reached. Recall the plateau value of the transition probability and the saturation value of Krylov complexity in (2.56). To further estimate their fluctuations a long time after the plateau time t p , we consider the long-time average of the probability correlation between sites m and n with time lag dt, where the subscript in (· · · ) ∞ denotes the long-time average defined in (2.26). For the first step in (4.3), we use the property that E p − E q = E p ′ − E q ′ holds only when   p = q, p ′ = q ′ or p = p ′ , q = q ′ in a chaotic spectrum [18]. The final result in (4.3) is proportional to the transition probability between sites m and n. From the characteristic curves (2.47), we expect that it reaches a peak when dt = ± |y(ϵm) − y(ϵn)|, with y(x) the coordinate in (2.46). When β = dt = 0, the correlation (4.3) is simply L −2 δ mn . Thus the local fluctuations take the form which is of the same order as [(|ϕ n | 2 ) ∞ ] 2 in (2.56). From (4.3), we may easily calculate the fluctuation of Krylov complexity with β = 0 after saturation, So the relative fluctuation is (∆K) ∞ /K ∞ ≈ 2/ √ 3L, which is negligible for large L. We conclude that at β = 0, the transition probability is not self-averaging after saturation, similarly to the SFF [16]. On the other hand, we note that the Krylov complexity is indeed self-averaging after saturation, similar to the spectral complexity [110]. So the relative fluctuation of the transition probability is stronger than the relative fluctuation of the Krylov complexity.
We move on to discussing the ramp before the plateau time. It is well known that the SFF, which is proportional to the survival probability, has a linear ramp. We expect that, due to the spectral rigidity, a linear ramp also appears in the transition probability |ϕ n (β + it)| 2 for n ̸ = 0 . For a first look at the linear ramp in the transition probability, we consider the two-point function (3.12) in the GUE in the box approximation [17,19] ⟨ρ(E 1 )ρ(E 2 )⟩ = ⟨ρ(E 1 )⟩ ⟨ρ(E 2 )⟩ + ⟨ρ(E)⟩ δ(s) − sin 2 (Ls) πLs 2 , (4.6) where we have sent E 1,2 → E in the connected part as the sine kernel is localized at s = 0. Moreover, we also sent ⟨ρ(E)⟩ → ⟨ρ(0)⟩ = L/π in the sine kernel, and have given an appropriate normalization to the sine kernel, such that the two-point function reduces to the one-point function ⟨ρ(E 1 )⟩ if E 2 is integrated out. From (4.7a) In (4.7c), we replace ψ(E 1 )ψ(E 2 ) → ψ(E) 2 by considering the that the sine kernel behaves as a narrow peak at s = 0 with width 1/L and the polynomial ψ(E) with low degree n is relatively smooth compared with the sine kernel if n ≪ L and then the energy difference s in ψ(E 1 )ψ(E 2 ) is negligible. When β = 0, the prefactor in (4.7d) is reduced to 1 because of (2.2). So, at β = 0, the connected part of the transition probability exhibits a ramp-to-plateau behavior, with the ramp t/(2L 2 ), the plateau values 1/L, and the plateau time 2L, which approximately matches the numerical result for small n in Fig. 7. In App. C.2, we show that the transition probability in a general Gaussian ensemble also exhibits a ramp. As we will show in Sec. 5, the transition probability in a non-chaotic system have short ramp regions and different plateau times. So, the above ramp-to-plateau behavior characterizes the chaos in the Krylov space. Note however that in (4.7), we neglected two effects: • When ψ(E) is a n-degree polynomial with n ≲ L, the high-frequency oscillation prevent us from neglecting the energy different s in ψ(E 1 )ψ(E 2 ).
• When the given polynomial ψ(E) is taken as the polynomial ψ n (E) determined by the Lanczos coefficients, which are in principle determined by the spectrum {E p } in a complicated way. The statistical correlation between the two polynomials and the two spectral density could also contribute to the connected part of the transition probability.
The first effect will leads to a n-dependent time shifting in the transition probability such that the ramp does not start from zero for finite n, as given in (C.38) in App. C.5.
The second effect is more complicated. In most of the literature, the statistical correlations between the two polynomials ψ n (E 1 )ψ n (E 2 ) and the density densities ρ(E 1 )ρ(E 2 ) in the ensemble average of the transition probability are neglected. Although we are unable to deal with all those complicated correlation in this paper, we still consider a most significant effect of these correlations step by step: In App. C.3, we will include an obvious effect of the correlation between the polynomial and spectral density, called the confinement of polynomials. In App. C.4, we will argue the approximated expressions of the reduced spectral density and polynomials after taking the effect of the confinement of polynomials into account. In App. C.5, we will find an improved expression of the transition probability with a ramp-to-plateau behavior in its connected part, as expected.
In the following subsections, we will first ignore the statistical correlation of the second effect and calculate the Krylov complexity in the first-order formalism of the continuum limit. The Krylov complexity already exhibits a peak as a result of the ramp in the transition probability at this level.

Krylov complexity in the continuum limit
In this subsection, we derive the complexity in the first-order formalism of the continuum limit, while neglecting the statistical correlation between the polynomial ψ n (E) and the spectrum {E p }. According to [92], and using the solution (2.33), we may define a complexity operator on the energy basis in the continuum limit 1 . However, in general, the polynomial in the continuum limit is not normalized to 1, meaning that (1/L) dE |Ψ(E, x)| 2 ρ(E) = 1/b(x) and (1/ϵL) dx |Ψ(E, x)| 2 = dx/(ϵLb(x)). Therefore, we should introduce a normalization factor b(x) to each Ψ(E, x) before integrating over x. We obtain a complexity operator and total probability on the energy basis in the continuum limit, where we have taken the symmetric part under exchanging E 1 ↔ E 2 . Taking the limit value of the Lanczos coefficients (3.4), we get Then we get the complexity operator and the total probability in the continuum limit J cont. (s) = ϵ s 2 L 2 s 2 + 6 sinc 2 Ls 2 − 6 , (4.11) where we may set the overall factor as ϵ = 1/L such that P cont. where the L/2 will be the saturation value of the Krylov complexity at β = 0, as a consequence of our normalization. From (2.55), the expectation value of the Krylov complexity in the continuum limit is With the two-point function in the GUE (3.12), we can calculate them in the ensemble average  will use the box approximation (4.6) by replacing ρ sc (E) → ρ sc (0) in the sine kernel. The final result at β = 0 is whose configuration is plotted in Fig. 10. ⟨K cont. (t; 0)⟩ reaches a peak of value 0.75L at time t ≈ 0.7L and nearly saturates to 0.5L when t > L. We further separate the contribution of the disconnected and connected parts of two-point function in the integrand of (4.17) and show them in Fig. 10. We see that the linear growth as well as the peak are mainly contributed by the disconnected part and the saturation is contributed by the connected part in this continuum limit. While the peak in the numeric Krylov complexity is partially contributed by the connected part, as shown in Fig. 8. The reason for discrepancy is that we neglect the fluctuation in the polynomial ψ n (E) coming from the fluctuation on levels. Those fluctuation transform some contribution from disconnected part to the connected part. We will come to this point in App. C.5.

The peak of Krylov complexity
We can now explain the peak observed in the Krylov complexity, which appears slightly before the plateau time t p and much later than the dip time t d for small n. The transition probability for all small n is in the ramp region, while the transition probability for all large n is in the rise-slope region. Considering the weight n in the Krylov complexity, the former is negligible while the latter dominates. Thus, the observed peak in the Krylov complexity is essentially the weighted combination of peaks in the rise-slope region of the transition probability at large n.
More precisely, we consider the saturated case where the shock wave reaches the n-th site at time t, and the transition probability before the n-th site is in its ramp region, i.e., |ϕ m (it)| 2 ≳ t/(2L 2 ) with 0 ≤ m < n. Due to probability conservation, we have |ϕ n (it)| 2 ≲ 1 − tn/(2L 2 ) at β = 0, where the total probability is 1. The shock wave travels according to (3.5). Then, the Krylov complexity is bounded above by which has a peak at around t = 0.62L. We can also explain why the linear growth of Krylov complexity is not a result of chaos before its peak time. The dip times of the transition probabilities for different n are mismatched, so the rise-slope of the transition probability in the shock wave always covers the ramps of other transition probabilities.

Non-chaotic spectrum
As a comparison, following the example given in [69], we can consider a uncorrelated spectrum whose density satisfies the same semicircle law (3.8). Technically, we randomly sample the energies individually from the spectra in the GUE with dimension L. We calculate the Lanczos coefficients, transition probability, and Krylov complexity numerically, as shown in Fig. 11. The m-site probability is shown in Fig. 9. The Lanczos coefficients approach the limit value (3.4) in ensemble average in consistence with the semi-circle law and (3.6). But they have strong fluctuation especially at large n.
To study the behavior of transition probability, we consider the two-point function of uncorrelated spectrum where the spectral density in ensemble average obeys the semi-circle law ⟨ρ(E)⟩ = ρ sc (E). Similarly, considering the confinement and then decorrelating the polynomials and the spectral density, we obtain the SFF and the transition probability respectively The first terms will give a plateau and there is no ramp. Since the Lanczos coefficients have strong fluctuations, the wave function ϕ n (it), which is initially localized at n = 0, quickly diffuses as it propagates, as observed in [1] as well. Thus, when n is not too large, the transition probability |ϕ n (it)| 2 exhibits rise-slope-plateau behaviors without any ramp, as shown in Fig. 11. The plateau value is universally 1/L, but the plateau time, i.e., the time when its slope become smaller than 1/L, highly depends on n and is much earlier than 2L. When n is increasing and approaching L, the peak between the rise and the slope decays quickly and even ceases to exist. Finally, when the transition probability only gradually rises and approaches the plateau 1/L from below. The decay or even absence of the peak in the transition probability for n ≲ L is the result of probability conservation. Consider the time when |ϕ n (it)| 2 is rising, due to the absence of a ramp, all the |ϕ m (it)| 2 with m ≪ n have entered their plateau regions and all the |ϕ m (it)| 2 with m ≲ n have entered their slope region. In both cases, |ϕ m (it)| 2 ≥ 1/L. From (2.18), we get |ϕ n (it)| 2 < 1 − n/L, which decreases with increasing n and approaches its plateau value 1/L when n ≲ L.
Due to the absence of a ramp in the transition probability, we will not find any ramp in m-site observable (4.2), as shown in Fig. 9. Following the strong fluctuations in the Lanczos coefficients, the polynomials ψ n (E p ; {E}) will get a stronger fluctuation at larger n via the spectrum {E}. So we observe that the connected parts of the transition probability |ϕ n | 2 conn. become dominated at large n. Since the Lanczos coefficients are still around the limit value (3.4). By plugging the SFF (5.2) into the Ehrenfest theorem (2.22) and doubly integrating over the time, we obtain the Krylov complexity at the early time. In particular at β = 0, which agrees with the numerical result displayed in Fig. 11. This linear growth has the same rate as the growth in RMT at early times. We thus confirm that, when the reference state is the maximally entangled state, the linear growth of Krylov state complexity does not characterize chaos in the spectrum 2 . However, at late times, the Krylov complexity for a non-chaotic spectrum gradually slows down and reaches a plateau from below at t ≳ L, without going through a peak (see Fig. 11), which is in contrast to the Krylov complexity in the GUE at late times (see Fig. 8). The absence of the peak in the Krylov complexity is due to the absence of the ramp in the transition probability. Recall that |ϕ n (it)| 2 < 1 − n/L due to the early plateau time and probability conservation, which bounds the complexity contributions from probability at large n. To describe this in detail, let us consider the saturated case, in which the shock reaches the n-th site at time t, i.e. |ϕ n (it)| 2 = 1 − n/L and |ϕ m (it)| 2 = 1/L with 0 ≤ m < n. With shock wave (3.5), the Krylov complexity is bounded from above by We distinguish chaos in spectrum from chaos characterized by the exponentially decay of the OTOC, which is related to operators and locality [18,43].

The SYK model
Here we show that the universal rise-slope-ramp-plateau behavior also appears in the SYK model with quartic or higher fermion interactions. The SYK q model describes N Majorana fermions with q-fermion random coupling, namely and anti-commutation relation {χ j , χ k } = δ jk . We specify the value of J so that the first Lanczos coefficient is b 1 = 1. The SYK q model with q ≥ 4 has the same level spacing distribution as the RMT [12,13,17]. The particle-hole symmetry determines the class of RMT statistics (GOE, GUE, and GSE) of each charge parity sector. For the GUE, we will consider N mod 8 = 2 or 6, where the spectrum consists of even and odd parity sectors which are mapped to each other by the particle-hole symmetry. Each sector corresponds to the GUE with dimension L = 2 N/2−1 .
The density of state in the triple-scaled limit N ≫ q 2 ≫ 1 of the SYK model is [17] ⟨ρ(E)⟩ ∼ J −1 e S 0 sinh π 2N (E − E 0 )/(q 2 J ) with E 0 the ground state energy and S 0 the zero temperature entropy. Due to the scaled limit, only the lower edge of the spectrum is accessible. So a finite temperature is introduced to regularize the SFF, which is similar to Sec. 3.2.2. The SFF of the SYK q model with the GUE statistic follows the integral (3.13) with the above spectral density. So the normalized SFF |Z(β + it)| 2 / ⟨Z(β)⟩ 2 , i.e. the normalized survival probability, exhibits a long ramp ∼ te −2S(β) between the dip time t d ≲ e S 0 /2 and the plateau time t p ∼ e S(β) , where S(β) is the thermal entropy at the inverse temperature β [17]. We would leave the analytical calculation of the transition probability and Krylov complexity in the future. Here we will numerically study them in the cases of q = 4, 2.
In the SYK 4 model, the spectral density is Gaussian law rather than semi-circle law [11,13]. We show the Lanczos coefficients, Krylov complexity, and transition probability at β = 0 in Fig. 12. The Lanczos coefficients b n decrease linearly for n ≪ L and decrease like ∼ 1 − n/L for n ≲ L. As (3.9) no longer holds, we cannot simply apply the Ehrenfest theorem as we did with (3.10). Surprisingly, from the numerical Lanczos coefficients, we observe a modification of (3.9) as b 2 n+1 − b 2 n ≈ Cδ n0 + An + B with A ∼ 1/L 2 , B ∼ −1/L and C ∼ 1, which has an "anomaly" Cδ n0 beyond the "complexity algebra" in [68,79]. Plugging it into the Ehrenfest theorem (2.22), we get For β = 0, since Z(it) decreases from D and K(t; β) ≪ L in the early time, the first term dominates (6.2) and leads to a linear growth of order t in the Krylov complexity. In Fig. 12, the transition probability exhibits the rise-slope-ramp-plateau behavior with ramp ∼ t/(2L 2 ), a common plateau time t p ≈ 2L and a common plateau value 1/L. The connected contributions to the rise-slope are more obvious. Due to the existence of a ramp in the transition probability, the Krylov complexity has a peak after its linear growth and before the plateau time. The SYK 2 model is a many-body model of N/2 free Dirac fermions filling random matrix single-particle energy levels. The model is integrable. The many-body spectral density satisfies a Gaussian law. The levels have weak correlation and repulsion. The SFF has a short and exponential ramp compared to the dimension of Krylov space L = 2 N/2 , namely [112] We show the Lanczos coefficients, Krylov complexity, and transition probability at β = 0 in Fig. 13 The Lanczos coefficients b n decrease faster than linearly for n ≪ L and a n = 0 due to the exact reversal symmetry of the spectrum. The transition probability exhibits the rise-slope-ramp-plateau behavior with a negligible ramp at finite n, a n-dependent plateau time t p and a common plateau value 1/L. Due to the absent of a long ramp in the transition probability, the Krylov complexity does not have a peak.

Conclusion and outlook
We have studied the Krylov complexity in random matrix theory by mapping it to an effective Krylov spin chain model. Under the reasonable assumption of neglecting the statistical correlation between wavefunction and spectra, we find that the Krylov complexity satisfies an Ehrenfest theorem, in which the spectral form factor serves as a force to drive the complexity to grow. For random matrix theory, this also enables us to derive an analytical approximation of Krylov complexity, in terms of spectral complexity (3.11). The analytical expression matches well with numerical calculations up to the linear growth region, while its asymptotic late-time value deviates from the actual saturation value. We attribute this deviation to the fact that neglecting the fluctuation of Lanczos coefficient is not valid at late times, as revealed by Fig. 7. This deviation also shows the fundamental difference between the saturation of Krylov complexity and the saturation of spectral complexity at late times: the former relies on the discreteness of the spectrum (2.56), while the latter relies on level rigidity. For early times, our generalized version of the Ehrenfest theorem is valid for the linear growth region of both chaotic and non-chaotic systems. This implies that the linear growth of complexity is not sufficient to discriminate chaotic and non-chaotic systems.
This further motivated us to study the quantum dynamics in the Krylov basis in general. We find that the transition amplitude, properly defined in a two-copy Hilbert space, shows a universal rise-slope-ramp-plateau behavior as function of time. The linear ramp behavior is a robust indicator for chaotic systems, similarly to the spectral form factor. Therefore, unlike global observables that span over all of the Krylov basis, such as the Krylov complexity, any observable defined in the subspace of the Krylov space can reveal the linear ramp behavior that is unique for chaotic systems. Our study of the transition probability also explains the existence (absence) of the peak in the Krylov complexity in chaotic (non-chaotic) systems discussed in [1]. Our analysis thus reveals a close relation between chaotic spectra and the Krylov state complexity.
In this context, there remain some open questions for the future. First, the saturation value predicted by the Ehrenfest theorem is not accurate, and the error stems from the approximation that we neglect the fluctuations of the Lanczos coefficients in Eqs. (3.4) and (3.10). It will be interesting to improve the approximation by including more statistical correlations between the Lanczos coefficients and the Krylov wavefunction. A similar improvement can be obtained for the Krylov wavefunction ϕ n at chain sites with large n ∼ L.
Second, in addition to the Ehrenfest theorem presented in this paper, further theorems in quantum mechanics may provide useful approximations or constraints for observables on the Krylov chain. For instance, the Robertson uncertainty relation provides a dispersion bound for Krylov complexity [79]. It will be interesting to investigate this bound in each region of Krylov state complexity discussed in this paper.
Third, for n ≳ L/2, the statistical correlation between the two polynomials in the transition probability makes the ramp-down behavior at late times difficult to analyze. In the numerical simulation, we observe that the late-time evolution of the transition probability is governed by diffusion rather than propagation. It would be more straightforward to describe the ramp-down behavior by writing down a diffusion equation for the Krylov chain.
Fourth, going beyond random matrix ensembles, it would be interesting to study Krylov state complexity in the SYK model [113][114][115] and in holographic models such as JT gravity [14,110], as well as in lattice models or spin models to benchmark the universal behavior studied in our paper.
Finally, let us highlight the fact that (2.55) provides an analytical expression for the Krylov complexity for the TFD state. This may serve as a starting point for constructing the gravity dual of Krylov complexity [116]. As we show in Sec. 4, the connected part of the Krylov complexity plays an important role for its peak and saturation structure at time scales comparable to the dimension of the Hilbert space. This may be relevant in particular for describing the late-time evolution of black holes, where higher genus effects should be taken into account at finite N in holography [110].

A Imaginary time evolution
For imaginary time evolution, we consider the following continuum limit with a small spacing ϵ such that Eq. (2.15) becomes In the continuum limit, the Ehrenfest theorem (2.22) becomes where the last two terms in the integrand may be neglected in the small ϵ limit. To further write this into a typical Schrödinger equation in the continuous coordinates, we introduce a new wave functionφ(τ, y) = b(x) 1/4 φ(τ, x) on coordinate y = y(x) with dy = dx/(ϵ √ b), and arrive at the following Schrödinger equation with the potential We can read out another Ehrenfest theorem on the coordinate y and real time t = Imτ , However, the Krylov complexity ⟨xL⟩ is not directly related to ⟨y⟩.

B Spectral complexity for Gaussian ensembles
We may generalize the calculation of the spectral complexity in Sec. 3.2 to the GOE and GSE. Instead of integrating the SFF over time, we directly use the spectral complexity (3.11) and replace the two-point function (3.12) with [19] ⟨ρ( and ⟨ρ(E)⟩ is taken as the semi-circle law (3.8). Similarly to the case of JT gravity [110], the contact term in (B.1) will not contribute to the complexity. In all of the three Gaussian ensembles, the spectral complexity is always written as with an ensemble-dependent kernel The asymptotic behaviors of the kernel functionk(x) arẽ where d = 1, 1, 2 for the GOE, GUE, and GSE respectively and f (x) is a polynomial of finite degree. The integrand is therefore finite at s = 0. For simplicity, we will focus on the β = 0 case. We will split the spectral complexity into the disconnected part and the connected part, namely, C(t) = C(t) disc. + C(t) conn. . Obviously, the 1 term and thek(X) term in the integral factor 1 −k (X) respectively contribute to the disconnected part C(t) disc. and the connected part C(t) conn. . The disconnected part is universal for those three ensembles, namely, which exhibits a quadratic-to-linear growth in order L 0 . The connected part is highly dependent on the kernel. Since the kernelk(π ⟨ρ(E)⟩ s) is a narrow function of s with width 1/ ⟨ρ(E)⟩ ∼ 1/L, we can replace ⟨ρ(E 1 )⟩ ⟨ρ(E 2 )⟩ → ⟨ρ(E)⟩ 2 and extend the domain of integral w.r.t. s to the real axes. By integrating out s, we get the connected part Only for the GUE, we can perform the integral w.r.t. E analytically and reproduce (3.15) in the main text. For other ensembles, we have to compute it numerically. The results of spectral complexity are shown in Fig. 5. At the early time t ≪ L, as g(u) = −3u 2 + O(u 3 ), the connected part scales as C(t) conn. ∼ t 2 /L and is negligible compared to C(t) disc. , as discussed in Sec. 3.2.
At the late time t ≳ L, the growth of the spectral complexity will slow down, stop, or even rebound in different ensembles, due to the cancellation between the disconnected part and connected part, as explained in [110] and the following text. Let's first assume thatk(x) is holomorphic function of x. Based on the asymptotic behavior (B.7), when t > 2dL ≥ 2πd ⟨ρ(E)⟩, the integrand e its (1 −k(π ⟨ρ(E)⟩ s)) is analytic on the upper half-plane and the real axis of s. So we can close the contour of integral ds by using the arc at complex infinity in the upper-half plane, such that the integral of those terms vanishes and the spectral complexity saturates to a time-independent value given by the other terms. However, for the GOE, the sgnx ink(x) is not analytic. As a result, the C(t) for the GUE approaches log t instead of a constant.

C.1 Formal Krylov approach in ensemble average
Here we give the formal expressions of the transition probability and the Krylov complexity in the ensemble average of Gaussian ensembles. As explained in the main text, to construct the Krylov space of the maximally entangled state |0⟩, we can consider non-degenerate spectrum {E p } 0≤p≤L−1 . The Krylov space is spanned by the states L n |0⟩ for 0 ≤ n ≤ L − 1. The components of the n-th state on the energy basis of L are the n-th rows of the Vandermonde matrix V = E n p 0≤n,p≤L−1 (C.1) times 1/ √ L. So the Krylov basis |O n ⟩ is obtained from the orthogonalization of V, namely where C is a lower-triangular matrix and I is the identity matrix. From the moments and Hankel matrix one can also derive the orthogonal polynomials by If the spectrum is symmetric, one can also divide the Lanczos coefficients from the Hankel determinant The transition probability and Krylov complexity are respectively where the evolution matrix and complexity matrix are respectively In an ensemble average, they become Forβ-ensemble, the measure of the spectrum is which is normalized to 1 and we have rescaled the potential such that the domain of spectrum is [−2, 2] in expectation. We will work at the GUE (β = 2) for simplicity. It can be solved with the oscillator wave functions where H n (x) is the Hermite polynomial. By using the kernel one can write down the one point function and two point function In the large L limit, they become the semi-circle law and sine kernel Since ψ n (E p ) is a complicated function of the spectrum of degree n, the ensemble average of probability |ϕ n | 2 is a complicated summation of maximally 2n-point functions and thus difficult to solve analytically. So we will approximate ψ n (E p ) by the polynomials ψ n (E) given by the limit value (3.4) and take its confinement into account.

C.2 General linear ramp from Gaussian ensemble
Similar to the SFF, the ramp in the transition probability does not rely on the specific sine kernel. Here, we work at β = 0 for simplicity. Since the ψ n (E) in |ϕ n | 2 is a n-degree polynomial of E, we can consider the generating function in the Gaussian ensemble average ⟨· · · ⟩, namely (C.14) where S[ρ] is an effective action of spectral density ρ from Gaussian ensemble. The saddle point of S is the semi-circle ρ s . Considering fluctuation ρ = ρ s + δρ and expanding S around its saddle, one finds the quadratic term [18]  By replacing t 1 → t + t 1 and t 2 → t − t 2 and letting |t 1,2 | ≪ t, we obtain a linear ramp on t in the connected part of the generating function Thus, by taking the derivative of (C.19) with respect to t 1,2 , we will obtain a linear ramp.

C.3 Confinement of polynomials
The transition probability in ensemble average is defined as where ψ n (E p ; {E}) is a polynomial of its first argument E p and the polynomial is determined by the orthogonality relation (2.2), or equivalently, the Lanczos algorithm (2.6)(2.8). Since the orthogonalization is determined by the whole spectrum {E} = {E 0 , E 1 , · · · , E L−1 }, where we have shown the dependence on {E} explicitly, the ensemble average of the transition probability (C.20) essentially relies on the multipoint (more than two points) function of the spectral density. In principle, the orthogonalization before the ensemble average is different from the orthogonalization after the ensemble average. In other words, the former case corresponds to applying the Lanczos algorithm (2.6) for each realization of ensemble and taking their average finally, which is what we are simulating in Sec. 4.1. The latter case corresponds to a single Lanczos algorithm (2.6) for the ensemble average of the inner product.
In this paper, we are not able to solve (C.20) exactly. Instead, in this subsection, we try to develop an approximate method by including the a significant effect in the level correlation between the argument E p and the spectrum {E} in the polynomial ψ n (E p ; {E}), called the confinement of polynomials.
Instead of thinking ψ n (E ′ ; {E}) as a function of a continuous energy E ′ , we only have to consider the its dependence on the discrete levels E ′ = E p ∈ {E} with p = 0, 1, · · · , L−1. Recall that p n (E ′ ; {E}) = det(E ′ − L (n) ) from (2.8). So p n (E p ; {E}) automatically vanishes on the spectrum of L (n) . Then we can reduce the summation where the power is taken as j = 1, 2, the (L−n) p is taken on the spectrum of R (L−n) , and f (E p ) is an arbitrary function of E p .
After the confinement of polynomials is taken into account, the reduced summation (C.24) in the ensemble average given in (C.9) is At the last step, we consider the average separately by neglecting the correlation between each term and define ρ (L−n) (E) and ⟨p n (E)⟩ as the reduced spectral density and the polynomials in ensemble average respectively. The ensemble average of the reduced one-point function is The ensemble average of the normalized polynomial ⟨ψ n (E ′ )⟩ is proportional to ⟨p n (E ′ )⟩. But the normalization is slightly subtle. Due to the trick of setting b n = 0 and replacing L byL, the original norm n m=1 b 2 m vanishes for each realization. We have to renormalize the polynomial in the ensemble average with the reduced spectral density ρ (L−n) (E) . By considering the confinement and the renormalization of polynomials, we take the ensemble average as Finally, a physical consequence of the confinement property discussed above is the decrease of entanglement in the Krylov basis from the maximally entangled state. Given a state |O n ⟩ in the Krylov basis, we expand it on the equalenergy states as |O n ⟩ = (1/ √ L) L−1 p=0 ψ n (E p ) |E p , E p ⟩, take the partial trace on H R , obtain a reduced density matrix in H L , and calculate the α-Renyi entropy S α = log( L−1 p=0 |ψ n (E p )| 2α /L) /(1 − α). We display the α-Renyi entropies for all the states in the Krylov basis in the GUE in Fig. 15. Recall that |O 0 ⟩ is the maximally entangled state, whose α-Renyi entropy is S α = log L, ∀α. When n ≫ 1, after the recurrence (2.9) with random levels, the resulting Krylov state |O n ⟩ has a nonuniform and random component on the equal-energy basis |E p , E p ⟩ with E p ∈ [−2 1 − n/L, 2 1 − n/L]. This confinement of the spectrum leads to the decrease of entanglement for increasing n. In particular, when 1 ≪ n ≪ L, the dimension of the subspace spanned by the equal-energy basis within [−2 1 − n/L, 2 1 − n/L] is exponentially large. So, first the entanglement decreases slowly. When n approaches L, the dimension of the subspace shrinks quickly and the entanglement decreases quickly as well.

C.4 Approximate the reduced spectral density and polynomials
We are constructing the reduced spectral density ρ (L−n) (E) and the polynomials ⟨ψ n (E)⟩ in the continuum limit.

Reduced spectral density
The reduced spectral density ρ (L−n) (E) is determined by the spectrum of R (L−n) . We consider the limit value (3.4) for the right most Krylov chain of length L − n. From (3.7), it turns out to be a semicircle law as well where we have introduced an additional normalization factor 1/(1 − n/L) so that the reduced spectral density is normalized to L. This normalization will lead to the normalization (C.27) with a proper polynomial ⟨ψ n (E)⟩. In the upper-left panel of Fig. 16, we compare (C.29) to the numerical result of the levels statistic in R (L−n) , which match very well.

Polynomials in the continuum limit
We will construct the polynomials in the following two ways. First, we calculate the polynomial ⟨ψ n (E)⟩ in the continuum limit in the second-order formalism in Sec. (2.3). We consider ϵ = 1/L, finite n/L, and large L. By inserting the limit value which coincides with the real part and imaginary part of (4.10). The frequency L(1 − 1 − n/L) in (C.30) is just the coordinate y corresponding to n in (2.46). The transformation (2.14) from E to t on (C.30) gives a wave function ϕ n (it). Since (C. 30) is not a polynomial with finite degree and only contains two frequencies ±L(1 − 1 − n/L), the wave function is localized around the characteristic curve t = ±L(1− 1 − n/L) in (2.47), which is different from the numerical result in Fig. 6, where the waves have long tails. Such difference in the tails will lead to a wrong slope behavior in the disconnected part of the transition probability. So we will only adopt it in the connected part of the transition probability.
Whatever, we find that ψ cont.

Polynomials from the Chebyshev polynomials
We can alternatively construct the polynomial ψ n (E) by deforming the Chebyshev polynomial of the second kind U n (E/2), which satisfies the orthogonality relation (2.2) with respect to the measure given by the semi-circle law (3.8). However, we show that U n (E/2) are the not the polynomials we need since they are given by constant Lanczos coefficients a n = 0, b n = 1 (see (D.3) in App. D). Here, the Lanczos coefficients are those given in (3.4). So, compared to the polynomials given by the Lanczos coefficients (3.4) with a plateau for n ≪ L and a descent for n ≲ L, the Chebyshev polynomials U n (E/2) work for n ≪ L only.
We may modify the Chebyshev polynomials based on the above discussion of the matrixL. To this end, we approximate the sub-matrix L (n) by replacing the Lanczos coefficients with their square mean value. For the limit value (3.4), we set a m → 0 and b m →b n , ∀m ∈ [0, n − 1] in L (n) . We may rescale the energy E → E/b n in the Chebyshev polynomial U n (E/(2b n )) such that the original domain of Chebyshev polynomials is rescaled to the domain E ∈ [−2b n , 2b n ]. Recall that the spectral domain of R (L−n) is [−2 1 − n/L, 2 1 − n/L]. We will chooseb n = b n = 1 − n/L such that U n (E/(2 1 − n/L)) is approximately orthogonal to other U n ′ (E/(2 1 − n/L)) with n ′ ∼ n on the measure of ρ (L−n) sc (E). However, we immediately encounter the following problem. U n (E/(2 1 − n/L)) around E = 0 is proportional to exp(±iE(n + 1)/(2 1 − n/L)). Again, by transforming E to t, we find a shock wave propagating along t = ±(n + 1)/(2 1 − n/L) that is very different from the characteristic curve t = ±L(1 − 1 − n/L) found in the continuum limit for large n. To fix this problem, recalling that the spectral domain has been rescaled, we tune the degree of the Chebyshev polynomial to match the characteristic curve t = ±L(1 − 1 − n/L). The resulting polynomial is which will be confined in E ∈ [−2 1 − n/L, 2 1 − n/L]. The normalization coefficient h n is determined by (C.27). The degree d n approaches n for small n and 0 for n = L. We compare the polynomials (C.32) to the numerical result in Fig. 14.
By taking f (E p ) = δ(x − E p ) in (C.28), we get the distribution of E p with weight ψ n (E p ) j . We further compare the numerical statistics with the above weight |ψ n (E p )| j to functions ψ Cheb. Fig. 16. We see that he approximate polynomials have closed moving average and oscillation frequency even at n = L/2 but their amplitudes and the edges behaviors are different. The numeric statistics give weaker oscillation mainly due to two reasons: • The approximation on L byL will affect the levels slightly such that some levels from L (n) skip the zero points of ψ n (E p ) 2 .
• The fluctuation of the polynomials ψ n (E p ) will also shift the phase of the oscillation. The mismatch between the phases in different realizations will suppress the oscillation in average.
The numeric statistic has smooth edges while the approximation has cliffy edges, where the discrepancy is due to our hard cut in breaking L toL. The comparison in Fig. 16 shows that the normalization in (C.29) is a good approximation for the spectral density with weight |ψ n (E)| j .

C.5 More on chaos in transition probabilities
By considering the finite degree and the confinement of the polynomials, we improve the calculation of the transition probability for n ≲ L/2 as compared to (4.7). We should first discuss the confinement effect in for the two-point function of the reduced spectral density. The selection in (L−n) p will not change the correlation between levels. We consider the following two-point function We have introduced the improved sine kernel in [19] to include the correlation between E p , E q in the two delta functions. Since the sine kernel is localized in −1/L < E p − E q < 1/L, we assume that E p and E q belongs to the summation (L−n) p̸ =q or not simultaneously. So, only one reduced spectral density ρ (L−n) (Ē) is present with the sine kernel. If we integrate out E ′′ , we reproduce the one point function (C.26). We can also use the box approximation by replacing ρ(Ē) → ⟨ρ(0)⟩ in the sine kernel [19].
Similarly, by assuming the decorrelation between polynomials and spectral density after considering the confinement of ψ n (E p ), we approximate the transition prob-ability as At the last step, we neglect the spectral correlation between the polynomials and delta functions. Although the sine kernel is localized in −1/L < s < 1/L, we can not identify the E ′ and E ′′ in the two polynomials, since their product, as a function of s, oscillates quickly when n ≲ L.
We have dropped the complicated correlation between the spectra in polynomials, namely which we are unable to calculated. However, we will see its contribution to the transition probability numerically. Finally, we will approximate the ensemble averages in (C.34c) with the reduced spectral density ρ (L−n) sc (E) in (C.29) and the deformed Chebyshev polynomial ψ Cheb. n (E) in (C.32) or the polynomial in the continuum limit ψ cont. n (E) in (C.30). The transition probability is written as The first term of (C.36) contributes to the rise-slope. From our numerical results in Fig. 6, we expect a sharp shock wave and a long tail on the Krylov chain, which corresponds to a sharp rise and a long slope along the real-time axis. The continuum limit fails to describe the long slope since it is localized at a frequency. We will adopt the deformed Chebyshev polynomial (C.32) in the first term of (C.36). The first term is the absolute square of ⟨ϕ n (τ )⟩ ≈ 1 2π (1 − n/L) with τ = β + it and I n (x) the n-the modified Bessel function of the first kind. So the transition probability |ϕ n (it)| 2 rises as (1 − n/L) dn t 2dn /Γ(d n + 1) 2 initially, reaches its first peak around the time L(1 − 1 − n/L), and then decays as (d n + 1) 2 / [πt 3 (1 − n/L) 2 ] over time.
The second term of (C.36) contributes to the plateau. From the normalization (C.27), the plateau value is 1/L.
The third term of (C.36) contributes to the ramp-up behavior for n ≲ L/2. The product ⟨ψ n (E + s/2)⟩ ⟨ψ n (E − s/2)⟩ is an even and oscillating function of s with a frequency of order n. Under the Fourier transformation of the level difference s, the oscillation will shift the time t by some scales of n, which is not negligible when n is comparable to L. To estimate this effect analytically for n ≫ 1, we employ the polynomial in the continuum limit (C.30) and the reduced semicircle law (C.29) with a normalization factor of 2 1 − n/L from (C.31). To get a simple expression, we adopt the box approximation by sending ρ sc (E) → ρ sc (0) in the sine kernel. The sum of the second and third terms for even n at β = 0 is given by For odd n, we obtain the same result. The function is plotted in Fig. 17. The plateau value is always 1/L, and the time dependence can be understood by the movement of the two min terms for different n. Each of them, as a function of t, is given by an upside-down triangle centered on the characteristic curve t = ±L(1 − 1 − n/L), with a width of 4L. When n = 0, the two triangles, centered at t = 0, stack up to form a larger triangle, which corresponds to the ramp t/(2L 2 ) of survival probability before the plateau time t p = 2L. When n > 0, the two triangles move apart due to the oscillation of the polynomials on s. Then a step appears between the centers of the two triangles, whose height is (1− √ 1 − x)/(2L) and width is 2L(1− 1 − n/L).  The ramp behaves as t/(2L 2 ) when L(1 − 1 − n/L) < t < L(1 + 1 − n/L), and as t/(4L 2 ) when L(1 + 1 − n/L) < t < L(3 − 1 − n/L). Thus, the plateau time is t p = L(3 − 1 − n/L). The dip time, which is the time when the ramp surpasses the slope, is given by t d ≈ 4 2/π L(n + 1) for n ≲ L/2. Therefore, the ramp region of the transition probability |ϕ n | 2 lasts for (L − n)L long.
In Fig. 18, we compare the numerical results with the analytical results obtained from (C.37) and (C.38). The rise-slope behaviors match well until n approaches L, but the ramp-up behaviors obtained from (C.38) are smaller than the numerical results due to neglecting the fluctuations from the polynomials, as discussed in (C.35) of App. C.
For n ≳ L/2, we are unable to reproduce the behavior of the transition probability analytically, including the rise-slope and ramp-down. This is because the Lanczos coefficients exhibit strong fluctuations at large n ≳ L/2, as shown in Fig. 2. Consequently, the polynomial ψ n (E p ; {E}) receives strong fluctuations from the spectrum {E}, and the correlation between the two polynomials is not negligible. Such fluctuations in polynomials lead to strong fluctuating phenomena in the transition probability for n ≳ L/2 in both rise-slope behavior and ramp-down behavior.
As an evidence of the strong fluctuation, in Fig. 3, the transition probability in a single realization fluctuate strongly at large n. Also, in Fig. 7, the connected part nearly dominates the whole transition probability in the ensemble average, which implies strong fluctuations in each realization.
A possible explanation of the ramp-down behavior for n ≳ L/2 is as follows. The ramp-down behavior could be originated from the slope but with strong fluctuation from the polynomials. The fluctuation in the polynomials leads to phase fluctuation in the oscillation of the slope. For example, we can imagine a phase φ fluctuation in I dn+1 (2it 1 − n/L + φ) 2 of (C.37). In the ensemble average, the oscillations in the slope between different realization are cancelled with each other due to the phase fluctuation and end up with a moving average. The moving average of the slope behaves like the ramp down behavior in the numerical average.

D Constant Lanczos coefficients
Here we give a simple example for the application of our Ehrenfest theorem and show several typical features of Krylov complexity. We consider the case of constant Lanczos coefficients in the interior a n = 0, b n = 1, (D.1) and b 0 = b L = 0 at the chain endpoints. Without loss of generality, we work with dimensionless quantities, so that the spectrum {E p }, the Lanczos coefficients {a n , b n }, and time τ = β + it are dimensionless. The b 2 n+1 − b 2 n and a n+1 − a n terms in the Ehrenfest theorem (2.22) vanish except b 2 1 − b 2 0 = 1 and b 2 L − b 2 L−1 = −1. So the growth of Krylov complexity is driven by the probability at the two endpoints, namely, Eq. (D.1) effectively describes the flat Lanczos coefficients in a chaotic system at the n ≪ L limit. Since the wave function is initially localized at n = 0, the short time evolution of the wave function as well as the Krylov complexity in a general system is effectively described by the first term of Eq. (D.2).

D.2 Large dimension limit
In the large L limit, we can calculate the evolution of wave function and complexity analytically. The density of state and the measure are approximately The wave function is given by the integral ϕ n (τ ) ≈ 1 2π where I n is the n-th modified Bessel function of the first kind. ϕ n (τ ) becomes nonzero for all n in this large L limit. We can calculate the survival amplitude and Krylov complexity along imaginary time S(2β) ≈ I 1 (4β) 2β , (D.9) K(0; β) ≈ 2β (I 0 (2β) 2 + I 1 (2β) 2 ) I 1 (4β) − 1 ≈ 2 2β π − 1, when β ≫ 1, (D. 10) where we have used the recursion relations 2nI n (x) = x(I n−1 (x) − I n+1 (x)) and 2I ′ n (x) = I n−1 (x) + I n+1 (x). Along the real time, we use (D.2) and find where F is the generalized hypergeometric function and we show the linear growth at the last step. The above results show the typical square-to-linear-to-plateau growth of Krylov complexity along the real time at finite temperature. We compare these formulas to the numeric results in Fig. 19.

D.3 Finite dimension effects
We would like to study the behavior of the Krylov complexity near the plateau time. We will consider a large but finite L, such that β, t > L is accessible. For simplicity, we will adopt the continuous Schrödinger equation where k is originally bounded by the dimension of the Krylov space L, but we have released this constraint by consider the boundL instead and will sentL → ∞ finally for simplicity. We consider the initial stateφ(τ = 0) = δ(y − ϵ) locating beside the boundary with small distance ϵ. Its overlap to the eigenstate is ⟨E k |φ(0)⟩ = √ 2πkϵ/L 3/2 . So the state evolves as ke −E k τ sin πky L , (D.14) Since the Lanczos coefficients are flat, the Krylov complexity K is the expectation value of the position operator y. The complexity operator on the energy basis are Its expectation value is where τ = β + it, k ± = k ± l, and By taking the summation, one can get the value of Krylov complexity for general β and t.