Sparse random matrices and Gaussian ensembles with varying randomness

We study a system of N qubits with a random Hamiltonian obtained by drawing coupling constants from Gaussian distributions in various ways. This results in a rich class of systems which include the GUE and the fixed q SYK theories. Our motivation is to understand the system at large N. In practice most of our calculations are carried out using exact diagonalisation techniques (up to N = 24). Starting with the GUE, we study the resulting behaviour as the randomness is decreased. While in general the system goes from being chaotic to being more ordered as the randomness is decreased, the changes in various properties, including the density of states, the spectral form factor, the level statistics and out-of-time-ordered correlators, reveal interesting patterns. Subject to the limitations of our analysis which is mainly numerical, we find some evidence that the behaviour changes in an abrupt manner when the number of non-zero independent terms in the Hamiltonian is exponentially large in N. We also study the opposite limit of much reduced randomness obtained in a local version of the SYK model where the number of couplings scales linearly in N, and characterise its behaviour. Our investigation suggests that a more complete theoretical analysis of this class of systems will prove quite worthwhile.


Introduction and Motivation
Random Matrix theory (RMT) is well known to have interesting connections with two-dimensional gravity. See for review [1,2]. More recently an interesting and new version of this connection has been discovered in the study of two-dimensional Jackiw-Teitelboim (JT) [3,4] gravity. It has been shown that the key features of JT gravity are correctly reproduced by the low-energy limit of the Sachdev-Ye-Kitaev (SYK) model, which is a quantum mechanical model of N flavors of Majorana fermions with random couplings [5][6][7][8].
Furthermore it was shown in [9] that the partition function of JT gravity on surfaces with an arbitrary number of boundaries and handles is correctly reproduced by Random Matrix theory in a suitable double scaling limit. Since JT gravity arises quite universally as a description of the low-energy dynamics for a wide class of higher dimensional near extremal black holes [10,11], these fascinating results promise to hold general lessons for the study of quantum gravity and black holes in higher dimensions as well.
The Hamiltonian of the SYK model with a ψ q coupling (with q being an even number) is given by Here ψ i , i = 1, · · · N, are the N flavors of Majorana fermions and j i 1 i 2 ···iq are real random couplings drawn from a Gaussian ensemble with variance j i 1 i 2 ···iq , j j 1 j 2 ···jq = (q − 1)!J 2 N q−1 δ i 1 ,j 1 δ i 2 ,j 2 · · · δ iq,jq (1.2) We refer to this model as SYK q model. Quite remarkably, this model reproduces many aspects of JT gravity, as was mentioned above. In particular, the low-energy dynamics of the SYK model exhibits a characteristic pattern of symmetry breaking, tied to how time reparametrisation symmetry is realised. This gives rise to soft Goldstone-like modes whose dynamics is governed by the Schwarzian action. The behaviour of this action accounts for the thermodynamics and low-energy density of states and the resulting dynamics gives rise to Out-of-Time-Ordered Correlators (OTOCs) in the system saturating the chaos bound. These features -the pattern of symmetry breaking, the Schwarzian action for the time reparametrizations, the resulting thermodynamics and the behaviour of OTOCs -are all shared by JT gravity.
As we will discuss in detail below, the fermions ψ i can be realised as operators in an L = 2 N/2 dimensional Hilbert space obtained by taking a tensor product of Hilbert spaces of N/2 qubits and the Hamiltonian H can then be thought of as a L × L dimensional Hermitian matrix acting on this Hilbert space. Such a matrix in general has L 2 real independent elements. Taking all these to be independent Gaussian random variables with zero mean and equal variance, gives rise to the Gaussian Unitary Ensemble (GUE) of Random Matrix Theory (RMT). Its behaviour, even at low-energies, is quite different from that of JT gravity.
In contrast, in the SYK model, even though the random variables present are independent Gaussian variables, as was mentioned above, their total number is only O(N q ) which is much smaller than L 2 -the number of random variables present in the GUE.
These observations raise several interesting questions: How much randomness is needed for agreement with gravity? What happens when we start from the GUE and begin reducing the randomness, by decreasing the number of Gaussian random variables? Is the resulting behaviour, at low-energies, dependent on only the number of random variables or also on which variables have been retained? When do we get the behaviour at low-energies to agree with JT gravity? Etc.
The current investigation is prompted by questions such as these and will attempt to address some of them.
Briefly, we follow two lines of investigation here. Our analysis is mostly numerical. In the first investigation, starting with the GUE we reduce the randomness by retaining only n < L 2 of the matrix elements to be non-zero and taking each of these n elements to be independent Gaussian random variables. We find that the behaviour changes in a fairly pronounced manner as the randomness reduces and n becomes smaller. Subject to the limitations of our numerical analysis, we also find that this change occurs in quite a small interval, when n reaches a value, n c ∼ L (1.5) , so that the fraction of random matrix elements n c /L 2 ∼ L (−0.5) → 0, as L → ∞. We study several properties of the system as n is varied, including the density of states, thermodynamics, the nearest neighbour spacing distribution, spectral form factors and Out of Time Correlators and find that these reveal interesting difference with both the GUE and the SYK cases. For example, initially, starting from the GUE, as n is reduced, we find that the density of states is well fitted to a rescaled Wigner semi-circle distribution. After the transition though, this is no longer true.
The second line of investigation involves going to the other limit, and now having very little randomness. In this analysis we study a "local" version of the SYK model. In the Hamiltonian eq.(1.1) the Hamiltonian connects all the N fermions to each other through the ψ q couplings which take fluctuating values. In the altered system we study -which we often refer to as the local SYK model below -only sets of q nearest neighbour fermions will be coupled together by random couplings. This reduces the number of random couplings to O(N ), instead of the O(N q ) in the SYK model. To differentiate our model with the more conventional one described above, eq.(1.1), we will refer to the latter sometimes as the "Non-local" SYK model. Once again we study several properties of the local system, along the lines of the first investigation mentioned above, and find these compare and contrast in interesting ways with other cases.
Two comments are worth making before we proceed. In the double scaling limit considered in [9], which gives agreement with JT gravity, we have to fine-tune the potential V (M ), and furthermore "zoom in" on the edge of the matrix theory potential. This means while all L 2 random variables in the Hermitian matrix M are retained, its matrix elements are not independent Gaussian Random variables. In contrast in our study, as was mentioned above, we reduce the number of random variables, but those which are retained are taken to be independent Guassian random variables. Second, there are other variations of the SYK model which have also been considered in the literature. These involve taking q → ∞, along with N → ∞. The case where N → ∞ first, then q → ∞, was studied in [7]. In [28,30] a double scaled limit was considered in which q 2 /N is held fixed while q, N → ∞. These limits are also quite revealing and we will make some comments about them below.

More on the Relation Between the SYK Model and RMT:
It is helpful at this stage to give some more details on how the N -flavor SYK model can be realised in the Hilbert space obtained from the tensor product of N/2 qubits; this will help clarify the relation between the SYK model and RMT.
The Majorana fermions ψ i , i = 1, · · · N of the SYK model satisfy the Clifford algebra On the Hilbert space obtained by taking the tensor product of N/2 qubits, this algebra can be realised by taking the following representation for the fermions in terms of tensor products of Pauli matrices: where σ x , σ y , σ z are the usual Pauli matrices and 1 is 2 × 2 identity matrix.
Note that in this representation the ψ i are L × L matrices where It follows then that a general Hermitian matrix acting on the Hilbert space of the N/2 qubits can be written as the sum over various ψ q terms, with q taking both even and odd values: and j i 1 ···iq is real. The matrix eq.(1.6) has N q=0 N C q = 2 N = 2 N/2 × 2 N/2 = L 2 (1.8) number of real independent couplings due to binomial theorem, which is equal to the total number of independent elements in an L × L Hermitian matrix. It is also easy to see that if we now draw all the couplings j i 1 ···iq which appear on the RHS in eq.(1.6) from a Gaussian random ensemble with equal variance j i 1 i 2 ···iq , j j 1 j 2 ···jq = σ 2 δ i 1 ,j 1 δ i 2 ,j 2 , · · · δ iq,jq , (1.9) i.e., with a variance σ 2 independent of q and the coupling j i 1 i 2 ,···iq , then the resulting ensemble for H is the GUE. This follows from noting that the set of matrices α q ψ i 1 ψ i 2 · · · ψ iq form a basis in which any Hermitian matrix can be expanded. Furthermore, this basis is orthonormal with respect to the inner product, M 1 , M 2 = 1 L Tr(M † 1 M 2 ). Restating the GUE in this way allows for a better comparison with the SYK model. As was mentioned above, with fixed value of q, the SYK model has only N C q ∼ O(N q ) random couplings. In addition we note that in the SYK model, eq.(1.2) the variance (for a fixed value of q) is taken to scale as 1/N q−1 , to ensure that the model has a good large N limit. This is very different from the GUE where, as mentioned above, besides including all powers of q we also take the variance to be independent of q. A key difference in the resulting behaviour as was emphasised above, between the GUE and the SYK model at fixed q, is in the low energy dynamics, which for the ψ q case arises from the dynamics of the Schwarzian mode. More recently, as has been discussed in [28,30], even when q is not fixed but instead scales like N a , where a ≤ 1/2, as N → ∞, one continues to get the Schwarzian mode and its low-energy dynamics. At q = N 1/2 there are O(N √ N 2 ) number of random couplings. It is not clear using analytic techniques what happens when the number of random couplings becomes even bigger. Here we are finding that when the number n becomes sufficiently big, n > n c ∼ L 1.5 ∼ 2 0.75N , the behaviour is of GUE type with no Schwarzian dynamics. For n < n c , but still much bigger that O(N √ N 2 ), the behaviour changes and is qualitatively different from the GUE. Unfortunately, with the limitations of our analysis, we are not able to establish the precise nature of the low-energy behaviour, in particular whether or not a Schwarzian mode is present for such values of n.
At the other extreme, in terms of having less randomness, one can take the ψ q theory, for fixed q, as discussed in [14] and reduce the number of couplings by setting some of them to vanish at random. It was found, as we will also review in appendix A, that as long as the number of couplings present still grows like N α , with α > 1, i.e., the number of couplings grows faster than a linear power of N , the low-energy theory will be due to the Schwarzian mode. Together, these results imply that the Schwarzian dynamics at low energies is in fact quite universal -arising for any fixed value of q, or in double scaling limits as long as q ≤ √ N , and persisting in any of these cases even with vastly reduced randomness as long as the total number of couplings present are of order N (1+ ) , for > 0. What happens when the randomness is reduced even further, so that the number of couplings grows only linearly with N ? The local SYK model is an example of such a situation. Our analysis suggests that interesting differences set in, in the linear case. In particular, our analysis suggests that the Schwarzian mode is absent.
Many of our comments have focussed on the low energy dynamics and the presence of a Schwarzian theory, because they are of primary interest from the point of view of connecting with a theory of gravity. However, it is worth mentioning that the detailed spectrum, beyond the low-energy limit, and many other properties, are of course much less universal and will change. This is true as n, the number of non-zero matrix elements in H, is varied. Or as the power q in the SYK model is varied, or in SYK models as the number of non-vanishing couplings for fixed q is varied. Even when the Schwarzian mode is present, the detailed nature of the conformal fixed point at low energies changes as q is varied, with the fermion having anomalous dimension 1/q. Interestingly though, for fixed q, when we reduce the number of random couplings, keeping the total number to grow faster than O(N ), the anomalous dimension of the fermion fields is still given by 1/q and does not change.
One more comment is in order. We had mentioned above that among the properties we study are the OTOCs and the growth of operator size, which tell us about the scrambling behaviour. Using the mapping above, eq.(1.4), we see that the fermionic operators are defined in general for any system in an L = 2 N/2 dimensional Hilbert space, including RMT. One can therefore use these fermionic operators, for example, to study the growth of OTOC, or the growth of operator size. In particular, we will often consider the OTOC which involves a sum over all flavours and is given by, [6][7][8], This saturates the chaos bound with a maximal Lyapunov exponent in the fixed q SYK model. In contrast, the behaviour of the same correlator in the GUE is very different, and is that of a "hyperfast scrambler" as discussed further below.

Organisation:
The paper is organised as follows. In section 2 we review some aspects of the SYK model, the behaviour of the GUE and the connection between these two systems. In section 3 we report on our investigation of L × L matrices with reduced randomness and in section 4 on the local version of the SYK model. We end with discussion in section 5.
Before we end this introduction section, let us comment on some additional related literature. Our local SYK model belongs to the so-called "sparse" SYK models, where the number of random couplings is reduced and which have been actively studied recently [13][14][15][16]. Our local SYK model is based on the locality of fermion interactions. The question of how universal these sparse SYK models are, compared with usual SYK model, is interesting and needs to be investigated more 1 .

Convention in figures:
In all figures, whenever we denote log x, we choose its basis 10, i.e., log x = log 10 x. On the other hand, for ln x, its basis is e, i.e., ln x = log e x.
2 The GUE and SYK q Theories: We briefly summarise the behaviour of the GUE and the SYK ψ q models here. For concreteness in the numerical plots we take q = 4.

Partition functions, eigenvalue densities and thermodynamics
The partition function for the GUE, Z GU E is given by Here M are L × L Hermitian Matrices, Z 0 is a normalisation constant, and σ is given by The matrix integral in eq.(2.1) is invariant under the U (N ) symmetry M → U † M U . Using this, we can obtain the density of eigenvalues ρ L (λ), and K L (x) is given in terms of Hermite polynomials H j as The expectation values of single trace operators, Tr(M n ), can be computed using ρ L (λ) as In the L → ∞ limit, K L (λ) -the Kernel function -becomes the famous Wigner semi-circle distribution, which is normalised to unity, The eigenvalue density ρ L (λ) for the GUE with L = 4096, N = 24 and the density of states for SYK model with q = 4, J = 1, eq. (1.2), are shown in Fig 1. Qualitatively the eigenvalues are more "clumped" together in the SYK model towards the centre and the density is correspondingly smaller towards the edges, compared to the GUE case. One can heuristically understand this as arising in the SYK model due to the reduced randomness which causes smaller eigenvalue repulsion.
Reducing the rank L for the GUE gives rise to oscillations, which grow as L becomes smaller.

Thermodynamics
As we have discussed in the introduction, regarding the random matrix M as Hamiltonian H H = M , (2.9) one can interpret λ as energy eigenvalue, and ρ W (λ) as the density of states. Throughout the paper we will refer to the normalised density of states as ρ, where it satisfies the condition dλρ(λ) = 1.
Then the thermal partition function 2 of the GUE in the L → ∞ limit is where I 1 is the modified Bessel function of the first kind and the second line is the T → 0 behaviour, and the resulting entropy at low temperature is given by The T → 0 features are due to the density of states going like ρ W → 1 π √ 2 + λ near the lower edge of the spectrum at λ = −2. Note that the temperaturedependent part of the entropy is negative indicating that thermodynamics is not a good approximation in the very low temperature limit. In fact one can show that thermodynamics is not a good approximation at any temperature in this model. Thermodynamics corresponds to a saddle point in the evaluation of the partition function (2.10), which here takes the form, It is easy to see that there is no good saddle point approximation to this integral, since L is an overall factor and we do not have large N parameter here.
In contrast, the behaviour of the SYK ψ q model is very different. Denoting the energy difference from the ground state by E we have at large N that the density of states for energies E J is given by [7,17,18,29] ρ(E) ∝ e S 0 sinh cN E J (2.14) where c is a positive O(1) number (for q = 4, c ≈ 0.396). Near E → 0, ρ ∝ E/J which is analogous to the Wigner distribution near the edge λ → −2. However, in the range the partition function has a good saddle point approximation (for N 1) (2. 16) and is given by the second line on the RHS above. As a result thermodynamics is a good approximation in this regime. One can also see this from the entropy which is positive and with a resultant positive specific heat which is linear in T . The constant S 0 scales with N , where c 0 is also an O(1) constant. The presence of this constant is a characteristic feature of the SYK model. Although for any finite N the ground state degeneracy goes to zero, in the temperature range eq. (2.15), the system behaves as if, in effect, it has a big "ground state" degeneracy, since eigenvalue differences are O(e −S 0 ) which is very tiny compared to the range eq. (2.15). Most of the above features arise from the Schwarzian action.
A plot of the entropy S as a function of T for the GUE and SYK models is shown in

Spectral form factor (SFF) for RMT and SYK model
This subsection briefly reviews the work of [29].
The spectral form factor is an important quantity to study from the perspective of chaos. It is defined to be , and it can be divided into a disconnected and connected piece, where Z(β, t) = Z(β + it). In the rest of the paper, we call g(t, β) as total SFF, g d (t, β) as disconnected SFF, and g c (t, β) as connected SFF respectively.
We note that the SFF can be expressed in terms of the two point correlator ρ(λ 1 )ρ(λ 2 ) This shows that as t increases the SFF is sensitive to the two point density correlator at smaller separations in the eigenvalues.
Interestingly the SFF of the SYK model agrees with RMT [29]. Depending on whether N (mod 8) is 2 or 6, 0, 4, respectively, the SFF of SYK model agrees with that of the GUE, GOE or GSE [29].
As a function of time t there are three basic features seen in the SFF: an initial decay, dominated by the disconnected part; a subsequent ramp due to the connected part and a final plateau. The disconnected part in GUE is given by as a result, at late times, t → ∞, (2.24) so that both the real and imaginary part of Z(β +it) decay with time, leading to In contrast to the disconnected piece, the connected part grows with time, for 1 t < 2L The connected and disconnected contributions therefore become equal at a time of order which is called dip time. Thereafter the connected part dominates and the SFF exhibits a ramp-like character growing linearly with t.
Finally at time t ∼ 2L, g c reaches a plateau attaining a constant value The plateau region arises due to nearest neighbour eigenvalues repelling each other. Compared with the plateau region, the ramp region is sensitive to interactions between eigenvalues which are further apart. As time increases, the SFF becomes more sensitive to the eigenvalue interactions between the closer ones.
All of these features can be seen in Fig 4 which also show, [29] that the SFF in RMT and SYK are very similar. We also note that the behaviour of the disconnected piece of the SFF and the ramp region in the connected piece, for the SYK model, are in good agreement (for sufficiently large β) with the JT theory where they correspond to the behaviour of the disk and double trumpet partition functions, respectively [21].

Nearest Neighbour Level Spacing for RMT and SYK model
The nearest neighbour level spacing is also well known to be a good diagnostic of chaos. For GOE this spacing takes the form of the well known Wigner surmise. The probability of the spacing taking value s, P (s) is given by In Fig 5 we plot this probability distribution for the GOE and the SYK q = 4, N = 24, models. We see very good agreement with the Wigner surmise, [29]. Note that these plots have been obtained after unfolding the spectrum, as described in [27], see also section 3.2 below. The comparison is made with the GOE since the number of flavours, N = 24, is divisible by 8, see [27].

OTOC
Finally we turn to OTOCs. These are defined in eq.(1.10) above. More precisely, we consider the normalized OTOCs where Z is the partition function, Z = Tre −βH and (2.31) In the non-local SYK model, the OTOCF ij (t) is same for all pairs as far as i = j, since all the flavour of fermions are connected to each other with random couplings j ijkl of the same strength. Numerically, we calculate the OTOC by averaging over the pairs (i, j) for i = j and denote this quantity asF (t); since the OTOC is independent of the choice of flavour, this average should be the same as that for any particular values of (i, j). Below we plot the function The plots for the non-local SYK model are obtained as follows, see also appendix E. The dots in the left hand panel are the data, obtained with an ensemble of 2 elements. The solid curves are obtained by fitting the initial region (up to a maximum time t max ) for each β with a curve of the form A + Be λt . The best fit value of λ obtained in this manner is then plotted against β in the right hand panel. The maximum time t max up to which we fit the OTOC with the exponential form is obtained by starting with an initial range for t where this form is a good fit and then extending this range to the largest time, t max , where the exponential fit continues to be a good one. This was done by first fitting the data naively with "the eye" and then, more systematically, by using a chi-square distribution to estimate the goodness of the fit, as discussed in appendix E. In the right side panel the error bars for each data point corresponds to the 1σ errors obtained by fitting the OTOC up to t max . The red curve in this panel corresponds to λ = 2π β , and the black curve is obtained by fitting the data with a curve of the form λ = A β + B. The best fit gives B = 0 and A = 6.5, which is in reasonable agreement with the chaos bound exponent 2π β . Strictly speaking, the Lyapunov exponent is defined in the large N limit. Obtaining it in this limit numerically though is quite challenging, see [31]. From the right hand panel above we see that while working at a moderately large value of N , our analysis still allows for a fairly reliable estimate of the Lyapunov exponent.
Theoretically, one can show in the large N limit that G 4 takes the form where A is a coefficient of order unity determined by the fermionic fields. This shows that G 4 reaches its asymptotic value of order unity at a scrambling time t ∼ β 2π ln N.
(2.34) G 4 is suppressed by a factor of 1/N because the non-local SYK model has a saddle point in the large N limit, and the value of the action at the saddle point scales like N . We see that as a result the scrambling time is parametrically larger than β by ln N .
Random matrix OTOC, L=4096 The OTOC for RMT is plotted in Fig 7. In appendix B, eq. (B.13), we show that in the L → ∞ limit, G 4 is given by where In Fig. 7, we show this function as a solid curve for three different values of β and also the numerically obtained data for L = 4096, which is in reasonable agreement.
The important point to note is that G 4 reaches its asymptotic value 1 in a time scale of order unity, not scaling with L. This shows that the GUE exhibits "hyperfast" scrambling. As was noted in [32] the thermalisation time scale is also of order unity and therefore comparable to this scrambling time. This in contrast to what we saw above for the non-local SYK model where the scrambling time scales like β ln N .
The underlying reason for this hyperfast scrambling is the U (L) symmetry which is preserved by the GUE, see appendix B for further discussion and also in [32]. There is no notion of "simple operators" which is invariant under this symmetry. The ψ i operators under a general U (L) transformation can be turned into arbitrarily complicated operators involving up to N factors of the ψ's. States prepared by acting with such complicated operators are already scrambled at the get-go, resulting in the hyperfast scrambling in the GUE.

Sparse Random Matrices
In this section, we shall explore the consequences of considering random matrices which are not fully random in the sense that not all elements of the matrix are randomly drawn numbers. We refer to this more general class of matrices as sparse random matrices.
The reduction of randomness is achieved as follows.
We set the diagonal matrix elements M ii to vanish. We then pick randomly and uniformly, n off-diagonal matrix elements, M ij , with i < j. The value of each of these matrix elements is chosen independently with its real and imaginary parts being drawn from a Gaussian distribution with variance 1 2L . All other off-diagonal matrix elements of M are also set to zero. Finally the matrix is made Hermitian by taking it to be 1 2 (M + M † ). The behaviour of the resulting random ensemble we obtain in this way, as a function of the number of random elements n, is then studied below numerically.
We now present results of numerical analysis for various quantities such as density of states, specific heat, level spacing and spectral form factor for the resulting ensemble.

The density of states
In the plots below, the density of states (DoS) is normalized to unity unless specified otherwise, i.e., The plot for a matrix of size L = 4096, as n (≤ 1.6×10 7 ) is varied, is shown below. The fully random case corresponds to the GUE with L = 4096. We see that there is a general tendency for the eigenvalues to get more concentrated towards the centre, as n decreases, i.e., the sparseness is increased, indicating that the repulsion between eigenvalues decreases.
However starting from the GUE there is an interesting change in behaviour as n continues to decrease. Initially, when n is not too small, corresponding to the values, n ≥ 1.5 × 10 5 , i.e., , n L 2 ≥ 9 × 10 −3 , in Fig. 8 (a), the density of states can be fitted to a Wigner semicircle, after rescaling the energy. We find, where ρ W is the Wigner semicircle, eq. (2.7). The parameter B scales with n in a power law fashion, B ∼ n L 2 0.45 , as can be seen in Fig. 9. However once the randomness becomes very small, n ≤ 1.5 × 10 4 , i.e., n L 2 ≤ 9 × 10 −4 , in Fig. 8 (b), there is an important change. The effect of the repulsion among eigenvalues decreases to a point where the eigenvalues now like to clump much more at the center, near E = 0, resulting in a sort of cusp at E = 0. This is reminiscent of a condensate forming in bosonic systems. As n decreases further, the clumping and resulting cusp at E = 0 get more pronounced. The transition to the regime where the Wigner distribution stops describing ρ occurs over a fairly narrow range in n. From Fig 8 we see that the value n = 1.5 × 10 5 is well fitted with ρ W , but this is no longer true for n = 1.5 × 10 4 .
It would be very interesting to find out how sudden is this transition, as L → ∞, but this is not easy to analyse numerically. Let us denote by n c the approximate value of n which separates the two regimes. One would like to understand how n c behaves as L → ∞; a natural ratio to consider is n c /L 2 which is the fraction of total matrix elements that are non-zero. This ratio is plotted in Fig. 9 below as a function of 1/L. We find that, subject to the limitations of our numerical analysis, an approximate fit is given by Note that in Fig. 10 the solid line is the best fit obtained after discarding the last point (at the value of log L = 4.623).
We leave a better understanding of this transition for the future.

Comments on Various Properties
Here we discuss some of the properties of ensemble obtained above with reduced randomness.

Thermodynamics
The density of states determines the finite temperature behaviour of the thermal partition function. Since for n > n c , ρ is well fitted by ρ W , in this case the thermodynamics will agree with that of the GUE, after an appropriate rescaling of energy or temperature. However once n < n c this will no longer be true. The resulting behaviour of the entropy S at low temperatures is shown in Fig. 11. We have not been able to obtain a good analytical fit to this result. Note that in this case n c lies in between 1.5 × 10 4 and 1.5 × 10 5 .
It is interesting to contrast the behaviour of entropy, S, for the random sparseness case with that of the GUE and the SYK 4 theory shown in Fig 3. In particular, for the SYK 4 model, in the large N limit, we saw that there is a range of temperatures, eq.(2.15), where a saddle point approximation is valid resulting in the entropy behaving linearly with temperature, eq. (2.17). One would like to study if something similar happens for the random sparseness case, for n < n c , L → ∞. This is however difficult to do numerically. Figure 11: The Entropy at low temperature for varying values of n. Here we plotted only the temperature dependent part of the entropy, after subtracting log L.

Spectral Form Factor (SFF)
We now turn to the spectral form factor, defined in eq. (2.19), with connected and disconnected pieces given in eq. (2.20), (2.21). We shall analyse how the SFF changes with changing randomness, n.
We recall, eq. (2.22), that the SFF is determined by the two-point density correlator and therefore is sensitive to the interactions between the eigenvalues.
The numerical results for the random sparseness case are in Fig. 12, again for L = 4096. We take β = 5 and consider varying values for n. The plots show much richer behaviour, compared to the GUE, Fig.4. In Fig. 12 besides the fully random case, the SFF is plotted for 5 other values of n. Of these n = 138240, is roughly comparable to n c , all other values of n lie in the range n < n c , where ρ W is no longer a good fit. Generally speaking, differences in the SFF compared with the GUE grow as n decreases. However, it is interesting to note that the n = 138240 case already shows differences with the GUE. Figure 12: Left: disconnected, middle: connected, right: total spectral form factor with changing number of random elements, parametrised by n, for a matrix of rank 2 12 = 4096. We have taken an average over 300 ensembles for each cases. Here n c ∼ 1.5 × 10 5 .
While the general features of a dip, ramp and plateau are present, there are interesting differences: • The top panel has n going down till 9000. The bottom panel has lower values of n ≤ 5760. In each case we plot the disconnected (left), connected (middle) and total (right) SFF. We see that in the top panel the ramp region in the connected SFF has much more structure, and unless n becomes very small and we go to the bottom panel (n ≤ 5760), it is not monotonic, and exhibits an intermediate local maximum, which we call a "peak", followed by the local minimum and then grows until the final plateau. This indicates that while the nearest neighbour eigenvalues continue to repel, leading to a plateau at late times; at intermediate separation, beyond nearest neighbours, in fact an attraction develops between eigenvalues.
• The height of the plateau decreases with decreasing n suggesting that the repulsion between the nearest neighbours also reduces. This also agrees with what we saw in the density of states, Fig. 8, which get more clumped in the centre as n decreases.
Some more comments are as follows: • The disconnected piece itself has interesting structure. For the GUE and SYK models, the disconnected piece is known to decay and be self averaging, meaning that a single instantiation of the Hamiltonian (for big enough N ), reproduces the ensemble average. Here we see that the disconnected piece has an intermediate 'spike' followed by a slower decay, with the intermediate spike being more pronounced for in-between values of n. See Fig.12, left panels, n = 13824, 10000, 9000 around log(t) = 2 for spike.
• The connected piece becomes smaller in magnitude as n decreases, and the full SFF is increasingly dominated, for smaller n, by the disconnected term.

Level Spacing
Next we turn to the nearest neighbour level spacing. There are a few points to keep in mind.
It is important to compute the level spacing after unfolding the spectrum to obtain the Wigner surmise for the Gaussian ensembles or SYK q theory 3 . The eigenvalue spacing after unfolding is given by we see that it is the local average value of the eigenvalue difference. The histogram plot of the quantity ∆λ for Gaussian ensembles or the SYK q theory then gives the standard Wigner Surmise.
Alternatively, another quantity which avoids this procedure is the ratio of eigenvalue differences, since the dependence of the density of states in eq. (3.4) cancels out in the ratio; Some more information on how the distribution of r i behaves for various Gaussian ensembles is given in appendix D.
If the eigenvalues are uncorrelated then the statistics will be that of the Poisson distribution which corresponds to As in [27], it will actually be convenient to plot the distribution of the logarithmic ratio ln r, which is, Also, we need to be careful about degeneracies in the spectrum. There will be a degeneracy in the spectrum if the model under consideration possesses symmetries. For example, if the Hamiltonian has fermion parity symmetry, then the eigenvalues will be 2-fold degenerate. Therefore in order to compute the level statistics, one has to first separate the eigenvalues according to the parity sector and plot the differences in eigenvalues corresponding to any one sector. As mentioned in [27], for N = 0 (mod) 8, which is the case for N = 24 for which we present the analysis, there are no degeneracies after we do this.
Let us now turn to the analysis of the plots. We have considered matrix of size L = 4096 and an ensemble size of 1000 in obtaining these plots. The left panel is a plot of the probability P (s) versus the unfolded level spacing s given by eq. (3.4), for varying values of the randomness parameter in the range n ≥ n c . The Wigner distribution is given by the black curve and corresponds for the GUE to the function, The fully random case is the data for the GUE, and other curves are obtained by reducing n until it reaches a value close to n c .
In the right hand panel we plot P (ln(r)), defined in eq.(3.7), vs ln(r). Again the black curve is the result for the Wigner surmise, the fully random case is the data for the GUE, and other cases correspond to varying values of n, which for n ≤ 5760 are less than n c . Note that for the Wigner surmise p(r) is given by from which it follows from eq. (3.7) that for GUE(β = 2). (3.11) For the Poisson distribution it is given by The result in eq. (3.9) is derived in appendix D for the simple case of 3 × 3.
We see from the left panel of fig. (13) that up till n = 1.5 × 10 5 , when the density of states fits a rescaled Wigner semi-circle, the level statistics is also well fitted by the Wigner surmise. However when n decreases further past n c , then differences set it, as we can see from the right panel. These grow, as n decreases and eventually for n = 576, when n/L 2 ∼ 3 × 10 −5 and the randomness is very sparse, the statistics becomes Poisson.
It is not unexpected that for vastly reduced randomness the statistics becomes close to Poisson. Ideally one would have liked to obtain the behaviour with varying n in the limit when L → ∞, but that is difficult to do numerically and left for the future.

OTOCs
Finally, we consider the behaviour of the OTOC, G 4 = 1−F (t), eq. (2.32), as a function of time t and analyse it as we vary n, the number of random elements 4 . In each case we work with L = 4096, and an ensemble size of 2. Four figures are given below for different values of n, 10 6 ,10 5 ,10 4 , 4.5 × 10 3 . In the left panel of each figure the points are the data. Note that the range of t is different for n = 10 6 , 10 5 and n = 10 4 , 4500. Comparing the left panels for various n, we can see that in general, the growth of the OTOCs slows down as n decreases. Furthermore, there is a noticeable slowing down once n crosses the value n c , which is somewhat smaller than 10 5 . For example, the OTOCs in Fig. 14, with n = 10 6 , exhibit a saturation time scale, at which they approach their asymptotic value, which is quite similar to that for the GUE, Fig. 7. for example, for β = 10, Fig. 14, this time scale is t ∼ 10, which is comparable to the time scale in fig. 7. Whereas for n = 10 4 the saturation time scale, for β = 10 is t ∼ 50 − 60.
The solid curves in the left panel are obtained by fitting an initial region, t ≤ t max , of the OTOCs to a functional form A + Be λt , as in the previous discussion of OTOCs above. The value of t max for each curve has been estimated by choosing the maximum range in t for which such a functional form is a good fit. The resulting estimate for the Lyapunov exponent is then plotted in the right panel of each figure. We have done the estimate for t max in a preliminary manner in this analysis and our corresponding estimate for λ should also be taken to be preliminary one.
In particular, we recall that the GUE exhibits hyperfast scrambling with the OTOCs saturating in a time scale of order unity, Fig 7. One would expect this to continue to be true for sufficiently large values of n ∼ L 2 . As n beginning to decrease it is possible that exponential growth sets in and a Lyapunov exponent can be defined in the L → ∞ limit, once n becomes sufficiently small, perhaps smaller that n c . However we have not been able explore this possibility systematically and leave it for future investigation.     Before closing this section let us make one final comment. Here we have studied the ensemble obtained by randomly choosing some fraction of matrix elements to vanish. In appendix C we discuss another model where the nonvanishing matrix elements are chosen in a more systematic manner. We first choose an integer q, and then only take the matrix elements, H ij , for i − j = 0 mod q, to be non-zero. The resulting matrix then can be block diagonalised into q × q blocks. Each of these blocks behaves like a GUE. E.g., the level spacing statistics of each block follows the Wigner surmise, however when we combine all the blocks the resulting level statistics becomes Poisson.

Local SYK model 4.1 Local SYK model and the reduction of random couplings
In this section, we consider a variation of the SYK model where the number of random couplings is vastly reduced and only O(N ). The fermions in this model can be thought of as lying on a circle and the only couplings present are those between q nearest neighbours. The Hamiltonian is given by with the fermions satisfying periodic boundary conditions: To distinguish between this model and the conventional SYK model, with all to all couplings, we will refer to the model above as the local SYK model, in contrast to the conventional SYK model which we have often referred to as the non-local SYK model above. The couplings j i 1 i 2 ...iq present are taken to be random with vanishing mean and variance We are particularly interested in the behaviour as N → ∞, withĴ being held fixed. Here we will also keep q fixed.
Notice that the varianceĴ 2 above has a different N dependence from the non-local SYK model, eq. (1.2). The choice above is appropriate for a local system and ensures that a good thermodynamic limit exists with energy, entropy, etc, scaling extensively with the number of sites, N , when N → ∞, at fixed T,Ĵ.
Also notice that with the choice we made in eq. (4.3), we get the dispersion in energy which has the same linear N dependence as in the non-local SYK model in the large N limit One more comment is worth making here. While the four-fermi interaction in eq. (4.1) does preserve a notion of locality, bilinear terms which would typically be present in a local theory are absent. For Majorana fermions, which we are considering here, the bilinear terms couple nearest neighbours and result in an additional "hopping" contribution to H More generally one can consider bilinear terms which couple next to nearest neighbours, etc, as well. We will neglect all such terms below. Study of the local SYK model with these hopping contributions are done in [40].
We restrict ourselves to the particular case of q = 4 in this section. We will compare and contrast the behaviour of the local and non-local SYK model by studying various quantities such as density of states, specific heat, 2-pt Green's functions, spectral form factor and OTOCs.
Before proceeding let us mention one potentially confusing point while reading the various plots of this section. The numerical results have been obtained for finite values of N . In all the plots, we take J = 1 for the non-local model and also takeĴ = 1 for the local model.

The density of states (DoS) and specific heat
The unit normalised density of states for the non-local and local models are presented in fig. 18 below for a few different values of N . We see qualitatively that the states are more clumped at the center (near E = 0 in the plot) in the local SYK model and the behaviour close to the two ends is more like the tails of a normal distribution. The reader will recall that when we discussed the non-local SYK and GUE models ( fig. 1) we saw that compared to the GUE the spectrum of the non-local model indicates that the repulsion between eigenvalues is weaker, leading to the eigenvalues being more concentrated near the center; we see now that this tendency is even more accentuated in the local case. We have scaled the axes in fig. 18 appropriately with N that ensures the width of the distribution is almost same for all values of N . The non-local SYK model has a specific heat which is linear in temperature T in the low energy region, eq. (2.17), eq. (2.15). The resulting behaviour of ln(Z) in the non-local case also has a constant and T linear term in the temperature regime, eq. (2.15).
The plot for entropy S vs T in the non-local SYK and the GUE was presented in fig. 3. The corresponding plot for the local SYK model is in fig. 19. We see from fig. 19 that there are qualitative similarities between them, essentially due to the fact that in all cases the entropy vanishes at T → 0, and increases monotonically achieving its maximum value, S = N ln 2, as T → ∞. An important feature for the non-local SYK is the low-temperature regime eq. (2.15), where the entropy has the characteristic behaviour, eq. (2.17). The presence of the constant S 0 and the linear T behaviour are closely tied to the presence of the Schwarzian action. It would be very interesting to check whether the same kind of behaviour arises in the local case. Unfortunately, the numerics which have been carried out at finite N do not allow us to obtain a conclusive answer and we leave this question for the future. Let us make one comment before proceeding. We see from fig. 19 that the rate of change of S/N from its low-temperature value, where it is close to vanishing, to its asymptotic value as T → ∞, becomes more rapid as N increases, suggesting that this change should be quite rapid in the N → ∞ limit.

Green's function
The two point function of fermions is another important quantity and is given by In the non-local SYK model, the two point function can be computed analytically in the large N limit, in the low-energy regime. Due to the emergent conformal symmetry, it takes the scaling form, where g 0 is a constant.
The corresponding plot for the Green's function in the non-local model are shown in fig. 20. The left panel shows log(Im (G(ω))) against log(ω).
There are three regions which can be fitted to a power law. For ω ≥ 1 we have a good fit, shown in green, to the free field result Im G(ω) = 1 ω . For 0.22 < ω < 0.65, the red line is a good fit to the conformal result, eq.(4.8).
For sufficiently small ω we find the fit to eq.(4.8) breaks down, and at very low ω < 0.1 a new scaling regime arises shown in black, with Im (G) ∼ 1 ω 0.16 . This break down and the new scaling regime are finite N effects. The middle panel in fig. 20 is Im (G) vs ω showing the region ω > 1 in green, and the right panel shows that the region 0.22 < ω < 0.65 is well fitted by eq.(4.8).  fig. 21. Once again there are three panels. The left panel is a plot of log(Im G(ω)) vs log(ω) showing that for ω > 0.1 there is a reasonably good fit with the free field result, Im G(ω) = 1 ω . The middle and right panel show that the regions 0.1 < ω < 1 and ω > 1 respectively, are well fitted by the free field form. There are deviations for ω < 0.1. We have not been able to reliably estimate the behaviour in this region. Here we rescale ω as ω R = ω × N 3/2 . (b) Im (G(ω R )) vs ω R for the region 0 < ω R < 100. (c) Im (G(ω R )) vs ω R for the region ω R > 100 .

Spectral form factor (SFF)
The SFF was discussed for the non-local SYK and GUE in subsection 2.2, fig.  4 and for the random sparseness model in section 3.2, fig. 12. In the figure below we plot the disconnected, connected and total SFF for the non-local and local models. The most immediate difference that catches the eye is in the total SFF ((c) panel) where we see that the ramp in the non-local model is almost flattened in the local SYK.
For the case with random sparseness the reader will recall, see fig. 12 that we had seen a similar behaviour for the SFF, namely, the flattening of the ramp as we increase sparseness, in particular for 1 n n c . It therefore seems that the reduction of eigenvalue repulsion which occurs in the random sparseness case, is also present here with O(N ) randomness. We had noticed in the case of sparse random matrix analysis that as the sparseness is increased, when 1 n n c , the ramp in the SFF starts to disappear along with the level spacing becoming more Poisson-like, see fig.13.
Recall that the SFF of the non-local SYK model is quite similar to that of the GUE where the ramp can be understood as a consequence of the sine kernel for the two point density correlator. The absence of a ramp in the local SYK model indicates that the two-point density correlator is quite different for this system. One more comment worth making about fig. 22 pertains to the disconnected part ((a) panel). For the non-local model, the disconnected piece falls like 1/t 3 (at t β). We see from the figure that in the local model, the disconnected term follows a similar fall-off.
From fig. 23 we also see that the local maximum in the connected SFF for the local case occurs somewhat early, at a time when the disconnected part has still not decayed away and is comparable in magnitude to the connected one. As a result such a local maximum does not appear in the full SFF.

Level Spacing
The nearest neighbour level spacings statistics is also a useful diagnostic of chaos. After unfolding the spectrum, as described in subsection 3.2, and taking care to separate the eigenvalues for eigenvectors of different parity, the resulting statistics is plotted in fig. 28.
The left panel is the probability P (s) vs the spacing s. The green, blue and brown points are the data for the GOE, non-local SYK and local SYK models respectively (the GOE is the relevant ensemble since N = 0 mod 8, [27]). The solid curves are the best fits to this data, with the green, blue and brown curves being the fit to the green, blue and brown points, respectively. We see that the non-local SYK model is in good agreement with the GOE RMT result, eq. (2.29), see subsection 2.3 and [29], while the local SYK model is in good agreement with Poisson statistics, (4.9) In the right hand panel the same nearest level statistics is presented now as a plot of the variable r, eq. (3.5), along x-axis, vs P (ln(r)), eq.(3.7). The GOE, non-local SYK and local SYK data is in green, blue and orange respectively, and the corresponding curves for the GOE RMT, eq. (2.29) and Poisson statistics, eq. (4.9), are in green and red respectively. We see that again the non-local and local SYK model are in good agreement with GOE and Poisson statistics respectively.
The bottom line is that we find for the local SYK model that the nearest neighbour level statistics agrees with the Poisson distribution to very good approximation, showing, at least as far as this diagnostic is concerned, that the model is not chaotic. This is also in agreement with what we have observed in the analysis of SFF earlier in subsection 4.4.

Out-of-time-ordered correlators
Next we turn to OTOCs.
As was discussed in [24,37] , one needs a small parameter to be present, analogous to 1 N in a large N vector model or gauge theory, or to , for a well defined notion of scrambling time and the associated Lyapunov exponent in OTOCs, to exist. The small parameter suppresses the connected component of the OTOC compared to the disconnected one. We do not have such a parameter in the local model and in its absence it is unclear if the OTOCs can serve as a useful diagnostic of chaos. The parameter N present in the local model, as was discussed at the beginning of this section, should be thought of as labelling a site with one fermion being present per site. The N → ∞ limit in the local model is therefore an infinite volume limit which is different from the vector model or gauge theory cases where in this limit the degrees of freedom per site diverge.
Despite this observation it is worth examining the behaviour of the OTOCs in the local model and comparing it to the non-local case. An important limitation of our analysis below is that we will be working numerically at finite N and our conclusions will mostly be of a qualitative nature.
Let us also note the following. Drawing on the discussion about quantum chaos in higher dimensional local systems [24,25], even in systems without a small parameter of the type mentioned above, one can often define a velocity dependent Lyapunov exponent λ(v) which vanishes at the butterfly velocity v B . For v = |x|/t where |x| is the spatial distance, with v > v B , the OTOC C(x, t), defined for an local operator O(t) at x = 0 and another local operator W x at x, takes the form with λ(v) being the velocity dependent Lyapunov exponent. When there is a large N (or small ) parameter which suppresses the connected component, α is unity and the behaviour of C(x, t) also continues to be of the same form when v < v B , resulting in exponential growth of the OTOC. However, in the absence of such a parameter suppressing the connected component, often α is bigger than unity and the growth of C(x, t) does not continue for v < v B .
The local SYK model we are studying is not a conventional local system since there is no quadratic hopping term in the Hamiltonian, eq. (4.6). Even so, it would be interesting to ask whether a butterfly velocity exists in the system and the behaviour of the OTOC close to it. We will not be able to address this question in a definitive manner here and leave it for the future.
For the non-local model the OTOCs were discussed in section 2.4. By averaging over flavours we defined the function G 4 , eq. (2.32) whose behaviour is plotted in figure 6, obtaining, at N = 24, good agreement for the Lyapunov exponent with the N → ∞ result.
The plots for local SYK are shown in fig. 26, 27. In obtaining the plots for the OTOC, we have considered whereF ij is given in eq. (2.30), then averaged the value of G ij for all possible pairs (i, j) keeping the distance p = |j − i| fixed. This averaged correlation is denoted asḠ i,i+p below. Fig. 26 (a) is for p = 1, and varying β, fig. 26 (b) is for p = 2, and varying β and fig. 26 (b) is for various p and two different values of β.
We see that the OTOCs in general grow rapidly before saturating to a maximum value, in a manner, qualitatively speaking, which is similar to the non-local SYK model and also the GUE RMT, section 2.4. The rate of growth is fastest forḠ i,i+1 , i.e., the nearest neighbour sites and then slows down as p increases. It is worth noting in this context that for p = 4, 5 there are no terms in the Hamiltonian which directly connect the two sites, and this might account for some of the slow down. Also for p = 1, 2 the rate of growth slows down as β increases, this is similar to what happens for the non-local SYK model too.
In conclusion, qualitatively the OTOcs in the local case are similar to the non-local and GUE cases. A more systematic analysis of the OTOCs in the local model, especially in the large N limit, is left for the future. It will be especially interesting to study whether a butterfly velocity can be defined in the system, and whether there is evidence for exponential growth of the OTOCs inside the light cone, v < v B . Some further discussion on the OTOCs in this model can be found in appendix F.

Operator Growth in SYK Models
The growth of operators from simple to complex ones is also a way to characterise the scrambling properties of a system, [26,33].
In the non-local SYK model it was suggested that the number of fermionic terms which appear in an operator can be a measure of its complexity. Thus a single fermion ψ i is a simple operator whereas another having a product of many fermions would be complex. Staring with the single fermion ψ i at t = 0 its time evolution is given by Products of fermions of the type ψ a 1 , ψ a 2 · · · ψ as for a complete basis in the space of operators and one can therefore expand ψ i (t) in this basis, 14) The probability of having an operator of size s for a particular combination can then be computed by finding all the c and this leads to c {i} a 1 ···as (t) = 2 1+s 2 ψ a 1 · · · ψ as , ψ i (t) (4.16) The resulting probability distribution of having operators of size s is then In the equation above the operator O denotes all the N s operators of size s; ψ a 1 ψ a 2 · · · ψ as , with condition a 1 < a 2 < · · · < a s .
Starting with the operator ψ 1 at t = 0 (setting i = 1 entails no loss of generality), we can find P s (t) numerically. Fig. 28 and 29 are plots obtained for the non-local and local SYK models for s = 1, 3, 5. We see that in both cases the operator growth is qualitatively similar. P 1 (t) decays with time.
In the non-local case, P s (t) increases, reach a maximum and then decreases, with P 3 (t) peaking before P 5 (t). In the local case, the behaviour is similar, but there is a longer tail characterising the slow decrease of P 5 (t).

Gaussian Ensembles with Varying Randomness
As was discussed in the introduction, the SYK model, like the GUE, can be thought of as a random Hamiltonian with matrix elements being drawn from a suitable Gaussian distribution. We explored some other random systems above where the random Hamiltonians were obtained by choosing other Gaussian distributions, with interesting differences in their behaviour. Here, we will make some additional comment on Gaussian ensembles.
Before proceeding let us note that the low-energy dynamics of the nonlocal SYK model, which is JT gravity, is known to also arise in Random Matrix theory by choosing a non-Gaussian potential V (H) and considering a suitable double scaling limit [9]. We restrict ourselves to commenting on Gaussian ensembles below, since this class of random systems is already quite rich.

General Observations
An Hermitian matrix H can be thought of as being a vector in an L 2 dimensional Hilbert space H of L × L Hermitian matrices. H can be expanded in any basis of H, where T a are Hermitian matrices which are the basis elements, and c a are real coefficients. An inner product in this space is given by the trace, By drawing the coefficients c a from various Gaussian distributions one can get different types of Gaussian ensembles. Under a change of basis the T a 's, and H, transform by conjugation, where U is the L × L unitary matrix specifying the change of basis.
A few different bases for H -are : • A standard basis corresponding to the root vectors of U (L) [38].
Each of these bases is useful in their own way, since they allow us to use different types of physical intuition in the study of randomness. The first basis connects to Random matrix theory, the second to the study of spin systems, and the third to the study of the SYK model.
An important property of the GUE is that the resulting probability distribution is U (L) symmetric. Many key properties of the GUE, including the fact that OTOCs and operators grow in an hyper-fast manner, as discussed in section 2, are tied to this U (L) invariance. For other random ensembles the U (L) symmetry will be broken in general. This is true for the non-local SYK model, and the breaking of the U (L) symmetry is in part responsible for interesting differences in behaviour with the GUE. It is also true for the random sparseness model studied in section 3, where we keep n < L 2 matrix elements, or the local SYK model studied in section 4, where we keep only O(N ) random variables.
We can also consider other ways of breaking the U (L) symmetry, these could also yield interesting properties from the point of view of thermodynamics or chaos. For example we can take H to be block diagonal with If σ M L = 0, the off diagonal terms in the matrix integral go to zero and the distributions breaks up into two GUEs of rank M, L − M respectively. But when σ M L σ M , σ L , the system behaves like two GUEs with a weak coupling between them. E.g., the density of states is only perturbed slightly by the offdiagonal terms, which are "very massive", where mass m ∼ O (1/σ M L ). This perturbation can have interest effects for chaos related properties though. The off-diagonal terms could alter the nearest-neighbour spacings which are of O(1/L). And for operator growth, one expects that a simple operator acting in the M × M block will at first scramble in an hyperfast manner into a matrix filling out the M × M block, but its subsequent growth into a full L × L matrix would be much slower and governed by the off-diagonal terms 5 . By changing the strength of the off-diagonal term compared to the diagonal ones, once can in this way explore a range of behaviours.
Other random ensembles are also worth studying. The GUE eq. (2.1), eq. (2.2), corresponds to choosing the coefficients c a (for unit normalised T a 's) to be independent Gaussian random variables with the probability distribution Taking σ a → 0, we will have a highly ordered matrix, where the dispersion about the meanc a is small. As σā ca increases the amount of disorder increases, and in this manner one can study the change in behaviour from the ordered to disordered situations.
We will report on the behaviour of some of these systems in a subsequent study [36].

More on SYK Models
Let us now turn to the SYK model in the context of the wider discussion above.
As was noted in the introduction, eq. (1.6), eq. (1.9), the GUE can be obtained by summing over all ψ q terms with equal variance. The non-local SYK model corresponds to only keeping the terms of the ψ q kind, i.e., involving products of q number of ψ i fields. The large N saddle point arises when the coefficients of these terms are drawn from a Gaussian distribution with variance given in eq. (1.2). The fixed q model only preserves an SO(N ) subgroup of the U (L) symmetry mentioned above.
In [14] a version of the fixed q model was studied where the number of couplings was reduced in a random manner. This model is reviewed in appendix A. In the large N limit the fixed q model, to begin, has O(N q ) couplings. As long as the number of random couplings after reduction scale like N α , α > 1, a saddle point analogous to the fixed q case exists in the large N limit, for a suitably chosen variance for the non-zero couplings. In fact, this is also true when q does not take a fixed value, as long as it grows with N as a power q ∼ N β , with β < 1 2 . This sparse SYK model, [14], is analogous to the sparse random matrix model studied in section 3. If we think of the GUE, as was mentioned above, by summing over all ψ q terms, then at large N the maximum number of terms are contributed by operators with q = N/2. In fact, the number of random variables for q = N/2 is which is exponential to N . Thus, instead of starting with the GUE and making the matrix H sparse, we could at large N , only take terms with q = N 2 , and consider the effects of making the randomness sparser by setting some terms to vanish in a random fashion 6 . The difference with the [14] case is that the value of q = N 2 now scales linearly with N . As we saw in section 3 as the number of terms n is reduced an interesting change in behaviour occurs in this case when n ∼ n c , with nc L 2 , scaling with L as given in eq. (3.3). It would be interesting to try and understand this transition using saddle point methods but we have unfortunately not been able to do so 7 .
The saddle point referred to above, obtained by starting with the fixed q model and keeping O(N α ) terms is analogous to the one obtained for the fixed q case (where all terms are kept). In particular, the saddle point continues to break time reparametrisation invariance giving rise to the Schwarzian mode and the low-energy behaviour is a CFT with the fermionic fields having the anaomalous dimension 1/q.
In this example we then see that most of the essential features found in the non-local SYK model -tied to the realisation of the time reparametrisation symmetry and the Schwarzian action are quite robust and continue to occur even for much smaller number of random couplings and are independent of the underlying Hamiltonian. Some of the more detailed features though depend on the underlying model. For example, we can have two different models with the same number of couplings O(N α ), obtained by starting with two different values of q, q 1 , q 2 , and reducing the number of couplings to different extents. The anamolous dimensions of the ψ i fields in the low energy CFT will then be different, 1 q 1 and 1 q 2 respectively. Let us also comment on symmetries. When we keep only even q terms in the sum to obtain H there is a fermion symmetry, ψ → −ψ symmetry, which is preserved. This symmetry is broken once odd q terms are included. Also, among even q terms if we only keep q = 0 ( mod 4) terms then H has time reversal invariance -as can be seen from the fact that the couplings are real. Terms of the same symmetry type can be transformed into one another under U (L) transformations. Nevertheless and quite interestingly, as mentioned in the previous paragraph, if we take terms for two different powers q 1 , q 2 both of which preserve the same class of symmetries and also take the same number of couplings in both cases, the low energy limit has some differences in the resulting anomalous dimensions of the fermionic fields.
We can now comment on the figures plotted below. Fig. 30 (a) plots the density of states for the GUE, the ψ q model where all even powers of q are included with the same variance, and the model where all ψ q terms with odd powers and the same variance are included. Fig. 30 (b) included the three models again, but with a rescaling carried out along the x (energy axis) so that the extent along the x axis of the three curves coincides. The y axis is also correspondingly rescaled to keep the area the same under each curve. We see that after this rescaling, the density of states for the models obtained by keeping only even or odd terms agree with the GUE. The density of states for the model obtained by keeping all q terms agrees with the GUE even before rescaling, as is to be expected. In fig. 31 working with the N = 20 case, we show the density of states when only some terms of the ψ q type, with q = 0 (mod 4), are kept in H. From our discussion of symmetries above we see that in this case all the terms we retain preserve the fermion and time reversal symmetries. The left panel of the figure shows the density of states for various values of nthe number of terms which have been kept, with the black curve showing density of states, ρ, for the SYK q = 4 model. The right hand panel is the same data after rescaling the energy E and ρ axis, so that all the curves coincide in their extent along the x axis, as discussed above. In addition, in the right panel we have included the RMT (all q) case and also the case where all q = 0 (mod 4) terms are included -these are shown as yellow and red curves respectively. We see after the rescaling, ρ for the GUE and the model where all q = 0 (mod 4) terms are kept, are in good agreement. As the total number of terms in H is decreased ρ begins to change. The black curve in the right panel is also for the q = 4 SYK model (where all q = 4 terms are included) -this model has a total of N C 4 = 4845 number of terms. Qualitatively, the right panel shows that for intermediate values of n, ρ near the center, around E = 0, agrees more with the q = 4 case. Fig. 32 shows a similar plot, with rescaled energy and ρ axes, for N = 24. The yellow curve is for the GUE and the black curve is for the q = 4 SYK. We see in this plot more clearly that at intermediate values of n, for E close to the edges, ρ differs from both the GUE and the SYK 4 theories, while for E close to the center, ρ lies in between the GUE and SYK 4 cases. It could be that some of these differences are due to finite N effects.  In all the figures above the variance for all the non-zero couplings was taken to be the same. One can also examine what happens if we keep terms whose couplings are normal distributed with different values for the variance. An example of this is shown in Fig. 33 where we have kept all terms of ψ 4 and ψ 6 type, for N = 28. The ψ 4 , ψ 6 terms are drawn from distributions, eq. (1.2), with q = 4, 6 respectively, and we have set J = 1. The full Hamiltonian is taken to be H full = λH SYK 4 + (1 − λ)H SYK 6 (5.9) The resulting density of states is plotted in fig.33 for various values of λ, where λ is an interpolating parameter.
Let us also note that a large N saddle point in terms of the G and Σ fields, [7], exists in this case. However solving for the values of the G, Σ fields analytically, at the saddle point, is difficult in general. When λ is small it is easy to see that the ψ 4 coupling is relevant in the conformal theory to which the ψ 6 model flows, thus at very low energies the behaviour will be governed by the ψ 4 fixed point, while at intermediate energies E J one will have the Schwarzian mode coupled to the ψ 4 perturbation. As λ is increased one expects that the model will flow from the UV more directly to the CFT governed by the ψ 4 coupling.
The left panel of Fig. 33 is a plot of the density of states for varying values of λ and the right hand panel is a plot where the density of states is plotted after rescaling the E axis and also ρ, as discussed above. We see from both plots that the density of states interpolates smoothly from the SYK 4 to the SYK 6 case as λ goes from 1 to 0. It is also worth noting that in the right hand panel, once the rescaling is carried out, the density of states at the edges (i.e., at low and high energies), for all values of λ = 0, agrees with the SYK 4 model. This connects with our comment above that the low-energy behaviour should be of the SYK 4 type. The discussion in this section is indicative of the rich set of behaviours one can get for various Gaussian ensembles, with varying degrees of randomness. A more definitive analysis is clearly called for, especially in the large N limit, and we hope to return to this in subsequent work, as was also mentioned above.

Summary and Discussion
The conventional SYK model, with ψ q couplings, is referred to as the nonlocal SYK model in this paper, due to the all-to-all nature of its fermionic couplings. This model and the Gaussian Unitary Ensemble (GUE) are both random systems, with some similarities. Their spectral form factors are similar and the level statistics, of nearest neighbour energy levels, in both follow the same Wigner distribution. However, there are some important difference as well. In the large N limit, the non-local SYK model has interesting lowenergy behaviour characterised by the breaking of time reparametrisation invariance which gives rise to a Schwarzian action. This is not true in the GUE.
The pattern of symmetry breaking for the SYK model determines much of its low energy behaviour. Its low-energy thermodynamics is governed by a saddle point resulting in a linear specific heat, and a big entropy, S 0 , scaling like N , eq. (2.17), eq. (2.18), close to ground state. And the OTOCs exhibit chaotic behaviour with a Lyapunov exponent saturating the chaos bound. These features are absent in the GUE. Many properties of the GUE, for L × L matrices, are tied to the U (L) invariance of the resulting probability distribution. This invariance results in hyperfast scrambling for operators and also for the OTOCs, which exhibit a saturation time of order the thermalisation time, see appendix B. The low-energy thermodynamics for the GUE is also quite different, see section 2.
The behaviour of the non-local SYK model is of considerable interest from the point of view of gravity, agreeing in particular with JT gravity and being akin, quite universally, to near-extremal black holes with an AdS 2 factor in their near horizon geometry, [7,11]. The differences in behaviour with the GUE prompt one to ask why these differences arise and more generally when can behaviour of interest from the point of view of gravity arise, as the randomness is varied 8 . Such questions were some of the central motivations behind this paper.
Both the GUE and the SYK model correspond to Gaussian Ensembles, see section 5. In the GUE, with L × L Hermitian matrices, H, all L 2 matrix elements are random variables drawn from independent Gaussian distributions with the same variance and no mean. Equivalently, one can expand a general matrix H in the basis of products of fermionic operators, eq. (1.6). The number of fermion flavours, N , needed is related to L by L = 2 N 2 . The total number of coupling constants in this fermionic basis is 2 N = L 2 , which is of course the same as the number of independent matrix elements in H, and for the GUE these couplings in the fermionic basis are also independent Gaussian random variables with equal variance and vanishing mean. In contrast, the non-local SYK model has fewer random variables. For the SYK q model, only operators with products of q fermions are present in the Hamiltonian, giving rise to N C q ∼ O(N q ) random variables, which is much smaller than L 2 .
It is quite clear that some of the difference between the two models arise because of this reduction in randomness. To investigate this further, we studied two kinds of models in this paper. First, starting with the GUE we reduced the number of random variables by setting some number of matrix elements to vanish, at random, and choosing the remaining non-vanishing elements from the same kind of Gaussian distribution as in the GUE. This gave rise to the sparse random matrix model studied in section 3. As the number of non-vanishing matrix elements, n, is reduced, we move away from the GUE and one can study the effects of reduced randomness. Second, at the other end with much reduced randomness, we considered the case of the local SYK model where only 4 adjacent fermions were coupled to each other through ψ 4 type couplings, unlike the non-local SYK 4 model where all fermions are coupled to each other in ψ 4 type couplings, section 4. This model has only O(N ) random couplings, which is even smaller than the O(N 4 ) couplings in the corresponding non-local SYK 4 model.
For the sparse random matrix model we found that as the number of random variables was reduced there was a marked change of behaviour when n reached a value n c , which was numerically found to scale like nc L 2 ∼ L −0.48 , for large L. This change of behaviour is tied to the reduction in eigenvalue repulsion, resulting in energy eigenvalues clumping in an increasingly pronounced manner near the origin (E = 0), fig. 8. At values of n < n c our analysis showed that the behaviour becomes quite different from the GUE and several properties begin to differ. The spectral form factor now showed evidence of attraction between eigenvalues; the level statistics began to depart from the Wigner statistics as n decreases and become more Poisson-like distributed; the ramp in the SFF is flattened and the OTOCs indicated a slower growth of the correlators. A key limitation of our study is that the numerical work could only be carried out at finite N and many of our conclusions could not be reliably extrapolated to the infinite L, N, limit. In particular, we could not investigate the change of behaviour near the transition at n c , in the L → ∞ limit. It would also be very interesting to know if for n < n c a Schwarzian mode arises in this limit.
For the local SYK model, studied in section 4, we found that the behaviour was quite different from the non-local SYK case. These differences could be seen in the spectral form factor, which has some similarities in fact with the sparse random matrix model case (for 1 n n c ); in the level statistics, which was Poisson and not Wigner; and also, in a qualitative sense, in the OTOCs.
It would be interesting to understand our results in greater depth, augmenting the numerical work presented here with additional analytical insights. As mentioned above, the GUE and the SYK models both belong to the class of Gaussian ensembles, where the Hamiltonian is determined by couplings drawn from Gaussian random distributions. Our investigation reveals that this class is in fact quite rich and such a detailed understanding, we think, will prove quite rewarding.
Some analytical results along the lines of our investigation are already known. It was shown in [13,14] that starting with the non-local SYK q model, if we set some of the ψ q couplings to vanish at random, the behaviour would continue to be of the SYK q type, as long as the total number of non-vanishing couplings scaled like N α , α > 1, in the large N limit. The local model we have studied has O(N ) couplings, which is smaller and lies outside this range. In contrast, the sparse random matrix model of section 3 exhibits a change of behaviour when the number of couplings is n c ∼ L 1.52 -this is much bigger than any power of N . The results of [14] can be extended to the case where q is not fixed and also diverges when N → ∞. One finds, as discussed in appendix A, that a G − Σ type of saddle point which arises in the non-local model, continues to arise here, with the same breaking of time reparametrisation invariance etc, as long as q < √ N and the total number of non-vanishing terms scales like N α , with α > 1. If we think of the GUE in the fermionic basis, eq. (1.4), when N → ∞, its Hamiltonian will be determined by the ψ q terms with q = N 2 , since these terms out-number all others. Thus we see that the sparse random matrix model of section 3 is related to the fixed q model of [14], with q now scaling linearly with N .
More generally, one could consider many other models in the class of Gaussian ensembles. For example, we can include a mean value as well as a variance in the normal distributions, see section 5 and [36]. Varying their ratio allows us to explore what happens as we go from ordered to disordered systems. We could also study such Gaussian models from the perspective of the breaking of the U (L) symmetry present in the GUE, see section 5. For example, one can have blocks in the random Hamiltonian of GUE type, and couple these blocks to each other with intermediate blocks that have reduced randomness, including of the SYK type. The resulting behaviour would then be quite rich. As far as the spread of entanglement goes for example, which could be measured through OTOCs, or the growth of operators, one would expect GUE like blocks to exhibit hyperfast scrambling, with t scrambling ∼ O(1), this would become somewhat slower t scrambling ∼ O(ln N ), in the non-local SYK blocks of the random matrix and then even slower, t scrambling ∼ O(N ), in the blocks where the couplings preserve spatial locality. Bottle neck for the spread of entanglement, may perhaps arise in highly ordered blocks. A detailed study along these lines is also left for the future.

A Sparse SYK
In this appendix, we discuss the sparse version of SYK following [14]. The sparse SYK model considered in [14] is a variant of the original SYK model where each of the q-fermi interaction are assigned a probability. The Hamiltonian for the model is given by where the variables j Q and x Q are independent random variables, with j Q being drawn from a Gaussian distribution and x Q is a boolean variable with a binary distribution. Note that in the notation above Q denotes a specific set of q flavour indices, {i 1 , i 2 , · · · i q }, and ψ Q denotes the product of the corresponding q fermions, ψ i 1 ψ i 2 · · · ψ iq . By the probability distribution we mean more precisely and The case of p = 1, σ = (q−1)! N q−1 would correspond to the usual SYK model, where N is the total number of fermion flavours. The variable x Q being a binary variable will be unity for some bonds and zero for others and so only some of the bonds are turned on in any instantiation. The total number of terms in the Hamiltonian, n, is when p = 1 For p < 1, the number of terms in any instantiation of the model is given by The disorder averaged path integral for the model is given by Doing the integral over j Q first, we get Now doing the average over the sparseness variable x Q , we get Now, if p is such that the average number of bonds turned on is for some α > 0, we have

A.1 q = fixed case
Let us first consider where q is fixed with large N limit. Then we get Let us consider the case where both p and σ small, i.e., For p → 0 in the large N limit with q fixed, we require Then we can exponentiate the term in the brackets in (A.8) and so we get In the second line, we assume σ → 0 and also that ψ Q ψ Q ∼ O(1). We will see the self-consistency of these assumption soon.
We now introduce the flavour dependent bi-local fields Σ Q (t, t ), G Q (t, t ) and follow the same manipulations as in the case of usual SYK. The path integral then becomes where the symbol Q is the summation symbol with sum being taken over all sets Q of the form in eq. (A.1) and in the second line above we have used the notation that For a moment let us assume that there exists a solution where all flavour two point functions are same, i.e., The equation of motion obtained by varying with respect to Σ and G are then given by First, the way it works in the conventional SYK model is as follows. The parameters for the conventional SYK model are Then it is clear that eq. (A.20) becomes which gives exactly the conventional SYK saddle point, where G ∼ O(1) in terms of N counting.
However it must be clear now that the way to obtain conventional SYK saddle point, therefore G ∼ O(1), is not only (A.21), but more generically possible as long as is satisfied. The condition of p → 0 and σ → 0 is due to the consistency with (A.12). With (A.11), this is equivalent to in the large N limit. In this case, the number of random variables, n, is n = N α with α > 1. Thus, we conclude that even in sparse case, as long as (A.24) is satisfied, we end up with the same saddle point equation and therefore solution as in the conventional SYK, (A.22). It is also worth mentioning, as we see from eq.(A.24), that as the sparseness increases, i.e., as α becomes smaller, the variance σ increases.
On the other hand, for α = 1 case, the model has become sufficiently sparse that the effective number of terms is of O(N ). In this case, the approximation we used in (A.15) breaks down.The action then becomes It is straightforward to see the modified equation of motion obtained by varying G in this case Another way to see the conclusion in eq. (A.24) is to compute tr(H 2 ) , which is given by For the saddle point solutions to exist we require tr(H 2 ) ∼ J 2 N and so we find Thus, we reach the conclusion that α > 1 in this viewpoint.
Note that all the manipulations above are done for a fixed q. Another interesting limit to consider is when q also becomes large. This can be done in two ways. First, we can take N → ∞ followed by q → ∞. Alternatively, we can consider q ∼ N a , 1 > a > 0, so that q 1 a /N is fixed as N → ∞. In both these cases, the above manipulations have to be redone carefully.

A.2 Large q limit
In the previous subsection, we focussed on the case of finite q and large N . Another interesting regime to consider is when both q and N are large. Let us first analyze the case of large q but with N taken to infinity first without introducing any sparseness. This part will be a review of [7]. The equations of motion for both G and Σ fields are In momentum space this becomes In the large q limit, we can take the ansatz for G to be given by so that we can do an expansion in 1 q as The fourier transform of the above is given by The crucial point here is that, usually in the analysis at fixed q, low energies correspond to small ω and so the explicit ω dependence in eq. (A.30) is dropped in solving the saddle point equation. However, in the large q limit, even in the regime of low energies, we should not drop this term but instead do a perturbation around the free theory result as shown in eq. (A.32).
Using eq. (A.33) in eq. (A.30), it is easy to see that We can self-consistently assume that in terms of q, N counting g(t We can choose σ to cancel all factors of N and q above, so that g(t) can be obtained to be of order unity. We take, In this analysis, we have only assumed that both q and N are large but have not assumed anything about q/N . As we shall argue later, the saddle point analysis breaks when q ∼ O( √ N ).
Before we discuss about the breakdown of saddle point, let us generalize the above discussion to the case where we introduce sparseness as mentioned in the previous section for the case of finite q by introducing sparseness variables x Q . The following discussion is also applicable to the case when q ∼ N a , 1 2 > a > 0 (A. 38) with the upper limit on ρ due to the breakdown of the saddle point estimate.
In fact, both these generalizations can be carried out together which is what we shall present now. The steps leading to eq. (A. 19) are valid even when q is large as long as As before, equating eq. (A.40) and eq. (A.34) we get Thus, for a good large N and large q limit to exist, we now have to take σ to be given by where we used N C q p = N α . Naively it may appear that σ is large in the large q and large N if q ∼ N a , a > 0 due to the factor of 2 q−1 . However this factor is cancelled by a similar factor coming from G q in eq. (A.39). From eq. (A.42) and using eq. (A.38), we see that eq. (A.39) is satisfied as long as Thus, we see that eq. (A.39) is satisfied even if the total effective number of terms is linear in N , i.e., α = 1, if q is large. This is to be contrasted to the case of finite q where we required α > 1, see eq. (A.24).
However, we can argue that we cannot make q arbitrarily large as the saddle point approximation will break down for q ∼ O( √ N ). One way to understand this is to compute the fluctuations around the saddle point and see when they become important. To see this consider the expansion of the fields around the saddle Expanding the action in eq. (A.19) by considering the fluctuations above, and noting the expansions where A is any matrix in the last line above and is a small parameter in which the expansion is done. In our case Tr corresponds to the trace in time coordinates t, t , the arguments of the functions G, Σ. We find that the quadratic and cubic terms in fluctuations have the schematic form (ignoring the time integrals and traces) in terms of N and q counting given by eq. (A.31),(A.35) and that σ is given by eq. (A.42), we immediately see from the quadratic terms in the fluctuations that Using these estimates in the cubic terms, we find that the cubic terms are suppressed compared to the quadratic terms if and only if So, precisely when q 2 ∼ O(N ), the saddle point estimates fail. In these cases there are other methods such as Chord diagram techniques [28] that can used to solve the model in such a limit. We shall not delve further into these in this appendix.

B Four Point function in RMT
In this section, we shall elaborate on the four point function calculations in GUE. To begin with, we illustrate the calculation in detail for the case of 2 point function which can then be straightforwardly generalized to four point function. Let us consider the averaged two point function of two operators A, B, with the averaging done over the Haar ensemble. This is given by where in obtaining the third line we used the fact that the Hamiltonian H and its measure are invariant under H → U HU † . We now use the identity where L is the rank of the matrix. Using this in eq. (B.1) it is easy to see that we get Further, for simplicity, if we consider operators such that Tr(A(0)) = 0 = Tr(B(0)), then we get To leading order in large L, the quantity ZZ H goes like O(L 2 ) and thus gives the dominant contribution to the two point function above. So, in the large L limit, we have We now turn to the 4-pt function. Since we are mainly interested in the OTOCs, we consider a 4-pt function of the form The analog of the identity eq. (B.2) we now need invoves four U s and same number of U † s. This identity is very cumbersome and so we shall not write it out explicitly here. Again, assuming all one point functions are zero, we find that, in the large L limit, the leading term in the four point function is given by The discussion so far pertained to the case of β = 0. We can generalize these results for the case of finite β as follows. The appropriate 4-pt function we for the finite temperature case is of the form yA(t)yB(0)yC(t)yD(0) , y = e − βH 4 (B.9) The analog of eq. (B.7) now becomes Thus, in the limit of large L, the 4-pt function essentially is given by eq. (B.8) with the simple modification that now Z = Tr(e −iHt ),Z = Tr(e iHt * ) (B.12) witht,t * as defined in eq. (B.11). Thus, the 4-pt function in the limit of large L is given by Tr(A(0)B(0)C(0)D(0)) (B.13)

C Systematic Sparseness
In this appendix, we consider a different way of sparsing a random matrix. In this version of constructing a sparse random matrix, we start by picking a positive integer p ∈ [1, L]. We then take M ij to have the following form, More precisely: the GUE, eq.(2.1), corresponds to the partition function where Z 0 is some normalization factor. This shows that the matrix elements are independent Gaussian random variables with the diagonal elements M ii having variance 1 L ; among the off-diagonal elements M ij , with i < j, can be taken to be the independent variables, with M ij for i > j being determined by the condition M † = M . R(M ij ), I(M ij ) denote the real and imaginary parts of M ij and these have variance 1 2L . Now in the systematic randomness case the matrix elements which are non-zero, M ij , with i − j = 0 mod p, eq. (C.1) are drawn in the same manner as independent Gaussian Random variables; i.e., the diagonal elements M ii are taken to have variance 1 L and among the off diagonal elements, M ij with i − j = 0 mod p that are non-zero, with M ij , i < j, being the independent ones and the variance for R(M ij ), I(M ij ) being 1 2L . Let us now consider a few special cases. For p = 1, the random matrix is fully random, i.e., all the elements of the matrix are non-zero random elements. For p = L, the only non-zero elements are the diagonal elements. More generally, let L = p q, then the matrix breaks up into p block diagonal sub-matrices 9 , each of rank q. The first block has the matrix elements M ij with i and j both being equal to 1 mod p. The second block has the matrix elements M ij , with i, j = 2 mod p, etc. The total number of (real) random elements is therefore equal to q 2 p = L 2 /p. It is also easy to see that each block is a GUE of q × q matrices with variance being L.
Thus we see that the amount of sparseness can be controlled by varying the value of p with only a fraction 1/p of the L 2 total matrix elements being present.
Due to the block diagonal nature, with each block being a GUE, most of the properties of this ensemble easily follow from those of the GUE as we will see below in Fig 34 and 35. The above is a plot of the rescaled density of states for various values of p. It is evident from the above plot that the except for highly sparse matrix, the case of p = 576 in the plot above when only L 2 p random elements are kept, the rest of the cases fit well with a Wigner distribution. For the case of p = 576, the size of each block is L p ∼ 7 and hence the wiggles around the Wigner distribution. The case of p = 2 is shown explicitly in the right panel where it can be seen that the two blocks have identical distribution, each contributing half to the total distribution, as expected. The above plot shows the level spacing distribution for the systematically sparsed random matrix. The left panel is the level spacing distribution for unfolded eigenvalues for the full random matrix and for varying values of p. As can be seen from the plots, for p > 1, when the full matrix splits in to blocks, the level spacing distribution is no longer follows Wigner surmise. As p increases, the distribution moves from Wigner Surmise to a Poisson distribution. This is just because of the fact that as the value of p increases, the full matrix splits into more number of blocks each of which are uncorrelated. So, increasing p corresponds to decreasing correlations between the eigenvalues of the full matrix. Each of the sub blocks in each matrix are still completely correlated as can be seen from the right panel. The mixing of the blocks leads to the distribution becoming more Poisson-like.
We shall now calculate the 2-pt function for the case of sparse random matrix corresponding to systematic sparseness. As already mentioned above the hamiltonian for the systematically sparse random matrix becomes a block diagonal matrix. In particular, the unitary matrices which preserve this block diagonal form the symmetry group for such Hamiltonians. So, we shall compute the averaged 2-pt function in a manner similar to the previous section B but with the understanding that the corresponding unitary matrices are also block diagonal to preserve block diagonal nature of the Hamiltonian. Let the Hamiltonian be a block diagonal matrix with p blocks each of size q. Any matrix element of a block diagonal unitary matrix will be of the form U a a = 0 iff a = αq + i, a = αq + i , 1 ≤ α ≤ p, 1 ≤ i, i ≤ q. (C. 3) The expression for the two point function is still given by the expression in eq. (B.1). The operators A and B need not be block diagonal. The measure dHdU in eq. (B.1) means that where the quantities with subscripts ρ, κ indicate the corresponding blocks. Also, since we are assuming A, B are non block diagonal, the product U † AU will be of the form where in the second line the quantities with the subscripts corresponds to the appropriate blocks. A similar expression can be written for operator involving B. The two point function eq. (B.1) written explicity in terms of blocks reads dH κ e −iHαt l i e iHρt j k δ ασ δ βρ (A αβ (0)) i j (B ρσ (0)) k where we used a = αq + i, b = βq + j, c = ρq + k, d = σq + l with 1 ≤ α, β, ρ, σ ≤ p, 1 ≤ i, j, k, l ≤ q (C.7) We now need the analog of the identity eq. (B.2). It is given by δ αβρσ q 2 − 1 δ i j δ j i δ k l δ l k + δ i l δ k j δ l i δ j k − 1 q (δ i j δ j k δ k l δ l i + δ j i δ k j δ l k δ i l ) + 1 − δ αβρδ q 2 δ αβ δ ρσ δ i j δ j i δ k l δ l k + δ ασ δ ρβ δ i l δ l i δ j k δ k j (C.8) The delta function with four indices means that it gives unity only if block diagonal indices are such that α = β = ρ = σ and zero otherwise. Using the identity eq. (C.8) in eq. (C.6) and contracting the indices appropriately, we get A(t)B(0) Sys = α 1 q 2 − 1 q − 1 q Z αZα Tr(A αα (0))Tr(B αα (0)) + (Z αZα − 1)Tr(A αα (0)B αα (0)) + α,β (1 − δ αβ ) q 2 Z αZβ Tr(A αβ (0)B βα (0)) (C.9) where we have explicitly written the summation to make it clear and where Z α = Tr(e −iHαt ),Z α = Tr(e iHαt ). The first term is the usual term that one get if the operators A, B are also block diagonal. The second term is an extra contribution that arises due to the off diagonal blocks in A, B.
The four-point can also be generalized in a manner analogous to the two point function but involves many more terms. We do not explicitly write it here except to note that the leading term in the OTOC will also be given by eq. (B.8).

D Wigner surmise
In this appendix we outline the derivation of the distribution of the ratio of nearest neighbour eigenvalues, eq. (3.9). For Wigner Surmise p(r) is given by p(r) = K (r + r 2 ) β A quick derivation of the above result for the case of 3 × 3 matrix is shown below.

E Error Analysis for non-local SYK OTOC fit
In this appendix we present a detailed error analysis we have done for the case of OTOCs. We have taken non-local SYK OTOC as the set up to perform the error analysis since we already know the Lyapunov exponent is the limit of large N saturates the Chaos bound value of 2πT .
The main point of doing the error analysis is to estimate the maximum time till which the growth of the OTOC can be approximated by an exponential. Starting with the smallest values of time, the exponential fit will be good till a certain time beyond which the exponential approximation to the OTOC will not be good and we are interested in estimating this critical time. We will denote this critical time as t c below.
The error analysis is carried out using the Monte Carlo method. Given that our numerical data is a discrete set as a function of time, we first fit our data to the the function F th = A + Be λt , 0 < t < t max (E.1) for various values of t max . This is done using the best fit package of Python (scipy.optimize.curve f it). The python output will be the values of A, B, λ for the best fit case along with the associated covariance matrix σ αβ , of which we only show the diagonal elements corresponding to σ AA , σ BB , σ λλ denoted by σ A , σ B , σ λ in the table. This is the starting point for the analysis. We explain the methods for a single value of β. We start by choosing a value of t max , the maximum time till which we fit the exponential. Let y = {A, B, λ} and σ y = {σ A , σ B , σ λ }.

E.1 Monte Carlo Simulation
This method is described in [34] Chapter 15.6.1, 'Monte Carlo Simulation of Synthetic Data Sets'. We will also explain it, to some extent, below.
From the quantities y = {A, B, λ} and σ y = {σ A , σ B , σ λ }, mentioned above, we can build three independent normal distribution of 5000 elements, around mean y i with variance σ 2 y i . So in total we have 5000 3 elements of type {A i , B j , λ k } where A i ∈ N (A, σ A ), B j ∈ N (B, σ B ), λ k ∈ N (λ, σ λ ). We retain only those data points which are within 1σ deviation from the mean value. Next we fix a time point (t n ) and generate the corresponding OTOC value using eq. (E.1) for F th (t n , A i , B j , λ k ). Around each t n we now have a total of ≈ 5000 3 such points. We take this distribution of ≈ 5000 3 points and compute the σ for that distribution. This σ can be denoted by σ tn , we use σ n . Now we use this to compute the value of χ 2 using the formula (6.14.38), & (15.1.6) in [34] χ 2 0 (t max ) = tn≤tmax (F numerical (t n ) − F th (t n , A, B, λ)) 2 σ n (E.2) then compute P (χ 2 > χ 2 0 ) given by We repeat this for various values of t max . The one with lowest value of P will give the best fit and we set t c equal to this value of t max . Once we have fixed t c then our best estimate for the Lyapunov exponent is the corresponding fitted parameter(λ) from Python.
In fact this more systematic way of obtaining t c and λ agrees with the more naive way of fitting "with the eye". This justify our choosing of t max for fitting the data naively with "the eye". The error in λ for the value of β under consideration corresponds to the standard deviation obtained from the best fit till t max . From the data shown below in fig.36, we see that for β = 20, 30, 50, correspondingly we get t max = 4, 5.5, 7. Thus for smaller β, the exponential fit is only good till small value of t.

F Lyapunov Exponent for Local SYK
In the local SYK section 4, we have discussed about OTOCs in Local SYK model forḠ i,i+1 andḠ i,i+2 , see fig. 26. We had argued there that there is no good large N limit that suppresses the connected part compared to the disconnected part. However, the plots for OTOCs are qualitatively similar to the case of Non-local SYK and so we can try to extract a Lyapunov exponent. This is shown in the plots below. In fig. 38, the blue curves are for fitting the Lyapunov exponents at different β with a functional form λ = A β + B. Black curves are for fitting with λ = A β . The green error bars correspond to 1σ error obtained from fitting the OTOCs with eq. (E.1). Within our numerical limitations we see that both fitting gives value of A to be lesser than 2π.