Speeding up the spread of quantum information in chaotic systems

We explore the effect of introducing mild nonlocality into otherwise local, chaotic quantum systems, on the rate of information spreading and associated rates of entanglement generation and operator growth. We consider various forms of nonlocality, both in 1-dimensional spin chain models and in holographic gauge theories, comparing the phenomenology of each. Generically, increasing the level of nonlocality increases the rate of information spreading, but in lattice models we find instances where these rates are slightly suppressed.


Introduction
The dynamics of entanglement generation and the growth of local Heisenberg operators in chaotic quantum systems have been subjects of intense investigation in recent years, bringing together the interests of condensed matter, quantum gravity, and quantum information theory communities . Generation of entanglement is one of the primary features characterizing the thermalization process for out-of-equilibrium states in a wide variety of chaotic systems. Operator spreading, as defined through various operator entanglement measures and through out-of-time-order correlators (OTOCs), has likewise been used to diagnose chaos in an equally diverse set of systems. In this work we focus on the related notion of information spreading, which is linked to both phenomena.
In any local chaotic quantum system, information that is initially localized will spread, such that its full recovery requires knowledge of an increasing number of degrees of freedom around the location of origin. Suppose we wish to track the spread of information that is initially localized on some test subsystem T . The smallest region/subsystem at any time which can be used to fully reconstruct the initial state of T constitutes the "information cone" of T . At late times in homogeneous systems, the boundary of this cone grows linearly at a rate dubbed the information velocity, v I [31]. It was argued in [31] that this velocity is limited both by the rate of entanglement generation and by the speed of local operator spreading. These disparate phenomena are therefore both brought to bear on a single physical question of how rapidly information delocalizes. This question is itself rather fundamental, inducing a form of causal structure for thermal-scale operators on the theory which may be more stringent than the ultimate relativistic bounds, and which is present even in nonrelativistic quantum systems.
The effect of nonlocality on entanglement growth and on various diagnostics of chaos has been previously explored in holographic theories [32]. Interestingly, introducing nonlocality into the system was found to enhance scrambling as well as the entanglement generation, eluding previous bounds proposed in the context of local quantum field theory. Further studies support this idea, including tantalizing results on the speed up of thermalization, dissipation and complexification rates [33][34][35] due to nonlocal effects. In this work, we extend these studies to ask more in detail how the presence of nonlocality affects the rate of information spreading. The forms of nonlocality we will consider are "mild," in the sense that there is always a finite length scale associated with nonlocal interactions, and we consider regions larger than this length scale. Thus, upon coarse-graining over short distances we can always recover standard local physics. This mild nonlocality is in contrast, for example, to matrix models, or to lattice system with all-to-all couplings which, in the extreme, could render any notion of local subsystems and neighborhoods to be irrelevant.
In this paper we will consider 1-dimensional spin chain systems as well as holographic gauge theories with various forms of mild nonlocality. Even though the presence of all-to-all interactions seems ubiquitous and necessary for fast scrambling [36], there are examples of systems that exhibit this phenomenon regardless of the absence of the former [37]. These are spin chains with a specified combination of some sectors, including a subset of next-to-nearest neighbor interactions. This form of mild nonlocality can in fact be realized experimentally. Known examples include materials with high polarizability which, in the presence of a strong electromagnetic field, mimic the presence of such couplings due to the alignment of their molecular dipoles. We believe that the results of this study could thence be important for emerging quantum technologies such as quantum computation and quantum communications.
We begin the work with a review of related concepts and techniques in section 2. We pay particular attention to describe the effective description for information spreading that arises from coarse-graining over short distances, a.k.a. the membrane or hydrodynamic theory, developed in [38] and adapted to holographic settings in [39]. Section 3 asks how nonlocality affects information spreading in chaotic, 1-dimensional spin chain systems, and discusses our numerical results for spin chains with next-nearest neighbor interaction and generalizations. Section 4 then poses the same questions in holographic systems inspired by the nonlocal spin chains considered previously. These include non-commutative SYM theory (NCSYM) and dipole-deformed SYM theory (DDSYM), both of which include fundamental dipole-type interactions. Even though, NCSYM was already studied in [32], the methods employed there (shock waves) are technically different to those used in this paper (membrane theory). As expected, the results agree with each other. This provides evidence that the hydrodynamic description is valid for systems with mild nonlocality as the ones considered in this paper. We conclude in section 5 with a comparative discussion of our results and future directions.

Review of concepts and methods
Before turning to consider the effects of nonlocality, we here summarize the various terms, concepts, and techniques employed in the investigation. We begin by reviewing two other characteristic quantities, the entanglement velocity v E and the butterfly velocity v B , and their relationship to the information velocity v I . We describe our method of numerically tracking the information cone and computing v I in chaotic 1-D spin chains. We then provide a review of the membrane theory of entanglement dynamics [38,39], and introduce the membrane tension function ε(v), which encodes information about the aforementioned quantities. We review its origin, use, and calculation, both in spin chains and in holographic contexts.

Three characteristic speeds:
In addition to our primary quantity of interest, the information velocity v I , we will frequently refer to two other velocities characteristic of local quantum chaotic systems: the entanglement velocity v E and the butterfly velocity v B . We review these here for future reference. The former, v E , is not a velocity in the strict sense, but a quantity that controls the rate of entanglement growth in thermalizing states [40][41][42]. For large regions and early times (but much larger than local thermal scales) after a homogeneous quench, the entropy of a region A grows linearly according to where s th is the thermal entropy density, and |∂A| denotes the area of the boundary of A. This linear growth is explained heuristically in terms of an emergent "entanglement tsunami", which is valid in the hydrodynamic regime. 1 For strip-like regions, this linear growth regime persists until saturation. The ellipses include an initial entropy density, as well as sub-extensive contributions which we ignore as subleading. Some dependencies are suppressed in this expression: both v E and s th may depend on the energy and charge density (or equivalently, the inverse temperature and chemical potential) of the state. Additionally, while v E is usually defined in the context of entropy growth from an unentangled state (or a vacuum state, with S [A(t)] taken to be the vacuum-subtracted entropy), it can more generally depend on details of the initial state. For instance, in this work we will consider initial states with a uniform entropy density greater than zero, but less than the thermal value, referring to f ≡ s initial /s thermal as the initial entanglement fraction of the state. In this case, v E depends on f as well.
The second velocity, v B [1,2], is associated with the spread of local perturbations, as diagnosed through the growth of local Heisenberg operators. The thermal expectation value of the squared commutator of one simple local operator and another, displaced in space and time, e.g.
β , vanishes as |x − y| → ∞, but becomes non-negligible over a displacement range that grows with time. The region over which the expectation value of the square of this commutator is O(1) serves to define the butterfly cone. After initial transient behavior, the boundaries of such a cone expand linearly at a rate v B . Such cones map the regions of influence of perturbations by local operators. 2 In [31] it was argued that the information velocity v I can be understood through an argument that relates it to both v E and v B . The argument follows a variation of the Hayden and Preskill protocol of information recovery [45], and states that the information cone of a test site T can be identified, at any given time, by finding the largest region centered on the test site which satisfies two conditions. First, it must have reached entanglement saturation, and second, it must be within the butterfly cone of the test site T . The former condition indicates that the active degrees of freedom in the region are maximally entangled with degrees of freedom in the complement system. The latter serves as an indicator that the region is sufficiently scrambled. 3 When both criteria are met by a region R, a Hayden-Preskill style protocol for information recovery implies that the complement regionR, and not R, can recover the initial data of T . 4 Considering larger and larger regions centered on T at a fixed time until one of the two criteria fails identifies the smallest region R I that can be used to reconstruct the initial data of T . This discussion assumes a sharp transition between regions of recovery, R andR. In reality, the "edge" of the information cone may be smoothed, however we assume this does not affect the asymptotic scaling of the growth of R I , and therefore v I .
The above discussion alone indicates that either the scrambling condition or the entanglement condition could act as a bottleneck to the spread of information. However, in all systems considered, the condition of entanglement saturation always sets the more stringent constraint, at least in the large-region, late-time limit. 5 Therefore we henceforth take for 2 Aside from choosing operators that do not represent locally conserved charge, the cone is largely insensitive to the choice of local operators. 3 The precise requirement for "scrambling" is not rigorously identified. In the quantum information argument by Hayden and Preskill, the scrambling operation corresponds to a Haar-averaging over unitaries, though this is presumed excessive. In the variation of [31], it is assumed that degrees of freedom within the butterfly cone are sufficiently scrambled (for purposes of the argument), because all local operators at T can be replaced by truncated operators with support only within the (growing) butterfly cone, and within this region the chaotic dynamics is presumed sufficiently close to a "random" unitary. 4 Actually, the argument implies that the complement regionR, plus any small region from within R, ∆R, with Hilbert space dimension at least greater than that of T , can recover the information. In our large region limit we assume |R| >> |T | and treat the difference between the size of R and R − ∆R as negligible. 5 This statement is related to several known bounds in the literature. In [40,46] it was proven that vE ≤ c in relativistic systems and then by a causality argument that tsat ≥ rins/c, where rins is the radius of the largest granted that v I can be identified by finding the growth of the largest region that has just reached entanglement saturation. For an arbitrary region A, the time to entanglement saturation is not solely controlled by v E , but in cases where linear growth persists until saturation, in the scaling limit it is simply given by [31,38,47], this entails that for strip-shaped regions, v I interpolates between v E at f = 0 and v B at f = 1. Finding v I for strips across all values of f thereby also provides information about the rates controlling entanglement growth and operator spreading in the system. Computing v I (f ) for strip-like regions will therefore be our primary focus, leaving extensions to other shapes and states to future work.
The fact that v I is simply related to the largest region to reach entanglement saturation also allows us to relate v I directly to the membrane tension function, a quantity described in the section 2.3. In spin chain systems, we can only roughly approximate the membrane tension function due to computational limitations and finite size effects. Hence, we are better served by computing v I directly through a method explained in section 2.2. Holographic systems, on the other hand, are particularly amenable to the membrane method. In these systems we compute v I by relating it to the membrane tension function, as explained in section 2.3.2.

Tracking the information cone in 1-D spin chains
For the purpose of identifying the information cone in a local quantum chaotic system 6 , we imagine that the degrees of freedom on a local test site, T , are initially maximally entangled with a reference system R of equal or greater Hilbert space dimension. The reference R is thereafter completely decoupled from the system dynamics. Initially, R has maximal mutual information with the test degrees of freedom, but as the system evolves, the mutual information between R and T decreases as data from T scrambles into the rest of the system. The information cone can be identified by tracking the smallest region around T that retains maximal mutual information with the reference system. Any other initial state at T (besides the maximally mixed state used in this procedure) could, in principle, be inferred from data within this region [50].
For a 1-dimensional chaotic spin chain, we instantiate a version of this setup as follows (this is a slight modification of a procedure employed in [31]): We consider L-qubit systems with nearest-neighbor σ z σ z couplings, with longitudinal and transverse magnetic field terms chosen to render the Hamiltonian chaotic: ball that fits in the region being considered. Then in [47,48] it was shown that vE ≤ vLC and tsat ≥ rins/vLC , where vLC is an effective light cone speed, usually assumed equal to vB (though not always, see [10]). 6 Controlled information transport in non-chaotic spin chains is another interesting vein of research. See, for example, [49] and references therein.
where J zz = 1, h x = 1.05, h z = 0.5. Variations of this Hamiltonian have been widely used in the literature to study thermalization. The superscript on H (0) L merely indicates that this serves as our "base" Hamiltonian, to which we will later add various nonlocal couplings. However, the nonlocal couplings are irrelevant for the demonstrations of this section. We also define the following set of states: with the matrix a given by These states allow us to initialize a uniform "volume law" entanglement fraction f , such that the entanglement entropy of a subsystem of N sub < L/2 neighboring qubits is f * N sub . The f dependence is implicitly related through (2). For all f , these states are at the center of the energy spectrum, in that ψ|H L |ψ = 0. An additional Bell pair is then appended to the system. One member of the pair is dynamically coupled to the system while the other remains decoupled. These are the "test" and "reference" qubits respectively.
The system is then evolved with the L + 1 qubit Hamiltonian (2.2), treating the test qubit as the first in an L + 1 qubit spin chain. The reference and test qubits initially have maximal mutual information, but as the information associated with the test qubit scrambles into the rest of the system, an ever-growing number of nearby qubits are required to recover the mutual information with the reference 7 . The growth of the smallest subsystem necessary to achieve this defines the growth of the information cone. Our Hamiltonian and states have a quasi-translation invariance, voided by the existence of the boundaries, or by the fact that we are computationally restricted to finite L. We therefore track the cone across just under half the system size, extracting v I as a linear fit on the cone in this region. Most of our results are for systems with total L = 26. In fitting v I , we exclude the initial and final three subsystems of the half-system to minimize finite-time and edge effects. We implement time evolution with a Krylov subspace method to exponentiate the Hamiltonian. This code was written in C, and based on the well known Fortran package, Expokit [51], where modifications were made to improve efficiency and parallelize the computation. Each time evolution and entropy computation for a particular Hamiltonian was run on a single node across 44 cores on TACC's Stampede2 computer. The results were verified for smaller L using the Dynamite [52] and Qutip [53,54] python packages as well as a Mathematica [55] implementation. Figure 1 shows examples of information cones from states initialized with varying degrees of uniform entanglement, f . The information velocity is plotted as a function of initial entanglement fraction. As f → 1, v I is expected to approach v B . As a consistency check, we computed v B independently by tracking the growth of squared commutators of simple operators, finding values in approximate agreement, considering finite size effects in both computations.

Membrane theory of entanglement dynamics
An interesting quantity which can serve to simultaneously characterize aspects of entanglement growth, operator dynamics, and information spreading in local quantum chaotic systems is the membrane tension function, E(v) [38]. The membrane model of entanglement dynamics emerges in a course-grained, hydrodynamic type approximation of such systems. It computes the entanglement entropy of a state via an integrated energy density along a timelike membrane stretching through an auxiliary Minkowski spacetime, analogous to a minimal cut through a tensor network or quantum circuit representing unitary time evolution. The membrane method has been applied with great success to holographic systems [39,56,57], where the membrane is merely a projection of bulk extremal surfaces, with the bulk dimension integrated out. In this section, we review the origin of the membrane method in a thermodynamic limit of chaotic quantum systems. We review a method of approximating E(v) in 1-D spin chain systems which we will later employ in section 3.3. We then discuss its derivation and use in the context of the AdS/CFT correspondence as well as its extension to more general gauge/gravity dualities. This section is largely a review of work appearing in [38] and [39], though, we generalize the techniques to theories with inherent nonlocality.

Membrane theory for spin chains
To motivate the membrane method, we restrict ourselves to states in local chaotic quantum systems exhibiting a "volume law" entanglement pattern, such that the entanglement entropy of subregion is simply proportional to the volume of the subregion (plus subleading corrections). For simplicity we also first consider 1-dimensional systems, though the generalization to higher dimensions is immediate. In a system of total length L, pure states partitioned by a cut at position x, at equilibrium exhibit a bipartite entanglement entropy of where s eq is the equilibrium entanglement density. The leading, course-grained dynamics of out-of-equilibrium states can be captured in a hydrodynamic description, governed by Here Γ ∂S ∂x = Γ(s) is the entropy production rate. In general this function could depend explicitly on position or on any locally conserved charge densities, though we consider the simplest, spatially uniform case. Physical considerations immediately put some constraints on Γ(s). At saturation, entanglement growth vanishes ∂S ∂t = 0, implying that Γ(s eq ) = Γ(−s eq ) = 0. Considering the growth of entanglement from a completely unentangled state leads to Γ(0) = v E . The entropy production rate is only defined over s eq ≤ s ≤ s eq , and we assume a smooth, symmetric function with Γ (v) < 0 over this domain.
Next suppose that there is an equivalent representation of the entropy dynamics, staged on an auxiliary Minkowski spacetime associated with the time evolution operator, where the entanglement entropy is computed by minimizing the integrated local "energy" along a timelike membrane stretching from the initial to final time and obeying certain boundary conditions. The tension function E(v) specifies the energy density along this membrane as a function of its orientation in the spacetime (its local "velocity"). For the one-dimensional translation invariant system described above, the entropy at position S(x, t) would be obtained by the following minimization (see figure 2): Differentiating with respect to time and comparing to equation (2.7) entails that E(v) can be directly related to Γ(s) (and vice versa) by Legendre transformation: From the properties of Γ(s), we can infer related properties of ε(v). In particular, considering Γ(s eq ) = 0 leads to v max = s eq Γ (s eq ). In the tensor network or circuit picture, the maximum velocity v max serves as an effective lightcone velocity, which in known cases is equal to the butterfly velocity v B . Results in the systems we consider are consistent with this expectation, and we assume it throughout. In conjunction with the other characteristics of Γ(s) and properties of the Legendre transform, in the range The definition of ε(v) can be extended outside this domain, but this is the region of dynamical interest. It is evident that the membrane tension function encodes a lot of information characterizing the system, including not only entanglement dynamics (v E ), but also operator growth (v B ) and indirectly, information spreading (v I ). For this reason it will be one of the primary quantities which we compute and compare for systems with variable degrees of nonlocality.
In general it is not possible to compute E(v) analytically for a given Hamiltonian. A numerical approximation technique was introduced in [38], which we summarize here and later employ. The membrane tension function at infinite temperature can be simply related to the operator entanglement of the time evolution unitary. Operator entanglement is computed by treating the operator as a state vector in a doubled Hilbert space. For instance, from the time evolution operator U (t) we obtain: Here, the first and second Hilbert space factors can be thought of as representing the same degrees of freedom at time 0 and time t, respectively. An ordinary entanglement entropy can then be computed on various subsystems of this state. In a 1-D system, tracing out degrees of freedom up to x on the first factor and up to y on the second factor gives the entanglement entropy S U (x, y, t), which in the membrane picture corresponds to a minimal membrane stretching from x to y through an auxiliary spacetime of height t. This leads to a method of approximating the membrane tension function via operator entanglements of the time evolution unitary: Here, for system of finite length L, the origin of the spatial arguments is placed at the center. At any given t, evaluating the right hand side gives an approximation of the membrane tension function at infinite temperature. This approximation is limited by both finite size effects and finite time effects. In an infinite system, taking a large time limit would render finite time effects negligible, and the approximation would converge to E(v) over the range −v B < v < v B , and to |v| outside this range. But for a finite system as t increases, the range of v over which the cut can be made without exceeding the boundaries (or more stringently, before edge effects become substantial) becomes more limited. In our spin chain system, we are computationally limited to 13 qubits. So in practice, we obtain a series of curves for a set of t values (or rather, discreet data points such that |vt/2| correspond to an integer number of spins traced out) and schematically piece these together to give a qualitative picture of E(v) under different Hamiltonians. This method is employed in section 3.3 as a qualitative check on our more direct methods of investigating the effect of variable nonlocality in spin chain systems.

Membrane theory for holographic systems
The membrane theory of entanglement dynamics has been successfully generalized to stronglycoupled large-N theories with holographic duals [39,56]. This provides strong evidence that this effective description can efficiently encapsulate the entanglement dynamics in all chaotic systems. In [39] it was first realized that a membrane description naturally emerges upon studying the late-time and large-distance regime of the RT prescription, akin to hydrodynamics. This study was later complemented in [56] showing that this theory is robust under a large set of generalizations, including different quench protocols and finite coupling corrections holographically dual to higher derivative gravity corrections. In this section we will first review the results of [39], valid for asymptotically AdS spaces in Einstein gravity. We will then generalize this framework to more general holographic scenarios, including those described by non-AdS spaces with extra matter fields and fluxes, ubiquitous in top-down constructions emerging from string or M theory.

The case of asymptotically AdS spaces
We start with a generic asymptotically AdS d+1 black brane where v is an "infalling" time coordinate. For instance, for the standard Schwarzschild-AdS d+1 we have We follow [39] but keep working in cartesian coordinates. In this way, we can later generalize the analysis to non-AdS settings in a more straightforward way. First, consider the scalings (2.14) Defining ξ ≡ ξ 1 , ξ ⊥ ≡ {ξ 2 , . . . , ξ d−1 } and parameterizing the RT surface with functions ζ(τ, ξ ⊥ ), ξ(τ, ξ ⊥ ), one finds that in the limit β/Λ → 0 the entanglement entropy functional reduces to: In this limit, the equation of motion for ζ is algebraic: .
We now define: so that (2.16) becomes v 2 = c(ζ). Plugging back into the action the solution to this equation, ζ = c −1 (v 2 ), we can rewrite the entropy functional as a membrane functional: where is the thermal entropy density of the black brane. In the above we have identified the standard "area" element in Minkowski space and defined the membrane tension E(v) to be . (2.21) Notice that the factor ζ d−1 h is included in (2.41) to make E(v) dimensionless. For example, for Schwarzschild-AdS d+1 (2.12) we find that the equation (2.16) becomes which can be inverted as follows: Plugging this back into (2.41) we find that It can be checked that this function has all the desired properties [39,56] (in the dynamically relevant regime, i.e., 0 ≤ v ≤ v B ): .
One final remark is in order. In general, solving for v B from (2.25) may be very difficult. However, it is useful to notice that so c −1 (v 2 B ) = ζ h must yield the horizon. On the other hand, and c −1 (0) = ζ HM yields the Hartman-Maldacena surface [40].

General theory for non-AdS spaces
The gravity duals of the theories we want to study are non-asymptotically AdS and possibly anisotropic. In particular, their metrics are not of the form (2.11). This means we need to generalize the membrane theory starting from a more general ansatz. For concreteness, we will take the metric to be of the following form: where d is the number of spacetime dimensions in the boundary and dΩ 2 p is the metric on a p-dimensional compact space M p . Importantly, we will assume that this compact space may have a non-trivial dependence with respect to the radial coordinate z, such that its volume factorizes as for a suitable "normalized"M p . Following the derivation of the previous section, we first consider the scalings v ≡ Λ τ , while keeping the coordinates of the compact space untouched. Defining ξ ≡ ξ i , ξ ⊥ ≡ ξ j (j = i) and parameterizing the RT surface with functions ζ(τ, ξ ⊥ ), ξ(τ, ξ ⊥ ) (assuming no dependence with respect to the coordinates of the compact space), one finds that in the limit β/Λ → 0 the entanglement entropy functional reduces to: (2.33) In this limit the equation of motion for ζ is algebraic; however, it generally cannot be written in the form v 2 = c(ζ) .

(2.34)
This is due to the anisotropy in the metric (2.29). To move forward we specialize to the case of infinite strips, so that the embedding functions are independent of the transverse coordinates, i.e., ζ(τ ), ξ(τ ) and hence ∂ ξ j ξ = 0. In this case we recover (2.32) but now with The equation of motion for ζ is now: (2.37) As a consistency check, we note that for asymptotically AdS, isotropic black branes we have and we recover (2.17) from (2.37). Plugging back into the action the solution to this equation, ζ = c −1 (v 2 ), we can rewrite the entropy functional as a membrane functional: . (2.41) We note thatẼ (v) still needs to be normalized. In order to do so, we identify the entropy density as: Hence, we find that . (2.44)

Information velocity from the membrane theory
We can include the effects of the entanglement fraction f [31] in the membrane theory in a manner developed by Mark Mezei, who obtained the results of this section in unpublished work [58]. The inclusion of f amounts to the introduction of a penalty factor for the lower end of the membrane: Thus, the membrane is anchored on ∂A(τ ) and ∂A (0) on the upper and lower boundaries respectively. Since the latter is just a boundary term, it does not affect the equations of motion. For the case of strips of width 2R one gets membranes with constant v, and (2.45) becomes independently of τ . The entropy saturates when S A (A(τ sat )) = s th area(∂A)R, which gives where Γ(f ) is defined as the Legendre transform of E(v). It can be checked that v I (f = 0) = v E and v I (f = 1) = v B .

Information propagation in spin chains with variable nonlocality
We now turn to the question of how the rate of information spreading in chaotic quantum systems is affected by the presence of nonlocality in various forms. We begin with the base Hamiltonian H L of equation (2.2). By adding next-nearest-neighbor terms (or next-nextnearest-neighbor terms, etc...) with variable couplings, we introduce a form of mild nonlocality into the Hamiltonian and investigate the affect of these terms on information spreading. 8 More precisely, our total Hamiltonian will be x + ... .

(3.2)
All the individual terms in H (nonlocal) L are quadratic, in the sense that they contain two active sites (i.e. only two Pauli operators per summed term). The labelling scheme for nonlocal coefficients should be clear: the number of o's in J xo...ox (for example) indicates the number of spaces between active σ x sites. In most cases, we will set only a subset of the couplings J xox , J zoz , J xoox , J zooz ... to nonzero values simultaneously, however we do allow them to be on the same order as the J zz = 1 coupling in H (0) L , so they do not generally constitute small 8 Spin chains with next-nearest neighbor interactions or related couplings have been studied in many other contexts, see for example [59][60][61][62][63]. Entanglement growth in integral models with a form of nonlocality induced by a Lifschitz scaling has been studied in [64,65]. perturbations to H (0) . In fact, in our scans below we choose to consider nonlocal couplings up to precisely J nonlocal = 1 because in some cases going far beyond this causes the Hamiltonian eigenstate level-spacing to deviate from the Wigner-Dyson distribution indicative of a chaotic regime [66], where we wish to remain. Equation (3.2) obviously does not represent the unique set of nonlocal terms which could be considered. We choose to work with σ z σ z terms and σ x σ x terms, because unlike σ y σ y , such terms leave the states (2.3) at the center of the energy spectrum, regardless of the values of nonlocal coupling coefficients. These two types of terms also provide clean examples of nonlocal additions which are, respectively, commuting and non-commuting with the nearest neighbor piece of H (0) L , which we speculate may lead to qualitatively different behavior. In the following sections we display some results obtained using the methods outlined in section 2.2, to show the effects of nonlocal terms on the rate of information spreading.

Next-nearest neighbor interactions: J xox and J zoz
First we consider the case where only one of the nonlocal terms in (3.2) is turned on. Figure  3 shows v I over f for the different values of either J xox (left panel) or J zoz (right panel). All other nonlocal couplings are set to zero for these scans. In the case of J xox , the most apparent qualitative behavior is perhaps unsurprising: as the amount of nonlocality is increased, the information velocity increases as well, at all values of f . For small J xox 0.2, however, this effect is suppressed or even reversed, as v I is either unchanged or even slightly decreased for small f . This behavior is far more pronounced in the case of J zoz , which shows a clear nonmonotonicity in the behavior of v I as a function of J zoz , at all values of f . This behavior is displayed more clearly in figure 4, and we will discuss it further below.
As a confirmation that the information velocity v I (f ) interpolates between v E at f = 0 and v B at f = 1, we have computed both of these velocities independently for the same set of system parameters, using basic methods. Results are shown as the dashed lines in figure 4. To compute v E , we tracked the rate of growth entanglement entropy on half of the system, evolving the state ψ f =0 3) with f = 0, though they should not substantially differ from those obtained by averaging over random unentangled states. On the other hand the v B computation is a state-independent estimation of the butterfly velocity at infinite temperature, as is appropriate for comparisons with states of energy H = 0. The approximate match of these curves with the behavior of v I , shown in figure 4, is good support for the analysis of [31] and a good overall consistency check. It is also a confirmation of the non-monotonic behavior seen in the v I curves, which we now discuss.
The nonmonotonicity evident in figure 4 is somewhat counter-intuitive, as it indicates that in some cases the inclusion of nonlocal terms in the Hamiltonian can suppress the rate of information propagation. Such behavior is not observed in any of the holographic systems investigated in section 4. For the case of J xox couplings, the suppressive effect is almost small enough to be dismissed as an artifact of finite size effects or procedural uncertainties, with v I never falling below about 6 percent less than its value at J xox = 0 (though even "flatness" of the v I (J xox ) curve for small J xox would be noteworthy). The case of J zoz shows a much larger suppressive effect, peaking around J zoz ∼ 0.4, where v I is nearly 35 percent lower than its value with no nonlocal coupling. We focus on this effect in the left panel of figure 5, plotting the ratio of the minimum v I (with respect to J nonlocal ) and v I when J nonlocal = 0. The effect is most stark at f = 0.
In the right hand side of figure 5, we have plotted the ratio v B /v E , as inferred from v I (f = 0)/v I (f = 1), at various values of nonlocal coupling. The precise form of the curves emerging from this data are surely dependent on our specific choices of Hamiltonian and couplings, but we include this plot for comparison with holographic results (see figures 8 and 11, left panels), where the analogous ratios are monotonically increasing and relatively featureless.

More nonlocality: fixed width scans
We now move beyond the case of next-nearest neighbor interactions, and consider adding a higher degree of nonlocality. We fix all J xox , J xoox , ..., J xo...ox couplings, up to a fixed "width", to the same J nonlocal value and turn them on simultaneously. Results for v I as a function of J nonlocal are displayed in 6. In this case, the affect of additional nonlocal terms is simply to raise the v I (J nonlocal ) curves, moreso with the inclusion of each additional "width" of nonlocal terms. This also has the affect of washing out the nonmonotonic behavior observed in section 3.1 and approaching behavior more similar to the holographic cases we consider in section 4. It is interesting that the growth of v I begins to level off near J nonlocal = 1. In principle these coupling parameters could be made arbitrarily large, but we have chosen to limit J nonlocal ≤ 1 in order to leave a comfortable margin before the Hamiltonian level-spacing departs from the Wigner-Dyson distribution expected of chaotic systems. Nonlocal terms are turned on up to fixed "width". Right: Information velocity as a function of nonlocal σ z couplings, at f = 0. Nonlocal terms are turned on up to fixed "width".

Membrane tension function and nonlocality
We now approximate the membrane tension function for the systems of section 3.1 with variable next-nearest neighbor couplings, using the method outlined in section 2.3.1 with L = 13. Results are displayed in figure 7. Finite size affects are substantial, so these curves can only be considered schematic representations of the effect of the nonlocal coupling on E(v) (i.e. they do not give precise values of v E = E(0) or v B = E(v B )). They are obtained by computing E eff (t, v) over a series of t values up to the v B crossing time for each Hamiltonian and fitting the boundary of the resulting data set with an even polynomial of degree 12. The central portion (toward v = 0) of the curves, which in an infinite system would converge through late time values of E eff (t, v), tend to be overestimated. This effect is larger for higher (true) values of v E , v B , because finite size effects prohibit a good estimate from E eff (t, v) at much lower values of t. All these caveats aside, in qualitative terms we find good agreement with the results of the previous section: the effect of adding J xox is initially fairly flat, but past J xox ∼ 0.25 dramatically raises E(v) upward at all v (entailing an increase of v B , v E , and v I ). Increasing J zoz from zero initially has a small suppressive effect, but by J zoz ∼ 0.5 the curve raises to match original values (J zoz = 0 values) and continues to increase uniformly.

Information propagation in nonlocal holographic systems
Inspired by the lattice results of the previous section, we now turn our attention to large-N gauge theories with holographic duals. We will consider two prototypical theories with different types of nonlocality: non-commutative SYM (NCSYM) theory and dipole-deformed SYM (DDSYM) theory. The former is an example of a nonlocal theory with UV/IR mixing. A simple model that realizes non-commutativity consists of a set of oppositely charged particles (dipoles) moving in a strong magnetic field [67]. The UV/IR mixing then implies that the transverse size of the dipoles grows with their longitudinal momentum, which may in turn Figure 7: Numerical estimation of the membrane tension function E(v) for various the spin chain with a single nonlocal coupling turned on. Left: The overall affect of increasing the J xox coupling is to raise the membrane tension function. Right: Increasing the J zoz coupling initially has a small suppressive effect on the membrane tension function, but beyond J zoz 0.5 it surpasses the J zoz = 0 values. posit obstructions for renormalizability [68,69]. Even though this theory is ultimately finite, noncommutativity is arguably not the simplest way to introduce nonlocality. We thus consider a second theory, inspired by [70], which introduces a set of fundamental dipoles of constant length L, hence, resembling more closely the lattice systems studied in section 3. As a result, one obtains a nonlocal theory without the issue of UV/IR mixing.
In the following, we will review the basic features of the holographic duals of NCSYM and DDSYM, realized as top-down constructions in string theory. We will also comment on their finite temperature black brane solutions, which are needed to tackle the questions pertaining to thermalization and information spreading. 9 Finally, we will use the tools developed in 2.3.2 to write down membrane tension functions for both systems and analyze the results.

Gravity dual of NCSYM
The gravity dual of maximally supersymmetric non-commutative SYM (NCSYM) is given by type IIB supergravity with non-zero NS-NS B-field. Here and below we assume that the non-commutative parameter is non-vanishing only in the (x 2 , x 3 )-plane, i.e., [x 2 , x 3 ] ∼ iθ. This amounts to replace all multiplication in the Lagrangian of the N = 4 SYM theory with a noncommutative star product To our understanding, this is the first time that a black brane solution is derived for the gravity dual of DDSYM theory.
The gravity background dual to this theory, in the string frame is [71,72]: and h(r) = 1 1 + a 4 r 4 , a = λ 1/4 √ θ . (4.4) In the above dΩ 2 5 denotes the metric on a unit S 5 , R 4 = 4πĝN α 2 ,ĝ denotes the string coupling and α is the string tension, which is related to the 't Hooft coupling through the standard relation √ λ = R 2 /α . To use the results for the membrane theory we first express the metric in Einstein frame, 10 and go to a radial coordinate z = 1/r, so that: where and We further define the infalling time coordinate v such that hence the metric becomes: This metric is of the form (2.29), with the following identifications: Furthermore, the metric of the compact space is so we obtain that (4.13) With the above identifications, we are now ready to apply our general results of section 2.3.2 specializing to this system.

Gravity dual of DDSYM
The gravity dual of the dipole deformed maximally supersymmetric SYM (DDSYM) is given by type IIB supergravity with non-zero NS-NS B-field and a deformed compact space. Here and below we assume that the dipoles have fixed length L and are all oriented along the x 3 -direction. This amounts to replace all multiplication in the Lagrangian of the N = 4 SYM theory with the star product (4.14) The gravity background dual to this theory, in the string frame is [73]: 11 and In the above dΩ 2 5 denotes the metric on a deformed unit S 5 . This compact space has the structure of an S 1 (Hopf) fibration over a base CP 2 . The global angular 1-form of the Hopf fibration is denoted as ψ. This fiber acquires an r-dependent radius h(r) 1/2 . The volume of the CP 2 is constant and given by π 2 /2, so the total volume of the compact manifold is π 3 h(r) 1/2 .
To use the results for the membrane theory we first express the metric in the Einstein frame and go to a radial coordinate z = 1/r: where and h(z) = 1 We further define the infalling time coordinate v such that hence the metric becomes: This metric is of the form (2.29), with the following identifications: Furthermore, the metric of the compact space is From the discussion below (4.17) we know that Vol(Ω 5 ) = π 3 h(z) 1/2 . Extracting all zdependent terms into V Ω (z), we obtain that Again, with these identifications, we can now apply our general results of section 2.3.2 specializing to this system.

Membrane theory for NCYSM
We can analyze two cases, depending on the orientation of the strip. In both cases we find: however, the membrane tensions differ.
• Commutative strip (ξ = ξ 1 and ξ ⊥ = {ξ 2 , ξ 3 }) First, note that (2.36) can be written as: 28) and this equation can be inverted analytically to give: We have added a subindices "c" to indicate that we are working out the commutative strip. Finally, the membrane tension is This is exactly the same result for a strip in Schwarzschild-AdS 5 . It satisfies all the desired properties, including: (4.32) • Non-commutative strip (ξ = ξ 2 and ξ ⊥ = {ξ 1 , ξ 3 }) In this case (2.36) can be written as: and can also be inverted analytically to yield: 12 We have added a subindices "nc" to indicate that we are working out the non-commutative strip. Finally, the membrane tension is We can check that in the a → 0 limit we recover (4.30). It also satisfies all the desired properties: (4.37) Note that in all the above formulas, the non-commutative parameter θ only appears through the dimensionless combination: → ∞ as ϑ → ∞, but this is the expected behavior for an "infinitely" nonlocal theory. Even so, they diverge with the same power of ϑ (v (nc) B ∝ ϑ 2 , v (nc) E ∝ ϑ 2 ) so their ratio remains finite. Expanding this ratio in the limits of strong and weak non-commutativity, we obtain:

Information velocity
Finally, we can obtain the information velocity v I as a function of the entanglement fraction f for . In order to do so, invert equation (2.47) (to obtain v I (f )) and then plug this into (2.48). Unfortunately, it is not possible to invert (2.47) analytically, even for the commutative strip. So we proceed numerically. In figure 9 we show our findings for this quantity. In general we obtain that the v I increases monotonically both with f and with ϑ, as opposed to the 1-D spin chain systems we studied previously. We believe that, in fact, large-N is responsible for washing out the initial suppression of v I with respect to the nonlocal scale.

Membrane theory for DDSYM
We can analyze two cases, depending on the orientation of the strip. In both cases we find: however, the membrane tensions differ.
• Standard strip (ξ = ξ 1 and ξ ⊥ = {ξ 2 , ξ 3 }) This is completely analogous to the commutative strip. Here we repeat the analysis for completeness. First, note that (2.36) can be written as: 41) and this equation can be inverted analytically to give: B , depicted in red and orange, respectively. Right: Information velocity for the non-commutative strip as a function of f for various values of ϑ ∈ {0, 1/2, 1, 2, 5, 10}, from bottom to top, respectively.
We have added a subindices "s" to indicate that we are working out the standard strip. Finally, the membrane tension is This is also the exact same result for a strip in Schwarzschild-AdS 5 . It satisfies all the desired properties, including: with v (s) (4.45) • Dipole oriented strip (ξ = ξ 3 and ξ ⊥ = {ξ 1 , ξ 2 }) In this case (2.36) can be written as: This expression can also be inverted analytically. 13 However, the final expression is lengthy and not very illuminating so we will not transcribe it here. To have a flavor of the effects 13 There are 6 roots (4 complex and 2 real), but only one coincides with the standard case in the limit b → 0.
due to nonlocality, we expand the resulting functions in two limits. Defining a dimensionless parameter, we find that (4.48) We have added a subindices 'd" to indicate that we are working out the dipole oriented strip. In this case, the membrane tension is We can check that in the δ → 0 limit we recover (4.43). At finite δ we can also verify (numerically, or analytically in some limits) that is satisfies all the desired properties: and v (d) (4.52) In fact, we can obtain a very compact expression for v B by noticing that it can alternatively be obtained from (2.27) and (4.46): (1 + δ) . Unfortunately, if we try to use (2.28) to recover v E we run into similar problems as before, since the expression for the Hartman-Maldacena ζ HM turns out to have a very complicated dependence with respect to δ. We will therefore refrain from writing down a closed expression for v E . Figure 10 illustrate our main findings. In general, we notice that both (4.51) and (4.52) provide very good fits in their regime of applicability. We also observe that both velocities undergo a transition from subluminal to superluminal in the intermediate regime, where δ ∼ O(1) and ultimately diverge in the limit of very strong nonlocality δ → ∞. This is in qualitative agreement with the results we obtained for the NCSYM theory. Finally, we can also verify that v (d) E for all δ. Since both quantities diverge with the same power of δ (v (d) E ∝ δ 1/2 ) their ratio remains finite at large δ. Expanding this ratio in the two limits we obtain: (4.54) These results are illustrated in figure 11. From the plots we can confirm the monotonicity of v (d) E with the strength of nonlocality. More specifically, the ratio interpolates between v (d) E → 2 2/3 ≈ 1.633 (for δ → ∞). We see here that the nonlocal effects in the NC case are a bit more severe (in that case v δ v Figure 11: Left: ratio between the butterfly velocity and the entanglement velocity as a function of the dimensionless combination δ = λL 2 T 2 /4. We also show the asymptotic value v B /v E → 2 2/3 ≈ 1.633, depicted in orange, and the expansions for δ 1 and δ 1, depicted in red. Right: Membrane tensions for δ ∈ {0, 1/2, 1, 2, 5, 10}, from bottom to top, respectively. In each case, we indicate the point at which the curve touches the straight line E(v B ) = v B .

Information velocity
Finally, we can obtain the information velocity v I as a function of the entanglement fraction f for the DDSYM theory. In order to do so, invert equation (2.47) (to obtain v(f )) and then plug this into (2.48). Again, we proceed numerically here because it is not possible to invert (2.47) even for the standard strip. In figure 12 we show our findings for this quantity. We obtain that the v I increases monotonically both with f and with δ, which was also found for the NCSYM system. This suggests that in large-N systems the interference effects between the local and nonlocal couplings are effectively washed out.

Conclusions and future work
In this paper we have considered the possibility of speeding up the information transfer in a variety of systems by turning on mild nonlocalities. Our studies include 1-dimensional spin chains as well as strongly-coupled, large-N theories with classical holographic duals. The nonlocal interactions that we consider only act below certain length scale and hence, and can be integrated out by a coarse groaning procedure. Thus, in the large-distance, latetime regime we do recover an effective notion of locality. Nevertheless, we find that these nonlocal interactions induce an clear imprint in this regime, which in most cases translates to an enhancement in the rates of information transfer. Known systems that mimic this type of interactions include certain materials with high polarizability, which in the presence of a strong electromagnetic field undergo an ordered phase with their molecular dipoles aligned. B , depicted in red and orange, respectively. Right: Information velocity for the dipole oriented strip as a function of f for various values of δ ∈ {0, 1/2, 1, 2, 5, 10}, from bottom to top, respectively. This quantity behaves qualitatively the same as for the non-commutative strip, provided we identify the values of δ ↔ ϑ. However, there are very minor numerical differences for f < 1 which make the curves for the dipole case slightly more concave. For f = 1 they are exactly the same, because the functional dependence of v B with respect to δ or ϑ is the same for both theories.
We thus believe that our results could have practical applications in the areas of quantum computation and quantum communications.
The first part of the paper focused on the study of 1-dimensional spin chain systems with the addition of next-nearest neighbor and next-next-nearest neighbor interactions. We have computed the information velocity for states of uniform entanglement fraction, which confers not only the rate at which information propagates in these systems, but also the entanglement velocity v E and the butterfly velocity v B , characterizing the rates of entanglement generation and operator growth, respectively. Our results provide further support for the theory of information spreading of [31] and extend it to systems with mild nonlocalities. We find that the addition of these extra couplings have a nontrivial effect on the information velocity, either suppressive (for very weak nonlocality), or enhancing v I , though the generic behavior seems to be to raise v I (and v E , v B ). The second part of the paper focused on holographic systems dual to strongly-coupled nonlocal gauge theories. In this case, we found that the suppressive behavior is completely washed out, in agreement with the earlier results in [32]. This behavior can possibly be explained as an effect of the large-N limit. Indeed, preliminary results suggest that increasing the size of the local Hilbert space in the spin chain systems (e.g. considering qudits instead of qubits) seems to suppress the interference effects [74]. Some interesting directions for future work include: