A Tale of Two Butterflies: An Exact Equivalence in Higher-Derivative Gravity

We prove the equivalence of two holographic computations of the butterfly velocity in higher-derivative theories with Lagrangian built from arbitrary contractions of curvature tensors. The butterfly velocity characterizes the speed at which local perturbations grow in chaotic many-body systems and can be extracted from the out-of-time-order correlator. This leads to a holographic computation in which the butterfly velocity is determined from a localized shockwave on the horizon of a dual black hole. A second holographic computation uses entanglement wedge reconstruction to define a notion of operator size and determines the butterfly velocity from certain extremal surfaces. By direct computation, we show that these two butterfly velocities match precisely in the aforementioned class of gravitational theories. We also present evidence showing that this equivalence holds in all gravitational theories. Along the way, we prove a number of general results on shockwave spacetimes.


Introduction
Chaos is prevalent in the physical world: small changes in initial conditions can lead to drastic variations in the outcome. This sensitive dependence on the initial state is known as the butterfly effect. In classical systems, chaotic dynamics is characterized by an exponential deviation between trajectories in phase space; the Lyapunov exponent parameterizes the degree of deviation.
In quantum mechanical systems, defining chaos is a more challenging task as the wavefunction is governed by a linear evolution [1]. Nevertheless, one can characterize quantum chaos by the strength of the commutator [V (t), W (0)] between two generic operators with time separation t [2,3]. 1 One useful measure of the typical matrix elements of this commutator is the expectation value of |[V (t), W (0)]| 2 (where the square is defined as |O| 2 ≡ O † O for any operator O). In a chaotic system, this quantity grows with t and becomes significant around the scrambling time t * [5,6], behaving like e λ L (t−t * ) . Here λ L is the (quantum) Lyapunov exponent.
In this paper, we are interested in the spreading of the quantum butterfly effect in spatial directions. We therefore specialize W and V into local operators with spatial separation x in d − 1 spatial dimensions. The corresponding expectation value behaves like [7][8][9] C β (x, t) ≡ |[V (x, t), W (0, 0)]| 2 β ∼ e λ L (t−t * −|x|/v B ) , (1.1) where · β denotes the expectation value in a thermal state with inverse temperature β.
The quantity C β (x, t) characterizes the strength of the butterfly effect at (x, t) detected by a probe V following an earlier perturbation W (0, 0).
Here v B is known as the butterfly velocity. It is the speed at which the region with C β (x, t) 1 expands outward. Intuitively, the commutator probes how a small perturbation W (0, 0) spreads over the system. For two operators that are far separated, the commutator is zero at early times and becomes large at sufficiently late times. In particular, the region in which C β (x, t) is order unity gives a measure of the size of the operator W (0, 0) at time t, and the speed at which the size grows over time is precisely the butterfly velocity.
In recent years, there has been much interest in the role of chaotic dynamics in holographic quantum systems, particularly in the context of AdS/CFT [7][8][9][10][11][12][13]. In this context, a thermal state in the boundary CFT can be realized as a black hole in the bulk AdS spacetime [14]. The rapid thermalization of a local perturbation on the boundary can be understood from the fast scrambling dynamics of the black hole [5,6,15,16]. In particular, the Lyapunov exponent is related to the exponential blueshift of early infalling quanta in the near-horizon region. For CFTs dual to Einstein gravity (possibly corrected by a finite number of higher-derivative terms), the Lyapunov exponent is universal and saturates [7,8,10,11] the chaos bound λ L ≤ 2π/β [17]. This can be understood from the universality of the near-horizon Rindler geometry. If one considers perturbations sent from sufficiently far in the past, the quanta are significantly blueshifted by the Rindler geometry. At around a scrambling time, the backreaction on the background geometry can be described by a shockwave on the horizon [7,10,11]. The strength of the shockwave grows exponentially as we insert the perturbation at earlier and earlier times.
This observation leads to one way of obtaining the butterfly velocity. To see it concretely, consider a localized perturbation in the thermofield-double (TFD) state dual to a two-sided (d + 1)-dimensional planar black hole. Inserting such a spatially localized perturbation on one boundary corresponds to injecting a small number of quanta which then proceed to fall towards the black hole in the bulk. The result of doing so is a localized shockwave [8]. The spatial region in which the shockwave has non-trivial support defines a size of the corresponding boundary operator. As we send the perturbation at earlier and earlier times, the size of this region grows, and this "speed of propagation" determines a butterfly velocity v B .
An alternative way of calculating the operator size and the butterfly velocity in holography was proposed in [18]. It is based on entanglement wedge reconstruction, which states that a bulk operator within the entanglement wedge of any boundary subregion can be represented by some boundary operator on that subregion [19][20][21][22]. Consider again a local operator inserted on the boundary, which in the bulk can be thought of as a particle falling into the black hole. By entanglement wedge reconstruction, a boundary region whose entanglement wedge contains the particle possesses full information about the boundary operator. The smallest spherical region that does so defines a notion of size for the boundary operator. At early times, the particle is near the boundary; the boundary region, and hence the operator size, is small. At very late times, say after a scrambling time, the particle is very close to the horizon and the entanglement wedge needs to extend deep into the bulk. The corresponding boundary region, and hence the operator size, is very large and in fact grows linearly with time. The corresponding speed, which we denote as v B , quantifies the growth of the operator size. It provides a second way of computing the butterfly velocity holographically.
These two holographic computations of the butterfly velocity appear to be very different and unrelated to each other. However, it was shown directly in [18] that the results of the two computations agree for Einstein gravity and for higher-derivative gravity with up to four derivatives on the metric.
The goal of this paper is to prove that the two computations of the butterfly velocity continue to agree in general higher-derivative theories of gravity. We will focus on the family of general f (Riemann) theories, namely those with Lagrangians built from arbitrary contractions of an arbitrary number of Riemann tensors: 2 where the higher-derivative terms are viewed as perturbative corrections to the leading Einstein-Hilbert action. The main purposes of studying higher-derivative theories are twofold: (1) they arise generally as perturbative corrections to Einstein gravity in low energy effective theories of UV-complete models of quantum gravity such as string theory; (2) the agreement between two computations of the butterfly velocity for general higher-derivative theories would suggest an equivalence between the two methods themselves rather than a coincidence in certain theories. This equivalence then suggests a deeper connection between gravitational shockwaves and holographic entanglement, as well as providing further evidence for entanglement wedge reconstruction. The rest of the paper is organized as follows. In Section 2, we begin with a detailed review of the two holographic calculations of the butterfly velocity. In Section 3, we derive general expressions for the two butterfly velocities in f (Riemann) theories. In Section 4, we prove v B = v B for this class of theories, which is our main result. In Section 5, we end with a discussion of this result and comment on potential future directions.

Operator size and the butterfly velocity
In chaotic many-body systems, the size of generic local operators -the spatial region on which the operator has large support -grows ballistically under Heisenberg time evolution. More specifically, consider a perturbation by such a local operator. Under chaotic evolution, information about the perturbation is scrambled amongst the local degrees of freedom and spreads throughout the system. The maximum speed at which this occurs is the butterfly velocity v B . It can be regarded as a finite temperature analogue of the Lieb-Robinson bound [9,23]. The existence of this bound on information scrambling is a general feature of chaotic systems.
One can define operator size more precisely using the square of the commutator for two generic local operators W and V : where t W > 0 so that W is inserted at an earlier time than V and the expectation value is taken in a thermal state with inverse temperature β. 3 The second term in the second line is called an out-of-time-order correlator (OTOC) and carries all of the non-trivial information in (2.1). The exponential decay of the OTOC over time is commonly used as an indicator of chaos in quantum many-body systems. Under chaotic time evolution, the commutator with x = 0 exhibits an exponential growth near the scrambling time t * , 4 defined as the time at which the commutator becomes of order unity. Now considering non-zero x, information from a localized Figure 1. Illustration of the butterfly cone. A localized perturbation at some initial time scrambles everything at that location after a scrambling time t * . This effect then spreads out spatially at the butterfly velocity v B , defining the butterfly cone where C β is of order one.
perturbation is scrambled among the local degrees of freedom and spreads throughout the system at the butterfly velocity v B . In this scrambling regime, the behavior of the commutator has the following universal form [9], (2. 2) The Lyapunov exponent λ L gives a time scale for scrambling at a fixed spatial location, while v B parametrizes the delay in scrambling due to spatial separation (see Figure 1). At time t W after the insertion of the W perturbation, the commutator is order unity in the spatial region defined by |x| ≤ v B (t W − t * ), and is exponentially suppressed outside this region. This gives a precise notion of the size of an operator under time evolution. A nice way to think about the OTOC, and hence the commutator, is as the overlap ψ |ψ between the two states Here |TFD β is the thermofield-double state: where L denotes the original system and R denotes an identical copy, |n is a complete set of energy eigenstates with energy E n , and Z(β) is the thermal partition function at inverse temperature β. The states (2.3) are obtained from |TFD β by acting with operators in the L system. In particular, the state |ψ corresponds to acting with W in the past at t = −t W and time evolving to t = 0 before inserting an operator V . The state |ψ corresponds to creating a perturbation V at t = 0, time evolving to the past, inserting W , and finally evolving back to t = 0. Under chaotic dynamics, the operator W is scrambled amongst degrees of freedom in its neighborhood. If W is inserted far enough in the past, it can interfere with the perturbation due to V and prevent it from reappearing at t = 0 in the state |ψ . Consequently, the state |ψ would have a small overlap with |ψ since the latter has V inserted at t = 0 by construction. This small overlap means that the commutator (2.1) is of order one, which defines the butterfly cone.
We now review two methods of calculating the butterfly velocity in holographic systems, which we call the shockwave method and the entanglement wedge method.

The shockwave method
One way of capturing the chaotic behavior in a holographic CFT is to introduce a perturbation at the asymptotic boundary and study the backreaction to the geometry as it propagates into the bulk. We will consider two copies of the CFT in a thermofield double state, which is dual to a two-sided eternal black hole in the bulk. Now let us act on the left boundary CFT with a local operator W at the origin x = 0 and boundary time −t W : In the bulk, this perturbation corresponds to inserting an energy packet near the asymptotic boundary, which then falls towards the black hole. If we take t W to be large, by the time it reaches t = 0, it will have gained considerable energy due to the exponentially large blueshift near the horizon, which then backreacts significantly on the spacetime. For large enough t W , 5 the backreacted geometry is well-described by a shockwave along the horizon as shown in Figure 2.
To be more specific, let us consider a general (d + 1)-dimensional planar black hole. The metric can be written in Kruskal coordinates as where the functions A(uv) and B(uv) in general depend on the gravitational theory and the matter content, and i ∈ {1, . . . , d − 1} labels the transverse directions. The two horizons live at u = 0 and v = 0. We will also rescale the transverse directions so that B(0) = 1. A localized shockwave on the horizon at u = 0 is sourced by a change in the stress tensor due to a perturbation with initial asymptotic energy E The prefactor Ee 2π β t W can be thought of as the effective energy of the blueshifted perturbation. Note that t W is not a spacetime coordinate but rather parameterizes the time at which the perturbation is inserted. Inserting the perturbation at earlier times increases this effective energy and results in a larger backreaction.
To solve for the backreaction, it is sufficient to perturb only the uu component of metric by an amount parameterized by some function h(x), (2.8) The function h(x) -which we will refer to as the shockwave profile -is determined by the equations of motion. Assuming that the equations of motion for the unperturbed background solution (2.6), are satisfied (where S is the gravitational part of the action), it suffices to consider the perturbed equations of motion δE ν µ = δT ν µ (2.10) sourced by the shockwave stress tensor (2.7). For Einstein gravity, this shockwave equation of motion was first derived for a vacuum background by Dray and t'Hooft [24], and later generalized to allow a non-trivial background stress-tensor in [25]. Now we consider general theories of gravity. As we prove in Appendix A, the only non-trivial component of (2.10) is h(x) Figure 3. Bulk representations of |ψ and |ψ . For the state |ψ (left), the shockwave (red) created by W is inserted first and the probe operator V is then inserted at t = 0, manifesting itself as a particle (green) propagating into the future and the past. On its way to the past, it encounters the already existent shockwave and is shifted by an amount h(x). On the other hand, |ψ (right) is created by inserting V first and then placing the shockwave W . The shockwave alters the original trajectory of the V particle, such that it is shifted to the future of the shockwave.
Furthermore, in the same appendix we show that (2.11) truncates automatically at linear order in h(x) (and its x i -derivatives). Indeed, the equations of motion reduce to a single ODE for the shockwave profile h(x), which we will refer to as the shockwave equation. As we will show, (2.11) can be solved for large r = |x| with the following ansatz for the shockwave profile where µ > 0 and # is some integer that will not be important. To see the connection to the OTOC, and hence the butterfly velocity, let us consider the overlap of the two states defined in (2.3) from the bulk perspective (see Figure 3). The states differ in the trajectory of the particle V , due to the shift h(x) from the shockwave W occurring either in the past or future evolution. Because the only difference between the two states is in the particle trajectory, the size of the overlap is controlled by the shift: it is close to unity when the h is small and close to zero when the h is large. Let us for concreteness define the boundary of the butterfly cone to be where C β = 1, which translates to Re ψ |ψ = 1 2 . This corresponds to some threshold value of the shift, say h(x) = h * . According to (2.12), the size of the butterfly cone then grows as a function of t W with the velocity v B = 2π βµ (2.13) at large r. We identify this with the butterfly velocity. Solving the equation of motion (2.11) yields a value for µ and thus a value for v B in terms of the functions A(uv) and B(uv) in the background metric (2.6).
Let us now look at some examples.
Einstein gravity Let us start with the simplest case of Einstein gravity, L EH = 1 2 (R − 2Λ). The calculations follow that in [7,8], which we now review. Using (2.11), the shockwave equation is given by (2.14) In the large-r limit, we can then substitute the ansatz (2.12), and obtain an algebraic equation for the parameter µ, where A 0 ≡ A(0) and B 1 ≡ B (0). Choosing the positive root for µ and using (2.13), we therefore find the butterfly velocity Note that the equation for µ and thus the butterfly velocity depend only on the behavior of the metric near the u = 0 horizon. This is enforced by the overall factor of δ(u) in the equation of motion. As we will see, in the entanglement wedge method, this feature will be reproduced via a different mechanism -by taking a near-horizon limit of an extremal surface. This is one of the many distinctions between the two methods, making their agreement quite non-trivial.
Lovelock gravity As was shown in [8], for Gauss-Bonnet gravity whose Lagrangian is the Einstein-Hilbert term L EH plus the coupling constant λ GB does not contribute to the shockwave equation (2.14).
We can further show that this is the case for Lovelock gravity, a general 2pderivative theory with second order equations of motion. The Lagrangian is L EH plus where the generalized delta symbol is a totally antisymmetric product of Kronecker deltas, defined recursively as Choosing p = 1 gives Einstein-Hilbert, while p = 2 reduces to Gauss-Bonnet. The metric equation of motion is given by (2.20) We are interested in δE v u . It is not difficult to show that the only way to get a non-zero contraction is to make one of the Riemann tensors R vj ui and the rest R mn kl . Using our metric ansatz (2.8), one can show that δR mn kl = 0, so any potential contribution can only come from δR vj ui ∝ δ(u) multiplied by p − 1 factors of R mn kl (see Eq. (3.3) for the detailed expressions). The latter vanishes on the horizon of the planar black hole, due to the flatness of the transverse directions. Hence, we find that as long as p > 1 the Lovelock corrections do not contribute to the shockwave equation (2.14).

The entanglement wedge method
In holography, the butterfly effect manifests itself in another way as first observed in [18]. The intuition comes from entanglement wedge reconstruction [19][20][21][22], which states that a given boundary region contains all of the information inside its entanglement wedge, which is a bulk region bounded by the boundary region and the corresponding extremal surface. As was argued in [18], this allows us to define a notion of operator size on the boundary.
Consider a thermal state in the boundary CFT dual to a planar black hole in the bulk. Let us perturb the boundary state by acting with a local operator. Under the chaotic time evolution, information from the perturbation is scrambled throughout the system. At late times, the boundary region over which this information is smeared propagates outwards at a constant velocity. In the bulk, the perturbation corresponds to a probe particle (or wavepacket) originating from the asymptotic boundary and falling towards the black hole, its trajectory determined from our choice of the bulk theory and the background geometry. According to entanglement wedge reconstruction, any boundary region whose entanglement wedge contains the particle should contain all the information of the corresponding boundary operator. In particular, we would like to consider the extremal surface which barely encloses the particle in its entanglement wedge. The corresponding boundary region then defines a size for the boundary operator.
To extract the butterfly velocity, we now study how this extremal surface changes as it follows the trajectory of the particle. Note that even though the location of the particle is time-dependent, the background spacetime is static 6 and at any given time we may use a Ryu-Takayanagi (RT) surface [26,27] (instead of its dynamical generalization -the HRT surface [28]). At early times, the shape of the RT surface will depend sensitively on details of the background metric. However, at late times, the surface approaches the near-horizon region and exhibits a characteristic profile which propagates outwards at a constant velocity (see Figure 4). This velocity, which we can identify as the butterfly velocity v B , depends only on the bulk theory and the nearhorizon geometry of the black hole (as long as the theory admits a black hole solution, which we can ensure by taking the coefficients of higher-derivative terms to be small so that a solution perturbatively close to the static planar black hole in Einstein gravity exists).
In general higher-derivative gravity, the location of the RT surface is determined by extremizing the holographic entanglement entropy functional S EE among all bulk surfaces homologous to the corresponding boundary region [29][30][31][32]. For the f (Riemann) theories that we focus on, the entropy functional S EE can be found in [29]. The important terms for our purpose is where y denotes a set of coordinates on an appropriate codimension-2 surface, γ is the determinant of its induced metric, K λρσ is its extrinsic curvature tensor, n µν is the induced metric (and ε µν is the Levi-Civita tensor) in the two orthogonal directions while vanishing in the remaining directions, and · · · denotes terms that are higherorder 7 in K λρσ and its derivatives than the second-order term shown here. As we will explain in Section 3.3, these higher-order terms do not affect the calculation of the butterfly velocity. Note that Eq. (2.21) works in Lorentzian signature, which we obtain by analytically continuing the corresponding Euclidean expression via L → −L, n µν → 6 Here we do not need to analyze the backreaction of the particle, unlike in the shockwave method. 7 These higher-order terms are difficult to write down explicitly because of 'splitting' [29,[33][34][35], although they can in principle be determined by using appropriate equations of motion [29,31,32]. Fortunately, here we only need S EE up to second order in K (and its derivatives), which can be obtained by setting q α = 0 in Eq. (3.30) of [29] and is free from the splitting difficulty -this follows roughly from the results of [29] but will also be proved carefully in a separate work [36]. Moreover, the same S EE up to second order in K (and its derivatives) was derived using the second law of black hole thermodynamics [37]. . An illustration of the entanglement wedge method. A particle is released from the asymptotic boundary and we show the RT surface which barely encloses the particle. At early times (left), the profile of the RT surface depends sensitively on the metric away from the horizon; but at late times (right), the RT surface in the near-horizon region has a characteristic profile which propagates outwards at the butterfly velocity v B . n µν , ε µν → −iε µν . 8 Extremizing S EE leads to a differential equation for the location of the RT surface, which we refer to as the RT equation.
For a general planar black hole, the metric is given in (2.6), which can be written in Poincaré-like coordinates as where the horizon is located at z = 1. In the near-horizon region, we can Taylor expand the functions As in [18], we will consider spherical boundary regions, for which the corresponding RT surfaces are also spherical. We use r = |x| to denote the radial coordinate in the x i directions. This reduces the RT equation to a simple ODE. At late times, the probe particle is exponentially close to the horizon and follows a trajectory 1 − z(t) ∼ e − 4π β t given by the geodesic equation. As such, we will focus on the near-horizon region near z = 1. In particular, defining the RT surface by z = Z(x), we can parametrize the near-horizon limit by writing where is a small positive number and we have defined a new function s(x) which we call the RT profile. The profile is determined by solving the RT equation, which we Taylor expand around = 0. The RT surface will stay close to the horizon for 8 The extra factor of −i can be traced back to the definition ε µν = ε ab n (a) with ε ab being the Levi-Civita symbol. Going to the Lorentzian version which comes with √ −g requires a factor of i. The minus sign is a result of going from the convention in [29] where ε τ x = −1 to the standard Lorentzian convention ε tx = 1. small r and start to depart the near-horizon region at some large radius r = r * , where s(x) 2 | r=r * ∼ O(1), after which it reaches the asymptotic boundary within an order one distance. Thus, the corresponding boundary region has approximately radius r * and this defines a size of the boundary operator. To model this behavior, one can use an ansatz 9 where µ > 0 and # is some unimportant integer. At each point in time, we demand the tip of the RT surface intersects the particle, which is enforced by setting s(x = 0, t) ∼ e − 2π β t . Therefore, the time-dependent RT profile is given by This in turn gives the radius of the boundary region (modulo an order one distance), which propagates outward with characteristic velocity Solving the RT equation yields a value for µ and thus a value for v B in terms of the functions f (z) and h(z) in the background metric (2.22). This is the second way of calculating the butterfly velocity.
Let us now revisit the examples in Section 2.1 using the entanglement wedge method.
Einstein gravity For Einstein gravity with L EH = 1 2 (R − 2Λ), the entropy functional is simply given by the area: The entire dependence of this functional on the choice of the surface is then in the determinant of the induced metric γ. Using (2.23) and expanding to linear order in , the induced metric of the surface is given by This is a flat metric up to O( ) corrections, which is consistent with the fact that we are expanding near the horizon of a planar black hole. The determinant is then given by The RT equation is then determined by varying with respect to s(r), which at leading order in gives Substituting the ansatz (2.24) and dropping higher order terms in 1/r then gives where the second equality follows from a coordinate transformation. Using (2.26), we thus find that the resulting butterfly velocity matches the shockwave result (2.16).
Lovelock gravity We will now show that the Lovelock corrections L 2p with p > 1 do not contribute to the RT equation, consistent with the shockwave calculation in Section 2.1. It is well-known that the entropy functional for this theory is given by the Jacobson-Myers formula [38] where R ij kl is the intrinsic curvature of the codimension-2 surface. For an RT surface perturbatively close to the horizon, the induced metric is again given by (2.28). We can therefore immediately write R ij kl ∼ O( ), which implies For p > 2, these terms are higher order in and therefore do not contribute. Thus these Lovelock corrections do not modify (2.31). For the case of p = 2, namely Gauss-Bonnet gravity, a more refined argument is required. In this case, the entropy functional (2.32) depends on the intrinsic curvature scalar R = R ij ij which, up to unimportant numerical factors, is given by where we have neglected terms that only contribute higher orders in 1/r to the RT equation compared to the Einstein-Hilbert term (2.29). The entropy is then given by The first term is a total derivative and therefore does not contribute to the RT equation.
The second term is down by a factor of 1/r compared to the contribution from the Einstein-Hilbert term (2.29). Since we are only interested in the leading large-r profile, this term will not contribute either. We therefore conclude that the Gauss-Bonnet correction does not modify (2.31), reproducing the result from [18].

Butterfly velocities for f (Riemann) gravity
In this section we derive general formulae for the butterfly velocities using the two holographic methods described in Section 2, valid for all f (Riemann) theories. Before diving into the derivations, we will lay out a few useful definitions and notations and calculate a few basic quantities related to the background metric.

Definitions, notations, and the spacetime
As reviewed in Section 2.1, the metric for the shockwave geometry is given by The background metric with h = 0 is simply the planar black hole (2.6). We will denote the functions A, B and their derivatives evaluated on the horizon at u = 0 by A n ≡ ∂ n A(uv)/∂(uv) n | u=0 and B n ≡ ∂ n B(uv)/∂(uv) n | u=0 . Throughout our calculations, we set B 0 = 1 by rescaling the x i coordinates. The non-zero Christoffel symbols are where h i ≡ ∂ i h(x), and we have dropped terms 10 containing uδ(u). It will become clear in our later calculations that the second term vA −1 A hδ(u) in Γ v uu is unimportant because it always enters the equations of motion together with at least one power of u.
The non-zero components of the Riemann tensor are where we have again discarded terms which vanish as a distribution. We see that the only correction due to h is in R uiuj , and the correction is linear in h. The same statement applies for R vj ui . The fact that, given the specific ansatz (3.1), these are the only possible non-zero components of the Riemann tensor is crucial in establishing our proof.
For notational convenience, we define the following tensors and use them to denote the corresponding values evaluated in the background solution and on the u = 0 horizon: (3.4) For example, C ρσ denotes the value of ∂L/∂g ρσ in the background h = 0 solution and on the u = 0 horizon. Here, we have viewed the Lagrangian L as a function of R µνρσ and g µν , i.e., we lower all indices on the Riemann tensor and raise all indices on the metric and treat these two types of tensors as independent variables when taking derivatives. Throughout this paper, we define derivatives with respect to tensors such as R µνρσ and g µν in the standard way; for example, ∂L/∂R µνρσ has the Riemann symmetry and satisfies the identity δL = ∂L ∂Rµνρσ δR µνρσ . Since the i, j indices can only appear in the combination δ ij for any background quantity, we can define the following notation where the transverse components are stripped away: where a, b, c, · · · ∈ {u, v} and i, j, k, . . . are the transverse directions.

The shockwave method
For general f (Riemann) theories, we start by writing down the equations of motion: , E (4)µν = 4 ∂L ∂R µρνσ ;σ;ρ sym(µν) , (3.6) where the Lagrangian L is viewed as a function of R µνρσ and g µν . Here the parenthesized numbers (1), . . . , (4) merely label the various terms and do not have any physical meaning.
We view the shockwave spacetime (3.1) as a perturbation from the background geometry (2.6) with δg uu = −2Ahδ(u). (3.7) The only non-zero component of the perturbation of the inverse metric is As we show in Appendix A, the only component of the equations of motion E ν µ that receives a non-vanishing perturbation from the shockwave is E v u , with We now calculate these two terms separately.
For δE vv , we have (3.10e) In arriving at this, it is important that we use the distributional identity uδ (u) = −δ(u) (see Footnote 10). The δ-function sets u = 0 so the quantities are all evaluated on the horizon. In deriving (3.10), it is useful to note the following simplifying properties. First, in the background solution, every extra v-index downstairs (beyond those paired with a u-index downstairs or a v-index upstairs) costs a factor of u. Similarly, a single i-type index cannot contribute in the background solution since it must come paired with another such index to form a Kronecker delta.
To work out the second term on the right-hand side of (3.9), we find an expression for E uv in the background solution on the horizon: Finally, collecting all the terms into (3.9) and plugging in the ansatz for h(x) yields where we have taken the large-r limit and neglected higher-order terms in 1/r, and is a function of quantities on the horizon through imposing δ(u). This is a quartic 11 equation for µ and a quadratic equation for µ 2 . The correct root is the one that is positive and continuously connected to the unperturbed Einstein gravity result given in Eq. (2.15). We then extract the butterfly velocity v B from (2.13).

The entanglement wedge method
Let us now move on to the entanglement wedge method. As reviewed in Section 2.2, our objective is to find the size of the smallest spherical boundary region whose entanglement wedge encloses a probe particle falling into the black hole. We will work in the same coordinate system as used for the shockwave method. This will make it easier to see the matching with the shockwave result, but at the expense of making the time translation symmetry slightly less explicit. The metric is the planar black hole given by (3.14) We would now like to derive the RT equation for a spherical boundary region in this background using (2.21). Before we proceed, let us make the following simplifying observations: 1. Entanglement surfaces anchored to a single boundary can never penetrate the horizon, so we can choose to work in one of the exterior patches and exploit the time translation symmetry. Because of this symmetry, we only need to look for the RT surface rather than the HRT surface. This means we can restrict to the u = −v hypersurface in order to get the spatial profile of the entanglement surface. This is the t = 0 surface in the original (t, z, x i ) coordinates.
2. Since we are interested in the near-horizon limit, the butterfly velocity can be calculated by extremizing the entropy functional S EE with respect to a candidate RT surface defined by uv = − s(x) 2 to linear order in . Since u = −v, we can consider each factor of u or v as contributing a factor of √ .
3. The entropy functional S EE given in Eq. (2.21) is only accurate at second order in the extrinsic curvature K and its derivatives, but this is sufficient to determine the butterfly velocity. In particular, higher-order terms in K and its derivatives are suppressed by additional powers of , and can thus be neglected in the nearhorizon limit.
4. The ε λ 1 λ 2 term in Eq. (2.21) vanishes when restricting to RT surfaces. To see this, go to a coordinate system where the time translation symmetry is manifest (so ∂ t is the timelike Killing vector field), and note that K λρσ vanishes if λ = t but ε λ 1 λ 2 vanishes unless one of λ 1 and λ 2 is t.
Implementing these simplifications and writing S EE = 2π d d−1 y √ γL EE , we have We call the second term the extrinsic curvature term. The non-zero components of the Riemann tensor in the background solution are again given by (3.3) but with h set to zero, i.e., without the shockwave. With our candidate entanglement surface defined on uv = − s(x) 2 , the components of the two normals are then given by where s i stands for ∂ i s(x). In deriving this, we used the fact that t is a function of u/v, n (1) ∼ dt, and n (2) ∼ df where f = uv + s 2 . Next, we need the following tensors, defined by To linear order in , the non-zero components of ε µν are given by It turns out that we will only need the O(1) term in n µν , and the only non-zero component at this order is n uv = A 0 + O( ). (3.20) We will also need the extrinsic curvatures. To leading order, the only non-zero component is We can now derive the contributions to S EE at linear order in . For any quantity X, we denote the linear order coefficient in a Taylor expansion in by ∆X. Then where the barred quantities are evaluated on the horizon at = 0. Note that quantities such as D µρνσ and H µ 1 ρ 1 ν 1 σ 1 µ 2 ρ 2 ν 2 σ 2 do not need to have bars because they are already defined to be evaluated on the horizon. The last piece is the only contribution from the extrinsic curvature term of S EE since K 2ij = O( √ ). The determinant of the induced metric is given by which can be derived by substituting uv = − s(x) 2 into the metric and expanding the identity det exp M = exp Tr M to linear order. Then where we have used the fact that only ε uv = 0 at zeroth order. For the next term, we need ∆ ∂L ∂R µρνσ = H µρνσµ ρ ν σ ∆R µ ρ ν σ + 2F µρνσ uv ∆g uv + F µρνσ ij ∆g ij , (3.29) We also notice that H vanishes if the numbers of lower u and v indices do not match (each upper u is considered one lower v and vice versa). Then we have Using the expressions for ε µν above, the third term is simply given by In the last term ∆L EE , we notice that only K 2ij has low enough order in to contribute, so we have Finally, putting everything together, the total contribution to the entropy functional at linear order in is given by To obtain the butterfly velocity, we vary ∆S EE = 2π d d−1 y ∆( √ γL EE ) with respect to s(x) and then substitute our ansatz s(x) ∼ e µr r # from (2.24), keeping only leading terms in 1/r. It is not hard to see that the number of x i -derivatives on s will be the number of factors of µ. From this, we obtain a polynomial equation for µ: Notice that all coefficients only involve quantities evaluated on the horizon; this is true as well in the shockwave calculation, where it is enforced by the presence of δ(u). Similar to the shockwave result (3.13), this is a quartic equation for µ and a quadratic equation for µ 2 , from which we choose the positive root continuously connected to the result for Einstein gravity (2.15). The butterfly velocity v B is then obtained from µ using (2.26).
Before proceeding to show that the two butterfly velocities we have derived agree, let us pause for a moment and use the results from this and the previous subsection in an explicit example. Consider the following four-derivative correction to Einstein gravity: L ⊃ R µνρσ R µνρσ = R µνρσ R µ ν ρ σ g µµ g νν g ρρ g σσ . (3.38) The non-vanishing tensor components are given by Substituting these expressions into either (3.13) or (3.37) reproduces the result in [18].

Equivalence of the two butterfly velocities
In Section 3, we derived general expressions for the butterfly velocities from two distinct holographic calculations -the shockwave method and the entanglement wedge method. More specifically, we have obtained polynomial equations for the parameters µ and µ given by f SW (µ) = 0 and f EE ( µ) = 0, respectively. In both cases, to solve for the value of µ (or µ) we treat the higher-derivative couplings perturbatively and choose the positive root that is continuously connected to the value in Einstein gravity. Recalling that the butterfly velocities are related to these parameters via (2.13) and (2.26), it then suffices to prove that f SW and f EE are the same function.
With f SW given in (3.13) and f EE given in (3.37), the two functions are the same if the following two equations hold in the background solution and on the u = 0 horizon: In the rest of this section, we use 'on the background' to mean 'in the background solution and on the u = 0 horizon'. In writing the above two equations, we have used the fact that g uv = A −1 0 and δ ij R uivj = − 1 2 (d − 1)B 1 on the background. We will refer to (4.1a) and (4.1b) as the first and second relations, respectively.
We will now prove the above relations for any given choice of Lagrangian involving contractions of Riemann tensors. To that end, it is useful to think of the terms in (4.1a) and (4.1b) as differential operators acting on the Lagrangian L. For example, C uv can be thought of as the operator acting on the Lagrangian L with the result evaluated on the background. A useful quantity to define is where the factor 1/4 is a symmetry factor from the Riemann tensor. For example, when we act ∂/∂ R uu on a function of R µρνσ such as L, we take the derivative with respect to R uu while holding the traceless part of R uiuj fixed. Now we can rewrite (4.1a) and (4.1b) by defining two operators Our goal is then to prove that on the background. This will be the goal of the remainder of this section. For any Lagrangian composed of a covariant combination of an arbitrary number of the Riemann tensor and inverse metric, we expand it by decomposing the sum over any dummy index into two sums, one over {u, v} and the other over the x i directions. This can be written in the following schematic form where L is an object of the form Here, A denotes any a-type index labelling either u or v, I denotes any i-type index, and the tensor X is a product of any number of inverse metric and Riemann tensor components not explicitly shown in (4.8), i.e., g II , g AI , R IIII , R AIII , R AAII , and R AAAI . Different A (or I) indices may specialize to different a-type (or i-type) indices. As a concrete example, (4.7) for L = g µν g ρσ R µρνσ can be written as L = g ab g cd R acbd + g ab g ij R aibj + · · · where the first term g ab g cd R acbd is of the form g AA g AA R AAAA and the second term g ab g ij R aibj is of the form g AA R AIAI g II , with a, b, c, · · · ∈ {u, v} and i, j, k, . . . labelling transverse coordinates. Notice that all a-type and i-type indices are contracted.
As we are only interested in O i L on the background, we may simplify (4.8) significantly by dropping those terms that vanish eventually. In particular, g AI , R IIII , R AIII , R AAII , and R AAAI all vanish on the background. 12 As the derivatives in O i do not involve these components, if L in (4.8) contains any of these components, it would vanish after acting with O i and evaluating on the background. Therefore, we can restrict L to those that do not contain any of these components. Similarly, the traceless part R aibj − R ab δ ij of R aibj vanishes on the background, and as the derivatives in O i do not involve this traceless part, we can restrict L to those that do not contain the traceless part, and thus we may replace all instances of R AIAI in (4.8) with R AA . Therefore, we replace (4.8) with L = g AA · · · g AA R AAAA · · · R AAAA R AA · · · R AA , (4.9) up to a multiplicative constant that we do not need to keep track of. Now define index loops by connecting the two indices of g ab , the two indices of R ab , the first two indices of R abcd , and its last two indices. For example, the term g ab R ab has a single (index) loop. In general, L contains one or more loops, and the two antisymmetric pairs of indices in any R abcd = R [ab] [cd] need not be part of the same loop. In order for a loop not to vanish on the background, it must consist of alternating u and v: either (uvuv · · · uv) or (vuvu · · · vu). For example, g ab g cd g ef R af bc R de consists of a single loop (abcdef ), with non-vanishing contributions on the background (4.10) while g ab g cd g ef R adbc R ef has the two loops (abcd) and (ef ), with non-vanishing contributions on the background It turns out that it is sufficient to prove O i L = 0 for L made of a single loop, because even for L made of multiple loops, O i must act entirely on a single loop to have a chance to be non-trivial: in particular, if we act the two derivatives in any second-derivative term of (4.5) -such as ∂ 2 /∂ R uu ∂g uu -on two different loops, one of the two loops would have to contain an extra factor of g uu , R vv , R vvab , or R abvv , thus vanishing on the background. 13 Therefore, from now on we consider a general L made of a single loop. It may be written as where n is the number of g ab factors and the tensor T is a product of a suitable number of R abcd and R ab . 14 Our goal is thus to prove on the background for any L of the form (4.12). The general statement (4.6) then follows. Before proceeding, let us introduce some useful terminology. For simplicity, we rename the g ab factors appearing in (4.12) so that the loop is precisely (a 1 b 1 a 2 b 2 · · · a n b n ). For any neighboring pair of inverse metrics g a k b k , g a k+1 b k+1 (where k = 1, 2, · · · , n and a n+1 ≡ a 1 , b n+1 ≡ b 1 ), the b k , a k+1 indices are either (1) contracted with some R b k a k+1 cd or R cdb k a k+1 , or (2) contracted with R b k a k+1 . In the first case, we say that there is an R-contraction between g a k b k and g a k+1 b k+1 , while in the second case, we say that there is an R-contraction between g a k b k and g a k+1 b k+1 . More generally, for any k ≤ l we say that there is an R-contraction between g a k b k and g a l b l if there is an R-contraction between any neighboring pair among g a k b k , g a k+1 b k+1 , · · · , g a l b l , and we say that there is an R-contraction not between g a k b k and g a l b l if there is an R-contraction between any of the other neighboring pairs (i.e., among g a 1 b 1 , · · · , g a k b k or g a l+1 b l+1 , · · · , g anbn ). Similar statements apply for R-contractions. Note that the number of R-contractions and R-contractions add up to n. As an example, in the loop g a there is an R-contraction between g a 3 b 3 and g a 1 b 1 , as well as between g a 1 b 1 and g a 2 b 2 ; and there is an R-contraction between g a 2 b 2 and g a 3 b 3 .

First relation We now prove the first relation
on the background for any L of the form (4.12). First, consider O On the background, we have O is the sum of the two terms in (4.12) where the loop (a 1 b 1 a 2 b 2 · · · a n b n ) consists of alternating u and v: either (uvuv · · · uv) or (vuvu · · · vu). These two terms differ by a factor of (−1) m where m is the total number of R-contractions, 15 because R uv = R vu but exchanging u and v in each R-contraction (i.e., each antisymmetric pair of indices in R abcd ) costs a minus sign. In other words, where the factor of 1/2 comes from the symmetry of g ab . Second, consider O On the background, we have O where L 1 is the sum of all terms in (4.12) where the loop (a 1 b 1 a 2 b 2 · · · a n b n ) is alternating except for two 'defects' at g a k b k = g uu and g a l b l = g vv , for any k = l. The two derivatives in O 1 act precisely on these two defects. If k < l, compared to the alternating loop (uvuv · · · uv) we are exchanging u and v in all R-and R-contractions between g a k b k and g a l b l . Since each R-contraction costs a minus sign and each R-contraction gives a plus sign, such a loop contributes (−1) s kl g uu g vv (g uv ) n−2 T uv···uv (4.21) to L 1 , where s kl (sometimes also written as s k,l ) is defined to be the number of Rcontractions between g a k b k and g a l b l . By definition we have s kl = s lk and s kk = 0, with no summation implied.
If k > l, compared to the alternating loop (uvuv · · · uv) we are exchanging u and v in all R-and R-contractions not between g a k b k and g a l b l . Since there is a total of m R-contractions and thus the number of R-contractions not between g a k b k and g a l b l is m − s kl , such a loop contributes (−1) m−s kl g uu g vv (g uv ) n−2 T uv···uv (4.22) to L 1 .
Combining the above two cases, we find On the background, we have O where L 1 is the sum of all terms in (4.12) where the loop (a 1 b 1 a 2 b 2 · · · a n b n ) is alternating except for two 'defects' at g a k b k = g uu and R b l a l+1 = R uu , for any k, l, whether or not they are equal. If k ≤ l, compared to the alternating loop (uvuv · · · uv) we are exchanging u and v in all R-and R-contractions between g a k b k and g a l b l . Such a loop contributes 1 . This expression is nice because it applies to any l satisfying k ≤ l, whether or not there is actually an R-contraction between g a l b l and g a l+1 b l+1 . If there is, we have s k,l+1 = s kl and (4.27) gives the correct contribution. If not, there must be an R-contraction instead between g a l b l and g a l+1 b l+1 , so we find s k,l+1 = s kl + 1 and (4.27) vanishes.
If k > l, compared to the alternating loop (uvuv · · · uv) we are exchanging u and v in all R-and R-contractions not between g a k b k and g a l+1 b l+1 . Such a loop contributes 1 . Again, this expression vanishes if there is actually an R-contraction between g a l b l and g a l+1 b l+1 .
Combining the above two cases, we find 1 .
This can be obtained from O 1 L by exchanging u with v. This leads to Combining all four pieces of O 1 , we find On the background, we have O 2 L (2) 2 (4.39) where L (2) 2 is the sum of all terms in (4.12) where the loop (a 1 b 1 a 2 b 2 · · · a n b n ) is alternating except for two defects at R b k a k+1 = R uu and R b l a l+1 = R vv , for any k = l. If k < l, compared to the alternating loop (uvuv · · · uv) we are exchanging u and v in all R-and R-contractions between g a k+1 b k+1 and g a l b l . Such a loop contributes to L 2 . As with (4.27), this expression applies to any l satisfying k < l, regardless of whether the b k , a k+1 and b l , a l+1 indices are contracted to some R b k a k+1 and R b l a l+1 .
If k > l, compared to the alternating loop (uvuv · · · uv) we are exchanging u and v in all R-and R-contractions not between g a k b k and g a l+1 b l+1 . Such a loop contributes Using the sum relations that we show in Appendix B, the prefactors in (4.40) and (4.41) after summing over k, l simplify to Combining the two cases, we find They were worked out in (4.30) and (4.32), respectively. We therefore simply quote the results here: Combining all four pieces of O 2 , we find (4.46) We have therefore proven that the two functions f SW and f EE are the same for any f (Riemann) theory, as claimed. This immediately implies our main result v B = v B via (2.13) and (2.26).

Discussion
In this paper, we have shown that the butterfly velocity can be calculated using two distinct methods in holography: the shockwave method or the entanglement wedge method. We proved that the two methods give the same result for any f (Riemann) theory by direct computation. To find the butterfly velocity, we have solved the metric perturbation in the shockwave calculation and the near-horizon shape of extremal surfaces in the entanglement wedge calculation. In both methods, we have also taken a large-radius expansion in the transverse directions. After finding general expressions using both methods, their matching was not immediate. Nevertheless, exploiting the symmetry of the background solution on the horizon, we have shown that the difference indeed vanishes.
While our calculations show explicitly that the two methods are equivalent for a large class of theories, a deeper and more intuitive understanding of the equivalence remains an interesting open question. In particular, the holographic entanglement entropy formula was derived by evaluating the gravitational action on a Euclidean conical geometry and varying it with respect to the conical angle [29][30][31][32]39], whereas the shockwave equation is derived in a Lorentzian spacetime with no conical defects. Furthermore, the shockwave profile (2.12) is exponentially decreasing in r, but the RT profile (2.24) is exponentially increasing in r. All these distinctions make the two methods appear very different, and finding a more direct way to connect them will likely shed light on the relationship between holographic entanglement and gravitational dynamics in general.
We now describe some potential future directions: More general gravitational theories: It would be interesting to see if the equivalence holds beyond f (Riemann) theories. To that end, we have worked out an example whose Lagrangian depends explicitly on the covariant derivative and found that the two methods continue to agree. More precisely, the Lagrangian contains and we find that its contributions to f SW (µ) and f EE (µ) are equal (in d = 3) and given by The holographic entanglement entropy functional for this theory can be found in [33]. This example suggests that the two methods continue to agree in higher-derivative theories beyond f (Riemann). It would be interesting to prove this generally, including cases where the gravitational theory is coupled to matter fields with general interactions. It would also be interesting to understand this better in the context of string theory, perhaps building on the results of [11,40].
Beyond the butterfly velocity: It would also be interesting to see if other properties of the OTOC related to shockwave quantities besides the butterfly velocity can be connected to properties of the entanglement wedge, further strengthening the link between gravity and entanglement.
Connections to the Wald entropy: An interesting connection between gravitational shockwaves and the Wald entropy was found in [41]. It was shown that the shockwave and microscopic deformations of the Wald entropy were related by a thermodynamic relation on the horizon. Since our main result establishes a connection between shockwaves and the generalized entropy (2.21), it would be worth investigating to what extent their result can be related to ours.
Constraints on higher-derivative couplings: As a potential application of our results, one could try to understand the constraints on higher-derivative couplings from the perspective of quantum chaos. To avoid issues related to unitarity and causality at finite couplings, we have treated the higher-derivative interactions perturbatively. The signs of these couplings appear constrained by the butterfly velocity. For example, in d = 2 the butterfly velocity equals the speed of light in Einstein gravity. Therefore, requiring it be subluminal with higher-derivative corrections imposes constraints on the signs of the couplings. Given our expressions for a large class of higher-derivative theories, it would be interesting to see if requiring the butterfly velocity be subluminal can provide further constraints.
Relation to pole-skipping: Throughout the paper we have focused on two methods of calculating the butterfly velocity -the shockwave method and the entanglement wedge method. However, it has been suggested that the butterfly velocity (and more generally the OTOC) is also related to the phenomenon of pole-skipping [42][43][44]. In the gravitational context, this is related to the appearance of special points in Fourier space of the Einstein equations near the horizon, from which the Lyapunov exponent and butterfly velocity can be extracted. Although both the pole-skipping calculation and the shockwave method involve finding solutions to certain metric perturbations, the exact details are different. It would be interesting to explore their connections further.
Asymptotically flat spacetimes: Finally, both methods we discussed rely only on the near-horizon geometry and are therefore potentially generalizable beyond AdS spacetimes, such as asymptotically flat spacetimes, perhaps along the lines of [45].
gravity including, but not limited to, f (Riemann). As an aside, we will also show that the only component of the equations of motion perturbed by the shockwave is E v u . To achieve this, it will be useful to define a notion of chirality. Consider a (not necessarily covariant) tensor of the form X a 1 ···a b 1 ···bn built out of g µν , g µν and ∂ µ , where indices a 1 , · · · , a , b 1 , · · · , b n can be either u or v, and we have suppressed i-type indices on X. We define the chirality of any of its components as We refer to any tensor component with χ = 0 as being non-chiral, and otherwise as being chiral. For example, the components g uu and E v u are chiral since both have χ = 2, while R uivj is non-chiral since it has χ = 0.
For all higher-derivative gravity theories, the equations of motion involve the metric, the Riemann tensor, and covariant derivatives. We can rewrite them using only the metric, the inverse metric, and partial derivatives. The only metric component that contains hδ(u) is g uu = −2Ahδ(u); similarly, the only inverse metric component having hδ(u) is g vv = 2A −1 hδ(u). A general term in E v u therefore takes the form where we have collected all v-derivatives into the beginning of the expression (so that they are understood to act on particular parts of X but not necessarily on X as a whole), and collected everything that does not involve g uu , its u-derivatives, or g vv into X 0 . As g uu = g vv = 0, X 0 is a product of g uv , ∂ # u g uv , and g uv . Let χ 0 be the chirality of X 0 ; it is equal to the total number of u-derivatives, and thus always non-negative. As g uv is a function of uv only, each ∂ u acting on g uv produces a factor of v, and we find where f 0 (uv) is some function of uv. Since E v u has chirality 2, we need for the chirality of the term in (A.2) to agree.
Since v appears only in the combination uv in all metric functions, each ∂ v in (A.2) produces a factor of u unless it acts on an explicit factor of v produced by ∂ u . In general, the ∂ u acting on g uu = −2A(uv)hδ(u) in (A.2) can act either on A(uv) or the δ-function.
Let us first consider the simplest case where all ∂ u shown in (A.2) act on the δ-function. In this case, using (A.3) we find X = v χ 0 f 1 (uv)δ (n 1 ) (u) · · · δ (n k ) (u) (δ(u)) m , (A. 5) where f 1 (uv) is some function of uv. Therefore, the term (A.2) in E v u behave at most as X ≡ (∂ v ) N X ∼ u N −χ 0 δ (n 1 ) (u) · · · δ (n k ) (u) (δ(u)) m , (A. 6) keeping only the leading dependence on u. Here we have acted as many ∂ v as possible on v χ 0 ; if not, we would get subleading contributions that are suppressed by additional powers of u. We will show momentarily that the leading contribution (A.6), understood as a distribution, vanishes under the condition (A.4) unless it is actually δ(u) or u n δ (n) (u) for some n. Thus any subleading contribution suppressed by additional powers of u would always vanish as a distribution. Now consider the more general case where not all ∂ u shown in (A.2) act on the δfunction. Every ∂ u that does not act on the δ-function must act on A(uv) and produce an additional factor of v (for one more ∂ v to act on) -thus the net effect on the term X in (A.6) is to decrease one of the n i by 1 and effectively increase χ 0 by 1. This preserves the condition (A.4), so it does not change our argument below.
We now show that the distribution (A.6) vanishes under the condition (A.4) unless it is actually δ(u) or u n δ (n) (u) for some n. To see this, we regularize the δ-functions in (A.6) as narrow Gaussian functions: We find where we have used (A.4) in going to the second line. In the first line, we have written down a contribution to the regularized X(u) where all u-derivatives act on the exponent of e −u 2 / 2 ; every u-derivative that does not act on the exponent would remove a factor of u 2 / 2 from the first line, but would not change the final result.
We now take the → 0 limit. By construction, k + m ≥ 1 since we are interested in corrections to the equations of motion due to the shockwave which have at least one factor of δ(u) or its derivative. If k + m > 1, then the integral I vanishes as we send to zero. We are left with only two cases: either k = 0, m = 1 where X ∼ δ(u), or k = 1, m = 0 where X ∼ u n δ (n) (u) for some n. In either case, the term is a well-defined distribution and linear in h, concluding our proof for E v u . Finally, consider other components of the equations of motion, e.g., E v i , E v v , etc. They have χ ≤ 1, so we must have more powers of ∂ v compared to (A.4), and the corresponding distribution must have more powers of u compared to (A.6). Thus the integral I would go like at least O( k+m ), which vanishes in the → 0 limit as long as k + m ≥ 1. Therefore, other components of the equations of motion are not perturbed by the shockwave.

B Proof of sum relations
In this appendix, we prove the sum relations (4.42a) and (4.42b) used in the main proof. We will need the following identities where in going to the last line we have used the fact that the sum is the same as (B.2) with k ↔ l.