Statistics on Lefschetz thimbles: Bell/Leggett-Garg inequalities and the classical-statistical approximation

Inspired by Lefschetz thimble theory, we treat Quantum Field Theory as a statistical theory with a complex Probability Distribution Function (PDF). Such complex-valued PDFs permit the violation of Bell-type inequalities, which cannot be violated by a real-valued, non-negative PDF. In this paper, we consider the Classical-Statistical approximation in the context of Bell-type inequalities, viz. the familiar (spatial) Bell inequalities and the temporal Leggett-Garg inequalities. We show that the Classical-Statistical approximation does not violate temporal Bell-type inequalities, even though it is in some sense exact for a free theory, whereas the full quantum theory does. We explain the origin of this discrepancy, and point out the key difference between the spatial and temporal Bell-type inequalities. We comment on the import of this work for applications of the Classical-Statistical approximation.


Introduction
What is the obstacle that generally hinders computations in Quantum Field Theory (QFT), when we can readily perform the corresponding computations in Quantum Mechanics (QM)? The short answer is the infinite number of degrees of freedom in QFT. To see how the degrees of freedom may be enumerated, we can look at the functional Schrödinger equation for the real, interacting Klein-Gordon field φ. This takes the form i ∂ ∂t φ, t|in = d d x − 2 2 δ 2 δφ 2 (t, x) + 1 2 (∇φ(t, x)) 2 + V (φ(t, x)) φ, t|in , (1.1) where V is the potential and φ, t|in is the field-configuration representation of the wavefunctional with initial state |in . Each φ, t|in corresponds to a cross-section of φ line bundle, and the total number of cross-sections would amount to N (Nx) d φ , if we discretized both the x and φ spaces. Notice the appearance of the number of spatial dimensions d in the exponent, which indicates an exponential growth in the number of degrees of freedom with either N φ or N x . The special case, of course, is d = 0, which is, in fact, QM. Furthermore, the evaluation of the functional derivative, δ 2 /δφ 2 (t, x), makes use of the neighbouring states/cross-sections. Thus, to solve the functional Schrödinger equation, we should keep the values for all of them, and the exponential growth in the number of degrees of freedom renders the approach insoluble on a normal computer.
While, in quantum theory, all states are equal, some states are more equal than others. If we do the statistics correctly, we do not really need to include all states in order to get an accurate enough result. This is the idea behind the stochastic method. The question is: What determines whether one state is more important than another, or, in other words, from which Probability Distribution Function (PDF) should we sample these states? The answer is well known in QFT and it can be extracted from a path-integral approach to the problem.
Recall that, to calculate the expectation value Ô (t) of some operatorÔ(t), we compute Ô = Tr ρ(t 0 )Ô(t) /Tr ρ(t 0 ) , where we assume the density operatorρ is known at the time t 0 . The partition function Tr [ρ(t 0 )] is the integral of the PDF. For an analogy, consider the normalization dx e −x 2 , arising when calculating x 2 = dx e −x 2 x 2 / dx e −x 2 for the Gaussian PDF e −x 2 . In the expression for Ô (t) , the initial density operator ρ(t 0 ) and the operatorÔ(t) are evaluated at different times. Therefore, to construct a path-integral representation of this expectation value, we should connect these operators together by constructing a closed-time path integral, according to the Schwinger-Keldysh prescription [1,2]. More specifically, we can write the expectation value as where Dφ ≡ Dφ + Dφ − is the functional integration measure, L is the Lagrangian and O(t) is the field-configuration representation of the operator. The C on the time integrals indicates the closed-time contour: a contour with a time-ordered "+" branch, starting from t 0 and ending at some time t m > t > t 0 , and an anti-time-ordered "−" branch running backwards from t m to t 0 . The superscripts "+" and "−" on the initial field configuration φ 0 , identify whether their time arguments lie on the + or − branches of the closed-time contour C. Following the Schwinger-Keldysh approach, the expression (1.2) can be obtained directly by using the time-evolution operatorÛ (t 1 , t 2 ) = T exp − i t 2 t 1 dt Ĥ (t ) , whereĤ is the Hamiltonian operator and T is the time-ordering operator, 1 or by inserting complete sets of eigenstates of the Heisenberg-picture field operators. We note that the choice of the end time of the contour t m is arbitrary, so long as t m > t. Alternatively, one can arrive at the same expression from the functional Schrödinger equation (1.1) by recalling that its solution can be written as a convolution of the path integral with the initial wavefunctional, i.e., Ψ(φ m ; t m ) = φm Dφ e iS/ Ψ(φ 0 ; t 0 ), with the expectation value computed . With this path integral in mind, our plan to compute expectation values by picking a sample of important states is analogous to doing an integral using generated samples, such as one might do in a Monte-Carlo approach. However, by rephrasing QFT in the language of statistics, we are not led to any direct solution. In fact, we face a more challenging question than is found in standard statistics: How to deal with the fact that the PDF is complex-valued.
The non-negativity of the PDF in standard statistical analysis has a number of crucial effects. On one hand, non-negativity is an important condition to guarantee a well-behaved Markov chain, thereby allowing a Monte Carlo computation. On the other hand, there are some key properties that are implied by non-negativity. One of them relates to the Bell inequalities [3]. For instance, from the viewpoint of statistics, there can be no violation of the inequality if the observables A, B and D, of value ±1, are drawn from some real-valued, non-negative distribution function. One can prove this by generating samples according to such a distribution. Since any single realisation will satisfy the inequality, so will the average, given that AB = N i=1 A i B i /N and A i , B i and D i are restricted to be ±1. However, a complex PDF can lead to violations of such inequalities.
Complex PDFs call for complex analysis. Consider, e.g., the complex Gaussian integral dx e ix 2 . With the complex PDF e ix 2 , we cannot ascertain which states are more probable than others, unlike with the normal distribution e −x 2 . Nevertheless, we know that we can perform the integral by deforming the contour of integration into the complex plane by Cauchy's theorem. By choosing a better contour, we can then make sense of the statistics. In this simple case, we find a normal distribution if we rotate the integration contour by π/4 in the complex x plane, so that ix 2 → −x 2 . Under the same principle, the Lefschetz thimble approach provides a powerful tool for QFT, whereby we complexify all real-valued fields [20] and are furnished with a prescription for finding a suitable integration contour. Applications of the Lefschetz thimble approach to real-time path integrals has recently attracted significant attention [22,23,25,26,[28][29][30]33].
A popular approximation used in the evaluation of the real-time path integral is the Classical-Statistical (CS) approximation. There, the quantum evolution is replaced by the classical evolution of an ensemble of initial conditions, drawn from a non-negative PDF. The interpretation in the context of the Lefschetz thimble Monte-Carlo is that we keep only the critical points of the action and allow only for non-negative Wigner functions. This is quantitatively a good approximation in the limit where field occupation numbers are large. This may be expressed in terms of discarding diagrams in a perturbative expansion (at high temperature [9]), in terms of the "statistical" propagator being much larger than the "spectral" propagator (e.g., in preheating [12,15]) or in terms of the state being "squeezed" (e.g., in cosmological perturbations during inflation [6]).
The prescription is as follows: When field modes acquire large occupation numbers, one may evolve them using classical equations of motion, and compute observables as in classical field theory. When the fields interact, all modes are coupled, but the dynamics is still expected to be dominated by the ones with high occupancy. In a few specific cases, one may opt to seed the evolution with an initial state resembling the quantum zero-point fluctuations (the "half" [10,13,14], and for a critical analysis, see Ref. [34]). This assumes that the evolution is linear (non-interacting) until the phenomenon under consideration (e.g., resonance, instability or inflation) amplifies these seeds into exponentially large occupation numbers. The "half" can of course not lead to truly quantum phenomena and is, from the point of view of the classical evolution, simply a curious non-thermal initial condition. Even so, it can give rise to spurious effects, since it has divergent energy in the UV (see, for instance, Ref. [15]).
In the case of a free theory, the CS approximation is, in some sense, exact. In spite of this, the aim of this article is to describe where and why the CS approximation is nevertheless incapable of describing the violation of Bell-type inequalities, both spatial Bell inequalities and temporal Leggett-Garg inequalities, even for a free theory.
The remainder of this article is organised as follows. In Sec. 2, we provide a brief introduction to the CS approximation within the context of the Lefschetz thimble approach. In Secs. 3 and 4, we focus on the temporal Leggett-Garg inequalities and the spatial Bell inequalities, respectively. Our conclusions are presented in Sec. 5. Some useful results are collected in the Appendices.

Classical-Statistical approximation and Lefschetz thimbles
To compute the expectation value appearing in Eq. (1.2), it proves convenient to introduce the following field variables: 2 In terms of these variables, the closed-time path integral involves the exponent with the bulk term The partition function can then be arranged in the form is the initial Wigner function, in which the boundary term from Eq. (2.2) has fulfilled the role of the kernel in a Weyl transformation of the initial density operator. Notice that we have relabelled π cl 0 ≡φ cl 0 to make contact with the Hamiltonian form of the path integral (see Ref. [31]). The prime on the integration measure indicates that the fields on the initial temporal boundary, such as φ 0 (x) and π 0 (x), have been excluded.
Due to the Hermiticity of the density operator, i.e.,ρ † =ρ, the initial Wigner function must be real-valued. In comparison, the bulk term e −I is purely a phase term. Notice in addition that, to make the bulk path integral well-defined, the existence ofφ cl requires two temporal boundary conditions, which are provided here by the Wigner function through φ cl 0 andφ cl 0 . This structure also has an impact on the critical point and related Lefschetz thimble as follows.
We first notice the critical point 3 of I satisfies with the initial values of φ cl andφ cl determined by the initial Wigner function W (φ cl 0 , π cl 0 ; t 0 ). That is, the critical point corresponds to a φ cl that follows the classical trajectory with initial data specified by W (φ cl 0 , π cl 0 ; t 0 ), and, as an initial value problem, there exists one and only one solution. This is the virtue of the two-step evaluation of the path integral. Namely, if we separate the path integral into the initial Wigner function W (φ cl 0 , π cl 0 ; t 0 ) and the dynamical part e −I , there will exist one and only one Lefschetz thimble for each initialization generated by the Wigner function. By Lefschetz thimble, we mean the manifold generated by the gradient flow originating from the critical point [20], and therefore the number of thimbles equals the number of critical points. For more on the two-step evaluation, we refer the readers to Refs. [28,29].
The dynamical part I consists of odd terms in φ q . If there are only linear terms in φ q , we can integrate φ q out to obtain functional delta functions in φ cl . Conversely, for nonlinear potentials that will yield odd terms in φ q of higher powers, imposing the same delta functions corresponds to the approximation of dropping these higher terms from Eq. (2.3). This is known as the Classical-Statistical (CS) approximation [8,9,11,17,21]. Note that this approximation maintains the non-linearity in φ cl . In corollary, no such approximation is needed if there are only quadratic terms in the Lagrangian, and the CS approximation is then, in some sense, exact. The situation remains, however, non-trivial when the parameters vary in space and time. For instance, certain quantum effects in the early Universe can be well approximated via quadratic terms on the time varying background, and the ensemble average of classical evolutions will provide a honest description.
After integrating out φ q , the CS approximation leads to where the delta function is understood in the functional sense. We see that the CS approximation to the partition function makes use of the critical points only, i.e., the classical trajectories, with their initialization distributed according to the initial Wigner function. Notice that the whole distribution function above is non-negative if the Wigner function is non-negative. In comparison to the original expression (2.4), we may regard the φ q as hidden variables, only this time, the PDF is complex. However, this analogy is not quite complete, as we may also wish to measure φ q -dependent operators. We want to stress that the complex PDF is a necessary condition for the violation of Bell inequalities, since if the distribution function is non-negative, one can always do the sampling and the generated samples cannot violate Bell inequalities. Now, with Eq. (2.7), we might speculate that there should not exist any violation of Bell inequalities in the free theory when a non-negative PDF about φ cl can be drawn. As we will see, this is not the case.

(Temporal) Leggett-Garg inequalities
With regards the CS approximation, it will turn out that the temporal Bell-type inequalities due to Leggett and Garg [4] are of a richer structure, and we therefore choose to treat these before the more familiar spatial Bell inequalities.
The Leggett-Garg inequalities deal with measurements at different times. For the measurement operator, we chooseQ = sign(φ), which is a proper dynamical operator that maps the continuous variable φ into a dichotomous one [16,27], taking values ±1.
It is useful to consider which correlators the experiment can measure and which twopoint correlation functions they correspond to in the theory. In an experiment, we prepare an ensemble of sets, consistent with the same initial state |ψ . For each set, a measurement Q 1 ≡ Q 1 = r is read out at t 1 and another measurement Q 2 ≡ Q 2 = s is read out at a later time t 2 , with r, s = ±. The joint probability P (r, s) can then be calculated [19], e.g., as where N (r, s) is the number of sets of Q 1 = r and Q 2 = s. Accordingly, the correlator is defined as For dichotomous variables, the correlator can further be related to the quantum two-point correlation function [19]. To see this, recall that we have two probabilities for the two measurements: which correspond to the probability of finding r at t 1 , given the initial state |ψ , and the probability of finding s at t 2 , given the state |r; t 1 , i.e., the state into which the first measurement collapses the system. The joint probability of finding r at t 1 and s at t 2 is then simply the product P (r, s) = | s; t 2 |r; t 1 | 2 | r; t 1 |ψ | 2 = ψ|r; t 1 r; t 1 |s; t 2 s; t 2 |r; t 1 r; t 1 |ψ .

(3.4)
A dichotomous operator admits the following propertieŝ and we can therefore write It is then straightforward to obtain the following equalities: The last of these [Eq. (3.7d)] in particular means that the correlator measured in the experiment via Eq. (3.2) is really the quantum two-point correlation function involving the anti-commutator of the measurement operators. Interestingly, the above four equations indicate that the measurement procedure allows us to find ψ|Q 1 |ψ and ψ|{Q 1 ,Q 2 }|ψ , but not ψ|Q 2 |ψ . As we shall see, this is not the case in the CS approximation, where we are able to find ψ|Q 1 |ψ and ψ|Q 2 |ψ , but not ψ|{Q 1 ,Q 2 }|ψ .

Violation of the Leggett-Garg inequalities
Since, in the case of the Leggett-Garg inequalities, the measurements are performed at the same spatial site, we can simplify the analysis significantly and consider the question in d = 0 spatial dimensions, i.e., in QM. As a concrete example, we focus on the quantum harmonic oscillator, subject to the Schrödinger equation We assume a Guassian initial state, displaced from the minimum of the potential well by an amount ∆, i.e., This leads to a positive Wigner function of the initial density matrix (with the explicit form shown in Eq. (3.27)). The full expression for the time-dependent wave function can be written in terms of the Feynman kernel (3.12) By inspection, we see that this corresponds to a Gaussian probability distribution, whose central peak oscillates about the origin with frequency ω and amplitude ∆. In fact, it is a coherent state of the form [5] (3.14) We know the one-point function of the coherent state will oscillate between ∆ and −∆, and this provides a perfect scenario for Leggett-Garg's proposal [24]. With this in place, we can calculate the two-point function directly via where Q(φ) ≡ sign(φ). This integral can be computed numerically, and we present some additional details of its structure in Appendix A.

A conundrum
As we argued earlier, the CS approximation cannot violate Bell inequalities, due to the non-negative PDF. We also saw that the CS approximation is actually an exact description of the free theory, for which we have just found a violation of Bell inequalities. So why do we get this apparently contradictory conclusion for the Bell inequalities? To answer this question, it is useful to check the definition of the QFT two-point function in the φ cl -φ q basis; given t 2 > t 1 , where we have introduced the shorthand notation φ i ≡ φ(t i , x i ). We recall that Dφ ≡ Dφ + Dφ − . The appearances of φ ± in Eq. (3.18a) can be understood by looking at Fig. 1. Specifically, if we want to compute the correlator Q 2Q1 = Tr ρ(t 0 )Q 2Q1 /Tr ρ(t 0 ) , we construct the path integral by inserting complete sets of states from t 0 to t 1 , then from t 1 to t 2 and finally from t 2 back to t 0 , which is just the path given in the upper plot of Fig. 1. The operator at the larger time may appear on either branch of the contour without affecting the result. In fact, we can contract the path so that it ends at the larger time, i.e., t 2 , and then use the results of Refs. [28,29] to set φ q (t 2 , x) = 0, yielding the final line of Eq. (3.18b). Instead, the CS approximation can only compute (if we restore both the φ + and φ − integrals), We can already conclude that C QFT and C CS do not calculate the same thing, and that the temporal Bell inequalities rely upon more than just φ cl .
For the free scalar field, we can proceed further, by integrating out φ q . Using the results of Appendix B, we find In the expression above,φ cl (t, x) denotes the field as evolved from the initial dataφ cl (t 0 , x) = φ cl 0 (x) andφ cl (t 0 , x) = π cl 0 (x) via the classical equation of motion, which may be represented using the retarded Green function G R as The associated momentum field is defined as There are two more fields involved in the expression: The integration variable (or π cl 1 for the shorthand notation) is the momentum field at time t 1 and position x 1 , and the fieldφ cl 2 depends on π cl 1 via 4

(3.24)
That is,φ cl 2 is almost a classical solution, except with the momentum coming from time t 1 and position x 1 , which is determined by the integration variable π cl 1 . Eventually, the integration over π cl 1 leads to, where Si(w) ≡ Returning to our QM example, the retarded Green's function, for t 2 > t 1 , and the normalized Wigner function are given respectively by (3.27) The two-point function (3.25) is then Less obvious is that this expression yields the same result as that given by Eq. (3.15), which one can evaluate numerically. Notice that we could obtain the correct violation of the Bell inequality from a CS calculation if we were to use the Si function and retarded Green's function as in Eq. (3.25), instead of the product of sign functions that the CS approximation naively leads to. This prescription, however, only applies to the free theory, and there is no obvious way to determine what replacement should be used for general interactions.

(Spatial) Bell inequalities
An interesting observation is that the Si function appearing in Eq. (3.25) admits the following limit: Since the retarded propagator G R (t 2 , x 2 ; t 1 , x 1 ) is causal, it vanishes when the separation of two points is space-like, i.e., (t 2 − t 1 ) 2 < |x 2 − x 1 | 2 , which means that, for the free theory, C QFT and C CS give the same result in the case of spatial Bell inequalities. On the other hand, when the Wigner function is non-negative, the two-point functions in form of Eq. (3.26) can never violate Bell inequalities. Thus, there can be no violation of Bell inequalities among space-like correlation functions in free scalar field QFT, unless some entanglement exists in the initialization. Beyond the free theory, we point out that it is a general property of QFT that φ q does not appear explicitly in the calculation of the spatial Bell inequalities. In fact, the following equations are valid in general QFT: with the reason discussed in Appendix B. In comparison, the CS approximation computes where I denotes the I with the higher-order terms in φ q discarded. In what follows, we will refer to those higher-order terms in φ q as "quantum vertices", a name that can reflect their roles in Feynman diagrams [9,21,28]. Since the CS approximation with Eq. (4.3) cannot violate any Bell-like inequalities, while the full quantum theory with Eq. (4.2) can, and given that the only difference between the two concerns the quantum vertices, we might speculate that these quantum vertices must have something to do with the origin of quantum entanglement. (Here, we equate quantum entanglement with the violation of Bell inequalities.) This seems to be the case with the following arguments.

Quantum vertices and quantum aspect of φ q
We first notice that there exists a loophole in the above reasoning, which is related to the initialization. When quantum entanglement appears in the initial state, the Wigner function will be negative-valued in some field region. The negative distribution will make the standard sampling method difficult, if not impossible, to generate initializations for the CS approximation. That being said, if there exists some sophisticated re-sampling method for doing the initialization, the CS approximation will escape the constraint of not violating Bell inequalities. On the other hand, the entanglement that already exists in the initialization does not help us to understand the origin of quantum entanglement. For this purpose, the best scenario is where the system starts from a non-negative Wigner function, and gains some violation of spatial Bell inequalities during the evolution. Consider, for instance, the decay of a spinless particle into two photons. In this case, as we argued above, the CS, as an approximation, cannot capture such quantum entanglement. We can therefore conclude that it is those higher order φ q -terms in the action -the quantum vertices -that make quantum entanglement possible, and it is the absence of them that renders the CS approximation unable to capture quantum entanglement.
We further point out in the φ cl -φ q representation that the quantum properties are usually accompanied by the appearance of φ q . Recall, from the last section, the central role that the φ q plays in the Leggett-Garg or temporal Bell inequalities and here in the origin of the violation of spatial Bell inequalities, through the quantum vertices.
Another example is the commutation relation, which is in fact a q-cl two-point function. To see this, first recall, from the field-configuration representation of path integral, that the field momentum can be computed via It is then straightforward to derive the commutation relation and obtain In the free theory, the relevant two-point function in Eq. (4.6a) is given by which, as a double-check, gives the right delta function in the limit dt → 0. In retrospect, when facing Eq. (3.27), the explicit Wigner function in the example, and noticing φ cl 0 and π cl 0 are independently distributed, one might doubt whether the initialization would respect the commutation relation. Now, as we see, one should in fact compute the two-point function in Eq. (4.6b) in order to check the commutation relation. Note that φ q 0 and π cl 0 appear in the kernel of the Weyl transform (2.5). Thus, by a substitution φ q 0 W = i δW/δπ cl 0 and then integration by parts, we can verify [φ(t 0 , x),π(t 0 , y)] = i δ d (x − y). In particular, the validity does not depend on the detail of the Wigner function or the initial density matrix, but only on the Weyl transform.

Conclusions
We have studied QFT from the viewpoint of statistics. A general feature of the real-time path integrals is that the PDF is complex-valued, and, as a result, standard statistical tools, such as Markov chain or Monte Carlo, cannot be applied directly. This is reminiscent of the so-called "numerical sign problem". To deal with the complex path integral, we can seek to use the Lefschetz thimble method, and we can summarize the following general properties of the real-time path integral (for further details, see Refs. [28,29]): • The real-time path integral admits a two-part separation into the initial density matrix (via the Wigner function W ) and the dynamical part (e −I ). The initial Wigner function provides the initial data for the dynamical part, and it can be either non-negative-or generally real-valued. The dynamical part, on the other hand, is purely a phase term, which is always situated on the edge of the convergent region in the complexified φ-space.
• Applying the Lefschetz thimble approach to the dynamical part, we find that the saddle points consist of classical trajectories. In particular, for each initialization, there exists a unique classical trajectory/saddle point/Lefschetz thimble.
• The exponent in the dynamical part (e −I ) includes only odd terms in φ q . If we throw away any higher-order terms in φ q , integrating out the φ q leads to functional delta functions that pick out the classical solutions. This leads to the CS approximation. Note that, after they have been integrated out, the φ q fields become hidden in the sense that we can no longer compute φ q -dependent operators within the CS approximation.
In this paper, we have further pointed out that the complex PDF is a necessary condition for the violation of Bell-type inequalities, since, if the distribution function is nonnegative, one can always apply the sampling method, and the so-generated samples cannot violate Bell-type inequalities. In this sense, the CS approximation is not expected to yield any violation of Bell-type inequalities, as it is restricted to non-negative PDFs.
This observation, however, leads to a conundrum that the free theory in a coherent state, for which the CS approach makes no approximations, admits a violation of the Leggett-Garg or temporal Bell inequalities. We resolved this puzzle by demonstrating that, for the example measurement operatorQ = sign(φ), the temporal two-point function depends on both φ cl and φ q . Note that the PDF involving φ cl only is non-negative, but the PDF with φ q included is complex. This further validates our conclusion above that a complex PDF is needed for violations of Bell-type inequalities.
We have also identified a key difference between the spatial and temporal Bell-type inequalities. We first recall that, in any (local) QFT [32], any two operators with spacetime arguments that are spacelike separated commute. In the language of φ cl -φ q , this is obvious because the product of two operators contains φ cl only. Thus, the spatial Bell inequalities have classical analogues, and one can therefore compute them directly within the CS approximation (but one will not see a violation). In comparison, there are no such analogues for the temporal Bell-type inequalities due to the explicit dependence on φ q . In this case, the CS approximation should not be used to compute Leggett-Garg inequalities directly, not even as an approximation.
These observations in relation to the CS approximation and Bell-type inequalities can be summarized as follows: 1. The CS approximation is consistent with the full QFT treatment for a non-negative initial Wigner function in the case of spatial Bell inequalities and in the absence of interactions, because the configuration φ q = 0 is picked out.
2. The CS approximation fails to capture violations of spatial Bell inequalities for the free theory, because we are restricted to non-negative Wigner functions.
3. We suspect that the CS approximation cannot be made consistent for spatial Bell inequalities for interacting theories, even for a non-negative initial Wigner function, because of the non-trivial φ q dependence, which will prevent us from identifying a CS equivalent for the true QFT correlation function.
4. The naive application of the CS approximation to the free theory cannot capture the violation of temporal Bell inequalities due to the φ q dependence of the measurement operators. While, for the free theory, we have been able to identify a CS-equivalent correlation function (viz. the sine integral expression rather than the product of sign functions in our example), the identification of such a prescription may not be straightforward in the interacting case, as per point 3. above.
The CS approximation is quantitatively sound when occupation numbers are large, i.e., φ cl is much larger than φ q . Discarding higher order terms in φ q altogether then allows straightforward sampling of a non-negative initial PDF and evolution with classical equations of motion. While this prescription follows explicitly from the quantum path integral, as we have described, it amounts in practice to performing a (ensemble-averaged) classical field theory simulation. However, what we have seen in this work is that attempts to stretch the CS approximation to truly quantum phenomena require extreme care. In the non-interacting case, one would expect agreement between quantum and classical results because of the linearity of both the quantum operator equations and the classical equations of motion. However, we have seen that the CS prescription does not in general give the correct qualitative result, because the observables themselves may be constructed from the φ q . As a result, the step in which φ q is integrated out does not go through. It is certainly illuminating to think of the CS approximation as a limit of the thimble formalism of the quantum path integral, but its applicability remains valid only for phenomena where φ cl φ q , both for the evolution and the observables -the classical realm.
where we have written with ≡ 0 + , and defined The φ + and φ − integrals can be performed analytically, so long as Re 4A 2 − C 2 > 0, and we arrive at the result After making the change of variables √ ωξ ± → ξ ± , √ ω → , (A.6) we have in which we see that the only dependence on the model parameters is through the parameter c = ω∆ 2 /(2 ). Instead, by making the change of variables in Eq. (A.2), the integral can be written in the form The factor is independent of time in the limit t 1 = t 2 . In fact, this is because one obtains a delta function of φ q in this limit. To see this, we need to make use of the following limit representation of the delta function: We can then take lim and since lim and lim 16) we are justified in writing lim (A.17) as above.

B Path integral for the dynamical part
For the dynamical part e −I , we have the continuum expression given in Eq. (2.2). However, to gain a better understanding of the real-time path integral, it is convenient to consider the discrete form where we have adopted the shorthand notation 5 We first notice that there are only linear terms in φ cl m (x), where t m is the latest time on the contour. Thus, if we integrate all φ cl m (x) out, we will obtain the functional delta functions δ φ q m−1 (x) . We can then proceed to integrate out all φ q m−1 (x), and the result will have an exponent similar to Eq. (B.1), but now with t m−1 as the end point. This corresponds to the contraction of the contour along the real-time line, and it is for this reason that the φ q will play no role in the spatial Bell inequalities in Sec. 4. For further details, see Refs. [28,29].
On the other hand, without the higher-order derivatives of the potential, e.g., in a free theory, we can integrate out all φ q simply via (B.3) Given the shorthand notation (B.2), the delta function above simply enforces the classical trajectory for φ cl j+1 , given φ cl j and φ cl j−1 . We also note that we interpret Eq. (B.3) in the sense that space has also been discretized, with Eq. (B.3) referring to a particular spatial point x. With this delta function, we can further integrate out the φ cl fields. This is how we proceed with the denominator in passing from Eq. (3.18b) to Eq. (3.20).
For the numerator, we need to proceed a little differently, due to the presence of external operators. For sites on which none of the operators are located, we can still apply the integral above, which leads to the same delta functions. On the special sites at t 1 (denoted with a discrete index k 1 ), where an operator is situated, we apply the following integral: . ( .

(B.6)
Notice that this reduces to the same delta function in the limit 2φ cl k 1 (x 1 )d d x/( dt) → 0. We are now in a position to integrate out the remaining φ q , which fall into two categories: those between t 0 and t 1 , and those between t 1 and t 2 . Each integration results in a delta function of the form in Eq. (B.3), and so the remaining φ cl integrals (except for φ cl k 1 +1 (x 1 )) pick out the classical trajectories, one starting at t 0 , and the other starting at t 1 . As a result, we can integrate out all φ cl in the numerator, except φ cl k 1 +1 (x 1 ). If we now return to Eq. (3.18b) and define the double angle brackets . . . to be a shorthand notation for the path integrals without the φ ± 0 integrals, we find 1 2 sign(φ + k 1 (x 1 )) + sign(φ − k 1 (x 1 )) sign(φ cl k 2 (x 2 )) = 1 π dφ cl k 1 +1 (x 1 ) sin 2φ cl The tilde refers to the classical trajectories with the initial data at t 0 . Theφ cl k 2 (x 2 ) also satisfies the classical equation, but with the initial data at t 1 , i.e., including π cl k 1 (x 1 ). Thus, with the initial Wigner function, we obtain the expression given in Eq. (3.20).