Bell inequalities and quantum entanglement in weak gauge boson production at the LHC and future colliders

Quantum entanglement of weak interaction gauge bosons produced at colliders can be explored by computing the corresponding polarization density matrix. To this end, we consider the Higgs boson decays H→WW∗\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$H\rightarrow W W^*$$\end{document} and H→ZZ∗\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$H\rightarrow Z Z^*$$\end{document}, in which W∗\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$W^*$$\end{document} and Z∗\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$Z^*$$\end{document} are off-shell states, and the WW, WZ and ZZ di-boson production in proton collisions. The polarization density matrix of the di-boson state is determined by the amplitude of the production process and can be experimentally reconstructed from the angular distribution of the momenta of the final states into which the gauge bosons decay. We show that a suitable instance of the Bell inequality is violated in H→ZZ∗\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$H\rightarrow Z Z^*$$\end{document} to a degree that can be tested at the LHC with future data. The same Bell inequality is violated in the production of WW and ZZ boson pairs for invariant masses above 900 GeV and scattering angles close to π/2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi /2$$\end{document} in the center of mass frame. LHC data in this case are not sufficient to establish the violation of the Bell inequality. We also analyze the prospects for detecting Bell inequality violations in di-boson final states at future e+e-\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$e^+e^-$$\end{document} and muon colliders. A further observable that provides a lower bound on the amount of polarization entanglement in the di-boson system is computed for each of the examined processes. The analytic expressions for the polarization density matrices are presented in full in an Appendix. We also provide the unitary matrices required in the optimization procedure necessary in testing the Bell inequalities.


Introduction
The most natural way to generate entanglement [1] between two quantum systems is through their mutual interaction; any interaction dynamics involving the degrees of freedom of both systems is bound to create quantum correlations and yield detectable effects measurable through suitable quantum observables.
An instance of such an interaction is high-energy collisions: they give rise to quantum entanglement among the elementary particles partaking in a scattering process-thereby providing the possibility to study and test entanglement in a novel setting. This opportunity has indeed recently drawn some interest and has been explored in a series of papers [2][3][4][5][6][7][8][9][10][11][12][13][14][15][16][17]. Most of these analyses have focused on distinguishing quantum mechanics from alternative local and deterministic theories through the exploration of Bell inequalities [18][19][20][21][22] in the energy range probed at the Large Hadron Collider (LHC). We continue these studies by analyzing possible quantum correlations in the polarization states of weak interaction gauge bosons produced at colliders.
Massive gauge bosons act as their own polarimeters and their spin polarizations can be reconstructed from the angular distribution of the final leptons or jets (when generated by down-type quarks). We assume that the polarization density matrix of the two bosons can be fully reconstructed (albeit with limited efficiency) from the angular distributions of their decays into final states. The actual uncertainty affecting such a reconstruction-as well as the effect of backgrounds, unfolding and of the detector-can only be estimated through dedicated numerical simulations.
In the following, we analyze the production of gauge bosons via the resonant Higgs boson decays H → W W * and H → ZZ * , where W * and Z * denote off-shell states, and study the W W , W Z and ZZ production proceeding from proton-proton collisions in a manner reminiscent of the Drell-Yan mechanism. The density matrix of the di-boson state is determined analytically for each process from the corresponding amplitudes and depends on the kinematic variables characterizing the scattering process. Once the density matrix is known, it is possible to test the (Collins-Gisin-Linden-Massar-Popescu) CGLMP inequality [23,24]-a Bell inequality optimized for three-level systems as those describing the polarizations of massive spin-1 particles-on the available kinematic configurations. We also compute an observable used as proxy for the entanglement in processes characterized by massive spin-1 final states. The corresponding operator yields a lower bound for the entanglement of the two boson system in each of the analyzed cases, thereby serving as a witness of the entanglement in their polarizations.
We find that the CGLMP inequality is violated in di-boson Higgs decays and that such a violation in the case H → ZZ * case could be tested at the LHC with future data. For the W W , W Z and ZZ production in proton collisions, instead, the CGLMP inequality is violated only in the W W and ZZ channels for invariant masses above 900 GeV and scattering angles close to π/2 in the center of mass frame. Due to the small number of events expected in this kinematic region, it is difficult to assert the CGLMP inequality violation with sufficient accuracy. A better significance will be achieved at future lepton colliders, as we also show in the following.
The theoretical uncertainty of our results is at most of order 10% in the continuum W W , W Z and ZZ Drell-Yan processes, due to the implied QCD next-to-leading order (NLO) contributions [25]. We checked that uncertainties in the parton distribution functions (PDF) are negligible. Smaller theoretical uncertainties of order of a few percent are expected for the W W * and ZZ * processes in the resonant Higgs decay region, due to NLO EW corrections. [26] We estimate the overall uncertainty in the event selection by considering the efficiency in the identification of the lepton final states and their momenta reconstruction-which conservatively we take to be 70% for each lepton-and a distribution of the observables due to uncertainty in the off-shell gauge boson mass. When W -bosons are in the final states, we add a systematic error to take into account the intrinsic difficulty of the physical analysis that requires dedicated algorithms to reconstruct the neutrino momenta. Though naive, our estimates still indicate the most promising processes for this kind of studies at the considered collider machines. It is our hope that this preliminary study encourages the experimental collaborations to assess the power of current and future machine to probe quantum entanglement through full simulations. Since our results are analytic and the related uncertainties, as mentioned, only include a guess on the efficiency of the reconstruction, these numerical simulations are paramount for a realistic estimate of the uncertainty. Numerical simulations were performed at the parton level in [5] and [12], for the Higgs boson decays, and in [15] for the di-boson production from quarks. We compare these numerical results with ours when discussing our findings.
The polarization of weak gauge bosons has been studied in the literature in terms of helicity amplitudes [27]. Our approach differs in that we derive the full density matrix of the di-boson system as required for studying the presence of entanglement.
The paper is organized as follows. Sec. 2 introduces the entanglement witness we utilize, the Bell and CGLMP inequalities and shows how the polarization density matrix of interest can be built from the amplitude of the underlying scattering process. In Sec. 3 we apply our methodology to two massive gauge bosons originated in a Higgs boson decay. The entanglement of a di-boson system created at the LHC or at future colliders is investigated in Sec. 4. We conclude in Sec. 5 by briefly summarizing our findings. The correlation coefficients of all the considered density matrices are listed in C. These expressions can be useful for future studies and, to the best of our knowledge, have not been reported in the literature before.

Methods and tools
In this section we introduce the observables used in the study of entanglement and violation of Bell inequalities. We also briefly review the calculation of the polarization density matrix for a system formed by one or two massive gauge bosons, as well as the procedure to reconstruct it from the momenta of the leptons emitted in their decay process.

Observables
We start by discussing quantum correlations in the context of three-level systems-in short, qutritsimplemented, for instance, by the possible polarizations of massive gauge bosons. We use a specific Bell inequality optimized for this system and a suitable measure of entanglement to probe the correlations encoded in the polarization density matrix of the two qutrits.

Entanglement
Quantifying the entanglement content of the state of a quantum system is generally challenging as the complexity of the problem increases with the system dimensionality [1]. For pure states-systems described by a vector in the Hilbert space, or equivalently, by a density matrices that is a projectorthe problem can be addressed by considering their Schmidt decomposition [1], but already for the simple case of bipartite systems only partial answers are available. No general rule is applicable to mixed states, in which case one can only rely on so-called entanglement witnesses: quantities that give conditions sufficient to establish the presence of entanglement in the system. A computable example of such a witness is connected to concurrence, a reliable entanglement measure for bipartite two-level systems-that is, consisting of two qubits [28].
Consider a bipartite quantum system comprising two subsystems of equal dimensionality, A and B, described by a normalized pure state |Ψ⟩ and density matrix |Ψ⟩⟨Ψ|. The concurrence of the system is then defined as [29] C[|Ψ⟩] = 2 1 − Tr (ρ r ) 2 , r = A or B , (2.1) where ρ r is the reduced density matrix obtained by tracing over the degrees of freedom of either subsystem: e.g. for r = A one has ρ A = Tr B |Ψ⟩⟨Ψ| . Any mixed state ρ of the bipartite system can be decomposed into a set of pure states {|Ψ i ⟩}, its concurrence is then defined by means of the concurrence of the pure states appearing in the decomposition through an optimization process: where the infimum is taken over all the possible decompositions of ρ into pure states. Clearly, for a pure state (2.1) the concurrence vanishes if and only if the state is separable, that is: |Ψ⟩ = |Ψ A ⟩ ⊗ |Ψ B ⟩.
As the same holds for mixed states [30], the concurrence appears to be a good entanglement detector. Unfortunately, the optimization problem appearing in (2.3) makes the evaluation of the concurrence a very hard mathematical task with a simple analytic solution only when A and B are two-level systems. Any approximation or numerical computation of C[ρ] only holds as an upper bound and thus cannot serve to reliably distinguish between entangled and separable states, or to give an estimate of a state entanglement content.
Lower bounds on C[ρ] for a generic density matrix ρ can be analytically computed and, if nonvanishing, unequivocally signal the presence of entanglement. One of these bounds is easily computable, yielding [31] C with ρ A = Tr B [ρ] and ρ B = Tr A [ρ] being the reduced density matrices. A non-vanishing value of C 2 then implies a concurrence larger than zero, thus witnessing the entanglement of the density matrix ρ.
Interestingly enough, an upper bound for C[ρ] has also been obtained [32]; explicitly, one finds The maximum value for the concurrence is obtained for a totally symmetric and maximally entangled pure state. For two qutrits this is The concurrence lower bound (2.5) will play the role of entanglement witness in our study of the spin polarization states formed with two massive gauge bosons.
If the bipartite state of interest is a pure state, it is possible to quantify its entanglement by computing the entropy of entanglement:

Bell inequalities
Local deterministic theories provide descriptions of a physical system that match the results of quantum mechanics for the averages of relevant system observables. Yet, in view of the deterministic and locality assumptions, these stochastic classical models are bound to satisfy a set of inequalities known as Bell inequalities [18][19][20][21][22], which are instead violated by the statistical predictions of quantum mechanics. An experimental determination of any Bell inequality is thus able to discriminate between these classical local models and quantum mechanics.
Whereas an essentially unique Bell inequality can be formulated [33] in the case of a bipartite system made of two qubits, different Bell inequalities can be found in the literature for systems of higher dimensionality. Among these, the CGLMP inequality [23,24] is an optimal generalization of the qubit inequality for systems made of two qutrits.
In order to explicitly write this inequality, consider again the two components A and B of the two qutrit system. For the qutrit A, select two spin measurement settings,Â 1 andÂ 2 , which correspond to the projective measurement of two spin-1 observables having each three possible outcomes {0, 1, 2}. Similarly, the measurement settings and corresponding observables for the other qutrit B areB 1 and B 2 . Then, denote by P (A i = B j + k) the probability that the outcome A i for the measurement of A i and B j for the measurement ofB j , with i, j either 1 or 2, differ by k modulo 3. One can then construct the combination: For deterministic local models, this quantity satisfies the following generalized Bell inequality, which instead can be violated by computing the above joint probabilities using the rules of quantum mechanics. Given a state ρ of the two-qutrit system, the above probabilities are computed in quantum mechanics as expectation values of suitable projector operators; for instance, the probability of the outcome A 1 = B 1 = 1, when measuringÂ 1 andB 1 , is given by , where e.g. P A 1 =1 projects onto the subspace of the A-Hilbert space whereÂ 1 assumes the value 1. Therefore, in quantum mechanics, I 3 in (2.9) can be similarly expressed as an expectation value of a suitable Bell operator B: The explicit form of B depends on the choice of the four measured operatorsÂ i ,B i , i ∈ {1, 2}. Hence, given the two-qutrit state ρ, it is possible to enhance the violation of the Bell inequality (2.10) through a specific choice of these operators. We remark that the numerical value of the observable is bound to be less than or equal to 4. For the case of the maximally entangled state in (2.7), ρ = |Ψ + ⟩⟨Ψ + |, the problem of finding an optimal choice of measurements has been solved [23]. By working in the single spin-1 basis formed by the eigenstates of the S 3 spin operator (A.2) with eigenvalues {1, 0, −1}, the Bell operator takes the following explicit form (see [34], though there it is written in the so-called computational basis): It should be noticed that, perhaps surprisingly, the maximal violation of (2.10) obtained with B is for a density matrix which is not maximally entangled [34], making it evident that entanglement theory in higher dimensions is rather intricate. Within the choice of measurements leading to the Bell operator in (2.12), there is still the freedom of modifying the measured observables through local unitary transformations, which effectively corresponds to local changes of basis. Correspondingly, the Bell operator undergoes the change: where U and V are independent three-dimensional unitary matrices. In the following we make use of this freedom to maximize the value of I 3 for any given density matrix ρ; as the gauge boson polarization states depend on the relevant kinematic variables, this optimization procedure is to be performed independently for each point in phase space. We give the explicit forms of the matrices that maximize the observable I 3 for the processes analyzed as they can be useful in future numerical simulations.

Density matrix for one spin-1 particle
Let us start by defining the reference frame we use to describe the polarization of a spin-1 particle at rest. To this purpose we introduce a set of three orthonormal (three-)vectors, n,r,k , forming a right-handed system:n =r ×k. The normalized helicity eigenvectors ψ ±,0 of the massive spin-1 particle of mass M , corresponding respectively to eigenvalues λ = ±1, 0, are having chosen thek direction as the direction of quantization. In order to describe the helicity of the spin-1 particle in a more general reference frame and in a covariant manner, we first promote the three basis vectors to four-vectors by extending them with a null temporal component and then perform a Lorentz boost along the −k direction. As a result, in the new frame the spin-1 particle acquires a velocity β = 1 − M 2 /E 2 along the positivek direction and possesses a 4-momentum p µ = E(1,kβ), where E is the particle energy in this frame. By construction, the boosted basis vectors 15) are orthogonal to the four-vector n µ 0 = E/M (1,kβ) (proportional to the particle momentum) and with it form an orthonormal vierbein n µ m . The label m ∈ {0, 1, 2, 3} indicates the vector: g µν n µ m n ν n = −δ mn with g µν = diag(1, −1, −1, −1) being the Minkowski metric.
The wave vector ε µ (p, λ) of a spin-1 particle can then be expressed in a covariant form as a linear combination of the three reference vectors (n µ 1 , n µ 2 , n µ 3 ) orthogonal to the particle momentum giving the standard representation of the spin-1 wave vector in the helicity λ basis. It can be easily checked that in the particle rest frame, where (β → 0), the equation above reduces to Eq. (2.14). From Eq. (2.16) we can construct the covariant helicity projector operator of a spin-1 particle with four-momentum p, mass M and polarization ε µ (p, λ) [35] where S i , i ∈ {1, 2, 3}, are the spin-1 representations 1 of the SU (2) generators and ϵ µναβ is the fully antisymmetric Levi-Civita tensor with ε 0123 = 1. The matrices S ij are defined as with i, j ∈ {1, 2, 3} and 1 being the 3 × 3 unit matrix. The covariant relation (2.17) can be verified by substituting the expression for ε µ (p, λ) in Eq. (2.16) with n µ i given as in Eq. (2.15) and the (S i ) λλ ′ and (S ij ) λλ ′ matrix elements as provided in Appendix C.
Consider now the probability amplitude M for the production of a massive spin-1 particle of momentum p and helicity λ, given by (2.20) Then, the polarization density matrix of a massive spin-1 particle can be written in the helicity basis as 1 Explicit matrix representations are given in Appendix A on the basis where the eigenstates of S3 read corresponding to the eigenvalues +1, 0 and −1, respectively.
where the |M| 2 = λ M † (λ)M(λ) is the unpolarized square amplitude (the sum in Eq. (2.21) over possible internal degrees of freedom of initial state particles is understood). By using the expression in Eq. (2.20) we have that where the expression for the covariant projector is given in Eq. (2.17).
The relation above provides a simple way to compute the polarization density matrix of one massive spin-1 particle starting from the amplitudes M of the related production process. As all 3×3 matrices, ρ can be decomposed on the basis formed by the eight Gell-Mann matrices T a (see Appendix A) and the unit matrix as follows where the T a satisfy the orthogonality condition Tr [T a T b ] = 2 δ ab . The coefficients v a , which depend on the kinematic variables of the process, are scalar quantities and can be easily obtained by projecting the ρ matrix on the Gell-Mann basis: Expressions for S i and S ij , i, j ∈ {1, 2, 3}, in terms of the Gell-Mann matrices are given in Appendix A.

Density matrix for two spin-1 particles
We first analyze the case of a pair of spin-1 particles with same mass M , and then generalize the kinematics to the case of two particles with different masses. Consider the production of a pair V 1 V 2 via the Drell-Yan topology initiated by quark-antiquark fusionq where p i are the momenta of initial state quarks and k i and λ i (i ∈ {1, 2}) the four-momenta and helicities of V 1 and V 2 , respectively. For processes initiated in proton-proton collisions we can assume massless quarks. Without the loss of generality we choose to work in the center of mass (CM) frame, where the orientation of the basis unit vectors {n,r,k} relative to the quark beam is illustrated in Fig. 1. In this frame, thek direction is taken to coincide with the axis defined by the three-momenta of the two spin-one particles produced, with k 1 indicating the positive verse. Then, taking the 3-momentum of the antiquark along thep direction, so that p 1 = E(1,p) and the scattering angle Θ matches the angle between the vectorsp andn. The remaining unit vectors composing the orthogonal {n,r,k} system are then given byr We define the spin eigenstates for each particle in its own rest frame as in Eq. (2.14). The corresponding expressions for the CM frame are then obtained via a Lorentz boost by −β, for V 1 , and +β for V 2 , where β = 1 − 4M 2 /s and s is the squared of the CM energy. We report below the form taken in this frame by the initial and final states momenta Given the polarization vectors ε µ (k 1 , λ 1 ) and ε ν (k 2 , λ 2 ), associated respectively with V 1 and V 2 , the corresponding polarization bases n µ i (1) and n µ i (2) (i ∈ {1, 2, 3}, cf. Eqs. (2.16)-(2.15)) are given by n µ 1 (1) = n µ 1 (2) = (0,n) , n µ 2 (1) = n µ 2 (2) = (0,r) , n µ 3 (1) = γ (β,k) , n µ 3 (2) = γ (−β,k) , (2.28) where γ = 1/ 1 − β 2 is the Lorentz factor. The above vectors satisfy the normalization conditions In case of production of two different gauge bosons, as well as for the decay of the Higgs boson into a pair of weak interaction gauge bosons (one necessarily off-shell 2 ), the above relations generalize as follows.
Let k be the common magnitude of the momenta of the produced particles in the CM frame, √ s = E 1 + E 2 the total energy in the same frame, M the heaviest mass of the two spin-1 particles and f M the lightest one, with 0 < f < 1. The four-vectors k 1 and k 2 are then given by with corresponding velocities β 1,2 = k/E 1,2 . The expression of the vectors n µ 1,2 (1) and n µ 1,2 (2) remain the same as in Eq. (2.28), while here where γ 1,2 = 1/ 1 − β 2 1,2 are the corresponding Lorentz factors. The normalization conditions remain the same as in the degenerate case, Eq. (2.29), but with scalar product Turning now to the computation of the polarization density matrix for two spin-1 bosons of arbitrary non vanishing masses, the matrix element M(λ 1 , λ 2 ) for the related production amplitude can be written as (2.34) The polarization density matrix is accordingly defined as where, as usual, |M| 2 stands for the unpolarized square amplitude and a sum over the possible internal degrees of freedoms of initial state particles is understood. By using the covariant expression for the spin-1 projectors P µν λλ ′ (k) defined in Eq. (2.17), we can rewrite the the density matrix in Eq. (2.35) as (2. 36) In the case at hand, ρ(λ 1 , λ ′ 1 , λ 2 , λ ′ 2 ) can be decomposed on the basis of the 9×9 matrices formed by the tensor products {1 ⊗ 1, 1 ⊗ T a , T a ⊗ 1, T a ⊗ T b }, with T a again the 3×3 Gell-Mann matrices. In particular, we have 3 The eight components of f a and g a , as well as the 64 elements of h ab , can be obtained by projecting ρ on the desired subspace basis via All the terms computed via Eq. (2.38) are Lorentz scalars which depend only on the energy E, the velocity β and the scattering angle Θ in the CM frame. It is possible to compute the observable quantifying the entanglement in the gauge boson system once the coefficients f a , g a and h ab are known. The lower bound C 2 , introduced in Section 2.1 as an entanglement witness, can be written in terms of the coefficients in Eq. (2.38) as

Reconstructing the correlation coefficients from the data
The actual processes observed at colliders are with missing energy E miss T due to the possible presence of neutrinos in the final state. These processes include the production of the gauge bosons through the resonant Higgs boson channel, as well as via quark fusion, and include the consequent decays into the final leptons (for the Zs) or the jets of interest (for the W s)-plus the jets originating from X spectator quarks.
The spin 1 gauge bosons act as their own polarimeters. For instance, in the decay W + → ℓ + ν ℓ the lepton ℓ + is produced in the positive helicity state while the neutrino ν ℓ in the negative helicity state. The polarization of the W + is therefore measured to be +1 in the direction of the lepton ℓ + . The opposite holds for the decay W − → ℓ −ν ℓ and the polarization of the W − is therefore measured to be −1 in the direction of the lepton ℓ − . In both the cases, the momenta of the final leptons (see Fig.1) provide a measurement of the gauge boson polarizations. The same is true for final jets from d and s quarks. These momenta are the only information that we need to extract from the numerical simulation or the actual data.
How do we go about reconstructing the correlation coefficients h ab , f a and g a of the density matrix starting from the momenta of the final leptons? This problem has been recently discussed in [15], which we mostly follow in the remainder of this section.
The cross section we are interested in can be written as [36] 1 σ in which the angular volumes dΩ ± = sin θ ± dθ ± dϕ ± are written in terms of the spherical coordinates (with independent polar axes) for the momenta of the final charged leptons in the respective rest frames of the decaying particles. The dependence on the invariant mass m V V and scattering angle Θ in Eq. (2.42) is implied. The density matrix ρ V 1 V 2 in Eq. (2.42) is that for the production of two gauge bosons given in Eq. (2.37). The density matrices Π ± describe the polarization of the decaying gauge bosons. The final leptons are taken to be massless-for their masses are negligible with respect to that of the gauge boson. They are projectors in the case of the W -bosons because of their chiral coupling to leptons. These matrices can be computed by rotating to an arbitrary polar axis the spin ±1 states of the weak gauge bosons taken in the z direction and are given, in the Gell-Mann basis, as where the functions q a ± can be written in terms of the respective spherical coordinates, as reported in Eq. (B.1) of Appendix B, for the decay of W -bosons. 4 We can define another set of functions which is assumed to exist. The explicit form of the functions p n ± are given in Appendix B Eq. (B.2) . The functions in Eq. (B.2) can be used to extract the correlation coefficients h ab from the bidifferential cross section in Eq. (2.42) through the projection The correlation coefficients f a and g a can be obtained in similar fashion by projecting the single differential cross sections: The density matrices Π ± are not projectors in the case of the Z-bosons because the coupling between Z-bosons and leptons contains both right-and left-handed components, whose strengths are controlled by the coefficients g L = −1/2 + sin 2 θ W and g R = sin 2 θ W . In this case, one must introduce a generalized form of the functions in Eq. (B.1) which is defined as the following linear combinations and define from these the corresponding orthogonal functionsp n to be used in Eq. (2.38). They are the same for both the ± coordinate sets and given bỹ where the matrix a n m is given in Eq. The analysis outlined in this Section is experimentally rather challenging because both the CM frame of the collision and the rest frame of the gauge bosons must be determined as precisely as possible to compute the correlation coefficients h ab , f a and g a with reasonable uncertainties.

Estimating the uncertainty
We model the uncertainty in the value of our observables as a Gaussian dispersion-controlled by the number of events-in the determination of the kinematical variables. The number of events is modulated by the efficiency in the identification of the final charged leptons or jets.
To this random error, we add, in the case of the W W final states, a systematic error that takes into account the significant uncertainty in the reconstruction of the gauge boson momentum from the missing momentum of the neutrinos. This reconstruction comes from the kinematical constraints together with dedicated algorithms. Estimates show that the distribution of the differences between the true and the reconstructed momentum over the true momentum has a standard deviation ranging from 30%, for simple kinematical reconstructions with smearing effects of the detector, to 3% in more advanced machine learning algorithms (see, for instance, [37][38][39][40][41]). We take for this uncertainty a conservative benchmark value of 30% (and show for the W W continuum how the determination changes for a smaller value) and add it in quadrature to the Gaussian error coming from the number of available events.
We run from 1000 (Higgs decay) to 10000 (Drell-Yan-like production) pseudo experiments as we vary the kinematical variables and compute for each of these values the observable I 3 . The distribution so obtained is skewed because the observable is computed near its maximum value and the random variation can only reduce this value.
The significance of the violation of the Bell inequality can be defined as 52) and the p-value p refers to the null hypothesis that the Bell inequality is not violated. The value of Z assigns a statistical significance to the separation between the distribution we obtain for the values of I 3 and the value 2, above which the Bell identity is violated. Values of the significance larger than 5 requires a very large number of pseudo experiments to be performed in order to find the actual value. For this reason, when this is the case, we do an extrapolation and quote a lower bound.

Di-boson production in Higgs boson decays
with V ∈ {W, Z}, and V * regarded as an off-shell vector boson. In the following, we treat the latter as an on-shell particle characterized by a mass reduced by a factor 0 < f < 1 with respect to the original mass M V . The amplitude of the Higgs boson decay (3.1) is given by where g is the weak coupling, ξ W = 1, and ξ Z = 1/(2c W ), with c W = cos θ W and θ W the Weinberg angle. From the amplitude in Eq. (3.3) we obtain for the on-shell and off-shell boson, respectively.
Following the procedure explained in Section 2.3 for a CM energy √ s = m H , we obtain the coefficients f a , g a , and h ab (a, b ∈ {1, . . . , 8}) reported below. There is no dependence on the scattering angle Θ because we are considering the decay of the Higgs boson at rest.
The non-vanishing f a elements are and we find g a = f a for a ∈ {1, . . . , 8}. The non-vanishing h ab elements are The unpolarized squared amplitude |M H | 2 of the process instead reads The main theoretical uncertainty affecting the correlation coefficients in Eq. (3.6) is due to higher order corrections to the tree-level values. To estimate the size of these contributions, we take as guidance the results in [26]-in which the NLO EW corrections have been computed. According to these results, we expect the error induced by these missing corrections yields at most a few percent of uncertainty on the main entanglement observables, in the relevant kinematic regions in which one of the two EW gauge boson are on-shell [26]. This expectation is based on the fact that these corrections give a 1-2% effect on the total width [26].
We then compute through Eq. (2.37) the polarization density matrix ρ H for the two vector bosons emitted in the decay of the Higgs boson with the condition Tr [ρ H ] = 1 following from the relation 4(h 33 + h 44 ) = 1. We remark that although some f a and g a are non vanishing, the dependence of ρ H on these quantities cancels in the final expression. Furthermore, due to the following identity among the correlation coefficients the above polarization density matrix is idempotent signaling that the final V V * state is a pure state. The density matrix in Eq. (3.8) can then be written [12] and κ = 1 corresponding to the production of two gauge bosons at rest. Because the di-boson system is described by a pure state, we can measure its entanglement through the entropy of entanglement defined in Eq. (2.8). This quantity is plotted in Fig. 3 as a function of the of the mass of virtual W or Z boson and reaches the theoretical maximum at the kinematic threshold, signaling a maximally entangled state. The dependence of the polarization entanglement on the mass of the virtual state is due the contribution of the longitudinal polarization, the coefficient κ in Eq. The maximization of the I 3 observable, which depends in this case only on the M * V mass, is obtained through the rotation   and H → ZZ * decays, respectively. For the W W channel, we find while for the ZZ channel The matrices in Eq. (3.15) and Eq. (3.16) are given in terms of rational numbers which approximate the corresponding real values with a 1% precision. The unitary condition is satisfied barring O(10 −2 ) factors. These matrices cannot be directly compared with the similar expressions given in [12] because of the different assumptions in the utilized optimization procedure. The C 2 observable admits here a simple analytical expression (3.17) The plots on the right-hand side in Figs. 4 and 5 nicely show that the value of C 2 decreases as the pure state in Eq. (3.8) becomes less and less entangled, for decreasing values of M * V .

Events and sensitivity
In order to evaluate the sensitivity of current experiments to the observables I 3 and C 2 , we estimate the number of suitable events available. These are given in Table 1 for the run2 at the LHC. The cross sections for p p → H → W + ℓ −ν ℓ and p p → H → Zℓ + ℓ − utilized in the estimates are computed with MADGRAPH5 [42] at the LO and then corrected by the κ-factor given at the N3LO+N3LL [43].
Even the definition of the rest frame of one decaying W -boson introduces an essential uncertainty because of the ambiguity in the reconstruction of the longitudinal momentum of the neutrino and the possibility of misidentifications (and other errors) in the identification of the missing momentum.
There is no such a problem in the case of the Z-boson decay which may though suffer of other generic inefficiencies. In the case of the Higgs boson decay into two W -bosons, the problem is exacerbated: the full reconstruction is not possible even in principle because there are more variables than constraints (since one of the masses of the gauge bosons is necessarily off-shell and the missing transverse momentum includes both neutrinos).
The problem of actually estimating the size of these uncertainties (for a given choice of an algorithm for the neutrino momenta reconstruction) is the central problem of any physical analysis from the actual or simulated data of the process and cannot be resolved here.
Hi-Lumi (L = 3 ab −1 ) 8.0 × 10 4 589 We take into account the problem of these irreducible uncertainties in the evaluation of the operators in the decays of the W W * by introducing a systematic error that mimics the significant uncertainty in the reconstruction of the neutrino momenta, which has to come from a dedicated algorithm (see, for instance, [37,38,40,41]). Since the uncertainty would be dominated by the error in the reconstruction of the two neutrino momenta, it is better to consider the semi-leptonic decay H → jjℓν ℓ and use the momentum from the s-quark jet (s-jet)-identified via the c-tagging of the companion jet-to measure the polarization of one of the two W -bosons. It has been shown that the efficiency of the jet tagging and the decreased uncertainty in the single neutrino momentum may improve the polarization reconstruction [44]. In this case, we take a conservative benchmark value of 0.3 for the systematic error in the single neutrino momentum and an overall efficiency of 40% in the c-jet tagging and the identification of the momentum associated to the s-jet that carries (by the same degree as the charged lepton in leptonic decays) the polarization of the W .
In addition we include an efficiency factor of 70% in the identification of each charged lepton [45]. The irreducible background for the H → W + ℓ −ν ℓ signal comes from the continuum electroweak production of W + W − pairs. It can be reduced by considering the characteristic distribution of the kinematical variables to a manageable size [44,46]. In addition, one has to remove the reducible background events from tt and W t production. The irreducible background for the H → Zℓ + ℓ − Figure 7: Distribution of the events at the LHC Hi-Lumi for the H → W + ℓ −ν ℓ and H → Zℓ + ℓ − processes. The set of events for W W * has mean value I 3 = 2.5, that for ZZ * has mean value I 3 = 2.9. The threshold value of 2 for Bell inequality violation is shown as a dashed red line.
signal is rather small and dominated by the electroweak process pp → ZZ/Zγ → 4ℓ, which is about 4 times smaller at the Higgs peak [47]. We neglect all backgrounds in our assessment of the significance even though they will have to be included in the actual analysis from the data and will affect the uncertainty.
To show the impact of possible irreducible backgrounds we show in Fig. 8 the values for I 3 and C 2 found as a generic factorizable density matrix is added with weight (1 − α) to that of the H → W W * process, which is accordingly multiplied by a factor α (with α between 0 and 1). For the case shown in Fig. 8, for values of α smaller than 0.7, the uncertainty substantially reduce the possibility of assessing the Bell inequality violation. We run 1000 pseudo experiments as we vary the invariant mass of the off-shell gauge boson around the mean value with a dispersion given by the (statistical and systematic) uncertainty as discussed above, and compute the observable I 3 . Fig. 6 and Fig. 7 show the distributions which are obtained for, respectively, LHC run 2 and Hi-Lumi. The distributions are skewed because the observable is computed near its maximum value and the random variation can only reduce this value. Fig. 6 shows that, at the LHC run 2, the significance for rejecting the null hypothesis I 3 ≤ 2 is 1 for the W W * case and 1.3 for the ZZ * case. Fig. 7 shows that, at the LHC Hi-Lumi, the significance for rejecting the null hypothesis I 3 ≤ 2 remains 1 for the W W * case, since the uncertainty is dominated by the statistical error, while it reaches 5.6 for the ZZ * case. These significances are likely to decrease in a more complete analysis based on a full simulation because of the reconstruction from the final lepton angular distributions and the systematic uncertainties of the unfolding, which is particular severe for the W + W − case due to the presence of neutrinos and background events.
Our results confirm the numerical simulations presented in [5] for the H → W W * process and in [12] for the H → ZZ * case. These works estimate the uncertainties from a parton-level reconstruction of the final lepton angular distributions. Yet, a fully realistic estimate of the uncertainty is still missing as uncertainties due to detector unfolding and background have not been modeled.

Di-boson production via quark fusion
Final W W , ZZ, and W Z states can be produced via electroweak processes in a continuous range of di-boson invariant masses. We show in the following how the polarization density matrix of the di-boson system can be computed starting from the density matrices obtained for the involved parton contributions, presented in Fig. 9 for the processes at hand. For the sake of simplicity, when possible we leave implicit the dependence of the correlation coefficients h ab (m VV , Θ), g a (m VV , Θ) and f a (m VV , Θ) on the scattering angle Θ in the CM frame and on the invariant mass of the bosons m VV .
The polarization density matrix ρ for two bosons produced in proton collisions is given by the convex combination of the density matrices of the involved parton contributions. Given initial state quarks q 1 andq 2 , we compute through Eq. (2.37) the polarization density matrix ρ q 1q2 of the parton contribution, from the scattering amplitude of the process where the sum runs over all the allowed initial states, including both the configurations where the anti-quark originates from either proton. 5 The coefficients w q 1q2 , satisfying {q 1 ,q 2 } w q 1q2 = 1, are given by and depend on the unpolarized squared amplitude of the parton process, |M q 1 ,q 2 V 1 V 2 | 2 , as well as on the parton luminosity of the initial q 1q2 state In the formula above q j (x) is the parton distribution function (PDF) of the parton j and τ = m VV / √ s. We utilized the numerical values provided by the recent (PDF4LHC21) release [48] for √ s = 13 TeV and factorization scale m VV (see Fig. 10).  where we introduced the abbreviations A qq = |M qq W W | 2 andh ab = A qq h ab . Similar results hold for the remaining correlation coefficients f a and g a , a ∈ {1, . . . , 8}. The explicit expressions ofh ab ,f a = A qq f a , g a = A qq g a for all the analyzed processes are collected in C.
The main source of theoretical uncertainty in the determination of the correlation coefficients comes from higher order QCD corrections. Taking as a guidance the results in [25], we assume that the error induced by these missing corrections yields approximately a 10% uncertainty on the main entanglement observables in the relevant kinematic regions. The theoretical uncertainties coming from the PDFs and the top-quark mass are negligible: by comparing results obtained with two different set of PDFs, we estimate the related uncertainty to be of the order of per mille. This is due to the fact that only ratios of PDFs enter in Eq. (4.5) for the h ab coefficients, and analogously for the g a and f a ones, and therefore most of the PDF uncertainty cancels out. This is of the same order as the uncertainty due to the top-quark mass, obtained by varying the parameter around its experimental value at most by two standard deviations.
In the following, we present our results for the entanglement observables for the W W , W Z and ZZ cases separately.
Our results differ from those presented in [15], obtained through a parton-level numerical simulation. In particular, we find substantially lower values for the observable I 3 in the W + Z process, larger for the W + W − and ZZ, and a general reduction of all C 2 values. Since the results in [15] come without an estimate of the uncertainty, the comparison is not straightforward; dedicated work is needed to fully understand the origin of these discrepancies. 6 4.1 p p → W + W − The tree-level Feynman diagrams contributing to the process at the parton level are shown in the top part of Fig. 9. The polarization vectors of W + and W − are ε µ (k 1 , λ 1 ) and ε ν (k 2 , λ 2 ), respectively. The polarized amplitude for the process in Eq. (4.6), for u andū initial states, is given by where the effective vertex Γ W W αβ is with s W = sin θ W and e being the unit of electric charge. The effective couplingsḡ q V,A are given bȳ where g q V = T q 3 − 2Q q s 2 W , g q A = T q 3 and T q 3 and Q q are the isospin and electric charge (in unit of e) of the quark q. The χ term in Eq. (4.9), which weights the contribution of the virtual Z channel, is real since we neglect the Z width contribution. The function V ανµ (k 1 , k 2 , k 3 ) is the Feynman rule of the for incoming momenta (k 1 + k 2 + k 3 = 0). The Mandelstam variables are defined as From the amplitude in Eq. (4.7), summing over the spin of quarks we obtain whereΓ µν = γ 0 (Γ µν ) † γ 0 and the projector P µν λλ ′ (k) is given in Eq. (2.17) with M = M W . The unpolarized square amplitude for the process uū → W + W − is given by where N c = 3, c Θ = cos Θ, β W = 1 − 4M 2 W /m 2 WW , m WW is the invariant mass of the W pair and we chose Θ as the angle between the anti-quark and W + momenta in the CM frame. Our convention for the polarization density matrix is that the W + momentum defines thek unit vector of the basis in Eq. (2.26).
The result for the dd → W + W − process follows from Eqs. (4.12)-(4.13) through the substitutions with the angle Θ being defined as before by the anti-quark and W + momenta. The contribution of strange quark initial states equals that of d quarks in the considered massless limit. The Eq. (4.12) (together with the corresponding ones for dd and ss processes) makes it possible to compute the unnormalized correlation coefficientsf a ,g a , andh ab of the density matrix for the process at hand (given in Appendix C) and consequently, the value of the operators I 3 and C 2 . As explained in Section 2.1, for the observable I 3 we find at each point in the kinematic space the unitary matrices U and V that maximize the violation of Bell inequalities.
The results obtained for the two observables of interest are shown in Fig. 11, as functions of the kinematic variables. We observe that the violation of the Bell inequalities takes place only in a limited range of the kinematic variables. The bin in which I 3 > 2 is indicated by the hatched area in first plot of Fig. 11. The matrices maximizing the Bell observable are given by with a precision of 1% with respect to the numerical solutions we found. Accordingly, unitarity is satisfied barring O(10 −2 ) terms. These expressions might be useful in a future simulation of the process. The observable C 2 follows roughly the pattern of I 3 and reaches the largest values in the upper-left quadrant, thus witnessing the presence of states more entangled than in the rest of the kinematic space. This feature can be made manifest by considering the density matrix of the process. For instance, at m W W = 900 GeV and cos Θ = 0, the polarization density matrix for the W + W − states can be approximated up to terms O(10 −3 ) by the following combination of pure state density matrices with decreasing weights: α ≃ 0.72, β ≃ 0.18, γ ≃ 0.07 and δ ≃ 0.02; the normalization condition α + β + γ + δ = 1 is satisfied within the adopted approximation. The involved pure states are where |a b⟩ = |a⟩ ⊗ |b⟩ with a, b ∈ {+, 0, −} are the polarization states of the two W gauge bosons at rest in the single spin-1 basis. Though the dominant contribution in (4.17) comes from the entangled pure state |Ψ +− ⟩-a result that justifies the high value of C 2 -the actual density matrix ρ describes a mixture, even more so if the discarded O(10 −3 ) terms were included. This feature explains why the corresponding value of C 2 , in this corner of the kinematic space, is large but far from maximal.

Events and sensitivity
Having identified the best region to test the data, we estimate the corresponding number of events expected at the LHC. This is given in Table 2, where the cross sections needed for the estimates were computed with MADGRAPH5 [42] at the LO and then correcting by the κ-factor given at the NNLO [50]. This is a good approximation, since there is little variation in the k-factors in the range of W W , ZZ, W Z invariant masses between 200 and 800 GeV [50]-which is the one we consider. We reduce the number of events thus found by the efficiency in the identification of the final leptons-which we take conservatively to be 70% for each lepton [45]. We consider semi-leptonic decays of the W and proceed as explained in Section 3. Though there are irreducible background events from the H → W + W − decay, they are few in the cos Θ ≤ 0.25 bin where the observable is to be estimated. Events of the reducible background, coming from tt and W t production, must be selected out.
We run 10 4 pseudo experiments as we vary the invariant mass and the scattering angle around the mean value with a dispersion given by the (statistical and systematic) uncertainty as discussed in the previous Section, and compute the observable I 3 . Fig. 12 shows the distribution which is obtained for LHC run2 and Hi-Lumi. The distributions are skewed because the observable is computed near its maximum value and the random variation can only reduce this value. We find that run2 yields a significance of 0.8 for rejecting the null hypothesis I 3 ≤ 2 (see Fig. 12) which remains about the same at Hi-Lumi because the uncertainty is dominated by the systematic error. Fig. 13 shows how the distribution of the pseudo-experiment changes as the systematic error is decreased to 0.1. In the latter case, the significance grows and reaches the value 6. Not surprisingly, the better the reconstruction of the neutrino momenta, the higher the significance of the violation of Bell inequality.
The significances we quote are bound to decrease in a full simulation because of other systematic uncertainties and the smearing of the events in the detector.

p p → ZZ
The tree-level Feynman diagrams contributing to the process q(p 1 )q(p 2 ) → Z(k 1 , λ 1 )Z(k 2 , λ 2 ) , (4.19) at the parton level are shown in the middle row of Fig. 9. We indicate the polarization vectors of the two Z bosons with ε µ (k 1 , λ 1 ) and ε ν (k 2 , λ 2 ). The polarized amplitude for the process in Eq. (4.19) is given by where The Mandelstam variables u and t are defined as and with the g q V,A couplings defined as in Eq. (4.9). Summing over the quark polarizations and colors we then obtain where P µν λλ ′ (k) is given in Eq. (2.17) with M = M Z . The corresponding unpolarized square amplitude is then obtained by summing over the polarizations of the two Z bosons where with β Z = 1 − 4M 2 Z /m 2 ZZ . The angle Θ is here defined as the angle between the anti-quark momentum and k 1 in the CM frame. The orientation of the latter coincides with that of thek unit vector of the basis in Eq. (2.26).
The Eq. (4.24) makes it possible through Eq. (2.38) to compute the unnormalized correlation coefficientsf a ,g a , andh ab (given in Appendix C) of the density matrix for the process at hand and consequently, the value of the operators I 3 and C 2 .
In Fig. 14 we present our results for the entanglement observables. The violation of the Bell inequalities takes place only in a limited range of the kinematic variables. The bin in which I 3 > 2 is shown as a hatched area in the left panel.
The observable C 2 follows the pattern of I 3 -as it does in the case of the W + W − final states-and reaches the largest values in the upper-left quadrant. In this region it witnesses the presence of states more entangled than in the rest of the kinematic space.

Events and sensitivity
The number of expected events at the LHC is given in Table 3. As before, the relevant cross sections were computed with MADGRAPH5 [42] at the LO and then corrected by the κ-factor given at the NNLO [50]. We reduce the number of events thus found by the efficiency in the identification of the final leptons-which we take conservatively to be 70% for each of the identified leptons [45]. We consider semi-leptonic decays of the W and proceed as explained in Section 3.
Though there are irreducible background events from the H → ZZ decay, they are negligible in the kinematic bin where the observables are to be estimated.  Table 3: Number of expected events in the kinematic region m ZZ > 500 GeV and cos Θ < 0.25 at the LHC with √ s = 13 TeV and luminosities L =140 fb −1 , run2, and L = 3 ab −1 for Hi-lumi. A benchmark efficiency of 70% is assumed in the identification of each charged lepton.
We run 10 4 pseudo experiments as we vary the invariant mass and the scattering angle around the mean value with a dispersion given by the (statistical and systematic) uncertainty as discussed in the previous Section, and compute the observable I 3 . Fig. 15 shows the distribution which is obtained for LHC run2 and Hi-Lumi. The distributions are skewed because the observable is computed near its maximum value and the random variation can only reduce this value. We find that run2 yields an average value of I 3 ≤ 2 that is below the threshold for Bell violation. At Hi-Lumi the significance for rejecting the null hypothesis I 3 ≤ 2 (see Fig. 12) is more than 2.
The significance we quote is bound to decrease in a full simulation because of the reconstruction from the final lepton angular distributions and the systematic uncertainties of the unfolding.

p p → W Z
Let us consider the tree-level Feynman diagrams contributing to the process at the partonic level, shown in the last row of Fig. 9. We indicate the polarization vectors of the W + and Z with ε µ (k 1 , λ 1 ) and ε ν (k 2 , λ 2 ), respectively. The polarized amplitude of the process is with Mandelstam variables s = (p 1 + p 2 ) 2 , t = (k 1 − p 1 ) 2 , u = (k 1 − p 2 ) 2 , the vertex V q µ (q ∈ {u, d}) being defined in Eq. (4.23) and the vertex function V ναµ defined in Eq. (4.10). We neglected up-strange quark transitions by setting cos θ c = 1, with θ c the Cabibbo angle.
Summing over the internal degrees of freedom of the initial state quarks gives  The following expressions approximate the quantity above in the limit The angle Θ is here implied by the momenta of anti-down quark and W in the CM frame. As before, our convention for the polarization density matrix for the W Z production is that the momentum of W is alongk, cf. Eq. (2.26). Analogous results hold for the process p p → W − Z initiated by theūd quarks.
We compute the unnormalized correlation coefficientsf a ,g a , andh ab (given explicitly in Appendix C) of the density matrix by using Eqs. (4.30)-(2.38). Fig. 16 shows the values obtained for the observables I 3 (left panel) and C 2 (right panel) for the process p p → W Z. By inspection, the observable I 3 is less than 2 regardless of the value of the kinematic variables. The final states are less entangled than in the case of the weak gauge boson pairs and the observable C 2 presents low values everywhere.

Lepton colliders
We consider now the charged di-boson production at e + e − and muon colliders, proceeding from the process where ℓ ∈ {e, µ}. We neglect the contribution of an intermediate Higgs boson regarding the leptons as massless. The analytical results for the amplitude and the polarization density matrix coefficients can be obtained from those given in Section 4.1 and Appendix C through the replacementsḡ d V,A →ḡ ℓ V,A . Because the initial state is unique, the total density matrix comprises only one contribution. For the correlation coefficients h ab , f a , g a we then find where the scattering angle Θ is defined as the angle between the anti-lepton and W + momenta.
The results for the entanglement observables are shown in Fig. 17. The violation of the Bell inequalities takes place in a range of the kinematic variables broader than in the LHC case and it is larger. The theoretical uncertainty of the result is negligible. The same results for the ZZ di-bosons are shown in Fig. 18. The violation of the Bell inequalities in this case takes place in a range of the kinematic variables more or less equivalent to that at the LHC In this instance, as for the W Z process, the process is generated by only one kind of diagram (see bottom diagrams of Fig. 9) and the PDF dependence exactly cancels out in the h ab coefficients in Eq. (4.5), as well as in the f a , g a ones. This PDF factorization in the density matrix for the W Z production at the LHC takes always place at the lepton colliders, where no dependence on the PDF appears.

Events and sensitivity
The bin in which I 3 > 2 for lepton colliders is shown as a hatched area in Fig. 17. Having identified the best region to confront the data, we can estimate the number of events expected at a muon collider working at an energy of √ s = 1 TeV and at the future circular collider (FCC) working at an energy of √ s = 368 GeV. These numbers are given in Table 4, where the relevant cross sections were computed with MADGRAPH5 [42] at the LO. We reduce the number of events thus found by the efficiency in the identification of the final leptons-which we take conservatively to be 70% per lepton as we did for the LHC. We consider semi-leptonic decays of the W and proceed as explained in Section 3.
It is premature to discuss any background-except for stressing that at √ s = 1 TeV the leptons initiated production is 10 times that of vector boson fusion (see, for example, [51]).
In the case of W W di-bosons, both the future muon collider and the FCC can provide a significance equal to 2 for rejecting the null hypothesis I 3 ≤ 2 (see Fig. 19). In the case of ZZ di-bosons, the future muon collider can provide a significance equal to 2 for rejecting the null hypothesis I 3 ≤ 2, the FCC-which is expected to produce many more events-a significance of more than 4 (see Fig. 20).
A more realistic estimate of these numbers can only be provided by a full numerical simulation. In particular, the systematic uncertainties of the unfolding because of the presence of the neutrinos, and the background events may further decrease these significances.

Summary
We have computed the value of two observables-C 2 and I 3 , linked respectively to quantum entanglement and violation of Bell inequalities-in processes yielding two weak interaction gauge bosons in the final state. These particles, being spin-1 and massive, are qutrit states and, as such, more complicated to treat than the more ordinary qubit states implemented with fermions and photons.
We find that the most promising processes for testing Bell inequalities and the presence of entan-glement are, by far, those in which the final gauge bosons result from the decay of a Higgs boson. In this case the null hypothesis that the Bell inequalities be satisfied can be excluded using the data of Hi-Lumi at the LHC with a significance of 6 for a ZZ final state. The systematic uncertainty in the reconstruction of the neutrino momenta in the case W W final states makes it very hard to reach a satisfactory significance in this channel. We hope that these provisional results encourage the experimental collaborations to estimate the actual significance in a full simulation. In our opinion this is where Bell inequalities violation stands the best chance of being observed at energies around the weak scale.
The same observables can also be measured in the di-boson production initiated by electroweak quark fusion, reminiscent of the Drell-Yan processes. In this case, the invariant mass required to achieve significant values of the observables is rather large and only few events are expected at the LHC. These processes will become more competitive at future lepton colliders, with both the FCC and the muon collider reaching a significance of about 2 in testing the violation of Bell inequalities with W W di-bosons. A better result is expected in the case of ZZ di-bosons, in particular at the FCC.

A Spin and Gell-Mann matrices
The spin-1 representation of the three SU (2) generators S i , i ∈ {1, 2, 3}, used throughout the text is These can be expressed as a function of the Gell-Mann matrices T a as and the matrices S ij in Eq. (2.19) are: The functions q n ± and p n ± and the matrix a n m In this Appendix we follow [15]. The q n ± functions introduced in Section 2.4 are given by the following expressions in terms of the spherical coordinates of the two decaying particle rest frames.
The non-vanishing elementsh uū ab (h uū ba =h uū ab ) are given bỹ The non-vanishing elementsf uū a are given bỹ The elementsg uū a are identical:g uū a =f uū a .
The elementsh dd ab ,f dd a ,g dd a can be obtained from the following transformations where in this case the angle Θ is the angle between the antidown quarkd and the W + momenta. The effective couplingsḡ u,d V,A are defined in Eq. (4.9).

C.2 Polarization density matrix for qq → ZZ
We write below the coefficients A qq [Θ, m VV ],f qq a [Θ, m VV ],g qq a [Θ, m VV ], andh qq ab [Θ, m VV ], appearing in the polarization density matrix for qq → ZZ. The angle Θ is the scattering angle in the CM frame from the anti-quark and one of the Z momenta. Our convention is that the Z is in this case the one with momentum parallel to thê k unit vector of the spin right-handed basis in Eq. (2.26). Results below will be given for a generic quark q.
where the expression for the unpolarized square amplitude |M qq ZZ | 2 is given in Eq. (4.25). Throughout the following expressions we use c Θ ≡ cos Θ, s Θ ≡ sin Θ and D ZZ , f ZZ which are given in Eq. (4.26).
The non-vanishing elementsh qq ab (h qq ba =h qq ab ), are given bỹ The non-vanishing elementsf qq a are given bỹ The elementsg qq a are identical:g qq a =f qq a .

C.3 Polarization density matrix for ud → W + Z
We write below the expressions for the coefficients A ud [Θ, m VV ],f ud a [Θ, m VV ],g ud a [Θ, m VV ], andh ud ab [Θ, m VV ], appearing in the polarization density matrix for ud → W + Z in the limit M W = M Z = M V , where m VV is the invariant mass of W Z system in this approximation. The angle Θ is the scattering angle in the CM frame from the anti-down quark and W + momenta. Our convention for the polarization matrix is that the momentum of W + is chosen parallel to thek unit vector of the spin right-handed basis in Eq. (2.26).
where the expression for the unpolarized square amplitude |M ud W Z | 2 is given in Eq. (4.31). Throughout the following expressions we use c Θ ≡ cos Θ, s Θ ≡ sin Θ and D WZ , f WZ which are given in Eq. (4.32).
The non-vanishing elementsh ud ab (h ud ba =h ud ab ) are given by The non-vanishing elementsf ud a are given bỹ The elementsg ud a are identical:g ud a =f ud a .