1 Introduction

Entanglement is considered a quintessentially quantum property, one with no classical analogue. Schrödinger described it as “the characteristic trait of quantum mechanics” [1]. Yet despite its central role in our understanding of quantum physics and its many applications in quantum information science, the fundamental nature of entanglement remains as mysterious as when it was first conceived.

Mathematically, entanglement may be defined as the property of nonseparability for vectors in (or operators on) a tensor product of Hilbert spaces. When combined with the Born rule, this property entails the many observational consequences of quantum entanglement, but as a mathematical property alone it is by no means limited to quantum systems. This trivial observation has given rise to the notion of classical entanglement, wherein the mathematical description of certain classical systems may also be described as nonseparable under a suitable identification of a product Hilbert space (e.g., modes of a vibrating drum) [2,3,4].

It is important to recognize that mere nonseparability in a classical system does not entail the many curious observational consequences of true quantum entanglement. The physical significance of entanglement lies in the unique statistical characteristics of entangled systems and the nonlocal effects they seem to imply. This behavior has been demonstrated most strikingly in a series of experiments considered to be free of all reasonable loopholes that might permit a local realist interpretation [5,6,7].

In this paper, we consider an interesting relationship between quantum entanglement and classical statistics that appears, until now, to have gone unnoticed. Inspiration is taken from the notion that many quantum effects may be reproduced by replacing the virtual zero-point field of quantum electrodynamics with one that is real and stochastic [8]. This approach has been used extensively as a method for classical modeling of certain quantum systems [9]. For example, the relationship to entanglement was studied by deriving a Wigner function representation of spontaneous parametric downconversion through a detailed physical modeling of nonlinear optical processes using a classical zero-point field [10, 11].

This result may seem surprising since squeezed vacuum states do not admit a positive P representation, even though they have a positive Wigner function. From a mathematical perspective, this is simply a consequence of the optical equivalence theorem, which implies the equivalence of a Gaussian quantum state and a corresponding classical Gaussian random vector [12]. The present work generalizes prior research to arbitrary multi-modal squeezed states arising from symmetric squeezing matrices and examines the relationship to improper complex Gaussian random variables. In addition, the behavior under a deterministic model of photon detection is considered [13,14,15,16], which is an approach the previous work had not considered. Although Gaussian states may in some respects be deemed classical, the introduction of a nonlinear measurement scheme, such as we consider, when combined with post-selection can give rise to contextuality and, hence, quantum-like behavior such as violations of the Bell–CHSH inequality [15, 17, 18].

2 Multi-mode squeezing

Let \(\varvec{\mathsf {\xi }}\) be a \(d \times d\) symmetric matrix defining the quantum mechanical multi-mode squeezing operator

$$\begin{aligned} {\hat{S}} = \exp \left[ \frac{1}{2} \left( \hat{\varvec{a}}^\dagger \right) ^{\mathsf {T}} \varvec{\mathsf {\xi }} \, \hat{\varvec{a}}^\dagger - \frac{1}{2} \hat{\varvec{a}}^{\mathsf {T}} \varvec{\mathsf {\xi }}^{\mathsf {H}} \hat{\varvec{a}} \right] \; , \end{aligned}$$
(1)

where \((\hat{\varvec{a}}^\dagger )^{\mathsf {T}} = [{\hat{a}}_1^\dagger , \ldots , {\hat{a}}_d^\dagger ]\) is a row vector of creation operators over d distinct modes and \(\varvec{\mathsf {\xi }}^{\mathsf {H}} = (\varvec{\mathsf {\xi }}^*)^{\mathsf {T}}\) is the Hermitian conjugate of the matrix \(\varvec{\mathsf {\xi }}\). The latter notation is used to distinguish the conjugate transpose of a matrix from the adjoint of an operator.

We may write \(\varvec{\mathsf {\xi }}\) in the general polar form \(\varvec{\mathsf {\xi }} = \varvec{\mathsf {R}} \varvec{\mathsf {Q}}\), where \(\varvec{\mathsf {R}}\) is positive semi-definite and \(\varvec{\mathsf {Q}}\) is unitary. Since \(\varvec{\mathsf {\xi }}\) is symmetric and, therefore, normal, \(\varvec{\mathsf {R}} = (\varvec{\mathsf {\xi }}\varvec{\mathsf {\xi }}^{\mathsf {H}})^{1/2}\). If, furthermore, \(\varvec{\mathsf {R}}\) is positive definite, then it is also invertible and we may take \(\varvec{\mathsf {Q}} = \varvec{\mathsf {R}}^{-1} \varvec{\mathsf {\xi }}\). In the degenerate case \(\varvec{\mathsf {R}} = \varvec{\mathsf {0}}\), we may simply take \(\varvec{\mathsf {Q}} = \varvec{\mathsf {I}}\) to be the identity. More generally, if \(\varvec{\mathsf {\xi }} = \varvec{\mathsf {U}} \varvec{\mathsf {D}} \varvec{\mathsf {V}}^{\mathsf {H}}\) is a singular value decomposition of \(\varvec{\mathsf {\xi }}\), where \(\varvec{\mathsf {U}}\) and \(\varvec{\mathsf {V}}\) are unitary and \(\varvec{\mathsf {D}}\) is diagonal and positive semi-definite, then \(\varvec{\mathsf {R}} = \varvec{\mathsf {U}} \varvec{\mathsf {D}} \varvec{\mathsf {U}}^{\mathsf {H}}\) and \(\varvec{\mathsf {Q}} = \varvec{\mathsf {U}} \varvec{\mathsf {V}}^{\mathsf {H}}\). Using the polar decomposition \(\varvec{\mathsf {\xi }} = \varvec{\mathsf {RQ}}\), we may now write the corresponding Bogoliubov transformation of \(\hat{\varvec{a}}\), denoted \(\hat{\varvec{b}} = {\hat{S}}^\dagger \hat{\varvec{a}} {\hat{S}}\), as follows [19]:

$$\begin{aligned} \hat{\varvec{b}} = (\cosh \varvec{\mathsf {R}}) \, \hat{\varvec{a}} + (\sinh \varvec{\mathsf {R}}) \, \varvec{\mathsf {Q}} \, \hat{\varvec{a}}^\dagger \; . \end{aligned}$$
(2)

A classical analogue will now be considered by replacing \(\hat{\varvec{a}}\) with the random vector \(\varvec{a} = \sigma \varvec{z}\), where \(\varvec{z}\) is a d-dimensional standard complex Gaussian random vector representing the d distinct vacuum modes and \(\sigma ^2\hbar \omega \) is the modal energy. Specifically, \(\varvec{z}\) is a complex Gaussian random vector such that \({\mathsf {E}}[\varvec{z}] = \varvec{0}\), \({\mathsf {E}}[\varvec{z}\varvec{z}^{\mathsf {H}}] = \varvec{\mathsf {I}}\), and \({\mathsf {E}}[\varvec{z}\varvec{z}^{\mathsf {T}}] = \varvec{\mathsf {0}}\), where \({\mathsf {E}}[\cdot ]\) denotes the expectation value. Note that \(\sigma = 1/\sqrt{2}\) corresponds to the (pure) vacuum state, while larger values of \(\sigma \) correspond to a (mixed) thermal state. The analogue of the squeezed state \(\hat{\varvec{b}}\) is then the random vector \(\varvec{b}\) defined by

$$\begin{aligned} \varvec{b} = (\cosh \varvec{\mathsf {R}}) \, \varvec{a} + (\sinh \varvec{\mathsf {R}}) \, \varvec{\mathsf {Q}} \, \varvec{a}^* \; . \end{aligned}$$
(3)

Transformations of this form appear in classical nonlinear mixing [20], so we may also view this as a classical model arising from nonlinear optics and reified (i.e., real, not virtual) vacuum modes. Our fundamental hypothesis is that \(\varvec{b}\) provides an accurate statistical representation of \(\hat{\varvec{b}}\) when applied to vacuum or thermal states.

Since \(\varvec{b}\) is a linear combination of complex Gaussian random variables, it, too, is a complex Gaussian random vector. As such, it is defined by a mean value, a covariance matrix, and a pseudo-covariance matrix. The mean is clearly zero, and the covariance is given by

$$\begin{aligned} \varvec{\mathsf {\Gamma }}= {\mathsf {E}}\left[ \varvec{b} \varvec{b}^{\mathsf {H}}\right] = \sigma ^2 \cosh (2\varvec{\mathsf {R}}) \; , \end{aligned}$$
(4)

which is positive semi-definite. Unlike \(\varvec{a}\), however, \(\varvec{b}\) is not generally a proper random vector since the pseudo-covariance,

$$\begin{aligned} \varvec{\mathsf {C}}&= {\mathsf {E}}\left[ \varvec{b} \varvec{b}^{\mathsf {T}}\right] = \sigma ^2 \left[ (\cosh \varvec{\mathsf {R}}) \varvec{\mathsf {Q}}^{\mathsf {T}} \sinh \varvec{\mathsf {R}}^{\mathsf {T}} + (\sinh \varvec{\mathsf {R}}) \varvec{\mathsf {Q}} \cosh \varvec{\mathsf {R}}^{\mathsf {T}} \right] \; , \end{aligned}$$
(5)

is not necessarily zero. In such cases, \(\varvec{b}\) is said to be an improper complex Gaussian random vector [21, 22].

Improper Gaussian random vectors arise in several signal and image processing applications. However, to date, they have received little attention within the physics community in relation to quantum optics and entanglement. Impropriety, a measure of the degree to which a random vector is improper, may be interpreted as a correlation between the real and imaginary parts of \(\varvec{b}\). A popular measure of the degree of impropriety is the following:

$$\begin{aligned} {\mathcal {I}} = \frac{|\det \varvec{\mathsf {C}}|^2}{(\det \varvec{\mathsf {\Gamma }})^2} \; . \end{aligned}$$
(6)

This definition is equivalent to others that have been proposed for characterizing improper random vectors [23]. It can be shown that \(0 \le {\mathcal {I}} \le 1\) and, for proper random vectors, \({\mathcal {I}} = 0\). A random vector for which \({\mathcal {I}} = 1\) is considered maximally improper [24]. If \(\varvec{\mathsf {C}}\) is singular but nonzero, then \(\varvec{b}\) is improper but has zero impropriety. Note that the definition of impropriety may be applied to any complex random vector, whether it is Gaussian or not, provided the second moments are well defined.

As a statistical relationship between the real and imaginary parts of \(\varvec{b}\) (or, equivalently, \(\varvec{b}\) and \(\varvec{b}^*\)), we might expect impropriety to be fundamentally related to the commutation relations between the quadratures of \(\hat{\varvec{b}}\) (or, equivalently, \(\hat{\varvec{b}}\) and \(\hat{\varvec{b}}^\dagger \)). In quantum mechanics, this relationship is captured in such familiar quantities as the Mandel \(Q_M\) parameter, which compares the mean and variance of the (normally ordered) number operator, and the squeezing parameter \(S_\theta \), which measures the imbalance between the quadratures of a squeezed state [25,26,27]. For, say, a single squeezed vacuum mode, these nonclassicality parameters take on anomalous values precisely when the squeezing parameter r is nonzero. This, as we now show, can be tied directly to impropriety.

Let us consider the special case \(\varvec{\mathsf {R}} = r\varvec{\mathsf {I}}\), for \(r \ge 0\). In this case, the covariance matrix is \(\varvec{\mathsf {\Gamma }} = \sigma ^2 \cosh (2r)\), \(\varvec{\mathsf {Q}}\) is symmetric, and the pseudo-covariance matrix takes the simple form

$$\begin{aligned} \varvec{\mathsf {C}} = \sigma ^2 \sinh (2r) \, \varvec{\mathsf {Q}} \; . \end{aligned}$$
(7)

Since \(\varvec{\mathsf {Q}}\) is unitary, the impropriety is found to be

$$\begin{aligned} {\mathcal {I}}(r) = \tanh (2r)^{2d} \; . \end{aligned}$$
(8)

Note that \({\mathcal {I}}(r)\) grows monotonically with r and is independent of both \(\varvec{\mathsf {Q}}\) and \(\sigma \). We further note that as the squeezing parameter increases (i.e., as \(r \rightarrow \infty \)), the degree of impropriety converges to unity, eventually approaching a state of maximum impropriety.

For this special case, it can furthermore be shown that the probability density function for \(\varvec{b}\) is given by [28]

$$\begin{aligned} f(\varvec{\beta }) \propto \exp \left[ -\frac{1}{2} \begin{pmatrix} \varvec{\beta }^{\mathsf {H}}&\varvec{\beta }^{\mathsf {T}} \end{pmatrix} \begin{pmatrix} \varvec{\mathsf {\Gamma }} &{} \varvec{\mathsf {C}} \\ \varvec{\mathsf {C}}^* &{} \varvec{\mathsf {\Gamma }}^* \end{pmatrix}^{-1} \begin{pmatrix} \varvec{\beta } \\ \varvec{\beta }^* \end{pmatrix} \right] = \frac{\exp \left[ -\Vert (\cosh r) \, \varvec{\beta } - (\sinh r) \, \varvec{\mathsf {Q}} \, \varvec{\beta }^* \Vert ^2 / \sigma ^2 \right] }{(\pi \sigma ^2)^d} \; , \end{aligned}$$
(9)

where \(\varvec{\beta } \in {\mathbb {C}}^d\). This matches precisely the Wigner function \(W(\varvec{\beta },\varvec{\beta }^*)\) for \(\hat{\varvec{b}}\). This is of course unsurprising since the second moments \({\mathsf {E}}[b_i b_j^*]\) of \(\varvec{b}\) match the symmetrized expectations \({\langle { ({\hat{b}}_i{\hat{b}}_j^\dagger + {\hat{b}}_j^\dagger {\hat{b}}_i )/2 }\rangle }\) of \(\hat{\varvec{b}}\).

Clearly, \(f(\varvec{\beta })\) and \(W(\varvec{\beta },\varvec{\beta }^*)\) are nonnegative (indeed, Gaussian) and, in this sense, classical, even though the P function of \(\hat{\varvec{b}}\) may not be. Nevertheless, the squeezed quantum state may still exhibit entanglement (i.e., nonseparability). The Peres–Horodecki criterion, extended to continuous variables, can be used to determine separability [29]. For the special case of, say, a symmetric two-mode Gaussian state with squeezing matrix

$$\begin{aligned} \varvec{\mathsf {\xi }} = r e^{i\phi } \begin{pmatrix} 0 &{} \;\;1 \\ 1 &{} \;\;0 \end{pmatrix} \; , \end{aligned}$$
(10)

the Peres–Horodecki criterion requires \(\sigma ^2 e^{-2r} < 1/2\) for entanglement. For the vacuum state (\(\sigma ^2 = 1/2\)), we have entanglement for all \(r > 0\), while a general thermal state will be entangled only if \(r > (1/2) \log (2\sigma ^2)\). For a general two-mode Gaussian quantum state, whether squeezed or not, propriety in the corresponding random vector (i.e., \(\varvec{\mathsf {C}} = \varvec{\mathsf {0}}\)) implies that the quantum state is separable [30]. The converse, as we have seen, need not be true. Thus, entanglement may imply impropriety, as it does in this case, but impropriety alone does not entail that the state is entangled.

3 Post-selected states

Let us consider again the general case \(\varvec{\mathsf {\xi }} = \varvec{\mathsf {R}} \varvec{\mathsf {Q}}\) and let \({|{\varvec{\mathsf {\xi }}}\rangle } = {\hat{S}} {|{\varvec{0}}\rangle }\) denote the corresponding multi-modal squeezed vacuum state. If the degree of squeezing is small, as measured by some suitable norm on \(\varvec{\mathsf {\xi }}\), we may approximate the squeezed state \({|{\varvec{\mathsf {\xi }}}\rangle }\) as follows:

$$\begin{aligned} {|{\varvec{\mathsf {\xi }}}\rangle } \approx {|{0,\ldots ,0}\rangle } + \tfrac{1}{2} \sum _{ij} \xi _{ij} {\hat{a}}_i^\dagger {\hat{a}}_j^\dagger {|{0,\ldots ,0}\rangle } \;. \end{aligned}$$
(11)

Neglecting the vacuum state, the second term in this approximation represents a two-photon state of the form

$$\begin{aligned} {|{\psi }\rangle } \propto \tfrac{1}{2} \xi _{11} {|{2,0,\ldots ,0}\rangle } + \xi _{12} {|{1,1,0,\ldots ,0}\rangle } + \cdots + \xi _{1d} {|{1,0,\ldots ,0,1}\rangle } + \cdots + \tfrac{1}{2} \xi _{dd} {|{0,\ldots ,0,2}\rangle } \; . \;\, \end{aligned}$$
(12)

Note that, although \({|{\varvec{\mathsf {\xi }}}\rangle }\) is Gaussian, the approximate post-selected state \({|{\psi }\rangle }\) generally is not.

Of particular interest will be squeezing matrices of the form

$$\begin{aligned} \varvec{\mathsf {\xi }} = r\begin{pmatrix} 0 &{} 0 &{} \alpha _1 &{} \alpha _2 \\ 0 &{} 0 &{} \alpha _3 &{} \alpha _4 \\ \alpha _1 &{} \alpha _3 &{} 0 &{} 0 \\ \alpha _2 &{} \alpha _4 &{} 0 &{} 0 \end{pmatrix} \; , \end{aligned}$$
(13)

as these can be used to represent two-photon polarization states of the form

$$\begin{aligned} {|{\psi }\rangle } = \alpha _1 {|{HH}\rangle } + \alpha _2 {|{HV}\rangle } + \alpha _3 {|{VH}\rangle } + \alpha _4 {|{VV}\rangle } \; . \end{aligned}$$
(14)

Here, the basis states \({|{HH}\rangle }, {|{HV}\rangle }, \ldots \) are used to represent the Fock states \({|{1,0,1,0}\rangle }, {|{1,0,0,1}\rangle }, \ldots \). The corresponding random vector \(\varvec{b}\) shall be denoted as \([b_{AH}, b_{AV}, b_{BH}, b_{BV}]^{\mathsf {T}}\), so the pair \((b_{AH}, b_{AV})\) may be associated with the first photon and \((b_{BH}, b_{BV})\) may be associated with the second photon.

Taking \(\alpha _1 = \alpha _4 = 0\) and \(\alpha _2 = -\alpha _3 = 1/\sqrt{2}\), for example, gives the Bell state \({|{\psi }\rangle } \propto {|{HV}\rangle } - {|{VH}\rangle }\). The polar decomposition in this case is \(\varvec{\mathsf {\xi }} = \varvec{\mathsf {R}} \varvec{\mathsf {Q}}\), with \(\varvec{\mathsf {R}} = (r/\sqrt{2}) \, \varvec{\mathsf {I}}\) and

$$\begin{aligned} \varvec{\mathsf {Q}} = \begin{pmatrix} 0 &{} \;\;\,0 &{} \;\;\,0 &{} \;\;\,1 \\ 0 &{} \;\;\,0 &{} -1 &{} \;\;\,0 \\ 0 &{} -1 &{} \;\;\,0 &{} \;\;\,0 \\ 1 &{} \;\;\,0 &{} \;\;\,0 &{} \;\;\,0 \end{pmatrix} \; . \end{aligned}$$
(15)

As noted previously, the corresponding Gaussian random vector \(\varvec{b}\) is improper for all \(r > 0\). For post-selected squeezed vacuum states, the corresponding quantum state \({|{\psi }\rangle }\) is, of course, maximally entangled. However, for general thermal states this need not be the case, even when \(\varvec{b}\) is improper.

By contrast, suppose \(\alpha _1 = \alpha _2 = \alpha _3 = \alpha _4 = 1/2\), corresponding to a separable superposition of all four modes. In this case, \(\varvec{\mathsf {\xi }} = \varvec{\mathsf {U}} \varvec{\mathsf {D}} \varvec{\mathsf {V}}^{\mathsf {H}}\), where \(\varvec{\mathsf {D}} = \mathrm {diag}(r, r, 0, 0)\) and \(\varvec{\mathsf {U}}, \varvec{\mathsf {V}}\) are unitary matrices taken to be

$$\begin{aligned} \varvec{\mathsf {U}}= & {} \frac{1}{\sqrt{2}} \begin{pmatrix} 1 &{} \;\;\,0 &{} -1 &{} \;\;\,0 \\ 1 &{} \;\;\,0 &{} \;\;\,1 &{} \;\;\,0 \\ 0 &{} -1 &{} \;\;\,0 &{} -1 \\ 0 &{} -1 &{} \;\;\,0 &{} \;\;\,1 \end{pmatrix} \; ,\end{aligned}$$
(16)
$$\begin{aligned} \varvec{\mathsf {V}}= & {} \frac{1}{\sqrt{2}} \begin{pmatrix} 0 &{} -1 &{} \;\;\,0 &{} \;\;\,1 \\ 0 &{} -1 &{} \;\;\,0 &{} -1 \\ 1 &{} \;\;\,0 &{} \;\;\,1 &{} \;\;\,0 \\ 1 &{} \;\;\,0 &{} -1 &{} \;\;\,0 \end{pmatrix} \; . \end{aligned}$$
(17)

Using \(\varvec{\mathsf {R}} = \varvec{\mathsf {U}} \varvec{\mathsf {D}} \varvec{\mathsf {U}}^{\mathsf {H}}\) and \(\varvec{\mathsf {Q}} = \varvec{\mathsf {U}} \varvec{\mathsf {V}}^{\mathsf {H}}\), we find that the pseudo-covariance of \(\varvec{b}\) is nonzero but singular, since \(\det \varvec{\mathsf {R}} = 0\) implies \(\det \varvec{\mathsf {C}} = 0\). So, \(\varvec{b}\) is improper, but the degree if impropriety is zero, in keeping with the separability of \({|{\psi }\rangle }\).

4 Bell–CHSH inequality violations

To further verify entanglement, we considered computing a Bell statistic for the Clauser–Horne–Shimony–Holt (CHSH) inequality [17]. For the Bell statistic, we used the observables \(\varvec{\mathsf {A}}_1 = \varvec{\mathsf {Z}}\), \(\varvec{\mathsf {A}}_2 = \varvec{\mathsf {X}}\), \(\varvec{\mathsf {B}}_1 = (\varvec{\mathsf {Z}}+\varvec{\mathsf {X}})/\sqrt{2}\), and \(\varvec{\mathsf {B}}_2 = (\varvec{\mathsf {Z}}-\varvec{\mathsf {X}})/\sqrt{2}\), where \(\varvec{\mathsf {X}}\) and \(\varvec{\mathsf {Z}}\) are the Pauli x and z matrices. For quantum observables, the Bell statistic

$$\begin{aligned} S = \bigl | C_{11} + C_{12} \bigr | + \bigl | C_{21} - C_{22} \bigr | \;, \end{aligned}$$
(18)

using \(C_{ij} = {\langle {\psi | \varvec{\mathsf {A}}_i \otimes \varvec{\mathsf {B}}_j |\psi }\rangle }\), is \(S = 2\sqrt{2}\).

For classical observables, a specific definition of measurement is needed to compute the correlations \(C_{ij}\), and for this we used local amplitude threshold crossings as a model for single-photon detection [15, 16]. In this approach, a detection of the modal component \(b_i\) is said to occur when \(|b_i| > \gamma \) for some fixed amplitude threshold \(\gamma \ge 0\). In the analysis to follow, we take \(\gamma = 1\) for all detectors.

Verification was done numerically as follows. First, a random sample of \(N = 2^{20}\) realizations of the vacuum states \(\varvec{a} = [a_{AH}, a_{AV}, a_{BH}, a_{BV}]^{\mathsf {T}}\) were generated, where the components of \(\varvec{a}\) are independent and identically distributed proper complex Gaussian random variables with zero mean and variance \(\sigma ^2 = 1/2\). Using Eqns.  (3) and (15) with \(\varvec{\mathsf {R}} = (r/\sqrt{2}) \varvec{\mathsf {I}}\), we computed the squeezed modes \(\varvec{b}_{A} = [b_{AH}, b_{AV}]^{\mathsf {T}}\) and \(\varvec{b}_{B} = [b_{BH}, b_{BV}]^{\mathsf {T}}\) representing two entangled photons measured by Alice and Bob, respectively. Specifically, these are given by

$$\begin{aligned} \varvec{b}_{A}&= \begin{bmatrix} b_{AH} \cosh \left( r/\sqrt{2}\right) + b_{BV}^* \sinh \left( r/\sqrt{2}\right) \\ b_{AV} \cosh \left( r/\sqrt{2}\right) - b_{BH}^* \sinh \left( r/\sqrt{2}\right) \end{bmatrix} \; , \end{aligned}$$
(19)
$$\begin{aligned} \varvec{b}_{B}&= \begin{bmatrix} b_{BH} \cosh \left( r/\sqrt{2}\right) - b_{AV}^* \sinh \left( r/\sqrt{2}\right) \\ b_{BV} \cosh \left( r/\sqrt{2}\right) + b_{AH}^* \sinh \left( r/\sqrt{2}\right) \end{bmatrix} \; . \end{aligned}$$
(20)

In the simulation, r was varied from 0 to 3. Note that, although \(r \gg 1\) exceeds the validity condition of Eqn. (11), the equations for \(\varvec{b}_A\) and \(\varvec{b}_B\) remain valid.

For the random vectors \(\varvec{b}_{A}\) and \(\varvec{b}_{B}\), measurements were performed as follows. To measure \(\varvec{\mathsf {A}}_1 = \varvec{\mathsf {Z}}\), Alice need only consider the components \(b_{AH}\) and \(b_{AV}\). Let \(I_H\) denote the subsets of all realizations for which \(|b_{AH}| > \gamma \), and let \(I_V\) be the subset for which \(|b_{AV}| > \gamma \). Events in \(I_H\) correspond to an outcome of \(+1\) for measuring \(\varvec{\mathsf {A}}_1\), while events in \(I_V\) correspond to an outcome of \(-1\). By contrast, the complementary set \({\bar{I}}_H\) denotes the set of realizations for which an outcome of \(+1\) did not occur, either because the outcome was \(-1\) or because there was no detection observed by Alice.

To measure, say, \(\varvec{\mathsf {B}}_2\), we first applied a unitary matrix \(\varvec{\mathsf {U}}_-^\dagger \) to \(\varvec{b}_{B}\) to obtain \(\varvec{b}_{B}' = \varvec{\mathsf {U}}_-^\dagger \varvec{b}_{B} = [b'_{BH}, b'_{BV}]^{\mathsf {T}}\), where

$$\begin{aligned} \varvec{\mathsf {U}}_{\pm } = \begin{pmatrix} \;\;\,\cos (\pi /8) &{} \pm \sin (\pi /8) \\ \pm \sin (\pi /8) &{} -\cos (\pi /8) \end{pmatrix} \end{aligned}$$
(21)

is such that \(\varvec{\mathsf {U}}_{+}^\dagger \varvec{\mathsf {B}}_1 \varvec{\mathsf {U}}_{+} = \varvec{\mathsf {U}}_{-}^\dagger \varvec{\mathsf {B}}_2 \varvec{\mathsf {U}}_{-} = \varvec{\mathsf {Z}}\) is diagonal. Let \(J_H\) to denote the subset of all realizations for which \(|b'_{BH}| > \gamma \), and, similarly, define \(J_V\) to be those for which \(|b'_{BV}| > \gamma \). Much as before, events in \(J_H\) correspond to an outcome of \(+1\) for measuring \(\varvec{\mathsf {B}}_2\), while events in \(J_V\) correspond to an outcome of \(-1\). Note that measurements of \(\varvec{\mathsf {A}}_1\) and \(\varvec{\mathsf {B}}_1\) are each performed locally.

Now, the correlation \(C_{12}\) between measurements of \(\varvec{\mathsf {A}}_1\) and those of \(\varvec{\mathsf {B}}_2\) was computed as follows:

$$\begin{aligned} \begin{aligned} C_{12} = (+1)(+1) \frac{\Pr [E_{HH}]}{\Pr [E]} + (+1)(-1) \frac{\Pr [E_{HV}]}{\Pr [E]} + (-1)(+1) \frac{\Pr [E_{VH}]}{\Pr [E]} + (-1)(-1) \frac{\Pr [E_{VV}]}{\Pr [E]}, \end{aligned} \end{aligned}$$
(22)

where \(\Pr [E_{HV}]/\Pr [E]\), say, is the probability of the event \(E_{HV} = I_H \cap {\bar{I}}_V \cap {\bar{J}}_H \cap J_V\) conditioned on the set of coincident single-detection events \(E = E_{HH} \cup E_{HV} \cup E_{VH} \cup E_{VV}\). Note that conditioning on coincident detection events corresponds to post-selecting for \({|{\psi }\rangle }\) upon preparing \({|{\varvec{\mathsf {\xi }}}\rangle }\). (Higher order multi-photon terms are considered negligible if r is small.) This post-selection is necessary to prepare the desired entangled state but introduces contextuality since the set E will be different for each of the four measurement selections.

The other correlations are computed similarly, and doing so for all four combinations of observables results in the Bell statistic S. In Fig. 1, we have plotted S as a function of r, the magnitude of the squeezing parameter. We see that for r greater than about 0.5, we obtain a violation of the CHSH inequality \(S \le 2\). The ability to violate the CHSH inequality is, of course, a result of the post-selection performed on coincident single-detection events, which gives rise to contextuality and the detection loophole [31,32,33]. We furthermore note that for r greater than about 1, the validity of Eqn. (11) is compromised and, so, we also get a violation of the Tsirelson bound of \(2\sqrt{2}\) [34]. The upward trend continues monotonically towards an asymptote of 4, which is the algebraic upper bound on S.

Such large values for S are not possible within quantum mechanics but do occur in post-quantum models such as Popescu–Rohrlich (PR) boxes [35]. Despite their unusual behavior, these models are not merely speculative but can be created artificially. For example, it has been observed that a PR box may be created simply through post-selection [36]. Indeed, an experimental realization of this for a three-photon state has already been performed [37]. A similar experiment using pairs of photons in entangled orbital angular momentum states was used to demonstrate near maximal violations [38]. Recently, a notional scheme for an optical realization of a PR box for two polarization photons has also been proposed [39]. The importance of the present work is to demonstrate how a physical model, such as we have described, can plausibly violate the fair-sampling hypothesis and result in extreme violations of the CHSH inequality, as have been observed experimentally.

Fig. 1
figure 1

(color online) Plot of the Bell statistic S (black dots) and coincident detection efficiency \(\eta \) (blue squares) versus the squeezing parameter r for an entangled Bell state. The solid red horizontal line is the classical bound of 2, while the dashed green horizontal line is the Tsirelson bound of \(2\sqrt{2}\)

High values of S rely upon post-selection of rare events, which can lead to low coincident detection efficiencies. We define the efficiency, \(\eta \), as the probability of a coincident detection, conditioned on a single detection for either measurement, and minimized over all measurements, as suggested in Ref. [33]. As shown in Fig. 1, the efficiency, \(\eta \), is maximized near \(r = 0.8\), which is the regime in which we get a CHSH violation. Much larger values of r, which are needed for more extreme violations, occur with vanishing probability. The maximum efficiency of about 38% is consistent with the detection loophole and comparable to what is observed for typical avalanche photodiodes [40].

5 Conclusion

In this paper, we have considered a classical model for certain multi-mode squeezed states in which the quantum mechanical annihilation operators are replaced with independent proper complex Gaussian random variables. The transformed random vector representing the squeezed state was found to be complex Gaussian distributed but not necessarily proper, due to the presence of a possibly nonzero pseudo-covariance matrix. Entangled quantum states were found to correspond to improper classical random vectors, but impropriety alone was not found to entail entanglement. The model was further examined by demonstrating violations of the Bell–CHSH inequality on post-selected coincident detections using an amplitude threshold crossing model of single-photon detection, with results that conform well with experimental observations for typical avalanche photodiode detectors.