1 Introduction

Together with the notion of general covariance, the equivalence principle represents a cornerstone of general relativity. Since its inception more than one century ago, there have been a number of distinct interpretations of the equivalence principle, each with its own set of assumptions and scope of use [1,2,3,4,5,6,7]. The simplest version of the equivalence principle, the so-called weak equivalence principle (WEP), asserts that non-inertial effects caused by acceleration are indistinguishable from the effects of an external gravitational field. This is encoded in the identity between the inertial and gravitational mass of test bodies subject to gravity.

Among the many theoretical and experimental efforts that have been done in connection with WEP (a comprehensive review of recent developments can be found in Ref. [7]), particularly prominent are those based on particle mixing, because these can also shed light on the validity of WEP in the quantum regime. Specifically, the investigations of WEP in neutrino physics have been a driving force behind a large number of studies, cf. Refs. [8,9,10,11,12,13,14,15,16,17] and references therein. Along these lines, the primary focus has been on ultra-relativistic neutrinos, as they are the most pertinent from a phenomenological standpoint. However, recently it has been pointed out [18] that, in a non-relativistic setting, it is simple to differentiate between inertial mass \(m_i\) and gravitational neutrino mass \(m_g\) in the weak interaction basis (i.e., flavor basis). In other words, for low energy neutrinos the violation of WEP becomes manifest.

In the case of neutrinos, the mismatch between \(m_i\) and \(m_g\) can be attributed to the unavoidable presence of flavor mixing. In particular, when performing the non-relativistic limit, one has to simultaneously deal with large and small bispinor components (e.g., \(\psi ^L\) and \(\psi ^S\)), which in the case of mixing are both comparably important. In fact, to identify the low-energy inertial mass \(m_i\), one has to work interchangeably with both small and large components because these are interlocked at all energy scales. On the other hand, when the conventional minimal coupling to gravity is considered in the weak-field approximation, the gravitational potential couples directly to the original flavor masses, allowing them to be interpreted as equivalent to gravitational masses. Since the gravitational mass does not undergo the same redefinition as the inertial one, the violation of WEP arises.

At this stage, one may wonder if a similar violation of WEP would be also observed for spin-0 particles, since particle mixing is not exclusive to neutrino physics, but it is also found for weakly interacting mesons. The most important mixing phenomena in this context are represented by the oscillations of neutral kaons \(K^0\rightleftharpoons \bar{K}^0\), neutral and strange B mesons, \(B^0\rightleftharpoons \bar{B}^0\) and \(B_s^0\rightleftharpoons \bar{B}_s^0\), respectively, and neutral D mesons \(D^0\rightleftharpoons \bar{D}^0\). Although the last example has been definitively established only recently [19], the other meson oscillations have been known for a long time. In particular, the strange B meson has attracted a significant attention over the years, as its phenomenology is considered to be relevant for understanding the asymmetry of matter and antimatter in the visible universe [20, 21].

Apart from the different spin, there are also other compelling reasons to extend the considerations carried out in Ref. [18] to bosons. For instance, there is a stark contrast between the oscillations of mesons and neutrinos. While neutrinos are elementary particles, mesons are not. They are hadrons made up of two valence quarks, which means that the first-quantized description of meson oscillations is only an effective one that is valid for energies smaller than the typical mass scales involved. On the other hand, analyzing low-energy meson oscillations could potentially facilitate the investigation of WEP violations for composite quantum systems. Such a study is, however, beyond the scope of this paper and it will not be pursued here. Another crucial reason to analyze meson oscillations is their potential for experimentation. Indeed, the elusive nature of neutrinos renders them unreliable as test particles for preparation and detection with the currently available technology. Conversely, much heavier mesons offer a wider admissible energy range for experiments, thus enabling the development of potential experimental setups to test the WEP violation.

In this paper, we aim to expand the results of Ref. [18] by studying the non-relativistic limit of mixed scalar particles and determine if the difference between \(m_i\) and \(m_g\) observed in [18] for mixed Dirac fermions holds also for spinless particles. To keep our considerations as close as possible to the spin-\(\frac{1}{2}\) case, we will resort to the Feshbach–Villars (FV) representation of a Klein–Gordon (KG) particle [22, 23]. The FV representation allows to reformulate the Klein–Gordon equation in a Schrödinger-like form (thereby employing first-order time derivatives, as for the Dirac equation), where the ensuing wave function has a two-component form, which reflects the presence of particles and antiparticles [22]. A particular advantage the FV representation is that it allows to pursue a non-relativistic approximation in a systematic way by mimicking the Foldy–Wouthuysen (FW) procedure known from the spin-\(\frac{1}{2}\) wave equation [24, 25]. It should also be stressed that, in a non-relativistic limit, the positive-energy plane-wave solutions of the FV equation have the upper component much larger than the lower components [23]. Similarly, for the negative-energy plane wave solutions one gets that the lower component is much larger. The analogous situation holds also for Dirac wave functions, with the only difference that, in the latter case, one must also consider helicity components.

In contrast to Dirac fermions where gravity couples through minimal coupling (with the Fock–Ivanenko connection), there is not yet a standard theory of massive spinless bosons in curved spacetime [28]. In our considerations, we will employ the commonly used conformal coupling, which, among others, has a correct quasiclassical limit, avoids a tachyonic behavior and allows for a straightforward translation in terms of the FV representation [28,29,30].

For the sake of consistency, we supplement our discussion by examining Bargmann’s superselection rules [31]. Specifically, we demonstrate why superpositions of states with different masses are not problematic in the non-relativistic limit of relativistic quantum mechanics, unlike the case in which one instead starts from Galilean (rather than Lorentz) invariance.

The paper is organized as follows: in Sect. 2, we briefly review the non-relativistic limit for mixed neutrinos in the weak interaction basis. In Sect. 3, we study in detail the non-relativistic limit of the FV equation for mixed scalar particles and show that, in this framework, one inevitably comes across a non-trivial correction to the initial inertial mass, which leads to a low-energy effective inertial mass \(m_i\). In addition, if a weak gravitational field is present, we show that the corresponding gravitational mass \(m_g\) does not undergo the same redefinition as the inertial mass, hence \(m_i \ne m_g\), which is a direct signature of WEP violation. In Sect. 3.3, we briefly comment on the violation of the Bargmann’s superselection rules in connection with the non-relativistic limit of a relativistic theory with mixed particles. Concluding remarks and generalizations are proposed in Sect. 5.

2 Non-relativistic mixed neutrinos

In this section, we briefly summarize the results of Ref. [7] regarding the violation of WEP for mixed neutrinos. To this end, we start from the two-flavor Dirac equation associated with neutrinos \(\nu _e\) and \(\nu _\mu \). In a compact notation, this reads

$$\begin{aligned} \left(i\gamma '^\alpha \partial _\alpha \ - \ \mathbb {M}\right)\Psi \ = \ 0, \end{aligned}$$
(1)

where \(\gamma '^\alpha \) is the \(8\times 8\) matrix \(\mathbb {I}_{2\times 2}\otimes \gamma ^\alpha \), \(\mathbb {M}\) is the \(8\times 8\) non-diagonal mass matrix, which in the \(4\times 4\) block formalism is given by

$$\begin{aligned} \mathbb {M} \ = \ \begin{pmatrix}m_e &{} m_{e\mu } \\ m_{e\mu } &{} m_\mu \end{pmatrix}, \end{aligned}$$
(2)

(note that \(m_{e\mu } = m_{\mu e}\)), whilst the wave-function \(\Psi \) is a short-hand notation for the neutrino doublet

$$\begin{aligned} \Psi \ = \ \begin{pmatrix}\psi _e\\ \psi _\mu \end{pmatrix}. \end{aligned}$$
(3)

At this stage, one can choose to work with the electron neutrino only, as the implications for the muon neutrino can be derived by simply swapping the subscripts \(e\leftrightarrow \mu \). So, for positive energy solutions, we obtain from Eq. (1) two coupled algebraic equations in momentum space:

$$\begin{aligned} \left(E_e-m_e\right)\varphi _e \ - \ {\varvec{\sigma }}\cdot {\varvec{p}}\,\chi _e= & {} m_{e\mu }\varphi _\mu ,\nonumber \\ {\varvec{\sigma }}\cdot {\varvec{p}}\,\varphi _e \ - \ \left(E_e+m_e\right)\chi _e= & {} m_{e\mu }\chi _\mu , \end{aligned}$$
(4)

with \(\varphi _{e,\mu }\) and \(\chi _{e,\mu }\) denoting the “large” (upper) and “small” (lower) bispinor components. To consider the non-relativistic limit, one should bear in mind that the dominant contribution to the energy comes from the rest mass. In light of this, in Eq. (4) we can factor out from the components of each spinor with definite flavor \(\nu =(e,\mu )\) a fast-oscillating phase \(\exp [-im_\nu t]\). In position representation, a derivative with respect to time allows us to obtain a mass term in both expressions (4), but as the rest mass is the leading term in the considered regime, we can assume \(|i\partial _t\psi _\nu |\ll |m_\nu \psi _\nu |\) and simplify the equations accordingly. In momentum space, this is equivalent to defining the non-relativistic energy \(E^{N\!R}_{e} \equiv E_{e} - m_{e}\), and write (4) as

$$\begin{aligned} E^{N\!R}_{e}{\varphi }_e \ - \ {\varvec{\sigma }}\cdot {\varvec{p}}\ \! \,{\chi }_e= & {} m_{e\mu }\ \!{\varphi }_\mu ,\nonumber \\ {\varvec{\sigma }}\cdot {\varvec{p}}\ \!\,{\varphi }_e\ - \ 2m_e{\chi }_e= & {} m_{e\mu }\ \!{\chi }_\mu . \end{aligned}$$
(5)

Analogous equations hold for \(e\leftrightarrow \mu \).

If there was no mixing, the small component \(\chi \) would be negligible compared to the large component \(\varphi \). By taking particle mixing into account, the small component \(\chi _e\) will still remain much smaller than \(\varphi _e\), provided a small admixture of the large component \(\varphi _{\mu }\) is included (a similar statement holds also for \(e\leftrightarrow \mu \)). This can be seen as follows: we first eliminate \(\chi _\mu \) in the second equation in (5) by inserting \(\chi _\mu \) from the analogous equation where e and \(\mu \) are exchanged. With this, the second equation in (5) can be cast in the form

$$\begin{aligned} \chi _e \ = \ \frac{{\varvec{\sigma }}\cdot {\varvec{p}}}{2 m_e} \ \! \varphi _e\ - \ \frac{m_{e\mu }}{4m_e m_{\mu }}\ \! {\varvec{\sigma }}\cdot {\varvec{p}} \ \! \varphi _{\mu }\ + \ \omega \ \!\chi _e, \end{aligned}$$
(6)

or equivalently

$$\begin{aligned} \chi _e \ = \ \frac{{\varvec{\sigma }}\cdot {\varvec{p}}}{(1-\omega )2 m_e} \ \! \varphi _e\ - \ \frac{m_{e\mu } \ \! {\varvec{\sigma }}\cdot {\varvec{p}}}{(1- \omega )4m_e m_{\mu }} \ \! \varphi _{\mu }, \end{aligned}$$
(7)

where

$$\begin{aligned} \omega \ = \ \frac{m_{e\mu }^2}{4m_em_\mu }. \end{aligned}$$
(8)

Should mixing be absent (i.e., \(m_{e\mu }=\omega =0\)), we would end up with

$$\begin{aligned} {\chi _\nu \ = \ \frac{{\varvec{\sigma }}\cdot {\varvec{p}}}{2 m_\nu } \ \! \varphi _\nu \qquad \nu =(e,\mu )\, ,} \end{aligned}$$
(9)

which would clearly yield the standard, non-relativistic Schrödinger equation and thus prevent the inertial mass from undergoing any form of redefinition (such as in Ref. [7]). Furthermore, we stress that the expression (9) tells us why the “large” and “small” components are called like this: in the non-relativistic limit, \(m\gg |{\varvec{p}}|\), thus making \(\chi \) less relevant with respect to the component \(\varphi \).

It is not difficult to see that, even with weak mixing, the notions “large” and “small” bispinor components still retain their traditional meaning. In fact, going back to the case of flavor mixing, by assuming that \(m_e \le m_\mu \) and expanding (7) up to \({\mathcal {O}}(\omega )\), we note that

$$\begin{aligned} {|\!| \chi _e |\!|_2=\frac{|\varvec{p}|}{2}\Big |\!\Big |\frac{1+\omega }{m_e}\ \! \varphi _e\ - \ \frac{m_{e\mu }(1+\omega )}{2m_e m_{\mu }}\ \! \varphi _\mu \Big |\!\Big |_2 ,} \end{aligned}$$
(10)

where \(|\!| \ldots |\!|_2\) denotes the \(\ell _2\)-norm. Now, since from Eq. (8) we see that \(m_{e\mu }\propto \sqrt{\omega }\), the linear terms in \(\omega \) are of higher order, and can thus be neglected. Furthermore, by making use of the triangular inequality, the above expression becomes

$$\begin{aligned} |\!| \chi _e |\!|_2\le & {} \frac{|\varvec{p}|}{2}\left( \frac{|\!|\varphi _{e} |\!|_2}{m_e} + \frac{\sqrt{\omega }}{\sqrt{m_e m_{\mu }}}\ \! |\!|\varphi _{\mu } |\!|_2\right) \nonumber \\\le & {} \frac{|\varvec{p}|}{2 m_e}\left( |\!|\varphi _{e} |\!|_2 + \sqrt{\omega } |\!|\varphi _{\mu } |\!|_2\right) , \end{aligned}$$
(11)

where in the last step we have relied on our initial assumption of the hierarchy of masses. Clearly, a similar relation holds also for the muon particle

$$\begin{aligned} |\!| \chi _\mu |\!|_2\le & {} \frac{|\varvec{p}|}{2 \sqrt{m_e m_{\mu }}}\left( |\!|\varphi _{\mu } |\!|_2 + \sqrt{\omega } |\!|\varphi _{e} |\!|_2\right) . \end{aligned}$$
(12)

In particular, when \(\sqrt{\omega } |\!|\varphi _{\mu } |\!|_2 \lesssim |\!|\varphi _{e} |\!|_2\) and \(\sqrt{\omega } |\!|\varphi _{e} |\!|_2 \lesssim |\!|\varphi _{\mu } |\!|_2\), we have that in the non-relativistic limit (i.e., when \(|\varvec{p}| \ll m_e\)) the large components are much larger than their respective small components. Note that the relativistic regime enters at scales where \(|\varvec{p}| \gg \sqrt{m_e m_{\mu }}\), which is also in agreement with QFT considerations [32].

By substituting \(\chi _e\) from (7) into the first equation in (5), we eliminate the small component from the equation and obtain

$$\begin{aligned} E^{N\!R}_{e}\ \!\varphi _e= & {} \frac{\varvec{p}^2}{2m_e(1-\omega )}\ \!\varphi _e \nonumber \\{} & {} + \left[m_{e\mu }\ - \ \frac{m_{e\mu }}{2m_e}\ \! \frac{1}{(1-\omega )} \ \! \frac{\varvec{p}^2}{2m_\mu }\right]\ \!\varphi _\mu \nonumber \\= & {} \frac{\varvec{p}^2}{2m_{i,e}}\ \!\varphi _e \ + \ m_{e\mu }\left[ 1 \ - \ \frac{\varvec{p}^2}{4m_e m_{i,\mu }}\right] \varphi _\mu . \end{aligned}$$
(13)

where we have defined the effective inertial mass \(m_{i,e} = m_{e}\left( 1-{\omega }\right) \) (and similarly for \(m_{i,{\mu }}\)). In Ref. [18], the same result was obtained by means of the iteration method. An analogous relation holds also for \(e\leftrightarrow \mu \). Before we proceed, let us to clarify why we have identified the quantity \(m_{i,e}=(1-\omega )m_e\) as the effective inertial mass. We can draw here upon the arguments presented in Refs. [33, 34], which state that “... at the level of the relativistic wave equation (both for the Klein–Gordon and the Dirac scenario) there is no unambiguous way of discerning the inertial from the gravitational mass, as one can only introduce the notion of “phase space” mass, which is the pole of the propagator associated with the given field. To establish a meaningful definition of inertial mass, one has to perform a non-relativistic limit and look at the mass term appearing in the kinetic operator”.

Bearing this in mind, we can move on. Equation (13) is the sought non-relativistic limit of the Dirac equation for a mixed electron neutrino. If \(m_{e\mu }\) is equal to zero, the equations for \(\varphi _e\) and \(\varphi _\mu \) become independent, resulting in free electron and muon neutrinos with masses \(m_e\) and \(m_\mu \), respectively. The presence of the amplitude in square brackets in the r.h.s. of (13), however, creates a connection between the two flavor neutrinos, implying that one flavor might “leak” into the other. In fact, such an amplitude is nothing but the “flip-flop” amplitude of a two-state system [35]. The amplitude modulus is manifestly invariant under the exchange of flavors \(e \leftrightarrow \mu \), thus reflecting the principle of detailed balance in the oscillation phenomenon. From (13), it is evident that the inertial mass \(m_i\) appearing in the kinetic term must be re-scaled by a factor \(1-\omega \), which reduces to unity only when mixing is removed, i.e., when \(m_{e\mu }\rightarrow 0\).

The root cause of the violation of the weak equivalence principle for mixed particles can be traced back to the previous observation. Indeed, the weak equivalence principle states that inertial and gravitational mass are equal, and it is not difficult to check that, when a gravitational potential is switched on, then the gravitational mass is not redefined in the same way as the inertial mass. To see this, it is convenient to study the Dirac equation in the weak-field regime of the Schwarzschild solution in isotropic coordinates. The line element in this case is expressed as follows [28]:

$$\begin{aligned} ds^2 = \left(1+2\,\phi \,\right)dt^2 \ - \left(1-2\,\phi \right)\left(dx^2+dy^2+dz^2\right), \end{aligned}$$
(14)

with \(\phi =-GM/|\varvec{x}|\) being the Newtonian potential. Now, the Dirac equation must be rephrased to take into account the presence of gravity. This is done by means of the Fock–Ivanenko connection \(\Gamma _\mu \), see Ref. [26, 27], i.e.,

$$\begin{aligned} {\Gamma _\mu \ = \ \frac{1}{8}\left[ \gamma ^a,\gamma ^b\right] e_a^\lambda \nabla _\mu e_{b\lambda } ,} \end{aligned}$$
(15)

which replaces the standard derivatives appearing in (1) with covariant ones \(\partial _\mu \rightarrow D_\mu =\partial _\mu +\Gamma _\mu \). The quantities \(e_a^\lambda \) are called tetrad fields or vierbeins, and are necessary for the treatment of spinor dynamics on curved backgrounds. These fields are determined by the requirements:

$$\begin{aligned} {g_{\mu \nu } \ = \ e_\mu ^ae_\nu ^b\eta _{ab}, \quad e_\mu ^ae_b^\mu \ = \ \delta _b^a, \quad e_a^\mu e_\nu ^a \ = \ \delta _\nu ^\mu ,} \end{aligned}$$
(16)

meaning that \(e^{\mu }_a\) is the inverse of \(e^{a}_{\mu }\). Whenever vierbeins are present in the equations, Greek indices are associated with manifold coordinates, whilst Latin indices are related to local Lorentz frame vector labels. So, Latin indices are raised and lowered with \(\eta ^{ab}\), \(\eta _{ab}\) and Greek ones with \(g^{\mu \nu }\), \(g_{\mu \nu }\). For further details on this formalism, the interested reader is referred, e.g. to Ref. [28].

In the simplest case of a weak and slowly varying potential (i.e., \(\phi \ll 1\) and \(\partial _i\phi \approx 0\) with \(i=x,y,z\)), though, the Fock–Ivanenko connection can be neglected, since for the metric (14) its expression reads [7]

$$\begin{aligned} {\Gamma _\mu \ = \ \frac{1}{8}\left[ \gamma ^a,\gamma ^b\right] e_a^\lambda \left( \eta _{\mu \lambda }\partial _\rho \phi -\eta _{\mu \rho }\partial _\lambda \phi \right) e_b^\rho \, .} \end{aligned}$$
(17)

Thus, all the information on the curved nature of spacetime is stored in the “generalized” gamma matrices \(\gamma ^\mu =\gamma ^ae_a^\mu \), cf. Ref. [28], where \(\gamma ^a\) fulfill the usual Clifford algebra \(Cl_{1,3}(\mathbb {R})\). Therefore, Eq. (4) now becomes

$$\begin{aligned}{} & {} {\left[(1-\phi )E_e-m_e\right]\varphi _e \ - \ {\varvec{\sigma }}\cdot {\varvec{p}}\,\chi _e } \nonumber \\{} & {} \quad = \; \ {m_{e\mu }\varphi _\mu \ + \ {\mathcal {O}}({\varvec{p}}\phi ),}\nonumber \\{} & {} {{\varvec{\sigma }}\cdot {\varvec{p}}\,\varphi _e \ - \ \left[(1-\phi )E_e+m_e\right]\chi _e}\nonumber \\{} & {} \quad = \; \ {m_{e\mu }\chi _\mu \ + \ {\mathcal {O}}({\varvec{p}}\phi ),} \end{aligned}$$
(18)

where \({\mathcal {O}}(\varvec{p}\phi )\) represents post-Newtonian corrections. If we now want to define the non-relativistic energy \(E_e^{N\!R}\) as before, we have to add and subtract the quantity \(m_e\phi \). In so doing, the above expression can be simplified, thereby yielding up to \({\mathcal {O}}(\phi )\)

$$\begin{aligned} (E^{N\!R}_{e}-m_e\phi ){\varphi }_e \ - \ {\varvec{\sigma }}\cdot {\varvec{p}}\ \! \,{\chi }_e= & {} {(1+\phi )m_{e\mu }\ \!{\varphi }_\mu \ + \ {\mathcal {O}}({\varvec{p}}\phi ),}\nonumber \\ {{\varvec{\sigma }}\cdot {\varvec{p}}\ \!\,{\varphi }_e\ - \ 2m_e{\chi }_e}= & {} \ {m_{e\mu }\ \!{\chi }_\mu \ + \ {\mathcal {O}}({\varvec{p}}\phi ).} \nonumber \\ \end{aligned}$$
(19)

At this point, we follow exactly the same steps as in the previous paragraphs where the presence of gravity was not considered. In this way, Eq. (13) is modified into (cf. Ref. [18])

$$\begin{aligned} \nonumber E^{N\!R}_{e}\ \!\varphi _e \ {}= & {} \ \left[\frac{\varvec{p}^2}{2m_e(1-\omega )} \ + \ m_e\phi \right] \varphi _e \\{} & {} + \ V\left(\varvec{p}^2,\phi \right)\ \!\varphi _\mu \ + \ {\mathcal {O}}(\varvec{p}\phi ), \end{aligned}$$
(20)

where the details of \(V(\varvec{p}^2,\phi )\) are not relevant for the purpose of the current analysis. The non-relativistic wave Eq. (20), where the large bispinor component is separated from the small one, represents the leading-order non-relativistic approximation in the Foldy-Wouthuysen procedure. In practice, using the FW procedure, one can systematically proceed to higher orders in (kinetic energy/\(m_e\)) or (kinetic energy \(\times \) potential/\(m_e^2\)); however, this is unnecessary here. In fact, the most important term in (20) is the term representing the Newtonian coupling to the external potential, from which the expression for the gravitational mass \(m_g=m_e\) can be unambiguously identified. Therefore, since \(m_g\) is not redefined in the same way as the inertial mass, we can conclude that \(m_i \ne m_g\) for mixed particles, which implies the violation of the weak form of the equivalence principle. A similar equation also holds when \(e \leftrightarrow \mu \).

Let us stress in passing that the results of the WEP violation were obtained by considering mixed neutrinos in the flavor basis and with only two generations. This approach is not applicable in the mass basis, because the non-relativistic limit and the mixing transformations are not interchangeable in this case [18].

In the following section, we will observe that the violation of WEP is not exclusive to mixed Dirac fields, but also occurs for mixed scalar fields.

3 Mixed Klein–Gordon particles

3.1 General setup

To keep our discussion as close as possible to the Dirac equation treatment from the previous section, we start from the observation that the Klein–Gordon equation for a free spinless particle can be rewritten in a Schrödinger-like form – the so-called Feshbach–Villars representation, cf. e.g., Refs. [22, 23]. In order to see what is involved, we start with the standard Klein–Gordon equation

$$\begin{aligned} {\left( \Box \ + \ m^2c^2\right) \Psi \ = \ 0,} \end{aligned}$$
(21)

(\(\Box =\partial _t^2-\nabla ^2\) denotes the flat D’Alembertian), and define a two-component wave function

$$\begin{aligned} \Phi \ = \ \left( \begin{array}{c} \zeta \\ \bar{\zeta }\\ \end{array} \right) , \end{aligned}$$
(22)

where the components \(\zeta \) and \(\bar{\zeta }\) are represented as [23]

$$\begin{aligned} \zeta= & {} \frac{1}{\sqrt{2}} \left( \Psi - \frac{1}{imc^2} \frac{\partial \Psi }{\partial t} \right) ,\nonumber \\ \bar{\zeta }= & {} \frac{1}{\sqrt{2}} \left( \Psi + \frac{1}{imc^2} \frac{\partial \Psi }{\partial t} \right) . \end{aligned}$$
(23)

Here, \(\Psi \) is a Klein–Gordon field fulfilling (21). Combining (23) with (21), one can easily check that \(\Phi \) satisfies the parabolic Schrödinger-like equation

$$\begin{aligned} i \partial _t \Phi \ = \ H_{FV}(\hat{{\varvec{p}}}) \Phi , \end{aligned}$$
(24)

where the Hamiltonian operator is a \(2\times 2\) Hermitian matrix

$$\begin{aligned} H_{FV}(\hat{{\varvec{p}}}) \ = \ \left( \sigma _3 + i \sigma _2 \right) \frac{\hat{{\varvec{p}}}^2}{2m} + \sigma _3 m c^2, \end{aligned}$$
(25)

with \(\hat{{\varvec{p}}} = - i \nabla _{{\varvec{x}}}\). We might note in passing that the charge-conjugated wave function has the form [22]

$$\begin{aligned} \Phi _c \ = \ \sigma _{1} \Phi _c^* \ = \ \left( \begin{array}{c} \bar{\zeta }^* \\ \zeta ^*\\ \end{array} \right) . \end{aligned}$$
(26)

The two-component form of the FV wave function thus indicates the existence of both particles and their antiparticles.

In the context of spin-0 particles, the FV representation is very practical and in many respects superior to the conventional KG formulation [24]. A particular advantage of the parabolic form of the wave equation (24) is that it allows to pursue a non-relativistic approximation in a systematic way by mimicking the procedure from the previous section. While the subsequent analysis could in principle be carried on within the KG representation, we find it more instructive to work with the FV representation. In particular, the latter allows to rotate the mass basis to the flavor basis before taking the non-relativistic limit along the same lines we used in Dirac’s case (hence the mass matrix is related to m not \(m^2\)), and treats both particles and antiparticles along with their coupling to the gravitational field in a unified manner. Further technical details related to the FV representation can be found, e.g. in Refs. [24, 36].

Within the FV representation, the two-flavor mixing of scalar particles can be formulated in a similar manner to that of neutrino mixing. Namely, we first write two decoupled Eq. (24) for different masses \(m_1\) and \(m_2\) (as in the case of the mass basis). We then rotate the ensuing diagonal mass matrix to a flavor basis where the mass matrix acquires also off-diagonal terms, so that

$$\begin{aligned} \left( \begin{array}{cc} m_1 &{} 0 \\ 0 &{} m_2 \\ \end{array} \right) \ \mapsto \ \left( \begin{array}{cc} m_{_\mathrm{{I}}} &{} m_{_\mathrm{{I,II}}} \\ m_{_\mathrm{{I,II}}} &{} m_{_\mathrm{{II}}} \\ \end{array} \right) \ \equiv \ \mathbb {M}, \end{aligned}$$
(27)

and \((\Phi _{1}, \Phi _{2})^{{{T}}} \mapsto (\Phi _{_\mathrm{{I}}}, \Phi _{_\mathrm{{II}}})^{{{T}}}\).

In analogy with the mixing of spin-1/2 particles we now formally replace in (25) the mas m with the mass matrix \(\mathbb {M}\). This yields

$$\begin{aligned} {H_{FV}(\hat{{\varvec{p}}}) \ = \ \left( \sigma _3 + i \sigma _2 \right) \frac{\hat{{\varvec{p}}}^2}{2}\mathbb {M}^{-1} + \sigma _3 \mathbb {M} c^2.} \end{aligned}$$
(28)

Starting from this form of \(H_{FV}\), we can explicitly write the two FV equations for mixed particles in the form (\(c= 1\))

$$\begin{aligned} i \partial _t \Phi _{_\mathrm{{I}}}= & {} \left( \sigma _3 + i \sigma _2 \right) \frac{\hat{{\varvec{p}}}^2}{2D} \ \! \left( m_{_\mathrm{{II}}}\Phi _{_\mathrm{{I}}} - m_{_\mathrm{{I,II}}}\Phi _{_\mathrm{{II}}} \right) \nonumber \\ {}{} & {} + \ \sigma _3 \ \! \left( m_{_\mathrm{{I}}}\Phi _{_\mathrm{{I}}} + m_{_\mathrm{{I,II}}}\Phi _{_\mathrm{{II}}} \right) , \end{aligned}$$
(29)
$$\begin{aligned} i \partial _t \Phi _{_\mathrm{{II}}}= & {} \left( \sigma _3 + i \sigma _2 \right) \frac{\hat{{\varvec{p}}}^2}{2D} \ \! \left( m_{_\mathrm{{I}}}\Phi _{_\mathrm{{II}}} - m_{_\mathrm{{I,II}}}\Phi _{_\mathrm{{I}}} \right) \nonumber \\{} & {} + \ \sigma _3 \ \! \left( m_{_\mathrm{{II}}}\Phi _{_\mathrm{{II}}} + m_{_\mathrm{{I,II}}}\Phi _{_\mathrm{{I}}} \right) , \end{aligned}$$
(30)

where D represents the (flavor basis) mass matrix determinant

$$\begin{aligned} D \ = \ m_{_\mathrm{{I}}}m_{_\mathrm{{II}}} - m_{_\mathrm{{I,II}}}^2 \ \equiv \ m_{_\mathrm{{I}}}m_{_\mathrm{{II}}} (1- \bar{\omega }). \end{aligned}$$
(31)

Here, \(\bar{\omega } = m_{_\mathrm{{I,II}}}^2/(m_{_\mathrm{{I}}}m_{_\mathrm{{II}}})\) is an analogue of \(\omega \) from Eq. (8).

3.2 Non-relativistic limit

Let us now focus on Eq. (29), since the following reasoning for \(\Phi _{_\mathrm{{I}}}\) can be easily repeated also for \(\Phi _{_\mathrm{{II}}}\) via the exchange of subscripts \(\mathrm{{I}} \leftrightarrow \mathrm{{II}}\). In momentum representation, the positive-energy wave functions satisfy the algebraic equations

$$\begin{aligned} E_{_\mathrm{{I}}} \Phi _{_\mathrm{{I}}}^L= & {} \frac{{{\varvec{p}}}^2}{2D}\left( m_{_\mathrm{{II}}}\Phi _{_\mathrm{{I}}}^L + m_{_\mathrm{{II}}}\Phi _{_\mathrm{{I}}}^S - m_{_\mathrm{{I,II}}}\Phi _{_\mathrm{{II}}}^L - m_{_\mathrm{{I,II}}}\Phi _{_\mathrm{{II}}}^S \right) \nonumber \\{} & {} + \ m_{_\mathrm{{I}}} \Phi _{_\mathrm{{I}}}^L + m_{_\mathrm{{I,II}}} \Phi _{_\mathrm{{II}}}^L, \end{aligned}$$
(32)
$$\begin{aligned} E_{_\mathrm{{I}}} \Phi _{_\mathrm{{I}}}^S= & {} \frac{{{\varvec{p}}}^2}{2D}\left( m_{_\mathrm{{I,II}}}\Phi _{_\mathrm{{II}}}^L + m_{_\mathrm{{I,II}}}\Phi _{_\mathrm{{II}}}^S - m_{_\mathrm{{II}}}\Phi _{_\mathrm{{I}}}^L - m_{_\mathrm{{II}}}\Phi _{_\mathrm{{I}}}^S \right) \nonumber \\{} & {} - \ m_{_\mathrm{{I}}} \Phi _{_\mathrm{{I}}}^S - m_{_\mathrm{{I,II}}} \Phi _{_\mathrm{{II}}}^S. \end{aligned}$$
(33)

The superscripts L and S denote the “large” (upper) and “small” (lower) components of the FV wave functions, respectively [22, 23]. The non-relativistic limit of (32)-(33) is now obtained along the same line as in the Dirac case, namely

$$\begin{aligned} E^{N\!R}_{_\mathrm{{I}}} \Phi _{_\mathrm{{I}}}^L= & {} \frac{{{\varvec{p}}}^2}{2D}\left[ m_{_\mathrm{{II}}}(\Phi _{_\mathrm{{I}}}^L + \Phi _{_\mathrm{{I}}}^S) - m_{_\mathrm{{I,II}}}(\Phi _{_\mathrm{{II}}}^L + \Phi _{_\mathrm{{II}}}^S) \right] \nonumber \\{} & {} + \ m_{_\mathrm{{I,II}}} \Phi _{_\mathrm{{II}}}^L, \end{aligned}$$
(34)

and

$$\begin{aligned} \Phi _{_\mathrm{{I}}}^S= & {} \frac{{{\varvec{p}}}^2}{4Dm_{_\mathrm{{I}}} }\left[ m_{_\mathrm{{I,II}}}(\Phi _{_\mathrm{{II}}}^L + \Phi _{_\mathrm{{II}}}^S) - m_{_\mathrm{{II}}}(\Phi _{_\mathrm{{I}}}^L + \Phi _{_\mathrm{{I}}}^S) \right] \nonumber \\{} & {} - \ \frac{m_{_\mathrm{{I,II}}}}{2m_{_\mathrm{{I}}}} \ \!\Phi _{_\mathrm{{II}}}^S, \end{aligned}$$
(35)

where \(E^{N\!R}_{_\mathrm{{I}}} \equiv E_{_\mathrm{{I}}} - m_{_\mathrm{{I}}}\) and \(E_{_\mathrm{{I}}} + m_{_\mathrm{{I}}} \approx 2m_{_\mathrm{{I}}}\). As in the neutrino case, one can now use the non-relativistic relation for \(\Phi _{_\mathrm{{II}}}^S\), which reads, cf. (35)

$$\begin{aligned} \Phi _{_\mathrm{{II}}}^S= & {} \frac{{{\varvec{p}}}^2}{4Dm_{_\mathrm{{II}}} }\left[ m_{_\mathrm{{I,II}}}(\Phi _{_\mathrm{{I}}}^L + \Phi _{_\mathrm{{I}}}^S) - m_{_\mathrm{{I}}}(\Phi _{_\mathrm{{II}}}^L + \Phi _{_\mathrm{{II}}}^S) \right] \nonumber \\{} & {} - \ \frac{m_{_\mathrm{{I,II}}}}{2m_{_\mathrm{{II}}}} \ \!\Phi _{_\mathrm{{I}}}^S, \end{aligned}$$
(36)

and from (35)–(36) resolve \(\Phi _{_\mathrm{{I}}}^S\) and \(\Phi _{_\mathrm{{II}}}^S\) in terms of \(\Phi _{_\mathrm{{I}}}^L\) and \(\Phi _{_\mathrm{{II}}}^L\). On the one hand, by assuming that \(m_{_\mathrm{{I}}}\le m_{_\mathrm{{II}}}\) and \(m_{_\mathrm{{I}}} \gg |{\varvec{p}}|\), we can write for \(|\Phi _{_\mathrm{{I}}}^S|\) up to the order \({\mathcal {O}}(\bar{\omega })\)

$$\begin{aligned} |\Phi _{_\mathrm{{I}}}^S|\le & {} \frac{{{\varvec{p}}}^2}{4} \left( \frac{|\Phi _{_\mathrm{{I}}}^L|}{m_{_\mathrm{{I}}}^2} \ + \ \frac{\sqrt{\bar{\omega }} \ \!(m_{_\mathrm{{I}}}+2m_{_\mathrm{{II}}})}{2 (m_{_\mathrm{{I}}}m_{_\mathrm{{II}}})^{3/2}} \ \! |\Phi _{_\mathrm{{II}}}^L|\right) \nonumber \\\le & {} \frac{{{\varvec{p}}}^2}{4m_{_\mathrm{{I}}}^2}\left( |\Phi _{_\mathrm{{I}}}^L| \ + \ \frac{3}{2}\sqrt{\bar{\omega }} \ \! |\Phi _{_\mathrm{{II}}}^L| \right) . \end{aligned}$$
(37)

Similarly, for \(|\Phi _{_\mathrm{{II}}}^S|\) we get

$$\begin{aligned} |\Phi _{_\mathrm{{II}}}^S| \ \le \ \frac{{{\varvec{p}}}^2}{4m_{_\mathrm{{I}}}^2}\left( |\Phi _{_\mathrm{{II}}}^L| \ + \ \frac{3}{2}\sqrt{\bar{\omega }} \ \! |\Phi _{_\mathrm{{I}}}^L| \right) . \end{aligned}$$
(38)

The results of (37) and (38) demonstrate that the small components \(\Phi _{_\mathrm{{I}}}^S\) and \(\Phi _{_\mathrm{{II}}}^S\) still remain much smaller than \(\Phi _{_\mathrm{{I}}}^L\) and \(\Phi _{_\mathrm{{II}}}^L\), respectively, even after having introduced particle mixing.

On the other hand, by inserting the solutions for \(\Phi _{_\mathrm{{I}}}^S\) and \(\Phi _{_\mathrm{{II}}}^S\) back into (34), we obtain after some algebra that

$$\begin{aligned} E^{N\!R}_{_\mathrm{{I}}} \Phi _{_\mathrm{{I}}}^L= & {} \bar{A}(\mathbb {M})\ \! \frac{{\varvec{p}}^2}{2 m_{_\mathrm{{I}}}}\ \! \Phi _{_\mathrm{{I}}}^L \ + \ \bar{B}(\mathbb {M})\ \!\Phi _{_\mathrm{{II}}}^L, \end{aligned}$$
(39)

where

$$\begin{aligned}{} & {} \bar{A}(\mathbb {M}) \ = \ \frac{4 m_{_\mathrm{{I}}}^2 (4 m_{_\mathrm{{II}}}^2 \ + \ {\varvec{p}}^2) \ - \ 4 m_{_\mathrm{{I}}}m_{_\mathrm{{II}}}m_{_\mathrm{{I,II}}}^2 }{4D(4m_{_\mathrm{{I}}}m_{_\mathrm{{II}}} - m_{_\mathrm{{I,II}}}^2) \ + \ 4(m_{_\mathrm{{I}}}^2 + m_{_\mathrm{{II}}}^2 + m_{_\mathrm{{I,II}}}^2){\varvec{p}}^2 \ + \ {\varvec{p}}^4}, \nonumber \\ \end{aligned}$$
(40)
$$\begin{aligned}{} & {} \bar{B}(\mathbb {M})\ = \ m_{_\mathrm{{I,II}}} \ - \ \frac{m_{_\mathrm{{I,II}}}\left( 8m_{_\mathrm{{I}}}m_{_\mathrm{{II}}} - 2 m_{_\mathrm{{I,II}}}^2 - {\varvec{p}}^2 \right) }{2m_{_\mathrm{{I}}} \left( 4 m_{_\mathrm{{II}}}^2 + {\varvec{p}}^2\right) - 2 m_{_\mathrm{{II}}} m_{_\mathrm{{I,II}}}^2 }\ \! \bar{A}(\mathbb {M}) \ \! \frac{{\varvec{p}}^2}{2m_{_\mathrm{{I}}}}. \end{aligned}$$
(41)

By employing the non-relativistic assumption \(m_{_\mathrm{{I}}}, m_{_\mathrm{{II}}} \gg |{\varvec{p}}|\), Eq. (39) reduces to

$$\begin{aligned} E^{N\!R}_{_\mathrm{{I}}} \Phi _{_\mathrm{{I}}}^L \ {}= & {} \ \frac{{\varvec{p}}^2}{2m_{_\mathrm{{I}}} (1-\bar{\omega })} \ \! \Phi _{_\mathrm{{I}}}^L \ + \ \left\{ m_{_\mathrm{{I,II}}}\left[ 1 - \frac{{\varvec{p}}^2}{2m_{_\mathrm{{I}}} m_{_\mathrm{{II}}} (1-\bar{\omega })}\right] \right\} \Phi _{_\mathrm{{II}}}^L \nonumber \\ {}= & {} \ \frac{{\varvec{p}}^2}{2m_{i,{_\mathrm{{I}}}}} \ \! \Phi _{_\mathrm{{I}}}^L \ + \ \left\{ m_{_\mathrm{{I,II}}}\left[ 1 - \frac{{\varvec{p}}^2}{2m_{_\mathrm{{I}}}m_{i,{_\mathrm{{II}}}} } \right] \right\} \Phi _{{_\mathrm{{II}}}}^L , \end{aligned}$$
(42)

where we have defined the effective inertial mass \(m_{i,{_\mathrm{{I}}}} = m_{_\mathrm{{I}}}\left( 1-\bar{\omega }\right) \) (and similarly for \(m_{i,{_\mathrm{{II}}}}\)). An analogous equation holds also for \(I\leftrightarrow II\). This outcome should be compared with the expression (13) for flavor neutrinos. The extra factor 2 in (13) (respective 4 in \(\omega \)) is a consequence of the way how the factor appears in the kinetic versus mixing term in non-relativistic Eqs. (5) and (34)–(35), thereby denoting a different spin content. Such a spin-dependent behavior of the effective mass can also be observed for higher-spin particle states described via Bargmann–Wigner equations.

3.3 Non-relativistic limit in presence of gravitational field

Let us now focus on what happens when we switch a gravitational potential on. It is not a priori evident whether the effective inertial masses \(m_{i,{_\mathrm{{I}}}}\) and \(m_{i,{_\mathrm{{II}}}}\) will also couple to the gravitational potential. To explore this issue, we will employ the conformal coupling to gravity and restrict our analysis to the weak-field metric (14).

When mixing is absent, one can show [29] that the form of the Klein–Gordon equation in the Feshbach–Villars representation (24) gets modified in the following way:

$$\begin{aligned} H_{FV}(\hat{{\varvec{p}}})= & {} \left( \sigma _3 + i \sigma _2 \right) \left[ \left( 1+4\phi \right) \frac{\hat{{\varvec{p}}}^2}{2m} \ + \ m\phi \right] \nonumber \\ {}{} & {} +\ \sigma _3 m . \end{aligned}$$
(43)

This Hamiltonian can be easily deduced from the Klein–Gordon equation (21) by recalling that, in the presence of an underlying curved background, the action of the D’Alembertian operator on a scalar quantity \(\Psi \) is given by [28]

$$\begin{aligned} {\Box \,\Psi \ = \ \frac{1}{\sqrt{-g}}\partial _{\mu }\left( \sqrt{-g}g^{\mu \nu }\partial _{\nu }\Psi \right) .} \end{aligned}$$
(44)

Thus, by following the same steps that led to the introduction of \(\zeta \) and \(\bar{\zeta }\) in (23) and resorting to the weak-field metric tensor that can be deduced from the line element (14), one can show that the new Hamiltonian which allows for a rephrasing of the Klein–Gordon equation in a Schrödinger-like form is precisely the one reported in Eq. (43).

Bearing this in mind, when we rotate in \(H_{FV}\) from the diagonal mass matrix (with masses \(m_1\) and \(m_2\)) to the flavor mass matrix \(\mathbb {M}\), Eq. (29) becomes

$$\begin{aligned} i \partial _t \Phi _{_\mathrm{{I}}}= & {} \left( \sigma _3 + i \sigma _2 \right) \Bigl [\left( 1+4\phi \right) \frac{\hat{{\varvec{p}}}^2}{2D} \ \! \left( m_{_\mathrm{{II}}}\Phi _{_\mathrm{{I}}} - m_{_\mathrm{{I,II}}}\Phi _{_\mathrm{{II}}} \right) \nonumber \\{} & {} + \left( m_{_\mathrm{{I}}}\Phi _{_\mathrm{{I}}} + m_{_\mathrm{{I,II}}}\Phi _{_\mathrm{{II}}}\right) \phi \Bigr ] \ + \ \sigma _3 \ \! \left( m_{_\mathrm{{I}}}\Phi _{_\mathrm{{I}}} + m_{_\mathrm{{I,II}}}\Phi _{_\mathrm{{II}}} \right) , \nonumber \\ \end{aligned}$$
(45)

and similarly for \(\Phi _{_\mathrm{{II}}}\).

Following the analysis of the previous section, we pursue our argument in momentum representation, in which the large and small components of the particle \(\textrm{I}\) satisfy the following equations:

$$\begin{aligned} E_{_\mathrm{{I}}} \Phi _{_\mathrm{{I}}}^L= & {} \left( 1+4\phi \right) \frac{{{\varvec{p}}}^2}{2D}\left( m_{_\mathrm{{II}}}\Phi _{_\mathrm{{I}}}^L + m_{_\mathrm{{II}}}\Phi _{_\mathrm{{I}}}^S - m_{_\mathrm{{I,II}}}\Phi _{_\mathrm{{II}}}^L - m_{_\mathrm{{I,II}}}\Phi _{_\mathrm{{II}}}^S \right) \nonumber \\{} & {} \qquad + \left( m_{_\mathrm{{I}}}\Phi _{_\mathrm{{I}}}^L + m_{_\mathrm{{I}}}\Phi _{_\mathrm{{I}}}^S + m_{_\mathrm{{I,II}}}\Phi _{_\mathrm{{II}}}^L + m_{_\mathrm{{I,II}}}\Phi _{_\mathrm{{II}}}^S \right) \phi \nonumber \\{} & {} \qquad + m_{_\mathrm{{I}}} \Phi _{_\mathrm{{I}}}^L + m_{_\mathrm{{I,II}}} \Phi _{_\mathrm{{II}}}^L, \end{aligned}$$
(46)
$$\begin{aligned} E_{_\mathrm{{I}}} \Phi _{_\mathrm{{I}}}^S= & {} \left( 1+4\phi \right) \frac{{{\varvec{p}}}^2}{2D}\left( m_{_\mathrm{{I,II}}}\Phi _{_\mathrm{{II}}}^L + m_{_\mathrm{{I,II}}}\Phi _{_\mathrm{{II}}}^S - m_{_\mathrm{{II}}}\Phi _{_\mathrm{{I}}}^L - m_{_\mathrm{{II}}}\Phi _{_\mathrm{{I}}}^S \right) \nonumber \\{} & {} \qquad - \left( m_{_\mathrm{{I}}}\Phi _{_\mathrm{{I}}}^L + m_{_\mathrm{{I}}}\Phi _{_\mathrm{{I}}}^S + m_{_\mathrm{{I,II}}}\Phi _{_\mathrm{{II}}}^L + m_{_\mathrm{{I,II}}}\Phi _{_\mathrm{{II}}}^S \right) \phi \nonumber \\{} & {} \qquad - m_{_\mathrm{{I}}} \Phi _{_\mathrm{{I}}}^S - m_{_\mathrm{{I,II}}} \Phi _{_\mathrm{{II}}}^S. \end{aligned}$$
(47)

We obtain the non-relativistic limit by setting \(E^{N\!R}_{_\mathrm{{I}}} \equiv E_{_\mathrm{{I}}} - m_{_\mathrm{{I}}}\) and assuming that \(E_{_\mathrm{{I}}} + m_{_\mathrm{{I}}} \approx 2m_{_\mathrm{{I}}}\). With this, we can write

$$\begin{aligned} E^{N\!R}_{_\mathrm{{I}}} \Phi _{_\mathrm{{I}}}^L= & {} \ \left( 1+4\phi \right) \frac{{{\varvec{p}}}^2}{2D}\left[ m_{_\mathrm{{II}}}\left( \Phi _{_\mathrm{{I}}}^L + \Phi _{_\mathrm{{I}}}^S\right) - m_{_\mathrm{{I,II}}}\left( \Phi _{_\mathrm{{II}}}^L + \Phi _{_\mathrm{{II}}}^S\right) \right] \nonumber \\{} & {} + \ \left[ m_{_\mathrm{{I}}}\left( \Phi _{_\mathrm{{I}}}^L + \Phi _{_\mathrm{{I}}}^S\right) + m_{_\mathrm{{I,II}}}\left( \Phi _{_\mathrm{{II}}}^L + \Phi _{_\mathrm{{II}}}^S\right) \right] \phi \nonumber \\{} & {} + \ m_{_\mathrm{{I,II}}} \Phi _{_\mathrm{{II}}}^L, \end{aligned}$$
(48)

and

$$\begin{aligned} \Phi _{_\mathrm{{I}}}^S= & {} \ \left( 1+4\phi \right) \frac{{{\varvec{p}}}^2}{4Dm_{_\mathrm{{I}}} }\left[ m_{_\mathrm{{I,II}}}(\Phi _{_\mathrm{{II}}}^L + \Phi _{_\mathrm{{II}}}^S) - m_{_\mathrm{{II}}}(\Phi _{_\mathrm{{I}}}^L + \Phi _{_\mathrm{{I}}}^S) \right] \nonumber \\{} & {} - \left[ m_{_\mathrm{{I}}}\left( \Phi _{_\mathrm{{I}}}^L + \Phi _{_\mathrm{{I}}}^S\right) + m_{_\mathrm{{I,II}}}\left( \Phi _{_\mathrm{{II}}}^L + \Phi _{_\mathrm{{II}}}^S\right) \right] \frac{\phi }{2m_{_\mathrm{{I}}}}\nonumber \\{} & {} - \ \ \frac{m_{_\mathrm{{I,II}}}}{2m_{_\mathrm{{I}}}} \ \!\Phi _{_\mathrm{{II}}}^S. \end{aligned}$$
(49)

In analogy with the previous section, we now resolve \(\Phi _{_\mathrm{{I}}}^S\) and \(\Phi _{_\mathrm{{II}}}^S\) in terms of \(\Phi _{_\mathrm{{I}}}^L\) and \(\Phi _{_\mathrm{{II}}}^L\). This allows to cast Eq. (48) for \(\Phi _{_\mathrm{{I}}}^L\) (and the analogous equation for \(\Phi _{_\mathrm{{II}}}^L\)) in terms of large components only. If post-Newtonian corrections of the order \({\mathcal {O}}(\varvec{p}\phi )\) are neglected (as their explicit form is irrelevant for the identification of \(m_i\) and \(m_g\)), we obtain after a simple algebra the non-relativistic, Schrödinger-like equation for \(\Phi _{_\mathrm{{I}}}^L\) in the regime \(m_{_\mathrm{{I}}}, m_{_\mathrm{{II}}}\gg |{\varvec{p}}|\), which turns out to be

$$\begin{aligned} \nonumber E^{N\!R}_{_\mathrm{{I}}} \Phi _{_\mathrm{{I}}}^L= & {} \left( \frac{{\varvec{p}}^2}{2m_{i,{_\mathrm{{I}}}} } + m_{_\mathrm{{I}}}\phi \right) \Phi _{_\mathrm{{I}}}^L\\{} & {} + \left\{ m_{_\mathrm{{I,II}}}\left[ 1 + \phi - \frac{{\varvec{p}}^2}{2m_{_\mathrm{{I}}}m_{i,{_\mathrm{{II}}}} } \right] \right\} \Phi _{_\mathrm{{II}}}^L. \end{aligned}$$
(50)

From this, we can immediately deduce that, whilst the inertial mass is still represented by the effective quantity \(m_{i,{_\mathrm{{I}}}}\), the gravitational mass \(m_{g,{_\mathrm{{I}}}} \) must be identified with \(m_{_\mathrm{{I}}}\), and similarly for \(I\leftrightarrow II\). This implies a violation of WEP, which is completely analogous to WEP already encountered in the case of neutrino mixing.

In passing, we note that we employed the FV representation because we wished, as in Sect. 2, to rotate the mass basis to the flavor basis before taking the non-relativistic limit, as the two operations do not commute according to Ref. [18]. Due to the Foldy–Wouthuysen procedure, the ensuing non-relativistic limit appeared to be more straightforwardly done by employing the FV representation rather than the KG one.

4 A closer look at the non-relativistic limit of mixed particles: inadequacy of Bargmann’s superselection rule

So far, we have shown that, in order to render WEP violation explicit in the context of flavor mixing, one has to start from a fully relativistic theory and take a non-relativistic limit to observe the discrepancy between \(m_i\) and \(m_g\), regardless of the nature of the particle considered (i.e., spin-0 boson or spin-1/2 fermion). In principle, if it were possible to perform the analysis directly with the full-fledged non-relativistic wave equation, we would not need to distinguish between fields possessing different spins (and thus to undergo distinct investigations), as we could take the advantage of the unifying framework of Schrödinger’s equation. Unfortunately, in such a physical setting flavor mixing cannot be consistently described because of a superselection rule (SSR) known as Bargmann’s superselection rule.

Bargmann’s SSR arises as a consequence of demanding Galilean covariance for the Schrödinger equation. This, in turn, implies that superposition of states with different masses is forbidden [31] and thus one cannot coherently describe unstable or oscillating particles at non-relativistic energies [37,38,39,40,41]. So, in order to keep our presentation self-consistent, we should demonstrate that Bargmann’s SSR is not applicable in the present case.

While the impossibility of oscillating particles can be easily deduced in the context of Galileo transformations, it is in conflict with the principles of relativistic quantum theory. Indeed, the description of systems where superpositions of states with different mass occur can be carried out without difficulty within relativistic quantum mechanics and quantum field theory, and the significant number of mixed particles observed in high-energy particle physics is evidence of the consistency of such treatment. It is thereby unclear why they should cease to oscillate in the non-relativistic limit.

We propose that this issue can be addressed by recognizing that non-relativistic quantum mechanics can be influenced by relativistic effects, which are not visible or even forbidden when Galileo covariance is strictly enforced. The appearance of such effects is often proclaimed as non-physical and banished from the general framework via superselection rules. A particular example of the latter is represented by measurable phase shifts in particle mixing. For the sake of simplicity, we will employ mixed neutrinos, though our argument can be adapted also to scalar particles with minor adjustments. We start by considering plane-wave solutions of Eq. (4), which in position representation acquires the form

$$\begin{aligned} \left(i\partial _0-m_e\right)\varphi _e \ + \ i{\varvec{\sigma }}\cdot {\varvec{\nabla }}\chi _e= & {} m_{e\mu }\varphi _\mu ,\nonumber \\ -i{\varvec{\sigma }}\cdot {\varvec{\nabla }}\varphi _e \ - \ \left(i\partial _0+m_e\right)\chi _e= & {} m_{e\mu }\chi _\mu . \end{aligned}$$
(51)

By realizing that, for an observer moving with the particle, the plane-wave phase is

$$\begin{aligned} k^{\mu }x_{\mu } \ = \ \omega {t} - \textbf{k}\cdot \textbf{x} \ = \ mc^{2}\tau /\hbar , \end{aligned}$$
(52)

where \(\tau \) is the observer’s proper time (for future convenience, we have reinstated c and \(\hbar \)), we can write for the positive-energy plane waves

$$\begin{aligned} \psi _e= & {} \cos \theta e^{-im_1c^2\tau _1/\hbar } u_1(k) + \sin \theta e^{-im_2c^2\tau _1/\hbar } u_2(k) \nonumber \\= & {} e^{-im_1c^2\tau _1/\hbar } [\cos \theta \ \!u_1(k) + \sin \theta \ \! \tilde{u}_2(k)], \end{aligned}$$
(53)

with \(\tilde{u}_2 = e^{i(m_1-m_2)c^2\tau _1/\hbar }u_2\). Here, \(m_1\) and \(m_2\) are masses in the mass basis and \(\theta \) is the mixing angle. The flavor masses and mixing term \(m_e\), \(m_\mu \), \(m_{e\mu }\) are related to \(m_1\) and \(m_2\) through [32]

$$\begin{aligned} m_e= & {} m_1\,\textrm{cos}^2\theta +m_2\,\textrm{sin}^2\theta ,\nonumber \\ m_\mu= & {} m_1\,\textrm{sin}^2\theta +m_2\,\textrm{cos}^2\theta ,\nonumber \\ m_{e\mu }= & {} \left(m_2-m_1\right)\textrm{sin}\theta \ \!\textrm{cos}\theta . \end{aligned}$$
(54)

Let us now apply on \(\psi _e\) a sequence of transformations à la Bargmann [31], but instead of Galilean boosts we use Lorentz boosts. We start from the original system S and then perform 4 transformations [41]

$$\begin{aligned}{} & {} \hbox {Translation by}\,a\hbox { from } S\hbox { to }S_I:\nonumber \\{} & {} \quad ~ x \rightarrow x + a = x_I,\nonumber \\{} & {} \quad ~ x_0 = x_{0,I}. \nonumber \\{} & {} \hbox {Boost by}\, v\hbox { from } S_I\hbox { to }S_{II}: \nonumber \\{} & {} \quad ~ x_{II} = \gamma (x_I - \beta x_{0,I}), \nonumber \\{} & {} \quad ~ x_{0,II} = \gamma (x_{0,I} - \beta x_I).\nonumber \\{} & {} \hbox {Translation by}\,-a\hbox { from } S_{II}\hbox { to }S_{III}:\nonumber \\{} & {} \quad ~ x_{II} \rightarrow x_{II} - a/\gamma = x_{III}, \nonumber \\{} & {} \quad ~ x_{0,II} = x_{0,III}. \nonumber \\{} & {} {\hbox {Boost by}}\,-v\hbox { from } S_{III}\hbox { to }S_{IV}: \nonumber \\{} & {} \quad ~ x = x_{IV} = \gamma (x_{III} + \beta x_{0,III}), \nonumber \\{} & {} \quad ~ x_{0,IV} = \gamma (x_{0,III} + \beta x_{III})= x_{0} - \beta a. \end{aligned}$$
(55)

Here, \(\beta = v/c\) and \(\gamma = (1- \beta ^2)^{-1/2}\). After the sequence of transformations \(S \rightarrow S_I\rightarrow S_{II} \rightarrow S_{III} \rightarrow S_{IV}\) we end up in the original point \(\textbf{x}\) but in the Lorentz shifted time \(t_{IV} \ne t\). In the version with Galileo boosts, we would have \(t=t_{IV}\), and so we would end up in the frame \(S_{IV} =S\). From the point of view of the observer who has undertaken the sequence of above transformations, the mixing (53) reads

$$\begin{aligned} \psi _e'= & {} \cos \theta e^{-im_1c^2\tau _2/\hbar } u_1 + \sin \theta e^{-im_2c^2\tau _2/\hbar } u_2 \nonumber \\= & {} \ e^{-im_1c^2\tau _2/\hbar } [\cos \theta \ \!u_1 + \sin \theta \ \! e^{-i\Delta mc^2 \Delta \tau /\hbar } \tilde{u}_2], \end{aligned}$$
(56)

where \(\Delta m = m_1 - m_2\) and \(\Delta \tau = \tau _1 - \tau _2\) is the difference between proper times of both observers. Note that the momentum bispinors \(u_1\) and \(u_2\) are not affected by the combined transformation, as the net effect of the sequence of transformations is reflected only in the phase parts.

Due to the lack of simultaneity between the two observers (twin paradox), the two states \(\psi _e\) and \(\psi _e'\) must be different (i.e., they are not members of the same projective ray in the Hilbert space). The appearance of the extra relative phase factor \(e^{-i\Delta mc^2 \Delta \tau /\hbar }\) is thus not surprising as \(\Delta \tau \ne 0\).

In the Galilean framework, the analogous situation looks different. The sequence of the four transformations gives an identity operation and the presence of a non-relativistic analogue of the above relative phase is inconsistent with the fact that the state \(\psi _e\) should coincide with the state \(\psi _e'\) (they lie on the same ray). In fact, such a relative phase does not have any meaning and is forced to be 1 by proclaiming that the only logically consistent situation is \(m_1 = m_2\) (i.e., Bargmann’s SSR), implying that no neutrino mixing can take place non-relativistically.

Let us note that the phase factor \(e^{-i\Delta mc^2 \Delta \tau /\hbar }\) will not disappear in the non-relativistic limit, but it will leave an imprint that is independent of c. Indeed

$$\begin{aligned} \Delta \tau= & {} \tau _1 - \tau _2 \ = \ t - \int _{t_0}^{t_{0,IV}} \sqrt{1 - \frac{v^{2}(t)}{c^2}} \ \! dt \nonumber \\= & {} t - \int _{t_{0,I}}^{t_{0,II}} \sqrt{1 - \frac{v^{2}}{c^2}} \ \! dt \ + \ \int _{t_{0,III}}^{t_{0,IV}} \sqrt{1 - \frac{v^{2}}{c^2}}\ \! dt \nonumber \\= & {} t - \sqrt{1 - \frac{v^{2}}{c^2}} \ \!t \ \ {\mathop {\rightarrow }\limits ^{NR}} \ \ \frac{v^2}{2 c^2} t_{NR} \ = \ \frac{v a}{ c^2}, \end{aligned}$$
(57)

where \(ct = x_{0,IV} - x_{0}\) and \(t_{NR} = [(a + a/\gamma )/v]_{NR} = 2a/v\). Hence, we obtain that

$$\begin{aligned} e^{-i\Delta mc^2 \Delta \tau /\hbar } \ \ {\mathop {\rightarrow }\limits ^{NR}} \ \ e^{-i\Delta m v a/\hbar }. \end{aligned}$$
(58)

This phase factor, though not depending on c, has no basis in the Galileo transformations (where the concept of proper time is meaningless) and is erroneously seen as non-physical and removed via SSR.

In short, the non-relativistic limit of superpositions of states with different mass can comfortably accommodate a relative phase that is otherwise problematic from the point of view of Galileo transformations. So, neutrino mixing and ensuing oscillations do not pose any conceptual difficulties in the non-relativistic limit, and certainly they are not prohibited by Bargmann’s SSR. Similar considerations hold true also for scalar particle in the FV representation. The only difference is that for spinless particles \(\psi _e \rightarrow \Phi _{_\mathrm{{I}}}\), \(m_e, m_\mu , m_{e\mu } \rightarrow m_{_\mathrm{{I}}}, m_{_\mathrm{{II}}}, m_{_\mathrm{{I,II}}}\) and \(\theta \rightarrow \tilde{\theta }\), where \(\tilde{\theta }\) is the mixing angle through which we must rotate the diagonal mass-matrix to obtain \(\mathbb {M}\) in (27).

5 Conclusions

In this Letter, we have investigated the non-relativistic limit of the Klein–Gordon equation for mixed scalar particles. To mimic our treatment for spin-1/2 particles outlined in Ref. [18] (and summarized in Sect. 2), we have employed the Feshbach–Villars representation, according to which the wave function of a spinless particle becomes a two-component object and the equation of motion is of the first order in time. Within this setting, we have demonstrated that the resulting Schrödinger-like equation predicts an effective inertial masses, which does not coincide with eigenvalues of the mass matrix in the relativistic regime. In particular, the ensuing low-energy inertial masses non-trivially depend on a mixing term.

We have also shown that, when a weak external gravitational field is taken into account, the resulting gravitational masses remain unchanged in the non-relativistic limit, giving thus rise to a violation of WEP. Interestingly, the rate of this violation is identical to the one encountered in the framework of spinor flavor mixing, the only difference being an overall numerical factor which is associated to the spin of the considered particle. For the sake of consistency in our presentation, we have further proved that the non-relativistic limit for superposition of states with different masses does not produce any inconsistency in non-relativistic quantum mechanics – as it could be naively inferred from the application of Bargmann’s SSR.

Finally, it is important to note that, unlike the case in neutrino physics, the current model of meson oscillations is inevitably an effective model since mesons are not fundamental particles. Indeed, a full-fledged treatment should involve quarks, whose mixing properties are encoded in the Cabibbo–Kobayashi–Maskawa matrix [42, 43]. However, such an analysis would pose a series of problems, the majority of which are related to the fact that mesons are made up of quarks tied together by the strong force. In order to properly understand the issue at high enough energies, it is necessary to use quantum field theory instead of first quantization. For energies lower than \(m_1\) and \(m_2\), it is reasonable to assume that our first-quantized analysis should be viable.