1 Introduction

The Eötvös  Pekár and Fekete (EPF) equivalence test of gravitational and inertial masses of a body was an outstanding achievement in physics at the beginning of the twentieth century [1]. They improved on the precision of previous tests by more than three orders of magnitude and were the first to use a torsion balance to test the equivalence principle. They found no violation on the level of accuracy 1/100,000,000 [2].

In 1986 Fischbach and his coworkers reanalyzed the EPF data and found a surprising composition dependence in terms of baryon number-to-mass ratios of the samples [3, 4]. They hypothesized a composition-dependent fifth force. A series of novel gravitational experiments followed, most notably from the Eöt-Wash group [5,6,7,8,9,10,11,12], and found no evidence of such a fifth force. The original hypothesis in lack of experimental support was abandoned. There remained valid questions, however, about the EPF experiment. This experiment was quite different from the following more precise equivalence tests, and the EPF correlation in spite of every effort has not yet been explained in terms of conventional or unconventional physics [4, 13, 14].

This situation motivated us to investigate the role of gravity gradients in the EPF test. Also we were motivated by our experience with the torsion balance in a nonlinear gravity field [15], and we asked whether any such effect might be visible in the results of the EPF experiment.

Fig. 1
figure 1

A In the EPF experiment lower mass of the balance was replaced with different samples. Sample geometry variation changed the coupling with the ambient gravity field and leads to variation of the gravitational torque. This caused the balance to move into a new equilibrium position, even when the equivalence principle was not violated and the ambient gravity field was unchanged. B Gravity field might change during the experiment

In this paper we present arguments that the EPF results might have been seriously biased by a classical systematical effect related to the ambient gravity field. We found by using source mass modeling of the ambient gravity field that a gravity gradient bias is enough to fully reproduce the EPF results without any fifth force effect. The essential aspects of our analysis are summarized in Fig. 1.

In Sect. 2 we overview relevant details of the EPF measurements and show the origin of the gravity gradient bias. In Sect. 3 we introduce multipole formalism and use it to model the interaction between the balance and the ambient gravity field. Next we evaluate and discuss gravity gradient bias in the EPF experiment with our model. In Sect. 4 we construct a simple ambient gravity field model and present numerical results with this model. In Sect. 5 other more precise equivalence tests are examined in connection with gravity gradient bias. Finally, we point out some important conclusions.

2 EPF measurement principle and origin of gravity gradient bias

The purpose of the EPF experiment was to compare gravitational acceleration G due to the Earth on different materials or samples [1]. The main idea was to compare the horizontal component of gravitational force acting on mass m with the horizontal component of centrifugal force mC (see Fig. 2). Eötvös assumed that centrifugal force mC is composition independent; hence, if gravitational force depends on material composition, the imbalance of horizontal forces can be detected with his torsion balance. If angle \(\varepsilon \) is the direction difference between gravity force mg (sum of gravitational and centrifugal forces) and gravitational force mG, \(mG\sin \varepsilon \) is the horizontal component of this force. At geodetic latitude \(\varphi \) the horizontal component of centrifugal force is \(mC\sin \varphi \).

EPF introduced the Eötvös parameter \(\eta \) to characterize possible composition dependence of the gravitational force through formula \((1+\eta )mG\), assuming \(\eta =0\) for a selected reference material. This parameter is the ratio of the horizontal component of the differential acceleration of the upper and lower masses of the balance and the horizontal component of the gravitational acceleration [16]. EPF worked with 10 pairs of samples. The effect on the samples below the arm was compared to the fixed upper mass by the Eötvös parameter \(\eta \). The results of EPF tests were finally described in terms of variation of the Eötvös parameter \({\Delta }\eta \) between different pairs of samples. If there was no bias, nonzero Eötvös parameter variation indicated equivalence principle violation.

Fig. 2
figure 2

Principle of the EPF experiment

Since the small force to be detected is in the north–south direction, the arm of the balance must be set to the east–west direction for maximum effect. (For direction reference we will consistently use the position of the lower mass of the balance.) When first the lower mass lies to the east, net torque on the balance is \(-(\eta _a - \eta _b) mGl\sin \varepsilon = -{\Delta }\eta mGl\sin \varepsilon \). Here l is half arm length and \(\eta _a\), \(\eta _b\) are Eötvös parameters belonging to the lower and upper masses, respectively. (A positive torque is that causes a positive rotation from north to east.) Next, when the balance is rotated by 180\(^\circ \), net torque on the balance changes to \({\Delta }\eta mGl\sin \varepsilon \). The difference of torques is thus \(-2{\Delta }\eta mGl\sin \varepsilon \). The azimuth of the lower mass will change after rotation by \(180^\circ +v_1\), i.e., the change will not be exactly \(180^\circ \) in case of a nonzero \({\Delta }\eta \). This differential rotation \(v_1\) is proportional with the difference of torques. The constant of proportionality is the reciprocal of the torsion constant \(\tau \) of the fiber. If \(v_1\) is measured, the difference of \(\eta \) can be calculated

$$\begin{aligned} {\Delta }\eta = -\frac{\tau v_1}{2mlG\sin \varepsilon }. \end{aligned}$$
(1)

Unfortunately, this simple formula gets more complicated because of the spatial variation of the gravitational force. We use a local north-east-down reference frame: the x-axis points to north, y to east and z to down. In this frame only the x-component of the gravitational force, \(g_x\), exerts torque on the balance in east–west position. In linear approximation its variation between the masses in east and west positions is \(mg_{x}(z) = mg_{xz}\, z\), where \(g_{xz}\) is the vertical gradient of \(g_x\). Hence, differential rotation due to gravitational gradients is

$$\begin{aligned} v_2 = - \frac{2}{\tau } mlh\,g_{xz}, \end{aligned}$$
(2)

where h is vertical distance between upper and lower masses.

Formula 2 makes it clear that at least two important requirements must be met to avoid gravity gradient bias in the EPF experiment. Both come from the requirement that \(v_2\) should be kept strictly constant during measurement. Otherwise, any change in the total differential rotation angle \(v = v_1 + v_2\), according to Eq. (1), might be interpreted as false violation of the equivalence principle. We note that Eötvös and his coworkers were fully aware of these requirements, and as we will immediately see, they changed their experimental protocol accordingly.

The first requirement was that torsion constant \(\tau \), mass m, half arm length l and vertical mass distance h should either be the same for the sample pair or should be measured and used for correction. Since constancy of \(\tau \) cannot be assumed, but its variation cannot be measured accurately, Eötvös and his coworkers used a smart idea to get rid of its change. They made use of the fact that there is no torque due to the composition-dependent gravitational force on the balance in the north–south direction, but there is a gradient effect causing a differential rotation w very similar to Eq. (2)

$$\begin{aligned} w = \frac{2}{\tau } mlh\,g_{yz} \end{aligned}$$
(3)

due to vertical gravity gradient \(g_{yz}\). (We note that Eqs. (2) and (3) differ in sign since forces \(mg_x\) and \(mg_y\) with positive sign cause opposite torques.) The ratio v/w is free of the critical parameter \(\tau \), but \({\Delta }\eta \) can still be calculated from the change of this ratio across different samples if gravity gradients were unchanged. This was their Method 2.

The second requirement was that ambient gravity gradient \(g_{xz}\) (and in Method 2 also \(g_{yz}\)) must be unchanged during the experiment. To get rid of this requirement and still avoid bias EPF took simultaneous measurements with a pair of samples using a double balance. Hence, any possible change of gradients had the same effect on the sample pair; by differencing v/w across the two balances at the same time these effects disappeared. After the first set of measurements they measured a second set by exchanging the samples between the two balances to cancel any false effect coming from the slightly different parameters and orientation of the individual balances of the double balance. This was their most advanced Method 3. It must be noted, however, that the output of their experiment, \({\Delta }\eta \) for measured sample pairs, contained results obtained with both methods.

Now we consider the origin of a gravity gradient bias that was not recognized by EPF. Equation (2) is valid both for point masses and for homogeneous circular cylinders if l and h refer to their centers of mass. The latter can easily be verified by integration.

But what would be the result, if the vertical variation of \(g_x\) were not strictly linear, i.e., not well described by the formula \(g_{x}(z) = g_{xz}\, z\)? The next possibility would be to use the quadratic approximation \(g_{x}(z) = g_{xz}\, z + g_{xzz}\,z^2\). For cylindrical samples used by EPF the total gravitational force must be calculated by integration of \(g_{x}(z)\)

$$\begin{aligned} v_2 = -\frac{2}{\tau }\int _{z_1}^{z_2} m_z l g_x(z) \hbox {d}z, \end{aligned}$$
(4)

where \(z_1\), \(z_2\) denote heights of upper and lower faces of the cylindrical sample and \(m_z\) is mass of infinitesimal cross section. After easy calculation for a sample with height \(H = z_2 - z_1\) we get

$$\begin{aligned} v_2 = -\frac{2}{\tau } m l \left( hg_{xz} + \left( h^2 + \frac{H^2}{12} \right) g_{xzz} \right) . \end{aligned}$$
(5)

Equation (5) clearly points to a new source of gravity gradient bias in the EPF experiment. It is due to sample height dependence. If sample height varies from H to \(H'\) and \(g_{xzz}\) is nonzero, there is a gravity gradient bias

$$\begin{aligned} {\Delta }\eta _{{\mathrm{bias}}} = -\frac{g_{xzz}}{12G\sin \varepsilon } (H^2 - H'^2) \end{aligned}$$
(6)

in the Eötvös parameter. A false violation of the equivalence principle is detected. And since EPF used samples with very different heights in their experiment, there is a room for this bias. For example the height of the Pt cylinder was 6 cm, that of magnalium (Mg–Al alloy) was 11.9 cm, and their Snakewood cylinder was 24 cm long. (We remark that Eq. (5) is valid only for thin cylinders. A better approximation will be shown later that contains a term proportional to \(H^2/12 - R^2/4\). This expression depends on the radius R of the cylinder as well, but our conclusion on the origin of the bias is the same.)

Let us estimate the magnitude of the bias. According to Eq. (6) it is proportional with \(g_{xzz}\). This is the coefficient of the quadratic term in the height dependence of \(g_x\). Close to surfaces of density jumps there are strong nonlinearities of \(g_{x}\). Since the original field books, notes and any possible sketches about the EPF measurements are unavailable at the moment, we can only guess the masses that were originally close to the balances at the measurement site(s). Our calculations with mass models showed that \(g_{xzz}\) may be as big as 0.2–\(3\,\mathrm {nGal/cm}^2\) when we are within 1 m from a strong density jump (floor, walls, etc.). We recently measured \(g_{xzz}=0.07\,\mathrm {nGal/cm}^2\) in an underground tunnel with an improved Pekár G-2B balance [17]. Hence, gravity gradient bias in \({\Delta }\eta \) may reach up to \(2 \cdot 10^{-9} - 8\cdot 10^{-8}\) and has a strong dependence on the local structure of the ambient gravity field and on sample shape. Compare this bias with the magnitudes of \({\Delta }\eta = \pm 1-6\cdot 10^{-9}\) reported by EPF [1].

We have shown in this section the origin and expected magnitude of the ambient gravity gradient bias. Next we further formulate and discuss the bias using multipoles.

3 Multipole formulation and discussion of possible gravity gradient effects

Multipoles [18] proved to be useful to describe gravitational interaction between the masses of the torsion balance and the masses outside that produce the ambient gravity field [9]. The ambient field is characterized by \(Q_{lm}\) multipole fields; with this characterization the gravitational torque on the balance is

$$\begin{aligned} T_g = -\frac{\partial W}{\partial \phi } = -4\pi iG \sum _{l=2}^\infty \frac{1}{2l+1} \sum _{m=-l}^l m \; q_{lm} Q_{lm} e^{-im\phi } . \end{aligned}$$
(7)

Here W is gravitational potential energy, G is the universal constant of gravitation, \(q_{lm}\) are multipole moments of the balance calculated in a body-fixed frame, and \(\phi \) is azimuth of the balance’s arm. Azimuth is measured from the x-axis and positive toward the y-axis. No torque is produced by the \(Q_{11}\) multipole field, because the arm is hanging freely on the torsion fiber; hence, the sum starts from \(l = 2\).

Our goal is to find Eötvös parameter variation \({\Delta }\eta \) in terms of multipole moments and ambient multipole fields. This quantity is expressed for Method 2 by

$$\begin{aligned} {\Delta }\eta =c\left( \frac{v}{w} - \frac{v'}{w'} \right) \end{aligned}$$
(8)

where v and w denote deflection differences of the balance arm in E–W and N–S directions expressed in scale units and by

$$\begin{aligned} {\Delta }\eta =\frac{c}{2} \left[ \left( \frac{v_1}{w_1} - \frac{v_2'}{w_2'} \right) + \left( \frac{v_2}{w_2} - \frac{v_1'}{w_1'} \right) \right] \end{aligned}$$
(9)

for Method 3, where primes indicate a different sample and subscripts denote individual balances of the double balance [1]. The proportionality constant is

$$\begin{aligned} c= \frac{w\tau }{4LM_a l_a C\sin \varphi }, \end{aligned}$$
(10)

where L is distance to the scale in scale units, \(M_a\) is mass of the sample, \(l_a\) is length of the balance arm, \(\tau \) is torsion constant of the fiber, and \(C \sin \varphi \) is the horizontal component of centrifugal acceleration. Small variations of the Eötvös parameter can also be caused by azimuthal differences of the arm; we omitted these since we were interested in gravity gradient effects.

Next, we expressed the ratio of v and w in terms of gravitational torque differences, which in turn are related to field multipoles according to Eq. (7). For negative m we utilized the relation \(Q_{l,-m} = (-1)^m Q^*_{lm}\) where star denotes complex conjugation. We assumed a symmetrical mass distribution of the balance with respect to the plane of the arm’s axis and the fiber. In this case all \(q_{lm}\) are real and by keeping terms up to \(l \le 4\)

$$\begin{aligned} \frac{v}{w} = -\frac{\hbox {Re}(p)}{\hbox {Im}(p)}, \end{aligned}$$
(11)

where

$$\begin{aligned} p = q_{21}Q_{21} + \frac{5}{7}\; q_{31}Q_{31} - \frac{5}{7}\; q_{33}Q^*_{33} + \frac{5}{9}\; q_{41}Q_{41} - \frac{5}{9}\; q_{43}Q^*_{43}. \end{aligned}$$
(12)

To summarize, Eqs. (812) formulate the effect of gravity gradients on the output of the EPF experiment for multipole moments and fields for \(l \le 4\).

Now we discuss possible sample and gravity field-related variation of the v/w ratio, for any such change can bias the Eötvös parameter.

First consider shape dependence of \(q_{lm}\)’s. Multipole moments for a mass distribution with density \(\rho ({r})\) over domain \({\varOmega }\) are defined as

$$\begin{aligned} q_{lm} = \int _{{\varOmega }} \rho ({r}) r^l Y_{lm}^*({\hat{r}}) \hbox {d}^3r \end{aligned}$$
(13)

where \(Y_{lm}\) is spherical harmonic, star denotes complex conjugation, and \({\hat{r}}\) is the unit vector along r. The samples used in the EPF experiment were all cylinders suspended vertically. For homogeneous vertical cylinders placed at the origin, the expression due to [19] was utilized and resulting multipoles were translated using the technique disclosed by [18]. We found the following low-degree multipole moments (\(l = 2,3,4\)) of vertical cylinders to be shape dependent

$$\begin{aligned} q_{20}= & {} \frac{1}{24} \sqrt{\frac{5}{\pi }} M (H^2 -3R^2-6x^2-6y^2+12z^2) \end{aligned}$$
(14)
$$\begin{aligned} q_{31}= & {} \frac{1}{8} \sqrt{\frac{21}{\pi }} M (x-iy)(H^2 -3R^2-3x^2-3y^2+12z^2) \end{aligned}$$
(15)
$$\begin{aligned} q_{41}= & {} -\frac{3}{8} \sqrt{\frac{5}{\pi }} M z (x-iy)(H^2 -3R^2-3x^2-3y^2+4z^2) \end{aligned}$$
(16)

where M is mass of the cylinder, R is radius, H is height, and x, y, z are coordinates of the center of mass of the cylinder. If the center of mass lies in the xz plane, these coefficients are real. These multipole moments indeed include shape dependence that is proportional with \(H^2/12-R^2/4\). What is relevant for the EPF experiment is the shape dependence of \(q_{31}\) and \(q_{41}\), because these also appear in Eq. (12).

The largest gravity field-related bias is expected from the lowest degree multipole fields. Also, the largest shape effect is expected to come from the \(q_{31}Q_{31}\) term because Eq. (7) converges as \((r/R)^l\) where r is a typical dimension of the torsion balance and R is a characteristic distance from the pendulum to the closest source [9]. Hence, for the moment we keep only the first two terms in Eq. (12) to discuss shape and gravity field-dependent bias. This means that we assume a two-component gravity field with nonzero field multipoles \(Q_{21}\) and \(Q_{31}\).

To quantify bias the total derivative of v/w was calculated in terms of infinitesimal increments \(\hbox {d}q_{31}\), \(\hbox {d}Q_{21}\) and \(\hbox {d}Q_{31}\) of multipole moments and fields

$$\begin{aligned} \hbox {d}\left( \frac{v}{w}\right) = -\frac{1}{\hbox {Im}(p)}\left( \frac{5}{7} \hbox {d}q_{31} Q^+_{31} + q_{21}\hbox {d}Q^+_{21} + \frac{5}{7} q_{31} \hbox {d}Q^+_{31}\right) , \end{aligned}$$
(17)

where we used the abbreviations \(\hbox {d}Q^+_{21}=\hbox {Re}(\hbox {d}Q_{21})+v/w\,\hbox {Im}(\hbox {d}Q_{21})\), \(\hbox {d}Q^+_{31}=\hbox {Re}(\hbox {d}Q_{31})+v/w\,\hbox {Im}(\hbox {d}Q_{31})\), \(Q^+_{31}=\hbox {Re}(Q_{31})+v/w\,\hbox {Im}(Q_{31})\) and \(p = q_{21}Q_{21} + \frac{5}{7}\; q_{31}Q_{31}\). The change of v/w described by Eq. (17) is directly related to the bias of \({\Delta }\eta \) through Eqs. (8) and (9).

The first term on the right-hand side of Eq. (17) gives a shape bias that depends on \(\hbox {d}q_{31}\), while the next two terms give a gravity field bias related to \(\hbox {d}Q^*_{21}\) and \(\hbox {d}Q^*_{31}\).

For Method 2 (see Eq. 8) all three terms may bias the Eötvös parameter, since v/w and \(v'/w'\) values were calculated from successive measurement sets with different samples in the EPF experiment. There must have been a considerable time delay between these sets. Although [1] disclosed no details on timing of their measurements, they do gave the number of observations in each set. From this it can be estimated that at least one week or even more, a couple of weeks must have passed between subsequent sets of measurements, during which constancy of the gravity field cannot safely be assumed. And, of course, there is a sample shape-dependent bias irrespective of any ambient gravity field change due to \(\hbox {d}q_{31}\).

For Method 3 (see Eq. 9) EPF made simultaneous measurements with a double balance. Values of v/w and \(v'/w'\) were calculated from measurements taken at almost the same time; therefore, the bias terms containing \(\hbox {d}Q^*_{21}\) and \(\hbox {d}Q^*_{31}\) must have been quite the same for both v/w and \(v'/w'\) even in a changing gravity field. Their differences according to Eq. (9) must have canceled giving no effect. The first term in Eq. (17) may still have biased the results, because \(q_{31}\) was not the same for the samples, i.e., \(\hbox {d}q_{31}\) was nonzero in the experiment. Clearly this bias also depends on the multipole field \(Q^*_{31}\). This field might have changed a little because in this method there was another set of measurements with samples exchanged between the two balances. So even in this method there was a slight dependence of the Eötvös parameter on time variation of the ambient gravity field due to the nonzero shape effect (nonzero \(\hbox {d}q_{31}\)).

If we keep the first term in Eq. (17) and introduce the relative change \(\hbox {d}q_{31}/q_{31}\)

$$\begin{aligned} \hbox {d}\left( \frac{v}{w}\right) = -\frac{5}{7}\frac{Q^*_{31}}{\hbox {Im}(Q_{21})+5/7 \hbox {Im}(Q_{31})\,q_{31}/q_{21}} \cdot \frac{\hbox {d}q_{31}}{q_{31}} \end{aligned}$$
(18)

we see linear dependence of the bias for Method 3 (or for Method 2 in unchanging gravity field) on \(\hbox {d}q_{31}/q_{31}\) when constancy of the \(q_{31}/q_{21}\) ratio and of field multipoles \(Q_{21}\), \(Q_{31}\) is assumed across different sample pairs. This important relation will be used later for checking the EPF results, since no knowledge of field multipoles is required for using it.

Finally we remark that if we neglect terms in p of higher than second degree (this was the analysis done by EPF), any false gravity gradient violation signal must come from the change of \(Q_{21}\), that is from the second term of Eq. (17). This bias was avoided by EPF through their Method 3.

We have seen that if higher-degree field multipoles are not negligible, even Method 3 results may be biased. The origin of this bias is the change of coupling to the gravity field as a function of sample geometry (see Fig. 1).

4 Calculation and discussion of possible gravity gradient bias in the EPF experiment

Before starting the interpretation of the results of the EPF experiment, it is useful to show the results themselves. In Fig. 3 we plotted the recalculated Eötvös parameter differences \({\Delta }\eta \) and their standard deviations for the 10 measured sample pairs. Recalculation was based on the v/w ratios and their standard errors published in the original paper [1].

Fig. 3
figure 3

Recalculated Eötvös parameter differences \({\Delta }\eta \) with standard errors for the 10 sample pairs measured by EPF in the order these results were published in their paper [1]. Recalculation was based on the published v/w ratios and their standard errors. Shading indicates measurements with Method 3

Equation (17) shows clearly that both multipole moments and field multipoles are required to calculate the Eötvös parameter bias.

It was relatively straightforward to calculate multipole moments of the balances used by EPF from the parameters published in their paper [1]. We used closed expressions of inner multipole moments [18, 19] of cylinders in upright and horizontal positions. Balance beam was a hollow brass cylinder with 0.5 cm diameter in the EPF experiment. The upper cylindrical Pt mass with 30 g weight was inserted into one end of the hollow cylinder [20]. Missing parameters of beam geometry were determined from the mass moments given by [1]. We emphasize that a complete mass model of the balance (with different probe masses, the balance beam and the upper mass) was used for all the calculations unless indicated otherwise. For reference we show multipole moments of the complete balance as well as its \(q_{31}/q_{21}\) ratios calculated in Table 1.

Table 1 Calculated multipole moments and \(q_{31}/q_{21}\) ratios of the complete balance including the arm for each measured sample of the EPF experiment

It is explained in Sect. 3 that linear dependence of the bias for Method 3 (or for Method 2 in unchanging gravity field) is expected on \(\hbox {d}q_{31}/q_{31}\) for constant \(q_{31}/q_{21}\) ratio and field multipoles \(Q_{21}\), \(Q_{31}\) across different sample pairs. From Table 1 it can be seen that \(q_{31}/q_{21}\) varies by less than 2% across samples for Method 3 and up to 16% for Method 2. In a steady-state ambient gravity field thus approximate linear correlation with the Eötvösparameter variation \({\Delta }\eta \) is expected. Figure 4 shows the correlation with computed relative changes \({\Delta } q_{31}/q_{31}\) between the samples of the same sample pair. For Method 3 average of the two slightly different \({\Delta } q_{31}/q_{31}\)’s was used.

Fig. 4
figure 4

Variation of the Eötvös parameter \({\Delta }\eta \) as a function of relative multipole change \({\Delta } q_{31}/q_{31}\) of the balance for sample pairs. If we assume a constant \(q_{31}/q_{21}\) ratio of the sample pairs, linear dependence is expected in unchanging ambient gravity field for Method 2 and for Method 3 even in changing ambient gravity field. Approximate linear dependence is clearly seen for Method 3 EPF results, which strongly supports a gravity field originating bias in the experiment

The distribution and size of masses close to EPF measurement sites were basically unknown to us. As we mentioned neither sketches nor field notes were available that might have helped us in this respect. Hence, one may only guess on the amount of gravity gradient bias in the EPF experiment. Quite the opposite to this situation, in modern rotating torsion balance tests multipole fields were carefully measured and compensated for [9, 12, 21] to minimize any possible false gravity gradient violation signal.

Fig. 5
figure 5

Point mass model of the ambient gravity field. The first part contains two point masses \(M_1\) and \(M_1\) symmetrically placed around the origin. It models the \(Q_{21}\) field multipole and has a nonzero \(Q_{43}\). Three point masses \(2M_2\), \(M_2\) and \(M_2\) form the second part to model the \(Q_{31}\) field multipole (see Table 2). Parameter \(\alpha \) is \(5/2 \root 8 \of {4/5}\)

We wanted, however, to demonstrate the effect of changing ambient gravity field on the output of the EPF experiment even in case of missing field multipoles. Therefore, we constructed a simple source mass model of the ambient gravity field (Fig. 5). We admit that this model was quite artificial. Despite this we think it still serves its purpose to demonstrate how sensitive the EPF experiment was to the ambient gravity field due to sample geometry bias.

Table 2 Details of the ambient gravity field model of low-order (\(l \le 4\)) \(Q_{lm}\) field multipoles

Our mass model consisted of 5 point masses. There were 4 independent model parameters. Details of the model are found in Table 2. The v/w ratio computed from the model must conform to the EPF measurements. Hence, the azimuths \({\varPhi }_1\) were constrained to yield the measured v/w ratios. We assumed point mass sources of the \(Q_{31}\) field multipole to lie at 20 m characteristic distance. It was because about 20 m from the measurement site there was a strong concrete tower reported by [2].

More precise values of Eötvös parameter variations \({\Delta }\eta \) and standard deviations for the 10 sample pairs were recalculated from the original data [1]. The parameter space of the model was searched for optimum solutions by differential evolution [22]. Optimality criterion was that the sum of weighted squared differences between model \({\Delta }\eta _{{\mathrm{model}}}\) and measured \({\Delta }\eta \) should be minimum. Weights were assigned from the standard deviations of the results.

We considered two extreme cases: In Case 1, no variation of the model was allowed, i.e., point masses were fixed for all measurement epochs. Upper subfigure of Fig. 6 shows the correlation between this model and the original EPF measurement in terms of variations of the Eötvös parameter \({\Delta }\eta \). In Case 2, reasonable changes were allowed for parameters of the mass model between different measurement epochs. Lower subfigure of Fig. 6 shows that this way a perfect match between the model and the original EPF measurement could be achieved in terms of \({\Delta }\eta \) Eötvös parameter variation.

Fig. 6
figure 6

These two figures show two extreme cases of correlation of our ambient gravity field bias model with the EPF experiment in terms of modeled and measured \({\Delta }\eta \) Eötvös parameter differences. The upper figure shows results of Case 1. In this case no variation of the ambient gravity field was allowed. Consequently, modeled \({\Delta }\eta _{{\mathrm{model}}}\) was due to varying sample geometry alone. The lower figure shows perfect correlation in Case 2. This fit was achieved by allowing small variations of the ambient gravity field model between measurements that were not taken at the same time (see Fig. 7). Although it is unreasonable to require a perfect fit, it demonstrates clearly that the original EPF measurements can be interpreted fully as a false gravity gradient effect. \(R^2\) and \(R^2_{adj}\) both denote coefficient of determination, and the latter is adjusted for the number of terms in the model. F is F test statistics for overall significance, and p(F) is p value of the F test

Fig. 7
figure 7

Parameter values of the ambient gravity field model that were required for exactly reproducing the EPF results (see lower subfigure of Fig. 6). Shading indicates measurements with Method 3. For each sample pair measured with Method 2 there were two models since these measurements were not taken at the same time. The ambient gravity field model was a simple 5-point mass model composed of a 2-point \(Q_{21}\), \(Q_{43}\) and of another 3-point \(Q_{31}\) field. Parameters \(d_1\) and \({\varPhi }_1\) resp. \(d_2\) and \({\varPhi }_2\) are horizontal distance and azimuth belonging to the 2-point resp. 3-point mass models. \(M_2/M_1\) is the mass ratio of the two models. 84% of the relative parameter changes are below \(\pm 5\%\), and the maximum is 17.9%. These changes seem reasonable since EPF reported on the construction of a nearby building during the observations

Figure 7 presents model parameters required for the perfect fit in Case 2. This fit was achieved with a 2.6% average absolute variation of the mass model’s parameters. Additionally, 84% of the relative parameter changes were below \(\pm 5\%\). Maximum parameter variation was 17.9%, and the three largest variations were found in the mass ratio parameter \(M_2/M_1\).

Figure 4 indicates no linear dependence for results by Method 2; on the contrary, results by Method 3 showed an approximate linear dependence. No dependence for Method 2 was expected, since time variation of the ambient gravity field between the two measurement sets might have probably hidden the effect of sample geometry. We checked this assumption with a simple calculation by using the same mass model the parameters of which are shown in Fig. 7. The gravity field was changing according to the mass model, but the shape of the probe masses remained constant within any given pair of Method 2 measurements. Results in Table 3 show that indeed for this particular mass model the variations of the Eötvös parameter are in the same order of magnitude or even bigger than those obtained by EPF. We mention that EPF reported on the construction of a nearby building and excavation of a deep pit during the observations [20]. This construction work must have caused a significant local gravity field variation. On the other hand, Method 3 results are virtually insensitive to this time variation and the sample geometry effect became clearly visible. We think these results strongly support our hypothesis of sample geometry-dependent ambient gravity field bias. We argue that in case of purely random effects there would be no such clear distinction between the two methods in the results of EPF experiment.

Table 3 Effect of the time-varying gravity field on Eötvös parameter differences calculated according to Method 2 of EPF

The simple source mass model has demonstrated that a particular ambient gravitational field can even reproduce the measured effects. Interpretation of the results obtained by source mass modeling confirmed the possible role of time variation of the ambient gravity field during the EPF experiment. When no time variation was allowed, we found moderate correlation between modeled and measured Eötvös parameter differences \({\Delta }\eta \). Even for this fit quite unrealistic model parameters (too large \(M_2/M_1\) ratio and too small \(d_2\)) were required.

On the other hand, when time variation of the source mass model was allowed, we got reasonable results. Although the assumption of a perfect fit without any statistical fluctuation is obviously unrealistic, Fig. 7 shows that both magnitude of calculated parameters of the mass model and range of their variations are feasible.

5 Examination of other equivalence tests in connection with gravity gradient bias

In view of the above results for the EPF experiment, the question is whether other, more precise equivalence tests might also be affected by a similar gravity gradient bias. Example calculations were made to assess the magnitude of such effects. In this section we present results of calculations of gravity gradient effects for three more precise tests made by Dicke et al. [23], Braginsky [24] and the Eöt-Wash group [12]. We also remark in this context on the space equivalence experiment MICROSCOPE [25]. For a recent overview of weak equivalence principle tests the reader is referred to [26].

In Sect. 3 we have shown that the EPF bias is due to either differences in proof mass geometry or time variation of ambient gravity or both. First let us discuss bias due to proof mass geometry in the context of the above more precise tests.

Compared to the EPF experiment, these tests used better proof mass geometry that was designed to reduce spurious ambient gravity field effects. Dicke and his collaborators [23], for example, reduced field gradient effects by 1) making the torsion balance small, with a moment arm of only 3.3 cm, 2) by giving the balance an approximate threefold symmetry axis and 3) by operating the instrument remotely. Braginsky [24] reduced the effects of variable local gradients by constructing a balance in the form of an eight-pointed star with equal masses at the points. The rotating pendulum designed by Wagner et al. [12] with fourfold azimuthal and up–down symmetries reduced systematic effects by minimizing coupling to the gravity gradients by allowing four different orientations of the pendulum with respect to the turntable rotor.

Another difference of the Dicke and Braginsky tests with respect to the EPF test was that their balances were fixed in the N–S direction, because rotation of the balance was not necessary to test for the effect of solar gravity field. (While EPF made some tests with a fixed balance for the solar gravity, their results were of inferior accuracy with respect to those in the Earth’s gravity field.) With a fixed balance there is only a constant torque due to shape effects that cannot bias the amplitude of the expected 24-hour solar equivalence violation signal. Wagner et al. [12] did rotate the balance to measure in the Earth’s gravity field, but—in contrast with the EPF experiment—they very carefully considered and controlled coupling to external gravity gradients (multipole fields) due to shape effects.

In the following we present example calculations of the bias due to time variation of ambient gravity. Our intent was just to show the expected magnitude of certain environmental effects.

Dicke and his collaborators [23] estimated and reported the effects of time-varying gravitational disturbances due to anthropogenic causes, precipitation, atmospheric masses and imperfections of the triangular torsion balance. Their balance was mounted and remotely controlled inside an instrument pit 12 ft deep by 8 ft square on a poured concrete floor resting on rock. Figure 11 of their paper [23] shows that the instrument pit was surrounded by soil; hence, the balance might have been affected by soil moisture effects. These effects were not mentioned in their paper [23]. To calculate the torque due to soil moisture variations we used the multipole formalism introduced in Sect. 3, specifically Eq. (7). To use this equation, \(q_{lm}\) multipole moments and \(Q_{lm}\) multipole fields are required.

Assuming perfect geometry of the balance, i.e., identical moment arms, the \(q_{lm}\) multipole moments of the Dicke Al–Au balance were calculated up to degree and order 10. If water content increased by 50% next to the pit in the pores of a rectangular volume of soil 60 cm \(\times \) 70 cm \(\times \) 1.5 cm, assuming normal porosity of 40%, the density change was \( 0.2 \,\hbox {g/cm}^3\), and an extra mass of 1.3 kg appeared inside that volume of the soil. Using corrected formula Eq. (19) from [19], the multipole moments of a cuboid with mass M, height H, sides a, b and orientation \(\phi \), the \(Q_{lm}\) multipole fields were obtained up to the same degree and order 10 of the rectangular mass anomaly using the analytic method de scribed in [18]. With these data we calculated the maximum torque on the balance \(T_g = 6.4 \cdot 10^{-10}\,\hbox {dyn}\,\hbox {cm}\). If, however, 1% imperfection in one of the arms of the balance was assumed, the maximum torque reached \(7.9 \cdot 10^{-10}\) dyn\(\cdot \)cm. Both torques are higher than the one that corresponds to the standard deviation of the Eötvös parameter \(\eta = 1 \cdot 10^{-11}\) reported by Dicke et al., since the torque \(T_g\) on the balance that corresponds to this value is \(T_g = 5.9 \cdot 10^{-10}\,\hbox {dyn}\,\hbox {cm}\). Such mass changes and the corresponding torque of course should have a 24-h periodicity to be falsely interpreted as an equivalence violation. But our simple calculations showed the sensitivity of the apparatus used by Dicke et al. [23] to possible time-varying environmental gravitational disturbances.

$$\begin{aligned} q_{lm}= & {} M\sqrt{\frac{(2l+1)(l+m)!(l-m)!}{4\pi }} e^{-im\phi } \nonumber \\&\times \, \sum _{k=0}^{(l-m)/2} \frac{(-1)^{k}{H}^{l-2k-m} }{(m+k)!k!2^{l+2k+m}(l-m-2k+1)!(2k+m+2)} \nonumber \\&\times \, \sum _{p=0}^{m/2 } (-1)^p {m\atopwithdelims ()2p}\sum _{j=0}^k {k\atopwithdelims ()j} \nonumber \\&\times \, \frac{(-1)^{m/2}a^{2k+m-2j-2p}b^{2j+2p}+b^{2k+m-2j-2p}a^{2j+2p}}{2j+2p+1}\,,\nonumber \\&\quad \text {for both } m \text { and } l \text { even, and } m\ge 0\,. \end{aligned}$$
(19)

The experiment by Braginsky et al. [24] was performed in a basement room of Moscow State University that was thermally insulated very carefully. In lack of additional details the size of the room and moisture control of walls can only be guessed. Again, if we assume perfect geometry of the balance, i.e., identical arms, \(q_{lm}\) multipole moments were calculated up to degree and order 10. Assuming 0.8% moisture change in a 35-cm-thick wall of size 2.5 m \(\times \) 5 m at a distance of 5 m from the balance, field multipoles were again calculated using Eq. (19). In this case the maximum torque was only \(T_g = 7.2 \cdot 10^{-15}\,\hbox {dyn}\,\hbox {cm}\). This is negligible considering standard deviation of the Eötvös parameter \(\eta = 0.9 \cdot 10^{-12}\) reported by the Braginsky team, since the torque \(T_g\) on the balance that corresponds to this value is \(T_g = 7.5 \cdot 10^{-13}\,\hbox {dyn}\,\hbox {cm}\). Relative machining tolerance of the balance was reported by [27] as \(4 \cdot 10^{-4}\). If therefore one of the arms of the balance was assumed to be changed by this amount (for a 10-cm arm by \(40\,\mu \hbox {m}\)), the quadrupole moment \(q_{22}\) became nonzero. With this slight imperfection the maximum torque on the balance increased by two orders of magnitude to \(T_g = 8.1 \cdot 10^{-13}\,\hbox {dyn}\,\hbox {cm}\) assuming the same moisture change as above. This torque is slightly bigger than that corresponds to the reported precision of the experiment.

The Eöt-Wash group at University of Washington, Seattle, performed a series of equivalence tests with uniformly rotating balances; the last and most precise of their test was reported in [28, 29]. The \(Q_{21}\), \(Q_{31}\) and \(Q_{41}\) moments of the environmental gravity gradient fields were measured with a specially designed gradiometer pendulum. Daily variations of \(Q_{21}\), \(Q_{31}\) were also monitored for about 10 days each. To avoid false equivalence principle violating signal from ambient gravity gradient field multipoles due to \(m=1\) moments, compensating masses near the apparatus reduced the \(Q_{21}\) and \(Q_{31}\) moments, as well as the \(Q_{22}\) moment. The \(Q_{41}\) field was shown to be negligible. However, the observed seasonal variations of \(\sim 1\%\) due to changes in the moisture content of the soil behind the laboratory and adjustments to the water level in Lake Washington limited their ability to fully compensate for environmental gravity gradients.

The best standard deviation of the Eötvös parameter obtained in the Earth’s gravity field was \(\eta = 1.2 \cdot 10^{-13}\) for the Be–Al configuration, and the corresponding torque on the balance was estimated as \(T_g = 2.6 \cdot 10^{-12}\,\hbox {dyn}\,\hbox {cm}\). Multipole moments of the balances in Be–Ti and Be–Al configurations are taken from Table 5.4 in [29]. We assumed 0.8% moisture change in a 35-cm-thick wall of size 1 m \(\times \) 1 m at a distance of 0.8 m from the balance and calculated multipole fields of this source as above. The maximum torque was \(T_g = 1.1 \cdot 10^{-13}\,\hbox {dyn}\,\hbox {cm}\) for the Be–Al and \(2.1 \cdot 10^{-13}\,\hbox {dyn}\,\hbox {cm}\) for the Be–Ti configuration. Both values are less than 10% of the value corresponding to the best standard deviation.

The MICROSCOPE space mission implemented a new approach to test the weak equivalence principle [25]. Nongravitational forces acting on the satellite are counteracted by thrusters making it possible to compare the accelerations of two concentric hollow cylindrical test masses of different compositions. The test masses’ shape has been designed to reduce the local self-gravity gradients by making them approximate gravitational monopoles [30]. For ideal gravitational monopoles no perturbing gravitational source, however close, could induce a differential acceleration between test masses [31]. Therefore, this smart design eliminated almost completely the bias that may result from interaction of test masses with the possible time-varying environmental gravity field.

6 Conclusions

We think that our findings shed a fresh light on the EPF data. It was demonstrated that the gravity field-related bias discussed in the present paper must be taken into account in any future attempt to explain the EPF results.

We thus suspect that EPF results were infected by this gravity gradient-related systematic error, although its precise magnitude remained unknown. Therefore, we propose a remeasurement to provide experimental verification of the gravity gradient effect by using an original torsion balance. Such an effect is not excluded by much more precise equivalence tests since it can only be measured with an Eötvös torsion balance. We mention in this regard that a remeasurement of the EPF equivalence test is going on this year in Budapest. An original Pekár G-2B balance with computer-assisted automatic rotation and reading was recently installed in a deep tunnel at 30 m below ground. We hope this remeasurement with careful control of the gravity gradient effect may provide further details on the EPF bias and on the EPF experiment itself. This experimental validation could also exclude or confirm the existence of an effect that could be related to the rotation of the Earth [14], a possible other reason of the apparent systematic deviation from the equivalence principle observed in the EPF experiment.

Considering equivalence tests that followed the EPF experiment, our example calculations have shown that even if the balance is not rotated, environmental gravity effects may reach the sensitivity of some of the experiments. This was the case for the tests made by Dicke, Braginsky and their collaborators. Even slight imperfections may increase sensitivity of the balances to relatively small variations of environmental gravity by orders of magnitude. On the contrary, due to the careful consideration of ambient gravity effects and/or by smart design, more recent equivalence tests by the Eöt-Wash group or by the MICROSCOPE mission are well-controlled in terms of the gravity gradient bias.