1 Two Pillars of Modern Physics

The methodology of physics, pioneered by Archimedes, Galileo and Newton, has been crystallising with the development of formal languages and statistical analysis. Its “unreasonable effectiveness” (Wigner 1960) is founded on two basic assumptions: Firstly, physical systems are modelled via mathematical structures, which guarantee the universality and objectivity of the description. Secondly, the models are verifiable via a comparison of their predictions against empirical data.

The question whether there exists an ‘ultimate’ mathematical model of physical reality (or an overarching “law of physics”), and whether it is intelligible, is controversial and has long been the subject of philosophical debate (Wheeler 1983; Deutsch 1986; Weinberg 1994; Hawking 2002; Heller 2009). This problem, however, seems irrelevant for the practice of doing physics. Indeed, any realistic experiment on a physical system is affected by some noise resulting from an environment. Hence, in practice, no model can become arbitrarily exact. Yet, we have to assume that our ignorance is the only source of the uncertainty, for otherwise we would decree “[...] that there exist aspects of the natural world that are fundamentally inaccessible to science.” (Deutsch 1986).

The second inexorable (Ellis and Silk 2014) pillar of modern physics is the falsifiability of mathematical models against empirical data. Again, in practice the hypotheses can only be confirmed at some confidence level, yet we have to assume that this is solely caused by our incapabilities. In order for the experiments to be conclusive and repeatable, we have to assure that they are free—that is the input cannot be correlated with the studied physical system, until the experiment is actually performed. The pertinence of this assumption has been recognised only recently on the occasion of the Bell tests (Aspect 2015; Hall 2010; Rauch 2018) demonstrating the predictive supremacy of quantum mechanics over “hidden variables” explanations. Heuristically, it has been voiced by Stephen Hawking and George F.R. Ellis: “[...] the whole of our philosophy of science is based on the assumption that one is free to perform any experiment.” (Hawking and Ellis 1973, p. 189)

2 The Inconsonance of Experimental Physics

2.1 Experiments and Models

The central requirement imposed on any valid experiment is that it can be understood and reproduced by any other experimenter. This demand is conveniently formalised within the assumption of intersubjectivity in the sense of Ajdukiewicz (1949, 1978). An experiment consists of two sets of data: an input \(d_{\text {in}}\) and an output \(d_{\text {out}}\), along with an unambiguous instruction on how to conduct the experiment. Both the data and the instruction ought to be intersubjectively communicable. In practice, it is of course never possible to reproduce an experiment ideally. Yet, the assumption of intersubjectivity guarantees that there are no a priori limits to do so. In other words, our empirical knowledge is limited solely by the amount of available resources, such as time, energy or memory.

Any set of experimental data must eventually be expressed as a finite combination of some intersubjective symbols. Hence, without loss of generality, we can assume that \(d_{\text {in}}\) and \(d_{\text {out}}\) are finite sets of bits, as any more complicated universal description of the data sets can eventually be rewritten in the binary form.

Every experiment is associated with some events and a physical system F. Among the relevant events are the ones associated with the preparation of the experimental setup (the input \(d_{\text {in}}\)) and the ones corresponding to the response of the apparatus (the output \(d_{\text {out}}\)). Let us observe that the distinction between the input and the output requires some concept of time. Indeed, the input always comes before the output and, more generally, the experimental instruction always has an inbuilt time-ordering. In the same vein, in order to conceive two or more independent experiments one needs some notion of space. These are, however, primitive notions—not related to any specific model of spacetime—and in this sense they should be understood hereFootnote 1.

The experimental data alone is but a set of symbols and does not have any epistemic value. It must be interpreted within some theoretical scheme in order to provide information about the chosen physical system F and the relevant events. A theoretical model \({\mathcal {M}}\) is some mathematical formalism, which describes the system F (or, more precisely, a chosen aspect of F) through states \(\omega\) from a set of all possible states \(\varOmega\). A state represents the full knowledge about the system, within a given formalism. To every event related with F, a model \({\mathcal {M}}\) associates a fixed state \(\omega\) (or, more precisely, an equivalence class of states which result in the same event).

A given model \({\mathcal {M}}\) thus provides explanations to the past events pertaining to F and imply correlations among them. For instance, a classical dynamical model explains a set of subsequent events \(p_1,p_2,p_3,\ldots\) in terms of a causal chain induced by a dynamics of states \(\omega _1 \rightarrow \omega _2 \rightarrow \omega _3 \rightarrow \ldots\) of the system F. On the other hand, a quantum model implies that two events p and q of simultaneous measurements on two particles are correlated, because the pair was in a entangled state.

Yet, any scientific model must not only provide explanations of past events, but also provide predictions about some future events. Indeed, the very idea of falsification is founded on the possibility of comparing the experimental outcomes with the theoretical predictions.

A sharp prediction takes the form of a conditional claim concerning two datasets \(d_{\text {in}}^p\), \(d_{\text {out}}^p\): “If \(d_{\text {in}}^p\) was input and \({\mathcal {M}}\) is valid, then \(d_{\text {out}}^p\) ought to be registered.” In such a case a single experiment with \(d_{\text {in}}= d_{\text {in}}^p\), but \(d_{\text {out}}\ne d_{\text {out}}^p\) would be sufficient to falsify the model \({\mathcal {M}}\). In general, one formulates more modest statistical predictions. These are expressed as conditional probabilities \(P(d_{\text {out}}^p \, \vert \, d_{\text {in}}^p)\) and they require multiple independent experiments with \(d_{\text {in}}= d_{\text {in}}^p\) to validate a statistical prediction of \({\mathcal {M}}\) at a prescribed confidence level, which we decree as satisfactory. Let us stress that at this level of formalisation it is irrelevant whether a model \({\mathcal {M}}\) is fundamentally probabilistic—as quantum models are—, or effectively statistical—as a result of ignorance of some of the system’s aspects, for instance its microscopic structure and/or interactions with an environment.

Any two competing models \({\mathcal {M}}\) and \({\mathcal {M}}'\) of a given phenomenon ought to be discernible, \({\mathcal {M}}\not \equiv {\mathcal {M}}'\), that is there must exist at least one experimental setting for which \(P(d_{\text {out}}^p \, \vert \, {\mathcal {M}}, d_{\text {in}}^p) \ne P(d_{\text {out}}^p \, \vert \, {\mathcal {M}}', d_{\text {in}}^p)\). Let us note that there could be two models, which offer different explanations of some past events, but have the same set of predictions. Then, they are not experimentally discernible. Consequently, a model \({\mathcal {M}}\) is operationally definedFootnote 2 by its set of predictions \({\mathcal {M}}:=\{P(d_{\text {out}}^{p,(i)} \, \vert \, {\mathcal {M}}, d_{\text {in}}^{p,(j)})\}_{i,j}\)., where i, j are indices parametrising the possible inputs and outputs, respectively.

2.2 Model Testing

For an experiment to be trustworthy, one has to warrant that the input data is free, that is independent of the history of the physical system F at hand. Concretely, we have to ensure the statistical independence of \(d_{\text {in}}\) and past states \(\varOmega _{\text {past}}= \varOmega _{\text {past}}({\mathcal {M}})\) pertaining to the system. In other words, a model \({\mathcal {M}}\) must not imply any correlations between \(\varOmega _{\text {past}}\) and the future events related with \(d_{\text {in}}\). For if it would do so, it would induce a statistical bias in what could be tested and how, hence a priori excluding a part of physical reality from our cognition. If multiple experiments with different \(d_{\text {in}}\)’s are performed, so that \(P(d_{\text {in}})\) can be defined a posteriori, we can express the demand of freedom as \(P(d_{\text {in}}\, \vert \, \varOmega _{\text {past}}) = P(d_{\text {in}})\).

Any experiment has to involve at least one free bit of input data, as the experiment might or might not be actually performed. Note that “experiment not done” is indeed a piece of intersubjective information, which corresponds to some definite event – for instance, publication in a report. In this case, there is no output and no output-related events as “Unperformed experiments have no results” (Peres 1978). On the other hand, if the experiment was done, \(d_{\text {out}}\) must take a definite value. For even if the experimental apparatus did not register any signal, this mere fact corresponds to an intersubjective bit of information (cf. Peres and Terno 2004).

Let us now fix a model \({\mathcal {M}}\) of a chosen physical system F. Let us also suppose that F is embedded in an environment E, which is not modelled, but does affect the experimental outcomes. This simply signifies that \({\mathcal {M}}= {\mathcal {M}}(F;E)\) includes some noise and/or free parameters. On the physical side, it means that F did and does interact with E.

Suppose now that an experiment with some \(d_{\text {in}}\) and \(d_{\text {out}}\) has been performed, to check the validity of \({\mathcal {M}}\). Since \(d_{\text {in}}\) constitutes some intersubjective information it must be physical, hence it had to be supported (or ‘written’) in a form of ‘matter’ G. Whatever model of G we would consider, it must be allowed to interact with the studied physical system F, because it actually did so in the experiment just performed. If G was a part of F, then \(d_{\text {in}}\) could not have been free, hence G has to be included in the environment part.

However, any scientific model \({\mathcal {M}}\) ought to be universal, that is the laws governing a chosen system F should also apply to its environment E. For instance, in order to test an atomic model it is desirable to detach a single atom from a larger molecule and shield it from its influence. Yet, we assume that the tested model eventually applies equally well to the atoms within the molecule.

Consequently, in order to guarantee the freedom of the experimental input, we need to assume that G is a part of the total system \(F+E\) not modelled within \({\mathcal {M}}\). But G is physical, as it carries information, so it is in principle modellable. One can thus seek an extended model \({\overline{{\mathcal {M}}}}\), which would encompass both F and G in abstraction from some other environment \(E'\). However, if we succeed in providing such an extended model \({\overline{{\mathcal {M}}}}\), we have to recognise that the entire ‘experiment’ designed to test the validity of \({\mathcal {M}}\) was actually a natural, that is modellable, phenomenon. Consequently, even if no correlations between \(d_{\text {in}}\) and \(\varOmega _{\text {past}}\) were assumed within \({\mathcal {M}}\), there exists an extended model \({\overline{{\mathcal {M}}}} = {\overline{{\mathcal {M}}}}(F,E)\) providing a natural explanation of the ‘experiment’ hence, a statistical dependence between \(d_{\text {in}}\) and \({\overline{\varOmega }}_{\text {past}} = {\overline{\varOmega }}_{\text {past}}({\overline{{\mathcal {M}}}})\).

Therefore, in order to guarantee the freedom of \(d_{\text {in}}\), we have to assume its independence in any conceivable model

$$\begin{aligned} P(d_{\text {in}}\, \vert \, \varOmega _{\text {past}}({\mathcal {M}})) = P(d_{\text {in}}), \quad \text { for all } {\mathcal {M}}. \end{aligned}$$

In other words, the experimental input \(d_{\text {in}}\) has to be random (cf. Colbeck and Renner 2012). But this means that the experimental input \(d_{\text {in}}\), which does affect physically the system F, is not modellable, even in principle. Consequently, after every experiment some aspect of the physical system F changes in a completely uncontrollable way.

Let us emphasise that these rather dramatic consequences are completely ignorant on the actual outcomes of the experiment. They stem from the very fact that an experiment was performed!

This is what we might call the experiment paradox: We must assume that the experimental input is free in order to perform credible tests of theoretical models, but then we agree that any experiment physical changes the system in an unintelligible way.

Before we discuss the methodological and philosophical consequences of this paradox, let us illustrate its manifestations in well-established physical models.

3 Faces of the Experiment Paradox

3.1 The Measurement Paradox in Quantum Mechanics

The indeterministic nature of the measurement process in quantum mechanics has been a major source of philosophical controversies since almost a century (Born 1926) (c.f., for instance, (Landsman 2017, Chapter 11) for a modern discussion). Its essence is illustrated on the diagram in Fig. 1.

Fig. 1
figure 1

Let F be a physical system described by the quantum state \(\rho\) and suppose that we choose to measure an observable A, according to our free input \(d_{\text {in}}\). Then, the standard von Neumann postulate of quantum mechanics implies that after the measurement the system’s state jumps abruptly to \(\rho '\)—one of the (pure) eigenstates of the observable A. Such an abrupt collapse can be given a natural explanation by embedding F in a (quantum) environment E and invoking the formal equivalence of projective measurements on F with a unitary evolution on \(F \otimes E\) (von Neumann 2018). But then, the entire system \(F \otimes E\) undergoes a unitary evolution and no definite outcome \(d_{\text {out}}\) is ever produced

There are many competing theoretical proposals on how to solve the measurement problem. One major branch of research relies on the concept of decoherence (Zurek 2003; Schlosshauer 2005), either through the scheme of quantum Darwinism (Zurek 2009) or spectrum broadcast structure (Horodecki et al. 2015; Mironowicz et al. 2017). Another approach seeks to relax the assumption about unitarity of the quantum states’ evolution, which could facilitate some “objective collapse” mechanism (Bassi et al. 2013). However, all these efforts are aimed at the “small” rather than the “big” measurement problem (Pitowsky 2006; Bub and Pitowsky 2010) (see also Bub 2015; Landsman 2017). The former asks how does a quantum superposition evolve into a classical statistical mixture, while the latter consists in the puzzle of how is a single outcome eventually chosen from a statistical mixture.

Both small and big measurement problems involve some kind of information loss. In the first case it is connected with the relative phases between the quantum states, in the second case these are the discarded classical elements of the statistical mixture. It seems therefore natural to argue that quantum mechanism is simply incomplete as a theory of natural phenomena. This point, in the context of composite—entangled—systems, has been the topic of the famous Einstein–Bohr debate (Einstein et al. 1935; Bohr 1935).

The question whether there could be a hidden local realistic theory behind quantum mechanics has been turned into a testable prediction by means of Bell’s theorem (Bell 1964). In the simplest scenario (Clauser et al. 1969) two parties perform simultaneous measurements on two particles from an entangled pair. If the measurement outcomes are determined by a “hidden variable” \(\lambda\), then a certain measure of correlations S between the outcomes cannot exceed the value of 2. On the other hand, the quantum formalism implies that S can be as large as \(2\sqrt{2}\) (Cirel’son 1980).

Since the 1970s multiple “Bell tests” on different physical systems have shown, with a high statistical significance, that the experimental value of S can easily exceed the classical bound and approach the quantum limit (Aspect 2015). However, as recognised already by John Bell himself, any Bell test relies on three external assumptions called “loopholes”. Two of them, “locality” imposing spacelike separation of the two measurements and “fair sampling” requiring a sufficiently sensitive detecting devices, were successfully closed very recently (Hensen et al. 2015; Giustina et al. 2015; Shalm et al. 2015; Rosenfeld et al. 2017).

Yet, there exists a third loophole, the “freedom of choice”, which assumes that the settings of the parties’ devices are uncorrelated with the hidden variable governing the outcomes. It is well known (Brans 1988; Hall 2010) that by relaxing this assumption one can easily attain, and even surpass, the quantum bound on S with a local realistic model. This loophole can be mitigated, for instance through explicit human choices (The BIG Bell Test Collaboration 2018) or signals from remote quasars (Rauch 2018), but it cannot be ultimately overcome (Hall 2016). Its status is therefore metaphysical rather than physical, as realised already by Bell (2001).

In conclusion, for any experiment demonstrating the validity of quantum mechanics one can find a local (or nonlocal, as in the case of Bohmian mechanics) realistic model, which explains its outcomes. Consequently, there exists no unequivocal way to prove that random events exist in nature. We note, however, that such models are operationally equivalent, because they offer exactly the same predictions. On the other hand, the quantum predictions can be conveniently regarded as conditional statements of the form: “If the experimental input was random, then the output will also be random”. This viewpoint is adopted, for instance, by Conway and Kochen in their “Free will theorem” (Conway and Kochen 2006, 2009). It can also be quantified in the “randomness amplification” schemes (Colbeck and Renner 2012; Brandão et al. 2016).Footnote 3

The (“big”) measurement problem in quantum mechanics is in fact an instance of the experiment paradox. It arises primarily because of the random input \(d_{\text {in}}\), which disturbs the system F in an uncontrollable way. The consequence of the latter is an unavoidable indeterminacy of some of the measurement outcomes. This harmonises with the “randomness amplification” viewpoint on quantum experiments. It can always be avoided by questioning the freedom of, at least a part of, the experimental input \(d_{\text {in}}\). However, the general experiment paradox induces also a second riddle—the “preparation paradox”, to which we now turn.

3.2 The Preparation Paradox

In contradistinction with the measurement paradox, the preparation paradox is present not only in quantum theory. A convenient universal framework for both classical and quantum mechanics uses the algebraic language (Strocchi 2008; Keyl 2002) of states—encoding the properties of a given physical system F and observables—measurable physical quantities. Any observable A has a spectrum \({\mathrm {sp}}(A) \subset {\mathbb {R}}\), that is a set of possible measurement outcomes and any state \(\omega\) defines a probability distribution \(\mu _{\omega ,A}\) over the set \({\mathrm {sp}}(A)\). Models of physical phenomena are formulated in terms of dynamical equations

$$\begin{aligned} f(\omega (t),t) = 0, \quad \text {for} \quad t \in [0,T], \end{aligned}$$
(1)

where t is a time parameter and f is functional (typically, a linear differential operator) acting on the space of states. Such a model specifies the time-evolution of the system’s state \(\omega (t)\) from an initial condition

$$\begin{aligned} g(\omega (t),t) \vert _{t=0} = 0, \end{aligned}$$
(2)

determined by a (collection of) functionals g. A typical example is a first order partial differential equation with an initial condition \(\omega (0) = \omega _0\).

The predictions of the model (1) are then formulated as follows: If the system was initially described by (2) and an observable A was measured at a time \(t > 0\), then an outcome \(a\in {\mathrm {sp}}(A)\) will be obtained with probability \(\mu _{\rho (t),A}(a)\)Footnote 4. Hence, to test a model determined by equation (1) one has to prepare the system in an initial condition (2) and then measure an observable A at some time \(t \in (0,T]\). Multiple experiments with different inputs \(g_1, g_2 \ldots , g_x\) would tell us whether the predicted probabilities match the observed ones.

Note that the experimental input listed above is indeed free within model (1), because the latter specifies neither the initial conditions (The model does specify the admissible forms of the initial conditions, but not the numerical values.) nor the observable A and the measurement time t. However, we have to admit that the studied system had been in some state \(g_0\) (for instance, the vacuum state) before it was prepared by the experimentalist.

This means that in every experiment the state of the studied system F changes from the \(g_0\), in which it would have been had the experiment not been performed, to a prepared state \(g_i\). Clearly, such a preparation procedure is physical as it effectuates a physical change in the system F. Yet, this change is not modelled within the model (1), for if it could have been modelled within (1), then the entire “experiment” would have been a natural evolution of the system, rather than a valid test.

To understand the ‘resetting problem’ we could construct an extended model

$$\begin{aligned} f(\rho (t),t) = 0, \quad \text {for} \quad t \in [t_0,T], \end{aligned}$$
(3)

describing the evolution of the system from some earlier time \(t_0 < 0\) until T. Then, we assume that its dynamics has been disturbed

$$\begin{aligned} f(\rho (t),t) = j(t), \quad \text {for} \quad t \in [t_0,0), \end{aligned}$$
(4)

with a suitable source j, so that the desired condition (2) at \(t=0\) is met, regardless of the primordial system’s initial conditions at \(t=t_0\). But, clearly, models (3) and (4) are different and the introduction of a source term forces us to change the model. Had the experiment not been performed, the object would evolve according to equation (3) rather than (4). If, on the other hand, we attempt to model the source itself we lose (or rather, shift to another level of complexity) its tunability—hence the ‘preparation paradox’ (see Fig. 2).

Fig. 2
figure 2

The preparation paradox is the ‘mirror’ version of the measurement paradox. Suppose that we have obtained an output \(d_{\text {out}}\) from an experiment performed on the system F. By seeking a ‘purely natural’ explanation of \(d_{\text {out}}\) we have to embed F in an environment E, the interaction with which caused \(d_{\text {out}}\). But then no \(d_{\text {in}}\) has ever occurred and we have never actually prepared the system in any way

The preparation paradox applies equally well to classical and quantum mechanics. In the latter case one can take, for instance, a model (1) with a Schödinger or von Neumann equation describing the dynamics of a quantum state. We are free to prepare the initial quantum state of the system—for instance through projective measurement—, but we have to assume that the tested dynamical equation does not model the entire preparation process. The same line of reasoning could be followed in the Heisenberg picture, in which the system’s state remains steady, but the observables evolve in time. Let us also note that the source term j need not be a function—it could be, for instance, a time-dependent Hamiltonian appended to the Schrödinger equation of a given system.

In conclusion, regardless of whether the theory entails that the measurement—i.e. information acquisition—disturbs the system or not, the preparation procedure is always invasive.

Preparing a system “from outside” is not problematic if we consider a restricted model, say, of a bouncing basket ball. Obviously, one does not expect such a model to say anything about the motion of our hand, which prepares the initial state of the ball. The problems start if the tested model is universal, as we expect a law of physics to be. If we want to test, for instance, the Newton’s law of universal gravitation we have to assume that it does not model the entire preparation procedure. Consequently, there are phenomena to which it does not apply and hence it is not universal.

Typically, the experimental outcomes depend very weakly on how the system has been prepared. The ‘triggering effects’, that is the details of the source j, can usually be alleviated below the noise level shaped by the uncontrolled interaction of the system with its environment. The paradox is, however, more salient in the cosmological context, to which we now turn.

3.3 Cosmological Paradox

Modern cosmological models are formulated in the framework of field theory. Let us emphasise that the fields do not evolve per se—a solution to field equations specifies the field content in the entire spacetime. Therefore, any disturbance coming ‘from outside’ would effectuate a global change. In other words, a local terrestrial experiment affects both future and past states of the Universe (see Fig. 3). Note also that cosmological observations are indeed genuine experiments for, firstly, they might but need not be effectuated and, secondly, they involve a number of free input data, such as the telescope’s location and direction or electromagnetic spectrum sensitivity range.

Fig. 3
figure 3

The conformal diagram for the Minkowski spacetime. The field content of the entire spacetime is uniquely determined by initial data imposed on a Cauchy hypersurface S. Consequently, a free intervention effectuated in the region K induces a change both in the causal future \(J^+(K)\) and the causal past \(J^-(K)\) of K. More generally, the outer diamond could serve as an illustration for the maximal Cauchy development of the hypersurface S

As an illustration, let us consider a cosmological model based on Einstein’s equations

$$\begin{aligned} G_{\mu \nu } = \tfrac{8 \pi G}{c^4} T_{\mu \nu }, \end{aligned}$$
(5)

with a matter energy–momentum tensor \(T_{\mu \nu }\) (possibly including the “dark energy”, i.e. the cosmological constant term \(\varLambda g_{\mu \nu }\)). The geometrical Bianchi identity \(\nabla ^{\mu } G_{\mu \nu } = 0\) implies the local covariant conservation of energy and momentum \(\nabla ^{\mu } T_{\mu \nu } = 0\) (Wald 1984). But, if an ‘external’ source term \(j_{\nu }\) is introduced into (5), the conservation law is violated, \(\nabla ^{\mu } T_{\mu \nu } = -j_{\nu }\), explicitly breaking general covariance. In other words, if one introduces into the universe some information which was not there, one creates ex nihilo a local source of energy–momentum.

In quantum field theory, whereas the energy and momentum need not be conserved locally, the suitable expectation values ought to be conserved. Concretely, if \({\hat{T}}_{\mu \nu }\) is the energy–momentum operator constructed from quantum matter fields, then

$$\begin{aligned} \nabla _{\mu } \langle \psi \vert {\hat{T}}^{\mu \nu } \vert \psi \rangle = 0 \end{aligned}$$
(6)

should hold (Bertlmann 2000) for any state vector \(\vert \psi \rangle\). The introduction of a, possibly quantum, source \({\hat{j}}\) violates the constraint (6) leading to the Einstein anomaly and, eventually, to the breakdown of general covariance (Bertlmann 2000) (see also Bednorz 2016).

In order to perceive the experiment paradox from the perspective of ‘cosmic evolution’ we firstly need to choose a time function—that is an observer—, which fixes an effective splitting of the global spacetime into space and time (Wald 1984). Secondly, one has to assure that equations (5) allow for a well-defined Cauchy problem (Ringström 2009). The latter consists in imposing initial data on a time-slice, say at observer’s time \(t=0\), and studying its (maximal) hyperbolic development (see Fig. 3). This guarantees that both past and future field configurations are uniquely derived from the imposed initial data. The objectivity of the evolution is guaranteed by general covariance, which enables unequivocal transcription of the time-slice field configurations for different observers.

Now, a free perturbation, which means an abrupt change of initial data on a time-slice, in a region K of space inflicts a change in both causal future \(J^+(K)\) and causal past \(J^-(K)\) of K. The problem persists in the context of quantum field theory, because of the “time-slice axiom” (Haag 1996). This is independent from the fact that projective measurements are as harmful to quantum field theory as they are to the non-relativistic quantum theory.

As an illustration suppose that we put a satellite into orbit to test the validity of a cosmic model. Clearly, the matter distribution on a time-slice is slightly different with the satellite and without it. Then, if F is taken literally to be the whole universe, we would have to adjust the past states of the universe to match its present state with the satellite, which effectively means changing the model. Alternatively, one could maintain that the satellite was predetermined to be put into orbit, which means that \(G(d_{\text {in}}) \subset F\) and we could not have not performed the ‘experiment’.

Such absurd conclusions can avoided by recognising that it is more appropriate to say that modelling always involves some aspect of the system and not the system itself. In the cosmic context, one seeks to model the global properties of the universe, while neglecting its microscopic details. The latter could then be treated as the “environment” E, in which the consequences of the experiment paradox could be hidden. Indeed, if one adopts a coarse-grained model of matter, say with a pixel of the size of a galaxy, then obviously the presence or lack of a satellite in Earth’s orbit does not make any difference.

The cosmic example illustrates again that the experiment paradox is not about the unavoidable presence of systematic errors in any experiment, but about the existence of consistent foundations of natural sciences.

4 Conclusions and Consequences

The experiment paradox shows that the concepts of universal laws of physics and testability are not compatible. There are two obvious ways to overcome the it:

The first route, discussed already in the cosmological context, consists in admitting that any model-testing procedure is in fact illusive. This inevitably pushes one into a superdeterministic position (c.f., for instance, ’t Hooft 2017), which reduces the whole scientific discourse to a passive observation (see Fig. 4). Apart from dubious philosophical foundations, such a standpoint raises pertinent practical questions on how science should be done. If one is not allowed to trust any experiment, then what should be the basis of model’s assessment? And why, after all, some models accurately predict the future events, while others do not?

Fig. 4
figure 4

One could maintain that we never actually prepare physical systems or perturb them—we just observe them evolving without effectuating any disturbance. But such a superdeterministic viewpoint excludes a priori the possibility of any interaction with the ‘physical world’, in particular it disallows any experiments. This firstly annihilates the explanatory power of science and, secondly, it is highly unpractical for it excludes a priori the existence of devices functioning according to our inputs

The second way out of the paradox is to question the universality of models, which we invoked to argue that any theory modelling a chosen aspect of physical reality must eventually apply equally well to the studied system and its environment. This could be done, for instance, by claiming that the laws of physics are only an effective concept (c.f. Wheeler 1983), which does not reflect the true nature of things, if the latter exist at all. Such a standpoint forces us to discard some aspects of natural world from the scientific cognition (Deutsch 1986) and thus undermines the objectivity of the scientific discourse. Furthermore, it fails to explain why our present physical theories are so successful.

Both these solutions essentially boil down to a radical version of underdetermination of scientific theories, which eventually challenges the very rationality of science (Stanford 2017). While being logically consistent such a viewpoint does not grasp the full complexity of the scientific discourse. Indeed, the scientific method has been crystallising over the centuries rather than starting from fixed methodological principles. Given the multitude of different contexts—ranging from observational cosmology to quantum engineering (see Fig. 5)—it seems more adequate to adopt a pluralistic perspective (c.f. Davies 2010).

Fig. 5
figure 5

In scientific practice we assume that our interactions with the studied system F encoded in \(d_{\text {in}}\) do not have natural, i.e. modellable, causes and that the obtained information \(d_{\text {out}}\) is always definite and objective. In order to save the model’s consistency we have to warrant that our interventions do not affect the past states of the system F. To this end we need to embed it in a suitable environment E, which absorbs the ‘retrocausal’ effects (\(E \rightarrow {\widetilde{E}}\)) and enables a consistent description of system’s history by multiple observers. Whether we wish our interventions to affect the system’s future states (\(F \ne F'\), \(E = E'\)) or not (\(F = F'\), \(E \ne E'\)) depends on whether we work in the observational paradigm (as, for instance, in cosmology) or in the engineering one

The underdetermination of scientific theories is well known, even if not too popular in the physicists community. However, the standard arguments based on the Quine–Duhem thesis do not exclude the possibility of the existence of the ultimate model of physical reality. Indeed, such a “Theory of Everything” has become a dream of many physicists (Weinberg 1994; Hawking 2006). Even if we will never be able to fully understand it, because of the finiteness of available resources, one could hope that our theoretical models can be consistently expanded (Hawking 2002). The experiment paradox presented here undermines these hopes and casts doubts over the programme of unification of physics, at least in its reductionist version.

This insight uncovers a captivating similarity between theories and experiments. Any formal theory is based on a collection of axioms, from which theorems are deduced. The axioms cannot be ultimately proven true or false—this is a general fact stemming from the limiting theorems in formal systems. In practice, we assume the axioms to be true and check whether such a premise is useful for deriving new results. Similarly, any experiment is based on a set of free bits (the ‘input’), which facilitate an explanation—within a theoretical scheme—for the ‘output’. We do not ask whether these bits were ‘truly random’ or not, but rather if the assumption about their freedom is useful or not. The pluralistic scheme proposed in Fig. 5 induces a new type of top-down causation—from the observer to the observed system. In cosmological context it should be minimalistic, but in the engineering paradigm it should play a pivotal role.

The experiment paradox offers a new perspective on the limits of natural sciences established from inside of the scientific paradigm. As such, it may be seen as an analogue of incompleteness theorems in mathematics. Here, the argument is based on both mathematical modelling and empirical testing. It should be stressed that the experiment paradox is not a consequence of the fact that any experimental procedure is limited by the finiteness of empirical resources. Rather, it is thanks to the finiteness of the available resources, which compel us to adopt the concept of the environment, we evade the consequences of the paradox in scientific practice.

Because the experiment paradox sets limits on epistemic models it implies an ‘ontic’ underdetermination of science, not based on its psychological and sociological aspects. No mathematical model of physical reality can be both arbitrarily exact and complete. This uncertainty renders the effectiveness of natural sciences even more “unreasonable”. Its existence might even provoke us to rethink the necessary preconditions for the very possibility of formulation of consistent ontological statements supported by empirical data.