1 Introduction

Any theory of mind that postulates mental states as fundamentally different from physical states has to explain how mental states can—apparently or actually—influence physical states and vice versa. The existence of such a mind–body interaction (MBI) is a natural common-sense belief. However, no theory has been developed to substantiate this belief, and several authors have argued that such interactions are incompatible with the laws of physics. For example:

For apart from the fact that the physical influence of one of these sub-stances on the other is inexplicable, I recognized that without a complete derangement of the laws of Nature the soul could not act physically upon the body. (Leibnitz [1])

If immaterial mind could move matter, then it would create energy; and if matter were to act on immaterial mind, then energy would disappear. In either case energy-a property of all-and only concrete things would fail to be conserved. And so physics, chemistry, biology, and economics would collapse. Faced with a choice between these “hard” sciences and primitive superstition, we opt for the former. (Bunge [2])

How can the nonphysical give rise to the physical without violating the laws of the conservation of mass, of energy and of momentum? (Fodor [3])

As exemplified by the quotes above, the violation of conservation laws is the most commonly used argument against MBI—see, Refs. [4,5,6,7] for recent discussions of this particular argument. Arguments about the incompatibility of MBI with physics are more common in non-physics literature, for example, in philosophy of mind or neuroscience. In contrast, there is a strong tradition of physicists arguing against the reduction of the mental to the physical—see, Ref. [8] for a recent selection of quotes. This tradition was particularly strong among the founders of quantum theory. Indeed, the Copenhagen interpretation of quantum theory and its variations presuppose the existence of observers with non-trivial abilities, as distinct from any specific quantum system under consideration.

In this paper, we contend that MBI and physics are easily compatible: the conceptual tools for describing MBI are already present in contemporary theoretical physics. This is not to say that MBI is contained within existing fundamental theories of physics, but that those theories have a mathematical structure that can be employed in order to formulate theories with MBI.

We demonstrate this point by the explicit construction of a general mathematical framework for MBI theories. The latter extends existing fundamental physical theories towards the inclusion of mental degrees of freedom. This framework does not depend on specific features or interpretations of quantum theory, in fact, it would also work if matter were described by classical physics. As a result, we provide an explicit counterexample to any claim about fundamental incompatibility between MBI and modern physics.

Contemporary physics is based on a small number of fundamental theories—like quantum theory and general relativity—that pertain to describe all phenomena in their domain. These theories are mathematical: their concepts are represented by mathematical objects, and their principles can be expressed as mathematical axioms. Physics theories place primary emphasis on the causal and structural features of the objects they study, and they can be highly successful, even if their underlying ontology is vague or underdeveloped.

Certainly, ontology is important to any physical theory, a theory is not complete unless it has a well specified ontology. However, when theories are viewed as research programs that develop in time, their early and intermediate stages may lack clear definitions or characterizations of key concepts. A prime example is the notion of the force in Newtonian mechanics. Newton himself admitted its ambivalence, it was a topic of debate for more than one century, and the debate built the ground for the modern notion of the field. However, the mathematical structure of Newtonian mechanics was not affected by different views on the nature of the force.

The emphasis on mathematical structure is a crucial factor that could lead to theories of MBI that make empirically testable predictions, even if this requires the postponing the answer to questions about the underlying ontology. Our current fundamental theory for the microcosm, quantum mechanics, works this way. Different interpretations of quantum theory postulate radically different ontologies. However, the theory works excellently with only a commitment to a minimal pragmatic interpretation [9]. To quote Dirac [10],

When you ask what are electrons and protons I ought to answer that this question is not a profitable one to ask and does not really have a meaning. The important thing about electrons and protons is not what they are but how they behave, how they move. I can describe the situation by comparing it to the game of chess. In chess, we have various chessmen, kings, knights, pawns and so on. If you ask what a chessman is, the answer would be that it is a piece of wood, or a piece of ivory, or perhaps just a sign written on paper, or anything whatever. It does not matter. Each chessman has a characteristic way of moving and this is all that matters about it. The whole game of chess follows from this way of moving the various chessmen.

We think that the attitude expressed by Dirac could prove very fruitful for the mind–body problem. One should simply substitute the “electrons and protons” in Dirac’s quote with thoughts, qualia or any other mental object. We can even expand on Dirac’s metaphor. A theory with MBI is analogous to a game that is played with two chess-boards, the mental board and the physical board. We should look for a new set of rules that describe how the movements of the pieces on one board are interconnected and affect the movements of the pieces on the other.

To this end, we must first reformulate the existing physical theories in a way that is amenable to such a generalization. We will show that the logical reformulation of physical theories in terms of histories—see, Sect. 3 for explanations of the terminology—satisfy this condition. The result is a general mathematical framework for theories that describe MBI. We will refer to this framework as \(\varPsi \varPhi\)I formalism, i.e., a formalism for theories with psycho-physical interaction. It turns out that some \(\Psi \Phi\)I-based theories are fully compatible with energy conservation, hence explicitly removing the most common objection to MBI.

\(\Psi \Phi\)I-based theories will involve psycho-physical laws that ‘relate experience to elements of physical theory’ as proposed by Chalmers [11, 12]. However, in contrast to Chalmers’ proposal, the \(\Psi \Phi\)I formalism does not describe the physical world as causally closed. The physical and the mental fully act upon each other.

To clarify this point, we note that the existence of MBI is a stronger assertion than the independent existence of mental states, a position usually referred to as dualism. It is logically conceivable that the mental states exist but that they do not act upon physical states. This position is known as epiphenomenalist dualism or epiphenomenalism. It allows one to preserve the causal closure of the physical world, while avoiding philosophical problems of positions that deny the existence of mental states. Epiphenomenalism conflicts our common sense intuition that our thoughts and feelings make a difference to our behavior. It also conflicts a key background idea of modern physics that all degrees of freedom have causal powers and influence the rest of the world. \(\Psi \Phi\)I-based theories involve full causal action of the mind degrees of freedom.

The \(\Psi \Phi\)I formalism exemplifies the idea that mind is a substance subject to mathematical laws that are different from laws of physics which describe material substances. Both types of law coexist within a unifying structure, through which MBI is described. This perspective naturally transfers a common and successful practice of theoretical physics to the mind–body problem. Nonetheless, it is rarely brought forward in contemporary discussions. Consider, for example, the following quote by Penrose.

We [...] are part of a universe that obeys, to an extraordinary accuracy, immensely subtle and broad-ranging mathematical laws. That our physical bodies are precisely constrained by these laws has become an accepted part of the modem scientific viewpoint.[...] Many people find the suggestion deeply unsettling that our minds might also be constrained to act according to those same mathematical laws. Yet to have to draw a clean division between body and mind-the one being subject to the mathematical laws of physics and the other being allowed its own kind of freedom-would be unsettling in a different way. For our minds surely affect the ways in which our bodies act, and must also be influenced by the physical state of those same bodies. [...] But if the mind were able to influence the body in ways that cause its body to act outside the constraints of the laws of physics, then this would disturb the accuracy of those purely physical scientific laws. It is thus difficult to entertain the [...] view that the mind and the body obey totally independent kinds of law. Even if those physical laws that govern the action of the body allow for a freedom within which the mind may consistently affect its behaviour, then the particular nature of this freedom must itself be an important ingredient of those very physical laws. Whatever it is that controls or describes the mind must indeed be an integral part of the same grand scheme which governs, also, all the material attributes of our universe. [13]

The realm of the mind is a priori presumed to be either non-mathematical or lawless (“its own kind of freedom”), unless subordinated to the laws of physics. The existence of a mathematical lawfulness for mind that is different from that of matter is not even hinted at as an alternative. This omission is quite common, and, in our opinion, it is very puzzling. After all the idea of the mathematical lawfulness of the mind / soul originates from the same philosophical tradition with that of the mathematical lawfulness of matter, and it is at least as old as Plato [14].

The structure of this paper is the following. In Sect. 2, we explain how interactions of fundamentally different substances are described in current theories of physics. In Sect. 3, we present the main ideas of histories-based reformulations of physical theories. In Sect. 4, we argue that the natural way to explain the physical correlates of consciousness is to view such correlations as dynamical, i.e., as arising from the interaction of mental and physical degrees of freedom. Then, we show how this interaction is described in the \(\Psi \Phi\)I formalism. In Sect. 5, we discuss the role of energy and information in the \(\Psi \Phi\)I formalism and how they relate to open problems in physics. In the last section, we summarise and discuss our results.

2 General Relativity as a Prototype for MBI

A traditional objection to theories of MBI is that if mental and physical properties are ontologically different, then they lack the communality that is necessary for interaction. This objection was raised against Cartesian dualism already in the 17th century. It was a strong argument as long as physical interactions were deemed to be mechanistic and to be based on physical contact of extended objects. Today, this argument has little force. First, because Bell’s theorem asserts that it is impossible to describe the fundamental physical interactions mechanistically. Second, because physics already contains theories about the interaction between radically different ‘substances’. General Relativity (GR), our current theory of gravity, describes the interaction between spacetime geometry and matter. In this section, we elaborate on the latter point, and we argue that GR provides a methodological template for formulating theories of MBI.

We first explain the meaning of the terms “spacetime geometry” and “matter”. Spacetime geometry is the structure that determines what clocks and rods measure, i.e., spatial and temporal distances between physical events. Events are represented by points on a four-dimensional manifold M. Three coordinates on M refer to space and one coordinate refers to time. In GR, matter can be taken roughly to refer to entities that extend in space and carry energy and momentum. Spacetime geometry and matter are essentially distinct, and their differences are not blurred by the fact that they interact.

General Relativity describes the interaction between matter and geometry by embedding both entities within a broad mathematical framework, namely, Lagrangian field theory (LFT) [15]. The LFT was initially conceived as a generalization of classical mechanics for continuous systems. The properties of continuous systems are expressed in terms of classical fields. A classical field is a map \(\phi : M \rightarrow S\), for some set S, i.e., a maps that assigns one mathematical object \(\phi (X) \in S\) to each spacetime point X. LFT enables us to describe a system’s dynamics in terms of partial differential equations that are satisfied by the classical fields.

Note that the formulation of GR in terms of LFT was not accidental. LFT originates from the tradition of 19th century’s analytical dynamics. The latter was explicitly championed as an abstract theory of dynamics that could be used without commitment to a specific ontology of the microscopic structure of matter and/or the ether [16]. This is the reason why it survived the demise of mechanistic models for matter and ether, and it remains an indispensable physics tool.

Spacetime geometry is expressed in terms of a field g, the Lorentzian metric. The metric g incorporates all geometric information in a compressed form. Spatial and temporal distances are obtained by decompressing the information contained in the metric through the solution of the so-called geodesic equations. The correspondence between geometries and metrics is not one-to-one. A metric also carries some non-geometric information about the choice of a coordinate system, with the result that one geometry corresponds to an infinity of different metrics.Footnote 1

In contrast, there is a huge loss of information when we use classical fields to describe matter. Fundamentally, matter is described by quantum theory; in the LFT description, quantum effects are ignored (or averaged out). The treatment of matter as continuous at macroscopic scales also implies that the discrete structure at the atomic level is ignored.

The description of matter and geometry in terms of classical fields is suboptimal for both matter and gravity. The field description of geometry contains too much information, the field description of matter contains too little. The LFT formulation of matter-gravity interaction in GR is a working compromise, and not a perfect fit. But it suffices for a formulation of a theory in which ‘spacetime tells matter how to move; matter tells spacetime how to curve’ [17].

The compromises involved in the formulation of GR are anything but benign. They are the source of major problems in any attempt to extend GR, for example, towards a quantum theory of gravity.Footnote 2 Nonetheless, GR works. It has a consistent and elegant mathematical structure, and it leads to predictions with excellent agreement to the experiment.

There is also an interesting analogy in the historical development of GR, and contemporary discussions of the mind–body problem. Einstein was strongly influenced in the development of GR by Mach’s ideas about the inter-relation between space, time and inertia [21]; he referred to these ideas as Mach’s principle. The main point is that space and time are inextricably linked with the existence of physical bodies; in absence of bodies there is not space and time. Inertial reference frames originate from distant cosmic masses, and the inertia of individual bodies is determined by all bodies in the Universe. Using a contemporary terminology from the philosophy of mind, Mach’s principle entails that spacetime and inertia are supervenient upon matter. However, GR differs from Mach’s principle, in that non-trivial spacetime structure can exist independently of a matter source, as is the case in gravitational waves. Furthermore, Mach’s principle suggests that the material degrees of freedom are causally closed, while in GR this is certainly not the case.

GR suggests the following strategy for describing the interaction between two ontologically different entities A and B.

  1. 1.

    Identify an appropriate mathematical framework \(\text{ Dyn }\) for the description of dynamics (the analogue of LFT for GR).

  2. 2.

    Represent entity A by mathematical objects \(F_A\) and entity B by mathematical objects \(F_B\), where both \(F_A\) and \(F_B\) fit within the structure of \(\text{ Dyn }\). These representations need not be one-to-one; they may involve either redundancy or information loss.

  3. 3.

    Identify all possible dynamics in \(\text{ Dyn }\) that involve interaction between \(F_A\) and \(F_B\).

In the present context, we want A to correspond to mental states/processes and B to physical states/processes. We have a very good idea of the mathematical structures involved in B, and, thus, of possible frameworks \(\text{ Dyn }\) compatible with B. In the next section, we will present what we believe to be the most appropriate framework for theories of MBI, namely, histories theory.

3 Histories Theory

In this section, we present the logical reformulation of physical theories that is based on histories, to which we will refer as histories theory. A history is a time-ordered sequence of properties of a physical system. In histories theory, any physical system is described in terms of (i) logical propositions about histories of the system and of (ii) the probabilities associated to such propositions. In particular, the notion of dynamics is incorporated into the rule of probability assignment.Footnote 3 The emphasis on the logical and probabilistic aspects of physical theories makes histories theory particularly suitable for the description of MBI, because propositions and probabilities are ontologically neutral. They can be meaningfully defined also for mental processes.

The idea of translating physics into the language of logical propositions originates from von Neumann and Birkhoff [22]. The idea that dynamics can be incorporated into the probability assignment for histories (in both classical or quantum physics) is due to Wigner and collaborators [23]. The identification of a logical structure for quantum mechanical histories is a key achievement of the consistent/decoherent histories approach to quantum theory by Griffiths, Omnés, Gell-Mann and Hartle [24,25,26,27,28]. This logical structure is the basis of our presentation here,Footnote 4 as it can easily be adapted to any physical theory by using the temporal logic axiomatization developed by Isham [29, 30]. The consistent incorporation of dynamics in histories theory and the analysis of the theory’s temporal structure is due to Savvidou [31, 32]. For the relation of histories theory to stochastic processes, see, Ref. [33] and for the histories theory version of GR, see, Refs. [34, 35].

3.1 History Propositions

Consider Usain Bolt running a hundred-meter race. He defines a physical system, to which we can assign the following propositions.

  • \(\kappa _t\) = “at time t, Bolt is between the 65- and the 70-meter mark ”;

  • \(\lambda _t\) = “at time t, Bolt’s momentum p lies between 3500 kgm/s and 4000 kgm/s”;

  • \(\mu _t\) = “at time t, Bolt’s kinetic energy E lies between 75 kJ and 0 kJ”.

The propositions above refer to a single moment of time t—we take \(t = 0\) to correspond to the start of a 100-meter race. For a specific race and for a specific moment of time t, they may be true or they may be false.

Single time propositions that refer to a pointlike particle (rather than an athlete of finite size) can be expressed solely in terms of the particle’ s positions and momentum. Each proposition corresponds to a subset C of the state space \(\Gamma\), i.e., a set that consists of points (xp). The elements of \(\Gamma\) are called classical microstates and they provide the most precise description of the system at one moment of time.

We can also consider history propositions, i.e. propositions that refer to more than one moments of time. The following are examples.

  • \(\alpha\) = “at time \(t_1\), Bolt is between the 65- and the 70- meter mark, and at time \(t_2\) he has momentum between 3500 kgm/s and 4000 kgm/s”

  • \(\beta\) = “at all times t between \(t_1\) and \(t_2\), Bolt’s momentum is greater than 3000 kgm/s”

  • \(\gamma\) = “at some time t between \(t_1\) = 9.58 s and \(t_2\) = 9.59 s, Bolt crossed the finish line.”

We will denote the set of history propositions of a system by \(\mathcal{V}\).

For physical systems, history propositions correspond to subsets C of the history space \(\Pi\). In order to define the latter, we first identify a time-set \(\mathcal{T}\) that consists of all time instants t, and it is equipped with an ordering relation \(\le\), with the physical interpretation of “earlier than”.

The history space is defined as the space of all paths on \(\Gamma\), where a path \(\xi : \mathcal{T}\rightarrow \Gamma\) is a function that assigns one point \(\xi (t) \in \Gamma\) to each time \(t \in \mathcal{T}\). The points of \(\Pi\) are called fine-grained histories. They define propositions that give the most precise description of the physical system at all times (fine-grained propositions). History propositions that are not fine-grained are called coarse-grained. For example, in a theory where the fine-grained histories refer to atomic motions, any history about properties of neurons is coarse-grained.

We can combine history propositions using logical operators such as \(\text{ AND }\), \(\text{ OR }\), \(\text{ IMPLIES }\), \(\text{ NOT }\), and so on. Given two history propositions \(\alpha\) and \(\beta\), we can always define the history propositions \(\alpha \, \text{ AND }\, \beta\), \(\alpha \, \text{ OR }\, \beta\), \(\alpha \, \text{ IMPLIES }\, \beta\), and \(\text{ NOT }\, \alpha\). We also define the impossible history proposition \(\emptyset\) (the proposition that can never be true) and the trivial history proposition \(\mathbb {1}\) (the proposition that can never be false). We call two history propositions \(\alpha\) and \(\beta\) disjoint, if there is no way that they can both be both true, i.e., if \(\alpha \, \text{ AND }\, \beta = \emptyset\). Coarse-grained propositions are obtained by joining disjoint fine-grained propositions through the connective \(\text{ OR }\).

Obviously, single-time propositions are special cases of history propositions. Some multi-time history propositions can be constructed from single-time propositions using the connective \(\text{ AND } \text{ THEN }\) of temporal conjunction. For example, the propositions \(\alpha\), \(\kappa _t\) and \(\lambda _t\) defined earlier satisfy

$$\begin{aligned} \alpha = \kappa _{t_1} \, \text{ AND } \text{ THEN }\, \lambda _{t_2}, t_1 < t_2. \end{aligned}$$

In systems described by classical physics, we can always find a set \(\Pi\) of fine-grained histories, so that any history proposition \(\alpha\) corresponds to a subset \(C(\alpha )\) of \(\Pi\). Hence, we can express all logical operations set-theoretically. For example, \(C(\alpha \, \text{ AND }\, \beta ) = C(\alpha ) \cap C(\beta )\), \(C( \text{ NOT }\, \alpha ) = \Pi - C(\alpha )\), and so on. Hence, the set of history propositions has the structure of a Boolean algebra.

History propositions in quantum systems are very different. They do not have a Boolean algebra structure. Furthermore, there is an infinity of different fine-grained sets of histories, and each set defines a different physical description of a physical system. The mathematical structure of the space of history propositions is significantly more complex. Its intricacies, while crucial from the perspective of quantum foundations, are peripheral to the aims of this paper. A brief description of quantum history propositions and of their mathematical structure is given in the Appendix 1. The reader may consult Ref. [29] for a detailed analysis.

3.2 Probability Assignment

The predictions of all physical theories are expressed in terms of probabilities assigned to history propositions. A partial probability function \(\text{ Prob }(\cdot )\) is a rule that assigns a probability \(\text{ Prob }(\alpha )\) to history propositions \(\alpha\) that belong in a subset \(\mathcal{W}\) of the set \(\mathcal{V}\); \(\mathcal{W}\) is typically closed under the logical operations mentioned earlier. The restriction of history propositions to a subset \(\mathcal{W}\) is a typical quantum phenomenon. Quantum probabilities are always defined in reference to a context, for example, a specific experimental configuration. If \(\mathcal{W} = \mathcal{V}\), then we call \(\text{ Prob }(\cdot )\) a complete probability function.

Given a probability function \(\text{ Prob }(\cdot )\), one defines the conditional probability of a history proposition \(\alpha\) given a history proposition \(\beta\), as

$$\begin{aligned} {} \text{ Prob }(\beta |\alpha ) = \frac{\text{ Prob }(\alpha \, \text{ AND }\, \beta )}{\text{ Prob }(\alpha )}. \end{aligned}$$

The physical predictions of the theory are typically expressed in terms of conditional probabilities \(\text{ Prob }(\alpha |\beta )\), where \(\alpha\) refers to measurement outcomes.

Three distinct types of probability functions are used in physics, whereupon one talks about three types of processes: deterministic, stochastic and quantum. (Further types of processes are mathematically possible, but have not yet found use in physics.)

Deterministic processes have the following property. For any single-time proposition \(\alpha _t\), and a time \(t' > t\), there is a unique minimal single-time proposition \(\beta _{t'}\), such that \(\text{ Prob }(\beta _{t'}|\alpha _t) = 1\). The proposition \(\beta _{t'}\) is minimal in the sense that, if some other proposition \(\gamma _{t'}\) satisfies \(\text{ Prob }(\gamma _{t'}|\alpha _t) = 1\), then \(\beta _{t'} \, \text{ IMPLIES }\, \gamma _{t'}\).

This definition of deterministic processes is restricted, as it ignores process with memory, but it suffices for present purposes. It characterizes all processes described by dynamical systems, i.e., by differential equations on the state space \(\Gamma\), like Newton’s equations of classical mechanics. The key point is that in deterministic processes, probabilities refer solely to the ignorance of the system’s precise initial conditions.

Stochastic processes are characterised by a complete probability function on \(\mathcal{V}\) that satisfies the Kolmogorov additivity condition: for all disjoint history propositions \(\alpha\) and \(\beta\),

$$\begin{aligned} \text{ Prob }(\alpha \, \text{ OR }\, \beta ) = \text{ Prob }(\alpha ) + \text{ Prob }(\beta ). \end{aligned}$$

Examples of stochastic processes are Brownian motion, random walks and evolutionary processes. Strictly speaking, deterministic processes are a special case of stochastic processes.

Quantum processes are different. The natural probability function for quantum history propositions does not satisfy Kolmogorov’s additivity condition for arbitrary history propositions \(\alpha\) and \(\beta\). This implies that in any given physical situation, probabilities cannot be assigned to all possible history propositions, but only to a specific subset thereof. Hence, quantum probability measures are partial. The exact specification of propositions to which probabilities can be assigned is a open issue in quantum foundations that is closely related to the quantum measurement problem. However, an uncontroversial choice is to restrict to history propositions that describe measurement outcomes in specific experiments. In this case, the probabilities associated to quantum processes coincides with the standard formulations of quantum theory in the Copenhagen interpretation [23, 33].

To clarify this point, we distinguish between the mathematical structure of history theories, and the most well known interpretation of quantum theory that employs this structure, namely, decoherent histories [24, 25, 27, 28]. The latter purports to construct a realist interpretation for quantum mechanics, that does not require measurement as a primitive notion—see, Ref. [36,37,38] for a critique of this claim, and related discussion. However, quantum histories are also meaningful in the Copenhagen interpretation. This was the case of the first histories formulation of quantum theory [23], where histories were thought in terms of measurement outcomes. Recently, we have used the logical structure of quantum histories in order to construct temporal observables [39, 40], while remaining within the confines of Copenhagen quantum mechanics.

In this paper, we take a neutral viewpoint on the interpretation of quantum histories. We only assert that probabilities can be assigned to a sublattice of history propositions. This statement is compatible with both the decoherent histories interpretation and the Copenhagen interpretation. In both cases, the relevant sublattices correspond to measurement outcomes. In the former case, we have to include the measuring device in the quantum description, as the decoherent histories approach is supposed to apply to closed systems. In the latter, we have to introduce a Heisenberg cut between the measured system that is being described quantum mechanically and the classical apparatus that records the measurement outcomes.

The three types of processes above are related. Deterministic processes can arise as limiting behavior of either stochastic or quantum processes, and stochastic processes can arise as limiting behavior of either deterministic or quantum processes. In what follows, we shall refer to deterministic and stochastic processes as classical processes, in the sense that they are compatible with classical physics (as contrasting quantum physics).

4 Histories Theory Description of MBI

4.1 Propositions About Mental Processes

In this section, we consider history propositions associated to a psycho-physical system and we argue that the most natural mathematical description of such systems introduces irreducibly mental degrees of freedom in addition to physical ones.

Consider a system that consists of Mary, a human person that lives in a closed room, together with all other physical objects in the room. We denote by \(\mathcal{V}_{\Phi }\) the set of all history propositions about physical properties of the system. For example, \(\mathcal{V}_{\Phi }\) contains propositions about a grey couch in the room, about Mary’s movements as she sits on the couch, or about Mary’s neurons firing while she sleeps on the couch. In principle, \(\mathcal{V}_{\Phi }\) is fully determined from existing theories of physics.

There is also a set \(\mathcal{C}\) of history propositions about mental properties in the system. Of course, these properties refer to Mary and not to any other object in the room. \(\mathcal{C}\) includes propositions about Mary’s emotions, thoughts and qualia. Physicalist theories of mind would identify \(\mathcal{C}\) with a subset of \(\mathcal{V}_{\Phi }\). We will argue that it is more reasonable to assume that \(\mathcal{C}\) is a subset of a different set \(\mathcal{V}_{\Psi }\) of mental propositions that does not overlap with \(\mathcal{V}_{\Phi }\). With this assumption, the set of all possible propositions about the system is the Cartesian product \(\mathcal{V}_{\Phi } \times \mathcal{V}_{\Psi }\).

To this end, let us assume that Mary’s room initially contains no green or red object.Footnote 5 At time \(t_0\) an object is inserted in the room. This object may be either a green pepper or a red rose. The pepper is green in the sense that it reflects light with wavelength of 500–550 nm; the rose is red in the sense that it reflects light with wavelength of 650–700 nm. Consider the history propositions,

\(\alpha _g\) = “A green pepper is inserted in the room at time \(t_0\), and then Mary sees it”,

\(\alpha _r\) = “A red rose is inserted in the room at time \(t_0\), and then Mary sees it”.

By “Mary sees it” we mean a conjunction of propositions that include light from the object reaching Mary’s retina, and an electrochemical signal carrying this particular information into the brain.

Next, we consider two history propositions that refer to mental properties,

\(\beta _G\) = “Mary has a GREEN experience at some time t after \(t_0\)”,

\(\beta _R\) = “Mary has a RED experience at some time t after \(t_0\)”.

GREEN and RED in capital letters refer to color qualia, i.e., individual instances of color experience [43]. We can avoid using the word “Mary” (which might require explaining what a person is) by rephrasing \(\beta _G\) as “There is a GREEN experience at some time t after \(t_0\)” and similarly for \(\beta _R\).

Obviously, there is a strong correlation between \(\alpha _g\) and \(\beta _G\) and between \(\alpha _r\) and \(\beta _R\). Consider an experiment in which either the rose or the pepper is inserted into the room and Mary telling us the color she sees. Repeating this experiment many times, we expect to find the probabilities,

$$\begin{aligned} \text{ Prob }( \alpha _r \, \text{ AND }\, \beta _R) = 1, \text{ Prob }( \alpha _r \, \text{ AND }\,\beta _G) = 0, \nonumber \\ \text{ Prob }( \alpha _g \, \text{ AND }\, \beta _R) = 0, \text{ Prob }( \alpha _g \, \text{ AND }\, \beta _G) = 1, \end{aligned}$$

modulo some errors of order \(\epsilon<< 1\).

The aim of many physicalist research programs is to map \(\beta _R\) and \(\beta _G\) to elements of \(\mathcal{V}_{\Phi }\) and to compute the probabilities (4) solely in terms of physics. The problem with this program is that existing physical theories are expressed in terms of particle properties, field properties, spacetime properties; qualia do not fit in.

Suppose then one finds a map that expresses \(\beta _R\) in terms of physical properties, i.e., that \(\beta _R\) logically coincides some proposition \(\gamma _R \in \mathcal{V}_{\Phi }\). In what sense does the proposition \(\gamma _R\) depend on the quale RED? Since qualia appear neither in the construction of the space of propositions, nor in the probability assignment, RED can only be used as a label, i.e., as a non-dynamical index that identifies this proposition. It is certainly not a property to which the proposition refer. However, labels are arbitrary in physics: they are chosen as a matter of convention and they can be interchanged at will. This implies that there is no explanation from physics why one particular physical proposition \(\alpha \in \mathcal{V}_{\Phi }\) is correlated to \(\alpha _R\) and not to \(\alpha _G\), i.e., why a red rose corresponds to the experience RED and not to the experience GREEN.Footnote 6

Let us consider the situation formally, and ignore for the moment the meaning of the propositions. We have two sets of propositions \(A = \{\alpha _r, \alpha _g\}\) and \(B = \{\beta _R, \beta _G\}\),

  1. (i)

    with strong probabilistic correlations given by Eq. (4);

  2. (ii)

    with no known way of logically identifying elements of A with elements of B;

  3. (iii)

    with strong arguments that such an identification may not be possible—see, [43] and references therein.

A physicist encountering this state of affairs in some problem would not hesitate to conclude that the correlations are dynamical. He or she would propose a model in which A describes one particular set of degrees of freedom and B describes a set of different degrees of freedom. These degrees of freedom interact dynamically in order to produce the probabilistic correlations of Eq. (4). The properties of the different degrees of freedom and the details of the interaction depend on the system under consideration, but the logic of the explanation is system-independent.

Suppose we transfer this way of thinking to the mind–body problem. We should introduce a set \(\mathcal{V}_{\Psi }\) of history propositions about mental degrees of freedom, such that \(B \subset \mathcal{V}_{\Psi }\). \(\mathcal{V}_{\Psi }\) must be disjoint from \(\mathcal{V}_{\Phi }\), i.e., \(\mathcal{V}_{\Psi }\) and \(\mathcal{V}_{\Phi }\) contain different propositions. Since \(A \subset \mathcal{V}_{\Phi }\), the dynamical correlations of Eq. (4) are to be explained in terms of a probability rule for elements of the set \(\mathcal{V}_{\Phi } \times \mathcal{V}_{\Psi }\).

The history propositions in \(\mathcal{V}_{\Psi }\) and \(\mathcal{V}_{\Phi }\) may be different, but the two sets must have isomorphic time-sets \(\mathcal{T}_{\Psi }\) and \(\mathcal{T}_{\Phi }\). This means that there is a bijection \(f: \mathcal{T}_{\Phi } \rightarrow \mathcal{T}_{\Psi }\), such that \(f(t_1) \le f(t_2)\) for all \(t_1 \le t_2\). This is physically obvious: if Mary experiences GREEN before RED, then the time of the GREEN experience (as measured by a physical clock) must be prior to the time of the RED experience. In other words, psychological time and physical time have the same ordering.

4.2 The \(\Psi \Phi\)I Formalism

The arguments above suggest the following framework for theories with psycho-physical interaction (\(\varPsi \varPhi\)I formalism).

1. Causal structure. The studied systems involve both physical and mental processes. All possible scenarios about a specific system can be expressed in terms of history propositions defined with respect to a time-set \(\mathcal{T}\).

2. Set of propositions. History propositions have one physical and one mental component, i.e., they belong to the set \(\mathcal{V}_{\Phi } \times \mathcal{V}_{\Psi }\), where \(\mathcal{V}_{\Phi }\) contains history propositions about physical degrees of freedom and \(\mathcal{V}_{\Psi }\) contains history propositions about mental degrees of freedom.

Purely physical propositions are of the form \((\alpha , \mathbb {1}_{\Psi })\) and purely mental propositions are of the form \((\mathbb {1}_{\Phi }, \alpha )\). We will denote such propositions as \(\alpha _{\Phi }\) and \(\alpha _{\Psi }\), respectively. Hence, we can express a general history proposition \(\alpha = \alpha _{\Phi } \; \text{ AND }\; \alpha _{\Psi }\), i.e., as a logical conjunction of its physical component \(\alpha _{\Phi }\) and its mental component \(\alpha _{\Phi }\).

3. Probability rule. There is a class of partial probability functions \(\text{ Prob }(\cdot )\) on \(\mathcal{V}_{\Phi } \times \mathcal{V}_{\Psi }\) that determine physical predictions. The assumption of partial probability functions allows us to accommodate a quantum theory, without committing to a particular interpretation. The complexity of the quantum probability rule for histories implies that there are many more possible ways of expressing mental-physical interaction than in classical physics.

If we treat matter as classical, then we can take \(\text{ Prob }(\cdot )\) to be a complete probability function.

The probability function incorporates non-trivial dynamical interaction between mental and physical degrees of freedom. This means that conditional probabilities of the form \(\text{ Prob }(\alpha _{\Phi }|\beta _{\Psi })\) and \(\text{ Prob }(\gamma _{\Psi }|\delta _{\Phi })\) have non-trivial dependency on propositions \(\beta _{\Psi }\) and \(\delta _{\Phi }\), respectively.

The degenerate case where physical events affect mental ones, but not vice versa correspond to the scenario of epiphenomenalism that was mentioned in the Introduction.

4. Limiting behavior. Let us denote by \(\Omega _{\Phi }\) the proposition that there are no physical processes at any \(t \in \mathcal{T}\). \(\Omega _{\Phi }\) is not to be confused with the impossible proposition \(\emptyset _{\Phi }\). We also denote by \(\Omega _{\Psi }\) the proposition that there are no mental processes at any \(t \in \mathcal{T}\). We expect that in absence of mental processes, the probability function reduces to the known one of physics, denoted by \(\text{ Prob}_{\Phi }\), i.e.,

$$\begin{aligned} \text{ Prob }(\alpha _{\Phi } \; \text{ AND }\; \Omega _{\Psi }) = \text{ Prob}_{\Phi } (\alpha _{\Phi }). \end{aligned}$$

Unless we wish to entertain the possibility of ghosts, we must postulate that no mental processes are possible in absence of physical processes, i.e.,

$$\begin{aligned} \text{ Prob }( \Omega _{\Phi } \; \text{ AND }\; \alpha _{\Psi }) = 0. \end{aligned}$$

The principles above can be naturally incorporated into the temporal logic description of histories [29], and, thereby, provide an axiomatic characterization of the \(\Psi \Phi\)I formalism. Technical details in the formulation and elaboration of the axioms will be presented in a different publication. The key point is that the principles above define a general framework that can accommodate many different theories of MBI. Such theories will differ on the mathematical characterization of \(\mathcal{V}_{\Psi }\) (the fundamental mental variables), and on the explicit construction of the probability function \(\text{ Prob }(\cdot )\) (dynamics).

At the moment, we know the structure of \(\mathcal{V}_{\Phi }\) and its associated probability rule. \(\mathcal{V}_{\Phi }\) consists of all history propositions allowed by the Standard Model of particle physics, and \(\text{ Prob }(\cdot )\) is the standard probability assignment for history propositions in quantum theory. Physics at the nuclear scale and beyond is most likely irrelevant to a mind–body coupling, so we can coarse-grain away Standard Model physics, and take \(\mathcal{V}_{\Phi }\) to contain propositions about nuclei, electrons and the EM field; the probability assignment will again be quantum. It is a widely held belief among neuroscientists that we can coarse-grain even further, and that it suffices to work with a set \(\mathcal{V}_{\Phi }\) of propositions about macromolecules or cells, subject to a classical probability assignment. In any case, the \(\Phi\) part of a \(\Psi \Phi\)I theory is based on known physics.

In contrast, the \(\Psi\) part is largely unknown. We know many history propositions about mental processes at a phenomenological level, and these are elements of \(\mathcal{V}_{\Psi }\). However, we know nothing about the mathematical structure of \(\mathcal{V}_{\Psi }\), i.e., its fine-grained histories and how they can be joined through the OR operation in order to form coarse-grained propositions. Hence, the known elements of \(\mathcal{V}_{\Psi }\) are disconnected from any underlying structure. The latter can be provided only by a mature mathematical theory of mind. It is quite possible that our familiar mental processes correspond to a highly coarse-grained mental history propositions, the same way that our everyday experience lies at a level of description much coarser than that of fundamental physics.

The probability assignment on \(\mathcal{V}_{\Phi } \times \mathcal{V}_{\Psi }\) is also unknown, except for the condition that it reduces to the probability rules of physics in absence of mental phenomena. Since matter is fundamentally quantum mechanical, a fundamental \(\varPsi \varPhi\)I theory cannot be classical. Then, there are only three conceivable scenarios. Fundamental \(\Psi \Phi\)I processes are either (i) fully quantum, which means that mental processes should also be subject to the probabilistic rules of quantum mechanics, or (ii) they involve a quantum/classical hybrid (see, Sect. 5.3), or (iii) they follow a probability rule with no analogue in current physics.

The conclusion above does not preclude the use of classical \(\Psi \Phi\)I theories at coarser levels of description, since classical processes often emerge as limiting cases of quantum ones. Indeed, most neuroscience research proceeds under the assumption that the physical phenomena correlated to mental processes are essentially classical. This means that the relevant physical objects have lost their irreducible quantum features (except for the ones pertaining to the chemical properties of atoms). There are theoretical objections to this point of view, which originate from proposals that quantum phenomena are important for understanding consciousness. This idea was first made explicit by Pauli in his famous collaboration with Jung [46, 47], who argued about the relation of Jung’s concept of synchronicity [48] to quantum phenomena. For recent theories about the interplay of quantum and consciousness, see, Refs. [49,50,51,52].

In this paper, we take an agnostic’s stance about a special relationship between quantum theory and consciousness. The \(\Psi \Phi\)I formalism works either way, but it is much easier to work with if quantum phenomena can be ignored. In principle, classical probabilistic models for a few discrete degrees of freedom can be constructed directly from the \(\Psi \Phi\)I axioms, without any knowledge of the deep structure of \(\mathcal{V}_{\Psi }\). Such models could describe elementary mental processes, like, for example, the distinction of a small number of colors by a living person. It is too early to tell whether they could lead to testable predictions or not.

Finally, we remind the reader that the overall rationale of the \(\Psi \Phi\)I formalism conforms to the strategy that was sketched in Sect. 2—see, Table 1 for the analogy to GR.

Table 1 Structural correspondence between the \(\Psi \Phi\)I formalism and GR

5 \(\Psi \Phi\)I Theories and Physics

In the previous section, we argued that the histories description of physical theories can easily be extended to incorporate mental degrees of freedom. Here, we explore plausible properties of such theories, especially, in relation to open issues in physics.

5.1 Energy Conservation

As mentioned in the introduction, the most common objection to theories of MBI is that MBI conflicts with fundamental laws of physics, in particular the conservation of energy. This objection is mainly popular with non-physicists, as it presupposes a rather, outdated, 19th-century understanding of physics. Today, we know that energy conservation is not a universal law, as it does not hold in General RelativityFootnote 7 and it holds only with qualifications in quantum theory.Footnote 8 Thus, an MBI theory with no strict energy conservation is not problematic in an epistemic sense.

More importantly, MBI does not always lead to a violation of energy conservation. In the Appendix 1, we study the status of energy conservation for classical \(\Psi \Phi\)I processes. We identify dynamics that are fully compatible with energy conservation, in the sense that the MBI does not add or subtract energy to the physical degrees of freedom. Energy conservation is a consequence of a particular coupling between the mental and the physical degrees of freedom. This coupling is neither artificial nor contrived: it is mathematically elegant and simple, as befits a fundamental theory. It can also be generalized for quantum systems.

Energy-conserving \(\Psi \Phi\)I theories are aesthetically appealing, but certainly, other options are available. For example, one may consider MBI dynamics that conserve a generalised notion of energy. In mechanical systems, energy conservation is a consequence of the symmetry of time translation, i.e., the requirement that the dynamics is unchanged under a transformation that moves the time of the various events by a constant amount. If a \(\Psi \Phi\)I theory shares this symmetry, then a generalised energy variable—that depends on both physical and mental degrees of freedom—is plausibly conserved. Hence, we can continue to use our current notion of energy, provided we include a contribution from mental processes. After all, the notion of energy has been generalised many times ever since its inception, as it has been applied to increasingly broader categories of phenomena.

It is also possible that a \(\Psi \Phi\)I theory does not admit either energy conservation or conservation of some generalised notion of energy. In this case, energy conservation is an approximate conservation law that holds in regimes where the MBI is negligible. Approximate conservation laws are quite common in physics, they apply to physical quantities that are conserved in a large classes of interactions but not in all. One example is isospin which is conserved in strong and electromagnetic interactions but not in weak interactions.

5.2 Threshold States

We believe that a living person has mental activity, but we ordinarily deny such activity to the dead body of the same person, as we deny it to rocks, cars, electrons or computers. Thus, we believe that in some systems mental processes never occur, and that in some other systems mental processes sometimes occur and sometimes do not. In a \(\Psi \Phi\)I theory, we implement such distinctions by introducing a set of propositions \(\Omega _{t} \in \mathcal{V}_{\Psi }\) which assert that no mental processes take place at time t. For rocks, cars and so on, the probabilities associated to propositions of \(\mathcal{V}_{\Psi }\) other than \(\Omega _{t}\) are always zero. For living people, probabilities for propositions other than \(\Omega _{t}\) can obviously be non-zero.

The analogue of \(\Omega _{t}\) in physics is the vacuum of quantum field theory (QFT). The vacuum is the state of the system in which no particles (of a given type A) are present. In QFT, one is often interested in the generation of A particles from the vacuum, for example, in the presence of other particles or external fields. These phenomena reveal the interaction channels of the A-particles with the rest of the world.

We can follow an analogous reasoning in the \(\Psi \Phi\)I formalism. Let C denote a configuration of a physical system, i.e., a subset of the system’s state space \(\Gamma\). We represent the proposition that C is present at time t by \(C_t\). The family of history propositions

$$\begin{aligned} \eta (C) = (\Omega _t \, \text{ AND }\, C_t) \, \text{ AND } \text{ THEN }\, ( \text{ NOT }\, \Omega _{t'}) \end{aligned}$$

for \(t' > t\) describes the generation of mental states out of the mental ‘vacuum’ \(\Omega\). If C describes a rock, a car, or a dead body, any reasonable probability assignment will give \(\text{ Prob }[\eta (C)] = 0\). The same holds if C describes a living person. However, there exist physical configurations C for which \(\text{ Prob }[\eta (C)] > 0\). Such configurations are responsible for the generation of (proto)mental states from non-mental states: they are the threshold to the world of mental processes—see, Fig. 1.

A formal definition of threshold states should also involve some condition of minimality, otherwise the whole Earth 4.5 billion years ago would qualify as a threshold state. Hence, we require that a threshold state satisfies \(\text{ Prob }[\eta (C)] > 0\) and also \(\text{ Prob }[\eta (D)] = 0\), for any \(D \subset C\).

Fig. 1
figure 1

A graphical description of threshold states. We assume that mental degrees of freedom are initially in their ‘vacuum’ \(\Omega\). Processes of type 1 cannot generate mental states out of \(\Omega\). Only if the physical state is a threshold state (processes of type 2) is it possible to reach a non-trivial mental state

Threshold states are crucial for understanding how organisms with mental processes emerged in the history of life. Furthermore, if the analogy to QFT is valid, the properties of threshold configurations may suggest the form of the MBI-generating terms in the probability assignment.

In QFT, the defining feature of threshold configurations is energy: no A particle can be generated from the vacuum unless the available energy is greater than the rest mass \(m_A\) of A. When asking what is the analogue of energy for the threshold configurations of a \(\Psi \Phi\)I theory, there seems to be an obvious answer: information. Many mental processes can be described in terms of information processing, and, of course, the brain is an information-processing system. Like causal ordering, the notion of information strands both sides of the mental-physical divide. Indeed, it has been proposed as a crucial component of psychophysical theories [12]. It is therefore plausible that threshold states are characterized by high informational capacity.

The problem here is that information theory concepts are not fundamental to our current physical theories. They only provide an additional layer of interpretation and some technical tools. Notions such as informational capacity or information processing are not absolute properties of physical systems. One deterministic process is as good as another in terms of information processing: there is no criterion for distinguishing the electric currents inside a laptop from the motions of air molecules in an empty bottle. We call the former information processing because of their meaningful output: the electric currents cause a complex production of light on the liquid crystal display that we interpret as text. This criterion presupposes us, i.e., the existence of mindful observers.

It is a plausible conjecture that the notions of information processing and informational capacity are fundamentally defined in terms of mental processes, and not physical ones. This would imply that threshold states ‘distribute’ the mental concept of information to the physical degrees of freedom. Hence, MBI could lead to a fundamental definition of information in physics.

5.3 MBI and Quantum State Reduction

The measurement problem in quantum theory is that quantum theory does not explain the emergence of definite properties for physical systems (for example, measurement outcomes). Definite properties are not, in general, compatible with quantum processes. One possible resolution, suggested by the founders of quantum theory, is the use of irreducibly classical concepts for the description of the measurement device. But what constitutes a measuring device? If the a particle’s position is correlated to the reading of a pointer in an apparatus, should we describe the pointer classically? What if the pointer is recorded by a camera? Should we treat the pointer as a quantum system and the camera as classical? The boundary of the quantum/classical divide appears arbitrary.

The problem is aggravated in interpretations of quantum theory that treat the quantum state as an objective feature of a physical system. In these interpretations, the change of state after measurement (quantum state reduction) is a physical process. Then, the arbitrariness of the quantum/classical split implies an ambiguity in the physical description. Von Neumann and Wigner proposed that the quantum/classical split and the body/mind split coincide: the quantum state is reduced when the result of a measurement enters the observer’s consciousness [53, 54].

In contrast, dynamical state reduction models [55] postulate that there is a tiny probability of reduction for each particle’s wavefunction, which adds up to be significant for systems that contain a huge number of particles. This explains why macroscopic measuring apparatuses behave classically. Alternatively, one may postulate irreducibly classical entities that universally interact with quantum systems. Barring consciousness, the main candidate for this role is the gravitational field [56,57,58,59,60]. The only way to consistently formulate a quantum-classical interaction—without introducing hidden variables as in Bohmian mechanics—involves dynamical state reduction for the quantum system.Footnote 9 Again, this process can cause measurements apparatuses to always behave classically.

The latter scenario is relevant to \(\Psi \Phi\)I theories. If mental processes are treated classically, the resulting \(\Psi \Phi\)I theory involves coupling of quantum to classical variables. Hence, quantum systems undergo dynamical state reduction as a result of MBI. In other words, dynamical reduction is a natural candidate for the physical channel through which mind acts upon matter.

We do not propose \(\Psi \Phi\)I theories as a solution to the quantum measurement problem. We think that consciousness-based solutions to the measurement problem are highly counterintuitive in the context of quantum cosmology. They seem to imply that no definite properties could exist prior to the emergence of the first conscious observers. Nonetheless, the \(\Psi \Phi\)I formalism can, in principle, be used in order to construct predictive models of the von Neumann–Wigner idea.

Reduction in \(\Psi \Phi\)I theories is not the universally occurring process postulated by dynamical reduction models. Most probably, it would only occur in the nervous system of biological organisms. Interestingly, \(\Psi \Phi\)I predictions may turn out to be compatible with proposals that relate dynamical reduction in the brain to consciousness—like, for example, the theory of Orchestrated Objective Reduction by Hameroff and Penrose [50]—even if the latter treats the physical world as causally closed.

6 Conclusions

The main aim of this paper is to show that the popular assertion that MBI is incompatible with physics is wrong. This was achieved by the construction of a mathematical framework that enables the construction of theories with MBI as extensions of current physical theories. The \(\Psi \Phi\)I formalism originates from the histories formulation of physical theories, and it describes irreducibly mental degrees of freedom that interact with the physical degrees of freedom.

The \(\Psi \Phi\)I formalism can incorporate any mental concept that can be expressed in terms of abstract structural and causal relations. Certainly, there are aspects of the mind that go beyond such relations; they cannot be described by the formalism. This is not a problem. GR demonstrates that even a fundamental theory does not require a faithful representation of the entities it describes in order to be highly successful.

Many mental processes involve conscious experience. We can represent mathematically a conscious experience, by introducing a variable, say Con, that takes value 1 on all states that involve conscious experience and 0 otherwise. The time evolution and causal properties of the variable Con can then be studied for any psycho-physical configuration. Obviously, the \(\Psi \Phi\)I formalism cannot explain what “conscious experience” is. The existence of conscious experience has to be taken as a brute fact about the mind that defines the fundamental building blocks of a theory.

Physicalist theories of mind often invoke the remarkable success of physics to explain a huge number of phenomena through reduction to elementary physical processes. In our opinion, such arguments only pay lip-service to physics, while ignoring its history and actual research practice. The point is that neither reduction nor supervenience have been particularly successful as research strategies in physics.

The key step in constructing fundamental physical theories has always been the identification of the appropriate degrees of freedom for the problem at hand. In all major discoveries, it was necessary to introduce new degrees of freedom, well beyond the ones that were known at the time [64]. Examples include the electromagnetic field, the Rutherford-Bohr model of the atom, the concept of particle spin, the spacetime metric in GR, and the large number of new particles, charges and fields that had to be postulated in order to construct the Standard Model of strong and electroweak interactions.

Ever since Newton, the most successful strategy in physics has been the search for theories that unify seemingly very different phenomena by incorporating them in an overarching mathematical structure. Hence, the \(\Psi \Phi\)I formalism is much closer to the research practices of physics than any physicalist research program.