The Weirdness Theorem and the Origin of Quantum Paradoxes

We argue that there is a simple, unique, reason for all quantum paradoxes, and that such a reason is not uniquely related to quantum theory. It is rather a mathematical question that arises at the intersection of logic, probability, and computation. We give our ‘weirdness theorem’ that characterises the conditions under which the weirdness will show up. It shows that whenever logic has bounds due to the algorithmic nature of its tasks, then weirdness arises in the special form of negative probabilities or non-classical evaluation functionals. Weirdness is not logical inconsistency, however. It is only the expression of the clash between an unbounded and a bounded view of computation in logic. We discuss the implication of these results for quantum mechanics, arguing in particular that its interpretation should ultimately be computational rather than exclusively physical. We develop in addition a probabilistic theory in the real numbers that exhibits the phenomenon of entanglement, thus concretely showing that the latter is not specific to quantum mechanics.


Introduction
We are interested in defining bounds on the algorithmic capabilities of a mathematical theory and in analysing their implications. We articulate our views from a logical standpoint, by postulating the following principle of rationality: The theory should be logically consistent.
This is what we essentially require to each well-founded mathematical theory: it has to be based on a few axioms and rules from which we can unambiguously derive its mathematical truths. The next postulate defines the computational limitations we want our theory to be subject to: (Computation) Inferences in the theory should be computable in polynomial time.
The second postulate will turn out to be central. It requires that there should be an efficient way to execute the theory, and in fact we are going to adopt the metaphor of a computer that executes the theory, i.e., that yields inferences out of it.
In what follows, we shall develop our considerations with regard to the special case of a theory of uncertainty. It will essentially coincide with the Bayesian theory once it is freed of the constraint of completeness (or precision); loosely speaking, such a theory is equivalent to modelling uncertainty with sets of probabilities. This choice will define a perimeter for the mathematical technicalities, while focusing on a case of wide interest and impact.
The postulates of coherence and computation are apparently in conflict with each other: intuitively, if the computer can only execute polynomial tasks, the theory will be consistent only up to what polynomial calculus allows. This is a view from outside the computer, however; it is the view of a hypothetical 'classical' observer with no computational limitations and thus external to the theory. An observer that is instead internal to the theory and behaves according to it is still subject to the coherence of the theory; it will therefore be impossible to prove any inconsistency from the inside. This is an instance of what we call an external-internal clash. 1. We formalise such a clash by what we refer to as 'the weirdness theorem'. It shows that any theory of 'algorithmic rationality', that is, one that obeys the two postulates of coherence and computation, necessarily departs in a very peculiar way from the probabilistic point of view. In particular, the theorem proves that all models compatible with the theory will present some negative 'probabilities' (these models are sometimes referred to as 'quasi-probabilities' in the literature).
Negative probabilities are however incompatible with classical rationality, and for this reason a hypothetical classical observer may regard the internal world as incoherent. Equivalently, to a classical observer the behaviour of the internal world may appear to be incompatible with so-called classical evaluation functionals (a concept used in particular in quantum logic). 2. We then define a theory of probability on a continuous space of complex vectors that complies with the two postulates of coherence and computation, and we show that its deductive closure (internal view) is tantamount to quantum theory (QT). The complex vectors represent the possible states of the computer while it runs the theory, and any such state bears within it the properties of the particles in QT, such as their directions or angles of polarisation. By framing it as a theory of rationality, we therefore view QT as a normative theory guiding an agent to assess her subjective beliefs on the results of a quantum experiment. As we are going to stress all along the paper, we ground the normativity of QT on three aspects: firstly, its deductive structure is tantamount to a logical theory, and therefore it is based on a requirement of consistency (coherence)-to follow the rule of QT is to be assured to be consistent. Secondly, the model is based on a possibility (phase) space whose elements are interpreted as states of the world. Finally, we advance that the specific features of our world that ground the use of QT is its being a computation.
The external-internal clash, when transposed to QT, is thus a clash between a computational view and a view stemming from classical physics, the weirdness theorem providing its formal, mathematical formulation. When we try to give a classical physical interpretation to QT we fail, because classical physics, in our common understanding, needs classical probability, and the latter grounds its normativity essentially only on its internal consistency, given the fact that it does not require any limitations on the computational resources available for executing its inferences. As such, to an external observer, QT presents a number of weird phenomena, such as entanglement, and is made up of negative probabilities or is characterised by a non-Boolean structure of events. We will show that this weirdness follows by the computational postulate.
There is also more to it. In our framework QT is naturally based on sets of (or imprecise) probabilities: in fact, requiring the computation postulate is similar to defining a probabilistic model using only a finite number of moments; 1 and therefore, implicitly, to defining the model as the set of all the (quasi-)probabilities compatible with the given moments' constraints. 3. As mentioned above, quantum paradoxes appear to be entirely a consequence of the weirdness theorem; in particular, the weirdness does not follow from having to deal with complex number or quantised states. We enforce this view by working out another example theory, which is unrelated to QT. Such a theory uses real numbers to model the experiment of tossing a classical coin under algorithmic rationality. Eventually the theory turns out to be based on Bernstein polynomials 2 and to admit entanglement. This shows in addition that the quantum-logic and the quasi-probability foundations of QT are two faces of the same coin, being natural consequences of the computation principle, as formalised by the weirdness theorem.
In order to develop the results mentioned above, we rely on a dual 3 characterisation of probability in terms of lotteries (or gambles). In doing so, we thus provide a subjective foundation, à la de Finetti, of so-called generalised probability theories. Compared to algebraic or information-based extensions of probability theories (e.g., [3,4]), a gambling foundation, which emphasises the notion of logical consistency, ensures soundness, and naturally provides a ground for comparing different theories-it boils down to assess the compatibility of their different notions of consistency.

Related Work
The perspective given in this paper may be related to the agent-centered interpretation of QT advanced by QBism [5,6]. However, by grounding the use of QT particularly in the world as being a computation, we depart from QBism, which puts at the center the Born rule but, for now, is unable to ground its use on something else than a coherence constraint. In this, our view looks more similar to the one advanced by Pitowsky [7], whose empirical premise in the derivation of the Born rule is that the structure corresponding to the outcomes of incompatible measurements is a non-Boolean algebra. 4 There is a long tradition of denying to quantum states any reference to the outside world that can be traced back at least to Bohr, and more broadly to the Copenhagen interpretation of QT. Similar contemporary views are labelled as -epistemic. In addition to QBism, they include for instance Healey's quantum pragmatism [8], Rovelli's relational quantum mechanics [9], and the empiricist interpretation of de Muynck [10]. 5 All these interpretations "do not view the quantum state as an intrinsic property of an individual system and they do not believe that a deeper reality is required to make sense of quantum theory" [12, p. 72]. Opposite to this tradition stand -ontic views such as the many world interpretation [13,14], hidden variable theories like Bohmian mechanics [15,16], collapse theories [17][18][19][20], or the transactional interpretations [21][22][23][24], the common trait being that quantum states are regared as descriptions of physical systems. 4 Notice that, in our perspective, the non-boolean structure is a consequence of algorithmic rationality as it follows from 'the weirdness theorem': the non-boolean structure arising from the non-standard evaluation functionals. 5 See for instance [11] for an overview of these views.
Without aiming at reconstructing QT, the present manuscript provides an alternative and original explaination of the differences between quantum and classical probability: the algorithmic intractability of classical probability theory contrasted to the polynomial-time complexity of QT.

Outline of the Paper
Section 2 is concerned with the coherence postulate. We recall the relatively little known fact that (imprecise) probability is the mathematical dual 6 of a coherent logical theory. Addressing consistency (coherence or rationality) in such a setting is a standard task in logic; in practice, it reduces to prove that a certain real-valued function is nonnegative. Section 3 details the computation postulate and its role in developing a model of algorithmic rationality. We consider the problem of verifying the nonnegativity of a function as above. This problem is generally undecidable or, when decidable, NP-hard. We make the problem polynomially solvable by redefining the meaning of (non)negativity. We give our fundamental weirdness theorem (Theorem 1) showing that such a redefinition is at the heart of the clash between classical probability and algorithmic rationality.
We show in Sect. 4 that QT is a special instance of algorithmic rationality and hence that Theorem 1 is the distinctive reason for all quantum paradoxes: the case of entanglement is detailed in Sect. 4.3; in Sect. 4.4 we show that the witness function, in the fundamental 'entanglement witness theorem', is nothing else than a negative function whose negativity cannot be assessed in polynomial time-whence it is not 'negative' in QT.
Section 5 devises a further theory related to the experiment of tossing a classical coin under algorithmic rationality, which is hence unrelated to QT. We show that the theory admits entangled states, as prescribed by the weirdness theorem.
We give our concluding views in Sect. 6.
Appendix A discussed more in detail the relation of our view of QT highlighting in particular some aspects of our theory as well as some connection with other research fields. Appendix B contains the proofs of formal statements.

Classical Rationality
De Finetti's subjective foundation of probability [51] is based on a notion of rationality (consistency or coherence). The idea is that of introducing a betting scheme and defining bettors as rational if their stakes are placed so as to avoid a sure loss (this is traditionally called a Dutch book; Economics refers to it as 'arbitrage'). De Finetti shows that avoiding sure loss is equivalent to representing a bettor's beliefs through classical (subjective) probability, thus providing a solid foundation for the latter.

Desirability
What is less known, however, is that de Finetti's bright intuition has greatly been extended in [52,53], giving rise to the so-called theory of desirable gambles (TDG). This can equivalently be regarded as a reformulation of the well-known Bayesian decision theory (à la Anscombe-Aumann [54]) once it is freed of the constraint to deal with complete preferences [55,56]. TDG is a dual theory of probability in the sense that probability is recovered from TGD through standard mathematical duality. In such a dual form, TDG appears just as a set of logical axioms.
These axioms have a natural interpretation as rationality requirements in the way a 'classical' subject (we call him Isaac), accepts gambles on the results of an uncertain experiment. For instance, Isaac might claim 'I find the gamble that returns 1 utiles 7 if the coin lands heads (H) and −2 utiles if it lands tails (T) to be desirable'. This means that he is willing to accept the gamble g = (1, −2) , that is, g(H) = 1 and g(T) = −2 : that is, to commit to both win 1 utile if the coin lands heads and lose 2 utiles if it lands tails.
Gambles are thus rewards about the uncertain outcome of an 'experiment', such as tossing a coin in the example above. We denote with its possibility space (e.g., {heads, tails} , ℝ n , ℂ n ). For many experiments, there may be more than one possibility space of interest to the 'experimenter', 1 , 2 , … , k . A possibility space describing the joint outcome of this k-valued experiment can be constructed as the Cartesian product = 1 × 2 × ⋯ × k . 8 Formally, a gamble g on is a bounded real-valued function of .
In an experiment, not all the quantities are observable and, therefore, bettable; we denote by L R the restricted set of all 'permitted gambles' on . We assume that L R is a linear space (a vector space) including the constant functions. The subset of 7 Abstract units of utility, we can approximately identify it with money provided we deal with small amounts of it [57, Sect. 3.2.5] 8 In other words, the outcomes of different experiments are assumed to be logically independent.
all nonnegative gambles in L R , that is, of gambles for which Isaac is never expected to lose utiles, is denoted as L ≥ R ∶= {g ∈ L R ∶ inf g ≥ 0} (analogously negative gambles are denoted as L < R ∶= {g ∈ L R ∶ sup g < 0} ). In the following, with G ∶= {g 1 , g 2 , … , g |G| } ⊂ L R we denote a finite set of gambles that Isaac finds desirable: 9 these are the gambles that he is willing to accept and thus commits himself to the corresponding transactions.
The crucial question is how to provide a criterion for a set G of gambles, representing assessments of desirability, to be regarded rational. As we said, rationality is traditionally imposed by avoiding sure losses: that is, by requiring that Isaac should not be forced to find a negative gamble desirable as a logical consequence of his initial assessments of desirability. An elegant way to formalise this intuition is to regard L R as an algebra of formulas on top of which to define a logic. This leads us directly to formulate rationality as logical consistency.
To proceed on this route, we first need to define an appropriate logical calculus (characterising the set of gambles that Isaac must find desirable as a consequence of having desired G in the first place) and based on it to characterise the family of consistent sets of assessments.
First of all, since nonnegative gambles may increase Isaac's utility without ever decreasing it, we first have that: A0. L ≥ R should always be desirable.
This defines the tautologies of the calculus.
Moreover, whenever f, g are desirable for Isaac, then any positive linear combination of them should also be desirable (this amounts to assuming that Isaac has a linear utility scale, which is a standard assumption in probability). Hence the corresponding deductive closure of a set G is given by: Here ' posi ' denotes the conic hull operator. 10 In the betting interpretation given above, a sure loss for an agent is represented by a negative gamble: by accepting a negative gamble an agent will lose utiles no matter the output of the experiment. We are led therefore to the following: 9 The case when G is infinite is analogous, but see Footnote 10. However, we will only consider the finite case in this paper because it suffices to the end of deriving finite-dimensional QT. 10 The conic hull of a set of gambles A is defined as posi (A) = { ∑ i i g i ∶ i ∈ ℝ ≥ , g i ∈ A} . A technicality is that when G is not finite, A1 should require in addition that K is topologically closed.
Note that K is incoherent if and only if −1 ∈ K ; therefore −1 can be regarded as playing the role of the Falsum in logic and hence A2 can be reformulated as A2′. −1 ∉ K.
An example that gives an intuition of the postulates is given in Fig. 1.
Postulate A2 (resp. A2 ′ ), which presupposes postulates A0 and A1, provides the normative definition of TDG, referred to by T . Moreover, as simple as it looks, alone it is the pillar of the foundation of classical subjective probability.

Probability (The Desirability Dual)
Let us show that probability is dual to desirability as described in Sect. 2.1. First of all, however, let us make some terminology precise: when we write probability, as a function, we mean probability charge, i.e., a finitely additive probability. 11 In fact the Analysis literature calls 'charge' a finitely additive set function [58,Chap. 11]. It coincides then with what we have called a quasi-probability; if we instead want to refer to an actual probability, we have to use the qualified expression probability charge.
Assume K is coherent. We give K a probabilistic interpretation by first observing that, since L R is a topological vector space, 12 we can consider its dual space L * R of all bounded linear functionals L ∶ L R → ℝ . Then the dual of K is defined as: is the set of (belief) states; L(1) = 1 means that linear functionals preserve the unitary gamble (normalisation). L(g) ≥ 0 means that L(g) must be a nonnegative real number for all gambles g ∈ G that Isaac finds desirable. 13 To K • we can then associate its extension K • in M , that is, the set of all probability charges on extending an element in K • .
In other words, we can write L(g) as an expectation with respect to a probability: L(g) = ∫ g( )d ( ) . One can then show that the extension K • is equal to: 11 Historically, de Finetti works with finitely additive probabilities, while Kolmogorov stays within the special case of sigma additivity. 12 Equipped with the supremum norm, L R constitutes a Banach space, and its topological dual L * R is the space of all bounded linear functionals on it. We assume the weak * topology on L * R . 13 Technically, one can show that K is a closed convex cone and K • is a section of the dual cone of K . K • is a closed convex set. Note also that, given a closed convex set R ⊆ of states L, we can define its dual cone as which is a coherent set of desirable gambles (it satisfies A0-A2 ′ ). Therefore there is a bijection between coherent sets of desirable gambles and closed convex sets of states. Since this relation preserves all relevant operations, such as conditioning and marginalisation, the two views (in terms of sets of desirable gambles and of convex sets of states) are, mathematically, the same. (3) is the set of all probability charges in , and M the set of all charges on .
Equation (3) states that, whenever an agent is coherent, desirability of g corresponds to nonnegative expectation, that is ∫ g( )d ( ) ≥ 0 for all probabilities in P . When K is incoherent, P turns out to be empty-there is no probability compatible with the assessments in K . Stated otherwise, satisfying the axioms of classical probability-that is being a nonnegative function that integrates to one-is tantamount of being in the dual of a set K satisfying the coherence postulate ('no-Dutch book'). (2) because, no matter Einstein's height, they may increase your wealth without ever decreasing it. Assume you have accepted all gambles in the green area and in addition gambles g 1 , g 2 in Plot (c)-note that the acceptance of g 1 , g 2 depends on your beliefs about Einstein's height. Since you have accepted g 1 , g 2 then you should also accept, because of A1, all gambles 1 g 1 (x) + 2 g 2 (x) + h(x) with i ≥ 0 and h ∈ L ≥ . Some of these gambles are depicted in Plot (d). Assume that instead of Plot (c), you have accepted the green area and g 1 , g 2 in Plot (e). Then you must also accept g 1 + g 2 , because of A1. However, g 1 + g 2 is always negative: it is the blue function in Plot (f). You always lose utiles in this case. In other words, by accepting g 1 , g 2 you incur a sure loss-A2 is violated and so you are irrational (Color figure online)

Algorithmic Rationality
Let us reconsider the classical theory introduced in the previous section. Assume that Isaac makes an initial (finite) set of assessments G , which represent his initial beliefs about an experiment. In order to evaluate Isaac's desirability of a further gamble f ∈ L R , we need to solve the membership problem f ? ∈ K . This can equivalently be expressed as the following nonnegativity decision problem: If the answer is 'yes', then the gamble f belongs to K , which is the conic closure of G ∪ L ≥ R , and this proves its desirability. Note that checking whether K is coherent or not is tantamount to solving (4) for f = −1.

Algorithmic Desirability
However, computing such an inference may be 'costly', if not virtually unfeasible. Indeed, when is infinite (later on we shall consider the case ⊂ ℂ n ) and for generic functions f , g i , the nonnegativity decision problem is undecidable. In this paper, we consider the case where gambles are (complex) multivariate polynomials of degree at most d. In this case, by Tarski-Seidenberg's quantifier elimination theory [59,60], the problem (4) becomes decidable but still intractable, being in general NP-hard. From this perspective, the classical theory is therefore not suitable for constituting a realistic model of rationality.
The idea of modifying the standard theory by considering computational issues traces back to the work of Good [61] and Simon [62]. Since then there have been two main approaches to the problem, either by charging an agent for doing costly computation (as initiated in [63]), or by limiting the computation that agents can do (as initiated in [64], and first used in the context of decision theory in [65] 14 ).
In what follows, we take inspiration from the second approach, and, employing a terminology stemming from [66], develop a model of algorithmic rationality for the framework under consideration. Our subject in such an algorithmic world is now called Alice, to distinguish her from Isaac, who lives in the classical world.
The intuition behind our approach is the following. Assume that, due to computational, or other types of, limits, Alice can only work out the decision problem (4) for a closed subcone ≥ of the nonnegative gambles L ≥ R : This means in particular that there will be a nonnegative gamble f ∈ L ≥ R that Alice cannot actually assess as nonnegative; thus she may well decide not to accept it. Similarly, Alice's initial set of assessments G may contain a negative gamble but this notwithstanding the answer to the corresponding coherence decision problem may be positive (solving (5) for f = −1 may lead to a negative answer).
Should these behaviours be counted as rational? Logic claims that they should: in fact, from the perspective of an agent whose rationality is constrained by ≥ , a collection of assessments is logically consistent whenever its deductive closure contains all tautologies as given by ≥ but does not contain −1 , the Falsum.
In other words, an algorithmic TDG, which we denote by T ⋆ , should be based on a logical redefinition of the tautologies, i.e., by stating that B0.
≥ should always be desirable, in the place of A0, where ≥ is a closed subcone of L ≥ R whose corresponding membership problem (5) delimits the type of computation that an agent can actually do.
The rest of the theory follows exactly the footprints of T . In particular, the deductive closure for T ⋆ is defined by: And finally the coherence postulate is simply reformulated by stating that a set C of desirable gambles is said to be A-coherent if and only if where 'A' stands for the the fact that in T ⋆ the algorithmic bounds of the coherence problem for a finite set of assessments are established according to the particular choice of ≥ .
Hence, T ⋆ and T have the same deductive apparatus; they just differ in the considered set of tautologies, and thus in their (in)consistencies. An example that gives an intuition of the postulates is given in Fig. 2.

Quasi-Probability (The Algorithmic Desirability Dual)
Interestingly, as we did previously, we can associate a 'probabilistic' interpretation to the desirability calculus, now defined by B0-B2, through the dual of an A-coherent set.
So let us consider again the dual space L * R of all bounded linear functionals L ∶ L R → ℝ . With the additional condition that linear functionals preserve the unitary gamble, the dual cone of an A-coherent C ⊂ L R is given by To C • we can associate its extension C • in M , that is, the set of all charges on extending an element in C • . In other words, we can attempt to write L(g) as an 'expectation', that is, an integral with respect to a charge: L(g) = ∫ g( )d ( ) . In general however this set does not yield a classical probabilistic interpretation to T ⋆ : in fact, whenever ≥ ⊊ L ≥ R , there are negative gambles that Alice, given her rationality constrains, does not recognise as such and therefore, from her perspective, do not lead to a sure loss. This is stated more precisely by the following: Theorem 1 (The weirdness theorem) Assume that ≥ includes all positive constant gambles and that it is closed (in L R ). Denote by < the interior of − ≥ . Let C ⊆ L R be an A-coherent set of desirable gambles. The following statements are equivalent: Theorem 1 is the central result of this paper (its proof is in Appendix B). It states that whenever C includes a negative gamble (item 1), there is no classical probabilistic interpretation for it (item 2). The other points suggest alternative solutions to overcome this deadlock: either to change the notion of evaluation functional (item 3) or to use quasi-probabilities as a means for interpreting T ⋆ (item 4). The latter case means that, when we write L(g) = ∫ g( )d ( ) , then ( ) satisfies 1 = L(1) = ∫ d ( ) = 1 but it is not a probability charge.
Observe that requiring polynomial time complexity is just one way to create the conditions for Theorem 1 to hold. But there are others, in that it is enough that one single negative gamble belongs to C to make the theorem hold. In other words, even if we allowed for exponential time complexity, there would still be gambles whose negativity we would not be able to evaluate (those that lead to undecidability). This is the reason why we use the terminology 'algorithmic' rationality, which appears to faithfully capture the idea that our capabilities are limited by the very fact of reasoning algorithmically.
However, and since we are particularly concerned with physics in this paper, we also embrace Aaronson's point of view in [67]: … while experiment will always be the last appeal, the presumed intractability of NP-complete problems might be taken as a useful constraint in the search for new physical theories as a reason to focus on a polynomial-time complexity definition of algorithmic rationality.

QT as a Theory of Algorithmic Rationality
We are going to show that QT can be deduced from a particular instance of the theory T ⋆ . As a consequence, we get that the computation postulate, and in particular B0, is the unique reason for all its paradoxes, which all boil down to a rephrasing of the various statements of Theorem 1 in the considered quantum context.

Setting
Let us initially focus on the possibility space we shall use. Consider first a single particle n-level system and let In some cases we can interpret an element x ∈ ℂ n as 'input data' for some classical preparation procedure. For instance, in the case of the spin-1/2 particle ( n = 2 ), if = [ 1 , 2 , 3 ] is the direction of a filter in the Stern-Gerlach experiment, then x is its one-to-one mapping into ℂ 2 (apart from a phase term). For spin greater than 1/2, Let us reconsider Einstein's height example. Assume we can split the nonnegative gambles in two groups: (i) those whose nonnegativity can be assessed in polynomial time (orange colour in Plot (a)); (ii) those whose nonnegativity cannot be assessed in polynomial time (blue colour in Plot (a)). If you are an algorithmically rational agent, then you should surely accept all the orange gambles: they are nonnegative and you can evaluate their nonnegativity in polynomial time. A-coherence demands that you should accept all nonnegative orange gambles, Plot (b), and avoid all negative orange gambles, Plot (c). Assume that you have accepted all nonnegative orange gambles and g 1 (red), g 2 (green) in Plot (d). Then you must also accept g 1 + g 2 because of B1. Note that g 1 + g 2 is always negative. However, according to A-coherence, you are algorithmically irrational only if g 1 + g 2 is of type orange, that is, if you can evaluate in polynomial time that g 1 + g 2 is negative, Plot (d). In case g 1 + g 2 is of type blue Plot (e), although you are indirectly accepting a gamble that is negative, you are not algorithmically irrational. The reason is that you may not be able to evaluate its negativity (Color figure online) however, the element x ∈ ℂ n cannot directly be interpreted only in terms of a 'filter direction'. In our framework element x is thus better interpreted as the state of the ontological world, which we have sketched in the Introduction. It is a world that is not directly accessible to an observer inside the theory (Alice), albeit it has implications for observables within such a theory.
For a composite systems of m particles (each one is an n j -level system), the joint possibility space is the Cartesian product Having defined the possibility space, the next step is the definition of the observables, which define the gambles in our setting. Let us recall that in QT any real-valued observable is described by a Hermitian operator. This naturally imposes restrictions on the type of 'permitted gambles' g on a quantum experiment. For a single particle, given a Hermitian operator G ∈ H n×n (with H n×n being the set of Hermitian matrices of dimension n × n ), a gamble on x ∈ ℂ n can be defined as: Since G is Hermitian and x is bounded ( x † x = 1 ), g is a real-valued bounded function ( g(x) = ⟨x�G�x⟩ in 'bra-ket' notation). For a composite systems of m particles, the gambles are m-quadratic forms: with G ∈ H n×n , n = ∏ m j=1 n j , and where ⊗ denotes the tensor product between vectors regarded as column matrices. Therefore, we have that is the restricted set of 'permitted gambles' in a quantum experiment. We can also define the subset of nonnegative gambles L ≥ R ∶= {g ∈ L R ∶ min g ≥ 0} and the subset of negative gambles Remark 2 (The tensor product) In our setting the tensor product is ultimately a derived notion, not a primitive one, as it follows by the properties of m-quadratic forms (see Appendix A.2).

Polynomial Inference and Agreement with Born's Rule
For m = 1 (a single particle), evaluating the nonnegativity of the quadratic form x † Gx boils down to checking whether the matrix G is positive semi-definite (PSD) and therefore the membership problem can be solved in polynomial time and so can be problem (4). This is no longer true for m ≥ 2 : indeed, in this case there exist polynomials of type (7) that are nonnegative, but whose matrix G is indefinite (it has at least one negative eigenvalue). Moreover, it turns out that problem (4) is not tractable: The problem of checking the nonnegativity of functions of type (7) is NP-hard for m ≥ 2.
What to do? As discussed previously, we could change the meaning of 'being nonnegative' by considering a subset ≥ ⊊ L ≥ for which the membership problem, and thus (4), is in P. For functions of type (7), we can extend the notion of nonnegativity that holds for a single particle to m > 1 particles: That is, the function is 'nonnegative' whenever G is PSD. Note that ≥ is the socalled cone of Hermitian sum-of-squares polynomials (see Sect. A.4), and that in ≥ the nonnegative constant functions take the form g( Now, consider any set of desirable gambles C satisfying B0-B2 with the given definition of (10); this results in an algorithmic rationality theory that is precisely QT. In other words, from the algorithmic rationality axioms and the given definition of (10), we can derive the first postulate of QT (see for instance Postulate 1 in [70, p. 110]): Associated to any isolated physical system is a complex vector space with inner product (that is, a Hilbert space) known as the state space of the system. The system is completely described by its density operator, which is a positive operator with trace one, acting on the state space of the system.
Indeed, although the possibility space is infinite (e.g., the 'directions' of the particle's spins), the vector space of gambles L R is finite dimensional: any polynomial (⊗ m j=1 x j ) † G(⊗ m j=1 x j ) ∈ L R can then be written as the inner product of a vector of complex coefficients, coming from the matrix G, and a vector of complex monomials: the elements of the matrix (⊗ m j=1 x j )(⊗ m j=1 x j ) † that constitute the basis of the vector space L R . Therefore the dual space L * R is finite dimensional too and corresponds to the space of linear operators L ∶ ℂ → ℂ , whose basis is given by the elements of the matrix . Said that, let G be a finite set of assessments, and K the deductive closure as defined by B1; it is not difficult to prove that the dual of K is where S = { ∈ H n×n | ≥ 0, Tr( ) = 1} is the set of all density matrices. As before, whenever the set C representing Alice's beliefs about the experiment is coherent, Eq. (11) means that desirability implies nonnegative 'expected value' for all models in Q . Note that in QT the expectation of g is Tr(G ) . This follows by Born's rule, a law giving the probability that a measurement on a quantum system will yield a given result.
The agreement with Born's rule is an important constraint in any alternative axiomatisation of QT. This is also the case of our theory, but in the sense that Born's rule can be derived from it. In fact, in the view of a density matrix as a dual operator, is formally equal to Example 1 Consider the case n = m = 2 , then this follows by the linearity of the trace operator. The expression L (⊗ 2 j=1 x j )(⊗ 2 j=1 x j ) † means that the operator L is applied component-wise to the elements of the matrix (⊗ 2 j=1 x j )(⊗ 2 j=1 x j ) † : where the monomials inside the above matrix constitute the basis of L R and L ∶ ℂ → ℂ , so: Hence, when a projection-valued measurement characterised by the projectors 1 , … , n is considered, it holds that Since i ≥ 0 and the polynomials (⊗ m j=1 x j ) † i (⊗ m j=1 x j ) for i = 1, … , n form a partition of unity, i.e.: we have that which is Born's rule.
Remark 3 (Discrete vs. continuous space probability) Quantum measurements are discrete: when we perform a measurement, we observe a detection along one of the directions i . This phenomenon of quantisation is one of the major differences between quantum and classical physics. We took it into account in the choice of the framework, the possibility space being (only) the 'directions' of the particle's spins and the measurement apparatus sensing only certain fixed 'directions' x is a function of two 'directions' x and v i ). Despite its centrality, we want however to point out that quantisation is not the source of Bell-like inequalities and entanglement. As said before, this is because 'quantum weirdness' is intrinsic to any theory of algorithmic rationality as above, and is hence not confined to QT only.
It is often claimed that QT includes classical probability theory (CPT) as a special case, or better that QT includes discrete-space CPT. 17 However, as the possibility space is infinite (e.g., the 'directions' of the particle's spins), in this paper when we speak about CPT (and compare it with QT), we mean continuous-space classical probability theory (in the complex numbers). Hence again, since both B1,B2 and A1,A2 are the same logical postulates parametrised by the appropriate meaning of 'being negative/nonnegative', the only axiom truly separating (continuous-space) classical probability theory from the quantum one is B0 (with the specific form of (10)), thus implementing the requirement of computational efficiency.
In other words, we claim that QT is 'easier' than CPT because, once the appropriate possibility space, observables and queries are specified, evaluating the Tr( i ) = 1, 17 In the present framework, such a view is due to properties of quadratic forms. Indeed x † 1 x, … , x † n x form a partition of unity, and therefore E[ consistency of the theory is NP-hard for CPT. In QT, we realise this clearly when we try to address the question of whether or not an experimentally generated state is entangled. We will discuss in Sect. 4.3 that determining entanglement of a general state is equivalent to proving the nonnegativity of a polynomial that, as we discussed in Proposition 1, is NP-hard. In fact, we can reformulate the entanglement witness theorem as the clash between the classical notion of coherence and A-coherence (see Theorem 2).
Remark 4 (Truncated moment matrices vs. density matrices) In a single particle system of dimension n, =L(xx † ) . In such case, can be interpreted as a truncated moment matrix, i.e., there exists a probability distribution on the complex vector x ∈ such that In fact, consider the eigenvalue-eigenvector decomposition of the density matrix: with i ≥ 0 and v i ∈ ℂ n being orthonormal. We can define the probability distribution where v i is an atomic charge (Dirac's delta) on v i . Then it is immediate to verify that In Sect. 4.4, we will extend this result to separable states. Note also that a truncated moment matrix does not uniquely define a probability distribution, i.e., for a given there may exist two probability distributions 1 (x) ≠ 2 (x) such that This means that, if we interpret as a truncated moment matrix and thus defining via (15) a closed convex set of probabilities (more precisely charges), QT is a theory of imprecise probability [53]. We will discuss more on this topic in Sect. A.3. In fact, Karr [71] has proved that the set of probabilities, which are feasible for the truncated moment constraint, e.g., =L(xx † ) , is convex and compact with respect to the weak * -topology. Moreover, the extreme points of this set are probabilities that have at finite number of distinct points of support (e.g., they are finite mixtures of Dirac's deltas). A similar characterisation for POVM measurements is discussed in the QT context in [72]. The case of a many-particle system is discussed in the next sections.

Entanglement
Entanglement is usually presented as a characteristic of QT. In this section we are going to show that it is actually an immediate consequence of algorithmic rationality.
To illustrate the emergence of entanglement from A-coherence, we verify that the set of desirable gambles whose dual is an entangled density matrix e includes a negative gamble that is not in < , and thus, although being logically coherent, it cannot be given a classical probabilistic interpretation.
In what follows we focus only on bipartite systems A × B , with n = m = 2 . The results are nevertheless general.
Let (x, y) ∈ A × B , where x = [x 1 , x 2 ] T and y = [y 1 , y 2 ] T . We aim at showing that there exists a gamble h(x, y) = (x ⊗ y) † H(x ⊗ y) satisfying: The first inequality says that h is desirable in T ⋆ . That is, h is a gamble desirable to Alice whose beliefs are represented by e . The second inequality says that h is negative and, therefore, leads to a sure loss in T . By B0-B2, the inequalities in (16) imply that H must be an indefinite Hermitian matrix.
Assume that n = m = 2 and consider the entangled density matrix: Let x i = x ia + x ib and y i = y ia + y ib with x ia , x ib , y ia , y ib ∈ ℝ , for i = 1, 2 , denote the real and imaginary components of x, y. Then This is the essence of the quantum puzzle: C is A-coherent but (Theorem 1) there is no P associated to it and therefore, from the point of view of Isaac, who holds a classical probabilistic interpretation, it is not coherent: in any classical description of the composite quantum system, x and y appear to be entangled in a way unusual for classical subsystems.
As previously mentioned, there are two possible ways out from this impasse: to claim the existence of either non-classical evaluation functionals or of negative probabilities. Let us examine them in turn.
(1) Existence of non-classical evaluation functionals: From an informal betting perspective, the effect of a quantum experiment on h(x, y) is to evaluate this polynomial to return the payoff for Alice. By Theorem 1, there is no compatible classical evaluation functional, and thus in particular no values x, y ∈ A × B such that h(x, y) = 1 . Hence, if we adopt this point of view, we have to find another, non-classical, explanation for h(x, y) = 1 . The following evaluation functional, denoted as ev(⋅) , may do the job: Note that x 1 y 1 = √ 2 2 and x 2 y 1 = 0 together imply that x 2 = 0 , which contradicts Similarly, x 2 y 2 = √ 2 2 and x 1 y 2 = 0 together imply that x 1 = 0 , which contradicts x 1 y 1 = √ 2 2 . Hence, as expected, the above evaluation functional is non-classical. It amounts to assigning a value to the products x i y j but not to the single components of x and y separately. Quoting Holevo in [68,Supplement 3.4]: entangled states are holistic entities in which the single components only exist virtually.
(2) Existence of negative probabilities: Negative probabilities are not an intrinsic characteristic of QT. They appear whenever one attempts to explain QT 'clas-

Table 1
Weights for the charge in (12) The i-th column of the row denoted as x (resp. y) corresponds to the element x (i) sically' by looking at the space of charges on . To see this, consider e , and assume that, based on (12), one calculates: Because of Theorem 1, there is no probability charge satisfying these moment constraints, the only compatible being quasi-probabilities. Table 1 reports the nine components and corresponding weights of one of them: Note that some of the weights are negative but ∑ 9 i=1 w i = 1 , meaning that we have an affine combination of atomic charges (Dirac's deltas). Consider for instance the first monomial x 1 x † 1 y 1 y † 1 in (12), its expectation w.r.t. the above charge is The charge described in Table 1 is one among the many that satisfy (12) and has been derived numerically. Explicit procedure for constructing such negative-probability representations have been developed in [73][74][75][76].
Again, we want to stress that the two above paradoxical interpretations are a consequence of Theorem 1, and therefore can emerge when considering any instance of a theory of A-coherence in which the hypotheses of this result hold.

Entanglement Witness
Do quantum and classical probability sometimes agree? Yes they do, but when at play there are density matrices such that Eq. (16) does not hold, and thus in Table 1. particular for separable density matrices. We make this claim precise by providing a link between Eq. (16) and the entanglement witness theorem [77,78]. We first report the definition of entanglement witness [79, Sect. 6.3.1]: Definition 2 (Entanglement witness) A Hermitian operator W ∈ H n 1 ×n 2 is an entanglement witness if and only if W is not a positive operator but The next well-known result (see, e.g., [79, Theorem 6.39, Corollary 6.40]) provides a characterisation of entanglement and separable states in terms of entanglement witness.

Proposition 2 A state e is entangled if and only if there exists an entanglement witness W such that Tr( e W) < 0 . A state is separable if and only if Tr( e W) ≥ 0 for all entanglement witnesses W.
Assume that W is an entanglement witness for the entangled density matrix e and consider W � = −W . By Definition 2 and Proposition 2, it follows that The first inequality states that the gamble ( is strictly desirable for Alice (in theory T ⋆ ) given her belief e . Since the set of desirable gambles (B1) associated to e is closed, there exists > 0 such that W � = W � − I is still desirable, i.e, Tr( e W � ) ≥ 0 and where we have exploited that (x 1 ⊗ x 2 ) † I(x 1 ⊗ x 2 ) = . Therefore, (21) is equivalent to which is the same as (16).
Hence, by Theorem 1, we can equivalently formulate the entanglement witness theorem as an arbitrage/Dutch book: be the set of desirable gambles corresponding to some density matrix ̃ . The following claims are equivalent: This result provides another view of the entanglement witness theorem in light of A-coherence. In particular, it tells us that the existence of a witness satisfying Eq. (21) boils down to the disagreement on rationality (coherence) between Isaac's classical probabilistic interpretation and Alice's theory T ⋆ , and therefore that whenever they agree it means that e is separable. This connection explains why the problem of characterising entanglement is hard in QT: it amounts to proving the negativity of a function, which is NP-hard. We can also prove the following Corollary 1 Let ̃ be separable, then ̃ is a truncated moment matrix.
In other words, when ̃ are separable, we have an agreement between the Isaac's classical view and Alice's theory T ⋆ of rationality, and therefore we can give ̃ a fully classical probabilistic interpretation by regarding it as a truncated moment matrix.

A Theory of Algorithmic Rationality and Entanglement in the Reals
In this section we are going to present an example of entanglement in an A-coherent theory of probability that is different from QT. For this purpose, we consider two classical coins, which we denote as l (left) and, respectively, r (right), and define where H i , T j denote the outcome heads and, respectively, tails for the left or right coin. We consider the possibility space Note that the following marginal relationships hold: As the space of gambles L R , we consider the set of all polynomials of the unknowns = [ 1 , 2 , 3 ] of degree 2: 19 19 Degree 2 polynomials allow Alice to express desirability judgments about the probability that the outcome is H l H r , e.g., is the gamble 1 − 0.5 desirable?, and also about the probability of the outcome H l H r , H l T r , e.g., is the gamble 1 2 − 0.25 desirable? Therefore, this choice of L R is expressive; we have fixed the maximum degree to 2 just to keep small the dimension of L R .
For instance, these are two elements of L R : Evaluating the nonnegativity of polynomials in L R is in general NP-hard. Therefore, Alice may not have the computational resources to enforce full rationality, A0-A2, or, equivalently, to solve (4). However, she can use a quick algorithm to prove a sufficient condition for a polynomial in L R to be nonnegative: a polynomial of is nonnegative in if its coefficients are nonnegative. For instance, under this criterion, Alice can easily verify that g 2 ( ) is nonnegative.
Proposition 3 Let C ∈ L R be a set of desirable gambles satisfying B0, B1, with L R defined in (24) and ≥ defined as follows: 20 which is the cone of (multivariate) Bernstein's polynomials of degree less than, equal to 2. A-coherence of C (or equivalently B2) can be proven in polynomial time by solving a linear programming problem.
Therefore, the definition of nonnegativity (27) gives an algorithmic efficient way to assess rationality: linear programming.
Also in this case, we can define the dual operator L . First of all, observe that the vector of monomials b( ) = [1, 1 , 2 , 3 , 1 2 , 1 3 , 2 3 , 2 1 , 2 2 , 2 3 ] constitues a basis for L R in (24). Therefore, the dual space L * R corresponds to the space of linear operators L ∶ ℝ → ℝ , whose basis is given by the elements of the matrix L (b) , where L is applied component-wise to the elements of b( ) . The dual of an A-coherent set of desirable gambles C is where = {L ∈ L * R |L(1) = 1,L(g) ≥ 0 ∀g ∈ ≥ } is the set of states.
(24) L R = {g( ) ∶ g( ) is a degree 2 polynomial}. Consider for instance the state: which, as it can be verified, belongs to . We aim at showing that there exists a gamble h ∈ L R such that: Consider the polynomial gamble: with > 0 and g 1 defined in (26). It can be shown that h( ) ≤ − and so the polynomial is negative. However, its 'expectation' w.r.t. the state (29) is equal to Therefore, we have violated an inequality that holds in classical probability ( E(h) ≤ − in T ), although the set of desirable gambles with L defined in (29), is logically consistent in T ⋆ (A-coherent). This is the essence of Bell's type inequalities: the quantum weirdness that is also present in this example.
It is then possible [80] to set up a thought experiment where two coins are drawn from a bag in the state (29). If we give the left coin to Alice and the right coin to her friend Bob as depicted in Fig. 3, then we can show that after the coins move apart, there are 'matching' correlations between the output of their toss. That is, if Alice measures (through a toss) the bias of one coin, then she can predict with certainty the outcome of the measurement (toss) on the other coin. This correlation cannot be explained classically, because there does not exist any classical correlation model that can violate the Bell's type inequality (30). We have entanglement!

Discussions
This paper grew out of our desire to understand QT, in the sense of giving it a meaning clear to us. We have been favoured in this by the fact that we have quite a strong background on the foundations of probability, and QT, mathematically, can be (29) regarded as a generalised theory of probability. But, given this, why is probability generalised in such a way, and what does it mean? We believe that the present paper, without aiming at reconstructing QT, provides a new way to explain the differences between classical and quantum probability: the algorithmic intractability of classical probability theory contrasted to the polynomial-time complexity of QT.
We have obtained this result in a setting that is more general than QT itself. Our 'weirdness theorem' establishes that the weirdness of QT is not exclusive to QT: it appears in any probabilistic theory that is (i) logically consistent and (ii) computationally bounded. 21 QT is just a special case, in the same way as our theory of Bernstein polynomials is another special case.
Yet, our result does talk in particular of QT. And hence it is interesting to know, for one thing, that QT is logically consistent, in the sense that it is a mathematical theory that cannot be proven inconsistent from the inside, by Alice. But it is actually inconsistent from the 'outside', i.e., from the point of view of our external observer Isaac that has unbounded computational capabilities, who, in other words, identifies rationality with the logical consistency of classical probability theory. This is the essence of the clash between classical and quantum physics. It also explains why QT is so peculiarly hard for us to grasp: because to classic eyes there is a degree of incoherence in it; and we tend be able to actually understand only logically consistent theories or ideas.
We believe that such a degree of incoherence is also the reason why we should abandon our attempt to reconcile traditional physics with quantum theory. In our narration, such an abandonment is embodied by the metaphor of a computer that 'runs the universe'. This is not a new idea at all [86]. However, it is new in the sense that the computer has limits due to the algorithmic nature of its tasks; and this is the reason for the weirdness of QT. Stated differently, what follows from this work, in our view, is that there is room for the idea of a more fundamental reality than classical physics, a reality that is just computational. It is by detaching computation from classical physics, in such a way, that we can finally have a solid grip on the meaning of QT and eventually being able to identify the specific features of our world that ground its use.
In order to hold onto some purely physical intuition, instead, one might want to consider for instance the many-worlds interpretation of QT [13], as many physicists do nowadays. It is certainly a fascinating view of QT, of which we feel the appeal. However, we perceive also the discomfort of having to embrace an interpretation that appears to require an incommensurably huge, and possibly infinite, amount of resources in order to have a universe that branches continuously in multiple copies of itself. Our own algorithmically bounded theory is much more parsimonious.
It tells us that we can implement a quantum world in polynomial time, by definition, and such a world would obey the usual axioms of QT: Bob might then as well believe that he is living in one of many worlds, but he would just be wrong. So should we, as entities of our universe, really go as far as postulating the existence of many worlds in presence of such a more parsimonious alternative? Is it not there any Occam's razor issue at stake here?
Of course one could still criticise our appeal to a more fundamental algorithmic reality on the basis of our postulating the existence of a computer that executes the universe. We have been careful in referring to this as a metaphor, however: in that it need not be a computer that someone has built and in particular there is no need of a programmer. It can simply be another level of reality, which can be interpreted as a computer; 22 in a sense, our picture only suggests that there can be more levels of reality, one nested into the other.
One might also wonder why we humans perceive the quantum-physical clash given that we, as Alice, are subjects within the quantum theory-in our narrative the inconsistencies of QT are observed from Isaac's point of view, externally to the theory. The explanation that we give to ourselves about this point is that we are used to the illusion of living in a classical universe. This is just in our minds, however, as we cannot make any physical experiment that reveals an actual inconsistency in our wonderland. And yet, we believe that this illusion can be explained within our framework: classical rationality emerges from algorithmic rationality when we consider the joint state of a system of many identical particles. We plan to address this issue in future work.
Finally, we think that the foundation of generalised probability theory via algorithmic rationality provided in this paper could possibly be useful outside the context of QT, for instance in decision theory. We also plan to address this research direction in future work.  22 Reality is for instance interpreted as a computer in a recent conjecture that the holographic universe could just act as a quantum-correcting code [87]; in a sense, our view is similar in spirit.

Appendix: Additional Discussion on QT in Relation to Other Notions
In the present section we shall discuss a few main questions that our view of QT appears to raise.

The Class of Hermitian Sum-of-Squares
The class ≥ of nonnegative gambles, defined in Sect. 4, is the closed convex cone of all Hermitian sum-of-squares in L R = {(⊗ m j=1 x j ) † G(⊗ m j=1 x j ) | G ∈ H r×r } , that is, of all gambles g(x 1 , … , x m ) ∈ L R for which G is PSD. In particular this means that Alice can efficiently determine whether a gamble belongs to ≥ or not. But is this class the only closed convex cone of nonnegative polynomials in L R for which the membership problem can be solved efficiently (in polynomial-time)? It turns out that the answer is negative (see for instance [88,89]): in addition of Hermitian sum-ofsquares (the one that Nature has chosen for QT) one could also consider real sumof-squares in L R , that is, polynomials of the form (⊗ m j=1 x j ) † G(⊗ m j=1 x j ) that are sumof-squares of polynomials of the real and imaginary part of x j .
A separating example is the polynomial in (17), which is not a Hermitian sum-ofsquares but it is a real sum-of-square, as it can be seen from (18). This polynomial was used in our example because it can be constructed by inspection and its nonnegativity follows immediately by (18). Clearly, there exist nonnegative polynomials in L R that are neither Hermitian sum-of-squares nor real sum-of-squares.
Why has Nature chosen Hermitian sum-of-squares? This is an open question that we will investigate in future work. A possible explanation may reside in the different size of the corresponding optimisation problems [89]. Another possible explanation is that the class of Hermitian sum-of-squares is always strictly included in the class of real sum-of-squares polynomials. Therefore the former may be the smallest class of gambles that allows one to efficiently determine whether a gamble is nonnegative according to it, but that is still expressive enough [90,Proposition 6].

On the Use of Tensor Product
In Sect. 4 we saw that the possibility space of composite systems of m particles, each one with n j degrees of freedom, is given by = ∏ m j=1 ℂ n j . We saw that gambles on such a space are actually bounded real functions where ⊗ denotes the tensor product between vectors, understood as column matrices.
In what follows, we justify the use of the tensor product, and more specifically the type of gambles on the possibility space of composite systems, as a consequence of the way a multivariate theory of probability is usually formulated.
As a start, let us consider the case of classical probability. In CPT, under the reasonable assumption that since agents are expressing beliefs about physical systems, the underlying notion of dependence/independence should be compatible with that of a generative model, 23 structural judgements of independence/dependence are expressed via products: given factorised gambles g( denotes the expectation operator. With this in mind, let us go back to our setting. Marginal gambles are of type g j (x j ) = x † j G j x j . This means that structural judgements are performed by considering factorised gambles of the form ∏ m j=1 x † j G j x j . It is then not difficult to verify that By closing the set of factorised gambles under the operations of addition and scalar multiplication, one finally gets a vector space whose domain domain coincides with the collection of all gambles of the form (⊗ m j=1 x j ) † G(⊗ m j=1 x j ) . Hence, structural judgements of independence/dependence are stated considering the desirability of gambles belonging to L R .

Hidden Variable Models
In [91], Kochen and Specker gave a hidden variable model for QT. 24 Their idea amounts to introducing a 'hidden variable' for each observable H producing stochasticity in outcomes of measurement of H. The totality of all such hidden variables is then the phase space variable of the model.
In case of a single n-dimensional quantum system, our model based on the phase space can also be understood as a hidden variable model, and essentially coincides with the one introduced by Holevo in [68,Sect. 1.7]. The point is that for a single quantum system, both ≥ = L ≥ R and < = L < R , meaning that Alice will never accept negative gambles. Hence, in such a case, the density matrix = L(xx † ) can be interpreted as a truncated moment matrix and it is therefore compatible with (can be extended to) a probability distribution over . Now, as there may be more the one probability compatible with it, 25 such model does not fulfil one of the key requirement imposed in many existing 'no-go' theorems, namely the uniqueness of the associated classical description. However, despite the fact that a hidden variable theory would necessarily treat as distinct two probabilities that define the same density matrix, they are underdetermined by the observations and therefore they can be regarded as corresponding to two unidentifiable, or undistinguishable, classical models. In fact, since any real-valued observable is described by a Hermitian operator and the expectation of a Hermitian operator w.r.t. a given density matrix (truncated moment matrix) is unique ( Tr(G ) ), is sufficient to provide an adequate characterisation of these two probabilities. To sum up, if we accept the view that, because of the underdeterminisation of classical models by observations, the requirement of a one-to-one correspondence between classical and quantum states is not grounded and hence can be relaxed, a hidden-variable model may be simply defined as the equivalence class of all probabilities associated to a given truncated moment matrix. 26 What about the case when there are m > 1 particles? In this case, Theorem 1 applies, and it therefore can be read as a no-go theorem pointing to two ways to extend the classical model either by allowing negative probabilities or by redefining the notion of evaluation functionals. Moreover, the result elucidates the role of the tensor product. In order to see this, let us consider two quantum systems A and B, with corresponding Hilbert spaces H A and H B . By duality, the density matrix (state) of the joint system lives in the tensor product space H A ⊗ H B . Indeed, we have that and L((⊗ 2 j=1 x j )(⊗ 2 j=1 x j ) † ) belongs to H A ⊗ H B . However, as mentioned before, when (16) holds, we may justify entanglement hypothesising the existence of nonclassical evaluation functions or, equivalently, a larger possibility space (Theorem 1). This is clearly discussed in [68,Supplement 3.4]: "Since the set of pure states of the composite system is larger than Cartesian product A × B , the phase space of the classical description of the composite system will be larger than the product of phase spaces for the components: A × B ⊊ . Therefore this classical description is not a correspondence between the categories of classical and quantum system preserving the operation of forming the composite systems. Moreover, it appears that there is no way to establish such a correspondence. In any classical description of a composite quantum system the variables corresponding to observables of the components are necessarily entangled in the way unusual for classical subsystems." To sum up, A × B ⊊ may be understood as another manifestation of algorithmic rationality.

Sum-of-Squares Optimisation
The theory of moments (and its dual theory of positive polynomials) are used to develop efficient numerical schemes for polynomial optimisation, i.e., global optimisation problems with polynomial functions. Such problems arise in the analysis and control of nonlinear dynamical systems, and also in other areas such as combinatorial optimisation. This scheme consists of a hierarchy of semidefinite programs (SDP) of increasing size which define tighter and tighter relaxations of the original problem. Under some assumptions, it can be showed that the associated sequence of optimal values converges to the global minimum, see for instance [1,95]. Note that every polynomial in is a (Hermitian) sum-of-squares because it can be rewritten as: with G = HH † . In QT, SDP has been used to numerically prove that a certain state is entangled [96][97][98][99][100][101][102][103][104]. The work [97,98] realised that the set of separable quantum states can be approximated by sum-of-squares hierarchies. This leads to the SDP hierarchy of Doherty-Parrilo-Spedalieri, which is extensively employed in quantum information.
The present, purely foundational, work differs from these approaches by stating that the (microscopic) world is actually running on a 'computer' that solves SOS optimisation problems.

Sections 2 and 4
The proofs of the results in Sect. 2.2 were derived in [105]. Hereafter, we extend those results to prove Theorem 1.
We define the dual of a subset K of L as: By an argument analogous to that by [106,Th. 4], it is easy to check that:

establishes a bijection between coherent sets of desirable gambles and non-empty closed convex sets of states.
It is also easy to verify the following characterisation of the dual of a closed convex cone which is not coherent. Essentially, Proposition 5 is telling us that, from the dual point of view, non-degenerated closed convex cones of gambles that are not coherent are characterised by quasiprobabilities (charges).

Proof of Theorem 1 Assume that L R includes all positive constant gambles and that
≥ is closed (in L R ). Let C ⊆ L R be an A-coherent set of desirable gambles. We have to verify that the following statements are equivalent: 1. C includes a negative gamble that is not in < ; 2. posi (L ≥ ∪ G) is incoherent, and thus P is empty; 3. C • is not (the restriction to L R of) a closed convex set of mixtures of classical evaluation functionals; 4. The extension C • of C • in the space M of all charges in includes only quasiprobabilities.
First of all, notice that the restriction to L R of the set of all normalised charges that correspond to a bounded linear functionals coincides with C • . Given this, the equivalence between (3) and (4) is immediate, whereas the equivalence between (2) and (4) is given by Proposition 5. We finally verify the equivalence between (1) and (3). In this case, the direction from left to right being obvious, the other direction is due to the fact that g ≤ f , for every g ∈ C and f ∈ posi (L ≥ ∪ C) ⧵ C . ◻

Section 4: Duality in QT
Recall from Sect. 3 that the set L ∈ L * R | L(g) ≥ 0, L(1 R ) = 1, ∀g ∈ C is the dual of C ⊂ L R .
The monomials ⊗ m j=1 x j form a basis of the space L R . Define the Hermitian matrix of scalars K ↦ P ∶= K • ∩ S and let {z ij } ∈ ℂ d , with d = n(n+1) 2 and n = ∏ m j=1 n j , be the vector of variables obtained by taking the elements of the upper triangular part of Z. Given any gamble g, we can therefore rewrite L(g) as a function of the vector {z ij } ∈ ℂ d . This means that the dual space L * R is isomorphic to ℂ d , and we can then define the dual maps (⋅) • between L R and ℂ d as follows.
Definition 3 Let C be a closed convex cone in L R . Its dual cone is defined as where L(g) is completely determined by {z ij } via the definition (32).
In discussing properties of the dual space, we need the following well-known result from linear algebra: Lemma 1 For any M ∈ H d×d and v ∈ ℂ d , it holds that By Lemma 1 and the definitions of g and Z, we obtain the following result.
It is then possible to verify that: Proposition 7 Let C be an A-coherent set of desirable gambles. The following holds: Proof By A-coherence, C includes ≥ , which is isomorphic to the closed convex cone of PSD matrices. We have that From a standard result in linear algebra, see for instance [68,Lemma 1.6.3], this implies that Z ≥ 0 , i.e., it must be a PSD matrix. ◻ In what follows, we verify that the dual C • is completely characterised by a closed convex set of states. But before doing that, we have to clarify what is a state in this context.
In an algorithmic TDG, postulate A0 is replaced with postulate B0. Hence, to define what a state is, one cannot anymore refer to nonnegative gambles but to gambles that are A-nonnegative. This means that states are linear operators that:  (1) assign nonnegative real numbers to A-nonnegative, and (2) preserve the unit gamble. In the context of Hermitian gambles, the unitary gamble is where I is the identity matrix. Therefore, we want that Hence, the set of states is By reasoning exactly as for Theorem 4, we then have the following result.

Theorem 3 The map is a bijection between A-coherent set of desirable gambles in L R and closed convex subsets of S B .
We can therefore identify the dual of an A-coherent set of desirable gambles C , with the closed convex set of states Notice that since matrices corresponding to states are density matrices, (39) is in fact equivalent to meaning that we can identify the set S B with the set of density matrices and denote its elements as usual with the symbol .
Funding Open Access funding provided by the IReL Consortium.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission (36)  ∈ H n×n ∶ ≥ 0, Tr( ) = 1, Tr(G ) ≥ 0, ∀G ∈ C ,