Quantum theory in finite dimension cannot explain every general process with finite memory

Arguably, the largest class of stochastic processes generated by means of a finite memory consists of those that are sequences of observations produced by sequential measurements in a suitable generalized probabilistic theory (GPT). These are constructed from a finite-dimensional memory evolving under a set of possible linear maps, and with probabilities of outcomes determined by linear functions of the memory state. Examples of such models are given by classical hidden Markov processes, where the memory state is a probability distribution, and at each step it evolves according to a non-negative matrix, and hidden quantum Markov processes, where the memory state is a finite dimensional quantum state, and at each step it evolves according to a completely positive map. Here we show that the set of processes admitting a finite-dimensional explanation do not need to be explainable in terms of either classical probability or quantum mechanics. To wit, we exhibit families of processes that have a finite-dimensional explanation, defined manifestly by the dynamics of explicitly given GPT, but that do not admit a quantum, and therefore not even classical, explanation in finite dimension. Furthermore, we present a family of quantum processes on qubits and qutrits that do not admit a classical finite-dimensional realization, which includes examples introduced earlier by Fox, Rubin, Dharmadikari and Nadkarni as functions of infinite dimensional Markov chains, and lower bound the size of the memory of a classical model realizing a noisy version of the qubit processes.


Introduction
Modeling a hidden cause mechanism for the probability distribution of a time series of observations is a ubiquitous task, from fundamental science experiments to data analysis.Considering classical hidden dynamics gives rise to hidden Markov models (HMM) [1,2], which have key applications in fields where time series arise [3], among them speech recognition [4] and genomics [5], where they are still an important part of the data analysis tools in these fields [6], but also new possible uses are emerging, such as in ecology [7].On the other hand, repeated measurements on a quantum system also define probabilities of sequences of outcomes with a hidden mechanism, in this case a quantum one.Landmark experiments can be modeled as such [8].Infinite sequences of identical repeated measurements define the class of hidden quantum Markov models (HQMM), a special case of C * -finitely correlated state when the state is classical, i.e. diagonal in a given product basis [9].HQMMs not only can serve as tools for the analysis of quantum experiments and for the modeling quantum technologies, but also as tools for data analysis application, implemented in a classical simulator or on actual controllable quantum systems (be it NISQ devices or universal quantum processors).
Removing the restriction to classical or quantum dynamics, and keeping only on the linearity of the hidden dynamics and the nonnegativity of the function used to compute the probabilities of sequences, enlarges the class of possible models and ensuing processes to so-called quasi-realizations [2].These generalized models are known under several different names in different communities, e.g.operator observable models (OOM) [10] or weighted finite automata [11], or indeed (classical) finitely correlated states [9].Considering this extended class simplifies greatly the inference of the hidden mechanism from the probabilities of the sequences, as a minimal description can be obtained by simple linear algebra, while this is not the case for a classical or quantum one.Moreover, from a physical point of view, this extended space of models can be seen as the class of models describing repeated measurement on a system in general probabilistic theories (GPTs) [12], including alternatives or extensions of quantum theory.The immediate question presenting itself is whether there is a strict inclusion between the sets of HMM, HQMM and general models?For these sets and for any other subclass of models that can be conceived, this is an interesting question from a fundamental point of view, since one could say that the possibility of generating every stochastic process with finite memory is a desirable property of a general theory of nature, but it also has practical consequences for applications, since it can exhibit strengths or limitations of specific classes.Already in [13,14] it was shown that there exist processes admitting general models which however are not representable classically by any HMM.In [15] it was shown that there exist processes given by HQMM which however cannot be represented by classical HMM.Perhaps then quantum mechanics is sufficiently powerful to be able to realize any discrete process admitting a finite memory general model, by means of finite-dimensional quantum systems [15]?
The main contribution of the present paper is a negative answer to this question, via the explicit construction of processes admitting a general linear model, but for which the underlying possible GPT is so tightly constrained that we can exclude the possibility of a realization by HQMM by inspection.Our result also answers a question raised in [9,Sec. 7.1].The argument is geometric, as pioneered in [15] (there for separating HMM and HQMM): our examples are such that the GPTs of their quasi-realizations have unique mutually dual convex cones of effects and states, respectively; in other words, there is only one possible operational probabilistic theory that can describe the observable statistics.As HQMM give rise to semi-definite representable (SDR cones, i.e. projections of sections of the positive-semidefinite cone of matrices), we can exclude a quantum realisation by forcing our cone to be not semi-algebraic.On the other hand, to better appreciate the power of HQMM and motivating the question of establishing a separation with general theories, we show that the non-classical examples in [13,14] are representable by HQMMs, and thus are not sufficient to show the new separation.This is remarkable since these examples were naturally formulated as a functions of infinite-alphabet classical Markov models, showing that small quantum systems can be expressive enough to represent rich stochastic processes that are not inherently quantum, supporting the possibility that quantum systems can be useful for modeling real world data streams.On the other hand, by simplifying the original examples, we remark that already a class of binary sequential measurements on a qubit cannot be reproduced by a HMM.This fact was already noticed by [16] where a HQMM for the so-called probability clock of Jaeger [10,17] was found, which itself is a simplified version of the older example in [13,14].
Before going into a mathematically precise description of our framework and results, let us discuss further related work.The notion of quantum hidden Markov models seem to have appeared in [18].In [19] a process was constructed which can be represented on a qubit but not on a binary classical space.Several papers analyzed how, for a quantum process representing a hidden Markov model, the entropy of the average stationary state can be less than in the classical case [20][21][22][23][24], and how to construct a quantum representation of an HMM, or from the outcome probabilities [25,26].In particular, an example of a class of classical processes which require infinite memory in a so-called unifilar HMM, but can be implemented on a qubit, was shown in [22].A gap between the memory requirement of an -machine to simulate sequential measurements in contextuality experiments was also observed [27].Note however that it is well-known that there exist processes generated by a finite HMM, yet its -machine and any other unifilar HMM necessarily have infinite memory [28,29].The non-asymptotic behaviour of the sample mean of a HQMM has been studied in [30] giving bounds for the tail probabilities and deriving a centrallimit theorem type result.Algorithms to find a HQMM modelling a sequence of observations have been presented in [16,31].Note that HQMM can equivalently be obtained from locally measuring C * -finitely correlated states [9]; this implies that our work also shows the existence of finitely correlated states which are not C * -finitely correlated, answering an open questions of [9], which received attention and but not a conclusive answer.For example, [32] shows that a similar separation exists for sequences of finite-size states in the non-translation invariant setting, while [33] shows that a separation exists for sequences of periodic finite-size states.Moreover, several works have investigated the use and advantages of tensor networks for probabilistic modeling, e.g.[34][35][36][37].
The cones used to show the separation are the power cone and the exponential cone [38], being the power cone more general since the exponential cone can be obtained as a limiting case of the power cone plus a linear transformation.They have no clear physical interpretation as general probabilistic theories (yet), but appear as models for several practical optimization problems, with applications to chemical process control [39], circuit design [40], or electric vehicle charging [41], among many others.Both the power cone and the exponential cone have self-concordant barriers [38,42,43] which make them suitable for conic optimization methods like interior point algorithms, and although they are non-symmetric cones the implementation of the algorithms is feasible [44].The exponential cone also can be used to model relative entropy programs which includes geometric programming [45] and second order conic programming [46].Extensions to quantum relative entropy programs include tasks like quantum channel capacity approximation [47] or quantum state tomography [48].
The paper is organized as follows.In the results section we start by reviewing key properties of finite-dimensional linear models for stochastic processes, and of their classical and quantum realizations.Then we show that the processes in [13,14] which do not admit a classical realization, do in fact admit a quantum realization.Moreover, we quantitatively evaluate the robustness of this statement by considering perturbation of the quantum realizations of these processes by depolarizing noise.We then present our main result: two families of processes with a three dimensional quasi-realization, which we show however not to admit any finite dimensional quantum realizations.Finally, in the discussion section, we present generalizations of the convex state spaces of the GPTs underlying the models, which also extend quantum theory.

Stationary stochastic processes and quasi-realizations
We start by reviewing the formalism for general linear models with memory of stochastic processes, or quasi-realizations [2].Let M be an alphabet with |M| = m symbols and let M be the set of words of length .This includes = 0, in which case M 0 consists only of one word .By M * = ≥0 M we denote the set of all finite words, which forms a semigroup under concatenation and with neutral element .We focus on stationary processes, meaning that the probability of a sequence of letters does not depend on t.For the empty word, we have p( ) = 1.The largest class of hidden cause models we consider is the class of quasi-realizations, defined as follows.
Definition 1.A quasi-realization of a stationary stochastic process p is a quadruple (V, π, D, τ ), where V is a real vector space, τ ∈ V, π ∈ V * , and D : M * → L(V) mapping a word u ∈ M * to a linear map D (u) of V a semigroup homomorphism, i.e.
In addition, the following fixed-point relations hold, and The right hand side of Eq. ( 4) can be visually represented as in Fig. 1.Quasi-realizations that generate the same stochastic process are said to be equivalent.Quasi-realizations of a process with minimal dimension of V are called regular, and they are related by each other by a similarity transformation, (i.e. for two equivalent regular realizations (V, π, D, τ ), (V , π , D , τ ), V is linearly isomorphic to V through an invertible linear map T , π = πT −1 , τ = T τ , D u = T D u T −1 .Note that due to the semigroup law Eq.(2), D is really given entirely by the maps D (u) , u ∈ M, making a quasi-realization a finite object in linear algebraic terms, as it can be given by a finite list of real numbers.
The linear structure of quasi-realizations alone is not sufficient to guarantee the positivity of the probabilities.However, any quasi-realization of a stochastic process can be understood as arising from the dynamics of a (possibly exotic) general probabilistic theory.In fact, it is immediate to show that: Proposition 2. A quasi-realization defines a non-negative measure if and only if there is a convex cone A depiction of a general stationary process with finite memory.The probability of a sequence u −k,..,u 0 ,...,u l can be computed as the inner product between a right stationary state π, evolved through a sequence of linear maps D u −k , .., D u l acting from the right, and a right stationary state τ .The hidden vector space in which πD u −k , ...D u l lives represents the memory of the process.For quantum hidden Markov models, π is a state and τ is the trace functional in the dual of the state space, while D u are CP maps such that u∈M D u is unital.
Note that, without loss of generality, the cone in the last proposition can be chosen to be closed: otherwise simply go to the closure of C, C = C * * , which is stable under the maps D (u) and has the same dual C * .In fact, the cone C can be viewed as the cone of effects of a general probabilistic theory (GPT) with τ being the unit [12,49,50], and C * as the cone of states.A pair of cones C, C ⊆ C * is what defines a general probabilistic theory; the maps D (u) stabilize the cone C, and the D (u) stabilize C * , therefore they can be considered as physical maps of the GPT.A quasi-realization does not immediately identify a unique stable cone C in general.However, we can put inner and outer bounds on it from the cones generated by the quasi-realization dynamics itself. where An important result in the theory of quasi-realizations is that a stochastic process has a finitedimensional quasi-realizations if and only if the rank of a suitable Hankel-type matrix constructed from the probabilities of the finite words is finite.This matrix H is an infinite matrix with entries indexed by pairs of words, such that H u,v = p(uv).Writing the columns of H as h v = H •,v , a potentially infinite-dimensional quasi-realization in the column space V = span{h v } is obtained by choosing π = (1, 0, 0, ...)+ker V, τ = h and D (u) h v = h uv .This is a bona fide finite-dimensional quasi-realization if and only if the rank of H is finite.We will focus on such processes and denote their set as G, with the idea in mind that they represent a privileged class of candidate processes, since they can in principle be reconstructed from a finite number of quantities, obtainable from observations of the process if enough data is available.

Classical and quantum processes
A subset P of G are those processes admitting a classical probability interpretation in finite dimension, denoted as positive realization, also known as hidden Markov models.In this case the process p admits a quasi-realization (R d , π, D, 1), such that D (u) are non-negative matrices and D = u∈M D (u) is (right) stochastic, π ∈ (R d ) * is a stationary distribution of D, and 1 = (1, 1, . . ., 1) ∈ R d .A larger subsets is given by the processes CP which admit a finite-dimensional quantum explanation, that is a completely positive realization: in this case the quasi-realization can be chosen to be (B(H) is the space of bounded operators on some finite-dimensional Hilbert space H and B(H) sa the space of selfadjoint operators, ρ is a positive semidefinite density operator in B(H), such that D (u) are completely positive maps on B(H) and D = u∈M D (u) is unital, and 1 1 is the identity of B(H).Positive and completely positive realization are guaranteed to give positive probabilities.
A natural question is then to ask if the inclusions P ⊆ CP ⊆ G are strict.This question makes sense only if one restricts to finite memory systems, since from the infinite-dimensional quasi-realization we presented in the last paragraph, a HMM with countably infinite classical memory can be constructed [2,51].As already mentioned, P G was shown as an early result by [13,14], while P CP was shown first in [15].We are going to prove here that even CP G holds.In order to show these separations, it is useful to establish necessary and sufficient conditions for a process to have a positive or completely positive realization.
For the classical case, these were provided by [52]: Given a quasi-realization (V, π, D, τ ), an equivalent positive realization exists if and only if there is a convex pointed polyhedral cone C ⊂ V such that τ ∈ C, D (v) (C) ⊆ C, π ∈ C * .For the quantum case, an analogous characterization was given in [15] highlighting the role of semidefinite representable cones, defined as follows.
Definition 4. Let V be a finite dimensional real vector space.A semidefinite representable (SDR) cone is a set C ⊂ V such that there exists a subspace W ⊆ B(C d ) sa for some d and a linear map where W + = W ∩ S + , S + being the cone of positive-semidefinite matrices.
For our purposes we will use that a necessary condition for a process to have a completely positive realization is that any regular representation of the same process must admit an SDR cone [15].Note that an SDR cone is semi-algebraic, that is, it can be defined through a finite number of inequalities involving polynomials of the coordinates.
Since both the characterization of classical and of quantum processes do not give a prescription for how to find the stable polyhedral or SDR cone, respectively, they are not immediately usable to establish if a given process has a positive or completely positive realization.However, they are powerful enough to exclude the existence of such realizations if one is able to rule out the existence of stable cones with the desired properties.

HMM vs HQMM
The processes presented in [13,14], which we refer to as Fox-Rubin-Dharmadikari-Nadkarni (FRDN) processes, were shown to be in G by defining them explicitly as a function of Markov chains with infinite memory (non-negative integers as internal states), and then proving that the rank of the Hankel matrix H is finite.As we have observed, this means that the processes can be explained with a finite-dimensional quasi-realization.In particular, the transition probabilities of the Markov chain are and the function is defined as f (0) = a and f (x) = b if x > 0, α ∈ R and 0 < λ ≤ 1/2.The resulting processes do not have a finite-dimensional classical realization when π and α are not commensurate.It was unknown if the processes in [13,14] had a quantum realization or not, and since the example was formulated naturally as an infinite-dimensional classical model, it could have been that it was sufficient to show the separation CP G.We show that this is not the case, since a quantum realization exists.
To obtain this result, we first derive an explicit quasi-realization of the model (which was not given previously), and then looked for an equivalent quantum realization imitating the main features, in particular the eigenvalues of the maps.Thus, the FRDN processes cannot separate CP from G.
Some remarks are in order: • The non-existence of a positive realization was proven by showing that in any realization the map D b must have eigenvalues with maximum modulus with arguments that are non-commensurate with π, which is impossible for nonnegative matrices by the Perron-Frobenius theorem [53].
• Theorem 5 defines bona fide HQMM even if p and ξ are not tuned to give exactly the FRDN models (only r has to satisfy some constraints in order for D † a to be completely positive).The argument of the proof that there does not exist any finite-dimensional classical HMM implementing the process is unchanged, since the eigenvalues of the map D b do not change.
• The proof of the impossibility of a classical model for this family of quantum realizations differs somewhat from the argument provided for the family in [15], which defines processes that are naturally representable by a 2-qubit quantum systems, and the existence of a stable polyhedral cone was excluded directly by looking at the symmetry properties of the stable cones, which are incompatible with polyhedral cones.This approach of analyzing the problem geometrically proves to be decisive to prove the separation between quantum and general theories, as we will show in the next section.There, in fact, looking at spectra of the maps does not seem to help much.
When α is commensurate with π, say α/π = s/t with coprime integers s and t, the FRDN models admit a positive (classical) realization, with a minimal dimension t [14].In fact, when there are no eigenvalues with arguments incommensurate with π, the spectral argument cannot rule out classical realizations.However, the dimension of the minimal positive realization can be bounded from below, since the allowed region for eigenvalues of n × n matrices with non-negative elements is a subset of the convex hull of the k-roots of unity, k = 1, . . ., n, multiplied by the maximum positive eigenvalue [54].We use this fact to prove a noise robustness results for the quantum processes of Theorem 5, in presence of depolarizing noise, in the special case of p = 1 where the process effectively take place on a qubit.We believe the argument can be adapted also for general 0 ≤ p < 1. Theorem 6.For 0 ≤ q < 1 and 0 < s ≤ 1, consider the processes defined by the HQMM with cp maps at fixed r = 0 and varying α.If positive realizations exist for every α, their maximum dimension (i.e.number of states of the HMM) must be ≥ Ω λ s √ 1−q(cosh 4r) , assuming that 1 − q is small enough.

Processes without quantum realization
Our main result is to present non semi-algebraic 3-dimensional cones which are the only closed stable cones for models of certain stochastic processes, thus ruling out the possibility that these processes admit a quantum realization.These cones are defined as follows: • Exponential cone: • Power cones (for 0 < α < 1): Both K exp and the K α are closed convex cones, and they are all not semi-algebraic (the latter for irrational α).Indeed, the boundary of K exp ∩ {x 2 = 1} is the graph of the transcendental exponential function, }, which is transcendental for irrational α.
The minimal example we can find, using an alphabet of 3 letters, is the following.
Theorem 7. It is possible to choose ν, a, b ∈ R, m 0 , µ 0 ∈ R 3 , such that the linear maps:  • (R 3 , π, D, τ ) is a bona fide regular quasi-realization of a stochastic process, • K exp is the unique stable closed convex cone admitted by (R 3 , π, D, τ ).
Thus, the resulting stochastic processes does not admit a quantum realization.
The crucial observation, as in [15], is that any candidate closed stable cone C has to satisfy On the other hand, for the given process the parameters are chosen in such a way that C min = K exp = C max , and therefore the only possible choice is C = K exp .Indeed the matrices are defined in such a way that after a reset, which must happen at some point, the rays generated by the repeated action of the matrices D 1 and D 2 in any order, densely explore the extremal rays of the exponential cone.
With the same strategy we can also show that also the power cones with irrational power give processes that are not representable by a HQMM.In this case the invertible matrices are diagonal, but we need an alphabet of four letters, rather than three.Theorem 8.It is possible to choose ν , a, b ∈ R, m 0 , µ 0 ∈ R 3 , such that the linear maps: are such that D = D 0 + D 1 + D 2 + D 3 has unique left and right eigenvectors with eigenvalue 1, respectively π, τ , so that, with D : {0, 1, 2, 3} → L(R 3 ) generated by D 0 , D 1 , D 2 , D 3 : ) is a bona fide regular quasi-realization of a stochastic process, • K α is the unique stable closed convex cone admitted by (R 3 , π, D, τ ).
Thus, the resulting stochastic processes does not admit a quantum realization when α is irrational.

Discussion
The result of the previous section has an important consequence: one could have hoped that CP = G, meaning that quantum theory would be able to explain any sequence of observations from a finite GPT dynamics, and this property could be a principle that distinguishes quantum theory among general probabilistic theories.This is not the case, and the study of extensions of quantum mechanics giving rise to larger sets of quasi-realizations is interesting to pursue, with possible applications in data analysis applications, in many-body physics and in the foundations of quantum mechanics.In particular, the exponential and power cones discussed here, and their associated GPTs, have a rich symmetry structure, as indeed the respective cones are generated by the action of a group of matrices on their boundary, reminiscent of the fact that in quantum mechanics the pure states are the orbit of any fiducial pure state under the action of the unitary group.This translates into a large set of essentially reversible dynamics of the GPTs.
As classical and quantum models are actually not restricted to a specific dimension, it is interesting to look for possible multivariate generalizations of power cones and exponential cones, which can be used to provide richer quasi-realizations, and which might unify classical, quantum and the present new state spaces (see e.g.[55]).Commutative multivariate generalizations that come to mind are (α ∈ R n with α i ≥ 0 and n i=1 α i = 1): • the multivariate power cone • and the multivariate exponential cone These cones however can be represented with inequalities involving linear constraints and vectors belonging to the previously discussed 3-dimensional exponential and power cones [38], therefore they are not really giving new structural building blocks.On the other hand, and perhaps more interestingly from the point of view of quantum foundations, are extensions using positive semidefinite matrix cones, which reduce to the power cones and the exponential cones on specific sections, and to the positive semidefinite cone on others.As usual in non-commutative settings, there is more than one natural extension to matrices, and we briefly discuss a few possibilities.
• Matrix exponential cone: as the exponential function is not matrix convex nor monotone, we apply the logarithm (which is matrix monotone and concave), and define • There are at least two natural versions of matrix power cones (for 0 < α < 1 and a fixed X ∈ R d×d ), based on Lieb's concavity theorem: the latter admitting an obvious generalization to α i ≥ 0, n i=1 α i = 1 by way of n-fold tensor products.
• Matrix relative entropy cone Notice that the section {t = 0} of L exp is {(A, B) : A ≥ 0, B ≤ 0}.Both versions of the matrix power cone have the property that the section with {t = 0} (resp.{T = 0}) give just a double copy of the cone of positive semi-definite matrices in dimension d.Finally, D intersected with {t = 0} is (A, B) ∈ R d×d × R d×d : A, B ≥ 0, supp A ⊆ supp B, Tr(A log A − A log B) = 0 .This means that quantum dynamics can be obtained by projecting onto the t = 0, A = B hyperplane and applying the same CP map to A and B. On the other hand, acting with the map which projects A and B to (Tr A)1 1 and (Tr B)1 1, and does not touch t, and then with the maps seen in the examples, one recovers the power cone and the exponential cone.Transcendental matrix cones could be also useful in the study of finitely correlated states, and it would be interesting to exhibit genuinely quantum (e.g.not diagonal in a product basis as in our examples) finitely correlated states that are not C * -finitely correlated.
Another important direction to investigate is the classical-quantum separation in the presence of noise, to understand to which extent classical models can simulate noisy dynamics.We have shown a specific example where the memory of the classical model has to increase as Ω (1 − q) − 1 2 , where q is the noise parameter and the noiseless case corresponds to q = 1.This holds if we insists in looking for exact realizations, and it is likely to be a generic feature of quantum models without classical realizations.What happens if we allow some level of approximation has yet to be formalized and studied.
Finally, there is a lot of room for improvement of necessary and sufficient conditions for a process to have a quantum realization.It would be interesting to single out some criteria which are easily verifiable from a quasi-realization.For example, our proof for excluding a quantum realization is heavily based on the fact that there is only one possible stable cone, and its not SDR.In general the stable cone is not unique, and it would be interesting to find a way to exclude quantum realizations in this case.

A.1 Quasi-realization
We present an explicit quasi-realization (V, π, D, τ ) of FRDN processes.We fix V = R 4 and The matrix corresponding to the output b is defined as and the matrix corresponding to input a is a rank one matrix defined as for π 0 , w ∈ R 4 to be determined.We want to fix the vector π 0 ∈ V.In order to do that we consider the probabilities of the sequences b n = bb . . .b after an a is output, i.e.
Using the above expression and p(b n |a) = π 0 D n b τ for every n ≥ 0 (this also fixes (π 0 τ ) = 1) we obtain the following: where a λ,α and b λ,α are defined as follows: Requiring that (D a + D b )τ = τ we get This construction fully determines also p(b n a|a), therefore all the probabilities p(u|a).
We are left with checking that p(a) is positive and equal to the desired value.
We have the condition πD a + πD b = π, which implies since 1 1 − D b is invertible.Now, the candidate left fixed point π satisfies which is the desired value.By virtue of the fixed point cosntraints and of the reset property, the probabilities of all words are completely determined and they coincide with those given by the FRDN process.

A.2 Quantum (completely positive) realization
We are going to verify that the quantum process given in Theorem 5 gives the probabilities of the FRDN process.
To start, observe that Φ r,α is a completely positive map and that its non-zero eigenvalues coincide with those of D b .Recall We We must have In order to be compatible with Eq. ( 29), we thus need and we note that β = γ, therefore we obtain, imposing r ≥ 0 which is less than 1 if 0 < λ ≤ 1/2, and which has as a solution tan φ = e 2r tan 1 2 arctan λ sin α 1−λ cos α , and the expression for arg β comes from √ 2β = cos φ(cosh r + sinh r) + i sin φ(cosh r − sinh r) = e r cos φ + ie −r sin φ = e 2r cos 2 φ + e −2r sin 2 φe i arctan e −2r tan φ .(52) With this choice of r and φ, we have that the value of p that solves Eq. ( 47) and ( 48) is the same.To compute it, observe that (|β| 2 + |γ| 2 ) cosh 2r − (γβ + βγ) sinh 2r = 1, (53) therefore we get Note that therefore 0 ≤ p ≤ 1 as desired.
We also need to check that Φ r,α (ρ) is trace non-increasing, that is which is guaranteed since the eigenvalues of Φ † r,α (1 1) are which evaluate to ω + = 1 and ω − = λ 2 when we substitute the value of r given by Eq. ( 49).Finally, p(a) is fixed as in the quasi-realization.
B Noise robustness of the size of classical memory: proof of Theorem 6 The impossibility of classical realization fo FRDN models crucially use the fact that the maps have eigenvalues with phases which are not powers of roots of unity.This cannot happen for irreducible maps [56].Taking the qubit reduction of our example quantum realization (just take (p = 1) and choose the initial state to be in the {|0 , |1 } subspace), mixing our invertible map with completely depolarizing noise, say when q = 1, s = 0, and the maximum modulus eigenvalues of D † b (q, s) have phases that are commensurate with π, since D † b (q, s) is irreducible [56].Classical realizations cannot be excluded in this way, but it is interesting to understand how large the dimension of the memory should be as q approaches one, and this can be understood again looking at eigenvalues.
In fact we have that These relation hold for any quasi-realization, for every value of 1/z inside the radius of convergence of || being the operator norm.This holds in particular if the quasi-realization is classical.From the quantum realization one obtains that a meromorphic continuation of f (1/z) on all C, since f (1/z) is rational; by inspection, the continuation can have poles only for 1/z = 1/λ, where λ is an eigenvalue of D b (q, s).Any classical realization will result in a function of 1/z coinciding with the function obtained from the quantum realization inside the minimum radius of convergence, therefore resulting in the same meromorphic continuation.We note that, again by inspection, the meromorphic continuation for a given quasi-realization has poles only at z = λ, where λ is an eigenvalue of D b (q, s), and thus if a pole at λ exists for the meromorphic continuation of the quantum realization, λ has to be an eigenvalue of D b (q, s) in any realization.
For n × n non-negative stochastic matrices, the allowed region of the eigenvalues is contained in the convex hull of k-roots of unity, k ≤ n [53,54], and this holds also for general non-negative matrices once their maximum eigenvalue is renormalized to one, since they are similar to a stochastic one [57].We can thus determine a lower bound on the dimension of the classical memory by showing that there are eigenvalues of the quantum map D b (q, s), associated to poles in Eq. ( 61), which are outside the allowed region unless n is large enough.Suppose that two eigenvalues of D † b (1, 0) are η max (which is on the maximal circle and real) and η.First of all, we observe that a perturbation bound constrains the eigenvalues of Let η max be its maximum modulus eigenvalue, which is real and positive.qλe iαZ/2 ρe −iαZ/2 + (1 − q)s Tr[e −2rX •] e 2rX 2 has an eigenvector |0 0| − |1 1| with eigenvalue qλ, therefore η max ≥ qλ.We denote σ(A) the n-tuple of eigenvalues of the n × n matrix A, counted with algebraic multiplicity.The optimal matching distance between two n-tuples u, v is d(u, v) = min g permutation max 1≤i≤n |u i − v g(i) |.Theorem VI.5.1 in [58] says that for a normal matrix A and an arbitrary matrix B such that ||A − B|| is less than half the distance between any two distinct eigenvalues of A, then d(σ(A), σ(B)) ≤ ||A − B||.In our case, the eigenvalues of A = qλU • U † , where U = e iαZ/2 , are {qλ, qλ, qλe iα , qλe −iα }, and the half the minimum distance between distinct eigenvalues is more than qλ| sin(α)|.By taking we have that ||A − B|| = (1 − q)s2 cosh(4r).Denoting {η i } and {η i } the eigenvalues of respectively A and B, note also that d(σ(A), σ(B)) ≥ min g permutation |η i − η g(i) | for any i, and that min i=1,..,4 |η max − η i | = |η max − η max |.Supposing that q is such that ||A − B|| ≤ qλ| sin(α)|, we can find |η max − η max | ≤ d(σ(A), σ(B)) ≤ (1 − q)s2 cosh(4r) and also an eigenvalue η such that |η − η| ≤ d(σ(A), σ(B)) ≤ (1 − q)s2 cosh(4r).
By repeated application of the triangle inequality, and supposing 2(1 − q)s cosh(4r) ≤ qλ| sin α|, we have the following: Let us focus on the segment between 1 and e i 2π n : if (1 − η η max ) is outside the bigger circular segment individuated by the segment, then there is no classical model with such eigenvalues in dimension n, because this point is outside the convex hull of e irπ/k , r = 0, ..., k − 1, k = 0, ..., n.The maximum distance between this segment and the boundary of the circle is 1 − cos(π/n), which happens at α = π/n.For this value of α there is not a classical model of memory smaller than n if 2(1 − q)s cosh(4r) ≤ qλ| sin(π/n)| and 4(1 − q)s cosh(4r) ≤ qλ(1 − cos(π/n)) from Eq. (63 , it is sufficient to require 4(1 − q)s cosh(4r) ≤ qλ 1  6 (π/n) 2 to exclude the existence of a classical model.Therefore if a classical model exists we need 1  6 (π/n) 2 < 4(1−q)s cosh(4r) qλ .We now have to show that in fact there are poles of f (1/z) corresponding to η max and η .Since probabilities are real, if a complex eigenvalue is a pole, its conjugate must be too.We also note that in our example D b (q, s) is guaranteed diagonalizable if 2(1 − q)s cosh(4r) ≤ qλ| sin α|.In fact, that this map is completely positive, therefore it admits a positive semi-definite eigenvector with real eigenvalue.We note that the operator e rX (|0 0| − |1 1|)e rX is an eigenvector with eigenvalue qλ, therefore a linear independent eigenvector with real eigenvalue exists.Finally, for these values of q, D † b (q, s) admits two distinct complex eigenvalues, again by d(σ(A), σ(B)) ≤ ||A − B||.Since D b (q, s) is a 4 × 4 matrix, it must be diagonalizable.This implies that if a complex eigenvalue η is not a pole, it means that either D a (q, s)τ = 0 or πD a (q, s) = 0, which is excluded by looking at the definition of D a (q, s) for q = 1, s = 1, or that D a (q, s)τ is orthogonal to some the right eigenspace of D b (q, s) corresponding to η , or πD a (q, s) is orthogonal to some left eigenspace of D b (q, s) corresponding to η .The latter two conditions are excluded by observing that the span of the orbits span{D b (q, s) n D a (q, s)τ, n ≥ 0}, span{πD a (q, s)D b (q, s) n , n ≥ 0}, are at least 3-dimensional (therefore both complex eigenvalue are poles).This is seen explicitly for q = 1, and for other values one can observe that the orbit is generated by linear combinations of the vectors in the orbit of the case q = 1 and 1 1, in both cases.Since the orbits for q = 1 densely explore a cone which is a linear transformation of a circular cone, there are always at least two points on the cone such that 1 1 is not in their span, therefore also in the case q = 1 the orbits must span at least a three dimensional space.

C Processes without a quantum realization
In this section we prove that there exist stochastic processes with a finite dimensional quasi-realization and that are not quantum realizable.

C.1 Proof of Theorem 7: Exponential cone
Recall the definition of the exponential cone: We consider a quasi-realization on V = R 3 , alphabet M = {0, 1, 2} and generators where Here, ν is a normalization constant such that the largest absolute value of the (in general complex) eigenvalues of D 0 + D 1 + D 2 is 1.
In order to check that the above quasi-realization defines a non-negative measure we are going to use a standard result that states this happens if and only if there is a convex cone C ⊂ V such that τ ∈ C, D (u) (C) ⊆ C, π ∈ C * = {f ∈ V * : f (x) ≥ 0 ∀x ∈ C}.Thus we need to describe what kind of cone C is preserved under the transformations {D (u) } u∈M .In fact, we argue that for any non zero stable convex cone C under all the transformations D u we can find τ ∈ C such that u∈M D u τ = τ .This is a consequence of a generalized version of Perron-Frobenius theorem [59][60][61] that states that if K is a convex cone preserved by a nonzero matrix A then: • The spectral radius ρ(A) is an eigenvalue of A.
• The cone K contains an eigenvector of A corresponding to ρ(A).
It can be shown by inspection that D 1 , D 2 preserve K exp acting from the left on column vectors, and D 0 also does it provided that we choose µ 0 ∈ K * exp and m 0 ∈ K exp , therefore one can find ν > 0 such that (D 0 + D 1 + D 2 )τ = τ , τ ∈ K exp .The same argument can be applied to D 0 , D 1 , D 2 acting from the right on row vectors, which preserve K * exp , therefore there exists π ∈ K * exp such that π(D 0 + D 1 + D 2 ) = π.The minimal stable cone is given by and what we just observed shows that C min ⊆ K exp .On the other hand, provided that D 0 τ = 0, we also have Indeed, when exploring the dynamics of this quasi-realization the operator D 0 acts as a "reset" to m 0 since it is defined as a rank-1 projector.We can ensure that D 0 τ = 0 in the following way.
Looking back at the orbit of m 0 , The matrices D 1 and D 2 commute, so it suffices to consider where x = s ln(a) + t ln(b).Note that x ∈ R is dense due to the incommensurability condition and Kronecker's Theorem.Thus, It is easy to see that or in other words that the epigraph t ≥ e x is a section of K exp .Setting m 02 = 1 and m 01 = e m03 we thus have that the orbit of τ densely explores the curve (e x , 1, x), and its closed conic hull is This can be seen as follows: • x 2 , x 2 > 0} [38], and ) is contained in the convex hull of {(e x , 1, x), x ∈ R} by convexity of the exponential function, and thus convexity of its epigraph (as a set).This means that The dual of K exp is given by int K * exp := cone (y 1 , y 2 , y 3 ) ∈ R 3 : y 1 ≥ −y 3 e y 2 y 3 −1 , y 1 > 0, y 3 < 0 (75) The argument to characterize On the other hand, asking that D 0 is such that πD 0 = 0 and choosing µ 03 = −1, µ 01 = e −µ02−1 we obtain that where the last passage is due to the fact that for any (y 1 , y 2 , y 3 ) ∈ int K * exp , (− y1 y3 , − y2 y3 , −1) ∈ int K * exp , but (− y1 y3 , − y2 y3 , −1) is also in the convex hull of {(e −x−1 , x, −1), x ∈ R}, by the convexity of the function e −x−1 .Note that any stable cone C has to satisfy C min ⊆ C ⊆ C max .Thus, by the observations above our quasi-realizations has K exp as the only closed stable cone.Since C min and C * max both span the full threedimensional space, the quasi-realizations are also regular [2,15].Moreover, since K exp is not semi-algebraic, by the conditions in [15] the quasi-realization does not admit a completely positive realization.Now considering the arguments above, we can give a specific example with a = e, b = 1 2 and satisfying the conditions that πm 0 > 0 and µ 0 τ > 0. As a check of consistency, notice that since C min and C max span R 3 and πD (u) τ ≥ 0 for every u ∈ M * , there must exist a word u * such that τ D (u * ) π > 0, which implies that πτ > 0, otherwise the probabilities would be all zero.In practice, this is shown already by τ D 0 π > 0. We can compute the following fixed points (up to normalization) τ = 17.855... 5.959... 1 T , π = 2.996... −1.167... −1 , and numerically check that D 0 τ = 0,πD 0 = 0. We can then check that τ ∈ int (K exp ) and π ∈ int K * exp explicitly using the expressions (64) and (75), which must be true in general because our quasi-realization has minimum dimension (3) among all for the generated process.

C.2 Proof of Theorem 8: Power cone
Using the same techniques as before we can give a quasi-realization that does not admit a quantum realization using a reset matrix and diagonal invertible matrices.Since the reasoning is very similar to the one of the previous section the argument is streamlined.We consider the quasi-realization on V = R 3 with alphabet M = {0, 1, 2, 3} and generators The power cone and its dual are given by (see Section 4 in [38]) where Observe that choosing m 0 ∈ K α , µ T 0 ∈ (K α ) * , D u , u = 0, 1, 2, 3 preserve the power cone acting from the left and preserve its dual acting from the right.Therefore we can find stationary states π ∈ K * α and τ ∈ K α , and C min ⊆ K α , C * max ⊆ K * α .Now note that where x = a t b s , which is dense in R + due to the incommensurability condition and Kronecker's Theorem.
Using that a + b = 1 and a .Using that any stable cone has to satisfy C min ⊆ C ⊆ C max we have that our quasi-realization only has K α as a stable cone and by the choice of α it is not semi-algebraic, implying that the quasirealization cannot have a quantum realization.

Figure 1 :
Figure1: A depiction of a general stationary process with finite memory.The probability of a sequence u −k,..,u 0 ,...,u l can be computed as the inner product between a right stationary state π, evolved through a sequence of linear maps D u −k , .., D u l acting from the right, and a right stationary state τ .The hidden vector space in which πD u −k , ...D u l lives represents the memory of the process.For quantum hidden Markov models, π is a state and τ is the trace functional in the dual of the state space, while D u are CP maps such that u∈M D u is unital.