Controlling a Random Population

Bertrand et al. introduced a model of parameterised systems, where each agent is represented by a finite state system, and studied the following control problem: for any number of agents, does there exist a controller able to bring all agents to a target state? They showed that the problem is decidable and EXPTIME-complete in the adversarial setting, and posed as an open problem the stochastic setting, where the agent is represented by a Markov decision process. In this paper, we show that the stochastic control problem is decidable. Our solution makes significant uses of well quasi orders, of the max-flow min-cut theorem, and of the theory of regular cost functions.


Introduction
The control problem for populations of identical agents. The model we study was introduced in [3] (see also the journal version [4]): a population of agents are controlled uniformly, meaning that the controller applies the same action to every agent. The agents are represented by a finite state system, the same for every agent. The key difficulty is that there is an arbitrary large number of agents: the control problem is whether for every n ∈ N, there exists a controller able to bring all n agents synchronously to a target state.
The technical contribution of [3,4] is to prove that in the adversarial setting where an opponent chooses the evolution of the agents, the (adversarial) control problem is EXPTIME-complete.
In this paper, we study the stochastic setting, where each agent evolves independently according to a probabilistic distribution, i.e. the finite state system modelling an agent is a Markov decision process. The control problem becomes whether for every n ∈ N, there exists a controller able to bring all n agents synchronously to a target state with probability one.
Our main technical result is that the stochastic control problem is decidable. In the next paragraphs we discuss four motivations for studying this problem: control of biological systems, parameterised verification and control, distributed computing, and automata theory.
Modelling biological systems. The original motivation for studying this model was for controlling population of yeasts ( [21]). In this application, the concentration of some molecule is monitored through fluorescence level. Controlling the frequency and duration of injections of a sorbitol solution influences the concentration of the target molecule, triggering different chemical reactions which can be modelled by a finite state system. The objective is to control the population to reach a predetermined fluorescence state. As discussed in the conclusions of [3,4], the stochastic semantics is more satisfactory than the adversarial one for representing the behaviours of the chemical reactions, so our decidability result is a step towards a better understanding of the modelling of biological systems as populations of arbitrarily many agents represented by finite state systems.
From parameterised verification to parameterised control. Parameterised verification was introduced in [12]: it is the verification of a system composed of an arbitrary number of identical components. The control problem we study here and introduced in [3,4] is the first step towards parameterised control : the goal is control a system composed of many identical components in order to ensure a given property. To the best of our knowledge, the contributions of [3,4] are the first results on parameterised control; by extension, we present the first results on parameterised control in a stochastic setting.
Distributed computing. Our model resembles two models introduced for the study of distributed computing. The first and most widely studied is population protocols, introduced in [2]: the agents are modelled by finite state systems and interact by pairs drawn at random. The mode of interaction is the key difference with the model we study here: in a time step, all of our agents perform simultaneously and independently the same action. This brings us closer to broadcast protocols as studied for instance in [8], in which one action involves an arbitrary number of agents. As explained in [3,4], our model can be seen as a subclass of (stochastic) broadcast protocols, but key differences exist in the semantics, making the two bodies of work technically independent.
The focus of the distributed computing community when studying population or broadcast protocols is to construct the most efficient protocols for a given task, such as (prominently) electing a leader. A growing literature from the verification community focusses on checking the correctness of a given protocol against a given specification; we refer to the recent survey [7] for an overview. We concentrate on the control problem, which can then be seen as a first result in the control of distributed systems in a stochastic setting.
Alternative semantics for probabilistic automata. It is very tempting to consider the limit case of infinitely many agents: the parameterised control question becomes the value 1 problem for probabilistic automata, which was proved undecidable in [13], and even in very restricted cases ( [10]). Hence abstracting continuous distributions by a discrete population of arbitrary size can be seen as an approximation technique for probabilistic automata. Using n agents correponds to using numerical approximation up to 2 −n with random rounding; in this sense the control problem considers arbitrarily fine approximations. The plague of undecidability results on probabilistic automata (see e.g. [9]) is nicely contrasted by our positive result, which is one of the few decidability results on probabilistic automata not making structural assumptions on the underlying graph.
Our results. We prove decidability of the stochastic control problem. The first insight is given by the theory of well quasi orders, which motivates the introduction of a new problem called the sequential flow problem. The first step of our solution is to reduce the stochastic control problem to (many instances of) the sequential flow problem. The second insight comes from the theory of regular cost functions, providing us with a set of tools for addressing the key difficulty of the problem, namely the fact that there are arbitarily many agents. Our key technical contribution is to show the computability of the sequential flow problem by reducing it to a boundedness question expressed in the cost monadic second order logic using the max-flow min-cut theorem.
Related work. The notion of decisive Markov chains was introduced in [1] as a unifying property for studying infinite-state Markov chains with finite-like properties. A typical example of decisive Markov chains is lossy channel systems where tokens can be lost anytime inducing monotonicity properties. Our situation is the exact opposite as we are considering (using the Petri nets terminology) safe Petri nets where the number of tokens along a run is constant. So it is not clear whether the underlying argument in both cases can be unified using decisiveness.
Organisation of the paper. We define the stochastic control problem in Section 2, and the sequential flow problem in Section 3. We construct a reduction from the former to (many instances of) the latter in Section 4, and show the decidability of the sequential flow problem in Section 5. The interpretation of the transition table is that from the state p under action a, the probability to transition to q is ρ(p, a)(q). The transition relation ∆ is defined by ∆ = {(p, a, q) ∈ Q × A × Q : ρ(p, a)(q) > 0} .

The stochastic control problem
We also use ∆ a given by {(p, q) ∈ Q × Q : (p, a, q) ∈ ∆}. We refer to [17] for the usual notions related to MDPs; it turns out that very little probability theory will be needed in this paper, so we restrict ourselves to mentioning only the relevant objects. In an MDP M, a strategy is a function σ : Q → A; note that we consider only pure and positional strategies, as they will be sufficient for our purposes.
Given a source s ∈ Q and a target t ∈ Q, we say that the strategy σ almost surely reaches t if the probability that a path starting from s and consistent with σ eventually leads to t is 1. As we shall recall in Section 4, whether there exists a strategy ensuring to reach t almost surely from s, called the almost sure reachability problem for MDP can be reduced to solving a two player Büchi game, and in particular does not depend upon the exact probabilities. In other words, the only relevant information for each (p, a, q) ∈ Q × A × Q is whether ρ(p, a)(q) > 0 or not. Since the same will be true for the stochastic control problem we study in this paper, in our examples we do not specify the exact probabilities, and an edge from p to q labelled a means that ρ(p, a)(q) > 0.
Let us now fix an MDP M and consider a population of n tokens (we use tokens to represent the agents). Each token evolves in an independent copy of the MDP M. The controller acts through a strategy σ : Q n → A, meaning that given the state each of the n tokens is in, the controller chooses one action to be performed by all tokens independently. Formally, we are considering the product MDP M n whose set of states is Q n , set of actions is A, and transition table is ρ n (u, a)(v) = n i=1 ρ(u i , a)(v i ), where u, v ∈ Q n and u i , v i are the i th components of u and v.
Let s, t ∈ Q be the source and target states, we write s n and t n for the constant n-tuples where all components are s and t. For a fixed value of n, whether there exists a strategy ensuring to reach t n almost surely from s n can be reduced to solving a two player Büchi game in the same way as above for a single MDP, replacing M by M n . The stochastic control problem asks whether this is true for arbitrary values of n: Problem 1 (Stochastic control problem). The inputs are an MDP M, a source state s ∈ Q and a target state t ∈ Q. The question is whether for all n ∈ N, there exists a strategy ensuring to reach t n almost surely from s n .
Our main result is the following. Theorem 1. The stochastic control problem is decidable.
The fact that the problem is co-recursively enumerable is easy to see: if the answer is "no", there exists n ∈ N such that there exist no strategy ensuring to reach t n almost surely from s n . Enumerating the values of n and solving the almost sure reachability problem for M n eventually finds this out. However, it is not clear whether one can place an upper bound on such a witness n, which would yield a simple (yet inefficient!) algorithm. As a corollary of our analysis we can indeed derive such an upper bound, but it is non elementary in the size of the MDP.
In the remainder of this section we present a few interesting examples.
Example 1 Let us consider the MDP represented in Figure 1. We show that for this MDP, for any n ∈ N, the controller has an almost sure strategy to reach t n from s n . Starting with n tokens on s, we iterate the following strategy: -Repeatedly play action a until all tokens are in q; -Play action b.
The first step is eventually successful with probability one, since at each iteration there is a positive probability that the number of tokens in state q increases. In the second step, with non zero probability at least one token goes to t, while the rest go back to s. It follows that each iteration of this strategy increases with non zero probability the number of tokens in t. Hence, all tokens are eventually transferred to t n almost surely. Example 2 We now consider the MDP represented in Figure 2. By convention, if from a state some action does not have any outgoing transition (for instance the action u from s), then it goes to the sink state ⊥. We show that there exists a controller ensuring to transfer seven tokens from s to t, but that the same does not hold for eight tokens. For the first assertion, we present the following strategy: -Play a. One of the states q i1 1 for i 1 ∈ {u, d} receives at least 4 tokens. -Play i 1 ∈ {u, d}. At least 4 tokens go to t while at most 3 go to q 1 .
-Play a. One of the states q i2 2 for i 2 ∈ {u, d} receives at least 2 tokens. -Play i 2 ∈ {u, d}. At least 2 tokens go to t while at most 1 token goes to q 2 .
-Play a. The token (if any) goes to q i 3 for i 3 ∈ {u, d}.
-Play i 3 ∈ {u, d}. The remaining token (if any) goes to t. Now assume that there are 8 tokens or more on s. The only choices for a strategy are to play u or d on the second, fourth, and sixth move. First, with non zero probability at least 4 tokens are in each of q i 1 for i ∈ {u, d}. Then, whatever the choice of action i ∈ {u, d}, there are at least 4 tokens in q 1 after the next step. Proceeding likewise, there are at least 2 tokens in q 2 with non zero probability two steps later. Then again two steps later, at least 1 token falls in the sink with non zero probability. Generalising this example shows that if the answer to the stochastic control problem is "no", the smallest number of tokens n for which there exist no almost surely strategy for reaching t n from s n may be exponential in |Q|. This can further extended to show a doubly exponential in Q lower bound, as done in [3,4]; the example produced there holds for both the adversarial and the stochastic setting. Interestingly, for the adversarial setting this doubly exponential lower bound is tight. Our proof for the stochastic setting yields a non-elementary bound, leaving a very large gap.
Example 3 We consider the MDP represented in Figure 3. For any n ∈ N, there exists a strategy almost surely reaching t n from s n . However, this strategy has to pass tokens one by one through q 1 . We iterate the following strategy: -Repeatedly play action a until exactly 1 token is in q 1 .
-Play action b. The token goes to q i for some i ∈ {l, r}.
-Play action i ∈ {l, r}, which moves the token to t.
Note that the first step may take a very long time (the expectation of the number of as to be played until this happens is exponential in the number of tokens), but it is eventually successful with probability one. This very slow strategy is necessary: if q 1 contains at least two tokens, then action b should not be played: with non zero probability, at least one token ends up in each of q l , q r , so at the next step some token ends up in ⊥. It follows that any strategy almost surely reaching t n has to be able to detect the presence of at most 1 token in q 1 . This is a key example for understanding the difficulty of the stochastic control problem. Fig. 3. The controller can synchronise any number of tokens almost surely on the target state t, but they have to go one by one.

The sequential flow problem
We let Q be a finite set of states. We call configuration an element of N Q and flow an element of f ∈ N Q×Q . A flow f induces two configurations pre(f ) and post(f ) defined by and Given c, c two configurations and f a flow, we say that c goes to c using f and We write c f c if there exists a sequence of configurations c = c 0 , c 1 , . . . , c = c such that c i−1 → fi c i for all i ∈ {1, . . . , }. In this case, we say that c goes to c using the flow word f .
We now recall some classical definitions related to well quasi orders ( [15,16], see [19] for an exposition of recent results). Let (E, ) be a quasi ordered set (i.e. is reflexive and transitive), it is a well quasi ordered set (WQO) if any infinite sequence contains an increasing pair. We say that S ⊆ E is downward closed if for any x ∈ S, if y x then y ∈ S. An ideal is a non-empty downward closed set I ⊆ E such that for all x, y ∈ I, there exists some z ∈ I satisfying both x z and y z. Lemma 1.
-Any infinite sequence of decreasing downward closed sets in a WQO is eventually constant. -A subset is downward closed if and only if it is a finite union of incomparable ideals. We call it its decomposition into ideals (or simply, its decomposition), which is unique (up to permutation). -An ideal is included in a downward closed set if and only if it is included in one of the ideals of its decomposition.
We equip the set of configurations N Q and the set of flows N Q×Q with the quasi order defined component wise, yielding thanks to Dickson's Lemma [6] two WQOs. for some a ∈ (N ∪ {ω}) X (in which ω is larger than all integers).
We represent downward closed sets of configurations and flows using their decomposition into finitely many ideals of the form a ↓ for a ∈ (N ∪ {ω}) Q or a ∈ (N ∪ {ω}) Q×Q .
i.e. the configurations from which one may reach F using only flows from Flows.

Reduction of the stochastic control problem to the sequential flow problem
Let us consider an MDP M and a target t ∈ Q. We first recall a folklore result reducing the almost sure reachability question for MDPs to solving a two player Büchi game (we refer to [14] for the definitions and notations of Büchi games). The Büchi game is played between Eve and Adam as follows. From a state p: 1. Eve chooses an action a and a transition (p, q) ∈ ∆ a ; 2. Adam can either choose to agree and the game continues from q, or interrupt and choose another transition (p, q ) ∈ ∆ a , the game continues from q .
The Büchi objective is satisfied (meaning Eve wins) if either the target state t is reached or Adam interrupts infinitely many times.
Lemma 3. There exists a strategy ensuring almost surely to reach t from s if and only if Eve has a winning strategy from s in the above Büchi game.
We now explain how this reduction can be extended to the stochastic control problem. Let us consider an MDP M and a target t ∈ Q. We now define an infinite Büchi game G M . The set of vertices is the set of configurations N Q . For a flow f , we write supp(f ) = (p, q) ∈ Q 2 : f (p, q) > 0 . The game is played as follows from a configuration c: Note that Eve choosing a flow f is equivalent to choosing for each token a transition (p, q) ∈ ∆ a , inducing the configuration c , and simiarly for Adam should he decide to interrupt. Eve wins if either all tokens are in the target state, or if Adam interrupts infinitely many times.
Note that although the game is infinite, it is actually a disjoint union of finite games. Indeed, along a play the number of tokens is fixed, so each play is included in Q n for some n ∈ N.
Lemma 4. Let c be a configuration with n tokens in total, the following are equivalent: -There exists a strategy almost surely reaching t n from c, -Eve has a winning strategy in the Büchi game G M starting from c.

Lemma 4 follows from applying Lemma 3 on the product MDP M n .
We also consider the game G  Proof: Let X (i) ⊆ N Q be the winning region for Eve in G (i) M . We first argue that X = i X (i) is the winning region in G M . It is clear that X is contained in the winning region: if Eve has a strategy to ensure that either all tokens are in the target state, or that Adam interrupts infinitely many times, then it particular this is true for Adam interrupting more than i times for any i. The converse inclusion holds because G M is a disjoint union of finite Büchi games. Indeed, in a finite Büchi game, since Adam can restrict himself to playing a memoryless winning strategy, if Eve can ensure that he interrupts a certain number of times (larger than the size of the game), then by a simple pumping argument this implies that Adam will interrupt infinitely many times.
To conclude, we note that each X (i) is downward closed: indeed, a winning strategy from a configuration c can be used from a configuration c where there are fewer tokens in each state. It follows that (X (i) ) i≥0 is a decreasing sequence of downward closed sets in N Q , hence it stabilises thanks to Lemma 1, i.e. there exists i 0 ∈ N such that X (i0) = i X (i) , which concludes.
Note that Lemma 4 and Lemma 5 substantiate the claims made in Section 2: pure positional strategies are enough and the answer to the stochastic control problem does not depend upon the exact probabilities in the MDP. Indeed, the construction of the Büchi games do not depend on them, and the answer to the former is equivalent to determining whether Eve has a winning strategy in each of them.
We are now fully equipped to show that a solution to the sequential flow problem yields the decidability of the stochastic control problem.
Let F be the set of configurations for which all tokens are in state t. we let X (i) ⊆ N Q denote the winning region for Eve in the game G M is Pre * (Flows 0 , F ). We generalise this by setting Flows i for all i > 0 to be the set of flows f ∈ N Q×Q such that for some action a ∈ A, supp(f ) ⊆ ∆ a , and for f with pre(f ) = pre(f ) and supp(f ) ⊆ ∆ a , we have post(f ) ∈ X (i−1) .
Equivalently, this is the set of flows for which, when played in the game G M by Eve, Adam cannot use an interrupt move and force the configuration outside of X (i−1) .
We now claim that for all i ≥ 0. We note that this means that for each i computing X (i) reduces to solving one instance of the sequential flow problem. This induces an algorithm for solving the stochastic control problem: compute the sequence (X (i) ) i≥0 until it stabilises, which is ensured by Lemma 5 and yields the winning region of G M . The answer to the stochastic control problem is then whether the initial configuration where all tokens are in s belongs to the winning region of G M .
Let us prove the claim by induction on i. Let c be a configuration in Pre * (Flows i , F ). This means that there exists a flow word f = f 1 · · · f such that f k ∈ Flows i for all k, and c f c ∈ F . Expanding the definition, there exist c 0 = c, . . . , c = c such that c k−1 → f k c k for all k.
Let us now describe a strategy for Eve in G (i) M starting from c. As long as Adam agrees, Eve successively chooses the sequence of flows f 1 , f 2 , . . . and the corresponding configurations c 1 , c 2 , . . . . If Adam never interrupts, then the game reaches the configuration c ∈ F , and Eve wins. Otherwise, as soon as Adam interrupts, by definition of Flows i , we reach a configuration d ∈ X (i−1) . By induction hypothesis, Eve has a strategy which ensures from d to either reach F or that Adam interrupts at least i − 1 times. In the latter case, adding the interrupt move leading to d yields i interrupts, so this is a winning strategy for Eve in G , implying that f ∈ Flows i . Thus f = f 1 f 2 . . . f is a witness that c ∈ X (i) .

Computability of the sequential flow problem
Let Q be a finite set of states, Flows ⊆ N Q×Q a downward closed set of flows and F ⊆ N Q a downward closed set of configurations, the sequential flow problem is to compute the downward closed set Pre * defined by i.e. the configurations from which one may reach F using only flows from Flows.
The following classical result of [22] allows us to further reduce our problem. Lemma 6. The task of computing a downward closed set can be reduced to the task of deciding whether a given ideal is included in a downward closed set.
Thanks to Lemma 6, it is sufficient for solving the sequential flow problem to establish the following result. Lemma 7. Let I be an ideal of the form a↓ for a ∈ (N ∪ {ω}) Q , and Flows ⊆ N Q×Q be a downward closed set of flows. It is decidable whether F can be reached from all configurations of I using only flows from Flows.
We call a vector a ∈ (N ∪ {ω}) Q×Q a capacity. A capacity word is a finite sequence of capacities. For two capacity words w, w of the same length, we write w ≤ w to mean that w i ≤ w i for each i. Since flows are particular cases of capacities, we can compare flows with capacities in the same way.
Before proving Lemma 7 let us give an example and some notations. Given a state q, we write q ∈ N Q for the vector which has value 1 on the q component and 0 elsewhere. More generally we let αq for α ∈ N ∪ {ω} denote the vector with value α on the q component and 0 elsewhere. We use similar notations for flows. For instance, ωq 1 + q 2 has value ω in the q 1 component, 1 in the q 2 component, and 0 elsewhere.
We write a[ω ← n] for the configuration obtained from a by replacing all ωs by n.
The key idea for solving the sequential flow problem is to rephrase it using regular cost functions (a set of tools for solving boundedness questions). Indeed, whether F can be reached from all configurations of I = a ↓ using only flows from Flows can be equivalently phrased as a boundedness question, as follows: does there exist a bound on the values of n ∈ N such that a[ω ← n] f c for some c ∈ F and f ∈ Flows * ?
We show that this boundedness question can be formulated as a boundedness question for a formula of cost monadic logic, a formalism that we introduce now. We assume that the reader is familiar with monadic second order logic (MSO) over finite words, and refer to [20] for the definitions. The syntax of cost monadic logic (cost MSO for short) extends MSO with the construct |X| ≤ N , where X is a second order variable and N is a bounding variable. The semantics is defined as usual: w, n |= ϕ for a word w ∈ A * , with n ∈ N specifying the bound N . We assume that there is at most one bounding variable, and that the construct |X| ≤ N appears positively, i.e. under an even number of negations. This ensures that the larger N , the more true the formula is: if w, n |= ϕ, then w, n |= ϕ for all n ≥ n. The semantics of a formula ϕ of cost MSO induces a function The boundedness problem for cost monadic logic is the following problem: given a cost MSO formula ϕ over A * , is it true that the function A * → N ∪ {∞} is bounded, i.e.: ∃n ∈ N, ∀w ∈ A * , w, n |= ϕ?
The decidability of the boundedness problem is a central result in the theory of regular cost functions ( [5]). Since in the theory of regular cost functions, when considering functions we are only interested in whether they are bounded or not, we will consider functions "up to boundedness properties". Concretely, this means that a cost function is an equivalence class of functions A * → N ∪ {∞}, with the equivalence being f ≈ g if there exists α : N → N such that f (w) is finite if and only if g(w) is finite, and in this case, f (w) α(g(w)) and g(w) α(f (w)). This is equivalent to stating that for all X ⊆ A * , if f is bounded over X if and only if g is bounded over X.
Let us now establish Lemma 7.
Proof: Let T = {q ∈ Q | a(q) = ω}. Note that for n sufficiently large, we have a[ω ← n]↓= I ∩ {0, 1, . . . , n}. We let C ⊆ (N ∪ {ω}) Q×Q be the decomposition of Flows into ideals, that is, C is the minimal finite set such that We let k denote the largest finite value that appears in the definition of C , that is, k = max{b(q, q ) : b ∈ C , q, q ∈ Q, b(q, q ) = ω}. Let us define the function By definition Φ is unbounded if and only if F can be reached from all configurations of I. Since boundedness of cost MSO is decidable, it suffices to construct a formula in cost monadic logic for Φ to obtain the decidability of our problem. Our approach will be to additively decompose the capacity word w into a finitary part w (fin) (which is handled using a regular language), and several unbounded parts w (s) for each s ∈ T . The unbounded parts require a more careful analysis which notably goes through the use of the max-flow min-cut theorem. Note that a[ω ← n] decomposes as the sum of its finite part a fin = a[ω ← 0] and s∈T ns. Since flows are additive, it holds that f w = w 1 . . . w l is a flow from c n to F if and only if the capacity word w may be decomposed into (w (s) ) s∈T = (w We also define Ψ (fin) ⊆ ({0, . . . , k, ω}) Q×Q to be the language of capacity words w (fin) such that there exists a flow f w (fin) with a fin f F . Note that Ψ (fin) is a regular language since it is recognized by a finite automaton over {0, 1, . . . , k|Q|} Q that may update the current bounded configuration only with flows smaller than the current letter of w (fin) .
We have Hence, it is sufficient to prove that for each s ∈ T , Ψ (s) is definable in cost MSO.
Let us fix s and a capacity word w ∈ {0, . . . , k, ω} Q×Q of length |w| = . Consider the finite graph G with vertex set Q × {0, 1, . . . , } and for all i ≥ 1, an edge from (q, i − 1) to (q , i) labelled by w i (q, q ). Then Ψ (s) (w) is the maximal flow from (s, 0) to (t, ) in G. We recall that a cut in a graph with distinguished source s and target t is a set of edges such that removing them disconnects s and t. The cost of a cut is the sum of the weight of its edges. The max-flow min-cut theorem states that the maximal flow in a graph is exactly the minimal cost of a cut ( [11]).
We now define a cost MSO formulaΨ (s) which is equivalent (in terms of cost functions) to the minimal cost of cut in the previous graph G and thus to Ψ (s) . In the following formula, X = (X q,q ) q,q ∈Q represents a cut in the graph: i ∈ X q,q means that edge ((q, i − 1), (q , i)) belongs to the cut. Likewise, P = (P q,q ) q,q ∈Q represents paths in the graph. LetΨ (s) (w) be defined by inf n ∃X q,q n ≥ |X q,q | ∧ ∀i, i ∈ X q,q =⇒ w i (q, q ) < ω ∧ Disc s,t (X, w) , where Disc s,t (X, w) expresses that X disconnects (s, 0) and (t, ) in G. For instance Disc s,t (X, w) is defined by ∀P , ∀i, q,q i ∈ P q,q =⇒ w i (q, q ) > 0 ∧ q 0 ∈ P s,q ∧ q ∈ P q,t ∧ ∀i ≥ 1, q,q i ∈ P q,q =⇒ q i − 1 ∈ P q ,q =⇒ ∃i, q,q i ∈ X q,q ∧ i ∈ P q,q .
NowΨ (s) (w) does not exactly define the minimal total weight Φ (s) (w) of a cut, but rather the minimal value over all cuts of the minimum over (q, q ) ∈ Q 2 of how many edges are of the form ((q, i − 1), (q , i)). This is good enough for our purposes since these two values are related bỹ implying that the functionsΨ (s) and Φ (s) define the same cost function. In particular, Φ (s) is definable in cost MSO.

Conclusions
We showed the decidability of the stochastic control problem. Our approach uses well quasi orders and the sequential flow problem, which is then solved using the theory of regular cost functions.
Together with the original result of [3,4] in the adversarial setting, our result contributes to the theoretical foundations of parameterised control. We return to the first application of this model, control of biological systems. As we discussed the stochastic setting is perhaps more satisfactory than the adversarial one, although as we saw very complicated behaviours emerge in the stochastic setting involving single agents, which are arguably not pertinent for modelling biological systems.
We thus pose two open questions. The first is to settle the complexity status of the stochastic control problem. Very recently [18] proved the EXPTIMEhardness of the problem, which is interesting because the underlying phenomena involved in this hardness result are specific to the stochastic setting (and do not apply to the adversarial setting). Our algorithm does not even yield elementary upper bounds, leaving a very large complexity gap. The second question is towards more accurately modelling biological systems: can we refine the stochastic control problem by taking into account the synchronising time of the controller, and restrict it to reasonable bounds?