Integral feedback in synthetic biology: negative-equilibrium catastrophe

A central goal of synthetic biology is the design of molecular controllers that can manipulate the dynamics of intracellular networks in a stable and accurate manner. To address the fact that detailed knowledge about intracellular networks is unavailable, integral-feedback controllers (IFCs) have been put forward for controlling molecular abundances. These controllers can maintain accuracy in spite of the uncertainties in the controlled networks. However, this desirable feature is achieved only if stability is also maintained. In this paper, we show that molecular IFCs can suffer from a hazardous instability called negative-equilibrium catastrophe (NEC), whereby all nonnegative equilibria vanish under the action of the controllers, and some of the molecular abundances blow up. We show that unimolecular IFCs do not exist due to a NEC. We then derive a family of bimolecular IFCs that are safeguarded against NECs when uncertain unimolecular networks, with any number of molecular species, are controlled. However, when IFCs are applied on uncertain bimolecular (and hence most intracellular) networks, we show that preventing NECs generally becomes an intractable problem as the number of interacting molecular species increases. NECs therefore place a fundamental limit to design and control of molecular networks.

Schematic representation of biochemical control. A black-box input network R α is displayed, consisting of unknown biochemical interactions shown in grey, where X 1 → X 2 (respectively, X 1 X 2 ) indicates that species X 1 influences species X 2 positively (respectively, negatively). The particular input network contains one interfacing species X 1 , shown as a green hexagon, that can be interfaced with a controller. The rest of the input species, that cannot be interfaced with a controller, are called residual species; one of the residual species, X 2 , is highlighted as a yellow triangle, while some of the other ones are displayed in grey. A controller network R β,γ = R β ∪ R γ is also shown, consisting of controlling species Y 1 and Y 2 shown as a red circle and blue square, respectively, whose predefined biochemical interactions are shown in black. The controller consists of the core R β = R β (Y 1 , Y 2 ), specifying how the controlling species interact among themselves, and the interface R γ = R γ (X 1 , Y 1 , Y 2 ), specifying how the controlling species Y 1 and Y 2 interact with the species X 1 . The composite network R α,β,γ = R α ∪ R β,γ is called an output network. The particular controller R β,γ displayed corresponds to the network (8), with i = j = 1, from Sect. 3 (Color figure online) (N) Nonlinearity. Intracellular networks are nonlinear (bimolecular), i.e. they include reactions involving two reacting molecules. (D) Dimensionality. Intracellular networks are higher-dimensional, i.e. they contain a large number of coupled molecular species. (U) Uncertainty. The experimental information about the structure, rate coefficients and initial conditions of intracellular networks is uncertain (incomplete).
When embedded into an input network satisfying properties (N), (D) and (U), an ideal molecular controller network would ensure that the resulting output network autonomously traces a predefined dynamics in a stable and accurate manner over a desired time-interval. Controllers that maintain accuracy in spite of suitable uncertainties are said to achieve robust adaptation (homeostasis)-a fundamental design principle of living systems [7][8][9][10][11][12]. Control can be sought over deterministic dynamics when all of the molecular species are in higher-abundance [13,15], or over stochastic dynamics when some species are present at lower copy-numbers [16][17][18]. See also Fig. 1, and Appendices A and B.
In context of electro-mechanical systems, accuracy robust to some uncertainties can be achieved via so-called integral-feedback controllers (IFCs) [19]. Loosely speaking, IFCs dynamically calculate a time-integral of a difference (error) between the target and actual values of the controlled variable. The error is then used to decrease (respectively, increase) the controlled variable when it deviates above (respectively, below) its target value via a negative-feedback loop. However, IFCs implementable with electro-mechanical systems are not necessarily implementable with biochemical reactions [20]. Central to this problem is the fact that the error takes both positive and negative values and, therefore, cannot be directly represented as a nonnegative molecular abundance. In this context, a non-biochemical IFC has been mapped to a bimolecular one in [21], which has been adapted in [22] and called the "antithetic integral-feedback controller" (AIFC).
Performance of the AIFC has been largely studied when unimolecular and/or lower-dimensional input networks are controlled [22][23][24][25]; in contrast, intracellular networks are generally bimolecular and higher-dimensional (challenges (N) and (D) stated above). For example, authors from [22] analyze performance of the AIFC in context of controlling average copy-numbers of intracellular species at the stochastic level. In this setting, in [22,Theorem 2], the authors specify a class of unimolecular input networks that can be controlled with the AIFC; in particular, to ensure stability, these input networks have to satisfy an algebraic constraint given as [22,Eq. 7]. This technical condition cannot generally be guaranteed to hold as it not only depends on the rate coefficients of the controller, but also on the uncertain rate coefficients of a given input network. More precisely, affine input networks, i.e. unimolecular input networks that contain one or more basal productions (zero-order reactions), can violate condition [22,Eq. 7]. In contrast, linear input networks, i.e. unimolecular networks with no basal production, always satisfy this condition. To showcase the performance of the AIFC, the authors from [22] put forward a gene-expression system as the input network, given by [22,Eq. 9], and demonstrate that the AIFC can arbitrarily control the average protein copy-number, and mitigate the uncertainties in the input rate coefficients (challenge (U) stated above). However, this unconditional success arises because basal transcription is not included in the linear gene-expression input network [22,Eq. 9], which ensures that condition [22,Eq. 7] always holds. A similar particular choice of an input network without basal production is put forward in [23], where the AIFC is experimentally implemented. The AIFC has also been analyzed in context of controlling species concentrations at the deterministic level in [24,25]; however, the results are derived only for a restricted class of linear input networks that, due to lacking basal production, unconditionally satisfy [22,Theorem 2].
Questions of critical importance arise in context of controlling unimolecular networks: When the AIFC is applied on affine input networks (unimolecular networks with basal production), how likely is control to fail? Are the consequences of control failures biochemically safe or hazardous [26]? Does there exist a molecular IFC with a better stability performance than the AIFC? In particular, assume that a given intracellular network is modelled as being affine; then, the AIFC is successful provided the stability condition [22,Eq. 7] holds. However, a critical challenge arises because it is not possible to a-priori guarantee that [22,Eq. 7] holds, because the rate coefficients of the intracellular network are uncertain [challenge (U)].
While analyzing control of unimolecular networks in their own right is of some theoretical relevance, the vast majority of intracellular networks are not unimolecular [challenge (N)] and, hence, e.g. condition [22,Eq. 7] does not even apply. There-fore, a far more important question for intracellular control is: How do molecular IFCs perform when applied to bimolecular and higher-dimensional input networks? In particular, the structure of intracellular networks is itself in general uncertain; consequently, only approximate models can be put forward, that neglect a number of coupled molecular species and processes that are "hidden" in the intracellular system. Hence, even if a less-detailed model of an intracellular network happens to be successfully controlled, there is in general no guarantee that this remains true when a more-detailed model is used-a phenomenon we call phantom control.
The main objective of this paper is to address these questions. We show that at the center of these issues are equilibria-stationary solutions of the reaction-rate equations (RREs) that govern the deterministic dynamics of biochemical networks [13,14]. In particular, molecular concentrations can reach only equilibria that are nonnegative. In this context, we show that IFCs can destroy all nonnegative equilibria of the controlled system and lead to a control failure; furthermore, this failure can be catastrophic, as some of the molecular concentrations can then experience an unbounded increase with time (blow-up). We call this hazardous phenomenon, involving absence of nonnegative equilibria and blow-up of some of the underlying species abundances, a negative-equilibrium catastrophe (NEC), which we outline in Fig. 2. To the best of our knowledge, analyses of NECs and the related challenges in molecular control, which are the focus of this paper, are absent from the literature. For example, the only form of instability presented in [22,24,25] are bounded deterministic oscillations, which average out at the stochastic level and do not correspond to violation of condition [22,Eq. 7]; in contrast, we show that this condition is violated when NECs occur.
The paper is organized as follows. In Sects. 2-4, we present some fundamental results regarding control of unimolecular networks; these results are then used to facilitate the analysis of control of bimolecular networks, which we present in Sect. 5. In particular, in Sect. 2, we prove that unimolecular IFCs do not exist due to a NEC. We then derive a class of bimolecular IFCs given by (8) in Sect. 3; as a consequence of demanding in the derivation that the controlling variables are positive, we obtain IFCs that influence the target species both positively and negatively, in contrast to the AIFC that acts only positively. In Sect. 4, we apply different variants of these controllers on a unimolecular gene-expression network (9), and then generalize the results. In particular, we show that the AIFC can lead to a NEC when applied to (9), both deterministically and stochastically; more broadly, we show that the AIFC does not generically operate safely when applied to unimolecular networks. Furthermore, we prove that there exists a two-dimensional (two-species) IFC that eliminates NECs when applied to any (arbitrarily large) stable unimolecular input network. However, in Sect. 5 we demonstrate that, without detailed information about the input systems, NECs generally cannot be prevented when bimolecular networks are controlled. In particular, we show that, in stark contrast to dimension-independent control of unimolecular networks, control of bimolecular networks suffers from the curse of dimensionality -the problem becomes more challenging as the dimension of the input network increases. We conclude the paper by presenting a summary and discussion in Sect. 6. Notation and background theory are introduced as needed in the paper, and are summarized in Appendices A and B. Rigorous proofs of the results presented in Sects. 2, 4 and 5 are provided in Appendices C-D, E and F, respectively. In particular, the target species X 1 approaches a desired equilibrium, shown as a black dashed line, and the equilibrium for the residual species X 2 is positive. c Displays a cell that has taken lethal damage due to a failure of the IFC. In particular, as shown in d, the target equilibrium for X 1 enforces a negative equilibrium for the residual species X 2 . However, since molecular concentrations are nonnegative, this equilibrium cannot be reached and, therefore, control fails. Furthermore, the failure is catastrophic, as concentrations of some of the underlying species (in this example, species X 2 and Y 1 ) blow up, placing a lethal burden on the cell. b and d are obtained by solving the reaction-rate equations for the output network (16)∪ (19) from Sect. 5 with the dimensionless coefficients (α 0 , α 1 , α 2 , α 3 ) = (1, 1, 1/10, 3/2), (β 0 , β 1 , γ 1 , γ 2 , γ 3 ) = (100, 1, 10, 10, 1), and with (α 0 , α 1 , α 2 , α 3 ) = (1, 25, 2/5, 3/2), (β 0 , β 1 , γ 1 , γ 2 , γ 3 ) = (100, 1, 10, 2/3, 1), respectively

Nonexistence of unimolecular IFCs
In this section, we consider an arbitrary one-dimensional black-box input network R α = R α (X 1 ), where X 1 is a single interfacing species and there are no residual species, and the goal is to control the dynamics of the target X 1 ; see also Fig. 1 for a more general setup. In this paper, we assume all reaction networks are under mass-action kinetics [13,14] with positive dimensionless rate coefficients, which are displayed above or below the reaction arrows; we denote the rate coefficients of R α by α ∈ R a > , where R > is the space of positive real numbers. In what follows, we say that a reaction network is unimolecular (respectively, bimolecular) if it contains at least one reaction with one (respectively, two) reactants, but no reaction with more reactants.

Non-biochemical controller
Let us consider an affine controller formally described by the networkR β,γ = R β (Ȳ 1 ) ∪R γ (X 1 ,Ȳ 1 ), given byR Here,R β =R β (Ȳ 1 ) is the controller core, describing the internal dynamics of the controlling speciesȲ 1 , where the reactant ∅ denotes a source, whileR γ =R γ (X 1 ,Ȳ 1 ) is the controller interface, specifying interactions betweenȲ 1 and the target species X 1 from the input network, see also Fig. 1. Let us denote abundances of species At the deterministic level, formal reaction-rate equations (RREs) [13,14] read where f 1 (x 1 ; α) is an unknown function describing the dynamics of R α , and (x * 1 ,ȳ * 1 ) ∈ R 2 is the unique equilibrium of the output network, obtained by solving the RREs with zero left-hand sides. Assuming that (x * 1 ,ȳ * 1 ) is globally stable, network (1) is an IFC; in particular, in this case, x * 1 = (β 0 /γ 1 ) is independent of the initial conditions and the input coefficients α. However, controller (1) cannot be interpreted as a biochemical reaction network. In particular, the term (−γ 1 x 1 ) in (2) induces a process graphically described by X 1 (1), which consumes speciesȲ 1 even when its abundance is zero. Consequently, variables (x 1 ,ȳ 1 ) may take negative values and, therefore, cannot be interpreted as molecular concentrations [20].

Unimolecular controllers
The only unimolecular analogue of the IFC (1), that contains only one controlling species Y 1 , is of the form The RREs and the equilibrium for the output network R α ∪ R β,γ are given by Given nonnegative initial conditions, variables (x 1 , y 1 ) from (4) are confined to the nonnegative quadrant R 2 ≥ , and represent biochemical concentrations. However, the x 1 -component of the equilibrium from (4) is negative and, therefore, not reachable by the controlled system. Furthermore, y 1 is a monotonically increasing function of time, dy 1 /dt > 0, i.e. y 1 blows up. We call this phenomenon a deterministic negativeequilibrium catastrophe (NEC), see also Appendix A. Network (3) not only fails to achieve control, but it introduces an unstable species and is, hence, biochemically hazardous. In Appendix C, we prove that a NEC occurs at both deterministic and stochastic levels for any candidate unimolecular IFC, which we state as the following theorem.

Theorem 2.1 There does not exist a unimolecular integral-feedback controller.
Proof See Appendix C.
To the best of our knowledge, Theorem 2.1 has not been previously reported in the literature. A related result is presented in [23,Proposition S2.7] and states that a molecular controller R β ∪ R γ , satisfying a set of assumptions, including the assumption that the interface R γ contains only catalytic reactions, is a molecular IFC only if the core R β contains a bimolecular degradation. No such assumptions have been made in Theorem 2.1, which holds for all unimolecular networks; in particular, we allow interface R γ to contain non-catalytic reactions, such as Y i → X j and X i → Y j .

Design of bimolecular IFCs
Theorem 2.1 implies that only bimolecular (and higher-molecular) biochemical networks may exert integral-feedback control. An approach to finding such networks is to map non-biochemical IFCs into biochemical networks, while preserving the underlying integral-feedback structure. This task can be achieved using special mappings called kinetic transformations [20]. Let us consider the non-biochemical system (2). The first step in bio-transforming (2) is to translate relevant trajectories (x 1 ,ȳ 1 ) into the nonnegative quadrant. However, since R α (X 1 ) is a black-box network, i.e. f 1 (x 1 ; α) is unknown and unalterable, onlyȳ 1 can be translated; to this end, we define a new variable y 1 ≡ (ȳ 1 + T ), with translation T > 0, under which (2) becomes Terms (−γ 1 x 1 ) and (−γ 2 T ), called cross-negative terms [20], do not correspond to biochemical reactions and, therefore, must be eliminated. Let us note that cross-negative terms also play a central role in the questions of existence of other fundamental phenomena in biochemistry, such as oscillations, multistability and chaos [14,20,27]. Term (−γ 1 x 1 ) can be eliminated with the so-called hyperbolic kinetic transformation, presented in Appendix D, which involves introducing an additional controlling species Y 2 and extending system (5) into Note that (5) and (6) have identical equilibria (time-independent solutions) for the species X 1 and Y 1 , and that the equilibria for the species Y 1 and Y 2 have a hyperbolic relationship; furthermore, provided β 1 is sufficiently large, time-dependent solutions of (5) and (6) are close as well, see Appendix D. On the other hand, cross-negative term (−γ 2 T ) can be eliminated via multiplication with x 1 and any other desired factor; such operations do not influence the x 1 -equilibrium, which is determined solely by the RREs for y 1 and y 2 , and which we want to preserve. One option is to simply map (−γ 2 T ) to (−γ 2 T x 1 ), and take T large enough to ensure that the y * 1 -equilibrium is positive; however, this approach requires the knowledge of f 1 (x 1 ; α). A more robust approach is to map (−γ 2 T ) to (−γ 2 T x 1 y 2 ), under which, defining γ 3 ≡ γ 2 T , one obtains The quadratic equation for y * 1 from (7) always has one positive solution; therefore, there always exists an equilibrium with positive y 1 -and y 2 -components.
In what follows, we largely consider input networks R α (X ) with at most two interfacing species {X 1 , X 2 }, and focus on controlling the target species X 1 with the bimolecular controllers induced by (7), given by In particular, the controller core R β (Y 1 , Y 2 ) consists of a production of Y 1 from a source, and a bimolecular degradation of Y 1 and Y 2 . On the other hand, the controller interface consists of the unimolecular reactions R 0 γ (Y 2 ; X 1 ), and R + γ (X i ; Y 1 ), that produce Y 2 catalytically in X 1 , and X i catalytically in Y 1 , respectively, and the bimolecular reaction R − γ (X j ; Y 2 ) that degrades an interfacing species X j catalytically in Y 2 . We call reactions R + γ (X i ; Y 1 ) and R − γ (X j , Y 2 ) positive and negative interfacing, respectively. Furthermore, we say that positive (respectively, negative) interfacing is direct if i = 1 (respectively, if j = 1), i.e. if it is applied to the target species X 1 ; otherwise, the interfacing is said to be indirect. In Fig.1, we display controller (8) with direct positive and negative interfacing applied to an input network with a single interfacing species X 1 .
As shown in this section, positive and negative interfacing arise together naturally when molecular IFCs are designed using the theoretical framework from [20]. In this context, it is interesting to note that the "housekeeping" sigma/anti-sigma system in E. coli, proposed to implement integral control [22], has been experimentally shown to be capable of exhibiting both positive and negative transcriptional control, at least when hijacked by bacteriophage [28]. On the other hand, while the AIFC from [22] is of the form (8), it lacks negative interfacing R − γ (X j ; Y 2 ). In view of the derivation from this section, the AIFC is missing a key designing step, namely the translation from (5); consequently, NECs may occur due to y * 1 -and y * 2 -equilibria being negative. Let us note that the negative interfacing R − γ (X j ; Y 2 ) has also been considered in [29], where this reaction is shown to be capable of eliminating oscillations at the deterministic level, and reducing variance at the stochastic level, for a particular gene-expression input network. In contrast, in this section, we have systematically derived reaction R − γ (X j ; Y 2 ) in order to ensure that a positive equilibrium for Y 1 and Y 2 exists. Such matters are not discussed in [29], where basal transcription is set to zero in the geneexpression input network considered, so that negative equilibria are not encountered.

Control of unimolecular input networks
In this section, we study performance of the IFCs (8) when applied on unimolecular input networks. To this end, let us consider the input network We interpret (9) as a two-dimensional reduced (simplified) model of a higherdimensional gene-expression network. In this context, X 1 is a degradable protein species that is produced via translation from a degradable mRNA species X 2 , which is transcribed from a gene; some of the "hidden" species (dimensions), that are not Fig. 3 Schematic representation of the gene-expression input network (9). a Displays (9) with basal transcription rate α 0 . b Displays network (9) with tripled effective transcription rate, 3α 0 , arising when an activating transcription factor binds to the underlying gene promoter explicitly modelled, such as genes, transcription factors and waste molecules, are denoted by ∅. See also Fig. 3a for a schematic representation of network (9). The RREs of (9) have a unique globally stable equilibrium given by For simplicity, in this section we assume that X 1 is an interfacing species, while X 2 is residual (i.e. X 2 cannot be interfaced with a controller); the goal is to control the equilibrium concentration of the target species X 1 at the deterministic level, and its average copy-number at the stochastic level. To this end, we embed different variants of the controller (8) into (9).

Pure positive interfacing
Let us first consider controller (8) with only positive interfacing, i.e. the AIFC from [22], which is denoted by The RREs for the output network (9)∪(11) are given by with the unique equilibrium As anticipated in Sect. 3, the AIFC can lead to equilibria with negative y 1 -and y 2components. In particular, Eq. (13) implies that the output nonnegative equilibrium is destroyed when Hence, using only positive interfacing, it is not possible to achieve an output equilibrium below the input one.
To determine the dynamical behavior of (9)∪(11) when the nonnegative equilibrium ceases to exist, let us consider the linear combination of species concentration (α −1 When (14) implies that a species concentration blows up for all nonnegative initial conditions, i.e. the output network displays a deterministic NEC; by applying identical argument to the first-moment equations, it follows that a stochastic NEC occurs as well. This result is summarized as a bifurcation diagram in Fig. 4a. Let us consider network (9) with rate coefficients α = (α 0 , α 1 , α 2 , α 3 ) fixed so that the input x 1 -equilibrium from (10) is given by x * * 1 = 5; then, the output x 1 -equilibrium from (13) must satisfy the constraint x * 1 > 5. Let us stress that, since the input rate coefficients α (and the structure of the input network itself) are generally uncertain (see property (U) from Sect. 1), condition x * 1 > 5 is not a-priori known. Assume the goal is to steer the output equilibrium to 10, i.e. we fix the control coefficients β 0 and γ 1 so that x * 1 = β 0 /γ 1 = 10; this setup is shown as a black dot at (5, 10) in Fig. 4a, which happens to lie in the region where the output network displays a nonnegative equilibrium. However, assume that at a future time, as a response to an environmental perturbation, an activating transcription factor binds to the underlying gene promoter, tripling the transcription rate of the input network, which we model by allowing the transcription rate coefficient α 0 to be time-dependent, see Fig. 3b. Such a perturbation would move the system from coordinate (5,10) to (15,10), into the unstable region, which we show as a black arrow in Fig. 4a. Put another way, the AIFC would fail at its main objective-maintaining accurate control robustly with respect to uncertainties.
In Fig. 4b-d, we show the deterministic and stochastic trajectories for the species X 1 , Y 1 and Y 2 , obtained by solving the RREs (12) and applying the Gillespie algorithm [30] on (9)∪(11), respectively; also shown as a dashed grey line in Fig. 4b is the target equilibrium x * 1 = 10. For time t < 50, when the input network operates as in Fig. 3a, and the output system is in the configuration (5, 10) from Fig. 4a, the AIFC achieves control. However, for time t ≥ 50, when the transcription rate has increased as in Fig. 3b, and the output system is at (15,10)

Fig. 4
Application of the IFCs (8) on the input network (9) with rate coefficients (α 1 , α 2 , α 3 ) = (2/5, 2, 4), and piecewise constant α 0 = α 0 (t) which changes at t = 50 and leads to a catastrophic bifurcation. a Displays a bifurcation diagram for the output network (9)∪(11), while b-d show the underlying deterministic and stochastic trajectories with α 0 = 4 for t < 50 and α 0 = 12 for t ≥ 50, leading to a change in the parameter space indicated by the black arrow in a. The control coefficients are fixed to (β 0 , β 1 , γ 1 , γ 2 ) = (40,4,4,4). Analogous plots are shown in e-h for the output network (9)∪(15) with α 0 = 12 for t < 50 and α 0 = 4 for t ≥ 50, and (β 0 , β 1 , γ 1 , γ 3 ) = (40,4,4,4), and for the output network (9)∪(16) in i-l using the same coefficient values as in a-d, and with γ 3 = 4 the species Y 2 blows up for all admissible initial conditions. Intuitively, this hazardous phenomenon (NEC) occurs because, when the target equilibrium is below the input one, the best accuracy result that the AIFC can achieve is to minimally increase X 1 . Such a task is accomplished with y 1 → 0 which, as a consequence of a hyperbolic relationship between y 1 and y 2 , enforces y 2 → ∞, which is a worst stability result.

Pure negative interfacing
Consider controller (8) By analogous arguments as with controller (11), one can prove that deterministic and stochastic NECs occur when x * 1 > x * * 1 , i.e. controller (15) cannot steer the output equilibrium above the input one; when control above the input equilibrium is attempted, species Y 1 blows up. We display a bifurcation diagram and trajectories for the output network (9)∪(15) in Fig. 4e-h.

Combined positive and negative interfacing
Let us now analyze controller (8) with positive and negative interfacing applied together to the target species X 1 , as suggested by the derivation in Sect. 3. This controller is denoted by The RREs of the output network (9)∪(16) have two equilibria, given by where y * 1 satisfies By design from Sect. 3, quadratic Eq. (18) always has exactly one positive equilibrium, so that no NEC can occur with controller (16); we confirm this fact in Fig. 4i-l.

Arbitrary unimolecular input networks
Test network (9) demonstrates that controller (8) with only positive, or only negative, interfacing does not generically ensure stability of the output network. In other words, the corresponding output network experiences NECs over a large region in the parameter space, as displayed in Fig. 4a and e. On the other hand, controller (8) with combined positive and negative interfacing, applied directly to the target species, induces no NEC, as shown in Fig. 4i. A special feature of unimolecular networks is that distinct species cannot influence each other negatively. Consequently, to ensure existence of a nonnegative equilibrium, negative interfacing must generally be applied directly to the target species, while positive interfacing can be applied directly or indirectly (i.e. to any suitable interfacing species). We now more formally state this result; for more details, see Appendix E. To aid the statement of the theorem, we say that an input network is degenerate if its x 1 -equilibrium is zero, x * * 1 = 0; otherwise, the network is said to be nondegenerate. The set of all degenerate networks forms a negligibly small subset of general unimolecular networks.

Theorem 4.1
Consider an arbitrary nondegenerate unimolecular input network R α whose RREs have an asymptotically stable equilibrium, the family of controllers R β,γ given by (8), and the output network R α,β,γ = R α ∪R β,γ . Then, controller R ± β,γ with both positive and negative interfacing, with negative interfacing being direct, ensures that the output network R α,β,γ has a nonnegative equilibrium for all parameter values On the other hand, the other variants of the controller (8) do not generically ensure that the output network R α,β,γ has a nonnegative equilibrium; furthermore, when a nonnegative equilibrium does not exist, these controllers induce deterministic and stochastic blow-ups (NECs) for all nonnegative initial conditions.
Note that the only variant of (8) that generically eliminates NECs may be experimentally most challenging to implement: bimolecular reaction R − γ must be applied directly to the target species.
In the degenerate case, when the equilibrium of the target species from the input network is zero, x * * 1 = 0, in addition to the positive-negative controller R ± β,γ , the AIFC R + β,γ also generically ensures existence of a nonnegative equilibrium. However, the degenerate input networks can describe only a small class of biochemical processes. For example, when there is no basal transcription, i.e. when α 0 = 0, the equilibrium of network (9) is zero and, consequently, the output equilibrium (13) is always nonnegative. In particular, the output equilibrium is then nonnegative independent of the uncertainties in the input coefficients, so that the key challenge (U) highlighted in Sect. 1 is mitigated. This gene-expression input network without basal transcription, α 0 = 0, has been used in [22] to demonstrate a desirable stochastic behavior of the AIFC. However, as we have shown in this section, when a more general gene-expression model is used, with α 0 = 0, the AIFC can fail and induce both deterministic and stochastic catastrophes as a consequence of the challenge (U). The results from [24,25] also focus on similar degenerate input networks.
Let us stress that, while the AIFC is NEC-free when applied to unimolecular input networks with zero basal production, even an arbitrarily small basal production may cause a NEC to occur over a large parameter regime. For example, let us consider network (9) with the following rate coefficients: α 0 = ε, α 2 = 1/ε 2 and α 1 = α 3 = 1, where 0 < ε 1 is sufficiently small, i.e. we are considering a slower-transcription and faster-translation gene-expression network [31]. It then follows from (13) that the AIFC can achieve control only when the target equilibrium is set to a sufficiently large value, β 0 /γ 1 > α 0 α 2 /(α 1 α 3 ) = 1/ε; in other words, even though the basal transcription is small, nevertheless the AIFC fails catastrophically over a large parameter regime due to a large translation rate.

Control of bimolecular input networks: curse of dimensionality
As expressed by challenge (N) in Sect. 1, most intracellular networks are bimolecular, rather than unimolecular, limiting the applicability of Theorem 4.1. For example, in Sect. 4, we have used the unimolecular input network (9) with a time-dependent rate coefficient to model intracellular gene expression with regulated transcription. To obtain a more realistic model, instead of introducing an effective time-dependent rate coefficient, the reduced network (9) must be extended by including other coupled auxiliary species (e.g. transcription factors and genes) and processes (e.g. pre-transciption and post-translation events); the resulting extended input network is then bimolecular, and therefore Theorem 4.1 no longer applies. In particular, a special property of unimolecular networks is that distinct species can influence each other only positively; in contrast, distinct species can influence each other both positively and negatively in bimolecular networks. For this reason, when stable unimolecular input networks are controlled, NECs can be eliminated purely by ensuring that the controlling species equilibrium y * is positive; the input species equilibrium x * is then necessarily nonnegative. In other words, the problem of controlling unimolecular networks is independent of the challenge (D) from Sect. 1, i.e. the problem does not become more challenging as the dimension of the input network increases. On the other hand, we show in this section that, as a consequence of nonlinearities and positive-negative interactions among distinct species, the problem of controlling bimolecular networks suffers from the curse of dimensionality-the problem becomes more challenging as the dimension of the input network increases.

Two-species reduced input network: residual NEC
Let us consider a two-dimensional reduced model of an intracellular process, given by the bimolecular input network R 2 α (X 1 , X 2 ) which reads where X 1 is produced from a source and converted into a degradable species X 2 via first-and second-order conversion reactions. We assume that X 1 is an interfacing and target species, while X 2 is residual. The RREs of (19) have a unique asymptotically stable equilibrium, given by where the function I 2 = I 2 (x 1 ; α) is given by We call (21) a residual invariant, which is simply the x 2 -equilibrium expressed as a function of the x 1 -equilibrium and the input coefficients α. Let us embed the controller (16) into (19), leading to the RREs of the output network given by where h(x 1 , y 1 , y 2 ; γ ) = γ 2 y 1 − γ 3 x 1 y 2 , which display two equilibria, one of which is never nonnegative, while the other equilibrium satisfies In particular, the functional form of the residual equilibrium x * 2 from (23) is the same as the form of x * * 2 from (20); put another way, the form of the residual species equilibrium is invariant under control, justifying calling the function (21) a residual invariant. To ensure that the output network (16)∪(19) displays a nonnegative equilibrium, the residual invariant (21), now evaluated at the target equilibrium x * 1 = β 0 /γ 1 , must be nonnegative, giving rise to the condition Therefore, while (16) guarantees existence of a nonnegative equilibrium for stable unimolecular input networks (see Theorem 4.1), the same is generally false for bimolecular networks, as the equilibrium of the residual species, which are not interfaced with the controller, can become negative. In Fig. 5, we display the deterministic and stochastic trajectories for the output network (16)∪ (19) over a time-interval such that condition (24) is satisfied for t < 50, and violated for t ≥ 50. One can notice that the output network undergoes deterministic and stochastic NECs. Critically, not only does the controlling species Y 1 blow up, but also the residual species X 2 . In other words, controller (16) destabilizes the originally asymptotically stable input network (19). We call this hazardous phenomenon a residual NEC, as it arises because a residual species has no nonnegative equilibrium (equivalently, because a residual invariant is not nonnegative). Intuitively, when the concentration of the target species X 1 is increased beyond the upper bound from (24), residual species X 2 , which influences X 1 negatively, counteracts the positive action of the controlling species Y 1 , resulting in a joint blow-up. We also display this phenomenon in context of intracellular control in Fig. 2.
Let us stress that the condition I 2 x * 1 , α ≥ 0 from (24) must be obeyed by every molecular controller (e.g. containing integral, proportional and/or derivative actions [19]) that cannot be interfaced with X 2 . Put another way, no matter how one chooses the functions g 1 , g 2 and h in (22), the inequality I 2 x * 1 , α ≥ 0 must be satisfied, which imposes an upper bound on the achievable output equilibrium via x * 1 < α 3 /α 2 . The only way to eliminate this residual invariant condition is to eliminate the residual species X 2 , i.e. to design an appropriate controller that can be interfaced with both X 1 and X 2 . However, as stated in challenges (D) and (U) in Sect. 1, intracellular networks generally contain a large number of coupled biochemical species with different biophysical properties, some of which may be unknown (hidden) or poorly characterized; therefore, it is generally unfeasible to demand that a controller is designed that can be interfaced with any desired species. In what follows, we further investigate this issue.

Three-species extended input network: phantom control
The two-dimensional network (19) has been put forward as a reduced model of an intracellular process, obtained by neglecting a number of molecular species that do not influence the dynamics of X 1 and X 2 , or by using perturbation theory to eliminate slower or faster auxiliary species from considerations [32]. However, the goal of such model reductions is to capture the dynamics of the species X 1 and X 2 on a desired time-scale, and not necessarily to capture how the underlying higher-dimensional intracellular process responds to control. In this context, let us extend network (19) by including a "hidden" residual species X 3 into consideration, which interacts with X 1 and X 2 according to the three-dimensional input network The residual species X 3 influences X 1 and X 2 only weakly via the slower reaction X 3 ε − → X 1 + X 3 in (25), where 0 ≤ ε 1 is sufficiently small. One can readily show that the dynamics of the species X 1 and X 2 from the input networks (19) and (25) are identical as ε → 0, which we denote by writing lim ε→0 R 3 α,ε = R 2 α . Furthermore, the RREs of the network (25) have a unique asymptotically stable positive equilibrium, given at the leading order by where the residual invariants I 2 = I 2 (x 1 ; α) and I 3 = I 3 (x 1 ; α) are given by In what follows, we let α = (α 0 , α 1 , α 2 , α 3 , α 4 , α 5 , α 6 ) = (200, 1/7, 1/3, 5, 1, 4, 1) and ε = 10 −2 ; in Fig. 6, we demonstrate that the (x 1 , x 2 )-dynamics of networks (19) and (25) are then close.
Let us now embed the controller (16) into (25); the RREs of the output network (16)∪(25) have two equilibria, both of which have identical (x 1 , x 2 , x 3 )components, given by In addition to requiring that I 2 (x * 1 ; α) ≥ 0, one must now also demand that I 3 (x * 1 ; α) ≥ 0, to ensure that the (previously neglected) residual species X 3 displays a nonnegative equilibrium, leading to By accounting for the residual species X 3 , a lower bound is imposed on the achievable output equilibrium x * 1 = β 0 /γ 1 in (29), while no such lower bound is imposed in (24). Therefore, while the reduced network (19) is suitable to approximate the dynamics of X 1 and X 2 from the extended network (25), i.e. lim ε→0 R 3 α,ε = R 2 α , network (19) is not suitable to approximate how (25) responds to control, i.e. lim ε→0 ( . When a reduced network is successfully controlled under a parameter choice for which a corresponding extended network fails to be controlled, we say that a phantom control, as opposed to genuine control, occurs for the reduced network. Hence, when the lower bound in (29) is violated, network (16)∪(19) displays a phantom control.
For the chosen input coefficients α, it follows from (29) that one can achieve the output equilibrium only within the small interval approximately given by 13.6 ≤ x * 1 ≤ 15; therefore, even small uncertainties in the input coefficients (challenge (U) from Sect. 1) can then move the system outside of this range, where the control fails. In Fig. 7a-c, we display the deterministic trajectories for the species X 1 , X 3 and Y 2 when the target equilibrium is given by x * 1 = β 0 /γ 1 = 5, thus violating only the lower bound from (29). One can notice that a deterministic NEC occurs-the target species X 1 fails to reach the desired equilibrium, while the residual species X 3 and the controlling species Y 2 blow-up; one can similarly show that a stochastic NEC occurs. Analogous plots are shown in Fig. 7d-f when x * 1 = β 0 /γ 1 = 30, violating the upper bound from (29); one can notice that the species X 2 and Y 1 blow up, as in Fig. 5.

Arbitrary bimolecular input networks
One can continue the model-refinement process which led from network (19) to (25) by including more auxiliary species X 4 , X 5 , X 6 , . . ., each of which generally introduces an additional constraint, I 4 , I 5 , I 6 , . . . ≥ 0, which must be obeyed for an equilibrium to be nonnegative. More generally, let R α be an arbitrary N -dimensional input network satisfying properties (N), (D) and (U) from Sect. 1, R β,γ an arbitrary Mdimensional IFC, and R α,β,γ = R α ∪ R β,γ the corresponding (N + M)-dimensional output network. In order to ensure that an equilibrium (x * , y * ) ∈ R N +M of R α,β,γ is nonnegative, there are exactly two options.

Fine-tuning the control coefficients
The first option is to choose the control coefficients β and γ so that the equilibria of all of the (N + M) species are nonnegative. However, this approach involves solving a system of (N + M) nonlinear inequalities with uncertain coefficients α, which is an intractable theoretical problem. Furthermore, owing to a large number of inequalities, even when a nonnegative equilibrium does exist, the admissible values for β and γ may be confined to a small set, which can lead to a large parameter regime where IFCs can catastrophically fail. The fact that these issues are of concern when intracellular networks are controlled has been demonstrated already with as low as two-dimensional network (19) containing only one bimolecular reaction, and the three-dimensional network (25) containing two bimolecular reactions. Let us note that these issues are related to the fact that the proportion of the state-space R N +M occupied by the nonnegative orthant R N +M ≥ is given by 2 −(N +M) , which decreases exponentially as the dimension of the input network N increases-a fact known as the curse of dimensionality.

Interfacing the controller with every input species
A necessary condition to bypass the intractable problem of fine-tuning the control coefficients is to eliminate all of the residual invariant constraints I n , I n+1 , . . . ≥ 0, which can be achieved only by eliminating all of the residual species. Therefore, the second option is to design a suitable controller that can be interfaced with all of the N input species, which is an unfeasible experimental problem. However, for theoretical purposes, assume all the input species are interfacing; does there then exist an IFC that ensures a nonnegative equilibrium exists for every choice of the coefficients α, β and γ , thus mitigating challenge (U)?

Theorem 5.1 Assume R α is an arbitrary mass-action input network with N input species, all of which are interfacing. Then, there exists a bimolecular integral-feedback
controller R β,γ , containing 2N controlling species, such that the output network R α,β,γ = R α ∪R β,γ has a positive equilibrium for every choice of the rate coefficients (α, β, γ ) ∈ R a+b+c > . Proof See Appendix F for a constructive proof.
In Appendix F, we design a controller that achieves this task by generalizing the approach from Sect. 3, and applying the controller of the form (16) to every input species; therefore, the dimension of the controller scales with the dimension of the input network.

Discussion
In this paper, we have demonstrated that molecular IFCs can display severe stability issues when applied to biochemical networks subject to uncertainties. In particular, all nonnegative equilibria of the controlled network can vanish under IFCs, and some of the species abundances can then blow up. We call this hazardous phenomenon a negative-equilibrium catastrophe (NEC). In context of electro-mehanical systems, analogous phenomenon is known as integrator windup [19]-equilibria of the controlled system reach beyond the boundary of physically allowed values. For some electro-mechanical systems, one only requires the equilibria to be real (as opposed to complex); for biochemical systems, one additionally requires that the equilibria are also nonnegative. Let us stress that requiring an equilibrium to be nonnegative is significantly more restrictive than only requiring it is real. For example, while real linear systems of equations generically have a unique real solution, finding parameter regimes where a positive solution exists is non-trivial [33]; for nonlinear systems, determining such parameter regimes is generally even more challenging, see Sect. 5.3. The consequences of these issues, which are unavoidable for biochemical systems, have been under-explored in the molecular control literature to date.
We have shown in Sect. 2 that, due to the nonnegativity constraint, affine (unimolecular) biochemical systems cannot achieve integral control and, even worse, lead to catastrophes (NECs); in contrast, some affine electro-mechanical systems can achieve integral control [19]. In Sect. 3, using the theoretical framework from [20], we have then constructed a family of bimolecular (nonlinear) IFCs (8). In Sect. 4, we have proved in Theorem 4.1 that a particular two-dimensional (two-species) molecular IFC of the form (8) ensures existence of a nonnegative equilibrium when applied to stable input unimolecular networks of arbitrary dimensions; in particular, NECs can be eliminated in a dimension-independent manner when unimolecular networks are controlled. In contrast, in Sect. 5, we have demonstrated that control of bimolecular networks suffers from the curse of dimensionality-every species in the input network can in general introduce a constraint which must be obeyed for a nonnegative equilibrium to exist, leading to an intractable problem. For theoretical purposes, we have proved in Theorem 5.1 that, assuming all of the input species are known and interfacing-a generally experimentally unfeasible assumption, then there exists a higher-dimensional IFC that always eliminates NECs. Let us note that, in all of the biochemical networks studied in this paper, NECs simultaneously occur at both deterministic and stochastic levels. In particular, as opposed to the instability arising from bounded deterministic oscillations [22,24,25], which average out at the stochastic level, NECs in general persist in the stochastic setting.
Intracellular networks are in general bimolecular, higher-dimensional and subject to uncertainties, as respectively described by the properties (N), (D) and (U) in Sect. 1. Due to these challenges, generally only reduced (approximate) models of intracellular networks are available, which are obtained by eliminating a number of the underlying auxiliary coupled molecular species and reactions. The objective of these lower-dimensional reduced models is to capture the dynamics of desired intracellular species on a time-scale of interest [32]. However, using reduced models for purpose of control is in general unjustified due to NECs, i.e. reduced models do not necessarily capture how the underlying extended models respond to control. In particular, it takes only one of the many input species to display a negative equilibrium for control to fail and a catastrophic event to unfold; hence, including a previously neglected molecular species into a successfully controlled reduced model can result in an extended model for which control fails-a phenomenon we call phantom control, see Sect. 5. Let us note that, when a reduced model displaying a NEC is extended to include e.g. finite resources or dilution, then some of the underlying species concentrations, instead of growing to infinity, would reach finite, but large, values. Nevertheless, the effects of unwanted large concentrations, such as sequestration of ribosomes and depletion of metabolites, are potentially very harmful; moreover, in such a scenario, the underlying IFCs has failed to deliver control and perfect adaptation. NECs therefore place a fundamental limit to applicability of molecular IFCs in synthetic biology. In particular, if one attempts to address NECs, instead of a systematic approach, an ad-hoc approach is in general necessary, consisting of gathering detailed experimental information about a desired intracellular network and designing suitable higher-dimensional controllers that can be interfaced with appropriate input species.
Author Contributions TP, AD and TEO concetualized the study; TP performed the mathematical analyses and simulations, and wrote the paper; TEO secured the funding.
Funding This work was supported by the EPSRC Grant EP/P02596X/1. Thomas E. Ouldridge would like to thank the Royal Society for a University Research Fellowship.

Competing interests The authors declare that they have no competing interests.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Notation
Given sets A 1 and A 2 , their union, intersection, and difference are denoted by A 1 ∪A 2 , A 1 ∩ A 2 , and A 1 \A 2 , respectively. The largest element of a set of numbers A is denoted by max A. The empty set is denoted by ∅. Set Z is the space of integer numbers, Z ≥ the space of nonnegative integer numbers, and Z > the space of positive integer numbers. Similarly, R is the space of real numbers, R ≥ the space of nonnegative real numbers, and R > the space of positive real numbers. Euclidean column vectors are denoted in boldface, where · denotes the transpose operator. The i-standard basis vector is denoted by e i ≡ (δ i,1 , δ i,2 , . . . , δ i,N The zero-vector is denoted by 0 ≡ (0, 0, . . . , 0) ∈ R N , and we let Given two Euclidean vectors x, y ∈ R N , their inner-product is denoted by x, y ≡ N i=1 x i y i . Abusing the notation slightly, given two sequences we let u l 1 ≡ x∈Z N ≥ |u(x)| denote the l 1 -norm of u. Given a matrix A ∈ R N ×N , with (i, j)-element α i, j ∈ R, we denote the i-row and j-column of A by α i,· ≡  (α i,1 , α i,2 , . . . , α i,N ) ∈ R 1×N , and α ·, j ≡ (α 1, j , α 2, j , . . . , α N , j ) ∈ R N ×1 , respectively. The identity matrix is denoted by I ≡ (e 1 , e 2 , . . . , e N ) ∈ R N ×N , the zero matrix by 0 ≡ (0, 0, . . . , 0) ∈ R N ×N , and diagonal matrices are denoted by diag (α 1,1 , α 2,2 , . . . , α N ,N ) ≡ (α 1,1 e 1 , α 2,2 e 2 , . . . , α N ,N e N ) ∈ R N ×N . Determinant of a matrix A is denoted by |A|.

Appendix A.1: Biochemical reaction networks
Let R α = R α (X ) be a reaction network describing interactions, under mass-action kinetics, between N biochemical species X = {X 1 , X 2 , . . . , X N } in a well-mixed unit-volume reactor [13], as specified by the following a reactions: Here, α = (α 1 , α 2 , . . . , α a ) ∈ R a > are the positive rate coefficients of the reactions from R α . Nonnegative vectors ν j,· = (ν j,1 , ν j,2 , . . . , ν j,N ) ∈ Z N ≥ and ν j,· = (ν j,1 ,ν j,2 , . . . ,ν j,N ) ∈ Z N ≥ are the reactant and product stoichiometric coefficients of the j-reaction, respectively. If ν j,· = 0 (respectivelty,ν j,· = 0), then the reactant (respectively, product) of the j-reaction is the null-species, denoted by ∅; such reactions involve biochemical species that are not explicitly modelled, or represent non-biochemical processes. When convenient, we denote two irreversible reactions Species X i is a catalyst in the j-reaction from R α if ν j,i =ν j,i = 0; if X i is a catalyst in all of the reaction from R α , then we write R α = R α (X \X i ; X i ). The order of the j-reaction from network R α is given by 1, ν j,· ∈ Z ≥ . The order of reaction network R α is given by max{ 1, ν j,· | j ∈ A}; first-order (respectively, second-order) reaction networks are also said to be unimolecular (respectively, bimolecular).
Given a class of biochemical reaction networks, parametrized by the underlying rate coefficients, it may be of interest if a given property is likely to be true when all admissible values of the rate coefficients are considered, which motivates the following definition. In what follows, sets whose total size is arbitrarily small are said to be of measure zero. Definition A.1 (Genericity) Consider a mass-action reaction network R α parametrized by the rate coefficients α ∈ S α , where S α ⊂ R a > is a nonempty open set. Assume S α is partitioned according to S α = α ∪ ω α , with α ∩ ω α = ∅, where ω α is a set of measure zero. A property is said to be generic for network R α and set S α if it holds for all α ∈ α and fails to hold for all α ∈ ω α .

Appendix A.2: Dynamical models of reaction networks
In what follows, we present deterministic and stochastic models of mass-action reaction networks, and provide definitions in context of blow-ups.  x 1 (t; α), x 2 (t; α), . . . , x N (t; α)) ∈ R N ≥ be a concentration vector at time t ∈ R ≥ for the species X = {X 1 , X 2 , . . . , X N } from the network R α , given by (30). A deterministic model of the network R α describes the time-evolution of x = x(t; α) with a system of first-order ordinary differential equations (ODEs), called the reaction-rate equations (RREs) [13,14,16], given by where x j,· ≡ (ν j,· − ν j,· ) ∈ Z N is the reaction vector of the j-reaction, and We now present an important property of the RREs.

Theorem A.1 The nonnegative orthant R N ≥ is an invariant set for the ODEs (31).
Proof See [13].
In this paper, unimolecular networks are of interest and, to this end, we introduce the following definition.
Definition A.2 (Cross-nonnegative matrix) A matrix A ∈ R N ×N with nonnegative off-diagonal elements, α i, j ≥ 0 for all i, j ∈ {1, 2, . . . , N } such that i = j, is said to be cross-nonnegative.
Remark Cross-nonnegative matrices are known as negative Z , quasi-positive, essentially nonnegative, and Metzler matrices in the literature [35,36].

Corollary A.1 For unimolecular reaction networks, the kinetic function from the RREs (31) is given by
A linear ODE system dx/dt = (α ·,0 + Ax) is said to be asymptotically stable (also simply referred to as stable in this paper) if all the eigenvalues of A have negative real parts.
Nonnegative ODEs, such as Eqs. (31) and (33), need not have a nonnegative timeindependent solution, i.e. the dynamics can be confined to an unbounded invariant set devoid of any equilibria. In this context, we introduce the following definition.
Definition A.4 (No-equilibrium catastrophes (NECs)) Reaction network R α (X ) is said to display a deterministic (respectively, a stochastic) no-equilibrium catastrophe (NEC) if, for all α ∈ R a > such that the RREs have no nonnegative equilibria, R α (X ) blows up deterministically (respectively, stochastically) for some nonnegative initial conditions.
Remark More specifically, a nonnegative equilibrium can cease to exist by attaining a negative component, becoming complex, or vanishing all together. In this paper, we focus on the NEC that involves equilibria attaining negative components, which we call negative-equilibrium catastrophe and also abbreviate as NEC.

Appendix B: Stochastic biochemical control
In this section, we formulate the problem of achieving biochemical control over a given reaction network, starting with the following definition.
Definition B.1 (Black, grey and white box) Network R α (X ) = ∅ with unknown (respectively, only partially known) structure and dynamics is called a black-box (respectively, grey-box) network; R α (X ) = ∅ is called a white-box network if its structure and dynamics are completely known.
Given an input (uncontrolled) reaction network R α = R α (X ), the objective of biochemical control is to design a controller network R β,γ in order to ensure that the dynamics of desired input species X is suitably controlled in the resulting output (controlled) network R α,β,γ ≡ R α ∪ R β,γ ; in what follows, we focus on blackbox input networks. To this end, we partition the input species into X = X I ∪ X ρ , where X I = {X 1 , X 2 , . . . , X N I } are the 1 ≤ N I ≤ N interfacing species that can be interfaced with a given controller, while X ρ = X \X I = {X N I +1 , X N I +2 , . . . , X N } are the N ρ = (N − N I ) residual species that cannot be interfaced with the controller; input species whose dynamics we wish to control are called target species. The controller can be decomposed into two sub-networks, , called the core, contains all the reactions that involve only the controlling species Y), called the interface, contains all of the remaining reactions, involving both X I and Y. Put more simply, R β describes internal dynamics of the controlling species, while R γ describes how the interfacing and controlling species interact.
In what follows, we focus on controlling average copy-number of a single target species which is interfacing; see [39] for a more general multi-species control of the full PMF when the targets are both interfacing and residual species. To this end, we denote the rate coefficients from the sub-networks R α , R β and R γ by α ∈ R a > , β ∈ R b > and γ ∈ R c > , respectively. We also let EX 1 (t; ·, · ; α) : R b > × R c > → R be the firstmoment of the target species X 1 ∈ X I , which is a function of the control parameters (β, γ ) for every time t, input coefficient α and every initial condition. More precisely, p(x, y, t; β, γ ; α) x,y , where p(x, y, t; β, γ ; α) is the time-dependent PMF of the output network. In what follows, we denote the gradient operator with respect to x = (x 1 , x 2 Definition B.2 (Control) Consider a black-box input network R α (X ), and the corresponding output network R α,β,γ (X , Y) = R α (X ) ∪ R β,γ (X I , Y). Assume we are given a nonempty open set S α ⊂ R a > , stability index p ∈ Z > , and a target valuex 1 ∈ R > together with a tolerance ε ∈ R ≥ . Then, the first-moment EX 1 = EX 1 (t; β, γ ; α) is said to be controlled in the long-run if there exists a nonempty open set S β,γ ⊂ R b+c > such that the following two conditions are satisfied for all initial conditions (X(0), Y(0)) ∈ Z N +M ≥ and rate coefficients (α, β, γ ) ∈ S α × S β,γ : Condition (C.I) requires boundedness of the long-time moments, up to order p > 1, of all of the species from the output network R α,β,γ (X , Y). Minimally, one requires that the long-time first-moments are bounded, i.e. that the controller does not trigger a stochastic blow-up (see Definition A.3). The larger p ≥ 1 one chooses, the thinner the tail of the long-time output PMF, which ensures a more stable behavior of the output network. Condition (C.II) demands that the long-time average of X 1 , which is required to depend on at least one control parameter, is sufficiently close to the target value. Let us remark that (C.I)-(C.II) must hold within neighborhoods of the underlying rate coefficient values, reflecting the fact that measurement and fine-tuning of rate coefficients is not error-free. For the same reason, we have put forward a more relaxed accuracy criterion by allowing nonzero tolerance in condition (C.II).
Remark Robust controllers are also called integral-feedback controllers [19].
Remark Conditions (C.II) and (R.II) demand that the first-moment of X 1 is nondegenerate with respect to the control parameters, ∇ β,γ EX * 1 (β, γ ; α) = 0, and that it is degenerate with respect to the input parameters, ∇ α EX * 1 (β, γ ; α) = 0, respectively. As we prove in Lemma C.1 below, the degeneracy condition enforces singularity of an appropriate kinetic matrix.

Appendix C: Nonexistence of unimolecular integral-feedback controllers
Let R α (X ) be a black-box input network with a desired target species X 1 ∈ X I , and let Y) be a unimolecular network. The first-moment equations for the output network R α,β,γ (X , Y) = R α (X ) ∪ R β,γ (X I , Y) can be written in the following form: where L α is the (unknown) generator of the input network, β ·,0 = (β 1,0 , β 2,0 , . . . , β M,0 ) ∈ R M ≥ is induced by the core network R β (Y), while cross-nonnegative matrix C 1,1 ∈ R N I ×N I and nonnegative matrices C 1,2 ∈ R N I ×M ≥ and C 2,1 ∈ R M×N I ≥ are induced by the interfacing network R γ (X I , Y). MatrixC 2,2 ∈ R M×M ≥ can be written asC 2,2 = (B + C 2,2 ), with cross-nonnegative matrices B and C 2,2 being induced by R β (Y) and R γ (X I , Y), respectively. Note thatC 2,2 encodes reactions that change the copy-numbers of the controlling species Y either in X I -independent (via matrix B) or X I -dependent (via matrix C 2,2 ) manner. Let us also note that, by definition, β ∈ R b > contains nonzero elements from vector β ·,0 , and absolute value of nonzero elements from matrix B; similarly, γ ∈ R c > contains absolute values of nonzero elements from C 1,1 , C 1,2 , C 2,1 and C 2,2 . We now present conditions on the matrices C 2,1 = (γ 2,1 ·,1 , γ 2,1 ·,2 , . . . , γ 2,1 ·,n ) andC 2,2 that are necessary for unimolecular networks to exert robust (integral-feedback) control. Lemma C.1 Assume a unimolecular reaction network R β,γ (X I , Y) is a robust controller for a black-box network R α (X ). Then,C 2,2 from (34) is a singular matrix all eigenvalues of which have nonpositive real parts for all (β, γ ) ∈ S β,γ . Furthermore, the first column of matrix C 2,1 is nonzero.
Assume the first column of C 2,1 is zero, γ 2,1 ·,1 = 0. By the Fredholm alternative theorem, a necessary condition for the stationary first-moment EY * to exist is then given by ( w, β ·,0 + n i=2 w, γ 2,1 ·,i EX * i ) = 0 for every w ∈ R M such that (C 2,2 ) w = 0. The stationary average EX * 1 is not uniquely determined by these constraints and, hence, depends on α. Therefore, condition (R.II) from Definition B.3 does not hold.

Appendix D: Hyperbolic kinetic transformation
In this section, we briefly discuss how to map arbitrary polynomial ODEs into dynamically similar mass-action RREs [20]. To this end, let us note that every component of every polynomial function P(x; α) = (P 1 (x; α), P 2 (x; α), . . . , P N (x; α)) ∈ P m (R N ; R N ), where α ∈ R a > , can be written as Here, are the indices of all of the distinct positive and negative terms in P i (x; α) . Definition D.1 (Cross-negative terms; kinetic functions) Consider a polynomial function P(x; α) ∈ P m (R N ; R N ) with α ∈ R a > , whose i-component is given by (35).  N (x; α); the set of all non-kinetic functions of degree at most m is denoted by P N m (R N ; R N ). Remark Definition D.1 captures the fact that biochemical reactions cannot consume a species when the concentration of that species is zero, e.g. see network (1); Theorem A.1 is a direct consequence of the absence of cross-negative terms in kinetic functions [14].
In order to map an arbitrary input non-kinetic function N (x; α) ∈ P N m (R N ; R N ) into an output kinetic one K(x;ᾱ) ∈ P K m (RN ; RN ), while preserving desired dynamical features over suitable time-intervals, a number of so-called kinetic transformations have been developed in [20]. Such mappings involve a dimension expansion (i.e. an introduction of additional biochemical species,N ≥ N ) or an increase in non-linearity (m ≥ m). We now present one such kinetic transformation.
Definition D.2 (Hyberbolic transformation) Consider a system of polynomial ODEs given by with the index sets as defined in (35). Consider also the RREs given by with a parameter ε ∈ R > . Kinetic transformation ε H : n) ), mapping the right-hand side of ODEs (36) to the right-hand side of (37), withm ≤ m + 2, is called a hyberbolic transformation.
By comparing equilibria of (36) and (37), the following proposition is established.
Let us note that, while the x-equilibria are preserved by the hyperbolic kinetic transformation, their properties, such as stability, are not necessarily preserved; a stronger dynamical preservation is ensured by taking ε sufficiently small. (36), with (x n+1 , x n+2 , . . . , x N ) ∈ R N −n > , are asymptotically equivalent to the solutions of (37) in the limit ε → 0.
We now prove that if, for a particular choice of the rate coefficients, the output network from Theorem E.1 has no nonnegative equilibria, then the network displays deterministic and stochastic blow-ups, i.e. we prove that the output network undergoes deterministic and stochastic NECs (see also Definition A.4 in Sect. 1). In what follows, the critical values of β 0 /γ 1 at which nonnegative equilibria cease to exist in Theorem E.1 are called bifurcation points.
Theorem E.2 (Negative-equilibrium catastrophe) Consider a unimolecular input network R α whose RREs have an asymptotically stable equilibrium, and a controller R β,γ of the form (8). Then, excluding the bifurcation points, the output network R α,β,γ = R α ∪ R β,γ displays deterministic and stochastic negative-equilibrium catastrophe for all nonnegative initial conditions. conditions, i.e. the output network displays a deterministic NEC. Identical argument implies that then the output network displays a stochastic NEC as well. Deterministic and stochastic NECs for cases (ii) and (iii) from Theorem E.1 are established analogously.
and the statement of the theorem follows using the same argument as in (7) from Sect. 3.