Skip to main content

Advertisement

Log in

Identifiability from a Few Species for a Class of Biochemical Reaction Networks

  • Published:
Bulletin of Mathematical Biology Aims and scope Submit manuscript

Abstract

Under mass-action kinetics, biochemical reaction networks give rise to polynomial autonomous dynamical systems whose parameters are often difficult to estimate. We deal in this paper with the problem of identifying the kinetic parameters of a class of biochemical networks which are abundant, such as multisite phosphorylation systems and phosphorylation cascades (for example, MAPK cascades). For any system of this class, we explicitly exhibit a single species for each connected component of the associated digraph such that the successive total derivatives of its concentration allow us to identify all the parameters occurring in the component. The number of derivatives needed is bounded essentially by the length of the corresponding connected component of the digraph. Moreover, in the particular case of the cascades, we show that the parameters can be identified from a bounded number of successive derivatives of the last product of the last layer. This theoretical result induces also a heuristic interpolation-based identifiability procedure to recover the values of the rate constants from exact measurements.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Anguelova M, Karlsson J, Jirstrand M (2012) Minimal output sets for identifiability. Math Biosci 239:139–153

    Article  MathSciNet  MATH  Google Scholar 

  • Aoki K, Yamada M, Kunida K, Yasuda S, Matsuda M (2011) Processive phosphorylation of ERK MAP kinase in mammalian cells. Proc Natl Acad Sci USA 108(31):12675–12680

    Article  Google Scholar 

  • Bellman R, Åström K (1970) On structural identifiability. Math Biosci 7(3):329–339

    Article  Google Scholar 

  • Bellu G, Saccomani MP, Audoly S, D’Angìo L (2007) DAISY: a new software tool to test global identifiability of biological and physiological systems. Comput Methods Programs Biomed 88:52–61

    Article  Google Scholar 

  • Boulier F (2007) Differential elimination and biological modelling. Radon Ser Comput Appl Math 2:111–139

    MathSciNet  MATH  Google Scholar 

  • Brouwer AF, Meza R, Eisenberg MC (2017) A systematic approach to determining the identifiability of multistage carcinogenesis models. Risk Anal 37(7):1375–1387

    Article  Google Scholar 

  • Catozzi S, Di-Bella JP, Ventura A, Sepulchre JA (2016) Signaling cascades transmit information downstream and upstream but unlikely simultaneously. BMC Syst Biol 16(1):1–20

    Google Scholar 

  • Chen WW, Schoeberl B, Jasper PJ, Niepel M, Nielsen UB, Lauffenburger DA, Sorger PK (2009) Input–output behavior of ErbB signaling pathways as revealed by a mass action model trained against dynamic data. Mol Syst Biol 5:239

    Google Scholar 

  • Chis O-T, Banga JR, Balsa-Canto E (2011a) Structural identifiability of systems biology models: a critical comparison of methods. PLoS ONE 6(11):e27755

    Article  Google Scholar 

  • Chiş O, Banga JR, Balsa-Canto E (2011b) GenSSI: a software toolbox for structural identifiability analysis of biological models. Bioinformatics 27(18):2610–2611

    Google Scholar 

  • Craciun G, Pantea C (2008) Identifiability of chemical reaction networks. J Math Chem 44:244–259

    Article  MathSciNet  MATH  Google Scholar 

  • Davis RJ (2000) Signal transduction by the JNK group of MAP kinases. Cell 103:239–252

    Article  Google Scholar 

  • Deshaies RJ, Ferrell JE (2001) Multisite phosphorylation and the countdown to S phase. Cell 107(7):819–822

    Article  Google Scholar 

  • DiStefano JJ III (2014) Dynamic systems biology modeling and simulation. Elsevier, London (2014)

  • Hagen DR, White JK, Tidor B (2013) Convergence in parameters and predictions using computational experimental design. Interface Focus 3:20130008

    Article  Google Scholar 

  • Hong H, Ovchinnikov A, Pogudin G, Yap C (2018a) Global identification of differential models. Preprint. URL arXiv:1801.08112

  • Hong H, Ovchinnikov A, Pogudin G, Yap C (2018b) SIAN: software for structural identifiability analysis of ODE models. To appear in Bioinformatics

  • Hornberg JJ, Binder B, Bruggeman FJ, Schoeber B, Heinrich R, Westerhoff HV (2005) Control of MAPK signalling: from complexity to what really matters. Oncogene 24:5533–5542

    Article  Google Scholar 

  • Huang C-YF, Ferrell JE (1996) Ultrasensitivity in the mitogen-activated protein kinase cascade. Proc Natl Acad Sci USA 93(19):10078–10083

    Article  Google Scholar 

  • Kholodenko BN (2000) Negative feedback and ultrasensitivity can bring about oscillations in the mitogen-activated protein kinase cascades. Eur J Biochem 267:1583–1588

    Article  Google Scholar 

  • Kyriakis JM, Avruch J (2001) Mammalian mitogen-activated protein kinase signal transduction pathways activated by stress and inflammation. Physiol Rev 81(2):807–869

    Article  Google Scholar 

  • Ligon T, Fröhlich F, Chiş O, Banga J, Balsa-Canto E, Hasenauer J (2017) GenSSI 2.0: multiexperiment structural identifiability analysis of SBML models. Bioinformatics 34(8):1421–1423

    Article  Google Scholar 

  • Lin J, Harding A, Giurisato E, Shaw AS (2009) KSR1 modulates the sensitivity of mitogen-activated protein kinase pathway activation in T cells without altering fundamental system outputs. Mol Cell Biol 29:2082–2091

    Article  Google Scholar 

  • Ljung L, Glad T (1994) On global identifiability of arbitrary model parameterizations. Automatica 30:265–276

    Article  MATH  Google Scholar 

  • Maple 18 (2014) Maplesoft, a division of Waterloo Maple Inc., Waterloo, Ontario

  • Meshkat N, Eisenberg M, DiStefano JJ III (2009) An algorithm for finding globally identifiable parameter combinations of nonlinear ODE models using Gröbner Bases. Math Biosci 222:61–72

    Article  MathSciNet  MATH  Google Scholar 

  • Meshkat N, Kuo C, DiStefano J (2014) On finding and using identifiable parameter combinations in nonlinear dynamic systems biology models and COMBOS: a novel web implementation. PLoS ONE 9(10):e110261

    Article  Google Scholar 

  • Ollivier F (1990) Le problème de l’identifiabilité structurelle globale: approche théorique, méthodes effectives et bornes de complexité. Thèse de Doctorat en Sciences, École Polytechnique, Paris, France

  • Pearson G, Robinson F, Beers Gibson T, Xu BE, Karandikar M, Berman K, Cobb MH (2001) Mitogen-activated protein (MAP) kinase pathways: regulation and physiological functions. Endocr Rev 22:153–183

    Google Scholar 

  • Pérez Millán M, Dickenstein A (2018) The structure of MESSI biological systems. SIAM J Appl Dyn Syst 17(2):1650–1682

    Article  MathSciNet  MATH  Google Scholar 

  • Pohjanpalo H (1978) System identifiability based on power-series expansion of solution. Math Biosci 41:21–33

    Article  MathSciNet  MATH  Google Scholar 

  • Qiao L, Nachbar RB, Kevrekidis IG, Shvartsman SY (2007) Bistability and oscillations in the Huang–Ferrell model of MAPK signaling. PLoS Comput Biol 3(9):1819–1826

    Article  MathSciNet  Google Scholar 

  • Raue A, Karlsson J, Saccomani MP, Jirstrand M, Timmer J (2014) Comparison of approaches for parameter identifiability analysis of biological systems. Bioinformatics 30(10):1440–1448

    Article  Google Scholar 

  • Saccomani MP, Audoly S, D’Angìo L (2003) Parameter identifiability of nonlinear systems: the role of initial conditions. Automatica 39(4):619–632

    Article  MathSciNet  MATH  Google Scholar 

  • Schaeffer HJ, Weber MJ (1999) Mitogen-activated protein kinases: specific messages from ubiquitous messengers. Mol Cell Biol 19:2435–2444

    Article  Google Scholar 

  • Sedoglavic A (2002) A probabilistic algorithm to test local algebraic observability in polynomial time. J Symbolic Comput 33:735–755

    Article  MathSciNet  MATH  Google Scholar 

  • Shaul YD, Seger R (2007) The MEK/ERK cascade: from signaling specificity to diverse functions. Biochim Biophys Acta 1773(8):1213–1226

    Article  Google Scholar 

  • Walch OJ, Eisenberg MC (2016) Parameter identifiability and identifiable combinations in generalized Hodgkin–Huxley models. Neurocomputing 199:137–143

    Article  Google Scholar 

  • Walter E, Pronzato L (1997) Identification of parametric models from experimental data. Springer, Masson

    MATH  Google Scholar 

  • Wang L, Sontag E (2008) On the number of steady states in a multiple futile cycle. J Math Biol 57(1):29–52

    Article  MathSciNet  MATH  Google Scholar 

  • Widmann C, Gibson S, Jarpe MB, Johnson GL (1999) Mitogen-activated protein kinase conservation of a three-kinase module from yeast to human. Physiol Rev 79:143–180

    Article  Google Scholar 

  • Xia X, Moog CH (2003) Identifiability of nonlinear systems with applications to hiv/aids models. IEEE Trans Automat Contr 48:330–336

    Article  MathSciNet  MATH  Google Scholar 

  • Zarubin T, Han J (2005) Activation and signaling of the p38 MAP kinase pathway. Cell Res 15:11–18

    Article  Google Scholar 

Download references

Acknowledgements

The authors wish to thank the anonymous referees for their thoughtful comments which helped to improve the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mercedes Pérez Millán.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Partially supported by UBACYT 20020170100048BA (MPM), UBACYT 20020160100039BA (GJ, PS), CONICET PIP 11220150100483 (MPM), CONICET PIP 11220130100527CO (GJ), CONICET P-UE 22920170100037CO (GJ, MPM, PS), and ANPCyT PICT 2016-0398 (MPM), Argentina.

Proofs

Proofs

Throughout this “Appendix,” we maintain the notation and assumptions introduced in Sects. 2 and 3.

Before stating and proving our results, we introduce some further notation and formulas we will use in our analysis. We consider an autonomous dynamical system

$$\begin{aligned} {{\dot{{\mathbf {x}}}}}= \underset{y\rightarrow y'}{\sum } k_{yy'} \, {\mathbf {x}}^y \, (y'-y), \end{aligned}$$
(8)

arising from a chemical reaction network satisfying the assumptions stated in Sect. 3.

For a non-intermediate species X, let

$$\begin{aligned} {\mathscr {Z}}_X = \{Z : Z \text{ reacts } \text{ with } X\} \quad \hbox {and} \quad {\mathscr {W}}_X = \{ W : W \text{ reacts } \text{ to } X\}. \end{aligned}$$
(9)

By the shape of the networks we consider, \({\mathscr {Z}}_X\) is a set of non-intermediate species and \({\mathscr {W}}_X\) is a set of intermediate species. From (8), we then have that

$$\begin{aligned} {\dot{x}}=-\sum _{Z\in {\mathscr {Z}}_X} \mu _{z} x z+ \sum _{W\in {\mathscr {W}}_X}\eta _{w}w, \end{aligned}$$
(10)

for suitable non negative real numbers \(\mu _z\) and \(\eta _w\). For \(\ell \ge 2\), Leibniz rule implies that

$$\begin{aligned} x^{(\ell )}=-\sum _{Z\in {\mathscr {Z}}_X}\mu _{z}\sum _{h+i=\ell -1}\left( {\begin{array}{c}\ell -1\\ h\end{array}}\right) x^{(h)}z^{(i)}+ \sum _{W\in {\mathscr {W}}_X}\eta _{w}w^{(\ell -1)}. \end{aligned}$$
(11)

If \(W\in {\mathscr {W}}_X\) is involved in a block of reactions

$$\begin{aligned} Z_{w,1}+Z_{w,2} \overset{a_w}{\underset{b_w}{\rightleftarrows }} W \overset{c_w}{\rightarrow } Z_{w,3}+X, \end{aligned}$$

then, according to (8), the differential equation \({\dot{w}}=a_wz_{w,1}z_{w,2}-K_w w,\) with \(K_w = b_w+c_w\), is satisfied, and

$$\begin{aligned} w^{(\ell -1)}= \sum _{h+i\le \ell -2} (-K_w)^{\ell -2-h-i}a_w\left( {\begin{array}{c}h+i\\ h\end{array}}\right) z_{w,1}^{(h)}z_{w,2}^{(i)}+(-K_w)^{\ell -1} w. \end{aligned}$$
(12)

By separating the cases where \(X\in \{Z_{w,1}, Z_{w,2}\}\) and \(X\notin \{Z_{w,1}, Z_{w,2}\}\), we can simplify:

$$\begin{aligned} x^{(\ell )}=\sum _{Z\in {\mathscr {Z}}_X} \sum _{h+i\le \ell - 1} \beta _{z,h,i}\,x^{(h)}z^{(i)} +\sum _{\begin{array}{c} W\in {\mathscr {W}}_{\!\!\!X} \\ X\notin \{Z_{w,1},Z_{w,2}\} \end{array}} \sum _{h+i\le \ell -2}\gamma _{w,h,i}\,z_{w,1}^{(h)}z_{w,2}^{(i)} + \sum _{w\in {\mathscr {W}}_{X}} \delta _w \, w, \end{aligned}$$
(13)

for suitable real numbers \(\beta _{z,h,i}, \gamma _{w,h,i}\) and \(\delta _w\) that depend on \(\ell \) and the reaction rate constants.

From the previous formulas interpreted as polynomials in the variables xzw, we deduce straightforwardly.

Lemma 1

For a reaction network satisfying the assumptions of Sect. 3, we have:

  1. 1.

    The constant monomial does not appear in any derivative of any species.

  2. 2.

    The only monomials of degree 1 appearing in a derivative \(x^{(\ell )}\), \(\ell \ge 1\), for a non-intermediate species X, are the monomials w corresponding to \(W\in {\mathscr {W}}_X\), that is, those that appear in \(\dot{x}\).

1.1 Proofs of Sect. 4.1: Identifying the Constants in One Connected Component from One Variable

Here we give the proofs of our identifiability result for a connected component of the type:

$$\begin{aligned} Y+S_0\overset{a_1}{\underset{b_1}{\rightleftarrows }} U_1 \overset{c_1}{\rightarrow } Y+S_1\overset{a_2}{\underset{b_2}{\rightleftarrows }} U_2 \overset{c_2}{\rightarrow } \dots Y+S_{L-1}\overset{a_L}{\underset{b_L}{\rightleftarrows }} U_L \overset{c_{L}}{\rightarrow }Y+S_L \end{aligned}$$
(14)

We maintain the hypotheses and notations introduced in Sect. 3 and previously in this “Appendix.”

Lemma 2

Given a connected component as in (14), the constants \(a_L, b_L\) and \(c_L\) can be identified from \({\dot{s}}_L\) and \(\ddot{s}_L\), and, if \(L>1\), the constants \(a_j\) and \(K_j:=b_j+c_j\), for \(1\le j \le L-1\), can be identified from \({\dot{s}}_L, \ddot{s}_L\) and \(s_L^{(3)}\).

Proof

Following (10), we have

$$\begin{aligned} {\dot{s}}_{L}=-\sum _{Z\in {\mathscr {Z}}_L} \mu _{z}s_{L}z+ \sum _{W\in {\mathscr {W}}_L}\eta _{w}w \end{aligned}$$
(15)

where, using the notation in (9), \({\mathscr {Z}}_L:= {\mathscr {Z}}_{S_L}\) and \({\mathscr {W}}_L:= {\mathscr {W}}_{S_L}\). By separating the term corresponding to \(U_L\in {\mathscr {W}}_L\), we obtain

$$\begin{aligned} {\dot{s}}_{L}=-\sum _{Z\in {\mathscr {Z}}_L} \mu _{z}s_{L}z+ c_L u_L+\sum _{W\in {\mathscr {W}}_L^*}\eta _{w}w \end{aligned}$$

where \({\mathscr {W}}_L^*:= {\mathscr {W}}_L\backslash \{U_L\}\). Then, we can identify \(c_L\) from \({\dot{s}}_L\) as the coefficient of the monomial \(u_L\).

Consider now

$$\begin{aligned} \ddot{s}_{L}=-\sum _{Z\in {\mathscr {Z}}_L} \mu _{z} [{\dot{s}}_{L}z+s_{L}{\dot{z}}]+ c_L(a_Lys_{L-1}-K_Lu_L)+\sum _{W\in {\mathscr {W}}_L^*}\eta _{w} [a_w z_{w,1} z_{w,2}-K_w w]. \end{aligned}$$

From this expression, since \(c_L \ne 0\), we can identify \(a_L\) and \(K_L\) from the coefficients of the monomials \(ys_{L-1}\) and \(u_L\) (which only appear in \(\ddot{s}_L\) from the derivative \({\dot{u}}_L\)) and, as we know \(c_L\), we can also identify \(b_L\). If \(L=1\) we have identified all the constants.

If \(L>1\), consider the third derivative

$$\begin{aligned} \displaystyle \begin{array}{rcl} s^{(3)}_{L}&{}=&{}\displaystyle \sum _{Z\in {\mathscr {Z}}_L} \sum _{h+i\le 2} \beta _{z,h,i} \, s_{L}^{(h)} z^{(i)} + c_L a_L[{\dot{y}}s_{L-1}+y{\dot{s}}_{L-1}]-c_L K_L(a_Lys_{L-1}-K_Lu_L) \\ &{} &{} {} + \displaystyle \sum _{\begin{array}{c} W\in {\mathscr {W}}_L^* \\ S_L \notin \{ Z_{w,1}, Z_{w,2}\} \end{array}}\gamma _{w,1} [{\dot{z}}_{w,1} z_{w,2}+ z_{w,1} {\dot{z}}_{w,2}] + \gamma _{w,0} z_{w,1} z_{w,2} + \delta _w w. \end{array} \end{aligned}$$
(16)

The constants \(a_j\) and \(K_j\), for \(1\le j \le L-1\), appear in \(\dot{y} = \sum _{1\le j \le L} (- a_j y s_{j-1} + K_j u_j) + \cdots \) as the coefficients (up to sign) of the monomials \(y s_{j-1}\) and \(u_j\), respectively. Then, they appear in the expression (16) from the product \({\dot{y}} s_{L-1}\) in the coefficients of the monomials \(y s_{j-1} s_{L-1}\) and \(u_{j}s_{L-1}\), for \(1\le j\le L-1\). We will now look for these monomials in the whole expression (16) and show that they come only from the product \({\dot{y}} s_{L-1}\).

As \(Y \notin {\mathscr {Z}}_L\) and, for every \(Z\in {\mathscr {Z}}_L\), by Assumption 2, we have \(Z\ne S_l\) for all \(0\le l\le L-1\), the monomials \(y s_{j-1} s_{L-1}\) and \(u_{j}s_{L-1}\), for \(1\le j\le L-1\), do not appear in \(s_L^{(h)} z\), for \(0\le h \le 2\). Also, it is clear that they do not appear in \(s_L z^{(i)}\), for \(0\le i \le 2\). On the other hand, every monomial of degree 3 that appears in a product of two derivatives of order 1 is a multiple of an intermediate; so, \(y s_{j-1} s_{L-1}\) does not appear in \({\dot{s}}_L {\dot{z}}\) and, by Lemma 1, \(u_j s_{L-1}\) does not appear either since no derivative contains a constant term or the degree one monomial \(s_{L-1}\).

Now, consider \(W\in {\mathscr {W}}_L^*\) such that \(S_L\notin \{Z_{w,1}, Z_{w,2}\}\), and the corresponding block of reactions \(Z_{w,1}+Z_{w,2}\rightleftarrows W \rightarrow Z_{w,3} + S_L\). Since \(U_j\notin {\mathscr {W}}_L^*\) for every \(1\le j \le L\), then \(Z_{w,1}+Z_{w,2} \ne Y +S_{j-1}\). Also, by Assumption 2, \(Z_{w,1}+Z_{w,2} \ne S_{l} +S_{L-1}\) for every \(0\le l\le L\). Every monomial in \({\dot{z}}_{w,1} z_{w,2}\) is either of the form \(w_0 z_{w,2}\) for an intermediate \(W_0\) that reacts to \(Z_{w,1}\) or of the form \(z_0 z_{w,1} z_{w,2}\) for a non-intermediate \(Z_0\) that reacts with \(Z_{w,1}\). If \(z_0 z_{w,1} z_{w,2} = y s_{j-1} s_{L-1}\), it follows that \(Z_{w,1} +Z_{w,2}\in \{ Y+S_{j-1}, Y+S_{L-1}, S_{j-1}+S_{L-1}\}\), leading to a contradiction. If \(w_0 z_{w,2} = u_j s_{L-1}\), then \(Z_{w,2} = S_{L-1}\) and \(U_j\) reacts to \(Z_{w,1}\), meaning that \(Z_{w,1}\in \{Y, S_{j-1}, S_j\}\), which is not possible.

Finally, the monomial \(ys_{j-1} s_{L-1}\) does not appear in \(y {\dot{s}}_{L-1}\) since, by Assumption 2, \(S_{j-1}\) does not react with \(S_{L-1}\) for every j.

We conclude that, for \(1\le j \le L-1\), the coefficients in \(s_L^{(3)}\) of the monomials \(ys_{j-1} s_{L-1}\) and \(u_j s_{L-1}\) are \( -c_L a_L a_j\) and \(c_L a_L K_j\), respectively. As we have already identified \(c_L\) and \(a_L\), these coefficients enable us to identify \(a_j\) and \(K_j=b_j+c_j\), for \(1\le j \le L-1\). \(\square \)

We show now some auxiliary results concerning the behavior of monomials appearing in the successive derivatives of some variables and their relations with the reaction network. They will allow us to prove Lemma 5, the key recursive tool to show the identifiability results of Sect. 4.1.

Lemma 3

If \({\prod _{i=1}^{m}}z_i\), with \(m\ge 2\), is a monomial of \(x^{(\ell )}\) where \(Z_i\) is a non-intermediate species for every i, then there exist \(1\le i_1<i_2\le m\) such that \(Z_{i_1}\) reacts with \(Z_{i_2}\).

Proof

If \(\ell =1\), this is true. Assume \(\ell \ge 2\). Recalling that \(x^{(\ell )} = \sum \nolimits _{v} \frac{\partial {x^{(\ell -1)}}}{\partial v} \dot{v}\) (where the sum runs over all variables v representing non-intermediates or intermediate species), it follows that \({\prod _{i=1}^{m}}z_i\) is a monomial in \(\frac{\partial {x^{(\ell -1)}}}{\partial v} \dot{v}\) for some variable v. Since every monomial appearing in \(\dot{v}\) is either a single intermediate or a product of two non-intermediate species that react together, the result follows. \(\square \)

Corollary 2

If X is a non-intermediate species, no derivative \(x^{(\ell )}\) for \(\ell \ge 1\) contains a monomial which is a pure power of degree \(m\ge 2\) of a variable corresponding to a non-intermediate species.

Lemma 4

Given an intermediate species U and non-intermediate species X and Y such that \(Y\ne X\), if a monomial \(y^r u\), \(r\ge 0\), appears in \(x^{(\ell )}\) for some \(\ell \ge 1\), then either U reacts to X or \(\ell \ge 2\), the network contains a block of reactions

$$\begin{aligned} Y+Z_w \rightleftarrows W \rightarrow {{\widetilde{Z}}}_w + X, \end{aligned}$$
(17)

where \(Z_w\ne X\), and a monomial \(y^{t} u\) with \(t<r\) appears in \(z_w^{(i)}\) for some \(i\le \ell -2\). If, in addition, Y acts as an enzyme in all the reactions of the connected component determined by U, then \(X \in {\mathscr {S}}_U\) and the block of reactions in (17) is \(Y+Z_w \rightleftarrows W \rightarrow Y + X\), and it is contained in the connected component determined by U.

Moreover, if U does not react to X and \(\ell \) is the smallest integer such that a monomial \(y^r u\) appears in \(x^{(\ell )}\), then \(r\ge 1\), \(\ell \ge 2\), and the monomial \(y^{r-1} u \) appears in \(z_w^{(i)}\) for some \(i\le \ell -2\).

Proof

We prove the first part by induction on r. If \(r=0\), then u appears in \(x^{(\ell )}\) for some \(\ell \ge 1\); by Lemma 1 (2), this is equivalent to the fact that U reacts to X. In particular, if \(Y\ne X\) acts as an enzyme in the connected component determined by U, then \(X\in {\mathscr {S}}_U\).

Now, if \(r\ge 1\), since no monomial \(y^r u\) with \(r\ge 1\) appears in \(\dot{x}\), it follows that \(\ell \ge 2\). Then, by identity (13), the monomial \(y^ru\) can only appear in a product of derivatives of two species, and by Lemma 1 and Corollary 2, one of these species must be Y and the corresponding order of derivation must be zero.

If \(y^r u\) appears in a product \(x^{(h)} z^{(i)}\) for some \(Z\in {\mathscr {Z}}_X\) and \(h+i \le \ell -1\), as \(X\ne Y\), then \(Z=Y\) and \(y^{r-1} u\) appears in \(x^{(h)}\); then, the result follows by the inductive hypothesis.

Finally, if \(y^r u\) appears in a product \(z_{w,1}^{(h)} z_{w,2}^{(i)}\) with \(h+i \le \ell -2\) for some \(W \in {\mathscr {W}}_X\) such that \(X\notin \{ Z_{w,1}, Z_{w, 2}\}\), again by Lemma 1 and Corollary 2, we may assume that \(Y= Z_{w,1}\) and \(y^{r-1} u\) appears in \(z_{w,2}^{(i)}\). Since \(X\ne Z_{w,2}\) and W reacts to X, we must have \(Y+Z_{w,2} \rightleftarrows W \rightarrow {{\widetilde{Z}}}_w + X\) for some species \({{\widetilde{Z}}}_w\), that is, a block of reactions as in (17). By the induction hypothesis applied to the non-intermediate \(Z_{w,2} \ne Y\), if Y acts as an enzyme in the connected component determined by U, it follows that \(Z_{w,2}\in {\mathscr {S}}_U\). Then, \(Y+Z_{w,2}\) is a complex in the connected component determined by U, where Y acts as an enzyme. As \(X\ne Y\), necessarily \({{\widetilde{Z}}}_{w} =Y\) and \(X\in {\mathscr {S}}_U\).

To see that the last statement of the lemma holds, note that if U does not react to X and a monomial \(y^r u\) appears in a derivative \(x^{(\ell )}\), then \(r\ge 1\) and \(\ell \ge 2\) and, by assuming \(\ell \) minimal, the only possibility in the above reasoning is the last one. \(\square \)

Now, we are able to prove the key lemma for the proof of our main result on the identifiability of constants in a single connected component. We keep our previous notation and assumptions.

For technical reasons, we define the empty product of factors \(\alpha _i\) as \({\prod _{i=0}^{-1}}\alpha _i=1\).

Lemma 5

Given a connected component as in (14), with \(L\ge 1\), let \(1\le n\le L\) and \(0\le k \le n-1\) be fixed. If \(\ell \) is minimum such that \(y^r u_{n-k}\) is a monomial of \(s^{(\ell )}_n\) for some \(r \ge 0\), then \(\ell =2k+1\), \(r=k\) and the coefficient of \(y^k u_{n-k}\) in \(s^{(2k+1)}_n\) is

$$\begin{aligned} c_{n-k} \prod _{j=0}^{k-1} a_{n-j} \, c_{n-j}. \end{aligned}$$

Proof

For \(k= 0\), first notice that, for all \(1\le n \le L\), as \(U_n\) reacts to \(S_n\), then \(u_n\) appears in \({\dot{s}}_n\) and so, \(\ell =1\), \(r=0\), and the coefficient of \(u_{n}\) is \(c_n\), as we wanted to prove.

We follow the proof by induction on n.

If \(n=1\), the only possibility is \(k=0\), which we have already proven.

Assume now \(n\ge 2\), and let \(k\ge 1\). If a monomial \(y^r u_{n-k}\) appears in \(s_n^{(\ell )}\) and considering \(\ell \) minimal, as \(U_{n-k}\) does not react to \(S_n\), by Lemma 4 applied to \(U:=U_{n-k}\) and \(X:= S_n\), the network contains a block of reactions \(Y+Z_w \rightleftarrows W \rightarrow Y + S_n,\) and the monomial \(y^{r-1} u_{n-k}\) appears in \(z_w^{(i)}\) for some \(i\le \ell -2\). This block of reactions is necessarily \(Y+S_{n-1} \rightleftarrows U_n \rightarrow Y+ S_n\) and so, \(y^{r-1} u_{n-k}\) appears in \(s_{n-1}^{(i)}\) for some \(i\le \ell -2\). Moreover, by formula (13) applied to \(x=s_{n}\), the only terms contributing to the monomial \(y^r u_{n-k}\) come from products \(y s_{n-1}^{(i)}\) with \(i\le \ell -2\). Since \(y^{r-1} u_{n-k} = y^{r-1} u_{(n-1)-(k-1)}\), by the induction hypothesis, \(i\ge 2(k-1)+1= 2k-1\); then, \(\ell -2\ge 2k-1\) or, equivalently, \(\ell \ge 2k +1\).

Consider now formula (13) for \(s_n^{(2k+1)}\). The only product of derivatives where a monomial \(y^r u_{n-k}\) may appear is \(y s_{n-1}^{(2k-1)}\), since \(i\le 2k-1\) for all derivatives \(s_{n-1}^{(i)}\) involved. Then, the coefficient of \(y^r u_{n-k}\) in \(s_n^{(2k+1)}\) equals \(\gamma _{u_n, 0, 2k-1}\) multiplied by the coefficient of \(y^{r-1} u_{n-k}\) in \(s_{n-1}^{(2k-1)}\). By the induction hypothesis, a monomial \(y^{r-1}u_{n-k}\) appears with nonzero coefficient in \(s_{n-1}^{(2k-1)}\) if and only if \(r-1 = k-1\), that is \(r=k\), and the corresponding coefficient is \(c_{n-1-(k-1)}\prod _{j=0}^{k-2} a_{n-1-j} c_{n-1-j}\). To determine \(\gamma _{u_n, 0, 2k-1}\) note that, by formula (12) applied to \(u_n\), the product \(y s_{n-1}^{(2k-1)}\) appears in \(u_n^{(2k)}\) multiplied by \(a_n\) and, by formula (11), \(u_n^{(2k)}\) appears in \(s_n^{(2k+1)}\) multiplied by \(c_n\); then, \(\gamma _{u_n, 0, 2k-1} = c_n a_n\).

Summarizing, the monomial \(y^k u_{n-k}\) appears with nonzero coefficient in \(s_n^{(2k+1)}\); hence, \(\ell = 2k+1\). Moreover, it is the only monomial of the form \(y^r u_{n-k}\) effectively appearing in \(s_n^{(2k+1)}\), and its corresponding coefficient is \(c_n a_n c_{n-k}\prod _{j=0}^{k-2} a_{n-1-j}\, c_{n-1-j} = c_{n-k} \prod _{j=0}^{k-1} a_{n-j} \, c_{n-j}.\)\(\square \)

Remark 2

An interesting fact is that the previous lemmas also hold for networks where not all the reactions are enzymatic. By this we mean that the blocks of reactions are of the form:

$$\begin{aligned} X_1+X_2\overset{a}{\underset{b}{\rightleftarrows }}U \overset{c}{\rightarrow } X_3+X_4, \end{aligned}$$

with \(X_1\ne X_2\), \(X_3\ne X_4\) but not necessarily \(\{X_1,X_2\}\cap \{X_3,X_4\}\ne \emptyset \).

Combining Lemmas 2 and 5, we may now prove the main result of Sect. 4.1 (Proposition 2 in the main text):

Proposition 4

All the constants in a connected component as (14) of a network satisfying the assumptions in Sect. 3 can be identified from \(s^{(\ell )}_{L}\) with \(1\le \ell \le \mathrm {max}\{2,2L-1\}\).

Proof

By Lemma 2, we can identify \(a_L, b_L\) and \(c_L\) from \({\dot{s}}_L\) and \(\ddot{s}_{L}\), which implies the statement of the proposition for \(L=1\).

For \(L\ge 2\), again by Lemma 2, we can also identify \(a_j\) and \( K_j = b_j +c_j\), for \(1\le j \le L-1\), from \(s_L^{(3)}\). In order to identify all the constants, we need to “separate” \(b_j\) from \(c_j\) for \(1\le j\le L-1\). We do this by identifying the constants \(c_{L-k}\) recursively, for \(k=1,\dots , L-1\), from the successive derivatives of \(s_L\).

Let \(k\ge 1\) and assume \(c_{L-j}\) has been identified, for \(0\le j<k\). By Lemma 5, the coefficient of the monomial \(y^k u_{L-k}\) in \(s^{(2k+1)}_L\) is \(c_{L-k} \prod _{j=0}^{k-1} a_{L-j} \, c_{L-j}.\) As \( a_{L-j}\) and \(c_{L-j}\) for \(0\le j \le L-1\) are known, from this coefficient we identify \(c_{L-k}\). \(\square \)

1.2 Proofs of Sect. 4.2: Identifying the Constants in Two Connected Components from One Variable

The case of two connected components of the type

$$\begin{aligned} \begin{array}{c} Y+S_0\overset{a_1}{\underset{b_1}{\rightleftarrows }} U_1 \overset{c_1}{\rightarrow } Y+S_1\overset{a_2}{\underset{b_2}{\rightleftarrows }} U_2 \overset{c_2}{\rightarrow } \dots Y+S_{L-1}\overset{a_L}{\underset{b_L}{\rightleftarrows }} U_L \overset{c_{L}}{\rightarrow }Y+S_L,\\ {{\widetilde{Y}}} +S_L\overset{{\tilde{a}}_L}{\underset{{\tilde{b}}_L}{\rightleftarrows }} V_L \overset{{\tilde{c}}_L}{\rightarrow } {{\widetilde{Y}}}+S_{L-1}\overset{{\tilde{a}}_{L-1}}{\underset{{\tilde{b}}_{L-1}}{\rightleftarrows }} V_{L-1} \overset{{\tilde{c}}_{L-1}}{\rightarrow } \dots {{\widetilde{Y}}}+S_1\overset{{\tilde{a}}_1}{\underset{{\tilde{b}}_1}{\rightleftarrows }} V_1 \overset{{\tilde{c}}_1}{\rightarrow }{{\widetilde{Y}}}+S_0 \end{array} \end{aligned}$$
(18)

considered in the paper runs in a similar way than the one connected component case. The first result concerning this class of networks is in the spirit of Lemma 2.

Lemma 6

Given two connected components as in (18), the constants \({\tilde{a}}_L\), \({\tilde{b}}_L\), \({\tilde{c}}_L\), and \({\tilde{a}}_j, {\tilde{K}}_j:={\tilde{b}}_j + {\tilde{c}}_j\), for \(1\le j\le L-1\), can be identified from \({\dot{s}}_L\) and \(\ddot{s}_L\).

Proof

Consider the formula for \({\dot{s}}_L\) given in (15). Separating the terms corresponding to \({{\widetilde{Y}}}\in {\mathscr {Z}}_L\) and \(V_L\in {\mathscr {W}}_L\), and writing \({\mathscr {Z}}_L^{\times }:= {\mathscr {Z}}_L\backslash \{ {{\widetilde{Y}}}\}\) and \({\mathscr {W}}_L^{\times }:= {\mathscr {W}}_L\backslash \{ V_L\}\), we obtain

$$\begin{aligned} {\dot{s}}_L=-\sum _{Z\in {\mathscr {Z}}_L^{\times }} \mu _{z}s_L z+ \sum _{W\in {\mathscr {W}}_L^{\times }}\eta _{w}w -{\tilde{a}}_L {\tilde{y}} s_L + {\tilde{b}}_L v_L. \end{aligned}$$

Then, we can identify \({\tilde{a}}_L\) and \({\tilde{b}}_L\) as the coefficients (up to sign) of the monomials \({\tilde{y}} s_L\) and \(v_L\) in \({\dot{s}}_L\).

Consider now

$$\begin{aligned} \ddot{s}_L= & {} -\sum _{Z\in {\mathscr {Z}}_L^\times } \mu _{z} [{\dot{s}}_L z+s_L {\dot{z}}]+ \sum _{W\in {\mathscr {W}}_L^\times }\eta _{w} [a_w z_{w,1} z_{w,2}-K_w w] \nonumber \\&-\, {\tilde{a}}_L [\dot{{\tilde{y}}} s_L+{\tilde{y}} {\dot{s}}_L]+ {\tilde{b}}_L[{\tilde{a}}_L {\tilde{y}} s_L -{\tilde{K}}_L v_L]. \end{aligned}$$
(19)

From the coefficient of \(v_L\) in \(\ddot{s}_L\), we can identify \({\tilde{K}}_L\) and, therefore, \({\tilde{c}}_L\), since we have already identified \({\tilde{b}}_L\). The constants \({\tilde{a}}_j\) and \({\tilde{K}}_j\), for \(1\le j \le L-1\), appear in the derivative

$$\begin{aligned} \dot{{\tilde{y}}} = \sum _{1\le j \le L} (- {\tilde{a}}_j {\tilde{y}} s_{j} + {\tilde{K}}_j v_j) + \cdots , \end{aligned}$$

then, they appear in the expression (19) from the product \(\dot{{\tilde{y}}} s_L\) in the coefficients of the monomials \({\tilde{y}} s_j s_L\) and \(v_js_L\), respectively. By Assumption 2, \(S_j \notin {\mathscr {Z}}_L\) for every \(1\le j\le L-1\); hence, the monomials \({\tilde{y}} s_j s_L\) do not come from any other term in (19). Also, it is immediate that the monomials \(v_j s_L\) only come from \(\dot{{\tilde{y}}} s_L\). Then, the coefficients of \({\tilde{y}} s_j s_L\) and \(v_j s_L\) in \(\ddot{s}_L\) are \({\tilde{a}}_L {\tilde{a}}_j\) and \( -{\tilde{a}}_L {\tilde{K}}_j\), respectively, and enable us to identify \({\tilde{a}}_j\) and \({\tilde{K}}_j\), for \(1\le j \le L-1\), since \({\tilde{a}}_L \ne 0\). \(\square \)

In order to establish a statement extending Lemma 5 to this new setting, we need a previous technical lemma (a suitable analogue of Lemma 4):

Lemma 7

Given an intermediate species V and a non-intermediate species Y that acts as an enzyme in a connected component where the set of substrates and products is \({\mathscr {S}}_V\), if X is a non-intermediate species such that \(X\in {\mathscr {S}}^{(\alpha )}\) for some \(\alpha \ge 1\) and \(Y,{{\widetilde{Y}}} \notin {\mathscr {S}}^{(\alpha )}\), where \({{\widetilde{Y}}}\) is the enzyme in the connected component determined by V, and the monomial \(y^r v\) appears in \(x^{(\ell )}\) for some \(r\ge 0\) and \(\ell \ge 1\), then \(X\in {\mathscr {S}}_V\).

Moreover, either V reacts to X or \(r\ge 1\), \(\ell \ge 2\) and a monomial \(y^{t} v \) with \(t<r\) appears in \(z_w^{(i)}\), for some \(i\le \ell - 2\), for a species \(Z_w\) involved in a block of reactions \(Y+Z_w \rightleftarrows W \rightarrow Y + X.\) If \(r\ge 1\) and \(\ell \) is minimal, then \(t=r-1\).

Proof

First, note that \(X\ne Y\) and \(X\ne {{\widetilde{Y}}}\), because of the assumption that \(X\in {\mathscr {S}}^{(\alpha )}\) and \(Y, {{\widetilde{Y}}} \notin {\mathscr {S}}^{(\alpha )}\). We proceed by induction on \(r\in {\mathbb {N}}_0\).

If \(r=0\), by Lemma 1 (2), V reacts to X. As X is not the enzyme \({{\widetilde{Y}}}\), then \(X\in {\mathscr {S}}_V\).

For \(r\ge 1\), since \(X\ne Y\), Lemma 4 states that either V reacts to X (which we have already considered) or the network contains a block of reactions \(Y+Z_w \rightleftarrows W \rightarrow {{\widetilde{Z}}}_w + X\), where \(Z_w \ne X\), and a monomial \(y^t v\) with \(t<r\) appears in \(z_w^{(i)}\) for some \(i\le \ell -2\) (furthermore, \(t= r-1\) if \(\ell \) is minimal). In the latter case, \({{\widetilde{Z}}}_w\) acts as an enzyme in the connected component determined by W and \(X\in {\mathscr {S}}_W\), which implies that \({\mathscr {S}}_W \subset {\mathscr {S}}^{(\alpha )}\). If \({{\widetilde{Z}}}_w = Z_w\), then \(Y \in {\mathscr {S}}_W \), contradicting the assumption that \(Y \notin {\mathscr {S}}^{(\alpha )}\); therefore, \({{\widetilde{Z}}}_w = Y\), and \(Z_w \in {\mathscr {S}}^{(\alpha )}\). By the induction hypothesis, \(Z_w\in {\mathscr {S}}_V\). As \({\mathscr {S}}_V\) is the set of substrates and products in a connected component where Y acts as an enzyme, the complex \(Y+Z_{w}\) lies in that component, and so, \(X\in {\mathscr {S}}_V\). \(\square \)

We are now able to prove the result that will play the key role in order to give a recursive argument to identify all the constants in suitable pairs of connected components:

Lemma 8

Given two connected components as in (18) with \(L\ge 1\), let \(1\le n\le L\) and \(0\le k \le n-1\) be fixed. If \(\ell \) is minimum such that \(y^r v_{n-k}\) is a monomial of \(s^{(\ell )}_n\) for some \(r \ge 0\), then \(\ell =2k+1\), \(r=k\) and the coefficient of \(y^k v_{n-k}\) in \(s_n^{(2k+1)}\) is

$$\begin{aligned} {\tilde{b}}_{n-k}\prod _{j=0}^{k-1} a_{n-j}c_{n-j} . \end{aligned}$$

Proof

For \(k=0\), and all \(1\le n \le L\), \(v_{n}\) appears in \({\dot{s}}_n\) (since \(V_n\) reacts to \(S_n\)) with coefficient \({\tilde{b}}_{n}\) and so, \(r=0\) and \(\ell =1\). We now proceed by induction on n.

If \(n=1\), the only possibility is \(k=0\), which has already been considered.

For \(n\ge 2\), let \(k\ge 1\). By Assumption 2, there exists \(\alpha \ge 1\) such that \(S_j \in {\mathscr {S}}^{(\alpha )}\) for every \(0\le j \le L\), and \(Y, {{\widetilde{Y}}} \notin {\mathscr {S}}^{(\alpha )}\). If the monomial \(y^rv_{n-k}\) appears in a derivative of \(s_n\) and \(\ell \) is the minimum derivation order where it appears, as \(V_{n-k}\) does not react to \(S_n\), by Lemma 7, \(r\ge 1\), \(\ell \ge 2\) and the monomial \(y^{r-1}v_{n-k}\) appears in \(z_w^{(i)}\), for some \(i\le \ell -2\), for a species \(Z_w\) in a block of reactions \(Y+Z_w \rightleftarrows W \rightarrow Y + S_n\). Then, \(W= U_n\) and \(Z_w = S_{n-1}\); so, \(y^{r-1} v_{n-k}\) appears in \(s_{n-1}^{(i)}\) for some \(i\le \ell -2\). By the induction hypothesis, we have that \(i\ge 2k-1\); therefore, \(\ell \ge 2k+1\).

Now, following mutatis mutandis the proof of Lemma 5, we deduce that the coefficient of the monomial \(y^k v_{n-k}\) in \(s_n^{(2k+1)}\) is equal to \(c_n a_n\) multiplied by the coefficient of \(y^{k-1} v_{n-k}\) in \(s_{n-1}^{(2k-1)}\), and we conclude by applying the induction hypothesis. \(\square \)

Similarly as in the previous subsection, from Lemmas 6 and 8 we deduce the following identifiability result for two connected components that extends Proposition 4 and constitutes the main result in Sect. 4.2 (Proposition 3 in the main text):

Proposition 5

Given a chemical reaction network satisfying the assumptions in Sect. 3, all the constants in two connected components as in (18) can be identified from \(s^{(\ell )}_{L}\) with \(1\le \ell \le \mathrm {max}\{2,2L-1\}\).

Proof

The result holds for \(L=1\), since by Lemmas 2 and 6, we can identify \(a_L, b_L, c_L, {\tilde{a}}_L, {\tilde{b}}_L\) and \({\tilde{c}}_L\) from \({\dot{s}}_L\) and \(\ddot{s}_L\).

Assume now \(L\ge 2\). By Proposition 4, all the constants \(a_j, b_j\) and \(c_j\), for \(1\le j \le L\), can be identified from \(s_L^{(\ell )}\) with \(1\le \ell \le \max \{2, 2L-1\}\). It remains to show that we can also identify \({\tilde{a}}_j, {\tilde{b}}_j\) and \({\tilde{c}}_j\), for \(1\le j \le L\).

By Lemma 6, the constants \({\tilde{a}}_L\), \({\tilde{b}}_L\), \({\tilde{c}}_L\) and \({\tilde{a}}_j\) and \({\tilde{K}}_j={\tilde{b}}_j + {\tilde{c}}_j\), for \(1\le j \le L-1\), are identifiable from \({\dot{s}}_L\) and \(\ddot{s}_L\). We just need to “separate” \({\tilde{b}}_j\) and \({\tilde{c}}_j\) for \(1\le j \le L-1\). Due to Lemma 8, this can be done by identifying \({\tilde{b}}_{L-k}\) recursively, for \(k=1,\dots , L-1\), from the coefficients of the monomials \(y^kv_{L-k}\) in \(s_L^{(2k+1)}\). \(\square \)

1.3 Proofs of Sect. 5: Identifying the Cascade

The following two auxiliary technical lemmas will be used in subsequent arguments concerning the identifiability in the cascade.

Lemma 9

If \({\prod _{j=1}^{M}}z_j\), with \(Z_j\) non-intermediate species for all j, is a monomial of \(x^{(\ell )}\) for a non-intermediate species \(X\in {\mathscr {S}}^{(\alpha )}\) and \(\ell \ge 1\), then there exists \(1\le j_1, j_2\le M\) such that \(Z_{j_1}\in {\mathscr {S}}^{(\alpha )}\) and \(Z_{j_2} \in {\mathscr {S}}^{(\beta )}\) for some \(\beta \) such that the network contains a complex \(X+Z\) with \(Z\in {\mathscr {S}}^{(\beta )}\).

Proof

For \(\ell =1 \) the result is true, since the only products of non-intermediate species appearing in \({\dot{x}}\) are of the form xz for a species Z that reacts with X. Assume the lemma holds for derivatives of order \(1\le h\le \ell -1\) of non-intermediate species.

By equation (13), if the monomial appears in \(x^{(h)}z^{(i)}\) for some \(h+i \le \ell -1\) and \(h>0\), by Lemma 1(1), there is a monomial \(\prod _{l=1}^{M'} z_{j_l}\) in \(x^{(h)}\) with \(1\le h\le \ell -1\), and the induction hypothesis gives the result. Assume now \(h=0\) and \(x= z_M\). If the monomial appears in \(x z^{(i)}\) with \(i\le \ell -1\), either \(i=0\), and the monomial is xz with X and Z reacting together, which implies the statement, or \(1\le i \le \ell -1\) and the monomial \(\prod _{j=1}^{M-1} z_j\) appears in \(z^{(i)}\). If the latter holds, \(Z_{j_1} = X \in {\mathscr {S}}^{(\alpha )}\) and, by the induction hypothesis applied to \(Z \in {\mathscr {S}}^{(\beta )}\) for some \(\beta \) and \(1\le i \le \ell -2\), there exists \(j_2\) such that \(Z_{j_2} \in {\mathscr {S}}^{(\beta )}\).

If the product appears in \(z_{w,1}^{(h)}z_{w,2}^{(i)}\), for some \(h+ i\le \ell -2\), coming from a block of reactions \(Z_{w,1}+Z_{w,2}\rightleftarrows W \rightarrow Z_{w,3}+X\) with \(X\notin \{Z_{w,1},Z_{w,2}\}\), then the enzyme is \(Z_{w,3}\) and, assuming \(Z_{w,1}=Z_{w,3}\), it follows that \(Z_{w,2}\) and X lie in \({\mathscr {S}}_W\). Since \(X\in {\mathscr {S}}^{(\alpha )}\), then \({\mathscr {S}}_W\subset {\mathscr {S}}^{(\alpha )}\); in particular, \(Z_{w,2}\in {\mathscr {S}}^{(\alpha )}\). On the other hand, \(Z_{w,1}= Z_{w,3} \in {\mathscr {S}}^{(\beta )}\) for some \(\beta \ne \alpha \). If \(i=0\), there exists \(1\le j_1\le M\) such that \(Z_{j_1} = Z_{w,2} \in {\mathscr {S}}^{(\alpha )}\) and, if \(i\ge 1\), by the induction hypothesis applied to \(Z_{w,2}\in {\mathscr {S}}^{(\alpha )}\) and the factor of the monomial appearing in \(z_{w,2}^{(i)}\), there exists \(j_1\) such that \(Z_{j_1}\in {\mathscr {S}}^{(\alpha )}\). Similarly, if \(h=0\), there exists \(1\le j_2\le M\) such that \( Z_{j_2} = Z_{w,1} \in {\mathscr {S}}^{(\beta )}\) and, if \(h\ge 1\), by the induction hypothesis applied to \(Z_{w,1}\in {\mathscr {S}}^{(\beta )}\), there exists \(j_2\) such that \(Z_{j_2}\in {\mathscr {S}}^{(\beta )}\). \(\square \)

Lemma 10

If \(u\prod _{i=1}^{M}z_i\), with \(M\ge 1\), is a monomial of \(x^{(\ell )}\) for U an intermediate species and \(X, Z_i\) non-intermediate species for all i, either there exist \(1\le i_1<i_2\le M\) such that \(Z_{i_1}\) reacts with \(Z_{i_2}\) or there exist \(1\le i_0\le M\) and a species V that reacts with \(Z_{i_0}\) such that U reacts to a complex containing V.

Proof

Note that there are no monomials of this type in \({\dot{x}}\); thus, \(\ell \ge 2\). For \(\ell =2\), the only monomials in \(\ddot{x}\) that are multiples of an intermediate and non-intermediates are:

  • uz, for an intermediate species U that reacts to X and a non-intermediate Z that reacts with X. In this case, the statement holds with \(V=X\) and \(Z_{i_0} = Z\);

  • ux, for an intermediate species U that reacts to a non-intermediate species Z reacting with X. The statement holds with \(V=Z\) and \(Z_{i_0}=X\).

For \(\ell >2\), recalling that \(x^{(\ell )} = \sum \nolimits _{v} \frac{\partial {x^{(\ell -1)}}}{\partial v} \dot{v}\) (where the sum runs over all variables v representing non-intermediates or intermediate species), it follows that \(u \prod _{i=1}^{M}z_i\) is a monomial in \(\frac{\partial {x^{(\ell -1)}}}{\partial v} \dot{v}\) for some variable v. Every monomial in \(\dot{v}\) is either a single intermediate or a product of two non-intermediate species in a reaction. In the second case, the result follows. Now, if \(\prod _{i=1}^{M}z_i\) is a monomial of \(\frac{\partial {x^{(\ell -1)}}}{\partial v}\) and u is a monomial of \(\dot{v}\), we have that \(v\prod _{i=1}^{M}z_i\) is a monomial of \(x^{(\ell -1)}\) and one of the following possibilities for V:

  • \(V= U\); then, \(u\prod _{i=1}^{M}z_i\) is a monomial of \(x^{(\ell -1)}\) and the result follows by induction.

  • V is a non-intermediate species such that U reacts to a complex containing V. By Lemma 3, there are two variables in \(v\prod _{i=1}^{M}z_i\) that react together. If none of these variables is v, there exist \(1\le i_1, i_2\le M\) such that \(Z_{i_1}\) and \(Z_{i_2}\) react together; otherwise, there exists \(1\le i_0\le M\) such that V reacts with \(Z_{i_0}\).\(\square \)

We follow here the notations introduced in Sect. 5, more precisely, in the general cascade (6). We also set \(S_{0,L_0}:=E\).

For \(1\le n\le N\), we have

$$\begin{aligned} {\dot{s}}_{n,L_n}= & {} c_{n,L_n} u_{n,L_n} -{\tilde{a}}_{n,L_n} s_{n,L_n} f_n+{\tilde{b}}_{n,L_n}v_{n,L_n}\\&- \sum _{j=1}^{L_{n+1}} a_{n+1,j} s_{n, L_n} s_{n+1, j-1}+ \sum _{j=1}^{L_{n+1}} K_{n+1,j} u_{n+1,j} \end{aligned}$$

and, for \(n=N\), only the three first terms appear in the derivative, i.e.\(a_{N+1,j}=0\), \(K_{N+1,j}=0\) for all j.

For \(\ell \ge 2\), by Eq. (13):

$$\begin{aligned} \begin{array}{rcl} s^{(\ell )}_{n,L_n} &{}=&{} \displaystyle \sum _{h+i \le \ell -1} \beta _{f_n,h,i}\, s_{n, L_n}^{(h)} f_n^{(i)} + \displaystyle \sum _{j=1}^{L_{n+1}}\sum _{h+i \le \ell -1} \beta _{s_{n+1, j-1},h,i}\, s_{n, L_n}^{(h)} s_{n+1, j-1}^{(i)} \\ &{}&{} + \displaystyle \sum _{h+i\le \ell -2} \gamma _{u_{n, L_n},h,i}\, s_{n-1, L_{n-1}}^{(h)} s_{n, L_n-1}^{(i)}+\delta _{u_{n, L_n}} u_{n, L_n} + \delta _{v_{n,L_n}} v_{n, L_n} + \displaystyle \sum _{j=1}^{L_{n+1}} \delta _{u_{n+1,j}} u_{n+1, j} \end{array} \end{aligned}$$
(20)

where

$$\begin{aligned} \beta _{f_n,h,i}= & {} {\left\{ \begin{array}{ll} - \left( {\begin{array}{c}\ell -1\\ h\end{array}}\right) {\tilde{a}}_{n, L_n} &{} \text{ if } h+i= \ell -1\\ {\tilde{a}}_{n, L_n}{\tilde{b}}_{n, L_n} \left( {\begin{array}{c}h+i\\ h\end{array}}\right) (-{\tilde{K}}_{n, L_n})^{\ell -2-h-i}&\text{ if } h+i\le \ell -2\end{array}\right. }, \\ \beta _{s_{n+1, j-1},h,i}= & {} {\left\{ \begin{array}{ll} - \left( {\begin{array}{c}\ell -1\\ h\end{array}}\right) a_{n+1, j} &{} \text{ if } h+i= \ell -1\\ - \left( {\begin{array}{c}h+i\\ h\end{array}}\right) a_{n+1, j} (-K_{n+1, j})^{\ell -1-h-i}&\text{ if } h+i\le \ell -2\end{array}\right. }, \\ \gamma _{u_{n,L_n},h,i}= & {} c_{n,L_n} a_{n, L_n} \left( {\begin{array}{c}h+i\\ h\end{array}}\right) (-K_{n, L_n})^{\ell -2-h-i} \quad \hbox {for } 0\le h+i\le \ell -2,\\ \delta _{u_{n, L_n}}= & {} c_{n, L_n} (-K_{n, L_n})^{\ell -1}, \quad \delta _{v_{n, L_n}} = {\tilde{b}}_{n, L_n} (-{\tilde{K}}_{n, L_n})^{\ell -1}, \\ \delta _{u_{n+1, j}}= & {} (-1)^{\ell -1} K_{n+1, j}^{\ell } \quad \hbox {for } 0 \le h+i\le \ell -2 \end{aligned}$$

According to formula (20), every monomial of \(s_{n, L_n}^{(\ell )}\) is either an intermediate species that appears in \({\dot{s}}_{n, L_n}\), or it appears as a monomial in one of the products:

  1. (a)

    \( s_{n, L_n}^{(h)} f_n^{(i)}\) for \(h+i\le \ell -1\),

  2. (b)

    \(s_{n, L_n}^{(h)} s_{n+1, j-1}^{(i)} \)\((1\le j \le L_{n+1})\) for \(h+i \le \ell -1\),

  3. (c)

    \( s_{n-1, L_{n-1}}^{(h)} s_{n, L_n-1}^{(i)}\) for \(h+i\le \ell -2\).

The following three technical lemmas describe how the coefficients of some distinguished monomials change recursively after differentiation. These results allow us to obtain Proposition 6 and hence, the identifiability result about the cascade stated in Sect. 5 (Theorem 2 in the main text).

Lemma 11

Let \({\mathcal {M}}= \prod _{j=1}^M z_j\) be a monomial of \(s_{n-1, L_{n-1}}^{(\ell _0)}\) which is not a monomial of any derivative of \(s_{n-1, L_{n-1}}\) of lower order and only involves variables corresponding to species in \({\mathscr {S}}^{(k)}\), \({\mathscr {S}}^{(N+k)}\), for \(1\le k \le n-1\), and \({\mathscr {S}}^{(2N+1)}\). Assume that:

  • \({\mathcal {M}}\) is square free and does not involve two disjoint pairs of variables corresponding to species that react together;

  • if \(s_{n-1, L_{n-1}}\) divides \({\mathcal {M}}\), for every \(1\le j_1, j_2\le M\) such that \(Z_{j_1}\) and \(Z_{j_2}\) react together, \(Z_{j_1} = s_{n-1, L_{n-1}}\) or \(Z_{j_2} = s_{n-1, L_{n-1}}\).

Then, \({\widehat{{\mathcal {M}}}} := s_{n, L_n-1} {\mathcal {M}}\) is a monomial of \(s_{n, L_n}^{(\ell _0+2)}\) and of no lower-order derivative of \(s_{n, L_n}\). Moreover, if \(C_{\mathcal {M}}\) is the coefficient of \({\mathcal {M}}\) in \(s_{n-1, L_{n-1}}^{(\ell _0)}\), the coefficient of \({\widehat{{\mathcal {M}}}}\) in \(s_{n, L_n}^{(\ell _0+2)}\) is \(c_{n, L_n} a_{n, L_n} C_{\mathcal {M}}\).

Proof

Assume \(\widehat{{\mathcal {M}}}\) is a monomial of \(s_{n, L_n}^{(\ell )}\) for some \(\ell \ge 1\). Then, it is a monomial of one of the products in cases (a), (b) or (c) stated above. We will show that it can only appear in case (c) with \(i=0\).

In cases (a) or (b), we must have \(i>0\), since the variables \(f_n\) and \(s_{n+1, j-1}\) do not divide \(\widehat{{\mathcal {M}}}\). Then, a factor of \(\widehat{{\mathcal {M}}}\) is a monomial of a derivative \(f_n^{(i)}\) or \(s_{n+1, j-1}^{(i)}\) of positive order and, by Lemma 9, it contains a variable in \({\mathscr {S}}^{(N+n)}\) or \({\mathscr {S}}^{(n+1)}\), contradicting the assumption on the variables involved in \({\mathcal {M}}\). It follows that \(\widehat{{\mathcal {M}}}\) is a monomial in a product in (c).

Assume that \(i\ge 1\). If \(h=0\), then \(s_{n-1, L_{n-1}}\) divides \({\mathcal {M}}\) and \(\widetilde{{\mathcal {M}}}:= s_{n, L_n-1}. \frac{{\mathcal {M}}}{s_{n-1, L_{n-1}}}\) is a monomial of \(s_{n, L_n-1}^{(i)}\). Due to Lemma 3, \(\widetilde{{\mathcal {M}}}\) contains two variables corresponding to species that react together. By the second assumption of the lemma and the fact that \(S_{n, L_n-1}\) only reacts with \(S_{n-1, L_{n-1}}\) or \(F_n\) (and \(f_n\) does not divide \(\widehat{{\mathcal {M}}}\)), one of these variables must be \(s_{n-1, L_{n-1}}\); but \(s_{n-1, L_{n-1}}\) does not divide \(\widetilde{{\mathcal {M}}}\), since it is square free. If \(h\ge 1\) and \(\widehat{{\mathcal {M}}} = {\mathcal {M}}_1 \cdot {\mathcal {M}}_2\), where \({\mathcal {M}}_1\) is a monomial in \(s_{n-1, L_{n-1}}^{(h)}\) and \({\mathcal {M}}_2\) is a monomial in \(s_{n, L_n-1}^{(i)}\), by Lemma 3, each of the monomials \({\mathcal {M}}_1\) and \({\mathcal {M}}_2\) contains two variables corresponding to species that react together. One of these variables must be \(s_{n, L_n-1}\), because \({\mathcal {M}}\) does not contain two pairs of variables corresponding to species that react together. Since \(S_{n, L_n-1}\) only reacts with \(S_{n-1, L_{n-1}}\) or \(F_n\), this is only possible in the case where \(s_{n-1, L_{n-1}}\) divides \({\mathcal {M}}\), but then \(s_{n-1, L_{n-1}}\) does not divide \(\frac{{\mathcal {M}}}{s_{n-1, L_{n-1}}}\) and it does not contain two variables corresponding to species that react together.

Then, necessarily \(i=0\) and \({\mathcal {M}}\) is a monomial of \(s_{n-1, L_{n-1}}^{(h)}\) for \(h\le \ell -2\). This implies that \(\ell \ge \ell _0+2\).

Finally, let us show that \(\widehat{{\mathcal {M}}}\) effectively appears in \(s_{n, L_n}^{(\ell _0 +2)}\) and compute its coefficient. Considering formula (20) for \(\ell = \ell _0+2\), by our previous arguments, we have that \(\widehat{{\mathcal {M}}} = s_{n, L_n-1} {\mathcal {M}}\) can only arise from a product \(s_{n-1, L_{n-1}}^{(h)} s_{n, L_n-1}\) when \({\mathcal {M}}\) is a monomial of \(s_{n-1, L_{n-1}}^{(h)}\) and \(h\le \ell _0\). By the minimality of \(\ell _0\), the only possibility is that \(h= \ell _0\); moreover, if \(C_{\mathcal {M}}\) is the coefficient of \({\mathcal {M}}\) in \(s_{n-1, L_{n-1}}^{(\ell _0)}\), the coefficient of \(\widehat{{\mathcal {M}}}\) in \(s_{n, L_n}^{(\ell _0 +2)}\) is \(\gamma _{u_{n, L_n}, \ell _0, 0 } C_{\mathcal {M}}= c_{n, L_n} a_{n, L_n} C_{\mathcal {M}}\). \(\square \)

Lemma 12

Let \(u\, {\mathcal {M}}\) be a monomial of \(s_{n-1, L_{n-1}}^{(\ell _0)}\) which is not a monomial of any derivative of \(s_{n-1, L_{n-1}}\) of lower order, where U is an intermediate species and \({\mathcal {M}}\) only involves variables corresponding to species in \({\mathscr {S}}^{(k)}\), for \(1\le k \le n-1\), and \({\mathscr {S}}^{(2N+1)}\). Assume that \({\mathcal {M}}\) does not involve two variables corresponding to species that react together and \(s_{n-1, L_{n-1}}\) does not divide \({\mathcal {M}}\).

Then, \({\widehat{{\mathcal {M}}}} := s_{n, L_n-1} u\, {\mathcal {M}}\) is a monomial of \(s_{n, L_n}^{(\ell _0+2)}\) and of no lower-order derivative of \(s_{n, L_n}\). Moreover, if \(C_{\mathcal {M}}\) is the coefficient of \(u\, {\mathcal {M}}\) in \(s_{n-1, L_{n-1}}^{(\ell _0)}\), the coefficient of \({\widehat{{\mathcal {M}}}}\) in \(s_{n, L_n}^{(\ell _0+2)}\) is \(c_{n, L_n} a_{n, L_n} C_{\mathcal {M}}\). In addition, if \({{\widetilde{C}}}_{\mathcal {M}}\) is the coefficient of \(u\, {\mathcal {M}}\) in \(s_{n-1, L_{n-1}}^{(\ell _0+1)}\), the coefficient of \({\widehat{{\mathcal {M}}}}\) in \(s_{n, L_n}^{(\ell _0+3)}\) is \(c_{n, L_n} a_{n, L_n} ({{\widetilde{C}}}_{\mathcal {M}}- K_{n, L_n} C_{\mathcal {M}})\).

Proof

Assume \(\widehat{{\mathcal {M}}}\) is a monomial of \(s_{n, L_n}^{(\ell )}\) and consider the three cases (a), (b) and (c) listed above. We will show that it can only appear in case (c) with \(i=0\).

If \(\widehat{{\mathcal {M}}}\) appears from a product of type (a), (b), or (c) with \(h\ge 1\) and \(i\ge 1\), there is a factor of \(\widehat{{\mathcal {M}}}\) not involving intermediate species which is a monomial of a derivative of positive order of a non-intermediate species and, by Lemma 3, this factor involves two variables of species that react together. But \({\mathcal {M}}\) does not contain two variables of species reacting together; in addition, the only species in \({\mathscr {S}}^{(k)}\), for \(1\le k\le n-1\), that reacts with \(S_{n, L_n-1}\) is \(S_{n-1, L_{n-1}}\), and \(s_{n-1, L_{n-1}}\) does not divide \({\mathcal {M}}\).

On the other hand, \({\widehat{{\mathcal {M}}}}\) cannot appear from cases (a) or (b) with \(h=0\) or \(i=0\), since none of the variables \(s_{n, L_n}\), \(f_n\) or \(s_{n+1, j-1}\), for \(1\le j \le L_{n+1}\), divides \({\widehat{{\mathcal {M}}}}\). Finally, the assumption that \(s_{n-1,L_{n-1}}\) does not divide \({\mathcal {M}}\) implies that the monomial cannot appear in case (c) with \(h=0\).

We conclude that \(\widehat{{\mathcal {M}}}\) only appears as a monomial in \(s_{n-1, L_{n-1}}^{(h)} s_{n, L_n-1}\) for \(1\le h\le \ell -2\), that is, when \(u{\mathcal {M}}\) is a monomial of \(s_{n-1, L_{n-1}}^{(h)}\). Then, \(\ell \ge \ell _0+2\).

The computation of the coefficient of \({\widehat{{\mathcal {M}}}}\) in \(s_{n, L_n}^{(\ell _0+2)}\) follows as in the proof of Lemma 11.

Finally, let us obtain the coefficient of \({\widehat{{\mathcal {M}}}}\) in \(s_{n, L_n}^{(\ell _0+3)}\). As shown before, in formula (20) the monomial \({\widehat{{\mathcal {M}}}}\) may only appear from terms of the form (c) with \(i=0\) and \(1\le h\le \ell _0+1\) such that \(u{\mathcal {M}}\) is a monomial of \(s_{n-1, L_{n-1}}^{(h)}\). By the minimality of \(\ell _0\), the only possible values of h are \(\ell _0\) and \(\ell _0+1\); thus, the corresponding coefficient is \(\gamma _{u_n,L_n, \ell _0-1, 0} {{\widetilde{C}}}_{\mathcal {M}}+ \gamma _{u_n,L_n, \ell _0-2,0} C_{\mathcal {M}}= c_{n, L_n} a_{n, L_n} {{\widetilde{C}}}_{\mathcal {M}}+ c_{n, L_n} a_{n, L_n} (-K_{n,L_n})C_{\mathcal {M}}= c_{n, L_n} a_{n, L_n} ( {{\widetilde{C}}}_{\mathcal {M}}-K_{n,L_n}C_{\mathcal {M}})\). \(\square \)

Lemma 13

For \(1\le l\le L_m-1\),

$$\begin{aligned} {\mathcal {M}}_{n, s_{m,l}}= s_{m,l }\, f_m\, s_{m,L_m} \prod \limits _{i=m+1}^n s_{i,L_i-1} \quad \hbox { and }\quad {\mathcal {M}}_{n, v_{m,l}}= v_{m,l}\, s_{m,L_m}\prod \limits _{i=m+1}^n s_{i,L_i-1} \end{aligned}$$

are monomials of \(s_{n,L_n}^{(2(n-m+1))}\) for every \(n\ge m+1\), and they are not monomials of any derivative of \(s_{n, L_n}\) of lower order. The corresponding coefficients are, respectively,

$$\begin{aligned} {\tilde{a}}_{m,l}\,{\tilde{a}}_{m,L_m}\prod _{i=m+1}^n c_{i,L_i} a_{i, L_i}\quad \hbox { and }\quad {\tilde{K}}_{m,l}\,{\tilde{a}}_{m,L_m} \prod _{i=m+1}^n c_{i,L_i} a_{i,L_i}. \end{aligned}$$

Proof

For \(n=m+1\), we must show that, for every \(1\le l\le L_{m}-1\),

$$\begin{aligned} {\mathcal {M}}_{m+1,s_{m,l}} = s_{m,l} f_m s_{m, L_m} s_{m+1, L_{m+1}-1} \hbox { and } {\mathcal {M}}_{m+1,v_{m,l}} = v_{m,l} s_{m, L_m} s_{m+1, L_{m+1}-1} \end{aligned}$$

are monomials of \(s_{m+1, L_{m+1}}^{(4)}\) and of no lower-order derivative of \(s_{m+1, L_{m+1}}\).

It is easy to see that none of the required monomials appears in \({\dot{s}}_{m+1, L_{m+1}}\) or \(\ddot{s}_{m+1, L_{m+1}}\), because these derivatives do not contain monomials of degree 4 and the monomials that are multiples of intermediates have degree at most 2 (see the proof of Lemma 10).

Consider now the expression of \(s_{m+1, L_{m+1}}^{(\ell )}\) following (20), with \(\ell \ge 3\).

The monomials \({\mathcal {M}}_{m+1,s_{m,l}}\) and \({\mathcal {M}}_{m+1,v_{m,l}}\) do not arise from products of type (a) or (b) with \(h=0\) or \(i=0\), since they are not multiples of \(s_{m+1, L_{m+1}}\), \(f_{m+1}\) or \(s_{m+2, j-1}\). Taking into account that every monomial in a first-order derivative of a non-intermediate is either a multiple of the non-intermediate or an intermediate that reacts to it, we have that the monomials do not appear either from products of type (a) or (b) with \(h=1\) or \(i=1\). As \(h+i\le \ell -1\) in products of type (a) or (b), we deduce that \({\mathcal {M}}_{m+1,s_{m,l}}\) and \({\mathcal {M}}_{m+1,v_{m,l}}\) do not appear in these products for \(\ell =3\) nor \(\ell =4\).

In products of type (c), if \(h+i\le 1\), there are no monomials of degree 4, and those that are multiples of an intermediate have degree at most 2.

We conclude that \({\mathcal {M}}_{m+1,s_{m,l}}\) and \({\mathcal {M}}_{m+1,v_{m,l}}\) are not monomials of \(s_{m+1, L_{m+1}}^{(3)}\) and that they may only appear in \(s_{m+1, L_{m+1}}^{(4)}\) from products of type (c) with \(h+i =2\).

  • \(h=0\), \(i=2\). By looking at the expansion of \(s_{m+1, L_{m+1}-1}^{(2)}\), we deduce that \(s_{m,l} f_m s_{m+1, L_{m+1}-1}\) and \(v_{m,l} s_{m+1, L_{m+1}-1} \), for \(l<L_m\), are not monomials of this derivative.

  • \(h=i=1\): The monomials \({\mathcal {M}}_{m+1,s_{m,l}}\) do not appear in this product because the only variable involved that reacts with \(S_{m+1, L_{m+1}-1}\) is \(S_{m, L_m}\) and the monomials \(s_{m,l} f_m\) do not appear in \({\dot{s}}_{m, L_m}\) for \(l<L_m\). The monomials \({\mathcal {M}}_{m+1,v_{m,l}}\) do not appear since \(v_{m,l}\) does not react to \(s_{m, L_m}\) or \(s_{m+1, L_{m+1}-1}\) for \(l<L_m\).

  • \(h=2\), \(i=0\): As in the proof of Lemma 6, it follows that \(s_{m,l} f_m s_{m, L_m}\) and \(v_{m,l} s_{m, L_m}\) are monomials of \(s_{m, L_m}^{(2)}\) with respective coefficients \({\tilde{a}}_{m,l}{\tilde{a}}_{L_m} \) and \({\tilde{K}}_{m,l}{\tilde{a}}_{m,L_m}\).

Therefore, \({\mathcal {M}}_{m+1,s_{m,l}}\) and \({\mathcal {M}}_{m+1,v_{m,l}}\) effectively appear in \(s_{m+1, L_{m+1}}^{(4)}\); more precisely, they arise from the product \(\gamma _{u_{m+1},L_{m+1}, 2, 0} s_{m, L_m}^{(2)} s_{m+1, L_{m+1}-1}\). The corresponding coefficients can be obtained from the fact that \(\gamma _{u_{m+1},L_{m+1}, 2, 0} = c_{{m+1}, L_{m+1}}.a_{m+1, L_{m+1}}\).

Let \(n>m+1\) and assume the monomials \({\mathcal {M}}_{n-1, s_{m,l}}\) and \({\mathcal {M}}_{n-1, v_{m,l}}\) appear in \(s_{n-1, L_{n-1}}^{(2(n-m))}\) and in no derivative of \(s_{n-1, L_{n-1}}\) of a lower order.

Let \(1\le l \le L_m-1\). Consider first \({\mathcal {M}}_{n, s_{m,l}}\), which is a product of non-intermediates. If it appears in a derivative \(s_{n, L_n}^{(\ell )}\), it arises from a product in case (a), (b) or (c) listed previously.

Since \({\mathcal {M}}_{n, s_{m,l}}\) does not contain any variable corresponding to a species in \({\mathscr {S}}^{(N+n)}= \{ F_n\}\) or \({\mathscr {S}}^{(n+1)}= \{ S_{n+1, j}, 0\le j \le L_{n+1}\}\), by Lemma 9, it cannot appear from cases (a) or (b). Then, it is a monomial in a product \(s_{n-1, L_{n-1}}^{(h)} s_{n, L_n-1}^{(i)}\) for \(h+i \le \ell -2\). If \(i>0\), the factor \({\mathcal {M}}_1\) of \({\mathcal {M}}_{n, s_{m,l}}\) which is a monomial in \(s_{n, L_n-1}^{(i)}\) contains a variable in \({\mathscr {S}}^{(n)}\), namely \(s_{n, L_n-1}\), and another variable in a set \({\mathscr {S}}^{(k)}\) that contains a species reacting with \(S_{n,L_n-1}\). Since the only species that react with \(S_{n,L_n-1}\) are \(S_{n-1, L_{n-1}}\) and \(F_n\), it follows that \({\mathcal {M}}_1\) contains a variable in \({\mathscr {S}}^{(n-1)}\). Now, \({\mathcal {M}}_{n, s_{m,l}}/{\mathcal {M}}_1\) is a monomial in \(s_{n-1, L_{n-1} -1}^{(h)}\); therefore, it also contains a variable in \({\mathscr {S}}^{(n-1)}\). But, since \(n>m+1\), the only factor of \({\mathcal {M}}_{n, s_{m,l}}\) in \({\mathscr {S}}^{(n-1)}\) is \(s_{n-1, L_{n-1}-1}\), leading to a contradiction. We conclude that \(i=0\) and \({\mathcal {M}}_{n, s_{m,l}}\) appears as a monomial in \(s_{n-1, L_{n-1}}^{(h)} s_{n, L_n-1}\), namely \({\mathcal {M}}_{n, s_{m,l}}= {\mathcal {M}}_{n-1, s_{m,l}} s_{n, L_n-1}\) with \({\mathcal {M}}_{n-1, s_{m,l}}\) a monomial in \(s_{n-1, L_{n-1}}^{(h)}\) for \(h\le \ell -2\). Then \(\ell \ge 2(n-m+1)\).

Now, consider \({\mathcal {M}}_{n, v_{m,l}}\) and assume it is a monomial of \(s_{n, L_n}^{(\ell )}\). As, for \(n>m+1\), none of the variables \(s_{n, L_n}\), \(s_{n+1, j-1}\), \(f_n\) or \(s_{n-1, L_{n-1}}\) divides \({\mathcal {M}}_{n, v_{m,l}}\), this monomial cannot arise from cases (a) or (b) with either \(i=0\) or \(h=0\), nor from (c) with \(h=0\). If it arises from cases (a), (b) or (c) with \(h\ge 1\) and \(i\ge 1\), then \({\mathcal {M}}_{n, v_{m,l}} = {\mathcal {M}}_1 {\mathcal {M}}_2\) with \({\mathcal {M}}_1\) and \({\mathcal {M}}_2\) monomials appearing in derivatives of positive order of non-intermediate species. Assume \(v_{m,l}\) divides \({\mathcal {M}}_1\). Then, \({\mathcal {M}}_2\) is a product of non-intermediates; by Lemma 3, it contains the only two variables of \({\mathcal {M}}_{n, v_{m,l}}\), \(s_{m, L_m}\) and \(s_{m+1, L_{m+1}-1}\), corresponding to species that react together. On the other hand, \({\mathcal {M}}_1 = v_{m, l} {\mathcal {M}}\), where \({\mathcal {M}}\) is not constant since \(V_{m,l}\) does not react to \(S_{n,L_n}\), \(F_n\), \(S_{n+1, j-1}\), \(S_{n-1,L_{n-1}}\) nor \(S_{n,L_n-1}\) (so, \(v_{m,l}\) is not a monomial in a derivative of \(s_{n,L_n}\), \(f_n\), \(s_{n+1, j-1}\), \(s_{n-1,L_{n-1}}\) nor \(s_{n,L_n-1}\)). By Lemma 10, taking into account that \(V_{m,l}\) is only involved in the reactions \(F_m + S_{m, l} \rightleftarrows V_{m,l} \rightarrow F_m + S_{m, l-1}\), we have that \({\mathcal {M}}\) contains either two variables corresponding to species that react together or it contains one variable that reacts with \(F_m\), \(S_{m, l-1}\) or \(S_{m, l}\). But none of these possibilities happen.

We conclude that \({\mathcal {M}}_{n, v_{m,l}}\) arises from (c) with \(i=0\) and it appears in \(s_{n-1, L_{n-1}}^{(h)} s_{n, L_n-1}\) for \(h\le \ell -2\), that is, \({\mathcal {M}}_{n-1, v_{m,l}}\) is a monomial of \(s_{n-1, L_{n-1}}^{(h)}\) for \(h\le \ell -2\). Then \(\ell \ge 2(n-m+1)\).

The fact that the monomials effectively appear in \(s_{n, L_n}^{2(n-m+1)}\) and the computation of their coefficients follow similarly as in the proof of Lemma 11. \(\square \)

From the previous lemmas and the results for the case of a single layer proved in Proposition 3, we obtain the following proposition that leads to our identifiability result for the cascade (see Table 3). The highlighted constant in each case is the one we will identify from the corresponding coefficient.

Proposition 6

For network (6), for every \(n\ge m\), the following monomials \({\mathcal {M}}\) appear in \(s_{n, L_n}^{(\ell )}\) with coefficient \(\pm C_{\mathcal {M}}\) for the stated value \(\ell \), and they do not appear in any derivative of \(s_{n, L_n}\) of lower order:

figure d

Proof

Fix m with \(1\le m\le N\). We prove the proposition inductively for \(n\ge m\).

The case \(n=m\) is considered in Sect. 4.2.

Let \(n\ge m+1\). Items 5 and 6 are proved in Lemma 13. For the remaining monomials, assuming the statement holds for \(n-1\), we deduce that it is also true for n by applying Lemma 11 (for items 1, 4 and 7) and Lemma 12 (for items 2, 3, 8 and the last statement of the proposition).

We present a complete proof in the first two cases. The induction step for the monomials of the remaining items follows similarly.

  1. 1.

    Consider \({\mathcal {M}}_1 =f_m \,s_{m,L_m}\prod \nolimits _{i=m+1}^{n-1} s_{i,L_i-1}\). By the inductive assumption, this monomial appears in \(s_{n-1,L_{n-1}}^{(2(n-1-m)+1)}\) with coefficient \(C_{{\mathcal {M}}_1} = {\tilde{a}}_{m,L_m}\prod \nolimits _{i=m+1}^{n-1} c_{i,L_i} a_{i, L_i}\), and in no derivative of \(s_{n-1, L_{n-1}}\) of a lower order. Let us show that \({\mathcal {M}}_1\) satisfies the assumptions of Lemma 11. First, note that \({\mathcal {M}}_1\) is square free and only involves variables corresponding to species in \({\mathscr {S}}^{(k)}\), for \(m\le k \le n-1\), and \({\mathscr {S}}^{(N+m)}\). In addition, since two species \(S_{i,L_{i}-1}\), \(S_{j,L_{j}-1}\), for \(m\le i,j\le n-1\), do not react together and \(F_m\) does not react with \(S_{i, L_i-1}\) for \(m+1\le i \le n-1\), then \({\mathcal {M}}_1\) does not contain two disjoint pairs of variables corresponding to species that react together. Finally, we have that \(s_{n-1, L_{n-1}}\) divides \({\mathcal {M}}_1\) only when \(n= m+1\), and in this case, \({\mathcal {M}}_{1} = f_{n-1} s_{n-1, L_{n-1}}\), which clearly satisfies the assumptions of the lemma. Therefore, by Lemma 11, we conclude that \(s_{n, L_n-1} {\mathcal {M}}_1 = f_m \,s_{m,L_m}\prod \nolimits _{i=m+1}^{n} s_{i,L_i-1}\) is a monomial of \(s_{n,L_n}^{(2(n-1-m)+1+2)}= s_{n,L_n}^{(2(n-m)+1)}\) and of no lower-order derivative of \(s_{n, L_n}\), and its corresponding coefficient is \(c_{n, L_n}a_{n,L_n}C_{{\mathcal {M}}_1} = {\tilde{a}}_{m,L_m}\prod \nolimits _{i=m+1}^{n} c_{i,L_i} a_{i, L_i}\).

  2. 2.

    For a fixed k, with \(0\le k \le L_m-1\), the monomial \({\mathcal {M}}\) can be written as \({\mathcal {M}}= s_{n, L_{n}-1}u{\mathcal {M}}_2\), where \(u:=u_{m, L_m-k}\) is a variable corresponding to an intermediate species and \({\mathcal {M}}_2:= s_{m-1, L_{m-1}}^k \prod \nolimits _{i=m+1}^{n-1} s_{i,L_i-1}\). By the induction assumption, we have that \(u{\mathcal {M}}_2\) is a monomial of \(s_{n-1, L_{n-1}}^{(2(n-1-m)+2k+1)}\), with coefficient \(C_{{\mathcal {M}}_2} = c_{m,L_m-k}\Big (\prod \nolimits _{j=0}^{k-1}a_{m, L_m-j} \, c_{m,L_m-j}\Big )\Big ( \prod \nolimits _{i=m+1}^{n-1} c_{i,L_i} a_{i, L_i}\Big )\), and it does not appear in any lower-order derivative of \(s_{n-1, L_{n-1}}\). Let us show that \({\mathcal {M}}_2\) satisfies the assumptions of Lemma 12. It is clear that \({\mathcal {M}}_2\) only involves variables in \({\mathscr {S}}^{(i)}\) for \(i \le n-1\) and \({\mathscr {S}}^{(2N+1)}\) and that \(s_{n-1, L_{n-1}}\) does not divide \({\mathcal {M}}_2\), since \(n\ne m\). Also, since two species \(S_{i,L_{i}-1}\) and \(S_{j,L_{j}-1}\), for \(m\le i,j\le n-1\), do not react together and \(S_{m-1, L{m-1}}\) does not react with \(S_{i, L_{i-1}}\) for \(i\ge m+1\), it follows that \({\mathcal {M}}_2\) does not involve two variables corresponding to species that react together. Then, by Lemma 12, we conclude that \(s_{n, L_n-1} u{\mathcal {M}}_2 = s_{m-1, L_{m-1}}^k u_{m, L_m-k} \prod \nolimits _{i=m+1}^n s_{i,L_i-1}\) is a monomial of \(s_{n, L_n}^{(2(n-m)+2k+1)}\) and of no lower-order derivative of \(s_{n, L_n}\), and its coefficient is \(c_{n,L_n}a_{n,L_n} C_{{\mathcal {M}}_2} = c_{m,L_m-k}\Big (\prod \nolimits _{j=0}^{k-1}a_{m, L_m-j} \, c_{m,L_m-j}\Big )\Big ( \prod \nolimits _{i=m+1}^{n} c_{i,L_i} a_{i, L_i}\Big )\). \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jeronimo, G., Pérez Millán, M. & Solernó, P. Identifiability from a Few Species for a Class of Biochemical Reaction Networks. Bull Math Biol 81, 2133–2175 (2019). https://doi.org/10.1007/s11538-019-00594-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11538-019-00594-0

Keywords

Navigation