# Symbolic Versus Numerical Computation and Visualization of Parameter Regions for Multistationarity of Biological Networks

## Abstract

We investigate models of the mitogenactivated protein kinases (MAPK) network, with the aim of determining where in parameter space there exist multiple positive steady states. We build on recent progress which combines various symbolic computation methods for mixed systems of equalities and inequalities. We demonstrate that those techniques benefit tremendously from a newly implemented graph theoretical symbolic preprocessing method. We compare computation times and quality of results of numerical continuation methods with our symbolic approach before and after the application of our preprocessing.

## 1 Introduction

The mathematical modelling of intra-cellular biological processes has been using nonlinear ordinary differential equations since the early ages of mathematical biophysics in the 1940s and 50s [28]. A standard modelling choice for cellular circuitry is to use chemical reactions with mass action law kinetics, leading to polynomial differential equations. Rational functions kinetics (for instance the Michaelis-Menten kinetics) can generally be decomposed into several mass action steps. An important property of biological systems is their multistationarity which means having multiple stable steady states. Multistationarity is instrumental to cellular memory and cell differentiation during development or regeneration of multicellular organisms and is also used by micro-organisms in survival strategies. It is thus important to determine the parameter values for which a biochemical model is multistationary. With mass action reactions, testing for multiple steady states boils down to counting real positive solutions of algebraic systems.

The models benchmarked in this paper concern intracellular signaling pathways. These pathways transmit information about the cell environment by inducing cascades of protein modifications (phosphorylation) all the way from the plasma membrane via the cytosol to genes in the cell nucleus. Multistationarity of signaling usually occurs as a result of activation of upstream signaling proteins by downstream components [2]. A different mechanism for producing multistationarity in signaling pathways was proposed by Kholodenko [26]. In this mechanism the cause of multistationarity are multiple phosphorylation/dephosphorylation cycles that share enzymes. A simple, two steps phosphorylation/dephosphorylation cycle is capable of ultrasensitivity, a form of all or nothing response with no multiple steady states (Goldbeter–Koshland mechanism). In multiple phosphorylation/dephosphorylation cycles, enzyme sharing provides competitive interactions and positive feedback that ultimately leads to multistationarity [23, 26].

Our study is complementary to works applying numerical methods to ordinary differential equations models used for biology applications. Gross et al. [18] used polynomial homotopy continuation methods for global parameter estimation of mass action models. Bifurcations and multistationarity of signaling cascades was studied with numerical methods based on the Jacobian matrix [30]. Other symbolic approaches to multistationarity either propose necessary conditions or work for particular networks [8, 9, 20, 27].

Our work here follows [5], where it was demonstrated that determination of multistationarity of an 11-dimensional model of a mitogen-activated protein kinases (MAPK) cascade can be achieved by currently available symbolic methods when numeric values are known for all but potentially one parameter. We show that the symbolic methods used in [5], viz. real triangularization and cylindrical algebraic decomposition, and also polynomial homotopy continuation methods, benefit tremendously from a graph theoretical symbolic preprocessing method. This method has been sketched by Grigoriev et al. [17] and has been used for a “hand computation,” but had not been implemented before. For our experiments we use the model already investigated in [5] and a higher dimensional model of the MAPK cascade.

## 2 The Systems for the Case Studies

For our investigations we use models of the MAPK cascade that can be found in the Biomodels database^{1} as numbers 26 and 28 [24]. We refer to those models as Biomod-26 and Biomod-28, respectively.

### 2.1 Biomod-26

The steady state problem for the MAPK cascade can now be formulated as a real algebraic problem as follows. We replace the left hand sides of all equations in (1) with 0 and substitute the values from (2). This together with (3) yields a system of parametric polynomial equations with polynomials in \(\mathbb {Z}[k_{17},k_{18},k_{19}][x_1,\dots ,x_{11}]\). Since all entities in our model are strictly positive, we add to our system positivity conditions \(k_{17}>0\), \(k_{18}>0\), \(k_{19}>0\) and \(x_1>0\), ..., \(x_{11}>0\). In terms of first-order logic the conjunction over our equations and inequalities yields a quantifier-free Tarski formula.

### 2.2 Biomod-28

## 3 Graph-Theoretical Symbolic Preprocessing

The complexity, primarily in terms of dimension, of polynomial systems obtained with steady-state approximations of biological models plus conservation laws is comparatively high for the application of symbolic methods. It is therefore highly relevant for the success of such methods to identify and exploit particular structural properties of the input. Our models have remarkably low total degrees with many linear monomials after some substitutions for rate constants. This suggests to preprocess with essentially Gaussian elimination in the sense of solving single suitable equations with respect to some variable and substituting the corresponding solution into the system.

Generalizing this idea to situations where linear variables have parametric coefficients in the other variables requires, in general, a parametric variant of Gaussian elimination, which replaces the input system with a finite case distinction with respect to the vanishing of certain coefficients and one reduced system for each case. With Biomod-26 and Biomod-28 considered here it turns out that the positivity assumptions on the variables are strong enough to effectively guarantee the non-vanishing of all relevant coefficients so that case distinctions are never necessary. On the other hand, those positivity conditions establish an apparent obstacle, because we are formally not dealing with a parametric system of linear equations but with a parametric linear programming problem. However, here the theory of real quantifier elimination by virtual substitution tells us that it is sufficient that the inequality constraints play a passive role. Those constraints must be considered when substituting Gauss solutions from the equations, but otherwise can be ignored [22, 25].

*maximum independent set*, which is the complement of a

*minimum vertex cover*. Figure 1 shows that graph for Biomod-26, where \(\{x_4,x_5\}\) is a minimal vertex cover, and all other variables can be linearly eliminated. Similarly, for Biomod-28 we find \(\{x_5,x_6\}\) as a minimum vertex cover. Recall that minimum vertex cover is one of Karp’s 21 classical NP complete problems [21]. However, our instances considered here and instances to be expected from other biological models are so small that the use of existing approximation algorithms [16] appears unnecessary. We have used real quantifier elimination, which did not consume measurable CPU time; alternatively one could use integer linear programming or SAT-solving.

It is a most remarkable fact that a significant number of biological models in the databases have that property of loosely connected variables. This phenomenon resembles the well-known *community structure* of propositional satisfiability problems, which has been identified as one of the key structural reasons for the impressive success of state-of-the-art CDCL-based SAT solvers [15].

## 4 Determination of Multiple Steady States

We aim to identify via grid sampling regions of parameter space where multistationarity occurs. Our focus is on the identification of regions with multiple positive real solutions for the parameters introduced with the conservation laws. We will encounter one or three such solutions and allow ourselves for biological reasons to assume monostability or bistability, respectively. Furthermore, a change in the number of solutions between one and three is indicative of a saddle-node bifurcation between a monostable and a bistable case. A mathematically rigorous treatment of stability would, possibly symbolically, analyze the eigenvalues of the Jacobian of the respective polynomial vector field. We consider two different approaches: first a polynomial homotopy continuation method implemented in Bertini, and second a combination of symbolic computation methods implemented in Maple. We compare the approaches with respect to performance and quality of results for both the reduced and the unreduced systems.

### 4.1 Numerical Approach

We use the homotopy solver Bertini [1] in its standard configuration to compute complex roots. We parse the output of Bertini using Python, and determine numerically, which of the complex roots are real and positive using a threshold of \(10^{-6}\) for positivity. Computations are done in Python with Bertini embedded.

For System Biomod-26 we produced the two plots in Fig. 2 using the original system and the two in Fig. 3 using the reduced system. The sampling range for \(k_{19}\) was from 200 to 1000 by 50. In the left plots the sampling range for \(k_{17}\) is from 80 to 200 by 10 with \(k_{18}\) fixed at 50. In the right plots the sampling range for \(k_{18}\) is 5 to 75 by 5 with \(k_{17}\) fixed to 100. We see two regions forming according to the number of fixed points: yellow discs indicate one fixed point and blue boxes three. The diamonds indicate numerical errors where zero (red) or two (green) fixed states were identified. We analyse these further in Sect. 4.3.

For Biomod-28 we produced the two plots in Fig. 5 using the original system. The sampling range for \(k_{30}\) was from 100 to 1600 by 100. In the left plots the sampling range for \(k_{28}\) is from 40 to 160 by 10 with \(k_{29}\) fixed at 180. In the right plots the sampling range for \(k_{29}\) is from 120 to 240 by 10 with \(k_{28}\) fixed to 100. The colours and shapes indicate the number of fixed points as before. For the reduced system Bertini (wrongly) could not find any roots (not even complex ones) for any of the parameter settings. The situation did not change when going from adaptive precision to a very high fixed precision. However, we have not attempted more sophisticated techniques like providing user homotopies. We analyse these results further in Sect. 4.3.

### 4.2 Symbolic Approach

Our next approach will still use grid sampling, but each sample point will undergo a symbolic computation. The result will still be an approximate identification of the region (since the sampling will be finite) but the results at those sample points will be guaranteed free of numerical errors. The computations follow the strategy introduced in [5, Sect. 2.1.2]. This combined tools from the Regular Chains Library^{2} available for use in Maple. Regular chains are the triangular decompositions of systems of polynomial equations (triangular in terms of the variables in each polynomial). Highly efficient methods for working in complex space have been developed based on these (see [29] for a survey).

We make use of recent work by Chen et al. [6] which adapts these tools to the real analogue: semi-algebraic systems. They describe algorithms to decompose any real polynomial system into finitely many regular semi-algebraic systems: both directly and by computation of components by dimension. The latter (the so called *lazy* variant) was key to solving the 1-parameter MAPK problem in [5]. However, for the zero dimensional computations of this paper there is only one solution component and so no savings from lazy computations.

For a given system and sample point we apply the real triangularization (RT) on the quantifier-free formula (as described at the end of Sect. 2.1: a quantifier free conjunction of equalities and inequalities) evaluated with the parameter estimates and sample point values. This produces a simplified system in several senses. First, as guaranteed by the algorithm, the output is triangular according to a variable ordering. So there is a univariate component, then a bivariate component introducing one more variable and so on. Secondly, for all the MAPK models we have studied so far, all but the final (univariate) of these equations has been linear in its main variable. This thus allows for easy back substitution. Thirdly, most of the positivity conditions are implied by the output rather than being an explicit part of it, in which case a simpler sub-system can be solved and back substitution performed instantly.

**Biomod-26.** For the original version of Biomod-26 the output of RT was a component consisting of 11 equations and a single inequality. The equations were in ascending main variable according to the provided ordering (same as the labelling). All but the final equation is linear in its main variable, with the final equation being univariate and degree 6 in \(x_1\). The output of the triangularization requires that this variable be positive, \(x_1>0\), with the positivity of the other variables implied by solutions to the system. So to proceed we must find the positive real roots of the degree 8 univariate polynomial in \(x_1\): counting these will imply the number of real positive solutions of the parent system. We do this using the root isolation tools in the Regular Chains Library. This whole process was performed iteratively for the same sampling regime as Bertini used to produce Fig. 4.

We repeated the process on the reduced version of the system. The triangularization again reduced the problem to univariate real root isolation, this time with only one back substitution step needed. As to be expected from a fully symbolic computation, the output is identical and so again represented by Fig. 4. However, the computation was significantly quicker with this reduced system. More details are given in the comparison in Sect. 4.3.

**Biomod-28.** The same process was conducted on Biomod-28. As with Biomod-26 the system was triangular with all but the final equation linear in its main variable; this time the final equation is degree 8. However, unlike Biomod-26 two positivity conditions were returned in the output meaning we must solve a bivariate problem before we can back substitute to the full system. Rather than just perform univariate real root isolation we must build a Cylindrical Algebraic Decomposition (CAD) (see, e.g., [4] and the references within) sign invariant for the final two equations and interrogate its cells to find those where the equations are satisfied and variable positive. Counting these we find always 1 or 3 cells, with the latter indicating bistability. This is similar to the approach used in [5], although in that case the 2D CAD was for one variable and one parameter. We used the implementation of CAD in the Regular Chains Library [3, 7] with the results producing the plots in Fig. 6.

### 4.3 Comparison

Timing data (in seconds) of the grid samplings described in Sect. 4. Numerical computation is using Bertini; Symbolic computation is using Maple Regular Chains

Numerical | Symbolic | ||||
---|---|---|---|---|---|

Mean | Mean | Median | StdDev | Maximum | |

026 – Original | 2.4 | 0.568 | 0.530 | 0.107 | 0.905 |

026 – Reduced | 0.85 | 0.053 | 0.047 | 0.036 | 0.343 |

028 – Original | 16.57 | 42.430 | 40.529 | 8.632 | 84.116 |

028 – Reduced | \(\bot \) | 0.485 | 0.468 | 0.119 | 0.796 |

Bertini computations (v1.5.1) were carried out on a Linux 64 bit Desktop PC with Intel i7. Maple computations (v2016 with April 2017 Regular Chains) were carried out on a Windows 7 64 bit Desktop PC with Intel i5.

For Biomod-26 the pairs of plots together contain 476 sample points. Table 1 shows timing data. We see that both Bertini and Maple benefited from the reduced system: Bertini took a third of the original time while the speedup for Maple was even greater: a tenth of the original. Also, perhaps surprisingly, the symbolic methods were quicker than the numerical ones here. For Biomod-28 the speed-up enjoyed by the symbolic methods was even greater (almost 100 fold). However, for this system Bertini was significantly faster. The symbolic methods used are well known for their doubly exponential computational complexity (in the number of variables) so it is not surprising that as the system size increases there so should the results of the comparison. We see some other statistical data for the timings in Maple: the standard deviation for the timings is fairly modest but in each row we see there are outliers many multiples of the mean value and so the median is always a little less than the mean average.

### 4.4 Going Further

Of course, we could increase the sampling density to get an improved idea of the bistability region, as in Figs. 7 and 8. However, a greater understanding comes with 3D sampling. We have performed this using the symbolic approach described above, at a linear cost proportional to the increased number of sample points. This was completed for Biomod-26: the region in question is bounded to both sides in the \(k_{17}\) and \(k_{18}\) directions but extends infinitely above in \(k_{19}\). With the \(k_{19}\) range bound at 1000 the region is bounded by extending \(k_{17}\) to 800 and \(k_{18}\) to 600. For obtaining exact bounds (in one parameter) see [5].

^{3}. We note the lens shape seen in the orientation in the left plots is comparable with the image in the original paper of Markevich et al. [26, Fig. S7].

## 5 Conclusion and Future Work

We described a new graph theoretical symbolic preprocessing method to reduce problems from the MAPK network. We experimented with two systems and found the reduction offered computation savings to both numerical and symbolic approaches for the determination of multistationarity regions of parameter space. In addition, the reduction avoided instability from rounding errors in the numerical approach to one system, but uncovered major problems in that approach for the other. An interesting side result is that, at least for the smaller system, the symbolic approach can compete with and even outperform the numerical one, demonstrating how far such methods have progressed in recent years.

In future work we intend to combine the results of the present paper and our recent publication [5] to generate symbolic descriptions of the bistability region beyond the 1-parameter case. Other possible routes to achieve this is to consider the effect of the various degrees of freedom with the algorithms used. For example, we have a free choice of variable ordering: Biomod-26 has 11 variables corresponding to 39 916 800 possible orderings while Biomod-28 has 16 variables corresponding to more than \(10^{13}\) orderings. Heuristics exist to help with this choice [10] and machine learning may be applicable [19]. Also, since MAPK problems contain many equational constraints an approach as described in [13] may be applicable when higher dimensional CADs are needed.

## Footnotes

## Notes

### Acknowledgements

D. Grigoriev is grateful to the grant RSF 16-11-10075. H. Errami, O. Radulescu, and A. Weber thank the French-German Procope-DAAD program for partial support of this research. M. England and T. Sturm are grateful to EU H2020-FETOPEN-2015-CSA 712689 SC\(^{2}\).

**Research Data Statement:** Data supporting the research in this paper is available from doi:10.5281/zenodo.807678.

## References

- 1.Bates, D.J., Hauenstein, J.D., Sommese, A.J., Wampler, C.W.: Bertini: software for numerical algebraic geometry. doi: 10.7274/R0H41PB5
- 2.Bhalla, U.S., Iyengar, R.: Emergent properties of networks of biological signaling pathways. Science
**283**(5400), 381–387 (1999)CrossRefGoogle Scholar - 3.Bradford, R., Chen, C., Davenport, J.H., England, M., Moreno Maza, M., Wilson, D.: Truth table invariant cylindrical algebraic decomposition by regular chains. In: Gerdt, V.P., Koepf, W., Seiler, W.M., Vorozhtsov, E.V. (eds.) CASC 2014. LNCS, vol. 8660, pp. 44–58. Springer, Cham (2014). doi: 10.1007/978-3-319-10515-4_4 Google Scholar
- 4.Bradford, R., Davenport, J., England, M., McCallum, S., Wilson, D.: Truth table invariant cylindrical algebraic decomposition. J. Symb. Comput.
**76**, 1–35 (2016)MathSciNetCrossRefzbMATHGoogle Scholar - 5.Bradford, R., Davenport, J., England, M., Errami, H., Gerdt, V., Grigoriev, D., Hoyt, C., Kosta, M., Radulescu, O., Sturm, T., Weber, A.: A case study on the parametric occurrence of multiple steady states. In: Proceedings of the ISSAC 2017, pp. 45–52. ACM (2017)Google Scholar
- 6.Chen, C., Davenport, J., May, J., Moreno Maza, M., Xia, B., Xiao, R.: Triangular decomposition of semi-algebraic systems. J. Symb. Comput.
**49**, 3–26 (2013)MathSciNetCrossRefzbMATHGoogle Scholar - 7.Chen, C., Moreno Maza, M., Xia, B., Yang, L.: Computing cylindrical algebraic decomposition via triangular decomposition. In: Proceedings of the ISSAC 2009, pp. 95–102. ACM (2009)Google Scholar
- 8.Conradi, C., Mincheva, M.: Catalytic constants enable the emergence of bistability in dual phosphorylation. J. Roy. Soc. Interface
**11**(95) (2014)Google Scholar - 9.Conradi, C., Flockerzi, D., Raisch, J.: Multistationarity in the activation of a MAPK: parametrizing the relevant region in parameter space. Math. Biosci.
**211**(1), 105–31 (2008)MathSciNetCrossRefzbMATHGoogle Scholar - 10.Dolzmann, A., Seidl, A., Sturm, T.: Efficient projection orders for CAD. In: Proceedings of the ISSAC 2004, pp. 111–118. ACM (2004)Google Scholar
- 11.Dolzmann, A., Sturm, T.: Redlog: computer algebra meets computer logic. ACM SIGSAM Bull.
**31**(2), 2–9 (1997)CrossRefGoogle Scholar - 12.Dolzmann, A., Sturm, T.: Simplification of quantifier-free formulae over ordered fields. J. Symb. Comput.
**24**(2), 209–231 (1997)MathSciNetCrossRefzbMATHGoogle Scholar - 13.England, M., Bradford, R., Davenport, J.: Improving the use of equational constraints in cylindrical algebraic decomposition. In: Proceedings ISSAC 2015, pp. 165–172. ACM (2015)Google Scholar
- 14.Famili, I., Palsson, B.Ø.: The convex basis of the left null space of the stoichiometric matrix leads to the definition of metabolically meaningful pools. Biophys. J.
**85**(1), 16–26 (2003)CrossRefGoogle Scholar - 15.Girvan, M., Newman, M.E.J.: Community structure in social and biological networks. Proc. Natl. Acad. Sci. USA
**99**(12), 7821–7826 (2002)MathSciNetCrossRefzbMATHGoogle Scholar - 16.Grandoni, F., Könemann, J., Panconesi, A.: Distributed weighted vertex cover via maximal matchings. ACM Trans. Algorithms
**5**(1), 1–12 (2008)MathSciNetCrossRefzbMATHGoogle Scholar - 17.Grigoriev, D., Samal, S.S., Vakulenko, S., Weber, A.: Algorithms to study large metabolic network dynamics. Math. Model. Nat. Phenom.
**10**(5), 100–118 (2015)MathSciNetCrossRefzbMATHGoogle Scholar - 18.Gross, E., Davis, B., Ho, K.L., Bates, D.J., Harrington, H.A.: Numerical algebraic geometry for model selection and its application to the life sciences. J. Roy. Soc. Interface
**13**(123) (2016)Google Scholar - 19.Huang, Z., England, M., Wilson, D., Davenport, J.H., Paulson, L.C., Bridge, J.: Applying machine learning to the problem of choosing a heuristic to select the variable ordering for cylindrical algebraic decomposition. In: Watt, S.M., Davenport, J.H., Sexton, A.P., Sojka, P., Urban, J. (eds.) CICM 2014. LNCS, vol. 8543, pp. 92–107. Springer, Cham (2014). doi: 10.1007/978-3-319-08434-3_8 CrossRefGoogle Scholar
- 20.Joshi, B., Shiu, A.: A survey of methods for deciding whether a reaction network is multistationary. Math. Model. Nat. Phenom.
**10**(5), 47–67 (2015)MathSciNetCrossRefzbMATHGoogle Scholar - 21.Karp, R.M.: Reducibility among combinatorial problems. In: Complexity of Computer Computations, pp. 85–103. Plenum Press, New York (1972)Google Scholar
- 22.Košta, M.: New concepts for real quantifier elimination by virtual substitution. Doctoral dissertation, Saarland University, Germany, December 2016Google Scholar
- 23.Legewie, S., Schoeberl, B., Blüthgen, N., Herzel, H.: Competing docking interactions can bring about bistability in the MAPK cascade. Biophys. J.
**93**(7), 2279–2288 (2007)CrossRefGoogle Scholar - 24.Li, C., Donizelli, M., Rodriguez, N., Dharuri, H., Endler, L., Chelliah, V., Li, L., He, E., Henry, A., Stefan, M.I., Snoep, J.L., Hucka, M., Le Novère, N., Laibe, C.: BioModels database: an enhanced, curated and annotated resource for published quantitative kinetic models. BMC Syst. Biol.
**4**, 92 (2010)CrossRefGoogle Scholar - 25.Loos, R., Weispfenning, V.: Applying linear quantifier elimination. Comput. J.
**36**(5), 450–462 (1993)MathSciNetCrossRefzbMATHGoogle Scholar - 26.Markevich, N.I., Hoek, J.B., Kholodenko, B.N.: Signaling switches and bistability arising from multisite phosphorylation in protein kinase cascades. J. Cell Biol.
**164**(3), 353–359 (2004)CrossRefGoogle Scholar - 27.Pérez Millán, M., Turjanski, A.G.: MAPK’s networks and their capacity for multistationarity due to toric steady states. Math. Biosci.
**262**, 125–37 (2015)MathSciNetCrossRefzbMATHGoogle Scholar - 28.Rashevsky, N.: Mathematical Biophysics: Physico-Mathematical Foundations of Biology. Dover, New York (1960)Google Scholar
- 29.Wang, D.: Elimination Methods. Springer, Heidelberg (2000)Google Scholar
- 30.Zumsande, M., Gross, T.: Bifurcations and chaos in the MAPK signaling cascade. J. Theoret. Biol.
**265**(3), 481–491 (2010)MathSciNetCrossRefGoogle Scholar

## Copyright information

**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.