1 Introduction

Whilst there are different approaches to mathematically model biochemical networks, a commonly used model formalism describes continuous changes in concentration of molecular species over time using ordinary differential equations (ODEs), commonly known as reaction rate equations (RREs) (Jiang et al. 2022). ODE models are studied to better understand generic biochemical network topologies and motifs as well as dynamics of well-defined pathways in specific organisms (Vaseghi et al. 2001; Swameye et al. 2003). RREs are almost always non-linear as they describe protein interactions (Rao et al. 2014). Reaction rates can vary greatly, giving rise to different time scales. Elaborate integrated interactions give rise to complex nonlinear behaviours, such as ultra-sensitivity (Angeli et al. 2004), bi-stability (Rombouts and Gelens 2021), and oscillation (Russo et al. 2009). Numerous studies of RREs have revealed general network properties that give rise to these nonlinear behaviours and, at the same time, give high level understanding for how biochemical networks encode cellular scale responses to stimuli. The general network properties are qualitative in nature, identified by investigating only the network topology, and are now often considered without substantial accompanying quantitative mathematical analysis. Some of these network-scale properties that can be qualitatively associated with complex behaviours include feedback loops, feed-forward loops, cross-talk, compartmentalisation, and noise (Schwartz and Baron 1999). Whilst it may be easy to identify a macro-feature responsible for certain behaviour, it is not always easy to predict the behaviour simply because the macro-feature is present without quantitative analysis.

Two biological signalling pathways can differ in their specific molecular components, mechanisms, and downstream effects. Differentiation metrics can be used to compare the differences between these pathways. When a model is almost symmetrical, it suggests that both pathways have comparable structures, properties, and functions, and their impact on the entire system is similar. The oriented and non-oriented pathways would produce similar results concerning their steady-state behaviour, output response, or sensitivity to perturbations. For instance, if both pathways regulate the same biological process, such as cell differentiation, and share several identical molecular components like transcription factors or signalling molecules, their effects on the final output would be alike. In such a case, the steady-state error between the oriented and non-oriented pathways is low, indicating that they contribute evenly to the outcome.

In mathematical models of biochemical networks, symmetry is typically not included as an explicit feature. However, it is generally assumed that these networks exhibit some form of symmetry in a qualitative sense. In this work, symmetry specifically refers to the mathematical term used to represent the kinetics of a single edge of the network, which is assumed to be nonlinear but possess certain symmetrical qualities. In some literature, symmetry is referred to in a different context, that of the symmetry of the actual network structure. The mathematical terms used to describe this type of symmetry are linear and therefore a specific case of the general framework. References to these concepts can be found in studies by Asllani et al. (2018) and O’Brien et al. (2021).

Fig. 1
figure 1

A model diagram of a simple pathway (left) and its oriented form (right). Solid nodes represent protein concentrations and hollow nodes represent a reverse/flip of a node (decreasing if the protein level is increased and vice versa). The oriented form shows identical behaviour in the output node k as the simple pathway under unbiased and symmetric conditions defined later the manuscript. The oriented form allows for the more simple identification that input node i indirectly activates output node k

Analysing the topological structure of the network through its interactive diagram is often a focal point for understanding the system’s driving mechanisms and qualitative behaviour (Navlakha et al. 2014; Glass 1975; Glass and Kauffman 1972). A common qualitative technique involves systematically processing the causal links described by the edges of the diagram (Gilbert and Heiner 2006). For example, if a node i inhibits node j which inhibits node k, it is qualitatively assumed that the two inhibitions counteract each other in such a way that this may be seen as node i activates a ‘reverse’ of node j which then activates node k; in both cases, node i indirectly activates node k. This qualitative equivalency is shown in Fig. 1. Typically, this equivalency is assumed irrespective of the specific mathematical representation of the model. However, the similarities between these two interpretations can vary depending on the quantitative description of activators and inhibitors; in particular, how closely these relationships are opposites/inverses of each other. The ability to discern when two pathways will result in ‘similar’ behaviour is highly useful. In this simple example, we may be able to scientifically agree on a single-orientated model (say the one with only activators), which characterises all “qualitatively similar” networks. In the absence of any standard form, we define the oriented model as a model which contains all activation interactions along predefined linear pathways embedded within the network. In the case where activators and inhibitors can be interchanged (exactly or approximately), the oriented model characterises a broad collection of models whereby activation and inhibition are able to be used interchangeably as opposites; thereby making it easier to identify similar networks as well as the nature of features such as feedback, feed forward and cross-talk.

For an instance, the paper by Pigolotti et al. (2007) observes a general class of negative feedback loops. Although the paper uses a real-world system to illustrate the observation, the focus of the paper is on behaviour that can be generalised. The paper generally describes negative feedback as any loop of activators and repressors embedded in the network that contains an odd number of repressors. Consider two loops, one with three repressors and one with five, and describe them as different loops from the loop’s perspective. However, if we choose a start and finish point arbitrarily and turn the loop into a pathway with a single feedback edge, then the algorithm of inversions (orientation) that we will present will explicitly reduce this system into the same system. This means that we can study the general features of a negative feedback loop for generic behaviour without describing many topologies, but rather just one (the oriented form). In principle, if repressors and activators do not behave in a reciprocal manner (which is almost all the time) it should not be possible to generally classify different network topologies into recurring motifs characterised by these oriented forms, but this is nevertheless what is done in practise (Zhang et al. 2005; Milo et al. 2002; Ferrell 2013). Our paper discusses how appropriately we can use the oriented form as a fair representation for the many topologies that constitute a motif (similar recurring structures that imply the inverse inhibition/activation relationship). Our framework is similar to theirs. In particular, they represent edges independently and define repressor/inhibitor vs activator as constraints on the sign of partial derivatives of the function the edge contributes to the right-hand side of the model.

The pertinent question is ‘What properties allow for the exact or approximate reduction of a biochemical mathematical model to its oriented model?’. We shall focus our attention predominantly on a property of a biochemical model which we call ‘model bias’ (the tendency for an interaction to be conferred directly effecting activation or deactivation of the downstream node) and secondly on a property which we call ‘model symmetry’ (the tendency for an interaction to be conferred directly as a result of elevated or lowered amounts of the upstream node). We find that for ‘symmetric models’ (where equal changes above and below a rest state in the upstream node confer equal and opposite changes in the downstream node), model bias plays a central role in determining the approximate tractability of a network to an oriented form. We investigate how the network connections together with model bias determine the suitability of this oriented form simplification through a numerical study.

Considering inhibition as an exact counterbalance to activation can result in nearly identical network results when the inhibitory mechanisms accurately reflect the activating ones, though in a reverse manner. Put differently, if the influences of inhibition are consistently and precisely represented as the reverse of activation across the network, the overall performance and dynamics of the network might remain similar to those where inhibition and activation are treated separately.

This assumption is valid when the network shows a strong symmetry between activation and inhibition pathways. In this scenario, symmetry suggests that inhibitory mechanisms produce effects that directly counteract the influence of activation without causing significant alterations or complexities in the dynamics of the network. Furthermore, when the network is relatively straightforward and the interplays between elements are well comprehended, considering inhibition as an exact counterbalance to activation might provide reasonably precise forecasts of network performance. Nonetheless, it’s crucial to acknowledge that this assumption might fail in more intricate networks or scenarios where the interconnections between activation and inhibition are more subtle. In these instances, a more comprehensive and nuanced modelling strategy might be required to accurately depict the network’s performance.

The study by Neubert et al. (2002) focuses on reactivity and its importance in demonstrating how disturbances immediately affect a stable balance. They found a connection between reactivity and Turing instability, showing that the positive reactivity of uniform spatial steady states is vital for generating spatial patterns through Turing instability, especially in cases of dispersal. The paper highlights the relationship between transient and asymptotic behaviours in various systems.

Bennett et al. (2007) have devised an approximation scheme that can be used to derive reaction rate equations for genetic regulatory networks. This scheme is more accurate than the standard quasi-steady state analysis as it predicts the timescales of transient dynamics of such networks more precisely. It does so by introducing prefactors into the ODEs that govern the dynamics of protein concentrations. These prefactors slow down the ODE systems, making them more accurate than their quasi-steady state approximation counterparts. The method is demonstrated by examining a positive feedback gene regulatory network and showing how the inclusion of the prefactor more accurately models the transient dynamics of this network.

Ohlsson et al. (2020) have proposed a technique for analysing a system of ODEs by studying the set of symmetries of its solutions. This approach helps in developing mechanistic models and describing structures that correspond to the underlying dynamics of biological systems. In a study, researchers demonstrated the ability of symmetry-based methods by considering the nonlinear Hill model, which describes enzymatic reaction kinetics. They derived a set of symmetry transformations for each order of the model. In a model selection problem, the symmetry-based methods were compared to a minimal example, and they demonstrated superior performance compared to the ordinary residual-based model selection. They showed that symmetries reveal the intrinsic properties of a system of interest based on a single time series. Finally, they propose that symmetry-based methodology should be the first step in a systematic model building approach. When multiple time series are available, it should complement the commonly used statistical methodologies.

In recent work by Russo and Slotine (2011), the relationship between symmetries and stability in the analysis and control of nonlinear dynamical systems and networks is discussed. The study combines standard results on symmetries and equivalence with new convergence analysis tools based on nonlinear contraction theory and virtual dynamical systems. This combination of structural properties (symmetries) and convergence properties (contraction) is exemplified in the contexts of network motifs, such as those arising in genetic networks, invariance to environmental symmetries, and the enforcement of different patterns of synchrony in a network.

Ma et al. (2009) investigated the network topologies and their behaviour, particularly adaptation, using a model similar to the one we use in our study. However, the authors differentiate between topologies which might otherwise be considered ‘similar’. They examined thousands of topologies and found ones that exhibit similar behaviour, which we believe stems from them being related by a common oriented form. Our study explains how different topologies can cluster together due to their similar behaviour. If our models are unbiased and sufficiently close to symmetric, these topologies (which might be classified together as examples of the same general motif) would behave identically but instead they have various degrees of divergence due to symmetry breaking properties between inhibition and activation (such as bias).

The framework presented in this manuscript is a proposal for an agreed upon diagrammatical representation of large-scale models of biochemical networks (the “oriented form” of a biochemical network). Pathways/networks/motifs with the same oriented form may not behave the same, but under some conditions (which we will explore) may behave in sufficiently similar ways that they can be classified together. The challenge that is faced in translating (“orienting”) a general biochemical network into this proposed framework is that a diagram alone shows only the generic relationships between nodes (excluding the mathematical model specifics). This means that to assume a class of networks/motifs (with the same oriented form) can, and often is, a quantitative error. We find that it is only fine to assume this when the model is perfectly ‘symmetric’ and ‘unbiased’ (when inhibitions and activations are perfect inverses of each other) ; a very unrealistic assumption. The main goal of the paper is to investigate under what circumstances/conditions may this reduction to network/motif classes be at least approximately appropriate from the perspective of common output behaviour.

In Sect. 2, we propose exploring a potential symmetric non-linear model that represents network configurations. This model simplifies the process of modelling mathematical constraints on qualitative relationships. These models are helpful in formulating hypotheses that can be experimentally verified in different biological contexts, including large biochemical networks. By creating a correspondence between pathways and networks, this framework offers a way to represent each group of equivalent pathways using an orientable pathway. This feature makes it easier to analyse and explore the underlying dynamics of the network.

The manuscript is structured as follows. In Sect. 2, we mathematically define the framework for orientation of a biochemical network. We also define an unbiased symmetric model; a standard for which exact equivalence between a model and its oriented form can be shown. In Sect. 3, we explore increases in intrinsic bias in a model and how this mainly translates into divergence in its orientated form and, in particular, how this divergence is regulated specifically by network/pathway properties. We study systematically increasing complexity from simple chain-like pathways to an example of an integrated pathway with cross-talk. Our studies provide new insights into when reduction of a network with an oriented form may be appropriate and when caution should be taken.

2 Model framework

In the context of systems biology, the terms ‘pathway’ and ‘network’ are sometimes used interchangeably. A ‘pathway’ typically refers to a relatively small, well-defined set of entities and relationships, such as a specific signal transduction pathway (Alberts et al. 2002). As the name implies, it should be clear in a pathway where the signal ‘begins’ and where it ‘ends’. On the other hand, a ‘network’ frequently refers to a larger, less constrained set of entities and relationships (Azeloglu and Iyengar 2015). A network, on the other hand, is often a set of pathways connected through cross-talk and feedback mechanisms. We will define both a pathway and a network (the latter consisting as a collection of the former) as a directed graph where each node \(x_i\), \(i=1,\ldots ,N\), is represented by a scalar state variable (with the same label), and each edge determines distinct terms on the RHS of the dynamical system describing the rates of change in these variables. Importantly, we use the term ‘state’ rather than concentration. Non-dimensionalisation of the active protein concentrations associated with each node followed by shifting allows us define each state variable on the range \(x_i\in [-1,1]\). Here 0 represents half of the saturation concentration and \(-1\) and 1 represents no protein and maximum protein respectively. We choose this framework because we consider that absence (and not just presence) of a protein can cause a response in downstream nodes.

The general mathematical model we investigate consists of the differential Eqs. (1);

$$\begin{aligned} \frac{\textrm{d} {\textbf{x}}}{ \textrm{d} t} = \varvec{\psi }({\textbf{x}}), \end{aligned}$$
(1)

describing the evolution of all N state variables \({\textbf{x}} = (x_1,x_2,\ldots , x_N)^{\dagger }\). Here \(\psi _i = \varrho _i^+ - \varrho _i^-\) and

$$\begin{aligned} \varrho _i^+&= \sum _{{\mathcal {E}}\in {\mathcal {E}}_i} r_{{\mathcal {E}}}^+(x_i,y_{{\mathcal {E}}};{\mathcal {T}}_{{\mathcal {E}}}) , \quad \text {and} \end{aligned}$$
(2)
$$\begin{aligned} \varrho _i^-&= \sum _{{\mathcal {E}}\in {\mathcal {E}}_i} r_{{\mathcal {E}}}^-(x_i,y_{{\mathcal {E}}};{\mathcal {T}}_{{\mathcal {E}}}) , \end{aligned}$$
(3)

where each of the sums are taken over the set of edges \({\mathcal {E}}_i\) that point towards the node \(x_i\), \(r_{{\mathcal {E}}}^\pm \) are functions that completely depend on the model and the type of edge \({\mathcal {T}}_{{\mathcal {E}}}\in \{ {\mathcal {A}},{\mathcal {I}}\}\) (activator \({\mathcal {A}}\) or inhibitor \({\mathcal {I}}\)). These are functions of the node \(y_{{\mathcal {E}}}\) from which each edge \({\mathcal {E}}\) originates. Finally, some edges in a network may not come explicitly from another node in the network. These edges represent sources or external stimuli to the system and must point towards a node in the network and for these edges it is assumed in the context of (2) and (3) that \(y_{{\mathcal {E}}} = 0\) such that \(y_{{\mathcal {E}}}\) and \(y_{{\mathcal {E}}}^*\) are non-zero and balanced. We use the superscript asterisk to represent a ‘flip’ in a state variable; negating it algebraically and representing it in diagrams by changing nodes between solid and hollow style.

An edge \({\mathcal {E}}\) is associated with the qualitative description of an activator or inhibitor. Activation is achieved in one of two ways, either \(r_{{\mathcal {E}}}^+\) relatively increases with \(y_{{\mathcal {E}}}\) and/or \(r_{{\mathcal {E}}}^-\) relatively decreases with \(y_{{\mathcal {E}}}\). Inhibition is associated with opposite conditions.

In particular, for a given activation edge \({\mathcal {E}}\) connecting node y to x, we require by definition that

$$\begin{aligned} \frac{\partial r_{{\mathcal {E}}}^+(x,y;{\mathcal {A}})}{\partial y} \ge \frac{\partial r_{{\mathcal {E}}}^-(x,y;{\mathcal {A}})}{\partial y} \end{aligned}$$
(4)

everywhere in the state space determined by x and y. On the other hand, we require

$$\begin{aligned} \frac{\partial r_{{\mathcal {E}}}^+(x,y;{\mathcal {I}})}{\partial y} \le \frac{\partial r_{{\mathcal {E}}}^-(x,y;{\mathcal {I}})}{\partial y} \end{aligned}$$
(5)

everywhere if an edge is to be an inhibition edge.

2.1 Model bias and symmetry

The conditions (4) and (5) can be achieved by adjusting either side of the inequalities. Treating activation as the opposite of inhibition requires that condition (54) is the same as (5) after swapping \(+\) and − then swapping the direction of the inequality. We will show it is necessary and sufficient for activation and inhibition to equal and opposite in this respect if both sides of the inequalities are equal and opposite in magnitude (unbiased) and this magnitude is the same for each condition (symmetric). We define these properties mathematically in Appendix A.

In practise, each node of a biochemical network graph/diagram often represents a protein concentration. Concentrations can either be increased/activated (increasing the state variable for the node) or decreased/deactivated (decreasing the state variable for the node). To better explain the mechanisms which determine bias and asymmetry (weighting) in a model, it is useful to add detail to each node and consider an ‘active’ (solid dots) and ‘inactive’ (hollow dots) component. Of course, these are just labels as we assume that information may equally use the active and inactive components and that these labels just determine the positive and negative direction of the node state variable. A protein \(x_i\) can be activated or deactivated and the mathematical model encodes this in the terms \(\varrho _i^+\) and \(\varrho _i^-\) respectively. Showing generic nodes x and y where y affects x through an edge, it is useful to visualise the mechanistic manner with which bias and weight/asymmetry are manifested in models. We catalogue the different extreme cases for both activator and inhibitor in Fig. 2.

Fig. 2
figure 2

Mechanistic diagrams for asymmetric and biased activator a and inhibitor b model edges. Each node y (upstream) and x (downstream) is represented as an actively switching chemical species between ‘active’ (solid dot) and ‘inactive’ (hollow dot) forms. The switching from active to inactive form is represented in the model for each node by \(\varrho ^-\) (Eq. (3)) and from inactive to active by \(\varrho ^+\) (Eq. (2)) which are influenced by the network edges. The dotted connectors indicate how biased (negative or positive) and weighted (activator or inhibitor) models for the network edges mechanistically confer activation or inhibition from node y to node x

Bias and weighting just indicate deviation away from the unbiased symmetric case as it is difficult to assign a sensible generalised quantitative definition to these qualities. We focus primarily on bias in this manuscript as issues with symmetry can be somewhat mitigated by ensuring that the amplitude \(r_{{\mathcal {E}}}\) is unchanged when comparing activators and inhibitors. It is possible to get a sense of increasing and decreasing bias by plotting the nullcline in the (x-y) state space that corresponds to \(r_{{\mathcal {E}}}^+(x,y) = r_{{\mathcal {E}}}^-(x,y)\) (the steady state in x caused by a state variable y in the absence of other edges). We show these nullclines for a symmetric model edge of activator (Fig. 3a) and inhibitor (Fig. 3b) type respectively. The unbiased case can be seen clearly in the solid black curves. Increasing negative bias is shown by the dashed blue curves whereas increasing positive bias is shown by the dashed red curves. Increasing bias shifts the nullcline further to the left or right respectively when compared to the unbiased case in black. Furthermore, the unbiased case necessarily has the feature that the nullclines can be rotated around (0, 0) an angle of \(\pi \) without changing the nullcline.

Fig. 3
figure 3

Example symmetric activator a and inhibitor b nullclines defined by \(r_{{\mathcal {E}}}^{+}(x,y;{\mathcal {T}})=r_{{\mathcal {E}}}^{-}(x,y;{\mathcal {T}})\). The solid black curves corresponds to an example unbiased model. Blue dotted lines show the nullclines for increasingly net negative bias whilst and red dotted lines show the nullclines for increasingly net positive bias

2.2 Oriented pathways and pathway similarity

Consider the example of the simple linear pathway shown in Fig. 1. If it is justified that both representations behave the same, we can orient the pathway by a standard ‘oriented’ form. The orienting process is a method that helps identify similarities among different pathways or networks. This is useful for understanding signalling pathways and how they function. It involves standardising all interactions along a canonical (main) pathway, characterising them as activation events. This approach, known as the oriented form, provides a qualitative interpretation of the complex interactions within the pathways and orients the network of pathways so that all elements within a pathway are “activated” along a designated central pathway. This simplifies complex information by streamlining diverse pathways into a common framework.

We shall denote the oriented form of the pathway as that portrayed on the right of Fig. 1; that is, where all relationships are denoted as activators. If we are unable to treat activators and inhibitors as opposites (because the model is biased or assymetric) then it is still qualitatively feasible that the oriented and unoriented pathways of Fig. 1 (and others) behave ‘similarly’ but not exactly the same. To what degree bias results in dissimilar/divergent oriented and unoriented pathways is the focus of this paper. In appendix B we define how to find the oriented form of a pathway and a network associated with the proposed model.

In the work by Angeli et al. (2004), a similar algorithm was proposed to create a network consisting of only positive edges. The algorithm selects a reference node, denoted by r, as the starting node. We have aimed to convert a signalling pathway into an oriented structure without compromising its output status, which should remain consistent with the non-oriented pathway. We started the process at the last node of the pathway to achieve this objective. This strategic decision has been taken to ensure that the status of the output node is preserved, while transitioning to an oriented form. Our overarching goal is to maintain the quantitative properties of the pathway throughout the transformation process.

Inspired by the proof of Theorem 2, the follow algorithm is used to orient any given pathway.

Algorithm 1
figure a

Finding an oriented pathway

The steps outlined in Algorithm 1 are shown in Fig. 4.

Fig. 4
figure 4

An example of how to convert a pathway into its oriented form using the steps outlined in Algorithm 1

Generalisation to orienting a full network is done using Algorithm 2.

Algorithm 2
figure b

Finding a oriented network.

3 Numerical results and discussion

Evident by the prevalence of the idea of protein ‘activation’ and ‘deactivation’, most biochemical network models have bias. We conduct a numerical exploration of positive and negative bias and the robustness of the oriented form approximation. We focus first on single pathways and then explore the effect of feedback and network cross-talks. As expected, making the networks more complicated makes the pathways more idiosyncratic (diverges away from the behaviour of a common oriented form). However, for a given network, the oriented form can be much more robust if the bias is positive or negative depending on the situation. Furthermore, we find that for simple networks robustness is strong as bias is increased but beyond a critical bias robustness is lost rapidly and the oriented form does not represent the unoriented network; in some cases even behaving in an opposite way.

3.1 Test model

We use a caricature test model so that we can increase or decrease bias explicitly. The model is parameterised by three parameters: \(\alpha \), \(\beta \), and \(\phi \). The parameter \(\alpha \) describes the amplitude (timescale), \(\beta \) prescribes variable nonlinearity, and \(\phi \) is a proxy for the model’s bias and is chosen independently for each edge. This model is unlikely to represent a real biological system. That being said, by the manipulation of parameters \(\alpha \), \(\beta \) and \(\phi \) there is a lot of flexibility in the model functions \(r_{\mathcal {E}}^\pm (x,y;{\mathcal {T}}_{\mathcal {E}})\). In this sense, the test model allows for generic observations in the role of bias in the approximation that inhibition is the opposite of activation in the context of increasingly complex networks. The rates \(r_{{\mathcal {E}}}^+\) and \(r_{{\mathcal {E}}}^-\) are defined as follows:

For activation,

$$\begin{aligned}&r_{{\mathcal {E}}}^{+}(x,y;{\mathcal {A}}) = \Biggl ( \frac{1+\phi }{4}\Biggl )(1+y) F(x;\alpha ,\beta ) \end{aligned}$$
(6)
$$\begin{aligned}&r_{{\mathcal {E}}}^{-}(x,y;{\mathcal {A}}) = \Biggl ( \frac{1-\phi }{4}\Biggl )(1-y)F(-x;\alpha ,\beta ). \end{aligned}$$
(7)

For inhibition,

$$\begin{aligned}&r_{{\mathcal {E}}}^{+}(x,y;{\mathcal {I}}) = \Biggl ( \frac{1+\phi }{4}\Biggl )(1-y) F(x;\alpha ,\beta ) \end{aligned}$$
(8)
$$\begin{aligned}&r_{{\mathcal {E}}}^{-}(x,y;{\mathcal {I}}) = \Biggl ( \frac{1-\phi }{4}\Biggl )(1+y)F(-x;\alpha ,\beta ). \end{aligned}$$
(9)

where,

$$\begin{aligned} F(x;\alpha ,\beta ) = \frac{\alpha \beta (1-x)}{2\beta -(1+x)}. \end{aligned}$$
(10)

Here \(\phi = 0\) indicates no bias and \(\phi >0\) (\(\phi <0\)) indicates positive (negative) bias. The function F is based on Michaelis–Menten enzyme kinetics but translated to our state variable framework (where state variables vary between \(-1\) and 1). Our study uses MATLAB’s ODE45 solver to run simulations. Specifically, where not stated, for each bias value \(\phi \), we test a total of 150 diverse sets of \(\alpha \) and \(\beta \) values to better isolate behaviour attributable to bias and network topology rather than from non-linearity and timescale.

3.2 Linear pathways

3.2.1 Unbiased linear pathways

We begin by investigating single linear pathways of up to 5 nodes activated at the input node. Node 1 is the input node and Node 5 is the output node.

In Fig. 5 we compare the time evolution of each node for a sample pathway against its oriented form obtained by Algorithm 1. In this particular case, we demonstrate numerically Theorem 2 by setting the bias \(\phi = 0\). The observation here is that the outputs (orange) are identical between pathway and oriented pathway whilst all flipping of internal nodes that is done in the orienting process correctly negates the time evolution of the node at all times.

It is clear that when there is no (or very little) bias in the symmetric model, observing two inhibitions in series is quantitatively identical (or very similar) to two activators. This is because, at least qualitatively, it is understood that inhibiting an inhibitor is akin to a double negative; resulting in a positive. Understanding this equivalence (and when it is appropriate to assume it) makes it easier to qualitatively assess the pathway (and network) behaviour from generic topological relationships – as is common with biochemical network models.

Fig. 5
figure 5

a A cellular biochemical signalling pathway consisting of five nodes \(x_i\) for \(i=1,2,3,4,5\). The black circles and connectors show the pathway, and the orange node \(x_5\) indicates the output node of the pathway. An external stimulus is indicated by a red arrow, which activates the node \(x_1\). The (symmetric) test model (Sect. 3.1) is used to model the pathway edges. In this model, each edge of the network is associated with two parameters, \(\alpha _{ij}\) and \(\beta _{ij}\) where i is the source and j is the destination of the connector between two nodes. The parameter values for \(\alpha \) and \(\beta \) were chosen arbitrarily as \(\alpha = \{\alpha _{12},\alpha _{23},\alpha _{34},\alpha _{45}\}= \{0.5,1,2,1.3\}\) and \(\beta = \{\beta _{12},\beta _{23},\beta _{34},\beta _{45}\}= \{10,5,15,18\}\). A boxes on the right side each node displays the state variables of each node from \(t=0\) to \(t=100\). All states are initialised in the neutral position of 0. The pathway in b is the oriented representation of the signalling pathway shown in a. The hollow nodes represent nodes that are flipped in the orienting process described by Algorithm 1. Since this process results in a flipped \(x_1\) the input edge should be interpreted as an inhibition which is indicated in blue. In choosing \(\phi = 0\) (no bias) and noting that \(x_5\) is the same for a and b we have verified the result of Theorem 2

3.2.2 Biased linear pathways

We now shift our investigation to the robustness of the oriented pathway reduction of single pathways with bias. To compare a pathway/network against its oriented form we measure the similarity between the two by focusing on two metrics. In both metrics we initialise the pathway and its oriented form in the neutral position (all node states equal to zero). We then run the model for both oriented and unoriented forms numerically until steady state. The first metric comparing the unoriented to oriented descriptions is the difference in the long term behaviour. We have restricted our study to non-oscillatory systems which limit to fixed steady states over time. We define the steady state difference to be

$$\begin{aligned} \delta _{ss} = \langle \delta - {\bar{\delta }}\rangle _{\alpha ,\beta } = \left\langle \lim _{t\rightarrow \infty } x_N(t) - \lim _{t\rightarrow \infty } {\bar{x}}_N(t)\right\rangle _{\alpha ,\beta }, \end{aligned}$$
(11)

where \(\delta \) and \({\bar{\delta }}\) are the unoriented and oriented steady states of the output node \(x_N\), respectively. The bar notation here corresponds to the oriented form. The brackets \(\langle \cdot \rangle _{\alpha ,\beta } \) indicate averaging over many combinations of parameter sets \(\alpha \) and \(\beta \) describing the edges in the model.

A second metric is defined to measure the difference in the transient aspects of the response of the unoriented and oriented cases. It is constructed by first normalising each output node by its steady state and then integrating the difference between the unoriented and oriented transient output values. The resultant error is positive (negative) if the unoriented form approaches steady state faster (slower) than the oriented form. Very large values in this metric may indicate the presence of some temporal behaviour not present in the other form (for example, rebounding behaviour). This is especially the case if one of the forms responds to the input and returns close to the neutral state over time.

$$\begin{aligned} \delta _\tau = \left\langle \int _0^\infty \left( \frac{x_N(t)}{\delta } - \frac{{\bar{x}}_N(t)}{{\bar{\delta }}} \right) \ \textrm{d}t \right\rangle _{\alpha ,\beta } \end{aligned}$$
(12)

Behavioural differences between unoriented and oriented pathways can also lead to small errors \(\delta _\tau \). The purpose of \(\delta _\tau \) is to get insight into response times of unoriented and oriented pathways under simple conditions. This metric is less useful in the case where responses are sufficiently different in nature.

Figure 6 displays the steady-state error \(\delta _{ss}\) for all possible linear pathways of length \(N=5\). Each chart in the figure is a plot of \(\delta _{ss}\) versus \(\phi \) in the model. That is, left of centre represented increasing negative bias and right of centre represents increasing positive bias. The pathways in the figure are shown at the top of each chart and immediately below these pathways are the oriented forms (showing specifically which nodes are flipped in the orienting process). The charts themselves are strategically organised into columns based on the total number of nodes that are flipped in order to create the oriented form (one node on the left, two nodes in the centre, three and four nodes on the right). From top to bottom, we order the charts such that in the orienting process flips that are generally more upstream are at the top whereas downstream flips are generally down the bottom. We also plot each chart in red if, after the orienting process, the input stimulus does not change type from activation but in blue if the input stimulus changes from activation to inhibition.

Setting the charts out like that in Fig. 6 shows a number of interesting and nontrivial properties for linear pathways:

  1. 1.

    Locally around the unbiased case \(\phi = 0\) we see good agreement in steady states between the unoriented and oriented pathways.

  2. 2.

    Agreement with the oriented pathway is very robust if the model is negatively biased and the oriented form requires a flip in the input stimulus from activation to inhibition. If there is no flip in the input stimulus then agreement is very robust if the model is positively biased.

  3. 3.

    As bias is increased, agreement with the oriented pathway remains robust up until some critical level of bias at which stage agreement rapidly decreases until under some conditions the worst cases reach \(\delta _{ss} \approx \pm 2\) indicating that the oriented and unoriented forms of the pathway limit towards extreme opposite outputs.

  4. 4.

    The critical bias, before disagreement with the oriented form rapidly increases, is closer to the unbiased case – that is, the oriented form is less robust – if the orienting process flips over nodes in higher quantities or further downstream. Pathways which satisfy these criterion also have more extreme disagreements with their oriented form.

These observations were also consistent with results found for pathways of lengths \(N=3\) and \(N=4\) (not shown here).

Fig. 6
figure 6

This figure displays the steady-state error \(\delta _{ss}\) of different signalling pathways of length \(N=5\) as defined by Eq. (11). The model used is the test model in Sect. 3.1. The error is plotted in charts as functions of the bias parameter \(\phi \) and the pathway with its oriented form are shown above each chart. The colors of the plots correspond to flipping (blue) or no flipping (red) of the input stimulus as a result of the orienting process. The columns separate the number of flipped nodes in the orienting process whilst each column is ordered from top to bottom where these flips occur more upstream or downstream respectively

It is also interesting to observe that disagreement between the oriented and unoriented forms as a result of bias compounds as the number of flips increases. This can be seen in Fig. 7 where the sum of all errors for each of the four single flip pathways (left column) in Fig. 6 are plotted against the steady state error associated with the pathway which requires all four flips to orient (bottom right chart in Fig. 6). The later error eclipses the sum of the former errors however in both cases robustness remains fairly strong for a significant interval around \(\phi = 0\) (the unbiased case).

Fig. 7
figure 7

The figure shows the sum of steady-state errors \(\delta _{ss}\) of all pathways of length \(N=5\) with one-node flips in their oriented form (blue) against the steady state error of a single pathway with and four nodes flipped in its oriented form (black dashed)

Using the same charting order used in Fig. 6, we chart the errors \(\delta _\tau \) in Fig. 8. The plots of \(\delta _\tau \) show similar behaviour under changes in bias to \(\delta _{ss}\). This is especially the case for pathways requiring few flips in orienting. In this case, oriented and unoriented pathways which tend towards different steady states also take different rates in getting there. Pathways requiring many flips produce more complex behaviour over time and \(\delta _\tau \) is less insightful.

Fig. 8
figure 8

This figure displays the steady-state error \(\delta _{ss}\) of different signalling pathways of length \(N=5\) as defined by Eq. (11). The model used is the test model in Sect. 3.1. The error is plotted in charts as functions of the bias parameter \(\phi \) and the pathway with its oriented form are shown above each chart. The colors of the plots correspond to flipping (blue) or no flipping (red) of the input stimulus as a result of the orienting process. The chart arrangement is the same as that of Fig. 6

In the following section we shift our attention to the affect of bias on the oriented form on networks. We begin with simple pathways with feedback before looking at an example network with significant integration of multiple pathways.

3.3 Networks

3.3.1 Unbiased pathways with feedback

Investigation of linear pathways in the previous section shows that care should be taken when assuming that inhibitors act as diametrically opposite activators under biased conditions in the case where there this assumption is taken in multiple instances (where there are many flips to get to the oriented form). We now attempt to investigate the validity of this assumption under increasing complexity and in particular in the case of feedback.

In this section, we will fix a linear pathway of length \(N=5\) consisting of 3 inhibitions and ending in a single activation. The orienting procedure then requires a flip of node 1 and 3; in the process flipping the input stimulus from activator to inhibitor. We then add a single feedback in the form of an inhibition to this pathway.

Figure 9, like Fig. 5, compares the evolution of each node in this pathway under unbiased conditions in using the test model in Sect. 3.1 with an inhibition feedback from the output node 5 to node 3. The purpose is to validate Theorem 2; noting that output nodes are exactly the same between oriented and unoriented pathways.

Fig. 9
figure 9

Here we compare an unoriented pathway with feedback a against its oriented form b for the test model in Sect. 3.1 under unbiased conditions (\(\phi = 0\)). The parameters and formatting used is the same as that in Fig. 5. The figure shows equivalency in the two networks and validates Theorem 2

Figures 10 and 11 show an array of pathways of length \(N=5\). These pathways all require two flips at nodes 1 and 3 to orient. The figures show how the errors \(\delta _{ss}\) (Fig. 10 showing error in steady state) and \(\delta _\tau \) (Fig. 11 describing discrepancy in temporal behaviour) are influenced by bias. In each figure, the array consists of the same pathway subject to all combinations of inhibition feedback. The first column consists of feedback a distance of a single node, the second column, a feedback a distance of two nodes, and in the last column a feedback is a distance of three and four nodes. The top charts display feedback which is introduced further downstream than the charts at the bottom. The pathways are oriented the same in all cases but the process leaves the feedback sometimes as an activator (red) and sometimes as an inhibitor (blue) depending on its location in the pathway. The charts are plotted in these colours respectively to highlight this feature across the chart which ultimately determines if the feedback is a negative or positive feedback.

Fig. 10
figure 10

The figure shows the steady-state error of different signalling pathways with five nodes, as determined by varying \(\phi \) values under an external stimulus. The y-axis on each chart represents the steady-state error ranging from \(-2\) to 2, while the x-axis shows \(\phi \) values ranging from \(-1\) to 1. The extremes in each graph indicate where the non-oriented and oriented pathways disagree significantly, and the flat line portion of the chart indicates the range of robustness in the steady state under increasing levels of bias

Fig. 11
figure 11

This figure shows the \(\delta _{\tau }\) for the same pathways in Fig. 10, as determined by varying \(\phi \) values under an external stimulus. The y-axis represents \(\delta _{\tau }\), while the x-axis shows \(\phi \) values ranging from \(-1\) to 1. The figure arrangement is the same as that of Fig. 10

The following key observations are found across our numerical investigation and exemplified in Figs. 10 and 11.

  1. 1.

    Orientability is more robust with the addition of negative (or positive) bias than the alternative if the oriented pathway is inhibited (or activated, respectively). This was found previously in the case of a simple linear pathway but this phenomena seems to be also valid with the introduction of complexity such as feedback (and later we will show this to be also the case for a significantly more complex network). Feedback tends to even amplify the significance of negative versus positive bias, especially in the case of the steady state (and excepting for our next observation).

  2. 2.

    Orientability (in the steady state but not the temporal behaviour) even close to the unbiased case is compromised in the specific case where (a) there is negative feedback (blue curves in Fig. 10) and (b) when the feedback loop includes at least one node which is flipped in the orienting process.

  3. 3.

    Orientability (in the temporal behaviour but not the steady state) even close to the unbiased case is compromised under positive feedback (red curves in Fig. 11) but not under negative feedback (red curves). This compromise is still more pronounced in the case of positive bias due to the inhibition stimulation of the pathway.

3.4 Network complexity and pathway cross-talk

As it is difficult to do a systematic numerical study of complex network topologies and pathway crosstalk we will instead only look anecdotally at a realistic network and in particular the EGFR/HER2 signalling network model published by Yamaguchi et al. in 2014 Yamaguchi et al. (2014). Using Algorithm 2 to orient a general network requires first to assign nodes to pathways (as defined explicitly in this manuscript not necessarily in a general biological sense) based on key flows of information. Each node must belong to a pathway and each pathway has to have an input node and an output node with edges directed from input to output. Of course, this leaves the choice of pathways an open problem with many solutions. We attempt to do this in such a way that makes the most amount of sense biologically and favoring input nodes as nodes with external stimulus and aligning as much as possible to conventional pathway families. The unoriented network diagram and its oriented form are shown in Fig. 12a, b respectively. Pathway edges are represented in black. There are six mathematical pathways in the network and some can be collectively grouped into four biological pathways. To satisfy our mathematical definition, a single node (representing the protein PTEN) which acts as an intermediary between the biological pathways \(P_3\) and \(P_4\) is technically labelled as its own pathway. The red and blue colours indicate activation and inhibition cross-talks and/or external stimulus, respectively. The diagram includes various coupled biological pathways, such as Wnt/\(\beta \)-catenin (\(P_1\)), EGFR family (\(P_2\)), Notch family (\(P_3\)), and TNF-R pathway (\(P_4\)). The two output nodes labelled \(x_1\) and \(x_2\) are shown in orange and we do not demand similarity in any of the other nodes. We choose these two nodes as the output as this model has been constructed to focus on the EGFR pathway and how it is affected by the other signalling pathways (Yamaguchi et al. 2014).

Immediately we highlight the power of the oriented form. In ensuring that all pathways are oriented, we can see the effect of the stimulii on this network. In the oriented form all stimulii except the stimulii of \(P_4\) are contributing to activation of the output via the direct pathways. Furthermore, it is more easy to see the effect of the cross talk interactions. All cross-talk in the oriented form is positively stimulatory. That is, they all contribute to positive feedback or feed forward through cross-talk. The exception, of course, is the cross-talk with \(P_4\) which is inhibited by its stimulus and through single inhibitory cross-talk with other pathways also acts as a positive stimulant. It is satisfying therefore that a network which is actively stimulated and contains positive feedback behaves (despite its added complexity) as a simple pathway in regards to the bias in the model. We can see this explicitly in the plots of \(\delta _{ss}\) in Fig. 13. In particular, we notice that error \(\delta _{ss}\) is greatly restricted to negative bias which has been an observation derived from our simpler tests for activated oriented pathways with positive feedback.

Fig. 12
figure 12

A network model of EGFR/HER2 and its integration with other pathways as described in Yamaguchi et al. (2014) in both its unoriented form a and oriented form b. The dashed boxes represent the four underlying interacting biological pathways. These are \(P_1\) TNF-R, \(P_2\) combination of Wnt/\(\beta \)-catenin and MAPK \(P_3\) P13/AKT and \(P_4\) Notch. On the other hand, individual columns (shown with black edges) represent the six pathways as we define them mathematically in this manuscript. The red and blue arrows indicate activation and inhibition cross-talks and stimulii, respectively. We focus on the outputs of the model in Yamaguchi et al. (2014) which are indicated in orange and labelled \(x_1\) and \(x_2\)

Fig. 13
figure 13

Steady state error \((\delta _{ss})\) for output nodes \(x_1\) and \(x_2\) using the test model described in Sect. 3.1. The steady-state error accumulates as \(\phi < 0\), but both networks maintain near equivalence when \(\phi > 0\)

Fig. 14
figure 14

Diagrams show the output of \(x_1\)(purple) and \(x_2\)(blue) in the model shown in Fig. 12. The simulation output is averaged over many combinations of parameters \(\alpha \) and \(\beta \) for each edge and shown in three columns for \(\phi =-0.5\) (left), \(\phi = 0\) (centre), and \(\phi = 0.5\) (right). The first and third rows display the non-oriented output of \(x_1\) and \(x_2\), while the second and the fourth row show the respective oriented network outputs

We can also visualise (more explicitly than simply plotting \(\delta _\tau \) for this system) the discrepancy in the temporal evolution of the nodes \(x_1\) and \(x_2\) by plotting \(x_1(t)\) and \(x_2(t)\) explicitly for \(\phi = 0\) and \(\phi = \pm 0.5\) (representative of positive and negative bias respectively). This is done in the set of charts in Fig. 14. The first row of the figure shows the non-oriented output of \(x_1\) (purple), while the second row shows the oriented network output. The third and fourth rows display the non-oriented and oriented network outputs for \(x_2\) (blue) respectively. We observe perfect agreement for the unbiased model (centre column) as expected and very good agreement for the positively biased model as a result of positive stimulus and positive feedback throughout the oriented network (right column). Furthermore, as expected, the negatively biased model exhibits significant discrepancies between the oriented and unoriented forms in line with the observations of \(\delta _{ss}\) in Fig. 13 and consistent with observations of simpler pathways.

4 Discussion and conclusion

This manuscript is concerned with the appropriateness of reducing the complexity of a biochemical network/pathway by assuming that inhibition and activation relationships behave as perfect inverses of each other. Such reduction is common as a method of identification of topological properties and possible qualitative behaviours but this reduction is only appropriate under certain conditions explored in this manuscript.

We define an oriented form as a standard/simplest way of reducing a pathway/network. We also prove that if a model is so-called ‘unbiased’ and ‘symmetric’ then reduction to the oriented form by successive exchanging of activations and inhibitions (keeping track of all operations by ‘flipping’ the sign of the relevant nodes) produces a model indistinguishable (from the perspective of measured output) from the unoriented form.

We focus our study on the nature of (positive and negative) bias in the formation of error between an unoriented form and its reduced oriented form (both in the case of differing steady states and differing dynamic properties). We restrict ourself to symmetric models only. Our investigation is computational. Pathways in their oriented form have more easily identifiable structure and our investigation leads us to propose the following general conjectures for symmetrically modelled pathways and networks.

  1. 1.

    For many systems bias can be added to a model without generating significant errors in the orienting process. When errors appear they do so rapidly and suddenly and therefore care should be exercised when reducing a network qualitatively in this way.

  2. 2.

    When the oriented network is externally stimulated then the reduction to the oriented form produces significant errors if the model is negatively biased. If it is externally inhibited errors are formed if the model is positively biased.

  3. 3.

    Errors are compounded as more approximations (replacing activators for inhibitors and flipping a node to compensate) are taken. Furthermore, if these approximations are made further downstream, the errors are expected to be larger.

  4. 4.

    The former statements extend to pathways with feedback. However, error in the steady state is observed for both negative and positive bias models if the feedback is negative (and the loop includes a flipped node) and error in the temporal behaviour is observed for both negative and positive bias models if the feedback is positive.

  5. 5.

    The conclusions seem to be consistent if these general properties can be identified in the oriented form of more complex network models.

The complexity of real biological networks is well-known, and we acknowledge that the independent combination of arrows between nodes adds to this complexity. It is not within the scope of this work to claim a representation of the full complexity of possible models. Rather, we employ a common modelling framework that is frequently used to examine the impacts of network structure and topology on network behaviour Ma et al. (2009). Therefore, to maintain a consistent general definition of the network diagram, we have chosen to limit the vast possibilities of real networks to a series of independent links. Moreover, we have adopted the pathway discussed in Yamaguchi et al. (2014) solely for its network structure, which represents an example of realistic biological topology, in contrast to the highly simplified ones examined earlier in this paper. The purpose of adopting this pathway is not to investigate any biological pathway, but rather to provide an example of realistic biological topology.

We acknowledge that there are ways to manipulate the phase portraits of the dynamical systems that describe these networks. However, we are limiting ourselves to a specific way of examining different topologies to identify if they share common features. In doing so, we choose a method and remain consistent with it when comparing topologies, for instance, a topology with what we refer to as the oriented form. To achieve this, we require comparable networks to agree on the embedded pathways (the starting and ending points). From this common ground, the orientation process yields unique results. We understand that this is a complex question, and we hope that we have given it the attention it deserves.

This study has many limitations and suggests very strongly more rigorous work that should be done. We have opted not to look into detail at symmetry in this paper. This is partly due to the length of the paper but also because bias seems more common/significant in mathematical models in the literature. Furthermore, we do not have an objective definition to measure the extent of bias that makes sense. We have a measure of bias for the model used in this paper \(\phi \) but this measure is arbitrary. It is therefore important that the scale of \(\phi \) not be given too much emphasis. We also investigate one single model. We attempted to create this model with symmetry but also with the kinds of nonlinear relationships common in biochemical network models. It remains as future work to investigate the generalisability and analysis of the conjectures posed in this manuscript to a much more broad class of model.

The observations found in this study form a framework with which to assess biochemical networks and determine if qualitative reductions are appropriate or if typologies are likely to be more idiosyncratic based on the specific quantitative model used to simulate it.