Background

Topological or structural analysis of biological networks can provide us with new insights into the design principle and the evolutionary mechanism of network molecules [14]. For instance, it has been widely accepted that biological networks have scale-free characteristics and a few highly connected network nodes (hubs) play pivotal roles in maintaining the global network structure [5]. Moreover, some other topological characteristics such as connectivity, clustering coefficient, and shortest path length have been proposed to explain the evolutionary rate and/or the lethality of network nodes. It has been shown that highly connected proteins in protein-protein interaction networks have a higher clustering coefficient and a smaller shortest path length. Consqeuntly, such proteins are more likely to be essential and evolve slowly [1, 3, 68]. There is however a pressing need to develop another topological measure that can better explain the relationship between network characteristics and biological importance of network nodes [1, 9].

We note that feedback loops are ubiquitously found in various biological networks and play important roles in amplifying (positive feedback loop) or inhibiting (negative feedback loop) intracellular signals [1015]. It has been suggested that such a feedback loop could be an important network motif [1618]. Yet, it has not been fully investigated whether there exists a correlation between feedback loops and the functional importance of network nodes. Hence, we address this problem here and propose that the number of feedback loops (NuFBL) is a novel network measure characterizing such a functional importance of network nodes.

To prove our hypothesis, we use the random Boolean network models where directed links between nodes are randomly chosen. This random Boolean network model has been widely used to represent various biological networks and it has successfully captured some biological properties [1923]. For instance, random Boolean network models were used to prove the properties of the yeast transcriptional network in that the network converges to a same stable state and it is robust against mutations of initial states [19]. They were also used to explain the remarkable robustness observed in genetic regulatory networks [20] and some properties of cell cycle networks such as stability along with genome size and the number of active genes along with the in-degree distribution [21] were also explained by Boolean network models. Previous studies adopt these random Boolean network models to prove that the global dynamics of the genetic regulatory network of HeLa cells are highly ordered [22] and the dynamics of various biological networks such as multi-stability and oscillations are related with positive or negative feedback loops [23]. These previous studies have validated usefulness of the random Boolean network models in analyzing the dynamical characteristics of biological networks.

Results and discussion

Correlation between the functional importance of network nodes and the NuFBL

The hippocampal CA1 neuronal signal transduction network

We considered the large signal transduction network of the hippocampal CA1 neuron of mice to examine the NuFBL as a new network measure [6]. We first confirmed the previous observation that proteins with a higher connectivity are more likely to be lethal and to have a slower evolutionary rate (data not shown). It has been considered that the lethal proteins are more essential than other proteins showing no obvious phenotype when deleted [1]. Also, it has been known that functionally important proteins are under a strong regulatory constraint resulting in relatively slow evolution [24, 25]. Similarly, to examine whether the NuFBL of a protein is related to its functional importance, the NuFBL was plotted against the degree of phenotype and the evolutionary rate for grouped proteins as described in Methods. In Fig. 1, it was observed that more essential proteins (Fig. 1a) and more slowly evolving proteins (Fig. 1b) tend to have a larger NuFBL, which suggests that functionally important proteins in the signal transduction network are more likely to be regulated by many feedback loops. On the contrary, the nonessential proteins indicated by "Not obvious" phenotype group showed a very small NuFBL and they are less likely to be regulated by feedback loops. Note that most of the proteins except those with the slowest evolutionary rate have little difference in the NuFBL.

Figure 1
figure 1

Correlation between the functional importance of proteins and the NuFBL. (a) The NuFBL's were plotted against the mutant phenotypes of the proteins in the network where proteins were classified according to the previous report [1]. (b) The NuFBL's were plotted against the evolutionary rate [1] (dN/dS) of proteins which were grouped into five different classes according to their evolutionary rates. For each protein group, the average and the confidence interval for 95% confidence level of the NuFBL are shown on the y-axis (see additional data file 4 for further details).

Boolean network models of biological networks

To further investigate whether the positive correlation between the NuFBL and the functional importance is an intrinsic principle of network dynamics, we performed extensive computer simulations for generalized biological network models represented by Boolean networks (see Methods). The importance of a node in the Boolean network model was defined as the probability with which either an initial state mutation or an update rule mutation of the node makes the network converge to a new attractor. In Boolean network models, a state trajectory starts from an initial state and eventually converges to either a fixed-point or a limit-cycle attractor. So, these attractors represent diverse behaviors of biological networks such as multistability, homeostasis, and oscillation [2628]. For instance, in the regulatory network of inducing phenotype variations in bacteria, some epigenetic traits are represented by multiple fixed-point attractors [29]. In addition, mitogen-activated protein kinase cascades in animal cells [26, 27] and cell cycle regulatory circuits in Xenopus and Saccharomyces cerevisiae [28, 30] are known to produce multistable attractors. On the other hand, the transcriptional network of mRNAs for Notch signaling molecules shows the oscillation with a 2-h cycle by hes1 transcription [31] corresponding to a limit-cycle attractor. ¿From these examples, we can find that attractors represent essential dynamics of biological networks. Therefore, converging to a different attractor by some mutations at a node means that the node has a significant role in the network. This concept has been widely used in a number of previous studies based on computational approaches [3235].

Fig. 2 shows the results of the Boolean networks with |V| = 14 and |A| = 19. It turns out that the network nodes with a higher connectivity or NuFBL are more important, which is consistent with the observation in the above neuronal signal transduction network. And, we observed the same result for networks with different sizes (see additional data file 1). Moreover, we found that the NuFBL is a better network measure than the connectivity in evaluating the functional importance of a network node.

Figure 2
figure 2

Correlation of connectivity and the NuFBL to the functional importance in Boolean networks. (a) Correlation between connectivity and the functional importance of network nodes with respect to initial state mutations. (b) Correlation between the NuFBL and the functional importance of network nodes with respect to initial state mutations. (c) Correlation between connectivity and the functional importance of network nodes with respect to update rule mutations. (d) Correlation between the NuFBL and the functional importance of network nodes with respect to update rule mutations. In each figure, all nodes were classified into five groups according to their connectivity or NuFBL ranks. All the results represent the average over randomly generated 2000 Boolean networks with |V| = 14 and |A| = 19. For each group, the average and the confidence interval for 95% confidence level of the functional importance are shown on the y-axis. Here, the functional importance of a network node is defined by the probability with which the network converges to a different attractor when the value of the node is mutated. For other Boolean networks with different |V| and |A|, we also obtained similar results (see additional data file 1).

In addition to the NuFBL, we can think of another measure that represents the particular characteristics of feedback loops. For instance, we have investigated the relationship between the length of feedback loops at a node and its functional importance which is defined in the same way as in Fig. 2. In this case, the nodes with relatively longer or shorter loop lengths were functionally less important while the nodes with medium loop lengths were more important (see additional data file 2 for details). So, the length of feedback loops can be considered as another measure, but it is no longer linearly correlated with the functional importance unlike the NuFBL.

Comparison of the NuFBL and the connectivity

Correlation between the NuFBL and the connectivity in the neuronal signal transduction network

We compared the NuFBL and the connectivity as a measure of network characteristics. As shown in Fig. 3, it was observed that there is a strong positive correlation between the connectivity and the NuFBL (the correlation coefficient is 0.73). Interestingly, the positive correlation was relatively stronger for the lethal and slowly-evolving proteins, which have a high connectivity and a large NuFBL (red plus sign points in Fig. 3a, b). On the contrary, there was only a weak correlation for the proteins of a non-lethal group or a rapidly evolving group (blue circle points in Fig. 3a, b). The correlation coefficient of 152 proteins whose connectivity ranged from 5 to 9 was only 0.14.

Figure 3
figure 3

Distribution of proteins with respect to connectivity and the NuFBL. (a) Proteins were classified into "Lethal", "Viable", and "Not obvious", respectively, according to their mutant phenotypes. (b) Proteins were classified into "Slow", "Middle", and "Fast", respectively, according to their evolutionary rates (see additional data file 3 for further details).

Classification of proteins in the CA1 neuronal signal transduction network

To probe the distribution of proteins, we classified the proteins into four different groups (see Methods): "no feedback loop & low connectivity", "no feedback loop & high connectivity", "feedback loop & low connectivity", and "feedback loop & high connectivity" (Table 1). The functional importance (R) estimated by the lethal mutant phenotype or slow evolutionary rate was significantly higher for the "feedback loop & high connectivity" group. Note that the connectivity or the NuFBL alone was not enough to discern all the different network characteristics.

Table 1 Classification of proteins and their functional importance in the hippocampal CA1 neuronal signal transduction network

We analyzed the distinct features of the proteins in the four groups with respect to their functional roles (Fig. 4). Interestingly, we found that receptor proteins were enriched in the "high connectivity & no feedback loop" group (Fig. 4c) and that downstream kinases and proteins from receptors were enriched in the "high connectivity & feedback loop" group (Fig. 4d). These suggest that the downstream proteins from receptors in the signal transduction network are primarily responsible for intensification of signals and therefore feedback regulations are required for the amplification and control of signals [36, 37].

Figure 4
figure 4

Classification of proteins according to their function in the hippocampal CA1 neuronal signal transduction network and the proportion of each classified group. The proteins were classified into four groups: (a) "no feedback loop & low connectivity" group, (b) "no feedback loop & high connectivity" group, (c) "feedback loop & low connectivity" group, and (d) "feedback loop & high connectivity" group. For each group, the proportion of proteins classified according to their functions is specified.

Classification of proteins in the computational networks

By using simulations based on the Boolean network models, we further investigated the relationship between the connectivity and the NuFBL. The whole network nodes were classified into four groups as in Table 1, and the simulations confirmed that the connectivity is positively correlated with the NuFBL with respect to the functional importance of network nodes (Table 2). This was verified through other Boolean networks with different sizes (see additional data file 3). In particular, we note that the nodes involved with no feedback loop present comparatively low functional importance on average. This implies that if a protein is relatively important among the "no feedback loop" group, it is likely for us to discover a new feedback loop around this protein.

Table 2 Classification of network nodes and their functional importance in generalized Boolean network models

Conclusion

We propose the NuFBL as a new network measure that can characterize the functional importance of network nodes. We have shown that the NuFBL is positively correlated with the connectivity in measuring network characteristics, and the network nodes with a higher NuFBL and a higher connectivity are more essential (lethal) and evolve slowly. Through extensive computational simulations, we found that the positive correlation between the NuFBL and the functional importance is an intrinsic property of network dynamics.

Unfortunately, at present, there are few large-scale biological networks harboring the information about feedback loops. A future study will therefore include a verification of the presented results in many other kinds of real biological networks. As another future study, we need to investigate the characteristics of feedback loops that can help us to predict the functional importance of network nodes from other aspects of the data. Such characteristics include timing of expression, the number of members in the loop, and the integrative sign of multiple interactions.

Methods

Connectivity, feedback loop, loop length, and the number of feedback loops (NuFBL)

Given a network composed of a set of nodes and a set of links between the nodes, the connectivity of a node is defined as the number of links connected to the node. A feedback loop means a closed simple cycle where nodes are not revisited except the starting and ending nodes. For instance, v0v1v2 → ⋯ → vL -1v L is a feedback loop of length L(≥ 1) if there are links from vi-1to v i (i = 1, 2,..., L) with v0 = v N and v j v k for j, k ∈ {0, 1,..., L - 1}. The NuFBL of a node v denotes the number of different feedback loops starting from v.

Analysis of the hippocampal CA1 neuronal signal transduction network

We considered all 545 proteins and their 1258 interactions in the signal transduction network of the hippocampal CA1 neuron of mice [6]. Following the previous study [1], proteins were grouped together according to their lethality and evolutionary rates. As it is difficult to enumerate all possible feedback loops in such a large network, we considered only the feedback loops whose length (i.e., the number of links comprising a feedback loop) is less than or equal to 10. Important proteins are defined as those with "lethal" phenotypes and these are illustrated in the upper of Table 1. 20% of the most slowly-evolving proteins are illustrated in the lower of Table 1.

Analysis of generalized biological network models represented by Boolean networks

Boolean network models composed of a set of Boolean variables and regulatory relationships between the variables have been widely used as a useful tool for investigating the complex dynamics of various biological networks [38, 39]. In spite of their structural simplicity, Boolean networks can represent a variety of complex behaviors [23] and share many features with other continuous models [40, 41]. We employed such a Boolean network model and described biological networks by a directed graph, G = (V, A) where V is a set of Boolean variables and A is a set of ordered pairs of the variables, called directed links (|V| and |A| denote the numbers of nodes and links, respectively). Each v i V has the value of 1 ("on") or 0 ("off"). A directed link (v i , v j ) has a positive ("activating") or negative ("inhibiting") relationship from v i to v j . The value of each variable v i at time t + 1 is determined by the values of k i other variables v i 1 , v i 2 , , v i k i MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWG2bGDdaWgaaWcbaGaemyAaK2aaSbaaWqaaiabigdaXaqabaaaleqaaOGaeiilaWIaemODay3aaSbaaSqaaiabdMgaPnaaBaaameaacqaIYaGmaeqaaaWcbeaakiabcYcaSiabl+UimjabcYcaSiabdAha2naaBaaaleaacqWGPbqAdaWgaaadbaGaem4AaS2aaSbaaeaacqWGPbqAaeqaaaqabaaaleqaaaaa@3FA7@ having a link to v i at time t through a Boolean function f i : { 0 , 1 } k i { 0 , 1 } MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGMbGzdaWgaaWcbaGaemyAaKgabeaakiabcQda6iabcUha7jabicdaWiabcYcaSiabigdaXiabc2ha9naaCaaaleqabaGaem4AaS2aaSbaaWqaaiabdMgaPbqabaaaaOGaeyOKH4Qaei4EaSNaeGimaaJaeiilaWIaeGymaeJaeiyFa0haaa@4115@ . Hence, we can represent the update rule as v i (t + 1) = f i ( v i 1 ( t ) , v i 2 ( t ) , , v i k i ( t ) ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGMbGzdaWgaaWcbaGaemyAaKgabeaakiabcIcaOiabdAha2naaBaaaleaacqWGPbqAdaWgaaadbaGaeGymaedabeaaaSqabaGccqGGOaakcqWG0baDcqGGPaqkcqGGSaalcqWG2bGDdaWgaaWcbaGaemyAaK2aaSbaaWqaaiabikdaYaqabaaaleqaaOGaeiikaGIaemiDaqNaeiykaKIaeiilaWIaeS47IWKaeiilaWIaemODay3aaSbaaSqaaiabdMgaPnaaBaaameaacqWGRbWAdaWgaaqaaiabdMgaPbqabaaabeaaaSqabaGccqGGOaakcqWG0baDcqGGPaqkcqGGPaqkaaa@4DB2@ where we randomly use either a logical conjunction or disjunction for all the signed relationships in f i . For instance, if a Boolean variable v has a positive relationship from v1 and a negative relationship from v2, the conjunction and disjunction update rules are v(t + 1) = v1(t) ∧ v 2 ¯ MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaqdaaqaaiabdAha2naaBaaaleaacqaIYaGmaeqaaaaaaaa@2F50@ (t)and v(t + 1) = v1(t) ∨ v 2 ¯ MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaqdaaqaaiabdAha2naaBaaaleaacqaIYaGmaeqaaaaaaaa@2F50@ (t), respectively. We defined the functional importance of a node in Boolean networks as follows: Given a network with N Boolean variables, a state denotes a vector consisting of N Boolean variables; there are 2Nstates in total. Each state makes a transition to another state through the Boolean update function. We constructed a state transition network that describes the transition of all the states. For a network node v, its functional importance can be considered in two ways. One is the functional importance with respect to initial state mutations. It is defined as the probability with which two state trajectories starting from s and s' converge to different attractors for all 2N-1pairs of states s and s' having different values only at v. The initial state mutation corresponds to the abnormal state (or malfunctioning) of a protein or gene caused by mutations. The other is the functional importance with respect to the update rule mutations. It is defined as the probability with which two state trajectories starting from a same state converge to different attractors where one of the two trajectories is obtained without the update rule mutation and the other is obtained by an error in updating the value of v with a probability 0.2. The update rule mutation corresponds to the change of relationships between nodes by removal or addition of links.