1 Introduction

In recent years, several blackouts have caused enormous economic losses to human society and threatened social security severely. It has aroused wide attention to power system vulnerability [1, 2]. The collapses of some critical nodes in power grid will lead to severe connectivity loss of power grid [3]. Severe faults occurred in some important nodes may induce cascading failures and finally result in instability of whole system [4]. Therefore, the effective identifications of important nodes in power grid are foundations of differentiated management and stable operation of power grid. For example, stable operation of power grid can be assured by enhancing the protections of important nodes. Besides, the evaluation for node importance of power grid can be applied to network reconfiguration in black start [5], vulnerability analysis [6], relay settings in a complicated ring net [7], etc.

Complex network theory is an effective tool to study topological and kinetic properties of network [8]. Power grid is a complex network [9, 10]. Most of available methods for evaluating node importance of power grid are based on complex network theory. In [4, 5], the change of the weighted network agglomeration after node contraction is defined as the index of node importance. Considering topology and power injection, an improved model is propose based on complex network theory for evaluating node importance of power grid in [11]. It is proposed an index of node importance combining structure of overlapping community in power grid, system running characteristic and reactance values of transmission lines in [12]. The extended betweenness and net-ability drop are defined as indexes of node importance with consideration of electric distance, power transfer distribution factor and line flow limit in [13]. In above methods, power grid is abstracted as a weighted and undirected graph. The weights of edges are given according to reactance parameters of transmission lines. The above methods evaluate node importance from the perspective of the topology of power grid to a greater extent, such as degree, betweenness and network agglomeration. In fact, power grid can be abstracted as a directed graph according to power flow direction. The node importance of power grid is related to not only the topology but also line flow, power source and load.

Power grid and Internet have some common properties [14]. They both belong to complex network and can be abstracted as directed graphs. However, few papers apply algorithms used in Internet to power grid. A method for evaluating node importance of power grid based on PageRank algorithm is proposed in [15]. However, PageRank algorithm evaluates node importance from the perspective of in-degree to a greater extent.

Taking power flow, load and power source into consideration, a method for evaluating node importance of power grid based on MBCC-HITS algorithm is proposed in this paper. The method defines authority score and hub score to estimate node importance of power grid from the perspectives of inflow and outflow power. The sum of two scores is defined as an index of node importance of power grid, which evaluates node importance more objectively and comprehensively.

2 MBCC-HITS algorithm

MBCC-HITS algorithm is proposed based on hyper-text induced topic selection (HITS) algorithm [16]. Authority score and hub score are defined as the measurements of web page importance in HITS algorithm. Authority score and hub score represent the contributions of a web page to information originality and transmission, respectively [17]. Internet can be abstracted as a directed and unweighted graph according to connection relations among web pages. Each web page corresponds to a node in the directed graph. The directions of edges are given according to the directions of corresponding hyperlinks. Authority score of a web page equals to the sum of hub scores of web pages (i.e. in-linked web pages) which cite this web page. It reflects the importance of a web page from the perspective of in-degree. Hub score of a web page equals to the sum of authority scores of web pages (i.e. out-linked web pages) which this web page cites. It reflects the importance of a web page from the perspective of out-degree.

The set of web pages is S = {1, 2, 3,…,N} and the adjacent matrix of the directed graph corresponding to connection relations of is H. y n and z n are the authority score and hub score of page n. A n is the set of web pages called in-linked web pages which point to page n. B n is the set of pages called out-linked web pages which page n points to. The vectors of authority scores and hub scores of S are y * = (y 1, y 2,…,y N ) and z * = (z 1, z 2,…,z N ). The relationships between authority scores and hub scores are shown as follows.

$$y_{n} = \sum\limits_{{i \in A_{n} }} {z_{i} } ,\;\varvec{y}^{*} = \varvec{H}^{\text{T}} \varvec{z}^{*}$$
(1)
$$z_{n} = \sum\limits_{{i \in B_{n} }} {y_{i} } ,\;\varvec{z}^{*} = \varvec{Hy}^{*}$$
(2)

The vector of authority score is calculated iteratively by (3) when the initial vector of authority score is given.

$$\varvec{y}^{*k} = \varvec{H}^{\text{T}} \varvec{Hy}^{*k - 1} \quad k = 1,\; 2,\; 3$$
(3)

where k is the iteration time.

The final vector of hub score is calculated according to z * = Hy * when the vector of authority score is convergent.

MBCC-HITS algorithm, which originates from HITS algorithm, proposes a new hyperlink weighting scheme based on common citations (cocitations) among a set of web pages to estimate the importance of web pages more reasonably. Besides, it also eliminates defects in calculation process of HITS algorithm and ensures the existence and uniqueness of solutions [18]. Then (3) is modified as follows.

$$\varvec{y}^{*k} = [\beta {\varvec{D}}_{c}^{ - 1} (\varvec{H}^{\text{T}} \varvec{H} + {\varvec{pe}}^{\text{T}} ) + (1 - \beta )(1/N){\varvec{ee}}^{\text{T}} ]^{\text{T}} \varvec{y}^{*k - 1}$$
(4)

where D c is an N × N dimensional matrix generated according to cocitations among a set of web pages; p is an N-dimensional vector which ensures the existence of solution and it is generated according to whether in-degree of a web page equals zero; β is a damping coefficient to ensure the existence of solution; e is an N-dimensional unit vector. The specific calculation processes of D c and p are shown in [16].

3 Application of MBCC-HITS algorithm in node importance evaluation of power grid

3.1 Corresponding relations between power grid and Internet

The recent researches show that both power grid and Internet are complex network and have some commonness [14]. Compared with Internet, each bus (removing transformer branch and regarding two or three terminals of transformer as one bus) can be regarded as a web page and each transmission line (merging double transmission line on the same tower) can be regarded as a hyperlink whose direction is same as the direction of active power flow. Then a directed graph corresponding to the topology of power grid can be established without consideration of power source, load and power flow values. Then some Internet algorithm based on directed graph can be applied to power grid. The corresponding relations between power grid and Internet are shown in Table 1.

Table 1 Comparisons of power grid and Internet

3.2 Index for evaluating node importance of power grid based on MBCC-HITS algorithm

Generally, web pages are ranked according to authority scores. The importance of a web page depends on the numbers and importance of the web pages which cite it.

As to a node of power grid, its importance is determined in aspects of inflow and outflow power. Therefore, the importance of a node depends on the numbers and importance of incoming and outgoing lines, as well as the importance of nodes connected to this node.

When applying MBCC-HITS algorithm to node importance evaluation of power grid, authority score y n and hub score z n evaluate the importance of node n from perspectives of inflow and outflow power, respectively. Authority score y n depends on the number and power flow of its incoming lines, as well as the importance of nodes connected to this node by incoming lines. Hub score z n depends on the number and power flow of its outgoing lines, as well as the importance of nodes connected to this node by outgoing lines. Considering the sum of inflow power equals outflow power for one node, authority score and hub score should have the same weight in determining node importance of power grid. Thus, the sum of two scores R n is defined as an index to evaluate the importance of node n.

$$R_{n} = y_{n} + z_{n}$$
(5)

The higher R n is, the more important node n is.

3.3 Modifications of MBCC-HITS algorithm based on operational characteristics of power grid

Power grid can be abstracted as a directed graph according to the corresponding relations in Table 1. Authority score and hub score of a node in power grid can be calculated by MBCC-HITS algorithm. However, in practical node importance evaluation of power grid, node importance is related to not only the topology of power grid but also some key factors considering operational characteristics of power grid shown as follows.

  1. 1)

    Power flow. It reflects operating mode of power grid. As to a fixed topology of power grid, different operating mode results in different line flow and node importance.

  2. 2)

    Load capacity. It influences power flow distribution and reflects voltage grade. Generally, the larger load capacity is, the higher voltage grade and importance of the load is.

  3. 3)

    Power source. The output of power source determines node importance to a much greater extent if the node is connected to a power source.

3.3.1 Modification based on power flow

The operating mode of power grid, namely line flow, should be taken into consideration when evaluating node importance of power grid. The importance of a node depends on inflow and outflow power transferred through its incoming and outgoing lines to a greater extent. Therefore, flow value of a line can be set as the weight of this line. Then a weighted adjacent matrix H′ is generated according to the topology of power grid, values and directions of power flow. The elements of H′ are given as follows.

$$H_{ij}^{\prime} = \left\{ {\begin{array}{ll} {S_{i \to j} }, & {{\text{nodes}}\;i\;{\text{and}}\;j\;{\text{are}}\;{\text{connected}}\;{\text{by}}\;{\text{line}}\;l_{i \to j} } \\ {0}, & {{\text{no}}\;{\text{connection}}\;{\text{between}}\;{\text{nodes}}\;i\;{\text{and}}\;j} \\ \end{array} } \right.$$
(6)

The initial unweighted adjacent matrix H should be replaced by H′ when evaluating node importance of power grid based on MBCC-HITS algorithm. It is different from web page ranking and is a highlight of this paper. Internet is abstracted as a directed and unweighted graph when using this algorithm to rank web pages. The modification in this part makes node importance evaluation of power grid related to not only topology but also operating mode of power grid.

3.3.2 Modification based on load

As to a load l i connected to node i, a load node \(n_{{l_{i} }}\) and a directed edge \(l_{{i \to n_{{l_{i} }} }}\) are added to the initial directed graph generated according to the topology of power grid. The direction of \(l_{{i \to n_{{l_{i} }} }}\) is from node i to load node \(n_{{l_{i} }} .\) The apparent power flow of \(l_{{i \to n_{{l_{i} }} }}\) is load capacity \(S_{{l_{i} }} .\) This modification increases out-degree of node i and considers load capacity as the weight. It makes the evaluation more reasonable. Besides, the importance of load node \(n_{{l_{i} }}\) is excluded in final node importance ranking.

3.3.3 Modification based on power source

As to a power source connected to node i, a generator node \(n_{{G_{i} }}\) and a directed edge \(l_{{n_{{G_{i} }} \to i}}\) are added to the initial directed graph generated according to the topology of power grid. The direction of \(l_{{n_{{G_{i} }} \to i}}\) is from generator node \(n_{{G_{i} }}\) to node i. The apparent power flow of \(l_{{n_{{G_{i} }} \to i}}\) is the generator output \(S_{{G_{i} }} .\) Similarly, the importance of generator node \(n_{{G_{i} }}\) is also excluded in final node importance ranking.

3.4 Calculation process for evaluating node importance of power grid

In Section 2, vector p in (4) is to ensure the existence of solution in the iterative process. However, it will cause the problem that authority scores of generator nodes, whose in-degree are zero, are all equal. It means generators which have different outputs will have the same influence on the importance of nodes connected to the generators. Therefore, further iterations are proposed in this paper to solve the above problem. The specific process is shown as follows: firstly, final hub score vector is calculated by (2) after authority score vector calculated iteratively by (4) is convergent; secondly, final authority score vector is calculated by (1) after final hub score vector is obtained. After above iterations, generators that have different outputs make different contributions to authority scores of the corresponding nodes. It further emphasizes the influence of generator output on node importance.

The calculation process for evaluating node importance of power grid is shown as Fig. 1.

Fig. 1
figure 1

Flowchart for evaluating node importance of power grid

Some supplements of Fig. 1 are shown as follows.

  1. 1)

    Initial authority score vector is (\(1/\sqrt{N},1/\sqrt{N},...,1/\sqrt{N}\)). N is the total number of nodes in directed graph G′.

  2. 2)

    In (4), the elements of p which are corresponding to generator nodes equal one, the other elements equal zero; D c is generated according to G′, H′ and p, and its specific calculation process is referred to [16]; β = 0.85.

  3. 3)

    The unitization rule of every iteration is \(\sum\limits_{n = 1}^{N} {(y_{n}^{k} )^{2} = 1}.\)

  4. 4)

    The unitization of final authority score vector and final hub score vector is also needed.

  5. 5)

    Node importance ranking is obtained according to node importance vector.

4 Case study

4.1 Application in IEEE 14-bus system

IEEE 14-bus system [19] is shown as Fig. 2. Three methods are adopted to obtain node importance ranking of this system. Method 1 is the method proposed in this paper. Method 2 is based on MBCC-HITS algorithm without the modification of power flow, that is, the adjacent matrix of the iterative calculation in method 2 is unweighted. Method 3 adopts electrical betweenness [20] to evaluate node importance. The node importance rankings of three methods shown in Table 2 are to verify the rationality of the modification based on power flow and the correctness of method 1. In Table 2, A, B and C represent node importance rankings obtained by methods 1, 2 and 3.

Fig. 2
figure 2

IEEE 14-bus system

Table 2 Node importance rankings of IEEE 14-bus system

The comparison of node importance rankings between methods 1 and 2 is to verify the rationality of the modification based on power flow. As is shown in Table 2, node 1 is the most important node in method 1. This node has two outgoing lines and is connected to a generator. The active power output of the generator equals 232.84 MW, which is 89.90 % of total active load in IEEE 14-bus system. The flow values of two outgoing lines are the first and second largest of total twenty lines in IEEE 14-bus system. So node 1 is a fairly vital node of IEEE 14-bus system, which is consistent with the node importance ranking of method 1. However, in method 2, the importance ranking of node 1 is 13th and the importance ranking of node 4 which has the most outgoing and incoming lines is 1st. It shows that the node importance ranking without consideration of power flow is unreasonable.

The rationality can also be verified by comparative analysis between two nodes. For example, node 5 and 9 have the same number of incoming and outgoing lines. Flow value rankings of two incoming lines and two outgoing lines of node 5 are 2nd, 7th and 4th, 6th. Flow value rankings of two incoming lines and two outgoing lines of node 9 are 9th, 13th and 14th, 17th. Therefore, node 5 is more important than node 9 for larger power flow values of its incoming and outgoing lines. In method 1, the importance rankings of two nodes are 3rd and 6th, which accord with actual situation. However, in method 2, the importance rankings of two nodes are much closer due to the fact that they have the same number of outgoing and incoming lines. It is unreasonable. Similarly, the analysis can be applied to nodes 2 and 6. It shows that node importance ranking of power grid without consideration of power flow is determined by the topology of power grid to a greater extent. It is inconsistent with practical situation of power grid.

Besides, in method 1, the importance ranking of node 3 is 5th for the load of node 3 is the largest in IEEE 14-bus system. And node 11 is evaluated as the least important node. Node 11 is connected to no generator or load. It only has one outgoing line and one incoming line. Flow value rankings of its outgoing and incoming line are 18th and 11th.

The correctness of method 1 is verified by comparison with method 3. Electrical betweenness adopted in method 3 is originated from betweenness in complex network theory and modified considering power transmission characteristics. It is a reasonable index to evaluate importance of power grid. Node importance rankings evaluated by method 1 and 3 are very consistent. Only the rankings of nodes 1, 2 and 8 have slight difference.

Based on all above analyses, it represents that MBCC-HITS algorithm used in Internet can be applied to node importance evaluation of power grid.

4.2 Application in IEEE 118-bus system

To verify the validity of the method proposed in this paper further, node importance of IEEE 118-bus system [19] is evaluated by three methods. And comparisons among the node importance rankings of three methods are shown in Table 3. Method 1 is the method proposed in this paper. Methods 2 and 3 measure node importance by extended betweenness T(v) and net-ability drop ΔA(Y) [13], respectively. Extended betweenness evaluates the importance of a node from the perspective of topology and power transmission limits. Net-ability drop evaluates the importance of a node by the change of global efficiency of power grid after this node is removed.

Table 3 Node importance rankings of three methods

Table 3 shows the top 20 of node importance rankings evaluated by three methods. CR represents corresponding ranking of this node in method 1.

In Table 3, No. 16 of the top 20 nodes in method 2 can be found in the top 20 nodes of method 3. No. 9 and No. 10 of the top 20 nodes in method 1 can be found in the top 20 nodes of methods 2, 3, respectively. It is notable that there are 9 common nodes (emphasized by bold font) among the top 20 nodes in methods 1, 2 and 3. What’s more, No. 16 and No. 14 of the top 20 nodes in methods 2 and 3 can be found in the top 30 nodes of method 1, respectively. No. 19 and No. 18 of the top 20 nodes in methods 2, 3 can be found in the top 50 nodes of method 1, respectively. The relative rankings of these common nodes are different in three methods. The above facts show that node importance rankings of power grid evaluated by methods 1, 2 and 3 have many similarities.

The analyses of uncommon nodes among methods 1, 2 and 3 are shown as follows.

  1. 1)

    The importance ranking of node 89 is 2nd in method 1. This node has 4 outgoing lines. The flow value rankings of these lines are 5th, 12th, 22nd and 33rd of total 197 lines in IEEE 118-bus system. The generator connected to node 89 has the largest active power output (13.19 % of total active power output of all generators). Node 66 is connected to a load and a generator. It has 4 outgoing lines. The flow value rankings of these lines are 6th, 42nd, 64th and 92nd. The first two outgoing lines are connected to the critical nodes 49 and 65. The importance of these two nodes ranks at 3rd and 4th of 118 nodes, respectively. Active power output of the generator ranks at 5th of all generators, which is 8.52 % of total active power output of all generators. However, nodes 89 and 66 cannot be found in the top 20 nodes of methods 2 and 3.

  2. 2)

    The importance ranking of node 59 is 5th in method 1. The load of this node is the largest of this system (5.86 % of all loads in this system). This node has 4 outgoing lines and 3 incoming lines. One of the lines ranks at 18th in line flow value ranking. This node is also connected to a generator. The output of the generator ranks at 12th. However, node 59 cannot be found in the top 20 nodes of methods 2 and 3.

  3. 3)

    Nodes 10 and 9 form a unit connection. Active power output of the generator connected to node 10 is the third largest (9.78 % of total active power output of all generators). Although these two nodes have few outgoing lines and incoming lines, flow rankings of line 10–9 and 9–8 rank at first and second of total lines. The above facts show that the importance of these two nodes is much significant and close. Nodes 8 and 5 are downstream nodes of the unit connection. Flow value ranking of line 8–5 is 3rd. These two nodes play an importance role in transmitting power from node 10 to other nodes through many outgoing lines of them. Besides, both of the two nodes are connected to loads and generators. Therefore, nodes 8 and 5 are a little more important than nodes 9 and 10. The important rankings of these four nodes in method 1 are consistent with above analysis. However, the above four nodes can’t be found in the top 20 nodes of methods 2 and 3.

  4. 4)

    Node 70 ranks at 14th and 6th in methods 2 and 3, respectively. However it ranks only at 63rd in method 1. This node has 2 incoming lines and 3 outgoing lines. Flow values of incoming and outgoing lines are very small. They just rank at 74th, 152nd and 118th, 161st, 168th, respectively. This node is connected to a load and a generator. The output of the generator just ranks at 24th of total 54 generators. The load ranks at 22nd of total 99 loads. The above facts result in a lower ranking of this node in method 1. Similarly, node 19 ranks 19th in method 3 but 83rd in method 1. Flow value rankings of its outgoing and incoming lines are 111th, 135th, 175th and 150th. The output of generator connected to it ranks at 29th. And the capacity of load connected to it ranks at 34th.

  5. 5)

    Node 30 is connected to no load or generator. However, nodes 8 and 26, whose importance ranks at 7th and 26th, are connected to node 30. The flow value rankings of the two corresponding incoming lines are 7th, 20th. Nodes 17 and 38, whose importance ranks at 23rd and 24th, are also connected to node 30. The flow value rankings of the two corresponding outgoing lines are 9th, 29th. Therefore, node 30 ranks at 18th in method 1 because not only authority scores and hub scores of its adjacent points but also the weights of its incoming and outgoing lines are large. It reflects the key idea of MBCC-HITS algorithm.

5 Conclusion

The node importance evaluation of power grid has not been studied thoroughly at present. The key reason is the evaluation index of node importance has not been unified yet. This paper introduces web page importance evaluation algorithm to power grid and improves its applicability. Then a new method for evaluating node importance of power grid based on MBCC-HITS algorithm is proposed in this paper. The characteristics of this method are shown as follows.

  1. 1)

    Both authority score and hub score are adopted in this method in order to evaluate node importance of power grid comprehensively. The two scores measure node importance from the perspectives of inflow and outflow power. The importance of a node is related to the numbers and importance of its incoming and outgoing lines, as well as the importance of its adjacent nodes.

  2. 2)

    The node importance evaluated by this method is consistent with operating state of power grid. It takes load, power source, value and direction of power flow into consideration. These factors closely reflect operational characteristics of power grid.