1 Introduction

The modern power system is pushed close to critical operating limits in the market environment. High capacity and long transmission networks are widely used to meet the power supply demand of modern society. Wind and solar power as clean and renewable energy are significantly adopted but they are inherently volatile, intermittent and random. Therefore, an improper handling of certain partial failures can easily lead to accidents and severe chain reactions, and thus may cause large-scale/extensive blackout eventually.

In recent years, there have been a great number of widespread blackouts around the world. For example, the North American blackout on August 14, 2003 caused a huge loss and the restoration of power supply lasted almost two weeks [1]. The European power outage on November 4, 2006 affected 15 million people and lasted up to 2 hours. Brazil and Paraguay experienced an extensive blackout on November 10, 2009. The Fukushima nuclear power plant was shut down in an emergency after the earthquake and tsunami on March 11, 2011, and its external power and emergency power were not adequate enough to support the cooling system and caused radiation leakage and other catastrophic consequences. The largest power outage in northern India involved 50 GW of load, affected 670 million people and lasted from July 30-31, 2012 [2].

Large-scale blackout risks still exist and are inevitable, although a great amount of work has been done to make power systems resilient against outages [3]. A proper restoration plan can effectively mitigate the negative impact on the public, the economy, and the power system itself. Research on how to restore the power system quickly and effectively after outages is of vital significance.

Power system restoration after a partial or complete collapse is quite a complex process. Many factors need to be considered including the operating status of the system, the equipment availability, the restoration time and the success rate of operation. It needs not only a large amount of analysis and verification, but also decisions made by dispatching personnel. Power system restoration is a multi-objective, multi-stage, multi-variable and multi-constraint optimization issue, and is full of non-linearity and uncertainty. It can be described as a typical semi-structured decision-making and it is difficult to obtain a complete solution [4].

The objectives of restoration are to enable the power system to return to normal conditions securely and rapidly, minimize losses and restoration time, and diminish adverse impacts on society. Many non-structured methods and technologies and object-oriented expert system have been employed in making restoration schemes to address the above objectives, but the establishment and maintenance of a knowledge base of past restorations remains a bottleneck. Case-based reasoning [5] is dependent on typical scenarios. With the development of computational intelligence, some heuristic algorithms such as genetic algorithms [6], artificial neutral networks [7] and fuzzy logic [8] are applied to system restoration, which may be a promising path forward. Graph theory with Petri nets [9] is also employed, but verification of constraints and reduction of uncertainties both need improvement. Based on the regional distribution characteristics in space, multi-agent technologies [1013] are developed with potential prospect. As a functional extension of expert systems and heuristic rules, decision support systems [14] have been demonstrated efficiently.

A bibliographical survey of publications covering the 1980s and 1990s is provided in nine different topics by [15]. Power system restoration is still a very active and hot research area up to now. To analyze it effectively, the three stages of black-start, network reconfiguration and load restoration are considered. This paper presents a review of the last decade of research, covering the areas of black-start, network reconfiguration, load restoration, and emerging technologies from the year of 2006, applying to transmission system restoration; distribution system restoration [16] is beyond this paper’s focus.

2 Black-start

The black-start stage, also called preparation period, is a stage in which a black-start generator provides cranking power to restart a non-black-start (NBS) generator. Related research mainly includes black-start power source selection, scheme formulation and assessment, field testing, dynamic and protection issues, and sectionalization strategy for parallel restoration.

2.1 Black-start power source selection

Usually, black-start power sources include the units with self-start ability such as hydroelectric generating units, fuel and gas turbine units and support power provided by adjacent interconnected systems. Gas-turbine-based plants can be profitably used in power system restoration [17]. The black-start resource procurement decision can integrate with a restoration planning model using optimization to produce a minimal cost procurement plan [18].

2.2 Scheme formulation and evaluation

Black-start schemes are generally formulated according to effectiveness and reliability. General guidelines of the considerations and needs for the development of effective restoration plans are described in [19] for electric utilities serving large metropolitan areas. Ref. [20] proposes the concept of success rate of unit restoration. It selects the optimal restored unit in advance and restoration paths are obtained by the algorithm of K-shortest paths. To generate the most suitable black start schemes, knowledge-based expert systems with decision support techniques are utilized [14, 21]. In [21], an expert system is developed by combining a fast reasoning mechanism with object-oriented features considering time-varying load. Restoration plans developed offline are used as guidelines for dispatchers in an online environment. Ref. [14] discusses online decision support tools for system restoration. Ref. [22] develops a black-start decision-support system that serves as an offline planning tool with an interactive graphical user interface. A hybrid approach combining graph algorithm and swarm intelligence is adopted to achieve the optimal solutions. Ref. [23] analyzes the current automated black-start program of the ERCOT grid, and gives a new black-start service annual selection analysis by a method of islanded formation.

Evaluation methods are available to choose the best from the feasible solutions offered by black start schemes such as those above. A hierarchical process-based method using fuzzy analysis is proposed to assess black-start schemes [24]. Vague set theory is also introduced to make black-start scheme decisions in [25], which can reasonably deal with the correlation among the evaluation indices of the black-start schemes and fuzzy information in the black-start decision. Ref. [26] presents an intuitive fuzzy distance-based method to analyze the consistency of black-start group decision-making results. The weights of black-start indices as well as the weights of decision-making experts are modified dynamically until a satisfactory result is obtained.

These black-start scheme generation and evaluation processes are generally carried out offline. Default scenarios used offline may be different from actual black-start situations, so a black-start scheme may not play its role as expected.

2.3 Field testing

In order to improve the efficiency of restoration schemes, many power companies have conducted black-start field tests. The Shandong power grid of China has performed a series of black-start field tests using Taishan pumped storage power station as cranking power, which makes full use of the long-term operation capability of pumped storage units in ultra-low load conditions, according to principles of risk management, continuous improvement, and accident priority control. Ref. [27] summarizes the main problems to be addressed during the field tests, and proposes a black-start field test management method based on the plan-do-check-action (PDCA) model to achieve standardized management of black-start field tests. Italy [17], Japan [28] and Sweden [29] have also reported their experiences of black-start tests in verifying the feasibility of black-start schemes. These field tests provide valuable advice to help system operators to improve their restoration schemes. It is necessary to do some field tests, alongside simulations, to improve the restoration efficiency.

2.4 Dynamic and protection issues

The frequency excursions are arrested automatically by various control and protection devices immediately after a major disturbance. However, it is challenging to coordinate the control and protective mechanisms with the operation of generating plants and the electrical system. During the subsequent restoration phases, plant operators in coordination with system operators attempt to manually maintain a real and reactive power balance. The duration of these manual procedures have invariably been much longer than equipment can endure. Several dynamic phenomena are presented and the required research and development identified in [30].

Ref. [31] updates the protection issues that occur during power system restoration. An effort has been made to cover most of the general protection related issues that can arise during power system restoration, and the functions that can be used to improve relay operation and also support the system operator in the restoration process.

2.5 Sectionalization strategy

To facilitate rapid power system restoration, the power system needs to be divided into several subsystems for parallel restoration. Ref. [32] employs a method based on an ordered binary decision diagram to derive proper splitting strategies for large-scale power systems. Ref. [33] presents a novel sectionalization method based on the wide area measurement system (WAMS) for the build-up strategy in power system restoration. In [34], a new method is proposed to divide systems for restoration based on the community structure of complex network theory. A modularity index is employed to evaluate the optimal number of the restoration subsystems. Ref. [35] proposes an optimization model for optimal sectionalization of restoration subsystems with a special emphasis on the coordination of generator ramping and the pickup of critical loads. A methodology based on constrained spectral clustering uses the physical and inherent properties of the network to determine a suitable sectionalization strategy in [36]. The dynamic restoration partition method adapts to deal with different scenarios after the blackout.

For some systems, inability to control the voltage and frequency may lead to unsuccessful restoration. Suitable islanding can help to improve the feasibility and reliability of restoration by simplifying complex restoration processes [37].

3 Network reconfiguration

In network reconfiguration phase, the generation capacity obtained after the black-start is used to restore other important units and substations in a reasonable start-up sequence, establishing a stable backbone network and laying foundation for full load restoration. Due to its complex features, network reconfiguration needs to be carried out under the guidance of restoration strategies. A three-stage restoration strategy [38] is proposed to manage this process, i.e. start-up of main units, restoration of important substations, and restoration of regional interconnection. Related research focuses on the unit start-up sequence optimization, the backbone-network determination, the restoration path optimization, and the transmission loop paralleling.

3.1 Unit start-up sequence optimization

The start-up of main units is one of the core missions. A reasonable unit start-up sequence can effectively speed up the restoration process and shorten the outage time. Related research can be divided into two categories: decision-making methods, and multi-optimization methods.

A decision-making technology is developed to determine the start-up sequence according to some established rules or policies in [39]. A series of restoration strategies are applied for optimizing the unit start-up sequence for the purpose of maximizing restored generation capacity according to the start-up power, starting time, ramping rate and critical maximum time interval of generating units.

A large number of studies consider the unit start-up sequence as an optimization problem and employ nonlinear programming or artificial intelligence methods to solve it. Ref. [40] formulates the starting sequence of generators as a mixed integer linear programming problem, using the transformation technique on the nonlinear generation capability curves. However, it is not suitable for large-scale power systems due to the computational limitations of this method. In [41], unit start-up is treated as a two-layer restoration process, namely network-layer unit restarting and plant-layer unit restarting. The lexicographic optimization method is used to solve the resulting multi-objective optimization problem. Ref. [6] proposes a multi-objective optimization model for the unit start-up sequence, system-partitioning strategy and time requirements. NSGA-II is applied to find Pareto solutions and the Dijkstra algorithm is utilized to search for the restoration paths in the identified unit restoration sequence.

To reduce the optimization scale in above mentioned research, a two-phase method is usually employed with a lower solution quality. Ref. [42] establishes a multi-objective optimization model of unit restoration with primary and secondary relations in optimization goals, simultaneously taking the optimization of unit restoration sequence and the target network into account in the network reconfiguration phase.

3.2 Backbone-network determination

In most current studies, evaluation indices of the target network are set up in terms of topological structure or electrical parameters. A network reconfiguration objective function is established, and a suitable optimization algorithm is selected to solve the problem. In [43], the method of node contraction is introduced to assess the importance of nodes. The network reconfiguration efficiency is defined as the optimization objective and a discrete particle swarm is employed to obtain the optimal skeleton networks. In [44], the discrete particle swarm is used to determine backbone network with the objective that important load restoration accounts for the maximum percentage of the total load restoration. However, the above studies do not consider the constraints of the unit start-up time. In [45], a fuzzy chance constrained model is proposed to cope with the uncertainty of operating time and lines put into operation. A combined method of fuzzy simulation, crossing particle swam optimization (PSO) with the Dijkstra algorithm, is used to solve this model.

Multi-attribute decision-making methods are also used to solve this problem with positive outcomes. Ref. [46] proposes a dynamic optimization decision-making method for power system skeleton restoration. It employs a three-stage skeleton restoration strategy and a shortest path searching method to generate candidate schemes for each restoration step. The main factors influencing the power system security or restoration speed are chosen as the attributes of candidate schemes and a multi-attribute decision-making method is used to make the optimal decision. Ref. [47] presents a group decision support system which combines multiple attribute decision-making and group decision-making to aggregate different attributes and multiple experts’ opinions to make decision-making more reasonable. In order to evaluate candidate restoration schemes, multiple types of attributes including crisp data, fuzzy numbers, interval numbers and linguistic terms are employed in [48]. A hybrid multiple attribute decision-making method based on VIKOR is proposed to provide compromise solutions.

3.3 Restoration path optimization

The establishment of the backbone-network has to be carried out step by step and the restoration path optimization in each step needs to minimize the cost of the target network formation while considering constraints. Each study has a unique feature in this area. In [49] the data envelopment analysis method incorporating preference information is employed to implement the modeling and evaluation of the weight coefficient with respect to each transmission line. Considering the failure risk of restoring transmission lines, two robust optimization models are presented in [50] to minimize total line restoration risk, which are applied to the series and parallel restoration stages, respectively. Ref. [51] proposes a network reconfiguration method based on the weighted complex network model. It can simultaneously obtain the globally optimal sequence of restoration paths and the optimal skeleton-network. Ref. [52] proposes a hierarchical collaborative optimization method according to the characteristics of hierarchical scheduling in power system. The objectives are defined as network reconfiguration degree and total power production. It combines hierarchical optimization with overall searching of the feed point index value which considerably reduces the scale of the problem. In [53], two types of restoration performance indices are used to rank all possible restoration paths according to their expected performance characteristics. The power transfer distribution factors and weighting factors are used to determine the order of restoration paths, which can enable the load to be picked up by lightly loaded lines or to relieve stress on heavily loaded lines.

3.4 Transmission loop paralleling

When two or more sectionalized subsystems are restored to a certain extent, a parallel operation is needed to interconnect them. The standing phase angle (SPA) difference is critical to improve the success rate of the parallel operation. Ref. [54] models the SPA reduction problem as a mixed discrete-continuous optimization with various constraints. The objective function is to minimize the weighted sum of the active power generation adjustments and load shedding, which is solved by a genetic algorithm. Ref. [55] proposes an optimal control strategy for SPA reduction. Load pickup is combined with active power generation increment according to a control strategy developed with an alternative two-stage decoupled algorithm and mixed integer nonlinear programming. It is an additional advantage that more load restoration in the regulating process can meet the parallel conditions.

4 Load restorations

The fundamental purpose of restoration is to resume the power supply to the users or loads. Both in black-start and network reconfiguration phases, it is necessary to restore certain loads for balancing the unit output, maintaining the power balance and voltages within an acceptable range, and these are called ballast loads. Large-scale load restoration can be carried out when the backbone network is restored and main generators are connected to the network. It is very important to restore all the loads as fast as possible.

4.1 Load restoration time

The restoration time is the most important variable in load restoration problem, because penalty regimes for network operators are usually based on the length and capacity of load interruption. The methodology in [56] graphically visualizes power system restoration plans to help evaluate the restoration duration and the scheduled critical path, as part of a method of evaluation and review. Ref. [57] proposes a new method to estimate the recovery time based on machine learning methods applied to previous restoration events.

4.2 Load restoration amount

In general, load restoration is composed of multi-step load pickups. Single-step load restoration in a substation usually aims to activate one or more circuits at the same time. If the single pick-up is too heavy, it will lead to lower voltage and even frequency excursion. On the other hand, if the load pick-up is too light it will increase the number of operations and delay the restoration process. To evaluate load restoration amount, the cold load pick-up (CLPU) characteristics should be considered [58]. Ref. [59] uses linearization techniques to construct a generic model for a fast assessment of the dynamic characteristics of system frequency during the period of cold load pickup. The PSO algorithm is applied to determine the optimal load restoration amounts and locations in the case of cold load pickup [60]. Ref. [61] deals with discrete load restoration based the real-time system status measured by the WAMS.

It is necessary not only to keep steady-state constraints, but also to maintain transient constraints during the load restoration process. Ref. [62] proposes a comprehensive model for calculating the maximum restorable load amount within the constraints. It considers transient frequency constraints, transient voltage-dip constraints, steady-state voltage constraints, and CLPU characteristics using a modified bisection algorithm.

Ref. [63] analyzes the main considerations of load restoration in the unit start-up stage, and of restoration strategies in coordination with unit start-up. Regarding load restoration in the last stage of network reconfiguration, Ref. [64] presents a combinatorial optimization model to address the sequencing problem of load pickup.

5 Emerging technologies

With the integration of variable renewable energy [65] and development of the smart grid, emerging technologies are causing widespread impact. Although there are many new challenges [66, 67], some new opportunities exist to restore power systems quickly.

5.1 Microgrid

The emergence of microgrids embedded in power systems enhances self-healing capability and allows distribution systems to recover faster in the event of an outage. A microgrid can operate in an islanded mode in isolation from the connected system during an outage. In [68], a graph-theoretic restoration strategy is presented incorporating microgrids that maximizes the restored load and minimizes the number of switching operations. Spanning tree search algorithms are applied to find the candidate restoration strategies by modeling microgrids as virtual feeders. A simultaneous bidirectional approach in [69] deals with black-start restoration sequences. The control strategies to be adopted for microgrid black-start and subsequent islanded operation, as well as the appropriate rules and conditions, are derived and evaluated by numerical simulations. Ref. [70] deals with the multi-stage restoration scheduling problem in an islanded microgrid system with multiple distributed generators and a radial configuration. Two stochastic methods are used to solve it. Under appropriate conditions, distributed generation (DG) can maintain power supply for loads in microgrid after blackout, and can give some power support for adjacent microgrids, or even the main grid if properly synchronized [71].

Recent development in wind farm control has enhanced the integration of wind power into power system. However, the risk of power system blackouts is also increased due to the volatility and uncertain nature of wind power generation. There has not been much research on the potential restoration function of wind farms in power systems. Since the starting time of wind turbines is shorter than non-black-start (NBS) generating units, some wind farms have the potential to function as black-start power sources in certain circumstances. The firefly optimization algorithm is used in [72] to find the optimal final sequence of NBS unit restoration, the optimal transmission path, and the optimal load pick up sequence with and without integration of wind farms in the system. Ref. [73] introduces a hybrid black-start unit containing combustion gas turbine, wind farm and static synchronous compensator and a three-level control model to achieve comprehensive functionality. A control scheme can allow variable speed wind turbines to participate effectively in system frequency and voltage regulation as Ref. [73] describes. When wind farms with adequate energy storage can be guided by suitable control strategy, they will be a great help to power system restoration.

5.2 VSC based HVDC

High voltage direct current (HVDC) transmission lines using voltage-source converter (VSC) have superior performance and characteristics compared to traditional line commutated converter (LCC) technology, and can become a reliable external power supply for supporting black-start [74]. The study presents the possibility that HVDC links can use a synchronous compensator at the receiving end [75]. Such systems have high regulating capacity and reliability which are valuable characteristics during the critical phases of power system restoration. A detailed stability analysis is developed for checking the correct size of the compensator. Electromagnetic simulations prove the validity of the solution. The VSC based HVDC will be of potential neighbor power support for power system restoration.

5.3 WAMS and cyber-attack

WAMS provides more timely and accurate data to power system operators. In [33], a novel sectionalizing method is proposed for the build-up strategy including two processes, restoring separated parts (islands) in the power system and then interconnecting them afterwards. Each island ensures observability and provides the phase angle difference between adjacent stations using the WAMS. In [61], a restorable load estimation method is proposed employing WAMS data after the network frame has been reenergized, and the WAMS is also employed to monitor the system parameters in case the newly recovered system becomes unstable again. Ref. [76] presents a method for optimal restoration planning which uses observability analysis and power transfer distribution factor (PTDF) to decrease the overvoltages caused by energizing transmission lines with light load. The use of the WAMS at the early stages of restoration provides precise determination of generators loading steps. It is a significant benefit of WAMS to unify phase angle references at different islands and present a tie line energizing priority list at the last stages of restoration [77]. Ref. [78] explores a methodology to control and monitor frequency during the early steps of power system restoration, and uses generator modeling to calculate the single machine equivalent of the power system based on phasor measurement unit (PMU) measurements. There is potentially highly valuable opportunity to take full advantage of WAMS’s timely situation awareness in rapid and effective power system restoration.

Ukraine suffered a serious cyber-attack on December 23, 2015, leaving hundreds of thousands of homes in the dark in what security experts say was a first for hackers with ill intent. A security firm claimed that it obtained malicious code to execute a temporary takedown of three power substations on the Ukrainian national grid [79]. The outage left about half of the homes in the Ivano-Frankivsk region of Ukraine without electricity. This is the first known instance of a blackout being credibly linked to the actions of malicious hackers. Cyber-attack is different from conventional faults and physical attack. It uses vulnerabilities of information domain to disturb the normal work of the monitoring system or communication network. Cyber-attack can trigger a blackout due to the dependence of the power system on information infrastructure. Power system restoration also relies on the information system, especially SCADA, to acquire the system status and thus to develop the restoration plan. Therefore, cyber-attack can delay or mislead the system restoration process. Cyber-attacks have a long incubation period and low cost, and present a significant risk to power systems and the power industry. The Israel Electric Authority, not the power grid, has also been subjected to a cyber-attack, though this was misinterpreted by the media as a power system attack [80]. It is crucial to get prepared to defend power systems against such attacks.

A statistical result of reviewed publications with a simple key word for method or purpose is listed in Table 1. The most popular technologies for developing restoration plans are various forms of artificial intelligence. Cold load pickup is serious concern in load restoration without enough research yet.

Table 1 Statistical result of reviewed publications

6 Conclusions

Although a great amount of research has strengthening power system to be resilient against outages, the risk of large-scale blackouts does not disappear. This paper discusses the advances and hot topics in power system restoration in the past decade. A large amount of work has been done and great progress has been made towards fast and effective power system restoration. Modern power systems have become complex cyber-physical systems due to significant integration of variable renewable energy, rapid development of smart grid and wide application of emerging techniques. Many new factors have the ability to cause a widespread blackout. Unlike preventive, corrective and emergency control, state-of-the-art technologies for restoration have barely achieved online application. There is still a very long way to go to realize automatic self-healing in bulk power systems.

Step-by-step dynamic decision-making based on situation awareness from SCADA and WAMS will be realized in the near future with effective application of fast developing artificial intelligence technologies. Furthermore, a comprehensive understanding of the socio-technical role in electricity outages within broader disaster contexts can only be improved through interdisciplinary research [81]. In addition, several to dozen Ultra-HVDC projects will transmit about 100 GW power to the east of China. Due to the multi-infeed HVDC interaction [82], the control and management of the most complex Ultra-high voltage power system in the world will meet more new challenges. Therefore, it is extremely urgent to do much more research to achieve fast and reliable restoration when the system is facing increasing blackout risks.