High-performance predictor for critical unstable generators based on scalable parallelized neural networks

LIU, Youbo; LIU, Yang; LIU, Junyong; LI, Maozhen; MA, Zhibo; TAYLOR, Gareth

doi:10.1007/s40565-016-0209-4

High-performance predictor for critical unstable generators based on scalable parallelized neural networks

Special Issue on Wide Area Monitoring, Protection and Control in Smart Grid
Open access
Published: 14 July 2016

Volume 4, pages 414–426, (2016)
Cite this article

Download PDF

You have full access to this open access article

Journal of Modern Power Systems and Clean Energy

High-performance predictor for critical unstable generators based on scalable parallelized neural networks

Download PDF

Youbo LIU¹,
Yang LIU¹,
Junyong LIU¹,
Maozhen LI²,
Zhibo MA³ &
…
Gareth TAYLOR²

2401 Accesses
13 Citations
1 Altmetric
Explore all metrics

Abstract

A high-performance predictor for critical unstable generators (CUGs) of power systems is presented in this paper. The predictor is driven by the MapReduce based parallelized neural networks. Specifically, a group of back propagation neural networks (BPNNs), fed by massive response trajectories data, are efficiently organized and concurrently trained in Hadoop to identify dynamic behavior of individual generator. Rather than simply classifying global stability of power systems, the presented approach is able to distinguish unstable generators accurately with a few cycles of synchronized trajectories after fault clearing, enabling more in-depth emergency awareness based on wide-area implementation. In addition, the technique is of rich scalability due to Hadoop framework, which can be deployed in the control centers as a high-performance computing infrastructure for real-time instability alert. Numerical examples are studied using NPCC 48-machines test system and a realistic power system of China.

Ensembles of Artificial Neural Networks for Smart Grids Stability Prediction

Applications of Artificial Neural Networks in the Context of Power Systems

Automatic Generator Re-dispatch for a Dynamic Power System by Using an Artificial Neural Network Topology

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Transient stability has been widely regarded as one of the most concerned issues of modern power system. In the last two decades, a number of large blackouts occurred all over the world due to the loss of synchronization caused by cascading failures [1]. Insufficient online implementations and lack of timely emergency controls, such as load shedding, generator tripping and proactive islanding, are said to be the common causes of those accidents [2]. The increasing renewable energy integration brings dynamic security deterioration of power systems, which would lead to the operation risks [3]. However, the deployment of phasor measurement units (PMUs) provides a promising way to improve awareness ability of control centers for the disturbed operation scenarios. PMUs, the infrastructure of wide-area monitoring system (WAMS) of power system, is able to measure synchronized phasor data with much higher sampling frequency compared with supervisory control and data acquisition (SCADA) [4]. The measurement accuracy is also reported to be sufficiently satisfactory. Since PMUs can grasp the instant response of power systems when faults occur, how to utilize the massive disturbed trajectories has been significantly investigated in the last decade.

As WAMS are now being deployed in quite a few power systems, PMU is playing an ever increasingly vital role in transient stability awareness [5]. A number of researches have been carried out to evaluate the transient stability by using PMU data. PMU trajectories based indicators are considered as the efficient estimators to understand dynamic features of power systems, especially during severe disturbances. For example, Alvarez et al proposed seven trajectory based indices which are suitable for fuzzy inference on real-time dynamic vulnerability [6]. A phasor data–based energy function indicator was developed in [7] aiming at monitoring the dynamic status of power transfer paths. A real-time transient stability assessment (TSA) method based on centre-of-inertia estimation from PMU records was reported in [8]. From voltage stability aspect, a coupled single-port model was applied to establish WAMS based assessment indicator [9]. Furthermore, Makarov et al. [10] presented a review on PMU-based TSA, offering a clear roadmap for further development.

Machine learning techniques have been widely applied for TSA. Most of the existing works are focused on the binary state prediction for global stability using clustering and classification. For example, support vector machine, decision tree and artificial neural network (ANN) are widely used to detect instability of power systems by using post-fault dynamic data during a few cycles [11–13]. Guo and Milanović presented a probabilistic framework to evaluate the accuracy of data mining tools applied for online prediction of transient stability [14], enabling the comprehensive analysis of performance of different implementations.

However, few machine learning techniques have considered the impact of the critical unstable generators (CUGs) of power systems. The majority of the researches have focused on the identification of the global system status due to the fact that a power system normally has hundreds of generators which generate massive volumes of data [15]. As a result, it has become a challenge for standalone machine learning techniques running on single computers to deal with stability assessment taking into account CUGs identification. For example, Passaro et al. [16] employed adaptive neural network to evaluate stability for each generator, admitting that standalone neural networks can hardly solve the problem in a reasonable time. For this purpose, applying advanced computing techniques to enable high-performance training and prediction associated with PMU measured data has become a necessity.

It is well known that neural network is highly adapted to classification tasks [17]. A number of researchers employed neural network to achieve high accuracy classifications in both academia and industrial fields. References [18, 19] figured out that BPNN encounters low efficiency issue due to large number of sum and sigmoid calculations. Some researchers focused on speeding up BPNN using cloud computing techniques. For example, Yuan et al. [20] implemented parallel BPNN using cloud computing technique. Ikram et al. [21] also employed cloud computing to parallelize BPNN in training phase. And also some researchers focused on solving the issue using MPI [22]. However, their ideas are all based on data separation, which does not consider the accuracy loss caused by the simple data separation. Therefore, to improve the efficiency of BPNN whilst maintains classification accuracy in predicting CUGs, this paper presents a MapReduce based parallel back propagation neural network(BPNN) algorithm. The algorithm firstly employs ensemble techniques [23] to complement the data information loss in data separation. And then the mappers in Hadoop clusters start training a number of sub-BPNNs. Finally, these sub-BPNNs can be employed to classify instances by fed with a few cycles of post-fault data and output final prediction results based on majority voting.

2 ANN-based CUGs prediction

2.1 Definition

CUGs are defined as the earliest group of generators rotor angles of which have a leading or lagging tendency compared with the rest units after fault clearing. The term of tendency refers to the given threshold of power angle difference between any pair of generators. Technically, CUGs are the most severely disturbed units that may lead to the ultimate loss of stability [24]. On the other hand, they are the potential control candidates for emergency tripping or correction action which is able to quickly diminish instability risk of power systems. The clustering-based method of identifying CUGs is detailed in the following section. Fig. 1 illustrates a few examples of CUGs in terms of rotor angle trajectories.

The unstable generators is belonged to CUGs, because their leading (or lagging) rotor angle against other units must exceed the given threshold which is usually set to be equal or little smaller than the wide-accepted instability criterion. For example, Fig. 1a and Fig. 1b illustrate rotor angle trajectories of CUGs which also contain all the unstable generators. It is a similar situation in Fig. 1d. In this situation, all the generators are determined as unstable ones at the end of observation time window, 150 cycles. But before that, none of generators reaches the CUG threshold criterion. Therefore, the strict two-cluster instability pattern corresponds to the situation that all the generators are CUGs, such as the case of Fig. 1d. However, unlike Fig. 1a, Fig. 1b and Fig. 1d, Fig. 1c offers the different pattern in which CUGs only are part of unstable units. Although belongs to the leading cluster, ahead of other leading generators, the two generators indicated in Fig. 1c meet CUGs identification criterion at the very beginning of time windows. These two units are considered to be the most effective objects for the further control strategy.

The primary aim of this study is to enable fast CUGs prediction by means of large-scale parallelized BPNNs learning method, providing more in-depth information for situational awareness of power systems transient stability.

2.2 Clusterwise CUGs identification

It is not difficult to distinguish unstable generators from the rest stable ones through the trajectory plot of rotor angle in a few seconds, as shown in Fig. 1a, and Fig. 1b. However, due to the lack of commonly used confirmation criteria for CUG, a method which is able to identify CUGs based on k-means clustering algorithm is presented in the paper. For each fault scenario, CUGs can be confirmed by means of following procedure.

Step 1: Collect rotor angle trajectory of each generator in a few seconds.

Step 2: Calculate rotor angle difference of any two generators i and j cycle by cycle from the very beginning of post-fault point according to:

$$ \Delta \delta_{ij} (t) = \delta_{i} (t) - \delta_{j} (t)\quad \forall i,j $$

(1)

where ∆δ _ij(t) is the angle difference between i and j at cycle point t after fault. If ∆δ _ij(t) exceeds the given threshold, e.g. 170°, where the power system is critically unstable, then record the time point t. Otherwise, the procedure is terminated for the next round.

Step 3: Extract a specific power angle trajectory δ _i(t+∆t) of every individual generator for further analysis. Here ∆t refers to the CUG validation interval. If ∆t is selected to be a relatively large value, like 3 s, it is hardly possible to distinguish CUGs from subsequently potential unstable generators. Empirically, ∆t is preferably set to be 50 cycles, i.e. 1 s.

Step 4: Classify all the δ _i(t+∆t) trajectories into two groups by means of widely-used k-means clustering algorithm, followed by calculating center of inertia (COI) against classified rotors of each individual group A and B respectively using:

$$ \delta_{\text{COI}}^{k} = \frac{1}{{M_{T}^{k} }}\sum\limits_{i = 1}^{{N_{k} }} {M_{i}^{k} \delta_{i}^{k} } ,\quad M_{T}^{k} = \sum\limits_{i = 1}^{{N_{k} }} {M_{i}^{k} } , \, k \in \{ A,B\} $$

(2)

where δ ^k_i and M ^k_i are rotor angle and inertia constant of generator i which belongs to group k, N _k refers to the number of generators in group k. It is worth noting that since the number of clusters is confirmed according to first swing situation, the replicate k-means, detailed in Appendix A, is employed in order to overcome the drawback of randomly selecting initial centroids.

2.3 Features selection

According to the previous works, a variety of dynamic parameters can be selected to compose features vector for the training procedure of particular machine learning algorithms. There exist two basic types of feature selection, i.e. time-series synchronized data such as a few cycles of voltage trajectory [25] and dynamic performance indices such as kinetic energy indicator of rotors [11].

In order to avoid information loss, the features in this study are confirmed to be the straightforward trajectory data after fault clearing including voltage amplitude, rotor angle and rotor speed of each individual generator. In addition, the maximal kinetic energy, a widely-used indicator highly related to disturbance severity of single generator, is also considered as one of the features. The features vector of generator n for one sample with time interval ∆T is denoted as:

$$ F_{n} (\Delta T) = \{ \varvec{V}(T), \,\varvec{\delta}_{\text{COI}} (T), \,\varvec{\omega}_{\text{COI}} (T), \, KE_{\text{COI}}^{n} \} $$

(3)

where V(T), δ _COI(T) and ω _COI(T) represent time-series data of voltage amplitude, rotor angle and speed during the time window T, respectively. Let t _c denote the exact time when fault clearing accomplishes, T is acquired from t _c to t _c+∆T. The symbol $ KE_{\text{COI}}^{n} $ refers to the kinetic energy at the instant of one cycle after fault clearing. It is noted that except voltage trajectory array, the time-dependent states such as δ and ω would be mapped to COI coordinate so as to consider mechanical effects of generators. The COI coordinate transformation and kinetic energy calculation are detailed in [24]. The COI coordinate is a widely accepted method of considering global pattern of system stability including the impacts of all the generators. The COI coordinate is applied in training and prediction stage, ensuring generalization ability of ANNs for a specific generator with the inclusion of global impacts of stability pattern. In addition, it is noted that the shape of transient voltage recovery which reflects disturbance severity of a generator closely relates to the stability evaluation of this generator. The disturbed voltage trajectory is also reported as a well-performing attribute for machine learning based transient stability prediction of power systems [25].

Features dimension significantly affects ANN training performance as well as generalization ability. Large size dimension is prone to over-fitting while the short one probably leads to inadvertent information loss. However, it is difficult to confirm the preferable dimension of feature candidates before validating fitting performance. In this regard, the parameters vectors with different dimension are tested by training a set of parallel ANNs in order to determine the optimal dimension for the input features. According to the vector defined by (3), the length of input features, d(∆T), is of linear dependence in terms of ∆T, shown as:

$$ d(\Delta T) = 3 \times \text{num} (\Delta T) + 1 $$

(4)

where the symbol num(∆T) refers to the number of cycles existing during ∆T interval. The triple means adding the length of vector containing V(T), δ _COI(T) and ω _COI(T) shown in (3). The kinetic energy indicator $ KE_{\text{COI}}^{n} $ is selected as the last feature, adding one more factor into the final vector of attributes. Taking voltage trajectory of generator bus as an example, the principle of determining preferable features dimension is detailed as follows.

As shown in Fig. 2, the time-series voltage amplitude after fault clearing during ∆T is taken as a sub-array of features, i.e. V(T). If ∆T equals 0.4 s, for example, the number of cycles during ∆T is 20 according to PMU data acquisition frequency. In this case, the total dimension given by (4) is 61. Aiming to determine the most effective features dimension for training regarding the trade-off between accuracy and computing cost, a number of ANNs fed by diverse dimension features associated with different ∆T are trained simultaneously. Specifically, ∆T is taken from 0.08 s to 0.56 s with step of 0.04 s, respectively corresponding to the intervals of 4 to 28 cycles using 2 cycles step-size after fault clearing. That means, for a single generator n, a set of ANNs denoted as:

$$ ANN_{1}^{n} \quad ANN_{2}^{n} \quad ANN_{3}^{n} \, \ldots \, ANN_{13}^{n} $$

(5)

are involved in the same training scheme which is challenged by the extremely computing-intensive machine learning tasks. However, training ANNs tentatively with all the potential input dimension is a straightforward way to ascertain the most generalized structure for the ANN trained for each generator.

At the output side of ANN, on the other hand, learning target against each data sample is given to be a binary variable reflecting status of an individual generator. If a particular generator is identified to be critically unstable by clusterwise method, the target status is tagged as 1, otherwise, it equals to 0.

2.4 Data samples production

A time-domain simulation based program is designed to automatically produce massive data samples associated with determined features. Specifically, the data generation procedure is applied by the following steps.

Step 1: Scale the initial load level of base case through multiplying active power P _Li0 on each bus by a random stress factor α, while update reactive power Q _Li using the constant power factor shown as:

$$ P_{Li} = P_{Li0} (1 + \alpha )\quad i \in \{ {\text{PQ}}\} $$

(6)

$$ Q_{Li} = P_{Li} \cdot \tan \left\{ {\cos^{ - 1} \left( {{{P_{Li0} } \mathord{\left/ {\vphantom {{P_{Li0} } {\sqrt {P_{Li0}^{2} + Q_{Li0}^{2} } }}} \right. \kern-0pt} {\sqrt {P_{Li0}^{2} + Q_{Li0}^{2} } }}} \right)} \right\} $$

(7)

where α ranges from −0.25 to 0.6. All generators, except the reference one, offset the load variation proportional to their base generations. The updated output of generator n can be calculated using:

$$ P_{Gn} = P_{Gn0} \left( {\frac{{1 + \sum\limits_{{i \in \{ {\text{PQ}}\} }} {\alpha P_{Li0} } }}{{\sum\limits_{{n \ne {\text{ref}}}} {P_{Gn0} } }}} \right)\;\;\; \, n \in \{ {\text{PV}}\} $$

(8)

Step 2: Execute time-domain simulation for a severe fault randomly selected from the pre-defined contingency list. The fault clearing time is applied as a random number between 0.15 s and 0.35 s. The longer clearing time implies the higher possibility of instability of power system.

Step 3: Organize features sample for each generator by using the simulated trajectories according to (3). It is noted that for any generator, a set of data samples with different feature dimensions due to the different ∆T selection (see Fig. 1) are produced at the same time for further parallel training, providing a way to determine the ANN structure with highest generalization ability for each individual generator.

Step 4: Identify CUGs based on the simulation results according to the proposed clusterwise method, enabling the learning target for each data sample.

Step 5: Save all the samples for the current round and repeat the complete procedure from the first step until the given total number of iterations is achieved.

In order to enhance generalization level of data samples, multiple faults are simulated in data producing procedure to consider N − k (k ≤ 3) scenarios.

3 Methodology of MapReduce based parallel BPNN

3.1 Issue of data volume

As discussed previously, BPNN has been widely used in quite a lot fields due to its stable performance and remarkable classification accuracy [26]. Therefore, we also employ such type of ANN to create online predictors to classify CUGs rapidly fed by PMUs measured data.

However, in this study, the time-series statuses of all generators produce an extremely large volume of data for ANN learning, which is completely different from the traditional applications. Traditionally, BPNN is frequently applied to train hundreds or a few thousands of instances in order to identify global instability. Thus, the algorithm generates less overhead. Currently because of the sharp increase of data volume, BPNN is forced to deal with massive scale data.

The volume of data is a specific term generally referring to the computable size and scale of a massive amount of data. Fig. 3 illustrates the quantification on the data volume of input instances fed to BPNN training for predicting CUGs.

In Fig. 3, N represents the total length of selected features, M is the number of available samples, and K refers to the sum of generators. Therefore, the cubic data volume can be estimated by

$$ {\text{Data Volume}} = Byte(M \times N \times K) $$

(9)

The data volume of fed instances highly affects training efficiency. For processing one instance, overhead occurs in both training and classification phases due to a large number of sum and sigmoid calculations existing in the network. Additionally, in training phase, BPNN has to execute back propagation to tune all the parameters, which generates overhead. At last each instance is not only trained once but a number of times. The training loops also generate overhead. Therefore, the standalone BPNN will meet a critical bottleneck for processing large volume of data in terms of efficiency. Our data intensive task will deteriorate the performance of CUGs prediction. This motivates us to parallelize BPNN using MapReduce [27].

3.2 Parallelization of BPNN

3.2.1 Standalone BPNN

BPNN is a widely used machine learning technique for classification due to its remarkable function approximation ability. It normally employs only feed forward to output final classification result for each input instance according to the trained weights and biases in training phase. In feed forward, the definitions of related variables are listed in Table 1.

Table 1 Definition of variables used in feed forward stage

Full size table

The number of inputs in input layer is decided by n, and the number of outputs in output layer is decided by the length of the encoded classifications. Therefore, I _j can be represented by:

$$ I_{j} = \mathop \sum \limits_{i} w_{ij} o_{lj} + \theta_{j} $$

(10)

In typical BPNN, the non-linear equation is frequently using sigmoid, therefore the output of the j ^th neuron from the current layer to next layer can be represented by:

$$ o_{cj} = \frac{1}{{1 + e^{{ - I_{j} }} }} $$

(11)

Output layer finally outputs its o _cj which indicates the classification result, and then feed forward completes. Following, back propagation starts. In back propagation, the related variables are defined in Table 2.

Table 2 Definition of variables used in back propagation

Full size table

Therefore, Err _j in output layer is expressed by:

$$ Err_{j} = o_{j} (1 - o_{j} )(t_{j} - o_{j} ) $$

(12)

while Err _j in hidden layers can be represented by:

$$ Err_{j} = o_{j} (1 - o_{j} )\mathop \sum \limits_{k} Err_{k} w_{kj} $$

(13)

As a result, the weight w _ij and bias θ _j can be tuned using:

$$ w_{ij} = w_{ij} + Err_{j} o_{j} $$

(14)

$$ \theta_{j} = \theta_{j} + Err_{j} $$

(15)

BPNN terminates its training procedure based on two conditions. The first one is that if the loop reaches a certain number, the algorithm terminates. The second condition is that if the error reaches a given threshold according to (16) for the single output and (17) for the multi-outputs:

$$ { \hbox{min} }(E[e^{2} ]) = { \hbox{min} }(E[(t - o)^{2} ]) $$

(16)

$$ \hbox{min} \left( {E\left[ {e^{\text{T}} e} \right]} \right) = { \hbox{min} }\left( {E\left[ {\left( {t - o} \right)^{\text{T}} \left( {t - o} \right)} \right]} \right) $$

(17)

3.2.2 MapReduce and Hadoop framework

MapReduce is a remarkable distributed computing model, offering two main operations named as Map and Reduce. Map function is responsible for data processing and computation. Reduce function, however operates the collecting and outputting operations. Specifically, the inputs and outputs for Map and Reduce are controlled by key-value pairs. Map processes each input key-value pair {K1, V1} and outputs intermediate output {K2, V2}. Reduce collects the output pairs with the same keys and executes merging, shuffling operations. And then Reduce outputs the final results {V2}.

Among a number of MapReduce implementations [28, 29], Hadoop framework [30] is the most famous one. Specifically, the nodes in a Hadoop cluster contribute their resources including processors, memory, hard disks and network adaptors to form hadoop distributed file system (HDFS) which is not only aiming at storing data but also the basic infrastructure of Hadoop. The nodes are categorized into one NameNode managing the metadata of cluster and several DataNodes executing computations. The implementations of Map and Reduce functions in Hadoop are named as mapper and reducer which are located in DataNodes. Based on the design of HDFS, Hadoop supplies remarkable scalability, fault tolerance, load balancing, heterogeneous environment support, and efficiency in dealing with large volume of data.

3.2.3 Ensemble technique

The presented parallelization of BPNN is based on data separation. The main idea is to divide the training data set into a number of sub-sets. Each sub-set is input into a sub-BPNN maintained by one mapper in the Hadoop cluster. As a result, each sub-BPNN is only trained by a part of original data set so as to improve the training efficiency. However, the merely simple data separation encounters a problem that partial training data results in insufficiently trained NN, which may lose accuracy in classification. Therefore, this paper employs ensemble technique including bootstrapping and majority voting. The solution of insufficient training for sub-BPNN based on bootstrapping and majority voting [23] is further detailed in Appendix B.

3.2.4 Algorithm design

As long as each bootstrapped sub-set is generated by using bootstrapping, each instance in one sub-set is defined in the format of <instance _k, target _k, type>, where

instance _k represents the bootstrapped instance, which is the input of neural network; target _k represents the desirable output if instance _k is a training instance; type field has two values: “train” and “test”, which labels the type of instance _k. Therefore the sub-BPNN in a mapper is aware of the instance type moreover executing training or classification operations.

Afterwards each individual mapper in the Hadoop cluster constructs one BPNN and initializes weights and biases with random values between -1 and 1 for its neurons. And then the mapper inputs one instance in the form of <instance _k , target _k , type> from one input sub-set of the mapper.

The mapper parses the data and retrieves the type of the instance. If the type value is “train”, the instance is fed into the input layer. Secondly, each neuron in different layers computes its output until the output layer generates an output which indicates the completion of feed forward process. And then the mapper starts a back propagation process and updates weights and biases for neurons. If one input instance is labeled as “test”, all the mappers start to classify the instances. In this case, each mapper generates an intermediate output in the form of < instance _k , o _jm > where instance _k is the key and o _jm represents the classification result of the m ^th mapper.

Finally, a reducer collects the intermediate outputs of all the mappers. As all the outputs have the same key instance _k, the reducer merges these outputs into one set, in which the reducer executes majority voting and outputs the finally voted result of instance _k into HDFS in the form of <instance _k , r _k > where r _k represents the voted classification result of instance _k. Fig. 4 and Fig. 5 illustrate the algorithm architecture for training and classification procedure of MapReduce based BPNN respectively.

Based on the majority voting, a number of sub-BPNNs (weak classifiers) can form a strong classifier. Therefore, although each sub-BPNN is only trained by a portion of the original training data which may lead to the wrongly classification, the final voted classification result of a number of sub-BPNNs has a higher chance to be correct with higher efficiency for dealing with a large volume of data.

3.3 Implementation framework

The presented technique is able to be implemented to WAMS application platform, enabling online prediction of transient stability for each generator after fault clearing. Compared with the conventional methods only predicting global stability [16], the proposed approach could provide more in-depth information for the potential emergency control which aims to resynchronize the disturbed power system as fast as possible. The implementation framework is simply shown in Fig. 6.

As illustrated in Fig. 6, a distributed simulator for fault response of power systems according to previous study [31] serves to generate massive data samples based on the given operation point in a parallel computing environment. Although it is an off-line procedure from the traditional concept, however, due to the fast variability of operation point including unexpected topology change, updating training samples ensures generalization of data. Therefore, the simulated fault scenarios which reflect various stability patterns of all the generators are collected accumulatively to update sample database in the proposed framework, resulting in an intensive computation burden of the standalone ANN training.

In the previous literature, the well-trained ANN is used as stability predictor which normally does not need update. However, the generalization of ANN could be enhanced by means of re-training the updated transient samples. The presented technique, shown in the dotted frame, enables high-performance computing framework and algorithm of large-scale parallelized BPNNs training. The training efficiency highly depends upon the number of DataNodes, which operates in a similar way of distributed samples generation [31].

Theoretically, in order to enable industrial application, the time consumption of samples update and thousands of BPNNs re-training could be reduced to an acceptable level, few minutes for example, by configuring sufficient computing resources based on the scalable Hadoop framework. The trained parallel BPNNs for each generator are stored in the distributed environment. When a fault occurs in power system, the PMUs installed on generator busbars capture the disturbance signal in a few cycles, providing the trajectories-based features which are imported into the well-trained parallel CUGs predictors that can avoid time-consuming series ANNs prediction. As a result, CUGs information, the most leading or lagging generators, will be sent back to the WAMS application, contributing to further emergency or correction control.

4 Case study

4.1 Test systems and training data

NPCC 48-machines test system and a provincial power system in the southwest of China [31] are used to validate the proposed technique. Details of test systems are shown in Table 3.

Table 3 Details of test system

Full size table

In order to obtain BPNN-based CUGs predictor with high generalization ability, rich data samples are required, preferably including all the potential operation conditions and fault modes. Therefore, massive samples are generated for both test systems. In this work, a distributed random fault simulator has been developed to generate massive samples [31]. Random fault refers to stochastic three-phase short circuits on any transmission lines. In addition, fault clearing time is randomly set between 0.1 s to 0.35 s. The samples generation procedure is in the following steps.

Step 1: Load base case, if the initial outage exists, trip the component and calculate power flow.

Step 2: Change P and Q on each bus by multiply a random number in the range of [0.8, 1.4] to simulate load level, allocate unbalance power to all generators in proportion to their base generation.

Step 3: Implement three-phase fault on a randomly selected component at time T _f, clear fault at T _f+μ, where μ is a random decimal in [0.1, 0.35].

Step 4: Perform time-domain simulation for above randomly configured operation and fault scenario, collect output trajectories to calculate features.

The results of data samples production based on the fault simulator are detailed in Table 4. In addition, the volume of data which are fed to train and validate parallel BPNNs in this study is shown in Table 5 according to (9).

Table 4 Details of simulated samples

Full size table

Table 5 Volume of data

Full size table

In Table 5, V _max and V _min represent the maximal and minimal block of data samples for individual generator in test systems. Table 4 and Table 5 indicate that the massive samples production as well as ∆T-dependent features definition for generators according to (5) results in a huge volume of data. However, a standalone BPNN inputs instances one by one leading to both sizable IO overhead and considerable computational overhead. As a result, the most reasonable way of identifying the CUGs in such a great volume of data is to apply the presented parallel BPNN which parallelizes both training and classification phase. Furthermore, it is the most feasible approach to enable WAMS application integrated with CUGs prediction according to the architecture in Fig. 6.

4.2 Computing cluster configuration

In order to evaluate the performance of MapReduce based parallel BPNN enabling predicting CUGs after fault clearing, a practical Hadoop cluster was built up. The cluster contains ten nodes, nine of which are Datanodes and the rest one is Namenode. Table 6 shows the configuration of the cluster.

Table 6 Configuration of computing cluster

Full size table

4.3 Evaluation of MapReduce based parallel BPNN

4.3.1 Precision of CUGs prediction

In this evaluation, we tested the algorithm precision of the generator status prediction. In terms of precision, when the number of training instance is large, the presented MapReduce based parallel BPNN has the same precision compared to that of standalone BPNN. Therefore, the following only lists the precision of the parallel BPNN without comparison with a standalone BPNN algorithm. Fig. 7 illustrates the precision of CUG identification for two test systems.

The figure recording the CUGs predicting precision of test systems indicates that the parallel BPNN is of satisfactorily high precision in identifying the generators transient status during the post-fault period of power systems. The average precisions for all generators of two test systems are 99.18% and 98.57% respectively.

However, it is worth noting that features dimension which depends upon the determination of ∆T affects CUGs prediction precision. In this study, we choose BPNNs possessing the highest average precision to be the CUGs predictor, even though it is not the best ∆T choice for all the generators. Table 7 gives the details.

Table 7 Precision details of test systems

Full size table

Table 7 indicates that when the value of ∆T equal to 5 cycles and 8 cycles respectively, two test systems get the BPNNs predictor with highest average precision. However, the particular machines, 3^rd generator in system I and 16^th generator in system II for example, could only obtain their most precise NN-based predictors by setting longer ∆T.

4.3.2 New samples validation

In order to test the generalization ability of parallelized ANNs based CUGs predictor, thousands of new simulated samples of two test systems are generated and fed into the trained ANNs to assess their prediction precision. The additional samples information is listed in Table 8.

Table 8 Details of new cases for prediction precision test

Full size table

Aiming to validate the comprehensive performance of MapReduce based CUGs predictor with new data cases, the numerical studies are investigated by retraining the BPNNs using the initial data samples introduced in Table 4 with different ∆T setting. Then the well-trained BPNNs are applied to predict CUGs using the data of new cases as input with the corresponding ∆T setting. The test results of precision statistics are provided in Table 9.

Table 9 Results of new sample cases test

Full size table

The validation results listed in Table 9 indicate that the parallelized BPNNs based on the presented technique have a well-performance of generalization ability for the new data cases which are not included in the training data set.

4.3.3 Features dimension impact analysis

As discussed above, the features dimension determined by ∆T according to (3) is highly related to the prediction precision. However, it is hardly to understand in advance that which ∆T choice is of the highest precision. Owing to MapReduce and HDFS technique, the parallelized BPNNs with thirteen types of features dimension for every single generator are trained and validated at the same time.

In this test, we focus on the comparison of algorithm precision with increasing dimension of the instance. Fig. 8 shows the variation of CUGs prediction precision against the change of ∆T.

It can be observed that, along with an increasing number of elements involved in an instance, the final precision of BPNNs trained for single generator changes in a nonlinear way. It is assumed that over-fitting largely affects training procedure. Moreover, due to the major part of generators obtain their most accurate BPNNs when ∆T equal 0.1 s and 0.16 s respectively in System I and System II, the average precisions are highest by selecting these ∆T for all generators. Fig. 9 illustrates statistical distribution of BPNNs with highest precision in term of ∆T.

4.3.4 Validation of ensemble training

Figure 10 indicates that the presented ensemble based neural network algorithm outperforms the standalone neural network in terms of precision.

The figure shows that, when the number of training instance is small, the precision of the ensemble training strategy in the presented parallel BPNN outperforms that of a standalone BPNN algorithm. The figure also tells that the precision of ensemble training strategy increases stably without fluctuations.

4.3.5 Algorithm efficiency

In this test, we primarily focused on the evaluation of algorithm efficiency. We duplicate the training data from 1 MB to 1024 MB, with 16 mappers employed. The experimental result is shown in Fig. 11.

It can be observed that when the data size is small, the standalone BPNN outperforms the parallel BPNN due to the overhead of Hadoop framework. However, when the data size keeps increasing, the parallel BPNN can still execute efficiently. Contrarily, the standalone BPNN needs more time to execute the data processing.

According to the test result, the time consumption of training 1 GB data by standalone BPNN costs around 9000 s. It is obviously not acceptable for continually updated samples production application, such as the technique architecture presented in Fig. 6. In order to investigate efficiency of the presented CUGs prediction algorithm, the comprehensive comparison in term of training time consumption is performed in Table 10.

Table 10 Efficiency comparison

Full size table

The efficiency comparison indicates that along with the computing resource increasing, the algorithm processing time is largely reduced without accuracy loss. If more Datanodes are configured, the efficiency will be further improved so as to accomplish training procedure in a few minutes even a few seconds. That enables the on-line training for the huge bulk of updated samples.

5 Conclusion

This paper presents a high-performance CUGs prediction approach using MapReduce based parallel BPNN. Our work in the first place employs time domain simulation to generate massive disturbed scenarios using the published fault simulator. Secondly we propose features selection principles to produce feature vector which represents the system status with reasonable data dimension. Thirdly for overcoming the disadvantages of standalone application, MapReduce based BPNNs technique is developed aiming to facilitate simultaneous training for every single generator. The presented methodology employs ensemble technique and data separation to enhance the training efficiency, whilst it uses data separation in enabling scalable integration of computing resource as well as large classification efficiency improvement. The experiment results show that the presented technique is able to predict CUGs especially in large-scale data with high accuracy and efficiency, providing a way to in-depth awareness of stability based on WAMS architecture.

References

Baldick R, Chowdhury B, Dobson I et al (2009) Vulnerability assessment for cascading failures in electric power systems. In: Proceedings of the 2009 power systems conference and exposition (PSCE’09), Seattle, WA, 15–18 Mar 2009, 9 pp
Vaiman M, Bell K, Chen Y et al (2012) Risk assessment of cascading outages: methodologies and challenges. IEEE Trans Power Syst 27(2):631–641
Article Google Scholar
Miao L, Fang JK, Wen JY et al (2013) Transient stability risk assessment of power systems incorporating wind farms. J Mod Power Syst Clean Energy 1(2):134–141. doi:10.1007/s40565-013-0022-2
Article Google Scholar
De La Ree J, Centeno V, Thorp JS et al (2010) Synchronized phasor measurement applications in power systems. IEEE Trans Smart Grid 1(1):20–27
Article Google Scholar
Makram EB, Vutsinas MC, Girgis AA et al (2012) Contingency analysis using synchrophasor measurements. Electr Power Syst Res 88(1):64–68
Article Google Scholar
Alvarez JMG, Mercado PE (2007) Online inference of the dynamic security level of power systems using fuzzy techniques. IEEE Trans Power Syst 22(2):717–726
Article Google Scholar
Chow JH, Chakrabortty A, Arcak M et al (2007) Synchronized phasor data based energy function analysis of dominant power transfer paths in large power systems. IEEE Trans Power Syst 22(2):727–734
Article Google Scholar
Cepeda JC, Rueda JL, Colomé DG et al (2014) Real-time transient stability assessment based on centre-of-inertia estimation from phasor measurement unit records. IET Gener Transm Distrib 8(8):1363–1376
Article Google Scholar
Liu JH, Chu CC (2014) Wide-area measurement-based voltage stability indicators by modified coupled single-port models. IEEE Trans Power Syst 29(2):756–764
Article Google Scholar
Makarov YV, Du PW, Lu S et al (2012) PMU-based wide-area security assessment: Concept, method, and implementation. IEEE Trans Smart Grid 3(3):1325–1332
Article Google Scholar
Gomez FR, Rajapakse AD, Annakkage UD et al (2011) Support vector machine-based algorithm for post-fault transient stability status prediction using synchronized measurements. IEEE Trans Power Syst 26(3):1474–1483
Article Google Scholar
Kamwa I, Samantaray SR, Joos G (2010) Development of rule-based classifiers for rapid stability assessment of wide-area post-disturbance records. In: Proceedings of the 2010 IEEE PES general meeting, Minneapolis, MN, 25–29 Jul 2010, 1 pp
Xu Y, Dong ZY, Zhao JH et al (2012) A reliable intelligent system for real-time dynamic security assessment of power systems. IEEE Trans Power Syst 27(3):1253–1263
Article Google Scholar
Guo TY, Milanović JV (2014) Probabilistic framework for assessing the accuracy of data mining tool for online prediction of transient stability. IEEE Trans Power Syst 29(1):377–385
Article Google Scholar
Hashiesh F, Mostafa HE, Khatib AR et al (2012) An intelligent wide area synchrophasor based system for predicting and mitigating transient instabilities. IEEE Trans Smart Grid 3(2):645–652
Article Google Scholar
Al-Masri AN, Ab Kadir MZA, Hizam H et al (2013) A novel implementation for generator rotor angle stability prediction using an adaptive artificial neural network application for dynamic security assessment. IEEE Trans Power Syst 28(3):2516–2525
Article Google Scholar
Cui MJ, Ke DP, Gan D et al (2015) Statistical scenarios forecasting method for wind power ramp events using modified neural networks. J Mod Power Syst Clean Energy 3(3):371–380. doi:10.1007/s40565-015-0138-7
Article Google Scholar
Gu R, Shen FR, Huang YH (2013) A parallel computing platform for training large scale neural networks. In: Proceedings of the 2013 IEEE international conference on big data, Silicon Valley, CA, 6–9 Oct 2013, pp 376–384
Rizwan M, Jamil M, Kothari DP (2012) Generalized neural network approach for global solar energy estimation in India. IEEE Trans Sustain Energy 3(3):576–584
Article Google Scholar
Yuan JW, Yu SC (2014) Privacy preserving back-propagation neural network learning made practical with cloud computing. IEEE Trans Parallel Distrib Syst 25(1):212–221
Article Google Scholar
Ikram AA, Ibrahim S, Sardaraz M et al (2013) Neural network based cloud computing platform for bioinformatics. In: Proceedings of the 2013 IEEE Long Island conference on systems applications and technology (LISAT’13), Farmingdale, NY, 3 May 2013, 6 pp
Long LN, Gupta A (2008) Scalable massively parallel artificial neural networks. J Aerosp Comput Inf Commun 5(1):3–15
Article Google Scholar
Alham NK (2011) Parallelizing support vector machines for scalable image annotation. PhD Thesis. Brunel University, London
Kundur P (2012) Power system stability and control, 3rd edn. McGraw-Hill, New York
Google Scholar
Rajapakse AD, Gomez F, Nanayakkara K et al (2010) Rotor angle instability prediction using post-disturbance voltage trajectories. IEEE Trans Power Syst 25(2):947–956
Article Google Scholar
Hagan MH, Demuth HB, Beale MH (1996) Neural network design. PWS Publishing Company, Boston
Google Scholar
Dean J, Ghemawat S (2008) MapReduce: simplified data processing on large clusters. Commun ACM 51(1):107–113
Article Google Scholar
He BS, Fang WB, Govindaraju NK et al (2008) Mars: a MapReduce framework on graphics processors. In: Proceedings of the 17th international conference on parallel architectures and compilation techniques (PACT’08), Toronto, 25–29 Oct 2008, pp 260–269
Taura K, Kaneda K, Endo T et al (2003) Phoenix: a parallel programming model for accommodating dynamically joining/leaving resources. ACM SIGPLAN Notices 38(10):216–229
Article Google Scholar
Apache Hadoop. http://hadoop.apache.org/. Accessed 07 June 2016
Liu YB, Liu Y, Liu JY et al (2014) A cloud computing framework for cascading failure simulation and analysis of large-scale transmission systems. In: Proceedings of the 2014 international conference on power system technology (POWERCON’14), Chengdu, 20–22 Oct 2014, pp 287–293

Download references

Author information

Authors and Affiliations

School of Electrical Engineering and Information, Sichuan University, Chengdu, 610065, China
Youbo LIU, Yang LIU & Junyong LIU
Electronic and Computer Engineering, Brunel University, Uxbridge, Middx, London, UB8 3PH, UK
Maozhen LI & Gareth TAYLOR
Senior Power System Engineer, National Grid, Bearwood road, Wokingham, RG6 3DU, UK
Zhibo MA

Authors

Youbo LIU
View author publications
You can also search for this author in PubMed Google Scholar
Yang LIU
View author publications
You can also search for this author in PubMed Google Scholar
Junyong LIU
View author publications
You can also search for this author in PubMed Google Scholar
Maozhen LI
View author publications
You can also search for this author in PubMed Google Scholar
Zhibo MA
View author publications
You can also search for this author in PubMed Google Scholar
Gareth TAYLOR
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yang LIU.

Additional information

CrossCheck date: 3 June 2016

Appendices

Appendix A

Generally k-means algorithm tries to minimize the within-cluster sum of squares, that is, to minimize

$$ \mathop {\arg \hbox{min} }\limits_{S} \sum\limits_{i = 1}^{k} {\sum\limits_{{x_{j} \in S_{i} }} {D^{2} \left( {x_{j} ,\mu_{i} } \right)} } $$

where k is the number of cluster, S _i is the set of the i ^th cluster (or elements in the i ^th cluster), μ _i is the mean of the points in i ^th cluster and D ²(x _j, μ _i) is the distance between point x _j and μ _i.

The above flow chart is the primary procedure of k-means algorithm. The third and fourth step can be further expressed as following mathematic equations respectively

$$ S_{i}^{(t)} = \left\{ {\left. {x_{p} :D\left( {x_{p} ,\mu_{i}^{(t)} } \right) \le D\left( {x_{p} ,\mu_{j}^{(t)} } \right)\quad \forall 1 \le j \le k} \right\}} \right. $$

$$ m_{i}^{(t + 1)} = \frac{1}{{\left| {S_{i}^{(t)} } \right|}}\sum\limits_{{x_{j} \in S_{i}^{(t)} }} {x_{j} } $$

where (t) means the t ^th iteration and |S ^(t)_i | denotes the number of elements in ith set (S _i). Each point x _p is assigned to exactly one set S _i.

The “replicate” indicates the number of times to repeat the clustering, each with a new set of initial cluster centroid positions. k-means returns the solution with the lowest value for sum of distance. The times to repeat the clustering depends on the distinction in clustering data. The choice of the initial clustering center has a large effect on the clustering results. It may lead bad results if the initial clustering center is not appropriate. In this case, increase the times to repeat the clustering with different set of initial cluster centroid positions can improve the accuracy of clustering results.

Appendix B

To solve the issue of insufficient training for sub-BPNN, the solution is to re-sample the training dataset based on bootstrapping and then execute majority voting. The operations are able to reduce the misclassification errors and increase the classification accuracy. Balanced bootstrapping ensures that each training instance equally appears in the bootstrap samples. However, it might not be always the case that each bootstrapping sample contains all the training instances. The method to create the balanced bootstrap samples is to construct a number of instances X ₁, X ₂, X ₃,…, X _n repeating B times so that a sequence of Y ₁ , Y ₂ , Y ₃,…,Y _Bn is achieved. A random permutation p of the integers from 1 to B _n is taken. Therefore the first bootstrapping sample can be created from Y _p(1), Y _p(2), Y _p(3),…, Y _p(n), moreover the second bootstrapping sample from Y _p(n+1), Y _p(n+2), Y _p(n+3),…, Y _p(2n) and so on, until Y _p((B−1)n+1), Y _p((B−1)n+2), Y _p((B-1)n+3),…, Y _p(Bn) is the B ^th bootstrapping sample. The majority voting is a commonly used combination technique. The ensemble classifier predicts a class for a testing instance which is predicted by the majority of base classifiers.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Cite this article

LIU, Y., LIU, Y., LIU, J. et al. High-performance predictor for critical unstable generators based on scalable parallelized neural networks. J. Mod. Power Syst. Clean Energy 4, 414–426 (2016). https://doi.org/10.1007/s40565-016-0209-4

Download citation

Received: 30 November 2015
Accepted: 25 April 2016
Published: 14 July 2016
Issue Date: July 2016
DOI: https://doi.org/10.1007/s40565-016-0209-4

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

High-performance predictor for critical unstable generators based on scalable parallelized neural networks

Abstract

Similar content being viewed by others

Ensembles of Artificial Neural Networks for Smart Grids Stability Prediction

Applications of Artificial Neural Networks in the Context of Power Systems

Automatic Generator Re-dispatch for a Dynamic Power System by Using an Artificial Neural Network Topology

1 Introduction

2 ANN-based CUGs prediction

2.1 Definition

2.2 Clusterwise CUGs identification

2.3 Features selection

2.4 Data samples production

3 Methodology of MapReduce based parallel BPNN

3.1 Issue of data volume

3.2 Parallelization of BPNN

3.2.1 Standalone BPNN

3.2.2 MapReduce and Hadoop framework

3.2.3 Ensemble technique

3.2.4 Algorithm design

3.3 Implementation framework

4 Case study

4.1 Test systems and training data

4.2 Computing cluster configuration

4.3 Evaluation of MapReduce based parallel BPNN

4.3.1 Precision of CUGs prediction

4.3.2 New samples validation

4.3.3 Features dimension impact analysis

4.3.4 Validation of ensemble training

4.3.5 Algorithm efficiency

5 Conclusion

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendices

Appendix A

Appendix B

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation