A Review on Providing Realistic Electric Grid Simulations for Academia and Industry

Engineering analysis and design for large-scale electric power grids require advanced modeling and simulation capabilities for a variety of studies, with two of the key study types being steady-state power flow and time-domain stability. In order to promote innovation in this area, during a time of rapid change, much recent work has been done on enhancing the availability of grid models and simulation datasets for the benefit of both academia and industry. The purpose of this paper is to review these new developments. Over the last several years, there have been many different developments in electric grid power flow and stability analysis. In power flow, key new changes include (1) the inclusion of geographic coordinates, (2) the addition of geomagnetic disturbance analysis, (3) the direct inclusion of weather data, and (4) new optimal power flow (OPF) and security-constrained OPF algorithms, some of which utilize machine learning. Key developments in stability are (1) many new models particularly for inverter-based resources, (2) wider availability of interactive stability simulations, and (3) greater use of wide-area visualization in both power flow and stability. The paper shows the range of software platforms available for large-scale electric grid for power flow and stability simulations, along with associated data formats. It also considers modeling enhancements, including the ability to capture more detailed dynamics and coupling to inter-related infrastructure. The paper also summarizes the availability of test case datasets, both real and synthetic.


Introduction
The focus of this paper is to provide a survey of some of the recent work associated with providing access to largescale electric grid simulations and the associated software.Electric grids worldwide are in a time of rapid transition due to a wide variety of changes including the addition of large amounts of renewable and distributed resources, the electrification of transportation, much more energy storage, more technology for monitoring and control, smarter distribution systems, and much more sophisticated electricity markets.
One result has been the need for many more people to learn about the design and operation of large-scale electric grids.
Given the breadth of this topic and because electric grids have dynamics over many different time scales, there are naturally many software tools available, well beyond the scope of a single paper to cover.Therefore, the focus here is on electric transmission grid simulations done in either the quasi-steady state "power flow" timeframe, which assumes a constant or uniform electric grid frequency, and those in what has been called the transient or frequency stability timeframe with dynamics ranging from cycles to minutes [1,2].Early papers describing these applications include [3][4][5][6][7], while [8] provides a summary of the use of such tools in electric grid operations.Software that represents the much shorter dynamics based on the electromagnetic transients approach [9] warrants its own paper and hence is not covered here.This paper's focus is on two general algorithm classes.The first, generically called power flow, assumes the electric grid is operating at a constant frequency and has few, if any, explicitly modeled dynamics.Here, this generic term includes related applications such as contingency analysis, which involves multiple power flow solutions with each applying a particular disturbance (i.e., a contingency) to the case such as opening a transmission line, optimal power flow (OPF), which combines the power flow with using an economic operation optimization, and the security-constrained OPF (SCOPF), which is similar to the OPF except it also includes optimizing over a set of contingencies.
The second, generically called stability, models the electric grid with time scales down to about an electrical cycle.Compared to power flow, stability provides a much more detailed representation of the electric grid, but because it has a shorter time frame, there is essentially no economic optimization.Sitting between these two is a third algorithm class, generically known as operator training simulators (OTSs).OTSs assume the electric grid is operating at uniform frequency and include dynamics on the order of seconds.As the name implies, OTSs are usually used to train electric grid operators and therefore have more limited applications compared to the other algorithms and will not be a focus here.
Before going further, it is helpful to first define several terms.The term model is a mathematical description of how a type of device behaves.The term model instance (or just instance) is used to denote a particular device of the model type.Models have parameters that are fixed for each instance and inputs that can vary.Also, different models are commonly used in different algorithms for the same devices.For example, in the power flow, an electric load might be modeled as having a fixed power consumption, with the location of the load a parameter, and its power consumption an input.In stability, a much more detailed model might be used.The combination of all the instances is then called an electric grid (or just a grid), and the combination of the grid instances with particular inputs is known as a case.
The commonality between the power flow and stability is they represent the grid using a quasi-static phasor representation [10] in which it is modeled as a set of algebraic equations.Where they differ is the degree to which their models represent the dynamics in electric grid devices.In the power flow, many of the faster dynamics are assumed as algebraic constraints.An example of this is representing a synchronous generator as a source of real power with a fixed voltage magnitude (i.e., a PV bus); the exciter dynamics are assumed to be algebraic.However, the longer time frame of the power flow allows for generator redispatch and hence the optimization of the OPF.Stability represents the dynamics of many types of devices, with algorithms now supporting hundreds of models.References summarizing many of the current models include [11,12].
The algorithms are then implemented in specific software applications, and usually a single application contains a number of different algorithms.The applications also include additional complimentary functionality, such as visualization, that add to their core usefulness.The applications are then used to simulate a case or a set of cases using the desired algorithm(s).Exactly what is in each case depends on the desired algorithm and the specific software.For example, an OPF case would be different from a Stability case.How a simulation is done varies.Examples include a single, non-interative power flow or Stability solution, a series of solutions, or an interactive time-domain simulation involving one or multiple participants.
The rest of the paper is organized as follows.The next section provides a survey of some of the major software applications, with the following section discussing some of the recent developments impacting the models, algorithms, and complimentary functionalities.The fourth section then provides a survey of the availability of different grids and cases both real and synthetic [13].The last section provides a summary and future directions.

Software Applications and Case Formats
Given that power flow and stability software dates back more than 50 years, there are many different available choices.Also, software popular in one country or region might not be common elsewhere.In addition, power flow and sometimes stability is included in the energy management systems (EMSs) used in electric grid control centers [8].However, these applications are more specialized and are only available with the EMSs; hence, they are not covered here.
If the scope is limited to North America, the most popular generally available commercial tools continue to be those mentioned in [14], that is, GE Energy Consulting's PSLF [15], PowerWorld Simulator [16], Siemen's PSS®E [17], and PowerTech Lab's DSATools [18].All of these have both power flow and stability functionality and support a wide range of algorithms and complimentary functionality.They also have lower-cost educational versions, and some have free downloads that can solve small grids; an educational version of PowerWorld Simulator is included with a textbook [19].All of these tools can solve the large-scale grids that can now have more than 100,000 buses.
There are also a number of open-source power flow and stability applications, with [20] providing a summary.The most widely used open-source power flow application is Matpower [21,22], which consists of Matlab-language M-files implementing a number of power flow algorithms; Matpower does not include stability.Another Matlab-based tool that includes power flow and stability is the Power Systems Toolbox (PST) [23], which is included with a power dynamics textbook [1].
What enables people to use multiple software applications is common case formats.Most applications can save cases in proprietary binary formats and save at least some of the case information in text-based formats.An early public, text-based power flow format is given in [24], while [25] lists many of the current file types.However, many of the common textbased formats are not public, with [26] recommending that common file formats be publicly available.

New Power Flow and Stability Developments
While the core power flow and stability functionality has remained since introduced in the 1960s, over the decades, new developments have occurred.This section summarizes some of the more recent ones (see Fig. 1).One of the most important has been the trend toward including geographic information.Traditionally, cases have not included model geographic information since it is not needed to do the core studies and early computers had very limited memory.However, this is rapidly changing due to (1) the widespread availability of geographic information systems, (2) the visualization techniques that can leverage geographic information, and (3) the need for geographic information for geomagnetic disturbance (GMD) studies [27].Examples of North American Electric Reliability Corporation (NERC) regions requiring the submission of geographic information for at least some applications include [28,29].In the USA, the latitudes and longitudes for all generators larger than 1 MW are available at [30].Hence, many cases now have geographic information.
Associated with the need for geographic information is expanding power flow and sometimes stability to account for the impact of geomagnetically induced currents (GICs) on the electric grid.This was driven in part by [31] highlighting the importance of considering high-impact, low-frequency (HILF) events, and [32] presenting a methodology for considering the GICs caused by GMDs.Then, [33], building on [34], integrates GMD analysis into the power flow.The NERC requirements for GMD studies are given in [27].The result is now most commercial power flow software includes GMD analysis algorithms.More recently, there has also been interest in including the impact of high-altitude electromagnetic pulses (HEMPs) in stability [35,36].
Also, driven by the growing availability of cases with geographic information is the inclusion of weather into power flow studies.Given the rapid growth in large amounts of strongly weather-dependent wind and solar generation and the need for ambient-adjusted transmission line ratings [37], expanding power flow to include weather has a number of advantages.Works showing how this can be done include [38,39], with commercial packages starting to directly support the inclusion of weather.An example of this is shown in Fig. 1, in which historical weather is applied to a more recent power flow case.In the figure, a contour is used to visualize the wind speed, while the green and yellow ovals show the calculated outputs for the wind and solar generation that can then be used in power flow or OPF calculations.
Recently, a number of new simulation developments have been taking place in the OPF area including the SCOPF.The need for this was noted by two of the recommendations from [26] looking broadly both at optimization algorithms and at SCOPF.A survey of current methods is given in [40], which divides the SCOPF solution algorithms into two broad categories: model-based methods and machine learning (ML)-based methods.While model-based approaches are the most common, a key challenge continues to be making the algorithms fast enough to solve large cases with potentially large numbers of contingencies.Hence, some of the new research is focused on ML to more quickly solve at least part of the SCOPF.
Helping to drive the rapid innovation that is taking place in the SCOPF domain over the last few years is the US ARPA-E Grid Optimization (GO) Competition [41].The goal of the GO competition is to accelerate the development of new grid optimization algorithms.GO is now on its third challenge, with details on the earlier results given in [42].Some of the new results include the use of ML to help select the important SCOPF contingencies [43], and the use of principal component analysis (PCA) to restrict the SCOPF solution space [44].
There have also been many new developments associated with stability simulations.Much of this work has been associated with improving the quality of the stability models, including developing appropriate models for newer technologies including for renewable generators and storage.Two recent development summaries include [45,46].Associated with this is the use of playback models to allow stability to play back recorded actual grid disturbances with the goal of verifying and improving the models.An ongoing issue in stability simulations is the degree to which generic models, with appropriately chosen parameters, can be used to represent grid behavior compared with more proprietary user-defined models (UDMs) [47][48][49][50].Another development is supplementing the traditional physics-based models with more data-driven ones [51].An example of large-scale stability simulation using a 110,000 bus grid is in [52].
While most stability simulations are not interactive, because of the recent growing interest in grid dynamics, interactive simulations are becoming more common.There is a long history of longer time-scale interactive simulations with a nice summary provided in [53].Examples of more recent stability simulations, capable of showing grid dynamics and oscillations, include [54,55].An example of using such a simulation for university education is given in [56].An example of a stability simulation coupled cyber is given in [57] while co-simulations with part of the simulation represented in the stability timeframe and part in the shorter EMTP time frame are given in [58].
The last topic software topic considered is wide-area visualization.A key simulation consideration is that the person performing it understands what is going on.The term used to convey this concept is situational awareness (SA).While defined informally as "knowing what's going on," a more formal definition is "the perception of the elements in the environment within a volume of time and space, the comprehension of their meaning and the projection of their status in the near future" [59,60].In grid simulations, SA is often enhanced using wide-area visualization, with a survey of some of the recent visualization techniques given in [61] and a more focused consideration of SA in [62].When the focus changes to communicating the results of an entire simulation, the concept of visual storytelling is helpful, with [63] providing an overview and [64] presenting grid examples.Figure 2 shows an example of a newer technique for visualizing the aggregate power flow and voltages in large-scale grids.In the figure, which shows results for an 82,000 bus synthetic grid, the size of the rectangles is proportional to the total generation in different regions, the rectangle's color shows its minimum bus voltage, and the green arrows visualize the flow of real power.

Available Electric Grids and Cases
Electric utilities and other entities responsible for the planning and operation of actual grids typically have a large number of available models and cases.Many engineers work on an ongoing basis to assemble datasets that represent a utility's portion of the grid, both as it currently exists and as it is predicted to be in the future.Both power flow and stability cases are created for those planning horizons, with typical practice being to create cases for multiple extreme conditions such as maximum (peak) summer, peak winter, off-peak shoulder, or minimum loading.In addition to planning cases, operational cases exist, which capture a snapshot of the grid as it was operated at a particular time, usually as an export of the EMS.These cases can be used operationally for assessing potential contingency concerns or instabilities, as case studies for planning purposes.
Because electric utilities are interconnected into largescale grids, maintaining useful model instances involves coordination between a large number of entities.For example, in the US state of Texas, the Electric Reliability Council of Texas (ERCOT), which is responsible for the Texas Interconnect, requires transmission system operators such as major utilities to supply power flow and Stabiliy cases which ERCOT then coordinates, using what is known as the network model management system (NMMS) to create the overall planning base cases for the ERCOT system [66].Most recent ERCOT planning cases are modeled with about 7000 buses.In the western portion of North America, the Western Electric Coordinating Council (WECC) has a Modeling Subcommittee which performs a similar function for this system that spans parts of 13 states, Canada, and Mexico [67], with WECC planning models often being about 20,000 buses.In the North American Eastern Interconnection, the grid now is modeled with about 80,000 buses [68].To further facilitate this effort, the US Federal Energy Regulatory Commission (FERC) has a required filing process for the Annual Transmission Planning and Evaluation Report (Form 715) for which US entities submit cases to facilitate reliability coordination [69].
Historically, actual grid cases were typically quite widely accessible by a variety of entities, including academic researchers; however, this changed in 2001.FERC has designated much actual electric grid information to be critical energy infrastructure information (CEII), meaning that it cannot be freely shared [68].Sometimes, this data can still be accessible by academic researchers and others, but this must be done through the use of non-disclosure agreements Fig. 2 Aggregate power flow visualization for an 82,000 bus grid using the approach from [65] (NDAs), to protect these CEII cases.While the concerns about data availability are certainly legitimate, the implications for research are significant.Early-stage research developments are often not tested on realistic grid cases, and any results tested on actual grid models cannot be published in detail, prohibiting peer researchers from cross-validating the results without identical access to the same CEII [26].
There are a few actual electric grid datasets outside the USA that are available in some form.One example is an older model of the electric transmission system of Poland [70].This case contains power flow data and is sometimes used to test OPF algorithms.It does not contain stability information.To the authors' knowledge, no actual stability case of a large-scale transmission grid is publicly available.
For many decades, academic research has utilized what are known as the IEEE Test Cases [71].Most of these test cases originated in the 1960s as simplified, anonymized versions of actual grids at that time.They range in size from 14 buses up to 300 buses and are extremely commonly used in academic literature.In addition, there are other smaller test cases from textbooks, software vendors, and researchers, which sometimes are based on real systems and other times are fictitious.While these cases serve the research community for initial research, they are much smaller and simpler than actual grid cases.Many of them also have features that do not reflect modern grids, such as a strong dependence upon voltage support from synchronous condensers.
To address this problem, over the last several years, the concept of synthetic grids has emerged.Synthetic grids are also fictitious test cases, but they are built at least in part with an automated algorithm (rather than entirely by hand) so that they can be scaled to sizes comparable to industry interconnect cases and include additional complexities and features.The key research challenge in building synthetic grids is developing an algorithm that is completely free of CEII and yet can generate a case that is highly realistic and useful for research and development.One of the key aspects of the synthetic grid research problem is characterizing the topological structure of realistic electric grids.Early work in analyzing electric grids as graphs includes [72][73][74][75], with much of the emphasis being on showing how grid reliability (as in vulnerability to cascading failures) is tied to its topological structure.
Several efforts have been made to reproduce topological characteristics of grids in synthetic networks, and some of these have also included various electrical properties as well and most recently geographic information [13,72,[76][77][78][79].When including geography, an automated approach begins with selecting a geographic footprint and seeding the algorithm with load and generation data (non-CEII) from that footprint.For example, in the USA, generator data is publicly available [30], and the load can be estimated from census data.The transmission planning algorithm follows, where an iterative process selects a subset of possible transmission lines that balances geographic, topological, and electric metrics.Once a candidate grid has been formed, a power flow solution is obtained by gradually transitioning to an ac power flow and adding in appropriate reactive power resources to control voltage and ensure power flow convergence.Finally, additional modeling data can be added such as the parameters needed to solve for GMDs, OPF, or stability, as in [80].Figure 3 shows an example of a synthetic grid case, geolocated on the ERCOT footprint of Texas.Although it shares geography, voltage levels, and generator locations with the actual ERCOT grid, none of the transmission facilities in this case are derived from any CEII.Other notable synthetic grids developed include a 2000-bus Texas case, a 10,000-bus Western US case, a 70,000-bus Eastern US case, and several smaller cases.Each of these cases also includes stability and GMD models; they are available at [81].

Conclusion And Future Directions
Much work has been done in recent years toward enabling realistic power grid simulations.Commercial and opensource tools have continued to develop with a growing number of documented, interchangeable data formats.One of the big, recent changes in power flow cases has been the addition of geographic information, which has advantages for Fig. 3 A 7000-bus synthetic electric grid geo-located in Texas, showing generator fuel types and transmission lines GMD, weather, visualization, and coordination with external datasets.Other changes include increased economic data for more advanced SCOPF, as well as expanding options for stability modeling of newer technologies.Tremendous progress has also recently occurred on developing synthetic grid cases, making them more realistic and larger.
A key area for future work is associated with the growing complexity of electric grids in both power flow and stability.Determining the appropriate level of modeling detail continues to be a challenge, as is computation and the need for the people in the simulations to have good situational awareness.Electric grids worldwide are transitioning to a higher level of inverter-based renewable generation, with periodic changes that need to be reflected in the models as the technology advances.The role of the distribution network and devices connected to the edge of the grid (such as electric vehicles and distributed generation like rooftop solar) is another area of particular attention in transmission system modeling, with challenges relating to how to aggregate these small, wide-area effects-which are in many cases impacted by weather-for bulk system studies at the power flow and stability time frames.Extreme events, like severe weather, present unique challenges for modeling and software, as they may in some cases stress models beyond their original intended region of operation; hence, they require further validation or modification.Another area for improvement is developing better synthetic datasets, including ones coupled with other infrastructure, and higher-quality stability test cases.As mentioned above, additional recommendations are to improve public documentation of exchangeable data formats and stability models.

Fig. 1
Fig. 1 Power flow visualization of wind and solar generator outputs using historical weather information