1 Introduction

The prosperous developments of data science and artificial intelligence (AI) have made digitalization as one of the major solutions towards tackling the sustainability issues for energy systems [1, 2]. Recently the benefits of energy system digitalization are being actively explored by government [3], industry [4] and academia [5] from various perspectives, ranging from power generation unit fault diagnosis [6] to regional energy infrastructure network synthesis [7]. In virtue of the improved sensor, AI algorithm, cloud computing, remote control and etc., digitalization provides a new norm for systemization of knowledge from physical phenomenon, in particular for energy system that includes a variety of energy conversion, transmission and consumption processes. Starting from monitoring and recording massive amount of data, both real-time and static [2, 8], the power of data can be fully unleashed to support a timely decision-making for the optimized operation, anomaly diagnosis and demand-side response and much more [9]. Therefore, a periodic summary of what digitalization of energy systems have achieved so far and what lays ahead for further synchronization between digitalization and energy transition is necessary.

Compared to the current fossil fuel dominated energy system, the importance of digital technology would become even more significant with energy system transition to net-zero, with deep structural changes of such energy system and solar/wind dominated variable renewable energy supply [10]. On one hand, the inherent intermittency of solar/wind needs to be numerically calibrated for accurate energy system plan and operation at different time scales (e.g. year, day, hour); on the other hand, the decentralization of energy supply, in together with the blurred boundary between energy producer and consumer, highlights the necessity for cross-region energy flow coordination which will also need data enabled intelligence. Nevertheless, many legacy business models in energy sector may not be up to these challenges, for example the match between supply and demand in current energy system is mainly dominated by supply side. The energy supply barely changes with the demand largely remaining flat; yet it will not be the case in net-zero energy system because demand has to fluctuate with supply as well. Cleary the current labor force related to energy system are not prepared for such operational transition; by contrast, digitalization has access to more granular data and the advanced analytics capability, which allows net-zero emission energy systems accurately quantify the benefits brought by their operations. In the past five years, global early-stage venture capital investments in digital energy start-ups increased from USD 167 to 447 million [10]. Yet despite the successful digital transformation cases papered, the design principles and operation regimes of energy system remain largely unchanged in practice [11]. The integration of digital technologies and energy systems are still quite inarticulate for now, for example, almost every single equipment (e.g. pump, boiler, turbine, pipe) in a thermal power plant has its own high fidelity simulation models, but they are rarely fully used in the integrated plant simulation and control. In other words, the digital artifacts of different components of energy systems have not been well linked to each other to fully unleash the power of numeric modeling and optimization in energy transition.

A high-level conceptualization of using digital technologies through cyber-physical system is shown in Fig. 1. In such a schematic, a cyber-physical system for future energy system has been proposed and intensively investigated in recent years [12, 13]. Cyber-Physical System (CPS), defined as “co-engineered interacting networks of physical and computational components” [14], is a further advancement of Internet of Things (IoT), embedded system, cloud computing and ambient intelligence. CPS aims to create a virtual representation, or digital twin, of real entities in physical space and seek optimal solutions to real-world problems by exploring the cyber space as shown in Fig. 1. Essentially, information about real entities is first collected through sensors, then sent to distributed computing platforms. The computational engine in such computational platform would then come up with a simulated strategy and apply it to the simulator. The simulation strategies could then be evaluated in reality, reconfigured if necessary and sent back to the simulator until the final optimal solution is found and applied to solving real problem. In fact, the digital assistance ability provided by such CPS has largely transformed many traditional industries (e.g. automobile, electricity, health-care); however, energy industry, although plays a fundamental role in current society, has not been enough engaged. In this review, the conception of digital assistance refers to the integration of human knowledge and machine agent during task execution [15]; “task” is a general description of energy system questions which could be finding better catalysts for energy product formulation, or synthesizing better power generation flow sheet, or configuring better international energy supply chain. In terms of computational framework, there are three basic elements for such energy system digital assistance: namely data, analysis and connectivity. Data provides the necessary information input as starting point of the digital assistance, analysis is responsible for producing useful insights from the data input, whereas connectivity could bridge the communication gaps between humans, devices and machines (including M2M) so that data and models could be effectively collected, analyzed and implemented.

Fig. 1
figure 1

Architecture of cyber-physical system enabled digitalization

Previous typical studies on digital technology enabled energy system transition have been examined and summarized: A recent paper points out that digital transformation could add USD 300 to USD 500 billion economic value to chemical industry together with 60–100 million tonnes CO2 emission reduction [16]; the new launched US Department of Energy’s Clean Energy Smart Manufacturing Innovation Institute (CESMII) also supports the future integration of smart manufacturing and energy industry, of which one important aspect is exploring the possibility of using smart manufacturing conceptions to improve efficiency and sustainability of energy industry [17]. One of many examples is the application of machine learning algorithms (e.g. support vector machine, artificial neural network) to predict the utility (power, heat) demand of various processes [18]. Similarly, the impact of digitization of process control and operation has been investigated in literature as well [19]. However, it has to be underlined here that constructing digital assistance for energy industry application is far more than building a machine learning algorithm for single application because digital assistance aims for a fully integrated control loop that contains the hierarchy of information at different detail levels, thus could enable automatic scenarios analysis through intelligent agent action. The detailed advantages of digitization would be elaborated and demonstrated in Sect. 4. Also summarizing energy system digitization research achievements is no easy task because many technologies are already contributing towards digitization without clearly marking it out, also it is difficult to clearly define the system boundary of CPS projects because physical space and cyber space are so tightly intertwined in many situations.

Yet in order to understand the current status and future trends of energy system digitization, the following questions need to be answered: What kind of data exist in the energy industry? How can data be efficiently collected and processed given the current data architecture? What kind of intelligent agents could be developed based on such data? Are there any successful demonstration projects at various scales? If so, what lessons could be learned from such pilot projects? In order to answer these questions, a preliminary literature review is presented in this paper. Firstly, we summarized the main conceptions and enabling technologies in current energy system digitalization perspectives; secondly, several digitization enabled energy system research projects are listed as research examples; finally based on the experiences accumulated from these projects, a technology road map towards net-zero energy system transition using digital innovations is suggested and discussed.

The rest of this paper is organized as follows: Sect. 2 provides an introduction of energy system digitization conceptions and key enabling technologies; Sect. 3 lists several representative energy industry digitization applications; Sect. 4 presents the current energy system digitization challenges and gaps; Sect. 5 delivers the final conclusions.

2 Energy system digitization: conceptions and key enabling technologies

2.1 Energy system digitization conceptions

The evolution of digitization could be tracked back to the 1940s when American mathematician Norbert Wiener introduced cybernetics as “control of any system using technology” [20], after which cyber space has been widely used to describe “the infinite artificial world where humans navigate in information-based space” [21]. Recently, CPS has become a well-recognized scientific terminology [22]. There are several digitization conceptions which are quite similar to CPS, including:

  • Internet of Things (IoT) [23]. IoT is a vision of the future where many millions of devices are connected over the internet, allowing them to collect information about the real world remotely, and share it with other systems and devices.

  • Embedded System [23]. An embedded system is a self-contained system that incorporates elements of control logic and real-world interaction.

  • System of systems (SoS) [23]. A system of systems (SoS) is a system composed of components, which are also independent systems in their own right.

  • Industry 4.0 [24]. Industry 4.0 is a term that originated in the area of manufacturing engineering and represents the fourth industrial revolution: the ability of industrial components to communicate with each other.

  • Machine-to-Machine (M2M) [24]. M2M communication refers to the ability of industrial components to communicate with each other.

The above conceptions mutually overlap in the sense that they all highlight the orchestration of linked computers and physical systems both horizontally and vertically. Generally, digitization is a quite inclusive conception, referring to the implementation approaches (e.g. IoT), sectoral application (e.g. Industry 4.0), and systematic abstraction (e.g. SoS) [24]. In addition, the definition of CPS by NIST [14]—“co-engineered interacting networks of physical and computational components”—is used to denote the fact that digitization emphasizes on the iterative interaction between physical space and cyber space rather than pure numerical simulation and optimization. Such definition of digitization aligns with the future trends of net-zero energy system transition as well, which would evolve into a decentralized and interconnected network rather than isolated dots. In such context, the interpretation of digitization from the view of “interacting networks” bears more significance.

2.2 Key enabling technologies

Digitization, as a collective innovation, is enabled by various technologies related to both hardware setup and software design. The EU CyPhERS (Cyber-Physical European Roadmap and Strategy) project summarized the key enabling technologies as computation, sensor/actuator, communication and informatics [23]; whereas the German Academy of Science and Engineering (Acatech) paper (Living in a networked world) categorized the required technologies into six aspects, namely physical awareness, prediction ability, coordination, human–machine interaction, learning ability, and adaption flexibility [24]. In the context of energy system digitization, the key enabling technologies include data acquisition, data fusion, data analytics, decision-making and decision implementation. Figure 2 depicts how does the key technologies enable the digitization of energy system and the detailed descriptions of these technologies are as follows:

  1. (a)

    Data acquisition. Getting state-of-the-art data across different sectors of energy system (i.e. generation, storage, transmission, and consumption) is the very first step towards the future data manipulation. For example, phasor measurement units (PMU) have been widely used to monitor the transient process of power supply and plays an important role in modern power gird optimization [25]. Yet, high level penetration of variable renewable energy has changed the operation regime of power system, bringing additional difficulty to the data acquisition from energy system [26]. Similarly, data from other energy domains, for example natural gas network, has been monitored on different time scales, yet also faced new challenges such as the increasing variability and regional coordination. Such challenges correspond to the information flows 1 and 2 in Fig. 1, which aim to get the important operation information about energy system into the computation engine for future post-processing and/or numeric manipulation.

  2. (b)

    Data fusion. Data fusion mainly refers to the process of merging data from different sources and dealing with the possible heterogeneity and inconsistency [27]. Data integration is an indispensable step for many digitization applications especially when the cloud computing is conducted. For example, many district-scale energy system has its own control center where sub-system level information exchanges. In such cases, it is important to have a standard platform to handle the heterogeneity coming from different data sources (i.e. the selection and processing parts in Fig. 2 (a)). The consistency of coded rules and logic relationships for data is one promising option and some earlier demonstrations are available (e.g. [28]). However, whether it is the best way to overcome data heterogeneity and how to combine such symbolic reasoning methods with machine intelligence remains as an open question now.

  3. (c)

    Data analytics. Most literature about digitization of energy system is concerned with data analytics and many promising cases have been papered, including building energy management [29], power plant optimization [30], and district scale energy system planning [31], etc. Re-usability is a major concern here. Most of such studies use handcrafted data and deliberately tailored algorithms which are only suitable for the specific problem at hand, which leads to the low re-usability of the solution [29]. To some degree, the challenges for large-scale energy system digitization largely come up with the re-usability issue: instead of being case-dependent, high adaptability should be properly designed; the digital assistance should always be able to solve the Nth-of-a-kind (NOAK) problem as long as it has successfully solved the first-of-a-kind (FOAK) problem [32].

  4. (d)

    Decision-making. Another important feature of energy system digitization is the coordinated decision-making ability. On the one hand energy systems have physical entities at different levels of detail; yet on the other hand different interacting agents exist in energy systems, as shown in Fig. 2 (b). As a result, digital assistance should be able to provide a multi-agent simulation platform which could mimic the interaction between different entities in the energy realm (e.g. between power generator and end user, between market operator and system regulator, etc.) [33]. The coordinated problem-solving strategy is of great importance for future energy system digitization because distributed energy resources utilization will be a key feature of future energy system [34]. In addition, digital assistance for energy systems could integrate external information to improve the entire energy system supply chain, and thereby to achieve the enterprise optimization [35].

  5. (e)

    Decision Implementation. Digitization offers the possibility to apply closed-loop control of energy system by coupling cyber and physical spaces ideally automatically. Successful examples include the scheduling of appliances at smart home through embedded optimization algorithm in different appliances [36]. However, digitization-enabled fully optimal operations of energy systems are only limited to certain scenarios under specific conditions. There is still a great gap that the future smart energy system should have a cross-sector self-configuring regime to reach better deployment through the digitization [37].

Fig. 2
figure 2

Schematic of energy system digitization: a a holistic working flow; b links between physical and cyber spaces

3 Energy system digitization applications

Applications of digital technologies in energy system cover various sections including primary energy resource identification, energy production and chemical manufacturing, energy transportation and distribution, and energy retail at different levels (equipment, process, plant, region) as shown in Fig. 3. Several typical cases are listed in the paper to demonstrate the digitization capabilities.

Fig. 3
figure 3

Landscape of energy system: a hierarchy perspective; b horizontal perspective

3.1 Model predictive control

Model predictive control (MPC) has been a classic topic in energy system operation for years [38]; in MPC, developing large-scale, hierarchical, high-fidelity yet fast-response models is essential [39]. By default, there are two types of numerical models used in MPC, namely first principle models and data-driven models. First principle models usually start with underlying physics of energy conversion and utilization processes (e.g. chemical reaction, heat and mass transfer), then use physical mathematical equations as delicately calibrated first-principle models. First principle models (e.g. CFD models) could provide very high fidelity representation of energy system entities but they are computationally expensive and difficult to reproduce. Data-driven models offer some advantages compared to first principle models but normally require large size of training data set. In this context, digitization, through data acquisition and data fusion, provide prefect platform to advance the application of data-driven model based MPC because: (a) digital assistance provides rich data sources from the sensor network which could feed into the data-driven model; (b) data driven model development and simulation could be done in cloud-environment or clouded-edge platform, which provides better computation resources for MPC.

A successful MPC application in steam-methane reforming (SMR) control is reported by Thomas et al. [17]. SMR is a common chemical process that consumes high energy because the required reaction temperature is higher than 400 ℃. The current industrial practice in such firebox control is basically using thermocouple temperature measurement to get the firebox temperature to control the fuel flow rate corresponding. This is a very simple control logic which does not make full use of MPC. In this case, CPS enabled digital assistance could solve the problem. Firstly, pre-developed CFD model is used to generate surrogate model for MPC, also infrared thermal camera is used to get detailed temperature distribution of the firebox during process operation; computer vision agent is then deployed to analyze the infrared camera image to get temperature of key points in the firebox, the results are compared to the surrogate model outputs to find optimal input (e.g. fuel injection angle); furthermore, the measured temperature distribution under new configuration are compared with surrogate model outputs to check whether the surrogate is accurate enough, if not the deviations would be used to further improve the surrogate model.

This modeling approach is referred to egg-create SMR (EC-SMR) model which reaches a compromise between computing time and prediction accuracy. When used for furnace balancing, this model takes about a minute to compute the optimal valve position, which obtain a 44% reduction in the standard deviation of the tube wall temperature distribution [17]. By contrast, the computing time of CFD models are too long to implemented in real-time control. Wehinger et al. [40] establish a high-precision particle-resolved CFD model to simulate the SMR process in a small firebox containing 113 catalyst particles. A laminar case with a mesh of 3.2 million cells takes 1.7 × 107 s on an Intel Xeon 3.07 GHz CPU. Furthermore, Pashchenko et al. [41] develop a pseudo–homogeneous CFD model to investigate SMR process. The numerical result presents good agreement with the experimental measurement. The mean deviation is less than 5% and the computing time for a single case is in the range of several minutes to hours. Furthermore, it is noticeable that in the architecture of EC-SMR model, the computation intensive parts, such as CFD simulation and computer vision, are conducted on the cloud; whereas the computation light parts, such as calibration of surrogate model, are done on the edge. As a result, it is possible to apply such digital assistance to various cases when local computation resources are limited.

3.2 Enterprise-wide optimization

Enterprise-wide optimization is a new frontier in energy system engineering that aims to optimize supply chain of material, manufacturing, distribution of product and retail simultaneously [35]. Compared to the traditional optimization for a specific power plant or factory, it emphasizes on the coordinated interactions between transactional management (e.g. energy supply chain and organization managements) and production optimization (e.g. production planning, productive process and product quality) as shown in Fig. 4. A key feature of enterprise-wide optimization is the spatial and temporal information integration among different subsystems and timescales. Obviously, except for the fundamental ability of providing information, the enterprise-wide optimization calls for a higher requirement in the ability of comprehensive decision-making. To realize such capability, the digital assistance should be able to handle the following challenges: (a) developments of mathematical models that capture the complexity of energy system operations; (b) multi-scale optimization framework that works across various spatial and temporal scales; (c) stochastic programming tools that could account for the inherent uncertainties related to energy system optimization (e.g. demands, prices); (d) efficient computational algorithms and platform that achieve just-in-time trade-off between accuracy and speed [42]. It turns out to be that digitization provides a prefect platform for realizing such enterprise-wide optimization: (a) digitization encompasses hierarchical models providing the detailed information about different units and surrogate models determining the optimized operation of energy system; (b) optimization framework in energy system digitization will be automatically formulated and solved by intelligent agents which could greatly facilitate the solution of stochastic programming under uncertainty.

Fig. 4
figure 4

Schematic of enterprise-wide optimization

The MANGOret optimization framework reported by Petkov et al. [43] is a good example which indicates the digitization is a powerful tool for the enterprise-wide optimization of existing building retrofits. The building sector accounts for 28% of energy-related CO2 emissions. Most of existing buildings will remain standing before realizing the climate goal in 2050, and therefore have to be retrofitted in the next decades [44]. Digital assistance enables the MANGOret optimization framework to give some optimal solutions that cannot be answered by traditional optimization approach, including when to initiate a retrofit, what retrofit to do and how to choose investment strategies by balancing cost and CO2 trade-offs. The optimization result shows: (a) a large commercial building in St. Gallen has an aggregate reduction in emissions of 510 tCO2 between minimum emissions and minimum cost Pareto points for an overall 10% increase in cost; (b) to balance the two objectives (cost vs. CO2 emission), the optimal solution is to wait as long as possible until the components must be retrofitted; and (c) the present-day attractiveness of buildings equipped with solar PV and heat pumps is also a significant portion of solution for building retrofits.

Enterprise-wide optimization could be also narrowed down to single plant level. Power plant, either fossil fuel based or renewable based, is one of the most important components in energy system. The current operation regime of power plant is mainly dominated by the static empirical rule that is being the challenged by the current watchful eye on dynamic and transient characters of power generation. Digitization could provide more power in such transition. I4GEN, namely Insight through the Integration of Information for Intelligent Generation, is an exemplary project run by Electric Power Research Institute to create digitally connected and dynamically optimized power plant [30] and the optimization system structure of this project is shown in Fig. 5. The ultimate goal of this project is to transform the huge amount of power plant operation data into actionable intelligence whenever needed such that future power plant operation could be intelligent. I4GEN defines three enabling technologies and six digital networks to make such transformation possible, namely real-time information, distributed and adaptive intelligence, action and response as enabling technologies, sensors and actuation, data integration and information management, advanced process control, asset monitoring and diagnostics, advance O&M, optimization as digital networks. In the first instance, the project demonstrates its capability in power plant fault diagnosis through advanced pattern recognition algorithms, such as analyzing the turbine blades vibration data to prevent turbine damage, analyzing cooling tower motor temperature data to spot possible clogs. In other cases, the project also shows that when properly communicating with neighboring power plants in the same grid, a networked operation control strategy could be formulated to facilitate the coordination between fossil fuel based generation and variable based generation to produce reliable and cheap power. In the future scenario of low-carbon power supply, more projects like I4GEN should be conducted to unleash the potential of digitization in power plant operation.

Fig. 5
figure 5

EPRI I4GEN project structure

3.3 Smart city

In fact, the conception of enterprise-wide optimization could be extended to other domains such as smart eco-industrial park and smart city. Digital City Exchange, a Research Council UK funded project at Imperial College London [45], is a good example. The Digital City Exchange project aims to revolutionize the urban infrastructure by integrating energy, transport, waste and utility resources. The project takes advantage of recent advances of pervasive sensing, large-scale modeling, new analytical and optimization techniques and web services technologies, the Internet-of-Things and cloud computing to find innovative solutions to optimize the use and planning of cities. Specifically, the projects look into the following aspects: sensor and data, cross-sector integration, real-time data incorporation, and digital services implementation. Key findings of the project include urban sensor data integration techniques [46], design of reliable communication networks for sensors [47], interaction between behavior economics and transportation energy consumption [48] and the impact of dynamic pricing on demand response [49]. The multi-scale 3D-GIS-approach for the solar potential analysis [50] is another example of digital application for smart city. This approach integrates the functions of data acquisition, data preparation, solar analysis, and solar income dissemination, which is developed to support an effective planning of solar panel constructions. According to the computed result of direct solar radiation of an old town in Innsbruck, the roofs have a mean radiation value of 688.65 kWh during half a year and that value is 191.68 kWh for walls, which result in a mean radiation value of 339.20 kWh per half year for all building surfaces.

The approach has been applied to energy system management of Jurong Island eco-industrial park in Singapore through the development of an energy system digitization project named J-Park Simulator (JPS). In JPS, every entity from a network of industrial plants down to individual parts of a unit operation can be digitally represented. This requires both suitable representations of entities and their related collected or measured data and representations of physical behavior as function of parameters or process conditions—in other words computational models. The overall system architecture of JPS is depicted in Fig. 6 which incorporates several ideas from previous sections: JPS utilises concepts and relations of different domains that have been designed in a modular manner, including OntoCAPE as a large-scale ontology for chemical process engineering [51], OntoEIP as an extension of the former to eco-industrial parks and its networks, and OntoCityGML as an ontological version of CityGML, a standardised format for 3D models of cities and landscapes [52]. With the help of these vocabularies, entities can be described semantically and form a large knowledge graph where each entity is represented by a node and each relation by an edge between two nodes. The knowledge graph can span across several application domains and several levels of hierarchy. Moreover, the knowledge graph can be distributed on different hosts which is represented in Fig. 6 by rectangles enclosing different parts of the knowledge graph. JPS provides the ability to grant independently controlled access privileges to different parts of the knowledge graph. A possible example is that all tenants of an industrial park are allowed to access a private data repository that provides information about the available central utilities, using this in conjunction with private repositories, all tenants could optimize their use of the central utilities.

Fig. 6
figure 6

System architecture of J-Park Simulator

The structure of the knowledge graph and the data stored therein is not static but evolves with time, such as during the operation of a chemical plant or when analyzing what-if scenarios. JPS realizes this dynamic behavior by agents. Agents are software components working together and covering a broad range of functionality. This is represented by the upper red layer in Fig. 6 where triangles and dashed lines denote agents and the collaboration between them respectively. Agents can operate on specific parts of the knowledge graph or on specific types of entities. Agents can also execute optimization algorithms by using solvers and reasoners. Some agents might even formulate simulation strategies or make decisions.

4 Digitization for net-zero energy system transition

4.1 Potential applications

In order to reach economy-wide net-zero, decarbonization of energy system is indispensable. A successful transition to the future net-zero energy system calls for combinations of technologies to eliminate emissions, in which innovation and deployment require overcoming technological and organizational challenges [53]. In general, the decarbonization of energy system could be achieved from three aspects, namely lowering demand-side energy use through energy efficiency, changing supply-side energy structure from fossil fuel to renewable, deploying carbon capture utilization and storage. In fact, none of such transitional changes would be possible without the assistance from digitization, which applies to both planning and operation stages of energy system. At planning stage, digitization could facilitate the accommodation of variable renewable generation, both centralized and decentralized, mostly through using energy system models. At operation stage, digitization provides customers the necessary information on real-time supply and demand such that they can either shave their demand or maximize their own benefits by selecting optimal supplier in free energy market. In similar manner, peer-to-peer energy trading in a decentralized energy system could also be facilitated by digitization.

In addition to the above-mentioned power system applications, digitization could also facilitate the solution of the emerging areas of net-zero energy system, for example H2, electric vehicle, and carbon emission management. In terms of H2, numerous studies have acknowledged its role as an important energy vector in a net-zero future, because of its unique characteristic of high energy density, about 120 MJ/kg, which is 1.5 times larger than that of methane [54]. Nevertheless, as of 2020, 95% of H2 was produced from fossil fuels, especially steam reforming of natural gas which emitted 830 tons/year of CO2 [55]. Hence, green hydrogen production through electrolysis has been recognized as an important clean energy source with a relatively low carbon footprint. From the viewpoint of understanding green hydrogen production, there certainly appears to be a need for digitization applications, to enable real-time production rates to be mapped with operational conditions and forward planning of the entire renewable energy extraction and capital investment strategy [54]. To predict the hydrogen production using distillery wastewater as substrate, Sridevi et al. [56] develop a Back Propagation Neural Network model with network topology of 4–20-1, which uses Levenberg–Marquardt algorithm for learning. The developed model achieves a regression coefficient between experimental and simulated data of 0.87 for the hydrogen production rate. Electric vehicles, in together with heat pump, have been identified as important flexible demand side resources in future energy system. Two-way communications between electric vehicles and power grid could create a win–win situation where electric vehicles could be charged at low power rates without impacting its daily travel schedule. Realization of such scenario relies on the power of data: traffic big data is needed to infer the daily travelling pattern, electricity market data could be used to get near real-time tariff information, and finally some optimization engines have to be deployed to reach a multi-object Pareto between charging cost and travel reliability. With the assistance from digitization technologies, Ren et al. [57] proposed a strategy to minimize the payback period of the deployed rooftop PV and batteries for achieving net-zero energy of electric bus transportation. A case study covering 28 bus routes and 1224 building rooftops in a real region of Hong Kong shows the shortest payback period of 3.98 years and an annual solar energy generation of 9007 MWh are achieved through effectively tackling design issues such as battery oversizing, PV misallocation, and battery misallocation. Carbon emission management is another feature of net-zero energy system that needs digital assistance. Accurate and reliable emission accounting serves the foundation for further emission trading and/or pricing, merging data from various sources such as energy consumption, emission factor and more needs a generic framework that builds on mutual trust.

4.2 Overcoming challenges

  • Heterogeneity. From data perspective, energy system data is characterized not only by big volume but also by high heterogeneity [58]. Through modern electricity smart meter, data could be collected every minute, resulting in millions of data entries per year. Also, data from different sources (sensors, texts, web etc.) could be in totally different format (tables, figures, natural languages, math equations etc.). Moreover, such data could also be semantically not inter-operable [59], because domain knowledge is only known implicitly to domain experts and different domains might refer same conception as different silos, and vice versa (see data fusion and data analytics in Fig. 2 (a)). To some degree, how to overcome such data heterogeneity is as important as how to handle the huge data volume challenge in energy system digitization. Data heterogeneity is a significant issue that can affect communication performance and the design of communication protocols. Systems need to be able to support a great number of different applications and devices.

  • Re-usability. Currently, handcrafted data and tailored analytics method are utilized for the specific problem at hand, which leads to solutions with quite low re-usability. Instead of being case-dependent, high adaptability of analytics method should be carefully designed, which is able to deal with the similar problems derived from the specific one. Hence, the re-usability is surely a non-trivial question and requires the data analytics method to be continuously trained and be able to learn and evolve correspondingly (see data analytics in Fig. 2 (a)).

  • Privacy. The challenge is to balance privacy concerns and personal data control with the possibility of access data to provide better services. Because the digitization framework manages large amounts of data, including sensitive information like proprietary and commercial data, significant issues about data privacy are raised. Digitization requires privacy policies in order to address privacy issues, thus a data anonymization management tool is required to have anonymized information before the system processes it (see data acquisition and data fusion in Fig. 2 (a)).

  • Security. Digitization must ensure security during communications because all actions among devices are coordinated in real time (see decision implementation in Fig. 2 (a)). As digitization expand and increase interactions between physical and cyber systems, security problems become more important. Traditional security infrastructures are not enough to address the issue and new solutions must be found, such challenges apply to both new and previously stored data. Lastly, digitization is based on heterogeneous applications and wireless communications, which often raise critical security issues due to the current vulnerability of complex communication network [60].

5 Conclusions

As emerging technology, digitization is expected to greatly promote the transition of net-zero energy system. However, the materialization of digitization in energy system has delivered a new level of complexity which is not yet ready to scale. How to deal with the unintended effects resulting from the complicated human-nature-engineering interaction in energy system through digitalization remains unclear. Such challenges range from data management to communication protocol, from efficient algorithm development to instructive visualization. In order to better understand the current status and future challenges of energy system digitization, the origins and history of digitization technologies are first analyzed in the paper. Key enabling technologies for digitization are also summarized in the paper, including increasing the number of intelligent devices, providing smaller, more energy-efficient sensing and actuating devices, building frameworks, standards and platforms for multi-agent systems (ontologies), algorithmic solutions for mining and fusing of data, learning and adapting of plans.

A list of digitization applications in energy system are investigated in the review, such as model predictive control, enterprise-wide optimization, and power plant process control. Through inspecting such demonstration projects, it is found that data heterogeneity still is a significant issue that can undermine the digital system performance and re-usability. One potential solution for such non-trivial data integration could be using ontology (e.g., a definition of entities and relationships between them) to structure and to describe the data semantically. The overall success of energy system digitization also depends on a consistent approach that allows seamless data communication from different domains and secure interoperability of different components within the cyber-physical system. If successfully overcoming such heterogeneity, in together with privacy and security challenges, digitization could offer additional potentials such as linking data from different sources, facilitating sensor data integration into computational models, combining such models to a complex model structure, further implementing such model results for real applications. Such integrated framework is particularly promising when applied across different energy system sections horizontally (e.g., production, transmission, consumption), making it a powerful tool for the energy system transition to net-zero.

Finally, it has to be underlined that both energy system and digitalization are quite inclusive research topics themselves, let alone the combination of them. Thus, our preliminary review of such an important yet still not clear topic surely could not provide holistic review of the problem, either more specific case studies or even broader literature review could be extended as follow-ups.