Enhancing privacy in smart energy systems

The mission to move from fossil to renewable energy sources is accompanied and enabled by the digitalization of our energy systems. With the introduction of information and communication technologies, the widespread integration of distributed, renewable sources, even in the distribution grid, are enabled. New use cases such as fast EV charging, local energy communities and dynamic energy tariffs are also enabled. However, this move toward digitalization also increases the exposure of the energy systems for cybercrime and raises concerns regarding the privacy of personal data. In this article, we address the issue of privacy in smart energy systems and give an overview of current methods to enhance privacy. Die Energiewende weg von fossilen und hin zu erneuerbaren Energieträgern wird von der Digitalisierung unserer Energiesysteme begleitet und ermöglicht. Durch den Einsatz von Informations- und Kommunikationstechnologien wird die umfassende Einspeisung durch erneuerbare und dezentrale Energiequellen auch im Verteilnetz ermöglicht. Neue Anwendungsfälle wie das schnelle Laden von Elektroautos, gemeinschaftliche Erzeugungsanlagen und dynamische Energietarife werden dadurch realisierbar. Neben allen Vorteilen macht die Digitalisierung Energiesysteme aber auch verwundbarer gegen Cyber-Kriminalität und es bestehen Bedenken, was den Schutz persönlicher Daten betrifft. In diesem Artikel werden die Themen Privatsphäre und Schutz persönlicher Daten näher beleuchtet und aktuelle Methoden zum Schutz der Privatsphäre diskutiert.


Introduction
The term "smart grids" is used to describe the next-generation energy systems -digitized systems of systems that are an important enabler for turning from fossil energy sources to renewable energy sources. Smart grids employ state-of-the-art information and communication technology to control generation, distribution and consumption of energy. With smart grids the power network organization moves from a hierarchical to a decentralized structure and communication flow moves from largely uni-directional to bi-directional. Compared to traditional power networks, significantly more data is acquired, transmitted and made available to different stakeholders in fine granularity, sometimes in near real-time.
One of the most important goals of smart grids is the accommodation of environmentally sustainable energy sources. Traditional, non-renewable energy sources can be controlled with a hierarchical network structure, with energy sources and energy sinks at opposing levels of the hierarchy. Many types of renewable energy sources, such as photovoltaics or wind power, generate power at the distribution level. In order to integrate a high number of these renewable sources to produce in bulk quantities, an evolution of the network infrastructure and the intense use of information and communication technologies, even in the distribution grid, become necessary.
A significant portion of (potential) end-users at this point in time are wary about possible disadvantages of the new smart grid technology, like the uncertainty regarding the level of privacy and possible security breaches [1,2]. Apart from privacy and security concerns, end-users are skeptical regarding possible benefits of smart grids as a new technology, such as cost savings [2]. End-users have difficulties understanding the level of control they can exert in a smart grid environment [3].
It is valid to diagnose a severe lack of trust towards smart grids on the end-user level. If this lack of trust were to persist, it would prevent many important features of the smart grid: dynamic energy tariffs [4], distributed energy management, adaptive load balancing, e-mobility, private renewable energy sources and the usage of smart grid infrastructure for other areas, such as home automation or local energy communities.
In order to establish the needed degree of trust in the end-user domain, providing a visible level of both, security and privacy, are imperative. In addition, a further component is necessary, namely: user control. In order to alleviate concerns of a lack in benefits (especially on a personal level) a sufficient degree of understanding of possible interaction is required. Correspondingly, user interaction needs to be reflected accurately on the system side.
Users need to be informed of what choices can be made (user information), how these choices influence smart grid processes (functional transparency), what data items at what granularity need to be disclosed for this purpose (user-managed privacy) and that processes and data transfer are operated in a secure way (traceable security). On all items, systemic feedback to the user needs to be provided consistently in order to raise user awareness and ultimately create a level of user trust that allows meaningful interaction.
trol" was founded. Since 2018 the scope of the research center has been broadened under the new name "Center for Secure Energy Informatics". The goal of the methods developed at the research center is enabling trust in the smart grid end-user domain, by improving existing approaches and developing new mechanisms to establish secure, private and informed -i.e., trusted -user interaction with smart grid technology. The fundamental question addressed is: How can a sustainable level of trust by the users be established the various entities (applications, processes, data protocols, and interfaces) of the smart grid user domain, in order to foster participation and acceptance on the side of the users?
The main focus is on methods that safeguard end-user data, both in terms of preserving end-user privacy in face of potentially curios or malicious insiders and providing security to fend off outside attackers. Apart from privacy and security, general methods for data handling, such as compression, are discussed. Furthermore, the important topic of user control is introduced, i.e., methods that aim at a two-fold benefit: incentivizing users to participate in smart grid optimization schemes and allowing informed interaction with the new technology. Both, of course, under the assumption that appropriate methods for preserving privacy and providing security are in place.
Overall, the research at the center aims at enabling user acceptance of smart grid technologies by providing methods for the important fields of privacy, security and user control. In this paper, we give an overview of selected published research results in the area of smart grid privacy. Note that this paper does not present original research, but is intended as a summary of research results. We will focus on smart meter privacy as a an example for the research performed at the center.

Related work
Overviews of privacy issues and privacy enhancing are given by [5][6][7]. There is a number of contributions that deal with technological approaches to end-user privacy in general, for an overview see [8]. In the context of smart grid privacy, a major part of current proposals is focused on smart metering and the load profiles generated by smart meters.
It has been argued, that approaches relying on policy alone, may prove inadequate to provide a sufficient level privacy and that technological methods that enforce privacy by virtue of "strength of mechanism" need to be employed [9]. Indeed, a number of such technological approaches have been suggested to remedy the (perceived) loss in privacy and still enable smart metering functionality on a broad basis. In the following, we give a brief overview of these contributions, closely based on own work published in [10]. More detailed surveys can be found in [5-7, 9, 11].
The only approach that is widely used in the real world at this point in time, is anonymization or pseudonymization of smart metering data. Consumption data and the personal data are split and stored separately. Methods for de-anonymization are a major threat for these types of approaches. It has been shown that even after anonymization or pseudonymization, data items can still be attributed to the individual that originated them. Jawurek et al. [12] show that de-anonymization can also be done in the smart grid user domain. This structural traceability is a problem for schemes that rely on anonymization or pseudonymization only without the use of additional encryption.
Simple aggregation tries to hide data related to individuals by aggregating over a number of house-holds, e.g., all households in a neighborhood are network (NAN). For example, Bohli et al. [13] propose a privacy scheme in which high resolution smart meter readings are aggregated at NAN level and only the aggregate is sent to the utility. They introduce two solutions both with and with-out involvement of trusted third parties.
Due to the inherent link between load data resolution and privacy, splitting the load data into a variety of different resolutions, each associated with different authorization levels, has been proposed by a number of contributions. For example, the anonymization scheme proposed by Efthymiou and Kalogridis [14] is based on two different resolutions: a low resolution that can be used for billing purposes, and a high resolution that allows further investigation. This scheme employs a trusted third party escrow service. In our own work, we propose wavelet-based multi-resolution privacy (e.g., [15]).
Masking relates to approaches which add numerical artifacts, e.g., random sequences to the original load data to obfuscate individual contributions. The added artifacts are constructed in such a way that they cancel each other out upon aggregation. The aggregator can therefore combine the data values of all participant to create an accurate aggregation, but cannot gain access to individual contribution. For example, Kursawe et al. [16] propose such an aggregation protocol, which compared to other approaches has the advantage of relatively low computational complexity. Defend and Kursawe [17] further improve on this idea. Danezis et al. [18] present another lowoverhead protocol for aggregation of smart meter data, which puts minimal computational demands on the smart meter hardware.
Differential privacy, as Dwork [19, p. 1] puts it, roughly speaking, "ensures that (almost, and quantifiably) no risk is incurred by joining a statistical database". Adding or removing an item from the database will not (or only to a very limited degree) affect the result of statistical computations. This is commonly achieved by the distributed generation of noise which is added to the individual data contribution. Shi et al. [20] propose a scheme for adding random noise to time series data using a symmetric geometric distribution. An advantage of this scheme is that the participants need not trust each other, nor rely on a trusted aggregator. As another example, Ács and Castelluccia [21] obscure individual data sets by adding Laplacian noise, which is jointly generated by the participants. Apart from the obvious drawback that the data is no longer exact after differential privacy is applied, data pollution by malicious participants is another issue with this approach [20].
Secure Signal Processing (SSP) refers to the possibility to perform certain computations, such as aggregation in the encrypted domain. A commonly employed mechanism in SSP is additively homomorphic encryption, which allows some specific manipulations of the ciphertext to be reflected in the plaintext domain. For example, Li et al. [22] propose an overlay network in a tree-like topology and the use of a Paillier cryptosystem. Garcia and Jacobs [23] combine secret sharing with a Paillier cryptosystem to add flexibility in the aggregation (at the expense of additional computational complexity). Erkin and Tsudik [24] extend the idea of homomorphic encryption of smart meter readings by splitting the module into random shares, which, in combination with a modified Pailler cryptosystem, allows flexible spatial and temporal aggregation for different use cases, such as billing or network monitoring. In our own work, we propose the combination of SSP with multi-resolution methods to increase customer privacy choices [25,26].

Results on smart meter privacy from the center of secure energy informatics in Salzburg
The primary goal of the methods proposed at the research center is to improve existing approaches and develop new mechanisms to establish secure, private and informed -i.e., trusted -user interaction with smart grid technology. Results have been published in a number of peer-reviewed journal publications as well as a number of publications in conference proceedings. In the following, the contribution of the publications in the area of smart meter privacy are summarized briefly and put into context to each other.
Smart metering has been in the focus of the discussion on privacy issues in smart grids. This discussion has been led without taking the resolution of the underlying data into account, also in the academic discourse. In two papers [27,28], we have presented a formal analysis on the impact of resolution on smart meter privacy.
In [27], we analyse the impact of resolution on a the first step of non-intrusive load monitoring (NILM), i.e., the identification of individual appliances in a household load profile. We show that decreasing resolution has an impact mainly on recall (rather than on precision) in NILM.
In [28], we extend on the previously presented NILM results and discuss the influence of load profile resolution on the degree of extractable personal information. The intuitive claim that lowering the resolution will increase privacy has been studied systematically. We show that this is indeed the case and that a dyadic series of decreasing resolutions is suitable for providing a series of privacy levels to the end-user. Although this paper was published after our work on wavelet-based smart meter privacy, it provides the formal basis for the utility of this approach. As an aside, we also investigated how well owners of swimming pools could be detected in an energy consumption set in [29].
In terms of methods for privacy enhancement in smart metering, our contributions focus on the idea of lowering the resolution of the available data, providing a wavelet-based data representation, that integrates all resolutions in a single bitstream without data expansion and providing conditional access combined with hierarchical key management.
In [30], the initial idea of wavelet-based data representation of load profiles is proposed. We have shown that the wavelet transform is a suitable tool to provide the aforementioned dyadic decreases in resolution, as each iterative application of the wavelet transform effectively halves the resolution. A first proof-of-concept implementation is presented and evaluated on inexpensive hardware (a Beagleboard embedded computing platform in this case).
The idea of multi-resolution representation is elaborated in [31]. Two wavelet filters are investigated for this purpose: the simple Haar Wavelet and the LeGall 5/3 Wavelet. We show that while the integer-based LeGall 5/3 Wavelet is faster, due to the necessary border handling and the ensuing errors, it is not a possible choice. The Haar Wavelet is well suited for application, as it is also fast in application and provides lossless transformation in practice. Furthermore, we discuss the fact that the original sum is preserved over multiple wavelet decomposition, which is an important property for use-cases such as billing (which also has to work at the lowest resolution). The performance of the approach is evaluated at the example of an extended prototypical implementation. It is shown that the performance these environments offer is sufficient for use in the field.
We show that our approach can be combined with additively homomorphic encryption to provide additional degrees of freedom in [25]. Formal proof is given that wavelets are fully compatible with additively homomorphic encryption in the context of a Paillier cryptosystem [32]. A proof-of-concept implementation is presented and the high computational demands are discussed.
Privacy-aware data representations need to be secured for access by authorized parties. In order to facilitate secured, authorized access, different keys need to be provided for different privacy levels. For example, for multi-resolution load profile data, a different key is required for each resolution. For this setup, in [33] we propose a hierarchical key scheme, based on previous work by Lamport [34]. In order to limit the number of keys a user needs to handle, a hierarchical key generation scheme is used. With this key scheme it is ensured that a user who has the key for the highest resolution can derive all necessary sub-keys to the lower resolutions (which are, of course, needed to reconstruct the highest resolution) In [15], all of the components for wavelet-based multi-resolution data representation in Smart Metering are integrated into system of a whole. The underlying protocols are presented and discussed in detail, as well as the different communication paths from Smart Meter to distribution system operator, energy provider or another third party. The degrees of freedom are addressed, e.g., the alternatives to run the protocols with or without a data concentrator. For the first time, a comprehensive and detailed security and privacy analysis is conducted, including the basic security assumptions, different attacker models and the discussion of privacy properties from an information-theoretic point of view.
Based on the idea of multi-resolution data representation in smart metering, we explore the options to combine the previously proposed wavelet-based representation with other privacy enhancing technologies (PETs) in [26]. The paper discusses the applicability of multi-resolution methods to three PETs: masking protocols, differential privacy and secure aggregation.
We found that in literature, the presentation of PETs for the energy domain is very heterogeneous, especially when it comes to privacy analysis. We therefore teamed up with colleagues from Koç University to make the concept of game-based privacy proofs viable for energy consumption protocols. The results and the formal framework are presented in [35].
Privacy measures try to assess the amount of information contained in data. While the previously discussed papers tried to assess privacy from a model-based view, there were also some ideas to come up with an entropy-like measure for privacy. While exploring options in this direction, we found that the compressibility (which, of course, is related with entropy) of load profiles had not been studied systematically before. Therefore, in [36], an approach for compression of load data is proposed, that is based on ideas for compression of multimedia data. It is shown that the approach is lightweight on the side of the smart meter and that it is resumable, which is an important property if a smart meter loses connectivity for a period of time.
We extend the previous work on load data compression and data granularity in [37], where we explore the effect of data granularity on compression results. For this investigation we joined forces with Martin Ringwelski the main author of the only other algorithm specifically designed for load data compression. We investigate the properties of our algorithm compared to Ringwelski's approach. It turns out that depending on data resolution, one or the other may be preferable.
In discussion with the company partners of the Josef Ressel Research Center, it became evident that there is no clear view on the privacy relevance of different smart metering use cases and how this could be addressed by specific PETs. Such an account is also missing in literature. We address this issue in [38], where we aim at bridging the gap between privacy requirements of smart metering use cases and the features different PETs have to offer.
At the research center, data analytics has mainly been used to identify the leak of personal information from energy consumption data. In a joint project with the colleagues from Energy Institute in Linz, we broadened the scope to investigate dynamic network tariffs and their impact on different households. In [4], different dynamic pricing schemes are discussed and their impact on sociademographic groups is assessed.

Conclusion
In this paper, we gave an overview of the research done at the Center for Secure Energy Informatics at the Salzburg University of Applied Sciences. We mainly focused on our work in the field of privacy enhancing technologies in the energy domains. While investigating privacy issues, we also came across other research questions we found interesting, such as compression of load data or the socioeconomic impact of dynamic network tariffs. The discussion of other research topics addressed by the center in the field of smart grid architecture and dependable system-of-systems engineering were out of the scope of this paper. A complete list of publications can be found at https://www.en-trust.at/publications.