Enhancing the performance of smart electrical grids using data mining and fuzzy inference engine

El-Balka, Rana Mohamed; Saleh, Ahmed I.; Abdullah, Ahmed A.; Sakr, Noha

doi:10.1007/s11042-022-12987-w

Enhancing the performance of smart electrical grids using data mining and fuzzy inference engine

Open access
Published: 17 April 2022

Volume 81, pages 33017–33049, (2022)
Cite this article

Download PDF

You have full access to this open access article

Multimedia Tools and Applications Aims and scope Submit manuscript

Enhancing the performance of smart electrical grids using data mining and fuzzy inference engine

Download PDF

Rana Mohamed El-Balka ORCID: orcid.org/0000-0002-8137-7752¹,
Ahmed I. Saleh¹,
Ahmed A. Abdullah² &
…
Noha Sakr¹

1293 Accesses
4 Citations
Explore all metrics

A Correction to this article was published on 13 December 2022

This article has been updated

Abstract

This paper is about enhancing the smart grid by proposing a new hybrid feature-selection method called feature selection-based ranking (FSBR). In general, feature selection is to exclude non-promising features out from the collected data at Fog. This could be achieved using filter methods, wrapper methods, or a hybrid. Our proposed method consists of two phases: filter and wrapper phases. In the filter phase, the whole data go through different ranking techniques (i.e., relative weight ranking, effectiveness ranking, and information gain ranking) The results of these ranks are sent to a fuzzy inference engine to generate the final ranks. In the wrapper phase, data is being selected based on the final ranks and passed on three different classifiers (i.e., Naive Bayes, Support Vector Machine, and neural network) to select the best set of the features based on the performance of the classifiers. This process can enhance the smart grid by reducing the amount of data being sent to the cloud, decreasing computation time, and decreasing data complexity. Thus, the FSBR methodology enables the user load forecasting (ULF) to take a fast decision, the fast reaction in short-term load forecasting, and to provide a high prediction accuracy. The authors explain the suggested approach via numerical examples. Two datasets are used in the applied experiments. The first dataset reported that the proposed method was compared with six other methods, and the proposed method was represented the best accuracy of 91%. The second data set, the generalization data set, reported 90% accuracy of the proposed method compared to fourteen different methods.

Group decision-making approach based on the distance measure of linguistic intuitionistic fuzzy sets and VIKOR technique

Article 18 June 2024

T2RFIS: type-2 regression-based fuzzy inference system

Article Open access 20 July 2023

Influence of One-Way ANOVA and Kruskal–Wallis Based Feature Ranking on the Performance of ML Classifiers for Bearing Fault Diagnosis

Article 12 June 2023

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

A smart grid has become a global trend due to its usage mechanism of information and communication technologies [14]. Smart grids had a simple architecture of two levels in which data are sent from devices directly to the cloud to be processed and the results are sent back. This could not be possible without the Internet of Things (IoT) which is the infrastructure of the whole architecture [11, 32]. IoT made it possible for such systems to be maintainable, secure, flexible, and interactive [36]. Unfortunately, this two-level architecture had many problems such as latency as the whole data is sent to the cloud directly and no control over location as all devices at different locations end to pour in the same cloud [11]. To overcome the previous problems, a new architecture of three levels came to the light. This new level is called fog and it is the second level or the intermediate medium between devices and the cloud.

Fog is responsible to receive data from devices, apply data pre-processing, remove unnecessary (or noisy) data, and apply computations on data based on rules received from the cloud. These rules are responsible to represent the data in an appropriate form; so it can be stored in a local server inside the fog to provide a quick response for real-time applications [51]. Periodically, data is being sent from the local server inside the fog to a larger server managed by the cloud called Cloud Computing Data Center (C2DC). After that, then the data is removed from the local server so it is ready to receive a bunch of new data from devices. Fog succeeded to reduce the load on the cloud and the amount of data being sent to it [12, 33, 51].

Smart grid made it easy and possible to ensure stability in the electrical network by monitoring the relationship between supplies and demands of all users at once and tracking the users’ behaviors for different hours, days, or maybe any sudden events. So that, the user is an essential part of the network to maintain its stability and control the required energy size from the stations [19, 52]. However, this shows up as a problem as many of these events or data is unnecessary and unhelpful that may affect the whole system negatively or maybe lead the system to unrequired behavior [41, 57]. Fortunately, we can eliminate them inside the fogs before being sent to the mining algorithms in the cloud. This can be done by using feature selection algorithms.

1.1 Feature selection (FS) methods

Feature selection (FS) is to decide which features are more relevant and useful and which to drop without affecting the overall performance of the system [1, 54]. In electric power systems, extreme values are represented as bad records that can have an unexpected effect on loads. For example, the presence of public holidays that occurs once or twice a year in winters or summers. The demand for electrical energy will be higher than normal loads in normal times as a result of this event [58]. Because this event is not repeated, it is therefore not reliable in the learning system. Extreme values from data sets must be eliminated before applying any further processing. Whereas, these extreme values can cause noise during many data mining algorithms [22].

FS is implemented in many tasks in diverse fields including Machine Learning (ML) [34], Pattern Recognition (PR), Image Processing (IP), and multimedia. FS is a process concerning selecting relevant and informative features, so, redundant information can be avoided and ignored [1]. It majorly focuses on selecting a subset of features from the input dataset, which could effectively describe the input dataset. FS can significantly minimize the detrimental effects of noise and irrelevant characteristics on data [13, 22]. Some of the dependent features may supply no additional information. In other words, the majority of the critical information could be achieved via a few unique features that provide class discriminative information. As a result, removing the dependent features in some cases that do not correlate with the classes is essential.

There are two mainstream categories for FS and the associated taxonomy are illustrated in Table 1. They are label information and search strategy FS algorithms can be categorized concerning the search strategy into three categories (1) filter, (2) wrapper, or (3) hybrid. Filter methods depend on applying different statistical tests on each feature and ranking the features based on the score [22] and selecting the subset of features as a pre-processing step before a classification [62]. Wrapper methods select a set of features and pass them to a classifier to check the accuracy, and repeat the same process with different sets of features until the maximum accuracy is reached [10]. They also use a learning algorithm to evaluate the subsets of features according to their predictive power and accuracy [18]. A hybrid method is a mixture of filter and wrapper methods. First, features are ranked using filter methods, and then only the top scores’ features are passed to the wrapper [22].

Table 1 Categorization of the feature selections techniques

Enhancing the performance of smart electrical grids using data mining and fuzzy inference engine

Abstract

Similar content being viewed by others

Group decision-making approach based on the distance measure of linguistic intuitionistic fuzzy sets and VIKOR technique

T2RFIS: type-2 regression-based fuzzy inference system

Influence of One-Way ANOVA and Kruskal–Wallis Based Feature Ranking on the Performance of ML Classifiers for Bearing Fault Diagnosis

1 Introduction

1.1 Feature selection (FS) methods

1.2 Paper organization

2 Literature review

3 The suggested feature selection strategy

3.1 Feature ranking phase (FRP)

3.1.1 Feature relative weight ranking (RFRW)

3.1.2 Feature effectiveness ranking

3.1.3 Information gain

3.2 Fuzzy Interface engine

3.3 Feature selection phase (FSP)

4 Evaluation and results

4.1 Datasets

4.1.1 EUNITE dataset

4.1.2 USPS dataset

4.2 Performance metrics

4.3 Evaluation of EUNITE

4.4 Generalization using USPS

5 Conclusions and future work

Change history

13 December 2022

References

Authorship

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Intellectual property

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation

3.1.1 Feature relative weight ranking (R_FRW)