1 Introduction

A localized power system consisting of hybrid renewable power sources is known as a microgrid (MG) and can function both independently and in combination with the main grid. Faults or abnormalities in the microgrid can lead to disruptions in power supply, affecting the stability and reliability of the system [1]. Timely detection and classification of faults allow for rapid response and corrective actions to maintain grid stability. Similarly, faults within a microgrid can cause damage to electrical equipment such as generators, inverters, transformers, or batteries [2]. By detecting and classifying faults promptly, appropriate actions can be taken to protect the equipment from further damage, minimize downtime, and avoid costly repairs or replacements [3]. However, faults in a microgrid can result in inefficient energy utilization. For example, voltage sags or imbalances can lead to suboptimal performance of electrical devices, reducing energy efficiency [4]. Some types of faults in a microgrid can pose safety hazards. For instance, equipment failures or electrical faults may lead to electrical shocks or fires. By detecting and classifying faults, safety protocols, and emergency response mechanisms can be activated promptly to mitigate potential risks and ensure the well-being of individuals in the vicinity. Fault detection and classification provide valuable insights into the health and performance of the microgrid system. By monitoring and analyzing fault occurrences, patterns, and trends, maintenance activities can be scheduled proactively, allowing for efficient resource allocation, minimizing downtime, and reducing maintenance costs. A well-functioning microgrid requires effective control and management [5].

Fault detection and classification can contribute to the optimization of system operation by identifying deviations from normal behavior, enabling timely actions to restore the system to its desired state, and facilitating efficient load balancing and fault isolation. An in-depth analysis of fault data can help identify the root causes of faults and aid in improving system design, component selection, and overall system performance. By understanding the types and frequencies of faults, system designers and operators can implement preventive measures and enhance the resilience of the microgrid. Overall, fault detection and classification play a crucial role in maintaining the stability, reliability, safety, and efficiency of microgrid systems [6]. By leveraging machine learning algorithms, it becomes possible to automate the process and enable real-time monitoring and response, leading to improved performance and optimized operation [7].

In recent years, machine learning algorithms such as deep learning and reinforcement learning algorithms have played a key role in enhancing the reliability, safety, and performance of the microgrid system [8, 9]. In [10], authors analyzed the fault profile and created training sets for artificial neural networks (ANN), a DC microgrid is developed to model the DC system under both normal and transient situations. In the considered test system, several fault types with various fault resistances and fault locations are explored. Later in [11], authors used deep artificial neural networks to tackle one of the most significant issues in process systems engineering: defect detection and classification. However, a class of data-driven machine learning algorithms known as deep neural networks (DNN) and discrete wavelet transforms (DWT) are used to create an intelligent fault detection scheme proposed in [12]. Later authors in [13], a trained fault identification model use the modified K-means algorithm, FP-growth algorithm, and mini-batch gradient descent (MBGD) algorithm, all of which are based on machine learning theory. Moreover, the authors in [14] used an artificial intelligence-based radial basis fault classifier for detecting faults in a microgrid.

Authors in [15] suggested a method that combines a convolutional neural network and wavelet transform to create an intelligent defect categorization mechanism. Wavelet transforms are used for preprocessing and picture conversion after first identifying the voltage and current results for each and every potential defect in the MG network. Authors in [16] provide a bearing fault detection approach (GNNBFD) based on graph neural networks. Using the similarity between samples, the method first creates a graph; then, this is applied as input to a network (GNN) for mapping of features, and the GNN network produces output samples. Moreover, reference [17] presented protection of MG using machine learning techniques such as NB (Naive Bayes) classifier, SVM (support vector machine), and ELM (extreme learning machine). The extracted three-phase current signals are used as input signals to the above-said machine learning methods to classify the various fault events. Reference [18] provides a fault diagnostic approach for microgrids based on the whale algorithm optimization-extreme learning machine (WOA-ELM) to address the issue of fault identification. The ELM, also known as the WOA-ELM model, uses the whale algorithm to optimize the input weight and hidden layer neuron threshold. Moreover, the artificial neural network (ANN), a technology based on artificial intelligence for fault detection, classification, and localization in an AC microgrid, is the main focus of the authors [19]. Later reference [20] presented blockchain technology and artificial intelligence techniques for fault detection and relay protection for wind power supply in AC/DC hybrid microgrids. In this, the regional layering form of the power supply fault diagnosis model could be created by combining machine intelligence-based identification models. In [21], authors presented a new idea by combining three computational tools, i.e., maximum overlap discrete wavelet packet transform (MODWPT) signal processing, augmented lagrangian particle swarm optimization (ALPSO) parameter optimization, and support vector machine (SVM) machine learning for detecting the microgrid faults. This research in [22] proposes a fault detection technique based on wavelet transform and chaotic neural networks. The flaw of becoming trapped in the local optimum can be overcome using the chaotic neural network. Furthermore, it performs well in terms of fault tolerance and associative memory abilities. Later in [23], the authors introduce a new defect detection technique for microgrid applications, by combining the dq0 and wavelet processing with local measurements, and this approach is employed in real-time. Reference [24] presented a method for detecting motor faults suggested which is based on the wavelet transform, an upgraded particle swarm optimization, and a backpropagation (BP) neural network with linearly increasing inertia weight. To provide a workable strategy, the research in [25] presents a new fault identification approach for low-voltage DC microgrids incorporating renewable energy sources. The proposed new fault detection method makes use of the absolute detailed energy, the DWT detail coefficient, and the instantaneous current change rate. To detect islanding and fault disturbances in a microgrid made up of resources such as wind turbine generators, fuel cells, and microturbines, the research in [26] suggested wavelet transform-based approaches.

From the above-discussed literature survey, many researchers developed different artificial intelligence techniques to detect the faults in microgrids. On this basis, the following are the gaps identified in the literature.

  • Most of the studies discuss the fault detection/classification of a single microgrid with limited resources.

  • Limited studies have addressed the feature extraction in the machine learning models for fault detection and classification for microgrid applications.

So, to bridge the above research gaps identified from the literature, in this work, we provide a DWT-DNN-based fault detection scheme for hybrid energy-based multi-area grid-connected microgrid clusters. Due to DNN's exceptional capacity to handle data with noise, DWT is made more resilient by introducing DNN, despite its susceptibility to noise and power disturbances. Hence, the following are the key contributions established in this paper.

  • A hybrid energy-based multi-area grid-connected MG cluster is modeled with the available resources with different load profiles.

  • Fuzzy-based MPPT algorithm is proposed to extract the maximum power from the solar PV system modeled in the MG.

  • Discrete wavelet transform-based feature extraction is adopted to train the machine learning model considered for study.

  • The “Deep Neural Network (DNN)” technique is proposed for fault detection and classification occurring at the PCC of the system considered for study.

The other portions of the paper are structured as follows. The description of the system considered for the study is explained in Sect. 2; Sect. 3 introduces the concept of the proposed DWT-DNN methodology; Sect. 4 summarizes the results of the simulation; and Sect. 5 provides the paper's conclusion.

2 Description of system under study

The architecture of the hybrid energy-based grid-connected microgrid cluster that is to be implemented is shown in Fig. 1. The system under study consists of two areas, namely, area-1 and area-2, each area is associated with available renewable sources and is considered as a single MG.

Fig. 1
figure 1

Layout of the two-area MG cluster to be implemented

In area-1, MG is modeled with a solar PV system [27] along with fuzzy-based perturb and observation (P&O) MPPT algorithm, and in area-2, MG is associated with a fuel cell along with controlled PWM technique. Each MG consists of variable building loads [1] and circuit breakers (CBs) for disconnecting the system from the grid if any abnormalities occur in the system. In the system, DC bus is modeled to provide a constant DC output voltage of 500 V from the DC to DC converter which is supplied with variable voltage from the renewable energy sources used in each MG, correspondingly, the voltage from the inverter is obtained as 415 V [3].

3 Modeling of solar PV system

The system shown in Fig. 1 consists of two single areas’ out of which area-1 is considered as a single MG which is associated with solar PV as an energy source. In this, the maximum power from the solar PV system can be extracted using a fuzzy logic-based MPPT algorithm. Figure 2 shows the equivalent circuit and characteristics of the solar PV system modeled in Simulink software. The mathematical equations used for implementing the same model in MATLAB/Simulink are given from Eqs. (1–4) [28]. Equation (1) demonstrates how a solar cell's output characteristic is nonlinear and significantly influenced by solar radiation, temperature, and load conditions. As we know, photocurrent is directly proportional to solar irradiance which is represented as given in Eq. (2). Similarly, photocurrent and saturation currents depend on solar temperature and irradiance as given in Eqs. (3) and (4).

$$ \hat{I}_{{{\text{out}}}} = \hat{I}_{{{\text{photo}}}} - \hat{I}_{s} \left( {\exp \left\| {\frac{q}{AkT}\left( {v + \hat{I}_{{{\text{out}}}} R_{s} } \right)} \right\| - 1} \right) $$
(1)
$$ \hat{I}_{{{\text{photo}}}} = \hat{I}_{{{\text{sc}}}} \frac{G}{{G_{{{\text{std}}}} }} $$
(2)
$$ \hat{I}_{{{\text{photo}}}} (G,T) = \hat{I}_{{{\text{scs}}}} \frac{G}{{G_{{{\text{std}}}} }}\left( {1 + \Delta \hat{I}_{{{\text{sc}}}} \left( {T - T_{{{\text{std}}}} } \right)} \right) $$
(3)
$$ \hat{I}_{{{\text{sat}}}} = \frac{{\hat{I}_{{{\text{photo}}}} (G,T)}}{{e^{{\left( {{{v_{{{\text{oc}}}} } \mathord{\left/ {\vphantom {{v_{{{\text{oc}}}} } {v_{t} }}} \right. \kern-0pt} {v_{t} }}} \right)}} - 1}} $$
(4)
Fig. 2
figure 2

PV cell a Equivalent circuit and b I–V and P–V characteristics

3.1 Fuzzy rule-based MPPT algorithm

When the maximum power point is achieved in the traditional P&O MPPT technique, the output power oscillates around the maximum power point, resulting in power loss in the PV system. As a result, for each MPPT cycle in P&O, the array terminal voltage is disturbed. Both stable and unstable atmospheric conditions fall under this category. Fuzzy logic application is thus anticipated to lessen operating voltage oscillation, which, in turn, minimizes power loss on the PV system. In comparison with conventional nonlinear controllers, fuzzy logic is more reliable, because of this, we have designed a fuzzy logic-based MPPT technique to track maximum power from the solar PV system which is given below. In this design, the fuzzy controller has two inputs, input 1 is the change in solar PV power (\(\Delta P_{{{\text{sol}}}}\)), input 2 is the change in solar PV voltage (\(\Delta V_{{{\text{sol}}}}\)) at any sampling instant “r,” and output is the change in reference solar PV voltage (\(\Delta V_{{{\text{sol}}}}^{*}\)). This output is now used to generate an error signal \(e(r)\) and its change \(\Delta e(r)\) which are expressed as given by Eqs. (5) and (6).

$$ e(r) = \frac{{P_{{{\text{sol}}}} (r) - P_{{{\text{sol}}}} (r - 1)}}{{I_{{{\text{sol}}}} (r) - I_{{{\text{sol}}}} (r - 1)}} $$
(5)
$$ \Delta e(r) = e(r) - e(r - 1) $$
(6)

The fuzzy logic-based MPPT controller of solar PV system FIS editor window and the surface plot is shown in Fig. 3. Proposed fuzzy logic-based MPPT control is characterized by the following assumptions.

  1. 1.

    Consists of five fuzzy sets, namely, Negative Big (-B), Negative Small (-S), Extreme Zero (Z), Positive Small (+ S), and Positive Big (+ B).

  2. 2.

    Mechanism of Mamdani inference is used.

  3. 3.

    Defuzzification is done by the centroid method.

  4. 4.

    The fuzzy controller is designed by considering 25 rules as shown in Table 1

Fig. 3
figure 3

Fuzzy logic controller a FIS editor and b surface plot

Table 1 Fuzzy rules developed for the MPPT algorithm

Based on the mathematical modeling, a fuzzy-based MPPT controller for solar PV system is developed and implemented in MATLAB/Simulink which is shown in Fig. 4.

Fig. 4
figure 4

Implementation of solar PV system in Simulink environment

4 Modeling of fuel cell

A polarization curve, which depicts the nonlinear connection between the voltage and current density, is used to evaluate the PEMFC steady-state feature of a PEMFC source. The mathematical modeling of the fuel cell is obtained as given from Eqs. (7)–(10). Figure 5 shows the implementation of a fuel cell with specifications of 6-kW and 45-V DC power [28].

$$ V_{{{\rm{FC}}}} = E_{{{\rm{ner}}}} - V_{a} - V_{{{\rm{ohm}}}} - V_{{{\rm{con}}}} $$
(7)
$$ V_{a} = {\text{T}}\left( {p + q.\ln (I)} \right) $$
(8)
$$ V_{{{\rm{ohm}}}} = I.R_{{{\rm{ohm}}}} $$
(9)
$$ V_{{{\rm{con}}}} = - \frac{{{\text{{RT}}}}}{zF}\ln \left( {1 - \frac{I}{{I_{\lim } }}} \right) $$
(10)

where \(E_{ner}\) is the reversible voltage of the fuel cell (thermodynamic potential), \(V_{a}\)—activation drop, \(V_{ohm}\)—ohmic drop, \(V_{con}\)—concentrated voltage, and \(p,q\)—constants.

Fig. 5
figure 5

MATLAB/Simulink implementation of PEM fuel cell

5 Proposed discrete wavelet transform with deep neural network (DWT-DNN)

5.1 Concept of discrete wavelet transforms in extracting the input features

Feature extraction is a process for transforming unprocessed data into usable numerical features while preserving the original data set's information. Compared to manual extraction, the automatic feature extraction method can be quite helpful when we wish to move swiftly from raw data to constructing machine learning algorithms. Wavelet transform is a signal processing method that examines interruptions in the power system using a "Time–Frequency multi-resolution" approach. It makes use of a movable window that, at high frequencies shrinks and, at lower frequencies expands. With the help of a variety of basic functions called "mother wavelets," the signal function can be constantly expanded and translated into distinct frequency levels. In the time–frequency domain, wavelet transforms can both represent functions and make their local properties evident. Due to these properties, effectively training neural networks to model very nonlinear signals is made easy. A given function (signal) can be defined by the DWT (frequency) as the sum of wavelets and scalable functions with coefficients at various time shifts and scales. DWT can extract information from transient signals by disassembling signal components that overlap in both time and frequency. As per DWT, the coefficients of both detailed and approximated time signal \(\alpha \left( {\overline{\rm T}} \right)\) are decomposed using a function (scaled) \(\beta_{k} \left( {\overline{\rm T}} \right)\), and the mother wavelet function \(\mu_{k} \left( {\overline{\rm T}} \right)\) is given in Eqs. (11) and (12) [3]. In this, we used Daubechies-4 wavelet decomposition.

$$ \beta_{kx} \left( {\overline{\rm T}} \right) = 2^{{\frac{0.5}{j}}} \beta \left( {2^{{\left( { - j} \right)}} \overline{\rm T} - n} \right) $$
(11)
$$ \mu_{kx} \left( {\overline{\rm T}} \right) = 2^{{\frac{0.5}{j}}} \mu \left( {2^{{\left( { - j} \right)}} \overline{\rm T} - n} \right) $$
(12)

Here \(n \in Z,k,j\) are integers, and the base function is altered by “\(n\)” units. The function \(\beta_{k} \left( {\overline{\rm T}} \right)\) is connected to LPF with coefficients, and with these filter coefficients, the wavelet is now connected to HPF and is expressed in mathematical form as given in Eqs. (13) and (14).

$$ \beta \left( {\overline{\rm T}} \right) = \sum\limits_{n} {h(n).\sqrt 2 } .\beta \left( {2\overline{\rm T} - n} \right) $$
(13)
$$ \mu \left( {\overline{\rm T}} \right) = \sum\limits_{n} {g(n).\sqrt 2 } .\beta \left( {2\overline{\rm T} - n} \right) $$
(14)

For an easy understanding of the process of decomposing detailed coefficients using wavelet transform a sample, three-level decomposition wavelet transform is shown in Fig. 6 [3]. The DWT of single dimension signal \(x\left( n \right)\) is determined by allowing this signal through LPF, HPF is given in Eq. (15).

$$ x\left( n \right) = \sum\limits_{m = - \infty }^{\infty } {x\left( m \right)} .g\left( {n - m} \right) $$
(15)

where\(g\left( n \right)\) and \(h\left( n \right)\) are the wavelet sequence of LPF and HPF.

Fig. 6
figure 6

Decomposition procedure of detailed coefficients using wavelet transform of level 3

In this stage, firstly, the original data consisting of a total of eight input features, which include all 3-Ø voltages and currents, positive and negative sequence voltages at 11-kV bus are obtained as shown in Fig. 1.

The step-by-step procedure for obtaining the required output signal to train the DNN is as follows. (1) Set the fault resistance, (2) read the three-phase voltages, currents, and sequence voltages at the 11-kV bus, (3) define wavelet syntax applied on signal [C, L] = wavedec (x, N, wavelet name), (4) define wavelet syntax for detailed coefficients D = detcoef (C, L, N), (5) repeat the same process for different fault resistances, and (6) save data to the workspace to train the DNN. Figure 7 shows the decomposition of detailed coefficients using the “Daubechies 4 wavelet” of level 5. From this, the required output is applied to DNN for proper identification and classification of faults that occur at the PCC of the cluster.

Fig. 7
figure 7

Decomposition procedure of detailed coefficients using wavelet transform of level 5

6 Deep neural network

An artificial neural network (ANN) that has more than one hidden layer of neurons between the input and the output is called a DNN. It is frequently used to simulate intricate nonlinear systems. Furthermore, DNN calculation is quick because it simply requires the solution of basic algebraic equations. Because of this feature, DNN can handle issues immediately. The premise behind the suggested DNN-based fault detection scheme is that the branch current and voltage measurements at the cluster system's PCC can quickly reveal the presence of a defect in the system. To extract the characteristics, DWT processes the measurements first. The data and characteristics are then fed into a DNN for fault-type classification. If a fault is found, its position is ascertained using the fault detection DNN. In addition, the fault phase is developed using the fault phase identification DNN if the fault is classified as imbalanced. Lastly, the information produced can be used to decision-making processes for later control operations, such as fault isolation and recovery.

Fault detection and classification in a system under study can also be accomplished using deep neural networks (DNNs), which are a type of machine learning algorithm particularly effective at learning complex patterns and relationships in data. The input layer, hidden layer, softmax layer, and output layer are the four main types of layers that a deep neural network contains. As is frequently the case with data-driven fault diagnostic approaches, input data must be normalized before being sent into the input layer. To ensure that all of the values lay within the range [0, 1], another possibility is to utilize the feature scaling of the following form.

$$ p^{\prime} = \frac{p - \min \left( p \right)}{{\max \left( p \right) - \min \left( p \right)}} $$
(15)

The following nonlinear transformations are used in the hidden layers which convert the input data information into high features. Where\(x = \left( {2,.....,d} \right)\), \(p\)—input vector, \(\Psi\)—hidden vector, and \(\beta -\)—bias vector. The output of the final hidden layer is transformed using Eq. (16) without using the activation function is given in Eq. (17).

$$ \left. \begin{gathered} \Psi_{1} = \lambda \left( {W_{1} .p + \beta_{1} } \right) \hfill \\ \Psi_{x} = \lambda \left( {W_{x} .\Psi_{x - 1} + \beta_{x} } \right) \hfill \\ \end{gathered} \right\} $$
(16)
$$ \Psi_{s} = W_{s} \Psi_{d} + \beta_{s} $$
(17)

The softmax layer uses the softmax function given by Eq. (18) to determine the values of each output neuron. The network then chooses the label with the highest output value to apply as a predicted label to the input data.

$$ q_{j} = \frac{{e^{{\Psi_{s,j} }} }}{{\sum\limits_{j = 1}^{{n\Psi_{s} }} {e^{{\Psi_{s,j} }} } }} $$
(18)

7 Implementation of DNN

The general structure of DWT-based DNN is as shown in Fig. 8, this structure consists of three stages, namely, (1) data set preparation, (2) feature extraction using DWT, and (3) training of DNN. The deep neural networks can be applied to fault detection and classification in a microgrid as described follows.

  • Data collection: Collect the voltage, current, and sequence components from the 11-kV bus of the system which is as shown in Fig. 1. These data should cover a wide range of normal and different faulty operating conditions. A total of eight input features are extracted from the 11-kV bus.

  • Data preprocessing: Clean and preprocess the collected data from the data collection by removing noise, outliers, and inconsistencies. Perform normalization or scaling to ensure that the data are on a consistent scale and format which is given by Eq. (16).

Fig. 8
figure 8

Structure of proposed DWT-DNN

8 Training of DNN for fault detection and fault classification

Based on the number and kind of layers, the number of neurons in each layer and the activation function have to be employed while designing the architecture of the deep neural network. In this study, for designing DNN in both fault detection and fault classification, a total of 360 samples were considered at PCC. Out of which 70% of samples, i.e., 252 samples are considered for training, 15% samples, i.e., 54 samples are considered for testing, and 15% samples, i.e., 54 samples are considered for validation. Common architectures for fault detection and classification tasks include convolutional neural networks (CNNs) and recurrent neural networks (RNNs), such as long short-term memory (LSTM) networks. Figures 9, 10, 11, and 12 show the performance of DNN-1 (fault detection) and DNN-2 (fault classification) in terms of regression, performance, training state, and error histogram.

Fig. 9
figure 9

Regression plots of a DNN-1 and b DNN-2

Fig. 10
figure 10

Performance plots of a DNN-1 and b DNN-2

Fig. 11
figure 11

Training state plots of a DNN-1 and b DNN-2

Fig. 12
figure 12

Error histogram plots of a DNN-1 and b DNN-2

From the regression plot shown in Fig. 9, it is observed that regression is equal to “1” which means that the DNN-1 is accurately trained to identify the faults in the system under study the mean square error is also very low between actual and predicted values. Similarly in DNN-2, the regression value is approximately equal to one which means that DNN is properly trained for classifying faults in the specified location.

The flowchart implementation of DNN for both fault identification and fault classification is shown in Fig. 13. Two DNNs are used in the suggested DWT-DNN fault detection strategy, one for fault identification and the other for fault classification. Meanwhile, their schemata differ slightly because of the differences in their outputs. First, we build the DNN for defect detection. This network's goal is to discover fault detection DNN by accepting as input the DWT-extracted features and the three-phase time-series current and voltage measurements at the system's PCC. Three different sorts of defects are examined in this work: LG faults, LLG faults, and LLLG faults. The output of the built DNN has four 0–1 indications, each of which denotes a different kind of defect. Additionally, an additional no-fault indicator is added to the output since this DNN needs to differentiate between cases with and without problems.

Fig. 13
figure 13

Implementation of deep neural network. DNN-1: fault detection and DNN-2: fault classification

9 Simulation results

Simulink modeling of a two-area microgrid cluster system with proposed DWT-DNN shown in Fig. 1 is implemented in the MATLAB 2022a software. The recommended computational facility for this system simulation includes Windows 11 operating system, any Intel or AMD × 86–64 processor with four or more cores and AVX2 instruction set support, 16 GB RAM, and SSD 23 GB storage for an all products installation. There are two green microgrids in the system: MG1 and MG2. Each microgrid has a circuit breaker that is controlled by an energy management system before connecting to a neighborhood microgrid or other microgrids in the neighborhood. Additionally, PCC and circuit breakers are used to link the integrated MG system to the electrical grid.

10 Analysis of the system under normal working conditions

Initially, the cluster system that is considered for study is operated under grid-connected mode. The system is connected to the utility grid through a circuit breaker (CB3). When the system is operating under normal working conditions, the system can make power transactions with the utility grid, i.e., during excess power conditions, cluster system exports the power to the utility grid, and during deficit power condition, the cluster imports power from the utility grid. Figure 14 illustrates how CB3 operates under excess and deficit power conditions. It is observed that at 0.01 s and also at 0.26 s, the cluster system has an excess of power which is transferred to the utility grid and is shown as a zone highlighted. Except for aforesaid times, during the remaining times, the system is importing power from the utility grid. From the instantaneous voltage and current waveforms at the 11-kV bus, as shown in Fig. 15, it is seen that voltage and current waveforms are at nominal values. In this situation, the DNN is properly trained, and it gives an output of fault level “0” which means that DNN-1 accurately identifies that there is no fault in the system.

Fig. 14
figure 14

Power available at PCC of two-area MG cluster for exchange of power during grid-connected mode

Fig. 15
figure 15

Instantaneous voltage, current waveforms, and fault detection signal of DNN-1 during no fault

11 Analysis of the system under abnormal working conditions

11.1 Line–ground (LG) fault

During this stage, the system under study is subjected to LG fault conditions. The fault is applied from 0.1 s to 0.25 s, and correspondingly, the instantaneous voltage, current waveforms, and fault detection signals are measured at the 11-kV bus as shown in Fig. 16.

Fig. 16
figure 16

Instantaneous voltage, current waveforms, and fault detection signal of DNN-1 during LG fault

From the results, it is observed that the voltage suddenly drops to 1400 V, and the current increases to 20 A in a phase where the LG fault occurred. It is also observed the output of DNN-1 accurately determines the LG fault condition by producing a result of the fault level as “1” at its output. Whenever there exists a fault immediately, DNN-2 is trained with the features extracted at 11-kV bus to classify which type of fault it is. As shown in Fig. 17, the output indicates that there exists an LG fault in the system by showing fault level “1” at its output by keeping the remaining faults at fault level “0.”

Fig. 17
figure 17

Fault classification with DNN-2 at 11-kV bus a LG fault, b LL fault, c LLG fault, and d LLL fault

12 Line–line (LL) fault

The system under study is now subjected to LL fault conditions. Again the fault is applied from 0.1 s to 0.25 s, and the instantaneous voltage, current waveforms, and fault detection signals are obtained at the 11-kV bus as shown in Fig. 18.

Fig. 18
figure 18

Instantaneous voltage, current, and fault detection signals form DNN-1 during LL fault condition

From the results, it is observed that the voltage drops to 3050 V, and the current increases to 16.5 A in phases where the LL fault occurred. It is also observed that the output of DNN-1 accurately determines the LL fault condition by producing a result of the fault level as “1” at its output. Whenever there is a fault immediately, DNN-2 is trained with the features extracted at 11-kV bus to classify the fault. As shown in Fig. 19, the output clearly indicates that there exists an LL fault in the system by showing fault level “1” at its output by keeping the remaining faults at fault level “0.”

Fig. 19
figure 19

Fault classification with DNN-2 at 11-kV bus a LG fault, b LL fault, c LLG fault, and d LLL fault

13 Line–line–ground (LLG) fault

In this case, the system is subjected to LLG fault. Similar to the previous cases, the fault is applied from 0.1 s to 0.25 s, and the instantaneous voltage, current waveforms, and fault detection signals are obtained as shown in Fig. 20. In this case, the DNN-1 accurately identified the LLG fault condition by indicating the fault level as “1” at its output. From the results, it is observed that the voltage drops to 1580 V, and the current increases to 20 A in phases where the LLG fault occurred. It is also observed the output of DNN-1 accurately determines the LLG fault condition by producing a result of the fault level as “1” at its output. Whenever there is a fault immediately, DNN-2 is trained with the features extracted at 11-kV bus to classify the fault. As shown in Fig. 21, the output clearly indicates that there exists an LLG fault in the system by showing fault level “1” at its output by keeping the remaining faults at fault level “0.” Moreover, the same analysis is also carried out for the remaining faults, and Table 2 shows the comparison of actual outputs and outputs produced from the DNN.

Fig. 20
figure 20

Instantaneous voltage, current waveforms, and fault detection signal during LLG fault

Fig. 21
figure 21

Fault classification with DNN-2 at 11-kV bus a LG fault, b LL fault, c LLG fault, and d LLL fault

Table 2 Comparison of actual and DNN outputs

In this study, we mainly focused on unsymmetrical faults only. In this way, the primary goal of any network training is to increase the network's accuracy, which is described using Eq. (19).

$$ \% {\text{Accuracy}} = \frac{M}{N} $$
(19)

where \(M\)—number of samples with correct labels and \(N\)—total number of samples.

When the system under normal working conditions both wavelet packet transform (WPT) [3] and proposed DWT-DNN-based methodologies is accurately detect the “No fault condition” by indication level “0” which is shown in Fig. 22.

Fig. 22
figure 22

Fault detection and classification with both WPT and DWT-DNN under no fault condition

From Fig. 22, consider the total number of samples (N) as 11 and the total samples with correct labels (M) are 11. So, the accuracy under no fault condition is obtained for both WPT and DWT-DNN methods is 100% using Eq. (19). Similarly, when the system has been subjected to LG fault, WPT methodology detects the LG fault by indicating level 1 at AG and also giving one incorrect label at ABG which is as shown in Fig. 23. From Fig. 23, again the total number of samples (N) is 11, and total samples with correct labels (M) is 10. So, the accuracy under the LG fault condition obtained with WPT is 90.90% using Eq. (19).

Fig. 23
figure 23

Fault detection and classification with WPT under LG fault conditions

Similarly, from Fig. 24, it is observed that the total number of samples (N) is 11, and the total samples with correct labels (M) is 11. So, the accuracy under LG fault condition obtained with DWT-DNN is 100%.

Fig. 24
figure 24

Fault detection and classification with proposed DWT-DNN under LG fault conditions

Similarly, the proposed DWT-DNN method is compared with WPT in terms of accuracy for various faults, and the quantitative comparison is given in Table 3. From the obtained simulation results and also from quantitative comparison with the WPT method, the proposed DWT-DNN has accurately identified and classified various unsymmetrical faults that occurred at the 11-kV bus of two-area MG cluster system.

Table 3 Quantitative comparision of WPT & DWT-DNN (proposed)

14 Conclusions

In this study, discrete wavelet transform-based deep neural networks were used to locate and classify the problems in a two-area MG cluster. We observed how the number of hidden layers and the number of neurons in the final hidden layer affected the performance of networks for fault detection. So, it is being observed that above a certain level (roughly 95%), increasing the network size does not improve fault detection accuracy. Later, we demonstrated how data augmentation may help further improve fault detection accuracy, as well as how it worked out well for the fault classification example. In this study, firstly, we developed and trained DNN for fault detection, and after that another DNN is also modeled to classify the faults. Later these models have been connected to the system under study to observe the performance in terms of accuracy. So, DNN outperforms when compared with the WPT technique. However, the proposed method has certain limitations such as feature extraction, data preprocessing complexity, and real-time processing. But as future scope, careful feature selection, parameter tuning, and hybrid approaches that combine DNNs with other machine learning techniques and domain expertise may be required to lessen these constraints. Additionally, some of the drawbacks of DWT can be mitigated by utilizing wavelet packet decomposition or lightweight DWT variations, which offer greater feature extraction flexibility.