1 Introduction

Globally, solar energy technology has seen significant, ongoing progress. It is safe for people and other living things, and it operates without any noise, making it one of the most environmentally friendly and renewable energy sources. Solar energy production is constantly rising because it is a pollution-free source with minimal installation costs. The report of the International Renewable Energy Agency [37] proved that the installed PV capacity in that year was approximately 700,000 MW, and that number continues to increase.

The energy losses in a photovoltaic systems are mainly due to the presence of faults that seriously affect the efficiency of the systems. A PV module failure degrades its output power and reduces the performance and reliability of the overall system [24], and this may eventually cause a safety issue [26]. Faults in PV systems can cause significant energy loss as well as fire hazards. To ensure reliable and safe operation of photovoltaic installations, monitoring and fault diagnosis systems must accompany these installations to detect and solve problems in a timely manner. To address these issues, many methods of monitoring and fault diagnosis have been considered in the literature, which differ in requirements for speed, complexity, sensors, and the ability to identify a large number of faults.

From the aforementioned, it is clear that PV systems are emerging now. They need effective and robust mechanisms for fault detection, diagnosis, and continuous monitoring. Hence, an appropriate solution is to use intelligent technique-based deep learning architectures to achieve high performance in determining the type and location of the fault. The most suitable and effective techniques are based on infrared (IR) thermography, which is quick, simple, and cheap. The main trend to build such intelligent systems depends on deep learning architectures, which are used in intelligent FDD for PV systems and give appropriate actions and responses at the appropriate time. Consequently, the motivation of this work is to keep track of the latest realization of AI architectures for intelligent FDD of PV panels. Many common AI architectures, especially convolutional neural network (CNN), long short-term memory (LSTM), generative adversarial network (GAN), auto-encoder/decoder, Boltzmann machine (BM) and stacked neural networks are reviewed.

The main contributions of this paper are as follows:

  1. 1.

    A comprehensive systematic review of FDD methods for photovoltaic systems is presented.

  2. 2.

    Intelligent techniques of FDD-based thermography and their benefits in classifying and localizing different types of faults are addressed.

  3. 3.

    Adequate guidance and recommendations for future research in this area are provided.

This paper is organized as follows, Sect. 2 discusses the types of PV system failures. Section 3 provides the main fault detection and diagnosis strategies. Section 4 describes various PV FDD methods in the literature, including thermography as one of the most promising methods. Section 5 covers different artificial intelligence techniques that are used in fault detection of PV systems. Section 6 is the future work and conclusion of the paper which provide a powerful review of recent FDD.

2 Classification of PV faults

Faults are assorted into three main categories based on power losses during the operating time, as shown in Fig. 1. These types are infant failures, midlife failures, and wear-out failures [24]. Infant failures occur when operating a PV system at first. The manufacturer or the installer of the modules is often responsible for the infant failures; consequently, the power of PV modules reduces quickly and dramatically, which causes a big loss. At the end of the lifetime of PV modules, wear-out failures occur. It might end with a safety problem or when the power of the PV module decreases to a certain level (80–70% from the initial power).

Fig. 1
figure 1

PV failure main categories

PV faults are also classified according to their severity. More severe PV faults are called acute, while chronic ones represent the less severe faults. Short and open-circuit faults are called acute faults, as they might shut down the PV system in the case of no output power. In contrast, shading faults, hot spot faults, degradation faults, and bypass diode faults are called chronic faults because of their lower severity.

Faults can be classified as permanent (internal cause) or temporal (external cause) faults [38]. PV modules' performance can be measured by the received light, and the condition of cells and their connections.

Permanent faults due to the condition of cells are delamination, bubbles, yellowing, burns, degradation, hot spots, scratches, or crack faults. Connections between electrical elements in the PV system fail, including open circuit, closed circuit, potential-induced degradation (PID), junction j-box, diode failures, and inverter failures. Therefore, permanently faulty modules can be simply removed and replaced. The temporal faults occur, according to the received light or partial shading effects. Temporal faults like shading and soiling (dirt, snow, dust, or other elements accumulating on the module) can be solved easily without removing the modules [17], as shown in Fig. 2.

Fig. 2
figure 2

PV Failure Examples

PV plants must be protected from faults like lighting, overcurrent, overvoltage, etc., to ensure stability, availability, reliability, and security in production. Many standards are used to protect PV plants, like the National Electrical Code [41], which addresses some of the safety standards for PV plant installation (protection devices, circuit breakers, overcurrent protection, and ungrounded systems). However, not all PV faults can be detected, and unfortunately, they could create serious risks like fires caused by line-to-line and ground faults [14].

Continuous determination of faults must be carried out to protect the PV system from different losses, so a fault diagnosis tool is essential to the reliability and durability of the PV panels.

3 Fault detection strategies

Fault detection and diagnosis (FDD) methodologies include three main approaches as shown in Fig. 3. The first approach is qualitative data based covering both the condition if–then rules and decision trees. The second approach is quantitative data based. The last approach is process history data based.

Fig. 3
figure 3

FDD main strategies and their examples

In model-based FDD, the mathematical models of the system are built based on understanding the principles of the physical design under normal operating conditions. Using the PV panel model's nonlinear equations, the input–output data from the model is used by signal analysis. The differences, or residuals, between the measurements of the actual system and the model predictions are used to determine the presence of a fault in the system. Famous FDD models in PV systems are the single-diode model [5], the double-diode model, and the current-driven three-diode. Model-based strategies achieve acceptable accuracy at high irradiance, but their accuracy is less at low irradiance conditions, and they also need an accurate mathematical model, which is complicated and sometimes impossible to obtain in the real world [46].

The data-driven approach focuses on collecting a massive amount of data for analysis and interpretation, unlike the model-based approach, which needs a priori qualitative or quantitative knowledge about the system [11, 52].

FDD methods-based data-driven uses large amount of training data which represents different conditions of operations with several faulty scenarios in order to find the relationship between inputs and outputs signals [53]. The output signal can be called a regression, which is a feature or sensor value that can be predicted, or a class label belonging to the input data, which is called classification.

4 PV FDD methods

Data types commonly used in PV FDD systems are electrical measurements, environmental data, or images of photovoltaic panels. According to this type, fault detection and categorization techniques in photovoltaic systems can be classified into two classes: non-electrical class, includes visual and thermal methods (VTMs) or traditional electrical class [49], as shown in Fig. 4.

Fig. 4
figure 4

PV FDD Categories and some examples

The electrical-based methods (EBMs) focus on, IV characteristic curve analysis, or statistical and signal processing techniques [21].

4.1 Electrical-based methods (EBMs)

4.1.1 IV curve analysis

IV curve analysis is a traditional FDD strategy in which the characteristics of the electrical measurements of a module gives short-circuit current, open-circuit voltage, and other factors that could detect a failure of the system. The current–voltage curve is monitored and measured when the voltage or the current across the module changes with the application of an external electronic load or power source [24]. Identical response characteristics of cells or modules are usually used as a reference, compared with the module under test. IV characteristics obey to a specific curve under normal operation, like in Fig. 5, which will be changed during a fault. The degree of that change in the curve is affected according to the type and severity of a fault [13].

Fig. 5
figure 5

I-V curve parameters. Isc short circuit, Voc open-circuit, Pmpp maximum power point, and PT for virtual power point

The IV curve of a module may be useful in detecting various faults. Unfortunately, the precise location of those failures is not detected, so other techniques are often necessary to find their locations. This makes the whole process difficult, as it takes a huge amount of time and money [22, 25].

Therefore, the use of visual and automatic anomaly classification can make systems monitoring and maintenance simpler and provide lower operating costs, while also saving more time [22, 36].

4.1.2 Statistical and signal processing techniques

Signal processing methods depend on the waveform signal analysis, such as Earth Capacitance Measurement, Speared Spectrum, and Time Domain Reflectometry (TDR) [39]. The TDR method is used to detect and locate defective PV module arrays. Unfortunately, it could depend on the installation conditions, like PV component materials and wiring.

4.2 Visual and thermal methods (VTMs)

The VTMs include the following techniques: visual inspection, ultraviolet (UV) fluorescence (FL) imaging, electroluminescence imaging, and thermography that is shown in Fig. 6.

Fig. 6
figure 6

Different faults for PV modules in thermography. A Offline (Open circuit) "The panel is hotter than others", B Short circuit (Bypass diode) "One row is hot ", C Wrong connection, Shading, Short circuit or severe soiling "Different cells have different temperatures", D Delamination, Defect cell or Shading, “One cell is hot ", E Snail trails, discoloration or Shading "Pointed heat", and F Broken cell " Part of a cell is hot"

4.2.1 The electroluminescence (EL) method

The electroluminescence (EL) method is one of the famous VTM strategies that can be used to test PV modules or cells and detect failure using the EL images as a data set [1, 12], like in Fig. 7 [24]. PV modules are supplied with a DC current to motivate radiative recombination in the solar cells. Using a charged silicon camera (CCD), which is a commercially accessible device, electroluminescence emission is measured.

Fig. 7
figure 7

EL failure image example

4.2.2 The UV fluorescence method

The UV fluorescence method (FL imaging) of ethylene vinyl acetate (EVA) in PV cells can be used to analyze the discoloration of photovoltaic modules, as shown in Fig. 8 [23, 47]. Even in a dark outdoor setting, it can determine the number and location of cell cracks in PV modules, but it cannot detect cracks on the border of the cell [24].

Fig. 8
figure 8

UV FL failure images

4.2.3 Infrared thermography

Infrared (IR) thermal imaging is one of the most important non-destructive and contact-less techniques for failure detection. Basically, the radiation process occurs when a surface of the PV system or its electrical components releases energy as electromagnetic waves. Such that, infrared waves are emitted, which are generated from the moving atoms of any object that has a temperature higher than 0 K or if the object obtains external energy [22]. Thermography can be used for failure localization and classification of PV modules, as illustrated in Fig. 6, as well as additional components of the system, such as cabling, diodes, DC box combiners, junction boxes, and connectors.

Infrared thermography (IRTG) is widely used because it provides fast, reliable, accurate, economical, and 2D distributions of characteristic features of PV modules. Figure 9 demonstrates two different thermography techniques for PV module failure detection: active IRTG and passive IRTG.

Fig. 9
figure 9

Thermography techniques Active IRTG

Using an external heat source, active thermal imaging creates an internal heat flow in an object, raising its temperature [22].

Pulsed thermography is one of the active IRTG types, which is an easy and fast method in the application. It is commonly used by heating the body using a heat pulse from a heat source, such as a lamp, a heating gun, etc. [35]. A continuous low-power heating source is applied to the PV modules in the long-pulse TG type, in which the focus is on cooling [48]. Lock-in TG is used by heating the object through an oscillating temperature domain; thus, internal failures can be detected in cases of wave change [8]. Mechanical vibrations are used in vibro-thermography, which converts vibrations into thermal energy, causing hot spots to appear in defective areas of PV modules such as cracks and delamination [45]. Passive IRTG

The passive IRTG method (also called “thermography under steady state conditions”) does not need any external heat sources; it just collects IR radiation from PV modules instead. Passive TG is the most common type because it is simpler and cheaper, as it needs only an infrared camera [22]. Fault detection can be done without touching the object and without the need for hardware or intervention from humans or physical objects. It has real-time imaging, such that the images are observed at the same time as the recording, which minimizes data errors. The authors in [44] propose a solution for PV fault detection using a deep learning method and a thermal image dataset to perform cell detection and instance segmentation, which makes the algorithm useful for the task of automated inspection.

IRTG has some limitations in its application. It requires accurate cameras, which are often expensive, to avoid errors in measurements. In addition, well-trained, experienced operators are required to use those cameras in the right way. A pre-detection study for choosing the correct altitude is needed to enhance image resolution [22].

5 Artificial intelligence (AI) techniques for FDD systems

AI techniques have had many applications in different fields in recent decades, like medicine, astronomy, engineering, robotics, speech recognition, natural language processing, behavioral sciences, etc. It is a powerful and important tool that is used in many areas of research in PV systems, including forecasting or prediction [21, 29]. Different techniques can be used in data-driven fault detection for PV systems, like statistical methods or machine learning (ML) which can handle complex and nonlinear problems. AI system examples that are used in PV systems include artificial neural networks [12, 18], fuzzy logic [6], support vector machine [2], decision tree, and k-nearest neighbor algorithm [34].

ML methods are considered a subset of AI techniques that give computers the ability to learn from previous experience automatically, like databases, without being explicitly programmed by humans. Deep learning (DL) is also a special kind of machine learning, such that both machine learning and deep learning are parts of AI tools. On the other hand, computer vision (CV) applications have many fields that are very important and give computers the ability to process, analyze, and interpret the visual world using artificial intelligence, as shown in Fig. 10. Deep learning has progressed in recent years, and it now solves the vast majority of traditional CV problems [43]. Table 1 introduces recent algorithms using the thermography technique with the aid of different methods of artificial intelligence and computer vision for classifying and localizing PV faults.

Fig. 10
figure 10

An analysis of how AI, ML, DL, and CV relate

Table 1 Different recent methods in PV FDD systems using thermography and AI

When traditional techniques can't find a solution to complex problems, machine learning techniques are used. It can handle unstable environments due to its ability to adapt to new data. Machine learning is useful for solving lengthy and difficult problems since the links between inputs and outputs are straightforward.

A fuzzy classification algorithm is proposed by the authors in [6]. In this work, failure can be classified using the pixel counting technique for thermal images to detect the discoloration of EVA and delamination failures based on three index values. However, it only focuses on the hot spot's location rather than diagnosing other types of faults. A machine learning methodology is introduced in [2] using a hybrid features-based support vector machine model for hot spot detection and classification of PV panels. Color histograms, a second-order co-occurrence matrix, and features of a local binary pattern are formed using a data fusion approach to increase efficiency.

DL is more powerful than ML. It is considered a multi-computational neural network with many hidden layers that accepts and learns a large amount of data.

5.1 Deep learning (DL) frameworks

The most popular deep learning frameworks for Photovoltaic fault detection and classification are the convolutional neural network, long short-term memory, recurrent neural network, generative adversarial network, Boltzmann machine, and auto-encoder/decoder [3, 21].

5.1.1 Convolutional neural network (CNN)

According to authors in [28], CNN is a specific type of ANN for data processing that uses convolution rather than standard matrix multiplication, which is known to have a grid-like structure [16]. As seen in Fig. 11, the input layer, the output layer, and a sizable number of hidden layers mostly convolutional, pooling, and fully connected layers are what make up CNN [4]. Max-pooling is used to distinguish between the pixel intensity levels and choose the highest value to determine the characteristics of the image. The activation function that is frequently employed to accelerate convergence is called the Rectified Linear Unit.

Fig. 11
figure 11

CNN Structure for PV FDD system

The authors in [36] have investigated different models of convolution neural networks and applied them to IRTG images taken by ground-based operators and unmanned aerial vehicles (UAV). Using pre-processing techniques such as normalization, grayscaling, thresholding, and Sobel Feldman and box blur filtering, they have a high-performance classifier of PV images as an operative, or hotspot PV module.

Region-based convolutional neural networks (R-CNN) and telemetry data are combined in an intelligent method that is suggested in [18] to automatically identify and assign the relative hot areas of solar panels. Another powerful method is proposed in [25] to classify 11 classes of PV module faults with multi-scale kernel of visual perception levels of CNN based on the strategy of transfer learning. Utilizing pre-trained knowledge of Alex-Net for increasing the capability of the network. Offline augmentation is performed such as the oversampling technique for solving the unbalanced distribution of the classes. That method detects and classifies PV faults correctly and efficiently using thermography.

5.1.2 Long short-term memory networks (LSTM)

LSTM is one of the recurrent NN (RNN) types. The LSTM network is able to deal with long dependencies (connecting the information when increasing the gap between the output data sequence and the input data sequence) using a forget gate shown in Fig. 12, which is a deficiency in recurrent NNs [19].

Fig. 12
figure 12

LSTM Structure

In [50], PV modules are linked to an IEEE bus system, and the LSTM RNN algorithm detects a high-impedance fault with an accuracy of 91.21%.

5.1.3 Generative adversarial network networks (GAN)

GAN consists of two networks: A generative network that is used to produce new data instances, and a discriminative network that evaluates the data for authentication in Fig. 13 [15]. Each network is trained against a static adversary. In order to reconstruct the input layer, GAN uses supervised learning [38].

Fig. 13
figure 13

GAN architecture

In [33], the use of GAN is to detect DC series arc faults, which has been used in domain adaptation with a convolutional GAN.

5.1.4 Auto-encoder/decoder networks

An auto-encoder ANN is trained in such a way as to encode the input data to a specific representation of the output so that the input can be reconstructed again from that output [7]. That target output of the auto-encoder then becomes the auto-encoder input itself. The code represents the learned feature when the reconstruction error is minimized [51]. Stacked auto-encoder networks

It is multiple-hidden neural networks that are created by stacking different auto-encoder networks (an encoder and a decoder make up each auto-encoder network) as illustrated in Fig. 14.

Fig. 14
figure 14

Stacked Auto-Encoder networks architecture

A stacked auto-encoder clustering method is applied to the IV curves of a PV system in [31] to detect short-circuit faults.

5.1.5 Boltzmann machine networks (BM)

BM is a stochastic unsupervised learning ANN that can solve hard problems by learning the identification of fundamental data. Deep belief network (DBN) is a special kind of BM [51]. There are visible and hidden layers in the restricted Boltzmann machine (RBM). A DBN consists of multiple RBMs and an output layer (often a classification layer), which together make up a multi-hidden-layer probability-generating model as shown in Fig. 15.

Fig. 15
figure 15

DBN architecture

The DBN is used to train the initial values of the NN in order to solve the crack problem of the PV module in [42]. The reconstructed and training images are used as the supervised data.

5.2 Ensemble learning algorithms

To create the best prediction model, the ensemble learning technique combines a range of basic learner algorithm pattern schemes. The resulting perfect prediction model outperforms the fundamental learning algorithms by a wide margin [9, 27].

5.2.1 Stacking (stacked generalization)

It has been extensively used in a variety of fields. To train a new meta-learner model of the output outcome, the findings of the various base learner model are integrated during stacking. Two steps of algorithms form the foundation of stacking. Several base learner algorithms are included in the first stage, and the meta-learner algorithm is included in the second stage as displayed in Fig. 16. The authors in [32] employ deep neural networks, long short-term memory, and bi-directional long short-term memory as its three basis learners for diagnosing PV faults. To combine the predictions of the basic learners and conduct a more extensive analysis of PV arrays, they use multinomial logistic regression as a meta-learner.

Fig. 16
figure 16

Stacking ensemble learning architecture

6 Conclusions and future work

In this study, many aspects of PV fault diagnosis, including its classification, detection, and identification, have been surveyed through a comprehensive study of modern literature, which must be used in PV systems to protect them from different losses like power, efficiency, and reliability. The importance of thermal imaging is demonstrated by the PV FDD method, which is a non-destructive and simple operation for finding and locating failures effectively. Various computational methods used in PV system failure analysis were investigated, including statistical methods and artificial intelligence (AI) techniques.

So, the review presented is an important research topic that has the potential to be improved further in the future. Some directions in future research could focus on improving fault categorization and the nature of fault identification using hybrids of various deep learning models. Also, future research can be extended to monitor and diagnose PV plants remotely through the use of internet of things (IoT) and edge computing technologies. Finally, Future work could focus on predicting PV faults based on the use of large datasets.