Application of Deep Learning in Photovoltaic Array Fault Identification

If a photovoltaic power station needs to operate normally, it cannot do without the blessing of the photovoltaic array, which also takes up about half of the cost of the photovoltaic system. The failure and low efficiency of photovoltaic power plants are the main two reasons for the loss of energy. Among them, the failure of photovoltaic arrays is undoubtedly one of the important reasons. In this article, the author uses the superiority of deep neural network in fault identification technology to complete the fault identification model of photovoltaic array.


Research Purposes
Photovoltaic power generation is a new energy power generation technology, and its energy conversion, that is, light energy into electrical energy, is completed by the photovoltaic effect of semiconductor materials. The photovoltaic power generation system generally includes four parts: combiner box, photovoltaic array, controller and inverter. The photovoltaic array is composed of a large number of solar panels connected in series and packaged. The photovoltaic power generation device incorporates a power controller on the basis of the photovoltaic array [1] .
Photovoltaic systems often suffer from efficiency drops in engineering applications. The main reason is that the photovoltaic arrays are particularly vulnerable to adverse external environments. Big data shows that due to interference from the external environment, photovoltaic systems generally dissipate 20% of the efficiency, that is, their actual efficiency only accounts for about 80% of the total array efficiency [2] .According to the research results, the main culprit for the low efficiency of the photovoltaic system is the mismatch problem caused by shading [3] .The output characteristics of photovoltaic modules that are not in line with reality are generally caused by dust, aging, partial shading [4][5] and other reasons, which will lose the efficiency of the photovoltaic array. This is a mismatch. The mismatch phenomenon will significantly reduce the power generation efficiency of the photovoltaic system. Therefore, we must quickly and accurately complete the fault identification of the photovoltaic system in order to target the weakening of the mismatch problem and the hot spot effect, thereby improving the reliability of the system. The working status of the photovoltaic array is directly related to the output power of the photovoltaic system, and it is the protagonist of the photovoltaic system. Therefore, if it is necessary to complete the task of high-efficiency photovoltaic power generation and minimize various accident rates, it is necessary to detect and diagnose photovoltaic modules in real time.

Photovoltaic Array Failure Analysis
In the process of debugging the project, it is difficult to detect some problems in the materials, design, and technology of the photovoltaic array, but it will gradually fail in accordance with the site conditions after a period of time. The failure types of photovoltaic arrays are roughly as follows: (1)Short circuit fault One of the most common faults in photovoltaic arrays is the short circuit of components, mechanical vibration and other damage to the internal batteries of the array; unfavorable weather causes local corrosion and damages the insulation; human error wiring and other reasons can easily cause short circuits. Assuming that a large-scale short-circuit fault is found in a photovoltaic power station, it is very easy to cause a fire accident. Therefore, it is imperative to regularly check the short-circuit fault in the photovoltaic array, and it is also the top priority of the inspection of the photovoltaic power station.
( If an open-circuit fault is triggered, it will be inevitable that the array will disconnect its path. Basically normal power supply cannot be maintained, resulting in electrical energy dissipation. (3)Aging failure When photovoltaic cells work under load for a long time, it is difficult to guarantee that the array material will not corrode and deteriorate, and the aging phenomenon cannot be avoided. Given that the photovoltaic power generation system is mainly operated in unfavorable external conditions such as beach desert areas and remote mountainous areas. The aging of the packaging materials and the aging of the photovoltaic cell itself are both included in the category of aging of the photovoltaic array. The degradation performance of the packaging material is mainly caused by long-term ultraviolet radiation. In addition, various aging phenomena will occur, that is, the backplane and its photovoltaic cell packaging film(EVA) turn yellow; the battery itself is affected by the passage of time due to the production process and the battery The difference in type will generate aging to a certain extent. The aging fault is an irreversible process, which will not only lower the output power of the photovoltaic array, but also accelerate its damage under uninterrupted long-term use.
(4)Cover fault Self-shading, temporary shadow shading and building shading belong to different aspects of photovoltaic power station shading. Self-shielding refers to the left, right, front, and rear shading between the photovoltaic arrays to form a shadow or the shading of equipment such as combiner boxes. Temporary shelter refers to partial shelter formed by external conditions such as dust, snow, fallen leaves, bird droppings,etc.,and must be cleaned if necessary. Building shelter generally considers surrounding green plants, nearby buildings, etc. However, the initial planning of the erected power station can completely avoid the above-mentioned failures to the greatest extent. The hot spot effect is very easy to occur when the photovoltaic array is blocked and cannot be processed for a long time, and the material packaged on the surface of the array will be damaged. Although this fault can be recovered, the output power of the photovoltaic system will still be dissipated.

Data-driven Fault Identification Methods
At present, fault identification mainly uses computer technology analysis. The ability to use sensor technology or other collection methods to complete existing data acquisition depends on sensor technology and other collection methods. Comparing operations rely on data-driven algorithms to achieve the capture of operating results. Compared with traditional methods, this type of method will greatly improve the accuracy of fault identification and fault type classification. Currently, the most commonly used data-driven fault identification methods include statistical methods, signal processing techniques, and machine learning collaborative deep learning methods.
The most common fault identification signal processing techniques are FFT (Fast Fourier Transformation\Fast Fourier Transform) and wavelet decomposition. FFT is based on the innovation of DFT (Discrete Fourier Transformation\Discrete Fourier Transform) and is a fast new algorithm. Because FFT has good sensitivity in the frequency domain and can timely monitor the vibration signal triggered by the fault, it is applied in fault identification. For example, in [6], the author uses the Fourier transform method to obtain the spectral characteristics of the inverter's instantaneous voltage for the fault detection and identification of the inverter, and uses the spectral difference as the characteristic quantity for fault identification. Normally closed faults make a good judgment. Wavelet decomposition is a method of decomposition based on the acquisition of wavelet transform waveforms. It can partially analyze the frequency domain in the time domain information and propose innovations to the traditional Fourier wave. The Fourier transform cannot handle the insufficiency of unstable signals. The wavelet decomposition filtering performance is good enough to smoothly process the data. The wavelet decomposition can eliminate the noise and interference in the actual data, that is, various unstable information, so as to obtain more accurate and simplified data. For example, literature [7][8] uses wavelet decomposition in data preprocessing, and fault detection can be completed by extracting wavelet coefficients.
In the fault identification system, the statistical method is a very efficient and intelligent method, but its disadvantage is that it relies heavily on expert knowledge. Statistical methods generally include correlation analysis, variance, mean, regression analysis, and difference analysis. Correlation analysis examines the degree of correlation between variables and Pearson coefficient analysis [9] is the most widely used. The judgment of linear correlation is determined according to the specific value of the correlation coefficient. The formula is: is the covariance of X and Y,D(X), D(Y) is the variance of X and Y, and |ρXY| ≤1.The value of |ρXY| is closely related to the degree of correlation between the X and Y variables, and shows a positive correlation trend. The larger the value, the greater the degree of correlation between the X and Y variables, and vice versa. When |ρXY|=0,the variable wireless relationship between X and Y. Correlation analysis selects candidate feature values from variables according to the degree of correlation. This method usually does not have expert advice. Variance, mean, and difference analysis are analytical methods that can directly and specifically express the change situation of a sample. Linear and nonlinear regression are regression analysis, which can reflect the relative changes of several variables. Literature [10] completed modeling for real cases, and realized fault warning and identification with the help of linear regression algorithm.
The research on deep learning and machine learning methods is relatively mature, relying on the category of artificial intelligence algorithms to further improve the original fault identification topics. The machine learning method explores how to make the machine have the learning ability similar to that of human beings. At present, it has achieved remarkable results. It has been applied in various fields and has a good classification effect. The largest two-classification structure is the process of accurately and effectively distinguishing the normal state and the fault state. For the discrimination of each specific fault type, it may as well integrate multiple two-classifiers. SVM(Support Vector Machine) [11] is the most frequently used machine learning method in the domain. The basic model of SVM is a linear classifier with the largest separation [12] .The classifier selects data from the feature space and passes the discrimination Fault status and normal operation status realize classification problems. In the literature [13],the superiority of the support vector machine to complete the fault Through continuous innovation, traditional neural networks have solved the problem of gradient diffusion and produced deep learning methods. The abstract iteration of the received signal is a salient feature of deep learning [14] .Machine learning and deep learning methods are divided into deep models and shallow models. Taking into account the complex internal structure of the network, shallow models are not effective in extracting data feature values, because they generally only include one or two non-linear conversion layers, which are limited The sex is greater. Deep learning can better approximate complex functions, add the number of hidden layers, train layer by layer, and use the training results of the previous layer as the training input of the next layer. Not only can the gradient diffusion problem be effectively contained, but also the data itself The deeper characteristics of the complete learning. The development of deep learning algorithms has gone through a certain process. Among them, computer vision is the earliest application category. On the topic of extracting features, deep learning networks have outstanding performance, especially in the classification and processing of images. Therefore, the failure of various equipment in terms of identification, deep learning methods have many applications. For example, literature [15][16][17] facilitates the use of various deep learning algorithms to complete the fault identification of industrial equipment. Taking into account the actual problems in the specific working environment such as strong coupling, nonlinearity, large quantity and high-order complex data, deep learning methods have defects that are not easy to meet the needs of the work, so this undoubtedly becomes a hot issue, that is, the introduction of deep learning methods Into the fault identification of various components.

Photovoltaic Cell Model Construction and Simulation
Photovoltaic arrays are usually connected to multiple solar cells and are one of the most important parts of a photovoltaic system. The smallest power generation unit in a photovoltaic system is a solar cell, which can generate electromotive force by absorbing light energy by virtue of the photovoltaic effect. The current formed by a single solar cell cannot meet the production needs, so the photovoltaic array connects multiple solar cells in series and parallel to effectively expand the electric energy. The basic structure of solar cells will not change due to the series or parallel form. Therefore, the understanding of the photovoltaic array can be completed by analyzing a single solar cell. The solar cell can be equivalent to a non-linear small DC power supply, but taking into account the daily life under the conditions and the shielding fault in the main faults of the photovoltaic array mentioned above, the photovoltaic cell will be shielded, so this paper establishes a double diode (double exponential) mathematical model [18][19] , as shown in  In the case of partial shading, the photovoltaic cell is negatively charged. Once the reverse voltage accumulated in the PN junction is too high, it will form a rapidly increasing current. In severe cases, the PN junction is easily broken down. I D --the current flowing through the diode, unit A; I ph --photoelectric current, unit A; R s and R sh --series equivalent resistance and parallel equivalent resistance, unit Ω.
In the formula, Is--the reverse saturation current of diode D; V br --avalanche breakdown voltage, unit V; A--the quality factor of diode D; T--reference temperature 300 K; α,n--avalanche breakdown characteristic constant; q--electron charge 1.6×10 -19 C; K--boltzmann constant 1.39×10 -23 J/K. The role of the bypass diode: To protect the negative voltage formed by the battery in a partially shielded state, and to prevent the avalanche breakdown effect. Therefore, the IV term in equation (4.1) is generally ignored in the circuit with bypass diodes. In actual situations, I ph and I S will change continuously with irradiance and temperature. The mathematical model can be shown as: Tab According to the above mathematical model,the system simulation of the photovoltaic array is completed with the help of Matlab/Simulink platform. The simulation diagram is shown in Figure 4.2:

Current-voltage Characteristic Analysis
The output characteristics of the photovoltaic array, that is, the current-voltage characteristics are usually not in line with the actual situation, which is generally interfered by shadows and component aging. Especially when the hot spot effect occurs in the photovoltaic panel, the power supply efficiency of the photovoltaic array will be reduced due to the sharp rise in temperature, which can easily cause the entire system to collapse in severe cases. Figure 4.3 simulates the current-voltage curve at different temperatures, where the control light intensity is the standard irradiance (G r =1000W/m 2 ).It is not difficult to find that temperature has varying degrees of influence on the short-circuit current and open-circuit voltage of photovoltaic cells. As the temperature increases, the short-circuit current increases, otherwise it decreases, showing a positive correlation; as the temperature increases, the open circuit voltage decreases, and vice versa, it shows a negative correlation.

Fig.4.3 I-V curves at different temperatures
Irradiance is the energy source of the photovoltaic array, and temperature will also affect the power generation efficiency of the photovoltaic power generation system. The current-voltage characteristics of photovoltaic modules will also change due to different irradiance. Figure 4.4 shows the current-voltage curves obtained under different irradiance conditions, where the control temperature is also the standard temperature of 25°C.  Structure  DNN(Deep Neural Network) is a multi-layer neural network structure with good application value in classification projects [20] .It has the advantages of straightforward processing of raw data, active learning and extraction of feature values, and can meet various engineering practical applications demand. DNN includes a multi-layer network structure. The first layer is the Input layer, the middle layer is the Hidden layer. The hidden layer can be set independently according to the actual situation, and the last layer is the Output layer.

Fig.4.5 DNN model
Suppose that the input vector of the input layer is expressed as X i , L l is the number of layers of the network, u lj is the input of the jth neuron of the Zth layer, Y lj is the output of the jth neuron of the lth layer, and Y l is the l Layer output. Then the output of any neuron in any layer can be expressed as: Generalizing to the entire l layer, the output of the l layer can be expressed as: Deep learning models generally have a multi-layer structure, and most of them have non-linear relationships. Therefore, compared with shallow models, their training rate and classification accuracy are higher. The activation function just satisfies the nonlinear demand. At the same time, differentiability and monotonicity are also indispensable features of the activation function. For example, when the activation function is f(x)=x(linear conversion),there is essentially no difference between the shallow neural network and the deep neural network, and the training rate and accuracy will not change. When the activation function is converted into a nonlinear function, it will also cause the neural network to form a nonlinear model. The neural network itself has good training advantages and the ability to extract feature values. Through the approximation training of the functional relationship between the expected output and input, the neural network can be approached to various complex functions to the greatest extent. The differentiability of the activation function is usually satisfied by the gradient calculation in the SGD(stochastic gradient descent) algorithm, and the monotonicity can be fully satisfied by ensuring that the output function of the shallow network is a convex function.

Identification Results and Analysis
According to the current-voltage characteristic curve, it is understood that irradiance and temperature will affect the current and voltage of the photovoltaic array. Further calculation of the temperature's influence on the output power P is shown in Table 5.1.Similarly, irradiance affects the power P The statistical values of changes are shown in Table 5.2.According to the data in the table, both irradiance and temperature have caused great changes in the output power of photovoltaic modules, so it is imperative to classify the types of module failures under various conditions. The selected DNN network structure is shown in Figure 5.1,the DNN algorithm flow chart is shown in Figure 5.2,and the test results under the condition that power varies with irradiance are shown in Table 5.3.The results in Table 5. 3 show that when the current-voltage is used as the eigenvalue matrix, the system accuracy is the highest. Figure 5.3 shows the current-voltage as the input loss function and the change trend of accuracy. Tab

Fig.5.2 DNN algorithm flow chart
The experimental principle of the influence of temperature on the output power is the same as above, and the current-voltage eigenvalue matrix is still used to integrate the DNN algorithm to complete the training. The accuracy rate obtained is 95.76%.Compared with the traditional MLP algorithm, the change trend is shown in Figure 5.4.Similarly, Figure 5.5 shows the training trend graph of the comparison of the two algorithms when the irradiance is different. It is not difficult to find that the advantages of the DNN algorithm compared to the MLP algorithm are reflected in the accuracy of fault detection and the reduction of the test sample loss function.   This article first understands the main types of faults currently existing in photovoltaic arrays. Secondly, it deeply analyzes the new data-driven fault identification method, and then simulates the different operating states of photovoltaic modules based on the mathematical model of the photovoltaic array. The current-voltage curve under different conditions and its changes to the output power P can further complete the fault identification of the photovoltaic array. Relying on the deep neural network algorithm, this paper provides a photovoltaic array fault identification mode based on the DNN algorithm. The DNN method is more suitable for a large number of eigenvalue fusion conditions, and has a good fit for photovoltaic module fault identification. However, this article did not try to apply other smart algorithms with better performance to the fault identification of photovoltaic arrays. Similar ideas can also be carried out to complete more in-depth research on photovoltaic inverter circuits.