Site assessment of transformer state based on individual Raman spectrum equipment

The transformer is the pivotal equipment in the power system, and its operating conditions are critical to the entire power grid, so it is important to evaluate its state in a timely manner. A method of site assessment of transformer state by using individual Raman spectrum equipment was presented in this paper. First, transformer oil samples containing different health states were collected by accelerated thermal aging tests and field sampling. The oil samples were then subjected to Raman testing and the Raman spectral features that can reflect the transformer state were extracted based on multidimensional scaling analysis (MDS). The evaluation model of transformer condition based on Raman spectra of insulating oil was constructed by random forest (RF), and the evaluation accuracy was higher than 90% for 20 actual samples. The results of this paper can provide technical support for intelligent operation and maintenance of substations.


Introduction
Oil-immersed power transformers, mainly oil-paper insulated power equipment, account for a relatively high proportion of the power grid, which bears the key work of power transformation.Its insulation operating conditions and aging state is the cornerstone of the power industry to protect people's normal production and life [1].Therefore, it is of great importance to assess the transformer state in a timely manner.
Due to the complex power environment and the complex internal structure of transformers, the aging of transformers is the result of a combination of factors.With the progress of the power industry, technical means are also progressing.How to detect the transformer state more efficiently is also a difficult problem facing the power industry at present.The degree of polymerization (DP) of insulating paper is currently the most direct and reliable characteristic quantity to characterize the degree of aging of transformers [2].However, its complex test procedure limits its application in practical engineering.Since many aging characteristics are produced and dissolved in the insulating oil during transformer aging, most transformer condition assessment studies have focused on the insulating oil [3].Some studies have shown that DP can be calculated from the furfural content using the logarithmic equation, so some scholars have studied the method of detecting furfural content in oil [4].However, because furfural itself is volatile and its content is easily affected by chemical reagents such as additives, it has certain limitations to infer the state of the transformer only by furfural content in oil [5].
Raman spectroscopy is an emerging detection and diagnosis technology in recent years, which has been widely used in many fields such as petroleum, and jewelry identification [6].In transformer condition assessment, Raman spectroscopy can be used to capture information about molecular changes in the insulating oil of transformers during the aging process [7].Combined with intelligent algorithms, it is possible to establish a mapping between spectral information and transformer states.
Based on this, this paper proposes a method for field evaluation of transformers based on individual Raman spectrum equipment assembled in the operation and maintenance vehicle of a substation.First, transformer oil samples with different health states were collected by accelerated thermal aging tests and field sampling.The oil samples were then subjected to Raman testing and the Raman spectral features that can reflect the transformer state were extracted based on multidimensional scaling analysis (MDS).The evaluation model of the transformer state based on Raman spectra of insulating oil was constructed by random forest (RF).

Accelerated thermal aging test and individual Raman spectrum equipment
Although the use of the transformer environment is very complex, in the actual operation of the transformer, transformer aging is mainly dominated by thermal and electrical stresses.In the case of transformers without electrical stress failure, thermal aging occupies a more important position than electrical aging.In this paper, an accelerated thermal aging test was performed in combination with IEEE guidelines [8] and Yang et al.'s work [9] to prepare the samples.
The essence of Raman spectroscopy is to characterize the process of electrons between different molecules or within molecules that have not returned to their original state after being excited in their original state.The individual Raman spectrum equipment used in this paper mainly includes lasers, spatial filters, notch filters, spectrometers, charge-coupled devices (CCD), and so on.For the Raman scattering light of the sample, the required wavelength region is separated by the spectrometer and then transmitted to the computer in the cockpit of the operation and maintenance vehicle for display and analysis by CCD.Due to the strong bonding force between the atoms in the oil molecules, the internal vibration Raman spectrum has a higher frequency, which is generally distributed in 800~3200 cm -1 , so this band is selected as the main band in this study.Figure 1 shows the Raman spectra of the prepared samples with the laser power, exposure time, and cumulative number set at 300 mw, 0.3 s and 10, respectively.

Multidimensional scaling analysis
MDS consists of the following three main steps, which are constructing the distance matrix, calculating the inner product matrix, and calculating the low-dimensional matrix [10].
Step 1: Construction of distance matrix.For the vector  in the normalized matrix X, the Euclidean distance  , between  and  is calculated to obtain the distance matrix D∈R N×N . , is defined as follows: , ∑     (1) where   and   are the l-th elements of  and  , respectively.
Step 2: Calculation of inner product matrix.The distance matrix D is transformed into the inner product matrix B with the following transformation equation:   (2) where J is the centralization matrix and is calculated as follows: (3) where the matrix E is the identity matrix, and e is the n-dimensional all 1 vector.
Step 3: Calculation of low dimensional matrix.Because the inner product matrix B is a symmetric positive matrix, the matrix B can be decomposed into the following form: (4) where V is the diagonal matrix of singular values about matrix B, and S is the corresponding vector of singular values.Thus, the low-dimensional matrix G of matrix X can be obtained by extracting the first d-column vectors of matrix Z.The matrix Z is calculated as follows:   . (5)

Random Forest
RF is mainly divided into three steps: constructing a training sample set based on Bagging or Boosting, constructing a decision tree based on the randomness of feature subsets to select feature attributes, and using the majority voting method for the results of each decision tree [11].
Step 1: Building the training sample set.The Bagging method is unweighted with a put-back sampling of the sample set.The method is more concerned with reducing the variance and focuses on the stability of the prediction.The Boosting method obtains a set of training sample sets and then sets the weights for each sample set and puts back the sampling.Initially, the weights of each sample set are set to be the same.Then, the model is tested using each sample set, increasing the weights of the sample sets with low prediction accuracy and weakening the weights of the sample sets with high prediction accuracy.Finally, the weights of each sample set are iterated continuously to obtain the optimal set of weights to maximize the prediction accuracy.
Step 2: Building the decision tree.A decision tree is constructed for each training sample set obtained in Step 1.When splitting nodes in each decision tree, some attributes are randomly selected from the feature attributes to form a subset of attributes for node splitting, and the number of attributes contained in the subset of attributes L sub satisfies:  log  (6) where  is the number of feature attributes.
Using Gini impurity as an evaluation criterion, the mathematical expression is as follows: where Λ is the sample set, λ j is the proportion of samples in the j-th category, and y is the total number of categories.The Gini index of the m-th branch node in the attribute set can be defined as Step 3: Voting decision.After training based on each sample set, each decision tree will predict the original sample set, and k decision trees can get k decision results for one sample.The majority voting method is used to select the predicted results, and the mathematical expression for the majority voting method is shown below: where j is the number of categories, θ j is the labeled category, and y is the number of categories.

Results and discussion
In this paper, real oil sample data were collected in a substation using individual Raman spectrum equipment.In a real substation, we cannot measure the DP of the insulation paper inside the transformer, but can only calibrate its aging state by other parameters (years of operation, operation and maintenance records, color and temperature of the oil).It is certain that none of the transformers in the substation will reach the level of aging that requires decommissioning.From the Raman spectra of the oil samples collected from operating transformers, it is evident that operating age is the factor most closely related to the degree of aging.With the development of technology, although the power environment is becoming more and more complex, the quality of transformers is also becoming higher and higher, while the power industry attaches great importance to the real-time monitoring and regular testing of transformers, so most in-service transformers are not deeply aged.
Comparing Figure 2 and Figure 1, it can be seen that the difference between the Raman spectra of the real oil samples and the Raman spectra of the prepared oil samples is small, and it is feasible to combine the data of the two oil samples to establish a transformer condition assessment model.The baseline of the oil sample in Figure 2 is lower than the baseline of the oil sample aged 30 days in Figure 1, even if it has been in operation for more than 15 years.The data width of the prepared oil sample is higher than the data width of the real oil sample.This is because the actual operation does not allow severely aged transformers to continue to operate.In this paper, MDS is used to reduce the dimension of the original Raman spectrum, and the dimension reduction effect is analysed by self-verification.It can be seen from Figure 3 that the selfvalidation accuracy exceeds 95% when the original data dimensionality is reduced to 8 dimensions.According to the literature [3], the transformer is divided into four stages (I: healthy, II: good, III: attention, IV: returned) and takes them as the output.Out of the 240 sets of samples (prepared samples: 200 groups; real samples: 40 groups) collected, 180 sets of samples were randomly selected for the RF training model.As mentioned earlier, since the actual transformer has been operating normally from the time it was put into operation until now, it can be judged that it is mainly subject to thermal stress aging.Therefore, the difference between the spectra of real and prepared samples is small, and the two samples can be directly combined for modeling without correction.The evaluation results of the remaining samples are shown in Figure 4 The value of the bar chart represents the diagnostic accuracy of each corresponding aging stage, and the label on it represents the number and the total number of correct diagnoses for each aging stage.For example, the second column in Figure 4(a) shows that the diagnostic accuracy of aging state II on the test set is 86.7% by the method proposed in this paper.As can be seen in Figure 4(a), the validation accuracy of the model proposed in this paper is 93.3% based on the test set.All samples of aging state IV were diagnosed correctly, and the remaining three aging states all showed one incorrectly diagnosed sample.Aging condition IV means that the transformer has major aging defects and should be taken out of service.It is very important for condition IV to be correctly identified in its entirety, which means that the method does not judge transformers that should be returned to service as equipment that can continue to operate, avoiding the huge economic losses caused by transformer damage in operation.
In order to verify the practical engineering application effect of the method proposed in this paper, we also carried out blind sample verification in the actual substation.All the blind samples for practical engineering application are from the substation of Guangzhou Power Supply Bureau of Guangdong Power Grid Co., Ltd.,China Southern Power.In practical applications, the operation and maintenance vehicle equipped with individual Raman spectrum equipment travels to the transformer.Then, the oil samples collected from the transformer are placed directly into the individual Raman spectrum equipment for Raman detection.Finally, the collected data are transferred to the computer in the cockpit of the operation and maintenance vehicle, and then evaluated using the method proposed in this paper, and the evaluation results are shown in Figure 4(b).Since there are no transformers belonging to aging state IV in operation in the grid, there are no samples of aging state IV in practical engineering applications.As can be seen from Figure 4(b), most of the transformers in operation belong to aging state I and aging state II.All samples of aging state III were diagnosed correctly, and the remaining two aging states all showed one incorrectly diagnosed sample.The diagnostic accuracy (90%) of this method in engineering applications is reduced, mainly because the internal molecular conditions of real oil samples are more complex, and there are also some transformers in the process of operation and maintenance of transformer oil treatment makes the transformer state change trend before and after the treatment appears different.However, the accuracy is sufficient for practical engineering needs.
Through practical application, we also found that individual Raman spectrum equipment in the substation site by electromagnetic interference is almost negligible.In addition, Raman is very susceptible to the influence of temperature and light, but the equipment is equipped with a semiconductor cooling module and a shading plate to effectively solve the above problems.From sampling to testing completion, the whole process can be controlled within 2 minutes, which greatly improves the efficiency of the power industry's operation and maintenance.The individual Raman spectrum equipment has a precision optical path built in, and small changes in that path can cause large detection errors.The testing environment at the transformer site is not smooth enough.Since the equipment is in the operation and maintenance vehicle and the testing and analysis are done in the operation and maintenance vehicle, this also avoids the impact on the testing due to the lack of smooth ground in the transformer condition field evaluation.

Conclusion
This paper uses individual Raman spectrum equipment to conduct Raman spectrum tests on the prepared samples and real samples.The 8-dimension Raman spectral features that can reflect the transformer state were extracted based on MDS.The random forest method is used to design the transformer state evaluation model based on the Raman spectrum.The laboratory test samples and transformer oil samples in service were tested and verified.The diagnostic accuracy of this method was 93.3% for the prepared samples and 90% for the real samples.The research results of this paper can provide technical support for the intelligent operation and maintenance of substations.

Figure 1 .
Figure 1.Raman spectra of transformer oil with different aging days

Figure 2 .Figure 3 .
Figure 2. Raman spectra of transformer oil with different years of operation

Figure 4 .
Evaluation results of (a) test set and (b) engineering application