Study on fault diagnosis method for nuclear power plant based on hadamard error-correcting output code

The technology of real-time fault diagnosis for nuclear power plants(NPP) has great significance to improve the safety and economy of reactor. The failure samples of nuclear power plants are difficult to obtain, and support vector machine is an effective algorithm for small sample problem. NPP is a very complex system, so in fact the type of NPP failure may occur very much. ECOC is constructed by the Hadamard error correction code, and the decoding method is Hamming distance method. The base models are established by lib-SVM algorithm. The result shows that this method can diagnose the faults of the NPP effectively.


Introduction
The technology of real-time fault diagnosis for NPP has great significance to improve the safety and economy of reactor. At present, many diagnostic methods are successfully used in nuclear power plant fault diagnosis [1][2]. As compared with the single classifier, the ensemble learning has higher accuracy and better generalization ability, so in recent years, the ensemble learning has been deeply researched in theory, and it has also been applied in the equipment fault diagnosis.
In the process of ensemble learning, how to generate a set of difference based models is a very important research content. And the methods, which are used to generate difference base models, include bagging, boosting, attribute combination and Error-correcting Output Coding (ECOC).
Different training subsets are obtained by extracting different subsets of attributes in attribute combination method. The author has done some work in this field. The model with good generalization ability is obtained by this method.
The model can still work properly in the absence of some parameters. However, this method is based on the fact that the parameters are redundant. But nuclear power plant is a complex system, Many kinds of faults may occur in the actual operation of nuclear power plants. So there are not so many redundant parameters. Therefore, in this paper, the sample set of the basic model by ECOC. The basic model is built by SVM algorithm.

SVM
SVM is an effective approach for pattern recognition. It is a machine learning method based on structural risk minimization principle [3]. SVM is a binary classifier, by solving a quadratic programming problem, we can find an optimal hyperplane that separates the two training data, and the SVM is constructed. The optimal hyperplane is that not only separates the two types of data correctly, but also makes the interval between two classes maximum. When a new sample x is introduced, its class is determined by the following decision function: is also the core of SVM. The linearly indivisible samples are mapped into high-dimensional space by the kernel function, which makes these samples linearly separable in high-dimensional space. SVM can effectively solve the problem of binary classification. At present, the commonly used multiclass SVM methods are one-versus-rest (1 V R), one-versus-one (1 V 1), Directed Acyclic Graph SVMS, Error Correcting Output Codes SVMS (ECOC -SVMS), etc. In this paper, ECOC algorithm is used to realize multi-class SVM.

ECOC
ECOC multi-class classification framework uses a binary or ternary coding matrix to achieve multiclass class decomposition and base classifier integration [4]. In the Coding matrix The binary code is represented by {-1, + 1}, and ternary code is represented by { -1, 0, +1}. Where "-1" represents one class, "+ 1" represents another, and "0" indicates that the corresponding class of the codeword bit is ignored in the second-class partition formed. Figure1shows four common ECOC classification system diagram, namely: Figure 1(a) 1-V-R, Figure 1 Figure 1(c) Dense Random, Figure 1 Each row of the code array of Fig. 1 represents a codeword of a certain class, where is class label. Each column represents a second class partition of the sample. The symbols "1", "-1" and "0" are respectively expressed in white, black and gray. In the training phase, the training samples of each base classifier are transformed into two classes according to their corresponding columns in the coding matrix. And then two classes of classifiers corresponding to this column are trained respectively. In the test phase, a test sample X is given, and these classifiers are used to classify X. The result is a codeword vector . Finally, the encoding matrix is decoded according to a certain decoding rule, and the final classification results are obtained by decoding. The key to multi-class classification using ECOC is to determine the effective coding matrix and decoding strategy.

Hadamard error correction code
The principle of how to design a good ECOC coding matrix is as follows： 1) The Hamming distances between each row are maximized as far as possible, so that the error correction ability of ECOC is stronger.
2) The Hamming distances between the column and the other columns and their supplements are maximized as much as possible, so that the difference between the base classifier is larger.
Based on the above principles, Dietterich and Bakiri give four kinds of ECOC coding methods: exhaustive codes，column selection from exhaustive codes，randomized hill climbing，BCH codes.
However, the error correction code obtained by this method has some drawbacks: 1) There is no algorithm for any class, and when the number of classes is changed, the algorithm must be replaced accordingly.
2) Some algorithms are very complex.
3) When the number of classes increased, the code length of the error correction code is longer. So It is difficult to store parameters and run in real time.
In this paper, ECOC is constructed by the Hadamard matrix. The Hadamard matrix is a binary code matrix. The order of the coding matrix can only be H . Complement operation, that is, elements "-1" and "1" interchange. The biggest characteristic of Hadamard matrix is that any two row or two columns are orthogonal. For the Hadamard matrix of order N, the Hamming distance between each row is 2 / N . So the Hadamard matrix has good discriminative performance. However, since its first row is all "-1", and the order of the coding matrix can only be , so in practice, the Hadamard matrix needs to be modified in order to get a satisfactory Hadamard error correction code Step 1 Determine the order of the Hadamard matrix. If Step 2 The Hadamard matrix is generated. The Hadamard matrix i H 2 of order i 2 is generated according to (3).
Step 3 the first column of the Hadamard matrix Step 4 The first k rows of the resulting matrix are taken according to the class number k. The error correction code matrix which is The error correction code matrix can be generated by the above method corresponding to the multi class classification problem of any class. Simple and fast, and the code length is shorter. So the storage capacity is reduced and the operating efficiency is improved.

Decoding strategy
Hamming distance decoding is one of the most common decoding methods; its decoding rule is simple, high accuracy, and has a certain error correction capability. In this paper, Hamming distance method is selected as the decoding method.

Simulation Analysis
The typical faults of NPP which are chosen for analysis are as follows: loss of coolant accident (LOCA), feed water pipe rupture, steam generator tube rupture (SGTR).
The sample set is as table 1. The sample set includes training sample set and test sample set. There are 48 characteristic parameters of NPP, such as water level of pressurizer, flow, pressure and temperature of main steam etc.

Table1. Information of Sample Set
The magnitude of the parameters varies widely. So the parameters must be normalized. At First the parameters are linearly normalized as follows: The sample set contains data on 4 faults and normal operating states. The error correction code matrix of 5 classes is as (5)  We test 1v1 method and ECOC algorithm, the diagnostic results is shown in Table 2. The test results show that ECOC algorithm based on Hamming codes has higher accuracy, but the difference is not great. Since we only collected four classes of faults, so the sample set contains only five classes. In the 5-class problem, 1V1 algorithm needs to build 10 base models, and Hamming code ECOC algorithm to build 7 base models. Therefore, when the number of classes is small, there is not much difference between 1V1 algorithm and ECOC algorithm in the implementation of the efficiency. When the actual application of the fault diagnosis model is established, the number of classes

Summary
The fault samples are collected from the NPP simulator of QINSHAN NPP Unit 1. ECOC is constructed by the Hadamard matrix, and Hamming distance method is selected as the decoding method. The base models are established by lib-SVM algorithm. The test results show that ECOC algorithm based on Hamming codes has a little higher accuracy. When the number of classes increased, the ECOC algorithm based on Hamming codes will show the obvious advantages of the implementation of efficiency. So the algorithm has a certain practical value, and it is very suitable for nuclear power plant fault diagnosis.