Valve flow profile optimization based on data mining

In this paper, a data mining based valve flow curve optimization technique is proposed. Firstly, the data mining technique is described. Then, a valve flow curve optimization method based on improved cluster analysis is proposed. Finally, the effectiveness of these methods was verified through experiments. The results show that the valve flow curve optimization technique based on data mining can effectively improve the performance and reliability of the valve.


Introduction
Valve as one of the important control elements in thermal power units.It has the function of regulating fluid flow and pressure.It plays a crucial role in ensuring the safe operation and stable power generation of thermal power units [1].
Turbine valve flow characteristic parameter refers to a set of parameters that can reflect the correspondence between the inlet steam flow and the turbine high-pressure regulator [2].When the valve flow characteristic parameters set in the DEH of the turbine match the actual valve flow characteristic, the turbine will show good controlled performance.Otherwise, there may be problems such as valve shaking, large load fluctuations during single valve/sequence valve switching.When the unit changes load and primary frequency regulation, the problem of sudden load change or slow regulation occurs.Sometimes even cause power system oscillation accident, making the unit control difficult, affecting the safety and stability of the unit and economic performance.Therefore, the valve flow characteristic (VFC) parameters need to be optimized [3].
To solve the problems, a valve flow curve optimization technique is proposed.The technique starts from data mining technology and performs valve flow curve optimization calculations on the basis of a large amount of historical data.On this basis, the function of data mining in valve flow curve optimization is further implemented.

Data mining
Data Mining (DM) is the discovery of specific, comprehensible, predictable and potentially useful information and related knowledge from massive amounts of data.The original data is often a large amount of incomplete, containing noise, complex practical application data.In a broad sense, as long as the useful rules obtained through data mining can be used as the content of data mining, its scope is very wide [4][5].See Figure 1   Among them, the preparation of data includes three parts: data selection, data quality analysis, and data pre-processing.Data exploration is a preliminary study of the data.Model building is the core of data mining, where the exact algorithm to be used is determined in this step.And the determined algorithm is used to derive the relevant parameters of the model, so as to obtain the exact model.

Data pre-processing
Thermal power unit control system is a huge control system.The massive thermal power unit operation data provides a large amount of basic data for turbine regulation VFC calculation and valve curve acquisition.This includes unit power, valve flow command, actual opening degree of each valve, main steam pressure, main steam temperature, regulating stage pressure and other related operating parameters.However, before using the historical data for valve flow characteristic parameter optimization, it is necessary to screen the historical operation data to obtain the operation data that can reflect the characteristics of the unit.
Abnormal data and redundant data caused by measurement accuracy and other reasons are eliminated using the T-test method and the isolated forest algorithm.The main steam temperature, main steam pressure, and unit load are used as the basis for judging the stable operating conditions of the unit: The standard deviation of the above three different parameters is processed within the sequential time range.At the same time, it is compared with the set deviation range, and when the data are stable for at least three sampling cycles, the unit is judged to be in a thermally stable condition.

Improved clustering algorithm
The implementation of data mining algorithms is mainly based on the construction of models, which are targeted to analyse a number of problems with different and unrelated characteristics.
K-means algorithm is a data mining method for machine learning.It is based on artificial intelligence technology, takes computer language as the research object, and builds models based on its theory on top of pattern classification, clustering and so on [6].As follows: Firstly, a number of work data points are divided according to the magnitude of the data as the initial centre of mass of the cluster.The Euclidean distance is adopted as the distance metric.Each work data point is assigned to the nearest centre of mass.Then, the centre of mass of each cluster is updated.Finally, the assignment and update steps are repeated until the centre of mass does not change.Then the cluster analysis process is finished.
In the process of practical application, the different selection of the initial centre point will directly lead to different results.For this reason, the clustering algorithm of Density Peak is introduced, which cleverly combines distance and density.The flow of the algorithm is as follows: (1) Calculate the local density ρi of each point: where dij denotes the distance.dc denotes the neighbourhood radius.
(2) Calculate the cluster centre clustering δi for each point: The local density values of each point are arranged in descending order, and the cluster centre distance of the point with the highest density is equal to the distance between the points furthest away from that point.
(3) Determination of clustering centers ri: = * (4) Determine the categories to which other points belong: Once the centre of clustering is determined, the category labels of the current point coincide with the labels of the points that are higher than the density of the current point and closest to it.

Application of data mining in valve flow curve optimization
A 660MW unit in a region is selected as the research object, in which the valve configuration is 4 high regulating valves, and the start-up sequence is CV1/2→CV3→CV4.Parameters of the valves are selected as the operation database in the DCS system.
Firstly, the T-test method and isolated Senri algorithm are used to eliminate the abnormal data on the basis of screening the stable working condition data.Then the improved clustering algorithm was used to cluster the stable working condition data, and the results are shown in Figure 2. It can be seen that before and after the data pre-processing for each parameter of the unit trend and data distortion does not have a greater impact.It shows that the data pre-processing method is feasible and effective.
The relationship between the steam flow rate of the turbine and the valve position command (VPC) using the data after the initial sieving is calculated by the Friugel formulae, as shown in Figure 3.
In Figure 3: In the process of unit operation, the flow rate of the turbine and its integrated VPC have an obvious non-linear relationship.When the VPC of the unit is <68%, the actual steam feed rate of the turbine increases rapidly.When the integrated VPC of the unit is >68%, the actual steam feed to the turbine increases slowly with the increase of the integrated VPC.Accordingly, the VFC set before optimization has obvious irrationality.Data mining is carried out using density functional method on the data after initial screening.Figure 4 gives the relationship between the flow rate of the turbine and the integrated VPC under various operating conditions obtained after data mining.
In Figure 4: After data mining of the initial screened data, the amount of data is further reduced.Moreover, the flow rates of the turbine obtained under various operating conditions are not distorted, which demonstrates the feasibility and effectiveness of the method.
The optimized turbine valve flow characteristic function curve can be obtained as shown in Figure 5.In Figure 5: The difference between the valve flow characteristic curves before and after optimization is obvious.When the integrated VPC <83%, the optimized valve flow characteristic curve is flatter than the original.When the integrated VPC > 83%, the optimized curve is steeper than the original curve.This is also more in line with the characteristics of the flow of the turbine valve.Accordingly, the controllability of the turbine under different loads and the stability of operation can be significantly improved.

Conclusion
In this paper, a data mining algorithm is provided to obtain the actual flow characteristic curve of the regulating valve manifold in the hybrid valve mode of the steam turbine.Through the data mining process using the T-test method, the isolated forest method and the improved clustering algorithm to obtain the actual flow characteristic curve of the regulating valve group .The optimization of the valve management curve is finally achieved.This method overcomes the shortcomings of the traditional test method and reduces the workload.

Figure 1 .
Figure 1.Schematic diagram of data mining process.

Figure 2 .Figure 3 .
Figure 2. Comparison of important parameters of the unit before and after data pre-processing.(a) Before preprocessing; (b) After preprocessing.

Figure 4 . 5 Figure 5 .
Figure 4. Relationship between actual inlet steam flow and integrated valve position after data mining.
for details.