Review of Recent Development of Dynamic Wind Farm Equivalent Models Based on Big Data Mining

Recently, the big data mining method has been applied in dynamic wind farm equivalent modeling. In this paper, its recent development with present research both domestic and overseas is reviewed. Firstly, the studies of wind speed prediction, equivalence and its distribution in the wind farm are concluded. Secondly, two typical approaches used in the big data mining method is introduced, respectively. For single wind turbine equivalent modeling, it focuses on how to choose and identify equivalent parameters. For multiple wind turbine equivalent modeling, the following three aspects are concentrated, i.e. aggregation of different wind turbine clusters, the parameters in the same cluster, and equivalence of collector system. Thirdly, an outlook on the development of dynamic wind farm equivalent models in the future is discussed.


Introduction
The penetration of wind energy in power system has been growing due to its interminable and mild environment effects. In 2016, more than 54GW of wind power was installed across the global market. The cumulative capacity has reached to 486.8GW with a capacity growth of 12.6% [1]. Nowadays, wind power has become an important part of grid power supply. In the meantime, the large-scale wind power integration has brought a lot of stability problems, which endanger the operation security of power systems. Therefore, dynamic modeling of wind power is highly required for power system analysis. The dynamic model of wind farm includes detailed model and equivalent model. With the growing size of the modern wind farms, detailed model is no longer applicable due to its complexity and long simulation time [2] [3]. In order to reduce the computational burden and the simulation time, equivalent modeling method is required and developed. In the specific studies, reasonable equivalence simplification does not affect the accuracy of the analytical results.  In recent years, big data mining has been developing rapidly. It is gradually realized that the deeper and more important information hidden behind the data can describe the overall characteristic of the data and predict the development trend. These information can play an important role in the decision-making process [4]. Thus, some researchers began to apply big data mining to the dynamic equivalent modeling of wind farms. This paper mainly reviews and analyzes the recent development of dynamic equivalent modeling of wind farms based on the big data mining method, which consists of the following three parts. The first part is the wind speed of the wind farm, including the wind speed prediction, the wind speed distribution of wind farms, and the equivalence of wind speed. The second part is the dynamic single wind turbine equivalent modeling of wind farm, including the selection of equivalent parameters and the parameter identification. The third part is the dynamic multiple wind turbine equivalent modeling of wind farms, which further includes three aspects: clustering of wind farms, parameter identification of clustered wind turbines, and equivalence of wind farm collector systems.

Wind speed in the wind farm
For dynamic wind farm equivalent modeling and model verification, the relevant data regarding wind speed and wind direction is needed. Considering factors such as wake effect, turbine distribution etc., the wind speed distribution and the incoming wind speed of each wind turbine could be calculated. Similarly, the equivalent wind speed of every equivalent wind turbine could also be calculated.

The wind speed prediction in the wind farm
To get the natural wind speed and direction of the upstream of the wind farm, neural network-based models [5] and Weibull distribution models are widely used.

The wind speed distribution in the wind farm
The distance between wind turbines is limited due to collector system and geographical constraint, especially in the large-scale wind farms. Wind turbines extract energy from the wind to produce electricity. Accordingly, the wind downstream of the turbine has a lower energy than the wind upstream of the turbine. This shadowing effect from upstream turbines on other turbines further downstream is referred to as the wake effect [6]. Based on the wind attenuation factor, a method to calculate the wakes of multiple turbines is proposed in [5].

The application of equivalent wind in the wind farm
Once the data of natural wind in the upstream of the wind farm is calculated, the equivalent power curve could be obtained. The method to calculate the equivalent wind of the aggregated turbines such as average wind speed method and the method based on typical equivalent power curve is widely used. Based on the neural network static model, dynamic real-time simulation model was developed in [7]. Two conventional approaches for the equivalent steady state modeling of the wind farm are discussed: a) aggregate-measured power curve, obtained as an averaged aggregate representation of the outputs from all wind turbines, and b) cluster-measured power curve, obtained using the support vector clustering technique [8].

Dynamic single turbine equivalent models of the wind farm based on big data mining
The use of single turbine equivalent model to represent the dynamics of large-scale wind farm can significantly simplify the calculating process during simulation, comparing to the accurate detail model. The following two steps are used to build an equivalent single turbine model: choosing equivalent parameters and calculating parameters.

Choose equivalent parameters
To simplify the wind turbine model, it is the first step to choose typical parameters that could reflect the typical features of the wind farm according to different conditions. Active power, reactive power and the voltage on the grid bus in the wind farm could be used as the typical features of the wind farm, an algorithm based on the above output features is proposed in [9].

Calculation on equivalent parameters
Several algorithms could be used for the calculation on equivalent parameters. Some methods such as weighted summation have simple calculation process. However, the error of the algorithm is large in large-scale wind farm simulation. In this paper, algorithms on parameter identification based on big data mining such as genetic algorithm, principal component analysis, auto mutation particle swarm optimization algorithm (AMPSO) are discussed; the advantages and disadvantages are shown in table 1. Vague meaning of the comprehensive evaluation function when the sign of the factor load of the principal components is positive or negative; Low naming clarity AMPSO Faster search, fewer parameters to adjust, simple structure, easy to implement.
Easy to fall into the local optimal solution.

Genetic algorithm.
Genetic algorithm is a kind of random searching method derives from the theory of evolution, which includes principles of natural selection and survival of the fittest etc. [9][10].

Principal component analysis.
Principal component analysis is a kind of eigenvalues extracting method, which eliminates the correlation between sub vectors, thus removes the axes which contain less information, and reduces the orders of the eigenvalues dimension [11].
3.2.3. AMPSO. Auto mutation particle swarm optimization algorithm take wind turbine patterns as the grouping rule for the wind turbine generator system. The relative parameters are divided into transient and steady-state parameters, and calculated by test measurement and AMPSO algorithm, respectively [12].

Multiple wind turbine dynamic equivalent modelling based on big data mining
If the large wind farm is aggregated to a single wind turbine model, the error would be large. Thus, an equivalent method with a small number of representative wind turbines for dynamic wind farm modeling is proposed and shows a satisfying result in simulation. Some wind turbines have similar operating point can be divided into the same class and represented as an equivalent turbine. This method follows three aspects：cluster analysis of wind turbines, aggregation of parameters in the same cluster and equivalence of collector system. The multiple wind turbine dynamic equivalent modeling flow chart is shown in figure 1.

Cluster analysis of wind turbines
By suitable index and aggregation algorithm, cluster analysis classifies the wind turbines with the same or similar operating point as one group. Accordingly, a large number of wind turbines can be divided into several groups. The aggregation algorithm is a branch of data mining. It divides physics or abstract sets into many groups, each group also called cluster. Any object in the same cluster has a great similarity while distinction between objects from two clusters is obvious. K-means algorithm is one of the traditional aggregation algorithms. Its main idea is to divide objects into clusters through iteration and make the clusters become compact and independent. This algorithm usually ends at local optimal solution. When Double Fed Induction Generators (DFIGs) have wind speed disturbance or faults at system side, the control system in generators works as before. Ref. [13] picked up the initial value of 13 state variables to reflect starting operating point of generators and used k-means algorithm to aggregate generators. Ref. [14] analyzes the eigenvalues of linear equation set of single machine infinite system with different kinds of generators under different situation, in order to find the leading eigenvalues. Then use the related factors to calculate the related factors between state variables and eigenvalues to find the optimal state variables as cluster index. It reduces the number of cluster index and improves the aggregation speed. With large-scale wind power generation, a significant performance for the behavior of relay protection and control equipment based on the feature of voltage and current wave has appeared. Ref. [15] established a multiple model for wind farm electromagnetic transient also with k-means algorithm. The cluster index is rotor speed of DFIG at the moment before short-circuit occurs. Another aggregation method is hierarchical clustering. It builds clusters based on data level to form a tree called clustering tree. Every cluster is a node of the tree. This algorithm can explore data at different granularity level and can achieve the similar measurement and distance measurement easily. However, the biggest problem is its vague stop condition and the method cannot be corrected if combine or split operation has been done. These disadvantages may have an impact on low quality of clustering result. In order to solve the clustering problems, hierarchical clustering is combined with other algorithms. Ref. [16] applied the hierarchical clustering as the aggregation method, where the characters of the transient voltages were used as the grouping rule of wind turbines. Ref. [17] also identified wind farm groups with similar dynamic behavior by hierarchical clustering. The cluster index is the similarity of low voltage ride through at connecting point in grid-connected wind farms. According to the advantages and disadvantages of two aggregation methods mentioned above, [18] set up a comprehensive algorithm named H-K clustering. Hierarchical clustering was used to get original information and k-means algorithm to complete the aggregation process. Silhouette value was used as the measurement indicator. Spectral clustering is an algorithm evolved from graph theory. Compared with the traditional clustering method, spectral clustering has more adaptability to data distribution and better clustering effect, and at the same time, the computational complexity is smaller and the implementation is simpler. For the wind farms with complex terrain or irregular layout, a spectral clustering method based on the diffusion mapping theory is proposed in [19]. It considered the output properties of active power, reactive power and voltage which combined with the operation environment, wind speed fluctuation, rotate speed and other parameters as the grouping rule. Using support vector machine as aggregation tool, Support Vector Clustering (SVC) belongs to kernel clustering. It has two significant advantages: first, the cluster boundary can be any shape; second, the noisy data point can be analyzed and the overlapped cluster can be separated while most aggregation methods cannot achieve it.  Ref. [20] proposed a concept of group and divided wind turbines into clusters by SVC according to wind speed. Then divided clusters into groups based on wind direction. Compared with traditional methods, simulation of how wind farm will work in the next whole year can be more accurate.

Aggregation of parameters in the same cluster
In order to simulate the dynamic wind farm more efficiently in the gird-connected analysis, equivalent generator and its representative parameters which on behalf of generators in the same clusters are needed after the cluster analysis. It is a common method by using volume weighted method and single machine multiplication to get parameters of lumped model [21]. This method has an advantage of easy and less calculation with the actual system model and has a widely practical application. Moreover, the aggregation of parameters in the same cluster can be considered as a single equivalent problem which can be dealt with parameter identification methods.

Equivalence of wind farm collector system
Conventionally, a large wind farm is made up of dozens or even hundreds of wind turbines which connect to grid through set-up transformers and underground cables collect to the bus. This feature may lead to the more complicated internal wires and may involve a large amount of cable routes. Thus, when it comes to dynamic equivalent analysis, it is necessary to consider the collector system. Ref. [22] presented an equivalent rule of collector system. Aiming at wind farm detailed parameters, [23] proposed parallel transform method and single impedance equivalent method for the collector system.

Conclusion and outlook
To sum up, big data mining method has been widely used in dynamic wind farm modeling up to now, with good results in the fields of wind speed, cluster analysis and aggregation of multiple wind turbine, identification of equivalent parameters for single wind turbine, etc. In the future, operating data will increase as further increase in the scale and number of wind farms. Big data mining method will has a greater potential of researching and applying in dynamic wind farm modeling.