Research on Wind Power Prediction Algorithm Based on Fusion Model

Wind power generation is currently one of the most promising power generation technologies. It is particularly important to improve the prediction accuracy of wind power output, which can effectively reduce the impact on the grid when wind power is connected to the grid. Based on the fractal model, this paper integrates it with the wind power prediction model, and combines the custom K nearest neighbor algorithm to evaluate the prediction effect using multi-dimensional indicators. Finally, taking the data of a wind farm in northwest china as an example, compared it with the prediction model of random forest, support vector machine and gradient boosting decision tree prediction model to verify the effectiveness of the prediction algorithm in this paper.


Introduction
The fossil energy on the earth is gradually being exhausted, and the development and utilization of new energy is imperative. Among them, wind energy resources have many advantages, such as a wide range, almost no pollution, and reusability. It has become one of the most potential new energy sources. However, natural wind is characterized by strong randomness and intermittent [1]. When large-scale centralized grid connection, it will bring certain threats to the stable operation of the power grid. Accurately predicting wind power for a period of time in the future is of great significance to power dispatching and safe operation. Physical method and statistical method are commonly used in wind power prediction. Among them, the physical method does not rely on the historical data of the wind farm [2]. It is mainly based on numerical weather forecasting, according to the wind direction, air humidity, air pressure, etc. as the input of the prediction model, the data is used to model and analyze the location of the wind farm, but in different locations, different moments, environmental factors have bigger difference, which make the practicability of the physical method poor and difficult to promote. The statistical learning method is to use a large amount of historical wind power generation data [3], wind speed, wind direction, air pressure and other data of the wind turbine hub to obtain the mapping law between the input data such as numerical weather parameters and the actual power of the wind power from a large amount of historical data, and establish the input and output relationship, The commonly used methods include support vector machine and artificial neural network. The other is the similar day method [4]. After years of gradual research, this method has been applied to wind power forecasting [5] and photovoltaic power forecasting [6].The author combine neural IOP Conf. Series: Earth and Environmental Science 898 (2021) 012004 IOP Publishing doi:10.1088/1755-1315/898/1/012004 2 network, fuzzy inference research and similar day method, and integrate the advantages of each other to predict the wind speed of a day [7]. Zhang Yiyang [8] and other researchers subdivided similar days into "similar periods", "reference segments" and "prediction segments", predicting from different levels, but ignoring the mutual influence between reference power and meteorological characteristics. In literature [9], the clustering method was first used to select similar days to avoid hard division of clustering. However, the unsupervised method also has obvious shortcomings, such as high requirements for original samples and sensitivity to outliers, which can easily lead to excessive classification and the final accuracy is difficult to guarantee. Some researchers considered the change law of wind speed within a certain period of time, and proposed a method based on continuous time clustering and combined with SVM algorithm to predict wind power. However, this method is difficult to reflect the difference between historical data and forecast dat. Zhao Ting [11] and other scholars studied the power curve of the previous K days, taking the characteristics of the power change curve into account in the model, but did not analyze the correlation trend relationship before and after. Li Hui [12] et al. first extracted similar days, and then used principal component analysis to reduce dimensionality and computational complexity, but the prediction accuracy was not high and the model could not be explained. In this paper, a hybrid K-nearest neighbor algorithm based on fractal model is proposed for wind power prediction. Drawing on the fractal model, considering the reference power curve problem and meteorological characteristic values, the use of fractal interpolation can effectively obtain the local information of adjacent samples, and then combine with the custom KNN algorithm to generate a prediction model. Finally, based on the historical measured data of a certain wind farm, and compared with some existing prediction models, the fusion model proposed in this paper has high prediction accuracy, reduced complexity and better performance.

Fractal Lemma
Lemma 1 [12]: In a certain complete distance space , , the mapping set with compression factor0 1 is defined as {ω : → , 1,2 ⋯,M},which constitutes an iterative function system ( IFS). When W is a compressed mapping cluster with compression ratio c on , , then W can generate a complete metric space , compressed mapping cluster F, F: H X → H X , F A ⋃ , ∈ , and the compression ratio is c, that is h W A , W B ch A, B , A, B ∈ H.
Lemma 2 [13]: Suppose the hyperbolic iterative function system X, W on the complete metric space X, d , the compression factor is s, and A is the attractor, then h , , ⋃ for any L ∈ H X . The collage theorem gives the approximate similarity between a set and an invariant set, and it is sufficient to select an appropriate IFS Well, similarly, a set of iterative function systems can be generated by continuously constructing interpolation nodes.

Fractal Interpolation Theory
The IFS [14] satisfies Lemma is made so as not to intersect with the function between the cells. Each transformation must satisfy the following equation (2),So that the left and right endpoints of the large interval can be mapped to the left and right endpoints of the subinterval: The vertical scaling factor of  m is m d [15], choosing the free variable as m d and x to solve the above equations, we can get the following expression (3): It is difficult to reflect the local features between two adjacent known information points in the traditional method, but the fractal interpolation method has its unique advantages. It can make better use of the local information of samples, so that most features of the original sampling curve can be effectively supplemented and retained [16].

KNN Algorithm Description
For a set of training data, the KNN algorithm finds K closest instances in the training set, and finds these K instances as candidate classes [17]. Then take the similarity between them as the weight and substitute the preset threshold value to basically determine the classification of the sample [18]. The algorithm flow chart is as shown in Figure 1.

Prediction Algorithm of Hybrid KNN Algorithm Based on Fractal Model
The wind power generation time series have periodic characteristics. The daily power generation curve has self-similarity on the time scale, and there are also very similar fractal dimensions [19], and a period of power generation time series is also relevant. The historical data and the power generation curve can be fitted into a similar period, and the fractal interpolation method can make good use of the characteristics of the local information of the adjacent known points [16], and the algorithm can quickly converge to the true value. In the traditional KNN algorithm, the historical sample set needs to be searched every time to obtain n similar historical sample sets [20]. When the set K value increases, the number of searches continues to increase, and the repeated samples of the search continue to appear, which not only consumes extra the storage space will greatly reduce the operating speed of the system. In order to make up for some shortcomings of traditional algorithms, this paper designs a hybrid custom KNN algorithm based on fractal models. The algorithm is based on the fractal dimension of self-similarity in the fractal theory,combined with the related theory of the KNN algorithm,to improve the search method of similar historical sample sets, which not only reduces the memory space consumed, but also reduces the time complexity of the algorithm. The main process of the prediction model is as follows:  The prediction day is set as the starting point,which is used to split the fractal dimension data,part of which is used for training and part of the data is used for verification.  Take the horizontal axis as the time coordinate of the fractal dimension point collection, and then analyze the power curve characteristics of the reference day to find the main characteristic points of the curve. In this example, we mainly consider the three main characteristics of weather, temperature, wind speed, and wind direction, and select the power value of nine integral points.  Establish the power curve IFS on the base day. The iterative function system is established by the set of interpolation points in step 2). It can be known that when is set to 0.9~0.95, the prediction error is the smallest, and the formula used for calculation is equation (3).  Establish the power curve IFS. The set of interpolation points is obtained by the reference time coordinate of step 2) and the power value corresponding to each similar point. The iterative functions are established separately, where the value of d remains unchanged.  After the calculation, the fractal dimension of wind speed, wind direction and temperature are saved in memory. KNN algorithm does not search all the historical sample sets any more, but searches from the fractal dimension of main features, which greatly reduces the search amount of the algorithm.  According to the test set date specified in the current data set, find out the observations of the latest 9 time points and combine them into a data frame, and calculate the fractal dimension of the three characteristics of wind speed, wind direction and temperature in this data frame. Figure 2 above is the Prediction model diagram of hybrid KNN algorithm based on fractal model.

Case Analysis
Taking a wind farm (20 units) as an example,the data of 334 days from January 1, 2019 to December 1,2019 were selected as the sample data, and the data of the first 290 days were used as the training sample set.The test object set of the following 44 days was predicted by using the training sample set.  Figure 3, the abscissa is the predicted time axis and the ordinate is the power value.   Figure 4. Throughing the random search method combined with the grid global search for optimal parameter tuning, it is determined that n_estimators=100,max_depth=10. In Figure 5, the kernel skills in SVM adopt Linear kernel function, degree=3, and the penalty coefficient adjustment model is adopted to prevent over-fitting.The predicted power curve of the GBDT model is shown in Figure 6. Through hyperparameter tuning, constantly modify the model, determine learning_rate=0.1,n_estimators=500,max_depth=3. Table 1 shows the root mean square error (RMSE), time consumed by training model data (CT) and goodness of fit (SCORE) of the prediction model after hyperparameter tuning of the four models.) As can be seen from the above table, the performance of the SVM prediction model is the worst. When low-dimensional data is mapped to a high-dimensional space, the model training consumes a long time, and the RMSE does not decrease. After continuous hyperparameter tuning of GBDT, the time consumed for model training has been reduced, but the increase in RMSE and model fit value is not obvious. After the RFR prediction model is adjusted for a series of hyperparameters, the goodness of the model fit is improved significantly, but at the same time the model training time has increased. The hybrid KNN wind power prediction model based on fractal model proposed in this paper has better performance in root mean square error, model goodness of fit and training consumption time, which fully verifies that this model can be used to predict wind power well.

Conclusion
At present, the grid-connected capacity of wind power is still increasing, and large-scale wind power grid-connected has a great impact on the operation of the grid. In order to cope with the challenges brought by the strong randomness of wind power, the accuracy of wind power forecasting has received great attention. A series of studies conducted in this article are aimed at improving the accuracy of wind power forecasting. Fractal model related theory with the custom of KNN algorithm, the combination of several important characteristics of sample first fractal dimension calculated and stored in memory, after interpolation combining KNN algorithm search twice, one is the nearest neighbor time within the scope of the fractal dimension of the cache, another is to find the weighted average power value of the nearest neighbor. At the same time, compared with the existing wind power prediction methods, in each index has a better performance. By introducing the fractal idea, the local information of samples can be guaranteed, the model is simple and the complexity is low, especially for a large number of samples, the algorithm performance is still good. The selection of the optimal value of the specific fractal dimension still needs further research, which will also be the main direction of the follow-up research in this article.