Integrated algorithm evaluation of a scalable digital power management innovation model

Maintaining contemporary society’s steady progress depends on the reliable functioning of the electrical grid. The complexity and nonlinearity of today’s power networks make it difficult to manage them using conventional mathematical modeling approaches. To get around these problems, researchers have started using machine learning methods, such transient stability evaluation that is based on machine learning. This research suggests a way to evaluate the transient stability of power systems by making use of the complex feature extraction capabilities of multi-layer perceptron (MLP) networks. Another approach is to suggest an ensemble learning model that uses the easy-ensemble undersampling technique to deal with power system transient stability data that is unbalanced. This method preserves unstable power system samples, performs multiple undersampling on stable samples and combines unstable and undersampling samples into multiple new balanced training sets for MLP model training. The final ensemble model is obtained through voting integration strategy. Through conducting a comparative evaluation of individual models and ensemble models, this paper discovers that the ensemble learning models exhibit superior sensitivity and accuracy in effectively addressing system instability.


Introduction
Electricity, as an efficient energy source, comes from various aspects of nature.Power stations can develop electricity based on various types of natural materials such as coal, wind, water, and tides.Various types of power stations produce various types of electricity based on regional material resources and transmit them to the power grid system for unified regional allocation [1].Allocating power production tasks to thermal power plants according to the monthly, weekly, and daily power demand of the region is necessary for the power grid system to maintain a sufficient power supply, prevent major power outages, and meet regional power demand.Using power production duties, each thermal power station generates electricity and instantly sends it to the power grid system to be distributed.Planning for power production and job allocation are both affected by how well predicted the overall power demand of the power grid system is [2].The economic production activities of cities are directly impacted by power paralysis, which occurs when the regional power supply is insufficient compared to the power demand.Surplus electricity occurs when regional power supply is much higher than power demand.This leads to price swings in energy and also raises the cost of power storage.To accelerate national development, it is imperative to build a power system that is both safer and more reliable, requiring the maintenance of transient stability at all times.A power system is deemed to be in an unstable state when it experiences significant disturbances resulting in the unsynchronized operation of its various generators [3].Failure to promptly employ effective measures to restore power system stability under such circumstances can lead to widespread power outages, disrupt normal lives, and inflict substantial economic losses on society.Consequently, the evaluation of transient stability in power systems has emerged as a prominent research area for scholars worldwide.There are two key approaches to ensuring transient stability in power systems.The first entails promptly deploying safety and stability devices in response to power system faults, thereby preventing the escalation of accidents, maintaining system stability, and averting further societal losses [4].The second method involves utilizing the transient stability assessment method to evaluate the power system's transient stability in advance of any faults and implementing timely measures to prevent power accidents.This approach effectively suppresses power accidents at their source.Transient stability assessment methods for power systems comprise time domain analysis, direct methods, and artificial intelligence techniques.
When it comes to determining transient stability in modern power systems, traditional mathematical methods like time-domain simulation and direct methods aren't cutting it anymore.The power system scale is constantly growing because of things like large-scale grid connection, new energy generation sources, and loads integration [5].The use of AI approaches for assessing the transient stability of power systems is therefore becoming increasingly popular.Transient stability in power networks is now much easier to observe because to smart grid technologies and the broad usage of smart meters in the last several decades.The transient stability evaluation of electricity systems is supported by the data captured by smart meters.Artificial intelligence approaches are able to fully utilize the capability of large data mining, in contrast to direct methods and time domain simulation.
This article examines the features of power system transient stability, takes into account all external factors' effects on power system transient stability, and models and studies the evaluation model of power system transient stability using a combination of ensemble learning and multi-layer perceptron networks.The article's suggested model construction technique can assess the power system's transient stability online and in real-time with excellent predicted accuracy, according to the experimental findings.

Related work
A hybrid approach utilizing support vector machines and decision trees for evaluating power system transient stability was investigated in reference [7].By preprocessing sample data with SVM, this approach reduces the misclassification rate of accurate samples.The extended area approach was presented in reference [8].It allows for quick evaluation of transient stability in power systems by integrating it with the equal area criteria.The use of decision trees and semi-supervised learning was proposed in reference [9] as a method for evaluating the transient stability of power systems.By utilizing decision trees and semi-supervised learning, this approach assesses the transient stability of regional power systems, which helps with the problem of inadequate real data in power systems.An SVM model that uses key samples to predict the transient stability of power systems was explored in reference [10].This model improves the rate of unstable category detection when compared to typical SVM models.
Several approaches have been suggested to enhance efficiency and precision in the field of transient stability evaluation in power systems.A two-stage support vector machine (SVM) technique is one such method, as described in Reference [11].The first step is to use the SVM algorithm to investigate the intrinsic link among feature values of the power system.The second step involves evaluating the system's stability during transients using the SVM method.Both the accuracy and the training time of the model are enhanced by this two-stage strategy.The use of extreme learning machines to develop a model for assessing power systems' transient stability is the subject of an additional study in Reference [12].The optimization problem of parameter determination in the extreme learning machine model is tackled by applying the bacterial population trend medicine method.For the purpose of predicting the transient stability of power systems, Reference [13] also presents an extreme learning machine model based on ensemble learning.This model has a novel feature selection strategy that is based on binary Jaya, which makes the predictions more faster and more accurate.Finally, a random forest algorithm-based approach to evaluating the transient stability of power systems is suggested in Reference [14].When compared to other approaches, this strategy shows improved evaluation performance by merging several decision tree classifiers.The effectiveness of this method is backed up by robust experimental evidence.

Multi-layer perceptron network model
The MLP is a neural network consisting of multiple hidden layers, which allows for a larger number of network parameters.This enables MLPs to effectively address local optimization problems and extract significant insights from large datasets.Figure 1 illustrates the fundamental structure of an MLP network, consisting of an input layer, a output layer, and two hidden layers.The layers are fully connected, and each layer's fundamental component is a neuron, characterized by a linear relationship combined with an activation function.
Both the weight matrix W and the offset vector b are assigned random values when forward propagation is being done; the backpropagation process is then used to discover the correct values.The extremum is located via loss function optimization in the backpropagation process.The following is the loss function for each sample calculated using MSE as the loss metric.

Evaluation algorithm based on ensemble learning
An imbalanced dataset refers to the phenomenon where the sample size of each category in the dataset is extremely uneven.Taking a binary dataset as an example: D(S, L) is the total number of sample datasets, S is the minority class dataset, and L is the majority class dataset.As shown in equation, R represents the ratio of majority class to minority class.The closer the R value is to 1, the more balanced the data is.On the contrary, the greater the degree of data imbalance.For the reduced dimensionality dataset, the calculated R-value is 1.85, indicating an imbalance in data.At present, there are mainly data level and algorithm level methods for dealing with data imbalance issues.
Under-and over-sampling, as well as mixed-method sampling, are some of the ways that data level imbalance concerns may be addressed.The undersampling technique maintains the minority group while sampling from the majority group in order to bring the majority group closer to the minority group.Although the undersampling strategy is easy to apply and can enhance the identification rate for a few classes, it might cause overfitting of the model since a huge number of sample characteristics are lost.Unbiased undersampling algorithms and clustered undersampling algorithms are the two most common varieties.Clustering undersampling entails picking representative samples from the majority class as a training set using clustering techniques, whereas random undersampling entails randomly repeating sampling within a majority class.The oversampling approach ensures that the dataset is balanced by preserving the majority of class samples and generating minority class samples through the model.While oversampling does a good job at increasing minority class sample diversity and learning quantity, it doesn't provide real samples, which means that the model's accuracy is susceptible to sample mistakes.By implementing suitable procedures, the mixed sampling method merges the oversampling and undersampling approaches.To improve the model's accuracy, the mixed-sample technique makes sure the model learns enough samples from minority classes without sacrificing samples from majority classes, and it does so without producing too much sample error.To achieve statistical parity, the mixed-method sampling strategy selects samples from the dominant group while simultaneously producing samples from the underrepresented group.Ensemble learning is a deep learning algorithm that trains multiple base learners and combines the prediction results of each base learner through ensemble strategies, as shown in figure 2. In the problem of imbalanced data classification, ensemble learning algorithms mainly fall into two categories: Boosting ensemble learning algorithms and Bagging ensemble learning algorithms.
For  = 1, . . .,  , learn from the training set with a weighted distribution to obtain an estimate ℎ  :  {−1, +1}, and calculate the training deviation   for each ℎ  according to equation (6).At the same time, initialize the weight a of each iteration according to equation (7), and calculate the error of each iteration using equation ( 8) as the defined error function.This paper selects   and ℎ  for each round to update the weights of the training set, and the final estimate of  is obtained through voting, as shown in equation (9). ) Bagging is another representative ensemble algorithm, whose main idea is to randomly select a training set of the same size from the same original dataset for training the base classifier by sampling the training set with dropout.Finally, a simple majority voting method is used to integrate the obtained base classifier and predict the test sample.There may be duplicate parts between training subsets, so the difference between base classifiers cannot be guaranteed well.However, it can improve the accuracy of individual classifiers, to some extent, making up for the shortcomings and still improving the generalization performance of the integrated system.The Bagging integration algorithm and AdaBoost integration algorithm have both similarities and some differences.The similarities are as follows: The Bagging ensemble algorithm and AdaBoost ensemble algorithm have the same integration strategy, both using simple voting methods, and the integrated base classifiers are homogeneous.

Construction of integrated learning evaluation model
This section effectively solves the problem of data imbalance using the easy-ensemble random undersampling algorithm.Continuous random undersampling is performed in positive samples, and the data generated by random sampling is combined with negative samples to obtain multiple new training sets.This method enables the learning algorithm to focus on different class distributions while preserving the proportion of small samples, and trains corresponding classifiers on each newly synthesized dataset, integrate these classifiers according to the integration strategy to construct a more robust power system transient stability assessment model.The easy-ensemble random undersampling algorithm addresses the problem of imbalanced binary data by utilizing integration techniques to mine potentially useful large class sample information that is overlooked in random undersampling.Firstly, initialize the training set with the number of samples, then train the corresponding model for each sample generated training, and finally integrate the model through the corresponding integration strategy.Assuming the training set is S r, Sr + is the positive sample set, Sr -is the negative sample set, and the number of undersampling is T. For each sampling, a subset Srk + is randomly obtained from the majority class sample set Sr + using multiple random repeated sampling algorithms.The number of samples in Srk + is the same as that in the minority class sample set Srk -, resulting in a new training set Srk.
Train the individual model on the new training set   using the AdaBoost algorithm, as shown in equation (12), where is the jth weak classifier of ℎ ,   ,  , is the weight of ℎ , , and   is the actual category set of the training subset.Integrate the model according to equation (13) to obtain the final model ().Based on the reduced dimensionality power system stability dataset and combined with the trained multi-layer perceptron model, a multi-layer perceptron power system stability prediction algorithm based on easy-ensemble undersampling is designed.The detailed algorithm description is as follows: Firstly, the dataset of power system transient stability data after dimensionality reduction using principal component analysis method will be divided into a training set and a testing set.Secondly, divide the training set into positive sample set Sr + and negative sample set Sr -.The positive sample set indicates that the power system state is temporarily stable, while the negative sample set indicates that the power system state is unstable.Thirdly, retain the negative sample set Sr -and use the easyensemble undersampling algorithm to perform Sr + on the positive sample set.Perform T repeated undersampling to obtain T subsets of positive samples Srk + (k=1, 2, ..., T).Combine the negative sample set Sr -with the positive sample set Srk + to obtain T new training sets   + ∪   − ,  = 1,2, … .Fourthly, the MLP models are trained on T new training subsets, and the hyperparameters of each MLP model are generated through a random function to obtain T MLP power system transient stability prediction models.Bagging integration was performed on T MLP models to obtain the final integrated MLP power system transient stability prediction model.

Dataset
The dataset used in this article is the 2020 power system transient stability dataset of a certain university.The dataset has 12 feature quantities and 2 labels, totaling 11000 pieces of data.Among them, the characteristic quantities specifically include: energy consumption, reaction time, motor power, average time, damping constant, and line capacity.The label indicates whether the power system is stable.According to the observed dimensionality reduction data, there are 6380 positive samples and 3620 negative samples.

Analysis
The new training set may be formed by combining the positive sample dataset that was produced by random undersampling with the negative sample dataset.In order to create an MLP model, the parameters of each MLP are first set at random.A new MLP model is created at random for each training set that is sampled.In order to build the best predictor for evaluating the transient stability of power systems, the trained models are integrated using the Bagging technique.Parameters for the integrated model are chosen using the test set's Acc and Auc values.Without a notable rise, the values of Acc and Auc oscillate near them.The result is that 10 MLP models will be the total number of integrated models.This article evaluates the performance of integrated classification models using accuracy, recall, precision, f-score, Auc value, and ROC curve.Firstly, a comparison was made between the individual model before integration and the various indicators of the integrated model, as shown in table 1 Figure 3 shows a comparison of the ROC curves between the multi-layer perceptron model and the integrated perceptron model, indicating that the integrated multi-layer perceptron model has higher accuracy.According to figure 4 and figure 5, the accuracy of the MLP model is 92.35.Compared with traditional machine learning algorithms, the MLP has a high accuracy in evaluating the transient stability of power systems.This article uses ensemble learning method to improve the accuracy of the model to 0.952, while also improving the recall rate, accuracy, and f-score, solving the problem of data imbalance.

Conclusion
Transient stability assessment during power system operation is becoming more difficult to accomplish using traditional methods due to the growing complexity of modern power systems caused by factors such as deepening reform of the power market, rapid development of renewable energy, and the continuous expansion of the power system.Recent years have seen ongoing advancements in computer science and technology that have made it possible to use synchronous measuring units in real power systems to track the operational condition and transient stability of power systems.This paper presents a data mining approach to building an MLP model for evaluating the transient stability of power systems, which is based on machine learning.An ensemble learning-based model for assessing the transient stability of power systems is also developed as a solution to the problem of data imbalance in these systems.To assess power system transient stability, the suggested approach combines the anticipated samples with pertinent affecting factors.A transient stability evaluation model for integrated MLP power systems is built utilizing the easy-ensemble undersampling approach to deal with the problem of unbalanced data.The suggested integrated model outperforms generic MLP models in terms of sensitivity and accuracy when it comes to detecting power system instability, according to comparative assessment indicators.

Figure 1 .
Figure 1.Structure of a multi-layer perceptron network model.A combination of forward and backpropagation techniques is used to train multi-layer perceptron networks.From one layer's output to the next and finally to the output layer, the forward propagation algorithm builds upon the preceding layer's output.

Figure 2 .
Figure 2. Integrated learning training framework diagram.The Bagging algorithm's idea is to sample the training set with dropout, generate several training subsets, train classification models on each subset, and obtain the final model through integration strategies.The most representative of ensemble learning boosting algorithms is the AdaBoost algorithm, and various algorithms are based on the AdaBoost algorithm.Equation represents the entire dataset, S represents the training set, x represents the feature quantity, and y represents the label quantity.Equation initializes the sample weight and iteration number based on the training set, Dt (i) represents the corresponding training set weight, and T represents the iteration number.

Figure 3 .
Figure 3. Model ROC curve.According to table 1, the integrated MLP model showed significant improvements in recall, accuracy, precision, f-score, and Auc values.The improvement in f-score indicates an improvement in the predictive ability of the integrated learning model for positive and negative samples.The improvement in accuracy and Auc values indicates a significant improvement in the model's predictive ability and performance.At the same time, the integrated model was compared with traditional machine learning algorithms, the experimental results are shown in figure 4 and figure 5.

Figure 4 .
Figure 4. Accuracy and recall of each model.

Figure 5 .
Figure 5. Precision and f-score of each model.
on Applied Physics and Mathematics Journal of Physics: Conference Series 2729 (2024) 012018 IOP Publishing doi:10.1088/1742-6596/2729/1/0120188 ) The differences are as follows: The structure of the Bagging ensemble algorithm is parallel training, which means completing the training of each base learner simultaneously.The structure of the AdaBoost ensemble algorithm is serial training, and one learner can only train the next learner after completing the training.The training sets of the Bagging ensemble algorithm are independent of each other and have no impact on each other.The AdaBoost ensemble algorithm training set is interdependent, and this round of training set is generated from the previous round of training set.

Table 1 .
. Various indicators before and after model integration.