Construction of short-term electricity demand streaming forecasting model in demand-side dynamic response

In the context of demand-side dynamic response, the electricity supply-demand relationship undergoes constant changes, and short-term electricity load exhibits strong randomness and volatility, making load conditions challenging to predict. Hence, this paper proposes a short-term electricity demand streaming forecasting model that combines wavelet decomposition with Random Forest to enhance the accuracy of short-term electricity load forecasting. This model establishes a load feature system, utilizing a three-scale wavelet decomposition algorithm to break down the load sequence into several sub-sequences of different frequency bands. Subsequently, Random Forest load forecasting models are separately established for these sub-sequences. The final load prediction is obtained through reconstruction. This approach enables quasi-real-time short-term forecasting analysis of demand-side electricity demand.


Introduction
As the transition to a new power system dominated by new energy sources advances, there is a significant increase in the scale of new energy sources and a widening gap between peak and off-peak loads.The integration of a substantial proportion of new energy sources into the grid places higher demands on the security and reliability of large-scale power grid operations.Consequently, the need for flexibility within the power system becomes increasingly urgent [1] .Electricity demand-side management represents an innovative energy management model within the electricity market.It has the potential to significantly alleviate power supply tensions and maintain the stability of the power system.To achieve the coordinated development of flexible resources with both new and traditional energy sources, constructing a fine-grained, market-oriented, specialized, and intelligent demand-side resource utilization system is one of the critical avenues to pursue.In the realm of demand-side dynamic response, the relationship between electricity supply and demand is in a constant state of flux.The realtime analysis of vast quantities of electrical energy data is instrumental in enabling more precise management of the dynamic response process.Within this context, short-term forecasting demand response plays a pivotal role in ensuring the stability and reliability of the power grid and in formulating effective dispatching plans [2] .Accurate short-term electricity load forecasting is essential for enhancing the accuracy of power-related decisions, facilitating the efficient allocation of power resources, and optimizing energy utilization [3] .These forecasting results generally serve as crucial reference points for system dispatch management and analytical decision-making.Therefore, the proposal of high-precision and viable methods for short-term load forecasting holds immense significance.
The short-term load forecasting domain has witnessed continuous domestic and international advancement, resulting in the gradual maturation of short-term load forecasting technology.The forecasting accuracy has reached a level where it essentially meets the demands of modern power system development.Several efficient short-term load forecasting methods have found extensive applications in practical power systems.Traditional approaches to user load forecasting have predominantly hinged on techniques like linear regression and time series regression analysis.However, the actual process of short-term load forecasting entails more than just considering load data; it necessitates a comprehensive evaluation of cross-influences stemming from complex factors such as meteorological conditions and day types.Beset with these nonlinear data patterns, conventional mathematical and statistical methods often fall short in providing practical solutions; the task is better handled by artificial intelligence algorithms with more robust nonlinear processing capabilities [4] .The load forecasting field has recently shifted towards artificial intelligence-based methods, with various artificial intelligence prediction models gaining popularity.These models encompass artificial neural networks, wavelet analysis, support vector machines, Random Forests, various advanced and improved algorithms, and combination algorithms.However, as research progresses, datasets have become increasingly complex, exposing the limitations of different algorithms when applied to various datasets.As a result, researchers have proposed the fusion of multiple algorithms to harness their strengths and offset weaknesses, aiming for improved predictive performance.Examples of such approaches include the Bayesian-optimized convolutional neural network-bidirectional gated recurrent unit (CNN-BiGRU) [5] and the stacking of long short-term memory (LSTM) with lightweight gradient boosting machine algorithms [6] .
As the future smart grid evolves and upgrades, short-term load forecasting technology will unavoidably encounter new challenges and examinations.The power system will place increased demands on forecasting models, requiring higher levels of accuracy and adaptability [7] .Consequently, the pursuit of more adaptable, higher-precision, scientifically rigorous, and productive forecasting methods represents a critical direction for the ongoing development and enhancement of short-term load forecasting technology in future power systems.In alignment with this imperative, this paper introduces a short-term electricity demand streaming forecasting model that combines wavelet analysis with Random Forest.This model facilitates quasi-real-time short-term forecasting analysis of demand-side electricity requirements, thereby contributing to the assessment of power supply-demand balance within the framework of advanced power systems [8] .

Data selection and processing
In this study, data were collected from 3, 234 electricity users in a specific region of Guangxi over the course of a week, with readings taken at 96 time points each day.Initial data preprocessing was performed.Users' missing power/load data were imputed using the data from the subsequent hour.Factors such as the power supply substation, user addresses, distribution transformers [9] , and terminal station numbers were deemed irrelevant to the prediction results.Therefore, the analysis focused solely on factors such as whether a user was considered high-risk, their subscribed power capacity, voltage level, and the hourly power/load data.

Analyzing dimensions of load features
Load forecasting involves inherent randomness and uncertainty factors, making it a challenge that cannot be solely approached as a mathematical problem.Nevertheless, the influencing factors of load features exhibit discernible patterns.We can gain insights into load variation patterns and conduct causal analyses through dedicated research, investigation, and in-depth exploration of the mechanisms that underlie load composition.This process helps elucidate the construction of input features crucial for short-term load forecasting.The overarching influencing factors for load features include historical load factors, meteorological factors, seasonal factors, and day-type factors.
(1) Historical Load Factors: Historical load data is an indispensable and crucial input when determining load forecasting models.Daily load curves exhibit periodic patterns, and historical load data is one of the most valuable reference factors when selecting input variables for load forecasting models.
(2) Meteorological Factors: Meteorological factors typically include temperature, atmospheric pressure, wind speed, humidity, weather conditions, etc. Meteorological factors largely influence people's lifestyles and electricity consumption habits.For example, in the summer, air conditioners operate at full load, while on cloudy or rainy days, lighting usage may increase, and during thunderstorms, power sources may be shut down.These meteorological factors are important influencing factors that can be innegligible in load forecasting.
(3) Seasonal Factors: Seasonal factors are, in essence, somewhat aligned with meteorological factors and manifest in various aspects such as daily temperature fluctuations, daily maximum and minimum temperatures, and more.In the short term, seasonal factors have a less pronounced impact, resulting in daily load curves staying roughly at the same level.However, in the long term, influenced by seasonal changes, the load exhibits a noticeable fluctuation due to the effects of shifting seasons.
(4) Day-Type Factors: Day-type factors refer to the influence of weekdays, weekends, holidays, and other factors on load.Weekdays generally exhibit consistent load patterns when examining the variation trend of load throughout the week, while weekends often see a certain degree of load reduction.This variation is related to the distribution of industrial structures and people's daily routines, constituting a cyclical influencing factor.Likewise, national statutory holidays, major social events, environmental protection activities, and other public welfare events can also impact load variation.However, these factors lack predictable patterns and can only be considered in particular circumstances, falling into the category of stochastic influencing factors.
(5) Other Factors: Electric prices, electromagnetic disturbances caused by external or internal factors, equipment insulation protection issues, and sudden faults or power outages are also significant drivers of load variation.

Constructing a load feature index system
Building upon the analysis results of feature dimensions, we construct a load feature index system for each dimension, as illustrated in Table 1.
Table 1.Load feature index system.

Influencing factor
Feature input quantity

Historical load
Load at time t-1 on the day before the forecasted day.

𝐿 𝑡−1 𝑑−1
Load at time t on the day before the forecasted day.

𝐿 𝑡 𝑑−1
Load at time t+1 on the day before the forecasted day.

Three-scale wavelet decomposition
In the context of demand-side dynamic response, where load fluctuations are significant and can affect the stability of load forecasting, we applied the Mallat algorithm to perform three-scale wavelet decomposition on the load data, resulting in sub-wavelet sequences, as presented in Figure 1.Reconstruction is the inverse process of decomposition.By reconstructing the signal step by step, we can restore the original signal CN. Figure 2 elucidates the reconstruction process.

2
(2 -) , Binary wavelets transition from one scale to the next with a corresponding time shift that either doubles or halves.Binary wavelets exhibit a binary scaling behavior, resulting in a relatively coarse partitioning of the frequency domain channels.However, they offer faster computational speeds, especially when employing fast wavelet decomposition algorithms, making them well-suited for digital signal processing.

Data standardization.
In addition to handling outlier data, it is essential to preprocess the variations in electricity load values.Due to the potentially significant differences in electricity consumption among users of the same type, the disparities between low-capacity users and highcapacity users might be more substantial than the differences caused by abnormal usage patterns out of electricity theft in users of similar capacity.This could lead to incorrect categorization, making it more likely for low-capacity users to be mistakenly classified as high-risk electricity theft users.This can affect the accuracy of identifying high-level electricity theft behaviors.
To eliminate the influence of load capacity, after taking the daily average electricity load as the baseline value for each user, the power at each time point is transformed into standardized values.This normalization process helps eliminate differences in load magnitude between different users, allowing for a more focused comparison of differences in load curve shapes.Additionally, given the dimensional differences in load feature input variables, data standardization is necessary to accurately represent their influence on load forecasting.
By using data normalization methods, the denoised load data and the original data for load characteristic influencing factors are processed to generate new sequences.where x denotes the original data,

Forecasting model with random forest
After performing the three-scale wavelet decomposition and obtaining one low-frequency component and three high-frequency components, each of these components is subjected to load forecasting using the Random Forest algorithm.The forecasting process is illustrated in Figure 3.The forecasting processes can be outlined as follows: Step 1: Relevant load feature influencing factors are chosen based on the load variation characteristics of wavelet components.
Step 2: The dataset that has undergone preprocessing and feature engineering is split into training and testing sets, typically following a 7:3 ratio.Techniques like K-fold cross-validation and grid search are used to enhance the model's generalization capability.
Step 3: The Random Forest algorithm is employed to perform classification forecasting after selecting different subsets of training and testing samples.
Step 4: Grid search is used to systematically transverse the Random Forest hyperparameters, such as Estimator, max_depth, and max_features, and fine-tune the model in conjunction with evaluation metrics like precision.The optimal parameter, best training-testing set split configuration, and optimal model are determined based on the mean absolute error.
Step 5: The sub-wavelet components are forecasted, and the corresponding forecasting results are obtained.
Step 6: The sub-wavelet forecasting components are reconstructed to get the final forecasting result.

Accuracy measurement of load forecasting
The accuracy of load forecasting is measured using the Mean Absolute Percentage Error (MAPE), which is calculated as follows:

Example verification
Taking the load data in May 2023 as an example, the db4 wavelet function is used to perform three-scale wavelet decomposition.The obtained low-frequency components, A3, and three high-frequency components, D1, D2, and D3, are shown in Figure 4.The wavelet decomposition and random forest prediction model constructed in this paper is used to predict the sub-wavelet components on May 30, as shown in Figure 5.
The sub-wavelet component of May 30 is reconstructed to obtain the final forecast load of May 30, as shown in Figure 6.Compared to the forecast results with the real load data, the average absolute percentage error (MAPE) was used to measure the prediction accuracy of the model, MAPE=1.59%.

Conclusion
This paper forms the data foundation for an extended analysis of demand-side response, aiming to enhance the precision of short-term electricity demand forecasting.It introduces a hybrid forecasting model that combines wavelet decomposition with Random Forest, leveraging quasi-real-time energy consumption data streams collected by metering automation systems.The paper constructs a load feature system from four main facets: historical load patterns, meteorological factors, day-type influences, and other contributing factors.This systematic approach enables a thorough comprehension of the patterns and causal factors underlying load variations.Through the utilization of wavelet decomposition, the load time series is dissected into high-frequency and low-frequency load components.The developed Random Forest load prediction model is then applied to forecast each of these sub-load components independently.Finally, the ultimate forecasted results are obtained by reconstructing the load components through wavelet reconstruction.This forecasting model can swiftly extract information from the original load sequence, demonstrating robustness, resistance to local optima, and high predictive accuracy.However, due to limitations in sample size, the model's accuracy cannot be guaranteed for other scenarios.Furthermore, this model has not undergone improvements in the underlying Random Forest algorithm itself.In our forthcoming steps, we intend to augment the Random Forest algorithm to improve the forecasting model's accuracy and continually refine short-term load forecasting technology.

Figure 1 .
Figure 1.Three-scale wavelet decomposition.CN represents the original signal.In the first-level decomposition, we obtain the low-frequency component dN-1 and the high-frequency component CN-1.The low-frequency component dN-1, after second-level decomposition, yields the next level's low-frequency component dN-2 and high-frequency component CN-2.The low-frequency component dN-2, when subjected to a third-level decomposition, results in the low-frequency component dN-3 and high-frequency component CN-3.Ultimately, the load data is divided into one low-frequency component and three high-frequency components.Reconstruction is the inverse process of decomposition.By reconstructing the signal step by step, we can restore the original signal CN.Figure2elucidates the reconstruction process.

Figure 2 . 5 .
Figure 2. Reconstruction of three-scale wavelet decomposition. 5. Construction of a short-term electricity load forecasting model based on random forest 5.1.Data preprocessing 5.1.1.Data denoising.Wavelet transformation is a novel analytical method that effectively highlights specific features of a problem.It allows for local analysis of time (or space) frequencies, progressively refining signals (or functions) through scale and shift operations.This approach achieves fine time division for high frequencies and fine frequency division for low frequencies.It automatically adapts to the requirements of time-frequency signal analysis, enabling a focus on any detail of the signal.This resolves the challenges encountered with Fourier transformation.Following data cleaning, the load data should be subjected to denoising using the discrete wavelet transform method.

Figure 3 .
Figure 3.The forecasting process of the hybrid model combining wavelet decomposition and Random Forest.Random Forest is represented as  ( , ), 1, 2, , i H X i k  =

L
represents the true load value, f i L denotes the forecasted load value, and i indicates 24 h.

Figure 6 .
Figure 6.Comparison of actual load and predicted load on May 30.