Multistep Prediction of Multiple Small Hydropower Stations’ Total Power in a Watershed within a Day Considering the Distributed Discharge TCN-LSTM Model

First, this paper proposes that although the small hydropower group in the basin terrace as a whole uses the hydrological natural flow to generate electricity, the small hydropower except the first stage will also reuse a certain proportion of the small hydropower outflow and immediately upper stage, solving the problem of the relationship between the total power prediction of the small hydropower group and the hydrological natural flow prediction at all levels. Second, this paper proposes a satellite remote sensing monitoring point selection method based on topographic elevation to determine the increment of rainfall collection area above each small hydropower dam site, solving the correspondence between the source of the incoming flow of each small hydropower and the rainfall collection area above the dam site nested step by step, and the problem that each small hydropower is dividing the hydrological and natural flow of the whole basin according to the increment of rainfall collection area above the dam site. Third, this paper proposes a method that combines computational yield flow, deep learning simulation of sink flow, and fitting small hydropower generation equations, i.e., considering the distributed yield flow TCN-LSTM model (DR-TCN-LSTM), which solves the problem that the major part of the incoming flow except for the first level small hydropower is the reuse of the outgoing flow and the immediate upper-level small hydropower, which is treated by the immediate upper-level small hydropower, rather than the purely natural state product of the problem. In an example of an 8-step prediction of multiple small hydropower stations’ total power in a certain basin in Guangxi, China within a day, this paper proposes a Distributed Runoff TCN-LSTM Model (DR-BP-LSTM), and the Nash coefficient of the total power is 0.919. The Nash coefficient is increased by 0.02 due to the calculated discharge. In the selection of deep learning models, the Nash coefficient of the prediction model proposed in this paper is 0.02 more than DR-TCN-GRU and 0.011 more than DR-BP-LSTM.


INTRODUCTION
The power grid cannot dispatch most small hydropower stations within the jurisdiction (hydropower stations with a rated capacity of less than 50 MW), and they have to absorb all of their power generation.Therefore, the power grid must predict the power generation of small hydropower stations within the jurisdiction before it can reasonably dispatch adjustable energy.The upstream small hydropower stations greatly interfere with the downstream small hydropower stations, especially during the dry season.The inflow flow of the downstream small hydropower stations almost entirely comes from the outflow flow of the upstream small hydropower stations.Therefore, there is a water system flow connection between small hydropower stations in a basin cascade, and almost simultaneous power generation and shutdown are required.Therefore, a group of small hydropower stations in a basin cascade must be used as one power source.From the perspective of the power grid, this article makes a multi-step prediction of the total power of multiple small hydropower stations in the basin within a day.
Previous research methods for predicting the total power (including hydrological natural flow) of small hydropower stations can be divided into two categories: statistical laws and genetic laws.
In terms of statistical laws, due to the non-aftereffect and non-periodicity of the total power of small hydropower stations (including hydrological natural flow), first, according to the "Code for Design Flood Calculation of Hydraulic and Hydroelectric Engineering" [1] , the prior distribution of the probability density of hydrological natural flow in China adopts the positive biased P-III type, and then, the Markov Chain Monte Carlo Method [2] is used, and process sampling is performed on the historical sequence of the hydrological natural flow.The posterior distribution is obtained by integrating with the Monte Carlo method.Then, the multivariate joint probability distribution of hydrological natural flow and control inputs (such as rainfall intensity) is expressed by using Copula class functions.Finally, the parameters of each preset function are obtained by minimizing the Euclidean distance between the theoretical Copula class function and the actual Copula class function.
In terms of genetic laws, we abide by the hydrological runoff generation and concentration theory, and regard rainfall intensity and underlying surface as source inputs [3] .There are roughly two types:

The physical hydrological model
The physical hydrological model treats the watershed as a simulation unit, summarizing that the hydrological natural flow is generated by rainfall intensity acting on the underlying surface through runoff generation and concentration.Because satellite remote sensing enriches meteorological data and the underlying surface of runoff production processes, distributed physical hydrological models for multi-point monitoring have become the mainstream of hydrological prediction models.However, there are three disadvantages: (1) There is a lack of detailed data on the catchment process, such as the inflow hydrograph of the excess infiltration and full storage discharge at the end of the slope catchment.Physical hydrological models have strict requirements for input data and boundary conditions, and require excessive computational power; (2) The physical hydrological model is still solved by using linear differential equations, which must approximately assume that the hydrological model satisfies homogeneity and superposition.Therefore, it can only ensure that the calculation results are more accurate during large watershed floods.The smaller the watershed area is, the smaller the rainfall intensity is, and the greater the error caused by this approximation is; (3) Many empirical equations are proposed on the assumption that the underlying surface remains unchanged.However, the actual underlying surface is constantly changing, resulting in poor robustness of physical hydrological models.Distinguishing wet and dry season predictions is often a helpless choice.

The distributed deep learning model
Deep learning is a class of simple nonlinear computing modules connected through a deep multi-level network structure to learn the mapping relationship from sample variables to label variables.
Through meteorological and underlying surfaces, such as distributed rainfall intensity as sample inputs and the power generated by small hydropower stations as label outputs, the distributed deep learning model [4] simulates three non-linear processes, namely, runoff generation, runoff concentration, and power generation.In this type of hydrological model, the distributed deep learning model combined with LSTM [5][6][7][8][9] and GRU [10] is superior to other distributed deep learning models.

THEORETICAL BACKGROUND
2.1 Relationship between total power prediction of small hydropower stations and hydrological natural flow prediction [ , ( ) During the runoff generation process, the distributed rainfall intensity forms a water supply amount on the soil surface and in the soil through vegetation coverage, evapotranspiration rate, soil water capacity, soil permeability, etc., which flows to different slope surfaces in a confluence manner, known as distributed runoff (rs, rg).The flow rate generated on the soil surface is called the excess infiltration flow rate rs, and the flow rate generated in the soil is called the full storage flow rate rg, which can be calculated.
During the confluence process, the distributed runoff flows ( , ) s s rs rg through the elevation, terrain, and land use type distribution of the basin to generate the hydrological natural flow of the S-level small hydropower station , s t N in the basin.Between small hydropower stations, the inflow flow of the S-level small hydropower station , s t I is equal to the hydrological natural flow of the S-level small hydropower station , s t N plus a certain proportion of the outflow flow of each branch of the S-1 level small hydropower station , among which Δt is the flow time difference between adjacent small hydropower stations.In small hydropower stations, the inflow flow , D , and T is the data sampling period.
During the power generation process, according to the operating characteristic curve of the hydraulic turbine, the power generation P can be determined by the power generation flow Q and the gross water head g H .After transformation, the power generation P can also be determined by the power generation flow Q and the storage capacity C. The outlet flow O is equal to the sum of the power generation flow Q and the wastewater flow W. Due to the lack of measurement data for wastewater flow W in small hydropower stations across the country, this article assumes that the wastewater flow W is 0, which does not waste economic benefits.
Because the prediction of the total power of multiple small hydropower stations in a watershed P  involves multiple physical processes in series and parallel, this article uses a method that combines the calculation of flow yield, in-depth learning of simulated flow convergence, and fitting of the power generation equation for small hydropower stations to predict.

The time-varying distributed flow calculation method
The flow production process mainly involves 3 parts: vegetation interception, evapotranspiration, and soil water retention of rainfall.
The influence of vegetation on the flow production process is reflected in 2 aspects.
(1) Vegetation divides the area of monitoring points into the vegetated area with an area ratio of VFC and the non-vegetated area with an area ratio of 1-VFC; (2) The rainfall falling into the vegetation zone needs to increase the vegetation interception process.Rainfall falling into the vegetation zone needs to pass through the stem, so leaf surface water holding and stem and leaf surface evapotranspiration will change before they can reach the soil in the vegetation zone.Assuming that in the vegetation zone, the stem and leaf surface water holding capacity is c, and the stem and leaf surface evapotranspiration rate is ec.Because the sunlight intensity has been blocked by the vegetation surface, the soil evapotranspiration rate in the vegetation zone is mainly transpiration ep, then the net rainfall rate vp in the vegetation zone soil is in Equation (2): Within the non-vegetated area, the soil evapotranspiration rate is eb, and the net ainfall rate vb in the non-vegetated area is in Equation (3): The soil infiltration rate in the vegetated area is fp, and the soil infiltration rate in the non-vegetated area is fb, so the amount of hyperfiltration production rs is: The rate of upper surface water absorption in the vegetation zone is min (vp, fp), and the soil water holding capacity is wp, the rate of upper surface water absorption in the non-vegetation zone is min (vb, fb), and the soil water holding capacity is wb, so the storage full production flow rg is: The hydrological natural flow N is the superposition of the super infiltration production flow rs and the storage full production flow rg, after convergence in the area of rainfall catchment area increment ΔS above the small hydropower dam site.From flow production to confluence into the small hydropower reservoir, the super infiltration production flow rs should be of the order of minutes, while the storage full production flow rg takes several hours.
The forecasting period of multi-step prediction in a day is 24 hours.Because the data sampling period T available from satellite remote sensing technology is 3 hours, this paper predicts 8 steps before the day.Assuming that the current moment is t, the two sources of hydrological natural flow in the prediction time (2) The prediction time and immediately adjacent historical time with a total of brgs step storage full production flow , 9 ~8 s t brgs t rg    and brgs ≥ 8.     .This area should not be counted in the area of the sum of the rainfall collection area above the dam site of the small hydropower station of each branch of level s-1 1 s S   , and its altitude should be slightly higher than the altitude of the small hydropower station of level s.

Satellite remote sensing monitoring point selection method
If the increment of rainfall area above the dam site of the small hydropower station of level s s S  is too small, it can be approximated as the hydrological and natural flow of the small hydropower station of level s s N is neglected, to select the satellite remote sensing monitoring points corresponding to the hydrological and natural flow N of each small hydropower station, as shown in Table 1.The flow rates (rs, rg) of the four selected monitoring sites are shown in Figure 2. (1) Small hydropower plants a, b, and g include the main source of incoming flow that is the corresponding production flow ( , ) s s rs rg ; (2) Small hydropower stations c, d, e, f, h, i, j, and k.In particular, the reservoir capacity b C is 2 orders of magnitude larger than other small hydropower stations and the reservoir capacity g C is 1 order of magnitude larger than other small hydropower stations, and all are not fully annual regulation, so there is a certain reservoir storage effect on the daily regulation of small hydropower stations c, d, e, f, h, i, j, and k.When the proportion of the hydrological natural flow of these small hydropower stations Therefore, the main source of the incoming flow of these small hydropower stations i s I s not the corresponding production flow ( , ) s s rs rg , but the generation flow of small hydropower stations of the It is because of the above linkage of water system flow between small hydropower plants in the basin terrace, but not the linkage of power generation P. Therefore, this paper argues that power generation P is only the surface of the result, but the cascade transfer of power generation flow Q from small hydropower plants in the basin terrace.The gross head g H or reservoir capacity C formed by the storage of water in each small hydropower plant is the essence of the prediction.Instead of using the generation power P of each small hydropower plant in the basin terrace directly as the label for deep learning, the cascade prediction should be done with the generation flow Q of each small hydropower plant as one of the labels, and the deep learning only simulates the sink mapping of the distributed generation flow (rs, rg) at each level.In addition, because the same generation flow Q produces different generation power P at the different gross head g H or reservoir capacity C, the gross head g H or reservoir capacity C of each small hydropower plant for the prediction time must be derived again, and finally, the power generation equation of fitting small hydropower plants is used to obtain the generation power P of each small hydropower plant in the basin and total power P  for the prediction time.

Selection of independent variables for power generation in the prediction period
Under the statistical granularity of the data sampling period T of 3 hours, it is clear from the specific data that because the reservoir capacity flow s D is much smaller than the power generation flow s Q , the power generation flow , 1~8

Fitting of the small hydropower plant generation equation
First, we perform data normalization.The reference value of power generation P is taken as 1 MW, the reference value of power generation flow Q is taken as 50 m^3/s, then the reference value of reservoir capacity C is taken as 54*10^4m^3, which is the integral value of power generation flow Q reference value for data sampling period T=3 hours.The reference value of super seepage generation flow rs and storage full generation flow rg is taken as the maximum value of 4 monitoring points max max , rs rg .
Then we remove the actual data in the power generation P equal to 0 points, after data normalization, we aim to find out a province of a watershed with power generation flow Q and reservoir capacity C binary quadratic fitting equation, as shown in Equation ( 6).
.294 17.18 20.5 24.51 55.47With an input step of n_in=16 and an output step of n_out=8 for data stacking and a row-by-row sliding window step of 1, a Python program is written according to the prediction model in Figure 3.

Figure 1 .
Figure 1.Altitude terrain of the basin On the intersection of gray latitude and longitude grid lines in Figure 1, there are meteorological and subsurface data acting on the flow-producing process with a 27.8 km × 27.8 km grid and a data sampling period T of 3 hours provided by satellite remote sensing technology.A tributary flowing through small hydropower plants a, b, c, d, e, and f, is merged with another tributary flowing through small hydropower plant g, after flowing through small hydropower plants h, i, j, and k.The source of the inlet flow s I of small hydropower stations of class s corresponds to the area of rainfall collection above the dam site s S and consists of a certain proportion of the outflow of small

Figure 2 .
Figure 2. Runoff at 4 monitoring sites Because the increase in rainfall area above the dam site of a small hydropower plant , , a b g S S S    is larger and , , , , , , , c d e f h i j k S S S S S S S S         smaller, the main sources of inflow to each small hydropower plant s I in a watershed in a province are divided into 2 major categories.(1)Small hydropower plants a, b, and g include the main source of incoming flow that is the .1088/1742-6596/2625/1/012067 7 of the sth level s N to the incoming flow s I is smaller, the greater the difference between the fluctuation pattern of the incoming flow s I and the fluctuation pattern of the hydrological natural flow s N is, the greater the influence of the upstream large reservoir storage becomes, especially in the dry period.

) 3 . 4
The prediction model of generation flow in the prediction period of s-class small hydropowerIn this paper, we propose a prediction model for the generation flow during the prediction period of the sth class small hydropower plant , 1~8 s t t Q   , as shown in Figure3, for the Distributed Runoff TCN- LSTM Model (DR-TCN-LSTM).

Figure 3 .
Figure 3.The prediction model of generation flow in the prediction period of s-class small hydropower The data period is from 19:00 on June 16, 2021, to 9:00 on April 8, 2022, with the first 80% as the training set and the last 20% as the validation and test sets.With an input step of n_in=16 and an output step of n_out=8 for data stacking and a row-by-row sliding window step of 1, a Python program is written according to the prediction model in Figure3. ) (

Table . 1
The selection of rainwater collection area and satellite remote sensing monitoring points above small hydropower dam sites is almost equal to the incoming flow , 1~8