Evolutionary algorithm-based multiobjective reservoir operation policy optimisation under uncertainty

Reservoir operation optimisation is a decision support tool to assist reservoir operators with water release decisions to achieve management objectives, such as maximising water supply security, mitigating flood risk, and maximising hydroelectric power generation. The effectiveness of reservoir operation decisions is subject to uncertainty in system inputs, such as inflow and therefore, methods such as stochastic dynamic programming (SDP) have been traditionally used. However, these methods suffer from the three curses of dimensionality, modelling, and multiple objectives. Evolutionary algorithm (EA)-based simulation-optimisation frameworks such as the Evolutionary Multi-Objective Direct Policy Search (EMODPS) offer a new paradigm for multiobjective reservoir optimisation under uncertainty, directly addressing the shortcomings of SDP-based methods. They also enable the consideration of input uncertainty represented using ensemble forecasts that have become more accessible recently. However, there is no universally agreed approach to incorporate uncertainty into EA-based multiobjective reservoir operation policy optimisation and it is not clear which approach is more effective. Therefore, this study conducts a comparative analysis to demonstrate the advantages and limitations of different approaches to account for uncertainty in multiobjective reservoir operation policy optimisation via a real-world case study; and provide guidance on the selection of appropriate approaches. Based on the results obtained, it is evident that each approach has both advantages and limitations. A suitable approach needs to be carefully selected based on the needs of the study, e.g., whether a hard constraint is required, or a well-established decision-making process exists. In addition, potential gaps for future research are identified.


Introduction
Reservoir operation optimisation is a complex problem due to the multiple and often conflicting objectives that need to be achieved (Changchit and Terrell 1993, Cheng et al 2017, McMahon and Petheram 2020, Yu et al 2021 and the uncertainty in system input such as inflow that obscures operation decisions and limits their effectiveness (Kuria and Vogel 2014, Schwanenberg et al 2015, Berghout et al 2017, Bozorg-Haddad et al 2022. Therefore, probabilistic optimisation approaches, such as stochastic dynamic programming (SDP), have been used to incorporate uncertainty in reservoir operation optimisation due to their ability to handle the noncontinuous solution space and exploit the sequential nature of reservoir operation decisions (Labadie 2004, Macian-Sorribes andPulido-Velazquez 2020).
However, SDP has been referred to as having three curses , Giuliani et al 2021: (I) multiple objectives-an inability to explicitly account for multiobjective tradeoffs, so a weighted sum method is often used (Soleimani et  Any further distribution of this work must maintain attribution to the author-(s) and the title of the work, journal citation and DOI. system state significantly increases computational cost when a large system is considered (Sahu, Mclaughlin 2018, Dobson et al 2019, Hooshyar et al 2020; and (iii) modelling-all variables need to be described in the simulation model, thus specific model and problem formulations are needed (Mortazavi et al 2012, Soleimani et al 2016, Sahu, Mclaughlin 2018, Dobson et al 2019, Ortiz-Partida et al 2019, Hooshyar et al 2020, Celeste et al 2021, which restrict its real-world applications. Evolutionary algorithms (EAs) offer a new paradigm for multiobjective reservoir operation optimisation under uncertainty and have been applied to a range of management decisions , including social and environmental impact (Mortazavi et al 2012), hydropower generation objectives (Tsoukalas andMakropoulos 2015, Chen et al 2018), and water quality and irrigation objectives (Saadatpour et al 2020). EAs directly address the three curses of SDP-based methods (Maier et al 2019, Giuliani et al 2021, enabling modellers to explore large search spaces of complex reservoir optimisation problems that would otherwise be impossible without undesired assumptions. In addition, an EA-based simulation-optimisation framework such as the Evolutionary Multi-Objective Direct Policy Search (EMODPS)  allows operation policies to be optimised directly based on operation objective functions and enables the consideration of input uncertainty represented using ensemble forecasts that have become more accessible recently. However, when EAs are used for multiobjective reservoir operation policy optimisation under uncertainty, a wide variety of approaches have been applied and there is no agreed approach on how uncertainty in system inputs should be incorporated.
This study aims to conduct a comparative analysis to demonstrate the advantages and limitations of different approaches to account for uncertainty in EA-based multiobjective reservoir operation policy optimisation via a real-world case study (sections 2, 3 and 4), and provide guidance on the selection of appropriate approaches (section 4). Based on the findings, future research is also recommended.
2. Reservoir operation optimisation integrating uncertainty: a brief review 2.1. Reservoir operation optimisation Based on the decision variables that are optimised, reservoir operation optimisation can be divided into two categories: release sequence optimisation and operation policy (rules or strategies) optimisation. The aim of release sequence optimisation is to find a sequence of reservoir water release over a pre-defined time period (e.g., one year), such that the objective function(s) during this time period are minimised or maximised (Wang et al 2012, Schwanenberg et al 2015, Ortiz-Partida et al 2019. The outcomes of release sequence optimisation are very simple to understand and use. However, as they are deterministic in nature, the optimised release sequences are only valid for the data used during the optimisation process and the performance of the optimised solutions can reduce significantly as the system state deviates from the data used in the original optimisation (Dobson et al 2019). In addition, as the duration of the operation period increases, the number of decision variables for release sequence optimisation can be so large that it becomes impractical to optimise (Dobson et al 2019). Consequently, release sequence optimisation is more commonly used to explore system responses to specific conditions and provide system understanding (Castelletti et al 2012).
Operation policy optimisation aims to derive an operation policy that will help reservoir operators and managers to determine release sequences (Oliveira andLoucks 1997, Macian-Sorribes andPulido-Velazquez 2020). A significant advantage of operation policy optimisation is that operation policies can be applied to future operation periods. Operation policies can be derived using a two-step process (Young 1967, Karamouz andHouck 1987), where a deterministic release sequence is first obtained and then a parameterised function of the release sequence (e.g., as a function of input variables such as reservoir system state, inflow and time of year) is identified. However, this approach relies on release sequence optimisation in the first step and therefore suffers from similar limitations of release sequence optimisation (Giuliani et al 2021).
Alternatively, operation policies can be obtained using direct policy search (Schmidhuber 2001), where the parameters of a pre-defined policy function are directly optimised during the optimisation process based on operation objectives, such as maximisation of hydro-electric energy generated or minimisation of flood risk. Various function forms can be used as the policy function, including simple rule curves (Li et al 2020), complicated hedging rules (Xu et al 2019), mathematical equations such as polynomial functions (Tsoukalas andMakropoulos 2015, Saadatpour et al 2020), or data-driven models such as artificial neural networks (ANNs) (Culley et al 2016, Zatarain Salazar et al 2017. Mathematical equations are commonly used as they are more flexible than simple rule curves and their behaviour is well-understood (Schmidhuber 2001). However, there is no direct evidence to indicate that a certain mathematical equation should be used for a particular system and therefore, they may lead to poor performance (Labadie 2004, Macian-Sorribes andPulido-Velazquez 2020). Data-driven models such as ANNs offer great flexibility in simulating release decisions based on input variables. For this reason, ANNs have been successfully applied to many reservoir operation optimisation problems

Incorporating uncertainty in reservoir operation optimisation using EAs
There are various methods to incorporate input uncertainty in reservoir operation optimisation using EAs, typically through the evaluation of constraints. First, inflow uncertainty can be incorporated as a probability constraint so that only solutions within a pre-defined confidence bound are considered ( Uncertainty can also be directly accommodated within the objective function(s), for example, having objective function values averaged across the range of inflows used (Saadatpour et al 2020). This is an intuitive approach and has been used in other applications, for example, post-processing ensemble climate forecasts (Zhao et al 2022). Yet another approach is to have the objective function calculated as the total of a criterion over the range of inflows used, for example, total hydro-electric power generated (Ghimire and Reddy 2014), total demand deficit (Saadatpour et al 2020) or total environmental stress (Mortazavi et al 2012). Alternatively, objective function values can also be estimated from the worst realisation of all inflow cases (Zatarain Salazar et al 2017, Chen et al 2018). Furthermore, an additional reliability objective evaluating the performance of solutions across the whole range of system input values can be used (Mortazavi-Naeini et al 2015). Given the wide variety of approaches used and each with separate case studies, it is difficult to appreciate their relative benefits.

Direct policy search-based optimisation framework
In this study, the optimisation framework used (figure 1) is based on the well-known Evolutionary Multi-Objective Direct Policy Search (EMODPS) . Within this framework, an ANN model is used as the functional form for the operation policy, where the ANN output is water released in a particular time step, for example a month, and is determined based on a number of inputs such as the time of the year, reservoir initial storage at the beginning of the time step and inflow during the time step. Then water release is passed into a reservoir model where the reservoir state in each time step is updated, and the optimisation objective functions and constraints are estimated. Within this framework the decision variables are the parameters of the ANN model and therefore, policy optimisation is carried out together with reservoir system simulation in a single-step process. A three-layer feed-forward multilayer perception network is used, as it is the most common ANN used for environmental modelling due to its simple structure and less data requirement, and it is able to simulate complex environmental systems (

Different approaches to incorporate uncertainty
Reservoir inflow is a major input for the ANN-based policy. However, it is not known at the beginning of the time step and therefore is the main source of uncertainty, which can be represented using ensemble forecasts. To investigate the impact of different approaches for incorporating input uncertainty into multiobjective reservoir operation optimisation using EAs, six analyses incorporating existing approaches from the literature (see section 2.2) have been conducted. The details of these analyses are summarised in table 1.
Analyses 1 to 3 have objective functions estimated from the worst case across all ensemble inflow members. For Analysis 1 a probability constraint is used where the probability of constraint violation needs to be smaller than a pre-defined value e.g., 1%. Whereas, for Analysis 2, the constraints are evaluated for the worst-case realisation across the ensemble inflows. For Analysis 3, the constraints are converted into an additional reliability objective, where the probability of constraint violation across all inflow ensemble members is minimised. Analyses 4 to 6 follow similar approaches as with Analyses 1 to 3 for handling the constraints, but the original objective functions are evaluated based on the average value across all ensemble inflow members.

Case study and data
Danjiangkou reservoir is located on the Han River in the Hubei province of central China (see figure 2). It is the second largest artificial freshwater lake in Asia and also home to one of the largest hydro-electric power stations in China. The reservoir has a catchment area of 95,217 km 2 . Prior to 2012 before the extension to Danjiangkou Dam was completed, the reservoir had a long-term average surface area of approximately 700 km 2 , an average annual inflow of approximately 39,400 million m 3 , and a maximum storage of 17,450 million m 3 . In total, 31 years of monthly inflow data are available from 1979 to 2009 for Danjiangkou reservoir.
Inflow uncertainty can be represented using a range of methods (

Optimisation problem formulation
The formulation of the optimisation problem adopted in this study follows the formulation reported in Zhao and Zhao (2014), where two operation objectives have been considered. The first objective, f 1 , is to maximise the total hydro-electric energy generated during the typical operation period (i.e., one year):  where p t is the hydro-electric power generated in time step t, which is a month; D is the simulation period, which is a year; and T is the total number of time steps in the simulation period. p t can be estimated based on: where h is the energy conversion coefficient, the value of which is 0.0082; s t and r t are respectively reservoir storage and release at time t, and SSR s t ( ) and SDR r t ( ) are respectively the stage storage relationship and stage discharge relationship. These functions are represented using the following equations for Danjiangkou reservoir  And p max is the maximum turbine capacity, which is 940 MW.
The second objective, f 2 , is to maximise the firm energy generated. The firm energy is defined as the minimum guaranteed energy that can be generated during a given time step (e.g., one month) that can be sold at premium prices (Tsoukalas and Makropoulos 2015). Firm energy is proportional to the minimum power (or firm power as referred to in this study) generated during the given time step: The optimisation problem has several constraints. Apart from the common water balance constraints, the minimum and maximum storage of the reservoir are respectively 12,100 million m 3 and 17,450 million m 3 , which are determined based on the desired water supply security and the capacity of the reservoir, respectively. The minimum release is 900 million m 3 /month (Zhao and Zhao 2014). During the simulation process, when the maximum storage of the reservoir is reached, spill occurs. While the minimum storage and release requirements are handled as either constraints or the additional objective to minimize constraint violation during the optimisation process depending on the analysis conducted (see table 1 where P x ( ) is the probability of x. For Analyses 2 and 5 (with worst-case constraint handling), the minimum storage and release constraints are estimated based on the following equations: For Analyses 3 and 6 (with the probability of constraint violation considered as an additional reliability optimisation objective), the third objective function, f 3 , is to minimize the probability of overall constraint violation, which is defined as:  Figure 4 shows the pairwise objective function values of the Pareto solutions obtained in each analysis (i.e., worst case-left panel; ensemble average-right panel) using the 50 ensemble members for evaluation. For either analysis, it is clear that the two constraint handling approaches generally lead to non-overlapping domains, with the probability constraint handling approach (blue shade) always yielding solutions with higher total energy and firm power compared to the worst-case constraint handling approach (red shade). In contrast, when a reliability objective is used to account for inflow uncertainty (i.e., Analyses 3 and 6), the domain of the obtained solutions completely overlaps the two constraint handling approaches. Importantly, the reliability objective leads to a broader set of solutions along the pareto front. Comparing the left and right panels, there is some overlap from the respective analyses, yet the ensemble averaged objective functions (right panel) yield consistently narrower pareto fronts (due to the central tendency of averaging) and are generally higher than those from the ensemble worst case (left panel).
In addition, there are evident trade-offs between the firm power objective and storage constraint violation (represented by the reliability objective), and between the total energy objective and water supply security. Including the third reliability objective accommodates for this trade-off and yields a wider pareto front. The purple-bordered cells in figure 4 show a wider pareto front for strict non-violation of constraints, while the green-bordered cells yield a considerably wider front for cases with mild constraint violation. Figure 5 shows, for each analysis, the respective marginal distributions of the additional system variables: monthly release, annual release, minimum storage and annual spill. There is similar performance for all four variables between pairings of ensemble-worst and ensemble-averaged objectives, i.e., comparing Analyses 1 and 4, Analyses 2 and 5, and Analyses 3 and 6. This indicates that the method for evaluating the energy and power objective functions, whether ensemble-worst or ensemble-averaged, does not significantly affect the performance of the optimal operating policies obtained. Consistent with the results in figure 4, the worst-case analyses (Analyses 1, 2, 3) yield slightly more spread-out distributions; probability constraint handling (Analyses 1 and 4)  yields better values (higher releases, lower spill) than worst-case constraint handling (Analyses 2 and 5); and the reliability objective (Analyses 3 and 6) yields a noticeably larger spread, with numerous solutions that appear to be 'good' in a marginal sense, but with many solutions that have reduced water supply security from violating the storage constraint (minimum 12,000 million m 3 ).
To better understand the differing performance of approaches incorporating inflow uncertainty in multiobjective optimisation of reservoir operation, the performance of the solutions relative to each member of the ensemble is investigated. The results in figure 6 are presented for Analysis 2 (see supplementary material for other analyses that yield similar observations). Figure 6(a) summarises the monthly inflow for each evaluation ensemble member, where three low-inflow members (2, 15, 48-shown in red) and five high-inflow members (22, 26, 29, 42, 47-shown in green) have been highlighted for further comparison. The remaining panels of figure 6 show distributions of the reservoir variables, including annual spill, total energy, and firm power, across all optimised solutions. The figure shows that spill and total energy are highly responsive to the mean inflow, whereas the firm-power is less sensitive. In general, the worst-case approaches (as of the case for Analysis 2) are dominated by ensemble members 2 and 15 which have significantly lower total energy generated from all optimised solutions. The low/high inflow members generally have low/high spill, where differences between the performance of the members depend on the seasonal timing of inflows. For example, member 26 has higher inflow than member 22, yet it produces lower total energy across all optimised policies due to a higher spill, which mostly results from the extremely high monthly inflow from one month as shown in figure 6(a). Similarly, member 48 produces significantly more total energy than members 2 or 15 despite having a similarly low average inflow, due to having two months of relatively higher inflows.

Discussion
Based on the results presented, it has been found that how the original energy and power objective functions are evaluated across the inflow ensemble does not have a significant impact on the range of the overall performance of the optimal solutions obtained. The worst-case objective function handling approach leads to a larger spread across the pareto front, thus more diverse solutions. Whereas the probability constraint handling approach leads to pareto-solutions with improved energy and power objective function values, at the cost of slight storage constraint violation. Although expected, the finding does confirm the value of relaxing constraint conditions especially if they do not have to be satisfied.
Similarly, the worst-case constraint handling approach leads to a wider spread across of the Pareto front without any constraint violation. Including the minimisation of probability of overall constraint violation as an additional reliability objective function, on the other hand, introduces more Pareto solutions. The performance range of these solutions covers the solutions obtained from the analyses with the constraint handing approaches, providing tradeoffs between the original energy and power objectives and the water supply security with only slight compromise of the reliability objective. However, this approach increases the total number of solutions, and thus the complexity in the decision-making process. Without a well-developed decision-making process to handle the increased complexity, this approach will not necessarily be as effective in reaching an acceptable solution.
Further analysis over the performance of optimised solutions across the different evaluation ensemble members reveals several case study specific findings. First, the total energy objective function is dominated by the total inflow of each year, and there is a high variability in the total energy across the set of Pareto policies for each ensemble inflow member. In contrast, the firm power objective function is less sensitive to the total inflow and therefore can be optimised even in a dry year. However, due to the impact of dry years over the total energy objective function, the optimised policies can be dominated by a few low inflow ensemble members, leading to higher spills and therefore wastage during wet years, especially if the worst-case approach is used.
Due to the consistency of system variable values obtained from each ensemble member across all the optimised policies, it would be possible to achieve similar performance with a significantly reduced number of ensemble members, as long as the key ensemble members are included. This provides an opportunity to improve optimisation efficiency, for example with a reduced number of ensemble members selected using sampling methods. Finally, the findings in this study are based on the assumption that the statistical properties of the inflow ensemble are unchanging. More research is required to understand the impact of future changes such as climate change on the optimisation of reservoir operation policies, and which approach(es) may be more suited to adapt to the changes.

Conclusions
This study provides a comparison of the different approaches that are commonly used to incorporate input uncertainty into multiobjective reservoir operation policy optimisation using evolutionary algorithms (EAs) via a real-world case study. The results show that the worst-case approach for objective function evaluation will lead to more diverse solutions compared to the averaged case approach. However, the worst-case approach can be dominated by several extremely low inflow ensemble members and lead to solutions that may not fully utilise all water available during wet years. The probability constraint handling approach generally leads to solutions with improved performance, albeit with slight constraint violation, and therefore is suitable for simulations where constraint levels are negotiable. When input uncertainty is accounted for via an additional reliability objective in the optimisation process, more pareto-solutions are found. Many of these solutions have improved values for the original management objectives with little constraint violation. However, the increased number of potential solutions will also increase the complexity of the final decision-making process.
This study affirms the role of EAs in multiobjective reservoir optimisation for their ability to include uncertainty and develops guidance on method selection for objective setting and constraint handling. An important assumption used in this study is that the uncertainty in inflow due to natural variability is unchanging, so that the ensemble inflows used for operation policy optimisation have similar statistical properties to those used to evaluate the optimised policies (i.e., in application). This may not be the case in the future considering climate change. It will be important to develop methods that are robust to potential changes in input uncertainty in future research. In addition, the results obtained show similar performance across different ensemble sizes, indicating that sampling methods can potentially be used to improve optimisation efficiency without compromising performance.