Development of a Sparse Polynomial Chaos Expansions Method for Parameter Uncertainty Analysis

Incorporating uncertainty assessment into hydrological simulation is of vital significance for providing valuable information for conserving and restoring the ecology environment in arid and semiarid regions. In this study, a sparse polynomial chaos expansions method was developed to quantify hydrological model parameter uncertainties on model performance in Kaidu river basin, China. A four dimension two order polynomial chaos expansions model was built and the effect of four parameters were quantified based on the coefficients of the polynomial chaos expansions model. Results indicated that precipitation in summer has more significant influence on model output than that in other seasons. High Sobol sensitivity indices values (0.22 in spring, 0.17 in summer, 0.21 in autumn and 0.29) for the interaction of precipitation and maximum capacity for fast store demonstrate that they are the major factors affecting runoff generation. These results can help reveal the flow processes and provide valuable information for water resources management.


Introduction
With the widespread application of digital elevation models, geographical information system and remote sensing, hydrological models have been widely used to provide information for catchment management with information on the interaction of water, energy and vegetation processes distributed over space and time in a way that cannot be done through field experiment and direct observation [1] [2]. Nevertheless, there are a variety of uncertainties involved in hydrological processes. Such uncertainties would influence model performance due to randomness characteristic of precipitation, temperature, infiltration and so on. Model representations of real-world hydrological systems are complicated with a variety of factors, including inadequate conceptualizations of physical processes, errors related to spatial and temporal scales and derivation of model-parameter values directly from basin traits [3]. Parameters obtained from calibration are also affected by several factors such as correlations among parameters, sensitivity or insensitivity in parameters and statistical features of model residuals. Therefore, evaluating the effect of parameter uncertainties on model output would provide more information for hydrological forecasting and related catchment management.
A number of efforts were made in developing more effective methods for reflecting parameter uncertainty in hydrological modelling and their effects on model performance [4]. Among them, the parameter uncertainty assessment methods widely used are mainly based on Bayesian approaches. These methods can be roughly classified into formal Bayesian methods using an explicit statistical error model and a Markov Chain Monte Carlo sampling procedure [5][6] and informal Bayesian approaches, based on the Generalized Likelihood Uncertainty Estimation method [7]. Although stochastic analysis method is capable of handling uncertainties with known probability distributions, it cannot ensure a sufficient precision of the statistics inferred from the retained solutions unless the sampling of the parameter space is dense enough [8].
Therefore, a sparse polynomial chaos expansions method integrating of the semi-distributed land use based runoff processes (SLURP) model, polynomial chaos expansions, sparse grid collection and Sobol sensitivity indices, would be developed to quantify parameter uncertainties on model performance. SLURP model is capable of simulating the interaction among overland, river and groundwater flows. A polynomial chaos expansions model would be to build a surrogate model for SLURP. Sparse grid collection would be used to determine the sampling dates of different parameters. Finally, the Sobol sensitivity indices would be calculated based on the coefficient of the polynomial chaos expansions model to quantify the contribution of different parameters and theirs to the total to discover the effect of different parameters.

Polynomial chaos expansions
The generalized polynomial chaos expansion model can be written in the form: Where y is the output and Where  is the vector of n random variables. Then, the second and third order Hermite polynomials with two random variables are and   2  2  3  2  2  3  1  2  1  1 2  2  1  1  1 2  2  2 1  1  2  2 {1, , , The series n-dimensional polynomial chaos would be truncated to a finite number of terms for real computational.

Sparse grid collection method
The quadrature method takes a tensor product of the univariate leading to an exponential dependence of the number of points of the dimension n. In order to alleviate this "Curse of dimensionality" sparse grid quadrature method was proposed based on tensor products of hierarchical difference sets ( The Smolyak interpolation operator can be given as follows [10]: Based on the collection points for one-dimension integral, the collection points of multi-dimension integral can be obtained as follows: In detail, the collection points for N=3 and L = 2 are (0, 0, 0), ( 1

Sobol sensitivity indices
Sobol sensitivity indices describing the contribution that the set of variables have on to the total variance can be formulated as u where vu is the variance of variables u. vT is the total variance. To calculated the sensitivity indices from the polynomial chaos coefficients, the truncated polynomial chaos expansion model is rewritten as follows: , , Based on the orthogonality of different random variables : ( , , ) ( , , ) 0, , , , , Thus the total variance and the variance of different variables can be determined and the Sobol sensitivity indices can be calculated as follows: Where Si is the main effect of variable i and Sij is the cooperative effect of variable i and j.

Case study
Kaidu River Basin is located in North China with an area of 18,827 km 2 and an average elevation of 3100 m [11]. It plays an important role in protecting the ecological environment and green corridor of the lower reaches of the Tarim River [12]. The low rainfall and high temperature in this area bring notable drought periods and the regional water resources system degraded due to more and more frequently human activities. As a sparsely populated region, precise data in Kaidu River Basin is hard to be obtained due to the lack of regular monitoring. Moreover, uncertainties associated with the temporal and spatial variations in hydrological processes (e.g. precipitation, topography and evaporation) may bring errors in hydrological simulation and influence model performance. Thus, it is necessary to develop an effective uncertain quantification method for hydrological simulation to provide detailed information on water resource management and environmental protection. The meteorological data and daily stream flow records, from 1996 to 2001, are obtained from rom the Bayanbulak meteorological station and the Dashankou hydrological station, respectively. The meteorological inputs for each ASA in SLURP model are derived using a weighted Thiessen polygon method with a lapse rate of 0.75°C and 1% per 100 m for temperature and precipitation. River network and boundaries of watershed are obtained based on the topography map, with a resolution of 100 × 100 m, using the topographic parameterization package.

Results and Discussion
The performance of SLURP model was estimated using Nash-Sutcliffe efficiencies (NSE) and determination coefficient (R 2 ). NSE values for calibration (2001)(2002)(2003)(2004)(2005) and verification (2006)(2007)(2008)(2009)(2010) are 0.693 and 0.673, respectively. R 2 values for calibration and verification are 0.850 and 0.826, respectively. The results not only indicate a good performance of SLURP for hydrological simulation in Kaidu river basin, but also lay a good foundation for further revealing the factors affecting hydrological processes from rainfall to streamflow in the study area. Then, a four dimension two order polynomial chaos expansions model based on sparse grid collection is formulated and the Sobol sensitivity indices for four sensitive parameters of SLURP, i.e., precipitation factor (P1), maximum capacity for fast store (P2), retention constant for fast store (P3) and retention constant for  (Figure 2). In summer, the indices of P1 is 0.535, which is higher than that of the other parameters. It is indicated that precipitation in summer is the major factor that affect runoff generation. On the contrary, the indices of P1 in winter is only 0.065, which is lower than that of the other parameters. These can be attributed to the low precipitation, as well as the freezing water temperature forming a thick layer of ice and snow on the land. The indices of P4 is 0.04 in spring, 0.01 in summer, 0.04 in autumn and 0.06 in winter, which are the lowest among the four parameters. It demonstrated that the effect of retention constant for slow, which is associated with the residence time of water in the saturated zone, is not significant. The indices of P2 are 0.19, 0.22 and 0.24 in spring, autumn and winter, which is higher than that (0.1) in summer. It is indicated that water holding in soil have higher effect in dry season that than in rainy season. The indices of P2×P1 are 0.22 in spring, 0.17 in summer, 0.21 in autumn and 0.29 in winter, which is higher than the value of the other interactions. It is because that excess infiltration is the main runoff mechanism and precipitation and maximum capacity for fast store are the major factors affecting runoff generation.

Conclusion
In this study, a sparse polynomial chaos expansions method was developed to quantify parameter uncertainties on model performance. The hydrological process from rainfall to runoff was simulated using SLURP model with consideration of canopy interception, snowpack, aerated soil storage and groundwater. The Sobol sensitivity indices was calculated based on the polynomial chaos expansions model and the sparse grid collection. On the other hand, uniform distribution of model parameters was used as the sampling distribution for sparse grid collection. In real world, multiple uncertain parameters may be interrelated to each other, leading to uncertain parameters distributions. Thus, other distributions (e.g., Gaussian) may be introduced to enhance the applicability of the developed sparse polynomial chaos expansions model.