Mind the (spectral) gap: how the temporal resolution of wind data affects multi-decadal wind power forecasts

To forecast wind power generation in the scale of years to decades, outputs from climate models are often used. However, one major limitation of the data projected by these models is their coarse temporal resolution—usually not finer than three hours and sometimes as coarse as one month. Due to the non-linear relationship between wind speed and wind power, and the long forecast horizon considered, small changes in wind speed can result in big changes in projected wind power generation. Our study indicates that the distribution of observed 10 min wind speed data is relatively well preserved using three- or six-hourly instantaneous values. In contrast, daily or monthly values, as well as any averages, including three-hourly averages, are almost never capable of preserving the distribution of the underlying higher resolution data. Assuming that climate models behave in a similar manner to observations, our results indicate that output at three-hourly or six-hourly temporal resolution is high enough for multi-decadal wind power generation forecasting. In contrast, wind speed projections of lower temporal resolution, or averages over any time range, should be handled with care.


Introduction
Wind is one of the main sources of renewable power and its utilisation is on the rise in many countries, [e.g.Soares-Ramos et al., 2020].It is therefore of uttermost importance to have reliable wind power forecasts in the range of years to decades (i.e. a turbine's lifetime) for site assessment and reliable future power supply [Copernicus, 2023].However, the high variability and stochasticity of weather and wind introduces uncertainty that makes multi-decadal planning difficult.Furthermore, computational complexity limits the resolution of forecasts; the temporal and spatial resolution 2 of long-term and multi-decadal forecasts are therefore usually much coarser than that of short-term forecasts [Eyring et al., 2016].
But do the low temporal resolution outputs from e.g.climate models provide wind speeds representative of the true sitespecific high resolution wind speed distribution?An indicator that this could be the case is the so called wind power spectral gap [ Van der Hoven, 1957].Wind speed variability can be assessed in the frequency domain in terms of power spectra by (Fourier-) decomposing high-resolution wind speed observations.This decomposition reveals an amplitude gap between high and low frequencies where the strong variability in high frequencies (on the order of seconds to minutes) is associated with turbulence, while strong low frequency variability (hours to days) is associated with synoptic weather systems [ Van der Hoven, 1957], [Stull, 1988].In the gap between the synoptic weather and turbulence peaks in the frequency spectrum are frequencies with little variability; this was found to be in the range of 10 −3 Hz(∼ 17min) to 10 −4 Hz(∼ 3h) by Van der Hoven [1957].However, other research suggests the gap is smaller [Kang and Won, 2016] or may not exist at all [Larsén et al., 2016].This spectral gap is described in many research papers [Horvath et al., 2012], [Kang and Won, 2015], [Larsén et al., 2016], [Lopez-Villalobos et al., 2021] indicating that observations that fall into the corresponding frequencies do not add much knowledge.Given the inconsistency of the width of this spectral gap, the question remains of what temporal resolution we should aim for in multi-decadal wind power forecasting.
Due to the non-linear relationship between wind speed and wind power, small changes in the wind speed distribution can lead to significant changes in wind power generation.However, if the underlying wind speed characteristics are preserved, lowerresolution data is preferred for multi-decadal forecasting to reduce required storage space and potentially also computational costs.
Additionally, multi-decadal wind speed projections should also account for climate change and interannual climate variability [Pryor et al., 2018].Several studies indicate that climate change will affect average wind speeds [McVicar et al., 2012], [Tobin et al., 2015]) as well as wind speed variability [Pryor and Barthelmie, 2010], [Tobin et al., 2016], [Dunn et al., 2019], [Jeong and Sushama, 2019], [Ringkjøb et al., 2020] which will impact wind power generation.In general, it must be assumed that current wind conditions may not be representative of future wind conditions [Jung and Schindler, 2019]; we thus often rely on output from climate models to help inform about future potential wind power.But which data (resolution) do we need for meaningful wind speed and power predictions?There seems to be agreement among researchers that certain temporal resolutions are too low.It is therefore common to account for additional variability using so-called downscaling techniques [Pryor and Barthelmie, 2010].In the past, different statistical downscaling approaches have been introduced, e.g.[Von Storch, 1999], [Tobin et al., 2015], [Shin et al., 2018], as well as dynamical downscaling using regional climate models such as CORDEX [Giorgi and Gutowski Jr, 2015], used by e.g.Davy et al. [2018] and Yang et al. [2022].The question remains, however: what temporal resolution of wind speed data is high enough?
The output from climate models are often available in various temporal aggregations, where some datasets represent temporal averages and other datasets consist of instantaneous values.In the CMIP6 datasets [Eyring et al., 2016], one of the most widely used set of global circulation models (GCMs), wind projections are available as temporal means and instantaneous values [CMIP6 data request, 2016].Currently, however, little focus has been placed on the choice of data and many studies are performed without explicitly stating whether averages or instantaneous values are being used.
With this study we show that the type of data (instantaneous vs averages) influences the wind speed distributions and thus the estimated wind power generation.We also conduct analysis to determine whether there is a temporal resolution that is high enough, i.e. for which added temporal resolution provides little additional information and accuracy.We give recommendations regarding the choice of data and the temporal resolution that downscaling techniques, and climate model outputs, should aim for.To do this, we conduct empirical data analysis using data from eight different mid-latitude sites across Europe and North America.In Section 2 we describe the data and the methodology used.Our results are presented in Section 3 and we discuss their implications in Section 4. Finally, we conclude in Section 5.

Methods
To investigate how aggregating wind speed data affects the wind speed distribution we use both parametric and non-parametric approaches.In the main manuscript we focus on observations of turbine-hub-height winds from four sites, with locations shown in Figure 1.We use these as our primary data as they are hub-height data; however, all four sites have a relatively limited observational period of only a few years (see Table 1).We thus also use 10m wind speeds from an additional four observational met masts from locations across Germany (see Figure A.1 for locations) that have between 18 to 34 years of data available.These data show very similar results to the hub-height datasets (analysis presented in the supplementary material in Section A).We also use these longer datasets to analyze multi-decadal tendencies of wind power generation in Section 3. In the following we first give a description of the data used and then describe our methodology.

Data
We investigate wind speed observations using open-source high met mast wind data of four mid-latitude locations.All of the wind speeds are either measured at wind turbine hub-height directly, i.e. by nacelle anemometers (sites Penmanshiel and Kelmarsh) or by high met masts (sites NWTC and Owez).The observation heights are between 59m and 116m and all of the measurements are provided as 10min averages, a very common aggregation-level of wind resource data [Harper et al., 2010].
In Table 1 we present static information of the observation sites.Static information of the longer datasets (at 10m height) is presented in Table A.1.Unless specified otherwise, the following abbreviations for the datasets can be found in the top left corner of figures: a) Kelmarsh, b) Penmanshiel, c) NWTC, d) Owez, e) Aachen, f) Zugspitze, g) Boltenhagen, h) Fichtelberg.

120°W
Figure 1: Locations of the two wind farms in the UK and the tall towers in central North America and the Netherlands.Our hub-height observation locations include one mountainous site (NWTC), one off-shore site (Owez), one coastal site (Penmanshiel) and one site on flat terrain (Kelmarsh).Penmanshiel and Kelmarsh are wind farm sites, their wind speeds are therefore influenced by wake effects of the surrounding turbines [e.g.González-Longatt et al., 2012] and we only used one turbine for our evaluations.Ramon et al. [2022b] Table 1: Static data of the four different sites with hub-height measurements.Our chosen datasets cover a large range of observation heights as well as different mean wind speeds and variances.
To bring the data to a format where we can compare different temporal resolutions, we pre-process the data by excluding all days where at least one observation is missing.We then average the 10 min wind speed observations to three-hourly, six-hourly and daily data: the n'th wind speed value w n in the time-series averaged over t consecutive time steps is computed as where t = 18 (three-hourly), 36 (six-hourly) and 144 (daily) respectively; for t = 1 we get the original 10 min resolution time series w 1 .In addition to calculating averages, we also consider wind speed time series of lower resolution, which we call instantaneous values.To do so, we use wind speed measurements every t'th time step only and exclude all other wind speed measurements w i where i ̸ = nt.The results are eight observations per day (every three hours), four observations per day (every six hours) and one observation per day respectively.

Comparing wind speed distributions
To determine whether wind speed distributions from data of different temporal resolution are statistically different, we compute pairwise Kolmogorov-Smirnov test statistics of cumulative density distributions The Kolmogorov-Smirnov statistic D is given by: where w are the wind speed values, T and S are the wind speed distributions to be compared, and the supremum, sup, is the largest value of the set of values |T (w) − S(w)| across all w.The Kolmogorov-Smirnov test only takes the largest absolute difference between the two distributions across all w values into account and we identify a statistically significant difference if the p-value of an individual test is p ≤ 0.05.

Modeling wind speeds using Weibull distributions
The Kolmogorov-Smirnov test can tell us whether wind distributions of different temporal resolution differ; in order to quantify the differences found, we model the wind speeds, w, as Weibull distributions.This is done by fitting the parameters of a three-parameter Weibull distribution, to the different temporal resolution datasets.The three-parameter Weibull distribution described in Equation ( 4) is defined by w ≥ θ and f (w; β, λ, θ) = 0 for w < 0 where β > 0 is the shape parameter, λ > 0 is the scale parameter and θ is the location parameter of the distribution which equals the lowest possible value of the distribution.For β ≈ 3, the Weibull distribution approximates a Gaussian distribution, while 3 > β ≥ 1 corresponds to a right-skewed distribution, and β > 3 corresponds to a left-skewed distribution, for β < 1 the density values are steadily decreasing with increasing w. λ represents the variability, i.e. smaller values of λ are associated with less variability [Rinne, 2008].We fit the parameters using Maximum Likelihood Estimation (MLE).This approach requires maximizing the likelihood function We can then evaluate the change of the parameters when wind speeds are averaged or discarded to produce datasets with different temporal resolution.While Weibull distributions are commonly used to model wind speed distributions [e.g.Mert and Karakuş, 2015], we additionally use a generalized Gamma distribution to test the sensitivity of our results to the choice of underlying distribution.The conclusions are unchanged (figures shown in the appendix) and we therefore consider our results to be insensitive to the distribution choice.

Validating the Weibull parametrization
Using a kernel density estimation we confirm that Weibull distributions are a reasonable representation of our data, with the exception of monthly averages.The kernel density estimator f of an unknown density f at a point x is defined by where we choose K to be the Gaussian kernel The band-width h is selected using Scott's rule [Scott, 2015].
Figures A.2 and A.4 show kernel density estimations for averaged and instantaneous wind speeds respectively; it is clear that monthly values can not be described by a Weibull distribution and thus we do not include this temporal resolution in subsequent analysis.To validate the fit of the Weibull distributions to the original wind speed distributions we generate quantile-quantile plots of the observations of length l t against l t randomly drawn samples from the corresponding Weibull distribution; these plots are shown in Figure A.5 and Figure A.6.The Weibull distributions generally provide a good fit to the data, although in some locations the fit is less good at higher wind speeds.

Power generation and transferability of results
As a last step, we relate the insights from the wind speed distributions to multi-decadal wind power forecasts.Wind power generation is often forecasted using hub-height wind speed forecasts with a turbine-specific wind power curve [Wang et al., 2019] that describes the relationship between wind (speed) and potential wind power generation and is highly non-linear.We apply the Enercon E92/2350 wind power curve,visualized in Figure 2. Given the relatively short duration of the main The relationship between wind speed and wind power can be roughly divided into four different regions: No power is generated if the wind speed is below the cut-in wind speed or if the wind speed is above the cut-out wind speed where the turbine is shut down to protect it from damage .In between, wind power generation first increases rapidly with increasing wind speed .Once the maximum wind speed that the wind turbine can convert to power is reached, the power output is usually constant until the wind speed exceeds the cut-out wind speed .
observational datasets we use in this study (around 3-5 years of non-missing data) and our interest in multi-decadal forecasts, we study potential power generation using four 18 to 34 year long wind speed datasets.
Additionally, to determine whether the results we find for observational datasets are also applicable to climate model data, we repeat our analysis using data generated by the historical run of the MPI-ESM1-2-LR general circulation model [Eyring et al., 2016] which was the only model with three-hourly data available on the ESGF node [WCRP CMIP6, 2023] at the time of research.

Results
We analyze the distributions of wind speed averages and instantaneous wind speed time series, and find consistent results across all sites investigated: averaging introduces shifts to the wind speed distributions, while three-hourly and six-hourly instantaneous data are usually close to the original.This is clearly demonstrated in Figure 3, with differences between 10min wind speeds and averaged data (left hand column) larger than the differences between 10min wind speeds and instantaneous wind speeds of lower resolution (right hand column).Similar patterns can be observed in the four longer datasets, see Figure A.7.
The impact of averaging can also be seen in the variance of the data: while averaging does not affect the mean it reduces the variance of the averaged time series (Figure 4).In contrast, three-hourly and six-hourly instantaneous values (dashed lines in Figure 4) preserve the variance, with daily instantaneous values preserving substantially more of the variance than daily averages, and at some sites, more than six-hourly averages.

Kolmogorov-Smirnov tests
To quantify whether the difference between the 10min wind speed distribution and the averaged and instantaneous distributions of lower resolution are statistically significant we compute Kolmogorov-Smirnov test statistics of their cumulative density distributions pairwise.The results for the Penmanshiel site are presented in  Wind speed ( m s ) Cumulative density day 6h 3h Figure 3: Difference of cumulative densities from the 10min data to the other temporal resolution datasets for average wind speeds (left) and instantaneous wind speeds (right).It can be seen that the averaged wind speeds are visually distinguishable, which is less the case for instantaneous wind speeds, particularly for data with a temporal resolution of six-hourly or higher.

Changes in Weibull parameters
The cumulative density plots and Kolmogorov-Smirnov tests indicate that wind speed distributions are likely to change when wind speeds are averaged and likely to stay similar when measurements are discarded, at least until around six-hourly resolution.
The aim of the next steps is to quantify these changes by parameterizing the distributions as Weibull distributions.
Figure 5 shows the values of the three Weibull parameters for the different aggregation levels for averaging (top row) and instantaneous (bottom row).The shape parameter β stays approximately constant across all aggregation levels and types, for both averaged and instantaneous data.For averaging, the location parameter θ increases with higher aggregation levels.This is consistent with the lowest values of the dataset increasing as they are averaged.Conversely, with lower resolution instantaneous data, the lowest values remain similar, leading to similar θ for all resolutions studied.The scale parameter λ decreases with increased averaging length, and remains relatively constant for instantaneous data, consistent with the changes in variance shown in To test for robustness of our results we also use an MLE to fit a generalized Gamma distribution (see Appendix for details); both Weibull and Gamma distributions are regarded as suitable statistical models for wind speed data [e.g.Mert and Karakuş, 2015].We find very similar results (see Figure A.11 to Figure A.14), with large changes in parameters when averaging data, and relatively small changes for three-and six-hourly instantaneous values.This suggests our conclusions are not sensitive to our choice of parameterization.

Implications for multi-decadal wind power forecasting
As introduced in Section 1, wind speeds and wind speed variability are subject to interannual variability and climate change.
Hence, for multi-decadal wind power forecasts climate models can provide useful information [e.g.Tobin et al., 2015].To understand how our results apply on multi-decadal timescales, we repeat our analysis using data from four sites where multiple decades of 10m wind speed observations are available.We also repeat our analysis on climate model output data to determine whether our conclusions are applicable to model data.We extract 10m wind speeds from the historical run of the CMIP6 model MPI-ESM1-2-LR for grid points closest to these four longer-term observational sites (locations shown in Figure A.1).We only use direct output from the model, and thus at daily temporal resolution we do not have instantaneous values, only averages.
For these multi-decadal datasets the Kolmogorov-Smirnov tests and Weibull parameterization analysis produce results comparable to those for the hub-height observations shown in the previous sections.The parameters of fitted Weibull distributions behave similarly in response to changing temporal resolution, with a decrease of λ as temporal resolution decreases indicating a decrease in variability (see Figure 6).The climate model data do not show an increase in θ with decreasing temporal resolution, although for three-hourly and six-hourly averages θ fitted to the climate model data is very close to the parameters fitted to the observational data.
So far we have only looked at wind speeds and their distributions.However, these are just a proxy for wind power -our variable of interest -and wind power depends non-linearly on wind speed.To estimate the power a wind turbine could generate over its lifetime, we apply the Enercon E92/2350 power curve (see Figure 2) to the four multi-decadal observational wind speed datasets as well as to the wind speeds from the corresponding closest grid-points in the CMIP6 dataset.We then compare the expected cumulative power generation of the highest available resolution (10min in observations and three-hourly instantaneous values in CMIP6 model) to lower resolutions (three-hourly, six-hourly, and daily averages and instantaneous values).Although the highest available resolution in the CMIP6 model is only three-hourly, our previous analysis shows that these values are closely aligned with 10min data.
Figure 7 shows the expected cumulative power generation using wind speed observations of different resolutions.We show this as a fraction of the total power generation achieved when applying the wind power curve to the 10min observational wind speed data and integrating over the whole time period.The dotted gray line shows 100% and thus if the expected cumulative power generation reaches this threshold without overshooting we consider the change introduced by the temporal resolution to be small.The top row of Figure 7 shows that averaging values, particularly to daily, but even to three-or six-hourly at some locations, can lead to relatively large errors in estimated power generation, with errors of up to -34.48% (daily average), -15.45% (six-hourly average), and -10.06% (three-hourly average).Conversely, three-hourly and six-hourly instantaneous values, shown in the bottom row, reveal very similar results to the 10min data, with errors less than 2%.
Figure 8 demonstrates that very similar results are found using the output from the MPI-ESM1-2-LR global climate model, with results given relative to the total amount of power generated using three-hourly instantaneous values.In all cases sixhourly instantaneous values are closest to three-hourly instantaneous values, followed by three-hourly averages and six-hourly averages; daily averages differ substantially.
Reducing temporal resolution leads to an underestimation of expected power generation at all sites studied except for daily instantaneous values at Zugspitze (Figure 7 bottom row, site (f)), a site with relatively high mean wind speeds and high wind speed variance, situated at a high elevation above sea level (2956m).This is very likely a function of the particular wind turbine power curve we have chosen.It is important to note that in all cases, both with observational data and climate projections, the difference in power generation compared to higher-resolution data increases with an increasing forecast horizon and does not average out -the shift in the wind distribution leads to a systematic error in wind power estimation.In general, averaging tends to result in an underestimation of expected power generation, while discarding values has only minor impacts.

Discussion
Global climate models simulate climate dynamics physically using partial differential equations.Their long forecast horizons make them computationally expensive with large storage space needs and their output is usually provided with a temporal resolution ranging from three hours to one month, either as instantaneous or averaged values.Using hub-height wind speed observations, this study investigates how the temporal resolution of wind speed data affects the wind speed distributions.Using multi-decadal observational datasets at 10m height we study the corresponding potential power generation and which time resolution is actually necessary.A.16.
Using hub-height data we find that in all cases investigated, three-hourly and six-hourly instantaneous observations wellpreserve the underlying 10min wind speed distribution, whilst three-and six-hourly averaging leads to distributional shifts.However, whether a significant wind speed distribution shift results in a significant change in wind power generation depends on the turbine and its corresponding power curve.Our results, using an example turbine power curve, indicate that the differences in wind distribution highlighted in this study can lead to accumulating errors when power generation is forecasted for years to decades (compare Figure 7 and Figure 8).
Our results are consistent across different observational sites and a GCM.We can therefore give two suggestions when working with wind speed projections for wind power modelling.First, instantaneous wind speed projections should be preferred over wind speed averages, as sub-sampling wind speed data introduces relatively minor errors in contrast to averaging wind speeds, where we observe a characteristic distributional shift.This shift associated with averaging data indicates that we might consistently over-or underestimate wind power generation.Second, instantaneous wind speed projections of six or three hours suffice, whilst daily data may be too low resolution, even with instantaneous values.For our sites temporal downscaling or either three-or six-hourly data is unlikely to provide substantial improvement in accuracy.For example, instantaneous observational wind speeds with a three-hourly temporal resolution lead to errors of less than 0.29%, with six-hourly data leading to errors of less than 1.57%.The experiments conducted using climate model data support these results.This knowledge can be useful to reduce storage space almost loss-free and decrease computational complexity in further applications using the data.
The primary shortcomings of our investigations include a potential lack of generalizability across turbines, sensitivity to our underlying highest temporal data resolution, and uncertain transferability to other climate models.More specifically, as wind  ) wind power generation at the four CMIP6 locations closest to e) Aachen, f) Zugspitze, g) Boltenhagen, h) Fichtelberg computed by feeding the wind speeds into a power curve.Top row: Average values underestimate wind power generation.Bottom row: Instantaneous values (note that daily instantaneous values were not available as direct output, and so are not included).Using six-hourly instantaneous values introduces only minor errors relative to three-hourly instantaneous values.
speed data measured at hub-height is often confidential, we are limited to relatively few datasets.Furthermore, for site assessment, wind speeds are usually transformed to hub-height which is non-trivial [e.g.Bañuelos-Ruedas et al., 2010].However, the close agreement of results from stations across a range of different locations, including different local geography (off-shore, on-shore, different altitudes and local topography), suggests that our results likely hold for the majority of locations.For 7 out of the 8 locations studied, using daily instantaneous or average values underestimates the power generation (see Figure 7), while six-hourly values are good approximations.For one station, however, the daily instantaneous data is an overestimate; understanding the conditions under which daily data over-or under-estimates the power generation would be useful for studies in which only daily data is available.In addition, while the Enercon power curve we use in this study has the characteristic shape of any modern horizontal wind turbine power curve, it is not necessarily representative of the turbines at a particular site.
Regarding the sensitivity to the underlying data resolution, we use data in 10min resolution, hence variability on shorter time scales -often associated with turbulence [e.g.Stull, 1988] -is not preserved.However, higher temporal resolution data is rarely available or used in wind power forecasting [Tawn and Browell, 2022], [Effenberger and Ludwig, 2022], and thus we assume that this is a minor issue.This claim is also supported by Lopez-Villalobos et al. [2021] who investigate wind power spectra [ Van der Hoven, 1957] and find only small differences between power production given wind speeds of different resolution between 1min and 6h.
Lastly, climate projections are characterized by different pathways that describe anthropogenic climate change [Eyring et al., 2016].This makes handling the data cautiously even more important, especially in wind power forecasting, where non-linearly dependent wind speed projections are often used as a proxies for power generation.Our results using one historical CMIP6 run indicate that changes in observational wind speed distributions for different temporal resolution data can be seen in data from climate models as well.However, future research has to investigate the sensitivity of these results to different climate models.

Conclusion
Wind power generation depends non-linearly on wind speeds.For multi-decadal wind power forecasts in the order of years to decades, small changes in modelling the wind speed distributions can result in large systematic errors in power estimation, with absolute errors increasing with longer forecast horizon.Using hub-height wind speed observations and climate model output data, this study investigates how the temporal resolution of wind speed data affects wind speed distributions and corresponding potential power generation.
We show that instantaneous wind speeds of lower resolution more accurately represent the underlying distribution of higher resolution data when compared to averaged wind speeds.Three-and six-hourly instantaneous values preserve the wind speed distribution of 10min wind speed averages well.Small changes in the wind speed distribution, through averaging or using daily data, has significant impacts on the estimated wind power generation of a turbine over its lifetime.These results hold true across several observational sites and a global climate model.Based on our results, we argue that modelling wind speed distributions correctly is what we should aim for in multi-decadal wind power forecasting.

Appendices Appendix A Supplementary Material
In the following we present results from additional datasets.Unless specified otherwise, we assign the following abbreviations to the different datasets: a) Kelmarsh, b) Penmanshiel, c) NWTC, d) Owez,e) Aachen, f) Zugspitze, g) Boltenhagen, h) Fichtelberg.They can be found in the top-left corner of all plots.To test for robustness of our results we also use MLE to fit a generalized Gamma distribution with instead of fitting a Weibull distribution.Both distributions were regarded to be suitable statistical models in earlier studies [e.g.Mert and Karakuş, 2015].Table A.16: Errors (in M W ) of wind power generation prediction when using 30 years of data of lower resolution (three-hourly average and three-hourly and six-hourly instantaneous) wind speed observations compared to 10min wind speed observations.

Figure 2 :
Figure2: Wind power curve of Enercon E92/2350 turbine.The relationship between wind speed and wind power can be roughly divided into four different regions: No power is generated if the wind speed is below the cut-in wind speed or if the wind speed is above the cut-out wind speed where the turbine is shut down to protect it from damage .In between, wind power generation first increases rapidly with increasing wind speed .Once the maximum wind speed that the wind turbine can convert to power is reached, the power output is usually constant until the wind speed exceeds the cut-out wind speed .

Figure 4 :
Figure 4: Variances of averaged (solid lines) and instantaneous (dashed lines) values.The variances steadily decrease when wind speeds are averaged and stay close to the 10min variances for the instantaneous three-hourly and six-hourly distributions.

Figure 4 .
Figure5shows the values of the three Weibull parameters for the different aggregation levels for averaging (top row) and instantaneous (bottom row).The shape parameter β stays approximately constant across all aggregation levels and types, for both averaged and instantaneous data.For averaging, the location parameter θ increases with higher aggregation levels.This is consistent with the lowest values of the dataset increasing as they are averaged.Conversely, with lower resolution instantaneous data, the lowest values remain similar, leading to similar θ for all resolutions studied.The scale parameter λ decreases with increased averaging length, and remains relatively constant for instantaneous data, consistent with the changes in variance shown in Figure4.We observe very similar results for ∼ 30 years of data measured at 10m height (seeFigure A.8  and Figure A.9).In addition, two of our observational sites, NWTC and Owez, have wind speed data at multiple heights, ranging from 10m to 130m; the changes in Weibull parameters seen in Figure5are found at all different heights studied (seeFigure A.10).

Figure 5 :
Figure 5: Parameters when data are averaged (top row) or discarded (bottom row) for four different datasets a) Kelmarsh b) Penmanshiel c) NWTC d) Owez.The Weibull parameters change when wind speeds are averaged and stay similar when values are discarded.The corresponding non-parameterized wind speed distributions are visualized in Figure A.2 and Figure A.4.

Figure 6 :Figure 7 :
Figure6: Parameters of Weibull distribution fitted to observational data averages (dashed lines) of the four multiple-decadal sites and to the closest grid points in the CMIP6 MPI-ESM1-2-LR dataset (solid lines).The parameter λ decreases in all cases when comparing three-hourly to daily averages, indicating a decrease in variability.The location parameter β does not show any consistent trends.θ increases with daily averages in observational data, but not in the CMIP6 data.

Figure 8 :
Figure8: Cumulative power generation of daily, six-hourly and three-hourly values relative to three-hourly instantaneous (3h inst.)wind power generation at the four CMIP6 locations closest to e) Aachen, f) Zugspitze, g) Boltenhagen, h) Fichtelberg computed by feeding the wind speeds into a power curve.Top row: Average values underestimate wind power generation.Bottom row: Instantaneous values (note that daily instantaneous values were not available as direct output, and so are not included).Using six-hourly instantaneous values introduces only minor errors relative to three-hourly instantaneous values.

Figure A. 1 :
Figure A.1: Locations of German weather stations that serve as longer datasets in blue .The corresponding CMIP6 locations we investigate for Figure 6 and Figure 8 are marked in orange .

FigureFigure
Figure A.2: Kernel density estimations of averaged wind speeds.The distributions of 10min, three-hourly, six-hourly and daily data look similar to a Weibull distribution; monthly data are not Weibull distributed.

Figure
Figure A.5: QQ-plots of averaged wind speeds and fitted Weibull distributions.From left to right the data resolution is 10min, three-hourly averages, six-hourly averages and daily averages.17

Figure
Figure A.6: QQ-plots of instantaneous wind speeds and fitted Weibull distributions.From left to right the data resolution is 10min, three-hourly averages, six-hourly averages and daily averages.18

Figure A. 8 :
Figure A.8: Weibull parameter trends when fitted to averaged wind speed distributions.

Figure A. 9 :
Figure A.9: Weibull parameter trends when fitted to instantaneous wind speed distributions.

Figure A. 10 :
Figure A.10: Weibull parameter trends of different observations heights for the tall towers NWTC and Owez.

Figure
Figure A.11: QQ-plots of averaged wind speeds and fitted Gamma distributions.From left to right the data resolution is 10min, three-hourly averages, six-hourly averages and daily averages.20

Figure A. 13 :
Figure A.13: Gamma parameter trends when fitted to averaged wind speed distributions.

Figure A. 14 :
Figure A.14: Gamma parameter trends when fitted to instantaneous wind speed distributions.
p-values of Kolmogorov-Smirnov test for instantaneous NWTC wind speed distributions.All values where p ≤ 0.05 are marked bold and the corresponding distributions are not considered to be significantly different.• 10 −1 9.20 • 10 −2 three-hourly 1 4.95 • 10 −1 Table 2 (for averaged data) and Table 3 (instantaneous data), where values in bold indicate no statistically significant difference in the distributions, i.e. that using the dataset of lower temporal resolution may retain all information about the wind speed distribution.We observe that all averaged distributions, as well as daily instantaneous distributions, differ significantly from the 10min wind speeds (right-most column), whilst three-and six-hourly instantaneous values are not distinguishable from the 10 min data.The results for all other locations are very similar (see TableA.2 to Table A.15): in general, averages do not preserve the wind speed distribution, and three-and six-hourly instantaneous values are almost always not statistically distinguishable from the 10 min data.

Table 2 :
Test statistics of Kolmogorov-Smirnov test for Penmanshiel averages.We reject the hypothesis that the wind speed distributions are equal if p ≤ 5 • 10 −2 .Therefore, only the three-hourly averages are not significantly different from six-hourly averages.

Table 3 :
Test statistics of Kolmogorov-Smirnov test for Penmanshiel instantaneous data.We reject the hypothesis that the wind speed distributions are equal if p ≤ 5 • 10 −2 which reveals that only daily instantaneous values are significantly different from all other distributions.

Table A .
1: Static data of the four longer-term 10m meteorological measurement sites in Germany.None of the stations was moved during the time period evaluated.
Table A.2: p-values of Kolmogorov-Smirnov test for averages Kelmarsh.All values where p ≤ 0.05 are marked bold and the corresponding distributions are not considered to be significantly different.Table A.3: p-values of Kolmogorov-Smirnov test for instantaneous Kelmarsh.All values where p ≤ 0.05 are marked bold and the corresponding distributions are not considered to be significantly different.Table A.4: p-values of Kolmogorov-Smirnov test for averaged NWTC wind speed distributions.All values where p ≤ 0.05 are marked bold and the corresponding distributions are not considered to be significantly different.
Table A.6: p-values of Kolmogorov-Smirnov test for averaged Owez wind speed distributions.All values where p ≤ 0.05 are marked bold and the corresponding distributions are not considered to be significantly different.Table A.7: p-values of Kolmogorov-Smirnov test for instantaneous Owez wind speed distributions.All values where p ≤ 0.05 are marked bold and the corresponding distributions are not considered to be significantly different.Table A.8: p-values of Kolmogorov-Smirnov test for averaged Aachen wind speed distributions.All values where p ≤ 0.05 are marked bold and the corresponding distributions are not considered to be significantly different.