Spatial optimality and temporal variability in Australia’s wind resource

To meet electricity demand using renewable energy supply, wind farm locations should be chosen to minimise variability in output, especially at night when solar photovoltaics cannot be relied upon. Wind farm location must balance grid-proximity, resource potential, and wind correlation between farms. A top-down planning approach for farm locations can mitigate demand unmet by wind supply, yet the present Australian wind energy market has bottom-up short-term planning. Here we show a computationally tractable method for optimising farm locations to maximise total supply. We find that Australia’s currently operational and planned wind farms produce less power with more variability than a hypothetical optimal set of farms with equivalent capacity within 100 km of the Australian Energy Market Operator grid. Regardless of the superior output, this hypothetical set is still subject to variability due to large-scale weather correlated with climate modes (i.e. El Niño). We study multiple scenarios and highlight several internationally transferable planning implications.


Introduction
Energy supply is changing from non-renewable to renewable sources globally [1].In 2021, 29% of Australia's energy supply came from renewable sources, up from 8% in 2001 [2].In a scenario where supply is entirely renewable, meeting demand with these sources requires new planning approaches [3][4][5][6].The current power grid manifests from population density and sites of large stores of on-demand non-renewable supply [2,7], and owing to Australia's climate and geography, non-dispatchable renewable sources (solar photovoltaics and wind) are more feasible over most of the nation than dispatchable stored renewable sources [2,[8][9][10][11] (e.g.bioenergy and hydro-schemes).To minimise energy storage costs while meeting demand, a blend of daytime solar and night-time wind power will be necessary [12,13].Ideally minimum wind supply at night should meet baseload demand [1,[3][4][5].
Each individual wind farm experiences a timeseries of local weather and climate which alone cannot always meet demand.Choosing locations with minimal correlation in wind power potential between them will create more consistent aggregated output [21][22][23][24][25][26][27], however these locations may not necessarily have strong winds, and for many farms this choice becomes quickly intractable.For example, there are around as many combinations of placing 97 farms (number currently operating connected to the Australian Energy Market Operator (AEMO) grid) in 300 prospective locations as there are atoms in the Universe ( 300 C 97 ≈ 10 80 ).This problem must also consider myriad other factors, including the proximity of prospective farm locations to the energy grid to minimise transmission costs and energy losses [7,9,16,17,19,20].
The existing set of 97 operational wind farms in Australia connected to the power grid-which have a total capacity of 10.3 GW-were not chosen with a top-down optimality approach, rather each location was chosen with the goal of maximising return on investment for that particular farm.In addition to the existing farms, as of 2022 there are 150 farms currently in some planning stage totalling 113.2 GW capacity.The capacity of these farms (figure S1), like the population and power grid, is overwhelmingly concentrated in the southeast of the country (figure 1(d)).The locations of these farms are chosen within a legal framework and are approved by energy market operators, but largely are selected based on sub-decadal assessments of local wind resource potential, land ownership and the interests of the farm's stakeholders [14][15][16][17]19].This process does not necessarily create optimal wind power supply for the grid and, given the ∼20-30 year life cycle of farms [28], can create lasting inefficiencies.
Regardless of the spatial configuration of a set of wind farms within a region, there is unmitigable variation in the wind itself [22,[29][30][31].At the sub-farm scale variability can occur due to local topography [32][33][34], or extreme winds posing a hazard to infrastructure [1,13,33,35], while at the intrafarm scale large weather or climate modes can create significant changes in wind supply which persist for days to many years [29,[36][37][38][39][40][41].For all locations, wind speeds in the atmospheric boundary layer have approximately Weibull distributions (a peak with a long right tail) and often a diurnal cycle [3,31].This last point is particularly relevant for night-time wind supply, since there is an increased advantage in higher turbine hub heights to access the nocturnal jet over the slow, stably stratified near-surface flow [42][43][44][45].In general, the wind speed at hub height can be mapped onto a capacity factor c f (i.e. the proportion of the turbine rated capacity power produced) using a 'power curve' [1,7,20].Typical power curves rise after a 'cut-in' wind speed with a wind speed cubed relationship before plateauing at the turbine's maximum capacity, and are zero after a 'cut-out' speed (figures 1(a) and S2; Methods).
In this study we compare supply from Australia's operational and planned wind farms to a hypothetical set of wind farms chosen to optimise night-time supply.We employ 43 years (1979-2021) of wind data at 100 m height above ground level from the ECMWF Reanalysis 5th Generation (ERA5) reanalysis at 0.25 • spatial and 1-hour temporal resolution [46] to conduct the comparison and assess the viability of Australia's wind resource (figure 1(b); Methods).In each distinct area of interest (within 100 km of the AEMO grid, figure 1(e); a hypothetically linked AEMO and southern Western Australia grid, figure 1f; or a 'copper plate' grid anywhere in Australia, figure 1(g)), we use a correlation clustering algorithm to select a hypothetical set of farms with a total capacity equal to that of the operational and planned wind farms.Each of the farms in these hypothetical sets have an individual capacity of 500 MW, such that the total number of farms matches the operational and planned set (N = 247; 123.5 GW total capacity).Our analysis has two central results: (1) existing methods of optimal set selection can miss more cost-effective and powerful sets, and (2) there exists a significant unmitigable variability in supply which is attributable to larger-scale climate variability.

Set selection
We consider each point in the ERA5 reanalysis grid as a potential distinct location for a wind farm and produce a cross-correlation matrix of the daily night-time average capacity factor timeseries over the 1979-2021 period for the points.We then organise these points into clusters, where clusters are chosen such that points are more similar to other points within them than to points in other clusters (using a farthest point algorithm; Methods).The measure of dissimilarity between sites is the L 2 -norm (i.e.Euclidean distance) in the correlation space with dimensions equal to the number of points.This clustering method is hierarchical and nested: to split a domain into N + 1 clusters, the cluster with points least correlated within itself in a set of N clusters is split in two (as reflected in a dendrogram).In this study, we choose a hypothetical set as each site with the maximum average night-time capacity factor within 247 clusters, equal to the number of hypothetical farms.
This clustering method, novel to wind farm optimisation but common in other sciences such as genetics and image analysis, reveals the primary issue facing Australia's operational wind supply: most of the currently operational capacity lies within a correlated region extending across Tasmania and the southeast coast of mainland Australia [7].To illustrate this, in figure 2(b) we plot the 'copper plate' (CP) scenario split into 15 correlated clusters-night-time wind power in the pink region is more correlated with itself than anywhere else.In figure 2(c), we show the nested regions within the pink region, each of which contains one of the 247 farms in the CP hypothetical set.Furthermore, the AEMO grid covers just 7 of the 15 regions in figure 2(b).
The method also reveals the information lost by the most common alternative technique for optimal set selection; the mean pair-wise correlation-distance function [7,34,39,47] (figure 2(a)).Often a lengthscale characterising the decay in the average pair-wise correlation-distance function is defined as the minimum distance between farms [7,34,39,47], yet this disguises key results of the clustering analysis; correlated regions do not have to be the same shape or size, or even contiguous (figure 2(b)).In figure 2(a) we plot the average correlation-distance functions for each scenario over the total CP probability density function, showing that not allowing farms within some decay length-scale would omit many potential pairs which are anti-correlated.
Hypothetical sets of equivalent total capacity have the same or higher average supply (intercept with grey line in figure 3(a)) than the currently operational and planned set (O+P; yellow dot in figure 3(a)).For sets selected with the clustering strategy, this leads to an optimum set size-around ten farms-for variability relative to power (figure 3(b)), where the gain from minimising the set correlation is weakened least by including farms more variable relative to their power (figures 3(a) and (c)).We find that all hypothetical sets outperform their operational plus planned set counterpart with equal capacity (figures 3(a)-(c)).In figure S3 and table S1 we extend this analysis to compare hypothetical sets of varying farm capacities to just the existing operational farms, finding similar results.
In figures 3(d) and (e) we show the performance of the sets statistically.The probability distributions of daily night-time average set power are narrower and peak higher for the most spatially disperse sets (figure 3(d)).These highly asymmetric PDFs show the modal night-time supply is below the average night-time supply.Another way of assessing the skewed power output of these sets is by calculating the duration of 'wind droughts' (i.e.periods where the set does not supply power above some threshold) in the 1979-2021 observation period [21,38].Figure 3(e) shows the maximum duration of contiguous time (in hour increments, due to the timestep of the ERA5 reanalysis) the total wind supply from a set never exceed some threshold amount over the 42 year observation period.The similarity of these curves for different sets for long drought lengths hints at the role of unmitigable climate variability in times of poor wind resource potential.For example, in all cases, total wind supply from the set never exceeds 44% of its capacity in any hour for an entire month in 1979-2021 (March of 1982).

Climate variability
To investigate this temporal variability, we look at annual-smoothed timeseries of change around the mean set output (figure 4(a)).Despite hypothetical sets being chosen to minimise variability in total output by selecting uncorrelated farms, there are still >10% changes in supply between consecutive years.These are generally smaller than the operational and planned set for equivalent capacity, but at times are clearly driven by the same variability in the climate.The El Niño Southern Oscillation climate mode has large influence over interannual variability in many aspects of Australian climate and, indeed, energy demand and supply [2,18,36,37,41].We plot a correlation map between the annual averages of capacity factor and the Niño3.4index which quantifies El Niño Southern Oscillation, finding La Niña generally produces stronger wind supply in Australia [39], while the spatial signature shows the strongest absolute correlations exist nearer the coast (figure 4(c)).
The above analysis shows large year-to-year variability in power output due to intrinsic variability in the winds.This means that planning wind farm locations using short records of a few years can be very misleading as to what capacity is needed.In addition, the identification of minimally correlated regions likely also depends on the amount of data used; in this study we have used 43 years of datanear the typical life of a wind farm.We investigate this by repeating the clustering described above for individual years.We show the total set power for the copper plate hypothetical set if just a single year of data was used to find the set configuration, as a percentage of the power from when all 43 years of data is used (figure 4(b)).Around 1% of power is lost, up to 4% varying across years.
At the farm-level, we plot a map of the standard deviation in annual average capacity factor normalised by the average capacity factor, with the extent of area within 100 km of the AEMO grid overlaid (figure 4(d)).We find that the highest interannual variability at a site in Australia is 16% of its mean; half the variation in average capacity factors across Australia spatially (32% of the average).

Discussion
There are a few important caveats to this study.Foremost, the ERA5 reanalysis is not a perfect representation of past weather [30, 44-46, 48, 49], and the spatial resolution does not allow access to fine-scale topographic effects which are important for farm site selection [32][33][34].We also chose equal-capacity farms in the hypothetical sets-a constraint which could be relaxed and optimised-and did not account for losses due to scale-dependent collective turbine wakes [1,[50][51][52].While we considered proximity to the power grid, we did not incorporate any costs or penalties for supply as a function of grid proximity which would be relevant in a Levelised Cost of Energy analysis [7,19], or land-use and government planning restrictions.These assumptions, however, do not preclude the main findings we present, which are principally a proof-of-concept of a generic approach to optimal site selection and assessing the role of large-scale variability in climate on wind supply.These findings are readily applicable to other energy markets, and climate datasets such as regional (e.g.BARRA [48]) or higher-resolution (e.g.ERA5-Land [49]) reanalyses.
This study raises some important points for national energy market planning.Increasing the spatial coverage of power grids is essential for a costeffective transition to renewables [2,3,10,18,21,23].We find that hypothetical sets in the scenario where farms are within 100 km of a connected AEMO and southern WA power grid produce on average 12% more power than within 100 km of the AEMO grid alone with equivalent total capacity (figure 3(a)).We also find that despite the progress in wind supply, now accounting for 13% of total energy supply on the AEMO grid in 2021, optimally-selected farms within 100 km of the AEMO grid could produce the same as the operational and planned farm supply with a 13% increase in consistency using equivalent capacity (figures 3(a) and (b)), and farms of reasonable size within 100 km of the AEMO grid could produce 35% more supply than the existing operational farms with a 39% increase in consistency using equivalent capacity (figure S3a, table S1).This is somewhat attributable to top-down strategy, however we also showed how sites assessed with one year of wind observations may underperform over a farm life cycle by up to 16% due to interannual variability in local wind climate (figure 4(d)).Wind farm stakeholders take on significant risk with short site assessments.
We also found that achieving average nighttime power consistently above baseload (∼25 GW for AEMO) with wind supply is not feasible, even with strategically selected farms, owing to largescale coherent features in wind climate (figure 3(c)).Including all planned additional capacity, however, does reduce these droughts to only a few days (figure 3(e)), a dramatic improvement over the presently operational set where supply did not exceed 6 GW at any hour lasted for over 3 months (figure 4(c)).This lack of on-demand power, especially at night, highlights the need for stored energy given our present demand [3,4,8].Finally, we find that there is little performance lost in hypothetical sets when selecting sites iteratively rather than as a collective, since the primary control on performance is power grid extent (figures 3(a)-(c)).Interannual variability in wind resource is significant and spatially correlated (figure 4(d)) and therefore, even for a perfectly designed set of wind farm locations, exerts a strong control on wind supply (figure 4(a)) [41].
Future work can take this study in several directions.Australia-specific policy choices or site assessment could be informed by a tailored version of this analysis with narrower scope.Detailed assessments of the advantages of the nocturnal and off-shore boundary layers for wind resources [36,42,44], which extend beyond the single-turbine scale and contextualise wind supply with demand and solar, may help inform future resource allocation.Our preliminary findings on diurnal variability suggest offshore sites tend to provide higher power but do not have relatively strong night-time power (to be expected with the higher ocean heat capacity), and that power is slightly more variable at night offshore (figure S4).More broadly, a framework, perhaps borrowed from other climate studies, for attribution of wind droughts to predictable climate states could help provide leadtime for ramping up alternative supply sources [30,[36][37][38][39][40].We also believe that the clustering analysis outlined here could be a useful tool in studies of other aspects of climate.Finally, the climate system is changing on a timescale well within the life cycle of wind farms with unclear effects on wind; to what extent this influences the renewable energy transition must be assessed [8,53].

Power curve
We convert wind speed U (m s −1 ) data into capacity factor c f using the grey curve in figure 1(a).Above the cut-out speed U 1 = 25 (m s −1 ) there is no power generated (for turbine safety): c f (U ⩾ U 1 ) = 0.For below cut-out speed winds, U < U 1 , the power ramps up like the cube of the wind speed, c f = f(U) = A(U) 3 , then saturates at c f = g(U) = 1.The factor A is found using typical parameters for turbines: A = C B ρ f π L 2 e/2C, where; C B = 16/27 is the Betz constant setting the theoretical maximum power extraction, ρ f = 1.2 (kg m −3 ) is the air density, L = 150/2 (m) is the blade length, e = 0.65 is a typical turbine efficiency, and C = 5 MW is a typical modern turbine rated capacity [7].We blend the cubic f and saturated g behaviour with a spline function, , where β = 5 sets the sharpness of the transition between the functions f and g.All capacity factors computed from reanalysis wind data are then scaled by farm capacities when required to produce power values.A comparison of this curve to real turbine capacity factor curves is given in figure S2.

ERA5 reanalysis
We used 0.25 • resolution ERA5 reanalysis data [46] for hourly instantaneous wind speeds at 100 meters for the period 1979-2021 (inclusive) to attain capacity factors using the power curve explained in the previous Methods section.We further used the hourly average surface downward short-wave radiation flux to mask out periods where co-located solar supply would not produce power, and classified those as nighttime with a threshold flux of 10 W m −2 .For each day (UTC), the daily average capacity factors and nighttime-only average capacity factors were analysed.Spatially, ERA5 data was masked according to the model's native land-sea mask, cropped specifically to Australia.Offshore regions were included in the analysis by extending this land-sea mask 2 grid-points (i.e.approximately 30 km) beyond the land edge.The mask is then adjusted to the scenario based on a 100 km range to the closest line on the power grid.This method is implemented with Xarray [54], NumPy [55] and Shapely [56].

Operational and planned farms
Operational and planned farm data was collected as explained in the data availability statement and is given in table S2.These farms have known total capacities but each farms turbine-weighted power curve and hub-height is not known.We assumed that their performance was equivalent to the hypothetical farms, such that we used the same 100 m winds and power curve to attain a capacity factor, then multiplied this by the total farm capacity.These capacity factors were attained using the data from the ERA5 grid point closest to the farm location.Some farms lie closest to the same ERA5 grid point but are treated independently.

Correlation clustering
Correlation values were found between each pair of timeseries for daily-average nighttime capacity factors for 1979-2021.This correlation matrix, with size (X, X) where X is the number of grid points, gives each location a coordinate in a space spanned by X dimensions.The distance between points m and n in this space is defined as d m,n = √ ∑ X i (m i − n i ) 2 , i.e. the Euclidean distance or L 2 -norm.Locations can then be clustered by looking at their distances from each other in this space.We use a farthest point algorithm (referred to as 'complete' in figure S5) to cluster points, which merges clusters that have the smallest distance D between the farthest point in one cluster A from a point in another cluster B: D = max{d m,n }, where A contains points m and B contains points n.In this study clusters are formed by iterative merging in a hierarchical fashion up to a prescribed number of clusters.The spatial sites are then labelled by the cluster they belong to and a cluster map can be produced.A comparison of the clustering algorithm we use to alternatives is given in figure S5, and example truncated dendrograms for clusters formed in each scenario is given in figure S6.This method is implemented with SciPy [57].

Figure 1 .
Figure 1.Scenarios and data for Australia's wind resources.(a) Power curve (grey) given in capacity factor c f for farms used in this study, with dashed lines showing the Betz limit (blue), cubic end-member (orange), rated power end-member (green), cut-out wind speed (red) and cut-in wind speed (purple).(b) Example ERA5 hourly timeseries of wind power P (MW) and surface downward short-wave radiation flux FS (MW m −2 ) for the largest-capacity operational wind farm, with nighttime (FS < 10 W m −2 ) regions shaded (navy).(c) Full ERA5 1979-2021 timeseries of average daily P (MW) for the case in (b) (grey) with annual-smoothing overlaid (black).Magenta region shows location of (b) timeseries.Maps of scenarios analysed in this study; (d) operational (black; (O) and planned (yellow; (P) (O+P) farms, (e) area (pink) within 100 km of the AEMO (magenta) grid (A), (f) area (pink) within 100 km of hypothetically linked (by red line) AEMO and southern WA (magenta) grids (A+W), and (g) a 'copper plate' (pink) area (CP).Maps in (e)-(g) are blue where farms are not allowed, and lighter shading indicates off-shore locations within 30 km of land.

Figure 2 .
Figure 2. Correlation of night wind power.(a) The probability density function (colour-map) of pairwise correlation r i,j and distance d (km) for Australia's average night wind power overlaid by the distance-binned average lines for each scenario given in legend.Zero correlation is marked (grey dashed line).(b) Fifteen regions clustered by night wind power correlation overlaid by the AEMO grid (grey lines) and operational farms (black dots), along with the dendrogram showing the hierarchy of these clusters.(c) Eighteen regions within the pale green region in (b) which are members of the 247 clusters in the CP scenario.

Figure 3 .
Figure 3.Total set performance.For increasing number of farms (i.e.clusters), the 1979-2021 set; (a) average night-time capacity factor ⟨c f,n ⟩; (b) normalised night-time capacity factor standard deviation σ cf,n /⟨c f,n ⟩; (c) average pair-wise correlation ⟨r i,j ⟩ in the set.Legend in (c) gives the scenario corresponding to each line, yellow marker shows the planned and operational set, grey line shows the studied sets of 247 farms.(d) Probability density functions of average night set power produced by equivalent-capacity sets.(e) Maximum 'drought' duration (days) within the 1979-2021 period where the sets in (d) do not produce power above some 'drought' threshold (GW).Grey dashed lines indicate a day, month and year, and black dashed lines indicate ∼25 GW AEMO baseload and total set capacity.

Figure 4 .
Figure 4. Interannual variability in set power.(a) The annual-smoothed percentage change from mean power in the equivalentcapacity sets; legend gives the scenario corresponding to each line.Dashed grey line indicates mean power.(b) For sets in the copper plate scenario, each bar indicates the percentage of 1979-2021 set output found by choosing farms employing just that single year's data compared with the full 1979-2021 timeseries of data.(c) Correlation map of Niño3.4 index with annual-average capacity factor, colour-bar below.(d) Map of standard deviation of annual-average capacity factor σ cf normalised by average capacity factor c f , colour-bar below, overlaid by the extent of the region within 100 km of the AEMO grid (pale blue line).