Standalone, Descriptive, and Predictive Digital Twin of an Onshore Wind Farm in Complex Terrain

In this work, a digital twin with standalone, descriptive, and predictive capabilities is created for an existing onshore wind farm located in complex terrain. A standalone digital twin is implemented with a virtual-reality-enabled 3D interface using openly available data on the turbines and their environment. Real SCADA data from the wind farm are being used to elevate the digital twin to the descriptive level. The data are complemented with weather forecasts from a microscale model nested into Scandinavian meteorological forecasts, and wind resources are visualized inside the human-machine interface. Finally, the weather data are used to infer predictions on the hourly power production of each turbine and the whole wind farm with a 61 hours forecasting horizon. The digital twin provides a data platform and interface for power predictions with a visual explanation of the prediction, and it serves as a basis for future work on digital twins.


Introduction
The importance of wind energy production efficiency cannot be overstated in the context of combating climate change and achieving a net-zero emissions target by 2050 [1].With the proliferation of cheaper sensors and the growing trend of the Internet of Things, the potential for extracting data from wind farms has increased significantly.However, it is not sufficient to store collected information in data silos.Instead, real-time data analysis and visualization can be leveraged to enable optimal control and informed decision-making and to unlock the full potential of the data.
The concept of the digital twin has emerged as a promising solution to address these challenges.A digital twin utilizes available data in real-time to monitor the current state of an asset and its environment, predict future states, detect faults, perform what-if scenario analysis, provide decision support, and ultimately enable autonomous control of the asset [2].The use of a suitable human-machine interface enhances the interpretation of analysis results and allows for effective communication with stakeholders.
A survey conducted with industry partners of the Norwegian Research Centre on Wind Energy "FME NorthWind" indicates that the wind industry is keenly interested in utilizing digital twins to reduce the cost of wind energy [3].However, several challenges must be addressed before the full potential can be unlocked in wind energy applications.These challenges relate to both the implementation and acceptance of digital twins within the industry [3].Overcoming these challenges will be critical to advancing the development and adoption of digital twins in the wind energy sector.
To this end the current work attempts to realize the following: • Introducing readers to the concept of digital twins within the context of wind energy applications and providing a scale to rank digital twins based on their capabilities.• Demonstrating a digital twin of an onshore wind farm with standalone, descriptive, and predictive capabilities.This will provide a practical illustration of the potential benefits of digital twins in wind energy applications, as well as offer insights into the challenges of developing such models.• Discussing the potential for further research on digital twins for wind farm applications.
By highlighting areas where additional research is needed, we hope to catalyze progress in this field and drive innovation in the wind energy sector.
The article is structured as follows: First, the definition of the term digital twin used in this work is clarified in section 2. The capability level scales are explained briefly.In section 3, the implementation of the standalone digital twin is given with a focus on terrain and visual interface.The onshore Bessakerfjellet wind farm is used as a demonstration site.It is operated by Aneo and is located at (64°13' N, 10°23' E) on the Norwegian coastline.Section 4 explains the integration and visualization of data measured at the turbines.Predictive capabilities are added in section 5 by implementing weather forecasts and performing predictions of the wind turbines' power production.The work is discussed in section 6 and an outlook into future work is given.Finally, the work is summarized in section 7.

Definition and Capability Levels
The term digital twin is being used for different concepts.Here, the digital twin is "a virtual representation of a physical asset or a process enabled through data and simulators for real-time prediction, optimization, monitoring, control, and informed decision making" [2].The concept of a digital twin with all capabilities is shown in figure 1.Since this definition still leaves some room, we use the capability level scale from [5] to specify the digital twin's exact capabilities.As such, a digital twin can be ranked on a scale from 0 to 5 as a standalone, descriptive, diagnostic, predictive, prescriptive, or autonomous digital twin as shown in figure 2. Here, a brief overview of the capability level scale is given.A more detailed description can be found in [6] and [3].
• The standalone digital twin is a virtual representation of a wind farm that lacks a realtime connection to the physical wind farm.It can be utilized in the design, planning, and construction stages before the wind farm is operational.• In the descriptive digital twin, measurements from the wind farm are being streamed into the digital twin.The descriptive digital twin mirrors the state of the real wind farm at each point in time and provides a platform on which data can be bundled, enhanced (e.g. through virtual sensing), processed, and visualized to the human operators and other stakeholders.
• The diagnostic digital twin uses the data gathered in the descriptive digital twin as input for analysis such as condition-based maintenance.The condition of components is tracked through e.g.vibration and temperature measurements, and anomalies are diagnosed.This way, minor deficits can be detected early and resolved before they result in major faults like turbine damage and unexpected downtime.• A predictive digital twin does not only use current and historical data but also forecasts parameters to predict future asset states.The predictive capabilities can be used for predictive maintenance or through power forecasts for the energy market.• In a prescriptive digital twin, recommendations are provided through what-if scenario analysis and risk assessment.Such prescriptions can include a balancing of component wear against power production based on current electricity prices and demand, or optimal maintenance scheduling based on component wear, estimated remaining useful lifetime, data anomalies, and weather forecasts.• The autonomous digital twin acts on the prescriptions on its own.Autonomous digital twin capabilities can range from farm-wide wake steering and component wear balancing over inspection through the usage of autonomous drones to automated operation and maintenance of the wind farm.

Standalone Digital Twin
In this work, a standalone digital twin of the onshore wind farm has been implemented following a similar approach as is explained in [6] for a floating offshore wind turbine, including a user interface using virtual reality.In contrast to the single-turbine implementation in [6], a whole wind farm is implemented here.Additionally, the local terrain around the wind farm is included.

Terrain
As an input to the terrain generation, height maps of the local terrain are being downloaded from [7] at a 1 m × 1 m resolution grid.Since the wind farm is located at a shoreline and the LIDAR-based height measurements cannot penetrate the water surface, the height maps are being complemented with information on ocean depth contour lines available from [8].All information on onshore and offshore terrain height is combined in a single terrain map.The height is then binned into int16 and the map is split into equal chunks to improve computational efficiency during rendering.Next, aerial images are downloaded from [9] with a 1 m × 1 m resolution.The images are combined and split into chunks matching the chunks of the terrain height.Terrain height and texture are then imported into the Unity game engine, where they are combined.As evident from figure 3, a top-down view of the 3D terrain inside the game engine (center) can only be distinguished from aerial images from [9](surrounding) by its improved resolution, 3D terrain, animated water, and dynamic lighting.Note that the terrain is not just implemented for visual realism while using the digital twin, it also contains information on logistical access through roads and nearby villages, and information on terrain height, water bodies, and forestation relevant for understanding wind flow.

Turbines
Since no CAD model of the turbines was available at the start of the project, a model was created in Blender.Tower height and rotor diameter are based on data sheets, while the nacelle and blades are based on pictures of the Enercon E70-4.The 3D CAD model of the turbine is shown in figure 4. The horizontal position of each turbine is known, while the vertical position is inferred from the terrain height.

Descriptive Digital Twin
The digital twin is enhanced with descriptive capabilities by including SCADA data from each wind turbine.At this stage, the digital twin mirrors the state of the physical wind turbine.
Only minor changes were made to the implementation in [6].Namely, the data structure gained an additional hierarchical level to advance from turbine to farm-level and the interactable components of the turbine were adjusted to the new turbine type.Additionally, the data input format changed, which required rebuilding the data reading module.Finally, two visualization methods were added to depict the current power production, as it cannot be directly seen on the turbine models.

Data
The available data consist of wind speed, wind direction, nacelle direction, and active power from each turbine.The measurement intervals vary from 3 to 10 minutes.Since the data are non-equidistant in time, an updating function is constantly checking for new measurements.For real-time operation, the persistence method is used to bridge time spans without measurements.
If instead the digital twin is used to inspect historic data, they are interpolated between measurements.The feasibility of real-time data streaming is demonstrated as explained in section 5.

Visualization
The yaw angle of each turbine is directly visible from the orientation of the turbine model.The active power can be shown in text above each turbine, or alternatively through gauges with dial and color indications as shown in figure 5.
Figure 5: Descriptive digital twin with text and gauges showing active power of each turbine.

Predictive Digital Twin
First predictive capabilities are added to the digital twin by streaming publicly available weather forecast data.These external forecasts are then used to predict the theoretical power production at the turbine-and farm-levels.

Wind Field
A vector field for wind speed and direction is implemented by streaming weather forecasts from the Norwegian Meteorological Institute's Thredds service [10] in real-time.The MetCoOp Ensemble Prediction System (MEPS) [11] provides forecasts every 6 h up to 61 h ahead with a frequency of 1 h.Parameters of interest are wind speed, wind direction, air pressure, air temperature, and relative air humidity.However, the MEPS model has a resolution of only 2.5 km.For this reason, the SIMRA microscale model nested into the HARMONIE mesoscale model is used around the wind farm to increase the lateral and vertical resolution of the forecast and include effects induced through the complex terrain.More information on the HARMONIE and SIMRA models can be found in [12].The SIMRA model evaluated around the Bessakerfjellet wind farm is available at [10].It includes wind speed, wind direction, air pressure, and air temperature in 1 h intervals for 6 h to 18 h ahead and has been evaluated every 12 h for this particular data set.There is no technical reason preventing the SIMRA model from being evaluated more frequently and with a longer forecasting horizon apart from saving on computational resources.The wind is visualized in the digital twin through wind trails moving through the vector field, or by showing parts of the vector field directly as can be seen in figure 6. Vector direction matches wind direction, vector length represents wind speed, and color indicates the turbulence index.
Figure 6: Vector field and profile of wind with wind speed (length), wind direction (orientation), and turbulence (color) for any forecast horizon.

Physics-based power prediction
In the next step, the weather forecast is used to estimate the power production at each turbine.
In [13], weather forecasts such as the MEPS forecast were used with support vector machines, clustering methods, and random forest algorithms to map from wind to power production in flat terrain.Here, the weather forecast is used as input, but the mapping from weather to produced power is done through physics-based models (PBMs) only to circumvent the black-box problem of data-driven methods (DDMs).A data sheet for the turbine type is used that contains the direct mapping from wind speed to power production, as well as the power coefficient as a function of wind speed, with 1 m s intervals.The power coefficient can be used in the well-known relation where v is the wind speed, P (v) is the produced power, ρ is the density of the air, C P (v) is the turbine-specific power coefficient, and A is the area swept by the blades.The blade sweeping area A is known to be 3959 m 2 .In the first approach, the air density ρ is assumed to be constant with ρ s = 1.225 kg m 3 .However, air density depends on temperature and pressure.Treating air as an ideal gas, the pressure of dry air ρ(T, p) can be calculated as where T is the air temperature, p is the pressure of the air, M d = 0.0289652 kg mol is the molar mass of dry air, and R = 8.31446 J K mol is the universal gas constant.Furthermore, humidity can be included through with M v = 0.018016 kg mol as the molar mass of water vapour, φ as the relative humidity, and p sat = 6.11 hP a * 10 as the saturation pressure, calculated with the Tetens equation as done in [14].The air density can be used to modify the power curve by The quality of the power prediction is calculated on a one-year training set with an hourly resolution for each combination of • wind speed, air temperature, and air pressure v based on the MEPS or SIMRA model, • air density ρ constant, dry air, or humid air (for SIMRA-based models only), • calculation from the power curve or through the power coefficient from the turbine manufacturer and equation 1, • interpolation of power curve or power coefficient with linear or cubic, • with or without imposing an upper limit on power output according to turbine specifications.

Data-Driven Predictions
Purely data-driven predictions using dense neural networks (DNN) and long-short-term-memory (LSTM) neural networks are implemented for measurement-based time series prediction and compared with the results from the PBMs.In the DDMs, two years of data are being used to train the neural networks (NNs), where 10% are split off for validation.The NNs are being trained for one-step-ahead prediction of the power production, and are evaluated iteratively on their output to obtain a forecast with the full 61 h forecasting horizon.Therefore, the NN output is of size 1.The architecture of the NNs is kept simple with three layers with 5, 3, and 1 units respectively.The input lag is chosen to be 4 h for the DNN based on the partial autocorrelation.In contrast, the cells of the LSTM keep information from previous evaluations in memory.Therefore, only one input is given at a time but the NN is evaluated on a sequence of previous data points.The NNs are trained with the Adam optimizer with a default learning rate, a batch size of 64, and the mean squared error as the loss metric.A validation-loss-based early stopping is used to avoid overfitting.Since the partial autocorrelation suggests that the last measured hour has by far the most substantial contribution to the short-term prediction, the NNs are compared to the persistence model, which always predicts the last measured value.

Results
The PBMs and DDMs are compared against the measured power production for every single turbine and for the whole farm production by using the normalized root mean squared error (NRMSE) for 3 years of available data.The best-performing model in each category is determined.The NRMSE across turbines is shown in figure 7, and the NRMSE on farmlevel prediction in figure 8.Note that the farm-level predictions are more accurate as prediction errors of different turbines can cancel each other.The DDMs perform best for small forecast horizons, but their accuracy decays quickly.For one-hour-ahead forecasts, the DNN performs marginally better than the LSTM and persistence model with <0.2% NRMSE.The SIMRA models outperform the DNN after 2 h on the farm-level and after 5 h on the turbine level.All predictions using the SIMRA model as input achieve similar accuracy, but a dynamic air density does improve the forecast by 0.4% NRMSE.The improvement from dry to humid air density and differences between the interpolation method, the cap on maximum power production, and the difference between using the calculated power curve or power coefficient as input are much smaller with <0.1% NRMSE.The best-performing SIMRA-based model uses the turbine's power curve with cubic interpolation, a limit on the maximum power production as the rated power, and a correction for air density that accounts for humid air.The micro-scale SIMRA model gives significantly better results than the MEPS model.Here the SIMRA model was only available up to 18 h ahead, but it is expected that the SIMRA-based models will keep outperforming the MEPS-based models also for longer forecasting horizons as the decay of accuracy with increasing prediction horizon is slow.Differences within the MEPS models are small <0.12% NRMSE.Like the SIMRA-based models, the best-performing MEPS-based model uses the power curve directly with cubic interpolation.
The different models can be combined in a simple hybrid analysis and modeling (HAM) approach for optimal wind farm power prediction on all forecasting horizons by using the DNN for 1 h to 2 h ahead predictions, the SIMRA-based model for 3 h to 18 h ahead predictions, and the MEPs-based model for 19 h to 61 h ahead predictions.Deriving the farm-level power forecast from the turbine level forecasts makes it possible to assess the impact of each turbine on the farm power production separately inside the virtual-reality-enabled interface and visually trace reasons for fluctuations between turbines back to wind speed, direction, and turbulence, as well as to terrain geometry and surface roughness.Therefore, the digital twin can be used to explain the farm-level power forecast.In this work, a functional digital twin of a wind farm was presented with standalone, descriptive, and predictive capabilities.There is much room for further research and demonstration on all capability levels.Hence, this section discusses potential improvements and future work.

Standalone
The modular nature of the implementation allowed upscaling the standalone digital twin from the turbine level presented in [6] to the farm-level without any difficulties, but a terrain had to be added for the onshore wind farm.In this work, the aerial images were directly mapped onto the terrain, resulting in 81 big textures with 1 m resolution and ca. 1 km 2 area.This did not compromise the real-time execution of the digital twin, but it may be advantageous to use one smaller, high-resolution texture per land cover class (grass/moss/rock/forest/etc.) instead.In future work, their placement can be automated by compiling the color channels of the aerial images into arrays with texture information.The software could be extended to include automated placing of forestation and houses.This way, the detail in the visual component of the digital twin can be increased, and the project size can be reduced.

Descriptive
The upgrade to the farm-level required an additional level in the data hierarchy.A new visualization method was included in the digital twin to ease the assessment of data such as power production across the whole wind park.The different data formats required a new interface between the raw data and the digital twin.Standardization will play an important role in the commercialization of digital twins, and more efforts are needed to establish common standards throughout the whole wind energy industry.Both wind turbine operators and original equipment manufacturers are already collaborating on standardization efforts [15,16]

Predictive
For predictions in this work, three models have been combined in a pipeline where the best model is used depending on the time to be predicted ahead.More sophisticated HAM approaches could potentially improve the predictions further by combining information from measured data and numerical weather models.In the broader context of digital twins for wind energy, PBMs, DDMs, and HAMs have been discussed in [3].In the context of wind power predictions, an example of a HAM approach includes a data-driven regression from mesoscale weather forecasts to power production, as has been evaluated for flat and open terrain in [13].The microscale weather model could be replaced by a resolution-enhancing generative adversarial network, as has been attempted in [17], but the results were criticized in [18].Finally, ensemble methods with secondary models for combining DDM and PBM outputs may extract additional value from measurements and numerical models.This approach will be investigated in future work with a more thorough investigation of different DDMs.

Diagnostic, Prescriptive, and Autonomous
In addition to the standalone, descriptive, and predictive capabilities explored here, diagnostic modules could be evaluated on the measured data for condition monitoring and component wear tracking including weather effects and turbine load.Prescriptive modules could include weather-and power-aware maintenance scheduling.Finally, the turbine state could be used for autonomous farm optimization and control to balance power production and turbine wear [19].

Conclusion
In this work, a digital twin of an onshore wind farm in complex terrain with standalone, descriptive, and predictive capabilities was built.The standalone digital twin was implemented into a game engine by creating a 3D CAD model of the turbines and combining it with height maps and aerial images of the surrounding terrain.It includes a human-machine interface capable of interaction through virtual reality and simultaneously contains meta-data about the wind turbines, the farm layout, and the environment.The digital twin was elevated to the descriptive level by including SCADA data measured at each of the wind turbines.Some data were visualized directly on the 3D CAD model, while other information was shown with animated gauges and text.Predictive capabilities were implemented to forecast the power production of each turbine in the wind farm and the results were visualized in the interface.The predictions were performed by combining existing weather forecasts and physics-based models.They were made intuitively understandable by showing wind as vector fields and trails on the terrain.Finally, the results were discussed, and an outlook on future work was given.On top of the continuation of current research, the digital twin can be extended with additional modules to cover more aspects and evolve throughout the whole life-cycle of a wind farm.Such fullfledged digital twins will have the potential to substantially contribute towards cheaper and more sustainable wind energy for a greener future.

Figure 3 :Figure 4 :
Figure 3: Comparison of terrain between topdown view within the digital twin interface (inside) to a picture from [9] (outside) Figure 4: 3D CAD Model of the Enercon E70-4 turbine

Figure 8 :
Figure 8: Farm-level NRMSE of the best model in each category