Integrated condition monitoring of a fleet of offshore wind turbines with focus on acceleration streaming processing

Particularly offshore there is a trend to cluster wind turbines in large wind farms, and in the near future to operate such a farm as an integrated power production plant. Predictability of individual turbine behavior across the entire fleet is key in such a strategy. Failure of turbine subcomponents should be detected well in advance to allow early planning of all necessary maintenance actions; Such that they can be performed during low wind and low electricity demand periods. In order to obtain the insights to predict component failure, it is necessary to have an integrated clean dataset spanning all turbines of the fleet for a sufficiently long period of time. This paper illustrates our big-data approach to do this. In addition, advanced failure detection algorithms are necessary to detect failures in this dataset. This paper discusses a multi-level monitoring approach that consists of a combination of machine learning and advanced physics based signal-processing techniques. The advantage of combining different data sources to detect system degradation is in the higher certainty due to multivariable criteria. In order to able to perform long-term acceleration data signal processing at high frequency a streaming processing approach is necessary. This allows the data to be analysed as the sensors generate it. This paper illustrates this streaming concept on 5kHz acceleration data. A continuous spectrogram is generated from the data-stream. Real-life offshore wind turbine data is used. Using this streaming approach for calculating bearing failure features on continuous acceleration data will support failure propagation detection.


Introduction
Wind energy has been booming for the past decade and a massive amount of turbines have been installed both onshore and offshore. Given the increasing percentage of wind energy in the renewable energy mix, it becomes more important to deliver an as constant as possible output of power; particularly at moments of high electricity demand. For achieving this turbines are clustered in large wind farms. Particularly offshore this is the case, where wind farms even have their own transformer substation. Such a plant will in the future be operated as an integrated power plant that needs to meet electricity demanded by the grid [1]. This will cause the farm operator to be much more responsible for the way in which this energy is produced. Now the wind speed is determining for each turbine how much energy it produces. In the future the wind farm operator will decide how much each turbine in the farm produces. That way he can take into account the price of energy, the condition of the turbines and the other factors.
A second important aspect in farm operation is maintenance planning. Drivetrain reliability is an important aspect in this [2]. Each turbine needs yearly replacement of certain components and fluids. Additionally if a component fails it needs to be replaced. In ideal conditions the maintenance actions are done during low wind and low electricity demand periods. That way the lost revenue due to downtime is minimal.
In order to be able to achieve this vision it is essential to have a good estimate of the current condition of each individual turbine in the farm. Early detection of failure and insights in the historic loading of each turbine is the basis for such an optimized operation of the farm. This paper suggests an integrated approach to assess the current condition of all turbines in a farm.

Methods
A first requirement to be able to assess the current condition of a system is the availability of data in a consistent dataset. This dataset should span all turbines in the farm in order to allow comparison between the behaviour of the different turbines. To be able to deal with the massive amount of measurement data coming from the more than 200 sensors mounted on each turbine in a farm we suggest the use of a big-data storage and streaming approach.
A second requirement is the availability of dedicated data analysis tools to perform anomaly detection. We suggest a combination of dedicated physics based condition monitoring signal processing techniques for high frequency data [3,4] and data-modelling machine learning algorithms for anomaly detection on signals coming from low frequency sampled (<10Hz) sensors embedded in the turbine [5].
The third requirement is a highly capable decision support framework to interpret the alarms from the anomaly detection algorithms. It is important to be able to learn from previous experiences (maintenance actions) how certain alarm sequences can be coupled to certain failures. This then in future allows to use these links to predict initiating failures. Again machine-learning algorithms can bring added value here.
This integrated approach is too broad to discuss in the context of this paper. Therefore one aspect is lifted out in detail: the aspect of high frequency data streaming signal processing. The other aspects are briefly discussed and reference is made to other papers where the details are discussed.

Big-data storage
In general there is a large volume of data generated by the different turbines in the farm. In order to be able to perform analysis on this data it is necessary to have it stored in one consistent dataset. This data can be time series or more contextual. For the wind turbine application the most important time series data-sources are: Supervisory Control And Data-Acquisition data (SCADA) either sampled at 10min basis or at 1Hz, turbine status codes [6], data from condition monitoring systems and dedicated measurements of custom monitoring systems for certain components, such as the gearbox. Context information is linked to for example maintenance actions and meteorological information. These datasources can provide additional insights in the behaviour seen in the different time series of the turbines and link them to failure events.
Given the high variety and high volumes of data we needed to go beyond traditional database approaches. Our automated analysis platform therefore uses a big data no-SQL architecture. Automation is needed in order to be able to process the different sensor channels in parallel. An overview of the different data sources is shown in Figure 1.
In addition to raw data also processed data is stored in the big-data platform. Particularly features linked to anomaly information are useful to keep over long periods of time. Such features can for example be the amplitude value of a ball pass frequency outer ring (BPFO). These features are kept in order to train classifiers that can combine the anomalies found for different sensor signals in a decision support strategy.
In this paper we focus on the high frequency acceleration data, rather than the low frequency SCADA data. This particular data source consists of the data of accelerometers mounted on main bearing, gearbox and generator. It is the data source that signal processing algorithms for condition monitoring will use as input information.
Acceleration data is typically sampled between 500Hz and 10kHz. This high data rate poses a challenge for more traditional processing approaches. In most of those cases the raw data is aggregated into averaged values at a one per minute to one per ten minutes sampling rate by an Extract Transform and Load (ETL) process. The aggregated data is stored permanently in the database and used as basis for further analysis.
However for condition monitoring it is necessary to perform the calculation of different features on each of the high frequency sampled channels and not on the aggregated data. Using the 'store-all' approach this would result in a massive amount of data in the database. Traditionally this is overcome by a reduction in the amount of measurements the condition monitoring system is sending to the database. In most cases key statistics will be sent at many instances, whereas detailed high frequency data is transferred at a limited number of times each day. This raw data is then processed into features, which are again stored in the database.
Having access to a limited number of data points each day allows for alarming, but makes it difficult to detect trends in system degradation. Particularly for the wind turbine application it is necessary to have more long-term data. This in order to have multiple measurement points in time at consistent operating conditions.
In order to overcome this problem we suggest the use of streaming processing approach. In such a data architecture the data is continuously processed as it is generated.

High frequency data streaming signal processing
In the streaming context the condition monitoring processing algorithms need to be able to keep up with the data-generation. Therefore, it is necessary to perform the different processing tasks in parallel. Figure 2 illustrates the dataflow.
In this case the input to the data-streaming engine is a stream of files. These files contain the data of one or multiple measurement channels and are generated by the data-acquisition system. Another possibility is the case for which the measurement system is capable of streaming the data directly to the streaming engine.
The streaming engine buffers the data sufficiently long for processing. It splits the data over different parallel data streams tailored to each of the analysis algorithms. Several measurement channels can be repeated in parallel streams if they are used as inputs for different signal processing sequences. The signal features resulting from the signal processing algorithms are stored in the database. All actions are implemented such that they are autonomous and continuous.

Offshore wind turbine gearbox acceleration measurements
To illustrate the working principle of our data streaming approach we show the example of long-term spectrogram processing on an acceleration signal of an offshore wind turbine. The spectrogram is often used as input for condition monitoring analysis and operational modal analysis. It offers insights in the dynamic behaviour over time of a system.
The measurement data that is used to illustrate the concept of the streaming acceleration processing is coming from an offshore wind turbine. A multi-month measurement campaign was performed on the turbine. The instrumentation package consisted of accelerometers mounted on the gearbox. All channels were sampled at 5120Hz. For rotating machines the spectrogram is typically characterized by the clear presence of harmonics linked to the different rotating shafts and gears. The frequency of these harmonics changes when the speed of the system changes. Hence tracking of the harmonics gives insights in the speed behaviour of the machine. This can for example be used for virtual tachometer extraction using e.g. a Multi-Order Probabilistic Approach (MOPA) [7]. Additionally the spectrogram can be used in the context of Campbell diagram analysis for the detection of resonance frequencies of rotating machines.  During the visualized time period the turbine is continuously changing regime. For this particular case the turbine was operational until minute 4. Then it was manually stopped for 1.5 minutes and restarted. This stop event is shown in Figure 4. The orders clearly mark the braking of the system. During the stop the turbine is still idling very slowly as marked by the remaining low frequency low amplitude orders. At minute 20 the turbine is stopped again. A more detailed view of this stop is shown in Figure 5

Conclusions
This paper illustrated our integrated big-data monitoring approach for a fleet of wind turbines. The general concept was illustrated. Moreover, the concept of streaming processing of high frequency acceleration data was discussed. This approach was illustrated by means of a streaming spectrogram calculation. Performing this analysis on experimental data of an offshore wind turbine it was shown that the dynamic behavior of this system is clearly present and it is a good starting point for streaming modal analysis and condition monitoring analysis. This will be further developed in future work.