Predictive Maintenance of Oil and Gas Equipment using Recurrent Neural Network

Oil and gas industry projects involving equipment acquisition and installation are usually capital intensive. The recent crude oil price fall has tightened the expenditure and therefore reinforced the importance of effective maintenance management across the oil and gas industry. Rotating mechanical equipment such as induction motor, compressors and pumps, are essential elements in industrial processes. Effective maintenance of these equipment is crucial to avoid several damage and downtime for repair. Predictive maintenance has attracted huge attention in this industry driven by sensors and data acquisition. This paper focuses on developing machine learning algorithm based on recurrent neural network (RNN) using long short-term memory (LSTM) to carry out predictive maintenance of Air booster compressor (ABC) motor. The resulting experiment demonstrates the performance of RNN-LSTM algorithm implemented for fault prognosis model of rotating equipment predictive maintenance. The application of these algorithms could mitigate risk and reduce cost in the oil and gas operation.


Introduction
Periodic and preventive maintenance has been used in oil and gas industry for the past few decades. However, these maintenance techniques sometimes will lead to unnecessary maintenance thus ends up with labor extensive and high cost problems. Predictive maintenance on the other hand, allows the users to predict future possible errors for a particular equipment or instrument. This allows us to reduce the frequency of needed maintenance unlike periodic and preventive maintenance. Preventive maintenance is ineffective in identifying problems and is labor-intensive. Predictive maintenance is very important towards industrial economy by improving equipment efficiency, reliability and reducing down-time. In recent years, abundant of data of rotating equipment are readily available from various sources. However, these data are not being utilized and analyzed for improving maintenance performance. It requires advance techniques to analyze variety of data in order to transform into relevant information.
Air booster compressor motor (ABC) is crucial device in oil and gas processes. There is strong demand on the motor reliability and safe operation, where the failure on motor lead to downtime and great losses on incomes and maintenance. This problem will increase the rate of losses of the motor maintenance. Besides that, ABC motor failure is costly and time consuming where cost and time to maintain the motor had increased as the motor or components inside the motor needs to be replaced. A few studies have been done on neural network application on time series prediction. Through artificial neural networks (ANN), a mathematical model can be trained through a set of pre-determined data to predict its future values. Based on the data acquired, the machine state is assessed for abnormal events and failure indication which will be the factors to identify an upcoming maintenance. Malhi and Gao [1] proposed a modified recurrent neural network (RNN) approach to multi-step prediction for prognosis on rolling bearing data. Liang et al. [2] proposed RNN based health indicator (RNN-HI) to enhance remaining useful life (RUL) prediction of bearing. Chen et al. [3] utilized RNN as predictive maintenance technique to design the intelligent prediction system for mechanical state.
In this paper, this study covers the predictive maintenance of ABC motor installed in air separation unit at the utility of oil and gas industry. It aims to develop a prediction model to predict the condition of the motor implementing RNN-LSTM trained and simulated to provide early warning on motor fault. The selected parameters are current, active power, discharge temperature, winding and bearing  Table 1. The rest of the paper is organized as follows: section II introduces basic concept of RNN-LSTM algorithm. Section III presents the methodology developed for predictive maintenance of ABC motor. Section IV discusses the results obtained from the designed model. The last section presents the concluding remarks for this work.

RNN-LSTM Principle
RNN is a subclass of ANNs designed for capturing information from sequence/time series data. In a feed forward network signals flow in only one direction from input to the output, one layer at a time. Unlike feed forward network, a recurrent network can receive a sequence of values as input, and it can also produce sequence of values as output. The ability to operate with sequences opens these networks to a wide variety of applications. RNN is a feedback and interactive network either from the hidden or output layer to the proceeding layer as this allows the network to store its internal state and process sequence input and perform temporal task such as speech recognition and prediction etc. The ability to operate with sequences opens these networks to a wide variety of applications. RNN networks can process dynamic information their state is changing continuously until they reach an equilibrium point. RNN works on recursive formula given in (1), where is the input at ℎ time, ℎ is the state of hidden layer at ℎ time, is the weight between input layer and hidden layer and is weight between current hidden layer and hidden layer at next time. Typically, RNN is an extremely difficult network to train. Since the network use backpropagation through time, when gradient is passed through many steps, it tends to vanish, and the network runs into the problem of vanishing gradient. To solve this problem and improve the accuracy Sepp Hochreite and Jurgen Schmidhuber [4-6] developed advanced version of RNN called Long short-term memory (LSTM). LSTM networks are able to learn long term dependencies by utilizing memory cell which remembers important features of time series data and tends to forget unimportant features. LSTM consist of three gates and one cell state represented in Figure 1. Each line carries an input vector from the output of one node to the input of the others [7][8]. Cell state line extends through the entire network throughout its path, it goes minor nonlinear interactions and regulates information. LSTM has ability to remove or add information to the cell states and this change in information is regulated by structures called gates; input gate, forget gate and output gate. Mathematically computation for input gate is represented as follows: where is the weight between the input gate and the output layer, ℎℎ is the weight between hidden layer at the last time and hidden layer at current time, is the bias vector and is the activation function. The equation of forget gate is given as follows: where is the weight between the input gate and the input layer, ℎ is the weight between hidden layer at the last time and hidden layer at current time, is the bias vector and is the activation function. The equation of the output gate is written as follows: where is the weight between the output gate and the input layer, is the weight between output gate and the state of memory block at ℎ time and is the bias vector. The computation of state of memory cell at ℎ time is computed as follows: where is the weight between input layer and the state of memory block at ℎ time, is the weight between the state of memory block at ℎ time and at next time and is the bias vector.

Research Methodology
There are various neural network structures that have been employed for prediction. The traditional neural network such as multilayered perceptron networks are not suitable for time series or sequential data prediction since the input of the data is independent of each other whereas in RNN the output state not only depends on current input but also considers input of previous time steps. Recently the researchers are using deep learning-based methods such as RNN-LSTM for time series prediction due to its data learning capabilities. LSTM have generated valuable results in predicting stock exchange [9] and traffic flow prediction [10].

A. Data Preprocessing
In order to have an optimized model, good historical data is needed from the well-functioned sensing instruments. In this study the data obtained is from real plant. The experimental data gathered for this paper spans over two-year period January 2015 to January 2017 having one-minute interval. The ABC motor faces downtime thrice due to different fault condition over the period of two years; therefore, the data is characterized into three events. Each event consists of data before failure. Data preprocessing is important to improve data quality for more accurate and faster prediction model from noises missing values and inconsistency. These data have been divided into two categories, which are healthy and unhealthy data. The unhealthy data has been further categorized into 3 types of common faults; shutdown, outliers and out of range fault. Shutdown fault refers to the condition when supervision system reads zero value from the motor, while outliers fault refers to the data which lies in abnormal distance from other values and out of range fault is the reading values of ABC motor parameters that occur outside the range of process parameter. The unhealthy data was figured out and filtered out by putting upper and lower range of each parameter. The healthy data values were normalized using the zscore normalization to make sure that some data did not outweighed other measurements. Z-score is calculated using Eq. (6).
where is the input, is mean and is standard deviation of the data. The data is then scaled into [-1 1] range to have standard minimum and maximum values for every data input.  Figure.2. The training phase and prediction of the data is realized in MATLAB software. A few architectures of the model are built and simulated to choose the best architecture for model prediction. The designed model was able to do short term forecasting accurately which is hourly forecasting. 5700 data points of experimental data are used for RNN-LSTM model development. The data is divided into three parts training, validation and testing to test the accuracy of the model. Event 1 data is used for training of the model, for validation purpose event 2 data is utilized and the model's accuracy is tested on event 3 dataset. The training set is a set of data used to train the model, during each epoch the model is trained to learn about the features of the data. The recorded data was divided into 6:2:2 ratio, 60% for training, 20-20% for validation and testing purpose. The hope with this is that later the model can be deployed to give accurate prediction on data that it's never seen before.

B. Development of RNN-LSTM model
Validation data process information that may assist with adjusting hyper parameters of the model. One of the major reasons for using validation set is to make sure that our model is not over fitting to the data in the training set. The idea of over fitting is that the model give accurate prediction on the data in training set but it is unable to generalize and make accurate predictions on data that it was not trained on. Test set is a set of data that is used to test the model. After the model has been trained and validated then the model is used to predict the output on the test set.
In this study, recursive strategy is applied to do multi-step prediction. In this strategy one step model is used multiple times. First step is forecasted by applying the model. Subsequently, the forecasted value is used as input for the next step prediction. This process is continued till the entire horizon has been forecasted. The predicted value after each step is inserted to the input vector for prediction of the next values of ABC motor data.

Figure 3. ABC motor prediction based on RNN-LSTM Prediction model
The structure of the designed RNN-LSTM predictive model is shown in Figure.3. The built model is designed to predict hourly motor condition using input . The input of the model is = ( −1 , −2 , − ). − is the value of motor condition indicator in hours before ℎ hours.
The number of the hidden layers and nodes in the hidden layers is determined by running several simulations of the designed model. The activation function used in this model is sigmoid function. The network architectures were trained until 1000 epochs. Initially the weights were generated randomly for training the network. The computation of the model is as follows:

Results and Discussions
The performance of the model relies on several parameters, i.e. the architecture of model, network parameters and training function. The selection of RNN-LSTM model parameters was carried out by running several simulations in order to acquire the ideal parameter for the designed model. The robustness and accuracy of the built model is determined based on the accuracy of prediction values that provide the least root mean square error (RMSE) value. RMSE is calculated using (9).
(9) Two most popular training algorithm i.e., Levenberg Marquardt and Bayesian Regularization were compared for training the network shown in Table 2. Levenberg Marquardt has proven to be faster Levenberg Marquardt is tailored to functions of the type sum of squared error. That makes it to be very fast when training neural networks measured on that kind of errors. RNN-LSTM neural network was selected with one hidden layer as it is realized that RNN-LSTM model with one hidden layer with each unit having sigmoid function can approximate arbitrary well any continuous mapping.  The model is trained with different epochs and number of neurons in hidden layer. Table 3 shows simulation results for the proposed model with 15 hidden nodes performing the lowest RMSE in training and testing data. As the number of neurons increase in hidden layer, the RMSE decreases drastically. The optimal number of neurons in hidden layer is 15. After the number of neurons in hidden layer exceeds 15, the RMSE starts to increase gradually. There is no particular approach to optimize the number of neurons in hidden layer, this particularly depends on trial and error method. Similarly, the RMSE value decrease with increase in epochs. After 750 epoch the model starts to overfit the training set and the model RMSE for validation and testing data suddenly increase. Overfitting exist when model underperforms on test data. This issue can be solved through techniques such as regularization. Regularization is a technique that constraints optimization problem to discourage complex models. It improves generalization of model on unseen data. The designed prediction model is able to do univariate forecasting. As long as the value of the process variable for the ABC motor is still within the range given, the motor is still safe to be operated and fully functional. Once the value has subpar the limit, the motor needs to stop operating immediately as it will be dangerous and quite harmful to operate a damage motor. The process variable can be used to give prediction alerts about the condition of the motor monitored. The prediction will identify any potential faults and offer opportunity to take action before hand to avoid any equipment damages. Figure.  presents the prediction result for Resistance temperature detector (RTD-2) for training and testing data respectively.

Conclusion
Equipment failure in oil and gas industry can be forecast using predictive maintenance to prevent major loss. In this paper, fourteen parameters are collected from industrial graded air booster compression (ABC) motor. These parameters are sequentially fed into LSTM model to obtain an optimal model with weights. The trained model is used to predict the signal for 30 minutes sampling after the current timestamp. The major advantage of using LSTM is that it does not require any expert knowledge and feature engineering. LSTM through use of its gates can capture long term dependencies and can cover discover meaningful features. In addition, LSTM can perform multistep ahead prediction with high prediction accuracy. In this study, the model is trained on data obtained from real plant. Several simulations of the model are carried out to get the optimum parameters of the model for higher accuracy of the prediction. The only disadvantage of the proposed algorithm is longer training time. However, network training is an offline task, so training time is not crucial for the application.