Research on Underdriven HOV Fault Diagnosis based on Physical Attention-Multidimensional Recurrent Neural Networks

In recent years, underdriven human occupied vehicles (HOVs) have gained immense popularity due to their exceptional maneuverability and efficiency. However, the intricate nature of their propulsion systems, coupled with the harsh underwater environment, presents substantial challenges in terms of fault diagnosis. To address this issue, this study proposes a fault diagnosis approach that leverages the Physical Attention-Multidimensional Recurrent Neural Networks (PA-MDRNN) for underdriven HOV propulsion systems. The PA-MDRNN model is trained using a dataset comprising both normal and faulty HOV propulsion systems. Subsequently, the trained model is utilized for real-time identification and diagnosis of faults. Distinguishing itself from conventional data-driven models, our approach introduces a novel non-consistent attention mechanism based on fundamental computational physical properties. This mechanism operates within the framework of hybrid networks, where attention is computed on the multidimensional mapping results. This computation enhances the focus on desired features that adhere to traditional physical properties, thereby eliminating any training results that contradict established physical laws. Consequently, this approach not only excludes incongruous training outcomes but also reinforces the adherence to physical properties in the finalized training results.


Introduction
Due to the technical advantages in onsite observations and underwater operations, the deepsea human occupied vehicle (HOV) has been highly valued, such as the Alvin deepsea HOV of U.S. and Jiaolong deep-sea HOV of China.As an important tool for ocean explorations, the deep-sea HOV must ensure that it is safe and reliable when working underwater.Therefore, the fault diagnosis technology has become one of the most important research topics in the field of deep-sea HOV in recent years [1].The underwater operations of deep-sea HOV are achieved by a thruster system, which is exposed to seawater for a long time and, thus, the thruster fault is one of the most common fault sources of deepsea HOV.Actually, it is particularly urgent and necessary to find an effective and accurate fault diagnosis method for the deep-sea HOV thruster [2].In reported underwater vehicle fault diagnosis studies [3][4][5][6][7][8], propeller failures of HOVs have been treated simply and crudely as two failure cases.The most obvious problem with this crude approach is that it ignores some of the propeller failure cases with varying degrees of failure.Some cases with minor failures are often mistaken for no failures, while cases with severe failures are considered to be complete failures.As a result, the fault diagnosis results of the above rough method differ significantly from the actual situation.In order to improve the accuracy of thruster fault identification, self-organizing maps (SOMs) neural networks are introduced into HOV thruster fault diagnosis [9][10][11].The more fault scenarios considered, the more accurate the results of thruster fault diagnosis will be, and the closer the results of thruster fault diagnosis will be to the actual situation.However, in the real-world application of deep-sea HOVs, thruster failure cases are closely related to the external environment, and the degree of failure is uncertain and constantly changing.Three or more fault cases cannot cover all fault situations, which will inevitably affect the accuracy of fault identification.Therefore, there is an urgent need to design a new advanced thruster fault diagnosis method to effectively diagnose continuously changing and unknown failure modes.With the development of artificial intelligence technology, multi-sensor information fusion methods based on ANN have gradually become a key research area [12,13], and the accuracy of fault identification has been greatly improved.Traditional ANNs, such as back propagation neural networks (BPNN), generally have only one hidden layer and are prone to converge to a local optimum.Compared with BPNN, SOM is an unsupervised learning network that constructs an ordered feature map in the output layer reflecting the distribution of the input patterns.The main disadvantage of SOM is that the output is discrete and it always matches new specific patterns with the closest patterns in the input patterns.The extracted features are not rich enough to fully fit complex nonlinear functions, so the errors of fault diagnosis methods based on BPNN and SOM are inevitable and large.For other large deep neural networks, such as convolutional neural networks (CNNs), overfitting or failure to converge often occurs when applied to fault diagnosis with fewer fault training samples.On the contrary, deep belief networks (DBNs) with multiple hidden layers can extract more fault features from experimental training samples, and through unsupervised inter-layer learning and fine-tuning of the network parameters, the network has excellent local generalization ability, which gives it a unique advantage in classification [14][15][16][17][18]. Based on the aforementioned analysis, in conjunction with the data on HOV propulsion systems and their fault characteristics, this paper presents a novel fault diagnosis method for HOV propulsion systems named PA-MDRNN.Diverging from conventional purely data-driven models, this method incorporates an attention mechanism.Initially, the collected data is preprocessed to capture the local spatial characteristics of the HOV propulsion system's monitoring signals obtained from multiple sensors using a Convolutional Neural Network (CNN).Simultaneously, the standard operating condition data is trained using a Bidirectional Long Short-Term Memory (BiLSTM) network, and the two sets of data are mapped to yield multi-dimensional features.Subsequently, a non-consistent attention mechanism, grounded in traditional computational physical properties, is devised to compute the attention mechanism based on the multidimensional mapping results.This computation amplifies the emphasis on desired features that conform to the traditional physical properties, thereby excluding training results that contradict established physical laws and enhancing the adherence to physical properties within the training outcomes.Finally, the fault output of the HOV propulsion system is obtained through an activation function.To validate the proposed method, an experiment was conducted using a thruster propeller with specifications similar to those of the Jiaolong.The experimental results unequivocally demonstrate the superiority of the proposed method over other existing approaches in terms of fault diagnosis accuracy and alignment with traditional physical laws.Moreover, the method adequately fulfills the requirements of practical applications.The right module is the attention mechanism module, which first performs multidimensional feature mapping on the feature data processed by the left module, and then uses a non-consistent attention mechanism based on the physical properties of traditional computation to calculate the attention mechanism on the results of the multidimensional mapping, and improves the attention of the desired features that satisfy the traditional physical properties, as a way to exclude the training results that are contrary to the traditional physical laws, and increase the training results' physical characteristics, and finally the fault output of the HOV propulsion system is obtained after the activation function.

CNN-based local feature extraction module
In the field of fault detection, due to the objective conditions raw input data when the dimensionality of the data is too many, it leads to an increase in time cost and overfitting when using neural networks to solve the problem.Considering the local correlation implied features between HOV monitoring data variables and early faults and the temporal sequence of sequence data, CNN is used to extract local features from the input data to speed up the training speed and improve the generalization performance, in which one-dimensional CNN is calculated as follows: where ) is the input HOV detection sequence, w t and u t are the CNN network weight matrices, b t , β t and ε t are the bias terms, c t is the convolutional output feature, r is the neural network activation function, η is the maximum pooling subsampling function, max is the maximum value of the specified region in the computed feature, which highlights the local critical fault information, and z t denotes the output of the pooling layer, which is used in pooling computation, and the advantages of which are two main points.
(i) Removing redundant information in the features, reducing the dimensionality of the features, simplifying the computational complexity of the network, improving the training efficiency and reducing the training cost.(ii) Reducing the shift in the estimated mean due to parameter errors in the convolutional layer

BiLSTM network structure
Bidirectional structured LSTMs using forward implicit and backward implicit layers can encode the input information in both directions and are able to capture information that may be ignored by unidirectional LSTMs.In this paper, we use BiLSTM for bidirectional extraction of local spatial correlation features of the input, ⃗ h t and ← h t are the correlation state information of the forward and backward hidden layers with the current input fault sequence, respectively.Where BiLSTM forward and reverse outputs ⃗ h t and ← h t are computed by standard LSTM.The final output of each time step consists of forward and reverse unit hidden layers spliced together.
Eq. 5 couples sequences ⃗ h t and ← h t by means of a series function σ to obtain an implied state containing information on the correlation of early failures of the HOV with the variables h t .

A hybrid model of two neural networks
The hybrid neural network model consists of a fusion of a CNN local signal feature extraction module and a BiLSTM network.Firstly, one-dimensional CNN is used for local feature extraction of fault variables, and then BiLSTM bi-directionally acquires contextual long time dependencies between HOV early fault variables, and then integrates the learned fault features through the fully connected layer to output the early fault prediction results.The hybrid model uses ReLU as the nonlinear activation function of r, which can reduce the computation amount when solving the error gradient, help to improve the convergence speed of the model and jump out of the local optimum, and adds L2 regularization to limit the size of the learning weights, so that the model can not arbitrarily fit random noises in the data to prevent the overfitting problem.

Traditional Physical Properties Attention Mechanisms
The attention mechanism originates from image recognition research, drawing on the fact that the human brain has an attentional bias towards different parts of things to highlight the weights of key factors.Nowadays, neural network methods based on the attention mechanism have been widely used and achieved good results.The input time series data to the hybrid model is activated by the softmax activation function.However, the diagnostic accuracy of the hybrid model reaches a bottleneck due to the increase in the length of the time series.Considering the strong time correlation and physical characteristics of the HOV monitoring data, whether to make full use of the temporal information determines the model's ability to diagnose early failures, and it is very easy to train the model to produce the results that are completely contrary to the traditional physical laws.Therefore, we propose a traditional physical property attention mechanism combined with time information to be embedded into the hybrid model, which dynamically selects the sequence states associated with HOV faults, highlights the key information that is consistent with the physical properties, adaptively acquires the implied state weights of the corresponding time sequences, and suppresses the inconsistent information, which helps the model to make more accurate decisions.The physical information attention weight at the current moment in the structure is affected by the hidden layer state h ∂ (∂ = 1, 2, • • • , N ) of the BiLSTM unit and depends on the unit states h m−1 , h m+1 and hidden layer states h t−1 , h t+1 5 of the BiLSTM unit at the previous and next moments, which are calculated to obtain the physical information attention weight d ∂ t of the fault variable at the current moment as shown in Eq. 6.
where Q and W are the multilayer perceptron weight matrices in the physical information attention mechanism, K is the bias term, and tanh is the hyperbolic tangent activation function.
In order to make the sum of weight coefficients a weights is d ∂ t , it is normalized using sof tmax activation function.As shown in Eq. 7.
The g ∂ t derived in Eq. 7 quantifies the importance of the hidden layer fault variable characteristics of layer a in the prediction of HOV faults at the current moment.By weighted summation of all the weights g ∂ t with the states of the corresponding hidden layers, comprehensive information about the state characteristics of the predicted time series is obtained ct as shown in Eq. 8.
The local spatial features v t and ct of the extracted multidimensional correlation variables are fused as inputs to the BiLSTM cell network xt .
In Eq. 9, W and b are the weights and bias of the BiLSTM cells, respectively.After the introduction of physical information attention and after that, the degree of relevance to the physical information at the critical moments before and after is considered to obtain the hidden layer state ht at moment t, as shown in Eq. 10.
In Eq. 10, f is a BiLSTM network cell, and the input is no longer the raw data extraction information, but a weighted feature containing the magnitude of the association weights between the HOV detection variables and the faults.

PA-MDRNN model training
The hybrid overall structure consists of a fusion of BiLSTM networks with the addition of a local feature extraction module and a physical information focus mechanism.After pre-processing operations on the acquired monitoring data, the CNN structure is utilized to perform local feature extraction on the multidimensional data, and then the BiLSTM network performs automatic feature learning and data feature encoding, and enhances the key factors influencing the output of HOV faults by using the physical information focusing mechanism, which adaptively extracts the feature contributions of the monitoring variables to capture the semantic relationship between the monitoring variables and the early faults.Finally, the HOV early fault y is predicted according to Eq. 11.
A cross-loss function is used to calculate the loss error between the predicted values and the actual faults, which avoids the problem of slow updating of the weights, as shown in Eq. 12.
where n is the amount of HOV fault states, j is the specific state class, ŷ represents the state output of the target, and y represents the actual state of the model output, and is optimized to obtain the network parameters using the Adam optimizer for gradient operation θ.
The PA-MDRNN hyperparameters are set as follows, the convolutional kernel size of the onedimensional CNN network for the local feature extraction module is set to 64, the window length of the maximum pooling layer is 3, the number of hidden layer units of the BiLSTM layer is 64, the window length of the time series is 80 samples of the historical moments, the value of the dropout is 0.4, the value of the L2 regularization is 0.02, and the batch sizes for model training are 128, the learning rate is 1 × 10 −3 , and the number of iterations is 10, 000, and these hyperparameters are determined by k cross-validation on the training set (set k = 5), as follows.
(i) divides the original training set equally into 5 sub-datasets.
(ii) Select one sub-data set as the test set and the remaining 4 sub-data sets as the training set.
(iii) Train the model using the 4 sub-datasets and then calculate the accuracy and loss function of the model on the test set.(iv) Repeat steps 2 and 3 5 times, each time selecting a different subdataset as the test set.(v) Calculate the loss function and accuracy of the 5 test sets to evaluate the diagnostic performance of the model.(vi) Adjust the hyper-parameters of the model, repeat the above steps so that the predicted value is constantly close to the real value, and select the combination of hyper-parameters that can have the best diagnostic performance in the 5 times of cross-validation to get the optimal model.

Failure experiment data analysis
In this paper, the experimental data of similar specification T561 propellers of Jiaolong HOV are selected for verification, and the specific technical parameters are shown in Table 1.The experimental data contains four kinds of faults: propeller loss, propeller jamming, and propeller winding, together with the normal working condition samples totaling 27815 samples, and the data set information is shown in Table 2.
The acquisition process of the propeller signals is as follows: the host computer controls the propeller to run according to the set control signals through the microcontroller, and receives the propeller speed signal and propeller power current signal fed back from the propeller.The types of control signals include linear FM signals, pulse signals and random signals generated by the program.By applying various fault modes to the thruster as shown in Fig. 2, a series of thruster signals were collected from the thruster in normal state, propeller loss, propeller winding (cable), propeller blocking (fishing net), and motor blocking states.

Analysis and comparison of results
In order to eliminate the effect of randomness as much as possible, the squared prediction error (SPE) Hotelling statistic (T2) was utilized in the experiments for quantitative comparison experiments.Where SPE is defined as The process is considered normal if the following occurs.
Once a fault is detected, SP E and T 2 can be used to evaluate the performance of the fault detection algorithm.The specific detection of the four faults is shown in Fig. 3 and Fig. 4. Fig. 3 and Fig. 4 show the fault detection of the four parts respectively, and it can be seen  3.
Table 3 shows us a comparison of the accuracy of different fault diagnosis methods for the

conclusion
Based on extensive analysis of operational and fault data accumulated from T561 thrusters, this study presents a novel fault diagnosis method for HOV propulsion systems, named PA-MDRNN, which is based on a hybrid complex neural network and a physical information attention mechanism.The proposed method leverages the correlation between diagnostic data and fault types, instead of relying on the mathematical model of the HOV propulsion system, to achieve single-system diagnosis, thereby simplifying the diagnostic process and enhancing diagnostic efficiency.Comparative analysis with a single model demonstrates that the hybrid model exhibits superior fault diagnosis capability for the thruster.Even in the early stages of model training, when diagnostic accuracy is relatively low, the hybrid model achieves an accuracy rate of over 90%, surpassing the average performance of the CNN and fast wavelet transform methods.After completing model training, the detection accuracy exceeds 95% in all four operating conditions.Considering the strong temporal and physical characteristics of the monitoring variables, the attention mechanism on the physical information accompanying time is introduced to enhance the feature fusion capability of the model.This method provides a means for high-accuracy and rapid fault diagnosis of HOV propulsion systems.Leveraging the deep information embedded in both fault and normal data through existing techniques holds significant importance for the enhancement of safety and reliability in deep-sea submersibles.This endeavor is crucial for providing reference support to the health management and timely maintenance of HOV systems.However, due to the limitation of supervised learning experiments, which cannot cover all fault types, future research aims to explore the application of semi-supervised or unsupervised learning methods in HOV health management and optimize these approaches to achieve high-precision monitoring of unknown operational conditions.Furthermore, while complex models can improve detection accuracy, they also entail substantial computational costs.Thus, reducing training costs is an important focus of our future research.

Fig. 1
illustrates the complete PA-MDRNN structure.The left module is divided into two steps, up and down, for the data preprocessing module.The local spatial characteristics of the

Figure 4 .
Figure 4. Loss of propeller and motor stalling

Table 1 .
Technical parameters of the thruster

Table 2 .
Propulsion dataset information Thruster failure Number of training sets Number of validation sets

Table 3 .
Comparison of detection accuracy of different methods(%) T561 thruster.It can be seen that the purely data-driven CNN method does not improve the accuracy much over the traditional fast wavelet transform method, and the advantage is not