Gaussian Process-Based Model Predictive Control for Autonomous Underwater Vehicles

Traditional MPC algorithms, assuming constant values, suffer from performance degradation caused by model mismatch. This paper addresses the enhancement of predictability in Autonomous Underwater Vehicles (AUVs) under uncertain disturbances and unknown system dynamics through the design of Model Predictive Controllers (MPCs). We propose a hybrid model that integrates a first-principles nominal model with a learning-based model utilizing Gaussian Processes (GPs). The algorithm addresses the problem of model mismatch in AUV motion control by constructing a precise GP model, which captures the dynamic characteristics of the process through the collection and learning of deviations between the reference model and the controlled system. Additionally, the GP model transforms stochastic constraints into deterministic convex constraints, enhancing safety guarantees in complex and challenging environments. The effectiveness of the proposed algorithm is demonstrated through two simulation examples.


Introduction
Autonomous Underwater Vehicles (AUVs) find extensive application in tasks that demand reliability and enhanced autonomy, such as surveillance, monitoring, search and rescue operations, and automated transportation [1,2].These systems possess a high degree of intelligence and autonomy.Due to the strong nonlinear characteristics of AUV kinematics and dynamics models, predictive control demonstrates excellent capability in constraint handling for both linear and nonlinear control problems [3,4,5].
However, selecting a suitable predictive model in MPC presents a challenge in balancing control computation complexity and effectiveness.It is worth noting that a substantial amount of operational data is generated by the AUVs, making data utilization for model optimization and calibration a crucial research endeavor [6].
Data-driven modeling approaches have gained widespread applications in recent years, including their use as predictive models in MPC [7,8,9].However, relying exclusively on datadriven models for control purposes faces challenges and lacks acceptance in control engineering, often leading to failures.For systems like AUV that allow mechanistic analysis, a viable solution is to employ mechanistic models as the system's baseline and appropriately incorporate datadriven models to capture system dynamics [10].In a study by Zhang et al. [11], they designed a learning-based vehicle motion MPC algorithm that estimates unknown system parameters while ensuring non-increasing estimation errors.
Despite the advantages of learning-based control methods in capturing complex dynamics, ensuring constraint satisfaction in unstructured environments remains a challenge.
To overcome the overfitting issue of online learning algorithms, researchers have proposed different approaches.Shao et al. [12] introduced a weighted Gaussian Process regression (GPR) model, while Särkkä et al. [13] developed a hybrid model combining first principles and non-parametric GP models, enabling a stochastic control method.Wu et al. [14] integrated GP with online optimization techniques to enhance controller performance while maintaining constraints.While GP is widely used in learning-based model predictive control (LBMPC) for addressing model mismatch [15], existing approaches often assume knowledge of the relationship between residual and true dynamics, limiting the GP model's ability to accommodate uncertainty.
The main contributions of this research can be summarized as follows: 1) Proposal of a novel trajectory tracking model for AUVs: The research introduces a new approach that combines a first-principles-based model with a data-driven GP model.This integration allows for improved trajectory tracking performance in AUVs.
2) The GP model utilizes uncertainty models associated with the state and input to establish the relationship between the controlled object's state and the actual state.This GP-based approach can rectify estimation results and adjust the predicted output of the first-principles model.

Kinematics of AUV
This paper focuses on the horizontal motion of a 3 DOF AUV (surge, sway, and yaw) under the assumption that the variables and are small, which is a valid approximation for conventional AUVs.By disregarding the elements associated with heave, roll, and pitch, we arrive at the following simplified expression.
where η = [x, y, ψ] T denotes the position and orientation, and v = [u, v, r] T denotes the velocities.

Dynamics of AUV
The equations of motion for an AUV comprise statics and dynamics.Statics deals with the equilibrium of the vehicle when it is at rest or moving at a constant velocity, while dynamics focuses on the vehicle's accelerated motion.To facilitate the concise expression of the 3 DOF nonlinear dynamic equations of motion, we can state them succinctly as follows.
Assuming a non-coupled motion of the AUV, we can approximate the damping terms on the diagonal as The system inertia matrix, M = diag(M x , M y , M ψ ), includes the added mass.The restoring force, g(η), is set to zero, and the centers of gravity and buoyancy are vertically aligned along the z-axis.The thrust force (control inputs) is denoted as τ = [F u , F v , F r ] T , and the Coriolis-centripetal matrix, C(v), incorporates the added matrix.This simplification is achieved when the body coordinate origin, (x b , y b , z b ), coincides with the center of gravity of the AUV.
By combining the kinematic and dynamic equations, we establish the system model for the AUV tracking control problem.
The AUV system (5), represented by x = [x, y, ψ, u, v, r] T , is inherently nonlinear, making it challenging to solve using linear controllers.In this context, MPC emerges as an appealing solution to address the nonlinearity in the AUV model.

Gaussian Process Based Model Predictive Control
In this section, we will utilize the kinematic model of AUVs and the GP-based MPC algorithm to design a controller for investigating the relationship between the speed and position of AUVs.
The kinematic model of AUVs allows them to achieve omnidirectional motion in three degrees of freedom when traveling on a plane.The objective of this controller is to enable the AUVs to closely follow the desired trajectory, minimize error states, enhance stability, and improve the reliability of the robots during operation.

Model Predictive Control
We consider a general nonlinear dynamical systems in equation ( 5) Assuming both the state vector x ∈ R nx and the input vector u ∈ R nu vary within bounded ranges, we consider a controlled system without measurement noise.To ensure the effectiveness of GPR model modeling and avoid interference from measurement noise, it is assumed that the modelable part f (•) of the controlled system is Lipchitz continuous.
In practical systems, the imposition of specific constraints on the system's input, output, and intermediate variables is often necessitated due to certain physical limitations of the equipment or the imperative to adhere to safety constraints during operation.As a consequence, both the system state and input are subject to constraints.
where each element of x min ,x max ,u min and u max is positive real number.Define the deviation state as Then the deviation state dynamics equation is: Redefining the control input ũk ∈ Ũ ⊆ R nu function F (•, •) of the deviated state dynamics model, equation ( 9) can be rewritten In order to achieve effective tracking of AUVs, it is imperative to employ a performance metric function that enables fast and smooth tracking.We have selected the following cost function: where Q > 0, R > 0 and P > 0 are symmetric weighting matrices, the decision variables are defined by a control policy Π = {π 0 , . . ., π N } over the prediction horizon N .The objective function comprises a scalar stage cost J.

A recursive GPR-based online modeling approach for model mismatch
In this paper, we examine the compensation of model mismatch in the MPC rolling optimization process.We assume that the prediction error, arising from both model mismatch and measurement noise, affects the actual and predicted system outputs.Consequently, we establish a relationship that links the prediction error of the system with the corresponding control effect.
where g represents the mismatch model between the real system model and the nominal model.Moreover, the prediction error e(k) is assumed to be white noise with zero mean and a variance of σ 2 .
The system control action and prediction error at moment k can be described by a linear regression model, which takes into account the control action and prediction error data for the past T moments. where It is evident that the estimation of the mismatch model poses a nonparametric model identification challenge.Herein, we persist in utilizing the previously mentioned nonparametric identification algorithm based on GP, assuming a GP prior for the mismatch model.
where K is a kernel matrix of dimension n×n and each element of the kernel matrix is determined by the kernel function k.
Since the linear regression model is satisfied between the prediction error vector and the mismatch parameter vector, we can obtain the prediction error vector d satisfies the following Gaussian distribution: At this point the parameter vector of the mismatch model and the prediction error vector obey a joint Gaussian distribution: Also the posterior distribution of the mismatch impulse response model with respect to the prediction error can be obtained as The mismatch impulse response model obtained from the previous identification is denoted as θ −1 , and satisfies Taking it as the prior information of the mismatch model at the current moment, the mismatch model and the prediction error obey the following joint Gaussian distribution Similarly, the posterior distribution of the mismatch model with respect to the prediction error can be obtained as To train the GPR model, it is crucial to estimate the hyperparameters associated with the kernel function k.The hyperparameters, denoted by η, are estimated by maximizing the marginal likelihood function given as follows: Based on the derivation of the recursive GP-based mismatch model and the solution of the hyperparameters in the previous paper, it can be found that the estimation of the mismatch model depends not only on the hyperparameters η but also on the variance of the white noise σ 2 .However, the variance of the noise σ 2 is unknown and needs to be estimated in the actual process.One simple and effective way is to estimate the mismatch impulse response model using least squares and then use the sampling variance as an estimate of the noise variance.

Uncertain output transformation based on Gaussian process
Taking into account the inherent uncertainty in the MPC controller model, the prediction of future output x(k) is inherently uncertain.Given that the MPC controller model follows a GP, imposing a strict constraint on the output at this stage may result in an excessively conservative or even unfeasible controller.To address this issue, the traditional hard constraint on the output is substituted by the following chance constraint.
In the given context, let P r(A) represent the probability of event A occurring, and let α denote a specific probability value or confidence level that ensures the satisfaction of the output constraint.Considering that the process output is predicted using a linear model with Gaussiandistributed model parameters, the aforementioned chance constraint can be reformulated as a deterministic convex constraint on the process input.

Simulation 4.1. Case1
The first reference trajectory in the simulation is a sinusoidal shape trajectory defined as follows The initial condition is (0, −0.5, π/2, 0, 0, 0).In the cost function, the weighting matrices are chosen as Q = diag(10 5 , 10 5 , 10 3 , 10 2 , , 10 2 , 10 2 ), R = diag(10 −4 , 10 −4 , 10 −4 , 10 −4 ) and F = diag(10 3 , 10 3 , 10 2 , 10, 10, 10).In the initial phase of control, the controller gives a large control input in both initial phases due to the large deviation of the current position of the AUV from the reference position, and the control input tends to 0 after 2s and remains stable.The LMPC controller effectively utilizes the onboard propulsive capability, ensuring rapid convergence while adhering to system constraints.This implies that the control commands remain within the expected allowable range.

Conclusion
In this research paper, we develop a dynamic model for the movement of an AUV and utilize a model predictive controller to govern the system.To address the disparity between the model and the actual AUV dynamics, we propose a GPR-MPC algorithm.By incorporating GPR into the controller, we are able to learn and account for the deviation between the model predictions and the actual dynamics of the AUV.This controller effectively achieves precise trajectory tracking even in the presence of random disturbances and modeling errors, enabling adaptation to unforeseen changes like movement obstacles and discrepancies in the model.The proposed GPR-MPC trajectory tracking control algorithm enhances stability and enables faster attainment of the desired tracking.

2 :
Error between actual state and expected state.