Modeling and Optimization of Security Robot Gait Balance Based on the Principle of Inverted Pendulum

Robots have been increasingly applied to security protection. The modeling and optimal control of robot gait balance were discussed in this study based on a double inverted pendulum system. First, a trolley-type double inverted pendulum system was obtained by abstracting human gaits, a mathematical model corresponding to this system was constructed through the analytical dynamics method. Then a double inverted pendulum system controller was designed using the linear quadratic type control algorithm, a gain matrix K with the optimal performance was obtained by combining the experience piece-try method with regional dichotomy, and the effectiveness of modeling and optimization was demonstrated by comparing the typical value with optimized value in the end. This study is expected to provide a reference for enhancing the gait balance of security robots.


Introduction
With the high-speed development of artificial intelligence (AI) in recent years, numerous robots have permeated into all walks of life, robot technologies have aroused extensive attention among domestic (Chinese) and foreign experts and scholars, and the research on the modeling and optimization technologies of robot gait balance will be of great significance.
Failing to adapt to ever-changing application environment and flexible modification of control algorithms, the traditional double inverted pendulum modeling system cannot meet the development needs of robots any longer, so the modeling and optimization methods adaptable to the development of robot technologies remain, urgently, to be improved [1]. Intensive research has been devoted to this field. For instance, Xu et al. [2] studied the fuzzy control method of double inverted pendulum and listed two fuzzy control methods, but the established system was of unsatisfactory stability. Xiong et al. [3] explored a new parameter optimization method for linear double inverted pendulum LQR controller based on the particle swarm optimization (PSO), but the convergence precision of PSO was not high in the later phase, and it could be easily stuck in local optimum. Xue et al. [4] used three transformation plans to improve the structural control of double inverted pendulum, but many uncertain factors existed in the real-time control of the established system, with low stability and practical application value.
As the control method that has been applied to the control research on the double inverted pendulum at the earliest, the linear quadratic type control algorithm (LQR) can reach favorable system performance indexes with the minimum cost, and moreover, the method is simple with easy implementation. In consideration that the experience piece-try method, which is applicable to all kinds of control methods, can adjust the parameters in proportion to improve the algorithm efficiency. The physical model of trolley-type double inverted pendulum was obtained by abstracting human gait firstly, the corresponding mathematical model was constructed via the analytical dynamics method, next, the linear quadratic type control algorithm was adopted to design a controller for the double inverted pendulum system. And the experience piece-try method was combined with the idea of regional dichotomy to acquire the optimal gain matrix K, and the system characteristics under the typical value and optimized value were compared. The results show that the theoretical analysis results of the established model accord with the physical laws of system, and meanwhile, the system characteristics under the action of optimized value are of apparent superiority, thus demonstrating the effectiveness of modeling and optimization in this study.

Double inverted pendulum system modeling
The human leg consists of the hip joint, the knee joint and the ankle joint. In the walking process, the hip joint keeps the balance of upper part of the body, the knee joint and ankle joint drive the lower part of the body to walk and exert the function of balancing the body, so the three joints make it possible for human to walk steadily. As the gait of a robot resembles the human gait, the robot joint design takes the leg joint structure for reference. The working principle of the three joints was abstracted into a double inverted pendulum model in order to discuss about the robot gait balance problem, reduce the system complexity and ensure the scientificity of the model [1]. Figure 1 shows the structure of the abstracted trolley-type double inverted pendulum system, consisting mainly of a motor, a belt pulley, synchronous driving belts, a trolley, a guide rail and two rods of pendulum. The motor drives the trolley to make reciprocating rectilinear motion on the unsmooth guide rail via belt pulley and driving belt. The Rod #1 is hinged on the trolley, Rod #1 and Rod #2 are connected by the hinge, and the two rods can freely rotate within the vertical plane parallel to the guide rail. The trolley makes reciprocating motion on the guide rail as driven by the motor so that the double inverted pendulum can keep steady balance. The following hypotheses are made in the modeling process: 1) Upper and lower rods are not elongated; 2) There is no mutual sliding between belt pulley and driving belt, and the driving belt is not elongated; 3) The frictional drag is in direct proportion to the trolley speed during its sliding process, and the frictional drag torque is in direct proportion to relative rotational speed between Rod #1 and Rod #2 during the motion of Rod #2, and the frictional drag torque is also in direct proportion to pendular rotational speed of Rod #1 during its pendular rotation. The physical parameters of the trolley-type double inverted pendulum system are listed in Table 1.  The double inverted pendulum system is a multi-variable, nonlinear, strong coupling, and unstable system. The linearized model of this system cannot only recover a one initial state, which deviates from the system balance state, to balance state through the unrestricted amplitude input, but also acquire the quantity of system state through the measurable output quantity y, so the mathematical model of the inverted pendulum system can be established using the Lagrange's equation in analytical dynamics method [5].

Fqi
(1) where qi F is the non-potential generalized force acting upon the inverted pendulum; T represents the system function; V is the potential energy of the system; D is its dissipated energy. On this basis, its nonlinear model can be acquired through the calculation formulas of kinetic energy, potential energy and dissipated energy as follows: where: Where, the specific values refer to Table 1, and the determined values can be obtained through calculation.

System control and optimization
The LQR controller can reach favorable system performance indexes even with the minimum cost, and the method is simple and easy to implement. Moreover, the LQR can acquire the optimal control law of linear state feedback and form the optimal closed-loop control [6].
The state feedback controller of LQR can be described by the following state equation: where A is the system matrix; B is the input matrix; C is the output matrix.
The optimized system control mainly means reaching the minimal value of index J by determining the matrix K that corresponds to the optimal controlled quantity u(t), and the system characteristics will tend to be the optimal at the time. The expressions of u(t) and J are respectively as follows: where x is the state vector; u is the control vector; R is the weight of controlled quantity, being a positive definite matrix; Q is a diagonal matrix of weight of performance index function relative to the quantity of state, and it is a positive definite or positive semidefinite matrix. q is the weight of angular speed of Rod #2. The Q value will affect the time for the system to return to steady balance from initial state and the time for it to recover its balance after being disturbed. The optimal gain matrix K is decided by weight matrix Q and energy restriction R, so the selection of Q and R becomes especially important. To realize the system balance as soon as possible, the K value should be calculated by optimizing the Q value until the quadratic type objective function J reaches the minimal value. In order to conform to the practical application, the R value was taken as 1 in this study, and the optimal value of weight matrix Q was acquired by combining the experience piece-try method with the regional dichotomy [6].
As a classical optimization method of strong universality, the experience piece-try method can adjust the parameters according to a certain proportion. During the optimization process, the element values at midpoint, starting point and ending point of K matrix were firstly calculated within the given value interval, and the absolute value of element difference between two adjacent K matrixes was denoted as E. For the system analyzed in this study, the simulation curve showed that the curve results were similar under E<0.5, so no refined division was done. If E>0.5, the region was equipartitioned again, the E value of two adjacent K matrixes was compared, followed by continuous cycle, the regional division ended when E<0.5, and the concrete process is shown in Figure 2. Therefore, this method can reduce the calculated quantity by half with better efficiency. Figure 2. Flowchart of regional dichotomy idea After the regional division was completed, the midpoint of each region was taken, and the optimal Q value meeting the system design requirement was finally acquired through a comparative calculation. According to actual living needs, the value range of Q was selected as diag[10,10,10,0,0,0]-diag[600,600,600,0,0,0], a comparative analysis was conducted through the variation trend of angle of rod 2, and the initial value of the system was set as x=[1,0,0,0,0,0]. The unit pulse disturbance was added at 6 s, and it lasted 0.3 s.

Results & Discussion
For the sake of simplification, the parameters of Matrix Q were adjusted simultaneously from small to large, and the comparison results of only three groups of typical Q value and optimal Q values were given. The typical weight matrixes were selected as Q1=diag[100,100,100,0,0,0], Q2=diag[200,200,200,0,0,0] and Q3=diag[300,300,300,0,0,0], and the optimal matrix Q=[563,563,563,0,0,0]was obtained through the aforementioned optimization method. The change curves of angular displacement of Rod #2 with and without addition of unit pulse disturbance are shown in Figure 3. It could be observed from Figure 3 that before the pulse disturbance, the system could return to steady balance under different Q values. The time needed to reach the balance and the maximum system overshoot were respectively as follows: 5 s and 0.218° for Q1; 4.8 s and 0.202° for Q2; 4.6 s and 0.190° for Q3; 4.1 s and 0.17° for Q. After the pulse disturbance was applied, the system could also recover the steady balance under different Q values. It could be obtained through the curves that the time needed to reach steady balance and maximum overshoot were respectively as follows: 3.3 s and 0.025° for Q1; 3.0 s and 0.024° for Q2; 2.9 s and 0.023 for Q3; 2.8 s and 0.021° for Q.
Through the above data comparison, it was demonstrated that in comparison with other Q matrixes, the optimized Q value spent less time in returning to steady balance, with smaller maximum overshoot.

Conclusion
The double inverted pendulum control system, which was derived from robot gait balance, was modeled and optimized. To be specific, the system balance controller was designed via the LQR algorithm, the experience piece-try method was combined with the idea of regional dichotomy to divide the interval of Q value, and the Q values were calculated and studied. According to the computational analysis of multi-group weight matrixes, the method used in this study could obtain the optimal Q matrix (Q=diag[563,563,563,0,0,0]) within diag[10,10,10,0,0,0]-diag[600,600,600,0,0,0]. Hence, the method is also applicable to the optimization in other value intervals of the Q matrix. Furthermore, the optimization results meet the design requirements and application needs in real life.