Individual deformability compensation of soft hydraulic actuators through iterative learning-based neural network

Taku Sugiyama; Kyo Kutsuzawa; Dai Owaki; Mitsuhiro Hayashibe

doi:10.1088/1748-3190/ac1b6f

1. Introduction

Soft robots have great advantages in safe and effective interaction with humans and the physical environment [1]. Many researchers have utilized these advantages and realized various new applications (e.g. bio-inspired soft robots [2], soft robotic grippers [3]). Rehabilitation devices can also benefit from soft robotics technology. To realize the effective rehabilitation for patients with motor paralysis, robotic rehabilitation devices have been developed thus far [4–8]. Recently, one of the extensive research topics in the field of rehabilitation devices is a wearable rehabilitation device with soft actuators (see also figure 1) [9–12]. For example, Polygerinos et al [9] developed a soft robotic glove for hand rehabilitation. This glove utilizes mechanically programmable soft actuators to match the range of motion of a user's fingers. The authors demonstrated its ability to assist much variety of functional grasping. Additionally, the authors improved this glove and showed its effectiveness for an actual patient with muscular dystrophy [11]. Such devices are safer and less expensive than conventional rigid robotic devices and can thus be applied for at-home rehabilitation [9, 13].

**Figure 1.** Fiber-reinforced soft bending actuator. The gray spheres are markers for the motion capture.
Download figure:
Standard image High-resolution image

However, the control of soft actuators is a challenging development aspect owing to characteristics such as high nonlinearity, hysteresis, and response delay [13]. The manual fabrication of soft actuators induces individual deformability (i.e. individual differences) [14–16], resulting in low control performance of soft actuators. Many researchers have utilized model-based and model-free controllers for soft robot control [17–19]. Owing to the individual deformability, an accurate model derivation for sufficient control performance of model-based controllers is challenging and time-consuming. Moreover, simple model-free controllers require laborious, time-consuming parameter tuning to compensate for this individual deformability. Wearable rehabilitation devices require adequate control performance of multiple soft actuators to ensure that they properly assist users and minimize the need for soft actuator calibration (i.e. deformability compensation) for simplicity with at-home rehabilitation. For these reasons, the simple application of model-free or model-based controllers is not suitable for rehabilitation devices.

Some researchers have examined learning control methods to overcome the stated control difficulties of soft actuators. Previous studies mainly proposed three approaches: (1) a model-based learning method; (2) a model-free learning method; (3) a data-driven modeling method. As a model-based learning approach, Tang et al [12] proposed an iterative learning model predictive control method to control soft pneumatic actuators. In this method, an iterative learning controller (ILC) gradually improves an approximated initial model. In the 21st iteration, the controller provided precise trajectory tracking performance with a root mean square tracking error (RMSE) less than 0.03 rad. The researchers also utilized an adaptive scheme to improve a model and achieved precise tracking performance with RMSE less than 0.05 rad [20]. As an example of a model-free learning method, Balasubramanian et al [6] developed PID + ILC controller for an upper limbs rehabilitation device with pneumatic McKibben artificial muscle actuators. The control method provided satisfactory control performance with an absolute mean tracking error of less than 3°. Moreover, the authors achieved automatic compensation for the individual differences among the users of the rehabilitation device during control. Some researchers focused on a data-driven modeling approach. Elgeneidy et al [15] proposed a data-driven modeling approach with feed-forward artificial neural networks. The authors derived a model of a soft pneumatic actuator to predict its bending angle and demonstrated high model accuracy. Giorelli et al [21] utilized a neural network to obtain the inverse kinematics of a soft octopus-like manipulator. These data-driven approaches do not need manual derivation of the model. These three approaches effectively solved the stated control difficulties of soft actuators without a manual derivation of an accurate model or laborious parameter tuning.

On the other hand, the approaches have some drawbacks, and thus, there are some problems in controlling wearable rehabilitation devices effectively with these approaches. As for the model-based learning methods, the initial approximated model limits the performance of the model-based learning controllers. This aspect can hinder the user-assistance capability of wearable rehabilitation devices. Furthermore, a new model needs to be derived for other types of soft actuators or rehabilitation scenarios. The model-free learning method requires an iterative learning process for each target movement. This characteristic is unfavorable for wearable rehabilitation devices, which need to execute a wide variety of movements. Also in [6], the structure of the upper limbs rehabilitation device reduces the nonlinearity of the device, owing to the use of rigid parts. Thus, the effectiveness of the controller for soft bending actuators, which are widely used in hand rehabilitation devices [9–12] and have higher nonlinearity, is uncertain. The data-driven modeling methods need many experimental data for each soft actuator. The shortage of data leads to poor model accuracy and control results. This characteristic especially becomes a problem when multiple actuators need to be used. In these cases, the calibration of the soft actuators in wearable rehabilitation devices becomes complex, and the application of these devices for at-home rehabilitation becomes difficult.

Tang et al [16] proposed a probabilistic model-based online learning optimal control method and overcame the problems of the three approaches. A probabilistic model is automatically derived as a Gaussian process without approximation and updated online with fewer sensory data. This method does not require the collection of many experimental data. Also, the obtained model can automatically deal with various target motions of soft actuators owing to online model update. This method was evaluated with a soft pneumatic actuator and resulted in excellent trajectory tracking results and robustness.

However, this method has not yet been verified with soft actuators with a large response delay and should not be optimal for such soft actuators. Note that in this paper, the response delay means a time lag of higher-order (i.e. it takes more time to reach a certain output than other soft actuators) mainly due to soft actuator characteristics and system configuration. Wearable rehabilitation devices need to be portable to realize at-home rehabilitation. Thus, the size of a pump for soft actuator pressurization should be small. This pump size limitation leads to a slow fluid discharge rate and pressurization speed. Also, soft actuators for rehabilitation devices need to output sufficient assistance distal tip force (around 7.3 N [9]). To meet this requirement, the elastic modulus of the soft actuators needs to be high for higher force generation per angle (i.e. strain) [9]. That leads to a larger required pressure to reach a certain angle. The slow pressurization speed and a large amount of required pressure for sufficient deformation result in a large response delay of soft actuators in wearable rehabilitation devices.

To our best knowledge, no research has yet proposed a learning control method that can compensate for the differences in the individual deformability of soft actuators with a large response delay, which is usable for wearable rehabilitation devices. Therefore, in this study, a feed-forward learning control method for individual deformability compensation, which is applicable for soft actuators with a large response delay, is developed. The proposed method consists of a simple feed-forward neural network (FNN) and an ILC. The FNN is trained to acquire the inverse model of the soft actuators. The obtained inverse model realizes tracking control for various generalized trajectories. For the supervised learning of the FNN, the simple trajectory tracking results obtained using the ILC are utilized as the training data. The ILC can compensate for the individual deformability in soft actuators. Thus, the control results of the ILC include information on how to cope with the individual deformability to realize target movements. Thereby, the FNN can efficiently learn the deformability from the training data. The FNN also can learn the effect of the response delay owing to time series data input. The key contribution of this work is the development of the feed-forward control method consisted of a simple FNN and an ILC for individual deformability compensation of soft actuators with a large response delay.

2. Methods

2.1. Soft hydraulic actuators

The control target was a fiber-reinforced soft bending actuator (FRSBA) [9], as shown in figure 1. The FRSBA has been used in a wearable hand rehabilitation device [9, 11]. Thus, the FRSBA is chosen as the control target. The FRSBA has a tubular structure with a semicircular cross-section. The structure is made of an elastomer, with fiber reinforcements in an elastomeric matrix. The radial reinforcements limit the radial expansion, and the reinforcements on the flat surface inhibit its extension, resulting in the surface bending through the extension of the curved surface upon pressurization. Water pressure was used for the actuation. Compared with pneumatic actuation, energy generated by a pump is more efficiently converted into internal pressure of a soft actuator in hydraulic actuation, owing to the incompressibility of water. As a result, hydraulic actuation involves a larger force generation if the same pump consumed the same amount of energy [22, 23]. Thus, a soft hydraulic actuator is suitable for portable at-home rehabilitation devices, which need to provide users with sufficient assistance force using a compact pump.

A three-port solenoid valve was used to adjust the amount of pressure applied to the FRSBA. The valve was connected next to the FRSBA, and a pulse width modulation (PWM) signal was employed to control the opening.

2.2. Overview of proposed control method

The proposed method consists of the FNN and the ILC. In general, neural networks have been widely used to learn the models of soft actuators in the soft robotics domain [24]. The FNN with one hidden layer is trained to obtain an inverse model of the FRSBA (figure 2). To learn the effect of the response delay, the FNN receives the data of a certain time series. The ILC is used to accomplish simple trajectory tracking tasks, and the tracking results are used as the training data. These tracking results include data of deformability which the ILC learned. Thus, the FNN can efficiently learn this individual deformability from the training data. The upper area of figure 2 shows the process flow of the ILC at the ith iteration. First, the ILC calculates the input u_i(t) from the data of the previous iteration. Next, the conversion function converts u_i(t) to v_i(t) ≡ f(u_i(t)). The soft actuator is controlled by v_i(t). The output of the soft actuator θ_i(t) and u_i(t) are saved in a memory of a control device (the 'memory' block in figure 2) for use in the subsequent iteration, and θ_i(t), v_i(t) are used for training of the FNN.

**Figure 2.** The process flow of the proposed method. The upper area describes the ILC at the ith iteration. The lower area describes the FNN training and process flow. The solid arrow indicates data transfer related to control, and the dotted arrow is related to the FNN training.
Download figure:
Standard image High-resolution image

The controlled variable θ (rad) is the angle between the initial and current base-to-tip vectors of the FRSBA. The control input u, v are the PWM duty cycle (D.C.) (%) of the valve. θ is calculated with equation (1).

$\begin{equation}\theta =\mathrm{arccos}\left(\frac{{\mathbf{a}}_{\mathrm{i}\mathrm{n}\mathrm{i}\mathrm{t}}\cdot {\mathbf{a}}_{\mathrm{c}\mathrm{u}\mathrm{r}\mathrm{r}\mathrm{e}\mathrm{n}\mathrm{t}}}{\left\vert {\mathbf{a}}_{\mathrm{i}\mathrm{n}\mathrm{i}\mathrm{t}}\right\vert \left\vert {\mathbf{a}}_{\mathrm{c}\mathrm{u}\mathrm{r}\mathrm{r}\mathrm{e}\mathrm{n}\mathrm{t}}\right\vert }\right),\end{equation} \tag{ 1 }$

where a_init denotes the initial base-to-tip vector, a_current denotes the base-to-tip vector at a time t (figure 1), a_init⋅a_current is the dot product of two vectors, and $\left\vert {\mathbf{a}}_{\mathrm{i}\mathrm{n}\mathrm{i}\mathrm{t}}\right\vert$ and $\left\vert {\mathbf{a}}_{\mathrm{c}\mathrm{u}\mathrm{r}\mathrm{r}\mathrm{e}\mathrm{n}\mathrm{t}}\right\vert$ are the magnitudes of each vector.

2.3. Iterative learning controller (ILC)

The ILC technique can gradually improve the tracking performance of a system by repeating the same movement [25] and has been used to control several soft actuators with satisfactory results [26–28]. The ILC iteratively generates the input for the FRSBA and learns its deformability simultaneously. The iterative learning process is carried out individually, and in the end, the ILC can automatically generate suitable control input for each of FRSBA's deformability without laborious parameter tuning.

The simple model-free PD-type ILC is used in the method:

$\begin{equation}{u}_{i+1}(t)=\begin{cases}_{i}(t)+{{\Gamma}}_{1}{e}_{i}(t)+{{\Gamma}}_{2}{\dot {e}}_{i}(t)\quad & (i\ne 0)\\ \enspace 0\quad & (i=0)\end{cases},\end{equation} \tag{ 2 }$

where e_i(t) ≡ θ_d(t) − θ_i(t) is the error between the target angle θ_d(t) and actual resulting angle θ_i(t), and ${\dot {e}}_{i}(t)$ is the time derivative of e_i(t). ${\dot {e}}_{i}(t)$ is computed by the backward finite differences method after the calculation of e_i(t). Γ₁ and Γ₂ are the learning gains. Before the calculation of ${\dot {e}}_{i}(t)$ , θ_i(t) was processed using a discrete-time low-pass filter as shown in the equation (3), and θ_LPF(t) was used for the calculation to reduce the noise.

$\begin{equation}{\theta }_{\mathrm{L}\mathrm{P}\mathrm{F}}(t)=r{\theta }_{i}(t)+(1-r){\theta }_{\mathrm{L}\mathrm{P}\mathrm{F}}(t-1),\end{equation} \tag{ 3 }$

where 0 < r < 1 is an arbitrary constant.

As the input u(t) is the PWM D.C., its value was limited as shown in the equation (4) before the value conversion.

$\begin{equation}u(t)=\begin{cases}100\quad & (u(t){\geqslant}100)\\ u(t)\quad & (0{< }u(t){< }100)\\ 0\quad & (u(t){\leqslant}0)\end{cases}.\end{equation} \tag{ 4 }$

2.4. Conversion function f(u)

As shown in figure 3(a), the relation between the step input and the corresponding steady response of the FRSBA is nonlinear. Thus, it is assumed that the use of only the ILC for the FRSBA may introduce instability in the iterative learning process, thereby degrading the control result. For example, the input at each iteration varies at a constant rate depending on e_i(t), regardless of the nonlinearity. Thus, the variation considerably affects the results in a certain range of the input, and vice versa.

Therefore, the conversion function f(u) is used to reduce the nonlinearity and improve the control result. The conversion function f(u), which is similar to gain scheduling [29], is a nonlinear function to make a PWM D.C. value calculated by the ILC and a percentage of an output angle of the FRSBA to its maximum output angle with PWM D.C. of 100% (angle%) equal. This processing approximately linearizes the input–output relation of the FRSBA. Figure 3(c) shows the graph of f(u) and its processing example. When f(u) receives u = 60%, f(u) outputs v = 76.3% to make the output of the FRSBA (angle%) be 60%.

The specific design process of f(u) is as follows.

(a)
Record the step responses to the input from 0% to 100% in 10% intervals. Next, prepare a graph plotting the input (PWM D.C.) and angles at the steady-state. The angles in the dead zone of the FRSBA are considered to be zero.
(b)
Change the scale of the vertical axis to 'angle%'. Then, interpolate each point with a straight line (piecewise linear interpolation, figure 3(b)).
(c)
Calculate the inverse function of the function obtained through piecewise linear interpolation in the previous design process. If multiple experimental points are in the dead zone, only the point which has the maximum corresponding PWM D.C. value is used for the calculation. That inverse function is f(u) (figure 3(c)). Note that the labels of f(u) are replaced with u (%) and v (%).

In the experiments, another type of conversion function was designed, that is, f_square(u). Figure 3(d) depicts f(u) and f_square(u) on the same u–v graph to visualize the differences between f(u) and f_square(u). It is simpler to design f_square(u) than f(u), as the graph obtained through the design process 1 is replaced with a quadratic function Φ(x):

$\begin{equation}{\Phi}(x)=\begin{cases}\frac{{y}_{\mathrm{max}}-{y}_{\mathrm{t}\mathrm{h}}}{{({x}_{\mathrm{max}}-{x}_{\mathrm{t}\mathrm{h}})}^{2}}{(x-{x}_{\mathrm{t}\mathrm{h}})}^{2}+{y}_{\mathrm{t}\mathrm{h}}\enspace & (x{\geqslant}{x}_{\mathrm{t}\mathrm{h}})\\ 0\quad & (x{< }{x}_{\mathrm{t}\mathrm{h}})\end{cases},\end{equation} \tag{ 5 }$

where x indicates the input (PWM D.C.) in the procedure 1, and y indicates the angle at the steady-state corresponding to x. The quadratic function passes the point at which the PWM D.C. is 100% (x_max, y_max), and its local minimum value occurs on the maximum percentage of the dead zone (x_th, y_th). In this manner, only two experimental values are needed to design f_square(u) (figure 3(e)).

2.5. Feed-forward neural network (FNN)

The FNN is trained with supervised datasets obtained using the ILC. The trained FNN can be used to realize various generalized trajectory tracking tasks. Note that these tracking tasks can be realized without the iterative learning process like the ILC. The FNN receives the desired angle of the FRSBA around the target time t. Subsequently, the FNN outputs the value of the PWM D.C. at t (figure 4). The input to the FNN (angle) outside of control time is 0 (i.e. the angle at t < 0, t_end < t is 0, t_end indicates the termination time of the control).

**Figure 4.** Input and output of the FNN.
Download figure:
Standard image High-resolution image

The number of neurons in the input layer is varied depending on the delay time of the FRSBA, and the output layer has one neuron. Before training, the input training data (angle data of the FRSBA) were filtered using a low-pass filter with a cutoff frequency of 2.5 Hz to reduce the noise. In this work, the Adam optimizer [30] was used, and the mean absolute error loss function was employed to reduce the effect of the noise in training.

2.6. Experimental setup

The experimental setup is shown in figure 5(a). A hydraulic pump (Flojet LFP series, Xylem) supplied water pressure, and a three-port solenoid valve (VDW200 series, SMC) regulated the amount of pressure supply to the FRSBA. The air chamber between the pump and valve reduced the periodic fluid discharge rate variation caused by the operation principle of the pump (diaphragm pump). The control input was determined using the PC and transmitted to the microcontroller (Arduino Mega 2560 R3, Arduino) which generated the PWM signal. This PWM signal operated the valve via the electric circuit. A motion capture system (OptiTrack Prime 17W, NaturalPoint) tracked markers on the FRSBA (see figure 1), and the track data were transmitted to the PC to calculate the subsequent iteration input. The PWM frequency was empirically set as 50 Hz. The control frequency and the motion capture frequency were 30 Hz, which is higher than the FRSBA mechanical frequency (1–2 Hz) [9] and lower than the PWM frequency.

The FRSBA was made of an elastomer (Elastosil M4601, Wacker Chemie). Thread (Kevlar Yarn #30, ESCO) and a glass fiber tape (Tiger G Fiber Tape N, Yoshino Gypsum) were used as the radial and the flat surface reinforcements. The position of the FRSBA was constrained by its connection to the fixed solenoid valve (figure 5(b)).

3. Experimental results

This section describes the physical experiments conducted to evaluate the effectiveness of the proposed method.

3.1. Compensation of the individual deformability with the ILC

Four FRSBAs were used to evaluate if the ILC could compensate for the individual deformability. The parameters of the FRSBA are listed in table 1. The frequency of the target trajectory was 0.2 Hz, and its amplitude was set to ensure that the FRSBA does not exceed its range of motion. Two target trajectories with different amplitudes were set for the FRSBA No. 4 to compare the control results with the FRSBA No. 3. The step response to the same 100% PWM D.C. for each FRSBA is shown in figure 6. The results show significant differences in the individual deformability in both the transient and steady responses. Note that each of the maximum bending angle values in table 1 is the average from 9 s to 10 s of the steady-state response depicted in figure 6. Each FRSBA was controlled using three types of control law: ILC-only, ILC + f(u), and ILC + f_square(u). Note that f(u) and f_square(u) were designed for each of the FRSBA. Also, the maximum PWM D.C. of the dead zone was 30% for all FRSBAs. The number of iterations for each strategy was 15. The gains were empirically set as Γ₁ = 32.0, Γ₂ = 0.1. The r in the equation (3) was also empirically set as r = 0.75, which means the cutoff frequency of 14 Hz.

Table 1. The parameters of the FRSBA.

No.	Length (mm)	Amplitude of the target trajectory (°)	The maximum bending angle (θ) at 100% PWM duty cycle (°)
1	95.6	0–20	39.2
2	94.5	0–60	81.7
3	96.2	0–45	62.8
4	95.9	(a) 0–45, (b) 0–60	87.8

**Figure 6.** Step response for each FRSBA. The input was the PWM duty cycle of 100%.
Download figure:
Standard image High-resolution image

The control results and the corresponding control inputs at the 15th iteration are illustrated in figure 7 and table 2. Note that the experiment was repeated five times. For the same control strategy, the resulted trajectories were similar for all of the FRSBAs. In contrast, the corresponding control inputs are significantly different among the FRSBAs, which means simple scaling of the control input is insufficient for the individual deformability compensation. The control results of FRSBA No. 3 and No. 4(a) clearly show that point. These results indicate that the ILC successfully calculated suitable control input to each FRSBA and compensated for the individual deformability. In particular, when f(u) or f_square(u) was used, deformability compensation and accurate tracking performance were achieved. As shown in figure 7, the ILC-only method caused a delay at the beginning, followed by an overshoot. However, the use of f(u) or f_square(u) to linearize the input–output relation of the FRSBA improved the results. Thus, by measuring only a small number of steady-state responses beforehand, the proposed ILC could cope with the nonlinearity shown in figure 3(a).

Table 2. Average RMSE at 15th iteration.

No.	RMSE (°)
No.	ILC-only	ILC + f(u)	ILC + f_square(u)
1	3.41 ± 0.11	0.82 ± 0.07	0.77 ± 0.05
2	5.55 ± 0.19	2.39 ± 0.11	2.69 ± 0.04
3	5.67 ± 0.23	1.64 ± 0.33	1.77 ± 0.12
4(a)	5.76 ± 0.22	2.13 ± 0.06	2.38 ± 0.14
4(b)	6.60 ± 0.25	3.36 ± 0.17	3.14 ± 0.08

Figure 8 illustrates the root mean square error e(t) (RMSE) values at each iteration. The solid line indicates the average values for the five runs. The RMSE decreased with each iteration by the ILC that iteratively learned the suitable input for each FRSBA. Moreover, the use of f(u) or f_square(u) decreased the final RMSE and accelerated the reduction of the RMSE with each iteration. Additionally, the performance of f(u) and f_square(u) was sufficiently similar, considering their standard deviations. This result shows that the ILC could learn and cope with the nonlinearity even if the simpler conversion function, f_square(u), is used.

3.2. Trajectory tracking tasks with the FNN

The inverse model (i.e. the model that can generate control input to the soft actuator from a target trajectory) of one FRSBA (No. 4) was trained using the FNN. After training, the input calculated using the FNN was used to control the FRSBA for various trajectories. Figure 6 shows that the delay time of the FRSBA No. 4 was 0.57 s, corresponding to approximately 17 samples at a control frequency of 30 Hz. Thus, the number of input layer neurons was set as 35, which corresponds to the sum of the total number of samples at the target time t and 17 samples before and after t. The hidden layer neuron was set as 100 to avoid overfitting of the FNN. To collect adequate training data, the FRSBA was controlled to achieve single period sine trajectories with various amplitudes and frequencies. Figure 9 represents the characteristics of the trajectories. The results of all 15 iterations were utilized for training to make training data more diverse for the versatility of the FNN and to reduce the time required for training data acquisition. Note that the ILC + f_square(u) strategy was used for the control, which could achieve accurate tracking performance with the simpler conversion function. The learning rate (α) of the optimizer was empirically set as 1.0 × 10⁻³, and the other parameters were set as the same in [30]. The training was repeated for 250 epochs.

**Figure 9.** Set of amplitudes and frequencies of the trajectories in the training data and figures 10(a) and (b).
Download figure:
Standard image High-resolution image

The experiment was repeated five times with the same input. In each subfigure of figure 10, the solid line indicates the average value. Figure 10(a) shows the tracking results for the single period sine trajectories, same as those in the training data, and figure 10(b) shows the results for the cases in which the trajectories were different from the training data. Figure 9 illustrates the amplitude and frequency combinations of the trajectories. The RMSE is sufficiently low in both cases, with the worst value of 5.76 ± 0.25° when the amplitude is 55°. This value is lower than the RMSE of the ILC-only strategy (table 2). Notably, tracking with a low RMSE could be realized even when the target trajectories were different from those in the training data. In general, the RMSE was slightly higher when the trajectories had a high amplitude; however, this phenomenon likely occurred because the actuation speed was close to the physical limits of the FRSBA.

The result of the complex trajectory tracking is shown in figure 11. Precise tracking performance for the complex trajectory was able to be achieved, with an RMSE of only 2.11 ± 0.35°. Interestingly, the tracking of the triangle and square trajectory could also be realized even though the training data only included the tracking results of the sine waves.

3.3. The effect of time series data input on the FNN

Two FNNs with different time series input lengths (i.e. different number of input layer neurons) were trained to investigate the effect of time series data input on the response delay compensation. The number of input layer neurons was set as 17, and 1 respectively. One FNN received data at t and eight samples before and after t as the input, and the other FNN just received data at t. The training data and the training parameters were the same as in the previous subsection. The number of the hidden layer neuron was also the same. Note that the learning curves of the FNNs and experiment results showed that overfitting did not occur.

The target trajectory was the same as in figure 11, and the experiment was repeated five times with the same input. Figure 12 shows the tracking results of each FNN with a different number of input layer neurons. Table 3 shows the delay until the FRSBA reach the peak A, B, and D in figure 12. As for the peak C and E, the delay was not calculated because the FRSBA did not become a steady state at these peaks in some trials. Also, the negative values in table 3 indicate that the FRSBA reached the peaks quicker than the target trajectory. Figure 12 and table 3 show that the FNN with one input layer neuron induced a large overall delay against the target trajectory. The FNN with 17 input layer neurons did not show a clear delay at the peaks compared to the FNN with 35 input layer neurons; however, figure 12 shows a slight delay in the rise sections of the trajectory. These results show that a certain amount of time series data input enables the FNN to learn the effect of the response delay and compensate for it.

Table 3. The delay to reach the peaks of the complex trajectory in figure 12.

Peak	Delay (s)
Peak	Length: 35	Length: 17	Length: 1
A	−0.06 ± 0.17	0.03 ± 0.04	0.82 ± 0.07
B	−0.16 ± 0.02	−0.14 ± 0.07	0.24 ± 0.21
D	0.01 ± 0.03	−0.04 ± 0.05	0.48 ± 0.03

4. Discussion

Figure 6 shows that the same control input to different FRSBAs induces considerably different control results due to the individual deformability. However, as demonstrated in figure 7, table 2, and figure 8, the proposed ILC successfully compensated for the individual deformability. Especially in figure 7, the same control strategy led to similar tracking results when the corresponding control inputs differ greatly among the FRSBAs. These results show that the use of the ILC enables to achieve very similar control performance among different soft actuators with each deformability, even the control strategy is the same. In other words, the use of the ILC resulted in individual deformability compensation. If the controller is PID, laborious parameter tuning is mandatory to achieve the deformability compensation (i.e. a similar control performance among different soft actuators), unlike the ILC. Additionally, the tracking performance will be worse due to the response delay of the FRSBA, which is mainly caused by the low pressurization capability of the pump and a large amount of required internal pressure of the FRSBA for sufficient deformation and assistance force. Note that the response delay caused by the fluid power system accounts for about 70% of the total response delay, and the remaining 30% is due to the characteristics of the FRSBA (see also figure S3 in the supplementary material). Owing to the training data collected with the ILC, the FNN efficiently learned the inverse model of the FRSBA, including the deformability. Figures 10 and 11 illustrate the capability of the FNN for the generalized trajectory tracking tasks. Moreover, figure 12 and table 3 demonstrate that the time series data input to the FNN contributes to learning the response delay of the FRSBA and improves the tracking accuracy through response delay compensation. The results show that the ILC achieved the individual deformability compensation, and the FNN learned the inverse model of the FRSBA from the training data, all with precise tracking performance. This fact suggests that the same FNN training procedure for the other FRSBAs should result in the individual inverse model, including each deformability. Moreover, the precise tracking performance with the other FRSBAs should also be achieved using the obtained individual inverse model without complex model derivation or laborious parameter tuning. These results confirm the effectiveness of the proposed feed-forward control method for soft actuators with a large response delay.

The research direction of this paper shares common topics with [16]. Tang et al [16] proposed a probabilistic model-based optimal control policy with online learning. The researchers achieved excellent performance on complex trajectory tracking tasks using soft pneumatic actuators. Moreover, the individual deformability compensation could be realized. However, this method employed a one-step model prediction in the input calculation. In addition, an integral-like feedback controller was used to decrease the tracking errors. Thus, for soft actuators with a large response delay, the prediction step may be insufficient, and the feedback controller may lead to unstable control results. In comparison, the proposed method is suitable for soft actuators with a large response delay. The experiments demonstrated satisfactory tracking performance when using the FRSBA, even though it took 0.27–0.36 s for the FRSBA to raise their angle by 16° (figure 6). This value was larger than that of the soft pneumatic actuators used in [16], approximately 0.1 s at most.

Regarding the potential limitation, aperiodic disturbances (abrupt force disturbance) considerably influence the performance of the ILC. The ILC expressed in the equation (2) is simple and easy to implement, while it cannot distinguish the effect of the disturbance from the results. Therefore, when the disturbance force is artificially applied in an iteration, the ILC calculates the subsequent iteration input from the previous iteration results even if the results are considerably different from the original deformability of the FRSBA. Moreover, many iterations may be required to compensate for the effect of the disturbance. This aspect can result in inaccurate tracking performance in the iterative learning process. This point is the issue of the ILC, that is, the issue of data collection. Once the FNN learned the inverse model with datasets obtained under a calm experimental condition, the FNN should work as long as the property of the soft actuator is the same and no disturbances occur during the control with the FNN. However, a disturbance during the control with the FNN considerably affects the performance of the generalized trajectory tracking. The control with the FNN is completely open-loop control, and the FNN cannot cope with the disturbances. This point will be a significant problem for the application of the proposed method to wearable rehabilitation devices. For example, if a user intentionally applies aperiodic resistance to soft actuators or accidentally hits soft actuators against the environment, the FNN cannot properly perform the trajectory tracking. Therefore, it will be essential to research the solution for this important limitation (e.g. the augmentation of feedback to the FNN) of the proposed method for the application to wearable rehabilitation devices. As another limitation, the proposed method for inverse model acquisition is an offline process. The physical parameters (e.g. level of paralysis) of patients and the rehabilitation environment vary over time in an actual rehabilitation process. The proposed method needs inverse model relearning to deal with such changes. Thus, the augmentation of feedback or the addition of an online-learning scheme to the proposed method should be investigated to improve its application in wearable rehabilitation devices.

The proposed method can solve the control problems (e.g. individual deformability, a high nonlinearity, and response delay) of a soft actuator used for wearable rehabilitation devices. The combination of the ILC and the FNN enables efficient inverse model acquisition from the training data, the actual tracking results of practical trajectories. Once the FNN learned the inverse model, accurate tracking control of soft actuators with a large response delay for various generalized trajectories is possible without the iterative learning process. The application of the proposed method to wearable rehabilitation devices with soft actuators provides sufficient control performance for user assistance and easy compensation for individual deformability. Therefore, the proposed control method should contribute to the development of wearable rehabilitation devices with soft actuators. Additionally, the proposed method should also contribute to the field of soft robotics, owing to such control capabilities for soft actuators with a large response delay. Future work may be focused on validating the effectiveness of the proposed method for other types of soft actuators. The application to force control (which would be useful for rehabilitation devices) is to be investigated. Also, the application of the proposed method to a soft actuator with the integrated sensor will be meaningful future research because it is not practical to measure the FRSBA angle with a motion capture system in an actual at-home rehabilitation situation. Finally, the future work also includes research about the effectiveness of the proposed method on hysteresis, a common soft actuator problem [13].

5. Conclusions

This paper proposed a feed-forward control method that consists of a simple FNN and an ILC. This method can be applied to precisely control soft actuators with a large response delay. The FNN is trained to acquire an inverse model of the soft actuators, including the individual deformability. Moreover, the FNN receives the data of a certain time series so that the FNN can learn the effect of the response delay. The ILC is utilized to collect the training data, achieving deformability compensation in simple trajectory tracking tasks. Thereby, the FNN can efficiently learn the deformability from the training data. After training, the FNN can calculate the input for various generalized trajectories. Experiments with soft hydraulic actuators demonstrated the effectiveness of the proposed method, where the individual deformability compensation and various generalized trajectory tracking were achieved. This paper is the first work that managed both the deformability compensation and generalized trajectory tracking for the soft actuators with a large response delay. The proposed method can solve the control problems of such soft actuators. Thus, the method can contribute to wearable rehabilitation device development and the field of soft robotics, achieving adequate control performance and easy compensation for individual deformability of soft actuators with a large response delay.

Acknowledgments

This research is supported by the JSPS Grant-in-Aid for Scientific Research (B) (18H01399) and by the JSPS Grant-in-Aid for Scientific Research on Innovative Areas (20H05458) Hyper-Adaptability project.

Data availability statement

The data that support the findings of this study are available upon reasonable request from the authors.

Individual deformability compensation of soft hydraulic actuators through iterative learning-based neural network

Article metrics

Submit

Author e-mails

Author affiliations

Author notes

ORCID iDs

Dates

Peer review information

Abstract

1. Introduction