A time-delayed physical reservoir with various time constants

Physical reservoir computing has been attracting attention in recent years. However, it remains unclear how much nonlinearity is required in the physical dynamics to achieve a high computational performance. Therefore, we focused on a resistor–capacitor circuit, which exhibits simple transient characteristics, and investigated the performance required for a physical reservoir. As a result, the proposed reservoir shows a high performance for time series prediction tasks and can be used as a computational resource even without high nonlinearity in the physical dynamics. These results are expected to help establish design guidelines that support the hardware implementation of physical reservoirs.


E
dge computing, which performs arithmetic processing at a location physically close to the user or data source, is attracting attention due to its processing speed and safety. 1)To enable data processing at any edge, a scheme that reduces calculation costs while maintaining a high learning performance is required.One such candidate is reservoir computing, which is a computational framework suitable for processing time-series data. 2)[5] Since the role of the intermediate-layer "reservoir" in the reservoir computing approach is to convert the system input into a higher-dimensional signal, an actual physical system that responds nonlinearly to the input can be used instead.Such physically implemented reservoirs are called physical reservoirs (PRs), 6) and these have been demonstrated experimentally and computationally in various physical systems, including buckets of water, 7) soft material, 8) magnetic tunnel junctions, 9) spintronics, 10,11) photonics, [12][13][14] electrochemical reactions, 15,16) electronic circuits, 17,18) ionic liquid (IL) devices, [19][20][21] FET devices, [22][23][24] and memristors.[25][26][27][28][29][30] However, the timescale of the PR signals that can be processed in real time is limited by the inherent transient characteristics of the selected physical system. ESNs wth leaky integrator neurons (LI-ESNs) have been widely used to adapt the model to the temporal characteristics of a learning target; 31) in several cases in which LI-ESNs have been applied, they have successfully improved the short-term memory (STM) in ESNs. 32) Moe recently, an ESN with diverse timescales (DTS-ESN) in which the leak rates of the LI neurons are distributed was proposed and reported to flexibly handle various timescale dynamics.33) Inspired by these ESN approaches, we propose a PR consisting of parallely connected reservoirs with various time constants (VTC-PR).
Numerical calculations were performed using Python 3.10.9and NumPy 1.23.5 on an ARM64 CPU running MacOS 13.2.1.LTspice (version 17.0.42)was used for the circuit simulation.VTC-PR comprises several parallelly connected one-resistor and one-capacitor (RC) circuit elements with different time constants (n elements, n = 1 is a single-RC circuit reservoir), as shown in Fig. 1.The time constant (τ) of the RC circuit is given by the product of the resistance R and capacitance C. In this study, the resistor value R i (i = 1, …, n) was varied to control the time constant of the RC circuit (C was fixed as 1 μF).Each RC circuit receives a discrete signal (V in ), which is the input time-series u(t) showing the time evolution in discrete time t (t = 1, …, T) converted directly into a voltage value, and an output voltage (V out ) is obtained.Subsequently, V out is converted into a high-dimensional reservoir state where N x is the number of virtual nodes per RC circuit) at each time step (Δt) using the virtual node method. 17) VTC-PR allows for diversity in the timescale of reservoir dynamics similarly to DTS-ESN, because the time constants of the RC circuits used as reservoirs are distributed.Note that unlike DTS-ESN, which is based on a network-based reservoir consisting of networked neurons (nodes), since no recurrent connection weight is set, there is no interaction among the RC circuit elements in VTC-PR.
Koh et al. proposed a PR based on dielectric relaxation at the electrode/IL interface (IL-PR). 19)In IL-PR, the timescale of reservoir dynamics is adjustable by selecting an appropriate combination of an anion and cation of IL, which provide the viscosity giving the dielectric relaxation time matched with the timescale of the input signal.However, this approach makes device implementation difficult, since readjustment of the viscosity is required to process other input signals with significantly different timescales.In contrast, the proposed VTC-PR takes the approach giving the diverse timescales by connecting reservoirs with different time constants without adjusting the time constant, allowing to process signals with a wide range of timescales.
In this study, the readout was learned based on the ridge regression 34) shown in Eq. ( 1) to calculate the appropriate weights W out  that minimize the squared error between y(t) x denote all d(t) and x(t), respectively, β is the normalization parameter, and x is the identity matrix.When β = 0, Eq. ( 1) results in a linear regression.
We evaluated the performance of VTC-PR on (i) a timeseries prediction task for nonlinear autoregressive moving average (NARMA) models (NARMA task), (ii) an STM task, and (iii) a time-series prediction task for the Hindmarsh-Rose (HR) model (HR task).The details of each task are as follows.
(i) The NARMA task is widely used as a benchmark task for PRs.Given a time-series input u(t), the NARMA model of order m is written as follows: 35) å where m is an integer from 2-10 and is denoted in the main text as NARMA2-NARMA10 tasks, respectively.The second term on the right-hand side is a nonlinear term that depends on the state of the past m steps.The NARMA task verifies whether the PR can mimic the nonlinear transformation from u(t) to d(t).Thus, if the reservoir can reproduce the NARMA model, it has "nonlinearity" and "memory" of the input at least m steps prior.To verify the NARMA task, we generated time-series data u(t) based on a random number sequence following a uniform distribution in the interval [0, 0.5].The Δt of u(t) was set to 1 s, and the target time-series data d(t) were generated from u(t) using Eq. ( 2).The total length of the data was 1000, with the first 900 data for training and the remaining 100 data for validation.In the training phase, the readout was trained with ridge regression (β = 10 −5 ).The normalized mean squared error (NMSE) was used to evaluate the computational performance [Eq.( 3)]: where ‖ • ‖ denotes the Euclidean distance (or norm).
(ii) The STM task is a method for quantitatively evaluating the memory performance of machine learning models, and was introduced by Jaeger. 36)Given a time-series input u(t), the target time-series data d(t) of the STM task are the series of u(t) delayed by time k; that is, d(t) = u(t − k).Here, when the learned model output at delay length k is y t k ˆ( ), the memory capacity (MC) can be measured by Eq. ( 4): where Cov( • ) is the covariance and Var( • ) is the variance.If the delay series can be recovered as model output, MC k takes a value close to 1.To verify the STM task, we generated time-series data u(t) based on a random number sequence following a uniform distribution in the interval [0, 0.5].The Δt of u(t) was set to 1 s.The total length of the data was 1000.The first 200 data were used for transient data, and the next 800 data for training data.The transient data were discarded in order to eliminate the effect of the initial state on x(t).In the training phase, the readout was trained with linear regression (β = 0).(iii) The HR task is a method to evaluate the performance of reservoir computing in predicting multiscale dynamics, and was introduced by Tanaka et al. 33) The HR model aims to reproduce the spike-burst behavior of the membrane potential observed in single-neuron experiments, and consists of a fast subsystem modeling fast ion channel behavior and a slow subsystem modeling slow ion channels. 37)The HR model is x 0 = − 8/5, respectively, so that the model exhibits chaotic dynamics, based on the study by Tanaka et al. 33) The ordinary differential equations that make up the HR model were integrated numerically with a time step Δt = 0.05 using the Runge-Kutta method.The total length of the data was 2000.The first 200 data were used for transient data, the next 1200 data for training, and the remaining 600 data for validation.The transient data were discarded in order to eliminate the effect of the initial state on x(t).For evaluation, the task was to predict the time evolution of the entire variable (x, y, z) one step ahead from only the input timeseries of the fast variables (x, y).Since the inference involves unknown slow dynamics, diversity in the timescale of the reservoir dynamics is required to handle this task.In the training phase, the readout was trained with ridge regression (β = 10 −3 ).The NMSE was used to evaluate the computational performance.We evaluated the computational performance of a single-RC circuit reservoir for the NARMA task.The results are presented in Fig. 2(a).Table I lists the prediction errors of the NARMA2 task compared with various PR-based RC systems in previous studies.Although a single-RC circuit reservoir is very simple and the reservoir size is small (five nodes), it exhibits a computational performance comparable to the PRs of previous studies aimed at hardware implementation.
Figure 2(b) shows the reservoir size dependence of the NARMA2-NARMA10 task performances.An excessive increase of the reservoir size did not improve the prediction performance.This is because, for RC circuits that exhibit simple transient characteristics, increasing the number of samplings within a time step too much does not improve the dimensionality of the effective feature space, since neighboring sampling points are strongly correlated.
The effect of the time constant of the reservoir dynamics on the computational performance was investigated.Figure 2(c) shows the relationship between the reservoir time constant and the NARMA2-NARMA10 task performances.The results showed that the tasks in which the reservoirs excel are distributed according to their time constants.Therefore, reservoirs with different time constants connected in parallel enables complementary learning.Therefore, VTC-PR is expected to improve the computational performance compared to the previously mentioned reservoir with a single-RC circuit.
The computational performance of the proposed VTC-PR on the NARMA task was evaluated.The results are shown in Fig. 3(a), which compares the computational performance of VTC-PR with that of a single-RC circuit reservoir.Here, the time constant of a single-RC circuit was 1 s to match the Δt of the u(t), and the VTC-PR had five RC circuit elements with time constants τ = [0.2,0.7, 1.4, 2.0, 2.9] s (10 virtual nodes were extracted from each element and combined to obtain a total of 50 nodes X).The combination of time constants was optimized to minimize the sum of the prediction errors of the NARMA2-NARMA10 tasks (time constants were verified in the range of 0.1-3.0s).VTC-PR may have exhibited a higher prediction performance than a single-RC circuit for the NARMA model due to the increased linear independence of X caused by connecting reservoir elements  027001-3 © 2024 The Author(s).Published on behalf of The Japan Society of Applied Physics by IOP Publishing Ltd that have different time constants in parallel.We used the rank of matrix to evaluate the complexity and diversity of the nonlinear transformations of the input performed by VTC-PR.The rank of X is a measure of the reservoir's ability to separate input patterns. 38)The rank of X was computed based on the singular value decomposition (SVD) method.Previous studies that have used the SVD method to calculate the rank of X and evaluate the reservoir performance are, for example, Refs.39-41.To calculate the effective rank, it is necessary to remove small singular values using a threshold.In this study, the threshold was set to 10 −2 .Consequently, the rank of X was calculated as 11 for a single-RC circuit reservoir and 25 for VTC-PR.Therefore, VTC-PR can convert the input timeseries into a high-dimensional feature space, providing richer reservoir states, and this improvement has led to its high computational performance.Figure 3(a) shows that VTC-PR significantly improved the performance in the NARMA3 and NARMA4 tasks.Because these tasks require the storage of at least three and four previous inputs, respectively, we investigated the STM performance of VTC-PR and a single-RC circuit reservoir.The results of the STM task are presented in Fig. 3(b).The MC of VTC-PR was significantly improved compared with that of a single-RC circuit reservoir, especially for delay lengths of k = 3 and 4. Therefore, in addition to the improved rank of X, the improvement in the STM performance may have led to the improved performance of the NARMA3 and NARMA4 tasks in VTC-PR.Table II lists the MC of the STM task compared to the various PRC systems in previous studies.The time constants of VTC-PR were optimized to maximize the MC, and the combination was τ = [0.2,0.6, 1.1, 1.7, 2.8] s.
We have shown that VTC-PR has a high computational performance for the NARMA and STM tasks.However, this is not the only advantage of VTC-PR; the wide distribution of time constants greatly extends the timescale of the input signals that can be processed, and VTC-PR is expected to be applied to prediction tasks for multiscale dynamics.Figure 3(c) shows the results of the computational performance evaluation of VTC-PR in the HR task.For comparison, the results of the HR task for a single-RC circuit reservoir are shown in Fig. 3(d).Here, the time constant of a single-RC circuit reservoir was set to 1 s.Although the one step ahead prediction of the fast dynamics was handled with high accuracy, the evolution of the unknown slow dynamics was not estimated (NMSE > 100).Note that a single-RC circuit could not accurately predict both fast and slow dynamics simultaneously, no matter what time constant was used (time constants were verified in the range of 1-100 s).Conversely, VTC-PR successfully predicted one step ahead of the slow dynamics from only the fast dynamics input (NMSE < 0.04).The time constant combinations for VTC-PR were τ = [1, 10, 20, 30, 40] s.The combination of time constants was  027001-4 © 2024 The Author(s).Published on behalf of The Japan Society of Applied Physics by IOP Publishing Ltd optimized to minimize the sum of all prediction errors of the fast dynamics (x, y) and slow dynamics (z).The VTC-PR used for the HR task in this study consisted of only five RC circuits with different time constants, which significantly reduced the reservoir size compared to the 200-400 nodes used in a previous study. 33)Although the prediction performance was inferior to that of DTS-ESN (network-based reservoirs), VTC-PR (simple time-delayed resevoirs) could handle the HR task with accuracy.These results suggest that VTC-PR, depending on its configuration, can be applied not only to nonlinear timeseries tasks but also to modeling and predicting multiscale dynamics in the real world.
We propose VTC-PR, which is based on a simple timedelayed PR, with a focus on the hardware implementation of reservoir computing.The proposed PR, in which basic RC circuits were connected in parallel, was validated to establish a higher-dimensional feature space than that of a PR with a single-RC circuit.This is because RC circuits with different time constants provide non-parallel vectors.As a result, a high performance was achieved not only in nonlinear timeseries prediction but also in multiscale dynamics prediction.The present result contrasts with a recent trend in physical implementations of reservoir computing, in which the best computational performance is achieved by employing a complicated physical system as the PR.The high practicability of VTC-PR, which is simple and highly effective, supports the hardware implementation of reservoir computing systems.

Fig. 1 . 2 ©
Fig. 1.Schematic computational flow of VTC-PR.During the learning process, only W out is adjusted by the linear regression.

Table I .
Comparison of prediction performance of NARMA2 task.

Table II .
Comparison of MC in STM task.