Remaining useful life prediction towards cycling stability of organic electrochemical transistors

Organic electrochemical transistors (OECTs) show abundant potential in biosensors, artificial neuromorphic systems, brain-machine interfaces, etc With the fast development of novel functional materials and new device structures, OECTs with high transconductance (gm > mS) and good cycling stabilities (> 10,000 cycles) have been developed. While stability characterization is always time-consuming, to accelerate the development and commercialization of OECTs, tools for stability prediction are urgently needed. In this paper, OECTs with good cycling stabilities are realized by minimizing the gate voltage amplitude during cycling, while a remaining useful life (RUL) prediction framework for OECTs is proposed. Specifically, OECTs based on p(g2T-T) show tremendously enhanced stability which exhibits only 46.1% on-current (ION) and 33.2% peak gm decreases after 80,000 cycles (53 min). Then, RUL prediction is proposed based on the run-to-failure (RtF) aging tests (cycling stability test of OECTs). By selecting two aging parameters (ION and peak gm) as health indicators (HI), a novel multi-scale feature fusion (MFF) method for RUL prediction is proposed, which consists of a long short-term memory (LSTM) neural network based multi-scale feature generator (MFG) module for feature extraction and an attention-based feature fusion (AFF) module for feature fusion. Consequently, richer effective information is utilized to improve the prediction performance, where the experimental results show the superiority of the proposed framework on multiple OECTs in RUL prediction tasks. Therefore, by introducing such a powerful framework for the evaluation of the lifetime of OECTs, further optimization of materials, devices, and integrated systems relevant to OECTs will be stimulated. Moreover, this tool can also be extended to other relevant bioelectronics.


Introduction
Due to the rapid development of advanced organic mixed ionic-electronic (semi)conductors (OMIECs) and innovative device structures, organic electrochemical transistors (OECTs) with high transconductance (g m ), low driving voltage, and good biocompatibility, have been realized [1][2][3].Consequently, various applications, including biosensors, artificial neuromorphic devices, and brain-machine interfaces, have adopted OECTs as key sensing/processing units [4][5][6].The ultra-high performance of OECTs relies on the bulky electrochemical doping/dedoping in the OMIEC channel, which consistently evolves ion (along with water molecules when using aqueous electrolyte) injection/ejection in the OMIEC channel [7][8][9][10].Such a process would not only result in swelling of the OMIECs, lead to uncontrollable microstructure/nanostructure variation, but also introduce constant electrochemical stress on the OMIECs, which could gradually vary the chemical composition of the active materials [11,12].Consequently, developing OECTs with high cycling stability has been the research highlight in recent years, and their cycling stability has been gradually improved by incorporating novel OMIECs and device structures.
Up to now, OECTs with >10,000 cycle numbers and > 1 h cycle time, have been extensively reported [13][14][15][16][17][18].For instance, by incorporating diazaisoin-digo and fluorinated thiophene units in an n-type OMIEC (gAIID-2FT), Wan et al, fabricated OECT with exceptional cycling stability (>3 h), which is attributed to the integrating strong electron acceptors with donor fluorination [19].Moreover, by adopting a solid-state electrolyte, Wang et al introduced OECT for neuromorphic processing with excellent switching endurance (> 100,000,000 cycles) [20].Note, besides the solid-state electrolytes would minimize the swelling phenomenon of OMIEC, it is suggested that such high cycle stability may be attributed to the none-conventional cycling paradigm, which did not fully dope/dedope in one cycle, thus introducing less stress on the channel during the cycling test.By introducing vertically stacked channel geometry, Facchetti et al, show both p-and n-type vertical OECTs with stable cycles of more than 50,000 cycles [3].As the OMIEC is effectively 'encapsulated' by the top electrode in the vertical structure, the swelling effect is expected to be partially controlled.
Although the cycling stability of OECTs is being improved continuously, the corresponding characterization and measurement processes of the performance parameters are labor-intensive and time-consuming.Accurate and fast prediction of electronic device lifetime using the specified aging parameters could unlock new opportunities in the mass production, utilization, and optimization of various electronics, such as Li-ion batteries [21][22][23].Therefore, it is expected that the research and development (R&D) cycle and OECTs can be effectively accelerated by conducting rapid validations of the innovative device architectures, and the adoption of the advanced functional materials for the OECT channel.However, to the best of our knowledge, the research on integrating remaining useful life (RUL) prediction with OECTs cycling stability is still in the infant phase, and we are not aware of any existing methods to address this challenging issue.
Here, OECTs with enhanced cycling stability by simply manipulating the cycling gate voltage (V G ) are fabricated and characterized first, followed by a multi-scale feature fusion (MFF) method for RUL prediction.Specifically, it is observed that by decreasing the cycling voltage window from 0.4 ∼ −0.6 V to 0.4 ∼ −0.5 V, the cycling stability is effectively enhanced, where on current (I ON , drain current under the highest given V G in transfer curve) shows 87.7% and 46.1% decrease after 80,000 cycles, respectively.Consequently, a MFF method for RUL prediction tasks is proposed.To verify comprehensively the effectiveness and superiority of the proposed RUL prediction method toward diverse cycling stability of OECTs-based devices, a deep learning (DL) based multi-scale feature fusion prediction method is presented.Specifically, two innovative contributions are summarized as follows: (1) A long short-term memory (LSTM) neural network is utilized and can effectively extract temporal aging features from two time series-based aging parameters (i.e., I ON and peak g m ) of OECTs; (2) The global and local aging features contained in each parameter are different, their importance to prediction results also varies.Accordingly, an attention mechanism is used to fuse global and local time features.It is shown that the developed RUL prediction protocol can accurately and efficiently predict the lifetimes of three different OECTs with varying cycling stabilities.Therefore, the RUL prediction methodology can be seamlessly integrated with the estimation of the performance variation, especially the cycling stability of OECTs.Introducing such a powerful tool in the designing and characterization processes of OECTs with both high performance and stability will dramatically accelerate the R&D cycle, experimental validation endeavors, and large-scale commercialization of the bioelectronic industry.

Fabrication of OECTs
Si/SiO 2 wafers (where the total thickness of the wafer is 625 μm, along with 100 nm SiO 2 ), which act as substrate, are first cleaned by an ultrasonic bath in isopropanol, blown dry with N 2 , and UV-ozone (15 min, Shenzhen Hwo Technology Co., Ltd.).Then, gold source and drain electrodes (120 nm) are deposited on the substrate with a vacuum thermal evaporator (5 × 10 −4 Pa) under an evaporation speed of 1.5 Å s −1 along with shadow masks (Note, 3 nm Cr is evaporated before Au to enhance interface adhesion).The formed channel width and length are 200 μm and 20 μm, respectively.Next, the P(g2T-T):DtFDA blend solution is spin-coated with 3000 rpm for 20 s to form the OMIEC channel of ∼100 nm.Subsequently, the blend film is exposed to 365 nm UV light (450 mW cm −2 ) for 2 min with a shadow mask to crosslink the channel area, and developed in chloroform for 5 s followed by blow dry with N 2 .Note, it is expected that un-crosslinked DtFDA would be washed away during the developing process.Last, the Cin-Cell:DtFDA blend solution is spin-coated and patterned to form the encapsulation layer by following the same procedures for the P(g2T-T):DtFDA blend with a different shadow mask.

Cycling stability characterization of OECTs
The transfer and cycling characteristics of the OECTs are conducted with a source meter (FS-Pro) by wiring the OECTs with a probe station (Chengdu Chiptest Technology CO., Ltd.) in the air.Note, a PDMS well is placed on top of the channel area to hold about ∼40 μl PBS (1 × ) electrolyte, while an Ag/AgCl electrode (World Precision Instruments, EP1) is immersed in the electrolyte to act as the gate electrode.During the cycling stability characterization process, a transfer curve of the OECT is first collected and followed by the drain current (I D ) cycling test (500 cycles, where I D versus times is recorded under a square wave of V G switch between 0.4 V and −0.5 V or between 0.4 V and −0.6 V, with a frequency of 25 Hz, V D = −0.1 V).Then this process is repeated multiple times to evaluate the long-term cycling stability of OECTs.

Mult-scale feature fusion method for RUL prediction
As shown in figure 1, the proposed MFF consists of a multi-scale feature generator (MFG) and an attentionbased feature fusion mechanism (AFF).Specifically, MFG extracts multi-scale temporal aging features from aging parameters, which means each level of aging information is included.AFF effectively fuses multi-scale aging features by calculating the corresponding weights of global aging features and local aging features.

Multi-scale feature extraction
The proposed MFG is structured as a four-layer LSTM to extract every detail of aging features across various levels of depth.LSTM, as an enhanced recursive neural network model, excels at capturing the long-term dependencies in OECT aging data and effectively addresses the issue of gradient vanishing.As shown in figure 2, LSTM maintains a memory cell c t to preserve the long-term state in the sequence until time step t.The behavior of the c t is determined by the forget gate f t , the input gate i t , and the output gate o t .
First, the forget gate f t determines the forget rate of aging information stored in the last memory cell c t−1 in the current memory cell c t : where x t is the input of OECT aging parameters at time step t, h t−1 is the hidden state of OECT aging information at time step t−1, U, W, and b are the weights and bias respectively.Then, the input gate i t is calculated, which decides the aging information to be updated for the current input: The output gate o t is used to decide the influence of memory cell c t on the new hidden state h t , A hidden state vector C t ˆis generated, contributing to the current memory cell, The memory cell c t is calculated according to f t , c t−1 , i t , and c , The output hidden state h t is calculated for the next LSTM unit, where  denotes the Hadamard Product.
To fully leverage the temporal aging information contained in the aging parameters, the proposed method extracts aging features at different scales and concatenates them together.The multi-scale aging features of OECTs can be formulated as: where F is the multi-scale aging features, i is the number of LSTM layers, and [F 1 , F 2 , K, F i ] denotes the concatenation of the features generated by LSTM layer 1, 2, K, i.

Attention-based feature fusion
To effectively fuse the multi-scale features extracted from OECT aging parameters, an attention-based feature fusion (AFF) mechanism is developed to fuse the local aging pattern and global aging pattern of different aging features.The local aging pattern refers to the unique aging trend in each aging feature, while the global aging pattern refers to the common aging trend contained in different aging features of OECTs.The key idea is that channel attention can be implemented in multiple scales by varying the spatial pooling size.To maintain AFF as lightweight as possible, the local aging pattern of the OECT aging features is added to the global aging pattern by the attention block [26].
The local aging features L(X) ä R C× H× W is computed as follows:  The global aging features G(X) ä R C× H× W is computed as follows: is the global average pooling.
Given the global aging features G(X) and local aging features L(X), the refined feature X ′ ä R C× H× W by attention block can be obtained as follows: where M (X) ä R C× H× W denotes the attentional weights generated by the attention block, σ is the Sigmoid function.
Given two aging features X and Y, the fused features Z can be expressed as Finally, the objective function of the proposed method can be formulated as:
To evaluate our proposed method for RUL prediction tasks, aging data are divided into different partitions of training data and testing data for each kind of OECTs.The performances of all methods are compared by running under the same experimental setting.For a fair comparison, all methods are evaluated in 50%-50% tasks on OECTs, where training with 50% data and testing with the other 50% [32].

Cycling stability of OECTs
Based on the fabricated OECTs and the introduced cycling stability characterization method, the cycling stabilities of the OECTs are tested under two different V G cycling biases (from 0.4 V to −0.5 V and from 0.4 V to −0.6 V).As shown in figures 3(a)-(c), with different V G biases, the degradation trends of the OECTs are similar, where decreasing I ON , peak g m (g m_peak ), and negatively shifted V ON are extracted with higher cycles based on the  .Specifically, under V G from 0.4 V to −0.5 V (OECT-1), I ON and peak g m show only 46.1% and 33.2% of decrease after 80,000 cycles, respectively, along with a shift of V ON from −0.04 V to −0.23 V. On the contrary, by increasing the V G cycling window 0.1 V wider (from 0.4 V to −0.6 V, OECT-2), enlarged I ON and peak g m decrease (87.7% and 82.5%, respectively) are obtained, along with apparent V ON shift from −0.01 V to −0.20 V. Therefore, it is suggested that with narrower V G cycling window, the stability of OECTs can be enhanced, which may be attributed to less stress on the OMIEC due to fewer injected ions and lower current passing through the materials.By further conducting an additional cycling test on another OECT (OECT-3) with V G from 0.4 V to −0.6 V, a similar degradation trend is obtained, indicating the reliability of the above-shown results.Note, that it costs ∼4 h to conduct such simple cycling stability tests, which severely hinders the efficient evaluation of the lifetime of OECTs.Note that we are not showing more data on the cycling stability results with different OMIEC blends and with other innovative device structures, as the major task of the current work is to introduce an efficient RUL prediction protocol for OECTs.

Experimental result on RUL prediction
Based on the above cycling stability aging results, 30 cycles of data are contained in each time window to ensure sufficient temporal information.Health indicator (HI) ä [0, 1] can directly indicate the degradation degree and be appropriate as the associated label of a sample in the neural network training.In this paper, HI is defined as a ratio of the remaining cycles to all cycles, which can directly calculate the remaining cycles.Similar to [32], we assess the prediction accuracy of HI by using three statistical metrics: Mean square error (MSE), Mean absolute error (MAE), and Mean absolute percentage error (MAPE).
The MAE indicator measures the absolute error between the ground-truth and prediction HIs.It can be formulated as: where y i denotes the real HI and y i ˆdenotes the predicted HI.The rule of thumb suggests that the prediction model obtains satisfied performance when the MSE is smaller than 0.05.
The MSE indicator represents the mean value for the square error between the predicted and ground-truth HIs.The MAPE quantifies model performance in the manner of percentage.MAE and MAPE can be formulated as: It is noticed that lower MAE, MSE and MAPE indicate better prediction performance.We also calculate the MAE of RUL to show the error cycle numbers of the whole degradation process directly.The MAE of RUL can be formulated as: Where z i denotes the real RUL value and z i ˆdenotes the predicted RUL value.Experimental results on the RUL prediction tasks of our method and the compared methods are reported in table 1 and visualized in figure 4. Bold numbers represent optimal results.These results show that our proposed method could achieve a satisfactory result among all kinds of OECTs.Compared with other methods, our proposed method achieves the best results on OECT-1 and OECT-2.For 160000 cycles of the whole process, our method achieves the best results with 4585 error cycles on OECT-1 and 3653 error cycles on OECT-2 in RUL prediction tasks.
For OECT-3, LA [30] gets a lower error, however, the curve of LA in figure 4(c) has a stronger oscillation amplitude at the test stage.In contrast, the curve of our prediction result is smoother and closer to the real degradation trend, which demonstrates the robustness of our method.These results reveal that our method has high performance and robustness in RUL prediction tasks.By extracting features of each parameter from different scales and fusing them with an attention-based feature strategy, our proposed method is formulated enough to leverage the degradation information on precursor parameters.Hence, our method has high performance and robustness in RUL prediction tasks.

Discussion about training size
In the real world, collecting aging data of OECTs by RtF tests is time-consuming and resource-consuming.As a result, it is necessary to find a balance between experiment cost and prediction accuracy by choosing an appropriate training size.To find the appropriate training size, we conduct experiments under different training size conditions to analyze our method in detail.
The results of our method on RUL prediction tasks with different training sizes are shown in table 2. 70%-30% task means training model with 70% data and testing with the rest.It can be obtained that the prediction performance is poor when the training data is hard to get, with the increasing of training size, the prediction accuracy increases.When the training size reaches a threshold, our proposed method could achieve a satisfactory result.As shown in figure 5, when the training size reaches 45% for OECT-1 and OECT-2, and 50% for OECT3, our method could get an MSE of less than 0.05.This result indicates that we can find the optimal training size for each kind of OECTs to achieve a prediction result with an MSE under 0.05.

Evaluation of AFF strategy
We verify the effect of AFF strategy by conducting experiments on 50%-50% prediction tasks with different feature processes: Table 3 shows the results of comparison experiments, the MAE RUL of MFF is lower than the loss of MFF I, MFF G and MFF * , which demonstrates the effectiveness of the AFF strategy in exploiting information from two parameters.It is worth noting that we can achieve a better result than single-dimensional HI even by concentrating features of two parameters directly, this result shows that aging parameters contain complementary information for RUL prediction.

Conclusion
In this paper, a novel RUL prediction method is proposed to accurately and efficiently predict the lifetimes of OECTs with diverse cycling stabilities, which is expected to facilitate the R&D cycle, experimental validation endeavors, and large-scale commercialization of OECTs.A remarkable characteristic of the proposed RUL prediction model includes integrating a LSTM-based multi-scale feature extractor with an attention-based feature fusion module, which laid the cornerstone of mining richer degradation information hidden in two time series-based aging parameters of OECTs.From the experimental verifications, the results indicated that OECTs  based on p(g2T-T) in an appropriate gate bias (i.e., 0.4 ∼ −0.5 V) show obviously enhanced cycling stability even after 80,000 cycles.Meanwhile, when compared with other popular DL-based prediction approaches, the proposed RUL prediction method of OECTs obtains the best prediction accuracy toward three different OECTs, which further validates the prediction effectiveness, superiority, and generalization capacity.

Figure 1 .
Figure 1.Overview of the RUL prediction framework based on MFF for OECTs.It consists of a multi-scale feature generating (MFG) module and an attention-based feature fusion (AFF) module.MFG extracts temporal aging features at different depths using a multiscale LSTM neural network.AFF extracts global attention and local attention features through two branches, which aims to obtain more representative fusion aging features by finding the global aging pattern of OECTs and the local aging pattern of each aging parameter.
))))) ( ) where B denotes the Batch Normalization (BN), δ denotes the Rectified Linear Unit (ReLU), Conv 1 and Conv 2 denotes the Convolutional Layer.The kernel sizes of Conv 1 and Conv 2 are c r × C × 1 × 1 and C × c r × 1 × 1, respectively.C is the number of channels and r is a hyperparameter that denotes the scaling ratio of C. It is worth noting that L(X) has the same shape as the input feature, which can preserve and highlight the subtle details in the shallow features of multi-scale aging features.

Figure 2 .
Figure 2. Network Structure of LSTM.The aging characteristics of OECTs are conveyed in each LSTM unit, guaranteeing the retention of valuable aging information.

Figure 4 .
Figure 4. Prediction results of different RUL methods on 50%-50% task for each kind of OECTs.The results show that our proposed method could achieve a satisfactory result among all kinds of OECTs.

( 1 )
MFF I: Trained model with features that extract from I ON only; (2) MFF G: Trained model with features that extract form g m only; (3) MFF * : Trained model with 2D features by concentrating features that are extracted from each parameter directly; 4) MFF: Trained model with features that are extracted from each parameter and fused with AFF strategy.

Figure 5 .
Figure 5. MAE of our method on RUL prediction tasks with different training sizes for each kind of OECTs.It indicates that we can find an appropriate training size to achieve a balance between data collecting and prediction accuracy.

Table 1 .
Prediction results of different RUL methods on 50%-50% tasks of each kind of OECTs.

Table 2 .
Results of our method on RUL prediction tasks with different training sizes.