Fusion Neural Network for Gas Concentration Prediction in Mixed Gas Environments

Due to the inherent complexity and nonlinearity of mixed gas data, existing pattern recognition algorithms utilized in electronic noses often encounter difficulties in accurately predicting gas concentrations. Addressing this issue, we propose a fusion neural network that merges Long Short-Term Memory (LSTM) and Temporal Convolutional Network (TCN), which we denote as the LSTM-TCN fusion model. The LSTM module effectively captures long-term dependencies in time-series data, while the TCN targets local correlations, thereby enhancing the prediction accuracy for complex gas concentrations. Experimental validation was conducted using a mixed gas dataset comprising ethylene and carbon monoxide. When compared with traditional models, including LSTM, TCN, and GRU, the proposed LSTM-TCN model demonstrated superior performance, achieving an R2 value as high as 0.9922. This research holds considerable practical significance and shows promising application prospects, contributing novel insights and methods to the study and application of electronic nose technology.


Introduction
As people become more aware of environmental protection, there is an increasing demand for indoor environmental quality.However, there may be harmful gases in the air that have negative effects on people's health [1][2] .Prediction of mixed gas concentrations is one of the most important applications in environmental monitoring and air quality control [3][4][5] .By predicting the concentration of mixed gases, the presence of harmful gases can be detected and controlled early to protect people's health and safety, and also provide dedicated support for research and applications in related fields [6] .
Currently, sensor technology is becoming increasingly advanced, and a lot of soft sensor designs are being proposed [7] .Among them, gas concentration prediction methods mainly include traditional machine learning algorithms and deep learning algorithms [8] .Deep learning algorithms have made some progress in gas concentration prediction.BakilerH et al. [9] proposed a method using the Long Short-Term Memory (LSTM) network to extract useful features when estimating different gas concentrations with an electronic nose (E-nose).Compared to the other feature extraction methods, the proposed method achieved high accuracy in classification and regression results.Ni et al. [10] introduced a new deep-learning model, Gaussian-TCN, based on Temporal Convolutional Network (TCN) and Gaussian Error Linear Units.The final results indicated that the Gaussian-TCN outperforms the TCN, LSTM, and GRU for CO concentration regression prediction.
Traditional machine learning algorithms have the advantages of simple model structures and fast computational speed.However, they are not effective in dealing with high-dimensional and complex non-linear data, and it is difficult to fully explore the potential features of the data, thus limiting their performance in gas concentration prediction.Therefore, this paper proposes a hybrid gas concentration prediction method that combines LSTM and TCN.This method pre-processes input data appropriately and combines LSTM to handle the long-term dependencies of time series data.Moreover, TCN is used to further capture temporal dependencies of time series data, enhancing the robustness and generalization performance of the model.

Data Processing
In this study, we utilize the mixed gas dataset of carbon monoxide and ethylene, collected by Fonollosa J et al [11] .We employ a series of effective data processing techniques to enhance model performance and optimize prediction outcomes.Initially, we utilize the down-sampling method to reduce the original sampling frequency from 100 Hz to 20 Hz, thereby diminishing the original five sampling points in the same time interval to one.This approach significantly alleviates computational load, accelerates the model's training and prediction speed, and mitigates the impact of noise, thereby enhancing the accuracy of model predictions.Secondly, we only use the data from the first 40 seconds of each response stage.This strategy not only reduces the data quantity but also retains essential features, thereby conserving computational resources and storage space.Importantly, this approach is designed with practical application, we can shorten the prediction time and achieve a relatively quick response.Finally, to train and evaluate the model, we divided the dataset into a training set, a validation set, and a test set at a ratio of 6:2:2.
As depicted in Figure 1, the concentration levels of the two gases and the sensor response curves for certain time periods are illustrated.The vertical axis of the first graph signifies the current concentrations of carbon monoxide (represented by a blue line) and ethylene (represented by a red line), whereas the vertical axis of the second graph stands for the normalized value of the gas sensor response.

Temporal Convolutional Networks
TCN has achieved excellent performance in various sequence-related tasks by combining key technologies such as dilated convolution, causal convolution, and residual connections.These core technologies enable TCN to not only maintain causality and improve model training stability but also effectively extract local temporal features.

Dilated Causal Convolution
Dilated convolution is a method for expanding the receptive field of a convolutional kernel [12] .By introducing intervals between adjacent elements of the convolutional kernel, dilated convolution can capture a wider range of contextual information without adding additional parameters and computational complexity.In TCN, the dilation factor of each layer is usually increased layer by layer to capture longdistance dependencies more effectively.
Causal convolution represents a special type of convolution, which operates solely on the current and preceding inputs, thereby ensuring that the information from a future time point t does not seep into the past.Causal convolution maintains causal relationships in the temporal dimension, giving TCN good online prediction capabilities.When implementing causal convolution, zero padding is typically used to prevent changes in the output sequence length.As shown in Figure 2, the output at the current moment is related only to the previous 15 historical moments, with a receptive field of 15, a convolutional kernel size of 3, with dilation coefficients of 1, 2, and 4. As the network depth increases, the dilation coefficients also grow exponentially.For the onedimensional sequence  and filter  in this paper, its dilated convolution equation  at time ℎ is written as Equation (1): ℎ  *  ℎ ∑   ⋅  ⋅ (1) where  represents the size of the filter; * indicates the convolution step;  refers to the dilation factor; and ℎ  ⋅  represents the past information.

Residual Connection
The residual structure of the TCN network in this paper is shown in Figure 3.Each layer comprises dilated causal convolution, layer normalization, an activation function, and a dropout.Residual connection is a strategy designed to enhance the training stability of deep networks [13] .It achieves this by adding the input of each convolutional layer to its output, thereby creating a direct pathway.This design mitigates the vanishing gradient problem, sustains gradient propagation, and as a result, accelerates the model's convergence process.As the dilated convolutions deepen the TCN network, residual connections play a crucial role in TCNs, ensuring the stability of the model in multi-layer convolutional structures.

Long Short-Term Memory network
LSTM is an improved version of traditional RNN [14] , with a more complex and efficient structure.The LSTM model consists of several key components, including forgetting gates, input gates, output gates, and cell states.These components work together to provide the neural network with more refined information processing capabilities.The basic structure of LSTM is shown in Figure 4.The forget gate is responsible for filtering information, retaining useful information related to the task, and discarding irrelevant historical information.The input gate determines the retention level of the current cell state information based on the input data.These two gates achieve effective filtering and fusion of input information together.The output gate determines the network output based on the cell state, ensuring that valuable information is passed on to the next neuron.

Proposed LSTM-TCN Model
LSTM-TCN is a combination of LSTM and TCN for time series prediction tasks.LSTM is a classical recurrent neural network structure that has the ability to handle sequential data and capture long-distance dependencies.In LSTM, the time series of the sensor response signal is first processed through an LSTM network to generate an LSTM hidden state sequence.Then, the LSTM hidden state vector is used as input for the next LSTM layer, and finally, the TCN network is used to extract temporal features.The main workflow is shown in Figure 5.The proposed LSTMA-TCN model first uses an LSTM layer with 32 units to perform initial feature extraction on the input data.Then, the data is further enhanced with a second LSTM layer with 16 units for feature extraction.Finally, the model utilizes a TCN network to extract temporal features, where the TCN network has 64 filters, a kernel size of 2, a dilation rate of [1, 2, 4], layer normalization, and a dropout rate of 0.2.The processed data passes through a global max-pooling layer and a fully connected layer to obtain the target gas concentration prediction.The model employs the Adam optimizer, utilizing Mean Squared Error (MSE) as the loss function.

Result
The algorithm proposed in this article was developed using Python 3.8 on PyCharm Community Edition, running on a Windows 11 operating system.It utilized TensorFlow 2.3.0 and CUDA 12.0.The hardware environment included an Intel 12700 K CPU, an NVIDIA GTX 3090Ti GPU, and 32 GB of RAM.
Table 1 and Table 2 respectively demonstrate the prediction performance of different deep learning models (TCN, LSTM, GRU, and LSTM-TCN) on the carbon monoxide (CO) and ethylene (Eth) mixed gas datasets.Each model in the table is evaluated with three metrics: MAE, RMSE, and R².From the two tables, it can be seen that the LSTM-TCN model performs the best, as it combines the long-short term memory ability of LSTM and the local temporal feature extraction ability of TCN.The GRU performs the second best, even though it is a single model, it can still capture long sequence patterns well, performing better than the standalone TCN and LSTM models.The TCN model is more focused on extracting local temporal features, while its performance in capturing long-term dependencies may not be as good as LSTM and GRU.The LSTM and GRU models have certain advantages in long-term and short-term memory, respectively, but may be slightly inferior to TCN in handling local temporal features.The performance of LSTM and GRU varies, with GRU sometimes performing better than TCN, while other times slightly inferior, possibly due to the different gating structures of GRU and LSTM, resulting in more effective information capture in some cases, and not as satisfactory in others.In summary, the LSTM-TCN model has higher accuracy in capturing time series patterns, thus achieving better prediction performance.

Conclusion
In this study, we explored the use of deep learning models for predicting gas concentrations in a mixed gas dataset.Specifically, we compared the performance of TCN, LSTM, GRU, and LSTM-TCN models using different evaluation metrics such as MAE, RMSE, and R².The results showed that the LSTM-TCN model outperformed the other models, which can be attributed to its ability to combine the longterm memory capacity of LSTM with the local temporal feature extraction ability of TCN.Furthermore, the results indicated that the LSTM-TCN model achieved better prediction performance in a shorter training time compared to other models.Overall, our study demonstrates the effectiveness of using deep learning models for gas concentration prediction, with the LSTM-TCN model showing the best performance.

Figure 1 .
Figure 1.The concentration levels of the mixed gas and the corresponding normalized response curves of the gas sensor arrays.

Figure 3 .
Figure 3. Residual structure in the TCN network.

Figure 5 .
Figure 5.The proposed workflow of the LSTM-TCN network

Figure 6
Figure6shows the experimental results of the CO and Eth mixed gas dataset in three different evaluation metrics after 200 epochs.The different models converged around 100 epochs, while the LSTM-TCN model achieved the best prediction performance and converged after about 60 epochs.This indicates that the LSTM-TCN model has better training efficiency and prediction ability than other models, as it can learn the underlying patterns and structures of the data in a shorter time.The LSTM-TCN model had a higher R² score than the other model in the whole training stage, possibly due to its faster adaptation to the time series features in the dataset.The LSTM-TCN model's R² score surpassed all of the other models, as it gradually exerted its advantages in integrating models, capturing long-term dependencies and complex patterns in the time series.With the ongoing training, the LSTM-TCN model learned more diverse and complex features, thus improving its prediction performance.

Figure 6 .
Figure 6.Performance of different models over 200 epochs

Table 1 .
Performance comparison of different models on predicting ethylene concentration.

Table 2 .
Performance comparison of different models on predicting CO concentration.