Soft-Sensing Estimation of Optical Density for PHA Production Using Multilayer Perceptron Neural Network

Biomass concentration is an important indicator of production rate in polyhydroxyalkanoates (PHA) fermentation process. In current practice, measurement of biomass concentration is done off-line by laboratory analysis that is unsuitable for online process monitoring and control. Soft-sensor is often used as an alternative that provides an estimate of hard to measure parameters from easy to measure process data. However, most of these studies use simulated data or data generated from mathematical model that was developed without full consideration of process and measurement uncertainty. In this study, a soft-sensor is developed from real production data for PHA fermentation in pilot-scale bioreactor with the appropriate data pre-processing techniques applied to process data that was obtained from this system. Multilayer perceptron (MLP) neural network is used to estimate biomass concentration using secondary process parameters such as dissolved oxygen (DO), temperature, pH and agitation speed as inputs. Different models are developed based on different batches of production data and various network architecture in order to study the appropriate integration of process data and network topology that gives the best model accuracy. Results indicate that the biomass soft-sensor developed using MLP-ANN provides a better estimate of biomass in comparison to radial basis function (RBF) neural network and support vector regression (SVR) methods. The developed soft-sensor can be further used in monitoring and control of production output.


Introduction
In bioprocess the output variable that are related to the quality of product is usually a biological parameter rather than physical or chemical variables. Current online sensors to measure biological parameters are either not available, high operational and maintenance cost or difficult to maintain by end user [1]. In bioplastic production of polyhydroxyalkanoates (PHA), optical density (OD) measurement is often used as indicator for biomass which is the fermentation process output quality [2]. Measurement of OD, alternatively known as turbidity is among the simplest and most frequently used biomass monitoring technology and has been used in many microbial, fungi and mammalian bioprocesses [1], [3]. OD values are commonly obtained through offline analysis where the product is sampled, and time-consuming laboratory analysis is performed. It is not suitable for online monitoring and control of the fermentation process.
A review on biomass estimation methods [4] suggested the use of optical density measurement and software sensor method using the commonly measured bioreactor's online process variables. This method is an economical means for biomass estimation as it uses existing sensors in the bioreactor. Wide commercial production of PHA is limited due to its high cost [5], thus improvement in process monitoring with minimal cost is desirable.
Soft sensor is an alternative to laboratory analysis where the terms software and sensor are combined. The principle of soft-sensor is an estimation of a process output variable is obtained using on-line data that are relatively easy to measure or access [6]. The use of soft-sensor facilitates in understanding complex bioprocess dynamics and batch variations [1].
There are two methods for soft sensor development, namely model driven and data driven. Data driven soft sensor are mostly based on First Principle Models that describe physical and chemical background, thus require in depth knowledge of the process [7], [8]. The complex nature of bioprocess requires many assumptions resulting in inaccurate model-based soft sensor.
Data-driven soft-sensor describes the true conditions of the process as it is based on the data collected in the processing plant. Nonlinear data-driven soft sensor models are used in this work to account for process nonlinearity. Also, comparative study on E. coli shows artificial neural network (ANN) which is a nonlinear method outperformed other linear software sensor methods [4].
Artificial neural network has been used on soft-sensor development to estimate process output variables related to product quality. This includes biomass concentration for baker yeast fermentation [9], glucose concentration for secreted protein production [10], biomass, substrate and nosiheptide concentrations in nosiheptide fermentation [11], biomass in nosiheptide fermentation [12], optical density and dry cell weight in recombinant protein production [13] and hydrogen in biological hydrogen production [14]. However, none of these work uses large scale real production data without full consideration of process and measurement uncertainty. Only data from simulation model or mathematical model as well as lab scale production process were used in building ANN soft sensor.
In this work, process data from a pilot scale 2000L bioreactor for PHA production were used in developing soft sensor for optical density estimation using multilayer perceptron neural network (MLP). Preprocessing of process data was done by normalization and interpolation techniques. For comparison with other nonlinear data-driven methods, Radial Basis Function Neural Network (RBFNN) and Support Vector for Regression (SVR) soft sensors were developed.

Fermentation Process Data
PHA fermentation process was carried out in a 2000L pilot-scale fed-batch bioreactor. The bioreactor was fitted with in-situ optical sensors to measure the process variables such as temperature, pH, agitation speed and dissolved oxygen level. The bioreactor is equipped with a Supervisory Control  3 and Data Acquisition (SCADA) system. Process data from six different fermentation batches (A to F) were recorded. The fermentation batches vary in length around 21 hours to 45 hours.
The chosen inputs are temperature, agitation speed, pH, and dissolved oxygen (DO) while the output is optical density (OD). These variables have function relationship with optical density, a variable that indicates the amount of biomass in the fermentation broth. In a fermentation process, pH, temperature and dissolved oxygen are important environmental variables that need to be controlled. Low temperature decreases protein's reaction while high temperature alters the characteristic properties of the protein. For efficient transfer of feedstuff to cell membrane and energy out of the cell, a suitable pH level is required [15]. Output of PHA production is strongly influenced by operating conditions such as temperature, pH level and dissolved oxygen [16], [17]. Dissolved oxygen is a key micro-nutrient for growth, metabolic production and maintenance of the microorganism. Agitation speed adjustments is one of the means for control of DO concentration in a fermentation broth [18].

Data Pre-processing
Temperature, pH, agitation speed and dissolved oxygen are the on-line process data obtained from Supervisory Control and Data Acquisition (SCADA) system with sampling time of 1 minute. These data are then down sampled to 1 hour/sample so that the sampling frequency difference with offline data is adequately reduced. The process output variable which is optical density values are sampled offline at a lower frequency of 6 hour/sample.
Optical density values are then interpolated using Piecewise Cubic Hermite Interpolating Polynomial (PCHIP) in order to have equal input and output sampling time for use in soft-sensor modelling. PCHIP was chosen rather than linear interpolation since it can interpolate while preserving the shape of data without unnecessary oscillations [19], [20]. Figure 1 shows the comparison between original optical density value, PCHIP interpolation and linear interpolation for Batch A.
The process variables are different in order of magnitude as shown by inputs for Batch C in Figure  2. Normalization was performed to avoid domination of larger magnitude variables on smaller ones and improves the learning process [21], [22]. Thus, all process data are normalized in the range [0.1, 0.9] using equation (1).
In equation (1), is the normalized data, represents the original data, while and are the minimum and maximum value of data, respectively. In Equation 2, with is the number of neurons, is the transfer function, is the weight, is the number of data points, is the input and is the bias term.
The ANN model has four neurons in the input layer and one neuron in the output layer, equivalent to the number of input and output variables, respectively as shown by model structure in Figure 3. The neural network has one hidden layer with logarithmic sigmoid (logsig) activation function and a linear transfer function (purelin) is used in the output layer. Equation 3 describes the logsig function while purelin function is described by Equation 4.
Weight and bias were added to every neuron in hidden layer and output layer. Performance of the ANN model was evaluated using correlation coefficient (R 2 ) given by Equation 5. Correlation coefficient value are used to as a measure of difference between the actual and predicted OD values.

Radial Basis Neural Network (RBFNN) Model Development
Radial Basis Function Neural Network (RBFNN) is among the classical feedforward neural network. It has three layers: an input layer, a hidden layer and an output layer [25]. The basic structure of an RBF NN is shown in Figure. The hidden layer uses a nonlinear radial basis activation function and a linear function in the output layer [26]. The output of an RBF NN output neuron is represented by With ℎ is the number of hidden neuron, is the radial basis activation function, the input and is the connecting weight between hidden neuron and output neuron [27]. Gaussian radial basis activation function is used represented by is the center and is the spread of radial basis function that represents its width. Spread or width is the value that determines the smoothness of radial basis interpolating function. In order to determine the optimum spread, different values were used in RBF NN training and the network performance was evaluated in terms of its correlation coefficient value as given in Equation (5). RBF NN was developed in MATLAB R2017b environment.

Support Vector Regression (SVR) Model Development
Support Vector Regression is an adaptation of Support Vector Machine (SVM), a statistical/machine learning theory based classification paradigm [28]. SVR was initially used in pattern recognition problem and later modified to solve regression problems. Through training, nonlinear mapping ( ) or correlation was obtained between input and output of the learner [29]. The regression function that describes the nonlinear relationship between input and output of the learner is represented by In Equation (8), coefficients and need to be adjusted. In this work SVR model was developed in MATLAB R2017b environment and default linear kernel function was used.

MLP Neural Network Model
For MLPNN, six models were developed using different combination of batch data for training and testing the neural network. Different subset was used for testing in each model to test the generalization capability of the network. The number of hidden neurons was determined by repeated runs or trial and error, varying between 2 to 11 nodes. Figure 4 (a)-(f) show the correlation coefficient values for different number of hidden nodes for the six models. Higher value of correlation coefficient indicates the model performs better. However, good performance in training subset but poorly in testing subset indicates overfitting. This happens when the number of hidden nodes is too large as can be clearly seen for 11 hidden nodes in all six models. Each model was simulated five times for each number of hidden nodes. The best performance was chosen from these runs. A comparison of best results from each model is shown in Figure 5 and Table  5. The best MLP NN is Model 5 with 10 hidden neurons based on the higher correlation coefficient values for training and testing subsets.

RBF Neural Network Model
RBF Neural Network models were simulated using different spread values between 30 to 50. Table 6 gives the summary of several trial and error iterations conducted to determine the value of spread that gives the highest correlation coefficient for each model. The best spread value with corresponding correlation coefficient is listed for each model. The best model for RBFNN soft sensor is Model 4 as shown in Figure 6.   10

SVR Model
SVR model performances for all models using training and testing subsets are shown in Figure 7 and tabulated in Table 7. Model 4 gives the best performance based on highest correlation coefficient values for training subset and smallest difference with testing subset.   Figure 8 Comparison optical density estimation using training subset (blue line from lab analysis, red line soft-sensor estimation) Figure 9 Comparison optical density estimation using testing subset (blue line from lab analysis, red line soft-sensor estimation)

Conclusion
Soft-sensor approach for estimation of optical density values in PHA production achieved a good accuracy in terms of correlation coefficient exceeding 0.9. Three nonlinear data-driven methods were used in building soft sensor model using online process data from a pilot-scale bioreactor. MLPNN softsensor performance is superior to both RBFNN and SVR soft sensors. The estimation of optical density is close to those obtained using laboratory analysis. The developed soft sensor could be used to predict process output quality as part of process monitoring and control.