Prediction of River Water Treatment Plant Operational Performances using Optimization Approach in Artificial Neural Network Model

Uncontrolled development at the upstream of the river catchment have led to detrimental effect to the environment, including degradation of river water quality. River Water Treatment Plant (RWTP) technology was introduced to reduce the contamination loading into the river water system, worldwide. The technology offers the best biological treatment process including simplicity and stable removal efficiency. However, the plant performance plan is difficult task to predict, thus might have influence the operational control. Recently, artificial neural network (ANN) models have been widely applied in environmental engineering area due to the ability to skip the complexity process to assume of the unknown variables compare to conventional physical based model. In this study, the results of 3-yrs performance using ANN of RWTP were developed. Feed-forward back-propagation using Levenberg–Marquardt (trainlm) used as for this predictive approach. The ideal configuration involves utilizing the tangent sigmoid transfer function (Tansig) in the hidden layer and a linear transfer function (Purelin) in the output layer, with 25 neurons. This configuration yields an R2 value of 0.963 and the most least mean square error (MSE) of 30.39. From the comparison between two model (bio-kinetic and ANN), performance indicator for ANN model shows the best and the most optimum model. Ultimately, RWTP optimization using black-box model ANN is more reliable and timesaving as compared to conventional bio-kinetic model. The development of the proposed model can be implemented and used for various water quality improvement facilities and predict the effluent target parameter in RWTP with higher degree of accuracy.


Introduction
The pollution of the Klang River is a significant concern due to the improper treatment of sewage discharges, industrial wastewaters, and land runoff.The river traverses various districts and local authorities within the Klang Valley, including Kuala Lumpur.Beyond the upstream sources, coastal waters also face direct discharges of surface runoff, domestic sewage, ship wastes, and industrial discharges.As prior of these, this river is contaminated by E-coli.Inorganic chemicals and various 1296 (2024) 012015 IOP Publishing doi:10.1088/1755-1315/1296/1/012015 2 other contaminants are present, posing a dual threat by not only endangering the ecosystem but also entering the food chain, thereby posing risks to human health.[1].Biological treatment process prior to River Water Treatment Plant (RWTP) is considered as an effective treatment process to remove pollutants from river water.Several advanced processes have been widely studied in Europe and Asian countries.This study investigates on [1] biological treatment performance using carriers to treat the polluted river water.The desirable attached growth treatment system in RWTP is neither widely accepted nor applied in river purification treatment mainly because of the unclear stability of biological performance and difficulty in establishing any generic knowledge pertaining to performance in biofilm development due to complexity of biological system.To comprehend and investigate the efficacy of biological treatment in RWTP, this research is devoted in deriving an understanding on performances in RWTP in terms of pollutant removal efficiency, pollutants loading rates and bio-kinetic performance.The Artificial Neural Network (ANN) is a cutting-edge black-box model with the capacity to predict outputs based on intricate inputs, resembling the structure and functionality of the human brain and nervous system [2].Recently, the ANN has been widely applied in water and wastewater process [3].The model of Artificial Neural Networks (ANNs) does not formulate an explicit equation or relationship between independent and dependent variables.Instead, it establishes relationship functions between input and output data with a specified accuracy.This connection is established through a training process.A data-driven model optimization that is capable to simulate RWTP performance pertaining to become predictive tool performance for any river water quality upgrading works in future.This paper will produce a reliable prediction of RWTP operational performances using optimization approach in artificial neural network (ANN) model and the findings will be used in the operation and maintenance of RWTP efficiency at Klang River purification treatment plants such as River of Life (ROL) Project.

Methodology
In this regard, predictive approaches such as the ANN have recently shown to be more effective in estimating RWTP performance, especially in effluent water quality and MLVSS (suspended biomass) value.Hence, the predictive approach of utilizing a feed-forward back-propagation neural network with the Levenberg-Marquardt algorithm (trainlm) has recently demonstrated greater effectiveness in estimating the performance of a RWTP with the use of previous performance data as an input variable (RWTP influents and suspended biomass content) and target variables (RWTP effluents) to achieve ROL Project target INWQS Class II.Despite of the option of the black-box model, the Monod model is chosen for the kinetic white-box model since Monod equation is the first and the most basic equation had been applied in biokinetic study.To implement the procedure, the algorithm for the program was created using MATLAB 2018.Building upon this rationale, the study utilized the Feed Forward Neural Network (FFNN) method with back-propagation (BP) learning algorithms within the artificial neural network (ANN) framework for the development of the model, specifically to estimate the effluents of the RWTP.The data on influent and effluent come from monitoring seven RWTPs for biomass by the Malaysian Department of Irrigation and Drainage (DID) over a twelve-month sampling period, spanning from November 2016 to October 2017.As depicted in Figure 1, the ANN model development involved selecting eight water quality parameters as input parameters and seven as target parameters.The Feed Forward Neural Network (FFNN) model comprised an input layer (8 × 196) and a target layer (7 × 196), organized into three layers: the input layer, the hidden layer, and the output layer.The independent variables were represented by the input parameters (DO influent , pH influent , COD influent , BOD influent , TSS influent , NH 3 -N influent , WQI influent , and MLVSS influent ).These parameters served as the basis for the model.The dependent predicted variables were the target outputs (DO effluent , pH effluent , COD effluent , BOD effluent , TSS effluent , NH3-N effluent , and WQI effluent ).The hidden layer functioned as the transformation layer for input information.[4].
Throughout the training of the dataset, the input patterns in the input layers of the network would undergo computation to produce output in the output layer, utilizing the selected algorithm and a specific number of neurons.[5].To achieve a precise target, the optimal number of hidden neurons was determined through a trial-and-error process during the training of the FFNN model.This involved experimenting with the number of neurons, ranging from 5 to 35.This iterative approach aims to enhance the learning performance without compromising the network's ability to effectively train the data [6].Additionally, the model incorporates two types of learning functions in the hidden layer: the Tangent sigmoid (Tansig) transfer function and the Log-sigmoid (Logsig) transfer function.The linear learning function (Purelin) was used in the output layer.The linear activation function (Purelin) was used for neuron settings in the output layer for both models' setup because it is suitable for the post non-linear transfer function in this multilayer perceptron [7].The initial dataset was randomly partitioned into three subsets, following the rule of thumb for Artificial Neural Networks (ANN): training, validation, and test sets.In the modeling process, 70% of the 196 dataset points were assigned to the training set, while the remaining 15% of the original data was designated for validation, and the other 15% for testing purposes.A benchmark comparison was carried out to aid in the identification of the optimal neural network during the ANN modelling process [4].As for model optimization purposes, a few parameters were selected to find the optimum number of neurons.The model performance indicators are listed in Table 1.
In the given context, where "var" represents variance, "y" is the measured value, "ŷ" is the simulated value, and "N" is the number of samples, if the Variance Accounted For (VAF) is 100% and the Root Mean Square Error (RMSE) is 0, the model is considered excellent.Maximum Variance Accounted For (VAF), (Equation 1) [6] The Minimum Root Mean Square Error (Equation 2) [6] Minimum Mean Absolute Percentage Error (Equation 3) [6] Maximum Coefficient Of Determination (R2), ̂2 ) (Equation 4)

Results and Discussion
The data from the Reclaimed Water Treatment Plants (RWTP) was modeled using MATLAB as one of the ANN software, employing the Levenberg-Marquardt (trainlm) algorithm.Within the Levenberg-Marquardt feedforward algorithm, the Logsig-Purelin transfer function was compared to the Tansig-Purelin transfer function to ascertain the most optimal model.The chosen model was subsequently utilized to predict the performance of the RWTP.Throughout the testing and validation of the data, the number of neurons in the hidden layer was systematically tested, ranging from 3 to 35.The raw data underwent normalization for both input and output, achieved by dividing all input data by the maximum input and output data by the maximum output.This process ensured that the data fell within the range of 0 and 1.The code systematically tested various numbers of neurons in the hidden layer, ranging from 5 to 35.The objective was to identify the number of neurons that would enhance learning performance without constraining the neural network's capacity to model the process and without compromising the generalization of the training trend.[8].The model underwent 100 runs for each tested neuron, allowing for improved initialization and enhancing the robustness of the results.While exploring the range of neurons in the hidden layer from 3 to 35, Figure 3 revealed that beyond neuron 25, the plotted line representing Tansig R 2 demonstrated a divergence in the testing and validation sets from the R 2 of the training set.This observation suggests that overfitting occurred, leading to a model incapable of generalizing the patterns observed in the data used for training during validation [8], [9].Consequently, the number of neurons in the hidden layer was constrained to the range of 3 to 24, and the optimal neuron count was determined.

Model Performance
The dataset in the input layer underwent treatment with the default Levenberg-Marquardt (trainlm) learning algorithm, involving diverse neurons.The Logsig-Purelin transfer function suggested that 25 neurons were the best fit, whereas the Tansig-Purelin transfer function indicated 26 neurons as the optimal choice.Generally, it is preferable to opt for simpler models with fewer parameters, when possible, rather than more complex ones with a higher number of parameters [10].Consequently, the tan-sigmoidal transfer function (Tansig) in the hidden layer and the linear transfer function (Purelin) in the output layer with 25 neurons emerged as the optimum configuration.This decision was based on the RMSE and R square values, which indicated the modeling accuracy of the selected architecture with 25 neurons, as opposed to the comparison with 26 neurons in the alternative architecture, as illustrated in Figure 4.The graph plots clearly reveal that Total Suspended Solids (TSS) has the highest R-square value (0.72) compared to other parameters.This may be attributed to the fact that TSS has the highest coefficient of variation (1.15) in relation to other parameters.A high coefficient of variation for a parameter indicates greater variability, which can contribute to improved accuracy in predicting that parameter [9].Also, there are better relationships between TSS and other parameters.However, the accuracy the other parameters are acceptable.The factors influencing effluent Water Quality Index (WQI) and other effluent parameter values are not solely attributed to the eight (8) input parameters considered in this study.Various other quality factors and external environmental factors might also play a significant role in influencing these parameters.This study utilized both water quality data and the quantity of suspended biomass (MLVSS) only to predict the 7 parameters of effluent.Hydraulic retention time, flow and oxidation tank volume were measured in uniform in all RWTP stations but negligible in this study.While other parameters were taken into account for improved predictions and to minimize disparities between simulated values and observations, it's essential to justify the costs associated with including these additional factors.

RWTP Optimum Model
From the bio-kinetic model validation, brush media gives better results compare to bioball via the maximum R-square and other performance indicators.According to the sensitivity analysis performed on the Artificial Neural Network (ANN) model, the most optimal configuration involves using the tangent sigmoid transfer function (Tansig) at the hidden layer and a linear transfer function (Purelin) at the output layer.In order to find the most optimum model, bio-kinetic model and ANN model are compare in term of their performance indicators.Comparison of the performance indicator between bio-kinetic model and ANN model is depicted in Table 3.  From the comparison analysis, ANN model shows the best and the most optimum model from the denoted performance indicator except for RMSE.Ultimately, RWTP optimization using black-box model using ANN is more reliable and timesaving compared to using white-box model (bio-kinetic) which is more challenging, time-consuming and expensive [11].Korkandi [12] predicted COD effluent from experimental testing with the same algorithm backpropagation Levenberg-Marquardt with coefficient of correlation R2 0.949 (lesser than this study) and linear function; COD simulated = 0.999CODobserved + 0.724 (better than this study).However, the type of transfer function was not mentioned in his study.In a different study, it was demonstrated that the optimal coefficient of correlation values for the selected Artificial Neural Network (ANN) in calculating river Water Quality Index (WQI) is 0.954.This was achieved with 23 input nodes and 34 neurons, utilizing the Quick Propagation (QP) algorithm [13].The focus of this study was primarily on simplifying a relatively extensive set of river water quality variables into a more manageable subset, without incorporating any treatment processes.Rather than employing intricate physical modeling approaches like Qual2K and WASP, a study [14] successfully predicted river water dissolved oxygen content.This was accomplished by utilizing BOD and COD as input values in both a feed-forward neural network (FFNN) model and a radial basis function neural network (RBFNN) model.The coefficient of correlation values obtained were 0.904 for the FFNN model and 0.966 for the RBFNN model.In additional studies, the Artificial Neural Network (ANN) modeling method was employed to assess the efficiency of full-scale wastewater treatment plants, yielding correlation coefficient (Rsquare) values within the range of 0.63 to 0.99 [14]- [16].These findings suggest that neural networks can effectively predict output data, achieving high coefficient of correlation values, notably around 0.9.Moreover, various descriptive statistics were utilized to evaluate the significance of the ANN [5].Artificial neural networks prove to be valuable tools for plant operators, facilitating the recognition of operational states in a cost-effective and efficient manner.This is achieved through the accurate prediction of Reclaimed Water Treatment Plant (RWTP) process variables such as COD, BOD, TSS, and WQI.The obtained prediction results demonstrate that the proposed method exhibits sufficient performance and is comparatively simpler than a white-box model like the biokinetic Monod Equation.The design of the soft sensor relies solely on knowledge about the RWTP process, represented through input/output operational monitoring data.

Conclusion
Data derived modelling was conducted to achieve the third objective of this study.Effluent RWTP water quality was predicted from RWTP influent using artificial neural network.The model was designed and developed using feed-forward neural network.Sensitivity analysis was conducted to search for the optimum model.Finally, both bio-kinetic model and ANN model are compared in terms of their performance indicators to find the models fit for RWTP prediction.From the comparison between two model (bio-kinetic and ANN), performance indicator for ANN model shows the best and the most optimum model.Ultimately, RWTP optimization using black-box model using ANN is more reliable and timesaving compared to using white-box model (bio-kinetic).Thus, RWTP is proven to reduce organic contaminant in river water as mentioned in previous studies [17]- [20].

Figure 1 :
Figure 1: The structure of the neural network model proposed

Figure 4 and
Figure 5  illustrate the performance of the training, validation, and test segments of the Artificial Neural Network (ANN).The initial mean square error (MSE) value is notably high at the outset but gradually decreases with an increasing number of epochs, indicating the correct execution of the network training procedure.Figure5displays three lines, reflecting the random allocation of all 0 categories: training (70%), validating (15%), and testing (15%).Up to epoch 6, which represents the optimal performance point of the model, no overfitting has occurred, and the model maintains an accuracy equivalent to this point.The optimal transfer function was achieved in a model employing the tangent sigmoid transfer function (Tansig) at the hidden layer and a linear transfer function (Purelin) at the output layer with 25 neurons.The model yielded an R 2 value of 0.96 and the best mean square error (MSE) of 30.39, as depicted in the figure5.Additionally, the calculated metrics include a root mean square error (RMSE) of 33.54, Variance Accounted For (VAF) of 61.08, a correlation coefficient (R2) of 0.34, and a Mean Absolute Percentage Error of 47.14%.

Figure 4 : R 2 for sub-sets optimum model Figure 5 :Figure 6 :
Figure 4: R 2 for sub-sets optimum model Figure 5: Validation performance

Table 2 :
ANN Results Equations

Table 3 :
Comparison of Models