Neural Network Modelling for Prediction of Coagulant Dosage

Water treatment plants (WTP) in Indonesia commonly applied conventional treatment that, among others, relies on the coagulation process to treat turbid surface water to produce clean and safe water for daily purposes. Meanwhile, standard Jar-Test methods for coagulant dosage determination are inefficient, inaccurate, and possibly will not function well for water with high variability of quality. This study aims to improve and optimize the coagulant dosing process by considering the main parameter that greatly affects the coagulation process, i.e., turbidity. For this purpose, a multilayer artificial neural network model (ANN) with an adaptive backpropagation algorithm was developed using MATLAB software. A network architecture with 1 input, 5 hidden and 1 output was implemented. For the input, 4465 turbidity data of the raw water taken from Surabaya Municipal Water Enterprise (PDAM) were used, while the output is the optimal dose obtained from the results of the Jar-test. This study resulted in an empirical model of the optimum dosage of coagulants as a function of the turbidity of the raw water. Training and testing on the given ANN resulted in R2 (determination coefficient) of 0.99444 from regression analysis with the following equation: output = 0.99 x target + 0.66. This study showed that the ANN model provides a very high success rate for coagulant dose prediction.


Introduction
Coagulation technology is the most commonly used water treatment technology in both small and large-scale water treatment plants (WTP).The term coagulation commonly represents two inseparable processes, i.e., coagulation and flocculation [1].Consecutively, these processes instigate the agglomeration of colloidal particles suspended in water [2] and form settleable flocs [3].By adding coagulants, small-sized suspended solids and colloids are destabilized and agglomerated to form larger flocs, thereby increasing the settling efficiency [4].The determination of coagulant dose requirement in most WTP in Indonesia is conducted manually.The calculation of coagulant requirements is periodically calculated using the jar test method.This process takes a long time and is likely to be miscalculated due to fluctuating raw water quality.The coagulant dosage must be adjusted to the state of the raw water entering the coagulation unit at the DWTP so that the effluent released complies with the desired quality standards or results.Optimization of coagulant dosage and coagulation pH in this procedure is based on the removal of turbidity.The turbidity present in the raw water affects the optimal dosage of the coagulant used.
Optimum coagulant application is necessary to maintain the quality of the production water in the DWTP, but it is very difficult to ensure the correct optimal dosage for the fluctuating raw water quality.Therefore, empirical modelling is needed to create a model capable of determining the optimal dosage of coagulants in real time.An empirical model for identifying the pattern can be found through several methods, one of which is an artificial neural network (NN).An artificial neural network is a group of connected I/O units where each connection has a weight associated with its computer program.The artificial neural networks will build predictive models from large databases.This model is built to mimic the human nervous system.These networks can be applied to image understanding, human learning, computer speech, and others.Backpropagation is at the heart of training artificial neural networks.This method uses the weighting of the artificial neural network to be adjusted according to the error rate obtained in the previous epoch (i.e., iteration).Proper weight adjustment will reduce the error rate and make the model reliable by improving its generality [5].
The objective of this study is to build and test an ANN to predict the dose of aluminium sulfate coagulant that should be applied based on empirical data.The variable used is turbidity, which is a parameter to be reduced in the coagulation unit at the Drinking Water Treatment Plant.

Methodology
This chapter focuses on the method used to predict coagulant dosing using the turbidity parameter approach, as this is the parameter that has the greatest influence during the coagulation process.The resulting data is subsequently used to train and test the artificial neural networks.

Materials
The water samples used in this study are raw water from a drinking water treatment plant in Surabaya, East Java, Indonesia, which was taken from the Jagir and Surabaya Rivers.These water samples were measured for turbidity during both the dry and rainy seasons to show the existing raw water quality conditions.
The coagulant used is aluminium in the form of technical grade aluminium sulfate (Al2(SO4) 3- 16H2O) with a concentration of 1%.This alum was chosen because it represents the alum used by the drinking water treatment plants in Surabaya.

Methods
The steps of the coagulation process were carried out consecutively by the jar test method as follows: • step 1: fast mixing phase (100 rpm, 1 minute) to facilitate the coagulation process.
• step 3: sedimentation phase without mixing (15 minutes) to allow the floc to settle.
To predict coagulant dosage, this study exploits a multi-layer artificial neural network with a backpropagation algorithm.A network architecture with 1 input (i.e., turbidity), 5 hidden, and 1 output (i.e., coagulant dosage) was used.The training and testing set was generated using the neural network employing 4465 data obtained from sample measurements conducted previously.Among the data, some arbitrary 3572 samples (80%) were selected for the training set.The input vector's component of the training set is related to the significance of the neurons in the input layer.The training and testing were conducted using a backpropagation algorithm that iterates the training until the error is close to zero.

Results and Discussions
This section is presented in two subsections.The first subsection provides some descriptive statistics of the data obtained from the jar test experiment for coagulant dose determination affected by water turbidity.The second subsection presents the results of the neural network test, mainly in the form of regression statistics.The results are then compared with previous studies found in the literature.

Coagulant Dose Determination
The evaluation of experimental values is based on the well-known fact that the turbidity value has the most significant effect on the coagulant dose as evidenced by Figure 2. The dose of coagulant introduced during the rapid mixing process is always changing due to the change in the quality of raw water entering the drinking water treatment plant.Changes in water quality, based on turbidity parameters, are caused by the content of suspended particles in the water [6].The difference in weather greatly affects the variation of water turbidity.During the dry season, the turbidity is low, below 50 NTU.However, in the rainy season, river water contains more solids and other impurities from the run-offs that cause turbidity to increase significantly, ranging from 50 NTU to 950 NTU.

Figure 2. The Effect of Turbidity on Coagulant Dosage
Turbidity represents the murkiness of water caused by suspended solids such as silt and clay, organic particles, i.e., waste from humans and livestock, plant and biological debris, as well as chemical deposits such as manganese and iron [7].Raw water with high turbidity indicates a contamination problem at the inlet area or surface water source.This causes a disadvantageous to the performance of the coagulation-flocculation-sedimentation, filtration, and disinfection units that make up the treatment train of a WTP.If the filtered water turbidity goal is not achieved, this may signify the existence of pathogens in the water, and high levels of turbidity in the distribution system could suggest the infiltration of external pollution into the pipe or may indicate biofilm release and oxide deposits [8].
Generally, coagulation and flocculation processes can significantly remove turbidity from the water.These processes involve the addition of coagulants.The coagulant used in this study is aluminium sulphate, which is the actual coagulant used in the studied WTPs.It can be observed in Figure 2 that there is a significant increase in coagulant dosage for the increase in turbidity.The turbidity of water in WTPs in Indonesia is usually less than 200 NTU so the coagulant dosage requirement is also less than 300 mg/L.However, there are conditions when the turbidity exceeds 200 NTU, resulting in a sharply increased dose of coagulant, as shown in Figure 2.
Deshmukh et al. [9] observed that the decrease in the turbidity value of WTP was dependent on the turbidity of the raw water.A study [10] on treated turbid water from the Guadiana River in Badajoz, Spain reported that the removal efficiency of turbidity was affected by the initial turbidity of the raw water.The greatest reduction was observed in raw water with high turbidity.The decrease in turbidity with an increasing dose of coagulant is due to the fact that the coagulant in water hydrolyses into various products containing cations that can adsorb negatively charged particles.This neutralization destabilizes the particles and causes them to clump together [11,12].At high doses of the coagulant, an inversion occurs, and the particles begin to stabilize again, reducing turbidity at high doses of the coagulant [13].

Artificial Neural Network (ANN)
Out of a total of 4465 measured samples, n = 223 randomly chosen samples were utilized to generate the test set.The neural network's output is displayed in Figure 3.The experimental results of the artificial neural network prediction are represented visually in Figure 3 with the measured turbidity value on the horizontal axis and the coagulant dosage amount on the vertical axis.The selection of the x and y axes is arbitrary.The effectiveness of the artificial neural network in predicting coagulant dosage efficiency throughout training, validation, and testing is shown in Figure 4, along with the network's root mean squared error.Figure 4 illustrates how the error gradually decreased as the number of epochs rose from a high value (15.68) throughout training, validation, and testing to -14.1.The training procedure continues for six iterations or until the validation error increases.To avoid overfitting the dataset, the training is stopped at epoch 131.The quantity of epochs thus represents the rate of training for an artificial neural network [14].As shown in Figure 4, the best execution of validation appears at epoch 125 and after 6 error iterations at epoch 131, the process was terminated.The network's root mean squared error (MSE) during training, validation, and testing is 38.1089, 38.3686, and 35.7107, respectively.These MSEs represent the mean squared difference between training, validation, and testing results and targets [15].However, in network behaviour analysis, validation is the most important metric [14].As a result, Figure 4 shows the best validation execution of 38.3686 at epoch 125, indicating that the network is good and stable [14].
The error histogram of the artificial neural network for coagulant dose prediction is shown in Figure 5.The error graph shows that the majority of errors range from 15.68 to -14.1.This means that the difference between training, validation, and testing performance and target is minimal [15].A yellow line represents zero error.Figure 5 depicts the data fit errors as being fairly well distributed in the range around zero [15].
According to the expression of Eq. ( 1), the black dotted line (Figure 6) shows a linear coagulation dose dependence.The answer is 0.99395.This number represents an estimate of the regression coefficient.The regression value measures the relationship between performance and goals during training, validation, and testing [15].However, estimating the range, which determines the direction to be between 0.99368 and 0.99444 with 95% confidence, is more accurate.output = 0.99 x target + 0.66 Table 1 displays the neural network's experimental output on a test set of 223 randomly selected samples.Table 1 displays the linear regression characteristic values (Figure 6).R = 0.99444 and R2 = 0.98891 are the determination coefficient and confidence value, respectively.Both of these values are close to one, indicating that the coagulant dosage and turbidity variables have a strong linear relationship.As a result, the proposed ANN successfully represent 98.8% of the data.

Conclusions
Fluctuations in the turbidity of raw water make the coagulation process inefficient in terms of time for coagulant dose determination as well as its accuracy.Predicting coagulant dosage by mathematical and statistical methods might be beneficial for the coagulation process in WTPs.An ANN has been developed and tested to reliably estimate the coagulant dose influenced by water turbidity values.The ANN in this study used the 1-5-1 architecture, and the system has 1 input layer, 5 hidden layers, and 1 output layer.The obtained model exhibits high performance which can encapsulate 98.8% of the data.

Figure 1 .
Figure 1.Neural network topology consists of input and output layers' nodes.

Figure 3 .
Figure 3. Coagulant dosage predicted by the ANN.

Figure 4 . 6 Figure 5 .
Figure 4. Training errors of the ANN

Figure 6 .
Figure 6.Neural network projected values of coagulant dose were tested, validated, and trained experimentally.The linear trend is connected by the black dotted line.