Spatial Variability Model for Water Quality Assessment of the Physicochemical Parameters and the Water Quality Index of Laguna Lake and its Tributaries

Assessment and mapping of quality of surface water are essential because chemical and physical characteristics of the water determine its suitability for domestic, industrial, and agricultural usages. Geographic information system (GIS) is an effective and efficient technique in problem-solving where spatial data are essential. Six Physicochemical parameters; Biological Oxygen Demand, Minimum Dissolved Oxygen, Level of Acidity or pH, Nitrate, Phosphate, and Ammonia from Laguna Lake and its tributary rivers were examined from the year 2016 to 2019. Ordinary Kriging and Universal Kriging were used as the spatial interpolation methods that carried out the estimation of the concentrations of the parameters at the unsampled location. It showed that Ordinary kriging performed better compared to Universal Kriging. This paper attempted to obtain the water quality index (WQI) from the year 2016 to 2019. Results showed that Laguna Lake has a WQI rank of marginal to excellent. The WQI of its tributary rivers is mostly poor. The physicochemical parameters indicated that the life of the lake and its tributaries are in threat. Rehabilitation must be given attention and must be suitably planned and executed. The national government and the responsible authorities must collaborate for the effort in rehabilitating and cleaning.

The Lake's water quality and ecology are in threat as it faces various environmental issues such as population expansion, urbanization, industrialization, deforestation, and land conversion causing massive changes in the basin and the lake itself [6]. Domestic waste accounts for 81 % of the pollution that flows into the lake [7]. In an assessment conducted by the Laguna Lake Development Authority (LLDA), from "A" to "F" with "F" being the lowest, the lake got a grade "C" for water quality and "F" for fisheries [8; 9].
The index of the health of a community relies on water quality. Urbanization, agriculture practices, and industrialization have a direct impact on water resources. These aspects influence the water resources both qualitatively and quantitatively. Water is one of the ecosystem's most significant and abundant element. All living organisms require water for their growth and survival. This study aims to inspire Filipinos of the exquisiteness of Laguna Lake, and how it can turn it into a thriving economic zone that serves to benefit all citizens. The researchers expect to improve Laguna Lake's capability and to encourage citizens to take a more active role in nation-building and protecting our nature. This study will assist in developing Laguna lake through the utilization of interagency that will help in undertaking development planning, law enforcement; determining areas for industrial and urban, and reforestation and agricultural; reduce water consumption, and control contaminants.
Generally, this study intends to generate a spatial variability model of the physicochemical water quality parameters using 2016 to 2019 data for the prediction of concentration levels of the parameters considered in the various areas of the lake. Specifically, this study aims to: 1. To assess the trend in physicochemical water quality of the lake from 2016 to 2019 data from LLDA Quarterly Water Quality Report. 2. To determine areas of the lake with critical water quality by visual inspection using Geographic Information system (GIS) as a decision support system. 3. To determine the water quality index (WQI) of the Lake and its tributaries reflecting the extent of the lake water pollution. 4. To generate composite maps of water quality depicting the required parameters using GIS. This study only focuses on the water quality parameters measured in Laguna Lake and its tributaries during the years 2016 to 2019. Thus, data outside the area and time frame is not within the scope of this research. Furthermore, the monitoring stations to be considered are only at Laguna Lake and its tributary rivers as identified by LLDA. The parameters included in this study are those released in the agency's quarterly water quality report. Biological Oxygen Demand, Minimum Dissolved Oxygen, Level of Acidity or pH, Nitrate, and Phosphate are the five primary parameters to be studied, and among the five secondary parameters, only Ammonia will be considered. Hence, the findings of this study may not apply to other areas and time frames.    Figure 1 illustrates the conceptual framework of the study. There are two main goals of the study these are to generate a spatial variability model and to obtain the CCME Water quality indices of each station that would reflect the state of the water. The processes are described in the succeeding paragraphs. Figure 2 shows the map of Laguna Lake and its surrounding mainland areas along with the monitoring stations represented by blue points. There are 9 stations located in the lake itself while 37 stations are situated in the tributary rivers of the lake.

Data Collection and Specification
The data consisting of water quality parameters for Laguna Lake and its tributary rivers were obtained from LLDA Water Quality Monitoring Report. Six physicochemical were considered, after excluding the total and fecal coliform since it has a different unit of measurement. The data to be used will be from the year 2016 to 2019. The six parameters to be examined are 1. Biological Oxygen Demand, 2. Minimum Dissolved Oxygen, 3. Level of Acidity or pH, 4. Nitrate, 5. Phosphate, and 6. Ammonia. Missing observations were observed from the report and were later known to be caused by non-collection of water samples due to dense accumulation of water hyacinth or very shallow or dry water during the time of sampling. These water quality parameter concentration averages, together with the coordinates of the monitoring stations obtained from the study of Maruyama and Kato (2017), were then compiled to form the final dataset. It is important to note that not all coordinates of the monitoring stations are obtained from the said study; some are estimated by the researchers.

Descriptive Statistics and Exploratory Spatial Data Analysis
Descriptive statistics is simply describing what the data shows. It is employed to give quantitative descriptions in a convenient form. The standard deviation, mean, maximum, and minimum value of each variable is necessary to be obtained as these will be valuable in comparing and confirming the predicted values presented in the maps as displayed in the next section. The researchers will be utilizing the ArcGIS software, a programming language common to statistical computing and such, to simplify the compilation of codes for the calculation of relevant statistics and estimates as well as the presentation of graphs used in this study. To perform kriging interpolations, the data will be first analyzed with Exploratory Spatial Data Analysis (ESDA).

Variogram Modelling
Before conducting spatial predictions, a variogram must be constructed first. This study utilized the estimator of semi-variances by Cressie and Hawkins (1980). Bailey and Gatrell (1995) further that a variogram is envisioned to investigate spatial dependence in the observed data, especially in the fields of geology and environmental science. Generally, the estimators of a variogram are more robust to minor departures from the assumption of stationarity in studying the first order component [13]. A natural sample estimator of the variogram is given by: (1) where the summation is divided by all pairs of observed data points within a distance separation h and n(h) is the total number of pairs. Assuming a stationarity process, a sample variogram possesses two important components: the sill which coincides with σ, and the range or the 2-upper-bound of a sill. There are three common variogram models obtained to fit a smooth and continuous covariance structure for stationary processes. These are (4) where r is the range and σ is the sill. The discontinuity at the origin or a nugget effect and when γ (0) = 0, which indicates a lack of spatial dependence, is then represented by a constant term a. Thus, adjustments are made in these three models. After adding the nugget effect, the spherical model changes to , 0  (6) Furthermore, in an interpolation method, a model with the best fit is the one with the smallest sum of squared error (SSError) or the sum of the squared differences between each observation and its respective mean.

Kriging Interpolation Method
The kriging method as defined by Bailey and Gatrell (1995) is a technique that exemplifies spatial patterns, predicts values at points other than those in the sample -it undertakes that the direction between sample points signifies a spatial correlation that describes variation in the surface. It evaluates the uncertainty associated with a predicted value at the unsampled locations. The weights used to predict are from the spatial dependence between the sample points.

Ordinary Kriging Interpolation Method
Ordinary Kriging is like simple kriging, but it is assumed that the process has an unknown but constant mean. However, this method forms the predicted values, (s), in one step, from the original process Y(s), at s, with weights equal to a linear combination of the observed values at the sample sites . To visualize, that is: (7) Typically, the values are chosen for the weights such that the mean value of is constrained to be , like the predicted random variable . Since the mean μ is constant among random variables, the mean for will also be μ provided that . The objective of this is to predict all values between and while yielding a minimum expected mean square error. Prediction is unbiased when the expected value of the difference between sample values and estimated values is equal to 0 -if and only if the sum of weights is in unity and mean squared error is minimized. Additionally, the mean squared prediction error, also known as the kriging variance, will be: (8) where C+ is an augmented matrix and c+(s) is an augmented vector equal to C+ω+(s). To solve for the predicted values, ω+(s) to extract ω(s) so that (s) = ω(s)y, where y is the vector containing the original observations y.

Universal Kriging Interpolation Method
To accommodate a global trend and when the mean is unknown and not constant anymore, the method of ordinary kriging is extended to a more general case which they refer to as universal kriging. In this technique, a first-order trend component is included. Like ordinary kriging, this technique forms a prediction for y in one step and utilizes a linear combination of weights from the observed values at the sample sites . Additionally, the predicted values will be unbiased if and only if the sum of weights is 1. Universal kriging estimates the trend to obtain residuals after modeling the variogram. The mean square prediction error is given by: (9)

Cross-Validation
To assess the adequacy of a spatial correlation model and results from kriging, cross-validation is introduced by using the sample data. This approach can be used to choose the appropriate lag and angle tolerances for the variogram estimation. Cross-validation is achieved by: 1. First, exclude the observation from the sample points temporarily 2. The observations are then estimated using the remaining points by an estimation method or model 3. The original observations and estimated ones are then compared for all data points For the method to be valid, the equation below must be approximately equal to zero (10) where: is the predicted observation from all data points except ; and is the mean squared prediction error of the predicted observations. Equation 10 is like what Oliver and Webster (2015) defined as the mean squared deviation ratio or MSDR. For a model to be unbiased, MSDR should be close to 1. Furthermore, the square root of the whole equation above with the values inside the summation being squared should approximately be equal to 1. This is much like the concept of a root mean squared error (RMSE) or simply the root of the squared difference between the observed and predicted values all over the number of observations. The smaller RMSE, the better. Root mean square standardized (StndRMSE) is used as an indicator if the  [10]. Furthermore, to determine if a model is a good fit for the data, the prediction sum of squares (PRESS) is calculated and is given by: (11) If a small value is obtained for a model it is considered as a good fit that is perhaps a value close to zero, since it is aimed at the difference between the original observations and predicted values is also zero or somewhat close to that value. This is called the mean squared error (MSE), a common indicator if a model or estimator fits well.

Water Quality Index
An advantage of WQI is that it combines the data related to all the tested physicochemical parameters for a specific location to produce a single value that makes it very easy to understand the overall water quality water at that area. The WQI equation is calculated based on the mathematical framework for the protection of marine life set by the Canadian water quality guidelines: Canadian Council of Ministers of the Environment Water Quality Index 1.0 (CCME 2001) for the assessment of ambient water quality in relation to the water quality objectives (Table 3 (17) The water quality index is computed by adding the three factors (assuming they are vectors). With this, the index changes in direct proportion in all factors. The CCME Water Quality Index (CCME WQI) is shown in Equation 13. The divisor 1.732 simplifies the resultant values between 0 and 100, where 0 signifies the "worst" and 100 as the "best" [11].
(18) Although the computation of WQI values can be achieved manually, this is impractical. A spreadsheet calculator that automates the process is available on the website of CCME [11]. The researchers will be utilizing the spreadsheet calculator of the CCME WQI.

Descriptive Statistics
A good dataset must have a histogram that has a normal curve, i.e. bell-shaped. The breaks and differing amounts can be an indication of clustering in the dataset.
As seen in Figure 3, the datasets for 2016, 2017, 2018, and 2019 are skewed to the right. For this reason, these variables were transformed to achieve a bell-shaped histogram. As seen in Figure 4, the dataset for 2016 is skewed. Log transformation is needed to be applied in the dataset of 2017, 2018, and 2019 to achieve a bellshaped histogram. As seen in Figure 5, the 2016 dataset is skewed. 2017, 2018, and 2019 dataset are skewed to the left for this reason these variables were transformed to achieve a bell-shaped histogram.
As seen in Figure 6, 2017, and 2018 datasets are highly skewed. 2016 and 2019 dataset are skewed to the right for this reason these variables were transformed to achieve a bell-shaped histogram. As seen in Figure 7, 2016, and 2019 datasets are skewed to the left. 2017 and 2018 dataset are skewed to the right. For this reason, log-transformation was applied to achieve a bell-shaped histogram. As seen in Figure 8, 2017, 2018, and 2019 datasets are skewed to the right. For this reason, these variables were transformed to achieve a bell-shaped histogram.

BOD
The bubble plot shown in Figure 9 indicates a relatively high BOD concentration in the west of Laguna Lake from 2016 to 2019. This area consists of rivers flowing around the Cavite area, the Metropolitan Manila area, and western Laguna. High levels of BOD indicate organic water pollution and attention must be given to these areas since they are at risk of water contamination. There is a significant change in the BOD concentration from 2016 to 2019. Figure 10 shows the prediction maps obtained from the Ordinary Kriging interpolation method. It shows the different estimates of BOD concentration. In ordinary kriging, a low BOD level is present in the east region of the Lake which is in line with the findings from the bubble plot. There is a high BOD level in the west region of the lake.

DO
The bubble plot indicates a relatively high DO concentration in the east area of Laguna Lake from 2017 to 2019. This area consists of rivers flowing around Rizal area, and eastern Laguna. In 2016, there is a high DO concentration in the west region of the lake. This area consists of rivers flowing around Cavite area, Metropolitan Manila area, and western Laguna. Figure 12 shows the prediction maps obtained from the Ordinary Kriging interpolation method. It shows the different estimates of DO concentration. In ordinary kriging, a high DO level is present in the west region of the Lake in 2016 and the east region from 2017 to 2019. There is a significant decrease in DO level in the western region from 2016 to 2019.

pH
The bubble plot shows that the central region of the lake is more basic. Monitoring Station 20 in Jala-jala River is more acidic from 2016 to 2019. Figure 14 shows the prediction maps obtained from the Ordinary Kriging interpolation method. It shows the different estimates of pH concentration. In ordinary kriging, the lake is more basic in 2016, and the region around monitoring station 20 in Jala-jala River can be seen to be more acidic from 2016 to 2019.

Nitrate
The bubble plot indicates a relatively high Nitrate concentration in the north area of Laguna Lake from 2017 to 2019. This area consists of rivers flowing around Rizal area, and northern Laguna. There is no significant change in the Nitrate concentration level from 2016 to 2019. Figure 16. shows the prediction maps obtained from the Ordinary Kriging interpolation method. It shows the different estimates of Nitrate concentration. In ordinary kriging, a high Nitrate level is present around the Lake. There is a significant decrease in DO level in the east region in 2017.  Figure 18 shows the prediction maps obtained from the Ordinary Kriging interpolation method. It shows the different estimates of Phosphate concentration. In ordinary kriging, the lake has a higher Phosphate level in the northwest region in 2016 and 2017 which lowers from 2018 to 2019. In 2017, there is an increase in Phosphate concentration.

Ammonia
The bubble plot shown above indicates a relatively high Ammonia concentration in the west of Laguna Lake from 2017 to 2019. This area consists of rivers flowing around Metropolitan Manila area, and western Laguna. In 2016, a high level of ammonia can be observed in the Rizal area and eastern Laguna. Ammonia concentration is significantly lower in 2017 to 2019 as compared to 2016. Figure 20 shows the prediction maps obtained from the Ordinary Kriging interpolation method. It shows the different estimates of Ammonia concentration. In ordinary kriging, a high Ammonia level is present in the west region starting from 2017 to 2019 which is in line with the findings from the bubble plot. In 2016, the ammonia level in the southeast region is high but in 2017 to 2019 the ammonia level in the southeast region have improved.   The The water quality in Laguna Lake is in the rank of marginal to excellent water quality whereas for its tributaries the quality is in poor to good water quality.

Descriptive Statistics
As seen in Figure 21, the data distribution of the WQI is not bell-shaped, for this reason, these variables were transformed to achieve a bell-shaped histogram. Log-transformation was applied.

Exploratory Spatial Data Analysis
The bubble plot in Figure 22 indicates that there is a relatively low rank of water quality indices in the west and north region. These are lakes flowing to Metropolitan Manila, Cavite area, and Rizal area. The WQI for the lake is Marginal to Excellent. Figure 23 shows the prediction maps obtained from the Ordinary Kriging interpolation method. It shows the different estimates of the water quality indices. The WQI map shows that the upper Northwest area of the lake has poor water quality. The water quality in the Lake per se is ranked as Marginal to Excellent. The WQI for most of the areas is poor. It can be observed that from 2016 to 2017 the water quality of the lake has worsened and improved in 2018.

Summary, Conclusions, and Recommendations
The following are the insights obtained from the interpolation: (a) In ordinary kriging, a low BOD level is present in the east region of the Lake: concentration level in the east region range from 0 to 5 mg/l. There is a high BOD level in the west region (10 to 15 mg/l). BOD concentration in the west region (tributary rivers found in Cainta, Taytay, Marikina, Muntinlupa, Taguig, Cabuyao, Calamba, San Pedro, and Sta. Rosa) have values that fail the DENR WQG for Classes A to D waters. The center region of the lake has a BOD concentration range of 5 to 10 mg/l. This means that organic pollution is present in these areas. Attention must be given to these areas since they are at risk of water contamination. (b) In ordinary kriging, in 2016 the DO level of the lake is in the range of 5 to 10 mg/l. In the succeeding years, the DO concentration in the west region has significantly lowered ranging from 0 to 4 mg/l  10 these include rivers flowing from Metropolitan Manila, Cavite area, and Rizal area. This value fails the DENR WQG for Classes A to D waters. These areas indicate that the body of water is not healthy enough to sustain many aquatic species. (c) In ordinary kriging, most of Laguna lake has a pH level of 8 to 8.4 while at the area surrounding the Monitoring station 20 (Jala-jala River) the pH concentration ranges from 6.5 to 6.8. These values comply with the DENR WQG For Classes A and B waters. In terms of pH level, the waters can be a source of water supply requiring conventional treatment. The water is also safe for primary recreation (swimming and bathing) (d) Under ordinary kriging, there is nitrate concentration of 0.5 to 5 mg/l in Laguna lake and those tributary rivers found in the Metropolitan Manila area and Rizal area where there is a lower nitrate concentration level in the rivers flowing in Cavite area. This range conforms with the DENR WQG for Classes A, B, and C waters. In terms of nitrate concentration, the waters can be a source of water supply requiring conventional treatment. Water is also safe for primary recreation. Additionally, the waters can be used for the fish and other aquatic resources growth and propagation as well as for boating, fishing, agriculture, irrigation, and livestock watering. (e) Under ordinary kriging, there is phosphate concentration with a range of 0.75 to 2 mg/l in tributary rivers found in the Metropolitan Manila area, Cavite area, and Rizal area. This value conforms with DENR WQG for Class D waters. The waters in these areas are navigable. There is a lower nitrate concentration level (0 to 5 mg/l) in Laguna Lake. This range conforms with the DENR WQG for Classes A, B, and C waters. In terms of nitrate concentration, the waters can be a source of water supply requiring conventional treatment. They are also safe for primary contact recreation. (f) Under ordinary kriging, Laguna Lake and its tributaries have an ammonia concentration of about 0.09 to 0.12 mg/l. Rivers flowing from Metropolitan Manila, Cavite Area, and Rizal Area have higher concentrations. Furthermore, the value of 0.09 to 0.12 mg/l fails the DENR WQG for Classes A to D waters. Ammonia concentration of this value is noxious to freshwater organisms thus actions must be taken to treat it to prevent damage to livelihood.
The integration of kriging within Geographic Information Systems provides a resourceful means for data integration, data analysis, and data visualization. It is therefore proper to conclude that GIS is an effective decision support system for the prediction of the conditions of the lake's water quality.
Water Quality Index indicated that there is poor water quality in the upper Northwest region of the Lake. The water quality of in the Lake per se is ranked as Marginal to Excellent. The WQI for most of the areas is poor. It can be observed that from 2016 to 2017 the water quality of the lake has worsened and improved in 2018. In comparison with the other water body classifications, the stations that have poor water quality as Class C as its objective value will also have a WQI of poor if Class A is used as an objective value. Those stations with excellent rank in Class C water may have a lower ranking when compared to Class A objective values.
The physicochemical parameters considered in the study indicate that the life of the lake and its tributaries are in threat. Rehabilitation of the lake must be given attention and must be properly planned and executed. The national government and the responsible authorities must collaborate for the effort in rehabilitating and cleaning the lake and its tributary rivers.