The Prediction of the Trihalogenomethanes Content in Drinking Water: Infiltration Water Intake

This study aims to establish a connection between the trihalogenomethanes content in drinking water of an infiltration intake and parameters characterizing water quality (turbidity, chromaticity, oxidizability), chlorine doses, and water consumption. Trihalomethanes include chloroform, bromodichloromethane, dibromochloromethane, and bromoform. It is known that trihalomethanes can harm public health. These compounds have a toxic polymorphism and can cause long-term effects, including carcinogenesis. It is shown that a satisfactory description of the trihalogenomethanes content can be obtained by modeling the annual cycle, taking into account the displacement of parameters (2-3 months), relative to the trihalogenomethanes content. Equations considering the displacement of water quality indicators have a high determination coefficient (0.95-0.97). The use of the same displacement for the time series of true values allows us to obtain equations with a sufficiently high determination coefficient, which indicates the possibility of their use for the predictive assessment of the trihalomethanes content in drinking water.


Introduction
One of the sanitary and epidemiological well-being measures is to provide the population with highquality drinking water [1]. The water supply of urban agglomeration is usually organized through the use of surface or infiltration water. Infiltration water intake can provide higher drinking water quality. Water filtered through the soil is well clarified, since suspended particles are trapped in the soil pores. However, in this case, the stage of disinfection is required, especially if the water distribution networks are quite long [8].
Thus, water disinfection includes the complex of measures taken to prevent and eliminate infectious diseases transmitted by water [15,16]. Chlorination is the most common and reliable method of water disinfecting.
A significant drawback of this method is the formation of chlorination products -organohalogen compounds of various compositions [2], most of which are trihalogenomethanes. This group of compounds includes chloroform, bromodichloromethane, dibromochloromethane, and bromoform. Trihalomethanes [THM] can harm public health. These compounds have a toxic polymorphism and can cause long-term effects, including carcinogenesis [3,6,7]. The sources of THM formation are anthropogenic and natural substances [3]. One of the most significant THM formation sources is the presence of aqueous humus in the river water [4,9,14] because, despite the filtration of groundwater through the soil, some parts of the humus penetrates through the filter layers.
Such factors also determine the quantitative THM content as temperature, the composition of organic compounds, the material of water pipes, the nature and dose of the chlorinating agent, pH, and others [5,10,11].
Analytical control of water quality at water treatment facilities includes turbidity, chromaticity, oxidizability, and does not provide for direct determination of THM precursors. In this regard, it seems reasonable to consider the relationship between the generalized indicators of water quality (turbidity, chromaticity, oxidizability), chlorine dose, and the amount of produced THM. Another parameter that can affect the concentration of THM in drinking water is water consumption. It largely determines the magnitude of the generalized water quality indicators due to the disruption of bottom sediments, washes from the coastal zone, and other reasons [12,13].

Materials and Methods
The calculations were based on analytical control of the THM content in drinking water obtained by the Water Quality Analytical Control Center. The analytical control was performed in a water supply organization of a large urban agglomeration with infiltration-type water intake.
Multivariate correlation and regression analysis were used for modeling. This method allows us to identify the relationship between the THM content in drinking water with a chlorine dose (DСl) and some generalized quality indicators of the water source: turbidity (T), chromaticity (C), oxidizability (O), which were supplemented by the flow rate of the incoming water during the analysis of the water source (Q).
The measurement of the THM content in drinking water is carried out once a month. In the year, there are 12 measurements. The total number of observations of the THM content was n = 216. In contrast to THM, turbidity, chromaticity, oxidizability, and water consumption at hydroelectric power stations are measured daily. Water consumption is the product of the average flow rate of the crosssectional area of the watercourse. For these indicators, the average monthly values of water quality indicators were calculated by summing all values for the month and dividing the resulting value by the number of measurements in the month. Thus, data series were formed containing the same number of measurements as in the series of THM values (n = 216). The dose of chlorine was calculated based on the data of the water supply average monthly value per day (m 3 /day) and the average monthly amount of chlorine spent (kg/day). As a result, the average values of the chlorine dose for each month were obtained.
According to the values of the monthly average dependences of the THM content, turbidity, chromaticity, oxidizability, water consumption, the average long-term values of these indicators are formed. This allows one to get 12 values of these indicators.
The described algorithm allows us to exclude from the study the influence of variable seasonal factors while maintaining the general tendency and dynamics of changes in the parameters of turbidity, chromaticity, oxidation, and flow rate of water entering the water source.

Results and Discussion
The main parameter that affects the THM content is the chlorine dose (DСl). During the study , there was an increase in the trend of changes in the chlorine dose and THM content (figure 1). [ТHМ] = b01 + b11Х; (1) The results of calculating the correlation coefficient between the chlorine dose and the THM content show the presence of a "weak" bond strength on the Cheddock scale (r = 0,10): where r -correlation coefficient; F -Fisher test; b03, b13 -the coefficients of the regression equation; S -Student's criterion; A -relative error (%) b03 = 4.9; b13 =2.7; r = 0.1; R 2 = 0.01; F = 6.33; S = 22,2; A =37.1 Even though the chlorine dose is a priority factor in the THM formation, modeling the THM content as a parameter, depending on the chlorine dose [5], does not allow satisfactory results. In this regard, an attempt was made to use multivariate correlation and regression analysis to simulate the THM content in drinking water obtained at the infiltration water intake. As independent variables, in addition to the chlorine dose, generalized indicators of water source water quality were used: turbidity, chromaticity, and oxidizability. These indicators are primarily related to the content of THM precursors in water, such as humus and fulvic acid. They determine the technology of water purification [9].
However, even in this case, it is impossible to obtain a satisfactory ratio between the THM content and the chlorine dose, turbidity, chromaticity, and oxidizability (table 1). It is interesting to note that chromaticity is an insignificant parameter of the regression equation (4).  According to the equation, the chlorine dose and oxidizability increase the THM concentration, while the turbidity decreases. The value of the Fisher test indicates the statistical significance of the determination coefficient and the statistical reliability of the regression equation (i.e., the coefficients are jointly significant). However, the low value of the determination coefficient of the regression equation (R 2 = 0.08) indicates an unsatisfactory description of the change in the THM content curve with the parameters (chlorine dose, turbidity, oxidizability). The average approximation error, using the regression equation (3), is 67.0%.
The river flow, on which the infiltration water intake is located, is regulated by the reservoir. Reservoirs have a significant impact on the river water regime and, therefore, on the river water's qualitative composition. In turn, the water source's quality depends significantly on the flow rate and quality of the river water. The introduction of the water flow rate through the water source can increase the degree of correlation between the THM content and parameters. In this regard, the equation (4), taking into account the chlorine dose, turbidity, chromaticity, and oxidizability, introduced an additional parameter -water consumption (Q).
A comparative analysis of the obtained equations characteristics (4) and (5) shows that the introduction of a water consumption parameter into the regression equation has little effect on the features of the bond strength between the THM content and the parameters (table 3). The value of the determination coefficient increased slightly, and the relative error was practically preserved (table 2). Thus, the use of the regression equation (5) is not possible. It can be assumed that the reason for the weak relationship between the THM content and the chlorine dose, water consumption, and parameters of its condition is the influence of climatic conditions, such as, for example, different times of the spring beginning and autumn floods. Some averaging of such an   (table 3), indicate the possibility of using equation 6 to assess the quality of drinking water according to the THM content. Thus, the correlation coefficient is 0.89, and the approximation error is 31.2%. It is noteworthy that when constructing changes in the THM content and other parameters in the annual cycle, a displacement in their maximum values, relative to each other, is observed. Thus, for THM, the maximum concentration occurs in July and coincides with the chlorine dose for chromaticity, oxidizability -in May, for turbidity -in April (Fig. 2). Thus, the THM content is delayed by 2-3 months compared with the maximum concentrations of chromaticity, oxidizability, water consumption, and turbidity.
Multivariate regression analysis performed to find a relationship between the THM content and parameters showed a significant increase in the correlation coefficient (r = 0.98) and a decrease in relative error (table 4).  Multivariate regression analysis was performed to find the relationship between the THM content and parameters, which showed a significant increase in the pair correlation coefficient, taking into account the displacement of the parameters' maximum values relative to the THM content (table 4). Simultaneously, the bond strength "THM content -parameter" also increases, which allows an assessment of the THM content for each of the parameters (table 5). The obtained regression equation (7) for the THM content forecasting in the annual cycle (long-term forecasting), taking into account the displacement in turbidity, chromaticity, oxidizability, and river water consumption, is reliable. It has a high determination coefficient; the average relative forecast error was 5.8% (table 5). It seems interesting to analyze the possibility of modeling the annual cycle, using displacement, the time series of water parameters (turbidity, chromaticity, oxidizability, and water flow), and the true time series for the entire observation period. For this purpose, a regression equation is defined. It describes the period 2011-2014 (  We can assume that, in this case, the necessary reliability of the equation is achieved, and it becomes possible to simulate the THM content not only in the annual cycle but also for the current time (figure 3). Because the THM analysis accuracy is 26%-42%, the obtained equation can be considered acceptable for a preliminary assessment of the THM content in drinking water.

Conclusion
The analysis of the THM predicting possibility, depending on the water quality, generalized indicators of the water source (turbidity, chromaticity, oxidizability), the chlorine dose, and consumption, shows that the THM content predicting in drinking water is marked with a low determination coefficient.
Using the average monthly values of the specified parameters for predicting the increase parameters connection degree. This allows using the obtained regression equations for the long-term prediction of the THM concentration. Additionally, a comparison of the monthly average values of turbidity, chromaticity, oxidizability, chlorine dose, water flow, and THM concentrations revealed that the maximum THM concentrations were shifted to others in 1-3 months. Equations considering the displacement of water quality indicators have a high coefficient of determination (0.95-0.97). The application of the same displacement for a time series of true concentrations allows obtaining equations with the determination coefficient R 2 = 0.65.

Acknowledgments
The work was performed within the framework of the Ministry of Science and Higher Education of the Russian Federation state task in the field of scientific activity. Publication number FEUR-2020-0004: