A comprehensive analysis of impacts of socio-economic development and land use on river water quality in a megacity-region: a case study

A comprehensive analysis of impacts of socio-economic development and land use on water quality is very useful for better planning and management of river basins by government. In this study, the North Canal River through a megacity-region of Beijing-Tianjin-Hebei Province in China was chosen to quantify impact of 10 socio-economic factors and 6 land use types on water quality in the watershed. The principal component regression (PCR) method was thus applied in this study to quantify effects of socio-economic development and land use types on river water quality through overcoming multicollinearity problems among influencing factors. Results showed that the water quality in the North Canal River improved significantly from serious pollution. Compared with 2010, the annual average pollution index (PI) of COD, NH3–N, TN, and TP decreased significantly in 2018, especially the PI of NH3–N decreased the most, from 8.62 in 2010 to <1 in 2018, implying that the concentration of NH3–N basically met the water quality assessment standard in 2018. The major water pollutant in the basin had shifted from NH3–N in 2010 to TN in 2018. The absolute impact coefficients of industrial restructuring, population density, upgraded municipal sewage treatment requirements (including sewage treatment rate and discharge standards of water pollutants for municipal WWTPs), and urbanization rate with COD, NH3–N and TP were higher than those of other factors, indicating that the impact of socio-economic factors on water quality was more significant than that of land use types, and the socio-economic factors may offset or alter the impact of land use on river water quality in areas disturbed by human activities.


Introduction
Cities are the concentration of capital, labor, and information, and are important regions reflecting the degree of national development. With rapid economic development, individual cities drive neighboring cities to form highly developed integrated urban spatial patterns (Glazer et al 2003, Mossay et al 2020. Under this urban development pattern, economic resources were rapidly integrated, urbanization process was further accelerated, and population further aggregated and increased, leading to further demands on ecological environment, such as adequate water resources and a clean water environment. However, the limited resources of watersheds lead to problems such as water scarcity and water environment pollution occurring frequently in developing countries, especially to countries or regions with rapid economic growth (Vörösmarty et al 2010, Xu et al 2019, Hong et al 2022, which poses higher requirements for the environmental protection and management of urban river watershed (Shi et al 2017). River pollution is influenced by a variety of anthropogenic and natural factors, such as land use, socio-economic status, and climate variations (Chen et al 2016, Luo et al 2019, Luo et al 2020. Understanding the variation of river water quality and identifying the key influencing factors are important for developing sustainable watershed management policies. Therefore, an in-depth analysis of the factors influencing river water quality based on socio-economic and land use changes is of great concern to guide future planning and management of urban development from the point view of watershed management. A number of previous studies have identified the relationship between river water quality and land use types. Different land use patterns vary widely in soil nutrient cycling, topography and vegetation, and these environmental parameters are capable of altering the transport and transformation of pollutants in the watershed, ultimately leading to degradation of river water quality (Seitz et al 2011. For example, build-up land use and agricultural land use were significantly and positively correlated with pollutant concentrations in river, while grassland and the undeveloped areas were negatively correlated with pollutant concentrations (Peterjohn andCorrell 1984, Records et al 2016). Stets et al (2020) explored the factors influencing water quality by analyzing water quality changes in U.S. streams and rivers from 1982-2012, showing that undeveloped sites had the lowest concentrations for all nutrients, agricultural sites had the highest concentrations of the total nitrogen (TN), nitrate nitrogen (NO 3 − ), and total organic carbon (TOC) and urban sites had the highest total phosphorus (TP) concentration.
Socio-economic factors such as human population, population density, gross domestic product (GDP) and gross domestic product per capita (GDPPC) have a more complex impact on river water quality (Wang et al 2008, Chen et al 2016, Zhou et al 2019, Feng et al 2021. The most direct impact is the discharges of urban domestic sewage and industrial wastewater, which are related to the population density and industrial production mode. Estrada-Rivera et al (2022) reported that population growth, poorly planned industrial development and uncontrolled production processes caused serious pollution of water quality in the Alto Atoyac watershed. The water parameters (BOD 5 , COD and TSS) were positive with the population. Feng et al (2021) reported that industrial wastewater compliance discharge rate and sewage treatment were the main socioeconomic factors affecting the water quality (i.e., COD, NH 3 -N, TN, and TP) of Dongting Lake. Luo et al (2019) studied the socio-economic system consisting of various factors such as population density, GDP and sewage treatment rate with the river system consisting of COD, TN, TP. etc, and found that socio-economic and river systems were highly related to each other with the average influence degree of greater than 0.9, indicating very close relationships between socio-economic and river systems. Chen and Lu (2014) considered that population density and GDPPC explained 60.8% of the total overall river water quality variances in East China. The study also claimed that urban area was strongly correlated with major river water quality parameter distributions.
Within natural or semi-natural areas, the effects of land use types on water quality are similar to different geographical areas at the watershed scale due to low intensity of human activities and similar land use characteristics (Wang et al 2008, Fu et al 2020, Omidvar et al 2021. Nevertheless, in the areas more heavily disturbed by human activities, the types of nutrients in water bodies and the characteristics of their accumulation concentrations may be different from the same land use type. Especially in urban areas, the impact of human activities causes heterogeneity not only in land use types, but also in the types and levels of socioeconomic activities carried by the same land use type (Luo et al 2019. The alterations in land use might be thus linked uncertainly towards changes in water quality, and the effect on water quality may be offset or reduced due to socio-economic differences, resulting in deviations between the analysis results and the reality. Liu et al (2012) indicated that agricultural land use and social-economic factors had no significant correlation with all eutrophication parameters in Yunnan Plateau lakes. However, in megacity regions, few investigations have simultaneously analyzed the relative importance of both land use types and factors associated with socioeconomic development (i.e., industry, population, and technology, etc) to address variation pattern of river water quality. The complexity of human activities in megacity areas and the multiple and complex factors influencing water pollution in watersheds require more in-depth data mining. Comprehensive analysis of land use patterns and socio-economic development in the watershed on the impact of water quality changes, and specification of land use types and economic development patterns can provide implementation and application to guide landscape planning and establish best management practices in the watershed.
Recently, Pearson or spearman-R correlation is commonly used to determine the relationships between river water quality and land use types (Xu et al 2019, Krishnaraj andDeka 2020), which is simple but not quantitative approach to interrelationships. Cluster analysis (CA), principal component analysis (PCA) (Chadwick et al 2006, Fu et al 2020, and redundancy analysis (RDA) (Chen and Lu 2014) provides only quantitative information about the rate of variance explained by variables through reducing the dimensionality of the data set and using a few of the most important factors. Multiple stepwise regression (MSR) (Liu et al 2012, Fu et al 2020, and canonical correlation analysis (CCA) (Luo et al 2019, Feng et al 2021 provides quantitative ways to interpret complex water quality variations. MSR method is widely employed to determine the land use patterns that best explain the spatial variability of individual river water quality variables. However, the problem of multicollinearity among influencing factors is the primary obstacle to the application of these methods. Strongly intercorrelated variables were removed, leaving the remaining variables for further MSR or CCA analysis , Luo et al 2019. Nevertheless, intercorrelations between variables are common, removing any variable will have an impact on the results of the analysis. Chen et al (2016) considered the correlation of factors and applied a modified geographically weighted regression to analyze the impacts of land use and population density on surface water quality.
With this background in mind, this study tries to apply principal component regression (PCR), which allows to disregard the problem of multicollinearity among factors and quantify the effects of both land use and socioeconomic factors on a single river water quality indicator. This approach is a regression analysis technique based on principal component analysis (PCA) in statistics, which has been widely used in medicine and socioeconomics (Jolliffe 1982, Tu andPeng 2012). The main use of PCR is to overcome multicollinearity problems, which arises when two or more explanatory variables are close to being collinear. Therefore, the problem of multicollinearity among indicators need not be addressed in the application (Dodge and Commenges 2006, James et al 2013, Lee et al 2015, Chen et al 2016. The objective of this study is thus to factor analyze the water quality of rivers influenced by socio-economic and land use in the megacity area, to explore the influencing factors of the river water quality and to quantify the degree of influence. The North Canal River was taken as the study site, which is located in the megacity-region of Jing-Jin-Ji Urban Economic Circle including Beijing, Tianjin and Hebei Province, the most important urban agglomeration in northern part of China. In this region, more studies have focused on the impact of land use on a particular river water quality indicator and the distribution characteristics of pollution sources (Dai et al 2015, Liao et al 2022, Liu et al 2023. Liu et al (2018) studied the effect of landscape pattern on riverine nitrogen pollution sources and found that the interspersion and juxtaposition index of forest land was negatively related with nitrate. Few studies have been done on the impact of socio-economics on river water quality in the region. Therefore, during a 2-year (2010 and 2018) monitoring period, a dataset of water quality parameters, socio-economic and land use factors was analyzed using pollution index (PI), Pearson correlation and PCR analysis to (1) identify spatial and temporal trends of river water quality; (2) clarify the correlation between socio-economic factors, landscape patterns and river water quality; (3) quantify the key influencing factors on river water quality; and (4) provide more information and references for decision makers in managing and controlling river water pollution in the megacity-region.

Study area
The North Canal River crosses through the mega-region of Jing-Jin-Ji Urban Economic Circle. Originating from Yanshan Mountain in Beijing, the river flows 143 km mostly central through Beijing city, Langfang City of Hebei Province, and Tianjin City with a population of 25.33 million, and ultimately joins the Hai River (figure 1). The total area of the North Canal River basin is 6,166 km 2 , of which 952 km 2 are mountainous area and 5,214 km 2 are plain area. With hills in the northwest and plains in the southeast, surface elevation in the North Canal River ranges from 20-1500 m. Located in a warm temperate monsoon climate zone, the river basin has four distinctive seasons with an annual average temperature of 10°C-12°C. The annual average rainfall ranges from 500 mm to 700 mm, most of which in the summer months from July to September (Guo et al 2012).
The North Canal River has been facing the challenge of water pollution control and river restoration. In 2021, it carried 8% of the country's population on 2.3% of the land, with a population density of more than 20 times the national average and generated 8.4% of the gross domestic product (NBS 2022). The river undertakes 90% responsibility of the drainage from Beijing central area, which produces the effluent from the main WWTPs . In 2010, the effluent from WWTPs in the river basin were implemented the Discharge Standard for pollutants for municipal wastewater treatment plants (GB 18918-2002). However, to speed the improvement of surface water quality, Beijing and Tianjin implemented more stringent local Discharge Standard of Pollutants for Municipal Wastewater Treatment Plant in 2015 and 2018, namely (DB11/890-2012) and (DB12/599-2015), respectively. According to the Class A of these standards, limit values of chemical oxygen demand (COD), ammonia nitrogen (NH 3 -N) and total phosphorus (TP) discharge are equal to those of Class IV and V of the Surface Water Quality Standard of China (GB 3838-2002), respectively. According to the latest 2021 Annual Report of Beijing Ecology and Environment Statement, roughly 32% of the water quality in Beijing section of the North Canal River was assessed as Class IV and V of (GB 3838-2002).

Water quality monitoring and sample analysis
In this study, 11 sample sites (D1 -D11) were selected along the mainstream and its major tributaries as well as the confluence, considering the characteristics of the mainstream and tributaries, land use and land cover (LULC) types, and accessibility. Water sampling at 11 sites was carried out each month from March to November in 2010 and 2018, respectively, with each time completed within two days, except water sampling was not possible in January due to river frozen. Each site of water quality samples collected in accordance with the 'Water quality-Guidance on sampling techniques (HJ 494-2009)' (https://www.mee.gov.cn/ywgz/fgbz/bz/ bzwb/jcffbz/200910/W020111114543133505806.pdf), the collection from at least three locations at a depth of 0.5 m below the surface water of parallel water samples (cross-sectional samples). Water samples were transported to the laboratory in an incubator with ice for chemical analysis within 48 h. Four water quality indicators COD, NH 3 -N, TN, and TP were selected, which are the main indicators of water quality management assessment of the North Canal River. All water quality samples were measured using standard analytical methods (EPAC 2002).

Factors of water quality
Two categories of basin characteristics were included in the PCR models to evaluate their impacts on the water quality of the North Canal River (table 1). The socio-economic system (SES) impact on the water quality was studied by using data in 2010 and 2018, respectively. Most studies reported that the SES factors under the human  Since data for SES variables were obtained on an administrative area basis, LULU types of area data were calculated on the basis of the study area, and river system indicators were obtained through field monitoring, the types of data do not match on temporal and spatial scales (de Lange et al 2010, Feng et al 2021). Therefore, it is necessary to convert them to the same scale before proceeding with further studies. For the time scale, data for SES variables (i.e., population, industry, output value and sewage treatment rate, etc), obtained from statistical yearbooks, and together with land use types are considered to be stable over the year and could be evenly distributed over the year and thus converted into monthly data. Thus, these data were converted to the same time scale as the river system data. On the spatial scale, data on SES variables obtained on an administrative area basis can be assigned to sub-basins by percentage of area weighting; in addition, the arithmetic mean concentration of the sampling sites in the sub-basin was used as the water quality concentration of the sub-basin according to the confluence pattern of water bodies and pollutants in rivers. Throughout these processes, the SES, LULC types and water quality indicators can be therefore converted to the same spatial and temporal scales for the further analysis.

Statistical analysis
To quantitatively describe the important contribution of the SES and LULC types for water quality variation in a megacity-region of China, firstly, Pearson correlation was performed to observe the coupling-feedback relationship for the SES, LULC types and water quality in the North Canal River basin in 2010 and 2018, respectively. Followed by applying PCR approach to further investigate the main influencing factors of water quality variations over the past 10 years. Notably, all variables are standardized before the analysis.
In this study, the SES and LULC system consists of sixteen indicators (table 1), and the water quality system consists of eight indicators, which are COD, NH 3 -N, TN and TP in 2010 and 2018, respectively. The main idea of PCR approach is to replace the explanatory variables with principal components by PCA, then regress the principal components as explanatory variables with outcome variables to obtain the estimated regression coefficients, and finally the coefficients of explanatory variables in PCR are each linear combination of loading in PCA and the coefficients of principal components. Those coefficients of explanatory variables in PCR can explain their effect of output variables. We wanted to investigate the degree to which each water quality indicator was affected by the SES and LULC system in 2010 and 2018. Therefore, we need to run one PCR model simulation for each water quality indicator in 2010 and 2018, which means we have to run the PCR model simulation for the eight times. Here, we explain the process using one PCR model simulation as an example. For example, the degree to which COD is influenced by the SES and LULC system in 2010. The process was as follows: Firstly, PCA was performed on sixteen indicators of the SES and LULC system and COD (the input variables of the PCA model were represent by X) to extract the M number of principal component (PC), the results can be written as: In the formula, Z represents the matrix of principal components, M was the number of principal components, m was the mth principal component, and m M; for example, the variable Z 1 is the first principal component. P was the number of the input variables (P = 17); p was the pth input variable. V m was the loading matrix of the principal components, v mp was the loading of the pth input variable on the mth principal component. Notably, M stands for the number of principal components which represent the maximum variances. In this study, the number of M was determined by those principal components could represent more than 90% information of original input variables.
Secondly, PCR forms the derived input columns Z and then regresses COD on Z 1 , Z 2 , K , Z M . Since the Z are orthogonal, this regression is just a sum of univariate regressions. In other words, PCR modeling was performed based on the obtained PC 1 , PC 2 , K, PC M , which resulted as: In the formula, y PCR represents COD concentration after standardization. Z m was from equation (1) In the formula, COD,the SES and LULC system b ( ) represented the coefficient of original input variables in the PCR model and was used to indicate the influence degree of the socioeconomic system on COD. V m was from the equation (1), m q was from the equation (2).
The Spatio-temporal variation of the conventional water quality parameters was studied by the pollution index (PI) (Wang et al 2008), which is calculated as follows: Where PI m is the pollution index of the water quality parameter m. C m O is the standard concentration value, according to the definition of 'Environmental Quality Standard for Surface Water (GB3838-2002)', mg l −1 ; Cm is the actual measured concentration, mg l −1 . When PI m is greater than 1, it considered that the water quality of the sampling site is contaminated, otherwise, the water quality is not polluted. Various sections of the North Canal River were divided into different water quality functional zones, based on different water quality categories of 'Environmental Quality Standard for Surface Water (GB 3838-2002)'. In this study, the water quality standards of the sub-basin were determined according to the standards of the water quality functional zone where the sampling sites were located (tables S1 and S3).

Spatio-temporal variability in water quality
PI m of water quality parameters showed that the water quality in the North Canal River improved significantly in the past 10 years (table 2). The major water pollutant has shifted from NH 3 -N to TN. The annual average PI of COD, NH 3 -N, TN, and TP in 2010 were 1.77, 8.62, 11.25 and 5.47, respectively, with TN showing the highest PI, followed by NH 3 -N. In 2018, the overall annual average PI of each parameter decreased significantly, with PI of TN being the highest, while PI of NH 3 -N decreased from 8.62 in 2010 to basically attain the standard for the entire basin (except for A1, table S3), and PI values of COD and TP decreased to below 1.5. Figure 2 shows significant spatial differences in the water quality in the sub-basin. In 2010 and 2018, water quality in A2 and A3 sub-basin was the best, followed by A6, and A1 water quality was the worst. The water quality of each sub-basin improved in general, but the improvement trend was varied. NH 3 -N concentration in A1 decreased most dramatically, from 19.08 mg l −1 in 2010 to 2.19 mg l −1 in 2018. Followed by A2 and A3, NH 3 -N concentration was observed with an average decrease of 14.94 mg l −1 in the last 10 years, and then the decreasing trend declined along the spatial gradient. Specifically, TN:TP ratio in A2 increased dramatically in the last decade, from 24.29 to 69.40, which is the most variable watershed in the North Canal Basin, followed by A3.

Spatio-temporal variability in socio-economic system and land use
With the rapid socio-economic development and urbanization, the urban population has grown rapidly, and the overall economic strength of the whole basin has maintained rapid growth ( figure 3, table 1). From 2010 to 2018, the industrial structure of the North Canal River basin has undergone significant changing. In 2010, sub-basin A1, A2, A3 and A4 were dominated by PTI, followed by PSI; A5 and A6 were dominated by PSI, followed by PTI. In 2018, the entire basin was dominated by PTI, with the highest share of the PTI in A2 at 90.94%. From 2010 to 2018, GDP, GDPPC, and PD increased in all sub-basin, with A2 increasing the most at 97.24 billion RMB, followed by A3 and A1 at 84.21 billion RMB and 56.85 billion RMB, respectively. The GDPPC of A1 increased slowly due to PD. the GDPPC of A2 and A3 was the largest. FAA increased from 2010 to 2018 in A1 to A4, while   figure 4 and table 3. The main LULC types in the North Canal River basin were cropland, built-up land and forestland, and the sum of these three types accounted for more than 90% of the total area. The area of cropland has been decreasing each  year, from 40.15% in 2010 to 33.02% in 2018. The area of forestland, grassland and water bodies increased, with the largest increase in forestland at 2.17%, followed by grassland at 1.49%. The scale of built-up land continued to expand from 42.12% in 2010 to 45.19% in 2018.
The transfer of LULC types occurred mainly among cropland, forestland, water bodies and built-up land. From 2010 to 2018, the cropland area in each sub-basin decreased, but the types of land transfer were slightly varied. Cropland in A1 was mainly transferred to forestland, accounting for 41.29%; A2, A4, A5, and A6 areas were mainly transferred to built-up land, with the largest proportion of transfer in A6 area, accounting for 70.73%; and A3 area was mainly transferred to forestland, accounting for 44.43%. Unlike other sub-basins, only the area of water bodies in A2 decreased in the past decade.

Correlations among SES, LUCC and water quality
The correlation coefficients of SES, LULC types and water quality indicators were shown in figure 5. Compared with LULC types, SES factors had significant impacts on water quality variations. GDP, the proportion of primary industry, population density and sewage treatment rate are the main factors. COD was significantly negatively correlated with GDP, PTI, GDPPC and STR, and significantly positively correlated with the proportion of PPI (p < 0.05). NH 3 -N, TN and TP concentrations were all significantly negatively correlated with STR (p < 0.05). TN:TP ratio was significantly and positively correlated with GDP and GDPPC, and significantly and positively correlated with STR (p < 0.05), which should be related to wastewater treatment plant process.  3.4. PCR analysis of the primary factors affecting water quality As shown in figure 6, consistent with the result of Pearson correlation, socio-economic factors have a significant impact on changes of the water quality, especially the industrial structure. the coefficient of COD and the PSI in 2010 is 13.45, indicating that for each unit increase in the PSI, the COD concentration will increase by 13.45 units; the coefficients of the PPI, PD and UR are −5.42, −12.24, and −8.18, which means that for each unit increase in PPI, PD and UR, COD concentration will decrease by 5.42, 12.24, and 8.18 units, respectively. Compared with 2010, the trend of COD in 2018 is consistent with the coefficient of each indicator, but all of them have different degrees of weakening. For example, the coefficient of COD and the PSI decreased to 6.29, and the coefficients of the PPI, PD and UR increased to −0.22, −5.99 and −4.75, respectively. The trend of the coefficient of NH 3 -N and each indicator is slightly different from that of COD, and the coefficient of NH 3 -N and PPI in 2010 was 6.78, which means for each unit increase in the PPI, the concentration of NH 3 -N will increase by 6.78 unit; the coefficient of NH 3 -N and PD is 0.07; the coefficient range of NH 3 -N and indicators in 2018 is −1 ∼ 1, which indicates that the degree of NH 3 -N pollution is greatly reduced in 2018. This is consistent with the change in water quality (table 2 and figure 2). Compared with 2010, the relationship between TN concentration and some indicators changed obviously in 2018, the coefficient of PSI changed from −0.70 (2010) to −7.69 (2018), the coefficients of GDP, PPI, PTI, GDPPC decreased slightly, FAA and FGR increased slightly, and the coefficients of land use type indicators and TN ranged from −1 to 1.5, indicating that socio-economic The coefficients of PD and UR increased significantly from 0.94 and 1.28 in 2010 to 6.72 and 7.70, respectively, implying that the socio-economic population has a greater influence on TN. Compared to TN, the coefficients of TP with PPI and UR changed more, from −5.69 and −3.66 in 2010 to −1.31 and −7.90 in 2018, while the coefficients of TP with other indicators did not vary noticeably. In conclusion, land use type has a greater impact on COD in 2010 and a smaller impact on river water quality in 2018. The industrial structure, population and urbanization rate in the socio-economic system are the main factors influencing the river water quality.

Discussion
Both socio-economic development and land use significantly affect river water quality (Chen et al 2016, Luo et al 2019, Feng et al 2021. In this study, the absolute impact coefficients of PPI, PSI, PD and UR with COD, NH 3 -N and TP were higher than other factors, indicating that SES factors have a more significant impact on water quality in the North Canal compared to LULC types. Industrial restructuring (including PPI, PSI and PTI), population density (PD), upgraded sewage treatment requirements (including STR and discharge standards of water pollutants for municipal WWTPs), and FGR may offset or partially offset the negative impacts of urbanization development. Since the success of Beijing's Olympic bid in 2001, the North Canal River basin in Beijing as the main host area of the 2008 Olympic Games, Beijing has increased its efforts to improve the river water quality by implementing a series of engineering projects, making regulations and action plan for water pollution control, such as water pollution source control and reduction projects, upgrading discharge standards for municipal wastewater treatment and reclamation, river restoration and other management measures in a multipronged manner. Meanwhile, the area has been developed (figure 3). Especially A2, as the main host area of the Olympic Games, its COD, NH 3 -N and TP concentrations met with the water function zone requirements (Class III of GB 3838-2002, table S1) in 2018.
Firstly, in order to solve the 'big city disease' in Beijing and expand the environmental capacity and ecological space, China proposed the national strategy of Beijing-Tianjin-Hebei region coordinated development, to promote industrial upgrading and transfer (Dai 2003, Yang andPu 2018). As a result of industrial policy shift, industrial sectors in urban areas of Beijing were moved out. In this study, PPI and PSI of the North Canal River basin were decreased from 3.22% and 33.18% in 2010 to 1.23% and 27.33% in 2018, respectively (figure 3). A research report on the North Canal River basin (BMEE 2008) and a 2-year survey of major water pollutants entering the river by our group (Yu et al 2012) found that discharges from domestic, agricultural, and centralized treatment facility were the main pollution pathways in the basin in 2007, accounting for 31%, 35%, and 32% of the total COD emission, respectively. With a series of implementation projects, such as sewage interception along the river, river restoration, and centralized sewage treatment plants, the main inflow pollution load of the North Canal River in Beijing decreased significantly in 2010 compared to 2007, with COD and NH 3 -N decreasing by 3.08 × 10 4 t/a and 0.17 × 10 4 t/a, respectively (BMEE 2008). The black odor phenomenon of the North Canal River in Beijing was completely eliminated in 2011, and the major water pollutant was shifted from COD to NH 3 -N (Guo et al 2012, Yu et al 2012. In this study, the coefficients of NH 3 -N with each indicator in 2018 ranged from −1 to 1, indicating that the pollution level of NH 3 -N was greatly reduced in 2018 (figure 6). Table 2 showed that NH 3 -N concentration of the North Canal River in Beijing met with the water function zoning target in 2018, and NH 3 -N pollution in Beijing basin then shifted to TN pollution.
Secondly, the spatial patterns of industrialization level and population density lead to differences in industrial wastewater and combined domestic wastewater discharges, which have different impacts on surface water (Chadwick et al 2006, Luo et al 2019. In the North Canal River Basin, from 2010 to 2018, the proportion of primary and secondary industries continued to decrease by about 1.99% and 5.86%, while the proportion of tertiary industries increased by 7.89% and population density increased by 10.81%, resulting in integrated domestic wastewater treatment as the chief component of wastewater. In addition, the economy of the North Canal River basin has grown rapidly, with a 127% increase in GDP and an increment in urbanization from 67.63% to 73.40% from 2010 to 2018. The environmental investment also increased while maintaining growth in GDP. Accordingly, the capacity of wastewater treatment was increased by building more wastewater treatment plants and improving the technology. There were 7 WWTPs operated in the Beijing basin with a treatment capacity of 809,000 m 3 d −1 in 2007, but the number of WWTPs in operation increased to 19 with a treatment capacity of 2,050,800 m 3 d −1 in 2018, 2.53 times of that in 2007 (Zhu et al 2021). The increased number of WWTPs and upgraded wastewater treatment processes in the basin effectively reduced the input of pollutants into the river, contributed to the great improvement of river water quality, resulting in the large decrease in NH 3 -N concentration (table 2). This is consistent with the results of Stets et al (2020), who analyzed the concentrations and trends of various constituents in streams and rivers in different regions of the United States during 1982-2012. Finally, on the technical level, with the development of the economy, local government could invest more costs to develop wastewater treatment technology and upgrade the effluent standard of wastewater treatment plants; therefore, a series of laws and regulations have been formulated and implemented in Beijing and Tianjin to further improve water quality (table S2). For example, Beijing has further refined the poor V surface water bodies into four classes of poor V1, poor V2, poor V3, and poor V4 since 2007. And Beijing local standard (DB11/890-2012), a stricter discharge standard for WWTPs has been implemented since December 31st, 2015 (table S2). According to this standard, the discharge limits of COD, NH 4 + -N and TP of WWTPs are equal to those of Class IV of (GB 3838-2002). In the North Canal River basin, the Wenyu River, Qing River and Ba River, as the main urban drainage rivers in urban areas of Beijing, the discharge of centralized sewage treatment facilities is its main source of recharge (Yu et al 2012, and the implementation of such upgrade standard has greatly improved the water environment quality of receiving water bodies in the basin and achieved the goal of Beijing's second 'three-year action plan ' in 2018. Areas with better socio-economic development (A2 and A3) have C:N:P ratios of river water quality similar to the effluent from WWTPs (tables 2 and S2). The improvement of socio-economic level contributes to the effective implementation of environmental policies. For example, the Clean Water Act (CWA) of 1972 focused on upgrading wastewater treatment plants to decrease point discharges of nutrients and organics, and the reauthorization of CWA in 1987 added important provisions on stormwater management (Patrick et al 1992). The implementation of these directly led to a significant downward trend in nutrient concentrations in urban areas throughout the United States (Carey and Migliaccio 2009).
In this study, the positive correlations between build-up land use with COD and TP in the North Canal River basin (figures 5 and 6) is consistent with several previous studies examining pollutant inputs to urban areas (Liu et al 2012, Chen et al 2016, Fu et al 2020. The higher percentage of impervious surfaces in urban areas prevents stormwater from infiltrating into the soil, resulting in the transport of pollutants in soluble and particulate form to nearby streams via surface runoff. For TN, Liu et al (2018) showed that industrial land, and built-up land were positively correlated with riverine nitrogen in the North Canal River basin of Beijing, which is contrary to the results of this study. In this study, considering socio-economic factors, build-up land use was negatively correlated with TN with an average coefficient of −0.64 in 2010 and 2018, respectively, indicating that socioeconomic development may lead to differences in the effects of land use types on water quality. Therefore, the socio-economic characteristics cannot be ignored when investigating the factors affecting water quality in urban areas, especially areas exposed to intense human disturbance activities. A comprehensive analysis of water quality impacts on watersheds by integrating land use types and socio-economic factors may better reflect the trend of water quality variations.
This study demonstrated that water quality parameters were significantly influenced by socio-economic and land use composition. Furthermore, the relationships between these metrics and water quality can be quantitatively analyzed. Therefore, metric analysis offers a useful framework for indirectly indicating the association between socio-economic, landscape characteristics and water quality. PCR results indicated that in the megacity-region, socio-economic factors may offset or alter the impact of land use on water quality. Implementation of river restoration measures and policies, such as centralized treatment of domestic wastewater and upgrading the discharge standards of wastewater treatment plants to significantly improve water quality, which can further deepen our understanding of the phenomenon of water pollution and provide a reference for the government to develop new and innovative planning and management. In this study, PCR was used to analyze the impact of socio-economic and land use types on river water quality. In the future, the impact of water resources such as water flow rate, water level and climate factors such as precipitation on river water quality could be considered, which will be a better guide for river management with multiple dams.

Conclusion
In this paper, a PCR method that disregards the problem of multiple covariance among factors was used to quantify the impact of socio-economic and landscape indicators on the water environment. The results indicated that the water quality in the North Canal River improved significantly from 2010 to 2018, NH 3 -N concentration in the Beijing section met the water function zoning target in 2018. Relatively the best water quality was found in urban areas (A2 and A3) and the worst in sub-urban area (A1). Both socio-economic factors and land use factors have a significant impact on river water quality. However, as one of the typical representative areas of a megacity-region, the socio-economic variations in the North Canal River basin play a more important role in determining the water quality of the whole basin than the land use types. The absolute coefficients of industrial structure, population density and urbanization rate with COD, NH 3 -N and TP are higher than other factors, indicating that socio-economic factors may offset or alter the impact of land use on river water quality in areas disturbed by human activities.