Rising summer river water temperature across Canada: spatial patterns and hydroclimatic controls

Understanding the spatio-temporal variability of climate-induced river water temperature change is critical for identifying hotspots and assessing the impacts on ecological and socioeconomic systems. Here, we employ the air2stream model reconstructed river temperature records for 106 stations in Canada (Nash Sutcliffe coefficient goodness-of-fit: minimum = 0.79; median = 0.93; maximum = 0.97) to analyze summer temperature changes over the years 1980–2018. Results reveal widespread river temperature increases from June to September, with significantly increasing trends for about 40%–60% of stations. Additionally, we find significantly rising 7-day maximum temperature and increasing occurrences over the critical 18 and 20 °C thresholds for about 30%–65% of stations. Furthermore, by employing the Ward’s agglomerative hierarchical clustering machine learning (ML) method, we identify eight regions of spatially coherent variability and change. We find that the south-east, coast and northern prairies are the regions of high vulnerability because of the likely impacts of rising summer water temperatures on cold-water aquatic species. Additionally, by using the random forests ML method, we demonstrate that mean air temperature and its trends are the primary drivers of mean water temperature and trends, respectively. Thus, with the projected enhanced air temperature increase across Canada, an amplified future summer river warming can be expected, which could have severe consequences, particularly in already thermally-stressed river systems.


Introduction
River water temperature is a key hydro-ecological variable that exerts critical controls on biogeochemical processes and affects aquatic health and ecosystem services (Caissie 2006, Ramulifho et al 2018, Ouellet et al 2020).Specifically, water temperature regulates dissolved oxygen, influences carbon and nutrient cycles, and consequently affects river aquatic habitat and human usage (Ducharne 2008, Ormerod 2009, Duan and Kaushal 2013, Ficklin et al 2014).River temperature is also a barometer of the impacts of warming climate and other human activities on water dependent ecosystems and societal health (Ficklin et al 2023).In this respect, an important indicator of the climate-induced change is the summer river temperature increase, associated with the compounding influence of warmer air temperatures, lower streamflow and diminished cooling from snowmelt (Mantua et al 2010, Yan et al 2021).Such rising river temperatures could have substantial implications, for example, increased heat fluxes from northern river systems to the Arctic Ocean (e.g.Yang et al 2014, 2021) likely affect ocean stratification, circulation and ice cover (Nummelin et al 2016, Park et al 2020).Furthermore, increasing exposure of coldwater fish species, such as salmon and trout, to elevated summer temperatures have been associated with shifts in their migration phenology, increased reliance on thermal refuges, as well as high mortality (Eliason et al 2011, Williams et al 2015, Keefer et al 2018).In view of such challenges, Ficklin et al (2023) argued the need for an integrated climate-landhydrology-human framework for a more holistic understanding of river temperature dynamics and developing sustainable management and adaptation strategies.
The space-time variability and change in river temperature regime are affected by the heterogeneity of climate and hydrologic forcings, and modulation by basin properties and anthropogenic impacts (e.g.land use change and riverine development) (Hannah andGarner 2015, Kelleher et al 2021).Across Canada, given the highly variable spatial patterns of climate warming (e.g. higher air temperature increases in the north; Zhang et al 2019) and magnitude and direction of streamflow change (e.g.Bonsal et al 2019), river thermal response to a warming climate can be expected to be highly non-uniform.However, previous studies on river temperature variability and change over Canada have been limited to basin-scale or regional assessments (e.g.Daigle et al 2019, Islam et al 2019, Boyer et al 2021, Shrestha and Pesklevits 2023a).
Furthermore, critical questions remain on the spatial coherence and underlying controls of the river thermal regime.Understanding the coherence with, and controls of hydroclimatic and landscape drivers could disentangle the spatial and temporal variability and change, identify regions of critical change and provide a framework for analyzing second/third order impacts (e.g.ecology, human usage).Additionally, an analysis of the aggregate variability and changes across multiple river systems could provide a meaningful categorization into similar regimes, with potential transferability beyond studied sites (Dethier et al 2020).The insights could also provide a perspective on the likely direction of future river temperature change in a warming climate.Machine learning (ML) methods, such as cluster analysis and variable importance analysis, provide powerful and flexible tools for evaluating the spatial coherence and control versus response relationships, respectively.The ML methods, which are increasing being utilized to resolve complex and multifaceted problems in climate and earth system science (e.g.Similar to other water quality variables, the sparse and often irregular water temperature observations make the statistical analyses (e.g.trends and variability analyses) problematic (Hirsch 2014, Shrestha et al 2019b).Therefore, the lack of continuous water temperature observations is a major constraint for better understanding the variability and change in river thermal regimes across Canada.In this respect, the model-based reconstruction of water temperature provides a potential avenue for supplementing a limited observation-based dataset, and a means for better understanding of the complex interactions of different drivers affecting thermal dynamics, the cumulative impacts of human activities, and the cascading effect of changing temperature on ecological processes (Ouellet et al 2020, Shrestha and Pesklevits 2023a).For this purpose, the hybrid semi-empirical air2stream model (Toffolon and Piccolroaz 2015), which is derived from process-based equations, provides a balance between model complexity and accuracy (Zhu et al 2022).Air2stream has been successfully applied for reconstructing river temperature records over large regions, including Poland (Zhu et al 2022) and western Canada (Shrestha and Pesklevits 2023b).
This study advances our understanding of the spatio-temporal variability of, and underlying controls on river water temperature across Canada in the context of warming climate.We focus on key research questions: (i) whether the variability and change in river temperature patterns are spatially coherent over multidecadal time scales; and (ii) whether the variability and change can be explained by underlying controls.To that end, we reconstruct daily river water temperature records for 106 stations across Canada by using the air2stream model, and analyze monthly, seasonal and maximum water temperature trends over ice-free summer months (June-September).We employ a ML method, Ward's agglomerative hierarchical clustering (Murtagh and Legendre 2014), to resolve the spatial patterns of river temperature along with hydroclimatic and geospatial drivers.Further, we evaluate the relative controls of hydroclimatic and geospatial drivers on river temperature using the random forests (RFs) ML method (Breiman 2001).Finally, based on these analyses, we identify regional hotspots of critical river temperature change and provide a perspective on future change in the context of warming climate.S1.

Study region and observation data
Building on our previous study of river water temperature data reconstruction for 55 stations in western Canada (Shrestha and Pesklevits 2023b), we expanded the dataset to 106 stations covering all of Canada (figure 1, supplementary table S1).We selected the stations with a minimum of 50 water temperature observations of >0 • C over the years 1980-2018 to provide a balance between model performance and wider station coverage across Canada.Additional criteria included drainage basin area of >1000 km 2 , streamflow stations with continuous observations, and unregulated or relatively minor level of flow regulation.
Data required for the air2stream model setup and calibration/validation includes water temperature, air temperature and discharge (described in section 2.2).We compiled observed water temperature records from several online sources summarized in table S2.Additionally, analog records from the Canadian riverice database (de Rham et al 2020) were manually digitized.The datasets for some stations consist of continuous daily/sub-daily records over recent years, while most datasets are discontinuous (spot sampling typically 5-10 times in a year).These datasets were first checked for consistency before combining into a single database.
We extracted daily mean air temperature at the water temperature stations from the 10 km resolution gridded Canadian Surface Reanalysis, version 2.1 reanalysis (Gasset et al 2021), except the high resolution (500 m) NRCanMet_500 data (Associated Engineering Ltd 2019) for the Similkameen River stations.Daily discharge records nearest to the water temperature stations were obtained from the Water Survey of Canada hydrometric stations network (https://wateroffice.ec.gc.ca/), except the USGS streamflow data for the Yukon River at Eagle AK station (https://waterdata.usgs.gov/nwis/).For the stations in the Liard and Similkameen rivers, missing discharge in the observation time series were infilled with the hydrologic model simulated values from Shrestha et al (2019a) and Schnorbus (2022), respectively.For all other stations, missing discharge were infilled with averages before and after missing periods, and simulated water temperature values over the infilled periods were discarded from analyses.
To evaluate the relationship between snowpack and river temperature change, we obtained historical gridded snow water equivalent (SWE) data over Canada (https://climate-scenarios.canada.ca/?page=blended-snow-data), constructed using a multi-dataset approach as described in Mudryk et al (2015).The dataset consists of maximum annual SWE values (SWEmax) at 0.25 × 0.25 • grid for the years 1981-2016.For our analyses, we extracted average SWEmax values over the drainage basins that contribute to the river temperature stations.

Water temperature data reconstruction
We employed the semi-empirical air2stream model (Toffolon and Piccolroaz 2015) for reconstructing daily water temperature records over the period 1980-2018.Air2stream is derived from process-based equations and includes net heat exchange with the atmosphere and streamflow fluxes.The model simulates river water temperature by using air temperature as the only meteorological forcing and considering surface and sub-surface flow contributions in terms of lumped discharge, as given by Toffolon and Piccolroaz ( 2015) and Piccolroaz et al (2016): where, T w and T a are water and air temperature ( • C), respectively; a 1 to a 8 are the empirical model parameters; t is the time step (days) and t y is duration of one year; Q and Q are daily discharge and mean discharge over entire time period, respectively (both m 3 s −1 ); ϑ = Q Q is the dimensionless discharge; δ = ϑ a4 represents the dimensionless term expressed as the power law rating curve.The 8-parameter model can be simplified to 3, 4, 5, and 7-parameter models by considering different terms of equation ( 1) negligible.Readers are referred to the original study (Toffolon and Piccolroaz 2015) for a detailed description of air2stream and derivations of simplified forms of equation (1).
We used daily T a and Q datasets (described in section 2.1) as inputs to air2stream and calibrated the model by comparing simulated T w with observations for available days.We partitioned the observed dataset into years 2000-2018 and 1980-1999 and used the period with larger number of T w observations for calibration and smaller number for validation.We used the particle swarm optimization algorithm as the calibration method with 2000 model iterations and 1000 population particles, following Toffolon and Piccolroaz (2015).We repeated the calibration/validation for all five forms of air2stream equations (with 3, 4, 5, 7 and 8 parameters) and selected the best performing model (equation and parameter set) based on the Nash-Sutcliffe coefficient of efficiency (NSE) criterion.Additional metrics considered for the evaluation of model performance included: i) mean absolute error (MAE); ii) ratio of root mean square error to standard deviation of observation (RSR) and iii) Kling-Gupta coefficient (KGE) (Gupta et al 2009).The selected model for each station was used for reconstructing T w values over the years 1980-2018, which was subsequently used for all analyses outlined below.

Analyses
We analyzed 1980-2018 trends in T w , T a and Q over the ice-free period from June to September (JJAS) together with 1981-2016 trends in SWEmax.Additionally, we analyzed trends in 7-day maximum T w and occurrences over the 18 and 20 • C thresholds according to the tolerance limits of cold-water species (e.g.Eliason et al 2011, Keefer et al 2018).Trends were calculated using the non-parametric Thiel-Sen method, with the significance determined using the Mann-Kendall test and the effects of serial correlation removed using the iterative pre-whitening method (Zhang and Zwiers 2004), as implemented in the R 'zyp' package (Bronaugh and Werner 2019).We considered two significance levels of p ⩽ 0.05 and p ⩽ 0.10 for determining the statistically significant trends.
We evaluated spatial patterns of T w , T a , Q, SWEmax along with geospatial variables (latitude, longitude and elevation) by using several linkage criteria for the agglomerative hierarchical clustering ML method.Hierarchical clustering with the Ward's linkage criterion (Murtagh and Legendre 2014)presented here-uses minimum variance to successively merge the most similar clusters until the optimal number of clusters are formed, as implemented in the R 'Cluster' package (Maechler et al 2022).Twentynine variables were included in our analysis that consist of 26 hydro-climatic variables: monthly means and trends over June, July, August and September for T w , T a and normalized Q, and annual SWEmax and trend; and three geospatial variables: latitude, longitude and elevation.Station Q values were normalized by contributing drainage areas to obtain comparable values across all stations.
Next, we developed the RFs ensemble ML model (Breiman 2001) to evaluate the hydroclimatic and geospatial controls on mean June-September T w and trends.RFs are combination tree predictors, constructed with each tree using a random vector from bootstrap samples with the same distribution for all trees in the forest (Breiman 2001).We employed the R 'randomforest' package (Liaw and Wiener 2022) to train two independent RF models: (i) mean June-September (JJAS) T w as the response and mean JJAS T a , normalized mean JJAS Q, mean SWEmax, latitude, longitude and elevation as the explanatory variables; (ii) JJAS T w trend as the response and means and trends of JJAS T a , Q and SWEmax, means of JJAS T w and latitude, longitude and elevation as the explanatory variables.We evaluated the relative importance of driving variables on responses by using the variable importance function within RF, which is based on the change in prediction errors by randomly permuting an individual variable while leaving other variables unchanged (Liaw and Wiener 2002).
Since the variable importance analyses considered aggregate means and trends of T a , T w and Q across 106 stations, inherent relationships amongst variables arising out of air2stream model input-output can be assumed to be less influential.

Air2stream model evaluation
Before performing analyses using the air2stream model reconstructed T w data, we evaluated model performances by comparing simulated T w with available observed T w .The results for the analysis years 1980-2018, which include both calibration and validation periods, indicate a good ability of the selected air2stream model to replicate observed T w (see figure S1, sample scatter plots for eight stations), with reasonable goodness-of-fit metrics: NSE (minimum = 0.79, median = 0.93, maximum = 0.97); KGE (0.81, 0.95, 0.99); MAE (0.51, 1.06, 1.97 • C) and RSR (0.12, 0.26, 0.46).A summary of statistical performances along with the selected model in terms of number of parameters are provided in table S3.The results are likely affected by several sources of uncertainties, including the sparsity of observations especially at high T w values and the use of gridded dataset to represent station-specific T a values.Readers are referred to our previous study (Shrestha and Pesklevits 2023a) for more details on calibration/validation results and related uncertainties.

Water temperature trends
Trend analysis of reconstructed T w revealed consistent warming patterns over the years 1980-2018 (figure 2, table S4).Specifically, statistically significant increasing trends were obtained for 54, 46, 42 and 62 stations out of 106 stations for June, July, August and September, with median trends of 0.18, 0.20, 0.19 and 0.33 • /decade, respectively.The trends depict some spatial patterns, e.g.significant increases for most north-western stations in June, south-western and central stations in July, western stations in August, and central and eastern stations in September.Notably, 7-day maximum T w are increasing significantly for 32 out of 106 stations, with most significant increases in south-western stations (figure S2(a)).Additionally, the number of days over 18 • C are increasing significantly for most (40/60) stations in eastern and central Canada (figure S2(b)).Most stations (20/39) also showed increasing occurrences of over 20 • C, which are located primarily in eastern Canada (figure S2(c)).
Next, we evaluated T w trends in relation to driving variables T a , Q and SWEmax.In general, higher T a can be expected to lead to higher T w by virtue of the heat exchange with warmer atmosphere.Lower Q and snowpack could both lead to increased T w , due to greater heat exchange with the reduced mass of water (Booker and Whitehead 2021) and reduced cooling effect of the snowmelt driven runoff (Yan et al 2021), respectively.Our analysis also revealed increasing T a trends for most stations (table S5), with significantly increasing T w and T a stations generally coinciding (figures S3(a) and (b)).JJAS Q trends are mostly insignificant, with only spatially coherent pattern of significant increasing trends for southern prairies stations (figure S3(c), table S6).Furthermore, due to the limited number of stations with significant Q trends, no consistent pattern between increasing/decreasing Q and T w trends could be discerned.Likewise, SWEmax trends are generally small and insignificant for most stations, and no pattern between SWEmax and T w change could be discerned (figures S2(a), (d) and table S6).However, statistical significance only relates to robustness of trends and does not explain underlying nonlinear relationships and interactions among variables.Hence, we analyzed spatial patterns of, and underlying controls on T w means and trends in relation to driving variables by using the cluster and variable importance analyses.

Spatial patterns
Agglomerative hierarchical cluster analysis using the Ward's criteria was considered appropriate among several linkage criteria because it provided a balance between outlier detection and sensitivity for our dataset (see figures S4-S6).The method provided a categorization of the sites with similar variability and trends in T w , T a , Q and SWEmax (figures 3 and S3).The T w and T a trends are increasing across all regions and less spatially distinguishable when considered in isolation.However, we identify distinguishing patterns by analyzing multiple variables, particularly T w , T a , Q, SWEmax and their trends together (figure 4).Additionally, the inclusion of geospatial variables (latitude and longitude) resulted in spatially coherent categorization into physiographic regions that could be roughly described as: (1) south-east; (2) coast; (3) north; (4) eastern prairies; (5) northern prairies; (6) southern cordillera; (7) northern cordillera; and (8) north-western coast (figure 3).
The resulting eight clusters showed similarities in some of the variables, and differences among other variables (figure 4).For example, clusters (1) southeast and (4) eastern prairies have similar distribution of T w and T a trends but substantially different Q trends.The clusters (6) southern cordillera and (7) northern cordillera have similar variability and trends in T w and T a , but substantially different Q ranges and means.Furthermore, the transition from cluster ( 6) to ( 7) arises out of the differences in Q, likely due to differences in precipitation in windward and lee side of the cordillera mountains.The two clusters in northern Canada: (3) north and (7) northern cordillera have similar ranges of July, August and September T w values, however, cluster (3) has higher   T w trends and cooler June T w values compared to cluster (7).Cluster (2) differs from other clusters (e.g.cluster 1) in terms of higher June Q as well as generally cooler T a and T w values.
Although several stations in the clusters are spatially non-coherent, they have similar hydroclimatic characteristics.Notably, two Atlantic coast stations (# 2 and 3; see figure 1) have been grouped with the Pacific coast stations into cluster 2 due to similarity in Q response, which is characterized by snowmelt driven freshet from colder watersheds that peak in June or July.In contrast, Atlantic coast station 1, draining relatively lower elevation areas, has been grouped with the south-eastern stations into cluster 1 due to lower June and July Q.Three northwestern British Columbia coast stations, that appear to be in a proximity (station # 68, 69, 70 in figure 1), have distinct hydroclimatic conditions and grouped into three different clusters.Specifically, station 68, which receives larger fractions of cooler flows from the glaciated tributaries of the Babine River (Pitman and Moore 2021), is grouped in (3) north cluster.Station 69, which drains relatively colder and less wet areas of interior Skeena basin (Wild et al 2023), is grouped in (6) southern cordillera cluster.Station 70, which in the singleton cluster (8) north-western coast, drains relatively warmer and wetter region of the Skeena basin, and is characterized by substantially higher T a and T w trends.The clusters included several outliers, e.g.high T w and T a values in cluster 2 (figure 4) representing warmer stations in the southern British Columbia.This indicates that the ward's clustering method is not overly sensitive to outliers.
Key insights are also gained by comparing the variability and trends in different variables together.Specifically, T a trends are generally higher than T w trends across most regions.Furthermore, the clusters (1) south-east and (4) eastern prairies are the regions with warmest T w values.Consequently, with the warming T a trends, the mean monthly T w in these two regions reach and exceed the critical thresholds of 20 • C. Continued T w increase in the region would likely result in range expansion, northward range shift, and range contraction for warm-water, coolwater and cold-water fish species, respectively, and exacerbate ecosystem balance of the river and lake systems (Van Zuiden et al 2016).Clusters (2) coast and (5) northern prairies are the regions where mean monthly T w values approach or, in some instances, exceed 20 • C, which is being exacerbated by warming T a trends.Increasing T w is especially concerning for these regions given that they provide important habitat for cold-water fish species (e.g.salmon and trout) that have tolerance limits of about 18 • C-22 • C (Eliason et al 2011, Keefer et al 2018).Hence, the cluster analysis identified the regions with critical T w and hotspots of change in a warming climate, and provided a basis for assessing the cascading second/third order impacts, e.g. on ecosystem services and human usage.

Controls on water temperature
The sample size of 106 stations is a limiting factor for analyzing differences in hydroclimatic and geospatial controls for each of the eight clusters separately.Hence, we combined all stations but developed two independent RF models to analyze controls on T w means and trends.The RF model with JJAS T w as the response provided NSE, KGE and MAE values of 0.75, 0.76, and 1.22 • C, respectively, indicating a reasonable ability of the explanatory variables to simulate JJAS T w .As can be expected from previous studies (Mayer 2012, Laizé et al 2017), JJAS T a exerts the highest control on JJAS T w , out of six variables considered (figure 5).Partial dependence plots provide further insights into marginal controls, for instance the strong control of JJAS T a on JJAS T w is depicted by the increasing JJAS T w with JJAS T a in a relatively steep curve.Station longitude and latitude also influence JJAS T w , with higher JJAS T w over eastern longitudes and southern latitudes.Given the north-south T a gradient over Canada (e.g.Yang et al 2021), the lower JJAS T w at higher latitudes is to be expected.Additionally, as most eastern longitude stations are in southern Canada, the effect of longitude on JJAS T w could also be explained in terms of northsouth T a gradient.Considering interactions among the top three variables, colder T w values are prevalent with colder T a and western longitudes, and colder T a and northern latitudes (figure S7).However, the relationships are non-uniform, and transitions occur between cooler (e.g.cluster 3) and warmer stations at about 12 • C T a .The transitions are likely related to a larger fractional contribution of cooler snowmelt driven runoff in cooler stations compared to warmer stations, resulting in different T a vs. T w characteristic response.Furthermore, the effect of latitude and longitude on T w are more pronounced in the region where most stations are located (west of −100 • longitude and north of 52 • latitude).Thus, more stations are needed to better assess their controls on T w outside of this region.Mean SWEmax has some influence on JJAS T w , as depicted by the partial dependence plot that shows lower T w values at higher SWEmax, which is likely related to the cooling effect of runoff from snowmelt.The influences of mean JJAS normalized Q and elevation on JJAS T w are limited, as depicted by flat partial dependence relationships.
The RF model for JJAS T w trends as the response variable by using 10 explanatory variables provided a moderate ability, with NSE, KGE and MAE values of 0.42, 0.46, and 0.08 • C/decade, respectively.Most of the skills arise from JJAS T a trends, as depicted by about 40% variable importance score (figure 6).Furthermore, the dominant control of JJAS T a trends on JJAS T w trends is evident from the higher T w trends for higher T a trends in the partial dependence plot.Mean SWEmax values and trends also have some influence on the JJAS T w trends, which are depicted by increasing T w trends with higher SWEmax and increasing T w trends with declining SWEmax trends.However, the relationships of mean SWEmax and trends with T w trends only become pronounced when SWEmax > 300 mm, and SWEmax trends >0 (figure S8).Smaller T w increase with increasing SWEmax trend is likely linked to higher fraction of cooler snowmelt driven runoff, offsetting the T a driven T w increase.The variable importance of <7% for other seven variables suggests that these factors have very limited influences on JJAS T w trends.Overall, T a and T a trends are the most influential hydro-climatic variables controlling the T w and T w trends, respectively.Geospatial variables, latitude and longitude have some influence on T w , with the northsouth T a gradient as the underlying factor.

Discussion and conclusions
Our study contributes to new knowledge on the space-time variability of climate-induced river temperature change across Canada, through consistent analysis of trends, spatial patterns and hydroclimatic controls.The analyses revealed widespread increases in river temperature over the ice-free summer months of June to September over the period 1980-2018.Notably, significant warming trends were detected for about 40%-60% of stations over the four months.Additionally, some of the stations showed significantly increasing 7-day maximum values, and increasing occurrences over 18 • C and 20 • C thresholds.
The cluster analysis of river temperature along with the explanatory variables address the challenge of identifying regionally consistent variability and change across Canada.These patterns are less distinguishable when considered in isolation.However, by aggregating the multiple stations and variables with a clustering approach, we provide regionally distinguishable patterns of the magnitude and direction of change.The approach is especially relevant because of the lack of continuous water temperature observations, and because the identification of regional patterns allows us to generalize the variability and change over a specific region.From the hydro-ecological perspective, we identify multiple regions as hotspots of change, which include south-east, southern prairies, coast and northern prairies, and which are vulnerable to the rising water temperature because of impacts on cold-water fish species.Our approach could be transferred to other regions, and/or employed for other hydrological and ecological variables through a careful consideration of the explanatory and response variables.
We expanded our understanding of river temperature variability and change by identifying critical hydroclimatic and geospatial controls.We find that mean summer air temperature and its trends are the most influential hydro-climatic variables controlling the summer mean water temperature and trends, respectively.Geospatial variables including latitude and longitude also have some effect on water temperature, with the north-south air temperature gradient as the underlying factor.The lack of stations in the northeast limited our ability to provide more complete analyses of the hydroclimatic and geospatial controls, along with trends and spatial coherence of river temperature across Canada.Furthermore, the reconstructed water temperature data are affected by associated uncertainties, e.g.parameterization and calibration of the air2stream model (Piotrowski andNapiorkowski 2018, Almeida andCoelho 2023).
The dominant control of air temperature on water temperature leads to a subsequent question about future changes in water temperature with the warming climate.Air temperature in Canada is warming at about twice the rate of global mean temperature increase, with higher increases in the northern regions (Zhang et al 2019).Thus, widespread summer river warming can be expected across Canada, which could have severe consequences, especially in already thermally-stressed river systems of the coastal and northwestern regions where cold-water fish species are being impacted (Eliason et al 2011, Williams et al 2015).Furthermore, in the southeastern region water temperature increases would extend to the Great Lakes, which could inhibit the mixing of lake waters, increase oxygen depletion, promote the growth of harmful algal blooms, and lead to decline in cold-water species (Lam and Dokoska 2022).Warming river temperatures also contribute to increased heat fluxes to the receiving seas and oceans, and increasing heat flux from northern rivers could exacerbate the sea-ice cover loss over the Arctic shelves (Park et al 2020).In this respect, the insights from this study along with the reconstructed river temperature data could facilitate such second/third order analyses.Furthermore, there is a need to develop an integrated modelling system for projecting future river temperature change, and assessing the impacts on physical and ecological systems across Canada.

Figure 1 .
Figure 1.Locations of 106 water temperature stations across Canada together with the mean observed June to September (JJAS) river temperature ( • C) over the period 1980-2018.The station numbers correspond to the station identifiers listed in supplementary tableS1.

Figure 2 .
Figure 2. Historical trends in monthly (a) June, (b) July, (c) August and (d) September Tw over the years 1980-2018.The results show significantly increasing trends at 5% and 10% levels; no significantly decreasing trends were detected.

Figure 4 .
Figure 4. Box and whisker plots of means and trends of Tw, Ta and Q based on the eight clusters (denoted as 1-8) identified by the Ward's agglomerative hierarchical method.Each box plot illustrates the median and interquartile range (IQR), and the whiskers extend to 1.5 * IQR.Outliers beyond the 1.5 * IQR are plotted individually.

Figure 5 .
Figure 5. Predictor variable importance (%) and partial dependence plots for six explanatory variables with Tw as the response variable.Vertical lines on x-axes of the partial dependence plots indicate the data distribution.

Figure 6 .
Figure 6.Predictor variable importance (%) of 10 explanatory variables with Tw trends as the response variable.Partial dependence plots are shown for the top six explanatory variables.Vertical lines on x-axes of the partial dependence plots indicate the data distribution.
Huntingford et al 2019, Reichstein et al 2019), are also being used for river water temperature prediction (e.g.Feigl et al 2021, Qiu et al 2021) and understanding the controls on river temperature dynamics (e.g.Wade et al 2023).Continuous water temperature data are important for analyzing trends, spatial coherence, and underlying controls on river water temperature.However, most historical observations in Canada are based on spot sampling taken a few times in a year (e.g.Moore 2013, Boyer et al 2016) and both spatial coverage and sampling frequency are very low for the northern rivers (Yang et al 2021).