Seasonal characteristics and spatio-temporal variations of the extreme precipitation-air temperature relationship across China

It is assumed that extreme precipitation (P) increases with air temperature (T) by a scaling rate close to 7%/°C without moisture limitation according to the Clausius-Clapeyron (C-C) relationship. However, the spatial distribution of the P-T relationship in China is subject to divergent conclusions including both sub-C-C (<7%/°C) and super-C-C (>7%/°C) scaling with reasons yet to be examined. Based on the long-term observations, here we show that P-T relationships with peak structure exist in most regions across China. The scaling rate in the wet season shows a decreasing spatial pattern from the southeast to the northwest, while sub-C-C scaling in the dry season dominates most regions across China. Mixing precipitation events from different seasons could lead to miscalculation of the P-T scaling rate. Furthermore, significant increases in peak precipitation at high percentiles have been observed in southern regions of China during the historical period, indicating that the peak structure does not imply a potential upper limit for precipitation extremes. Our results highlight the importance of considering seasonal characteristics in analyzing the extreme precipitation-temperature relationship in a changing climate.


Introduction
A significant increase in extreme precipitation events has been detected across the world, triggering a series of hydrological disasters (Yin et al 2018, Merz et al 2021, Zhang et al 2022, Zhao et al 2023. A common explanation for increases in precipitation extremes is the rising air temperature in the context of climate change. The Clausius-Clapeyron (C-C) scaling is commonly used for describing the extreme precipitation-air temperature relationship (P-T relationship) (Wang et al 2017). According to the C-C scaling, the saturated water vapor pressure approximately increases 7%/ • C, which could lead to the same rate of change in the intensity of extreme precipitation without limitations of other factors (Allan et al, 2014).
However, multiple variants of C-C scaling, i.e. sub-C-C (<7% • C −1 ) and super-C-C (>7% • C −1 ) scaling, have been reported worldwide, resulting in great spatial heterogeneity in the actual P-T relationship across different regions (Lenderink et al 2011, Shaw et al 2011, Allan et al, 2014. The spatial pattern of the P-T relationship and its driving mechanism have been one of the hot research topics recently (Ali et al 2018, Gao et al 2020, Zeder and Fischer 2020, Yin et al 2023. For the mid-latitude regions, a peak-like (or hook-like) structure of the P-T relationship is described as typical, which means the precipitation intensity would increase with temperature until it reaches the peak and then decrease steeply. Previous studies suggest that the limited moisture supply or decrease in the duration at high temperatures might be the causes that lead to the peak structure of the P-T relationship (Boessenkool et al 2017, Visser et al 2021, Sun and Wang 2022.
The aforementioned P-T relationship with a peak structure has also been found in most regions of China (Gao et al 2018, Huang et al 2019, Shi et al 2019. However, there is disagreement about the spatial pattern of the P-T scaling rate before reaching the peak. For instance, Chen et al (2022) find that the scaling rates exhibit sub-C-C, super-C-C, C-C-like, and negative C-C scaling in southeastern, southwestern, northeastern, and northwestern China. In contrast, Wang et al (2018) suggest that most regions in China exhibit C-C-like (∼7% • C −1 ) scaling rates. Meanwhile, the season of precipitation can be an influencing factor for the C-C scaling rate (Sun et al 2013, Zhang et al 2017, Schroeer and Kirchengast 2018, but this factor has seldom been considered when examining the P-T relationship across China. Therefore, exploring the seasonal characteristics of the P-T relationship could help clarify the spatial distribution and reconcile the disagreement of the P-T scaling rate across China. The differences in previous conclusions could also arise from methodology. The most widely adopted method is to calculate the scaling rate for individual stations or grids and then obtain the spatial distribution by interpolation (Gao et al 2020, Yin et al 2021, Chen et al 2022. However, the small sample size of individual station (or grid) data allows detailed settings (such as the width of bins) to largely impact the results (Chen et al 2022). Another common method is to lump the data of stations (girds) in the same climate zone or watershed, and then calculate the scaling rate for the lumped data (Wang et al 2017. The shortcoming of this method is that existing climate zoning is based on a-priori empirical knowledge, which might not reflect the actual precipitation characteristics. This could cause stations with different precipitation patterns to interfere with each other (Gao et al 2020). Consequently, improving climate zoning and lumping data in the same zone to increase sample size could be the key to resolving the above disagreement.
The peak structure of the P-T relationship indicates that the extreme precipitation increase will peak at a certain temperature within a certain period, but this does not mean the extreme precipitation will decrease in the future after reaching the peak temperature. Many studies based on future climate model projections have suggested that the peak temperature (T p ) and peak precipitation (P p ) will continue to increase with climate warming in the future (Wang et al 2017, Yin et al 2021. Some studies have shown that the two increases generally conform to the C-C scaling rate according to the simulation of the climate model (Wang et al 2017, Yin et al 2021, but it remains unknown whether such increases in peak points could be supported by historical observations. Based on the existing knowledge gap, this study attempts to investigate the spatial distribution of the P-T relationship and its historical change in China during 1958-2017. The objectives of this study are to (1) identify the spatial pattern of the P-T relationship across China based on refined climate zoning; (2) estimate the P-T scaling rate for different regions considering the effects of seasonal characteristics; (3) analyze the temporal trends of P p and T p based on longterm observation data across mainland China.

Data
The daily precipitation and daily average temperature data from 1958 to 2017 were obtained from the National Meteorological Information Centre of the China Meteorological Administration. As the clustering method adopted in this study has strict requirements for quality control, we selected 498 stations from 2157 national meteorological stations, at which the proportion of missing values was less than 3% each year. Daily precipitation larger than 0.1 mm d −1 was defined as precipitation events. All other types of weather events such as snow (information provided by the dataset) were excluded, as Chen et al (2022) reported they could cause slight deviations in calculating the scaling of extreme precipitation. This study defined the intensity of precipitation extremes based on the percentile thresholds of all wet events as suggested by the World Climate Research Program (Schär et al 2016). Specifically, the 95th, 99th , and 99.9th percentiles were used in the study.

K-Means clustering
K-means clustering is an unsupervised learning algorithm, grouping similar data points into clusters by minimizing the mean distance between geometric points. To do so, it iteratively partitions the dataset into 'K' non-overlapping clusters ('K' is an input variable that determines the number of clusters), wherein each data point belongs to the cluster with the nearest mean cluster center (Hartigan and Wong 1979, Liu et al 2019, Govender and Sivakumar 2020). Choosing proper indices to calculate the distance is crucial for K-Means clustering.
In this study, we expected to group stations with similar precipitation characteristics into the same cluster. Since precipitation exhibits an annual cycle, we used the gamma distribution for describing the precipitation for each month of stations, an idea borrowed from the calculation progress of the Standardized Precipitation Index (Edwards 1997, Xu et al 2015. The gamma distribution of each month was described by two parameters (Shape and Scale). Thus, for each station, 24 parameters were used in total.
The dimensionality of the parameters was so high that they cannot be adopted for distance calculation directly (Wainwright 2019). Thus, exploratory factor analysis (EFA) with minimum residual solution was introduced to reduce dimensionality (Revelle 2020). The main steps were as follows. First, calculating the covariance matrix of the above 24 parameters and perform the Kaiser-Meyer-Olkin (KMO) test on the correlation coefficient matrix. Then, using 'Parallel' analysis to determine the number of factors if the value of the KMO test was higher than 0.8 (0.83 for the correlation coefficient matrix of the 24 parameters). The results suggested that four factors were sufficient. Finally, using EFA for downscaling the original parameters to obtain four main factors, the scores of which described the precipitation characteristics of each station. Specific standardized loadings are shown In table S1 in supplement.
In addition, the longitude and latitude of stations were standardized (subtracted from the means and divided by the standard deviations) as indices recording spatial information (Becker et al 1988). The above six indices were used for distance calculation. The R package 'Nbclust' provides 30 indices for evaluating the performance of different cluster numbers (Charrad et al 2014). We tried the numbers recommended by most indices and adjusted according to the results, a cluster number of ten eventually settled.
To ensure data consistency within each cluster, median absolute deviation (MAD) was used for removing the outliers. MAD can be calculated as (Pham-Gia and Hung 2001): where X represents the multi-year average annual air temperature of all stations within a certain cluster since the precipitation characteristics had already been fully considered in the process of cluster analysis. If the difference between the multi-year average annual air temperature of a certain station and the cluster median was greater than three times the MAD, the station was then excluded from the cluster as an outlier. All the above steps were performed through the R programming language. The gamma distribution fitting and EFA were conducted using the R packages 'SPEI' and 'psych.'

Scaling rate
The scaling rates of extreme precipitation with temperature were quantified by a binning technique (Wang et al 2017). First, all precipitation events of the same cluster were 'binned' according to the corresponding temperature, with a bin size of 0.5 • C. When analyzing the scaling rate in all seasons or the wet season, we required at least one thousand data values in each bin and eliminated those with fewer events. The threshold in the dry season was lowered to one hundred. The definition of the wet and dry seasons is as follows: Precipitation in China is concentrated in certain months of a year due to the monsoonal climate, which are defined as the wet (or rainy) season. Previous studies suggest that the wet season in China starts around May and ends in September, a time range may vary depending on the climate of each region (Li et al 2016, Zhang et al 2021). The wet season and dry season in different regions in the study were defined and adjusted based on the time range. The specific months for wet and dry seasons for each region are clarified in table S2 in supplement.
The binned precipitation data were then adopted to estimate the 95th, 99th , and 99.9th percentiles. Those exceeding the percentiles were averaged to define the daily extreme for each bin temperature. The method of 3-bin moving window averaging was used for smoothing the results (Wang et al 2017).
Following the C-C equation, the scaling rate (α) was defined as the logarithmic change of extreme precipitation from P a to P b in response to temperature increase from T a to T b , which can be calculated as (Hardwick Jones et al 2010, Gao et al 2020: We roughly divided scaling rates before reaching the peaks into three intervals relative to C-C scaling: the C-C-like scaling (5% • C −1 -9% • C −1 ), the super-C-C scaling (greater than 9%/ • C), and the sub-C-C scaling (<5% • C −1 ) (Utsumi et al 2011, Chen et al 2022.

Statistics of peak structure characteristics
The characteristics of the peak structure that we focused on included the scaling rate, the threshold temperature (THR), and the peak point (P p and T p ). In some regions, the P-T scaling rate would increase from C-C-like to super-C-C scaling when the temperature exceeded a threshold (i.e. THR, which is below T p ). We performed segmented linear regression based on the R package 'Segmented' for statistics of scaling rates and THR (Muggeo 2003). The regression results included suggestions for the location of the mutation points (corresponding to THR), the slope of each segment, and the coefficient of determination (R 2 ). P p and T p were detected by applying 'locally weighted regression' (LOESS) (Gao et al 2020).
The method of 'time sliding window' was applied when analyzing the temporal change of peak points during the historical period. The width of the window was set as 20 years and the step size was set as five years (i.e. 1958-1977, 1963-1982, …, 1998-2017). The time series were considered to have a significant trend if the p-value of the Mann-Kendall (MK) test was less than 0.05. The MK test was performed based on the package 'Trend.' The other above statistics were implemented based on the inline functions of the R language.

Basic spatial pattern of the P-T relationship
Based on clustering analysis, all stations across China are divided into ten clusters (figure 1). Each cluster corresponds to a specific geographic zone, which has a similar climate that shapes the common precipitation characteristics. The spatial pattern of the P-T relationship in China is then analyzed based on the ten clusters. Figure S1 in supplement compares the P-T relationship for individual stations within each cluster. The results show that the cluster-based station zoning of this study could overcome the shortcomings related to climate zoning in previous studies. For instance, the P-T relationship of the Southwest river basin in the study of Wang et al (2018) has two peaks, which is divided into two clusters in our study with one peak respectively (figures S1(h) and (j)). Besides, the range of the shaded area in figure S1 reflects the fluctuations in the P-T relationship for individual stations, while the P-T relationship of the cluster (corresponds to the dashed line) eliminates most of the fluctuations, limiting statistic uncertainty caused by the small sample size of individual stations. Figure 2 shows the relationship between the 99.9th, 99th, and 90th percentiles of daily precipitation intensity and daily mean temperature for each cluster compared with the C-C scaling. Different percentiles of precipitation intensity exhibit a similar relationship with temperature, despite differences in the magnitude of precipitation intensity. The P-T relationship has a peak structure in all regions of mainland China. Similar curves with peak structures can be found in Australia, the Indian Monsoon, and other mid-latitude regions (Wang et al 2017). Wang et al (2018) suggest that the P-T relationship in China is governed by C-C-like scaling before reaching T p , except for the Northwest and Eastern Tibetan Plateau (TP) regions, where the P-T relationship is governed by sub-C-C scaling. The findings are consistent with our study in half of all regions (figures 2(e), (f) and (h)-(j)). However, we find that a threshold temperature (THR) between C-C-like and super-C-C scaling could be identified in the Southern Coast, South, Mid-low Yangtze, and Huaihe regions (figures 2(a)-(d)). Besides, the scaling rate of the P-T relationship in the Sichuan region (figure 2(g)) is governed by super-C-C scaling instead of C-C-like scaling.

Spatial distribution of the P-T relationship at different seasons
Considering the significant differences in precipitation types and influencing factors in different seasons, we isolate seasonal precipitation to analyze the causes of these super-C-C scaling rates. Precipitation intensity of the 99th percentile is taken for further investigation ( figure 3). The results verify that the variation of the P-T scaling rate (when above the THR) is correlated with the discrepancy between different seasons, as the scaling rates in the wet and dry seasons are close   to super-C-C scaling and C-C-like scaling, respectively. Previous studies have also reported a similar steep increase in the P-T scaling above THR in Europe (Lenderink and van Meijgaard 2008, Berg et al 2013, Formayer and Fritz 2017. Blenkinsop et al (2015) suggest that the increase of the P-T scaling rate above THR depends on the seasonal processes that drive extreme rainfall. Besides, the super-C-C scaling in the Sichuan region is caused by mixing events from different seasons. The scaling rate of Sichuan in the wet season and dry season respectively exhibits C-C-like scaling, but the precipitation intensity is higher in the wet season than in the dry season for the same temperature.
Based on the above findings, this study reassesses the regional characteristics of the P-T relationship in mainland China based on different seasons. In the wet season, the P-T scaling rate shows a decreasing trend in space from southeast to northwest ( figure 4(a)). The Southern Coast, South, Mid-low Yangtze, and Huaihe regions exhibit as super-C-C scaling. The Northeast, North, Sichuan, and Southwest regions exhibit C-C-like scaling. The Eastern TP and Northwest regions exhibit sub-C-C scaling. In the dry season, mainland China, on the other hand, is dominated by sub-C-C scaling. Only the Sichuan and Southwest regions exhibit C-C-like scaling.
The spatial distribution of the P-T relationship at different seasons is related to the monsoon climate of China. A set of possible explanations is as follows. In the wet season, monsoons from the sea provide a sufficient supply of water vapor. Extreme precipitation in different regions is mainly limited by thermodynamic factors, the increase of which therefore has a high sensitivity to temperature . In the dry season, the monsoon from inland cannot provide sufficient moisture supply, thus most of the regions exhibit sub-C-C scaling, except for the Sichuan and Southwest regions. The C-C scaling of the Sichuan and Southwest regions may be caused by topographic and dynamic differences at local scales (Zhang et al 2017). Besides, the northwest region exhibits sub-C-C scaling in both seasons, the dominant factor impacting which is water availability rather than atmospheric moisture holding capacity in the arid climate (Chen et al 2022).
It should be noted that the spatial distribution of the P-T relationship in our study is different from previous studies , Chen et al 2022. The differences are mainly concentrated in the estimation of P-T scaling rates for regions in Southeast and Southwest China. For regions in Southeast China, the P-T scaling rates are described by previous studies as sub-C-C scaling (Chen et al 2022) and C-C-like scaling , while our results describe them as super-C-C scaling in the wet season and sub-C-C scaling in the dry season. For regions in Southwest China, the P-T scaling rates are described by previous studies as super-C-C scaling (Chen et al 2022) and sub-C-C scaling , while our results describe them as C-C-like scaling in both seasons. The aforementioned differences could arise from the size of the sample, the precipitation zoning, and the seasonal characteristics.

The historical changes in the P-T relationship
Here we further examine if the historical observational data in China support the argument that the extreme precipitation intensity increases with rising temperature. Considering the seasonal characteristics of the P-T relationship, the temporal changes of P p and T p in the historical period are derived from the 90th , 99th and 99.9th percentiles of precipitation intensities in wet season at different clusters of mainland China. Figure 5 shows that the P p in the Southern coast, South and Mid-low Yangtze regions have experienced significant increases in the past 60 years, and the 99.9th percentile of precipitation intensities presents the strongest upward trends. In contrast, the P p in Figure 5. The historical change of wet season precipitation peaks measured by the moving-windows method during 1958-2017. The length of a step is 5 years, and the width of a window is 20 years. Solid color lines represent the 99.9th, 99th, and 90th percentiles. Positions of the x-axis correspond to the middle year of each window. The p-value is obtained from the MK test. When the p-value is less than 0.05, the trend is considered significant. The reference dashed line obtained from the simple linear regression is given to make the trend intuitive. TP represents Tibetan Plateau.
the North region experienced a significant decrease. There are no significant temporal trends in P p in the other four regions, indicating that the historical trends of P p are also spatially heterogeneous. Historically, the increase in extreme precipitation was mostly concentrated in southern China. Figure S2 in supplement shows the relationship between P p and T p . Since the changes in T p during the historical 60 years are small, it is not sufficient to identify the scaling rate between P p and T p . But we can still see a positive correlation between P p and T p in the North, Mid-low Yangtze, and Huaihe regions (figures S2(c), (d) and (f) of supplement). In comparison, figure S3 of supplement derived P p and T p from the 90th, 99th, and 99.9th percentiles of precipitation intensities of all seasons. P p in most regions do not show a significant trend, indicating that ignoring seasonal characteristics could lead to misleading results regarding the temporal changes in extreme precipitation. One explanation is that the increase is concentrated on a few high-percentage extreme events in the wet season, and this increase is averaged out by precipitation events in the dry season. The evidence for this is that precipitation for the same percentile is significantly higher in figure 5 than in figure S3.

Conclusions
In this study, we investigated the spatial pattern of the extreme precipitation-air temperature relationship in China and examined its historical change during 1958-2017 based on the precipitation zoning derived from clustering analyses. The following conclusions were reached: (1) Seasonal characteristics have an important effect on the statistics of the P-T scaling rate. The differences in scaling rates between the dry and wet seasons could lead to a transition from C-C-like scaling to super-C-C scaling after reaching THR. Mixing precipitation events from different seasons may result in a miscalculation of the scaling rate.
(2) China's spatial distribution of P-T relationship in the wet season shows a decreasing spatial pattern, i.e. super-C-C, C-C-like to sub-C-C scaling from southeast to northwest. While in the dry season, most regions exhibit sub-C-C scaling. (3) The observed peak precipitation in southern China has increased significantly over the past six decades, providing a historical basis for predictions of future extreme precipitation. However, a significant decrease in peak precipitation is also found in northern China. Not distinguishing precipitation of different seasons can lead to misleading results of historical trends in extreme precipitation.
It should be noted that the above conclusions are based on daily observations. Studies based on hourly or sub-daily observations may lead to different conclusions about the P-T relationship, such as higher scaling rates (Schroeer andKirchengast 2018, Visser et al 2021). However, this study also obtained some conclusions that used to be considered as features of hourly precipitation, such as the increase of the P-T scaling rate above THR in some regions (Park andMin 2017, Wang et al 2018). The findings of the study may apply to a broader context.

Data availability statement
The data cannot be made publicly available upon publication due to legal restrictions preventing unrestricted public distribution. The data that support the findings of this study are available upon reasonable request from the authors.