Appropriately representing convective heating is critical for predicting catastrophic heavy rainfall in 2021 in Henan Province of China

An unprecedented heavy rainfall event occurred in Henan Province of central China during 19–20 July 2021 with the maximum hourly rainfall rate of 201.9 mm, which broke the historical record in mainland China. To investigate the impacts of predicted atmospheric circulation on the regional convection-permitting prediction of this event, two sets of nested experiments with different convective parameterizations (GF and MSKF) in the outer domain and at convection-permitting resolution in the inner domain are performed with the Weather Research and Forecasting (WRF) model. The analysis found the prediction of ‘21.7’ rainstorm at convection-permitting resolution in the inner domain is largely affected by convective scheme in the outer domain. The atmospheric circulation forcing from the outer domain with different convective schemes is significantly different, which ultimately affects the regional synoptic pattern and precipitation in the refined region through lateral boundary forcing. The difference in regional prediction at convection-permitting resolution can be mitigated by adjusting convective latent heat parameterization in the outer domain. This work highlights that appropriately parameterizing convective latent heat is the key to provide reasonable large-scale forcing for regionally predicting this catastrophic heavy rainfall event at convection-permitting resolution, which may also be applicable to other events and other regions.


Introduction
During the period of 19-20 July 2021, a catastrophic heavy rainfall event occurred in Henan Province of central China (hereafter named the '21.7' rainstorm), which was characterized by high intensity, long duration, and wide coverage. The hourly rainfall intensity reaches 201.9 mm, breaking the historical record in mainland China (198.5 mm). The cumulative precipitation on July 20, 2021, reached 624.1 mm, exceeding the total daily precipitation of 509.5 mm in 2019. Such torrential rainfall results in devastating floods, severe urban waterlogging, landslides and other natural disasters and ultimately causes numerous casualties and enormous property losses. Specifically, this rainstorm caused 292 deaths and a direct economic loss of $17.65 billion. In general, severe rainstorms are the product of the interaction of small-scale, mesoscale, and large-scale processes. Previous studies have shown that heavy rainfall events in China are often associated with multiscale weather systems such as summer monsoon, typhoons, fronts, shear lines, westerly troughs and low-level jets (e.g., Zhao et al 2019a, Wu et al 2022, Zhao et al 2019b, Ren et al 2021, Du and Chen, 2019. The warm and moist air transported from the tropics by large-scale monsoon circulation provides abundant water vapor supply for the Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. occurrence and continuation of rainstorms in China. Orographic lifting can also play an important role in the formation of heavy rainfall (e.g., Goswami et al 2010, Xia and Zhang, 2019, Zhao et al 2020. Accurate prediction of heavy precipitation events saves lives, supports emergency management and mitigation of impacts, and prevents economic losses. Numerical models are important tools for predicting heavy rainfall. Although precipitation prediction has been improved in recent decades, accurate prediction of extreme precipitation remains challenging because of the multiscale nonlinear interactions of processes that generate heavy rainfall (e.g., Sukovich et al 2014, Bauer et al 2015, Ma et al 2021. During the '21.7' rainstorm, several weather systems existed in the East Asia region, such as the abnormally strong and northward western Pacific subtropical (WPSH) in northeastern Asia, the South Asian high in the Tibetan Plateau (TP), Typhoon 'InFa' in the East China Sea and Typhoon 'Cempaka' in the South China Sea (Yin et al 2022). Under the control of such complex atmospheric circulations and the special topography around Henan Province, almost all operational numerical weather predictions (NWPs) significantly underestimate the intensity of precipitation.
Since convection generally cannot be resolved at a resolution of tens of kilometers, convective parameterization needs to be adopted and is often considered one of the largest uncertainty sources of NWP . Many previous studies have demonstrated that modeling of heavy rainfall events is sensitive to the choice of cumulus parameterization scheme. For example, The Grell-Freitas (GF) and Kain-Fritsch (KF) schemes have been widely used to simulate the heavy precipitation events in the Weather Research and Forecasting (WRF) model. Some studies found that GF scheme outperformed KF (Sikder and Hossain 2016, Gao et Gao et al (2017) found that the simulation with GF scheme can better capture the U.S. summer precipitation than that with KF scheme. Wang et al (2021) indicated that KF scheme could predict the diurnal rainfall peak better than GF scheme for extreme rainfall in Shanghai. Jeworrek et al (2021) revealed that the GF scheme performed better for summertime convective precipitation, but the KF parameterization better simulated wintertime frontal precipitation over the complex terrain of southwest British Columbia. Furthermore, previous studies have also pointed out that simulations with convective parameterizations exhibited large deficiencies in capturing precipitation intensity and frequency and failed to reproduce the diurnal variation of heavy precipitation (e.g., Dai 2006, Stephens et al 2010, Lin et al 2017, Xu et al 2021. For instance, Li et al (2020) found that the simulation with convective parameterization underestimated the entire warm-season precipitation intensity during the East Asian summer monsoon.
Recent advancements in computational resources have allowed limited-area models or global variableresolution models to be run at convection-permitting scale (∼4 km) regionally, which attempts to avoid the impacts of uncertainties associated with convective parameterization at the region of interest. Previous studies have suggested that increasing grid horizontal resolution could significantly improve the capability to simulate extreme precipitation because the impacts of topography, land use, and other important processes are better resolved , Prein et al 2015. For weather prediction, many studies have shown that when the model resolution increases to convection-permitting scale, the spatial and temporal distributions, diurnal variations, and intensity of precipitation can be predicted more accurately (Zhu et al 2018, Zhao et al 2019, Xu et al 2021. Zhu et al (2018) indicated that the convection-permitting forecasts of precipitation in China during the summer of 2013-2014 outperform global forecasts in terms of spatial distribution, intensity, and diurnal variation. With respect to regional climate simulation, previous studies have also demonstrated that convection-permitting resolution could significantly improve the simulation of characteristics, duration and diurnal variability of hourly precipitation (e.g., Kendon et al 2012, Kendon et al 2014, Ban et al, 2014, Gao et al 2017, Li et al 2022. Gao et al (2017) revealed that the convection-permitting simulations were more realistic in reproducing the observed spatial distributions and diurnal variability of precipitation than the simulations at 36 km resolution.
Although regional forecast at high resolution, particularly at convection-permitting resolution, can improve the prediction accuracy of heavy precipitation using either limited area nesting or variable mesh refinement method, please note that it still requires large-scale forcing that is provided by global forecast at a resolution of tens of kilometers with cumulus parameterization (Biswas et al 2014, Yin et al 2022. As mentioned above, many studies investigated the impacts of convective parameterization on simulating precipitation through its direct influence on generating precipitation at model resolution of tens of kilometers, however, only a few studies have examined the impacts of the modulation of atmospheric circulation by convective parameterization on the modeling results within the refined region at convection-permitting resolution (Li, 2013, Biswas et Biswas et al (2014) found that the simulated hurricane intensity in the nested domain at a resolution of a few kilometers without using convective parameterization was still sensitive to the choice of convective parameterization used in the outer simulation domain at a resolution of tens of kilometers. In addition, very few of them explored how convective parameterization affected atmospheric circulation and then influenced the simulated event within the refined region.
Although many studies have simulated/forecasted extreme precipitation events such as the '21.7' rainstorm in Henan Province at convection-permitting scale (Zhu et al 2018, Zhao et al 2019, Luu et al 2022, Yin et al 2022, Zhang et al 2022, very few studies investigated the influence of forecasted atmospheric circulation on forecasting regional extreme precipitation events at convection-permitting resolution, particularly regarding how convective parameterization can affect forecasted extreme precipitation at convection-permitting resolution by modulating forecasted atmospheric circulation. Therefore, in this study, the '21.7' catastrophic rainstorm event of 2021 in Henan Province is investigated to reveal the impacts of forecasted atmospheric circulation modulated by convective parameterization on the regional convection-permitting forecast of this event. The remainder of the paper is organized as follows. In section 2, the setup of numerical experiments and the observation and reanalysis datasets are described. In section 3, the influence of predicted atmospheric circulation dominated by convective parameterization on the regional prediction of the '21.7' rainstorm at convection-permitting resolution is examined. The conclusion and discussion are given in section 4.  figure  S1. The outer domain covers the main weather systems that affect '21.7' rainfall in Henan Province, and the inner domain covers Henan Province and the surrounding neighboring regions. The height coordinate is configured with 50 layers, and the pressure top is at 100 hPa. To investigate the impacts of forecasted atmospheric circulation modulated by convective parameterization on the regional convection-permitting forecast of this event, two experiments with different convective parameterizations, the GF scheme (Grell et al 2014) and Multiscale Kain-Fritsch (MSKF) scheme (Zheng et al 2016), in the outer domain are conducted. GF is an ensemble scheme in which multiple cumulus schemes and variants are applied within a single box to obtain an ensemble-mean realization. The KF is a simple mass-flux-based scheme for moist updrafts/downdrafts and applies a trigger function to initiate convection, compensating for circulation and closure assumption. The MSKF scheme includes updates to the traditional KF version involving 'subgrid-scale cloud-radiation interactions, a dynamic adjustment time scale, impacts of cloud updraft mass fluxes on grid-scale vertical velocity, and lifting condensation level-based entrainment methodology that includes scale dependency'. The convective parameterizations are turned off in the inner domain simulation.
The analysis in this study shows that the forecasted synoptic circulation in the outer domain can be largely modulated by the latent heat released in convective parameterization, which can significantly affect the forecasted rainfall of the event in the inner domain. The bias of forecasted rainfall with the default GF scheme may be related to its uncertainty in latent heat estimation. Therefore, an additional experiment is conducted to adjust the latent heat calculation of the GF scheme to demonstrate the coupling of parameterized convective latent heat, atmospheric circulation, and convection-permitting forecasted rainfall. In the current version of the model, the latent heat in GF is calculated proportionally to the total base updraft mass flux (UMF), and UMF is inversely proportional to the convective updraft fraction s. s is specified as a function of the radius of convective updrafts (R) as follows: where A is the area of the grid box and e is the initial factional entrainment rate that is set to´-m 7.0 10 5 1 in this version of model. Please note that e is very experimental and highly tunable. The analysis shows that the parameterized convective latent heat in GF may be overestimated. To reduce the latent heat in the GF scheme, five values of epsilon (7.0, 2.0, 1.5, 1.1, 1.0, gradually decreasing) are chosen for sensitivity test. As the epsilon value decreases, the latent heat rate also decreases (figure S2). In particular, when epsilon is equal to 1.0, the parameterized convective latent heat in GF is significantly reduced and the atmospheric circulation is closer to the reanalysis data (figure S3). Therefore, e is set to´-m 1.0 10 5 1 for the sensitivity experiment with the GF scheme in this study.
The three experiments described above are referred to as EXP-GF, EXP-MSKF, and EXP-AdjGF hereafter. Please note again that convective parameterizations are only applied in the outer domain and are turned off in the inner domain, so the forecasted rainfall in the inner domain would not be directly modulated by the convective parameterizations but can be affected by the atmospheric circulation forcing in the outer domain that is coupled with the parameterized convective latent heat. Physical schemes other than convective parameterization are kept the same for all the experiments, such as the Thompson cloud microphysics scheme, the YSU boundary layer scheme, the Noah land surface scheme, and the RRTMG shortwave and longwave radiation schemes. The experimental configurations and the physical schemes in this study are summarized in table S1. The initial and lateral boundary conditions are obtained from National Centers for Environmental Prediction (NCEP) Final (FNL) Operational Global Analysis data (NCEP, 2000) with a horizontal resolution of 1°× 1°and a temporal resolution of 6 h. The lateral boundary conditions (LBCs) of the outer domain are updated every 3 h, which are temporally interpolated from the NCEP FNL analysis data at 6-hourly interval. The LBCs of nested domain keep updated with the fields from the outer domain at its every integration time-step ( i.e., one minute). Although the LBCs of outer domain are derived from NCEP FNL analysis data, the LBCs of inner domain is obtained from the prediction of outer domain. Therefore, the experiments in this study can be used to understand the influence of atmospheric circulation on regional prediction of extreme event and the mechanism underneath. Three ensemble simulations are performed for each experiment by changing the initial time at 2100 UTC July 17, 0000 UTC July 18, and 0003 UTC July 18, 2022, the initial conditions for the 3 h interval of three ensemble members are temporally interpolated from the NCEP FNL analysis data at 6-hourly interval. The averaged results for 0000 UTC July 19-0000 UTC July 21, 2022 from three ensembles are analyzed to reduce the influence of modeling internal variability.

Observations and reanalysis data
Ground station-observed rainfall from the National Meteorological Information Center of the China Meteorological Administration (CMA) is employed to evaluate the simulated precipitation. In this dataset, rainfall was measured by tipping buckets, self-recording siphon rain gauges, or automatic rain gauges. The data were subject to strict three-step quality control by station-, provincial-, and national-level observational departments. The methods of quality control mainly include checking the climate threshold value, extreme value, spatial and temporal consistency, and human-computer interaction. All the data used in this study are quality controlled. The weather stations are densely distributed over East China, and the mean separation distance between two adjacent stations is approximately 25 km.
To validate the general conditions of the atmosphere, such as circulation and temperature, the global reanalysis dataset of the European Centre for Medium-Range Weather Forecasts  Figures 1(a)-(d) shows the spatial distributions of 48-h accumulated precipitation within the inner domain from 0000 UTC July 19 to 0000 UTC July 21, 2021, from the observations and the three experiments. The black box (31.5-36.5°N, 110.5-116.5°E) in figure 1(a) denotes Henan Province referred to in further analysis. In the observations ( figure 1(a)), more than half of the areas in Henan Province had total accumulated precipitation exceeding 200 mm, with a maximum amount of 727.7 mm at Zhengzhou city, and a record-breaking hourly precipitation of 201.9 mm was observed at Zhengzhou station. Compared with the observations, the location of heavy precipitation from EXP-GF shows significant southwestward biases, and the precipitation intensity is much lower than the observations ( figure 1(b)). In comparison, the EXP-MSKF experiment exhibits much better performance than EXP-GF, although the precipitation intensity is still slightly underestimated ( figure 1(c)). The spatial correlation coefficients are 0.08 and 0.74 for EXP-GF and EXP-MSKF, respectively. Figure 1(e) compares the time series of averaged hourly precipitation of Henan Province from the observations and the simulations (ensemble mean). All three ensemble members for each experiment exhibit high consistency, demonstrating the robustness of the sensitive simulations (figure S4 in the supporting material). The observed temporal evolution of averaged hourly precipitation increases after 0800 LST on 19 July, reaches its peak at approximately 0800 LST on 20 July and lasts until approximately 2200 LST on 20 July. The EXP-GF experiment shows a poor capability to reproduce the temporal evolution of precipitation with a decreasing trend and significantly underestimates the precipitation intensity. It has a temporal correlation coefficient of −0.42 against the observation. In comparison, the EXP-MSKF experiment generally captures the temporal characteristics of precipitation and has a correlation coefficient of 0.71 against the observation. Please note, the high positive correlation for the EXP-MSKF experiment mainly reflects its increasing trend of precipitation over time consistent with the observation. The EXP-MSKF experiment still can't accurately capture the hourly variation of precipitation during the event, which needs more effort to be improved and deserves further investigation in future.

Prediction of atmospheric circulations with different convective parameterizations
The difference in precipitation from the two experiments, EXP-GF and EXP-MSKF, may result from their difference in atmospheric circulation pattern. Both the ERA5 and CRA40 reanalysis datasets show that a southwest−northeast wind shear line is formed over western Henan Province (figure S6 in the supporting material). The location of the wind shear line is consistent with the spatial distribution of the precipitation. Although all the configurations are kept the same for the experiments in the inner domain, the spatial distributions of geopotential height and horizontal wind fields at 700 hPa averaged for 19-20 July 2021 within the inner domain are significantly different for EXP-GF and EXP-MSKF (figure S6). The circulation from EXP-GF shows significant biases, with the location of the wind shear line shifted to the southwesterly compared to the reanalysis, while EXP-MSKF captures the circulation pattern well (figure S6). The difference in circulations between the two experiments can only be explained by their difference in lateral boundary conditions from the outer domain simulation.
The spatial distributions of geopotential height and horizontal wind fields at 700 hPa in the outer domain averaged for 19-20 July 2021 are shown in figure 2. Both reanalysis datasets show that the abnormally strong WPSH extended to northeastern China during the rainstorm. Typhoon 'InFa' is located in the East China Sea, and Typhoon 'Cempaka' is located in the South China Sea. Under the joint control of several weather systems, a south−northeast wind shear line forms over western Henan Province, which corresponds to the rainstorm event. In the EXP-GF experiment, the low-pressure areas of Typhoon 'InFa' and 'Cempaka' are much larger than the reanalysis, and the locations show westward biases, which results in almost a merger of the two typhoons. The simulated biases of the features of the two typhoons also reflect the biases in the simulated atmospheric circulation that shift the south−northeast wind shear line southwestward in EXP-GF against the reanalysis. In comparison, the features of the two simulated typhoons in EXP-MSKF are much more consistent with the reanalysis. The atmospheric circulation and the south−northeast wind shear line over western Henan Province are also well captured by EXP-MSKF. The difference in atmospheric circulations between EXP-GF and EXP-MSKF in the outer domain must be from their only difference in convective parameterizations used.
To better understand the mechanism that governs the atmospheric circulation by different convective parameterizations, figure 3(a) presents the spatial distributions of the difference in geopotential height and horizontal wind fields at 700 hPa between EXP-GF and EXP-MSKF within the outer domain averaged for 19-20 July 2021. It is evident that EXP-GF simulates a smaller geopotential height at 700 hPa over a broad area, particularly over the ocean, than EXP-MSKF, which results in the anomalies of two low pressure centers and counterclockwise cyclonic circulations in EXP-GF. The biases of geopotential height and circulation pattern at 700 hPa in EXP-GF against EXP-MSKF result from its biases in atmospheric temperature. Figure 3(b) shows the  difference in the mean temperature averaged from 700 hPa to 200 hPa between EXP-GF and EXP-MSKF. It shows that EXP-GF significantly overestimates the mean temperature over a broad area corresponding well to its negative biases in geopotential height at 700 hPa. Considering that the parameterized convective latent heat can significantly contribute to atmospheric temperature, the area-averaged vertical latent heating profiles from EXP-GF and EXP-MSKF are shown in figure 4. The averaged areas are denoted as the black boxes in figure 3. The results show that the latent heating rate above 900 hPa in EXP-GF is significantly larger than that in EXP-MSKF, which is consistent with the overestimation of the mean temperature from 700 hPa to 200 hPa in EXP-GF. Therefore, the excessive latent heat in EXP-GF explains its positive bias in atmospheric temperature, which eventually leads to its deviation in geopotential height and circulation.

Modulation of rainfall and atmospheric circulations by convective latent heat
To further investigate the impact of parameterized convective latent heat release on atmospheric circulation, one sensitivity experiment (EXP-adjGF) of adjusting the convective latent heat calculation in the GF scheme is conducted (see the details in section 2.1). Figure 4 shows that the latent heat profile in EXP-adjGF is much closer to that in EXP-MSKF. The simulated geopotential height at 700 hPa in EXP-adjGF is significantly improved, much closer to the reanalysis and EXP-MSKF (figure 2(e)). The simulated atmospheric circulations are also improved in EXP-adjGF, including Typhoon 'InFa' and Typhoon 'Cempaka' and the south−northeast wind shear line over western Henan Province. The spatial and temporal variations of the rainfall in EXP-adjGF are also shown in figures 1(e) and (d) . Figures 1(d)-(e) shows that EXP-adjGF reproduces the location and intensity of precipitation over Henan Province in comparison with the observations and EXP-MSKF, with the performance much better than EXP-GF. The spatial and temporal correlation coefficients between the observations and the EXP-adjGF results reach 0.81 and 0.71, respectively. The analysis above proves that the parameterized convective latent heat is critical for producing reasonable atmospheric circulation forcing of the convection-permitting forecast in the inner domain and thus successfully predicting the location and intensity of the '21.7' extreme rainstorm in Henan Province.
Although this study focuses on the investigation of the influence of forecasted atmospheric circulation on regional extreme precipitation events at convection-permitting resolution, the results from outer domain in all the experiments are also shown ( Figure S5 in the supporting material). It is interesting to note that the results from the outer domain are similar to those from the inner domain, at least in terms of the spatial distribution and the average of precipitation over Henan province as shown in figure 1. In detailed, the MSKF experiment still outperforms the GF experiment in the outer domain, and the intensity and trend of precipitation simulated by the outer domain are similar to those from the inner domain. In addition, EXP-adjGF also has a much better performance than EXP-GF, more consistent with EXP-MSKF. This implies that the forecast with appropriate convective parameterization can sometimes perform well against the one at the resolution of convectionpermitting scale in terms of spatial distribution and regional average of precipitation. More detailed analysis of the difference between the results from the inner domain and outer domain is beyond the scope of this study but deserves further investigation in future.

Summary and discussion
A record-breaking heavy rainfall event occurred in Henan Province of central China during 19-20 July 2021. In this study, the impacts of predicted atmospheric circulation modulated by convective parameterization on the regional convection-permitting prediction of this event are examined. Our model experiments and analysis demonstrate that the difference in latent heat release from convective parameterizations dominates its impacts on atmospheric circulation and ultimately affects the prediction of the '21.7' rainstorm in Henan Province. The location and intensity of heavy precipitation from the EXP-GF experiment shows significant biases, while the EXP-MSKF experiment exhibits much better performance than the EXP-GF experiment. The spatial correlation coefficients are 0.08 and 0.74, and the temporal correlation coefficients are −0.42 and 0.71 for EXP-GF and EXP-MSKF against the observations, respectively. It is obvious that the MSKF scheme outperforms the GF schemes for this heavy precipitation event, which is consistent with the previous studies on the influence of convection parameterization on the modeling of heavy rainfall in China (Liang et al 2019. This founding implies that although the resolution reaches convection-permitting scale and no convective parameterizations are used in the inner domain, the forecast skill of heavy precipitation in the inner domain is still largely dependent on the convection representation in the outer domain. The difference in precipitation from the EXP-GF and EXP-MSKF experiments is influenced by the difference in atmospheric circulations. The difference in atmospheric circulations is caused by the different convective parameterizations used in the outer domain, which ultimately affects the circulation and precipitation in the refined region through lateral boundary forcing. The EXP-GF experiment simulates a lower 700 hPa geopotential height compared to the EXP-MSKF experiment, which results in two low pressure centers and counterclockwise cyclonic circulation in EXP-GF. The negative biases of geopotential height at 700 hPa in EXP-GF result from its positive biases of atmospheric temperature over a broad area, which is mainly due to the positive bias of the latent heat from GF. The EXP-adjGF experiment of adjusting the latent heat parameterization in the GF scheme predicts more reasonable atmospheric circulation and captures the location and intensity of the '21.7' rainstorm. A better performance of EXP-MSKF than EXP-GF on modeling heavy precipitation in China is also found by previous studies (e.g., Liang et al 2019, Wang et al 2021). However, they did not explore how convective parameterization affected atmospheric circulation and then influenced the simulated event within the refined region. This study reveals that the parameterization of latent heat is the key leading to the difference between the two schemes, in terms of their different impacts on atmospheric circulation.
This study highlights that the atmospheric circulation dominated by latent heat released in convective parameterizations largely affects the prediction of 2021 catastrophic heavy rainfall in Henan Province of China. Although regional models at a high resolution of a few kilometers are widely used and expected to improve the prediction skill of extreme events, due to the uncertainty of atmospheric circulation dominated by convective parameterization, particularly its estimation of latent heat release, regional high-resolution prediction may not meet our expectation. Global prediction at a resolution of a few kilometers may eliminate these uncertainties, but the computational efficiency and resources impose hard limits on its operational application at this moment (Bauer et al 2015. Therefore, convective parameterization remains critical and needs more effort to be improved for predicting extreme events accurately with global variable-resolution models or regional downscaling models.