Tropical cyclone intensity forecasting using model knowledge guided deep learning model

This paper developed a deep learning (DL) model for forecasting tropical cyclone (TC) intensity in the Northwest Pacific. A dataset containing 20 533 synchronized and collocated samples was assembled, which included ERA5 reanalysis data as well as satellite infrared (IR) imagery, covering the period from 1979 to 2021. The u-, v- and w-components of wind, sea surface temperature, IR satellite imagery, and historical TC information were selected as the model inputs. Then, a TC-intensity-forecast-fusion (TCIF-fusion) model was developed, in which two special branches were designed to learn multi-factor information to forecast 24 h TC intensity. Finally, heatmaps capturing the model’s insights are generated and applied to the original input data, creating an enhanced input set that results in more accurate forecasting. Employing this refined input, the heatmaps (model knowledge) were used to guide TCIF-fusion model modeling, and the model-knowledge-guided TCIF-fusion model achieved a 24 h forecast error of 3.56 m s−1 for Northwest Pacific TCs spanning 2020–2021. The results show that the performance of our method is significantly better than the official subjective prediction and advanced DL methods in forecasting TC intensity by 4% to 22%. Additionally, compared to operational approaches, model-guided knowledge methods can better forecast the intensity of landfalling TCs.


Introduction
Tropical cyclones (TCs) are extremely powerful weather phenomena originating in tropical oceans, potentially unleashing catastrophic devastation upon coastal regions (DeMaria 2009, DeMaria et al 2014, Cangialosi et al 2020).The significant hazards TCs pose include severe flooding, destructive winds, and coastal inundation caused by sorm surges, all of which substantially threaten human lives and property (Zheng et al 2015, 2019, Klotzbach et al 2018, Wang and Toumi 2021, Bhatia et al 2022, Li et al 2023).Hence, precise TC intensity forecasts are pivotal in helping individuals take proactive precautions and mitigate potential losses (Woodruff et al 2013, Landsea and Cangialosi 2018, Yu et al 2020, Wang and Toumi 2022, Zhang et al 2023, Wang and Li 2023a).
The dynamical model method is the primary approach for TC intensity forecasting (Saha et al 2014, Bao et al 2022).This method utilizes a set of mathematical equations to represent the fundamental principles governing atmospheric motion and thermodynamics (Weber 2003, Zheng et al 2017, 2021, 2023, Shen et al 2022).By assimilating extensive observational data into the forecasting models, a dynamical model can be initialized to forecast the evolution of TC intensity.The dynamical models consider a range of atmospheric and oceanic parameters, including sea surface temperatures (SSTs), humidity, wind patterns, and pressure systems, which influence the behavior of TCs.Nonetheless, recent advances in computing technology and data assimilation techniques have significantly improved the accuracy of dynamical models, resulting in a considerable reduction in the forecast error for TC tracks, while the progress in intensity forecasting has been relatively slow (Landsea and Cangialosi 2018).
Statistical methods for TC intensity forecasting rely on historical observational data to develop mathematical models that capture the relationships between various meteorological parameters and the evolution of TC intensity (DeMaria and Kaplan 1994, 1999, DeMaria et al 2005, DeMaria 2009).These models often involve regression analysis, time series analysis, and other statistical techniques (Knaff et al 2003).The advantage of statistical methods lies in their simplicity and ease of implementation, particularly when real-time data is limited.However, these models are constrained by the assumption that historical patterns will continue to hold, limiting their effectiveness in forecasting extreme events, such as rapid intensification or sudden weakening of TCs (Lin et al 2009, Sandery et al 2010, DeMaria et al 2014).They are often combined with other methods to enhance overall forecast accuracy.
However, the intensity changes of TCs are influenced by many factors, such as the intricate interactions between the atmosphere and the ocean and the broader atmospheric conditions at play.These factors are complex and difficult to explain.Traditional methods cannot effectively capture the non-linear processes of TCs.Deep learning (DL) has emerged in response to this challenge.In recent years, there have been significant advancements in DL technology (Li et al 2020, 2022, Wang and Li 2023b), which has been successfully applied to various forecasting tasks (Lagerquist et al 2020, Zheng et al 2020, Ravuri et al 2021, Wang et al 2022, 2023, Zhang and Li 2022, Ren and Li 2023) and is currently being explored for TC intensity forecasting (Baik and Paek 2000, Pan et al 2019, Xu et al 2021, 2022, Yuan et al 2021, Zhang et al 2022, Ma et al 2023, Meng et al 2023b).DL algorithms, such as convolutional neural networks, can process vast amounts of data and identify complex patterns within meteorological datasets.This makes them particularly well-suited for handling the intricate and nonlinear characteristics of TC intensity forecasts (TCIFs).By learning from historical TC data, DL models can capture subtle relationships and nonlinear factors that traditional forecasting methods may overlook.With continuous learning and improvement from new data, DL methods demonstrate adaptability and flexibility in handling changing atmospheric conditions.
In recent years, DL has shown high accuracy and efficiency in the field of TCIFs.Baik and Paek (2000) designed a TC intensity forecasting model based on a multi-layer perceptron for the 12-72 h period, which resulted in a 7%-16% reduction in forecast errors compared to statistical methods.Pan et al (2019) and Yuan et al (2021) considered the time series dependency in intensity forecasting based on long short-term memory (LSTM) models (Graves 2012), leading to significant improvements in forecast accuracy.Xu et al (2021) utilized inputs from Statistical Hurricane Intensity Prediction Scheme(SHIPS) and Dynamical Statistical Hurricane Prediction (DSHP) statistical methods, combining them with a multi-layer perceptron to reduce forecast errors by 5%-22%.Furthermore, Xu et al (2022) and Meng et al (2023b) introduced the three-dimensional structure of the TC wind field and constructed the spatial attention fusing network (SAF-NET) TC intensity forecasting model, resulting in enhanced forecasting performance.Zhang et al (2022) incorporated the two-dimensional sea surface field into the three-dimensional atmospheric field, augmenting the model with LSTM modules to improve its ability to extract temporal information.Ma et al (2023) introduced Gated Recurrent Unit (GRU) modules during the modeling process to further enhance the model's capability to extract temporal information, resulting in improved forecast accuracy.
Although these DL methods have achieved remarkable accuracy in TC intensity forecasting, three key issues remain to be addressed.Firstly, satellite infrared (IR) imagery is commonly utilized for rapid intensification forecasts due to its strong correlation with TC intensity changes (Adler and Rodgers 1977, Steranka et al 1986, DeMaria and Kaplan 1994, 1999, DeMaria et al 2005, Su et al 2020), yet existing DL methods only have to consider atmosphere and ocean factors from reanalysis data and ignore satellite IR imagery.Secondly, the TC process is inherently complex, and the interactions among multiple input factors are intricate.Available networks often extract features individually for each factor or use simple concatenation, leading to limited capability in learning the interplay between these factors.Lastly, in the three-dimensional atmospheric field data, besides the signals strongly correlated with TC intensity, numerous factors interfere with the model's forecasts, which have been previously overlooked during the modeling process.
In this paper, the goal is to achieve accurate 24 h TCIFs, which is defined by the 2 min maximum sustained wind speed.The contributions of this study are threefold: firstly, it demonstrates the positive impact of utilizing satellite IR imagery on TC intensity forecasting performance.Secondly, it designs feature fusion modules for different factors, enhancing the DL model's capacity to learn and represent interactions between them.Lastly, by leveraging DL model knowledge (MK) to guide the modeling process, the study focuses the model more on signals closely associated with TC intensity evolution, resulting in improved model performance and training efficiency.
Section 2 introduces the data and methods.The performance of 24 h TCIF models analysis and discussion are given in section 3. The conclusion is given in section 4.

Data
The TC IR images used in this study were from the Gridded Satellite Data (GridSat-B1).GridSat-B1 data were created to facilitate to use of geostationary data (Knapp et al 2011).GridSat-B1 data are gridded International Satellite Cloud Climatology Project B1 data on a 0.07 • latitude equal-angle grid.Satellites are merged by selecting the nadir-most observations for each grid point.GridSat-B1 offers IR satellite imagery with a temporal resolution of 3 h, covering the period from 1981 to the present.The imagery includes wavelengths of 11, 0.6, and 6.7 µm.However, due to the absence of 0.6 and 6.7 µm band images before the year 2000, this study only utilized the 11 µm band images.
The model inputs included environmental variables provided by ERA5 reanalysis data (Hersbach et al 2023), comprising the u-(U), v-(V), and w-(W) components of wind and SST.These variables strongly correlate with TC intensity (Baik and Paek 2000, Vecchi and Soden 2007, Tang and Emanuel 2010).To represent the vertical structure of TCs, four isobaric levels at 200, 500, 850, and 1000 hPa were chosen.The data collected spans from 1979 to 2021, with a spatial resolution of 1 • and a temporal resolution of 6 h.
The Best Track dataset for TCs provided by the Shanghai Typhoon Institute, Chinese Meteorological Administration (STI/CMA) was used to label and extract TC samples (Ying et al 2014).The track and intensity of TC are recorded every 3 or 6 h in this dataset.The TC data were collected from 1979 to 2021.The data from 1979 to 2019 were partitioned, with 90% allocated for training and 10% reserved for validation.The data from 2020 and 2021 were utilized as independent test data.
For details on data preprocessing, please refer to the supporting information section S1.

Design of the TCIF model input data
The input factors of the TCIF model can directly impact the results of TCIFs.Dynamical model usually consider atmospheric and oceanic elements.However, most existing DL-based methods mainly consider atmospheric factors (such as historical TC information (HIS), U, V, W), while Zhang et al (2022) and Ma et al (2023) have included oceanic factors (SST).The studies indicate that IR data can depict TC morphology and convective activities and is commonly used for rapid intensification forecasting (DeMaria and Kaplan 1994, 1999, DeMaria et al 2005, Su et al 2020).However, no DL-based TC intensity forecasting method has incorporated IR.U, V, W, SST, and IR into the input factors of the TCIF model.
The model architecture for the input factor selection experiments is depicted in figure 1.For example, when the input factors are U, V, W, and HIS (supplementary table 1), the model exclusively comprises the U, V, W, and HIS branches (supplementary figure 4, consists of gray arrows).However, with the addition of the input IR, the model integrates the IR branch (supplementary figure 4, consists of gray and blue arrows).
In addition, the arrangement of data sequences plays a pivotal role in influencing the computational methodology of convolutional kernels.For details on the arrangement of data sequences, please refer to the supporting information in section S2.As shown in supplementary table 2, compared with the 'x-y-zt' arrangement, the 'x-y-t-z' arrangement leads to a 2.5% reduction in error.Therefore, all experiments in this paper use the 'x-y-t-z' data arrangement.

Design of the TCIF model architecture
In previous studies, a common approach involved using a multi-branch network structure (24 h TCIFbasic consists of blue arrows and blocks in figure 1) to independently extract each factor's features and concatenate them at the fully connected layer.The structure of the block is illustrated in supplementary figure 2.
Two improvements have been made to the basic framework shown in figure 1.The first improvement involves adding a feature fusion branch to the 24 h TCIF-merge model (consisting of blue and green arrows and blocks in the main paper figure 1).The second improvement is the addition of an extra input branch to the 24 h TCIF-fusion model (consisting of blue, green, and gray arrows and blocks in the main paper figure 1).For details on model architecture design concepts and performances of different model architectures, please refer to the supporting information section S3.Finally, the TCIF-fusion model was confirmed for 24 h TC intensity forecasting.
In these models, the output of the TCIF model is the 24 h TC intensity, the linear activation function is employed in the output layer, while ReLU is used for the other layers.The optimization function selected is Adam, while the loss function employed is mean absolute error (MAE).

Design of MK-guided training strategy
Besides the signals strongly correlated with TC intensity in the three-dimensional atmospheric data, numerous spurious signals exist that can potentially interfere with the model's forecasts, previously overlooked during the modeling process.
In DL model interpretability has been a focal point of research.Among various interpretability As shown in figure 2, the heatmap assigns higher weights to regions with high wind speeds, while regions characterized by lower wind speeds and areas deemed insignificant by other models receive comparatively lower weights.Specifically, the areas with higher weights in the U and V heatmaps are situated near the TC center, where strong wind speeds are predominant.Notably significant weight regions in the W heatmap are observed both within the TC eye and in its peripheral areas.The SST heatmap closely mirrors the genuine SST distribution.Furthermore, regions exhibiting higher weights in the IR heatmap correspond to the TC's eye and its spiral bands.A new dataset is generated through an element-wise multiplication of the original data with the absolute values of the heatmap (figure 2).This novel dataset empowers the model to strengthen the extraction of features in regions exhibiting high wind speeds.
In light of this, a method is proposed that effectively utilizes 'MK' to guide the modeling process.The proposed approach involves the following steps    When input into HIS, U, V, and W, the MKguided TCIF-1 model registers an error of 3.8 m s −1 .However, with the integration of SST and IR inputs, the model's forecast errors are notably reduced to 3.74 m s −1 (MK-guided TCIF-2 model) and 3.69 m s −1 (MK-guided TCIF-3 model), respectively.The MK-guided TCIF-4 model, incorporating HIS, U, V, W, IR, and SST inputs in concert, achieves even further error reduction, yielding an error of 3.56 m s −1 and a performance improvement of 8.0% (compared with MK-guided TCIF-1).

The roles of SST and IR images in TCIFs
These findings underscore a significant observation: apart from wind speed components directly correlating with intensity, incorporating SST and IR inputs greatly enhances the model's performance in forecasting TC intensity.

The roles of MK in TCIFs
The original input data contains extraneous 'noise' beyond the information relevant to TC intensity, which could impede the model's learning process.Introducing MK can effectively reduce the impact of 'noise' on the model.For details on the method of introducing model knowledge, please refer to the Method section.
The progression of model training is represented by the loss curve presented in figure 3 Figures 3(c) and (d) illustrate the forecasts of the TCIF-fusion model without MK-guided and the MK-guided TCIF-fusion model against the test dataset.Notably, the MK-guided TCIF-fusion model exhibits a higher correlation (0.92) and lower MAE (3.56 m s −1 ).Figures 3(c) and (d) demonstrate that incorporating MK mitigates the model's tendency to underestimate high wind speed samples.The results strongly indicate that incorporating MK contributes to enhanced model performance.The results reveal that both the DL-based methods and our proposed approach outperform traditional methods by 4%-22%, underscoring the substantial potential of DL in the TCIF.In contrast to other DL-based methods, the approach proposed in this paper (1) adds satellite imagery as input, (2) prioritizes the learning of interactions among factors, and (3) employs model-guided knowledge for modeling.The performance of the MK-guided TCIF-fusion model is more than 4% higher than that of other deep learning methods, proving that the method is advanced.

Results analysis
Additionally, the forecasting results of the MKguided TCIF-fusion model have been compared across different TC intensities (figure 4). Figure 4(a) depicts a bar chart illustrating the forecast errors for various intensity levels.The forecast error of our model increases with higher TC intensity levels.In figure 4(b), different TC intensity occurrence frequencies are showcased in both observations and model forecasts.Notably, the occurrence frequencies of distinct intensity of TCs are quite comparable between our model and the observations.Our model tends to make fewer forecasts for TCs of tropical storm (TS) and SuperTY intensity and more forecasts for tropical depression (TD) (10.8-17.1 m s −1 ), severe tropical storm (STS) (24.5-32.6 m s −1 ), and typhoon (TY) (32.7-41.4m s −1 ) intensity.
The frequency distribution illustrated in figure 4(b) implies that our model might misclassify TS (17.2-24.4m s −1 ) TCs as TD and SuperTY (>51.0 m s −1 ) TCs as STS (41.5-50.9m s −1 ) or TY.This tendency could stem from the varying sample sizes of TCs across different intensities, highlighting a limitation of DL methods.This issue poses a challenge to address in the future.
When a TC makes landfall, the transition from ocean to land alters the TC's energy source, encountering friction and changes in terrain, thereby

Conclusions
TCs represent some of the most formidable natural disasters, underscoring the pivotal significance of precise TCIFs.Despite achieving notable forecasting accuracy levels, DL-based methods continue to grapple with certain challenges.This paper verifies the advantageous role of SST and IR imagery in enhancing TCIF.By merging SST and IR data with atmospheric factors, the performance of the DL model experiences an 8.0% enhancement.Given the highly intricate interactions governing the nonlinear dynamics of TCs involving multiple factors, this research introduces a model design that explicitly accounts for inter-factor dependencies.This design augments the model's capacity to capture the detailed evolution of TC intensity, effectively reducing intensity forecast errors.Furthermore, integrating modelguided knowledge during the modeling process mitigates the interference from environmental 'noise' , subsequently amplifying both learning speed and model performance.
The MK-guided TCIF-fusion model delivers a 24 h forecast error of 3.56 m s −1 for Northwest Pacific TCs spanning 2020-2021.This method is comparable to, or surpasses, traditional and DL-based TC intensity prediction methods.
Given the complexity of multi-factor interactions inherent in the TC phenomenon, the tailored MK-guided TCIF-fusion model augments the comprehension of inter-factor relationships and effectively mitigates environmental noise interference in forecasting.This model framework holds relevance for TCIFs and TC track forecasts, rainfall forecasting, etc.
Regrettably, similar to existing DL-based methods, this study relies on reanalysis data that lacks accessibility for operational forecasting.Pioneering the development of purely satellite image-based TC forecasting models is an impending challenge.Furthermore, considering TC processes' intricate and dynamic essence, integrating physical constraints or prior knowledge into DL models is a promising way for future research within this domain.

(
supplementary figure 3): (1) training a TCIF-fusion model, denoted as the TCIF-initial model; (2) utilizing Grad-CAM to produce heatmaps for the training, validation, and test datasets; (3) creating a new input data by element-wise multiplication of the absolute value of the heatmaps with the corresponding original input data points (figure 2); (4) training a new TCIF-fusion model using the augmented dataset, thereby obtaining a TCIF-fusion model (the MKguided TCIF-fusion model) that is guided by the acquired MK.Note that both the TCIF-initial Model and the MK-guided TCIF-fusion model employ the same training, validation, and test datasets, ensuring the absence of any data leakage issue.
TCs are complex weather phenomena influenced by multiple factors.Most existing DL-based methods have mainly considered atmospheric factors (such as HIS, U, V, W), while Zhang et al (2022) and Ma et al (2023) have included oceanic factors (SST).The studies indicate that IR data can depict TC morphology and convective activities and is commonly used for rapid intensification forecasting.However, no DLbased TC intensity forecasting method has incorporated IR.This section compares the performance of the MK-guided TCIF-fusion model under varying inputs.As shown in table 1 and figure 3(a), the basic model employs HIS, U, V, and W as inputs while introducing SST and IR inputs individually.Table 1 and figure 3(a) depicts the average absolute errors of

Figure 2 .
Figure 2. The process of MK-guided generation of new input data.The deep blue dots in the SST represent processed NaN values, please refer to the supplementary information section S1 for details.
(b).As shown in figure 3(b), the solid and dashed red lines denote the model's training and validation loss values with the original input data.Correspondingly, the solid and dashed blue lines represent the training and validation loss values of the MK-guided TCIF-fusion model.The results underscore that the TCIF-fusion model without MK-guided converges after 40 training epochs, while the MK-guided TCIF-fusion model achieves convergence within just 20 training epochs, displaying even lower loss values.

Figure 3 .
Figure 3. TCIF-fusion model 24 h forecast performance (a) MK-guided TCIF-fusion model performance with different inputs, (b) the loss curves for models with or without MK, (c) the TCIF-fusion model without MK, and (d) the TCIF-fusion model with MK.

Figure 4 .
Figure 4. Forecast results of the MK-guided TCIF-fusion model distribution of TCs in different intensities, (a) forecast error in different intensities, (b) forecast frequency in different intensities.

Table 1 .
The mean absolute errors (MAE, m s −1 ) over the Northwest Pacific test data (2020-2021) of 24 h TC intensity forecasting with different model inputs.
Baik J-J and Paek J-S 2000 A neural network model for predicting typhoon intensity J. Meteorol.Soc.Japan II 78 857-69 Bao S,Zhang Z, Kalina E and Liu B 2022The use of composite GOES-R satellite imagery to evaluate a TC intensity and vortex structure forecast by an FV3GFS-based hurricane forecast model Atmosphere 13 126 Bhatia K et al 2022 A potential explanation for the global increase in tropical cyclone rapid intensification Nat.Commun.Ma D, Wang L, Fang S and Lin J 2023 Tropical cyclone intensity prediction by inter-and intra-pattern fusion based on multi-source data Environ.Res.Lett.18 014020 Ma L-M 2014 Research progress on China typhoon numerical prediction models and associated major techniques Prog.Ren Y and Li X 2023 Predicting the daily sea ice concentration on a sub-seasonal scale of the pan-arctic during the melting season by a deep learning model IEEE Trans.Geosci.Remote Sens. 61 1-15 Saha S et al 2014 The NCEP climate forecast system version 2 J. Clim.27 2185-208 Sandery P, Brassington G, Craig A and Pugh T 2010 Impacts of ocean-atmosphere coupling on tropical cyclone intensity change and ocean prediction in the australian region Mon.Weather Rev. 138 2074-91 Selvaraju R R, Cogswell M, Das A, Vedantam R, Parikh D and Batra D 2020 Continental us hurricane landfall frequency and associated damage: observations and future risks Int.J. Comput.Vis.128 336-59 Shen D, Bao S, Pietrafesa L J and Gayes P 2022 Improving numerical model predicted float trajectories by deep learning Earth Space Sci. 9 e2022EA002362 Steranka J, Rodgers E B and Gentry R C 1986 The relationship between satellite measured convective bursts and tropical cyclone intensification Mon.Weather Rev. 114 1539-46 Su H, Wu L, Jiang J H, Pai R, Liu A, Zhai A J, Tavallali P and DeMaria M 2020 Applying satellite observations of tropical cyclone internal structures to rapid intensification forecast with machine learning Geophys.Res.Lett.47 e2020GL089102 Tang B and Emanuel K 2010 Midlevel ventilation's constraint on tropical cyclone intensity J. Atmos.Sci.67 1817-30 Tian W, Zhou X, Niu X, Lai L, Zhang Y and Sian K T C L K 2022 A lightweight multitask learning model with adaptive loss balance for tropical cyclone intensity and size estimation IEEE J. Sel.Top.Appl.Earth Obs.Remote Sens. 16 1057-71 Vecchi G A and Soden B J 2007 Effect of remote sea surface temperature change on tropical cyclone potential intensity Nature 450 1066-70 Wang C and Li X 2023a A deep learning model for estimating tropical cyclone wind radius from geostationary satellite infrared imagery Mon.Weather Rev. 151 403-17 Wang C, Xu Q, Cheng Y, Pan Y and Li H 2022 Ensemble forecast of tropical cyclone tracks based on deep neural networks Front.Earth Sci.16 671-7 Wang H, Hu S and Li X 2023 An interpretable deep learning enso forecasting model Ocean Land Atmos.Res. 2 0012 Wang H and Li X 2023b Deepblue: advanced cnn applications for ocean remote sensing IEEE Geosci.Remote Sens. Mag.2-25 Wang S and Toumi R 2021 Recent migration of tropical cyclones toward coasts Science 371 514-7 Wang S and Toumi R 2022 An analytic model of the tropical cyclone outer size npj Clim.Atmos.Sci. 5 46 Weber H C 2003 Hurricane track prediction using a statistical ensemble of numerical models Mon.Weather Rev. 131 749-70 Woodruff J D, Irish J L and Camargo S J 2013 Coastal flooding by tropical cyclones and sea-level rise Nature 504 44-52 Xu G, Lin K, Li X and Ye Y 2022 SAF-Net: a spatio-temporal deep learning method for typhoon intensity prediction Pattern Recognit.Lett.155 121-7 ocean anomaly, air sea fluxes and the rapid intensification of tropical cyclone Nargis (2008) Geophys.Res.Lett.36