Identifying the Physical Origin of Gamma-Ray Bursts with Supervised Machine Learning

The empirical classification of gamma-ray bursts (GRBs) into long and short GRBs based on their durations is already firmly established. This empirical classification is generally linked to the physical classification of GRBs originating from compact binary mergers and GRBs originating from massive star collapses, or Type I and II GRBs, with the majority of short GRBs belonging to Type I and the majority of long GRBs belonging to Type II. However, there is a significant overlap in the duration distributions of long and short GRBs. Furthermore, some intermingled GRBs, i.e., short-duration Type II and long-duration Type I GRBs, have been reported. A multiparameter classification scheme of GRBs is evidently needed. In this paper, we seek to build such a classification scheme with supervised machine-learning methods, chiefly XGBoost. We utilize the GRB Big Table and Greiner’s GRB catalog and divide the input features into three subgroups: prompt emission, afterglow, and host galaxy. We find that the prompt emission subgroup performs the best in distinguishing between Type I and II GRBs. We also find the most important distinguishing features in prompt emission to be T 90, the hardness ratio, and fluence. After building the machine-learning model, we apply it to the currently unclassified GRBs to predict their probabilities of being either GRB class, and we assign the most probable class of each GRB to be its possible physical class.


INTRODUCTION
Dating from the early days of gamma-ray burst (GRB) study, a clear bimodal distribution had been identified in their durations (Kouveliotou et al. 1993).Two classes of GRBs are then proposed based on their durations, namely long GRBs (LGRBs) and short GRBs (SGRBs).The commonly used criterion is based on T 90 , the time within which 90% of the fluence of the GRB is observed, with the dividing point set to be T 90 = 2 s.
LGRBs are thought to be produced by the core-collapse of massive stars (Woosley 1993), and this theory is subsequently supported by direct observational evidence of the association of some LGRBs with Type Ic supernovae (Galama et al. 1998;Woosley & Bloom 2006).SGRBs are thought to be originated from compact star mergers (Eichler et al. 1989), and this theory is supported by the multi-messenger observations of the binary neutron star merger event GW170817/GRB 170817A (Abbott et al. 2017a,b,c;Goldstein et al. 2017;Zhang et al. 2018).
The existence of these "intermingled" GRBs challenges the practice of classifying GRBs solely based on duration, as well as the names of "long" and "short" GRBs.It is then apparent that more sophisticated classification criteria involving multiple observational parameters are needed.Throughout this study, we refer to the GRB classes based on their physical origins, namely Type I for compact merger GRBs, and Type II for collapsar GRBs, following the classification scheme of Zhang (2006); Zhang et al. (2009).Many other schemes for GRB classification have also been put forward (e.g.Zhang et al. 2009;Lü et al. 2010;Zhang et al. 2012;Bromberg et al. 2013;Lü et al. 2014;Yang et al. 2016;Li et al. 2016;Kulkarni & Desai 2017;Li et al. 2020;Minaev & Pozanenko 2020), yet the classification of long and short GRBs is still largely based on community consensus, and there is a lack of objective classification models with minimal human interference.
In this case, machine learning comes in handy.Capable of automatically generating results without human input after training, machine learning can help us to fathom the differences between Type I and II GRBs, as well as aid us in the classification of newly discovered GRBs.Machine learning have already been widely adopted in the study of GRBs (e.g.Horváth et al. 2006;Řípa et al. 2012;Huertas-Company et al. 2015;Tarnopolski 2015;Modak et al. 2018;Horváth et al. 2019;Jespersen et al. 2020;Salmon et al. 2022a;Modak 2021;Salmon et al. 2022b;Tarnopolski 2022;Bhave et al. 2022;Steinhardt et al. 2023).However, the above-mentioned studies predominantly use machine learning methods of the unsupervised type, where only the observed features of the GRBs are inputted into the models, but not the labels (the GRBs' physical classes being Type I or II).On the other hand, the other type of machine learning methods, supervised methods, are also commonly employed by astronomy researchers in the classification of other astronomical objects (e.g.Connor & van Leeuwen 2018;Villa-Ortega et al. 2022;Butter et al. 2022;de Beurs et al. 2022;Yang et al. 2022a;Coronado-Blázquez 2022;Kaur et al. 2023;Fan et al. 2022;Luo et al. 2023;Zhu-Ge et al. 2023), albiet study on the application of supervised methods on GRB is scarce.Since supervised methods take both features and labels as input, and can produce deterministic predictions of the class of new GRBs, they can be helpful in identifying the true physical origin of intermingled GRBs.
In this study, we apply supervised machine learning methods to the classification of Type I and II GRBs.The machine learning model we use is the eXtreme Gradient Boosting (XGBoost) classifier (Chen & Guestrin 2016).We employ XGBoost as it is one of the most popular and successful machine learning frameworks to date, and can handle missing input values natively.The GRB catalog we use contains many missing values, so the ability to handle them is vital.In Section 2, we introduce the GRB catalogs we utilize and the machine learning methods we use.In Section 3, we present the classification results and feature importance from the machine learning models.In Section 4, we attempt to predict the classes of the unclassified GRBs.Finally, in Section 5, we put forward our conclusions and discuss on the classifications of some recently discovered possible intermingled GRBs.

DATA AND METHODS
We use an updated version of the GRB Big Table (Wang et al. (2020), Wang et al. (in prep)), which contains 7179 GRBs ranging from 1991 April 21 -2021 July 08.Greiner's GRB catalog (https://www.mpe.mpg.de/∼ jcg/grbgen.html), on the other hand, has 2261 GRBs in the same time range.We match the two catalogs, requiring T 90 of the selected GRBs in the Big Table to be known, and we label the GRBs based on their labels in Greiner's catalog.GRBs with 'S' at the end of their names are marked as Type I GRBs, while the others are marked as Type II GRBs.We also adopt the consensus classification of some intermingled GRBs: Type II GRB 090426 (Antonelli et al. 2009;Guelbenzu et al. 2011), Type I GRB 060505 (Fynbo et al. 2006) and Type I GRB 060614 (Fynbo et al. 2006;Gal-Yam et al. 2006;Gehrels et al. 2006;Zhang et al. 2007).This leaves us with 144 Type I and 1761 Type II GRBs.We acknowledge that this matching method substantially reduces the size of our sample, but the unmatched GRBs do not have many known features, to begin with.Therefore, we did not discard too much information.
The classification input of our model is based on the Greiner's catalog, which collected the community consensus based on both T 90 and afterglow/host galaxy information as presented in the literature.It is possible that a small fraction of bursts is mis-classified, but the very strength of our machine learning model is that it considers all the  classifications of the input training sample.If there are a few GRBs that are wrongly classified, they would not have a significant impact on the overall accuracy.
In this study, we pay special interest to the intermingled GRBs.We define intermingled GRBs as GRBs classified as Type I in Greiner's catalog, but have T 90 values > 2 s in the Big Table , or GRBs classified as Type II in Greiner's catalog, but have T 90 < 2 s.There are 21 intermingled Type I GRBs and 59 intermingled Type II GRBs in our sample.
We also scrutinize the possible third intermediate GRB type proposed by some studies (e.g.Horvath 1998;Mukherjee et al. 1998;Hakkila et al. 2000;Balastegui et al. 2001;Hakkila et al. 2003;Horváth et al. 2006;Chattopadhyay et al. 2007;Horváth et al. 2008;Huja et al. 2009;Řípa et al. 2009;Veres et al. 2010;Horváth et al. 2010;Řípa et al. 2012;Koen & Bere 2012;Zitouni et al. 2015;Kulkarni & Desai 2017;Horváth et al. 2018)  We then divide the features in the Big Tables into three subgroups: prompt emission, afterglow and host galaxy.Three subsamples are subsequently created by requiring each GRB in the subsamples to have at least one feature other than T 90 in the corresponding feature group to be known.We also divide each subsample into training sets and test sets with a 7:3 ratio, while keeping the ratio of Type I to Type II GRBs the same in the training sets and test sets.
The training sets are used to train the machine learning model, while the test sets are used to test the performance of the model after it is trained.
While it is common practice to impute the missing values in the data with some type of algorithm, we find that imputation introduces false information in the feature importance we later calculate, which is also suggested by some other studies (e.g.Seijo-Pardo et al. 2019;Yu et al. 2022).Since the XGBoost classifier (Chen & Guestrin 2016) can automatically handle missing values, we simply input our data without imputation.
Then, we note that the Type I and Type II GRBs in our sample are significantly imbalanced by a ratio of ∼ 1 : 10.Because this apparent ratio could be caused by selection effects, we should not introduce this ratio to our training data.However, the commonly used synthetic minority over-sampling technique (SMOTE) (Chawla et al. 2002) cannot be applied to data with missing values.Instead, we assign different sample weights for the two classes calculated with a balanced sample weight implemented in scikit-learn (Pedregosa et al. 2011): where w i is the sample weight of the ith class, N is the total number of data points, k is the number of classes (in this study 2), and n i is the number of data points in the ith class.Finally, we input the training sets into the XGBoost classifier to train the machine learning model.After training, we use the test set and the commonly used F 1 score (van Rijsbergen 1979;Sasaki 2007) to assess the performance of our models.A more intuitive metric, accuracy, is disfavored here because our data is imbalanced.A model simply predicts all GRBs as Type II can still score 92% accuracy.
To calculate the F 1 score, we first consider two commonly used metrics in evaluating the performance of machine learning models, precision and recall: True positives True positives + False positives . (2) Precision measures how many of the items predicted by the model as positive (in this study Type I GRBs) are true positives.
• Recall Recall = True positives True positives + False negatives . (3) Recall measures how many of the originally positive items are correctly predicted as positive by the model.
F 1 score is then calculated as the harmonic mean of the two metrics: The resulting F 1 score is a value between 0-1, with 0 meaning total failure in predicting the correct labels for the test set, and 1 meaning 100% accuracy in predicting the labels.Since F 1 score is a stricter metric than accuracy, F 1 scores are usually significantly smaller than accuracy scores calculated on the same classification results.
To test which input feature has the best capability in distinguishing between Type I and II GRBs, We use SHapley Additive exPlanations (SHAP, Lundberg & Lee (2017); Lundberg et al. (2020)) to calculate the feature importance of the input features.For each data point, SHAP estimates the contribution towards the output result from each input feature in the form of SHAP values.Readers can refer to the above-mentioned references for more detailed and mathematical description of SHAP.In contrast to the also commonly used permutation feature importance (Breiman 2001;Altmann et al. 2010;Fisher et al. 2019), which generates a single feature importance value for a feature across all data points, SHAP can analyze the prediction contribution of features on individual data points.When the SHAP values from all the data points are combined, SHAP can show not only the importance of the input features, but also in which direction the feature values of each input data point draw the final output.
Figure 1d and 1e show an example of the results from SHAP.In Figure 1d, the SHAP values from each individual data points are taken absolute value and averaged across different features.The length of each bar in the figure shows how important is each feature to the prediction result in general.Figure 1e, on the other hand, shows the individual SHAP values of each feature in each data points.In this beeswarm plot, the X-axis shows the SHAP values, with higher SHAP values leaning toward Type I, and lower SHAP values leaning toward Type II.The feature values of each data point are also shown with the color of the points, so that readers can know in which direction a higher or lower value in one feature draws the prediction to.

Prompt emission
Many studies suggested adding hardness ratio (HR) to the T 90 classification criterion to form a two-dimensional criteria will yield better results (e.g.Horváth et al. 2006Horváth et al. , 2010;;Řípa et al. 2012;Zhang et al. 2012;Bhat et al. 2016;Yang et al. 2016;Horváth et al. 2018;Tarnopolski 2019;Zhang et al. 2022).Similarly, the power-law index or peak energy E p of the spectrum of prompt emission can also take the place of hardness ratio (Zhang et al. 2012;Goldstein et al. 2010;Nava et al. 2011).In general, Type I GRBs have harder spectra compared with Type II GRBs.Goldstein et al. (2010) further proposes classification on the E p -fluence plane.Since fluence is highly related to duration, this scheme also follows the HR -T 90 scheme.
Some other studies (e.g.Zhang et al. 2009Zhang et al. , 2012;;Qin & Chen 2013;Tsutsui et al. 2013;Minaev & Pozanenko 2020) suggest that the famous Amati relation (Amati et al. 2002(Amati et al. , 2009;;Kumar & Zhang 2015) of the peak energy E p and the isotropic energy E iso of GRB prompt emission are different for Type I and II GRBs, and thus the E p -E iso plane can be used to distinguish between Type I and II GRBs.
In addition, Norris & Bonnell (2006) Association with supernovae (SN) is also a very important distinguishing factor between Type I and II GRBs, as SN associations provide smoking-gun evidence of the GRB progenitor.However, the Big Table only contains SN association information for 22 GRBs.For those GRBs without SN association information, it is unknown whether there truly was no SN associated with the GRB, or there simply was no observation, or most likely, there was an optical observation, but the SN was outshone by the bright optical afterglow.We find that including SN association in our model results in significantly lower F 1 scores.While the model correctly classifies GRBs with SN detection as Type II GRBs, including SN in our model also makes it more likely to classify GRBs without SN detection as Type I. Since most GRBs do not have SN detection in our data because they are too far away for SN detection, and because there are more Type II GRBs than Type I, including SN will yield worse results.Furthermore, the model without SN can correctly classify almost all SN associated GRBs as Type II.Therefore, we do not include the SN information in our model.
With the prompt emission subgroup, we are able to obtain a F 1 score of 0.758 on the test set, 0.667 on the intermingled GRBs, and 0.821 on the intermediate GRBs.The corresponding confusion matrices and feature importance are shown in Figure 1.Our model can predict most GRBs correctly based on prompt emission data, and T 90 is the most prominent feature, with feature importance much higher than other features, and shorter T 90 pull the predictions toward Type I. Since the intermingled GRBs are the ones that defy the classification based on T 90 , the major features that cause their classifications to be different will be the features that have high feature importance other than T90.The same stands true for intermediate GRBs, if T 90 cannot classify them clearly, then they will be classified based on other important features.However, when we remove T 90 from the prompt emission subgroup and carry out the same analysis, while we get a lower F 1 score of 0.581 on the test set as expected, but we also get a higher F 1 score of 0.833 on the intermingled GRBs.The F 1 score for intermediate GRBs is at a similar value of 0.888.The corresponding confusion matrices and feature importance are shown in Figure 2.This shows that T 90 can be misleading to the machine learning model for intermingled GRBs, and multiple observational parameters are needed for more accurate classification of GRBs.
We also find the fluence F g and hardness ratio HR to be the most important feature after T 90 .A lower fluence and a higher HR pull the predictions toward Type I. Since fluence is directly related to the duration, our results confirm the finding of other studies.
In order to measure the importance of other features, we further exclude fluence and hardness ratio from our feature group, and carry out the same machine learning analysis.We obtain F 1 score of 0.485 on the test set, 0.815 on the intermingled GRBs and 0.780 on the intermediate GRBs.The corresponding confusion matrices and feature importance are shown in Figure 3.While the general F 1 score drops again, the F 1 scores for intermingled and intermediate samples remain high.The most important features are again related to the spectral shape, such as E p cpl, E iso, alpha spl and alpha cpl.The flux-related feature of L pk, F pk1 and P pk4 are also important, as well as redshift and spectral lag.Generally, a harder spectrum, a lower flux, a shorter spectral lag and a lower redshift pull the predictions toward Type I.  2013) pointed out that afterglows of Type I GRBs mostly have lower X-ray luminosities and energies.The X-ray luminosities and energies of Type I GRBs also decay faster.There are also correlations among afterglow X-ray energy, X-ray afterglow luminosity, prompt emission isotropic energy E iso , peak luminosity L p and peak energy E p .Combined with the findings mentioned in Section 3.1, X-ray afterglow luminosity can also be employed for GRB classification.Kann et al. (2011) found that similar to X-ray, optical afterglows of Type I GRBs are significantly fainter than that of Type II GRBs, and similar afterglow-prompt emission correlations also exist in the optical band.

Afterglow
With the afterglow subgroup, we are able to obtain F 1 score of 0.353 on the test set, 0.857 on the intermingled GRBs and 0.75 on the intermediate GRBs.The corresponding confusion matrices and feature importance are shown in Figure 4. We found the most important feature to be 11-hour beta index in X-rays.The 11-hour fluxes in X-ray and optical bands are also important.A higher beta index and lower X-ray and optical fluxes pull the predictions toward Type I, consistent with other studies.In general, we find that afterglow features perform poorly in GRB classification.

Host galaxy
The different progenitors of Type I and II GRBs also have a substantial correlation with the properties of their host galaxies.The short lifetime of Type II GRB progenitors (Woosley et al. 2002) makes their event rate to generally follow the star formation rate (SFR) of the host galaxies, and Type II GRB host galaxies generally have higher SFR.(Bloom et al. 2002;Chary et al. 2007;Savaglio et al. 2009;Levesque et al. 2010a;Robertson & Ellis 2011;Levesque 2014;Wei et al. 2014;Trenti et al. 2015;Cucchiara et al. 2015;Lan et al. 2022).The redshift distribution of Type I GRBs are found to be delayed with respect to the star formation history, and thus host galaxies of Type I GRBs generally have lower SFR respectively (Piran 1992;Nakar et al. 2006;Zheng & Ramirez-Ruiz 2007;Virgili et al. 2011;Wanderman & Piran 2015;Luo et al. 2022).
Type II GRBs usually occur in regions with active star formation and are, therefore, closer to the center of the galaxy and in brighter regions.Type I GRBs, however, have larger offsets from the galactic center as the evolution of compact binary mergers require supernova events that "kick off" the binary system away from the location where they are formed (Bloom et al. 2002;Fruchter et al. 2006;Fong et al. 2013;Blanchard et al. 2016;Wang et al. 2018;Li et al. 2020;O'Connor et al. 2022;Fong et al. 2022).
With the host galaxy subgroup, we are able to obtain F 1 score of 0.615 on the test set, 0.909 on the intermingled GRBs and 0.933 on the intermediate GRBs.The corresponding confusion matrices and feature importance are shown in Figure 5.We found the most important feature to be offset, with higher offset pull the predictions toward Type I.In order to find other important features, we also carry out the same analysis on the host galaxy subgroup without offset.We get F 1 score of 0.56 on the test set, 0.909 on the intermingled GRBs and 0.857 on the intermediate GRBs with the host galaxy subgroup without offset.The corresponding confusion matrices and feature importance are shown in Figure 6.A V, stellar mass and star formation rate (SFR) are fairly important.Stronger dust extinction, higher stellar mass and lower SFR pull the predictions toward Type I.

All
We also combine all the feature subgroups to form an "all" group.We then train and test our machine learning model with this group containing all the features.With all the features, we obtain a F 1 score of 0.8 on the test set, 0.649 on the intermingled GRBs and 0.870 on the intermediate GRBs.
The corresponding confusion matrices and feature importance are shown in Figure 7.The most important features all come from the prompt emission subgroup, which shows that prompt emission data is most important in GRB classification.

Comparing the feature subgroups
Because the training and test set splitting process introduces randomness to the results, F 1 scores from a single trial may not be able to fully reflect the abilities in distinguishing Type I and II GRBs for different feature subgroups.Therefore, we repeat the random splitting and training process 1000 times, and record the F 1 scores of each feature subgroup on the entire test set and intermingled GRBs.
We report the average F 1 scores, along with standard deviations based on the 1000 trials for each feature subgroup on the entire test set and intermingled GRBs in Table 4.We found that the prompt emission subgroup performs the best in predicting Type I and II GRBs, while the average F 1 score of the afterglow subgroup is significantly lower.Using all features only marginally improve the performance of the model.Host galaxy comes in between the two subgroups.However, prompt emission including T 90 performs the worst on the intermingled GRBs.Also, all the feature subgroups performs reasonably well on the intermediate GRBs, which indirectly rejects the existence of a third intermediate GRB type.
Note that the intermingled and intermediate GRBs form a smaller sample that usually get more attention from the scientific community compared with all the GRBs.Among the 32 features we use in this study, requiring the GRBs to have T 90 and one other feature to be known, the general GRB sample on average have 2.5 known features, while the intermingled and intermediate samples on average have 9.3 and 9.4 features to be known, respectively.
We also the compare the performance of our model with the traditional way of classifying GRBs on the T90-HR plane by building a decision tree (e.g.Breiman et al. 1984;Timofeev 2004;Loh 2011Loh , 2014) ) with T 90 and hardness ratio as input.We use the implementation in scikit-learn and set the maximum depth of the decision tree to be 3.We use this model instead of the more sophisticated XGBoost model we use in the other parts of this study because we think the decision tree model better reflects the classification ability of a human scientist.
With this decision tree model, we are able to achieve F 1 score of 0.761 on all GRBs, 0.125 on intermingled GRBs and 0.667 on intermediate GRBs.The examples of confusion matrices are shown in Figure 8.When comparing the average F 1 scores from multiple trials listed in Table 4, we find that while the performance of our XGBoost multi-parameter model and the simple decision tree model are comparable on all the GRBs, the multi-parameter model performs significantly better on the intermingled and intermediate GRBs.This shows that our new classification method is an improvement over the traditional one, especially on the intermingled and intermediate GRBs.

PREDICTING UNCLASSIFIED GRBS
After building the models, we then move on to predict the classes of the unclassified GRBs in the Big Table .Since the all-feature subgroup achieved the best performance, we use all the features to train our model and predict the classes.
We train the model using the same method described in Section 2 with all the classified GRBs with at least one feature we intend to use and T 90 known, and use the trained model to predict the probabilities of the unclassified GRBs being either class.We also require the unclassified GRBs to have at least one feature and T 90 known.1455 GRBs are used for training, and the class probabilities of 2809 unclassified GRBs are predicted.For each unclassified GRB, the class in which they are predicted with the highest probability is assigned as their class.2181 GRBs are predicted as Type II, while 628 GRBs are predicted as Type I.The prediction results are listed in Table 5.We graph   the probability distribution of the unclassified GRBs being Type II in Figure 9.To compare our results with the traditional method of classifying GRBs on the T 90 -Hardness ratio plane, we also plot our prediction results on the T 90 -Hardness ratio plane in Figure 10.

CONCLUSIONS AND DISCUSSIONS
In this paper, we applied supervised machine methods, mainly XGBoost, to the classification of Type I and II GRBs.We come up with the following conclusions: • Classifying GRBs solely based on T 90 can yield unsatisfactory results, especially on intermingled GRBs.Criteria based on multiple observational parameters are needed.
• Compared with traditional GRB classification methods, the machine learning method can effectively classify GRBs, especially intermingled and intermediate ones.• We found that the best feature group in predicting Type I or II GRB is prompt emission.Among features on prompt emission, we found that T 90 still separates Type I and II GRBs the best.Besides T 90 , fluence and hardness ratio are also important features.Since fluence is correlated with T 90 , this is consistent with the traditional way of classifying GRBs on the T 90 -HR plane.
• We predict the class of some of the GRBs not present in Greiner's catalog.Their predicted class and their probabilities of being either are shown in Table 5.
• The methods employed in this study can be applied to future newly discovered GRBs to identify potentially peculiar GRBs in their early stages and help allocate resources for follow-up observations.The analysis code used in this study is available at https://github.com/Rigel7/grb-ml.
Recently, three possible intermingled GRBs have gained a lot of attention from the scientific community.GRB 200826A is thought to be an intermingled Type II GRB (Zhang et al. 2021;Ahumada et al. 2021;Rossi et al. 2022), while GRB 211221A and GRB 230307A are thought to be intermingled Type I GRBs (Yang et al. 2022b;Rastinejad et al. 2022;Sun et al. 2023).We apply our trained model to these three GRBs to examine their observational proprieties.The features we gathered and corresponding references for the three GRBs are listed in Table 6, 7 and 8.
Since T 90 can be misleading for the classification of intermingled GRBs, we use all the features except T 90 to classify these three bursts.For GRB 200826A, the model predicts it to have 13% probability of being Type I and 87% probability of being Type II.For GRB 211211A, the model predicts it to have 2% probability of being Type I and 98% probability of being Type II.For GRB 230307A, the model predicts it to have 98% probability of being Type I and 2% probability of being Type II.The SHAP values explaining how the individual features affect the predictions results are shown in Figure 11.
The prediction results of GRB 200826A and GRB 230307A from our model match the proposed classification of the two GRBs by other literature, while the prediction result of GRB 211211A matches the classification done solely based on its duration.This shows that these GRBs, particularly GRB 211211A are truly peculiar and more study should be done on their observational properties and physical origins.Indeed, Yang et al. (2022b) proposes that GRB 211211A could be originated from a white dwarf -neutron star merger, while Barnes & Metzger (2023) argue that GRB 211211A could still be explained by the normal Type II collapsar GRB model.
With its ability to not only predict the physical types of GRBs, but also explain the importance of each parameter in individual classifications, our model can provide independent opinions on the classifications of possible peculiar GRBs and help guide future observations and studies.This work is partially supported by the Top Tier Doctoral Graduate Research Assistantship (TTDGRA) and Nevada Center for Astrophysics at the University of Nevada, Las Vegas.
erg cm −2 s −1 Peak flux in the 1 s time bin in the rest-frame 1-10 × 10 4 keV energy band YP pk4photon cm −2 s −1 Peak photon flux in the 1 s time bin of 10Isotropic gamma-ray energy in the rest-frame 1-10 × 10 4 keV energy band Y L pk 10 52 erg s −1 Isotropic peak luminosity in the 1 s time bin in the rest-frame 1-10 × 10 4 keV energy band Y based on the T 90 distributions of GRBs by creating an intermediate sample consisting of GRBs with T 90 within 1-4 s.There are 31 intermediate Type I and 100 intermediate Type II GRBs in our sample.
Figure 1.Examples of confusion matrices and SHAP feature importance values of the prompt emission subgroup.

Figure 4 .Figure 5 .
Figure 4. Examples of confusion matrices and SHAP feature importance values of the afterglow subgroup.

Figure 6 .
Figure 6.Examples of confusion matrices and SHAP feature importance values of the host galaxy subgroup without offset.
Figure 7. Examples of confusion matrices and SHAP feature importance values of the all features subgroup.

Figure 9 .Figure 10 .
Figure 9. Probability distribution of the unclassified GRBs being Type II.The probability for Type I is 1 − the shown value.

Figure 11 .
Figure 11.SHAP values of individual feature values of GRB 200826A, GRB 211211A and GRB 230307A.Features marked with red color push the prediction results toward Type I, while features marked with blue color push the prediction results toward Type II.Upper: GRB 200826A.Middle:GRB 211211A.Lower: GRB 230307A.

Table 1 .
List of features used in the prompt emission subgroup.For features with multiple definitions (e.g., variability, F pk), we choose the one with the most known data.Directly measured features are listed above the horizontal line, while the derived features are listed below the line.

Table 2 .
List of features used in the afterglow subgroup.

Table 3 .
List of features used in the host galaxy subgroup.

Table 4 .
List of average F1 scores and 16th/84th percentile percentile values obtained with different feature subgroups and GRB samples.

Table 5 .
Prediction results of the unclassified GRBs.The probability of them being Type I or II are shown as pI and pII respectively.This is an example of the first ten rows of the table.The full version is published in its entirety in the machinereadable format.
• The fact that supervised machine learning model with two classes of GRBs can effectively classify intermediate GRBs with T 90 between 1-4 s indirectly rejects the existence of a third intermediate GRB class proposed based on duration distribution.