Data-driven revision of coal spontaneous combustion risk classification for prevailing methods used in Australian mining industry

Spontaneous combustion poses persistent hazards in underground coal mining operations, threatening personnel and operations. To assess this risk, in Australia small-scale laboratory tests are being conducted using Adiabatic Oxidation (R70), Crossing Point (CPT) and Minimum Self-Heating Temperature (SHTmin) methods. The intrinsic spontaneous combustion propensity classification (ISCP) utilizes R70 results to establish a risk rating. This risk ranking provides guidance on the severity of spontaneous combustion ranging from low to extremely high, aiding in the implementation of preventive measures. However, The ISCP lacks supporting literature, and there is no existing literature on correlation between laboratory tests or risk matrix for CPT. This paper presents the results of a Spearman correlation study among R70, CPT and SHTmin based on a large historical database (n = 318) showing that only R70 and CPT can be used to reliably rank spontaneous combustion risk. It was found that CPT is strongly correlated with R70 (ρ = -0.8875), but SHTmin shows a weaker correlation with R70 (ρ = -0.7265). The hierarchical clustering analysis resulted in a revised risk ranking: Low (R70 < 0.4 °C/h, CPT > 156 °C), Medium (0.4 °C/h < R70 < 3 °C/h, 132 °C < CPT < 156 °C), High (3 °C/h < R70 < 11 °C/h, 102 °C < CPT < 132 °C), and Very High (R70 > 11 °C/h, CPT < 102 °C).


Introduction
Coal remains an important primary energy source in the world due to its abundance and proximity to markets.However, spontaneous combustion of coal negatively affects the normal extraction of coal.Between 1972 and 2004, fifty-one spontaneous combustion incidents were identified in Queensland in Australia alone [1].Spontaneous combustion of coal occurs when sufficient oxygen is supplied for oxidation, but insufficient dissipation of heat generated from coal's low temperature oxidation reaction occurs.As self-heating continues, the accumulated heat will cause a thermal runaway and, consequently, spontaneous combustion [2].To assess this risk, coal samples are regularly screened for their spontaneous combustion propensity by different methods through examination of chemical constituents of coals, thermal studies, and oxygen avidity studies [3].
The current predominant methods adopted by Australian mining industry for spontaneous combustion assessment are Adiabatic Self-Heating Test (R70), Crossing Point Test (CPT) and Minimum Self-Heating Temperature (SHTmin) calculation.The R70 test, one of the typical adiabatic methods, defines R70 index to express the self-heating rate in the linear part of the adiabatic self-heating curve from 40 to 70 °C [4].The intrinsic spontaneous combustion propensity classification (ISCP) is used for establishing a risk rating based on the R70 laboratory results.This risk ranking provides a guideline to spontaneous combustion severity on a scale from low to extremely high: the higher the R70, the higher the risk of spontaneous combustion [5] [6].
The CPT is defined as the temperature at which the coal is equal to reaction vessel temperature when the coal sample is contained in a reaction vessel, which is placed in a programmed oven, with the oven temperature increasing at a constant rate.The higher the CPT, the lower the risk of spontaneous combustion [7].
The SHTmin is defined as the lowest temperature obtained in an adiabatic experiment demonstrating thermal runaway.SHTmin is calculated using empirical equation (1), which predicts SHTmin based on dry ash-free oxygen content (ODAF) derived from coal quality analysis [8].The lower the SHTmin, the higher the potential of spontaneous combustion. (1) These three methods can provide a quantitative assessment towards identifying and preventing spontaneous combustion.However, there are several limitations associated with these methods.Firstly, the supporting literature for the currently adopted ISCP has minimal supporting evidence.Secondly, there is no well-established CPT rating system even though established procedures exist for CPT testing.Thirdly, there is a doubt about the accuracy of SHTmin as it may provide an inaccurate and misleading indication of spontaneous combustion propensity since it is calculated based on dry ash-free oxygen content and excludes other coal's intrinsic factors (i.e., ash, moisture, volatile matter, and fixed carbon).Coals with extremely high ash content (generally above 30%) usually possess low spontaneous combustion risk, though they may possess high dry ash-free oxygen content and consequently low SHTmin, which is indicative of a higher spontaneous combustion risk.
In this paper, spontaneous combustion testing, coal quality analysis and statistical analysis are presented to address the limitations of the current methods used for quantitative assessment of spontaneous combustion in Australia.Spearman correlation, data cleaning, principal component analysis (PCA), multivariate nonlinear regression analysis and hierarchical clustering analysis were performed on spontaneous combustion data from Simtars' spontaneous combustion testing.Additionally, a revised risk ranking matrix is proposed for both R70 and CPT, which could be applied to assess spontaneous combustion propensity in future R70 and CPT testing.

Methodology
This paper presents the data and the results of an analysis conducted based on a large database (n = 318) obtained from Simtars' spontaneous combustion testing in the past two decades.The sample preparation and testing procedures slightly varied during this period.Therefore, the sample preparation and procedures were made adherent to Simtars' current methodology.

Sample Preparation
Sample preparation was undertaken on 318 coal samples from 6 different worldwide regions: 274 from Australia, 22 from New Zealand, 4 from Canada, 11 from the United States, 6 from Africa and 1 from Bangladesh.After sample acquisition, the samples were kept intact and stored in the freezer set to -18 °C until ready for testing in order to minimise oxidation.All sample identifiers were tokenised to allow de-identification of samples.
On the day of testing, the samples were dried on concrete out of direct sunlight.For the analysis, the samples had to be free from shale, clay, limestone, and other visible inclusions.Once samples were visibly dry, they were divided into half of their respective forms i.e., lengthwise for core or half for each lump, if supplied.Half of the samples were crushed to < 4 mm using Jaques laboratory jaw crusher (Model 127ST), and then homogenised for a minimum of six times by a riffle splitter.Half of each crashed sub-samples were milled to < 250 μm using a Retsch Cross Beater mill (Model WRB 80 c/2q SIL).
A 1000 mL conical flask, equipped with a rubber stopper and modified with a gas inlet and gas outlet, was placed inside the laboratory dehydrating oven with 500 g of < 250 μm milled coals.Nitrogen with a flowrate of 250 mL/min was passed over the samples for one hour at room temperature (25 +/-5 °C), and then for 16 hours at 110 °C.The samples were allowed to return to 40 °C +/-2 °C before testing.

Coal Quality Analysis
The sub-samples crushed to < 4 mm were provided to an external laboratory for coal quality analysis according to ISO 11722, ISO 1171 and ISO 562 standards.

Adiabatic Self-Heating R 70 Testing
One hundred fifty grams of dehydrated coal samples were transferred to a 473 mL (16 oz) Dwyer flask, equipped with a PTFE stopper, which was placed into an adiabatic oven set to 40 °C.Nitrogen with a flowrate of 50 mL/min was passed over the coal samples until the samples had equilibrated with the oven at 40 +/-0.3°C.The oven was then switched to adiabatic mode, where 60 mL/min of oxygen was passed over the samples, and the set point of the oven was determined by the sample temperature.The testing continued until the samples passed 70 °C, 72 hours had passed, or the samples failed to show any signs of self-heating, whichever came first [9] [10].

Crossing Point Testing (CPT)
Dehydrated coal samples, with a weight ranging from 60 to 70 g, were transferred to a brass reaction vessel, which was placed in a programmed laboratory oven.Nitrogen with a flowrate of 120 mL/min was passed over the coal samples until the samples had equilibrated with the reaction vessel at 40 +/-0.3°C.Dry instrument air was preheated to the oven's temperature and then injected into the reaction vessel with a flow rate of approximately 120 mL/min.The oven temperature increased at a constant rate of 0.5 °C/min raising the inlet gas and vessel temperatures.The temperatures of the oven and of the reaction vessel were recorded continuously.

Minimum Self-Heating Temperature (SHTmin)
SHTmin was calculated using equation (1) based on coal quality analysis results.

Statistical Approaches
Within MATLAB R2022b environment, Spearman correlation analysis, data cleaning, principal component analysis (PCA), multivariate nonlinear regression analysis and hierarchical clustering analysis were performed on collected data.

Spontaneous Combustion Testing and Coal Quality Analysis Results
Spontaneous combustion testing (R70, CPT, SHTmin) and coal quality analysis were carried out on 318 coal samples.Figure 1 and Figure 2 present the distribution of R70, CPT, SHTmin and coal quality analysis results, while Table 1 summarizes the descriptive statistics of the spontaneous combustion database.This database was used in the subsequent statistical analysis.

Spearman Correlation Analysis on Spontaneous Combustion Testing Results
Since R70 is the most predominant method to assess coal's spontaneous combustion propensity in Australia and can provide a good indicator of coal reactivity to oxygen [11], Spearman correlation study was performed between R70 and both CPT and SHTmin to assess the potential to consistently rank spontaneous combustion risk.The outcomes of these correlation analyses are graphically represented in Figure 3.

Data Cleaning
Data cleaning was conducted to address potential outliers within the dataset, which occurred due to error in testing or transcription errors when recovering old test datasets (see Figure 4).Isolation Forest algorithm [12] was applied to identify and remove outliers.After data cleaning, a total of 276 coal samples remained viable for subsequent analyses.Geographically, these samples originated from diverse sources, including 252 from Australia, 8 from New Zealand, 1 from Canada, 8 from the United States, 6 from Africa, and 1 from Bangladesh.

Spearman Correlation Analysis between Reliable Spontaneous Combustion Indicators and Coal
Quality Analysis Results Spearman correlation study between reliable spontaneous combustion indicators and coal quality analysis results was carried out to identify which statistically significant variables derived from coal quality analysis could be used to predict reliable spontaneous combustion indicators.The correlation coefficients are summarized in Table 2.

Principal Component Analysis
Principal component analysis was conducted on reliable spontaneous combustion indicators and coal quality analysis results.The explained variance by each principal component is graphically represented in Figure 5.
Figure 5. Scree plot of principal component analysis. 8 The reliable spontaneous combustion indicators and coal quality analysis results were projected on the space of the first two principal components, as shown in Figure 6, which led to the identification of predicators for reliable spontaneous combustion indicators' predication models.Figure 6.Projection of variables on principal component space.

Multivariate Regression Analysis
Multivariate linear and nonlinear models were tested for reliable spontaneous combustion indicators to determine the best possible relation with identified predictors, and the following two prediction models were configured with Table 3 summarizing their coefficients: (2)

Hierarchical Clustering Analysis
Hierarchical clustering analysis was performed on the reliable spontaneous combustion indicators with linkage distance calculated for each sample, and the samples were categorized into four clusters, as illustrated in Figure 8.Additionally, Figure 9 shows the scatter plot of reliable spontaneous combustion indicators, grouped by hierarchical clustering analysis.Consequently, hierarchical clustering analysis led to the revision of a risk ranking matrix for coal's spontaneous combustion risk assessment, as shown in Table 5.The revised risk ranking matrix groups coals into four categories: Low, Medium, High and Very High.The outcomes of the analysis can be explained by the fact that SHTmin may provide an inaccurate and misleading indication of spontaneous combustion propensity since it is calculated based on dry ash-free oxygen content and excludes other coal's intrinsic factors, such as, ash, moisture, volatile matter and fixed carbon (Equation ( 1)).Coals with extremely high ash content (generally above 30%) usually possess low spontaneous combustion risk though they may possess high dry ash-free oxygen content and, consequently, low SHTmin, which is indicative of a higher spontaneous combustion risk [13].

Data Cleaning
Data cleaning was performed to exclude outliers after reliable spontaneous combustion indicators (R70 and CPT) were identified.The scatter plots (Figure 4), depicting the relationship between these reliable spontaneous combustion indicators and coal quality results, revealed strong correlations with moisture (ad%), volatile matter (ad%), fixed carbon (ad%), oxygen (ad%), carbon (daf%), nitrogen (daf%) and oxygen (daf%) with significant outliers deviating from the patterns.
Since the samples used for testing were collected for almost two decades, outliers were to be expected either due to error in testing or transcription errors while recovering old test datasets.Therefore, an Isolation Forest algorithm was utilised to detect anomalies within the dataset.The Isolation Forest algorithm was utilised with the recommended parameters provided within this paper [12], with a contamination ratio of 10%, distance factor = 2 and anomaly cluster = 2.The Isolation Forest detected 42 outliers within the overall dataset, reducing the size of the analysis from n = 318 to n = 276.The Isolation Forest provided a conservative outlier removal in contrast with other tested methods, including One-Class Support Vector Machine (SVM).Therefore, considering the dataset characteristics, the removal of the outliers was a necessary process.
In addition, as illustrated in Figure 4, the trend towards spontaneous combustion propensity escalates with higher levels of moisture (ad%), volatile matter (ad%), oxygen (ad%) and oxygen (daf%).Conversely, an increase in fixed carbon (ad%), carbon (daf%), and nitrogen (daf%) contributes to a reduction in this propensity.Therefore, the presence of extreme values in these variables can result in notable deviations in R70 and CPT values.These deviations introduce instability in the overall performance of the database.Therefore, the removal of the outliers contributed to the increase in the stability and robustness of the subsequent statistical analysis.

Reliable Spontaneous Combustion Indicators Prediction
As shown in Table 2, Spearman correlation study between reliable spontaneous combustion indicators (R70 and CPT) and coal quality analysis results (14 variables) reveals that both R70 and CPT strongly correlate with moisture (ad%), volatile matter (ad%), oxygen by difference (ad%), carbon (daf%) and oxygen by difference (daf%) and the absolute values of their correlation coefficients are above 0.72.Therefore, it is concluded that these variables derived from coal quality analysis are suitable to predict R70 and CPT.
Principal component analysis on R70, CPT and coal quality analysis results (14 variables) shows that the first 2 principal components can explain a reasonable analysis variation (71.06%), as illustrated in Figure 5. Therefore, the correlation relationship among reliable spontaneous combustion indicators and coal quality analysis results can be examined by projecting these variables on the space of the first 2 principal component.
Figure 6 shows that CPT and R70 strongly correlate with each other.Additionally, moisture (ad%), oxygen by difference (ad%) and carbon (daf%) exhibit closest proximity to CPT and R70, and thus they strongly correlate with both R70 and CPT.Therefore, these three variables can be applied to predict R70 and CPT.
As shown in Table 4, R70 and CPT models exhibit a large R 2 of 0.864 and 0.828, respectively, revealing that R70 and CPT models provide a good fit of experimental data to the models.
12 Additionally, Figure 7 shows that experimental data are highly fitted to the model equations.Thus, R70 and CPT can be confidently predicted by moisture (ad%), oxygen by difference (ad%) and carbon (daf%), proving that R70 and CPT are reliable spontaneous combustion indicators.
However, the R70 predication model cannot differentiate the coal samples with low reactivity, as shown in Figure 7.This may explain why very low R70 may still be liable for spontaneous combustion.

Reliable Spontaneous Combustion Indicators Ranking
Hierarchical clustering analysis was performed on the reliable spontaneous combustion indicators (R70 and CPT). Figure 9 is the scatter plot of R70 vs CPT, which is grouped by the hierarchical clustering analysis, showing most of experimental data categorised into Cluster 1: low R70 and very high CPT, which is normal for Australia coals that are low reactive.The least of the experimental data are categorised into Cluster 4: very high R70 and low CPT, which correspond to highly reactive New Zealand coals.

Comparison Between Revised Risk Ranking Matrix and ISCP
The comparison between the currently applied Intrinsic Spontaneous Combustion Propensity Classification (ISCP) and the revised risk ranking presented in Section 3.7 encompasses the objectives, methodology and criteria, accuracy and reliability, data coverage and global applicability.
The revised risk ranking matrix and the ISCP classification are aligned in their objectives, which is to provide a comprehensive guidance for coal mining operations and facilitate the implementation of effective preventive measures to mitigate the risk of spontaneous combustion.
Both classifications share a common process of categorizing coal samples into distinct risk levels, based on their inherent susceptibility to spontaneous combustion.R70 values are used as a criterion in both classifications to assess the likelihood of spontaneous combustion.However, the revised risk ranking matrix incorporates not only R70 values, similarly to the ISCP classification, but also the Crossing Point (CPT) values, which introduces supplementary ranking criterion that significantly enhances the overall robustness and reliability of the risk assessment system.
Notable distinction lies in the data-driven foundation of the revised risk ranking matrix, which draws its strength from an expansive database comprising 276 coal samples.In contrast, the ISCP classification lacks substantial supportive literature.This crucial dissimilarity leads to the enhanced reliability of the revised matrix.The extensive dataset used in this study bolsters the revision's robustness and accuracy, thereby underscoring the substantive contribution of empirical evidence to the reliability and validity of the revision.
The revised risk ranking matrix includes a broader global sampling scope, encompassing a comprehensive collection of 252 coal samples from Australia, 8 from New Zealand, 1 from Canada, 8 from the United States, 6 from Africa, and 1 from Bangladesh.In contrast, the ISCP classification is confined to its regional scope, centred solely on coal originating from Queensland and New South Wales in Australia.By integrating data from diverse coal-producing regions worldwide, the revised matrix offers insights that extend beyond regional boundaries, facilitating a more holistic understanding of spontaneous combustion risk across varying coal compositions and geological contexts.Thus, this divergence underscores a pivotal advantage of the revised matrix: its enhanced applicability on a global scale.

Conclusions
Adiabatic Self-Heating R70, Crossing Point Testing (CPT) and Minimum Self-Heating Temperature (SHTmin) methods are predominant for coal's spontaneous combustion risk assessment in Australia.A study of 318 coal samples was carried out for comparative review and analysis of spontaneous combustion testing.
The following conclusions emerged from this study: (1) Spearman correlation study of spontaneous combustion indicators (R70, CPT and SHTmin) revealed that both R70 and CPT can serve as reliable spontaneous combustion indicators due to their strong correlation (ρ = -0.8875).However, SHTmin might not reliably assess spontaneous combustion risk due to weaker correlation (ρ = -0.7265).This results from skewed SHTmin calculation due to high ash contents in some coals.(2) After data cleaning with 276 coal samples remaining, Spearman correlation and principal component study showed that moisture (ad%), oxygen by difference (ad%) and carbon (daf%) indicate strong correlation with R70 and CPT testing results, and thus these three variables derived from coal quality analysis can be applied to predict R70 and CPT.(3) Multivariate nonlinear regression analysis configured R70 and CPT predication models based on moisture (ad%), oxygen by difference (ad%) and carbon (daf%), with a high R 2 of 0.864 and 0.828, respectively.Therefore, R70 and CPT can be confidently predicted by these three variables and reliably categorise coals as per their propensity towards spontaneous combustion.(4) Hierarchical clustering analysis was useful for the risk ranking of coals using R70 and CPT.The revised risk ranking matrix may be applied to rank coal's risk in future spontaneous combustion testing.This study significantly advances the field of coal spontaneous combustion risk assessment and addresses key limitations in current industry practices.The development of a novel revised risk ranking matrix, incorporating R70 and CPT values, introduces a robust and globally applicable approach to evaluating spontaneous combustion risk.The study's data-driven foundation and empirical evidence contribute to the credibility of the revised risk ranking.The research not only fills knowledge gaps but also highlights the value of statistical analysis in establishing correlations and predictive models.This work enhances the precision of risk assessment, enabling safer coal mining operations and more effective safety measures.Overall, the study significantly contributes to coal spontaneous combustion risk assessment, with implications for both academia and industry practice.

Future Research Directions
It is proposed that future work focuses on improving and modifying SHTmin equation for spontaneous combustion risk assessment.The proposed approach would include: (1) obtaining experimental data of SHTmin, (2) performing multivariate nonlinear regression analysis on experimental SHTmin and coal quality results (moisture (ad%), oxygen by difference (ad%) and carbon (daf%)), and (3) testing the nonlinear predication model with test dataset to seek any improvement.
A limitation of this study pertains to the distribution of data, where a substantial proportion of coal samples are derived from Australia, thus, potentially, introducing a geographic bias.To enhance the study's comprehensiveness and global applicability, future efforts should prioritize the acquisition of coal samples from a more diverse array of international locations.This approach will facilitate a more balanced representation of coal compositions and geological contexts, culminating in a study that transcends regional boundaries and offers a broader understanding of spontaneous combustion risk dynamics worldwide.Additionally, as outlier detection was required due to the nature of the database, further research is being undertaken on future samples that will not require extensive outlier detection to validate the reliability of the model hypothesised within this paper.

Figure 2 .
Figure 2. Box-and-whisker plots of historical spontaneous combustion data.

Figure 4 .
Figure 4. Scatter plots of R70 and CPT vs coal quality.

Figure 7
Figure 7 compares the actual and predicted values of reliable spontaneous combustion indicators to examine the models' performances.

Figure 7 .
Figure 7. Comparative plots of predicated to actual data.

Figure 8 .
Figure 8. Dendrogram of hierarchical clustering analysis on R70 and CPT.

Table 1 .
Descriptive statistics of historical spontaneous combustion database.
a Air dry basis.b Dry ash-free basis.

Table 2 .
Spearman correlation coefficients between R70, CPT and coal quality.
a Air dry basis.b Dry ash-free basis.

Table 3 .
Coefficients of R70 and CPT predication models.

Table 4
shows the models' performance with R 2 , adjusted R 2 , P level and Root Mean Squared Error (RMSE) validating the reliability of spontaneous combustion indicators.

Table 4 .
Measure of fit of experimental data to multivariate nonlinear regression models.

Table 5 .
Revised risk ranking matrix as per spontaneous combustion propensity.Since R70 is the most predominant testing method to assess coal's spontaneous combustion risk, CPT can be considered as a reliable spontaneous combustion indicator, but SHTmin might not be able to consistently rank spontaneous combustion propensity.