Application of multivariate analysis to investigate the trace element contamination in top soil of coal mining district in Jorong, South Kalimantan, Indonesia

Multivariate analysis is applied to investigate geochemistry of several trace elements in top soils and their relation with the contamination source as the influence of coal mines in Jorong, South Kalimantan. Total concentration of Cd, V, Co, Ni, Cr, Zn, As, Pb, Sb, Cu and Ba was determined in 20 soil samples by the bulk analysis. Pearson correlation is applied to specify the linear correlation among the elements. Principal Component Analysis (PCA) and Cluster Analysis (CA) were applied to observe the classification of trace elements and contamination sources. The results suggest that contamination loading is contributed by Cr, Cu, Ni, Zn, As, and Pb. The elemental loading mostly affects the non-coal mining area, for instances the area near settlement and agricultural land use. Moreover, the contamination source is classified into the areas that are influenced by the coal mining activity, the agricultural types, and the river mixing zone. Multivariate analysis could elucidate the elemental loading and the contamination sources of trace elements in the vicinity of coal mine area.


Introduction
A huge amount of coal deposits are infilled in Kalimantan Islands as the consequence of its position in the Cenozoic sedimentary coal-bearing basins of Southeast Asia, in particular the Neogene Southern Sundaland [1]. Due to particular tectonic setting and coal development processes, the area contains not only extensive and thick coal deposits but also low sulphur content coal [2][3][4]. The rife coal deposits and the coal cheap price have been driving the extensive coal use as the Indonesian primary energy resources [5,6].
Despite the considerable use of coal as an important energy resource, coal development faces abundant environmental issues including metal contamination [7]. In particular to metal contamination into terrestrial environment in Kalimantan Island, several studies have observed the metal contamination from mine drainage of coal mining area in Asam-asam Basin, South Kalimantan. pH and metal content of particular pit lakes were found exceeding the coal mining effluent standard. Furthermore, some remediation and mitigation strategies have been implemented to overcome the metal contamination issues in the surrounding environment [8][9][10]. However, the loading and association among trace elements have not been investigated yet. Moreover, since the soils are influenced by their rich metal deposits and utilized as other land functions, it is also important to investigate the original contamination source. This study investigates the geochemistry of selected trace elements (As, Ba, Cd, Co, Cu, Cr, Ni, Pb, Sb, Zn, V) in top soils and their contributory association with the contamination source as the influence of coal mines in Jorong District, South Kalimantan, Indonesia. Furthermore, this study will determine (1) the loading of selected trace elements and (2) the contamination source types. To achieve the aims, this study applies multivariate analysis (PCA and CA).

Study area
The present study is carried out in soils in the vicinity of coal mines in Jorong District, Tanah Laut Regency, South Kalimantan Province, Indonesia. The mining area is associated with Asam-asam River which accepts the mine drainage from the mining [10]. The river flows into the Java Sea in the southern part of the province. The mines are located around the hilly surface of the river's middle section. Numerous land uses lie surrounding the mining, for instances plantation, industrial forest, farming, settlement, and mangrove [11].
The northern part of the study area is constituted by Warukin formation, developed during the Miocene period. The coal was deposited in a paralic depositional environment, intercepted with the quartz sandstone and sandy claystone. Dahor formation is to the south of Warukin formation, consisting of lignite, kaoline, and limonite minerals. Lignite dominates the particular coal rank, having low Sulphur and ash content. The alluvium soils dominate the coastal area in the southern part. The Kalimantan coal is observed as having low trace elements, compared to the worldwide coal range and average [3,[12][13][14][15].

Methods 2.2.1. Sample collection.
The total number of top soil samples (0-20 cm depth) were twenty (20) around the Asam-asam River Basin, as described in figure 1. The soil samples were stored in polyethylene bags for immediate transport and storage. In the laboratory, the soil samples were oven-dried at 40 °C, pulverized, sieved through a 2-mm sieve, and then stored in sealed polyethylene bags until analyses.

Geochemical analysis.
Bulk analysis was performed to determine the total concentration. The representative samples were ground for 5 min in a planetary ball mill at 450 rpm and then pressed into 32-mm internal diameter pellets using a hydraulic press (20 tonnes pressure). The selected trace elements were analyzed by the Energy Dispersive X-Ray Fluorescence (EDXRF) spectrometry (PANalaytical Epsilon 5) [16]. pH and EC were measured based on [17] method.

Geostatistical analysis.
The statistical methods were applied in terms of correlation and classification among the selected trace elements. Basic statistical analyses were applied to determine minimum, median, maximum, 1st quartile, 3rd quartile, average, standard deviation, coefficient of variation (CV), and skewness. CV is classified as weak variability if CV < 10%, and strong variability if CV >100%. CV ranged within 10-100% is classified as moderate variability [18].
Numerous statistical analyses are applied to elucidate the correlation and classification of the studied elements. Pearson correlation (r < 0.05) is applied to investigate the inter-elemental correlation for all selected elements derived from the linear relationship of two variables [19].
Moreover, multivariate analysis is applied to determine the contamination source and the elemental classification. All data are normalized by means of z-scores to equate the variables. Two methods of multivariate analysis are applied in this study. The first method is Principal Component Analysis (PCA) which has been widely applied to describe the relationship among variables by reducing the data dimension and drawing up the incorporated variables. Furthermore, the data will be transfigured into a new data set delineating the factor loading (eigenvector) and score (eigenvalues) [17,20]. The second method is Cluster Analysis (CA), which develops a partitioning of multivariate data into significant subgroups or clusters. The well-developed clusters elucidate as much as possible similarities within each cluster, and as huge as possible differences with other clusters [21,22]. This study applies the PCA Ward's method and the hierarchical clustering with the Euclidean's distance. All statistical analyses were performed by Matlab R2016b for windows professional. Spatial information is developed by applying QGIS 2.18.4. Table 1 shows the geochemical results. The mean values of As, Cu, Cr, Ni, Pb, and V were higher than those in the worldwide upper continental crust by [23]. All elements have greater mean values instead of their median values, corresponding with their positive skewness. In particular, the mean values of Cr and Ni are much higher than those found in other studies [24][25][26][27]. Cr and Ni also perform larger variation than other elements due to their higher CV values. The maximum values of Cr and Ni were found in the estuary (A19) and near the farming area (A16), respectively. Furthermore, the maximum values of Ba, Sb, and V were observed near the area closed to the coal mines. In addition, the maximum value of As was observed near the plantation area (A1) which is located on the upstream side of coal mines.

Correlation analysis
Pearson correlation is performed with the confidence level 95%, as described in   PC3 accounts for 12.9% of total variance. It describes high positive eigenvectors of As and Pb. Cr shows high negative eigenvector. PC3 score shows high loading in the plantation area (A1). The maximum values of As (27.76 mg/kg) is found in this area. On the other hand, Cr show low value (105.9 mg/kg) below its Q1 (126.3 mg/kg). It could be concluded that Cr, Cu, Ni, Zn, As, and Pb contribute to the high elemental loadings of the soil total concentration. In addition, the high elemental loadings are found in the area near the settlement (A15), the farming (A16), the upstream plantation (A1), and the estuary (A19).
PCA could elaborate the elemental loading and inter elemental classification of the selected elements. In addition, it is also able to illustrate the contamination sources that have particular elemental loadings. It conforms to the PCA application in previous studies [28,29].

Cluster analysis
Hierarchical cluster analysis results are shown in the figure 3 by dendrogram visualization. CA of total concentration remains four significant groups; group 1: Pb-Cr, group 2: Cd-Co-Sb, group 3: Ni-As-Ba, and group 4: V-Cu-Zn. The sources of Pb and Cr are immensely different with the sources of other elements.
CA is also performed to classify the sampling points, as described in the fig 7. CA obtains four significant groups. The first group (A9, A16, A15, A8) is classified as the river flow-mixing area, which may increase the elemental concentration [30]. The upstream part of the rivers are mixed in these points. The second group (A19, A17, A13) and the third group (A18, A3, A1) have similarities so that they are classified as the plantation affected area. Soils in the third group are mostly representing the natural vegetation types rather than those in the second group. The last group is strongly represented the coal mining affected area. Cluster analysis is not intended to elucidate the elemental loadings. Nevertheless, the association of the elements and the contamination source types could be described well by the application of cluster analysis.

Conclusions
It is concluded that multivariate statistical methods can be used to identify the association of the trace elements and the contamination source types in soils of the study area. PCA shows that Cr, Cu, Ni, Zn, As, and Pb contribute to the elemental loading in the top soils. Moreover, the origin of the contamination sources could clearly be classified by the cluster analysis as three sources, the river flow-mixing area, the plantation, and the coal mining affected area. Therefore, the coal mining activity in Jorong District is not solely affecting the trace element contamination in soil of the study area.