Unsupervised texture classification of 3D X-ray Micro-computed Tomography images

Characterizing rock proprieties is crucial in the oilfield to evaluate hydrocarbon reserves. Several studies showed a high correlation between rock properties and textures. Therefore, we propose integrating texture information in the images to identify precisely the most representative textures in highly heterogeneous rocks to estimate their properties. First, we implemented a steerable pyramid decomposition to extract the texture features. Then, those parameters were used as input for the Self-organizing map to classify the textures. Finally, by applying several models and comparing their results, we suggested the best approach to implement for texture classification.


Introduction
The most significant step in the oil industry to assess hydrocarbon reserves is characterizing the rock properties.The standard technique to measure the rock properties experimentally is the Special Core Analysis (SCAL).It is a laboratory procedure where core plugs (cylindrical subsets with a diameter of 3.8 cm and 5-8 cm long) are taken from a petroleum reservoir.Then experiments are performed on them, and rock properties are provided to the petroleum engineers.The properties will be presented in a property log at each core plug.However, this procedure has limitations, such as the choice of core plug positions, and the number is only sometimes optimal [1][2][3][4][5][6][7][8][9][10].
Tomography found its applications in diverse fields, including medical applications, digital rock physics (DRP) , image processing, and others, for instance [11][12][13][14][15].In particular, Digital rock physics is a more advanced method for characterizing rock properties.It estimates rock properties such as porosity and permeability using 3D X-ray micro-computed tomography images.It consists of three main stages.The first step is the acquisition of raw grayscale digital rock images.Specifically, high grey levels represent a solid frame, and low grey levels represent pore space.The second stage is separating pores and grains using image segmentation.In the last stage, they run a numerical simulation on the segmented images to estimate the rock properties [1].Indeed, texture analysis is a part of this approach to understand rock properties better.However, as it contains basic classification tools, it failed to work when applied to carbonate reservoirs due to their heterogeneity.Several studies showed a correlation between rock properties and texture information [1].Therefore, we propose studying the image's texture information to identify the most representative textures in highly heterogeneous rocks and estimate the rock properties.Two primary techniques to classify textural facies from CT scan images are supervised and unsupervised.The main difference between the two approaches is the availability of labeled data.To put it more simply, the number of texture classes is pre-known in the supervised techniques.On the other hand, unsupervised texture classification is a method where the machine learning model does not have access to any labeled data, i.e., there is no existing information about the number of classes [1].U-Net architecture is an example of a supervised machine learning technique that is used for fast and precise segmentation of images [2].In contrast, independent component analysis, self-organizing map, and square error clustering using Gabor filters are unsupervised techniques for texture classification [1,[16][17][18].
In this study, we will focus more on the unsupervised machine-learning techniques, specifically the self-organizing map, to find the best strategy for texture classification that could be implemented in the future on real data and use the output to predict the rock properties.

Steerable pyramid decomposition
Steerable pyramid decomposition generates a multi-scale and multi-orientation representation of the image.The main goal is to extract the main representative features of an image, such as skewness, entropy value, auto-correlation, cross-correlation, and kurtosis at different scales and orientations, allowing us to capture the image features that cannot be captured in one scale [1,19].An illustration of the process is shown in figure 1.Initially, the input texture image is decomposed in the frequency domain into high-pass and low-pass subbands using high and low-pass filters (H0, L0).

Figure 1. Steerable pyramid decomposition
A high-pass filter is designed to pass signals with frequencies above a particular cut-off frequency while preventing the frequencies below a specific cut-off frequency from passing.Consequently, the high-pass subband contains the high-frequency components of an image.These components are typically associated with the edges and fine details of the image.On the other hand, a low-pass filter is designed to pass signals with frequencies below a particular cut-off frequency while attenuating signals with frequencies above the cut-off frequency.The low-pass subband will contain the low-frequency components of the image.These components are typically associated with the coarsest scale information that represents the overall shape or structure of the image.Then, the low-pass subband is decomposed into a set of oriented band-pass subbands and a low-pass subband.This process is repeated iteratively until the desired level of decomposition is achieved.The features of an image can be captured by extracting the features of each subband.The output images represent coefficients of decomposition at different scales and orientations.An example of steerable pyramid decomposition is shown in figure 2.

Figure 2. An example of steerable pyramid decomposition
One advantage of this representation is that the sub-bands are translation and rotation invariant.This means that the sub-bands are not affected by translations or rotations of the image.In other words, if the original image is shifted or rotated, the sub-bands will remain the same, which allows us to analyze the data and extract information regardless of the orientation [1,19].

Self-organizing map
After extracting the texture parameters of the images using the steerable pyramid model, the selforganizing map is then implemented in order to find the main representative texture classes and classify the textural facies from CT scan images.It is a type of Artificial Neural Network technique that transforms input data from a high-dimensional space into a lower-dimensional space while preserving the topological relationships between the data points [1,20].
Initially, N texture images obtained from the X-Ray CT scan images, along with a set of n texture parameters for each extracted from the steerable pyramid model, are used as input data.Specifically, a matrix whose columns represent the n texture features for each texture image will be projected onto the 2D SOM.The map space consists of components called "nodes" or "neurons," which are arranged as a hexagonal or rectangular grid with two dimensions.The number of nodes and their arrangement are specified beforehand based on the larger goals of the analysis and exploration of the data.Each node in the map space is associated with a "weight" vector, which is the node's position in the input space.
In the training process, the nodes stay fixed in the map space, whereas the weight vectors will move toward the input data.Each input vector is associated with a winning neuron whose weight vector will move towards that input vector without spoiling the topology induced by the map space.It is the closest neuron to the input in regards to Euclidean distance.After finding the winning neuron of a particular input vector, the weight vectors of the nodes will be updated by pulling them closer to the input vector.Mathematically, they will be updated as follows, where v refers to the index of the node in the map, s is the current iteration, and u is the index of the Best Matching Unit (BMU), i.e., the winning neuron in a SOM.The BMU is the neuron that is the most similar to the input data vector.The Neighborhood function is used to determine how much influence each neuron has on its neighbors during the learning process, which is indicated by θ(u, v, s) in equation (1).It is multiplied by the learning restraint rate α(s) to control how much the weights of the neurons in the SOM are updated during the training process.A higher learning restraint value means that the weights will be updated more drastically during training, leading to faster convergence, but potentially causing the SOM to over-fit the data.On the other hand, a lower learning restraint value will result in slower convergence but may lead to a more robust representation of the input data.The last term in the equation (D(t) −  " (s)) measures the difference between the target input vector D(t) and the current weight vector of node v.This process will be repeated while s < λ where λ is the iteration limit that should be specified initially.After finding the winning neuron of the inputs, each input will be visualized with the same color as the winning neuron.Inputs that are in proximal clusters have more similar values than observations in distal clusters.SOM overestimates the number of representative classes in 2D as there is no existing information about the real number of classes [1,20].By doing that, we ensure that all classes are captured without missing any.

Results
Three 512×512 images from the Brodatz texture data that can be seen in figure 3 were used in the application.In The first phase, we used steerable pyramid decomposition to extract the texture features.This was done by going over smaller regions of each texture and constructing a pyramid for each region.
For each example, we constructed a steerable pyramid and extracted the subbands, and then we computed their features such as mean, standard deviation, autocorrelation, kurtosis, entropy, skewness, and cross-correlation.By combining those values in one vector, we got the features of a small region.This procedure was repeated several times to get many examples for each texture.The result was a matrix in which its columns represent the features of each small region.This matrix was then normalized and fed to the self-organizing map as an input.The self-organizing map then assigns each feature vector a class label based on the mapped inputs.This means if we went through the combined image in figure 3 and got 20,000 small windows (examples), SOM will assign a class to each one based on similarities.Similar features will get a similar class label.

Figure 3. Three 512x512 Brodatz textures
To visualize the classes on the original image, we first constructed a 512x1536 uint8, the same size as the original image in figure 3. Then by going through it consistently with going over the original image, we assigned the class labels to each small region.We used a grid size 10x10 (100 classes) for the SOM.To better visualize the classes, the colors of the chromatic circle shown in figure 4 were mapped onto the results of SOM.For example, the cells that have a label of class 1 will be shown as pink color, and the cell with a class label 100 will get a green color.

Figure 4. Chromatic circle
In the following subsections, several models were applied to the three brodatz textures in figure 3 will be presented in order to identify the best possible approach for texture classification of 3D X-ray Micro Tomography images representing rock samples.To evaluate their performances, we visually inspect the SOM result and the chromatic circle side-by-side and look for similarities and differences between the two.

Model 1
In this model, the smaller examples were extracted by going over each texture with a step size of 8; this means having 2304 examples for each texture.Then for each example, we constructed the steerable pyramid with two scales and three orientations.Specifically, the orientation parameter defines the number of directions (or angles) used to decompose the image.In contrast, the scale parameter determines the number of levels (or scales) used to capture different image frequency bands.A subset of the subbands was then considered to extract the feature vectors.We picked only two subbands and computed the mean, standard deviation, skewness, kurtosis, entropy, and energy values for each.Additionally, we computed the autocorrelation matrix for each subband and took a window of size 5x5 around the maximum value.Remarkably, the autocorrelation of the subbands can reveal information about the spatial frequency content, which is related to the size and spacing of the repeating patterns, orientation selectivity, and texture properties of the image features.Consequently, our feature matrix was of size (62x6912), which was then normalized and used as an input for SOM.The final result of classification after mapping the chromatic circle colors onto the SOM results is shown in figure 5.As can be seen, the result is not very accurate.It is not very clear that we have three distinct textures.Possible reasons behind this are considering a subset of examples of each texture which means going over only some of the regions in one texture, as well as extracting only two subbands of each pyramid.This resulted in reducing the feature values for each example.Furthermore, considering all the values of the symmetric autocorrelation window was time-consuming when running the code, so taking only the unique values was our next consideration.

Model 2
In this model, we took more examples for each texture to better capture each texture's features.We extracted 9,216 samples for each texture.Then for each sample, we formed the steerable pyramid decomposition in the same way we did in Model 1.However, the difference was considering all the subbands in which we then computed for each the mean, standard deviation, skewness, kurtosis, entropy, and energy values.Moreover, we computed the autocorrelation matrix for each subband and took a window of size 5x5 in the same way as Model 1 but considering only the unique values of the window to avoid redundancy and accelerate the training process.This resulted in having a feature matrix of size (190x27648) that was then standardized and sent to the SOM. Figure 6 reflects the classification result using this model.As can be noticed from figure 6, we got better classification result compared to Model 1. Overall, the first region has shades of blue color and the second one has pink shades.This reflects the fact that we have two distinct textures.Additionally, for the last region, the colors that are close to each other in the SOM result are also close to each other in the chromatic circle.This means that the features are similar or almost the same in this region, which increases the probability of having a distinct texture here.

Model 3
This model is the same as Model 2 but with a bigger window extracted from the autocorrelation matrix (7x7).Consequently, we got a larger feature matrix of size (310x27648) to be used in the training process of SOM.The classification result for applying this model in figure 3 is shown below.

Model 4
In this model, the same procedure to produce Model 2 was applied but by going over all the regions of each texture.Thus, a more significant number of samples for each texture (147,456 samples) was extracted.This resulted in a larger feature matrix of size (190x442368) that was fed to the SOM after normalizing it.As a result of having more examples of the texture, a more precise classification result was expected.As can be viewed in figure 8, we got smoother classification result compared to the outcome from the Model 2 application.This resulted from considering all the regions in each texture to capture more features and increase accuracy.

Model 5
In this model, we considered every sample of each texture to construct the steerable pyramid decomposition for it.The features were extracted from each subband, just like what was done in Model 4.However, the window size for the autocorrelation matrix was reduced (3x3).Therefore, we extracted a feature vector of size (110x1) instead of (190x1) for each example.This means having a feature matrix for each texture of size (110x147,456) and a total feature matrix of size (110x442,368) that was the input for SOM.By comparing the application results of Model 4 and Model 5, minimizing the window size of the auto-correlation matrix from (5x5) to (3x3) did not make any significant difference in the classification.However, expanding the window as in model 3 achieved a better classification result.

Model 6
To see the effect of changing the scale and orientation parameters of the steerable pyramid decomposition, in this Model, we increased the number of orientations to 6 and scales to 3 and kept the rest of Model 2 as it is.Better classification is expected as a higher number of scales and orientations would result in a more finely detailed image decomposition, capturing minor variations and features at different orientations.Assume having a ball for each texture in the chromatic circle that consists of the corresponding colors in the SOM result.Then, visually we can see that for each texture, the radius of the ball for Model 6 is smaller than that for Model 2. This means that the features in each region are more similar, and thus as expected, Model 6 performed better in comparison to Model 2.

Model 7
The number of representations for each texture in Model 7 is the same as in Model 2, which is 9,216.On the other hand, the cross-correlation feature was added to the feature matrix for each texture in addition to the previous parameters, i.e., auto-correlation, standard deviation, etc.A window size of (5x5) was chosen from a cross-correlation matrix with the center being the maximum value.
Furthermore, when decomposing the steerable pyramid, the orientation, and the scale were reduced and set to a value of 1 to fasten the code running process.Mainly, increasing them is preferable to capture more detailed information about the texture, but we only wanted to test which is better for classification, having cross-correlation and auto-correlation along with statistics features or considering the cross-correlation alone with the statistics features which will be presented in the following model.The below figure shows the classification result when applying model 7 to figure 3.  Comparing Models 7 and 8, We can see that, as a whole, considering both the autocorrelation and cross-correlation leads to better classification results.Figure 11 reflects that we have three distinct textures.However, in figure 12, for the first texture, there are shades of pink and shades of blue, which at the same time are included for texture 2. This reflects the similarity between features in region one and region two.However, this is not true, as we have two distinct textures.A possible reason behind what happened is that autocorrelation can capture local variations in texture structure, such as the presence of edges or corners.Thus, by removing the auto-correlation feature, we may lose significant information about the textures, resulting in inaccurate results.

Conclusion
In this paper, we presented an approach consisting of two phases to classify the textures.In the first phase, the features extraction phase, we decided to implement the steerable pyramid decomposition as it is a powerful technique for extracting texture features.It can analyse images at different scales and orientations, and it is efficient in representing an image.In the second phase, the unsupervised classification phase, we used the self-organizing map to project the features extracted from the steerable pyramid decomposition in a 2D map and classify each feature vector.Then the chromatic circle was then used to visualize the classification result better.In total, we developed eight models, and in each one, we changed something in the feature extraction process and implemented them on three 512x512 brodatz textures.
Based on the classification results, we believe that the best possible strategy to implement for carbonates rocks 3D X-ray Micro Tomography images is to go over all the samples of each texture in the feature extraction procedure and use all the steerable pyramid subbands to get the features from them.Furthermore, including both autocorrelation and cross-correlation in the parameters could provide complementary information about the data and improve the performance of the classification model.Also, our results suggest that increasing the number of orientations and scales in the steerable pyramid decomposition is better for having more detailed information about the texture.Additionally, we observed that a window size of (7x7) of the autocorrelation matrix leads to better classification results.Finally, to avoid redundancy and save time, it's essential to include only the unique values of that window as the autocorrelation is symmetric.

Figure 5 .
Figure 5. Classification result for figure 3 using Model 1

Figure 6 .
Figure 6.Classification result for figure 3 using Model 2

Figure 7 .
Figure 7. Classification result for figure 3 using Model 3

Figure 8 .
Figure 8. Classification result for figure 3 using Model 4

Figure 9 .
Figure 9. Classification result for figure 3 using Model 5

Figure 10 .
Figure 10.Classification result for figure 3 using Model 6

Figure 11 .
Figure 11.Classification result for figure 3 using Model 7

Figure 12 .
Figure 12.Classification result for figure 3 using Model 8