Igneous rock classification using Convolutional neural networks (CNN)

This paper describes how convolutional neural networks are used to identify and classify igneous rocks (CNN). Igneous rocks are formed while still hot, hot magma crystallises and solidifies. Melt originates deep beneath the Earth’s surface, amid active plate borders or hot zones, and then rises to the surface. There are also various kinds of igneous rocks, which are addressed throughout this work, distinguishing each one is a difficult feat in and of itself. Machine learning, which is fundamentally a three-layer neural network, is a subset of deep learning. These neural networks attempt to mimic conscious brain function of humans by letting it to “learn” from massive volumes of data, but they continue to fail. A Convolutional Neural Network (ConvNet/CNN) is a Deep Learning approach for assigning priority (learnable weights and biases) to various aspects/objects in a picture while also identifying them. Classification of images, audio and video segmentation, decision support systems, speech recognition, and image analysis are just a few of the applications for CNNs.


Introduction -
Igneous rocks are formed by the cooling and crystallisation of molten rock, often known as magma. Magma is molten rock that exists deep within the ground. It is the source of all igneous rock. Because the planet was mainly melted when it originated, magma may be regarded as the beginning of the rock cycle. Igneous rocks' origins are documented in their composition. We can derive processes that occur within the earth and comprehend volcanic activities that occur on the planet's surface by carefully examining igneous rocks and interpreting the information they carry. We can learn about the igneous portion of geologic history by studying igneous rocks [1]. Igneous petrology is the study of the identification, classification, origin, development, and formation and crystallisation processes of igneous rocks [2]. Igneous rock categorization and identification are critical because they can assist us understand the relative mineral richness of the rock. Based on their chemical/mineral composition, texture, and grain size, igneous rocks can be classified as felsic, intermediate, mafic, or ultramafic: Extrusive igneous rocks are fine-grained i.e., fine textured (infinitesimally small crystals) or amorphous, whereas intrusive igneous rocks are coarse-grained (tiny crystals) (no minerals; no crystalline structure). The porphyritic texture of volcanic rocks, notably felsic and intermediate, is characterised by visible crystals floating in a fine-grained groundmass [3].
Physically identifying each igneous rock by its texture, grain size, colours, faults, and patterns is a difficult process that necessitates the use of a skilled geologist with rock and mineral understanding. The act of identifying and classifying rocks may be arduous and time consuming for persons from other fields. This paper discusses a tool that can aid in the task's completion. We employed some specific deep learning methods to create this tool. Based on the image that the user uploads, our system offers an accurate forecast of rock [4].
The method of deep learning is a machine learning and artificial intelligence (AI) methodology that mimics human learning and biological neural networks, In data science statistical insights and predictive modelling, has embraced deep learning technology as an essential component. Data scientists in charge of acquiring, evaluating, and perceiving large amounts of data gain tremendously from deep learning because it improves and simplifies the task [7]. Deep Learning advancements in computer vision and image identification have been established and improved throughout time, mainly due to the use of a single algorithm: the Convolutional Neural Network.
A Convolutional Neural Network (ConvNet/CNN) is a Deep Learning system which can ingest a picture and assign relevance (learnable weights and biases) to various aspects/objects while 3 also discerning between them. ConvNet/CNN requires substantially more pre-processing than other classification techniques. With enough training, ConvNets can learn these filters/characteristics [17], whereas simple approaches need filter hand-engineering. Other classification approaches need substantially higher pre-processing than a ConvNet. While foundational approaches need the hand-engineering of filters, CNN can acquire these filters/characteristics with enough training [17]. The extent of pre-processing necessary by a ConvNet is much less than that required by other classification algorithms. While basic approaches need hand-engineering of filters, CNN can learn these filters/characteristics provided with sufficient training [17]. The amount of pre-processing required by a ConvNet is significantly lower than that required by other classification techniques. While simple techniques need filter designing by hand, CNN can learn these filters/characteristics with enough training. The arrangement of the Visual Cortex influenced the construction of a CNN, which is similar to the connection pattern of Neurons in the Human Brain. Individual neurons can only respond to stimuli in the Receptive Field, a tiny portion of the visual field. If a set of comparable fields overlap, the entire visual region is covered. The CNN's role is to compress the images into a format that is easier to handle while keeping crucial properties for accurate prediction [17].

Literature Review -
Sharma N., Jain V., Mishra A., "An analysis of convolutional networks for image classification". This study looks at the performance of common convolutional neural networks (CNNs) for recognising objects in real-time image processing. [23].
Xin M. and Wang Y., "Research on image classification model based on deep convolution neural network". This paper explains architecture of image classification models based deep convolutional neural networks [24].
H. C. Shin; Holger R. Roth; Gao M.; Le Lu; Xu Z.; Nogues I.; Yao J., "Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning". Three crucial, but formerly unexplored aspects of using deep convolutional neural networks to solve computer-aided detection challenges are described in this research. To begin, the research and assessment of several CNN architectures are described. The models analysed ranging in size from 5,000 to 160 million variables, with varying amounts of layers. The effect of dataset scale and spatial image context on performance is then determined. [25]. JP Iddings, "The origin of igneous rocks". This paper discusses in depth analysis of igneous rocksformation, features and types [26].
MacKenzie W.S., Donaldson C. H. and C. Guilford, "Atlas of igneous rocks and their textures". The textures of various volcanic and plutonic igneous rocks are discussed [27].

Methodology-
To predict and identify rocks the model first needs to be trained. For training, a good number of images with decent resolution and quality are required. The following flow chart explains the sequence of tasks we are ought to perform to create an accurate model.

a. Data collection and data splitting -
Data collection is the process of obtaining, measuring, and interpreting precise information from a multitude of sources in order to explore solutions to examine situations, answer questions, and assess outcomes [12]. To train our model images of various igneous rocks are required. It is recommended to collect at least 150-200 distinct images of each class of igneous rocks. This paper aims to perform a multiclass classification of 4 types of igneous rocks; however, any number of igneous rocks can be considered if enough image data is available for each class. The four classes of Igneous rocks used in the scope of this paper are andesite, basalt, diorite and obsidian.

Training and testing folders
First create two folders in any directory and name them as "train" and "test". Create 4 folders of "andesite", "basalt", "diorite" and "obsidian" inside each of these training and testing folders. Split the total dataset of each class into a ratio of 80:20 in training and testing folders (For example -If 200 images of Andesite are available, move 160 images to andesite folder present inside training folder and remaining 40 images to andesite folder of testing folder). Once images are moved to their respective folders, the process of training and testing the model begins.
Inside the folders, images of the specific rock classes are stored. In this case its Basalt

b. Tools and techniques -
Python is the programming language utilised in this project. Jupyter Notebook, a cloudbased IDE, is used to develop the python code. Because it saves information on a cloud server and does all processing on the cloud, Jupyter notebooks allow you to interact with applications and APIs that may demand a high system specification (RAM, GPU).
Keras Sequential API is used to generate and train a model. This API simplifies the process of building and training with only a few lines of code. TensorFlow is accompanied by the Keras API. As a result, simply installing TensorFlow on your command prompt will enough [15].
TensorFlow is a machine learning framework that streamlines the entire process. It attributes a diverse ecosystem of tools, libraries, and community resources that enable academics to continue to progress the trailblazing techniques in machine learning and developers to instantaneously compose and implement machine learning optimization algorithms [7] [13]. Google designed and published TensorFlow, a Python library for swift computationally intensive tasks [13].
Keras is an API that is based on Python-programming language. It is a deep learning API that operates atop TensorFlow machine learning framework. It was founded with a specific aim of enabling for speedy experimentation and research [8]. Keras uses many optimization methods to make high-level neural network API simpler and more performant. Keras is built on TensorFlow, an open-source machine learning framework. Keras is based on a straightforward framework for creating deep learning models with TensorFlow [7]. It's a simple framework that may be used on both the CPU and GPU. Established neural network deployment approaches either adds substantial latencies, cost a lot of money and time to incorporate into embedded code bases, or specifically supports a few numbers of model types.

Importing the libraries
Import the TensorFlow Platform and Keras API first. Data augmentation is a crucial aspect of the training process. For data augmentation, we use the "ImageDataGenerator" technique that comes with Keras. Then add the Sequential class to your project. The basic concept of Sequential API is to arrange the Keras layers in a sequential manner, hence the name. Most ANNs have layers that are arranged in a sequential sequence, with data flowing from one layer to the next in the specified order until it reaches the output layer. Then add layers for activation, dropout, flatten, and dense CNN. An activation function governs how the sum of weighted inputs is generated into an outcome from a network as well as nodes in a layer in a neural network. Dropout is a method of preventing a model beyond being overfit. Dropout converts the information the outgoing points of hidden nodes (neurons that comprises the hidden layers) to 0 at random throughout every cycle of the training procedure [8] [15].

Training the model -
Once arranging the images in all folders is completed, we define two variables "train_datagen" and "training_set". The data gen variable calls out the ImageDataGenerator function which helps in the process of image augmentation. Second variable training set stores the path of the directory in which the training images are stored. These images can be stored locally as well in cloud storage platforms like Google drive and dropbox [11].
ImageDataGenerator( ) class has several arguments, rescale -This argument rescales the RGB pixel value into numbers between 0-1 (usually they're somewhere near 0 -255), shear range and zoom range-The flow from directory method has following arguments -The directory path, target size -Pixel size of the image, batch size -Number of images to be taken in a single batch, class mode-Here categorical class mode as an argument is used since we have multiple classes.
Now the process of creating the model starts. To create a CNN model, we are going to add several layers to it using "keras.layers" and add() function. The CNN is made up of three types of standard layers they are-Convolutional layers, Pooling layers, and Fully-connected (FC) layers. When these layers are combined, a standard CNN architecture is formed. Remaining two significant components are the activation layer and the dropout function [10]. A standard diagram of CNN architecture is shown below.

i. Convolutional Layer-
To detect picture characteristics, a convolution filter is applied to the image. The following is how it works: A convolution is a function that multiplies a collection of weights by the neural network's inputs. Kernels or screens kernel (for 2D arrays of parameters) or a filter (for 3D structures) passes throughout an image several times during the multiplication process. The filtering is employed from right to left and top to bottom to cover the entire image. During the convolution, a mathematical operation known as the dot or scalar product is used. The weights are multiplied by each filter with various input values. The entire inputs are combined together to provide each filter position a unique value [18][19] [21].

ii. Pooling Layer-
The pooling layers shrink the image over time, preserving just the most crucial details. For each set of pixels, the pixel with the peak score is kept (this is known as max pooling), or simply the average is kept (average pooling). By lowering the number of calculations and parameters in the network, pooling layers aid in the management of overfitting. [18].

iii. Fully Connected Layer-
A fully connected neural network is made up of layers that are all linked. A fully connected layer is expressed as a function ranging from R m to R n. Each input dimension determines the output dimension. "Neurons" is a term used to describe the nodes in fully connected networks. As a result, fully connected networks will be referred to as "neural networks" elsewhere in the literature [21].

iv. Dropout-
The Dropout layer, which helps minimise overfitting, changes input unit values to null at random with a frequency of specific rate at each step during training time. Inputs that aren't adjusted to 0 are escaladed up by 1/ (1 -r), where r is that specific rate that was mentioned, so that the overall sum does not change its value [19].

v. Activation Functions-
Furthermore, the activation function can be considered as one of the most critical parts of the CNN model. These functions are employed to train themselves and estimate specific sort of network mode to node correlation, including continuous and complicated. It defines which model characteristics should be transmitted forward and which will be returned at the network's end, in simple words [12][20].
Note: The activation function employed is RELU, or rectified linear activation function, is a piecewise function that returns 0 if the input is negative and returns 1 if it is positive. Because it is easier to make it learn and often produces better quality results, it has transformed itself into the preferred activation function for several modelling techniques [13].
A nonlinear activation function is applied to the outcomes of a linear operations, such as convolution. Regardless of the fact that relatively smoother nonlinear functions like the sigmoid or hyperbolic tangent (tanh) function were usually prominent as mathematical representations of actual neuron behaviour, the rectified linear unit (ReLU) is now the most prominent nonlinear activation function [22].
f(x) = maximum (0, x) The testing of our model is completed. The performance parameter shows an accuracy of 95%, which is great. (Note -This accuracy might vary depending upon the type and quality of data).

Model summary -
Here we show our model summary. We need model summary to confirm that everything (layers, arguments and other parameters) is as expected. [14][15]

Performance parameters -
For performance parameter, two Line graphs are plotted. One is of accuracy vs epochs and other is of loss vs epochs. The graph shows that the validation and training accuracies were almost identical, with a combined accuracy of over 90%. The CNN model's loss is a negative trailing graph, indicating that the model is functioning as predicted, with the loss decreasing with each epoch [16] [19]. After entering the image path -Once the image path is entered into the input box shown above. The image data is passed through the neural network, which predicts the rock type.
The uploaded test image is showcased -And finally, the model predicts an outcome. The out is usually in a form of an array. A simple if-else python statement can solve this. Finally, the desired result is obtained.
In this case the Igneous rock is diorite!

Conclusion -
Classification of igneous rocks using CNN algorithm can help us automate many aspects of geological surveys. First organise the image dataset into two folders "training and testing". After this the model is provided with several layers like convolutional layer, pooling layer and activation function (In this case RELU). This CNN algorithm can classify images with considerably good accuracy. The four igneous rock classes we've taken for this paper are Andesite, basalt, diorite and obsidian. Once the user inputs a rock image to the algorithm, the result is predicted. In our case we provided an image of Diorite rock, which our model predicted very accurately. There is no restriction in the number of classes. However, it is important to organise the rock images in proper folders and images should be of good quality.
This tool makes geologist's life easy. Geological surveying can be automated by using these machine learning tools. This rock classification algorithm can also be used for surveying rocks and minerals on a larger scale and also for understanding underlying geological structures. The CNN image classification algorithm can also be used for several other geological/mining problems like predicting the rock fragmentation, predicting the river basin sediments etc...,