Classification of flower images using SVM method through the colour, texture and shape using Histogram, Haar Wavelet and Robert Cross methods

Flowers are one of the plants that offer a beauty that makes the environment more beautiful and attractive. Not only a decoration, but flowers are also used as cut flowers, sowing flowers and herbal medicine. flowers that grow in Indonesia have two forms, namely the form of a flower and a bunch. In the classification of flower characteristics taken from the flower, images are characteristics of colour, texture and shape. The purpose of this system design is to make a classification of flowers that grow in Indonesia that have different colour, texture and shape characters using the Histogram, Haar Wavelet and Robert Cross methods to get the value of features to be made vector and used on SVM method to determine the types of flowers. SVM is a classifier that has the advantage of being able to process high-dimensional data, without eliminating significant performance degradation. From the experiment of this Indonesian flower classification application, the accuracy of classification success results in the test interest of 81.67% by using the colour, texture and shape features of the extraction results with the Histogram, Haar Wavelet and Robert Cross methods and classified by the SVM method with Linear, Polynomial kernel and RBF.


Introduction
Technological developments in image processing techniques are also developing rapidly. Various techniques were developed to facilitate human work, both as an image processor, image analyst and the use of images for various purposes and purposes. Often in image processing, images are processed in such a way that they can be used for further applications [1]. Therefore this image processing technique will be used to recognize the flowers that grow in Indonesia. Flowers are well-known plants that offer beauty to the environment, fragrant and other uses such as herbal medicine. The most attractive part of the flower is flower adornment (petals). This flower jewellery has a variety of colours and shapes [2]. On the system, flower image objects obtained from the Google Image collection. The data used comes from the characteristics of colour, texture and shape with each method, namely Histogram, Haar Wavelet and Robert Cross. the shape of the image of flowers that grow in Indonesia that is used is a form of flowers such as Roses, Hibiscus, Ashar, Daisies, Dahlias and others with a variety of colours. A bunch flowers such as Bougenville, Jasmine, Frangipani, Chrysanthemums and others with a variety of colours. The purpose of this paper is to classify the flowers that grow in Indonesia using the SVM method, which can show the types of Indonesian flowers to users, by using the SVM method, Histogram, Haar Wavelet and Robert Cross into the application. Making this work get a reference from a similar study classification of batik lamongan based on features of color, texture and shape [3]. The data used in this study are pictures of Indonesian flowers. The amount of image data used in this study is 960 training data images and 120 images which are test data. Figure 2 is a general description of the system created. Based on Figure 2 there are three main processes: preprocessing, feature extraction, and classification.

Preprocessing
The purpose of preprocessing is to improve image quality. In this research pre-processing was conducted is to change the image size from the initial size to 256x256, in addition to preprocessing is to change the image from Red, Green, Blue (RGB) to grayscale.

Feature Extraction
In this research, three feature extraction methods are used: Color Histogram for colour, Haar Wavelet for texture and Robert Cross for shape.

Color Histogram
In digital images, the Color Histogram method can explain the number of pixels in each colour by having a certain range of values that includes the color space of the digital image [4]. The steps of the Color Histogram method will be explained as follows [5]: Determine the number of the bin with the formula. Bin Calculate the frequency of many pixels according to a predetermined bin and create a histogram for each group of frequencies produced.

Haar Wavelet
Haar Wavelet is a transformation with a series of difference and average operations. Haar Wavelet is the simplest wavelet and has become a source of ideas for other wavelet families [6]. Haar Wavelet is used to represent texture characteristics. To calculate the Haar Wavelet transformation, first, the image will be made into grayscale or grey and then pixels will be taken from the image, then do the Averaging and Differencing calculations for each pair of samples by counting rows and columns. Matrix data will be made absolute and variance then arranged in vector form. The vector is used for classification of texture images. In general, the formula for determining the Averaging and Differencing values can be stated as follows [7]: (2) is a calculation for the Averaging process, and for (3) is a calculation for the Differencing process. Following is the formula of mean and variance [8]:

Robert Cross
Robert Cross is a method for characterizing shapes in digital images by performing edge detection (Edge Detection). Edge Detection is one of the fundamental processes in Image Processing which aims to identify points in digital images where the brightness changes dramatically or discontinuity occurs [9]. In theory, Robert Cross can be said to be a cross operator of a 2x2 matrix pair. This cross operator checks an additional pixel in one direction of the gradient but because the pixel examined is in the diagonal direction, then the overall pixels involved form a 2x2 matrix window. Robert Cross's gradient in the x-direction and y-direction is calculated by the formula [10]: − ( , ) = ( , + 1) − ( + 1, ) In the form of a convolution kernel, the Robert Cross operator is:

Support Vector Machine(SVM)
Support Vector Machine (SVM) is a supervised algorithm for classifying what works by finding the hyperplane with the largest margin. SVM is a classifier that has the advantage of being able to process high-dimensional data, without significantly reducing performance [11]. In the classification using SVM, not all cases can separate data linearly by a hyperplane, even more cases with non-linear data. Therefore introduced the use of the kernel method in SVM which can be used for non-linear data. Some commonly used kernel trick functions [12]: Kernel Linear

Results and discussion
The experiment carried out from this study aims to determine the class of flower images entered by the user. Images of flowers will be classified into several classes namely Moon Orchids, Roses, Sun, Ashar, Daisies, Bougenvile, Jasmine, Chrysanthemums, Frangipani, Asoka and other flowers. The amount of data used in this study was 1080 which was divided into two of 960 images as training data and 120 images were used as data testing. The image size used in this study is 256x256. The features used are 65646 features consisting of 96 colour features, 14 texture features, and 65536 feature shapes. In this study, the classification process is divided into two stages: training and testing.

Training Process
This process aimed at training the system. The training process was done only one time. The system has not been able to provide a conclusion or result when the training has not been done. Figure 3 is a training process. The training process begins by taking the image to be entered into the system. The next process is to do preprocessing. The image that has been done preprocessing process, the next process is to extract the features of the image. The last process is to save the image feature into the database or into a file. Features data stored into the database that will be used as data knowledge to determine the class of the input image. Data used for training process is not used anymore for the testing process.

Testing Process
In the process of testing, the system will provide results given by the user input. Based on the pictures from the input system will classify the image data based on the training data that has been done. Figure 4 is a testing process. Based on the test results shown in Table 1 obtained the highest accuracy of 81.67% when the kernel Linear-Polynomial-RBF(L-P-R) and using the characteristics of colour, texture and shape. Based on the experiments that have been done, the system is still not able to fully classify the data according to the class of flower drawings correctly. This system is able to recognize classes on colour traits with high accuracy values, this is because the colour features of the images in each class have a very different base colour from the other classes. As for the SVM kernel or classification that has accuracy will have high accuracy using the Linear-Polynomial-RBF kernel because the kernel will be compared which one is the best to use.

Conclusion
Based on the experiment that has been done, can be drawn some conclusions, among others: 1. the highest accuracy of 81.67% is obtained when using the colour, texture and shape features simultaneously and using linear, polynomial and RBF kernels simultaneously. 2. The colour-based feature provides higher accuracy than other features.