Satellite classification and segmentation using non-additive entropy

Here we compare the Boltzmann-Gibbs-Shannon (standard) with the Tsallis entropy on the pattern recognition and segmentation of colored images obtained by satellites, via "Google Earth". By segmentation we mean particionate an image to locate regions of interest. Here, we discriminate and define an image partition classes according to a training basis. This training basis consists of three pattern classes: aquatic, urban and vegetation regions. Our numerical experiments demonstrate that the Tsallis entropy, used as a feature vector composed of distinct entropic indexes q outperforms the standard entropy. There are several applications of our proposed methodology, once satellite images can be used to monitor migration form rural to urban regions, agricultural activities, oil spreading on the ocean etc.


I. INTRODUCTION
Image pattern recognition is a common issue in medicine, biology, geography etc, in short, in domains that produce huge data in images format.Entropy, in its origins is interpreted as a disorder measure.Nevertheless, nowadays it is interpreted as the lack of information.Thus, it has been used as a methodology to measure the information content of a signal or an image.In image analysis, the greater the entropy is, the more irregular and patternless a given image is.The additive property of the standard entropy allows its use in several situations just by summing up image characteristics.Among the non-additive entropies, we study the Tsallis entropy, which has been proposed to extent the scope of application of classical statistical physics.Here, we compare the additive Boltzmann-Gibbs-Shannon (standard) 1 and non-additive Tsallis entropy 2 when dealing with colored satellite images.
We start defining the standard entropy for black and white images and we simply extend its use to coloured images, justified by its additive property.Next, we consider the Tsallis entropy for black and white images and extend it to coloured images.Due to non-additiveness, we call attention to some characteristics that help to qualify these images more efficiently that the standard entropy.

II. NON-ADDITIVE ENTROPY
Firstly, consider an black and white image with L x ×L y pixels.The integers i ∈ [1, L x ] and j ∈ [1, L y ] run along the x and ŷ directions, respectively.Let the integer pi,j ∈ [0, 255] represent the image gray levels intensity of pixel (i, j).The histograms p(x) of a gray levels image are obtained by counting the number of pixels with a given intensity pi,j .For colored images, a given pixel has three components: red (k = 1), green (k = 2) and blue (k = 3), and the integer intensity concerning each one of these colours are written as pi,j,k ∈ [0, 255], so that k = 1, 2, 3.This leads to different histograms for each color: p k (x), and hence different entropies for each color: H k , with k = 1, 2, 3.
For two images A and B, for a given color, the entropy of the composed image, is the entropy of one image plus the other . This is the additivity property of the standard entropy, which leads to: Secondly, consider an black and white image mentioned before.The Tsallis entropy is for it generalizes the standard entropy 3 : S q = 255 x=0 p(x) ln q (1/p(x)), where the generalized logarithmic function is ln q (x) = (x q−1 − 1)/(q − 1), so that, as q → 1, one retrieves the standard logarithm, consequently the standard entropy.
To build a feature vector, one simply uses n different entropic values: S bw = (S bw,q1 , S bw,q2 , . . ., S bw,qn ) , so that n = 1 and q = 1, one retrieves the standard entropy image qualifier.Notice the richness introduced by this qualifier.If n = 1, we have already an infinity range of entropy indexes to address.This richness is amplified for n > 1, considering instances of : q < 1, q = 1 and q > 1 4 .
Since ln q (x 1 x 2 ) = ln q (x 1 ) + ln q (x 2 ) + (1 − q) ln q (x 1 ) ln q (x 2 ), see Ref. 5 , S bw,q is non-additive leading to interesting results when composing two images A and B. The entropy of the composed image is S bw,q (A+B) = S bw,q (A)+S bw,q (B)+(1−q)S bw,q (A)S bw,q (B), which, for q = 1 is not simply summation of two entropic values.This property leads to different entropic values depending on how one partitions a given image.The final image entropy is not simply to summation of the entropy of all its partitions, but it depends on the sizes of these partitions.
For colored images, we proceed as before, we calculate the entropy of each color component, in principle with different entropy indices values: q (r) , q (g) and q (b) .For sake of simplicity, we consider the same entropic index for all the color components.For color k the entropy is: so that so that k = 1, 2, 3 retrieves Eq. ( 1), for q = 1.

III. METHODOLOGY
Considering pattern recognition in images, the main objective is to classify a given sample according to a set of classes from a database.In supervised learning, the classes are predetermined.These classes can be conceived of as a finite set, previously arrived by a human.In practice, a certain segment of data is labelled with these classifications.The classifier task is search for patterns and classify a sample as one of the database classes.
The reason to use the multi-q analysis is that a feature vector gives us more and richer information than a single value of entropy.The correct choice of q indexes emphasize characteristics and provide better classifications.
The following steps describe image treatment, training and validation: • Using Google Earth software, capture images from several locations (Figure 3); • each image must be segmented in 16 × 16 pixels partitions; • for each partition the colors Red, Green and Blue are written in a tridimensional array; • for each array and for each color, histograms are built and the Tsallis entropies (Eq.2) are calculated, for q ∈ [0, 2] in steps of 1/10; • the feature vector is created and the classifiers knearest neighbors (KNN), Support Vector Machine (SVM) and Best-First Decision Tree (BFTree) are applied; • an output image are delivered with the segmented partitions highlighted (aquatic region = yellow, urban region = cyan, vegetation regions = magenta) according with the classification of KNN classifier.Table 1 presents the hit rate percentage of each classifier evaluated for the 3 methods: multi-q analysis, multiq analysis with attribute selection and standard entropy analysis.Since the use of a feature vector gives us more information than a single entropy value it also gives some redundant information.In this context, the feature selection is important to eliminates those redundancies.
Table I.Several classifiers are used (SVM, KNN and BFTree) to compare the performance of the generalised entropy with respect to the standard one in pattern recognition.The number of features of each method is indicated in parenthesis.

IV. CONCLUSION
Our study indicates that the Tsallis non-additive entropy can be successfully used in the construction of a feature vector, concerning coloured satellite images.This entropy generalizes the Boltzmann-Gibbs one, which can be retrieved with q = 1.For q = 1, the image retrieval success is better that the standard case (q = 1), once the entropic parameter q allows thorougher image exploration.

Figure 1 Figure 1 .
Figure 1 shows an gray scale image and the histogram p(x) produced from this image:

Figure 2 .
Figure 2. Comparison between images with low and high entropy

Figure 5
Figure5depicts image highlights produced by KNN method, evaluated in a region that contains the three types of pattern classes: aquatic, urban and vegetation regions.

Figure 5 .
Figure 5. Segmentation obtained by Multi-q method and highlights provided by KNN classifier.The yellow color indicates an aquatic region, the cyan color indicates an urban region and the magenta color indicates a vegetation region.