Nonlinear dimensionality reduction of hyperspectral data based on spectral information divergence preserving principle

The paper proposes a nonlinear dimensionality reduction technique for hyperspectral data based on the principle of preserving the spectral information divergence (SID). In this technique, the spectral information divergence is used as a spectral dissimilarity measure in both the original and reduced space. In the experimental studies performed using open hyperspectral images, we solve the problem of per-pixel classification of hyperspectral images. The developed method extends the existing hyperspectral data analysis toolkit, which allows reducing the dimensionality of hyperspectral data based on the principle of preserving specified spectral dissimilarity measures.


Introduction
A hyperspectral image is a three-dimensional array with two spatial and one spectral dimension. Hyperspectral images are recorded using high spectral resolution sensors. Hence, each sample of such images is a vector that includes up to several hundred spectral components.
Hyperspectral images are widely used in various fields. Their use, on the one hand, provides new opportunities, allowing you to extract information about the materials (components) presented in an image, and on the other hand, imposes difficulties caused by the high spectral dimension. As a result, the most important task is to eliminate the redundancy of such images while maintaining important spectral information.
Linear dimensionality reduction methods are most often used to eliminate this redundancy. The most popular linear method is the principal component analysis technique (PCA) [1], which searches for a linear projection into a subspace of lower dimension that maximizes the variance of data. In recent years, in light of the considered problem, the popularity of nonlinear dimensionality reduction methods grows. The possibility of taking into account desirable spectral dissimilarity measures is of particular interest when working with nonlinear dimensionality reduction techniques. Although many of the above nonlinear methods allow us to take into account the chosen spectral dissimilarity measure in the hyperspectral input space (see, for example, [8,9,10]), only a few works [11,12] are known, in which an attempt is made to take into account the chosen dissimilarity measure in the output reduced space. In both of these works, the methods being developed are based on the principle of preserving selected measures of spectral mismatch. In the paper [11], the authors proposed several nonlinear mapping methods based on the principle of preserving the spectral angle mapper (SAM) measures [13]. In the paper [12], the method based on the principle of preserving the spectral correlation measure [13] was introduced.
The method proposed in this paper is based on the same ideas as the above papers [11,12]. However, unlike previous works, this method is based on the principle of preserving the divergence of spectral information (SID) [13]. This measure, along with the spectral angle and spectral correlation measures, is among the three most well-known non-Euclidean measures of spectral dissimilarity in the hyperspectral data analysis. This measure shows good results in solving hyperspectral data classification problems.
The work is organized as follows. Section 2 describes the basics of the developed dimensionality reduction method. In particular, we introduce the quality measure for the dimensionality reduction and derive a numerical optimization procedure based on the stochastic gradient descent method. Section 3 describes the experimental studies. In this section, using publicly available hyperspectral images, the proposed method is compared with alternative nonlinear dimensionality reduction methods based on other measures of spectral dissimilarity. The effectiveness of the proposed approach is studied in terms of classification accuracy. The work ends with a conclusion in section 4 and a list of used literature in section 5.

Method
Consider a hyperspectral image X, of size W x H, containing M spectral components. Each sample x i , i = 1..N, N = W H of such an image can be considered as a vector in the multidimensional hyperspectral space R M of dimension M.
As mentioned in the introduction, spectral information divergence (SID) is a fairly popular measure in hyperspectral image analysis. For a pair of vectors x i and x j , this measure is given as follows [13]: Here, p(x) = (p 1 (x),…p M (x)) T is considered as the probability distribution function of a discrete random variable generated by the vector x, and D(x i ||x j ) is the difference of the self-information for the vector x i with respect to the information for the vector x j averaged over all spectral ranges (relative entropy). Note that the spectral information divergence (SID) is a symmetric measure by construction.
The nonlinear method proposed in this paper for the dimensionality reduction of hyperspectral data is based on the principle of preserving the spectral information divergence between samples of a hyperspectral image.
First, to each sample x i , i=1..N of the original image, we associate a vector y i in the reduced space R L of smaller dimension L<M. Next, we introduce the following measure of the preserving of the spectral information divergence: (2) this measure shows how well the pairwise spectral dissimilarity measures between the samples of the hyperspectral image are preserved in the space of lower dimension.
The factor μ is assumed to be constant in the above expression (2) for a given image X. It is calculated in the present work as Formally, the task of dimensionality reduction the can be posed as the problem of minimizing the error functional (2) for all possible values of the parameters, which are the coordinates of vectors (image samples) in the reduced space Y={y i , i = 1..N}.
The minimization problem can be solved by analogy with the works [11,12,14] using the methods of stochastic gradient descent. With this approach, the true value of the gradient   in the gradient descent procedure is approximated at each iteration of the procedure using a subset consisting of R randomly selected vectors   The implementation of such an optimization scheme requires finding partial derivatives of the error function (2): (4)

Experiments
This section describes the results of the experimental comparison of the developed method with alternative techniques. For the experiments, we used two well-known hyperspectral scenes [14], which have pixel-by-pixel true classification data: Indian pines and Kennedy space center. Both hyperspectral images were obtained using the AVIRIS sensor. The first scene contains 145 × 145 samples in 224 spectral ranges. In the experiments, we used a derived image containing 204 spectral channels, in which some spectral channels were excluded due to a high noise level. The pixels of the classified image are divided into 16 classes.
The Kennedy space center image contains 512 × 614 pixels. In the experiments, we used a version containing 176 spectral ranges. The image with the true classification contains information only about a relatively small portion of pixels, which are divided into 13 classes.
In our experiments, we performed a pixel-by-pixel classification of the above images using a twostep approach. At the first stage of the approach, we reduced the dimensionality. At the second stage, we performed the classification itself using the descriptions (features) obtained at the first stage. The experimental studies described below reflect the results obtained using the developed dimensionality reduction method as well as using similar methods based on various spectral dissimilarity measures. In particular, we used the following nonlinear dimensionality reduction techniques: nonlinear mapping 4 method based on the Euclidean distance (NLM), the method based on the approximation of spectral angles by the Euclidean distances (SAED), the method based on the spectral correlation preserving (SCPM), and the developed method based on the preserving of the spectral information divergence (SDPM). At the classification stage, we used the nearest neighbor (1-NN) classifiers corresponding to the methods used for dimensionality reduction. For the NLM and SAED methods, in which the dissimilarity between the output vectors was calculated as the Euclidean distance, the 1-NN classifier based on the Euclidean distance was used. For the SCPM method, 1-NN based on the Pearson correlation coefficient was used. For the developed SDPM method, the spectral information divergence (SID) was used as a measure in the 1-NN classifier.
At first, we applied the true classification mask in our experiments. After that, we reduced the dimensionality of the selected set of samples with a known pixel-by-pixel classification. We changed the output dimensionality in the range m = 3, 5, 10, ..., 30, and measured the classification quality for each value of output dimensionality. As a measure of the per-pixel classification quality, classification accuracy (CA) was used, defined as the proportion of correctly classified samples from the total number of classified samples. The results of the experiments are presented below.
As it can be seen from the results of the experiments, two methods showed the best quality of classification, namely, the method based on spectral angles (SAED) and the developed method based on the spectral divergence preserving (SDPM). The proposed method showed a comparable or better quality of classification in the majority of presented cases, starting from output dimensionality m = 10. The worst results were shown by the SCPM method for small dimensions of the output space.

Conclusion
In this paper, we proposed and investigated a nonlinear method for the dimensionality reduction of hyperspectral data (SDPM), which is based on the preserving of the spectral information divergence (SID). Experiments have shown that, compared to other studied methods, the developed method allows to obtain a comparable or better quality of pixel-by-pixel classification for the dimensionality of the output space, at least, m = 10.