An improved k-NN method based on multiple-point statistics for classification of high-spatial resolution imagery

In this paper, the potential of multiple-point statistics (MPS) for object-based classification is explored using a modified k-nearest neighbour (k-NN) classification method (MPk-NN). The method first utilises a training image derived from a classified map to characterise the spatial correlation between multiple points of land cover classes, overcoming the limitations of two-point geostatistical methods, and then the spatial information in the form of multiple-point probability is incorporated into the k-NN classifier. The remotely sensed image of an IKONOS subscene of the Beijing urban area was selected to evaluate the method. The image was object-based classified using the MPk-NN method and several alternatives, including the traditional k-NN, the geostatistically weighted k-NN, the Bayesian method, the decision tree classifier (DTC), and the support vector machine classifier (SVM). It was demonstrated that the MPk-NN approach can achieve greater classification accuracy relative to the alternatives, which are 82.05% and 89.12% based on pixel and object testing data, respectively. Thus, the proposed method is appropriate for object-based classification.


Introduction
It is generally agreed that object-based image analysis (OBIA) has several advantages compared to pixel-based analysis for processing fine spatial resolution remotely sensed data [1]. Some advanced algorithms for object-based classification have been developed. For example, Chen et al. [2] introduced a modified object-based method integrating multiple characteristics based on fuzzy logic. Salehi et al. [3] developed a hierarchical rule-based, object-based classification framework coupled with height data for classification in complex urban environments. Object properties and spatial context are important for OBIA classification in remote sensing. Commonly, a spatial weighting is incorporated into the classifier to increase the accuracy of the classification result. It has been suggested that such contextual classifiers can reduce noise and achieve greater accuracy than noncontextual classifiers [4]. Geostatistical techniques have been applied to pixel-based classifiers as a contextual classification method [5]. However, all the spatial weighting schemes investigated were based on two-point statistics (i.e., the relationship between the central pixel and its neighbours).
Recently, advances in multiple-point statistics (MPS) have shown that two-point statistics are limited. Instead of two-point-based functions such as the variogram, MPS borrows structures from training images, from which higher-order local patterns of the target field can be captured. Thus, spatial structures can be characterised more completely using MPS [6]. The original implementation of the MPS approach builds on the paradigm of the traditional indicator simulation. Recently, a few studies applied MPS to the classification of remotely sensing data. For example, Boucher [7] realised super-resolution mapping with MPS. Ge and Bai [8] extracted linear objects from remotely sensed imagery using MPS. Tang et al. [9] proposed a post-classification method using MPS and compared it with contextual classification based on the Markov random field model. Nevertheless, MPS for the classification of fine spatial resolution imagery remains under-explored. The goals of this paper are: (i) development of a new MPS spatial weighting for remotely sensed contextual classification, and (ii) evaluation of the applicability of the new method for handling objects (c.f., the previous pixel-based implementation). The multiple-point weighted k-NN method is introduced in Section 2, along with a general description of MPS, and a set of standard k-NN classification methods. Our experiment in the study area is presented in Sections 3. Section 4 provides further discussion on the proposed method, followed by some concluding remarks.

Methods
In the traditional k-NN method, the classifier allocates pixels to the neighbours to which it is closest in feature space. An inverse distance weighting (IDW) function can be incorporated into the k-NN classifier to give more weight to information from a neighbour close to an unclassified observation than from a more distant neighbour. IDW can be expressed as: where d uk measures the distance between the current pixel u and its neighbouring training pixel k in feature space, ω uk is the weight based on an inverse distance, and the exponent p is an integer that determines the magnitude of the weight. The term wk-NN is used to refer to the IDW-based k-NN method.
In a geostatistically weighted k-NN classifier (gk-NN), the probability that a pixel u belongs to class m can be evaluated as follows [ (2) where the subscript uk of h indicates the lag between pixel u and its neighbour k.

 
, m m uk p h is the fitted model of the spatial covariance, which also refers to the class-conditional probability. The term m' is a class index for m' = 1, …, M classes, and m is the class of interest. S g is a proportional weight between 0 and 1.
In the proposed MPS-based k-NN approach, instead of training samples, the conditional probability is derived from the training image to provide spatial information. For an unknown location u, K nearest neighbour training pixels u k (k = 1, …, K) can be found. Thus, the data template at location u can be defined as T(u) = {h 1 , …, h K }, where h k is the lag between u k and u. So the template centred at u consists of the same separation lag (for both distance and direction) and classes with the neighbouring K pixels. This template is used to scan the training image and derive the multiple-point probability for pixel u by counting the replicates of the data events dev(u), where dev(u) = {c(u 1 ), …, c(u K )}. The training image can be obtained from an initial classification map. The probability of pixel u with class m equals the proportion of the number of dev(u) that possesses class m at the central node to the total number of dev(u). For another pixel, a different template is applied to estimate another probability from the training image. The multiple-point probability that a pixel u belongs to class m is, thus, expressed as: Note that dev(u) = 1 means that the data event dev(u) is found in the training image, indicating that all K pixels at location u k should match exactly the corresponding classes (i.e., c(u k ) = m k (k = 1, …, K)). Thus, the multiple-point based k-NN classifier (MPk-NN in brief) across L probability levels can be written as: Similar to S g , S MP is a multiple-point statistical weight given to the classifier, ranging from 0 to 1.
To estimate the multiple-point probability in the MPk-NN method, a training image is defined first. It can be obtained either for the same area that needs to be classified using the MPk-NN method or represent a different place. However, the training image is required to have a similar class spatial distribution to that of the study area and, thus, to provide prior knowledge on the character of spatial information. Then the data template is defined, which consists of K nearest neighbour pixels of the current pixel. Therefore, one pixel in the image corresponds to only one data template, and the data templates are different at each location. The multi-grid concept is applied to the data template. Instead of expanding the data template, the rescaled template is constructed by condensing the original one.
Thus, the data template is formulated as: To apply the MPk-NN method, the difference between the pixel-and object-based methods is the measure of spatial distance when modeling the spatial correlation. When the pixel-based method is applied, the distance is measured directly between two points. For the object-based method, the point located at the center of each segment is extracted first. The central locations are recorded, so that the distance between any two segments can be inferred. It should be noted that, here, the k-NNs are segments, and the distances from the current segment to the k-NNs are measures of the relations between the centers of the segments. However, the scanning process is pixel-based, from top to bottom and left to right of the whole training image. The replicates of the data events were recorded according to the class type of the central node. This information was then converted to a conditional probability for each class and incorporated into the gk-NN classifier, as shown in equation (4).

Study area and data processing
The MPk-NN method proposed in this paper was tested on a 256 by 256 pixel large subset (39°57ʹ55ʺ-39°58ʹ28ʺN, 116°24ʹ5ʺ-116°24ʹ49ʺE) of an IKONOS image acquired over the Beijing urban area in May of 2000 ( figure 1a). The dataset consists of four multispectral bands with a spatial resolution of 4 m. The typical urban area (including that in the present study) can be classified generally into four classes: buildings, vegetation, road/bare land and shadow. The focus is mainly on distinguishing the building class from bare land/road. Because these two classes, which are key for urban planning and resource management, have similar spectral responses, but different spatial distributions, their separation is challenging using traditional classifiers. Reference data were provided to test the classification result. The reference image was produced using eCognition software. The standard k-NN classifier was applied to the object features of brightness, maximum difference, mean values, and the ratio of length to width of the four spectral bands. Considering that some bias may be introduced in the classification process, the classification result was then modified manually with reference to a corresponding image in Google Earth, which was obtained in January of 2001. The final reference image is shown in figure 1b, which has the same segmentation number as the segmentation image to be classified.
The training data were used with two objectives in mind. One was to train the classifier, and the other was to estimate the class-conditional probability for geostatistical modelling. It is common to use objects instead of pixels for object-classification, since the object contains both spectral and shape information. A total of 35 training objects were selected manually from the 1348 possible segments, which are displayed in figure 1c. These training segments were used to train the k-NN classifier. Although the size of the training set (35 segments relative to the total number of 1348) was appropriate for classification, geostatistical modelling requires more sample data. Furthermore, the training segments are not suitable for geostatistical modelling as the model of spatial covariance is a function of distance. Each segment accounts for an area, so the distance measured between segments varies. Therefore, instead of segments, sample points were used for geostatistical modelling.
A stratified sampling scheme was applied to the 35 training segments, ensuring there was at least one point within each segment. The number of points sampled from each segment was based on the size of the segment: the larger the segment, the more sample points that were selected. A total of 50 sample points were selected for each class, and 200 points were selected in total for geostatistical modelling, as displayed in figure 1d.

Classification
The traditional k-NN classifier was first applied to the 35 training samples, based on the mean value of the four spectral bands of each segment and with K = 5. For the gk-NN method, class-conditional probability plots were first estimated from the 200 sample points in figure 1d and then fitted with covariance-type models. Anisotropy was not considered. gk-NN using equation (2) was performed, based on the same training segments as for k-NN. Conditional probabilities were estimated from the k-NN segments (K = 5) for each class. The proportional weight for the geostatistical information S g was taken as 0.8, after experimentation. For comparison with other classifiers, three more classification methods were applied as benchmarks: the Bayesian, decision tree classifier (DTC) and support vector machine (SVM) methods. Finally, the MPk-NN method was applied. Since a pre-classified image of the same area can guarantee a similar spatial distribution of land covers, the training image was derived from an initial classification result. Here, the land cover map derived from wk-NN was used as the training image, which itself was not used as a benchmark method. The data template for each pixel consisted of k-NN nodes (K = 5). The multi-grid level L was taken as 3. The classification results obtained using the above six methods are displayed in figure 2.  As seen in the reference map, for the main road near the bottom of the image, there are only a few trees along the road, without buildings or shadow within the road feature. For the same area, shadows and buildings appear in the main road in the Bayesian, k-NN, and gk-NN results. Here, the SVM and MPk-NN methods show closer results to the reference data than the other methods. However, for the two squares above the main road, the k-NN and DTC classifications were more accurate. This is probably because bare land arranged as a square has similar features to buildings both spectrally and spatially, so the spatial weights involved in the enhanced classification methods are ineffective. In contrast, in the upper right part of the image, the land cover type of buildings was increased in coverage by the gk-NN and MPk-NN methods, and the narrow roads and small areas of bare land that were misclassified by the k-NN were corrected in the gk-NN and MPk-NN methods. Table 1. Classification accuracies of the Bayesian, DTC, SVM, k-NN, gk-NN and MPk-NN methods in Beijing (AA and OA represent the average accuracy and overall accuracy, respectively).
Class name: 1-buildings, 2-vegetation, 3-road / bare land, 4-shadow. Table 1 summarises the accuracy results using the six classification methods. The accuracies were estimated based on both pixel-based and object-based testing data. The average accuracy is the average of the user's accuracy (commission errors) and producer's accuracy (omission errors). As can be seen, the accuracies of the vegetation class are rather high (all around 90%) except for the DTC method. The SVM and three k-NN methods resulted in high accuracies for shadows. For the classes of buildings and road and bare land, MPk-NN has the greatest average accuracies, whereas the SVM and Bayesian methods offered the lowest average accuracies for buildings and road and bare land, respectively. It can be seen that the object-based accuracy does not always accord with the pixel-based accuracy. This is because sometimes one classifier misclassifies some very large segments whereas another classifier misclassifies more but smaller segments. Nevertheless, the MPk-NN method produced the greatest overall accuracies relative to the other methods based on both pixel and object testing data. An analysis of variance (ANOVA) for the classification accuracy was performed using the F-test. All the classification results were first compared to the reference data to produce a set of indicator values. The value was set to one if the classification code equals the reference class, otherwise zero. This process was applied both on pixels and objects. Then the indicator values were compared with the MPk-NN result in turn. As shown in table 2, the F-ratios are quite large for the object-based results. The MPk-NN result shows significant increases in accuracy with respect to all the other results at the 90% confidence interval using pixel-based assessment. Using object-based assessment, the increases are smaller, but the increase in accuracy for the MPk-NN is still significant relative to all the other methods except for the gk-NN method.

Discussion and conclusion
Since the classification methods applied are all object-based, it is necessary to explore the applicability of the gk-NN and MPk-NN methods for object-based classification. For the spatial measures in geographical space in the gk-NN method, converting segments to points may lead to some problems. The first problem relates to the sampling density for geostatistical modelling. A stratified sampling scheme was used to estimate the class-conditional probability to ensure that the points have a random spatial distribution, while not too many points were chosen within one segment. Instead of the correlation within one segment, the gk-NN method characterises the correlation between segments. Another problem is that modelling the spatial correlation of areas (segments) with representative points may cause the modifiable areal unit problem (MAUP) [10], which is a very common problem of scale when working with spatial correlation. A solution to this problem is to use the area-to-point approach for geostatistical modelling. However, if the study area is very large, and a great number of objects are produced by image segmentation, then geometric features (size, shape, orientation) can be ignored. Therefore, although the gk-NN method generally results in a more accurate classification than the k-NN method, it is not ideally suited for object-based classification. The geostatistical modelling process, therefore, requires further exploration. The research presented here explored the potential of MPS for spatially weighting a remote sensing classifier generally and, in the present case, for object-based classification. A multiple-point statistical k-NN classification method was tested on remotely sensed image of an urban area in Beijing. The multiple-point classification method can account for spatial correlation at multiple points simultaneously, in contrast to common spatial weighting schemes, which are limited to two-point