Subcarrier selection for efficient CSI-based indoor localization

Indoor positioning systems have received increasing attention for supporting location-based services. In recent Wi-Fi networks, the rich information in the physical layer, known as channel state information (CSI), has been recognized an effective positioning characteristic rather than traditional received signal strength. However, the positioning performance depends on a very high-dimensional CSI due to all pairs of transceiver antenna, which may incur over-fitting problems. This paper proposes a subcarrier-selection approach based on information theoretic learning to compensate for over-fitting problems in CSI-based localization systems. After equalizing the histogram of CSIs, the proposed algorithm computes the information gain of each subcarrier and forms a new low-dimensional subset of CSIs to reduce the complexity and to decrease possible over-fitting caused by redundant CSIs. We demonstrate the effectiveness of the proposed algorithm through experiments. On-site experimental results demonstrate that the proposed approach outperforms traditional feature selection schemes.


Introduction
Indoor localization has become a crucial concern for various mobile applications in pervasive computing environments [1]. The massive deployment of Wi-Fi access points (APs) has made Wi-Fi a suitable technology for developing such indoor location systems. Numerous studies have addressed indoor location estimation by using existing Wi-Fi infrastructure. Among the various indoor Wi-Fi positioning systems, the fingerprinting-based approach, in which the user's location is estimated by matching receive signal strength (RSS) with the radio map, is one of the most feasible solutions [2].
In recent Wi-Fi systems with the IEEE 802.11a/g/n standard, the Orthogonal Frequency Division Multiplexing (OFDM) and Multiple-input and multiple-output (MIMO) technology have be used, where data are modulated on multiple subcarriers in different frequencies and transmitted among different pairwise antenna to increase throughput [3][4]. Thanks to the development of firmware and open sources, the wireless MIMO channel response can be extracted from the receivers in the format of channel state information (CSI) [5][6][7].
CSI reveals a set of channel measurements depicting the amplitudes and phases of every subcarrier. Recently, CSI-based fingerprinting approaches, which uses PHY layer fine grained information, have been proposed for indoor localization. Instead of receive signal strength (RSS), these approaches exploit CSI as new positioning features, with the location granularity to improve the accuracy [8][9][10][11][12][13][14] However, the positioning performance depends on a very high-dimensional CSI due to all pairs of transceiver antenna, which may incur over-fitting problems.
The studies in [15] proposed FIFS, which leverage both the frequency and spatial diversity and uses a correlation filter with the probability positioning algorithm. That is, FIFS represents the location by aggregating the power of all subcarriers. However, although the summation over all subcarriers can significantly reduce the dimensions, it does not consider the detail information within subcarriers. A similar approach, namely, MIMO, was presented in [16][17], in which the amplitude and phase value were subtracted for subsequent subcarrier to generate the location fingerprint. Although this approach effectively extracts the relatively robust fingerprints owing to the subtraction process, the highdimension still maintains.
This paper presents a robust CSI-based indoor localization using histogram equalization (HEQ) [18] and information theoretic learning (ITL) [19] to reduce the required dimensions. Specifically, we propose a subcarrier-selection approach based on information theoretic learning to compensate for over-fitting problems. There are two steps of the proposed algorithm. First, we normalize CSIs using the HEQ technique by converting the coefficients into a reference to enhance the system robustness. Second, after equalizing the histogram of CSIs, the proposed algorithm computes the information gain of each subcarrier based on ITL and forms a new low-dimensional subset of CSIs to reduce the complexity and to decrease possible over-fitting caused by redundant CSIs. We demonstrate the effectiveness of the proposed algorithm through experiments. On-site experimental results demonstrate that the proposed approach provide compare or even better performance with a relatively lowerdimension subcarrier subset.
The rest of the paper is structured as follows. Section II illustrates the proposed localization algorithm. Section III presents the experimental setup and results. Finally, the conclusion is given in Section IV.

Proposed algorithm
To enhanced data throughput, Wi-Fi networks used OFDM (orthogonal frequency division multiplexing) and MIMO (multiple input multiple output) techniques in the 802.11a/g/n standard, where data are modulated on multiple channels in different frequencies and are transmitted simultaneously among multiple antenna pairs. In these systems, channel response can be extracted from the receivers in the format of channel state information (CSI), as indicated in figure 1. Because CSI reports each channel state for each antenna pair, we assume that a complete measured CSI signal in the MIMO wireless network is given as equation (1): result. Let R be the entire set of reference locations; the CSI-fingerprinting localization problem can be formulated as equation (2) = t max (2) where Λ r is the probabilistic model of the r-th reference location, which can be obtained on the basis of the training data during the offline stage [20]. A challenging issue in CSI-based localization is the very high-dimensional CSI. In this study a mechanism is designed to select the important CSIs as positioning feature. First, we normalize the CSIs using the histogram equalization (HEQ) technique (i.e., converting the coefficients into a reference). HEQ provides a transformation that converts the probability density function of an original variable into a reference probability density function. This study uses a typical Gaussian as the reference. Then, the proposed approach ranks CSIs in descending order of their InfoGain values which are calculated as equation (3): ropoter e = e e = t (3) where I(D) is the information incurred in the entire training database D (entropy of the reference locations without e , and I e computes the conditional entropy of the location given e . This approach calculates the discriminative ability for each subcarrier and the top are considered the best. As the above equation indicates, the feature with larger information gain is assigned a higher rank, representing more contribution in indoor localization.

Experimental setup
This study involved experiments conducted on the fifth floor of the telecommunications building at Yuan-Ze University. Figure 2 shows where the experiments were performed. The size of the test-bed was 14 (m) × 10 (m). One access point (ASUS RT-N12E) with two transmitted antennae (M=2) was deployed at the corner. After installing the Linux 802.11n driver, which is built on the Intel Wi-Fi Wireless Link 5300 MIMO radios, the gain and phase of the signal path between a single transmit−receive antenna pair is available in the measured device (Lenovo x200). In the experiments, the laptop measured CSIs using three receiver antenna (N=3), and thus there were 6 links available as positioning features. Each reported channel state contains 30 subcarrier channel groups. Each channel state matrix entry is a complex number and only the magnitudes were used in this study. Figure 3 shows the realistic RSS and CSI measurements over 100s at two different locations. In   Figure 4 displays the mean positioning error versus the number of subcarriers based on six algorithms, including RSS, CSI, HEQ, MIMO, FIFS, and the proposed ITL algorithm. The HEQ approach indicates the direct normalization of CSI, and the RSS and FIFS methods are constant because the performance is independent of the number of subcarriers. This figure first shows that the error generally decreases as the number of channels increases. Moreover, this figure indicates the effectiveness of ITL, which considerably reduce the error while using the lowest number of subcarriers. This shows that ITL can select the useful location-related information among CSIs more effectively.

Performance evaluation
Next, figure 5 compare the performance of different sub-carrier selection methods on mean positioning error. In this figure, the Maxmean method selects the most stronger CSIs (in terms of the maximal mean magnitude over time) while the Minvar approach selects the most stable ones (in terms of the minimal variance over time). This figure shows that the proposed ITL algorithm outperforms traditional selection approaches. Using 40 subcarriers, ITL reduces the mean localization errors by 40% and 22%, respectively, compared to Maxmean and Minvar. The trend is similar using 50 subcarriers. The results again demonstrate that ITL constructs a more suitable subset of subcarriers according to information gain, thus achieving more accurate location estimations with fewer numbers of CSIs. The computed information gain of CSIs is shown in figure 4.

Conclusions
This paper proposes a subcarrier-selection approach based on ITL to compensate for over-fitting problems in CSI-based localization systems. After equalizing the histogram of CSIs, the proposed algorithm computes the information gain of each subcarrier and forms a new low-dimensional subset of CSIs to reduce the complexity and to decrease possible over-fitting caused by redundant CSIs. We demonstrate the effectiveness of the proposed algorithm through experiments. The experimental results demonstrate that ITL constructs a more suitable subset of subcarriers according to information gain, thus achieving more accurate location estimations with fewer numbers of CSIs.