Assessing the accuracy of satellite-derived urban extent over major urban clusters in the WRF model

Urban land use data play a central role in climate change assessments of urbanization process. The Moderate Resolution Imaging Spectroradiometer (MODIS) land use data is employed in the Weather Research and Forecasting (WRF) model. It is important to understand scaling effects of the MODIS urban data before applying to regional climate modelling. In this study, we took the Landsat derived National Land-Use/Land-Cover Dataset (NLCD) of China as the reference data to assess the accuracy of the MODIS urban map. Urban area sizes and spatial agreement of urban pixels were investigated as assessment methods at the national and metropolitan levels over China. Results showed that the accuracies vary from region to region and highlighted strengths and weaknesses of the MODIS data in different metropolitan area. The study provides insights to model communities with the suitability of the MODIS urban data for specific regional modeling.


Introduction
Land use changes play a central role in driving forces of changes in the Earth system and climate in particular [1]. The underlying surface properties change with the twin processes of urbanization and deforestation accompanied with human activities. Urban land use change, typically from vegetated surfaces to imperious built-up surfaces, can have significant impacts on meteorological elements and pollution concentrations, as it determines the exchange of heat and momentum between the land and air [2] by factors such as the surface soil moisture, albedo, and roughness length. Numerical models have been used extensively to investigate effects of direct and indirect urban land use change mechanisms on climate [3]. Previous studies have found that urban land use data with differing accuracies incorporated into regional climate modeling directly influenced the model outputs [4]. Cheng et al. [5] demonstrated that application of the inaccurate representations of land use and land cover patterns indicated overprediction or underprediction of meteorological parameters, e.g., temperature and wind speed. Therefore, it is important to validate the urban land use data in models before using them for climate studies.
Currently there are several global land use data in response to the need for analyses of the connections between natural and anthropogenic processes. For widely used community models such as the WRF model, beginning with its Version 3.1, the model provides the Moderate Resolution Imaging Spectroradiometer (MODIS) data as an alternative land use dataset. The MODIS data was acquired in 2001 at 1-km spatial resolution from the TERRA MODIS instrument [6]. As [7] found that the urban and water categories have the highest consistency in the area sizes but have high mixed error in China. These studies have certainly improved our understanding of accuracy and comparability between existing global urban maps. However, the validation and accurate map for regional climate model application remains inadequate [8]. Very little assessment has been undertaken to directly examine the accuracy of urban land use data especially in metropolitan regions over China, In this study, the MODIS urban land use data in the WRF model was assessed by comparing to the finer resolution of Landsat derived urban data at the national and metropolitan levels over China. Urban area sizes and spatial agreement of urban pixels were investigated, in order to outline (i) the comparisons of accuracies of the MODIS urban data in different metropolitan areas over China and (ii) the errors of urban land use data when applying to regional modeling of urbanization effects.

Datasets and preprocessing
Four major metropolitan areas over China were investigated since they have experienced remarkable economic growth and urban development during the past two decades, including the Beijing-Tianjin-Hebei region, the Pearl River Delta, the Yangtze River Delta, the Chengdu-Chongqing area.
The 1-km MODIS dataset derived in 2001 used by the WRF model was classified by the International Geosphere-Biosphere Programme (IGBP) modified categories. As our interest was focused on urban, the "Urban and Built-Up" category in the MODIS dataset was extracted for accuracy assessment. The extracted data was classified into urban and non-urban categories at 1-km resolution for better comparison.
The National Land-Use/Land-Cover Dataset (NLCD) of China acquired in 2000 was taken as the reference dataset. It was constructed by the Chinese Academy of Sciences from Landsat ETM (Enhanced Thematic Mapper) data [9]. The Landsat images were georeferenced and orthorectified using ground control points and high-resolution digital elevation models. Visual interpretation and digitization were applied to the Landsat data to generate thematic maps of land use and land cover. The NLCD dataset was classified by a hierarchical classification system of 25 land categories grouped into 6 aggregated classes of land types with the overall accuracy of 95%. The NLCD dataset was gridded by land use percentage data at 1-km resolution, which represents the area composition of each land use type within each one square kilometer unit. To cope with the MODIS urban and non-urban data at 1-km resolution, we used the majority-rule approach to search for the "built-up area" category in NLCD dataset with the highest frequency within the 1-km grid cell [10]. Finally, the MODIS and NLCD urban data were reprojected and registered to Albers equal area conic projection at a spatial resolution of 1-km with urban and non-urban categories.

Assessment methods
In this study, urban area sizes and spatial agreement of urban pixels were investigated as assessment methods at the national and metropolitan levels. First, the comparison of urban area sizes provides a direct assessment of data accuracy in terms of the quantity. The assessment compared the area mapped as urban land in the MODIS data against in the NLCD map within the same extent. The bias of urban area sizes was calculated using the following equation.
In equation (1), O denotes the bias of urban area sizes of the MODIS data, M and N denote the urban area sizes of the MODIS and NLCD data, respectively.
Second, the comparison of spatial agreement considers in terms of location consistency (pixel by pixel). The MODIS and NLCD data were overlaid for comparison. The pixels with the same urban or non-urban category in both MODIS and NLCD data retained their class values (i.e. urban and non- urban) and were regarded as agreement. Whereas pixels with different categories were assigned the new class values (i.e. urban/non-urban and non-urban/urban) and considered as disagreement. Thus, there were four class values after spatial overlaying comparison, i.e., urban, urban/non-urban, nonurban/urban, and non-urban. The non-urban pixels were disregarded from the calculation of overall agreement since our attention was concentrated on the accuracy of urban category. The agreement of urban category between the MODIS and NLCD data was calculated using the following equation.
In equation (2), D is spatial agreement of urban category between the MODIS and NLCD data, A, B, and C represent the number of pixels for urban, urban/non-urban, and non-urban/urban after overlaying comparison, respectively. Figure 1 illustrates the remapped urban and non-urban data derived from the MODIS against NLCD data for the four metropolitan areas. Results show that the urban land patterns for each metropolitan region between the MODIS and NLCD data are similar. Geographically, urban pixels gather in the MODIS data, whereas there are many fragile urban patches demonstrated in the NLCD data. The statistical urban area sizes both at the national and metropolitan levels are listed in Table 1. At the national scale, the total area of urban land for the whole country estimated from the MODIS and NLCD are 93444 and 64705 km2, respectively. The percentage of urban land in the NLCD data for the whole China is 0.68%. Results indicate that urban land totals predicted from the MODIS map across China has a discrepancy relative to the NLCD map. The bias from the NLCD is 44.42%. At the metropolitan scale, the greatest percentage of urban land estimated from the NLCD data is 6.17% for the Pearl River Delta. The biases of the four metropolitan regions with respect to the NLCD are all positive. Nevertheless, the deviations of each metropolitan region are considerably different (Table 1). Results indicate that urban area sizes for the Beijing-Tianjin-Hebei and Chengdu-Chongqing region estimated from the MODIS data are consistent with the NLCD data. The bias from the NLCD data is 9.92% and 16.11% for these two metropolitan areas respectively. Whereas the Pearl River Delta (147.47%) and Yangtze River Delta (148.08%) have much higher biases relative to the NLCD data. Table 1. Urban area sizes (km 2 ) and biases of the MODIS data relative to the NLCD data. The list of Percentage represents for the percentage of urban land in that item calculated from the NLCD data. The number of urban pixels for spatial overlaying comparison and agreements of distribution at the national and metropolitan levels.

Spatial agreement of urban pixels
The agreements of spatial distribution of urban per-pixel comparison at the national and metropolitan levels are listed in Table 1. For the whole China, spatial agreement of the MODIS data versus the NLCD data is 15.49%. With respect to the metropolitan areas, the greatest agreement between the MODIS and NLCD data is observed in the Pearl River Delta (26.68%). The Beijing-Tianjin-Hebei region has the lowest agreement (18.91%) among the four metropolitan areas. Figure 2 illustrates spatial distributions of urban pixels of the MODIS versus NLCD data for the four metropolitan areas. Overall, pixels for the agreement (urban/urban) are surrounded by pixels for the disagreement (urban/non-urban). For the Beijing-Tianjin-Hebei region, pixels for the disagreement (nonurban/urban) are mainly distributed along the coastline. It is because there are areas used for settlements in villages and transportation mapped in the NLCD data. For other three metropolitan areas, pixels for the disagreement (non-urban/urban) are those scatted urban patches for small-medium cities and rural settlements.

Discussion
This study reveals that there are varying levels of discrepancies between the MODIS urban data for the four metropolitan areas. Urban area sizes and spatial agreement of urban pixels vary from region to region. The first two assessments provide the quantity of size and pixel location. Although the size bias for the Beijing-Tianjin-Hebei area is the minimum, the spatial agreement of urban pixels for this region is the lowest. It is because the imbalance of urban distribution between the MODIS and NLCD data ( Figure 1). The numbers of pixel disagreement for urban/non-urban and non-urban/urban are comparable in the Beijing-Tianjin-Hebei area (Table 1). Thus, the errors of urban fractions should be concerned as the underlying parameters when applying the MODIS data to modeling the effects of urbanization.

Conclusions
Satellite imagery plays an important role in aiding the production of urban maps for urbanization issue analysis. Users of urban maps need to be made aware of the applicability and accuracy of each map. This study outlines strengths and weaknesses of the MODIS data at the metropolitan and national levels. Assessments of urban area sizes and spatial agreement of urban pixels provide the quantity of