Letter The following article is Open access

Combining GEDI and Sentinel-2 for wall-to-wall mapping of tall and short crops

, and

Published 18 November 2021 © 2021 The Author(s). Published by IOP Publishing Ltd
, , Focus on The Global Ecosystem Dynamics Investigation: Research, Applications and Policy Implications Citation Stefania Di Tommaso et al 2021 Environ. Res. Lett. 16 125002 DOI 10.1088/1748-9326/ac358c

Download Article PDF
DownloadArticle ePub

You need an eReader or compatible software to experience the benefits of the ePub3 file format.

1748-9326/16/12/125002

Abstract

High resolution crop type maps are an important tool for improving food security, and remote sensing is increasingly used to create such maps in regions that possess ground truth labels for model training. However, these labels are absent in many regions, and models trained on optical satellite features often exhibit low performance when transferred across geographies. Here we explore the use of NASA's global ecosystem dynamics investigation (GEDI) spaceborne lidar instrument, combined with Sentinel-2 optical data, for crop type mapping. Using data from three major cropped regions (in China, France, and the United States) we first demonstrate that GEDI energy profiles can reliably distinguish maize, a crop typically above 2 m in height, from crops like rice and soybean that are shorter. We further show that these GEDI profiles provide much more invariant features across geographies compared to spectral and phenological features detected by passive optical sensors. GEDI is able to distinguish maize from other crops within each region with accuracies higher than 84%, and able to transfer across regions with accuracies higher than 82%, compared to 64% for transfer of optical features. Finally, we show that GEDI profiles can be used to generate training labels for models based on optical imagery from Sentinel-2, thereby enabling the creation of 10 m wall-to-wall maps of tall versus short crops in label-scarce regions. As maize is the second most widely-grown crop in the world and often the only tall crop grown within a landscape, we conclude that GEDI offers great promise for improving global crop type maps.

Export citation and abstract BibTeX RIS

Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 license. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

1. Introduction

Crop type maps are a crucial step toward estimating crop area, mapping yield, studying local nutritional outcomes, and developing hydrological models (Boryan et al 2011, Jin et al 2019). Recent years have seen significant progress in remote sensing-based crop type mapping, particularly in high-income countries, with maps now produced in the US (USDA-NASS 2020), Canada (Agriculture and Agri-Food Canada 2021), much of Europe (Belgiu and Csillik 2018, Defourny et al 2019), and parts of Asia (You et al 2021). While often high in accuracy, the models that produce these maps remain local, in the sense that model application is confined to the region where ground labels exist for crop types. Applying these models outside the region of training sees rapid performance declines (Wang et al 2019, Kluger et al 2021), because the models largely use optically-sensed time series as features. These time series, which reflect crop phenology, change from region to region as growing season timing, climate, management practices, soil properties, and crop varieties change. As a result, crop type maps remain elusive in places where ground labels are scarce, which includes the vast majority of low- and middle-income countries.

To date, solutions proposed for creating crop type maps in label-scarce regions include matching satellite time series to crop type profiles (Foerster et al 2012, Belgiu et al 2021), designing machine learning models that need fewer labels to perform well (Jean et al 2019, Tseng et al 2021), substituting crowdsourced labels in lieu of survey-based labels (Wang et al 2020), and investing more resources to collect ground data in low-income regions (Lambert et al 2018, Jin et al 2019, Rustowicz et al 2019). Another potential solution is finding remote sensing features that are invariant to geographic shifts—in other words, finding a remote sensing modality under which a particular crop type looks the same way everywhere on Earth. So far, such a feature has not been found in multi-spectral imagery at the spectral resolution of MODIS, Landsat, or Sentinel-2, or in radar imagery like that acquired by Sentinel-1, but the ever-growing list of sensors offers new possibilities each year.

One yet-unexplored sensor is the global ecosystem dynamics investigation (GEDI). GEDI is a spaceborne light detection and ranging (lidar) sensor that was launched in late 2018 and installed on the International Space Station (Dubayah et al 2020). As a lidar waveform instrument, GEDI measures the reflection of a laser beam off of vegetation and the ground surface, with a nominal spatial resolution of 25 m. The waveforms are then processed to provide information on surface topography, canopy height, canopy cover, and vertical canopy structure (Dubayah et al 2020). GEDI was designed with the goal of improving measures of forest canopy structure, and several recent studies have applied GEDI to this end (Healey et al 2020, Potapov et al 2021, Schneider et al 2020). Because spaceborne lidar sensors typically provide only a sparse sampling of the Earth's surface—during its planned mission GEDI will measure 4% of the land surface—lidar data are commonly used as a source of training data for models that estimate forest structure from wall-to-wall imaging sensors, such as Landsat (Healey et al 2020, Potapov et al 2021), Tandem-X (Qi et al 2019), or Sentinel-1 (Bruggisser et al 2021, Chen et al 2021).

Although designed for forest systems, the GEDI measures could also prove useful in cropland systems. In particular, crop height may be a more consistent feature of crops across regions than the spectral and phenological features detected by passive optical sensors. For example, figure 1 displays the distribution of reported crop heights for thousands of varieties of different key species stored in the U.S. National Plant Germplasm System (https://npgsweb.ars-grin.gov/), which contains seed samples from around the world. Among the four staple crops grown most widely across the world, maize is clearly taller than the others, with even the 25th percentile of maize samples exceeding the 95th percentile of the other three crops (rice, wheat, and soybean). On average, maize is roughly 1 m taller than the other crops, compared to 50 cm being the reported vertical resolution of GEDI (Dubayah et al 2020). Thus, it is plausible that GEDI could distinguish maize from other common staples, although it is unlikely that it could distinguish wheat from rice or soybean.

Figure 1.

Figure 1. The distribution of crop heights for the four most widely grown crops in the world (a), as reported for germplasm in the U.S. National Plan Germplasm System. Central lines indicate the median height, boxes indicate the 25th–75th percentile, and whiskers indicate the 5th–95th percentile. Panel (b) compares maize with other tall crops commonly grown in the world. Data source: npgsweb.ars-grin.gov for crop heights, www.fao.org/faostat for global crop areas in 2019.

Standard image High-resolution image

In this paper, we explore the potential of GEDI to distinguish between taller and shorter crops, and assess whether models trained on GEDI in one region can successfully transfer to far-away regions. Although several tall crops are commonly cultivated—figure 1 illustrates that crops such as sorghum and sunflower exhibit a similar height distribution as maize—we focus on maize for two main reasons. First, it is by far the most widely grown tall crop in the world, with many regions relying on maize either directly or indirectly (via animal feed) for a substantial portion of their calories and protein. In sub-Saharan Africa, for instance, distinguishing maize from all other crops is often a key step toward estimating national grain supply (Jin et al 2019, Nakalembe et al 2021). Second, maize is the predominant tall crop in the regions for which we have extensive field-scale crop maps to test our crop type estimates.

Since we are interested in finding geographically-invariant features, we test GEDI in three maize-producing regions around the world: the state of Iowa in the US, the province of Jilin in China, and the region of Grand Est in France. The three study areas were chosen for their geographic diversity and availability of accurate, up-to-date field-scale crop type maps (Agence de Services et de Paiement 2019a, USDA-NASS 2020, You et al 2021). We first trained maize classifiers within each region using GEDI as features and local crop type maps as labels, comparing models trained on GEDI against baseline models trained on Sentinel-2 optical features. Next, we applied the GEDI models trained in one region to a different region for all six pairs of regions. Lastly, we trained models in the transfer regions using Sentinel-2 as features and GEDI maize predictions as labels. We produced maize maps in the transfer regions and evaluated them against the existing crop type maps. The results show that (1) GEDI data can distinguish maize from non-maize crops based on height, (2) GEDI features transfer much better than optical features across regions spanning multiple continents, and (3) GEDI data can generate training labels that then enable wall-to-wall crop type mapping with optical imagery in the absence of other ground labels.

2. Datasets

This section describes the three study areas used to test GEDI for maize classification, GEDI waveforms and feature extraction, Sentinel-2 time series and feature extraction, and the crop type labels used as ground truth to evaluate the performance of all models.

2.1. Study areas

To evaluate the potential of GEDI to distinguish between tall and short crops, we considered three regions of the world: Jilin in China, Grand Est in France, and Iowa in the United States (figure 2). These regions are major agricultural production areas located on three separate continents. They each contain a mix of tall and short crops and have accurate, up-to-date field-scale crop type maps that are publicly available.

Figure 2.

Figure 2. Overview of GEDI observations in the three study regions: Jilin in China, Grand Est in France, and Iowa in the United States. Maps show the shot locations for July–September for cropped area, and barplots indicate the number of shots from maize and non-maize fields for each observation date.

Standard image High-resolution image

Jilin Province is located in Northeast China and is one of the most important maize-producing provinces in China. It spans the mid-latitudes from 40.8N–46.3N and 121.7E–131.3E and has a humid continental climate. Other major crops grown in the area are soybeans and rice. Because early frost usually appears in September and early October, fast maturing maize varieties are cultivated. Maize is typically planted in April and harvested in September, with an average maize cycle duration of 150 days.

The Grand Est administrative region in northeastern France cultivates a wide variety of crops including wheat, barley, maize, alfalfa, sugar beets, legumes, and oilseeds. Twenty-two percent of French sugar production, 21% of rapeseed production, 13% of wheat production, and 13% of maize production come from this region (Le Service statistique ministériel de l'agriculture 2019). Maize is generally planted between April and May, and harvested between September and November. The region spans 47.4N–50.2N and 3.4E–8.2E and has a climate that varies from oceanic in the west to humid continental in the east. While the administrative region Nouvelle-Aquitaine is the largest producer of maize in France (31%), Nouvelle-Aquitaine also produces significant quantities of sunflower, which is also a tall plant. To focus on evaluating GEDI's ability to distinguish maize, we instead conducted experiments in Grand Est, which is France's second-largest producer of maize. We elaborate on the application of GEDI in regions with more than one tall crop in the Discussion.

Iowa, a state located in the Midwestern region of the United States, is in the heart of the U.S. Corn Belt and is the country's largest producer of maize. Maize and soybean are the two primary crops cultivated, comprising well over 95% of total cropped area (figure A1). Maize in Iowa is planted from late April to May and harvested from late September to early November. Located between 40.4N–43.5N and 90.1W–96.6W, Iowa also experiences a humid continental climate with cold winters and hot summers.

While the three study regions are thousands of miles apart, they share some similarities: they are all in the Northern Hemisphere, have humid continental climates, grow predominantly rainfed agriculture, and sow maize around April and harvest it between September and November. Management practices, however, do differ; of the three study regions, Jilin has the smallest fields on average, with the median field size between 0.64 ha and 2.56 ha according to the GeoWiki database (Lesiv et al 2019). Fields in Grand Est are 4.5 ha on average (Agence de Services et de Paiement 2019b), while fields in Iowa are largest with an average area of 33 ha (Yan and Roy 2016). Differences in agricultural practices across regions may affect model transfer if the practices result in the same crop type having different GEDI waveforms. Smaller field sizes also cause more GEDI shots to land on field borders, which are likely to contain more than one crop type or a mix of crop and non-crop land covers.

2.2. GEDI data and feature extraction

The GEDI instrument is the first spaceborne lidar instrument specifically optimized to measure vegetation structure. By firing a laser at 25 meter spots (termed 'footprints') on the Earth's surface and observing the return of the laser pulse, the instrument is able to measure the vertical distribution of vegetation at each spot. GEDI collects data globally (between 51.6N and 51.6S latitudes) at the highest resolution and densest sampling of any lidar instrument in orbit to date (Dubayah et al 2020). The raw GEDI waveforms collected undergo several processing steps to retrieve a variety of metrics, and the derived products are saved as both footprint and gridded datasets.

For our analysis, we used the Level 2A Elevation and Height Metrics Data (L2A), which includes footprint-level elevation and relative height (RH) metrics. RH metrics represent the height (in meters) at which a percentile of the laser's energy is returned relative to the ground. For example, $\textrm{RH}50 = 20$ m means that 50% of the laser's energy was returned by objects up to 20 m above the ground. The ground position is determined based on the center of the lowest mode of the returned waveform (Hofton et al 2000). RH metrics are saved at 1% intervals, so each shot contains 101 values representing RH at 0%–100%. These footprint data are geolocated with a mean positional error of 10.3 m.

We downloaded the GEDI L2A version 2 (GEDI02_A v002) data from July to September 2019 that intersects our study regions through NASA's Earthdata Search website. GEDI L2A data come with a series of flags and properties to help the user filter for data of quality appropriate for the specific application. For this study, we omitted shots with a quality flag value of zero, which indicates poor quality, and a non-zero degrade flag, which indicates poor geolocation. We note that, unlike RH values observed for forests, RH values used here were commonly below zero. This happens because waveforms from agricultural areas often have only one mode, and the GEDI algorithms define RH relative to the center of the lowest mode. We also filtered out shots with RH100 greater than 10 m, as the field crops in the study areas do not grow that tall. We also dropped a full orbit for September 25 in Jilin, China because of abnormally high RH100 values. The outliers removal only dropped a small percentage of points (1.5% in Iowa, 3.6% in Jilin, and 3.5% in Grand Est). A map of the shots left in each region after cleaning the dataset and filtering for cropland are shown in figure 2, and the counts of shot numbers for each crop type are summarized in figure A1.

Consecutive RH metrics are highly correlated with each other. We therefore sampled a metric every 10% to reduce the number of features used from 101 to 11. The accuracy of random forest classifiers trained on a subset of 11 RH metrics is only slightly lower than one trained on all 101 RH metrics for all three study regions. This suggests that the information lost from reducing feature dimensionality had little impact on the ability to distinguish crop types.

2.3. Sentinel-2 imagery and feature extraction

Current state-of-the-art crop type maps use time series from passive optical remote sensing as input features for classification (Defourny et al 2019, USDA-NASS 2020). To compare GEDI features to optical features for crop type classification, we extracted Sentinel-2 (S2) time series at each GEDI shot location. We also extracted S2 time series for the entirety of the study areas in order to demonstrate how GEDI can be used to create labels for wall-to-wall crop type mapping. All optical imagery was processed using the Google Earth Engine (GEE) platform.

The Sentinel-2A/B (S2) satellites acquire images with a spatial resolution of 10 m (Blue, Green, Red, and NIR bands) and 20 m (Red Edge 1, Red Edge 2, Red Edge 3, Red Edge 4, SWIR1, and SWIR2 bands), and together they provide images at a 5-day interval. The spatial resolution of 10 to 20 m is sufficient to resolve individual fields in the three study areas.

We used S2 surface reflectance data (Level-2A) present in GEE and filtered out clouds using the S2 Cloud Probability dataset provided by SentinelHub in GEE. To capture crop phenology, we used Sentinel-2 imagery from 1 January to 31 December 2019, using the same time window across the three regions. In our study areas, this time window encompasses a single growing season for the majority of crop types.

Features were extracted from S2 time series by fitting harmonic regressions to all cloud-free observations in 2019. The harmonic regression, also known as the discrete Fourier transform, decomposes signals over time into their constituent frequencies (cosines and sines). It has been shown to perform well at land cover and crop type classification (Moody and Johnson 2001, Jakubauskas et al 2002). Other options for feature extraction include monthly or seasonal medians and the double logistic regression. In our previous work in the U.S. Midwest (Wang et al 2019), we found that harmonic coefficients classified crop types better than seasonal median features. Furthermore, unlike the double logistic regression, the harmonic regression is easy to deploy over entire states and provinces in GEE using the built-in linear regression function.

For each spectral band or vegetation index f(t), the harmonic regression takes the form

Equation (1)

where ak are cosine coefficients, bk are sine coefficients, and c is the intercept term. The independent variable t represents the time an image is taken within a year expressed as a fraction between 0 (1 January) and 1 (31 December). The number of harmonic terms n and the periodicity of the harmonic basis controlled by ω are hyperparameters of the regression. We used a second order harmonic (n = 2) with ω = 1.5, shown in previous work (Wang et al 2019) to result in good features for crop type classification in the U.S. Midwest. Since seasonality is similar in Jilin and Grand Est, the same values of ω and n should also fit time series in the other two regions well. Model transfer also requires that the features be derived from the same regression. The harmonic regression above yields a total of five features per band or vegetation index, resulting in 20 harmonic coefficients total.

We computed harmonic coefficients for three bands and one vegetation index: NIR, SWIR1, SWIR2, and GCVI. GCVI is the green chlorophyll vegetation index (Gitelson et al 2005) computed as

Unlike the commonly-used NDVI, GCVI does not saturate at high values of leaf area. These four bands and VI were chosen in prior work (Wang et al 2019), which found that crop type classification using NIR, SWIR1, SWIR2, and GCVI performed nearly as well as classification using all optical bands and a variety of other VIs.

2.4. Crop type labels

We used high-accuracy crop type maps in Jilin, China, Grand Est, France and Iowa, USA to filter out non-crop areas, train maize classifiers, and evaluate each classifier's performance. In each region, the corresponding map's value at each GEDI shot footprint centroid location or S2 pixel location was used as the ground truth for crop type. Note that GEDI footprints have a 12.5 m radius, so it is possible for a GEDI shot to span multiple crop type map pixels and have mixed crop type labels (figure 3). The data products available in each region for the year 2019 are described below.

Figure 3.

Figure 3. Example of GEDI shot footprints in cropped area in Clay County, Iowa, U.S. In (a) the eight GEDI laser ground tracks and footprints are separated by 60 m along-track and 600 m across-track. In (b) GEDI returns along one track show differences in the vertical distribution for maize vs. soybean. A close-up of the shots with either (c) a high resolution satellite image as background or (d) the crop type map as background illustrate how shots can be distributed within fields. Some shots fall on field boundaries or outside fields, such as the three shots circled in grey, causing in this case the high RH returns visible in (b).

Standard image High-resolution image

2.4.1. You et al (2021) Northeast China crop type map

You et al (2021) produced annual 10 m crop type maps in Northeast China from 2017 to 2019 for the three major crops in the area (maize, soybean, and rice) using S2 time series as features and ground samples from field surveys as labels. The overall accuracy for the 2019 crop map for the whole Northeast region is 87%, with F1-scores of 94%, 85%, and 87% and rice, maize, and soybeans, respectively. Maize and soybean have higher recall (producer's accuracy) (86% and 90%) than precision (user's accuracy) (both 84%), indicating that the commission errors of maize and soybean are higher than the omission errors. According to the authors, this mainly resulted from the incorrect identification of other crops as maize and soybean.

We imported the 2019 crop type map for the province of Jilin in GEE and used it to sample crop type labels at GEDI shot locations for the three major crops mapped.

2.4.2. Registre parcellaire graphique (RPG)

The Registre Parcellaire Graphique (RPG) is an geographical database of agricultural fields in France maintained by the Service and Payment Agency (ASP). The ASP is the institution that pays aid to French farmers under the Common Agricultural Policy (CAP) of the European Union. As part of their request for CAP aid, farmers send the ASP plot boundaries and certain plot characteristics. Unlike the cropland data layer (CDL) in the US and the You et al (2021) map in Northeast China, the RPG in France is a georeferenced vector product derived via survey, rather than a raster product generated by a machine learning algorithm. Each plot is drawn to centimeter resolution and associated with a crop type also submitted by the farmer (Agence de Services et de Paiement 2019b).

An anonymized version of the dataset is released publicly by the ASP each year, and we accessed this dataset at www.data.gouv.fr/. The entire 2019 database contains 9.6 million plots; filtering for those that fall within the Grand Est region results in a dataset of 851 090 plots. Although the RPG does not include farmland not receiving CAP aid, in reality 98% of agricultural land in Grand Est is recorded in the RPG.

We imported the RPG dataset in GEE, filtered out non-crop parcels, and rasterized it to sample the crop type labels at GEDI shot locations.

2.4.3. USDA CDL

Each year, the U.S. Department of Agriculture produces the CDL for the lower 48 states of the U.S.A raster product with pixels at 30 m resolution, CDL covers 132 classes spanning field crops, tree crops, developed areas, forest, and water. It is the output of a decision tree algorithm trained on ground labels obtained through surveys and a combination of Landsat, Disaster Monitoring Constellation, ResourceSat-2, and S2 imagery (USDA-NASS 2019). The accuracy of CDL labels varies by class and geographic region but is generally high.

We accessed CDL via GEE and used it to filter out GEDI shots in non-crop areas of Iowa and assign crop type labels to GEDI shots in cropped areas. Of the cropped area, 57% of GEDI shots are maize and 41% are soybean (figure A1). In the 2019 Iowa CDL, maize is classified with a precision of 97% and recall of 95%, and soybean is classified with a precision of 96% and recall of 95% (USDA-NASS 2019), indicating that CDL is accurate enough to be used as ground truth to evaluate GEDI features for maize classification.

3. Methods

In this section, we describe how the maize classifier was implemented (section 3.1) and how we set up experiments to test GEDI features for maize classification (section 3.2), evaluate GEDI feature transfer across regions (section 3.3), and create wall-to-wall maps using GEDI and S2 together (section 3.4). Table 1 summarizes the GEDI experiments in this paper and the S2 baseline models used to benchmark GEDI.

Table 1. Summary of our experiments evaluating GEDI for maize classification. Models trained on Sentinel-2 (S2) features are the baselines against which models trained on GEDI features are compared. 'Local' models are trained and tested in the same region, and 'Transfer' models are trained in one region and tested in a different region. The final row, 'GEDI-S2 Transfer', is the ultimate goal of this paper: to use GEDI for wall-to-wall maize mapping in regions with no local crop type labels.

Method  TrainingSpatialLocal
NameDescriptionFeaturesLabelsCoverageLabels
S2 LocalTrained and tested in different locations of the same regionSentinel-2 (20 harmonic coefficients)Training region crop type mapWall-to-wallYes
GEDI LocalTrained and tested in different locations of the same regionGEDI (11 RH metrics)Training region crop type mapPointYes
S2 TransferTrained and tested in different regionsSentinel-2 (20 harmonic coefficients)Training region crop type mapWall-to-wallNo
GEDI TransferTrained and tested in different regionsGEDI (11 RH metrics)Training region crop type mapPointNo
GEDI-S2 TransferTrained and tested in different locations of the same regionSentinel-2 (20 harmonic coefficients)GEDI Transfer predictionsWall-to-wallNo

3.1. Maize classifier implementation

We used a random forest classifier to classify maize vs. non-maize in all experiments. Random forests (Ho 1995, Breiman 2001) are an ensemble machine learning method comprised of many decision trees in aggregate. Each decision tree is trained on a bootstrapped version of the training set and a random subset of features to reduce the correlation of predictions across decision trees and improve performance when those predictions are averaged. Random forests are commonly used in crop type classification (Defourny et al 2019, Jin et al 2019) and other Earth observation tasks due to their high accuracy and computational efficiency.

We used the RandomForestClassfier implemented in Python's scikit-learn package. We kept the default parameters, with the exception of raising n_estimators from 10 to 100 to reduce prediction variance.

We discretized each study region into $0.5^{\circ} \times 0.5^{\circ}$ grid cells. All GEDI shots in each grid cell were placed entirely in either the training set or test set, with 80% of grid cells in training and 20% in test (figure A2). Splitting GEDI shots along grid cells reduces spatial correlation across the training and test sets, thereby preventing classification metrics from being artificially inflated. For each result reported in this paper, we ran the classifier 11 times using different training and test splits, and we report the mean and standard deviation of accuracy over all runs. We chose an odd number of runs (11 instead of, say, 10) to make it easier to select the run with median accuracy and visualize its confusion matrix.

3.2. Testing GEDI features for maize classification within regions

Our first experiments test how well GEDI waveforms can distinguish maize from non-maize crops based on maize being a significantly taller crop. As GEDI was designed to monitor forests, it is unknown whether the instrument would be able to resolve height differences between crop types at all.

We trained a random forest classifier for each study region using GEDI RH metrics ('GEDI Local') as features, and compared this model to a baseline random forest classifier trained on S2 harmonic coefficients ('S2 Local'). Both models were tasked with distinguishing maize samples from non-maize samples, where the ground truth for crop type was provided by the datasets described in section 2.4. The S2 Local model is representative of current state-of-the-art crop type classifiers that use optical imagery to predict crop types. It provides a reference for the GEDI Local model as well as models that are transferred across regions.

The number of samples and locations used for training and testing the GEDI and S2 models were identical. Although S2 provides wall-to-wall imagery while GEDI only observes a small subset of Earth's surface, we limited S2 samples to GEDI shot locations. By controlling for sample size and location, we directly compare the two sets of features.

The timing of GEDI observations is important, as maize will be most distinguishable from other crops when their height difference is the greatest. To test the sensitivity of classification performance to growing season timing, we compared the performance of GEDI Local models trained on July-only shots, August-only shots, September-only shots, and shots from all three months.

3.3. Testing GEDI feature transfer across regions

After classifying maize vs. non-maize crops within each region, we tested model transfer across regions. For each GEDI Local and S2 Local classifier trained in the U.S., China, or France, we applied the classifier to the test sets of the other two regions to separate maize from non-maize. We refer to these models as 'GEDI Transfer' and 'S2 Transfer'. The models were not shown any additional data from the new regions. High classification accuracy in a new region would indicate that the model's features generalize across space and few if any labels are needed from the new region to classify maize. Conversely, low classification accuracy would mean that the learned relationship between features and crop types holds true only locally, and labeled data is needed in the new region to learn new classification boundaries.

We compared the GEDI Transfer models to their Local counterparts to see how well GEDI RH metrics generalize across geography. To understand whether growing season timing affects model transfer, we repeated this analysis for each GEDI model trained on July-only shots, August-only shots, September-only shots, and shots from all three months.

3.4. Wall-to-wall crop type mapping using GEDI as training labels

Even if GEDI features transfer perfectly across geography—i.e. maize is always identifiable in GEDI waveforms no matter where on Earth one looks—GEDI only samples 4% of the land surface and cannot alone generate a wall-to-wall crop type map. Achieving a continuous map in space requires GEDI-based approaches to be combined with wall-to-wall imagery like that provided by S2.

To create wall-to-wall maize maps using GEDI and S2 imagery, we used methods similar to those employed previously to calibrate local maps of forest height with GEDI and Landsat (Healey et al 2020). For each pair of regions, we first followed the steps in sections 3.2 and 3.3 to obtain GEDI Transfer models. For example, in China we had one GEDI Transfer model trained in the U.S. and another one trained in France. Then, for each GEDI Transfer model, we treated its predictions at GEDI shot locations as labels in the new region and trained another model to reproduce those predictions using local S2 harmonics as features. We call this second model 'GEDI-S2 Transfer'. To continue the example, we treated the predictions of the U.S.-trained (or France-trained) GEDI Transfer model at GEDI shots in China as true crop type labels, and then trained a classifier on S2 features in China to predict those 'labels'.

By applying the GEDI-S2 Transfer model to all cropland S2 pixels in a new region, we produced a wall-to-wall 10 m spatial resolution maize map without the need for local labels. Figure 4 presents a graphical explanation of the GEDI-S2 Transfer approach with transfer from the U.S. to China as a concrete example.

Figure 4.

Figure 4. Graphical overview of the GEDI-S2 Transfer approach: an example of transferring a model trained on US crop type labels and applying it to China to create wall-to-wall maize maps. In (a) a random forest model is trained using GEDI RH metrics as features and CDL as labels in Iowa, U.S. The model is tested in a different region within Iowa (GEDI Local) to evaluate model performance. In (b) the Iowa-trained model is used to create predictions in China at GEDI shot locations (GEDI Transfer). These predictions then serve as labels for a new model trained on Sentinel-2 harmonic coefficients. Combining Sentinel-2 and GEDI (GEDI-S2 Transfer) enables a wall-to-wall maize map in China without any ground truth labels from China. In this schematic, the GEDI-S2 Transfer predicted map is shown for the same region used for training, but model evaluation is always done using a test set not used in training.

Standard image High-resolution image

We compared GEDI-S2 Transfer to two other models. The first is the S2 Local model, which provides an upper bound for how well S2 harmonic features can classify maize when trained on in-region ground truth. The gap between GEDI-S2 Transfer and S2 Local reflects the accuracy of GEDI Transfer's predictions relative to ground truth. The second benchmark is the S2 Transfer model, which shows how well a model trained on S2 harmonics in one region fares when applied to other regions. Differences between S2 Transfer and GEDI-S2 Transfer reveal how robust GEDI features are across space compared to optical features.

4. Results

In this section, we begin by comparing GEDI and S2 features across the three study areas to understand why each set of features may or may not transfer across geography. We then report the performance of GEDI and S2 models for each experiment.

4.1. GEDI and Sentinel-2 feature comparison

The median harmonics of GCVI from S2 and median RH energy curves from GEDI are shown for the top three crops in each region in figure 5. In each region, the S2 maize profile reaches peak greenness in August and is generally distinguishable from the S2 profiles of other crops because of different timing and magnitude of the greenness peaks. For the GEDI energy profiles, roughly 50% of the energy returned comes from negative RH values, which as mentioned previously arises from the GEDI algorithm's definition of ground elevation (section 2.2). Thus, we emphasize that the values of RH should not be interpreted as physically meaningful; for instance RH100 does not correspond to the physical crop height. Nonetheless, the GEDI curves exhibit a clear separation between maize—the tallest crop—and the other crops. This difference is especially apparent at the extremes of RH curves, shown as insets in figure 5. Specifically, maize has lower values for RH0 to RH50 and higher values for RH70 to RH100 compared to other crops. These differences are also apparent in the example profiles shown in figure 3.

Figure 5.

Figure 5. Median harmonics of GCVI from Sentinel-2 (top row) and median RH energy curves from GEDI (bottom row) are shown for the top three crops in each region. GEDI profiles are computed from all shots covering the July–September period; please see figure A1 for the number of shots by region and crop type. Shading shows the 25th–75th percentile of observations.

Standard image High-resolution image

Whereas figure 5 compares different crops within a region, figure 6 displays the median features for maize from different regions on the same plot. This comparison is especially relevant for the question of how well a model is likely to transfer across regions. Ideally, features would be similar across regions in order to use a model trained in one region on another. For the S2 harmonics, clear differences emerge between the regions. Maize in the U.S. generally has a steeper increase in GCVI during June and July and a higher peak in August compared to maize in the other regions. Maize curves in France are slightly earlier than maize curves in China on average and have a substantially higher variance. The harmonic curves for the non-maize crops also differ considerably, both in timing and magnitude of the peak (figure 5). In contrast, the GEDI curves are remarkably consistent across regions (figure 6). This difference between S2 and GEDI indicates that the peak height of maize is a better preserved characteristic across regions than the timing of the maize growing season and the total crop biomass, both of which influence the S2 harmonics.

Figure 6.

Figure 6. Median maize profiles in the three regions, shown for Sentinel-2 GCVI harmonics in (a) and GEDI curves in (b). Shading shows the 25th–75th percentile of observations.

Standard image High-resolution image

4.2. Local maize classification

The feature comparisons above suggest that GEDI features should be reliable for classifying tall crops (in this case, maize). To quantitatively evaluate its performance for local classification, we used these GEDI features to train a classifier for each region and compared it to a classifier trained on the S2 harmonics (section 3.2).

Since the task is to separate maize from other crops based on height, we considered various possible timings of the GEDI features. Test accuracies indicate that GEDI RH metrics can distinguish maize from other crops within each region with more than 79% accuracy in all three months tested (figure 7). The optimal timing of GEDI observations differs by region, with September for China, July for France, and August for the U.S. being the best times for classification, resulting in 88%, 85%, and 91% accuracy, respectively. August is generally a good month for GEDI observations in all regions, with performance in all regions above 83% for this month.

Figure 7.

Figure 7. Classification accuracies for locally-trained GEDI and Sentinel-2 models. Bars indicate mean GEDI accuracies for models trained in different months. Error bars show one standard deviation. The dashed line indicates accuracy of the S2 Local model in each region, which is 93% in China, 95% in France, and 95% in the U.S. Note that training sample locations for the Sentinel-2 model were the same as the training shot locations for GEDI for all three months.

Standard image High-resolution image

As expected, the locally-trained S2 models do well in each region (93% accuracy in China, 95% in France, and 95% in the U.S.), and indeed perform better than the GEDI features. The gap between S2 local models (trained on the same locations as the July–September GEDI shots) and the best GEDI models is less than 5% for China and the U.S. and less than 10% in France.

To understand why GEDI model errors are larger, we show maps of GEDI misclassifications in appendix figure A3 and confusion matrices for representative median runs in appendix figure A4. From the maps, we see that a significant percent of errors occur at field borders, where the GEDI shot footprint contains multiple crop types or a mix of crop and non-crop classes. We also observe that the most difficult crop types for GEDI Local models to classify are soybeans in China and silage corn and sunflower in France. In China, this can be partly explained by the 84% precision for soybeans in the You et al (2021) map used for ground truth (section 2.4.1). In other words, up to half of the 'misclassification' of soybeans as maize in China could be correct. Since the 'ground truth' maps in China and the U.S. are created using optical satellites, the predictions of S2 Local models—both correct and incorrect—are likely to correlate with the ground truth more than those of GEDI Local models.

Local GEDI accuracies in France are overall lower than the other two regions. This appears to arise mainly from the fact that two kinds of maize are grown in France, maize for grain and for silage (see figure A1 for the crop distribution by region). Whereas grain maize is always grown to maturity and thus has a more reliable seasonality, silage maize is grown for biomass and can be sown and harvested at any point in the season. As a sensitivity test, we recalculate local GEDI accuracy in France after omitting the silage maize from the test set, finding that accuracies improve by 4% or more in all periods (see figure A5 in appendix). The improvement is largest in September, consistent with the notion that early harvest of silage maize is affecting the performance of GEDI features, since a harvested crop is no longer tall. As both the U.S. and China grow predominantly grain maize, the less predictable timing of silage maize is not an issue in those regions.

4.3. Transferring classification across regions

As noted in section 4.1, the consistency of GEDI features across regions suggests that models trained in one region can be reliably applied to new regions. A quantitative test of this proposition is shown in figure 8, which compares the performance of GEDI models trained using data from the local region (GEDI Local) to those trained in other regions (GEDI Transfer). Although the locally trained models are typically the best performers, the transferred models perform nearly as well and are occasionally indistinguishable from the locally trained models. For example, models trained in the U.S. or China both perform as well in France in August as one trained in France.

Figure 8.

Figure 8. Test accuracies for GEDI models when using different combinations of training and test regions and periods. Colors indicate the region where models were trained, with hatching indicating a model trained in the same region (but on different locations). Models trained in other regions typically perform similarly to those trained locally.

Standard image High-resolution image

Transferred GEDI models are only able to make predictions at GEDI shot locations. In order to extrapolate beyond these point locations, we used transferred GEDI predictions as labels to train a new model that takes local S2 harmonics as input (GEDI-S2 Transfer). GEDI-S2 Transfer test accuracies for the month of August are reported in figure 9 together with S2 Transfer and S2 Local accuracies for comparison. S2 Transfer shows how well state-of-the-art optical features perform out-of-region, while S2 Local provides an upper bound on how well optical features can separate maize when paired with ample local ground truth.

Figure 9.

Figure 9. Test accuracies of models using Sentinel-2 harmonic features for wall-to-wall mapping of maize and non-maize for the month of August, trained either with direct transfer of Sentinel-2 features from other regions (gray bars) or using labels from GEDI predictions (blue bars). Error bars show one standard deviation. Dashed lines show performance of a locally-trained Sentinel-2 model (with training samples from the same GEDI shot locations) for the month of August as a comparison.

Standard image High-resolution image

The results in figure 9 show first that harmonic features, while good at distinguishing crop types locally, transfer poorly across study regions. The average accuracy of an S2 Transfer classifier is 64%. In the U.S., the S2 Local model achieves an accuracy of 94%, but S2 Transfer models trained in China and France only manage accuracies of 60% and 62%, respectively. This is consistent with previous work that found optical feature transfer-ability to deteriorate across geography (Wang et al 2019), as well as with the differences in crop phenology observed in figures 5 and 6. As the phenology and prevalence of crop types shift across regions, the optimal decision boundary for classifying maize versus non-maize with harmonic features also changes. S2 Transfer therefore results in many misclassified samples.

GEDI RH features, on the other hand, transfer much better across regions, and consequently the S2 model they supervise (GEDI-S2 Transfer) also performs much better than direct S2 Transfer. Figure 9 shows that GEDI-S2 Transfer accuracies exceed 82% for all cross-region pairs. For example, GEDI-S2 Transfer achieves 86% accuracy in the U.S. for models trained in China or France. While the S2 Local model has an 8% higher accuracy, the GEDI-S2 Transfer significantly outperforms S2 Transfer from China and France by 26% and 24%, respectively. GEDI-S2 Transfer and GEDI Transfer accuracies are about the same; in China they are the same, in the U.S. GEDI-S2 Transfer is 1% lower, and in France GEDI-S2 Transfer is 1% lower for the model trained in China and 1% higher for the model trained in the U.S.

Example crop type predictions for the three regions using the best GEDI-S2 Transfer model are shown in figure 10. The corresponding ground truth crop type maps are also shown for reference. For the most part, the GEDI-S2 Transfer predictions agree with the ground truth maps.

Figure 10.

Figure 10. Ground truth crop type maps (a)–(c) compared with classification predictions of GEDI-S2 Transfer models (d)–(f). Prediction maps for China (d) and France (e) are created using models trained in the U.S. (84% accuracy for both). The prediction map for the U.S. was created using the model trained in France (86% accuracy).

Standard image High-resolution image

5. Discussion

The results show that GEDI features can distinguish a tall crop like maize from shorter crops, and that these features are highly transferable across geography. Should spaceborne lidar sensors someday sample the Earth's surface more densely, they would add a useful set of features for mapping crop types that are complementary to optical and radar features. In regions where crop type maps are already available, such as the study areas considered here, lidar could augment field surveys to generate crop type labels. This would reduce the cost of creating products like CDL, which currently rely on nationwide surveys for crop type labels. Lidar could be especially helpful in a system like the U.S. Corn Belt, where agriculture is heavily dominated by maize (a tall crop) and soybeans (a short crop).

Most importantly, lidar has the potential to enable mapping of tall crops like maize in areas of the world where crop type maps are not available due to a lack of ground labels. Our experiments transferring GEDI features and using GEDI to train wall-to-wall crop type maps in China, France, and the U.S. show the robustness of lidar features across continents, despite the GEDI instrument being designed to monitor forests rather than cropland. We found that S2 models trained on GEDI achieved 84% average accuracy, compared to 94% average accuracy when trained on local ground truth labels (figure 9). This is in stark contrast to models trained on optical features like S2 time series, which perform well locally but decrease rapidly in performance when transferred across geography (64% average S2 Transfer accuracy). Our results suggest a strategy for mapping tall vs. short crops in regions that lack crop type labels: train a classifier to distinguish tall vs. short crops in China, France, and the U.S. (GEDI Local); apply the classifier to GEDI shots in a new region (GEDI Transfer), and then train another classifier to predict GEDI predictions using S2 time series covering the entire new region (GEDI-S2 Transfer).

Despite the promising results in this work, we recognize many potential issues that could emerge when extending this approach to the global scale. First, mapping locations of tall and short crops will not suffice for many applications, which require more detailed crop information. Where maize is the predominant tall crop, as was the case for the three regions studied here, a map of tall crops can be reliably used to identify maize areas. In regions with multiple tall crops, additional features would be needed to separate individual crop types; for instance, optical data has proven useful for distinguishing maize from sorghum (Soler-Pérez-Salazar et al 2021) and radar data for distinguishing maize from sunflower (Veloso et al 2017, Belgiu and Csillik 2018).

Second, many of the errors we observed in GEDI predictions occurred at the edges of fields. These disparities likely reflect some combination of errors in the labels, errors in GEDI horizontal geolocation, and mixed crop types within the GEDI shot footprint. For applications in smallholder regions where fields are very small, pixels near field edges will be increasingly common. Using GEDI to predict tall vs. short crops is therefore likely to have more errors in smallholder systems. It is possible that such errors would have only minimal effect on model training, since they are likely to be random, or that maps of field boundaries could be used to filter out shots near field edges (Waldner and Diakogiannis 2020).

Third, while we tested GEDI transfer across three regions on three different continents, these regions still only represent a subset of existing climates and agricultural management practices worldwide. For example, in regions with sparser maize sowing densities, different maize varieties, and more frequent crop failure, the GEDI waveforms could systematically differ from those observed in China, France, and the U.S. At the same time, as long as maize reaches a certain height, RH values near 100 should still be taller for maize than for other crops, offering a physical justification for why the methods of this paper could still succeed.

Another issue, particularly in tropical systems, could be frequent cloud cover during the time of year when crops are at peak height, which could limit the availability of clear GEDI shots. Although we observed lower availability of clear shots during August for the current study regions (figure 2), it did not appear to compromise the performance of the model relative to other months with more observations, presumably because many thousands of clear observations were obtained even in the cloudier months. Nonetheless, clouds could emerge as an important constraint in other locations.

Lastly, data continuity could become a challenge; the planned lifetime of GEDI is only two years, with data acquisition beginning in March 2019 and expected to wrap up in September 2021 (GEDI Timeline 2019). To create maps of tall and short crops in years without GEDI observations, it may be possible to apply the GEDI-S2 Transfer models to S2 data before 2019 and after 2021. Such cross-year transfer is likely to result in lower accuracy than within-year transfer, though some of the decrease in performance might be alleviated by training on combined 2019–2021 data and by accounting for shifts in features over time (Kluger et al 2021). It may also be worth testing whether the longer-duration ICESat-2 mission, which carries a photon-counting lidar (compared to GEDI's full-waveform lidar), could also detect differences in crop height. We leave these explorations to future work.

6. Conclusions

Crop type mapping remains challenging in much of the world. This difficulty stems from a scarcity of crop type labels and the highly localized relationship between optical satellite features and crop type; currently, crop type maps mainly exist in high-income countries where crop type labels are abundant. In this study, we test whether lidar measurements can distinguish tall crops from short crops in a way that transfers across geographies. If lidar features successfully separate tall vs. short crops anywhere on Earth, then we can use crop type labels from a region with labels (e.g. the U.S.) to predict crop types in a region without labels (e.g. much of the world).

Using RH returns from GEDI, we demonstrate that lidar measurements can distinguish between maize and shorter crops in three study regions spanning China, France, and the U.S. Within-region classification accuracy for models using GEDI features average 86% (for shots acquired in the month of August), compared to 94% for models using S2 features. We then show that GEDI features transfer better than S2 features across regions; on average, transferred GEDI models achieved 85% accuracy compared to 64% for transferred S2 models. Lastly, we train models in each transfer region to replicate GEDI maize/not-maize predictions using S2 features to create wall-to-wall crop type maps. These maps achieve 84% average accuracy.

We conclude that GEDI holds great promise for improving agricultural monitoring, because it captures features that are much more generalize-able than those typically used in satellite-based crop type mapping. Of particular importance is its potential to map tall vs. short crops in regions of the world with no crop type labels. In addition, the demonstrated ability to distinguish crops with height differences of just 1 m suggest other potential applications in agriculture should be feasible, such as monitoring the age and growth of tree crops or identifying intercropped fields that contain mixtures of different crops.

Data availability statement

The data that support the findings of this study are available upon reasonable request from the authors.

Acknowledgments

This work was supported by the NASA Harvest Consortium (NASA Applied Sciences Grant No. 80NSSC17K0652, sub-award 54308-Z6059203 to DBL). Work by SW was partially supported by the Ciriacy-Wantrup Postdoctoral Fellowship at the University of California, Berkeley. We thank the Google Earth Engine team for making large-scale computational resources available to researchers.

:

Figure A1.

Figure A1. Number of GEDI shots in each region by crop type.

Standard image High-resolution image
Figure A2.

Figure A2. To avoid inflating classification accuracy due to spatial correlation, we split the GEDI shots in each region into training and test sets along a grid. On the left, we show the grid along with GEDI shots in gray over Iowa (shaded). On the right, we show one possible assignment of grid cells to the training and test sets, with 80% of grid cells in training and 20% in test. We randomly assign grid cells to the training and test sets each time we run the classifier, for a total of 11 runs per experiment.

Standard image High-resolution image
Figure A3.

Figure A3. GEDI Local predictions in the three regions. In blue are the correctly classified shots, in red the misclassified ones. Many errors occur at the borders of the fields, likely because of mixed crop types within GEDI shots footprints. Some other errors can be attributed to errors in the crop type maps used as ground truth.

Standard image High-resolution image
Figure A4.

Figure A4. Confusion matrices of GEDI Local classification for the best time in the three regions: September for China, July for France, and August for the U.S., which resulted in a mean accuracies of 88%, 85%, and 91%, respectively. For our analysis, we ran the classification task multiple times, each time with different train and test sets, and computed mean accuracies. Here we are showing confusion matrices only for the median run, i.e. the run with median accuracy. We ran a binary classification (maize vs. other crop). Here we show ground truth labels detailed by crop type to get deeper insights into misclassifications.

Standard image High-resolution image
Figure A5.

Figure A5. Comparison of GEDI Local accuracies in France when (left) excluding versus (right) including maize for silage. Accuracies improve when considering only grain maize by more than 4% in all periods, with September improving 10% most likely due to early harvest of silage maize.

Standard image High-resolution image
Please wait… references are loading.