Inferring alpha, beta, and gamma plant diversity across biomes with GEDI spaceborne lidar

Biodiversity-structure relationships (BSRs), which describe the correlation between biodiversity and three-dimensional forest structure, have been used to map spatial patterns in biodiversity based on forest structural attributes derived from lidar. However, with the advent of spaceborne lidar like the Global Ecosystem Dynamics Investigation (GEDI), investigators are confronted with how to predict biodiversity from discrete GEDI footprints, sampled discontinuously across the Earth surface and often spatially offset from where diversity was measured in the field. In this study, we used National Ecological Observation Network data in a hierarchical modeling framework to assess how spatially-coincident BSRs (where field-observed taxonomic diversity measurements and structural data from airborne lidar coincide at a single plot) compare with BSRs based on statistical aggregates of proximate, but spatially-dispersed GEDI samples of structure. Despite substantial ecoregional variation, results confirm cross-biome consistency in the relationship between plant/tree alpha diversity and spatially-coincident lidar data, including structural data from outside the field plot where diversity was measured. Moreover, we found that generalized forest structural profiles derived from GEDI footprint aggregates were consistently related to tree alpha diversity, as well as cross-biome patterns in beta and gamma diversity. These findings suggest that characteristic forest structural profiles generated from aggregated GEDI footprints are effective for BSR diversity prediction without incorporation of more standard predictors of biodiversity like climate, topography, or optical reflectance. Cross-scale comparisons between airborne- and GEDI-derived structural profiles provide guidance for balancing scale-dependent trade-offs between spatial proximity and sample size for BSR-based prediction with GEDI gridded products. This study fills a critical gap in our understanding of how generalized forest structural attributes can be used to infer specific field-observed biodiversity patterns, including those not directly observable from remote sensing instruments. Moreover, it bolsters the empirical basis for global-scale biodiversity prediction with GEDI spaceborne lidar.


Introduction
Global biodiversity loss and ecosystem degradation have mobilized large-scale ecological monitoring initiatives that seek to fulfill goals outlined under international agreements like the Convention on Biological Diversity (Gonzalez and Londoño 2022).For example, the Group on Earth Observations Biodiversity Observation Network has espoused using remotely-sensed essential biodiversity variables (EBVs) to efficiently and consistently monitor spatial and temporal trends in biodiversity over large areas (Pereira et al 2013).Remotely-sensed EBVs, when paired with corresponding field data from global monitoring networks, can facilitate data-driven prioritization efforts for ecosystem restoration, biodiversity conservation, and carbon sequestration at unprecedented scales (Strassburg et al 2020).
Lidar (light detection and ranging) observations of three-dimensional forest structure-the structural dimensions, heterogeneity, and configuration of vegetative elements in a forest ecosystem-provide particularly effective EBVs that have seen increased use in recent decades (Coops et al 2021).Lidar-derived forest structural attributes like biomass and vertical profiles have been critical to international efforts to delineate forest condition and integrity (Hansen 2019), estimate global carbon dynamics (Duncanson et al 2022), and characterize habitat and biodiversity (Fagua et al 2021, Pillay et al 2022).For biodiversity mapping applications, the promise of lidar lies primarily in its ability to exploit empirically-observed biodiversity-structure relationships (BSRs) to predict geographical patterns in species diversity across taxonomic groups including tropical mammals (Deere et al 2020), birds (Goetz et al 2007, Burns et al 2020), reptiles (Shine et al 2002), arthropods (Müller et al 2014), as well as understory plants and trees (Hakkenberg et al 2018, Wang andGamon 2019).
The covariance between biodiversity and forest structure that is described by BSRs is hypothesized to reflect both specific habitat affinities and cross-taxon habitat relationships (Yong et al 2018).For example, studies have noted a correlation between habitat heterogeneity and species diversity that emerges when spatially-heterogeneous resource distributions drive niche partitioning and species packing in structurally-complex ecosystems (Costanza et al 2011).Conversely, forest structure has been described as an emergent property of complex community processes-including forest self-organization through tree growth and allometry patterns as well as multi-species interactions-which together manifest as tree size-frequency distributions, forest spacing relations, and canopy configurations (West et al 2009).In this view, irrespective of factors like disturbance and succession, the more that diverse genotypes produce diverse phenotypes (e.g.tree crown morphologies), the greater the likelihood that more diverse community assemblages will manifest as structurally complex forests (Horn 1971).Thus, whether structure drives diversity or diversity drives structure, BSRs serve as an expedient means to infer community attributes undetected by air-and spaceborne sensors (e.g.understory herbaceous diversity) from attributes that can be remotely sensed, like canopy layering or structural diversity.Importantly, unlike similar formulations encompassed by the Diversity Begets Diversity Hypothesis (Palmer and Maurer 1997) or the Height Variation Hypothesis which only applies to tree diversity (Torresani et al 2023), BSRs represent more generalized taxon-habitat relationships that can include structural metrics unrelated to heterogeneity, such as canopy height or plant area index (Hakkenberg and Goetz 2021).
Lidar mapping of ecosystem structure is unprecedented for its efficiency, accuracy, and extent compared with traditional field-based methods (Atkins et al 2023a).At local-to landscape-scales (<100 km 2 ), airborne laser scanning (ALS) provides fine resolution, highly accurate data on forest canopy structure.However, despite the recent proliferation of ALS datasets, ALS can be too expensive or logistically challenging for large-scale applications or inadequate when fused piecemeal across disparate flights.On the other hand, for large-extent (>100 km 2 ) applications, spaceborne lidar such as NASA's Global Ecosystem Dynamics Investigation (GEDI) provides consistent data on canopy structural attributes like plant area volume density, foliage height diversity, and aboveground biomass at near-global extents (Dubayah et al 2020).While orbital geometry and cloud cover limit observations, GEDI's spatial coverage transcends global administrative borders and mitigates geographical bias in ecological data collection (Meyer et al 2016).Importantly, GEDI is a sampling instrument, whose fundamental measurement is a vertical waveform profile of returned energy in discrete 25 m diameter footprints, such that the area between footprints (60 m along-track and 600 m across-track) is not sampled other than via additional orbital acquisitions (Dubayah et al 2020).Despite the GEDI Science Team having processed >20 billion high quality footprints, a substantial proportion of the Earth surface remains unsampled.In response, researchers have implemented a variety of procedures to fill these interstitial gaps and provide continuous gridded raster data on canopy structure (figure 1), including GEDI footprint aggregation (Tang et al 2019) and model interpolation based on ancillary remote-sensing datasets (Healey et al 2020, Potapov et al 2021).
As with ALS point clouds gridded into raster products like canopy height models, there is no native pixel resolution for gridding GEDI footprint data into spatially-continuous maps.Grid cell resolution is instead determined in the context of the application, with the goal to maximize representation while minimizing error due to factors like subpixel heterogeneity, footprint sample size, or noise introduced from ancillary datasets (Frazer et al 2011).For example, some recent GEDI gridded canopy height products have incorporated optical data to attain spatial resolutions of 30 m (Potapov et al 2021, Wang et al 2022).However, fine resolution gridding may come at the expense of accuracy, especially where interpolated values are modeled from saturation-prone optical data (Mutanga et al 2023).Gridding at coarser resolutions may mitigate some of these issues as more GEDI samples per grid cell increases the chance of representative coverage in aggregate statistics, albeit at the expense of spatial proximity and fine detail.These trade-offs in the scale dependency of GEDI footprint aggregation have profound effects for its use in biodiversity models.
In this study, we sought to address these challenges to using GEDI data for taxonomic biodiversity prediction, to infer the relationship between multi-scale lidar characterizations of forest structure and field-measured vascular plant and tree diversity (i.e.BSRs) across the United States.We addressed the following questions: (1) how do BSRs based on coincident ALS data compare with those derived from aggregated GEDI footprints?(2) How do BSRs vary between plants and trees (where trees are a subset of all vascular plants), among alpha, beta and gamma diversity indices, among six indices of forest structure, and across the ecoregions of the USA? (3) What scale dependencies do these relationships exhibit, and how do they vary by sensor type, sample spatial structure and sampling density?
We hypothesize that ALS point cloud data is a more consistent predictor of plant and tree diversity compared with GEDI waveforms owing to its finer resolution, complete plot coverage, and spatial coincidence with field-measured biodiversity data.Further, we expect stronger and more consistent BSRs for tree diversity versus plant diversity owing to lidar's ability to directly detect canopy trees, unlike smaller understory plants which can only be indirectly inferred.A critical extension of these expectations, with implications for global diversity modeling using space-borne lidar, is the question of whether generalized characteristic forest structural profiles derived by aggregating proximate GEDI footprints are sufficient to infer alpha diversity at specific plots, as well as beta and gamma diversity gradients across landscapes.

Vascular plant and tree diversity
We calculated vascular plant and tree diversity from plant presence and percent cover and woody plant vegetation structure data from National Ecological Observation Network (NEON) 20 × 20 m (400 m 2 ) field plot observations (NEON 2021).NEON is a hierarchically structured ecological observation facility distributed across the United States that consists of 47 terrestrial sites, each between 68 and 470 km 2 .Sites possess 22-69 base plots ('basePlots') each, with specific locations determined through a stratified-random sampling design to represent the compositional and environmental variance across each site (Barnett et al 2019).Among all plots, we selected those located in forest or woodland cover types-defined as possessing ⩾30% canopy cover in a 2512 m 2 lidar extent subsuming each plot (Bonan et al 2002).Sites falling outside GEDI's ± 52 • latitudinal coverage (e.g.sites in Alaska) were excluded, as were those possessing fewer than 10 plots per site to ensure adequate site-level sample sizes in the mixed effects statistical design (Harrison et al 2017).When plot data were available for multiple years, the most recent date with temporally coincident diversity, ALS, and GEDI data was selected.The resulting dataset consisted of 978 plots across 32 sites (appx.S1).Local, or 'alpha' , plant diversity (plant alpha ) was represented as total vascular plant species richness, including trees, as derived from cover values in NEON's spatially nested subsampled corners (see Barnett et al 2019) and aggregated to the 400 m 2 full plot extent (table 1).Tree alpha diversity (tree alpha ) corresponds to the species richness of live multi-and single-bole >1.3 m stems (NEON 2021).Beta diversity-which characterizes spatial turnover in composition among all plots per site-was quantified as total replacement diversity from a Sorensen dissimilarity matrix using species' cover values for plants (plant beta ) and basal area for trees (tree beta ) using the 'adespatial' package (Dray et al 2022) in R (R Core Development Team 2021).Site-level (68-470 km 2 ) gamma diversity corresponds to the total species richness of vascular plants (plant gamma ) and trees (tree gamma ) across all plots per site (appx.S2).

Lidar data
Site-wide ALS data were acquired from the NEON Airborne Observation Platform flown during peak leaf-on conditions, defined as >90% of maximum greenness.After calibration for instrument characteristics, meteorology and geographic data, NEON derived an orthorectified, discrete return point cloud product in 1 km 2 tiles (NEON 2021).Point cloud tiles were mosaicked across each site, normalized to the ground surface based on a digital terrain model, and subjected to quality control for corrupted data, with returns greater than three standard deviations from the site mean return height removed as spurious.Processing of the 1 m resolution ALS 'census' (i.e. a wall-to-wall, continuous map) was performed using the 'lidR' package in R (Roussel and Auty 2019).GEDI L2A and L2B files were selected that cover NEON sites, correspond to leaf-on dates (day of year 105-319), and possess minimal positional degradation as indicated by the degrade flag (Dubayah et al 2021a(Dubayah et al , 2021b)).Due to issues with ground-finding in steep terrain, we excluded all footprints occurring on slopes greater than 20 • , or the 75% quartile of all site-level slope values, whichever was greater (Fayad et al 2021).We likewise excluded GEDI footprints that were more than 150 m higher or lower than a TanDEM-X digital elevation model (German Aerospace Center 2018).Owing to the potential for geolocational accuracy to affect comparisons with co-located ALS samples, we instituted a 'bullseye' procedure to improve the mean geolocational accuracy of GEDI footprints (Blair and Hofton 1999).To do this, we selected all GEDI L1B granules located within each NEON site.We then created a parallelized high-performance computing pipeline to simulate waveforms from NEON ALS tiles with the same parameters as GEDI waveforms using the 'colocateWaves' function (Hancock et al 2019).Because simulated ALS waveform locations have a geolocational accuracy of <1 m, we jittered the x and y locations of each simulated waveform within 20 m of the GEDI footprint's nominal location to determine which offsets to the original GEDI footprint locations would result in the highest correspondence with simulated waveform locations.Offset values (spatial adjustments in x and y directions) corresponding to the highest Pearson correlations for 352 height bins between the original and simulated waveform profiles were used to adjust GEDI footprint locations (Blair and Hofton 1999).

Forest structural metrics from ALS and GEDI
Six forest structural metrics were derived from continuous ALS census data and discrete GEDI footprint samples that are parsimonious, interpretable, and representative across a range of forest structural conditions (table 1).While these structural metrics possess varying degrees of correlation (appx.S3), they were selected to represent endpoints along three conceptually-orthogonal axes of structural variation: canopy dimensions, structural heterogeneity, and spatial configuration (Hakkenberg and Goetz 2021).All forest structural metrics were calculated to allow direct comparison between ALS point cloud and GEDI waveform data formats.
Canopy dimensions were represented by maximum height (H max ), or the 98th percentile of relative heights (RH98), and total plant area index (tPAI) which includes photosynthetic and nonphotosynthetic elements like stems and branches.For GEDI waveforms, tPAI was derived using 5 m binned PAI estimates from 1.3 m above ground to RH98, while for ALS, tPAI was estimated by summing 1 m PAI bins up to RH98, using a constant extinction coefficient to account for signal attenuation below the upper canopy (Bouvier et al 2015).Structural heterogeneity was quantified as foliage height diversity (FHD) in 1 m height bins and Pielou's J height evenness (H J ) between 1.3 m and RH98.Spatial configuration was characterized by understory structure, or PAI below canopy median height (PAI b2 ), as well as vertical skewness using the vertical distribution ratio (VDR), where low VDR values correspond to top-heavy canopies and vice versa (Goetz et al 2007).For GEDI footprints, VDR was based on the distribution of 5 m PAI bins, while for ALS point clouds, it was estimated using the cumulative PAI profile across all 1 m voxels.

Structural metric scaling and aggregation
We assessed multi-scale ALS-derived structural metrics and aggregated structural metrics from GEDI samples at varying proximity from NEON basePlots to characterize local to landscape-level vegetation structure.Owing to plant competition dynamics that may be influenced by factors outside of a given plot, like canopy shading from proximate stands, we hypothesized that structural profiles in areas in and outside of each plot mediate plot-scale community dynamics.Thus, to assess potential scale dependency of ALS wall-to-wall census data in capturing structurally mediated community dynamics outside of the plot, we clipped ALS point clouds at five extents: the 400 m 2 basePlot (10 m half-width square) and four concentric circular extents with radii 12.5, 25, 37.5, and 50 m from the basePlot center (table 2; figure 2(c)).
In contrast to NEON ALS, GEDI data consist of spatially discrete ∼25 m diameter footprints that form a crosshatch spatial structure resulting from instrument design and the ISS's orbital geometry (figure 2(b)).To derive a characteristic structural profile for a given basePlot, structural metrics from GEDI footprints were aggregated across distances of 200,400,700,1100,1600,2200,2900,3700,4600, and 5600 m from each basePlot center (table 2).In addition, GEDI-coincident ALS samples (i.e. 25 m circular ALS co-located with GEDI footprints) were extracted from point cloud tiles, with bullseye geolocational correction ensuring relative fidelity between the two.Finally, two additional datasets were derived for each plot and scale to assess the role of sampling spatial structure and density: a spatially-dispersed (not crosshatched) and sparse (equivalent sample size to corresponding GEDI footprints) dataset, and a spatially-randomized, dense ALS dataset where the sample size was approximately four times that of the sparse dataset (figure 2(b); table 2).All metrics derived from GEDI and ALS-coincident footprints were aggregated using inverse-distance weighted linear kriging.After removing outlier values (z ⩾ 4), kriged aggregates of all six structural metrics  were represented by the distribution's weighted mean and standard deviation.To ensure compatibility for site-level analyses, we delineated the site area by a 10 km buffer surrounding the convex hull of all basePlots, or the site boundary, whichever was smaller.

Ecoregions
To subset NEON's multi-biome domain into distinct biomes, we employed the hierarchically-nested Environmental Protection Agency (EPA) level I 'ecoregion' typology, which excels for its geographic detail and spatial realism that includes features like nested enclaves (EPA 2013).EPA level 1 ecoregions reflect the primary axes of compositional and ecoclimatic variation across the United States, and thus serve as an expedient means for assessing cross-biome effects across all 32 NEON sites versus intra-biome effects which we delineate with the following ecoregions: (1) North American Deserts (Deserts), (2) Eastern Temperate Forests (Eastern Forests), (3) Great Plains, (4) Northern Forests, (5) Northwestern Forested Mountains (NW Forests), and (6) Tropical Wet Forests (Tropics).Owing to San Joaquin Experimental Range (SJER) being the sole site in its level I group (Mediterranean California), it was grouped with proximate Sierra Foothills sitesin the NW Forests category, such as Lower Teakettle (TEAK) and Soaproot Saddle (SOAP) (figure 1).

Statistical analyses 2.6.1. GEDI footprint representative elementary area
To investigate scale dependency in GEDI footprint aggregation, we estimated the representative elementary area (REA) of GEDI-derived structural metrics with increasing distance from basePlots.REA estimates the spatial scale at which the variability of a response flattens, or falls to an acceptably low level, to determine optimal resolution for gridding lidar rasters (Atkins et al 2023b).To do this, we first calculated a distance window from the range and nugget of empirical semivariograms for all six structural metrics using the 'automap' package in R (Hiemstra and Skoien 2023), using these range values to determine a window midpoint extending equally on both sides until reaching the nugget.Second, we implemented a linear changepoint analysis within these distance windows for all basePlots and each structural metric based on segmented regression, which estimates the two linear fits that minimize the Akaike information criteria for each line while testing for statistical significance based on the Davies Test using the 'segmented' package in R (Muggeo 2022).The changepoint represents the point of stationarity for each GEDI metric, where the response of GEDI structure changes abruptly in response to the dependent variable, the sample distance from each plot (Atkins et al 2023b).

Spatial generalized modeling
We employed a spatial generalized linear mixed modeling (GLMM) framework to infer the magnitude and direction of structural effects on patterns of alpha diversity (appx.S4).Spatial GLMMs are well-suited to NEON's spatially-clustered and hierarchically-related plot-site sampling design because they simultaneously account for spatial autocorrelation among plots within sites, while employing a site-level random effects structure that mitigates inter-site confounding effects like climate and topography (Bolker et al 2008).Random effects also account for imbalances in plot sample sizes among sites (Harrison et al 2017).GLMM models assessed the bivariate relationship between censused and sample-aggregated structural metrics at a given scale (the predictor) and its corresponding field-plot derived biodiversity value (the response).All predictors were standardized and centered on zero, while model error distributions (e.g.normal, Poisson, negative binomial) were determined empirically (appx.S4).
We accounted for spatial autocorrelation among plots using an empirical spatial covariance matrix based on Euclidean pairwise distances among plots, with an exponential correlation structure (Venables and Ripley 2002).This explicit accounting of spatial autocorrelation at the model stage partially accounts for unmeasured environmental variables and spatially-structured error (Wimberly et al 2009).Spatial GLMMs were performed using the 'MASS' package in R (Venables and Ripley 2002).Site-level beta and gamma diversity data, which do not possess a spatialized hierarchical structure like the geographically nested plots, were modeled using spatial generalized least squares (GLS) in the 'nlme' package in R (Pinheiro and Bates 2022).

Alpha diversity and forest structure across ecoregions
NEON basePlots across the biomes of the United States demonstrate distinct variation in both vegetation structure and diversity (appx.S5).Bivariate GLMMs indicate significant near-continental, cross-biome BSRs across most combinations of plant and tree alpha diversity indices, ALS-and GEDI-derived structural attributes, and spatial scales (figure 3; see appx.S6 for results with Simpson's D diversity).BSRs were especially consistent for tree diversity, with ALS-censused structure consistently related to tree alpha across all scales.Among the six structural metrics, only VDR exhibited negative standardized effects (β 1 : −0.28 to −0.18).As VDR corresponds to top-heavy canopies, this result is consistent with the positive effects observed for other dimension-related metrics, with standardized slopes between 0.09 and 0.38.Despite some nonsignificant relationships at the smallest scales of aggregation where sample sizes were limited, tree BSRs derived from GEDI sample means exhibited a similar pattern in significance, sign, and magnitude, with the absolute value of standardized effect sizes (absβ 1 ) falling between 0.15 and 0.46 (appx.S7).Landscape variance in structural attributes, as estimated from the standard deviation of GEDI samples, showed a weaker overall relationship with tree diversity (absβ 1 : 0.22-0.33)compared with mean values, though variance in spatial configuration (PAI b2 and VDR) was a significant predictor across most scales, albeit negatively.
BSRs for vascular plant diversity exhibited a very different pattern, with spatially coincident census data from ALS (absβ 1 : 0.07-0.22)outperforming GEDI sample aggregate means, which were not significant across scales.Importantly, unlike ALS samples, GEDI samples were spatially offset from the field plots where diversity was measured (figure 3).ALS-derived H max , tPAI, and FHD were the most frequently significant (positive) structural attributes associated with plant alpha .While no mean GEDI aggregates were significantly related to ground stratum-dominated plant diversity, landscape variance in GEDI-sampled VDR showed a significant, positive relationship with plant alpha (absβ 1 : 0.002-0.26),indicating that variance in bottom-heavy canopies is positively related with local plant diversity at landscape scales.

Alpha diversity and forest structure by ecoregion
Subsetting plot-level BSRs by ecoregion reveals that some relationships significant at the cross-biome level were not significant at the ecoregion level, and vice versa (figure 4; appx.S6).For example, despite the strong signal for cross-biome tree BSRs with both ALS and GEDI data (absβ 1 : 0.09-0.46),they were not significant in Northern Forests.Further, H max and PAI b2 BSRs were inconsistent for NW Forests.For plant BSRs based on ALS data, H max , tPAI, and FHD were only significant for Eastern Forests and the American Tropics, while tPAI was negatively related to plants in NW Forests.Conversely, where landscape means of GEDI-derived structure were not significant predictors of local plant diversity at the cross-biome scale, when subset by ecoregion, several of these relationships were significant (absβ 1 : 0.05-0.34),especially for Deserts which indicated generally positive relationships with structural dimensions and heterogeneity.Unlike cross-biome  β1) from a series of spatial GLMMs, split by ecoregion.ALS census and GEDI sample aggregate results left and right of vertical lines, respectively.Terms not significant (hollow point) when the credible interval includes zero.Note: some point estimates excluded when ecoregion-level sample size requirements (i.e. 10 plots per site and 3 sites per ecoregion) were not satisfied (see 2.5).Structural variables: height maximum (Hmax), total PAI (tPAI), foliage height diversity (FHD), height evenness (HJ), vertical distribution ratio (VDR), and PAI of the bottom half of the canopy (PAI b2 ).results, ecoregion-specific PAI-based BSRs were negative in Northern Forests, and landscape FHD was negatively related to plant alpha in NW Forests.

Beta and gamma diversity BSRs
NEON site-level beta and gamma diversity BSRs, where site-level is defined as a 10 km buffer surrounding the convex hull of all basePlots, demonstrate distinctly different patterns from those observed with alpha diversity, especially for plant alpha (figure 5).Whereas most landscape-scale aggregate structural profiles were not significantly related to plant alpha diversity (figure 3), most site-level mean structural profiles were significantly correlated with both plant beta and plant gamma , albeit negatively (absβ 1 : 0.39-0.64;where significant).On the other hand, site-level variance (cv) in structure was a positive predictor of plant beta and plant gamma (absβ 1 : 0.28-0.58;where significant), especially for H max , tPAI, and FHD.Together, these results indicate that low-statured, sparse canopies with a relatively large degree of heterogeneity across the landscape are associated with high plant turnover and greater regional diversity.Tree beta and tree gamma BSRs, on the other hand, were less consistently significant, especially for tree beta which was not significantly correlated with any site-level structural attributes.Tree gamma relationships performed better (absβ 1 : 0.38-0.66;where significant), demonstrating a strong positive correlation with several mean site-wide structural indices, especially those positively correlated with tree alpha , such as tPAI and FHD.In most cases, standardized effect sizes between GEDI and GEDI-coincident ALS censuses were not significantly different.

Scale dependency of GEDI variables
REA breakpoint analysis indicated a consistent pattern in the magnitude of scale dependency among structural variables (figure 6).While some individual plots exhibited long tails, extending the distribution of REA breakpoints between 1400 and 2300 m, median REA values ranged between 400 and 500 m (mean = 494 m; sd = 434 m) across all structural metrics.Ecoregion-level median values were relatively clustered near the cross-biome median, though some prominent differences arose.For example, Desert  β1) of site-level structural attributes on beta and gamma diversity.Site structure is inferred from GEDI (brown) and GEDI-coincident ALS samples (blue), aggregated by mean and coefficient of variance (cv).The median posterior estimate (points) and 95% credible interval (error bars) are significant (filled point) when the credible interval does not include zero (dashed line).Structural variables: height maximum (Hmax), total PAI (tPAI), foliage height diversity (FHD), height evenness (HJ), vertical distribution ratio (VDR), and PAI of the bottom half of the canopy (PAI b2 ).ecoregions exhibited a smaller REA range (139-220 m) for all metrics except VDR, while Eastern Forests had the largest range for PAI-related variables and indicators of structural heterogeneity.

Sensor, spatial structure, and sampling density
We used tPAI and FHD (see appx.S8 for other structural variables) to demonstrate the effects of sensor type, spatial structure, and sampling density on tree alpha BSRs.Sensor-platform comparisons indicate that while the increased resolution and precision of airborne point clouds resulted in slightly higher slope coefficient values (mean absβ 1 of 0.26 for ALS versus 0.25 for GEDI) and a greater number of significant BSRs compared with spaceborne waveforms (17 cases for ALS versus 14 for GEDI), the difference between the two was not significant (figure 7(a)).The role of sample spatial structure-assessed by comparing cross-hatched GEDI-coincident ALS footprints versus those evenly dispersed in concentric sampling extents-was likewise not a significant factor influencing the significance and magnitude of BSRs from aggregate GEDI samples (figure 7(b)).To test the influence of sample size (n), we compared two sets of evenly dispersed ALS samples: a sparse set where sample n equals that from GEDI footprints and a dense set where samples locations are randomized and sample n is four times that of the sparse set (table 2).Results show no significant difference between the two sets (figure 7(c)), indicating that GEDI sample density does not significantly affect the strength of local tree BSRs, at least in the North American latitudes and at scales of 400 m or above (appx.S7).

Coincident BSRs: alpha diversity
In this study, we used NEON data to assess spatially-coincident BSRs (where field-observed diversity measurements and ALS-derived structure spatially coincide at a single plot) for vascular plants and trees in 32 sites dispersed across the United States.Cross-biome analyses confirm macrosystem-wide BSR relationships, though divergence in the significance and magnitude of vascular plant versus tree BSRs reflect different hypothesized processes.Unlike understory plants, whose presence is indirectly inferred from complex BSRs, tree crowns are often larger than the common (1-2 m) grain size of ALS data, and thus can be directly observed.This direct relationship has important implications for the interpretation of the processes driving tree BSRs.For example, structural heterogeneity metrics were found to positively co-vary with tree diversity.This empirical phenomenon, described by the Height Variation Hypothesis (Torresani et al 2023) is a subset of the Diversity Begets Diversity Hypothesis, whereby diversity in one part of an ecological system may give rise to diversity in another (Palmer and Maurer 1997).Accordingly, tree diversity may be positively correlated with structural heterogeneity when species exhibit inter-specific variance in crown morphology and growth habit (Pretzsch 2014, Kunz et al 2019).Alternatively, structurally mediated habitat heterogeneity may, on its own, provide the conditions for niche partitioning that drives recruitment of a broader pool of species (Canham et al 1994, Ishii andAsano 2010).
The consistency of our results across structural heterogeneity metrics supports the macroecological consistency of the Height Variation Hypothesis, as well as GEDI's efficacy for predicting tree diversity (Marselis et al 2022).Importantly, these results generalize it to include a broader set of structural metrics, including canopy dimensions and spatial configuration.While structural attributes like H max and tPAI were correlated with structural heterogeneity metrics (appx.S3), their relationship with tree diversity suggests different underlying mechanisms.For example, the More Individuals Hypothesis attributes the positive correlation between productivity and tree diversity to sampling effects when larger canopy dimensions are positively correlated with stem density (Srivastava and Lawton 1998).Alternatively, soil fertility and site index effects based on abundant resource availability may concurrently drive canopy size and the niche partitioning underlying increased diversity (Cardinale et al 2009).
For all vascular plants-where richness in temperate zones tends to be dominated by understory herbaceous species (Barbier et al 2008)-coincident BSRs were significant across all structural metrics for at least one scale.At this cross-biome scale, structural variables related to stand age and site index, such as H max , tPAI and FHD, exhibited the strongest relationships with plant diversity indices.Taken together, these results suggest that larger and more complex canopy volumes sustain greater vascular plant diversity totals, providing a path to infer understory diversity from airborne remote sensing, even where it cannot be directly observed.Importantly, these patterns may also reflect larger ecoclimatic trends simultaneously driving both diversity and structure (Chu et al 2019).Supporting this assertion, studies have found that after controlling for the interactive role of temperature and precipitation, canopy dimensions were actually negatively correlated with plant diversity-likely due to the increased number of canopy gaps supporting understory recruitment (Hakkenberg and Goetz 2021).As observed in related studies where understories were sufficiently dense (i.e.high PAI b2 values), we found a negative relationship with plant diversity in light-constrained understories (Dormann et al 2020).

Characteristic BSRs: alpha diversity
Spatially-coincident BSRs assume that the spatial resolution of structural data aligns with the specific scale of the target organism; in this case, tree crowns and canopy gaps.This direct relationship between plot biodiversity and structure has long been described in the literature (MacArthur and MacArthur 1961).In this study, we tested if spatially-coincident BSRs generalize to landscape-scales using aggregate structural statistics to infer large-scale diversity patterns.Results generally support this hypothesis, as GEDI and GEDI-coincident ALS footprints (calculated as kriged sample means and variances in 200-5600 m radii sampling extents, corresponding to 0.1-98 km 2 ) were significant across the majority of the bivariate relationships assessed.
We posit that this relationship emerges as the result of different underlying mechanisms than those driving plot-coincident BSRs.Rather than the direct resource constraints that structure imposes on community assembly and diversity (Cardinale et al 2009, Dormann et al 2020), our results instead suggest a robust relationship between biodiversity and landscape-scale characteristic profiles.Characteristic structural profiles, in this sense, denote general (versus specific) forest characteristics that can be represented through aggregate statistics like the mean and variance of structural attributes over a given area.Characteristic structure reflects underlying composition, as well as the complicated outcomes of traits, competition, disturbance, and dispersal mechanisms representative of a particular forest type and ecoclimatic context.Thus, to the extent that aggregate structural profiles are associated with characteristic diversity profiles, they may expedite biodiversity prediction over larger areas.While generic predictors from characteristic structural profiles may actually outperform specific, coincident predictors of plant diversity (such as with site-level structure and plant beta ), this approach has certain limitations for predicting inferred understory properties in specific locations.For example, plant alpha was inconsistently related to characteristic structural profiles and instead tends to track idiosyncratic trends in micro-environmental gradients like topography, substrate, and disturbance (Barbier et al 2008, Fourrier et al 2015).

Ecoregional variation in BSRs: alpha diversity
The hierarchical design of spatial GLMMs accounts for large-scale spatial structure via site-level random effects and smaller-scale spatial effects represented with a spatial covariance matrix.However, this cross-biome analysis obscures inter-regional differences in the sign and magnitude of BSRs across disparate ecoregions.While little work has been done on the macroecology of BSRs (but see Hakkenberg and Goetz 2021), ecoregional differences likely reflect the interaction of observed patterns in biodiversity (Field et al 2009) and structure (Fahey et al 2019) along ecoclimatic gradients in precipitation and temperature (Ricklefs andHe 2016, Keil andChase 2019).For plot-coincident BSRs, the most pronounced difference between cross-biome and regional relationships occurred with structural metrics like H max , tPAI and FHD, where plant BSRs were significant (positive) for Eastern Forests and the American Tropics.Both Eastern Forests and the American Tropics span a large geographic area, where structurally heterogeneous and voluminous forests (e.g.Florida's Disney Wilderness Preserve; DSNY) tend to possess high plant diversity (appx.S1).In NW and Northern Forests, on the other hand, we found PAI-based BSRs to be nonsignificant, or negative in several instances.In these light-limited forests, tPAI and PAI b2 may actually serve to depress plant diversity, especially among ground stratum herbs which tend to dominate plant richness tallies (Fourrier et al 2015, Valladares et al 2016).
Conversely, at landscape scales of GEDI footprint aggregation (0.1-98 km 2 ), we observed several nonsignificant cross-biome plant BSRs that were significant in individual ecoregions.This was especially the case for the Desert ecoregion where aggregations of structural metrics like H max , tPAI, H J and PAI b2 were consistently related to plant diversity.This positive relationship between local plant diversity and canopy dimensions in the Desert ecoregion suggests multiple possible mechanisms including sampling effects, nurse plant effects, and microtopography.In the first case, GEDI-detected forest structure in sparsely-forested deserts may simply be detecting the presence of vegetation (versus non-vegetation) that serves as a correlate of plant diversity (Storch et al 2018).These sampling effects may be accentuated by the presence of nurse plants, where trees and shrubs provide a sheltered environment for colonizing plants (Madrigal-González et al 2020).Finally, GEDI's ground-finding algorithm has been observed to confound ground elevation and low-lying vegetation in the waveform signal where canopies are relatively short-statured (Fayad et al 2021).Thus, tPAI and PAI b2 in desert regions might partially reflect micro-topography which has independently been found to positively covary with plant diversity in arid systems (Zuo et al 2021).

Characteristic BSRs: beta and gamma diversity
Site-level analyses provide an example of how characteristic forest structural profiles correlate with macroecological diversity patterns like landscape-scale beta diversity (compositional turnover among plots per site) and regional-scale gamma diversity (site-level species richness).This test of BSRs for beta and gamma diversity is, to our knowledge, novel.One notable result from this analysis is the fact that while characteristic forest structural profiles were not significantly related to plant alpha , most mean landscape structural attributes were significantly related to plant beta and plant gamma .For example, we found that variance in landscape structure was a consistent positive predictor of plant beta -reflecting how structurally-determined habitat heterogeneity at site scales can be indicative of spatial turnover in species composition.Plant gamma BSRs exhibited similar patterns, being significant in characteristically low and sparse canopies-where high light levels support differential colonization and recruitment patterns versus closed-canopy, highly-shaded understories where understory recruitment is constrained for shade intolerant species (Ishii and Asano 2010).
Tree gamma exhibited a positive relationship with mean structural attributes and a negative relationship with landscape variance in those attributes.Together, this result supports the interpretation that, as with plant gamma , homogeneous and voluminous canopies are associated with greater regional tree diversity.Interestingly, where BSRs between tree alpha diversity and aggregate structural attributes were consistently significant across scales, no site-level structural attributes were significantly related to tree beta .Unlike plant beta , tree beta patterns appear more aligned with turnover in environmental factors (e.g.habitat affinities) not directly captured by characteristic forest structural profiles (Vega et al 2020).Notably, beta and gamma diversity BSR analyses do not employ site-level random effects in the hierarchical design.Thus, rather than partitioning local effects, overall patterns are likely driven by macroecological patterns that underlie large ecoclimatic gradients in both diversity (Chu et al 2019) and structure (Scheffer et al 2018).

GEDI scale dependency and sampling considerations
Like ALS point clouds of discrete returns that can be gridded into raster maps of varying resolution, GEDI-based gridded products lack an inherent spatial grain.Optimized GEDI gridding seeks to maximize signal, while minimizing error for interpolated (kriged) estimates of structure between GEDI footprints.In this trade-off between spatial proximity and training set sample size, smaller sampling extents may best represent proximate structure, though smaller sample sizes may likewise bias mean estimates, especially in areas of high canopy heterogeneity (appx.S3).On the other hand, increasingly larger sampling extents allow for a regression towards mean characteristic structural profiles for a given landscape, albeit at the expense of representing local structural anomalies that could be impacting plot-level biodiversity dynamics like those affecting plant alpha .
Ecological scale dependency analyses seek to assess the specific role of scale in constraining ecological relationships, either (1) as an inherent property of an individual variable, or (2) as a component of a relationship between two or more variables (Crawley andHarral 2001, Lechner et al 2012).In the former case, we found GEDI attributes to stabilize at REA distances of 450 m across variables, though long tails for some metrics extended to 2300 m, where heterogeneity in GEDI footprint data reflects underlying factors like spatial turnover in land cover, disturbance, and topographic heterogeneity at landscape scales.REAs represent the smallest grain size indicative of the mean and variance of a given variable, and thus have important implications for quantifying the size of neighborhoods requisite to define a 'characteristic' structural profile (Wood et al 1990).A similar approach based on local ALS data found expectedly smaller values owing to the sensor's finer resolution: with REA values between 25 and 75 m for canopy cover, arrangement, leaf area, and complexity variables; and up to 150 m for canopy heights (Atkins et al 2023b).
In the latter case (the scaling relationship between biodiversity and structure), we found plot coincident BSRs to peak at extents roughly 2-3 times the size of the measured plot (radii: 25-37.5 m).This is an important result that suggests resource acquisition dynamics outside of a given plot mediate plot-scale community assembly and competition.Factors influencing these resource dynamics include lateral root growth competition for water and nutrient uptake (Agee et al 2021), as well as asymmetric light competition where daily-and seasonally-varying incident angles constrain plot-level direct and diffuse light profiles (Schwinning and Weiner 1998).As with landscape-aggregate BSRs, cross-biome and ecoregion-level relationships tended to peak at 2900 m sampling extent.
In terms of platform-sensor considerations, we found that while ALS census data provide better fine-scale characterizations of structure for predicting specific understory diversity profiles, at aggregate scales, its performance did not differ significantly from those based on GEDI waveforms.Further, neither GEDI's spatial structure (linear versus dispersed samples) nor its sampling density had a significantly detrimental effect on strength of tree BSRs.Importantly, these results hold at spatial scales at 400 m and above, where aggregation areas possess sample sizes adequate to represent landscape characteristic structure.These results are applicable to BSR-based prediction to determine the finest effective grain size for a given region or forest type.

Conclusion
Results from this study inform an enhanced biogeographic perspective on cross-biome BSRs (1) between plants and trees, (2) across alpha, beta, and gamma diversity; (3) among forest structural metrics; (4) across spatial scales and sensors, and (5) between specific, local relationships from spatially coincident ALS census data versus general, landscape relationships based on characteristic structural profiles from aggregated GEDI samples.Using a distributed, near-continental plot database, we found lidar-derived structural variables alone (i.e.without the incorporation of more standard predictors of biodiversity like climate, topography, or optical imagery) were significantly related to plant and tree diversity patterns.These results fill a critical gap in our understanding of how landscape-scale characteristic forest structural attributes predict specific field-observed biodiversity patterns, especially for understory communities that are not directly observable from remote sensors.Moreover, empirical BSRs corroborate expectations from the Height Variation Hypothesis, while extending them to include a larger set of structural attributes, as well as at landscape to regional scales with beta and gamma diversity prediction.
However, more work is needed to establish how biodiversity-structure habitat relationships extend to phylogenetic and functional diversity metrics, disparate taxa and in different geographic contexts.Gradients in forest structure are driven by some of the same factors that likewise influence patterns of diversity.As the existence of these multiple correlative relationships may obscure mechanistic processes, more work is needed to untangle this collinearity and uncover the unique role of structure in constraining multi-scale biodiversity patterns.Finally, this study focuses on inference into the significance, magnitude and consistency of BSRs and not predictive accuracy per se (sensu Lo et al 2015).Thus, while results provide empirical support for BSRs across a range of ecological contexts that sheds light on underlying mechanisms constraining community assembly, biodiversity predictive mapping applications would likely benefit from inclusion of ancillary data like climate, topography and optical imagery, as well as optimized, nonparametric predictive modeling procedures.
As of its March 2023 hiatus, GEDI collected over 20 billion quality observations of 3D canopy structure.But substantial gaps remain across the Earth surface which require statistical interpolation of structural attributes in unsampled areas before continuous modeling of higher-level biodiversity properties can commence.This study bolsters the empirical basis for how both continuous, spatially-coincident and discrete, aggregate lidar-sampled structural metrics can facilitate biodiversity monitoring at global extents.

Figure 1 .
Figure1.Gridded forest structure across the contiguous United States.A 6 km 2 resolution map of aggregated mean GEDI metrics, where pixels are represented as red-green-blue (RGB) combinations of maximum height (Hmax), foliage height diversity (FHD), and total plant area index (tPAI), respectively.Forested NEON sites used in this study are labeled in white (see appx.S1 for abbreviations).For map purposes, forested areas correspond to those RGB pixels possessing mean RH98 values ⩾4 m, while non-forest pixels (RH98 < 4 m) appear as off-black.

Figure 2 .
Figure 2. GEDI and ALS forest structure sample design.(a) TREE NEON site in WI, USA.(b) Concentric sampling zones for GEDI sample aggregation (thick white lines), GEDI coincident NEON ALS (red circles), dispersed ALS samples (hollow white circles), and basePlots (blue-outlined squares); (c) Multi-scale ALS structural census in and outside of a basePlot.

Figure 3 .
Figure3.Plot-level biodiversity-structure relationships (BSRs): ALS census versus aggregated GEDI structure across scales.Facets depict standardized effects of multi-scale ALS-censused (green) and GEDI aggregated structural attributes by the mean and standard deviation (blue and brown, respectively).ALS census data were extracted across 5 scales from basePlot to 50 m radius circular extent (points left of vertical lines), while GEDI samples aggregated (kriged) in circular extents of 200-5600 m radii (points right of vertical lines).Standardized effects (β1) shown as the median posterior estimate (points) and 95% credible interval (error bars) from each bivariate spatial GLMM.Terms not significant (hollow point) when the credible interval includes zero.Structural variables: height maximum (Hmax), total PAI (tPAI), foliage height diversity (FHD), height evenness (HJ), vertical distribution ratio (VDR), and PAI of the bottom half of the canopy (PAI b2 ).

Figure 4 .
Figure 4. Plot-level BSRs: alpha diversity by ecoregion.Standardized slope estimates (β1) from a series of spatial GLMMs, split by ecoregion.ALS census and GEDI sample aggregate results left and right of vertical lines, respectively.Terms not significant (hollow point) when the credible interval includes zero.Note: some point estimates excluded when ecoregion-level sample size requirements (i.e. 10 plots per site and 3 sites per ecoregion) were not satisfied (see 2.5).Structural variables: height maximum (Hmax), total PAI (tPAI), foliage height diversity (FHD), height evenness (HJ), vertical distribution ratio (VDR), and PAI of the bottom half of the canopy (PAI b2 ).

Figure 5 .
Figure5.Site-level BSRs: beta and gamma diversity.Facets depict the standardized effect (β1) of site-level structural attributes on beta and gamma diversity.Site structure is inferred from GEDI (brown) and GEDI-coincident ALS samples (blue), aggregated by mean and coefficient of variance (cv).The median posterior estimate (points) and 95% credible interval (error bars) are significant (filled point) when the credible interval does not include zero (dashed line).Structural variables: height maximum (Hmax), total PAI (tPAI), foliage height diversity (FHD), height evenness (HJ), vertical distribution ratio (VDR), and PAI of the bottom half of the canopy (PAI b2 ).

Figure 6 .
Figure 6.Scale dependency in GEDI structural metrics.Violin plots depict the distribution and median (black line) of representative elementary area (REA) breakpoint values across all basePlots.Breakpoint values represent the distance (m) from each basePlot where variance in sampled GEDI metrics asymptotes, and thus a point of stationarity.Colored points represent Structural variables: height maximum (Hmax), total PAI (tPAI), foliage height diversity (FHD), height (HJ), vertical distribution ratio (VDR), and PAI of the bottom half of the canopy (PAI b2 ).

Figure 7 .
Figure 7. Effects of sensor, sample spatial structure, and sampling density on tree alpha BSRs.Standardized slope estimates (β1) represent the effect of mean aggregate structural attributes on tree across sampling designs comparing roles of (a) sensor, (b) sample spatial structure, and (c) sampling density (table 2) based on a series of bivariate spatial GLMMs.Terms not significant (open symbol) when the credible interval includes zero (dashed vertical line).Structural variables: total PAI (tPAI) and foliage height diversity (FHD).

Table 1 .
Biodiversity and forest structural variables.Biodiversity data was compiled from 400 m 2 NEON field plots, while forest structure was derived from multi-scale ALS and GEDI lidar samples.

Table 2 .
Forest structure census and sampling design.NEON basePlot (square with green highlight) surrounded by concentric sampling domains, each with GEDI and ALS samples (points).