Integrating spaceborne estimates of structural diversity of habitat into wildlife occupancy models

Vegetation structure is a crucial dimension of wildlife habitat, responsive to global changes in human activities and ecosystem processes. NASA’s recent Global Ecosystem Dynamics Investigation (GEDI) provides an exciting opportunity to explore how spaceborne waveform observations can improve our ability to measure wildlife habitat and advance animal ecology in the Anthropocene. We tested the utility of GEDI data in univariate occupancy models to estimate habitat use in a remote mountain system in central Idaho, USA. We collected data from 49 camera trap stations from two surveys in 2018–2019 and modeled the occupancy for each of seven mammal species representing different trophic levels and feeding strategies: American black bear (Ursus americanus), deer (Odocoileus hemionus), elk (Cervus canadensis), moose (Alces alces), coyote (Canis latrans), wolf (Canis lupus), and mountain lion (Puma concolor). We first derived structural diversity indices (richness, evenness, and divergence) of GEDI-derived canopy height, plant area index, and foliage height diversity to represent different dimensions of vegetation structure. This spatial aggregation is necessary due to gaps in GEDI footprints and parallels commonly used functional diversity metrics applied to biological communities that are calculated using trait probability densities. We measured these indices across three spatial scales that reflect different species movement and habitat selection patterns. We found the structural diversity indices of canopy height, foliage height diversity, and plant area index had the strongest effects on the occupancy of most mammals compared to two-dimensional (2D) variables (e.g. tree cover, normalized difference vegetation index). The spatial extent of these indices also influenced the strength of response, highlighting the importance of selecting a scale large enough to capture sufficient GEDI footprints but small enough to reflect site-level variance. Compared to 2D covariates, our results suggest that GEDI variables allow researchers to generate more detailed inference on the forms of habitat that wildlife use. We discuss the implications of these findings for habitat management and future wildlife research from local to global scales.


Introduction
Human activities alters land cover globally, often detrimentally impacting ecosystem functioning and biodiversity [1,2]. The three-dimensional (3D) structure of vegetation is one aspect of land cover that plays various critical roles in forest ecosystems including the provisioning of ecosystem services (e.g. carbon storage) and habitat for wildlife [3][4][5][6]. Until recently, sparse data availability on the 3D structure of vegetation has limited our inference on its role in ecosystems at large scales and across study systems. NASA's recent Global Ecosystem Dynamics Investigation (GEDI) now provides such light detection and ranging (LiDAR) data at near-global scales [7]. GEDI hosts a full-waveform LiDAR system on the International Space Station and provides measures of canopy height, plant area index, and vertical foliage profiles.
LiDAR sensors are commonly used to measure plant heights and densities from horizontal (e.g. terrestrial laser scanning) and vertical (e.g. airborne laser scanning) perspectives and are more efficient than manual measurements [8]. This technology has most commonly been applied to study avian and invertebrate species [6]. Studies estimating wildlife occupancy and distributions using LiDAR metrics have been shown to reduce model variance better than twodimensional (2D) classifications of habitat [9][10][11] while more accurately differentiating animal behavior in community settings [12]. Land cover classifications have frequently used 2D habitat variables, but these measurements do not include information on plant density across strata, which provide various habitat functions across different species. For example, understory cover provides concealment for small herbivores [13] and heterogeneity in canopy height is preferred by ungulates during various ambient temperatures of the day [14,15]. However, due to limited coverage and high operating costs, current airborne LiDAR data and information on vegetation structure are rarely available where needed.
We used data from GEDI to derive different measures of vegetation structure and integrated those measures into models of wildlife occupancy in a mountain system of Idaho, USA. GEDI is the first spaceborne sensor specifically designed to map 3D terrestrial vegetation structure and presents an exciting opportunity to understand how this important dimension of ecosystems influences the ecology of animals. The use of high-resolution structural data has been limited in wildlife research, although the 3D arrangement of habitat is fundamental to how animals interact with the environment [16,17]. For example, the 3D structure of shrubs provides resources and shelter for mesocarnivores and can influence large carnivore hunting success [9,[18][19][20]. Moreover, understanding how wildlife utilize various forms of structure could better target habitat management and improve conservation planning. For instance, the practice of retaining forest patches with certain structural characteristics after timber harvest can provide refuge for small mammal communities [21]. Yet, it is unknown which GEDI metrics of vegetation structure will be most useful to wildlife ecology and how to best utilize the vast information available.
GEDI comprises three lasers that produce eight parallel tracks of observations each with a 25 m footprint separated by 60 m along-track and 600 m across-track. Although GEDI's coverage is near global, these between-track data gaps require aggregation or fusion with other data products to be used at the site-level, where GEDI data may not perfectly overlap. For models of wildlife occupancy, complete coverage would be favorable to ensure the availability of structural measurements at every location [19]. Nevertheless, these models only require habitat data surrounding fixed sampling locations at a minimum and could be well suited for GEDI applications if aggregated with relevant indices and scales [22]. Because occupancy modeling is a common tool in wildlife science used around the world, determining how to aggregate GEDI data to create the desired coverage and spatial scales is a vital step towards leveraging these data products for future wildlife research and management.
We examine the contribution of GEDI data aggregated by diversity indices in estimating mammalian wildlife distribution for the first time, using occupancy analyses of camera trap data from our study system in the Rocky Mountains. Wildlife cameras (camera traps) are a reliable method of sampling wildlife populations and allow for long-term remote observations of medium to large-sized mammals [23][24][25]. We focused on the occupancy of seven mammal species representing different trophic levels and feeding strategies: American black bear (Ursus americanus), deer (Odocoileus hemionus), elk (Cervus canadensis), moose (Alces alces), coyote (Canis latrans), wolf (Canis lupus), and mountain lion (Puma concolor), and compared the performance of models using GEDI-derived variables to those that used traditional 2D variables. We also demonstrate a workflow to combine raw GEDI footprints to develop several structural diversity indices across scales: richness, evenness, and divergence. These indices are considered primary components of functional diversity used widely in biodiversity science and can also be applied to aggregate data on vegetation structure to provide detailed information on wildlife habitat selection and co-occurrence patterns [26]. Thus, our approach can widen the applicability of GEDI data for ecological research and conservation science and practice.

Study area
The study area is in the Big Wood River watershed in central Idaho, USA, and is 1161 km 2 bounded by 43.905 N, 43.485 S, −114.067 E, and −114.719 W (figure 1). Elevation ranges from 1515-3570 m and land-cover is predominantly montane conifer forest with Douglas fir (Pseudotsuga menziesii), lodgepole pine (Pinus contorta), and ponderosa pine (Pinus ponderosa). Some areas are dominated by sagebrush (Artemisia tridentata) or aspen (Populus tremuloides), and riparian areas are covered by willow (Salix spp.). During our study period, temperatures ranged from −14 • C to 37 • C (mean average daily temperature 11.4 • C), and total mean precipitation was 99.9 cm [27]. Recreation is common and includes hiking, mountain-biking, and hunting.

Wildlife data collection
To observe spatial patterns of wildlife activity, we deployed infrared-triggered wildlife cameras (Bushnell TrophyCam Aggressor and Browning Strike Force Pro) across our study area between 7 July-4 November 2018 and 25 May-27 October 2019. In 2018, we deployed 44 cameras for a total of 2704 camera days. In 2019, we deployed 49 cameras for a total of 5900 camera days.
We chose locations of cameras based on signs of wildlife activity (e.g. tracks, scat, or other sign) and spatially and topographically opportunistic locations (e.g. near game trails, saddles between drainages, and distance from roads). We separated cameras by at least 1 km, averaging 2.69 km apart. We deployed cameras on trees or large stumps approximately 1 m from the ground and 1-5 m from a location of interest.
We examined images twice for the presence of wildlife and positive detections were examined a third time for confirmation of species identification. We processed these images using the camtrapR package in R software to create tables of detection events [28,29]. We determined detection events as independent if the previous detection of the species occurred more than two minutes prior [30]. The observation period for modeling was one week, where if a species was detected once during a week, it was determined as present for that week to avoid zero inflation in the data for rare species [31]. Therefore, each species had a detection history with 88 cameras and 23 weeks. We also recorded human presence in camera detections.

Environmental covariates 2.3.1. GEDI structural traits
We selected GEDI footprints within the study area collected in 2019, 2020, and 2021 and removed any data with a poor-quality flag. To represent vegetation characteristics at various strata, we mapped several structural traits from relative heights at 25, 50, 75, and 95 percentiles; and foliage height diversity, canopy cover, and plant area indices at 0-10 m, 10-20 m, 20-30 m, and 30-40 m. These traits have been shown to accurately represent ground conditions using inventory and dense airborne LiDAR data and represent a variety of forest characteristics [32,33].
Because GEDI data do not have complete coverage, we aggregated traits surrounding camera locations at grain sizes large enough to include sufficient GEDI footprints to capture landscape heterogeneity, but small enough to be ecologically meaningful to wildlife occupancy [16]. We developed buffers around each camera location at 250 m, 500 m, 750 m, and 1000 m and calculated the mean of each GEDI trait within the buffer (supporting figures S1 and 2).

GEDI structural diversity indices
We also calculated structural diversity indices of each GEDI trait within the four buffer sizes, following the approach commonly used in assessing functional diversity of biological communities [32,34]. To calculate structural diversity indices, we first linearly scaled traits from 0 to 1 and calculated trait probability densities using Gaussian kernel density estimation following Carmona et al using the TPD package in R software [32,34]. Trait probability densities rely on the concept of Hutchinsonian niches as probabilistic hypervolumes [35,36]. They can better handle varying sample sizes and gaps in trait distributions than convex hull [34]. This is important as GEDI data is irregularly spaced and unevenly distributed across our camera locations [33]. We used trait probability densities to aggregate GEDI-derived structural metrics into richness, evenness, and divergence, the three primary components of functional diversity (figure 3 [26]).
Richness is the amount of functional space occupied by a trait or the sum of the hypervolume cells greater than 0 and is independent of trait abundance (i.e. an area with many different canopy heights would have high richness for this trait; figure 3 [34]). Evenness reflects the homogeneity in the distribution of abundances within the trait space (i.e. an area with the same amount of each canopy height would have high evenness). Finally, divergence measures the extent that trait abundances at the edges of the distribution are greater or less than those at the center (i.e. an area with many tall and short trees, but few medium sized trees would have high divergence).
Because GEDI provides many metrics, we checked for information redundancy and reduced the list of summarized traits and indices [37]. We calculated the Pearson's correlation coefficient for all variable combinations and removed variables that had a coefficient greater than 0.65 (supporting figures S3 and S4 [29]). The variables removed were commonly correlated with other metrics, representing similar ecological attributes, but less relevant to our species of interest.

Traditional environmental covariates
We also measured 2D environmental variables traditionally used in wildlife research at 250 m, 500 m, 750 m and 1000 m grains, and extracted values at each camera location. Variables included abiotic factors of distance to nearest perennial stream, elevation, slope, aspect, terrain roughness, terrain ruggedness index, and topographic position index. We calculated topographic variables using the USGS 30 m digital elevation model and the raster package in R software [38,39]. We also collected mean daily temperatures from National Oceanic and Atmospheric Administration (NOAA) monitoring stations in the study area and nearest NOAA station to each camera was used to supply the mean weekly temperature [27]. We included biotic factors of vegetation productivity (normalized difference vegetation index (NDVI)) and tree, shrub, and annual forb cover. The 16 day NDVI from Moderate Resolution Imaging Spectroradiometer (MODIS) [40] was extracted for each camera and week with the majority overlapping days. We used vegetation cover data from the Rangeland Analysis Program [41] which provides annual percent cover at 30 m resolution.

Occupancy modeling
To test the contributions of each variable to the habitat use of our focal species we used univariate singlespecies occupancy models with the unmarked package in R [42][43][44]. We used species detections at a given camera and week to estimate both the probability of detecting that species and the probability of that species occupying the area. Occupancy models account for imperfect detection by accounting for the ecological processes of species occurrence and the observation process of species detection [45]. Detection is both spatially and temporally informed, with each unique observation (i.e. one week at one camera) being used to model what governs the detection process of a species in the study system. Occupancy probability was spatially informed and can be interpreted as an estimate of habitat-use for each species.
We estimated occupancy for black bear, coyote, deer, elk, moose, mountain lion, and wolf. We chose these species as they had relatively high occurrences across our study system and represent different trophic levels and feeding strategies. We used backward step-selection to define a detection model specific to each species. All detection models included the year of detection to account for the two different surveys in our study. We tested the inclusion of additional variables (NDVI, human presence/absence, temperature) to the detection models and retained these variables if the likelihood ratio test was statistically significant (p < 0.05 [43]). These variables were then included in the detection function of univariate occupancy models testing all GEDI and 2D environmental variables [46].

Results
In total, we captured 694 independent detection events for elk ( (table 1). These four metrics represent different structural properties including maximum plant height, diversity of heights, density of ground to understory vegetation, and density of understory to canopy layer vegetation.
The effect of both GEDI and 2D variables on occupancy varied by species (figures 4 and 5). None of the variables had statistically significant (p < 0.05) effects on the occupancy of wolf, deer, and coyote. GEDI indices aggregated using structural diversity indices had overall larger effect sizes than the means of 3D metrics across all scales and 2D variables. NDVI and slope had the overall greatest effects for the 2D variable class ( figure 5). For models with 2D variables of vegetation cover (tree, shrub, and forb cover), coefficients rarely fell far from 0. Structural diversity indices of GEDI variables were most influential in estimating elk, moose, mountain lion and bear occupancy. Canopy height evenness from 250 m to 750 m had strong negative effects for elk and moose, while mountain lions were most influenced by plant area index, especially in the understory (0-10 m; figure 4). Divergence of foliage height diversity had strong positive associations and plant area index had strong negative associations with bear and mountain lion occupancy. Table 1. Final set of variables tested in wildlife occupancy models. For 3D variables, trait corresponds to the GEDI traits, and the aggregation type produces a GEDI structural diversity index. All variables were calculated within buffer sizes of 250 m, 500 m, 750 m, and 1000 m.

Discussion
GEDI-derived variables of 3D habitat structure had the strongest overall effects in estimating the occupancy of the focal mammals in our study. Although effect size varied by species and spatial grain, the aggregation of GEDI variables by structural diversity indices allows for a more comprehensive understanding of wildlife-habitat relationship than aggregating solely by the mean. For instance, canopy height evenness had strong negative effects on elk and moose occupancy despite canopy height mean having weak associations. This may be attributed to the foraging behavior of these large ungulates preferring landscapes with more structurally heterogeneous vegetation including matrix of forests and grasslands to regulate temperature throughout the day [14,15]. Such relationships were not distinguished by conventionally used 2D variables including tree, shrub, and forb cover. Although 2D environmental variables had weak associations with the occupancy of species in our study, they can help infer the type of vegetation comprising the structural patterns of habitat. Thus, it remains important to include these variables to inform habitat management, especially when implementing multivariate occupancy models for extrapolation as commonly done in species distribution modeling [11]. Our findings highlight the importance of aggregating GEDI indices across various scales as some mammals had different magnitudes and directions of effect across our four grains. For example, mountain lions responded negatively to richness in canopy height at 1000 m but positively at 250 m. Responses were greatest for most mammals at medium-coarse grains (500 m-750 m) which is encouraging because the distances between GEDI footprints makes it difficult to ensure enough footprints will be available immediately surrounding the sampling location. Yet, responses to scale are species specific, as others have shown different species of the same taxa respond to vegetation structure across a gradient of scales [47]. Moreover, our study focused on medium to large ground-dwelling mammals and found the strongest associations with GEDI variables for larger species. Our results suggest GEDI data may be well suited for far ranging mammals but potentially less so for species interacting with vegetation at finer scales [16].
When aggregating data using the mean, values may become increasingly similar at larger aggregation especially in homogenous landscapes. Structural diversity indices are better suited to capture landscape variation at larger scales [33], but should be paired with land cover data to specify the types of vegetation and their structural patterns to inform habitat management [48]. A combination of indices to capture broad scale structural patterns may have greater utility to land managers. The selection of single metrics should be informed by life histories of target species [49]. The overall effect of variables in our study on wildlife occupancy was small and may be an effect of high and ubiquitous occupancy across our study site. We believe GEDI data may play a larger role in other study systems where species' habitat use is responding more to environmental variation.
The utility of GEDI metrics to inform wildlife ecology and biodiversity conservation will rely on its integration with other technologies [22]. This integration will need to be mindful of the scales at which the environment and species responses are measured. Camera traps, compared to other wildlife tracking techniques (e.g. GPS -Global Positioning System) can be purposefully placed in areas with sufficient GEDI availability, and a systematic study design that covers deployment sites with various forms of vegetation structure and composition. To match the near-global coverage of GEDI, Wildlife Insights, a web-platform hosting camera trap data from around world, is well suited to provide a systematic analysis at similar extents with insightful comparisons across contexts [50]. As we advance our abilities to measure the physical environment, we should also strive to consider intraspecific variation in animal behavior. For example, moose of different sex and ages have shown to prefer different vegetation structure [51]. Integrating other remotely sensed imagery to measure land cover types and physiological plant traits can also improve our understanding of how species rely on specific habitat features such as nutrients and water availability [33].

Conclusion
Reducing the degradation of habitat for biodiversity is a global conservation priority. Ensuring diversity in the structural elements of habitat can contribute to spatial assessments and planning for conservation across a range of taxonomic groups. We highlight the utility of aggregating GEDI data by structural diversity indices for the estimation of mammal occupancy for the first time. Using these indices to approximate structural patterns across scales is well suited for applying GEDI data to wildlife ecology. We recommend this methodology to be applied across additional sites and taxonomic groups to build a more comprehensive understanding of wildlifehabitat relationships.

Data availability statement
The data that support the findings of this study are openly available at the following URL/DOI: https:// doi.org/10.5281/zenodo.7826340. Data will be available from 28 March 2023.