Ecosystem functional types of the circumpolar Arctic tundra based on the seasonal dynamics of vegetation productivity

Biodiversity, when viewed through the combined lenses of compositional, structural, and functional attributes, provides for a holistic understanding of the complexities found within community assemblages and ecosystems. However, advancement in our understanding of how ecosystem functional diversity interacts with structural and compositional diversity metrics is lacking, in part because universally applied methodologies to derive ecosystem functional classifications are still under development and vary widely across scales, extents and biomes. This study presents a methodology to construct ecosystem functional types (EFTs), or areas of the land surface that function similarly, using the MODIS NDVI record, for the terrestrial circumpolar Arctic. EFTs were derived from the seasonal dynamics of NDVI, over the Arctic tundra at 250 m resolution and compared to bioclimate subzones and to structurally and compositionally defined vegetation units of the Circumpolar Arctic Vegetation Map (CAVM). Correspondence analyses of CAVM EFTs to previously delineated CAVM bioclimatic subzones, physiognomic (vegetation) units and floristic provinces revealed a general congruence, indicating convergence across composition, structure, and function; yet also demonstrated substantial functional variability even within bioclimate subzones and vegetation units. Strong latitudinal gradients in ecosystem function are present, with EFT richness ranging from low (34) in northernmost regions to high (45) in southernmost regions. Locally, the mountainous regions of northern Alaska, and eastern and western Siberia had high spatial variability in ecosystem functioning. Aside from these generalities, we found that EFTs varied widely within individual mapped vegetation units, successfully capturing the functional dimension of biodiversity across the circumpolar Arctic tundra.


Introduction
The diversity and heterogeneity of functional aspects of biological systems are central concerns in ecology and conservation.The functional component of biodiversity captures heterogeneity in ecosystem processes (Tilman 2001).Spatial heterogeneity of ecosystem functioning is prominent, through variability in the magnitudes of vertical fluxes of matter and energy, and in material flow across landscapes (e.g.horizontal transfers of nutrients, water, or even species) (Turner and Gardner 2015).Understanding the patterns, causes, and consequences of spatial diversity in ecosystem functioning is needed to better monitor and preserve all aspects of biodiversity (Meyer 1997, Lovett et al 2005, Turner and Gardner 2015, Cazorla et al 2021, Skidmore et al 2021), and to elucidate the links among biological diversity, ecosystem functioning (Diaz et al 2007, Chapin et al 2010, Cadotte et al 2011), ecosystem multifunctionality, ecosystem services (Manning et al 2018), and ecological stability (Oliver et al 2015).Functional diversity has traditionally been estimated by using morpho-functional species traits (e.g.Malaterre et al 2019, Serbin et al 2019) or by grouping species into functional types based on structural, phylogenetic, or metabolic strategies (e.g.Lavorel and Garnier 2002, Lavorel et al 2007, Diaz et al 2022).However, the ability of species traits and plant functional types to represent variations in ecosystem functioning at regional scales remains a challenge (Wright et al 2006, Pasari et al 2013, Reichstein et al 2014, Asner et al 2017, Malaterre et al 2019).More recently, coarse-scale remote-sensing assessments of the spatial heterogeneity in ecosystem function (e.g.Tuanmu and Jetz 2015, Higgins et al 2016, Rocchini et al 2018, Cazorla et al 2021) have been successfully used assess functional diversity.
Time-series of spectral indices can be used to assess the functional similarity of ecosystems, by using key ecosystem functional attributes (EFAs) that describe functioning in terms of carbon and water cycling, radiation and heat exchange, or ecosystem disturbances (Cazorla et al 2021, Regos et al 2022, Marcos et al 2023).For example, satellite-derived ecosystem functional types (EFTs), defined as areas that share similar dynamics of matter and energy exchanges (magnitude and timing) between biota and the physical environment (Paruelo et al 2001, Alcaraz-Segura et al 2006, 2013), have been used to group ecosystems based on common functionality without prior knowledge of vegetation composition and structure (Ivits et al 2013).Just as species can be grouped into plant functional types based on common morpho-functional traits, ecosystems can be grouped into EFTs, based on the common behavior of key EFAs (Alcaraz-Segura et al 2006).By categorizing continuous trait gradients into discrete units, the approach obtains functionally homogeneous groups with specific and coordinated responses to environmental factors.The EFT approach has been applied to the Iberian Peninsula (Alcaraz-Segura et al 2006, Pérez-Hoyos et al 2014), Mexico (Villarreal et al 2019, Cazorla et al 2021), China (Liu et al 2020b, Wang andHuang 2015), the conterminous United States (Villarreal et al 2018), temperate South America (Paruelo et al 2001, Alcaraz-Segura et al 2013, Lee et al 2013, Müller et al 2014, Lara et al 2018a), continental Europe (Ivits et al 2013, Rotllan-Puig et al 2021), and globally (Ivits et al 2013, Thompson et al 2017, Bagnato et al 2024).
The Arctic is now thought to be warming at approximately three times the rate of the global average (AMAP 2021).Combined with dramatic sea-ice reductions, changes in the hydrological cycle and the continentality of the Arctic biome (Bhatt et al 2021), and increased anthropogenic pressures, Arctic tundra ecosystems are experiencing substantial forcings for change.Compositional and structural characteristics of the Arctic tundra are changing in response to these drivers (e.g.Walker et al 2006, Elmendorf et al 2012, Frost and Epstein 2014, Myers-Smith et al 2015, Martin et al 2017, Bjorkman et al 2020) over years and decades.EFAs, for example those related to the dynamics of primary productivity, are changing as well (e.g.Zeng and Jia 2013, Bhatt et al 2017, Bjorkman et al 2018, Prevey et al 2019, Berner et al 2020).These functional changes may not be coincident with composition/structure dynamics, and they may be occurring more immediately than composition/structure changes, providing an early warning for potential species range shifts (e.g.Alcaraz-Segura et al 2017).Attention to the spatial heterogeneity and temporal dynamics of ecosystem functioning throughout the circumpolar Arctic has been limited, given remote locations and the challenges of measuring ecosystem functions in the field, indicated by the relatively sparse (but growing) network of eddy covariance flux towers and other ecosystem flux studies throughout the tundra (Virkkala et al 2018, El Sharif et al 2019, Pallandt et al 2022).Remote sensing provides an obvious and useful tool for assessing patterns and dynamics of ecosystem functioning and lends itself well to the development of EFTs.
This study presents a new conceptual framework for understanding rate and timing of energy exchange in the circumpolar Arctic tundra.Our main objective for this research was to use the seasonal dynamics of remotely sensed primary productivity to derive and classify the circumpolar Arctic tundra into EFTs.The satellite-derived EFTs, defined here as patches of the land surface that share similar magnitudes and timing of carbon gains, are then used to characterize the spatio-temporal heterogeneity of ecosystem functioning and to explore its correspondence with existing bioclimatic, compositional, and structural classifications of the Arctic tundra.Specifically, our study addressed the following research questions: 1) What are the most statistically and biologically meaningful variables (referred to here as EFAs) that characterize ecosystem functioning related to the seasonal dynamics of carbon gains in the circumpolar Arctic tundra?2) Once EFAs are used to identify EFTs, how does the EFT distribution inform us on the spatial patterns of functionality throughout the circumpolar Arctic tundra?3) How does the functional classification of the Arctic tundra biome compare to compositionally, structurally, and bio-climatically based classifications, such as those from the Circumpolar Arctic Vegetation Map (CAVM)?was then applied across our study domain at each stage of our analysis.It is openly accessible as a GEE asset and TIFF file (see supplement S1).

From EFAs to EFTs
To calculate EFTs, statistically and biologically meaningful EFAs were first determined from the mean seasonal NDVI curve of the MOD13Q1 product.Using the bi-weekly (16-day) composited NDVI values from 6 March to 31 October, we calculated 12 statistical metrics (listed in table 1) for each year from 2000 to 2020, and then summarized them into their interannual means.In parallel, a spatial principal components (PCs) analysis was performed on the pixel-level mean values (2000-2020) of the 15 bi-weekly (16-day) composites (March-October).To determine which metrics explained the greatest proportion of the overall NDVI variability, both spatially over the domain and temporally over the growing season, the resultant first three PCs were correlated with each of 12 EFA metrics using Pearson correlations.Spatial Pearson correlations were calculated separately (at the pixel scale) between each EFA and the values for PC1, PC2, and PC3.Because of the vast number of pixels, correlations were scaled to MODIS resolution ×2 (∼500 m).A test of how resolution scaling could affect correlations repudiated any scaling bias (see supplement S2).Subsequent correlations of the first three spatial PCs (which accounted for 99% of the variance) with the NDVI metrics revealed three key EFAs for the study domain (see results): Mean NDVI (Mean), Date of the maximum NDVI increase (SOS Date, start of season date), Date of maximum NDVI (Dmax).These three EFAs, as shown in figure 1, were then binned based on percentile classes or natural break points determined from frequency frequency distributions.The number of bins used to divide each of the EFAs was determined by the importance of each PC.Because the first PC explained the majority of the variation and correlated most closely with Mean NDVI, Mean NDVI was divided into five ranges by using quintiles (i.e.20th, 40th, 60th, and 80th percentiles) into five ranges, a number equal to the number of bioclimatic subzones identified in the CAVM.The second and third EFAs (SOS Date and DMax, descriptors of timings) were divided into three ranges each based on natural breaks in their frequency distributions, corresponding to early, middle, and late occurrence of these phenological events.Given the low temporal resolution of the NDVI composites (16 d) and the high concentration of pixels in just one 16 d composite (i.e. more than half of pixels in CAVM displayed their SOS Date and DMax phenologies within the same 16 d composite period), we chose as natural breaks the dates before (early) and after (late) the NDVI composite with the highest (middle) number of pixel values.
With each EFA binned into five, three and three ranges respectively, pixels were then re-classified into their corresponding EFT.To build a biologically meaningful EFT legend that kept the original EFA information, Mean NDVI (Mean) values were reassigned as 100, 200, 300, 400, or 500, depending on their quintile from low to high primary production.Similarly, the Start of Growing Season date (SOS date) was assigned to 30, 20, or 10, depending on early, middle, or late start of season date ranges (respectively).The date of the maximum NDVI (DMax) values were reassigned as 1, 2 or 3, corresponding to an early, middle, or late date of NDVI peak ranges.EFTs were then classified for each pixel by stacking the reassigned EFAs into one image and adding their values together so that each combination of the three EFAs could be expressed as one of 45 unique EFTs.With this naming convention, EFTs ranged from 111, which means, low Mean NDVI, late SOS date, and early DMax date, to 533, which means high Mean NDVI, early SOS date, and late DMax date.

EFT analysis
To explore differences in EFT composition across bioclimatic subzones, physiognomic units, and floristic provinces, EFT frequency distributions were drawn, and richness, evenness, and diversity metrics from EFT pixel counts (frequencies) were calculated and compared (using the 'tabula' package version 1.7.0 in R).Both the evenness and diversity were calculated using the Simpson Diversity index, which conveys the probability that two individuals (i.e.pixels) randomly picked from a finite sample belong to two different EFTs (Simpson 1949).The Diversity Index (D) was calculated using the equation: where n is the total number of pixels of each EFT in a particular bioclimatic subzone, physiognomic unit, or floristic province, and N is the total pixel number of each EFT across all bioclimatic subzones, physiognomic units, or floristic provinces.The value of D ranges between 0 and 1, which is interpreted as the lower the index number, the greater the diversity.In our results table below (table 3), we reported 1−D, whereby high values are indicative of greater diversity.This D value was also used to calculate evenness (e) following: where S is the number of EFTs found in a particular bioclimatic subzone, physiognomic unit, or floristic province.Results for richness, diversity, and evenness can be found in table 3 in the results section.
Finally, to examine whether the EFT distribution significantly varied across CAVM bioclimatic subzones, physiognomic units, and floristic provinces, we applied CAs.CA is typically used to identify and quantify associations between two multivariate sets of categorical variables, here EFTs and CAVM bioclimatic subzones, physiognomic units, and floristic provinces.In the CAs of EFT distributions across these subzones, units, and provinces, whether there is a significant dependency between the rows and columns was evaluated using a chi-square statistic.CAs were conducted in R using the 'FactoMineR' (version 2.4) and 'factoextra' (version 1.0.7)packages.

NDVI seasonal dynamics of Arctic tundra
The PC Analysis of the 16 day composites of NDVI, our index of primary production, for the growing season (6 March to 31 October, i.e. 15 images per year), averaged across the MODIS record , indicated that 99.14% of the variation was explained by the first three PCs.The first PC (PC1) explained 96.17% of the variation, whereas the second (PC2) and third (PC3) components accounted for 2.27% and 0.70%, respectively.As shown in table 2, correlation results indicated that the Mean growing season NDVI (Mean) had the highest correlation value with PC1 (−0.987), though standard deviation and Maximum NDVI (Max NDVI) also correlated strongly.For PC2, the start of growing season date (SOS date) had the highest correlation value (0.510), except for Mean NDVI (−0.535).The date of the maximum NDVI (DMax or Day of Max) had the strongest correlation with PC3 (−0.266).Accordingly, Mean NDVI, SOS Date, and DMax were the three EFAs that were most statistically relevant and independent with respect to explaining variance and the temporal differences in the eigenvector weights across the dataset.
In addition to correlating to EFA variables, the eigenvector values from the first three PCs related to the magnitude and timing of the NDVI seasonal curve (see S.2 in the supplement).The seasonal timing of the vector signs and magnitudes revealed that seasonal longevity of productivity and timing play an important role in defining spatial variation of NVDI intra-annual dynamics across the circumpolar Arctic.Maps of each of the EFAs were created from the same MODIS NDVI dataset that was used to calculate the principal components analysis (PCA) (figure 2).

EFTs map
The functional classification of the Arctic tundra by EFTs (figure 3) was comprised of combinations of five Mean NDVI quintiles (100 = lowest, 200, 300, 400, 500 = highest), three SOS date groups (10 = latest, 20,30 = earliest) and 3 DMax groups (1 = earliest, 2, 3 = latest).Of the 45 possible resultant EFTs, seven (531,532,322,422,122,432,222) were dominant, with >5 × 10 6 pixels, and seven additional (121,431,112,221,321,332,331) were represented by >10 6 pixels.The least abundant EFT classes (211,311,512,413,411,511,513) contained fewer than 1000 pixels each and were extremely rare or relatively absent, altogether occupying just 0.006% of the total CAVM MODIS 250 m pixels.These rare classes included functional combinations that are biologically unrealistic for the Arctic tundra vegetation, such as reaching high mean NDVI with a late start of the growing season and an early date of maximum NDVI.These rare EFTs tended to be scattered throughout the lower latitudes of the study domain (bioclimatic subzones D and E), adjacent to areas of rapid transition (e.g.hillslopes) and nearby to similar EFTs of the same NDVI quintile.For example, EFTs 511, 512 and 513 were most closely associated with subzone E and found in areas of high NDVI values within this subzone.

EFT richness and evenness across the CAVM regions
To understand how EFTs distribute across the Arctic tundra landscape, we calculated richness, evenness, and diversity metrics from pixel counts for EFTs within the CAVM bioclimatic subzones, physiognomic units, and floristic provinces.On average, EFT richness was high (mean 41 and median 42) for the three CAVM regionalizations, i.e. a majority of classes had more than 40 EFTs and just one physiognomic unit and one floristic province had less than 30 EFTs (table 3).Despite such relatively small differences in the high EFT richness, EFT Diversity strongly differed across classes mainly due to large differences in evenness.For instance, in some of the richest classes, despite containing all possible 45 EFT combinations, a majority of pixels belonged to a few dominant EFTs, while most EFTs had relatively low abundance.Two examples include Tussock Sedge and Low Shrub Tundra for physiognomic units, and Kanin and Beringian Alaska among the floristic provinces.A table of frequency values for all subzones, units and provinces can be found in supplement S4.
Across the five bioclimatic subzones, EFT richness increased from north (36 in subzone A) to south (45 in subzones D and E), though diversity peaked in subzone D (0.875), where evenness was highest (0.178).The average EFT richness (table 3) for bioclimatic subzones was 41 (median: 42), similar to that of both physiognomic units and floristic provinces, which had means of 42 (median: 43) and 41 (median: 42), respectively.Bioclimatic subzones mean evenness was 0.149 (median: 0.145) but was not the highest amongst the three CAVM regionalizations; floristic provinces had the highest mean evenness (0.168; median: 0.142), whereas the physiognomic units had the lowest, with a mean value of 0.138 (median: 0.125).Despite the low evenness values overall, the Simpson's Diversity Index (1-D) values were relatively high, with a mean of 0.835 (median: 0.824) for five bioclimatic subzones, 0.801 (median: 0.808) for the 16 physiognomic units, and 0.818 (median: 0.830) for the 28 floristic provinces.
In each of the five bioclimatic subzones, the Simpson's Diversity Index were all greater than 0.800, though all had relatively low evenness values and varying richness.Similar patterns were observed within both the 16 physiognomic units, and the 28 floristic provinces.Notably low evenness values predictably resulted in some of the lowest Simpson's Diversity Index values, but examples could be found at both limits of richness among the physiognomic units (e.g.Cryptogram Herb (1), Tussock Sedge Moss Tundra (8) and Low Shrub Tundra Table 3. Tables showing results of richness, evenness, Simpson Index (1-D) values and pixel counts for EFTs within CAVM bioclimatic subzones (top left), physiognomic units (bottom left) and floristic provinces (right), each sorted by richness from low to high.For table spacing, many of the physiognomic units and floristic province names were shortened or abbreviated.The number in parentheses after each unit and province name refer to the numeric labels assigned in previous CAVM Maps.The table columns are conditionally formatted, ranging from low numeric values in bright blue to high numeric values are in bright red, with shades in between corresponding to values closer to the middle of the range (shown in white).( 14)).This pattern was also noted at only some of the highest richness value floristic provinces (Kanin and Beringian Alaska for example).These results highlight the uneven distributions of EFTs within subzones, units and provinces.
Pixel counts were also included in table 3 to explore whether greater areas reach higher EFT diversity metrics.Although it was clear that EFT richness increased with area across bioclimate subzones, evenness and Simpsons Diversity Index did not exhibit this same trend.Across physiognomic units, a similar, although weaker, trend of increasing EFT richness with area was noted, however there were also numerous exceptions to the relationship between larger units and greater richness.However, no discernable trend was observed between floristic province size and the EFT richness found within it.

Correspondence of EFTs with CAVM bioclimatic subzones
Though richness and evenness values indicated that EFTs are well distributed across the CAVM domain, they do not reveal whether EFT spatial patterns are related to climate controls, or to the structure, or composition of vegetation.To assess such relationships, we ran three CA on the EFT distribution across bioclimatic subzones, physiognomic units, and floristic provinces.For bioclimatic subzones (figure 4), the cumulative percentage of variation explained by the first two CA axes combined was 92.96%.The Pearson's Chi-squared statistic value of ∼1.85 × 10 8 (p-value = 0) indicated a significant association between rows (bioclimatic subzones) and columns (EFTs).Within the first two dimensions, however, there were a few discernible trends.For instance, the east-west orientation of the EFTs in the bi-plot (figure 4) indicates a productivity gradient captured in the first dimension.The close proximity of Subzone E to the majority of the most productive EFTs (500s) and Subzone A and B to the lowest productivity EFTs highlights the dominance of these EFTs within these subzones.

Correspondence of EFTs with CAVM physiognomic units
The CA between EFTs and physiognomic units also showed a strong correspondence, though less than that of bioclimatic subzones and greater than the floristic provinces.The Pearson's Chi-squared statistic value of ∼2.97 × 10 8 also indicated a strongly significant association between rows (EFTs) and columns (physiognomic units).The variance explained by the first two dimensions was 74.45%, higher than that of the floristic provinces.Similar patterns as those seen in the floristic province CA were also notably present in the bi-plot of the physiognomic units CA (figure 5).In general, EFTs are distributed from low to high Mean NDVI across the first CA axis on the bi-plot.The physiognomic units were also well distributed across the  two axes, with some occurring close to EFTs, while others occurring away from clusters of EFTs or other physiognomic units (i.e.nunatak complex).The high mean NDVI EFTs (531-533) were closely associated with the three physiognomic units with the highest plant biomass: low shrub tundra, tussock sedge, and low shrub wetland.At the opposing limit of the first dimension, the lowest productivity vegetation types, such as 'nunatak complex,' 'carbonate mountain complex,' and 'cryptogam-herb' , were most closely related to the lowest mean NDVI EFTs (111-113).

Correspondence of EFTs with CAVM floristic provinces
The correspondence between floristic provinces and EFTs indicated a strong association, the Pearson's Chi-squared statistic value of ∼5.81 × 10 8 (p-value = 0) indicated a significant association between rows (EFTs) and columns (floristic provinces).The cumulative percentage of variation explained by the first two CA axes was 55.49%.EFTs were again distributed from low to high Mean NDVI across the first CA axis on the bi-plot (figure 6), but less distinct than that of the bioclimatic subzones and the physiognomic units.The second CA dimension did not have a similarly obvious association for EFTs or floristic provinces.Overall, the majority of both EFTs and floristic provinces were not clustered but distributed along the eigenspace on the bi-plot with few discernable patterns, suggesting that EFTs are, in general, well-distributed throughout the floristic provinces.Indeed, only a few associations were noted between particular EFTs and floristic provinces.For instance, the highest mean NDVI EFTs (500s) were closely associated with South Chukotka, Northern Beringian Islands, Beringian Alaska, and Kanin, while the lowest mean NDVI EFTs (100s) were associated with Greenland, Glaciers, and Svalbard.The floristic provinces that are dominated by these higher or lower NDVI EFTs and are found near the positive and negative limits of the first CA axis, and indicate that these provinces were outliers with low EFT richness, with the majority of the composition of each being the EFTs they are in close proximity to on the bi-plot.

Discussion
Our study presents a first classification of the Arctic tundra landscapes, not focused on bioclimate, nor on vegetation structure or composition, but in terms of EFTs, providing a novel metric for understanding patterns of ecosystem functioning diversity across the circumpolar high latitudes.Our results highlight a greater correspondence of EFT patterns with bioclimatic subzones, followed by physiognomic units and floristic provinces, which suggests a greater role of current ecological conditions followed by vegetation structure and composition in determining ecosystem functioning.Regarding EFT diversity, our results also show how the number of EFTs (EFT richness) increased with area and thermality, particularly across bioclimatic subzones, and in physiognomic units or floristic provinces characterized by mountainous landscapes or water-body shores.Functional diversity and ecosystem function, their relationship with biodiversity composition and structure, and their drivers have been topics of discussions concerning ecosystem vulnerability and resilience to environmental change, and in particular, to climate change (Alcaraz-Segura et al 2017, Schweiger et al 2018, Cavendar-Bares et al 2022).There is broad consensus within the scientific community that functional diversity is critical to the stability of ecosystem function and thus for biodiversity conservation (Tilman 1997, Dı ´az and Cabido 2001, Fargione et al 2007, Schweiger et al 2018).A number of strategies, both bottom-up and top-down methods, have been developed to target measurements of ecosystem function and functional diversity.
A bottom-up approach, which was first described by Shugart et al (1997), defined functional types by grouping organisms based on their common structural and functional properties in order to scale-up from leaf or organism level.More recently, numerous studies have been built on the 'spectral diversity hypothesis' which states that 'plant species within a community occupy unique spectral spaces delineated by their chemical, anatomical and morphological characteristics' (Asner and Martin 2008, Rocchini et al 2010, Ustin and Gamon 2010, Schweiger et al 2018).These methods include those of Wang et al (2020b), Serbin and Townsend (2020), Cavender-Bares et al (2020), Asner et al (2015), and others, which utilize imaging spectroscopy and other high resolution remote sensing methods for mapping functional traits to delineate functional variation and diversity on a continuous scale at the leaf and plant level.Several studies have scaled up emergent structural and functional properties from patches to regional estimates by applying hierarchical and patch dynamic theories to emergent structural and functional properties (Aber et al 1999, Reynolds and Wu 1999, Wu et al 2003).However, few studies have established widely accepted and validated methods for mapping multiple functional traits consistently across biomes and heterogeneous landscapes from leaf-level trait mapping (Wang et al 2020, Hauser et al 2021).In addition, the continuous scale at which leaf-and plant-level traits are measured, presents an inherent difficulty when trying to scale up to the ecosystem level, as individual traits tend to overlap on a continuous scale and can be hard to spectrally separate (Asner et al 2015, Schneider et al 2017), resulting in a profound dependency of functional diversity estimates on image resolution (Helfenstein et al 2022).
Describing functional units from a top-down perspective, our study offers a complementary approach to the leaf-organism level trait-mapping methodology.Ecology and conservation need spatially-explicit and straightforward ways to assess and incorporate heterogeneity of ecosystem functions (Turner and Chapin 2006), and to develop tools that complement bottom-up approaches and management actions at the regional scale (Possingham et al 2005).Our top-down methodology relies on the use of satellite-derived variables at moderate resolution, that document the dynamics of patches of the land surface in a common way with regard to matter and energy exchange (Paruelo et al 2001, Alcaraz-Segura et al 2013, Cazorla et al 2021).The approach relies on the assumption that functional units are those patches that are grouped together by showing coordinated and specified responses to environmental factors, regardless of their species composition or structure (Valentini et al 1999).Seasonal dynamics of net primary production was first suggested to identify functional units by Soriano and Paruelo (1992), but the use of ecosystem functional descriptions has been developed further in recent decades into a reliable method to characterize ecosystem functioning, with the potential to link ecosystem processes on the land-surface with climatic processes (e.g.Müller et al 2014) and biodiversity responses (Alcaraz-Segura et al 2017) toward improving understanding of the effects of environmental changes on biota (Marcos et al 2021, Rotllan-Puig et al 2021, Regos et al 2022).
Our study shows how satellite derived EFAs and EFTs of a focal ecosystem function (here primary production) offer tangible and biologically meaningful qualities of ecosystem functional heterogeneity (e.g.EFT richness) that complement traditional climatic, compositional and structural descriptions of the arctic tundra.Ellenberg and Mueller-Dombois (1974) first described the possibility of vegetation classification based on function by mapping similarities in the way vegetation responds to the environment across different ecosystems.By interpreting the relational exchange of matter and energy with the land surface over the growing season, functional types can be satellite-derived at the ecosystem level, not only based on the spatial heterogeneity of primary production dynamics, but also on other ecosystem processes and functions at different scales (Pettorelli et al 2018, Regos et al 2022).Paruelo et al (2001) first utilized this approach for temperate South America, where EFTs were defined separately from vegetation structure on a pixel-level.Their approach has since been applied and refined in numerous other systems, e.g. the Iberian Peninsula (Alcaraz-Segura et al 2006) and the contiguous U.S. (Villarreal et al 2018).Each study, including ours, first has to statistically and biologically define the most important EFAs in order to determine classifiers that are most appropriate for the systems of interest.

Functional attributes of the circumpolar Arctic
In our study, variables pertaining to productivity (annual mean) and phenology (timings of the maximum growth rate and of maximum interception of radiation) were used to define Arctic tundra EFTs.Previous work by Alcaraz-Segura et al (2006, 2013) has shown how these variables measured by remote sensing can identify key EFAs.As stated by Ellenberg and Mueller-Dombois (1974), the solar energy absorbed by living plants through photosynthesis constitutes an integrative and quantitative functional characterization of ecosystems.Traits derived from remote sensing data relating to greenness (NDVI) or carbon cycling are essentially describing patterns of interception of radiation by vegetation and can therefore be useful in defining EFTs.
In seeking to answer our objective question of: 'What are the essential variables that characterize ecosystem functioning, in this case the seasonal dynamics of primary productivity (i.e.carbon gains), throughout the circumpolar Arctic?' we followed the methodology of Alcaraz-Segura (2006) and Paruelo et al (2001).Conducting a spatial PCs analysis over the CAVM domain using 16 d composites over the growing season from March to October, averaged over the entire MODIS NDVI record from 2000-2020 provided us with a statistical means to reduce the variability of the NDVI seasonal dynamics across the study area for a few statistically informative axes of variation.Calculating Pearson correlations between the main axes of variation and all EFAs used in this study (pertaining to productivity, seasonality, and phenology; see table 1) proved to be a difficult undertaking, even for GEE.
The variables selected for EFAs provide a more holistic understanding of the heterogeneity of the Arctic tundra landscape than any one productivity measure alone.NDVI has been used as a reliable indicator of plant productivity for decades across the tundra biome (Stow et al 2007, Raynolds et al 2012, Gamon et al 2013, Lara et al 2018b, 2018c).However, NDVI alone does not provide the important temporal dimension that is required to fully depict the spatial heterogeneity of Arctic tundra vegetation dynamics.Variability in the occurrence and magnitude of the flow of energy through systems across the landscape are driven by seasonal, radiative, and climatic fluxes, as well as geologic and topographical patterns.The three statistically independent (non-self-correlated) EFAs that had the highest correlation with the PCs each represent the key drivers of spectral changes to landscapes across the tundra biome.Productivity (Mean) and phenology of maximum net growth rate (SOS Date) and of maximum interception of radiation (DMax) together captured the multi-dimensional complexities (spatial and temporal) of vegetation and ecosystem processes throughout the Arctic tundra.

Arctic tundra EFT distribution patterns
The response to our second research objective question of: 'What are the spatial patterns of the distribution of EFTs, classifications of functionality, throughout the circumpolar Arctic?' was made evident by examining our Circumpolar Arctic Tundra EFT map (figure 2).Our expectation of greater functional heterogeneity in the southernmost CAVM subzones was based on the paradigm of greater species diversity with decreasing latitude (Pianka 1966, Tilman et al 1998), and the spectral or optical diversity hypothesis, which states 'that species within a plant community occupy unique spectral spaces delineated by their chemical, anatomical and morphological characteristics' (Ustin and Gamon 2010, Rocchini et al 2010, Asner and Martin (2008).In viewing EFT frequency distribution by bioclimatic subzone, the range of EFT richness follows latitude.However, the distribution of EFTs across subzones differs in evenness, with the comparatively middle latitude Subzones C and D having more evenly distributed EFTs than Subzone E, which has the highest concentration of more productive EFTs (high NDVI).This supports Gravel et al's (2012) assertion that the relationship between phylogenetic diversity and ecosystem function can vary along a continuum.Our first hypothesis was supported based on EFT richness, as the longer growing season in the southernmost CAVM subzones favors a greater range of possible pathways of seasonal dynamics of primary productivity, i.e. both EFTs with greater mean productivity and with more variability in seasonalities and phenologies.Indeed, subzones D and E both contained all 45 possible EFTs.Our second hypothesis, which expected a substantial degree of functional heterogeneity near the center of the latitudinal temperature gradient of the tundra biome was also supported as Subzones C and D showed the highest Simpson's Index.
From our third hypothesis, we expected greater EFT heterogeneity in areas that occupy strong elevation gradients (e.g.Brooks Range, Ural Mountains, East Siberian Mountains), and in areas where there are ice and near-shore ocean conditions.Both of these landscape types have similar locally variable edaphic conditions, though they differ in the mechanism of their heterogeneity.The diversity of conditions found in mountainous landscapes is related to topography (elevation, slope, position, geology), while those in icy and near-shore areas are related to ocean currents, winds, and proximity to shore.Both sets of conditions have related strong climatic gradients leading to higher demographic diversity per unit area than other ecosystem types.In support of this hypothesis, EFT diversity hotspots are evident in the Brooks Range of northern Alaska, and the Chukotka and Ural Mountains of Russia (figure 3).EFTs in these areas appear to be strongly associated with elevation, with higher productivity EFTs in the foothills, and lower productivity at the peaks where year-round ice is not present (ice was masked out of the analysis).Similarly, the icy and near shore coastal zones also exhibit higher EFT diversity than that of adjacent inland areas.Southern Greenland, northern Iceland, coastal Siberia and coastal areas surrounding the southwest coast of Baffin Island all depict areas of high local EFT diversity.To illustrate patterns of how EFT richness varies spatially across the Arctic, supplement S5 shows richness values mapped across CAVM floristic provinces and physiognomic units.

Relating Arctic tundra EFTs to CAVM bioclimatic, phytogeographic, and physiognomic zonation
To answer our third question: 'How does the functional classification of the Arctic tundra biome compare to bioclimatically, compositionally, and structurally based classifications, such as those from the CAVM?', we conducted CAs to determine how our mapped EFTs were distributed across bioclimatic subzones, vegetation/physiognomic units, and floristic/phytogeographic provinces.Correspondence analysis (CA) is a dimension reduction technique similar to PCA; but where PCA aims to determine what causes variation, CA deals with two multi-variable datasets (here subzones, units, or provinces and EFTs), with the aim of finding what accounts for the most co-variance between the rows (CAVM zones or units) and columns (EFTs).In CA, similarity between two rows or two columns is completely symmetric, and two rows (CAVM zones or units) are considered to relate closely to each other if they associate with the columns (EFTs) in the same way.For instance, if two floristic provinces have very similar EFT frequency distributions, they will be more closely related than a third region with a very different frequency distribution.Thus, similar rows (in this example floristic provinces) can be grouped together and each group of rows is then characterized by the columns (EFTs) to which it is or is not associated.
In our fourth hypothesis, since our focal ecosystem function to identify EFTs was carbon gains dynamics, in particular, the rates and timing of primary production across the landscape, we expected to observe greater association with climatic drivers than with the physiognomic structure or the floristic composition of vegetation.Our reasoning for this is that, at the regional scale, ecosystem functioning is likely to have closer correspondence to climate (e.g.temperature, humidity and radiation), edaphic factors and vegetation structure than to evolutionary components (floristic composition) (Higgings et al 2016, Mucina 2019).To address this hypothesis, we first used the CA results to compare the degree to which each CAVM subdivision statistically associated with EFTs, and then we compared EFT richness and Simpson's Diversity Indices across the subzones, units, and provinces.
As part of the CAs, Chi-square tests were accomplished separately for each CA in order to investigate whether the rows (subzones, units and provinces) and columns (EFTs) were considered to be significantly associated.In all three, the chi-square statistic was very large (1.85 × 10 08 for subzones, 2.97 × 10 08 for units, and 4.03 × 10 08 for provinces), with a p-values essentially zero, rejecting the null hypotheses and signifying the statistically significant association between EFTs and subzones, units and provinces.However, the dimensionality of the subzones, units and provinces differed considerably; 93.0% of the variation could be explained in the first two dimensions of the bioclimatic subzone CA bi-plot, while 74.5% was explained for the physiognomic units, and only 55.5% was explained for floristic provinces.This finding suggests that the bi-plot ordination better captured the simultaneous variation of EFTs and regionalizations for bioclimates followed by vegetation structure and floristic provinces.In all three bi-plots there was no clear bi-directional clustering of EFTs and units or provinces, but there existed a clear orientation of the three regionalizations and EFTs, owing to productivity gradients.Some EFT orientation was exhibited across the first dimension in floristic provinces with EFTs decreasing in terms of NDVI mean along a negative to positive gradient.This was also true for physiognomic units, with some exceptions noted in both.For each of the three CA tests, lowest diversity subzones or provinces were associated with the highest and lowest number of EFTs in eigenspace, i.e, the highest and lowest NDVI mean, while the most heterogenous zones and provinces were oriented near to the origin (i.e. the center of the plot, with medium levels of NDVI mean).
In interpreting the degree of functional heterogeneity for each CAVM unit's EFT distribution, our analysis did not fully support our fourth hypothesis, i.e. finding more heterogeneity within floristic provinces followed by physiognomic units and bioclimatic subzones.In viewing the Simpson's Diversity indices results (table 3) for the subzones, units and provinces, the mean and median values were lowest for units (0.801 and 0.808, respectively) as compared to subzones (0.835 and 0.824) and provinces (0.818 and 0.803).However, the degree to which these differed was small, and there was a lot of overlap in values.In addition, the minimum Simpson's index of 0.580 was calculated for Beringian Alaska and the maximum (0.927) for Polar Ural, both floristic provinces, while for physiognomic units the minimum was 0.683 (Cryptogam Herb) and maximum was 0.926 (Non-Carbonate Mountain Complex).Thus, the full range of Simpson's Diversity (1-D) values for both units and subzones were within the bounds of the provinces.EFT richness showed the same pattern than diversity.

Sources of uncertainty and associated error
Our goal of developing Circumpolar Arctic Tundra EFTs was accomplished almost completely in GEE.We encountered numerous issues with calculations timing out or exceeding the maximum memory capacity.This was not completely unexpected given the enormity of our study domain, however, our methodology and code required optimization and refinement to cut down on these errors at each step in the process.We investigated the quality of our input datasets both through bit-masking testing, where we calculated and mapped the number of images included per pixel at different bitmask thresholds.We also encountered data coverage difficulties brought on by seasonal daylight hours.Because MODIS is a passive instrument, there are no data from northern latitudes during periods of darkness; thus, there is inconsistent coverage across the study domain at the beginning and end of the growing season.We therefore selected the start and end dates of our study growing season to coincide with time periods where there was daylight, and thus MODIS measurements, for the entire domain.This may have introduced some error in the lowest latitude areas of our study regions; if the growing season began prior to 6 March or ended after 31st October.However, our decision prevented increased uncertainty that would be introduced by averaging images with only partial coverage with images of full coverage of the domain.
In addition to data quality and coverage, water was another source of error that we addressed in our investigation.The Arctic tundra is a landscape of diverse and complex hydrology; the GlobCover water bodies permanent snow and ice mask did not address seasonal water.Initial versions of our EFT map had artifacts and anomalies (e.g.systematic increase in EFT richness) in the pixels that are located at the edges of water bodies, especially seasonal lakes/ponds and bogs.As our EFT map was borne out of terrestrial vegetation productivity metrics, we constructed a more robust water mask that combined the JRCs Global Surface Water Map with the GlobCover water bodies, permanent snow, and ice mask to remove both seasonal and permanent water over our study time period .To correctly scale seasonal water spatially, we created a threshold of >20% for inclusion in the EFT analysis.Thus, if a 250 m MODIS pixel contained greater than 20% of 30 m pixels classified as water by the JRC mask, then the 250 m pixel was masked out of the analysis.This exclusion of seasonal water decreased the appearance of anomalous pixels adjacent to water bodies and any artificial increase in EFT richness.
Due to the large number of pixels in our study domain that exceeded the computing limitations of GEE, the Pearson correlations were performed at quadruple the native MODIS resolution (∼500 m).To verify that decreasing resolution would not affect the outcome of these correlations, we also performed correlations at MODIS times 16 (∼1 km), 64 (∼2 km), and 256 (∼4 km).Results of this sensitivity analysis (see supplement S2) indicated that the coarsening of the resolution did not have a significant impact on the correlation results.Though many of our initial EFT calculations were accomplished in GEE, the CAs were accomplished using R. CAs are conducted from frequency distributions of EFTs calculated for each CAVM regionalization.Because of pixel-averaging that GEE does when calculating frequency distributions for large domains, we could not rely on the frequency distributions calculated for CAVM subdivision, so we decided to compute them in R.However, in exporting the full CAVM EFT map, we encountered an export error bug that exists in GEE and that GEE developers were not able to repair yet.When exporting datasets that cross the 180 • /0 • longitude line as with our circumpolar map, a large portion of Eastern Siberia was left out of the export.Our solution was to break the EFT map into pieces that did not cross the 180 • /0 • longitude line and export them separately.We then recombined the pieces in QGIS and calculated frequency distributions in R using the full EFT geotiff.

Conclusion
The development of an EFT classification for the circumpolar Arctic tundra provides an analogous effort for mapping ecosystem function throughout the Arctic tundra biome, and for comparing functional, compositional, and structural classifications.To our knowledge, our EFT map is the first map describing ecosystem functional diversity at the regional scale for the circumpolar Arctic tundra.Satellite-derived time series of NDVI were used to quantify the complex spatiotemporal heterogeneity of primary production dynamics as a focal ecosystem function in the Arctic tundra.As other studies first developed this methodology for other biomes and ecosystems, we were able to leverage a substantial body of previous work to tailor our analysis into a robust and reproducible framework in GEE, applicable for other ecosystems in future works.This datasets from this work will inform conservation efforts with respect to ecosystem functioning and is relevant for Arctic biodiversity monitoring efforts, such as the Group on Earth Observations Biodiversity Observation Network Arctic BON, which is synonymous with the Circumpolar Biodiversity Monitoring Program within the Conservation of Arctic Flora and Fauna group of the Arctic Council.
In statistically correlating a myriad of functional attributes with spatial PCAs, we revealed the most important drivers of variation in productivity across the Arctic tundra.These three essential EFAs, Mean NDVI (Mean), Date of Season Start (SOSdate), and Date of Peak Growing Season (Dmax), are explanatory continuous variables that are representative of the patterning of productivity across the landscape in magnitude (Mean) and timing (SOSdate and Dmax).The CAs calculated for the CAVM bioclimatic subzones, physiognomic units, and floristic provinces demonstrate that the EFT Map is complementary to existing compositional and structural descriptions of the circumpolar tundra vegetation, providing functional metrics toward a more holistic understanding of this complex and heterogeneous region of the planet.In future efforts, we aim to leverage this work to develop wall-to-wall ecosystem functional diversity metrics toward understanding of the drivers of spatial and temporal functional variability across the Arctic tundra.

Figure 1 .
Figure 1.Illustration of the three ecosystem functional attributes (EFAs) used to identify ecosystem functional types in the circumboreal Arctic tundra.Between parenthesis there is a short explanation.The EFAs are numbered corresponding to the principal component axis which each correlated highest to.NDVI (Normalized Difference Vegetation Index) was used as a surrogate of vegetation primary productivity and obtained from MODIS-Terra satellite sensor from 2000 to 2020 at 250 m/pixel.

Figure 3 .
Figure 3. (Above) The Ecosystem Functional Type Map of the Circumpolar Arctic Tundra depicts the distribution of functional type classes made from discrete functional attribute ranges.(Left) The pixel count frequency distribution indicates the distribution of EFTs as seen on the map.

Figure 4 .
Figure 4. Correspondence analysis results Bi-plot of Bioclimatic Subzones.The results plot shows EFTs (in blue) and subzones (in black), with 92.96% of the variation explained in the first two dimensions.Frequency distributions showing EFT distributions at the right show Subzone A in red, Subzone C in yellow and Subzone E in green.

Figure 5 .
Figure 5. Correspondence analysis results Bi-plot of Physiognomic Units.The results plot shows EFTs (in blue) and subzones (in black), with 74.45% of the variation explained in the first two dimensions.Frequency distributions showing EFT distributions at the right show Cryptogram Tundra in red, Low Shrub Wetland in yellow and Graminoid in green.The physiognomic unit names were abbreviated to conserve space.

Figure 6 .
Figure 6.Correspondence analysis results Bi-plot of floristic provinces.The results plot shows EFTs (in blue) and subzones (in black), with 55.49% of the variation explained in the first two dimensions.Frequency distributions showing EFT distributions at the right show Northern Alaska (AK) in red, Northwest (NW) Greenland in yellow and Taimyr in green.The Floristic Province names were abbreviated to conserve space.

Table 1 .
All potential Ecosystem Functional Attributes (EFAs) evaluated to determine their biological and statistical meaningfulness to identify Ecosystem Functional Types in the circumpolar Arctic tundra.The metrics (potential EFAs) are grouped by their main biological meaning: productivity, seasonality, and phenology.

Table 2 .
Results of 3 PC axes with all EFA correlations, including correlations among EFAs.
a All Correlations accomplished at Modis x2 resolution.Figure 2. Ecosystem Functional Attribute Maps for the Circumpolar Arctic Tundra (Lambert Equal Area Polar Projection).Top left: NDVI Mean; Top Right: Date of the start of the growing season (SOS Date) as defined by the date with the highest positive slope in NDVI; Bottom: Date of the maximum NDVI value.The Mean NDVI is shown with a continuous color scale; the DMax and SOS Date are both shown with discrete colors corresponding to the Julian date of the respective attribute.