A solution for perfect bioclimate envelopes that are imperfect for extirpated species

Present-day species distributions modeled with climate variables cannot provide potential future climate space for species that have contracted in range due to extirpations, regardless of abundant sample sizes within current ranges. My objective was to examine effects of range contractions on modeling of species to determine suitable space under climate change, exploring different approaches to modeling based on historical range maps. As examples of this issue, I estimated the current and future bioclimate envelopes of American bison (Bison bison) and elk (Cervus elaphus) from their current reduced ranges in the western United States compared to historical ranges immediately before extirpation. One solution for bioclimate envelope modeling is to generate presence samples from the historical range and pseudoabsence samples from outside of the historical range. By using the fullest climate space, the models identified areas of future suitable climate space that otherwise would be underpredicted (10%-27% of climate space, for these two examples) based on current ranges of species that have been extirpated from their historical range. Range contraction substantially reduced predictions of suitable climatic space under climate change. Therefore, species need to be evaluated for range extirpation before determining potential impacts of climate change on biodiversity conservation.


Introduction
Species that are declining may contract in range, resulting in local extinction, or extirpation from geographic extents (Lomolino and Channell 1998). Land use can result in habitat loss and fragmentation, constraining species to only a portion of historical ranges. Overexploitation directly can remove species from some part of the range. Around 25% of assessed animal and plant species are threatened with extinction, primarily due to land use and overexploitation (Intergovernmental Science-Policy Platform on Biodiversity and Ecosystem Services [IPBES] 2019).
Bioclimate envelope models (i.e. bioclimatic models, climate envelope models) are distribution models based on correlations between species occurrences and climate variables. Bioclimate envelopes models are a relatively accessible and fruitful line of research, encompassing tens of thousands of publications. For example, Worldclim 2, a dataset which supplies bioclimatic variables, has 8000 citations currently, a rapidly increasing number (Fick and Hijmans 2017). Climate-only models are a beneficial tool that are explicit about the relationships of climate to species distributions, which are relatively reliable over time, unlike correlational models that include other variables or process-based models that incorporate dynamic processes.
Modeling of species to determine suitable space under climate change is one specific objective of bioclimate envelope modeling (Jeliazkov et al 2022). However, to produce a reliable model of climate space for a species, occurrence samples need to cover the full climate range of the species, regardless of sample size (Martínez-Freiría et al 2016). The number of samples is important to generate reliable models but primarily as a reflection of whether the samples have captured the climate space. The modeled climate space may be accurate if the species always has been a specialist with a narrow range or if it is established that the species Not subject to copyright in the USA. Contribution of USDA Forest Service lost traits for tolerance to climate of the historical range during the extirpation process (i.e. subspecies genetically suited to a different climate). Otherwise, it is not possible to model the climate space of species that have contracted in range, due to land use or overexploitation, unless the historical range is known or well-distributed historical fossil samples are documented (Lima-Ribeiro et al 2017, Faurby and Araújo 2018, Sales et al 2022. Two examples of species that have been extirpated from their historical ranges in North America, following Euro-American settlement, are American bison (Bison bison) and elk (Cervus elaphus canadensis). Elk and bison primarily are grazers of herbaceous vegetation, which are abundant in grasslands, shrublands, savannas, and woodlands. Bison consume predominantly grasses and sedges (Gates et al 2010), whereas elk generally may prefer graminoids but will consume forbs and browse based on relative spatiotemporal abundance (Christianson and Creel 2007). The diets of both species are principally graminoids for most of the year, although bison favor sedges if available while elk favor grasses, and overlap continues into the summer, when elk consume more forbs and woody plants than bison (Gates et al 2010). Greatest competition for forage is from domestic cattle, due to presence of 30 million beef cows, seasonally with calves, in the United States (USDA NASS 2023).
Historically, an estimated 10 million elk and 60 million bison were present in North America (Seton 1929). Populations decreased to 100 000 elk and fewer than 1000 Plains bison (B. bison bison) around year 1900 with current recovery to about 1 million elk and 500 000 bison (Seton 1929, Gates et al 2010, Popp et al 2014. However, most bison are owned in commercial herds, while about 20 000 animals are present in 62 conservation herds (Gates et al 2010). Disease may have contributed to near complete loss of bison (Holt et al 2018) but given extinction of most megaherbivores after initial contact with humans and extirpation of multiple large herbivores due to overexploitation by Euro-Americans, disease was not a necessary factor. Euro-American hunters killed herds of animals, often without taking any products, although sometimes hides and tongues were extracted (Seton 1929). The trait of large body size (>44 kg), or slow reproduction rates, may make species extremely vulnerable to extinction, regardless of historical range extent or abundance (González-Suárez et al 2012). Equally, the historical full suite of predators, consisting of wolf (Canis lupus), grizzly bear (Ursus arctos horribilis), black bear (Ursus americanus), wolverine (Gulo gulo), mountain lion (Puma concolor), lynx (Lynx canadensis), and coyote (Canis latrans) occurs in limited locations currently, resulting in high bison survival rates (>90% for calves; Gates et al 2010). For elk, neonate survival declines as the number of predator species increases, with bears and mountain lions as the dominant predators (Griffin et al 2011). Generally, high adult female survival and variable juvenile survival characterize ungulate population dynamics (Griffin et al 2011).
The ranges of bison and elk have contracted from nearly the full width of North America, albeit not filling the entire extent, to limited ranges in western central North America, within public lands or in mountainous private lands where land use is not intensive. Bison are highly managed, and wild Plains bison only range freely within the confines of limited public lands, particularly Yellowstone, Theodore Roosevelt, and Wind Cave National Parks and Custer State Park. Elk and bison have been re-introduced into a few public lands in the eastern U.S. (Gates et al 2010, Popp et al 2014. Due to their large size and impact on infrastructure, bison and elk usually are unwelcome on private lands and individuals that attempt to expand are removed (Fricke et al 2008).
Bioclimate envelope modeling requires completeness of species distributions to represent climate tolerances. While others have modeled North American mammal distributions (e.g. Deb et al 2020), and some with recognition that contracted ranges affect future predictions of climate space (Lima-Ribeiro et al 2017, Araújo 2018, Sales et al 2022), fossil records rather than historical range maps immediately before extirpation have been applied and different modeling approaches for historical ranges have not been examined. Therefore, research questions remain about (1) how bioclimate envelope models from species observations of reduced ranges fill the climate space of overexploited, extirpated species compared to historical ranges before exploitation and (2) what are different options to find the missing climate envelope for historical ranges? My objectives were to (1) demonstrate the influence of extirpation on bioclimate envelope models, for two example species with abundant samples in contracted ranges, and (2) provide a solution for modeling the lost climate space from three different options. Because general historical ranges are known for bison and elk, as examples of species with abundant samples but within a limited range, I explored different solutions for filling the lost climate space for these species. One solution was continuing to use current occurrence samples and simply avoiding sampling (pseudo)absence points from the historical range. The second solution was generating presence samples from historical ranges with background (pseudo)absence points. A third solution was a combination of presence samples from the historical range while avoiding the historical range for (pseudo)absence points. To detect any historical climate change effects on ranges, I also hindcasted to the 1900s, 1800s, and 1700s. This research will help improve evaluation of future distributions for species with contracted ranges, by demonstrating potential missing climate envelopes in models based on current observations for species that have range reductions and potential model improvement by using historic ranges to better predict future distribution of species that have extirpations across large parts of their historic distributions.

Climate variables
Sixteen total climate variables were from climatologies for 1981-2010 and additionally 2071-2100, 1900-1990, 1800-1899, and 1700-1799 for North America (resolution 30 arc s or <1 km 2 ; Karger et al 2017Karger et al , 2021. The general circulation models for 2071-2100 were MRI-ESM2-0 (Meteorological Research Institute, Japan), GFDL-ESM4 (Geophysical Fluid Dynamics Laboratory, USA), and IPSL-CM6A-LR (Institut Pierre Simon Laplace, France), which were modeled by groups representing different continents, out of five general circulation model options prioritized by the Intersectoral Impact Model Intercomparison Project, for the fossil-fueled ssp5-8.5 scenario of maximum temperature increase (Coupled Model Intercomparison Project Phase 6). Future mean annual temperatures for North America, excluding Greenland, were projected to increase 5.4 • C and 5.9 • C by the MRI and GFDL general circulation models, respectively, and 8.8 • C by the IPSL general circulation model. The 14 bioclimatic variables quantified temperature and precipitation at annual, seasonal, and monthly intervals: mean annual temperature, maximum temperature of the hottest month, minimum temperature of the coldest month, mean temperatures of the coldest and hottest three months (i.e. quarters), mean temperatures during the wettest and driest three months, annual precipitation, precipitation during the driest month and driest three months, precipitation during the wettest month and wettest three months, and precipitation during the warmest and coldest three months. Two more variables were number of snow cover days and the Koppen index, a coarse evapotranspiration measure to account for the relationship between temperature and precipitation.

Current occurrences and pseudoabsences
For current species occurrences of bison and elk, records were since 1990 during the summer months of June, July, and August, with unique coordinates and coordinate uncertainty less than 999 m (GBIF 2022; GBIF occurrence download https://doi.org/10.15468/dl.8tdqfp). Pseudoabsences were random points, equal in number to presence points, from the study extent, or background, which was the continent of North America. This process resulted in 1193 occurrences and 1193 absences for bison and 2223 occurrences and 2223 absences for elk (see figure 1 for workflow). Although no species-specific information is available to guide altering samples, a step of thinning samples is sometimes selected (Varela et al 2014). Therefore, samples were thinned to one for each pixel within the climate raster layers, resulting in 674 bison occurrences and 1249 elk occurrences (Varela et al 2014), with an equal number of pseudoabsences from the study extent of North America. Climate values were extracted at each of the species occurrence coordinates and absence coordinates.

Historical occurrences and pseudoabsences
To delineate historical ranges (see figure 2)  ecological units (i.e. ecoregions and ecological subsections; figure 2), which are landscapes of ecosystems defined by homogenous climate, vegetation, surficial geology, lithology, geomorphic processes, soils, and hydrology (Keys et al 2007, Omernik andGriffith 2014). That is, selected ecological units corresponded with the maps of historical distributions. However, the elk distribution in Popp et al (2014) extended into Mexico, which was excluded based on elk distribution maps from Seton (1929) and no evidence for elk presence in Mexico in a review by Carrera and Ballard (2003). The herbaceous vegetation present in pine and oak forests of the Western Sierra Madre Piedmont ecological unit likely would have been appropriate, but less so sparse vegetation in the Chihuahuan desert ecological unit.
To incorporate historical ranges, I tested three solutions as sampling options for historical ranges, generating an equal number of presence and pseudoabsence points for each option (figure 1). One solution was to use current occurrences, but to exclude the historical range from the randomly generated pseudoabsence points. The second solution was to add randomly generated points from the historical range to represent occurrences, along with randomly generated pseudoabsence points from the entire North American extent. One, 5, and 50 randomly points were generated for each ecoregion in the historical ranges; the bison range contained 60 ecoregions and the elk range contained 90 ecoregions. Additionally, I combined these two options together, of randomly generated presence points from the historical range and randomly generated pseudoabsence points avoiding the historical range.

Modeling framework
For modeling of each species, I applied an iterative process to winnow variables to the most influential with the random forests classifier in the caret package (figure 1, Kuhn 2008, R Core Team 2022). The caret package automatically tunes the model parameter of the number of predictors that will be randomly sampled at each split when creating tree models to maximize accuracy and otherwise applies default values of 500 trees, maximum possible number of terminal nodes, and minimum of one terminal nodes (Kuhn 2008, www. rdocumentation.org/packages/randomForest/versions/4.6-14/topics/randomForest). First, with ten-fold cross-validation, repeated three times, recursive feature elimination built a model based on all variables, rank-ordered variables by importance, and removed variables with the least importance based on model evaluation metrics (i.e. accuracy). Because all climate variables potentially are strong matches to species distributions, I limited the number of variables to the eight most optimal variables in terms of maximum accuracy. Then, the modeling process iterated again, retaining only variables that carried an importance value ⩾40 (out of 100 scale; note that other classifiers will assign weights differently). Although number of variables was reduced, climate variables are inherently correlated throughout time, which provides relative climate stability, rather than random weather, and reliable long-term associations with species (Hanberry 2023). Random forests consistently is able to identify the most important variables for models from correlated climate variables (Hanberry 2023).
With this final reduced set of variables, modeled variables went to two different pathways. With samples partitioned into training (75% of samples) and test (25% of samples), accuracy metrics of accuracy, true positive rates, and true negative rates were calculated. With all the samples, distributions and area (NAD83 Albers projection) were predicted for 2071-2100, 1900-1990, 1800-1899, and 1700-1799.

Models from current observations with pseudoabsences from North America
The models near perfectly matched current species observations with climate, resulting in lack of filling in the missing range for all time intervals (figure 2). Accuracy was 0.956 for bison models, with equal true positive and true negative rates. Accuracy was 0.938 for elk models, with a better true positive rate (0.957) than true negative rate (0.919). For thinned samples, accuracy was slightly reduced (−0.007 for bison and −0.031 for elk); without clear support for altering samples, I did not use the thinned samples for comparison.

Models from three options to incorporate historical ranges
The options for using current observation or presence samples from historical ranges produced different outcomes. For current occurrence points but pseudoabsence points outside of the historical range, the distribution increased but about half of the historical ranges remained missing. The options that most completely filled the historical ranges were the 50 presence points in each ecological unit of the historical range, with either the pseudoabsences from the entire North American extent or from outside of the historical range. The latter option more completely filled the historical range (figure 3); indeed, this option had greatly improved accuracy compared to the former option: 0.925 compared to 0.818 for bison and 0.886 compared to 0.812 for elk. Although most of the improvement arose from the true negative rate (0.917 compared to 0.774 for bison and 0.874 compared to 0.754 for elk), true positive rates also were greater by 0.07 for bison and 0.03 for elk (0.932 compared to 0.861 for bison and 0.898 compared to 0.869 for elk).

Comparison of model extents under future climate change
Models of current reduced ranges will predict smaller extents for the future under climate change than models of the historical full ranges (figure 4). For bison, areas of future bioclimate envelope based on current observations were 14%-27% of areas of future bioclimate envelope based on historical ranges. Under the most extreme general circulation model predictions (IPSL-CM6A-LR), for bison, the area of the future bioclimate envelope was 3.1 million km 2 based on current observations, whereas area of the future bioclimate envelope was 11.2 million km 2 based on historical ranges. For elk, areas of future bioclimate envelope based on current observations were 10%-16% of areas of future bioclimate envelope based on historical ranges. Under the most extreme general circulation model predictions (IPSL-CM6A-LR), for elk, the area of the future bioclimate envelope was 1.4 million km 2 based on current observations, whereas area of the future bioclimate envelope was 13.3 million km 2 based on historical ranges. As a check of range contraction that may have been due to climate, rather than extirpation, range extents stayed relatively stable during 1900-1990, 1800-1899, and 1700-1799 compared to 1981-2010.

Discussion
Due to human activities during past centuries, many species ranges have contracted into a reduced climate space, which may result in underestimation of climate tolerances and distributions under future climate change. To examine effects of range contractions on modeling of species to determine suitable space under climate change, I explored solutions for modeling bioclimate envelopes from historical ranges before extirpation. For two example species of bison and elk, which have abundant sample sizes, bioclimate envelopes from current observations had near perfect accuracy, despite the challenge of matching climate to limited reintroductions of extirpated species in the eastern U.S. Nevertheless, the perfect matches missed the historical ranges of bison and elk based on historical maps, similar to results from fossil records (Lima-Ribeiro et al 2017, Faurby and Araújo 2018, Sales et al 2022. Therefore, a successful solution to filling the missing envelopes was to reproduce the historical ranges and generate well-distributed presence samples from the historical range and pseudoabsence samples from outside of the historical range. This solution filled the missing climate envelopes and extended along the edges of the historical ranges, excepting the northernmost ecological region for each species. Typically, observed ranges are smaller than climate envelopes due to constraints on realized ranges such as biotic interactions, barriers to dispersal, and unidentified factors. For example, it is not clear which factors prevented elk and bison from being present historically throughout the entire eastern U.S. Conversely, species that are introduced to areas where constraints are lifted may expand beyond native bioclimate envelopes. Regarding the unfilled northernmost ecological region, additional samples in these ecological regions may push the model to fill these remaining spaces. However, historical ranges are not detailed enough to guide whether supplementary fitting was necessary. The 50 samples per ecological region equaled approximately one sample every 1600 km 2 -2400 km 2 , but more samples may be generated. While simply using background samples was not sufficient to reproduce the historical ranges, a two-step modeling process also may be possible of first producing a model and then selecting pseudoabsence samples with low predicted probability of species presence (Sales et al 2022). Two-step strategies typically over-predict species presence, due to too much environmental distance between the pseudoabsences and recorded presences (Hanberry et al 2012), but the historical ranges are too general to assess well. In any event, the one-step model with pseudoabsence samples from outside of the historical range was efficient.
The approaches to bioclimate envelopes produce different current and future extents, which is extremely relevant to consider for species with known extirpations. Models of current reduced ranges will predict smaller extents for the future under climate change than models of the historical ranges, whether based on fossil records or historical range maps (  Relatively recent range contractions also will reduce suitable areas predicted for the future (Martínez-Freiría et al 2016). For these two species, future predicted extents based on current observations were 10%-27% of future predicted extents based on historical ranges. While the future predicted extents may seem sufficient to meet the needs of species, species require more than just climate for habitat, which will consequently reduce the extent (Deb et al 2020, Sales et al 2022. Furthermore, if suitable habitat shifts and decreases, connectivity issues develop, which would appear very possible for free-ranging elk according to the future bioclimate envelope modeled only from current observations. Similarly, current concentrated locations of managed bison populations become partly unsuitable under the future bioclimate envelope modeled from current observations. In contrast, changing climate is a limited issue for the climate space of elk under the future bioclimate envelope modeled from the historical range. For bison, most of their historical stronghold of the Great Plains grasslands will no longer have optimal climate based on the future bioclimate envelope modeled from the historical range. Nonetheless, bison have limited presence in the Great Plains currently due to private lands and crop fields. For some species, current distributions are unknown, much less historical distributions before land use change and overexploitation. Historical ranges maps for elk and bison matched well with the ranges generated in the PHYLACINE dataset for 146 extant terrestrial large-bodied (>44 kg) mammals developed from climate suitability modeling and historical and paleoecological occurrence records . Therefore, the PHYLACINE dataset may be a potential source of historical ranges, particularly if historical range maps are inaccessible. However, evaluation is critical because current distributions of bison in public lands did not match well with the PHYLACINE current ranges, whereas the GBIF occurrences, used here to model current distributions, aligned with locations of bison in public lands.
One caveat is that subspecies may be genetically suited to the climate of their location and species-level modeling may be inadequate to provide relevant conservation and management information for subspecies (Jinga et al 2021). Here, I separated the historical range of the Plains bison from the more northern wood bison (B. bison athabascae; Gates et al 2010), albeit these subspecies have intermixed in intentional conservation efforts (Gates et al 2010). Six North American subspecies of elk have been described, based on different physical and behavioral characteristics. The eastern elk (Cervus canadensis canadensis) and the southwestern Merriam's elk (C. canadensis merriami) are extinct. Nevertheless, current observations of elk range from the Southwest, which is the hottest and driest location in the U.S., to cold Rocky Mountains and rainy Pacific Northwest locations; therefore, elk as a species currently are tolerating a wide range of climate even if they are not filling their historical climate space. However, modeled climate space based on current observations may be accurate if the species lost traits for tolerance to climate of the historical range during the extirpation process.

Conclusions
Species range contractions due to human influences may lead to underestimation of future potential species distributions under climate change. Here, I explored different options for modeling overexploited species with contracted ranges through use of historical ranges. To more accurately model future climate space, historical ranges are necessary, and one solution for bioclimate envelope modeling is to generate presence samples from the historical range and pseudoabsence samples from outside of the historical range. The current ranges of species that have been extirpated from their historical range will generate a conflicting outcome about the impact of climate change compared with the historical ranges, if species maintain historical climate tolerances. The current range models forecasted areas that were 10%-27% of the areas forecasted by historical ranges. Species need to be assessed for range extirpation before establishing potential effects of climate change on species distributions.

Data availability statement
The data that support the findings of this study are openly available at the following URL/DOI: https://doi. org/10.5281/zenodo.7278360.