Introduction

Global climate change is altering the suitable habitat distributions of many biotic communities (Thomas 2010). Species’ distributions on average have shifted 16.9 km poleward per decade (Chen et al. 2011), while climate-driven losses to suitable habitat area have been observed across taxa and geographies (Parmesan 2006; Pecl et al. 2017; Lenoir et al. 2020; Masson-Delmotte et al. 2021; Pörtner et al. 2022). Climate influences species distributions by imposing geographic limits based on species’ physiological tolerances to temperature and precipitation gradients (i.e., the fundamental niche) (Stevens 1989; Gaston and Spicer 2001). As climate change progresses, these geographic limits of tolerable climatic conditions shift in space, forcing species to move or face niche mismatch which may increase the vulnerability of the species. In northern latitudes where temperatures are rising at nearly four times the rate of the global average (Rantanen et al. 2022), shifts in suitable habitat range are affecting proportionally more species at faster rates and over greater areas compared to global estimates (van Beest et al. 2021; Antão et al. 2022; Ramalho et al. 2023). Better understanding the influence of climate on changes to species’ distributions will be critical to plan for and remain resilient to future climate change. Fortunately, advances in computing power, increasing access to species occurrence information, and dissemination of thoroughly tested modeling algorithms has allowed for easy implementation of species distribution models (SDMs) to project change under future climate scenarios and inform management action (Elith et al. 2006; Elith and Leathwick 2009; Sofaer et al. 2019).

Arctic vegetation community distributions are projected to experience significant change in line with arctic warming. Most notably, arctic vegetation communities are expected to undergo “shrubification,” in which traditionally grass- and sedge-dominated communities, especially in tundra areas, are overtaken by woody shrub and tree expansion made possible by increasing soil temperatures, microbial activity, and length of the growing season allowing for shrub growth and recruitment in previously unsuitable northern habitats (Tape et al. 2006; Myers-Smith et al. 2011). Some estimates indicate tree lines could move as much as 100 km northward, expanding in range, while lower-lying graminoids’ ranges are expected to shrink (Pearson et al. 2013). The implications of such distribution shifts could be far-reaching—for example, vegetation shifts may alter the arctic carbon cycle, a critical component of global carbon budgets (McGuire et al. 2009), by decreasing regional albedo and increasing solar energy absorption as taller, woodier vegetation grows above snowpack (Mekonnen et al. 2021). Distribution shifts could impact ecology and ecosystem functioning in several ways; for example, encroaching woody vegetation could shade out lower-lying shrubs, altering competition dynamics, (Myers-Smith et al. 2011), while cold-adapted specialist small mammal distributions could shrink (Baltensperger and Huettmann 2015), disrupting arctic food webs.

A key component of arctic food webs that could be impacted in this way are wild, fleshy fruit-producing plants (hereafter, “berry plants”). Berry plants are ubiquitous in northern latitudes, occurring throughout the boreal, arctic, and polar regions of the northern hemisphere. Berry plants are vital components of arctic food webs often functioning as the only viable food source during critical feeding times, like in fattening birds prior to migratory periods (Hupp et al. 2013) and in supplementing bear diets when salmon and other preferred sources are unavailable (Deacy et al. 2017). Berries have also been shown to be influential in survival of small mammal populations like rodents (Krebs et al. 2010). Furthermore, berry plants are an important resource for human communities across the Arctic, especially Indigenous communities, where the harvest of berries is an important cultural practice to gather a healthy traditional wild food. Berries are also an economical food source in remote, rural areas where imported fruit alternatives are limited and expensive (Redwood et al. 2008; Hupp et al. 2015; Boulanger-Lapointe et al. 2019). Wild berries are a rich source of polyphenolic metabolites, acting as effective anti-obesity and anti-diabetes dietary components (Kellogg et al. 2010), while opportunities for physical activity afforded by berry harvesting are critical in Indigenous communities for exercise and community life (Redwood et al. 2008).

Climate change is already altering wild berry plant productivity and phenology. Extreme winter warming event simulations have demonstrated reductions in berry production, flowering, root growth, gross primary productivity, and increased shoot mortality (Bokhorst et al. 2008, 2011). Alaskan berry plant phenologies have been observed to shift in line with warming trends between seasons, flowering and producing fruit earlier during warmer years with earlier snowmelts compared to cooler years with later snowmelt (Mulder and Spellman 2019). Indigenous communities have been observing these and other climate-induced changes to berry plants for decades, citing decreasing abundance and/or increasing variability in berry production over time and observed phenological shifts of berries ripening earlier in the season (Hupp et al. 2015; Boulanger-Lapointe et al. 2019; Herman-Mercer et al. 2020).

However, less progress has been made in understanding potential suitable habitat shifts of high-latitude berry plant species in response to climate change. Predictions of habitat shifts will be critical in assessing the vulnerability of Alaskan communities to shifting accessibility of berries and in helping to target adaptive management prescriptions on the landscape (e.g., Anderson et al. 2018; Reich et al. 2018). A limited number of studies from other regions have broached this topic, finding minor changes to the suitable habitat distribution of Vaccinium uliginosum L. in the Khingan Mountains region of Northeast China (Li et al. 2022); net expansion of suitable habitat ranges for Vaccinium uliginosum, Vaccinium vitis-idaea L., Vaccinium macrocarpon Aiton, and Vaccinium oxycoccos L. in Northern Canada (Hirabayashi et al. 2022; Seider et al. 2022); and net contraction of the suitable habitat range of huckleberry (Vaccinium membranaceum Douglas ex Torr.) in the Pacific Northwest (Prevéy et al. 2020). These studies provide some insight to potential berry plant habitat shifts, but more targeted assessments are needed to understand what these shifts might look like in Alaska and for species important to Alaskan ecosystems and Indigenous communities.

To help address this need, here we aimed to (O1) identify dominant environmental drivers of berry plant habitat; (O2) predict current berry plant habitat distribution; and (O3) predict how berry plant habitat distribution might change under future climate conditions in the Bristol Bay region of southwestern Alaska. To meet these objectives, we built SDMs focusing on five berry species: Vaccinium uliginosum, Empetrum nigrum L., Rubus chamaemorus L., Vaccinium vitis-idaea, and Viburnum edule (Michx.) Raf.. V. uliginosum, E. nigrum, and V. vitis-idaea are generalist shrub species occurring in boreal, arctic, and sub-arctic regions and tolerant of a range of soil conditions from very wet to very dry (Jacquemart 1996; Nestby et al. 2019). R. chamaemorus is a perennial herbaceous forb with circumpolar distribution likely heat-limited with temperatures exceeding 18° C diminishing its photosynthetic capacity (Marks and Taylor 1978). In general it prefers acidic peat bog habitat types and wetland soils (Taylor 1971). V. edule is a shrub growing taller than the other species included in this study at heights 1 to 2 m high; it can be found throughout northern North America often within wooded areas and at the margins of wetlands and streambanks, and prefers rich, moist soils (Gould et al. 2013).

We expected our models to identify the most influential variables in determining suitable habitat areas to be in line with described habitat preferences in the literature and expert communications (e.g., Jacquemart 1996; Taylor 1971; Nestby et al. 2019) (O1). Next, we expected our models to predict widespread suitable habitat distribution for the three generalist, tundra-associated species (V. uliginosum, E. nigrum, V., vitis-idaea), and restricted distributions for R. chamaemorus and V. edule, under current climate conditions (O2). Third, we expected to observe retractions of suitable habitat distributions in future model outputs for the three tundra species and expansion of suitable habitat distribution for the two specialist species (O3). Through identifying key environmental and climatic drivers and generating spatially-explicit maps of suitable habitat distributions, our results can offer an evidentiary basis for spatial adaptation planning for conservation of suitable habitat areas for these important food-producing plants under future climates. Furthermore, our results will contribute to further understanding how climate change may cause shifts in the distribution of shrubby plants and reorganization of vegetative communities in the Arctic.

Methods

Study area

We aimed to map the potential spatial distribution of suitable habitat for five key berry plant species across Alaska’s Bristol Bay and Togiak areas (Fig. 1). Located in southwest Alaska, this study region covers 47,561 km2 and is home to 31 predominately Indigenous Alaskan villages, 7259 people, and over 300 wildlife and 500 plant species (Bristol Bay Native Corporation 2016). Berries and other subsistence resources like salmon are integral components of local peoples’ culture and economy in Bristol Bay; its sockeye salmon run is the largest in the world, accounting for roughly half the global supply of sockeye salmon (Clark et al. 2006), while the rate of participation in berry-harvesting in Dillingham, the largest village in Bristol Bay, exceeds 80% with residents collectively harvesting over 6000 gallons of berries annually (Evans et al. 2010). In this way, many Indigenous Alaskans throughout Bristol Bay continue to practice traditional activities like hunting, fishing, and gathering and attribute great cultural significance to the subsistence lifestyle (Bristol Bay Native Corporation 2016).

Fig. 1
figure 1

Map of the two NRCS survey areas which form the Study Area Boundaries. NRCS sample locations are displayed, overlaid on land cover classes derived from the National Land Cover Database (NLCD) and a hillshade relief derived from elevation DEM. The inset map highlights the Study Area’s position within the state of Alaska

The western Alaskan landscape is characterized as a semiarid treeless zone of discontinuous permafrost. The study area spans the Bristol Bay-Northern Alaska Peninsula Lowlands and the Ahklun Mountains (USDA 2022). In the Bristol Bay area, the physiography is primarily rolling plains, hills, and mountain footslopes while the Ahklun Mountains area primarily consists of low elevation, rugged mountains. Balsam poplar and white spruce forests are common in low-elevation valley bottoms and floodplains. Dwarf and low scrub communities dominated by ericaceous shrubs, feathermoss, and lichen are common on the drier soils of hills and plains, while wetter soils on these same slopes support low shrub and herbaceous peatlands dominated by scrub birch, ericaceous shrubs, sedges, and Sphagnum. Tall alder and low willow scrub are common on the lower half of mountain slopes, while alpine tundra dominates the upper half of mountain slopes. The climate is subarctic with long, cold winters and short, warm summers. Areas proximal to the coast have a maritime influence that quickly becomes continental as distance from the coast increases. Average temperatures range from -1 to 2 degrees C and average annual precipitation is 330 to 1,270 mm (USDA 2022).

Berry plant presence/absence data

We used presence and absence data for five berry plant species collected by the USDA Natural Resources Conservation Service (NRCS) to build SDMs. We selected these five species based on their appearance in the literature, prevalence in the environment, subsistence and cultural significance to human communities (Hupp et al. 2015), and through personal communications with experts and community members.

The NRCS data were collected between the years 2006–2013 as part of routine soil surveys conducted throughout the state. Plot locations for these surveys were selected using purposive sampling in which 405 m2 plots were established across dominant ecotone strata throughout the survey areas. At each plot, surveyors recorded the occurrence and percent cover of every plant species in addition to soil metrics including soil horizons, presence of redoximorphic features, parent material, pH, texture, and structure. These data were collected primarily to develop a soil map that spatially illustrates the distribution of soil components present in the survey area and their associated ecological sites (USDA 2022). For our study, we used presence/absence data for 1627 plot-level data points within the study area for each of the five berry plant species.

Climate, soils, and topographic predictor data

For predictor variables, we used a suite of environmental and climate geospatial data layers found to influence berry plant distribution based on past vegetation SDM studies (e.g., Garamvölgyi 2013; Pearson et al. 2013; Niskanen et al. 2019), botanical literature detailing berry species preferred habitat conditions (Bell and Tallis 1973; Jacquemart 1996; Anderson et al. 2018; Nestby et al. 2019), and data availability. In total, we gathered 29 predictor variable layers across three categories of climate, soil, and topography from publicly available online GIS database repositories.

We sourced temperature and precipitation climate variables from the Scenarios Network for Alaska + Arctic Planning (SNAP) database (Scenarios Network for Alaska + Arctic Planning 2015) for current (2006–2013) and future (2081–2100) time periods. We used three dataset groups corresponding to three climate Representative Concentration Pathways (RCPs) to compare changes to berry species suitable habitat distributions under a range of potential future climates. The RCPs represent potential trajectories of radiative forcing agents throughout the twenty-first century and cover the span of trajectories detailed in the climate literature (van Vuuren et al. 2011). Here we selected three of the four RCPs (RCP4.5, RCP6, and RCP8.5) to cover a span of potential emissions scenarios. RCP4.5 and RCP6 are stabilization scenarios which assume collective international policy implementation to reduce greenhouse gas emissions leading to radiative forcing stabilization at 4.5 W/m2 and 6.0 W/m2 by 2100, respectively (Thomson et al. 2011). RCP8.5 is considered a high emissions scenario which assumes high population and slow economic growth and failure to enact comprehensive international agreements to reduce greenhouse gas emissions resulting in radiative forcing to 8.5 W/m2 by 2100 (Riahi et al. 2011). SNAP produced these climate layers by averaging across five climate models (CCSM-4, GFDL-CM3, GISS-E2-R, IPSL-CM5A-LR, MRI-CGCM3) to account for model-specific variation and downscaled to 771 m resolution using PRISM climatological datasets from 1971 to 2000 (Scenarios Network for Alaska and Arctic Planning 2015). For our SDMs, we used modeled RCP4.5 baseline conditions averaged for the present time period (2006–2013) because observation data were not available at this scale for this time period in this region and to improve consistency between present and future time period projections. We selected RCP4.5 for this purpose because RCP4.5 is the closest (of our chosen scenarios) to present-day (i.e., 2006–2013) observed conditions (Pielke et al. 2022).

We sourced soil variables including pH, organic matter content, clay content, and available water storage from the NRCS Gridded Soil Geographic (gSSURGO) database for Alaska (Soil Survey Staff 2022a). The NRCS developed these datasets for resource planning and soils analysis purposes by converting vector-based soil map component data from two soil surveys [the Bristol Bay-Northern Alaska Peninsula, North and Bordering Areas and Togiak National Wildlife Refuge-Ahklun Mountains Area surveys (Soil Survey Staff 2022b, c)] into gridded statewide data layers at 30 m spatial resolution using weighted averaging.

We retrieved elevation data from the ArcticDEM project of the Polar Geospatial Center (Porter et al. 2018) at 32 m resolution from which we produced slope and aspect layers using the “terrain” function from the raster package in R (Hijmans 2022).

We harmonized all spatial predictor layers to a common resolution (30 m), alignment, and coordinate reference system (NAD 1983 State Plane Alaska 6 FIPS) using the nearest neighbor method of the “resample” function from the raster package in R (Hijmans 2022). Then, we extracted these grid values to the berry plant presence/absence point dataset to be used in model training and to a common raster grid across the study area to be used in predictive distribution mapping. All spatial data manipulation was performed in R version 4.1.3 (R Core Team 2022) using packages raster (Hijmans 2022) and sf (Pebesma 2018).

Model development and evaluation

We used the Random Forests (RF) algorithm (Breiman 2001) to build SDMs for five berry plant species. RF is an ensemble classification and regression tree (CART; see: Breiman et al. 1984) algorithm well suited to the development of SDMs because it is capable of handling many high-volume predictor datasets across scales and resolutions, and is able to resolve non-linear relationships between variables. Because species distribution patterns are determined by environmental structure across landscape gradients (Guisan and Zimmermann 2000), RF is an effective technique for its capacity to manage the many geospatial data inputs required for estimating landscape-scale distribution from point training data.

We tuned and ran SDMs in R version 4.1.3 (R Core Team 2022). To do so, we first specified input predictor variables and model hyperparameters custom for every species. We conducted variable selection first using the “vifstep” function of the usdm package in R (Naimi et al. 2014) to remove highly collinear predictors. Although RF is a nonparametric technique relatively inoculated from problematic variable collinearity and overfitting, variable selection has nevertheless been shown to be useful in improving variable importance estimates and model interpretability (Biau and Scornet 2016; Genuer et al. 2010). “vifstep” works to remove collinear predictors according to the variance inflation factor (VIF) of each variable. A VIF above 10 is indicative of problematic predictor collinearity (Chatterjee and Hadi 2006). Thus, we implemented the “vifstep” argument at a VIF threshold of 10. This stepwise procedure removed the variable with the highest VIF above the threshold then recalculated VIF for all variables again, repeating the process until all variables above the threshold were removed. Then, we used the VSURF package in R (Genuer et al. 2022) to select a subset of the remaining variables based on RF’s internal variable importance ranking functionality. This procedure facilitated clearer interpretation of results by removing barely influential and confounding variables from analysis. Variable subsets were determined for each species using the “Threshold” VSURF option which retains variables highly related to the response, even if somewhat redundant.

Following variable selection, we calibrated RF model hyperparameters using the “train” function of the caret package in R (Kuhn 2008). We used this function first to find the optimal “mtry” parameter which specifies the number of predictor variables to be used at each tree split using an out-of-bag (OOB) error estimation. Then, we employed this same procedure to find the optimal “ntree” parameter which specifies the number of decision trees to be grown. For two species, R. chamaemorus and V. edule, we corrected for class imbalance during this procedure by specifying a method of downsampling of the majority class (absence class) to be equal to the number of samples of the minority class (presence class) (Chen et al. 2004). Class imbalance is a common issue in SD modeling, especially when dealing with rare species and/or a high number of background (absence/pseudo-absence) data points (Valavi et al. 2021), and is important to correct for because class imbalance can result in biased classification accuracy due to an overrepresentation of the majority class in model fitting (Evans et al. 2011). This downsampling occurs at the tree level by randomly sampling from the majority class with replacement, and thus still makes use of information embedded within the majority class across the high number of trees grown. This technique has been shown to be effective in improving model accuracy over the default sampling regime for imbalanced data (Valavi et al. 2021). Finally, we ran all models, specifying custom hyperparameter values according to the outputs generated from the procedure above, using the ModelMap package (Freeman et al. 2018) which implements the RF algorithm through the randomForest package in R (Liaw and Wiener 2002).

We evaluated the effectiveness of our models using two metrics: threshold-dependent classification error rates expressed as the OOB error rate; and a threshold-independent metric, the area under the curve of the receiver operator characteristic (AUC ROC). The OOB error rate reflects model misclassification on a subset of data automatically withheld from the bootstrap sample used to build trees. The AUC ROC is a well-established metric for evaluating systems aiming to distinguish between two classes of events without relying on a probability threshold at which classes are separated (Swets 1988), and is given by the area under the curve of the plot of sensitivity (true positive predictions) against 1-specificity (i.e., false-positive predictions). An AUC of 0.5 is indicative of a bad model unable to discriminate between classes better than random, while an AUC of 1.0 reflects a model with perfect discrimination capacity. AUCs above 0.7 are generally considered acceptable (Swets 1988). We report both of these metrics because of (1) the relevance of threshold-dependent binary classification accuracy given our interest in evaluating species’ potential distribution change, calculated by differencing binary classification maps; and (2) the robustness of the ROC metric for threshold-independent model evaluation.

For one species, R. chamaemorus, where we expected soil variables to be particularly important relative to climate, we conducted an additional analysis (following Feng et al. 2020) building separate models for soil and climate variables, respectively (see Supplementary Materials S1.6 for Methods detailing the development of these separate models).

Analysis

We extracted ranked variable importance information from our RF models and constructed variable importance and partial dependence plots to make inference about the relationships between climate and environmental predictor variables and habitat suitability for each species. We reported relative variable importance for each model according to the mean decrease in accuracy metric which reflects the decrease in model predictive accuracy resulting from a stepwise removal of each predictor variable in the model (Liaw and Wiener 2002). We also constructed partial dependence plots for the top three ranked variables for each species. Partial dependence plots visualize the relationship between individual predictor variables and the predicted probabilities of classification in the positive class.

Model outputs reflect suitable habitat distributions (not occurrence) because the approach employed here (predictive RF modeling) does not account for dynamic ecological processes which factor into real-world spatial distribution of vegetation including dispersal potential, germination, recruitment, and other factors. They also do not reflect the suitable range for fruiting and potential harvesting range, as these species can all be present and reproduce asexually in suboptimal conditions for fruiting. Rather, this approach finds associations between environmental conditions and species occurrence (and absence) and is thus more descriptive of the suitability of a given location to support each species (i.e., the realized species niche vs. the fundamental species niche (Guisan and Zimmermann 2000)). Within this framework, we mapped suitable habitat across the study area corresponding to each of the five models we produced using the ModelMap package in R (Freeman et al. 2018).

To predict the distribution of suitable habitat under future climate scenarios, we substituted future projected climate layers into the five models during the map building process. This approach assumes a static relationship between climate variable influence and classification probability over time; thus, the future projection is based on extrapolation of this relationship from the present time period model. All other model predictors (soils, topography) were unchanged. This procedure produced 15 suitable habitat distribution probability maps corresponding to each of three future RCP projections and five berry species. To visualize change in the probability of classification between time periods, we subtracted the present time period probability surface from each future time period surface, resulting in probability change maps.

To quantify suitable habitat across the study area, we next converted probabilities to binary classes of “suitable” and “unsuitable” habitat at a threshold probability value of 0.5. We selected this threshold value for consistency between models and because we found, through testing of four other thresholds using the PresenceAbsence package in R (Freeman and Moisen 2008), that some models’ classifications were sensitive to the selected threshold values (See Supplementary Materials S2). Because of this sensitivity, we present results and evaluate the accuracy of our predictions regarding patterns of suitable habitat distribution both through quantification of binary classification maps and through non-threshold dependent outputs including probability distribution and probability distribution change maps.

To specifically quantify suitable habitat change, we first calculated the amount of suitable habitat area for each species under each climate scenario by summing suitable habitat cells for each binary classification surface and converting to square kilometers. Then, we calculated raw change in suitable habitat area by subtracting the predicted suitable habitat area amount in the present time period from predicted suitable habitat area amount from three climate futures. We also estimated geographic patterns of distribution change across elevation and latitude gradients by extracting elevation and latitude values of suitable habitat cells for each binary classification surface, calculating the average value, and subtracting future predictions from the present time period estimate.

Results

Predictive performance and variable importance

The five SDMs all showed reliable predictive performance according to the threshold independent AUC ROC metric (AUC > 0.7, Table 1). The threshold-dependent metric, OOB error, also showed relatively reliable predictive performance though these varied between models (Table 1).

Table 1 RF SDMs summary: five berry plant species models

Top predictors varied between species modeled. Unexpectedly, elevation appeared within the top two most important for 3/5 species (Fig. 2a, b, d) with the probability of classification in the suitable habitat class for each of these species increasing with increasing elevation to ~ 200 m and then plateauing (Fig. 3a, b, d), and with its mean decrease in accuracy ranging from 80.0% for E. nigrum to 27.8% for V. uliginosum.

Fig. 2
figure 2

Variable importance rankings for five berry plant species. Rankings were determined by the “Mean Decrease in Accuracy” metric which reflects the decrease in model predictive accuracy resulting from a stepwise removal of each predictor variable in the model. Climate variables are colored purple, soils green, and topographic orange

Fig. 3
figure 3

Partial dependence plots for top three variables for each of five species modeled. Partial dependence plots visualize the relationship between individual predictor variables and the predicted probabilities of classification in the positive class

Following elevation, soil variables including pH, available soil water storage, and soil organic matter content ranked highly across various models. For example, in line with our predictions, soil pH (0 to 30 cm depth) and soil water storage were the first and fourth most important variables for V. uliginosum, with the probability of classification in the suitable habitat class decreasing with increasing pH and increasing with increasing soil water storage (Fig. 3a) and showing a mean decrease in accuracy of 28.7% and 26.0% respectively (Fig. 2a). Organic matter content (0–30 cm depth), organic matter content (30–100 cm depth), and soil pH (0–30 cm depth) were the first, second, and third most important variables for R. chamaemorus, with the probability of classification in the suitable habitat class positively correlated with each of these variables (Fig. 3c), and showing a mean decrease in accuracy of 9.8%, 9.6%, and 9.2%, respectively (Fig. 2c).

Across models, January temperature, July temperature, and July precipitation appear most frequently as relatively important climate variable predictors. July temperature was the second most important variable for V. edule, (Fig. 3e) showing a mean decrease in accuracy of 8.2% (Fig. 2e); July temperature was the third most important for V. uliginosum and E. nigrum showing mean decreases in accuracy of 26.2% and 54.1% respectively (Fig. 2a, b). January temperature was the second most important variable for E. nigrum and the fourth most important for V. vitis-idaea with the probability of classification in the suitable habitat class decreasing with increasing January temperature for both of these species (Fig. 3b, d), and showing mean decreases in accuracy of 61.4% and 46.97%, respectively; (Fig. 2b, d); January temperature was also the highest ranked of all climate variables for R. chamaemorus (Fig. 2c).

Current suitable habitat

Current time period (2006–2013) probability distribution maps showed variability in predicted patterns of viable habitat across the study area for each of the species. For three species (V. uliginosum, E. nigrum, and V. vitis-idaea), the highest probabilities were generally located in the eastern and northern regions of the study area, while lower probabilities were in the western parts of the study region (Fig. 4a, b, d). Meanwhile, the map for R. chamaemorus showed high probability of suitable habitat in the low-lying wetland areas of the south and center of the study region, and low probability in the mountainous northeast and west (Fig. 4c). V. edule showed low probability of suitable habitat throughout the majority of the study region, with its highest probabilities concentrated in the boreal forested center-west and loamy flood plains along the Nushagak river (Fig. 4e).

Fig. 4
figure 4

Maps depicting species-specific probability of suitable habitat classification under current climate conditions (2006–2013) based on Random Forests model outputs using presence/absence data collected in NRCS surveys during the same time period. Values range from 0 to 1 on a scale from yellow to green

Binary classification maps, which converted probabilities into classes of suitable and unsuitable habitat, highlighted these patterns further and revealed species-specific variability in predicted suitable habitat areas (Fig. 5). Under present climate conditions, these maps estimated a majority of the study region to be suitable habitat for three species (V. uliginosum: 65.59%, E. nigrum: 62.81%, V. vitis-idaea: 61.64%), and majority unsuitable for two species (R. chamaemorus: 35.65%, V. edule: 7.31%), results which are in line with our expectations and current knowledge surrounding the rarity of these species (pers. comm. J. Johanson & B. Spellman, NRCS). Three species, V. uliginosum, E. nigrum, and V. vitis-idaea, were predicted to have suitable habitat spread throughout much of the study region, especially concentrated in the northern and eastern halves of the study region (Fig. 5a, b, d) and at higher mean elevations (268 m, 291 m, and 275 m, respectively). In contrast, R. chamaemorus’ predicted suitable habitat area was concentrated in the lower elevation south, central, and north central regions along the Nushagak river, mapping closely to the distribution of soil organic matter content in the region (Fig. 5c). V. edule was predicted to have relatively little suitable habitat area throughout much of the region, with some small pockets in the west and among the forested center-west and along the Nushagak river (Fig. 5e).

Fig. 5
figure 5

Maps depicting species-specific suitable habitat classification under current climate conditions (2006–2013) based on Random Forests model outputs using presence/absence data collected in NRCS surveys during the same time period. Pixels were converted from probabilities to suitable/unsuitable habitat classes at a 0.5 probability threshold. Suitable habitat is depicted in green and unsuitable habitat in white

Future suitable habitat

Future climate models predicted all species to exhibit both positive and negative suitable habitat probability changes in response to climatic change into 2081–2100 across the study area (Fig. 6). E. nigrum and V. vitis-idaea exhibited increases in the central region of the study area, while increases appeared in the south for V. uliginosum, in the east, center-south, and northwest for R. chamaemorus, and across the vast majority of the study area for V.edule. Areas of decreasing probability appeared in the northern half of the region for most of the species (excluding V. edule), with other patterns of decreasing probability variable between species and RCPs.

Fig. 6
figure 6

Maps depicting species-specific probability of suitable habitat classification for five berry plant species between for present time period (2006–2013; first row) and three future RCP scenarios (2081–2100)

Binary distribution change maps further highlighted that spatial patterns of species’ predicted habitat losses, gains, and persistence were variable across the landscape (Fig. 7). Broadly, E. nigrum and V. vitis-idaea were estimated to lose substantial suitable habitat area in the south and center-north of their predicted distributions. R. chamaemorus was predicted to lose suitable habitat area concentrated in the center-west of the study area but make gains in other locations, and V. edule was projected to lose most of its north-central habitat.

Fig. 7
figure 7

Maps depicting species-specific change in suitable habitat classification based on a 0.5 probability threshold for five berry plant species between two time periods, 2006–2013 and 2081–2100, and under three RCP scenarios. Suitable habitat loss is depicted in red, while suitable habitat gains are depicted in blue

Quantification of these changes at the landscape scale using binary classifications revealed net retractions in suitable habitat area across the study area by the end of the century across RCPs for all species (Table 2), with the greatest declines predicted to occur under RCPs 6.0 and 8.5 and the least under RCP4.5 (average: 26.01%, 23.62%, and 17.99% respectively). Counter to our expectations, according to this metric V. edule and R. chamaemorus were not predicted to gain suitable habitat area across all RCPs; V. edule was predicted to lose roughly half of its predicted suitable habitat area across RCPs, while R. chamaemorus was predicted to experience a small net gain of suitable habitat area under RCPs 4.5 (1.0%) and small net losses under RCPs 6.0 and 8.5 (9.3% and 4.4% respectively).

Table 2 Suitable habitat change metrics: estimated suitable habitat area metrics for five berry species for two time periods (2006–2013 and 2081–2100) under 3 RCPs (4.5, 6.0, 8.5)

Most species’ suitable habitat distributions were predicted to move little, on average, across the elevation gradient (See Supplementary Material, Table S1.5). Average movement among species was slightly upslope across RCPs 4.5, 6.0, and 8.5 (4.6 m, 26.2 m, and 18.3 m respectively).

Average predicted suitable habitat distribution shifts across the latitudinal gradient exhibited similarly varied patterns, with two species’ habitat distributions predicted to move northward on average under RCPs 4.5, 6.0, and 8.5 (E. nigrum and V. vitis-idaea), one species moving slightly southward (V. edule) and two species with little net movement in either direction (V. uliginosum and R. chamaemorus).

Discussion

We constructed SDMs for five important wild berry plant species across the Bristol Bay region of southwestern Alaska. Models identified key drivers influencing predicted habitat suitability in elevation, soil variables like soil pH and available water storage, and climate variables like January and July temperatures, mostly in line with our expectations surrounding each species’ habitat preferences. Current suitable habitat distribution maps aligned with our expectations and assessments from experts in terms of geographic patterns and areas. Using future climatic state projections drawn from Alaska-specific climate models, we predicted variable changes to the probability of suitable habitat among species across the study area and across RCP scenarios; quantification of predicted habitat changes revealed net retractions in suitable habitat area for all five species, aligning with our expectations for three species (V. uliginosum, E. nigrum, V., vitis-idaea), and running counter for two (R. chamaemorus, V. edule).

Drivers of berry plant habitat distribution

Elevation appeared as an important predictor of suitable habitat classification, ranking within the top two important variables for three SDMs (V. uliginosum, E. nigrum, and V. vitis-idaea). Elevation is commonly understood as an indirect variable in SDMs in that its effects typically have to do with correlative factors along the altitudinal gradient like temperature and moisture regimes (Guisan and Zimmermann 2000; McCain and Grytnes 2010; Oke and Thompson 2015). These correlated climate variables change more rapidly with elevation than with latitude, which may help to explain elevation’s strong influence in limiting suitable habitat areas in this study, especially considering the relatively narrow latitudinal breadth of the study area. In contrast, the range in elevation across the study area is wide (− 19.93 to 1526.35 m) due to the presence of both very low-lying areas at the mouth of Bristol Bay and along the Nushagak river and very high elevation areas of the Kuskokwim mountains in Togiak in the west and the westernmost parts of the Alaska Mountains in the east. These elevation extremes manifest as different habitat types on the landscape which could bind species to certain areas based on their tolerance for correlated elevation-adjacent conditions. Because we included elevation data in our models at a much finer spatial resolution than our climate variables, elevation may have better captured these realized microclimate conditions on the ground at a scale relevant to berry plant species compared to broad-scale climate layers. The three species for which elevation was an important variable all exhibited a similarly positive marginal relationship with elevation; this pattern could be reflective of this correlation between elevation and temperatures, with these species having an affinity for colder temperatures at higher elevations.

Soil characteristics have long been recognized to strongly influence plant population dynamics (Bever et al. 1997) but until recently they have been largely overlooked in SDM studies due to limited access to accurate datasets at the spatial scales needed to be relevant for analysis (Roe et al. 2022; Thuiller 2013; Xu et al. 2023). Increasingly, as these datasets have become more widely available, SDM studies have demonstrated the importance of soil variables in determining species distributions (Coudun et al. 2006; Beauregard and de Blois 2014; Chauvier et al. 2021; Roe et al. 2022; Xu et al. 2023). We leveraged one such dataset that included co-located soil and plant presence/absence training data and showed that soil variables were important predictors of suitable habitat consistent with our predictions. For example, V. uliginosum’s most influential predictor was soil pH (0–30 cm depth), with the probability for suitable habitat classification partially dependent on this variable exhibiting a positive relationship to ~ 5 pH before sharply declining (Fig. 3a). Meanwhile, its fourth most important variable was available water storage. These rankings are consistent with descriptions of V. uliginosum’s affinity for acidic, peaty soils and bog habitats (Jacquemart 1996; Holloway 2006; Nestby et al. 2019). Soil variables were also very important for R. chamaemorus, representing eight out of ten top variables in its model.

Climate variables were ranked with middling relative importance among models, appearing as the top variable for only one model (V. edule), but within the top five for four models. January and July temperature in particular appeared in several models’ top variable importance rankings, suggesting that changes in average temperatures during these months may be the most significant climatic drivers of shifting berry plant habitat suitability in the future. Rising temperatures shift habitat distributions by delimiting zones of species-specific physiological tolerances—for berry plants, changes to temperature minima and maxima may cross thresholds relevant to life history processes like germination and bud break, which may have a cold temperature requirement (Jacquemart 1996; Bell and Tallis 1973). As minimum winter temperatures increase with climate change, some areas in the study region may move out of zones where this temperature threshold exists, limiting areas of suitable habitat. This could be the ecological explanation for V. uliginosum, E. nigrum, and V. vitis-idaea’s models’ predicted relationships between the probability of suitable habitat area and January temperatures, all of which show a stark decline in probability ~ 9 °C (See Supplementary Material, Fig. S1.2).

Habitat distribution shifts

Binary classification maps projected net retractions in suitable habitat area for all five species we tested, slight northward and upslope movement for R. chamaemorus and V. vitis-idaea, northward and downslope movement for V. uliginosum and E. nigrum, and southward and downslope movement for V. edule. These results were in line with our expectations for the three tundra-associated species (V. uliginosum, E. nigrum, and V. vitis-idaea) and consistent with observations that show, on average, species will move poleward in latitude following shifting bioclimatic envelopes in those directions driven by climate change, especially at trailing edges of species distributions (Breshears et al. 2008; Thomas 2010). Meanwhile, these results ran counter to our expectations for the two specialist species, R. chamaemorus and V. edule, which we expected to benefit from warming temperatures and shifting precipitation regimes. However, when examining the probability distribution change maps for R. chamaemorus and V. edule, gains in probability of suitable habitat classification across the majority of the study region were evident (see Supplementary Material, Fig. S1.4), indicating that future climate conditions in Bristol Bay may in fact be more conducive to supporting these species. This interpretation is more in line with our expectations, particularly for potential habitat expansion for V. edule, given its more southerly, warm-adapted known distribution today, its affinity for forested habitats (which are projected to expand in Bristol Bay under future climates (Pearson et al. 2013)), and its positive relationship with climate variables (July precipitation, July temperatures, January temperatures) illustrated in its partial dependence plots (Fig. 3e). This discrepancy between binary classification and probability maps is likely driven by the small number of positive observations of these species in our training data (low prevalence), leading to an underestimation of positive classifications at the default (0.5) probability threshold (Jiménez-Valverde and Lobo 2007; Valavi et al. 2021).

With regards to R. chamaemorus, we opted to conduct an additional analysis to test the sensitivity of our predictions of habitat change for this species. We did so because the R. chamaemorus model demonstrated an outsized importance of soil variables relative to climate variables, a dynamic which some recent SDM studies have suggested can produce unreliable predictions through a potential “masking” effect of the climate variables, especially when climate variables are the sole drivers of future habitat predictions (Feng et al. 2020). The results of this additional analysis showed much more substantial habitat loss predictions under RCPs 4.5, 6.0, and 8.5 compared to our original analysis. In our view, this outcome represents a possible future landscape of R. chamaemorus habitat distribution if, in the future, climate variables emerge as more influential drivers of distribution than they are now under current conditions. However, given ecological evidence of the importance of soils governing the distribution of R. chamaemorus (e.g., Taylor 1971; Karst and Turner 2011; Āboliņa et al. 2023), we view these predictions as less likely than those of our original analysis (see Supplementary Material S1.6 for methodological details, results, and discussion of this analysis).

Our SDM predictions are reflective only of potential changes to suitable habitat, not species occurrence; future realized distribution shifts will also be driven by how well species traits allow for movement and recolonization of newly suitable habitat, and thus are difficult to predict with precision. Exclusion of species traits important to distributional change, like dispersal mode and seed mass for example, is a well-recognized limitation of correlative SDMs (e.g., Kearney and Porter 2009; Wang et al. 2018; Regos et al. 2019). Several studies have found that SDM accuracy can vary with vegetation species traits (Hanspach et al. 2010; Syphard and Franklin 2010), and traits like dispersal will be important in determining vegetation’s ability to colonize newly suitable habitat areas in the future. Species that reproduce clonally with high intraspecific genetic diversity like V. uliginosum, for example, may be better equipped to adapt to climate change (de Witte et al. 2012), while species reliant on insect-mediated pollination to set fruit, like R. chamaemorus (Thiem 2003), may have a harder time moving in line with shifting climate envelopes if their symbionts cannot move as well.

Similarly, distribution shifts are expected to be greatly influenced by emergent conditions of future novel ecological community organization, species organization, and abiotic landscape conditions (Gilman et al. 2010). Across the Arctic, the suitable habitat range of low-lying vegetation classes like shrubs and graminoids are largely predicted to contract while woody vegetation and tree classes are predicted to expand (Pearson et al. 2013); interactions provoked by these community reorganizations, such as increasing shade cover by taller, woodier plants, increased competition for soil nutrients, and introduction of new predators could further reduce berry plant population size. On the other hand, positive feedbacks for shrubby vegetation could also occur and offset some of these losses (Myers-Smith et al. 2011), including positive changes to soils like increased nutrient availability and moisture, two factors that our models suggest can be important in determining habitat suitability.

Finally, there is some uncertainty in the future projections of habitat shifts presented here inherent to the RF algorithm. RF is a nonparametric machine learning approach which learns the relationship between variables using the original distribution of response values in the training dataset. Thus RF cannot, with certainty, project into future novel climatic domain space which may lie outside the bounds of the training data, because the algorithm cannot predict whether the relationship between the variables will change beyond those bounds (Hengl et al. 2018). Nevertheless, RF has been used extensively in SDM applications projecting into future novel climate space because it is a robust learner resistant to overfitting and produces highly accurate predictions (e.g., Drew et al. 2011; Billerman et al. 2016; Iverson et al. 2019; Mosebo Fernandes et al. 2020; Bagaria et al. 2021; Ebrahimi et al. 2022). These studies employ RF under the assumption of stationarity in the relationship between predictor variables and the response, even when projecting into potential no-analog future climates. We make this same assumption here.

Climate adaptation and management implications

Overall, the results presented here suggest that climate change may cause shifts in the habitat distribution of berry plant species in the Bristol Bay region, which could potentially restrict communities’ ability to access and make use of these plants as subsistence resources. Many Alaskan families harvest as much as 75 L of wild berries every year (Hupp et al. 2015) representing an important local fresh food source with positive health benefits (Kellogg et al. 2010). Thus, changes in the distribution or accessibility of berries could have cultural, food security, and health impacts for many small rural and remote Alaskan communities. Additionally, a reduction in access to berries with climate change is likely to occur in parallel with other climate-related food stressors and changes to the physical environment like reduced sea ice and windier conditions making travel for resources more treacherous (Brinkman et al. 2016) amounting to an increasingly challenging landscape for subsistence food resources in Alaska under future climate change.

The predictive maps and analyses presented here can be applied to planning and management of berry resources in Bristol Bay. Our predictive maps may be able to help communities target climate adaptation efforts on the landscape, while our variable importance outputs could help to prioritize management actions for important habitat characteristics conducive to supporting these species. For instance, in addition to potential habitat loss, our maps also show areas of persistence, or no predicted change, as well as potential areas of expansion. These areas may remain or enter into berry plant viable habitat niche space under future climate conditions, and thus managers might consider these areas as starting points for targeting management action to facilitate plants’ movement and future survival in these locations. In addition, our models ranked environmental variables by their importance in predicting berry plant habitat preferences; managers could make use of these rankings to prioritize management actions which are likely to be the most impactful in supporting these species. For example, soil variables consistently ranked highly throughout several models with soil water storage in particular appearing highly important for V. uliginosum, E. nigrum, and R. chamaemorus. This ranking suggests that these species are highly associated with peat forests and/or low shrub and herbaceous peatlands, and thus managers may consider focusing protection efforts on these habitats throughout Bristol Bay.

Conclusion

Understanding the response of berry plant species to climate change is critical to mitigating potential loss in access to these important resources for both human and animal communities under future climate conditions. Here, we used SDMs to predict change in suitable habitat distribution for five berry plant species in southwestern Alaska. Our models suggest that all five species tested here would see shifts in suitable habitat area and geographic location within Bristol Bay under three RCP scenarios. We also identified key drivers of berry plant habitat distribution including elevation, soil variables like pH, water storage, and organic matter, and climate variables like mean January and July temperatures, and we mapped areas that are likely to lose or gain habitat suitability; these insights could guide management action to mitigate the impacts of climate change on berry plant habitat. Our methods could be applied to predict berry plant species distribution and shifts in other regions of the Arctic. In addition, future studies should explore relationships we identified here at the individual plant or plant population level, and develop more complex SDMs incorporating dynamic processes, and at a broader spatial extent, to better understand berry responses across the entire Arctic under future novel conditions.