Introduction

Human activities have introduced many species into new areas. Some of these species fail to establish (Blackburn and Duncan 2001), while some become invasive, that is—reproduce and expand spatially and demographically in their newly introduced environment (Pyšek and Richardson 2010; Blackburn et al. 2011; Matheson and McGaughran 2022). Invasive species may negatively impact the native ecosystem, the economy, and human health, with the financial cost of invasive species between 1970 and 2020 estimated to exceed 1.288 trillion $USD globally (Zenni et al. 2021). Invasive species are also one of the leading causes of global biodiversity loss (Clavero and García-Berthou 2005; Bellard et al. 2016).

With increased global connectivity, habitat modification, and climate change, the impacts of invasive species are a growing concern as there has been a rise in the number and the distribution of invasive species in recent decades (Hulme 2009; Turbelin et al. 2017). There is an increasing use of species distribution models or ecological niche models (ENMs) to infer the realised niche and also potential distribution of introduced and not yet introduced species (Faulkner et al. 2014; Shivambu et al. 2020; Ndimuligo et al. 2022). These models have been successfully applied to infer areas at risk of future invasion, for example, potential areas under risk of house crow invasion in New Zealand (Fraser et al. 2015).

There are two main types of ENMs: mechanistic and correlative. Mechanistic ENMs use information about the species physiology to model the species potential distribution, but require physiological information from controlled experiments which are difficult to obtain (Peterson et al. 2015). Correlative ENMs, on the other hand, correlate readily available species occurrences (e.g., eBird, GBIF, etc.) to environmental data (e.g., WorldClim, CHELSA, etc.) to model the species’ suitable environmental space or “niche” (Peterson et al. 2015). The modelled “niche” is assumed to be conserved and this can be projected into the geographical space where the model has been trained, or transferred into a different area or time. ENMs can be built with different predictor variables and at different scales (Menke et al. 2009; Seo et al. 2009; Guevara et al. 2018). Comparisons between fundamental and realised niches can also be made, for example, with physiological data (Tingley et al. 2014). At large-scale, ENMs informed by climate and environmental variables are particularly useful as a basis to predict suitable areas prior to introduction for conservation and invasive species assessment purposes, and with the increasing availability of future climate projections, the models can also be projected to predict species suitable habitat in future climates. ENM has been used with future climate projections to assess climate change vulnerability of birds with different body sizes, life history traits, and from different habitats (Simmons et al. 2004).

In this study, correlative ENMs (hereafter, ENMs) were used as physiological data were not available. The ability of an ENM developed for a specific area and/or time to predict the species distribution in a different area or time defines its transferability (Sequeira et al. 2018; Yates et al. 2018). ENMs built on biologically important environmental variables for species with well defined, conserved niches are expected to have good transferability. Niche transferability, conservatism or overlap assessment (hereafter, niche overlap analyses) can be performed prior to constructing an ENM (as with Fernandez and Hamilton (2015)). However, niche overlap analyses are based on environmental variables, and the incorporation of all possible environmental variables may lead to false inferences (e.g., air surface temperature may be unimportant for subterranean species, and detecting a niche difference based on that variable will not have any biological meaning for the species) (Bates and Bertelsmeier 2021). Therefore, ENMs can be used to help inform biologically important environmental variables for niche overlap analysis (Brown and Carnaval 2019).

Niche overlap analysis can help answer questions on invasive species evolution following their introductions. Has the invasive species merely invaded a proportion of its native range niche? Has the invasive species invaded a new niche, one that had not been previously occupied in the native range? Has the invasive species’ niche shifted or diverged since the introduction?

Species invasion therefore provides an excellent opportunity to validate ENM predictions and transferability and determine whether the species has exploited a new niche during invasion. Several bird species have been introduced to Aotearoa New Zealand and have become invasive (Duncan et al. 2006). Among the list of invasive bird species in Aotearoa, two species from the Sturnidae family: the common myna (Acridotheres tristis, hereafter myna), and the common starling (Sturnus vulgaris, hereafter starling) are two of the three bird species on the IUCN “100 of the World's Worst Invasive Alien Species” list (GISD 2021). Both species are omnivorous generalist cavity-nesting species and thrive near human-modified landscapes (Downs and Hart 2020; GISD 2021). The myna is non-migratory and generally regarded as a tropical species, native to Central to Southeast Asia (Kannan and James 2020). The starling, in contrast, is a migratory temperate species, native to temperate Eurasia and Northern Africa (Cabe 2020).

In the eighteenth century, mynas were introduced to Mauritius and Reunion Island to control insect pests (Cheke and Hume 2008), making this one of the world’s first attempts at biological pest control (Safford and Hawkins 2013). Starlings, on the other hand, were first introduced to Australia in the mid-nineteenth century (Long 1981; Pipek et al. 2019). To date, mynas have been intentionally introduced in at least 13 locations across the world, starlings, in seven locations across the world and both species have also successfully self-introduced globally (Long 1981; Cabe 2020; Kannan and James 2020; Stuart et al. 2023). Mynas now exist on all continents except Antarctica and South America, and starlings in all continents except Antarctica (Long 1981; Cabe 2020; Kannan and James 2020). In their invasive range, both mynas and starlings are agricultural pests and compete with, and may predate, local fauna (Linz et al. 2007; Dhami and Nagle 2009; Downs and Hart 2020; Feare et al. 2021).

In the South Island of New Zealand, mynas were introduced and persisted for a few decades but disappeared by 1900, except for one Nelson population which persisted until the 1960s. In the North Island, the common myna range expanded northwards from introduction sites until it reached the tip of the North Island by the end of the 1960s, while its southern range edge contracted to approximately their current distribution by the late 1970s. Mynas are now rare in some of their initial introduction locations (e.g., Wellington) with very few populations established South of Whanganui. However, there have been increasing records of mynas in the Wellington region in 2011–2021, which may indicate a change in their southern range edge (Thomson 1922; Cunningham 1948; Beesley et al. 2023).

Starlings were introduced and relocated across New Zealand and quickly became abundant in most parts of the country (Pipek et al. 2019). Within a few decades following introduction, they also started colonising offshore islands of New Zealand (e.g., Chathams, Kermadecs, Antipodes, Macquarie islands) (Long 1981). Both species have become agricultural pests in New Zealand (Dawson and Bull 1970; Nelson 1990) and positive impacts on native fauna has been observed following removal of mynas from a New Zealand island, albeit that effects may be confounded with removal of other pest species (Tindall et al. 2007).

There are a handful of ENM studies focused on the two sturnid species. Myna distributions in Australia were modelled using one of the first ENM, the BIOCLIM model (Martin 1996). A more recent study modelled global myna distributions (Magory Cohen et al. 2019). A further study included myna in joint species distribution models built for common bird species in Israel (Magory Cohen and Dor 2019). Global ENMs on seven invasive species, including both the myna and the starling, were performed as part of an impact assessment study for South Africa (Shivambu et al. 2020). However, none of these studies assessed model transferability or compared niches in the different ranges. Ecological niche transferability in starlings has been evaluated alongside 12 other species from the IUCN “100 of the World's Worst Invasive Alien Species” list, but no species distribution projections were made (Fernández and Hamilton 2015). Overall, these studies suggest both impervious surface area and temperature parameters limit common myna distribution (Martin 1996; Magory Cohen et al. 2019). While environmental predictors have not been assessed for starlings in an ENM framework, some evidence of niche expansion in starlings is observed in their invasive range (Fernández and Hamilton 2015).

Here we build upon the knowledge and insights provided from previous studies to improve our understanding of the two species’ niches and better simulate the myna and the starling potential future distribution in New Zealand. We utilize myna and starling occurrence records and environmental data from the native range and New Zealand to first perform ENMs for each species using five different ENM algorithms. We assess the ENM’s transferability to select the most appropriate ENM algorithm, and use the results from the ENMs to help inform choices for important environmental variables for defining the species niche. We then compare the niches of species in New Zealand and the native range, and of the two species in New Zealand. We predict that niches are conserved within each species, and divergent between the species. We then project the best ENMs using future climate/environmental projections for New Zealand, and use these to highlight areas at risk of future invasion and range expansion.

Material and methods

Our analyses are divided into two main parts: (a) ENM, and (b) Niche overlap analysis (see Fig. S1.1 for more details). The steps in our analyses are as follows: (a1) perform ENM, (a2) assess model transferability, (a3) predict suitability under current and future projections, (b1) select important environmental variables for niche overlap analysis, and (b2) perform niche overlap analysis. All analyses were performed in R version 4.2.1 (R Core Team 2022). Spatial datasets were manipulated and processed using the following R packages: ‘terra’ version 1.6.17 (Hijmans 2022a), ‘raster’ version 3.6.3 (Hijmans 2022b), and ‘sf’ package version 1.0.7 (Pebesma 2018); tables were processed and manipulated using ‘tidyverse’ version 1.3.2 (Wickham et al. 2019) and ‘data.table’ version 1.14.2 (Dowle and Srinivasan 2021). All scripts used in data processing and analysis are available on GitHub (https://github.com/akamolphat/ENM_NZ_sturnid).

Occurrence data

Myna and starling occurrence data were downloaded from GBIF (GBIF.org 2022a; b). Only occurrence records with dates from 1970 to 2020, precision of ≤ 1000 m or with coordinate precision ≥ 4 decimal places were retained, and records were cleaned following standard procedures (Cobos et al. 2018). Outlier records appearing in unexpected locations were reviewed and removed as appropriate (see Supplementary S1.1.1 for more details).

Persistent localities were identified and retained. For mynas, these were localities with occurrences within 16 km spanning at least 5 years. For starlings, because they are generally migratory, these were breeding season occurrences within 32 km spanning at least 4 years. The thresholds were based on published myna and starling dispersal ranges, average lifespan, and breeding site fidelity (Kessel 1953; Wilson 1973; Kang 1992; Berthouly-Salazar et al. 2012; Homan et al. 2017).

‘spThin’ R package version 0.2.0 (Aiello-Lammens et al. 2015) was used to spatially thin occurrences so that a minimum distance of 16 km and 32 km separate the myna and starling occurrences, respectively. Only records with all environmental variables were retained. After all filtering and thinning processes, there were 4222 native range occurrences and 193 New Zealand occurrences for mynas, and 3015 native breeding season range (hereafter ‘native range’) occurrences and 156 New Zealand breeding season range (hereafter ‘New Zealand’) occurrences for starlings (Fig. 1) (see Supplementary S1.1 for more details on data filtering and thinning).

Fig. 1
figure 1

Maps of filtered and spatially thinned occurrences (black diamonds) from the myna A native range and B New Zealand, and the starling C native range and D New Zealand. Green areas represent the species distribution (breeding distribution for starlings), and orange outlines depict the ENM calibration area. Distribution data are from BirdLife International and Handbook of the Birds of the World (2016, 2019). Photo credits: K. Atsawawaranunt

Environmental variables

Twenty-seven environmental variables at 0.0083° resolution were downloaded and assessed (see Supplementary S1.2 for more details), including 19 bioclimatic variables averaged for 1981–2010 from CHELSA version 2.1 (Karger et al. 2017, 2018), four variables from the corresponding BIOCLIM+ dataset (Brun et al. 2022a, b), 2003 percentage tree cover (Geospatial Information Authority of Japan, Chiba University and collaborating organizations 2016) and a human-related variable, namely, downscaled human population density (Jones and O’Neill 2016; Gao 2017, 2020) for the year 2000. Grid cells in some datasets were not perfectly aligned and were slightly shifted to ensure alignment using the shift function from the ‘terra’ R package.

For each species, correlations between environmental variables were explored and variables with biological justification were chosen to produce ecologically realistic predictions and ecologically interpretable results (Ingenloff 2017; Guevara et al. 2018).

Five environmental variables were chosen for the myna and the starling ENMs including: growing degree days above 5 °C (gdd5), annual precipitation (bio12), growing season length (gsl), percentage tree cover (tree_cover_pct), and the human population density (pop_dens). gdd5 is highly correlated (Pearson’s r > 0.8) to several other temperature variables in both species and was chosen to reflect the energetic requirements during the breeding season. For mynas, low temperatures may decrease hatching success and limit their distribution in colder climate (Wilson 1973). For starlings, temperatures too low or high during the breeding season may cause egg and nestling losses (Royall 1966; Gromadzki 1980), and may also negatively affect foraging ability (Clark 1987). bio12 was chosen to reflect the overall productivity of the land. Both species are generally not found in deserts except where there are humans (Cabe 2020; Khoury et al. 2021). For the starling, high precipitation and low temperatures may also cause nestling mortality (Gromadzki 1980). gsl was chosen to reflect the duration of food availability. Mynas are generally not migratory and may not be able to survive in areas with very short growing season. Starlings may migrate but gsl must be long enough for successful breeding. tree_cover_pct was chosen to reflect the availability of natural nesting cavities; both species are also generally not found in thick forests (Higgins et al. 2006; Cabe 2020). Lastly, pop_dens was chosen to reflect the species interactions with human-altered environments, including anthropogenic nest sites and food sources (Cramp and Perrins 1994; Markula et al. 2009). See Supplementary S1.2 for more details on variable choice.

For future climate scenarios, we used the same three climate variables (gdd5, bio12 and gsl) from the downscaled CMIP6 CHELSA datasets for the year 2085 (2071–2100 average). The GFDL-ESM4 general circulation model under two Shared Socio-economic Pathway (SSP) and Representative Concentration Pathways (RCP) were used (SSP1-RCP2.6 and SSP3-RCP7). Human population projections under SSP1 and SSP3 for 2090 were used alongside the climate variables for future projections. To avoid biases due to cell size/area, human population counts were converted to population density using the cell area calculated using the cellSize function from the ‘terra’ R package. Percentage tree cover (tree_cover_pct) was kept constant because no future projections were available.

Ecological niche modelling

Determining the calibration area, the accessible areas to the species during the time period of interest, is a crucial step in calibrating the ENMs. For mynas, the calibration area for the native range was chosen with the assumption that the desert and the cold weather in the Middle East, north and central Asia (steppe and tundra in Kazakhstan and Russia, and the Gobi Desert and the Tibetan plateau) and the Himalayas act as a barrier to dispersal. Regarding the barriers in south China (just north of Hainan), we were unable to detect any major barriers, but this outline was made on the assumption that there was a potential barrier here that has limited the dispersal in the past. There are now newly invaded areas in Hong Kong, first reported in 1952 (Leven and Corlett 2004) and Taiwan, likely introduced in the 1980s or 1990s (Lin 2001) but these populations were not natural range expansions and were founded by released caged birds (CAB International 2021).

For starlings, the calibration area was chosen as the area within 1500 km from the native breeding range as defined in BirdLife International and Handbook of the Birds of the World (2019). The 1500 km distance is the approximate maximum distance travelled by migratory starlings (Perdeck 1958). We have elected to use a distance-based buffer around the breeding range instead of dispersal barriers as the species is known to disperse great distances (e.g., records of vagrant starlings in South-east Asia).

For New Zealand, the entire North and South Island of New Zealand were used as the calibration area as both islands were accessible to both species (Fig. 1). Starlings are found in both the North and South Island (Fig. 1). Although mynas are currently not found in the South Island, they were introduced there but failed to establish (Beesley et al. 2023). Occasional strays, likely accidental human mediated introductions (e.g., aviary escapees and cargo ships), are observed in the South Island (Tunnicliffe 1982; Bull et al. 1985; Higgins et al. 2006; BirdingNZ.net 2021). We therefore included the South Island as part of the calibration area for mynas with the presumption that the South Island was accessible to mynas (see Fig. 1 for calibration area outline).

ENMs were performed using the ‘biomod2’ R package version 3.5.1 (Thuiller and Georges 2021). To select the most appropriate model for each species, we ran and evaluated five different ENM algorithms on the native range (native), New Zealand (NZ), and the combined native range and New Zealand (native + NZ) data with a spatial cross-validation approach. Models were also trained using all filtered localities (occurrences and pseudoabsences with no spatial cross-validation) for each region/range (i.e., native, NZ, and native + NZ) and evaluated against data from the native range and New Zealand, separately. For the native and NZ models, this involved the models being cross evaluated against data from the other range (model trained with native range data evaluated against data from New Zealand, and vice versa) to help assess model transferability.

Spatial cross-validation is especially important when aims involve model transferability (Roberts et al. 2017). Data from New Zealand and the native range were therefore partitioned using a five k-fold spatial block cross-validation approach, using the ‘blockCV’ R package version 2.1.4 (Valavi et al. 2019). The sizes of the spatial blocks differ depending on the region in which the model was built, and were approximated based on the median value of the effective range of spatial autocorrelation estimated using a fitted variogram model for the five environmental variables. The myna data was partitioned into 200 × 200 km blocks for the NZ model, and 2000 × 2000 km blocks for the native, and native + NZ models. The starling data was partitioned into 200 × 200 km blocks for the NZ model, and 4000 × 4000 km blocks for the native models and the native + NZ models. Each spatial block was iteratively assigned values of one to five, 100 times, and the configuration with the most evenly dispersed number of records was chosen (see Supplementary 1.3.1 for more details).

We ran five algorithms: Maximum Entropy modelling (Maxent), R alternative implementation of Maxent (maxnet), Generalised Linear Model (GLM), Generalised Additive Model (GAM), and Surface Range Envelope (SRE, also known as BIOCLIM) (Nelder and Wedderburn 1972; Hastie and Tibshirani 1986; Busby 1991; Phillips et al. 2004, 2006, 2017; Phillips 2021). Maxent, maxnet and GAM have been shown to have good performance with presence-only data, but no single method is superior in all situations (Valavi et al. 2022). Maxent parameters were first tuned to optimise for our data (see Supplementary S1.3.2). GLM, maxnet and SRE were run using default settings in biomod2, and GAM were fitted using the smoothing parameter estimation method following parameters for the ‘GAM DW’ method described in Valavi et al. (2022). Background points were used for all algorithms except for SRE. Two sets of 50,000 background points were used for the native and NZ models. Two sets of 100,000 background points were used for the native + NZ model. Models were permutated 10 times to estimate environmental variable importance. Model evaluation statistics and response curve plots (probability of species occurrence vs environmental variables) were recorded and assessed for each model.

For each training region (i.e., NZ, native, and native + NZ), the ranked performances across algorithms were compared for the cross-validation and the cross-evaluation runs using the area under the receiver operator curve metric (AUC). As AUC was calculated based on presence-background data, the AUC calculated in this study was the probability that a presence location has a higher predicted value than the background location. Different training regions consisted of different numbers and ratios of presences and background areas, and therefore, only the same species models from the same region were compared and ranked. As only presence data were used, different thresholds were explored for creating binary (i.e., presence/absence) maps and the 10th percentile training presence (P10) threshold (Pearson et al. 2007) was chosen for both species as it produced comparable distributions to the species current known distributions.

Final models and future projections

The present-day projections of habitat suitability in New Zealand based on the native + NZ and NZ models were compared to the present-day species distribution in New Zealand. If both native + NZ and NZ models demonstrated similarly good fit for New Zealand, the native + NZ models were chosen, as native + NZ models were trained on more training data which included a larger range of environmental variables (Sequeira et al. 2018).

The chosen models were then projected using future environmental projections for environmental variables used in the ENMs. The projected model is the best overall performing algorithm trained with all training data for the range with no spatial cross-validation (see Table 1 for more details). Projections were made based on the second of the two sets of background points, as there were no visible differences between them. As no future projections were available for tree_cover_pct, this was kept constant for future scenarios. To account for uncertainties in future projections, three unique combinations of future climate projections (GFDL-ESM4 general circulation model) and human population projections were used: (1) SSP1-RCP2.6 and SSP1 (SSP126, hereafter), (2) SSP3-RCP7 and SSP3 (SSP370A), and (3) SSP3-RCP7 and present-day human population (SSP370B). Combination (1) represents an optimistic case scenario, (2) represents a mid-to-bad case scenario, and (3) uses the same mid-to-bad case scenario but without human population growth to assess the human density effect on the future projections.

Table 1 Average AUC scores and ranks from cross-validation (CV) and evaluations based on New Zealand (NZ) and native range data

Niche overlap analysis

To compare niches within species (e.g., native range vs New Zealand) and between species, we used the ‘humboldt’ R package version 1.0.0.420121 (Brown and Carnaval 2019). The source code for ‘humboldt’ R package was modified to work on the remote computing cluster (see Supplementary S1.4.2 for more details). For the within species comparison in myna, the niche in the native range was compared to the niche in New Zealand. For starlings, the comparison was done between New Zealand and the United Kingdom (the source of introduction in New Zealand) instead of the entire native range due to computational limitations. Environmental variables with greater than 5% importance in the New Zealand ENMs were chosen (see Supplementary S2.1 and Fig. 2 for more details) and principal component analysis (PCA) was performed on these environmental variables. These were gdd5, gsl, and bio12 for both mynas and starlings. Because pop_dens is important in the myna native range, we also checked and confirmed that interpretation of results were consistent with the inclusion of pop_dens for mynas when comparing the native range and New Zealand (results not shown). For inter-species comparison within New Zealand, environmental variables important to any of the two species were used (i.e., gsl, gdd5, bio12). The environmental space (E-space) used to define the niches for comparisons was characterized by the first two principal components of the PCA of the important environmental variables across the study region of both populations (or species). In all comparisons, the first two principal components from the PCA explained > 83% of the total variance (see Supplementary S1.4.1 for more details).

Fig. 2
figure 2

Environmental variable importance for A mynas, and B starlings, based on Maxent models. The solid black vertical line indicates 5%; variables with more than 5% are considered important for predicting species presence. Abbreviations used in text are as follows: tree cover percentage = tree_cover_pct; human population density = pop_dens; growing season length = gsl; growing degree days > 5 °C = gdd5; mean annual precipitation = bio12

To infer niche difference between populations (or species), Niche Equivalence statistics, measured by Schoener’s D, were performed on the E-space. Schoener’s D has values from 0 to 1; 1 signifies perfect niche equivalency, and 0 signifies perfect niche difference. We also assessed the ability of the Niche Equivalence statistic to detect differences based on the available E-space with the Background statistic. The significance of the Niche Equivalence and Background statistics is quantified by shuffling the occurrences and shifting of localities in geographic space, respectively. The statistics are then calculated for each iteration (see Brown and Carnaval (2019) for more details) and significance calculated based on 200 iterations.

Niche Equivalence and Background statistics performed on the total and the analogous accessible E-space are referred to as the Niche Overlap Test (NOT) and Niche Divergence Test (NDT). The combination of NOT and NDT helps deduce whether the niches differ due to divergent evolution or to asymmetries in habitat accessibility (see Table 2 in Brown and Carnaval (2019)).

Results

Model transferability

To select the most appropriate algorithm for our ENM for each species, we assessed the model’s transferability by assessing the ability of each model to predict the species distribution both within the same area and in a different area. Model evaluation scores and their ranked performances across methods of the cross-validation runs and the cross-evaluation runs are shown in Table 1. For mynas, Maxent achieved the highest AUC scores in both cross-validation and cross-evaluation runs. For starlings, Maxent only achieved the highest scores in native models cross-validation runs and NZ models cross-evaluation runs, the R alternative implementation version, maxnet, achieved the highest scores in NZ models cross-validation runs and SRE achieved the highest scores in native models cross-evaluation runs.

Environmental variable importance

Because it performed well in most scenarios (Table 1), likely because it was tuned and it allows for complex relationships between the probability of occurrences and environmental variables in comparison to other models (Elith et al. 2011; Merow et al. 2014), Maxent was chosen to assess the importance of the five environmental variables in predicting species presence. Variable importance varies slightly between the different model training areas (Fig. 2). For mynas, growing degree days above 5°C (gdd5) was the most important variable in the native and NZ models, and the second most important variable in the native + NZ model. Annual precipitation (bio12) was the most important variable in the myna native + NZ model, and contributed > 5% in both native and NZ models. Human population density (pop_dens) contributed > 5% in the native and native + NZ model, but not in the NZ model. Growing season length (gsl) contributed > 5% in the NZ model, and 5% and 2% in the native and native + NZ model. Percentage tree cover (tree_cover_pct) contributed < 1% in all models. All variables except tree_cover_pct contributed > 5% in the NZ, native, or native + NZ models (see Table S2.1 for more details).

For starlings, gsl was consistently the most important variable in all models, followed by gdd5 (Fig. 2). bio12 was the third most important variable in the native and NZ models. In the native + NZ model, only gsl and gdd5 contributed > 1%. pop_dens contributed < 5% in all models (0.1 − 0.3%), and tree_cover_pct contributed between 0.5% and 5%. All variables except tree_cover_pct and pop_dens contributed > 5% in the NZ, native, or native + NZ models (see Table S2.1 for more details).

Response curves from ENMs can indicate the relationship between environmental variables and the probability of species presence. Although different models (NZ, native, and native + NZ) showed slightly different shaped response curves, some generalisations can be drawn. Regarding the four variables that contributed > 5% in any of the myna models, the response curves showed increased species’ probability of presence with increases in all four variables. For annual precipitation (bio12) and human population density (pop_dens), in all myna models, the species’ probability of presence increased to reach optimal values once environmental values exceeded a threshold (bio12 > ca. 800–1700 kg m−2 and pop_dens > ca. 2000–3000 individuals km−2) (see Fig. S2.1 for more details).

Regarding the three variables which contributed > 5% in the starling models, there are slightly different trends for each model. In general, the species probability of presence increased with gsl. In the NZ model, the species probability of presence increased with increasing gdd5, while starling native and native + NZ models suggest that an intermediate value for gdd5 optimised species presence. For bio12, the species’ probability of presence decreased with bio12 in the NZ model (contribution = 10%) but were relatively flat in the native and native + NZ model (contributions ≤ 2%) (see Fig. S2.2 for more details).

ENM projections

To visualise and assess our model predictions of the current distribution of mynas and starlings in New Zealand, and to project their distributions under climate change, we projected our best overall-performing ENMs to the present and to future climate scenarios. For mynas, the Maxent native + NZ model was chosen, and for the starlings, the Maxent NZ model was chosen (see “Ecological niche modelling” section for more details). For mynas, the present-day binary habitat suitability map for New Zealand from the Maxent native + NZ model predicted suitable habitat in the North Island of New Zealand, excluding the central North Island volcanic plateau and mountainous regions of the North Island (Fig. 3A). In the South Island of New Zealand, only small coastal areas were predicted to be suitable for mynas. Future binary habitat suitability projections predicted an increase in suitable habitat in the central North Island volcanic plateau and the Canterbury plains in the central-east of the South Island (Fig. 3B, C). Future projections under SSP370 (SSP370A and SSP370B) predicted larger increases in suitable habitat than SSP126 (Fig. 3B, Table S2.2). SSP370 future projections also predicted suitable habitat in some areas in the south of the South Island.

Fig. 3
figure 3

Binary (i.e., presence/absence) maps of suitable habitat for common mynas created using the Maxent native + NZ model 10th percentile training presence for New Zealand for A present-day (1981–2010 average), B future projections (2071–2100) under the SSP126, SSP370A, and SSP370B projections, and C sum of all layers in (A, B). SSP126 scenario is based on future climate projections under Shared Socioeconomic Pathway 1 (SSP1) and Representative Concentration Pathway 2.6 (RCP2.6), and human population density under SSP1. SSP370A scenario is based on future climate projections under SSP3 and RCP7, and human population density under SSP3. SSP370B scenario is based on future climate projections under SSP3 and RCP7, and present-day human population density

For starlings, the present-day binary habitat suitability map for New Zealand from the Maxent NZ model predicted suitable habitat across most of both the North and South Island of New Zealand, excluding the high mountain regions (Fig. 4A). Future binary habitat suitability projections predicted an increase in suitable habitats at higher elevations, with the largest increase in the Central Otago plateau in the south-central South Island (Fig. 4B, C). SSP370 (SSP370A and SSP370B) predicted larger increases in suitable habitat than SSP126 (Fig. 4C, Table S2.2).

Fig. 4
figure 4

Binary (i.e., presence/absence) maps of suitable habitat for common starlings created using the Maxent NZ model 10th percentile training presence for New Zealand for A present-day (1981–2010 average), B future projections (2071–2100) under the SSP126, SSP370A, and SSP370B projections, and C sum of all layers in (A, B). SSP126 scenario is based on future climate projections under Shared Socioeconomic Pathway 1 (SSP1) and Representative Concentration Pathway 2.6 (RCP2.6), and human population density under SSP1. SSP370A scenario is based on future climate projections under SSP3 and RCP7, and human population density under SSP3. SSP370B scenario is based on future climate projections under SSP3 and RCP7, and present-day human population density

For both species, the use of different human population density (SSP370A vs SSP370B) did not yield large differences in suitable habitat areas (see Table S2.2 for more details). As applying different thresholds may generate different binary maps, continuous maps of probability of presence for current and future presences were also produced (Figs. S2.3 and S2.4). Clamping masks for future projections were also produced to help assess model extrapolation (Fig. S2.5). Clamping masks identify locations where predictions are based on environmental variable values outside the range used for model calibration (Thuiller and Georges 2021).

Niche overlap analysis

We explored whether there was evidence of niche divergence between the native and New Zealand ranges of mynas and starlings, and whether the two species occupy different niches within New Zealand. Regarding the comparisons of mynas in the native range vs New Zealand, the niche equivalency test for the entire calibration area (Niche Overlap Test; NOT) resulted in a nonsignificant niche equivalency statistic (D = 0.000, p = 0.318). Given that both background statistics were significant (p < 0.01), this suggests that there is no evidence of niche difference. The niche equivalency test within analogous areas (Niche Divergence Test; NDT) provided no evidence of divergence (D = 0.222, p = 1), with weak evidence in both background statistics significance (NZ → native p = 0.065, native → NZ p = 0.055). This also suggests that there is no evidence for niche difference, although power to detect differences may be low (see Fig. 5 and Table 2 for more details). Both NOT and NDT therefore indicate that there are no niche differences between mynas in New Zealand and their native range (see Table 2 of Brown and Carnaval (2019) for more details).

Fig. 5
figure 5

Differences in analogous environmental space between A the native range versus New Zealand for mynas, B the UK versus New Zealand for starlings, and C mynas versus starlings in New Zealand. For (A, B), blue = native range less dense than New Zealand range, red = native range more dense than New Zealand range. For (C), blue = mynas less dense than starlings, red = mynas more dense than starlings. Principal components represent principal components analysis of environmental variables with greater than 5% importance in the ENMs

Table 2 Results from niche overlap analyses

Regarding the comparisons of starlings in the UK vs New Zealand, the NOT resulted in a significant niche equivalency statistic (D = 0.008, p < 0.01), but nonsignificant background statistics, suggesting some evidence for niche differences, but a low ability of the data to detect niche differences. The NDT resulted in a nonsignificant niche equivalency statistic (D = 0.159, p = 0.980), and nonsignificant background statistics, suggesting that there is no evidence for niche difference, but, again, limited ability to detect a difference (see Fig. 5 and Table 2 for more details). Therefore, for starlings, NOT suggests some evidence for niche differences while NDT does not. Together, these results support the hypothesis that their current niches are not equivalent, but that this is only due to access to different environments and not divergence. NOT environmental space (E-space) comparisons suggest that this is largely due to parts of the New Zealand E-space not found in the UK (Fig. S2.8).

When comparing the niche of mynas with starlings within New Zealand, both the NOT and NDT resulted in a significant niche equivalency statistic (NOT: D = 0.417, p = 0.00498; NDT: D = 0.514, p = 0.00498), and significant background statistics. This suggests that there is evidence of niche divergence between these invasive species in New Zealand.

Discussion

As species are introduced into a new environment and become invasive, their niches may or may not change from their niches in the native range. Understanding invasive species niches in their native and invasive ranges, and how they may have changed, will help inform species management, including helping to predict their potential future distributions (Faulkner et al. 2014; Briscoe Runquist et al. 2019; Shivambu et al. 2020).

Our ENMs suggest potential increase in the myna and starling distributions in Aotearoa New Zealand by 2071–2100, particularly for the common myna where suitable habitats are predicted in New Zealand's South Island where it is not currently found, and support early removal of the common myna if sighted in the South Island. Our niche overlap analyses found no niche divergence between New Zealand and the native range for both the common myna and common starling but found evidence for niche divergence between the myna and starling in New Zealand.

Ecological niche modelling

We found that models trained with data from New Zealand, or the native range were best at describing the myna and starling distributions in these respective locations. This is consistent with findings from previous studies that local models are best at predicting local distribution (Osborne and Suárez-Seoane 2002; Barbosa et al. 2009), likely because species vary spatially in their response to the local environment. Hence, local models may be better at detecting local variations than global (or larger area) models, as global models must incorporate a larger variety of species–environment relationships. Maxent was the best performing algorithm for all myna models (native, NZ, and native + NZ) when spatially cross validated locally, and the native and NZ myna models exhibit best transferability when cross-evaluated between locations. However, there was no one best algorithm for the starling models and the best performing cross-validated native and NZ starling models exhibit the worst transferability when cross-evaluated, supporting the narrative that there is no one superior algorithm (Valavi et al. 2022). Consequently, we chose the Maxent NZ starling model which balanced cross-validation and cross-evaluation performance (Table 1).

For both species, there was good agreement between the present-day projections of suitable habitats and the current species distribution in New Zealand (Figs. 3A and 4A vs 1B, D). While species-environment relationships are likely dynamic, the current distribution maps of both mynas and starlings are likely to be good representations of the available suitable habitats in New Zealand as both species are very conspicuous, New Zealand is relatively well surveyed, and both species have had the opportunity to occupy the available suitable habitats. Mynas have been introduced to cities in the South Island where they are no longer found (Beesley et al. 2023) and starlings were introduced in many locations across New Zealand (Pipek et al. 2019; Stuart et al. 2023) and can travel long distances (Perdeck 1958).

For the common myna, our ENMs identified gdd5, bio12, and pop_dens as the three most important bioclimatic variables (see “Environmental variable importance” section). The probability of presence (PoP) increased with gdd5, reflecting the species known affinity to warmer climates and supporting a proposition that temperature is a key limiting factor in the species distribution (by limiting the energetic requirements (Martin 1996) and the hatching and fledging success (Wilson 1973)). Our model also supports a thresholded-response between PoP and bio12 and pop_dens, which can be explained by the myna adaptability—the myna can thrive as long as the minimum energetic requirements are met. The common myna has demonstrated that it can persist in less conducive conditions, especially in human-altered landscapes. This is evident by populations in desert cities (e.g., Dubai) and decades of the species persistence in parts of the South Island following introduction (Thomson 1922; Beesley et al. 2023). Warming climates and increasing human impacts on the environment are therefore expected to increase the common myna suitable habitats in the future, as similarly echoed in global scale study (Magory Cohen et al. 2019) and other invasive species studies (Fraser et al. 2015; Ndimuligo et al. 2022). The projected southward increase in common myna future suitable habitats also echo the projected southward shift in potential suitable habitat of several species found in New Zealand (Sheppard 2013; Watts and Watkins 2022; Germano et al. 2023).

For the common starling, gsl and gdd5 were identified as the two most important environmental variables. PoP increased with gsl in all models, likely reflecting a minimum breeding period length. The relationship between gdd5 and PoP differed a little between the NZ model and the native and native_NZ models, likely due to the more limited range of gdd5 available in New Zealand. However, it is unclear if their (general) absences in tropical environments in the native range are due to competition or the species temperate niche. The species can and have thrived in warmer climates given the right opportunity (e.g., United Arab Emirates, Jamaica) (Long 1981; Campbell and Smiles 2019).

Our model cross-evaluation scores and projections suggest our ENMs have performed well, with general agreements between the present-day projections and the species distributions in New Zealand. Extrapolation of a model to environmental conditions beyond those used to calibrate the model can result in poor model transferability (Merow et al. 2014; Qiao et al. 2019). However, we were able to incorporate data from the native range to the mynas models which help reduced the risk of extrapolations in future predictions for New Zealand. The models’ over- and underpredictions were also carefully assessed along with the environmental variables used (see “Model limitations” section). Our future projections predicted an increase in suitable habitats for both mynas and starlings in New Zealand as climate warms.

Given that present-day projections overpredicted suitable habitat for the myna in the southern limit of the species range (> 40°S) (see “Model limitations” section), the predicted southward increase in suitable myna habitat may similarly over-predict the potential future distribution. This means that using these models may represent a worst-case scenario rather than the most likely scenario for the spread of this invasive species. However, a precautionary approach to invasive species management is appropriate (Jiménez-Valverde et al. 2011; Cooney and Dickson 2012). The predicted increase in suitable habitats for starlings at higher elevations was expected as increasing temperature will allow starlings to move into environments which are currently too cold to inhabit.

Model limitations

While there were general agreements between the present-day projections and species distribution, there were also some minor disagreements which provide insights to our understanding of the species niches (and lack thereof). Regarding the myna, our ENM overpredicted suitable habitat in the southern limit of the species range in New Zealand (> 40°S) where it is occasionally recorded as present, but only in urban centers, i.e., in strong association with humans. We also overpredicted suitable habitat in a large native bush area (Whanganui National Park in the eastern-central North Island). There were also some underpredictions of suitable habitat along a main national highway that runs (SH2) west to east across the North Island axial mountain range, and in areas west and south of the central North Island volcanic plateau. Overall, these discrepancies imply that there may be other key environmental variables that modulate temperature (i.e., gdd5 and gsl) and limit the species energetic requirement (e.g., solar radiation, day length). In the native range, our ENM overpredicted suitable habitat in forests, and underpredicted suitable habitat in semi-arid regions in Central and South Asia (Fig. S3.1 vs Fig. 1A). This included areas in the species long-standing distribution (e.g., Pakistan and Rajasthan, India) and the newly invaded areas (Pakistan to Kazakhstan).

With regards to starlings, our ENM overpredicted suitable habitats in forested regions in New Zealand (e.g., the North Island axial mountain range). There was also some underprediction in the South Island’s Central Otago plateau. In the native range (native and native + NZ models), there was also underprediction of suitable habitats in colder regions and in the semi-arid region of Central Asia (Fig. S3.2 vs Fig. 1C). The native and native + NZ model projected into New Zealand overpredicted suitable habitat in forested areas, especially over the southwest temperate rainforest regions.

These small over and underpredictions demonstrated the difficulties for ENMs to exactly model the niches and distribution of these highly adaptable species. Firstly, ENM assumes equilibrium relationships between species occurrences and environmental variables across the calibration area. The degree to which this assumption is obeyed in real life varies from case to case, and may also be altered in the presence of other environmental variables, especially biotic factors (e.g., presence of a food source, competitors, etc.). Biotic factors are undoubtedly important, and the strength of the relationship between species and these biotic factors likely varies in different environments. For example, in warm arid regions (e.g., Central Asia), anthropogenic variables such as human population density are likely very important at determining the myna’s distribution as mynas are commensal with humans. On the other hand, in colder regions (e.g., New Zealand), temperature (e.g., represented by gdd5) may be a more important variable. Biotic factors are also difficult to include as there are many species present in the environment and the strength of their interactions are often difficult to quantify. Our starling native and native + NZ models consistently predicted Iberia and North Africa as suitable habitats. However, the common starling (S. vulgaris) rarely breeds in these regions presumably because Iberia and North Africa are occupied by the closely related spotless starling (Sturnus unicolor) (Ferrer et al. 1991). There are other potential species that may be in direct competition with mynas and starlings, but including these species in ENMs is very difficult as their interactions are often speculative (see Supplementary S3.1 for list of species potentially in competition with mynas and starlings).

Secondly, environmental variables used by ENMs may not directly represent the variable which biologically limits the species distribution. In our ENMs, growing degree days above 5°C (gdd5), annual precipitation (bio12) and growing season length (gsl) were main predictors for the species suitable habitat and represent the resource availability (food) and energetic requirements of the species in the environment. However, these variables are proxies and not direct measurements and therefore, subjected to inaccuracies. For instance, gsl was calculated using the TREELIM model (Paulsen and Körner 2014) which does not account for rivers, irrigations and depressions where water accumulates. In Central Asia, both bio12 (annual precipitation) and gsl (calculated using the TREELIM model) were very low, and the semi-arid region is known to heavily rely on rivers and irrigations for water which were not captured by gsl or bio12. Rivers in this region (e.g., Indus, Amu Darya, Syr Darya) rely heavily on snow and glacial melt rather than local rainfall (Armstrong et al. 2019).

Similarly, the importance of the environmental variables to the species may be scale dependent. We included percentage tree cover (tree_cover_pct) as a predictor in our ENMs to represent both species strong preference for open and edge habitats (Markula et al. 2009; Heather and Robertson 2015; Kannan and James 2020), but it was not important in our myna and starling ENMs. We believe that at the one km2 grid cell resolution, tree_cover_pct was not representative of the species’ preference for open habitats. A one km2 grid cell covered in forests with a road passing through will have high tree_cover_pct values while the two sturnid species will still be able to take advantage of the open habitats along the roadsides.

In previous studies, anthropogenic effects (represented by impervious surface area and population density) have been consistently found to be the most important for the mynas’ prevalence (Magory Cohen et al. 2019). However, we found bioclimatic variables (e.g., growing degree days above 5°C and annual precipitation) to be most important, with varying importance for population density. We did not use impervious surface area in our ENMs as there were no future projections for this variable. These differences in variable importance may be due to the differences in the datasets (e.g., training area, spatial resolution, occurrence points). However, despite the different environmental variables and their contributions in the studies, similar shortcomings were observed—overprediction of suitable myna habitat in forests, and underprediction of projected suitable myna habitat in the semi-arid regions in Central and South Asia (see Figs. 3 and 4 in Magory Cohen et al. (2019))—highlighting the challenges in modelling highly adaptable species’ niches where anthropogenic and other biotic factors are important, difficult to quantify, and may alter other species-environment relationships.

Niche overlap analysis

Our niche overlap analysis supports our hypothesis that the niches do not differ geographically within the species but are divergent between the species. Both niche equivalency statistics based on the entire calibration area (NOT) and analogous areas (NDT) failed to find evidence that myna niches differ in the native range and New Zealand. In starlings, only NOT niche equivalency statistics were significant when comparing the species in the UK and New Zealand. This suggested that the niche in the UK and New Zealand were different but only due to the environment availability (Brown and Carnaval 2019). This somewhat supports the niche expansion observed in previous studies despite the different scale of the study (Fernández and Hamilton 2015). Starlings appeared to have invaded areas with warmer climates (e.g., Jamaica, New Zealand’s Northland). When niche overlap analysis was performed on analogous areas (NDT), the niches in the UK and New Zealand were not significantly different from each other. Nonsignificant background statistics are also suggestive of limited power to detect the difference if there was one. When comparing mynas and starlings in New Zealand, both NOT and NDT niche equivalency statistics were significant. Irrespective of the significance of the background statistic, this suggests niche divergence between the two species in New Zealand (see Table 2 in Brown & Carnaval (2019) for more details on interpretation of the results).

The lack of evidence for niche divergence between New Zealand and the native range for both the common myna and common starling supports the notion that the invasion successes of the two species in New Zealand are mostly attributed to the presence of suitable bioclimatic conditions. This invasion success is likely aided by very large niches that cover a large range of bioclimatic conditions. This supports previous findings that found avian introduction successes to be largely attributed to the presence of suitable environmental conditions (Blackburn and Duncan 2001).

Implications and future analyses

Our study found that local ENMs were best at describing the species distribution in the local range, and algorithms which performed well in one region may not perform well in a different region. While these findings have been observed in previous studies (Osborne and Suárez-Seoane 2002; Barbosa et al. 2009; Valavi et al. 2022), invasive species provide an excellent opportunity for testing the generality of this conclusion and future analyses with other invasive species would be rewarding. Our ENMs based on five biologically justified variables performed well, with good agreement between predicted suitable habitats and present-day species distributions. Nonetheless, there were some disagreements between predicted suitable habitat and the current species distribution which allowed us to evaluate the shortcomings and reassess the environmental variables used.

Our findings that the niches are conserved within the two species suggested that Aotearoa New Zealand contains environments occupied by the species in the native range, a “home away from home”. Similar observations have been reported in a majority of invasive species (Aravind et al. 2022). This raises questions for future research regarding niche evolution; even if the niches are generally conserved, do mynas and starlings experience novel selection pressures in New Zealand following introduction? How long will it take these isolated invasive populations to diverge from those in the native range? Can we detect signals of selection through other approaches (e.g., genetic analyses)?

Our future projections provide management with direct tools to assess the invasion risks in both mynas and starlings in New Zealand. Starlings already occupy the majority of New Zealand, and the predicted increase in suitable habitats are small and may not carry much weight for management teams. However, the increase in suitable habitats for mynas in the South Island as climate changes supports early removal of the species by management teams if sighted in the South Island. Mynas are currently restricted to the North Island, and the Cook strait (approximately 22 km at the narrowest point) provides a natural barrier/resistance to the species dispersal to the South Island.

This study provides an example of how invasive species may be used to assess ENM transferability and how niche comparisons may be performed. Our findings improve our understanding of mynas and starlings’ niches, and highlight areas at risk of future invasion and range expansion.