Introduction

It is well-known that protected areas are important in mitigating global threats to biodiversity, like land conversion and degradation, direct anthropogenic pressure and climate change and are thus fundamental conservation tools (Talbot 1984; Parr et al. 2009; Beale et al. 2013; Coetzee 2017). Some of the world’s largest protected areas include savannas located within a mosaic of diverse land-uses, driven by economic, social and environmental processes (UNEP-WCMC 2016). Within these systems, drivers of land-use change and, consequently, land cover change include (but are not limited to) agriculture, urbanisation, alien-invasive plant species and climate change (Eriksen and Watson 2009). In the savannas of north-eastern South Africa, much of the space is dedicated to conservation and maintaining biodiversity and ecosystem functioning. Here, the large protected area, the Kruger National Park (KNP), is situated in the semi-arid part of the Savanna Biome and constitutes ~ 2 M ha of protected land and hosts more than 2800 plant and animal species. The animal community per se equates to some 50% of all South African savanna species (von Maltitz and Scholes 2008; SANParks Scientific Services 2020) and is a stronghold for threatened birds such as vultures whilst also hosting a species-rich bird community. However, only 15 of 545 (2.8%) peer-reviewed publications that emanated from projects within the park between 2003 and 2013 were bird-related (Smit et al. 2017). Most of these publications focused on threatened or individual species’ ecology, with few studies looking at bird communities and the effects of various disturbances from inside (e.g., elephant impacts) and outside the park [e.g., agriculture, urbanisation, wood harvesting and bush encroachment; Wessels et al. 2011; Coetzee and Chown 2016; Loftie-Eaton 2018; Weideman et al. 2020)].

Bird projects like the Southern African Bird Atlas Project, a citizen science project with over 17 million records, allow questions to be asked about the biogeography of birds across large spatial and temporal scales, and communities’ or species’ responses to disturbances. Birds also encompass a wide variety of functional traits and responses to disturbance (Schulze et al. 2004; Şekercioğlu 2006). For example, ecosystem characteristics like increased vegetation greenness and structure, and disturbances like urbanisation may promote bird diversity or species richness (Lee et al. 2021), but diversity metrics like functional and phylogenetic differentiation (diversity accounting for the effect of species richness) might show opposing trends (Ehlers Smith et al. 2022). Also, components and metrics such as turnover, nestedness and its sum (i.e., beta diversity), may also be important but not often considered in conservation (Villéger et al. 2008; Baselga 2010; Mason et al. 2013; Socolar et al. 2016).

It is unknown how bird communities inside and adjacent to KNP are composed and what the regional drivers are. In this study, we investigated: (i) how species richness, functional and phylogenetic diversity, species rarity, and beta diversity (including turnover and nestedness components) differed across KNP and the adjacent mosaic; and (ii) whether any habitats, habitat heterogeneity and proximity to the protected area boundary, affected these metrics. We also (iii) investigated whether differences in the sample size of our citizen science dataset resulted in significant differences among the results of the analyses. We predicted that KNP would support lower species richness and nestedness, but higher functional, phylogenetic diversity, turnover and rarity compared with the surrounding mosaic which comprises multiple land-uses. Resources, like food, inside KNP are widely scattered; but in more urban environments, supplemental feeding at high-density supports more species, albeit generalists that are smaller-sized birds adapted to human-modified landscapes.

Since we were also interested in the habitats that potentially shape avian community composition, we predicted that an increase in certain habitats like anthropogenic infrastructure would promote species richness, but habitat heterogeneity and closer proximity to the KNP boundary would also promote species richness, and functional and phylogenetic diversity. However, we predicted that, for example, infrastructure will negatively influence functional and phylogenetic diversity, compared with the benefits that increased surface water and vegetation structure may offer to overall bird diversity. These predictions are based on observations that urban areas (typical in the mosaic) often host more species than adjacent land-uses like agriculture or more natural habitats; however, this comes at the expense of specialist species, which typically increase functional diversity and contribute species of different evolutionary ages to an area’s bird community.

Vegetation structure and composition affect bird species differently and consequently, are likely to have varying impacts on the different bird diversity metrics. Finally, we predicted that a difference in sample size of approximately 50% would not result in any strong, significant changes to the results of our analyses, which we tested by selecting a random subset of grid cells (sample sites) that were representative of the larger data set.

Methods

Study region

We conducted our study in the north-eastern part of the savanna biome of South Africa (between 22.3° South, 30.5° East and 25.8° South, 32.0° East; Fig. 1), which includes the KNP, and the surrounding mosaic that contains diverse land-uses and disturbances. Some land-uses in the surrounding mosaic are agriculture, peri-urban informal settlements, game farming, mining and wildlife conservation. The total area of the study region, of which ~ 60% consisted of the protected area (KNP) and ~ 40% mosaic, equates to approximately 1.85 M ha.

Fig. 1
figure 1

Map of the study region displaying filled grid cells (pentads of approximately 7800 ha) with > 10 checklists submitted from 2011 to 2020. The darkest cells represent the mosaic, and the other cells (i.e., the medium and light-coloured cells) represent the protected area. Medium-coloured cells represent a random subset (n = 73) of the protected area grid cells that we employed in our analyses. The extent of the Savanna Biome in South Africa per se, is indicated as the hatched area. The Kruger National Park’s boundary (protected area) is indicated with a black dashed line, and the inset map shows the location and extent of the study region in southern Africa

Bird data

We used species occurrence data from one of the world’s largest citizen science projects—the second Southern African Bird Atlas Project (SABAP2; Lee et al. 2022). The project ensued in 2007 with volunteers recording bird species by traversing grid cells (named pentads that are 5 min by 5 min coordinate grid cells of approximately 7800 ha; Fig. 1), and submitting checklists of recorded bird species within the cell boundaries while spending no less than 2 h observing. At the end of 2020, the project had collected approximately 250,000 checklists containing bird distribution records, mostly from across southern Africa. The project continues to grow as more people submit data at a rate of approximately 30,000 checklists per annum. (Lee et al. 2022). Moreover, since the initiation of the BirdLasser smartphone application in 2014, there has been an exponential increase in SABAP2 data submissions (Appendix Fig. S1; Lee and Nel 2020).

Of the 482 contiguous grid cells covering the study region, we analysed 139 grid cells from the KNP (including a random subset of 73) and 73 from the mosaic (Fig. 1). In order to account for the possible effects that mountainous areas and their unique habitats (like high tree cover) may have had on bird diversity, we excluded grid cells from the mosaic containing elevations higher than that of the protected area (i.e., > 854 m.a.s.l.). All grid cells had a minimum of 11 bird checklists submitted from 2011 to 2020. This minimum number of lists results in nearly all common species being recorded, according to Underhill and Brooks (2016). Moreover, we only used bird checklists from grid cells that complied with SABAP2 full protocol, data collection standards (Underhill 2016). This well-structured protocol requires observers to record species within a grid cell for at least 2 h, and the citizen scientists are encouraged to visit as many habitats as possible during this period. Despite potential caveats of data collected by citizen scientists [e.g., bias because of preference to visit habitats like urban and protected areas (Geldmann et al. 2016; Lee et al. 2021)], we feel confident in using data from this project. Specifically, we used pentads with double-figure checklist submissions (as explained above), we weighted or offset certain analyses by the number of checklists submitted, used SABAP2’s structured protocol and record vetting process, and relied on the generally good observer skills of citizen scientists participating in this project. Moreover, added workshops and tutorials offered by the project for anyone wanting to participate ensured adherence to this well-structured protocol.

Habitat data

We extracted remotely sensed habitat classifications for each of the 212 grid cells from the Copernicus Global Land Cover Layers: CGLS-LC100 collection 3, by employing Google Earth Engine (after Gorelick et al. 2017; Buchhorn et al. 2020). The spatial resolution of this product was 1 ha, with a temporal resolution of five years (2015–2019). This temporally-limited environmental data corresponded with the period when most bird data were collected [i.e., since the introduction of the BirdLasser smartphone application (Appendix Fig. S1)]. The mean cover values (%) were computed across the five years, and for each grid cell for the following habitat classifications: tree, shrub, grass, bare ground, anthropogenic infrastructure, seasonal water and permanent water. We also computed the proximity (distance) of each grid cell’s centre, to the boundary of the KNP to determine if edge effects influenced bird diversity. Lastly, we computed habitat diversity using cover values for all the habitat classes and the Shannon–Wiener index to test whether habitat heterogeneity significantly influenced bird diversity.

Analyses

Bird communities

To visualise how bird communities were composed across the region in ordination space, we employed nonmetric multidimensional scaling (NMDS; Kruskal 1964), computed from a Bray–Curtis dissimilarity matrix that was derived from observed presence/absence data. The latter was constructed using a grid cell (pentad) by species community matrix with study area (mosaic or protected) as a grouping variable. Once the NMDS found the best convergent solution (after a second run) using a maximum of 9999 iterations and random tries, weighted averages scores, and reached a stress value of ≤ 0.1, grid cells were plotted with different colours depending on the study area they were located in. Ellipses (95% confidence interval) were also drawn for each study area on the ordination plot to easily visualise how the mosaic and protected area grid cells were composed across the study region as a function of bird species composition. We employed the metaMDS function of the vegan package for ordination purposes. This was used to construct a dissimilarity matrix in order to ultimately perform nonmetric multidimensional scaling (Oksanen et al. 2020).

We calculated asymptotic species richness values (SR.asymp) for each grid cell using sample-size-based extrapolation using the observed sample of bird species incidence-frequency data (Hsieh et al. 2016). The sampling units (number of checklists submitted) of each grid cell were extrapolated to 1000 (that ensured an asymptotic value was reached) in knots of 10 for deriving estimated richness and ultimately comparing grid cells in the two respective study areas. The iNEXT package developed by Hsieh et al. (2016) was employed for these analyses.

Functional richness (FRic; Villéger et al. 2008) is one of the most popular metrics for measuring functional diversity (Díaz and Cabido 2001) and considered important for detecting trait-based assembly processes and their influence on species occurrences (Mason and de Bello 2013). In our case (i.e., because of a combination of numerical and categorical traits), FRic was typically computed using the existing convex hull volume index and the requirement is that each species is assigned various functional traits belonging to different functional groups that were, for this study, adapted from Hockey et al. (2005). The traits we included were body mass (g), primary diet, nest shape, primary nesting location and primary movement. Primary diet traits were aquatic (specifically plants and arthropods living in freshwater systems), carrion, carnivore, frugivore, granivore, herbivore, insectivore, nectarivore, omnivore or piscivore. We assigned the following nest shapes to species: ball, cavity, cup, non-breeding migrant, parasite or platform. Primary nest location traits were: burrow, cliff and infrastructure, grass, ground, obligate infrastructure, non-breeding migrant, parasite, shrub, tree or water. Primary movements were resident, nomad, vagrant, intra-African migrant, Palearctic migrant and Nearctic migrant.

Because of the trait combination we used (i.e., numerical and categorical traits) in the calculation of FRic, a Gower dissimilarity matrix was constructed, after which square root transformations were applied to achieve a Euclidian species-species distance matrix. Principal Coordinates Analysis (PCoA) was finally performed (in the backend as part of the FRic computations), and the resulting axes were used as the ‘new traits’ for computing FRic. We used an independent swap null model to compute a final FRic value for each grid cell after performing 99 permutations. This was used to compare our study area datasets with a randomised dataset (produced by a swap algorithm) because null models provide specificity and flexibility in data analysis, often not possible with conventional statistical tests (Williams 1947; Gotelli 2001). To minimise the effect of species richness on FRic, we finally derived the standardised effect size of functional richness (SES.FRic) for each grid cell (Plass-Johnson et al. 2016). The standardised effect size is a unitless metric and measures the number of standard deviations that observed FRic is above or below the mean of the simulated bird community derived from the randomised dataset. The null hypothesis was, thus, that the average SES, measured for our presence-absence matrix containing the study region’s grid cells, is considered zero (Gotelli and McCabe 2002). For calculating functional richness (FRic), the dbFD function [a distance-based framework from the FD package (Laliberte and Legendre 2010)] was employed.

We calculated Faith's (1992) phylogenetic diversity (PD) by sourcing 1000 trees using the Hackett et al. (2008) backbone from phylogenetic species data provided by Jetz et al. (2012; http://birdtree.org/). The trees were developed from a subset of species (n = 491) in the study region for which data existed, resulting in the omittance of 50 species (easily misidentified congeners, rare, vagrant or birds considered subspecies in southern Africa). For this species subset, we applied a single dichotomous tree using majority rules consensus (i.e., 95% confidence interval and ultrametric modifications). Once a single tree was developed for the region, the sum of the phylogenetic branch lengths for all bird species (i.e., PD in each grid cell), was computed. However, to minimise the effect of species richness on PD, we also computed the standardised effect size of PD (SES.PD) as per Kembel et al. (2010) using these parameters: number of runs was set to 999, 1000 iterations per run and a null model where the taxa labels were shuffled across the tips of the phylogenetic tree. The reason for computing SES is explained in the previous section. Phylogenetic tree outputs from Jetz et al. (2012) were processed using phytools (Revell 2012) and SES.PD values were computed using the picante package (Kembel et al. 2010).

To determine each species’ rarity (rr) in both study areas, we used the centre latitude and longitude coordinates in decimal degrees of each grid cell and the reporting rate of each species (percentage as a proxy of abundance) recorded in each cell. Rarity values ranged from 0 (most common) to 1 (most rare) and followed a recent species-level index developed by (Maciel 2021) that is based on Rabinowitz’s scheme. This rarity index (rr) is computed as follows: given a list of species for a study area, rr = median(gri + hsi + psi) for each species, namely, the inverse of the geographical range across a study area (gri), the maximum number of grid cells in the study area (i.e., sample size; hsi) and the maximum reporting rate in any grid cell, for each species (psi). We computed the rarity for each species in each study area using the rrindex package developed by (Maciel 2021).

We tested for significant differences in SR.asymp, SES.FRic, SES.PD and rr between the mosaic and protected area using simple Wilcoxon-Mann–Whitney tests (Wilcoxon 1945) and an asymptotic distribution for all four biodiversity metrics. We employed the coin package for performing the tests (Hothorn et al. 2008).

We also investigated dissimilarities in bird diversity among the region’s grid cells and, ultimately, area-specific beta diversity (including its turnover and nestedness components) using the Sørensen dissimilarity index. Thus, we computed beta diversity for the mosaic and the protected area, respective to one another. The three beta diversity components, as per Baselga (2010), that we computed were βSIM (spatial turnover), βSNE (nestedness) and βSOR (overall beta diversity, which is additively decomposed into turnover and nestedness). Turnover is the replacement of some species by others, whereas nestedness occurs when the communities of sites with smaller numbers of species are subsets of communities at species-richer sites (Wright and Reeves 1992; Qian et al. 2005; Ulrich and Gotelli 2007; Gaston and Blackburn 2007). We treated the number of grid cells covering each study area as the sample sizes for these analyses. Notwithstanding, since we could not weight these analyses by the number of checklists submitted thus, we treated the beta diversity results with caution; however, exploratory analyses did not produce a significant difference between the median number of checklists per grid cell, for either study area (Appendix Fig. S2). We used the package betapart to compute this metric and its resultant components (Baselga and Orme 2012). Program R (R Core Team 2020) functions and packages, as listed above, were employed to perform all the analyses.

Drivers of bird communities

To explore the potential effects habitat had on bird communities in each of the study areas (mosaic and protected), we employed a generalised linear mixed-effects model (GLMM) approach. This was fitted with multivariate normal random effects using a Template Model Builder and Laplace approximation. Despite being less flexible than other techniques, it is more accurate and relatively faster (Bolker et al. 2009). For the response fixed effects, we used SR.asymp, SES.FRic and SES.PD, as a function of eight predictor fixed effects (habitat classes, proximity to the KNP boundary and an offset). All models were set up to use the Gaussian family with the identity link (sigma2) because of the presence of negative values, a typical feature of the SES computation results. Unique grid cell (pentad) identities were set as the random effect because we expected them to vary independently based on their bird communities and habitat classifications. Finally, we used the number of bird checklists submitted for each grid cell (observer effort), as an offset predictor fixed effect. To test for model fit, we produced different models with different combinations of predictor fixed effects (all models had the offset included), for each of the three aforementioned metrics, with all other parameters kept the same. Due to large and strongly significant correlation coefficient values with other predictor fixed effects produced by generalised pairs plots, we excluded habitat diversity and grass cover from the final model. Thus, we selected the model that produced the smallest AIC value and excluded the latter predictor fixed effects for each bird metric. The glmmTMB package in Program R was used to perform TMB GLMMs (Kristensen et al. 2016; R Core Team 2020).

Results

Bird community patterns

Nonmetric multidimensional scaling, an ordination of the bird data, produced final grid cell scores (after a second and final run) using two dimensions, a minimum stress value of 0.069 and at 10052 random tries (i.e., there was no convergence after 9999 tries) after employing the Bray–Curtis dissimilarity distance matrix. The majority of the protected area’s bird communities were similar to ~ 50% of the communities in the mosaic, in ordination space (Fig. 2). Secondly, the majority of the mosaic grid cells (n = 73) spanned a somewhat smaller range on the first NMDS axis than the protected area (n = 139) and thirdly, the mosaic grid cells spanned a range more than twice the length, on the second axis. Thus, at least half of the bird community of the mosaic was distinct from the protected area, suggesting that the mosaic and the protected area were composed differently.

Fig. 2
figure 2

Nonmetric multidimensional scaling (NMDS) ordination plot showing grid cells coloured by respective study area (the dark colour indicates the mosaic, and the light colour indicates the protected area). Ellipses enclose 95% (confidence interval) of the associated grid cells in the respective study areas

Asymptotic species richnesses (SR.asymp) were not significantly different between the mosaic and the protected area (Z = 0.710, P = 0.477; Fig. 3a). However, despite this result, there were differences in the shapes of the violin plots we produced. Firstly, the frequency distributions differed (i.e., modal species richness in the mosaic peaked at ~ 250), whereas modal species richness in the protected area spanned from 225 to 275; however, these differences were subtle. A second, more evident result was the larger range of estimated species richness values (lower, upper quartiles and violin tails that include outliers) in the protected area compared with the mosaic area. At KNP, the grid cells ranged from those with as little as ~ 120 species to some with over 400 estimated species. The mosaic’s grid cells spanned a smaller range and contained between ~ 150 and ~ 370 estimated species across the area. The estimated species richness for each grid cell, corresponding standard error, confidence limits and observed species richness are indicated in Appendix Fig. S3. Although there was much variation in confidence intervals among the grid cells, these uncertainties were similar between the two study areas. The random sub-sample of the protected area median and those of the other study areas were not significantly different from one another, and frequency distributions were relatively similar to those of the larger protected area sample. A list of observed bird species, the study area they were recorded in, their functional traits and respective rarity values appear in Appendix Table S1.

Fig. 3
figure 3

Box-and-whisker plots and associated violin plots of a SR.asymp, b SES.FRic and c SES.PD showing differences in median values and frequency distributions between the two study areas and a subset of samples from the protected area (n = 73). The study area medians for SES.PD were significantly different from one another (i.e., the protected area produced larger values when a larger sample size was employed)

For FRic calculations, we chose to keep seven PCoA axes as the quality of the reduced-space representation was 0.702 (an R2-like ratio) and increasing or reducing the number of axes resulted in lower qualities of reduced space. After 99 independent swap null model permutations of FRic, we computed SES.FRic; however, values were neither significantly different between the two study areas (Z = −0.131, P = 0.896; Fig. 3b) nor between the random protected area sub-sample and the other two study areas. However, the violin frequency distributions were differently-shaped across the study areas.

We built an ultrametric, majority rules consensus phylogenetic tree using 491 bird species from the study region, 1000 trees, and then converted this to be dichotomous using a smoothing parameter of zero and random resolving of multichotomies. Weak but significant differences in median SES.PD values were produced between the mosaic and protected area (Z = -2.018, P = 0.044; Fig. 3c); however, when the protected area was sub-sampled, the differences were not significant. Specifically, the larger protected area sample produced older evolutionary-age bird assemblages (i.e., generally larger SES.PD values).

Bird species in the protected area were significantly more rare (median rr = 0.432, n = 500) than in the mosaic (median rr = 0.428, n = 482; Z = −4.164, P < 0.001). In the former, the three rarest species (all with rr = 1.000) were Fawn-coloured Lark (Calendulauda africanoides), Cape Crow (Corvus capensis) and Olive Woodpecker (Dendropicos griseocephalus). In the mosaic, the three rarest birds (rr = 0.833) were Sedge Warbler (Acrocephalus schoenobaenus), Caspian Plover (Charadrius asiaticus) and African Snipe (Gallinago nigripennis). Conversely, two of the three most common species were shared among the two study areas. In the protected area, the most common species (rr = 0.418) were Blue Waxbill (Uraeginthus angolensis), Arrow-marked Babbler (Turdoides jardineii) and Laughing Dove (Streptopelia senegalensis), whereas in the mosaic (rr = 0.404) these were Blue Waxbill, Long-billed Crombec (Sylvietta rufescens) and Laughing Dove. We did not perform this analysis using the subsetted sample from the protected area grid cells.

Beta diversity and its components produced results as per our predictions [i.e. smaller nestedness but larger species turnover values inside the protected area (Table 1)]. Also, the turnover component (βSIM) contributed the most for both study areas. Conversely, the generally smaller nestedness components (βSNE) were similar between the mosaic and protected area. Overall beta diversity of the two areas was relatively similar (Table 1).

Table 1 Beta diversity (βSOR sum of components) and its decomposed components: turnover (βSIM) and nestedness (βSNE) across the two study areas

Drivers of bird community patterns

For all the bird diversity metrics and across both study areas as well as the subsetted protected area, we selected the model that excluded grass cover because of its strong significant correlation with tree cover (Pearson correlation coefficient = −0.773). We also excluded habitat diversity because of multiple strong and significant correlations with most of the habitat classes. The cover and composition of habitats across the study areas and habitat diversity (i.e., Shannon–Wiener diversity index) are visible in Fig. 4. Note the near identical values between the full and sub-sample of the protected area.

Fig. 4
figure 4

Box-and-whisker plots of habitat classification cover values for each of the respective study areas, including a random sub-sample of the protected area (n = 73). Habitat diversity is also displayed, and we computed this using the Shannon–Wiener index. Overlapping boxplot notches indicate non-significant differences in the median values and vice versa between the two study areas. Vertical black lines represent lower and upper confidence limits (2.5–97.5%), with black dots representing outlier grid cells. To present a large range in cover values and accommodate the diversity index values, the y-axis scale was log-transformed

A couple of habitats strongly drove estimated species richness (SR.asymp; Fig. 5a) across the two study areas and the sub-sample of the protected area. In the protected area (KNP), infrastructure and seasonal water were strongly positive and highly significant drivers of species richness. Moreover, there were zero negative associations with any habitat in the protected area. On the contrary, permanent water sources had a near significant negative effect on estimated species richness in the mosaic, whereas seasonal water promoted the latter as it did inside KNP. Also, proximity to the KNP boundary was a weak but significant negative driver of species richness (i.e., the closer a grid was to the boundary, the more species were recorded). Functional richness (SES.FRic; Fig. 5b) was strong and significantly but negatively driven by infrastructure inside the protected area. Similarly, SES.PD (Fig. 5c) showed a strong negative association with infrastructure in the protected area, but also in the mosaic. However, the strongest effect on phylogenetic diversity was increased tree cover's negative impact on both study areas and the protected area’s sub-sample.

Fig. 5
figure 5

GLMM estimates from bird diversity metrics as a function of habitat classification and proximity to the KNP boundary. Effects are displayed for: a SR.asymp, b SES.FRic and c SES.PD and horizontal lines indicate the 95% confidence intervals for each estimate. Study area and differences in sample sizes are indicated by different shades of grey (not representative of other figures): mosaic (light shade; n = 73), random sub-sample of the protected area (medium shade; n = 73) and a full set of grid cells in the protected area (dark shade; n = 139). Habitat classifications and the proximity of a grid cell to the protected area (KNP) boundary, which showed significant effects, are indicated as: *** P ≤ 0.001; ** P ≤ 0.01; * P ≤ 0.05

Discussion

We found that human infrastructure, seasonal water sources and tree cover were the most significant drivers of bird diversity across the eastern part of the savanna biome of South Africa. Specifically, the KNP’s bird communities were filtered strongly by the presence of infrastructure, more so than any other habitat: a result found by others (Chown et al. 2003; Burgess et al. 2007). Grid cells containing infrastructure in this study area hosted a species-rich bird community; however, it came at the cost of functional and phylogenetic homogenisation, similar to other findings in the region (Evans et al. 2011; Coetzee and Chown 2016; Weideman et al. 2020). This meant that the infrastructure inside KNP, albeit it is limited, (ranging in cover from 0–1.2% compared with 0–24.4% in the mosaic, respectively) served as oases for certain species, likely generalists such as Laughing Dove, that were able to make use of the supplemental food, water and nest sites typically provided by small, scattered tourist rest camps across the park. This is shown in Appendix Fig. S4 where these types of infrastructure inside KNP drove species richness but to a lesser extent outside in the mosaic where the presence of seasonal water drove species richness. It is also evident that asymptotic species richness per se is related to the number of checklists submitted; however, when we accounted for this in the GLMM, the relationship inside KNP remained. In the mosaic, only phylogenetic diversity responded (negatively) to increases in infrastructure that typically took on the form of rural villages, small towns and a city (Mbombela). Contrary to our prediction, this pattern thus only supports some of the previous landscape-scale findings from our study region [i.e., our results do not show that infrastructure in the mosaic promoted species richness (Coetzee and Chown 2016; Weideman et al. 2020)]. Thus, across larger scales compared with that of typical landscape studies in the South African savanna, a small increase in infrastructure may result in substantial biodiversity increases but at the expense of specialist species that serve unique functions and comprise different evolutionary ages.

A second important driver of species richness, mostly in the protected area but also outside of this in the mosaic, was seasonal water; however, this habitat promoted phylogenetic diversity in the mosaic study area. We believe that aquatic habitats contribute a significant amount of bird species to the terrestrial-dominated landscapes of the region because of the unique resources these provide (e.g., food), which consequently led to the occupation of species with unique functions at the relevant grid cells. Specifically, seasonal water like large rivers that ran through the region (Appendix Fig. S4), were dynamic systems and provided a range of habitats that varied, for example, when flow affected water depth. This, in turn, allowed different species to use the same habitat across different seasons. For example, generally, wading birds require shallow water, which is typical during the latter part of the rainfall season, while piscivores require deeper waters that accommodate their food during earlier parts of the season. In contrast, permanent water bodies provided more stable habitats (e.g., less temporal habitat diversity because of more stable water depths throughout the year).

However, functional richness per se, was not affected by the presence of water, probably because of the small number of unique bird life history traits (e.g., primary diet: piscivore) that this habitat contributed to an area’s bird community. Surprisingly, the mosaic’s water bodies hosted bird communities of which the species pool comprised more diverse evolutionary ages. Such differences in patterns of bird diversity between the two study areas’ water bodies could have been because of water cover and physiognomy, shoreline composition or terrestrial vegetation cover, as Andrade et al. (2018) found. We could not determine with the spatial datasets that were implemented whether the latter two drivers were, in fact, different immediately around the water bodies, but when we explored the former further, it was clear that the mosaic contained more lentic systems (e.g., dams) that persisted across seasons, experienced increased human disturbance because of the proximity to and the magnitude of adjacent urban cover. The water bodies here were also closer to one another (i.e., highest density). This compared with the protected area, whose water bodies typically took the form of six major lotic systems (i.e., perennial rivers, of which most were highly seasonal) and widely spaced from one another. These rivers were the Limpopo’, Levhuvhu’, Letaba’, Olifants’, Sabie’ and the Crocodile River, all of which originate from outside the protected area but also pass through the mosaic. Basic statistics also showed us that the protected area was covered on average by much less seasonal water (0.2%) up to a maximum of 2.5% compared with the mosaic that on average (0.3%) contained larger areas covered by seasonal water and covering up to 3.5% of the study area’s grid cells. Thus, we found, at the least, that water per se is important for maintaining bird diversity despite its small contribution across the region. It is imperative that water across a large biogeographical area, such as our study region, needs to be considered and protected from processes like pollution and alien-invasive plants, whether these are local or threaten other parts of the catchment. The conservation of freshwater systems is important for biodiversity conservation as a whole, which is evident elsewhere (Dudgeon et al. 2006; Rolls et al. 2018; Tu et al. 2020).

Across the region, different processes drive vegetation structure; this ranges from wood harvesting (Wessels et al. 2011) to bush encroachment (Loftie-Eaton 2018) in the mosaic and a growing elephant population (Ferreira et al. 2017) in the protected area. However, despite these differences, tree cover was nearly identical among the two study areas (i.e., mosaic: \(\overline{x }\) = 17.6%, minimum = 7.8%, maximum = 33.8%; protected area: \(\overline{x }\) = 15.6%, minimum = 4.6%, maximum = 33.2%). Tree cover was strongly negatively related to phylogenetic diversity across the region, and negatively related to functional diversity in the mosaic mostly. When we inspected the spatial distributions of habitat classes, it was evident that the largest tree cover values were from riparian zones along rivers and drainage lines, mountainous terrain and lower-lying areas of the landscapes (which was linked to the former). Moreover, as our strong and negative tree-grass correlation result showed, these areas typically contained little or no grass cover. Ultimately, because the trees reduced grass cover in our study region, we would expect various bird functional traits, such as granivory (e.g., the region’s granivorous birds that also comprised of multiple other traits, like a diversity of nest shapes), and birds of various evolutionary ages to be excluded; and this was reflected in the negative relationship between tree cover and diversity metrics. Thus, our results are similar to other studies in savanna where increases in tree cover reduced bird diversity (Skowno and Bond 2003; Loftie-Eaton 2018). Despite the processes resulting in increased tree cover, we found that heterogeneity of vegetation structure is important across large spatial scales. This has implications for biodiversity conservation across protected and more human-modified landscapes.