1 Introduction

The knowledge of spatial patterns of genetic diversity is fundamental for the conservation of native forest tree species since it can inform the design of conservation strategies (Frankham 2003; Escudero et al. 2003; Davies et al. 2013). Fine-scale spatial genetic structure (FSGS), defined broadly as the non-random spatial distribution of genotypes, has been described in many plant populations (see reviews in Vekemans and Hardy 2004; Troupin et al. 2006; Hardy et al. 2006; Dick et al. 2008; Buzatti et al. 2012; Berens et al. 2014; Ewédjè et al. 2017). FSGS is shaped by the complex interplay of microevolutionary processes (i.e., genetic drift, gene flow, and natural selection) and mating system (Epperson 2003; Dick et al. 2008). In non-equlibrium populations, demographic history can also affect FSGS patterns: for example, locally co-ocurring gene pools or allele frequency gradients can result from multiple waves of colonization and population establishment or from habitat-mediated selection (Hampe et al. 2010; Torroba-Balmori et al. 2017). Therefore, information on the demographic history is also essential to determine which mechanisms are responsible for generating the temporal dynamics of FSGS (Jones and Hubbell 2006).

The fragmentation of formerly continuous tree populations disrupts natural ecological and evolutionary processes, and can adversely modify their genetic composition (Hamrick 2004). Decrease in forest fragment size, together with increased fragment isolation, habitat loss, and reduction in tree densities, may lead to genetic erosion due to the effects of genetic drift and inbreeding, and lower gene flow among forest remnants (Young et al. 1996; Aguilar et al. 2008; Fageria and Rajora 2013). These effects are exacerbated when reductions in population size occur in populations that display spatial genetic autocorrelation (Nason and Hamrick 1997; Aldrich et al. 1998).

In long-lived tree species, changes in effective population size, defined as the average number of individuals that contribute offspring to the next generation (Ridley 2003), may be slow and difficult to detect. Moreover, the fact that genetic processes operate in time units of generations may introduce important time lags in the response of forest trees to landscape change, including fragmentation (Wagner and Fortin 2013). Time lags between landscape change and subsequent genetic change can vary from a few to several thousand generations depending on the type of landscape change and the specific genetic parameters studied (Epps and Keyghobadi 2015). For example, after a change in subpopulation connectivity, measures of inbreeding or heterozygosity move towards equilibrium much more slowly than measures of genetic differentiation (Epps and Keyghobadi 2015, and references therein). Nevertheless, the comparison of the spatial pattern and genetic structure across different life stages has proven to be a useful tool to study the ecological and evolutionary processes underlying population dynamics in forest trees (see examples in Jones and Hubbell 2006; Berens et al. 2014). In forest trees, saplings are usually spatially clustered and adult trees are more randomly distributed, suggesting assortative mating and/or density-dependent mortality (e.g., Janzen-Connell effects; Clark and Clark 1984; Comita et al. 2014).

Here, we studied the FSGS and the demographic history of a remnant population of the insect-pollinated and gravity-dispersed subtropical tree Anadenanthera colubrina var. cebil (Griseb.) Altschul, Fabaceae (Spanish: cebil or curupay; Portuguese: angico; also known as “vilca”, which is derived from the Quechua voice “wilka”, meaning “sacred”), in the Campo San Juan Nature Reserve (Misiones, northeastern Argentina). Anadenanthera colubrina var. cebil is a long-lived and semi-deciduous canopy tree that can reach up to 35 m in height. It has compound bipinnate leaves with specialized ant glands (extrafloral nectaries), hermaphroditic flowers in inflorescences, and long legume fruits (von Reis Altschul 1964; Justiniano and Fredericksen 1998; Cialdella 2000; Klitgård and Lewis 2010). The mating system of A. colubrina var. cebil is presumed to be mostly outcrossing (Cialdella 2000), although one study in Ribeirão Preto (São Paulo, Brazil) showed a mixed mating system (multi-locus outcrossing rate, tm = 0.619, Feres 2013). Bees are the main pollinators, and seeds are dispersed by autochory or anemochory after pod dehiscence (Justiniano and Fredericksen 1998; de Noir et al. 2002). Anadenanthera colubrina var. cebil occupies a wide distribution range (Fig. 6 in Annex 4) in the seasonally dry tropical forests (SDTFs), a relatively poorly known biome with a strongly fragmented distribution across Latin America and the Caribbean (Särkinen et al. 2011; Mogni et al. 2015; DRYFLOR 2016). Anadenanthera colubrina var. cebil has been proposed by Prado and Gibbs (1993) as the most important tree species in this biome, as it is present in 8 out of the 12 floristic groups described for Neotropical SDTFs (DRYFLOR 2016) and occurs in four major SDTF nuclei—Caatinga, Misiones, Chiquitanía, and Subandean Piedmont (Särkinen et al. 2011; Fig. 1a). The current distribution of SDTFs in South America is highly fragmented due to both natural factors (e.g., climatic fluctuations during the Pleistocene) and human disturbance (e.g., land use changes and deforestation) (Werneck et al. 2011; Sánchez-Azofeifa and Portillo-Quintero 2011).

Fig. 1
figure 1

a Current distribution of seasonally dry tropical forests (SDTFs) in South America according to Särkinen et al. (2011). Four nuclei are identified: Caatinga (CA), Misiones (MI), Chiquitanía (CH), and Subandean Piedmont (SP). The location of the studied area, Campo San Juan Nature Reserve, is indicated. b Studied forest stand indicating adult trees (circles) and saplings (squares) of Anadenanthera colubrina var. cebil, and the four transects sampled (dashed lines)

In a previous study, Barrandeguy et al. (2014) detected genetic signatures of ancient fragmentation in both nuclear and chloroplast genomes for A. colubrina var. cebil and suggested (but did not test) that demographic instability may underlie current genetic structure at the large geographical scale. By studying local genetic structure and demographic history in one typical location of the species, we aimed at providing insights into the causes underlying these patterns, which is especially relevant for highly fragmented and poorly known biomes, such as the SDTF. The specific aims of our study are (i) to characterize patterns of genetic diversity and FSGS of A. colubrina var. cebil in a typical remnant population from the Misiones nucleus in the Paranaense biogeographic province, Argentina; (ii) to investigate one-generation changes in these patterns by comparing adults and saplings from the same population; and (iii) to provide insights into demographic history and the causes underlying population structure in this population. Current levels of protection of SDTFs are inadequate, as protected area is very small with respect to the high diversity, high endemism, and high floristic turnover found in the biome (DRYFLOR 2016). Knowledge on genetic diversity and population genetic structure, in relationship to demographical processes in keystone SDTF species, such as A. colubrina var. cebil, can provide valuable information for in situ dynamic conservation and the establishment of sampling strategies for ex situ conservation and reforestation.

2 Material and methods

2.1 Study area

Campo San Juan (CSJ) is a Nature Reserve located in the Paranaense biogeographic province (Cabrera 1971) within the Misiones nucleus of the SDTFs (Fig. 1b). Native grasslands and forests cover 90% of the reserve, in approximately equal proportions (Falguera et al. 2015). Within CSJ Nature Reserve, we studied a forest stand of c. 20 ha located in the center of a fragmented forest extending into the grassland matrix (Fig. 1b). Besides Anadenanthera colubrina var. cebil, the main forest trees in the CSJ Nature Reserve are Myracrodruon balansae (Anacardiaceae), Handroanthus heptaphyllus (Bignoniaceae), Cedrela fissilis (Meliaceae), Parapiptadenia rigida, and Peltophorum dubium (Leguminosae) (Falguera et al. 2015). Four parallel transects were established (T4, T5, T6, and T7), crossing the stand from the southwestern to the northeastern forest edge (Fig. 1b). Transects were around 340 m long by 50 m wide, covering approximately 40% of the total forest area.

2.2 Plant material

Two life stages of A. colubrina var. cebil were sampled along the four transects. In total, young leaves from 119 individuals were collected, including all reproductively mature trees in the transects (60 individuals), and ~ 60% of the saplings (59 individuals). Trees were georeferenced by GPS (Geographic Position System) by means of a Garmin eTrex® 20× receiver (precision of ± 3 m) and identified by an individual code. Diameter at 1.30 m (DBH) was also measured in adult trees. Leaves were dried in salt (Carrió and Rosselló 2013) and stored at room temperature until DNA extraction.

2.3 DNA extraction and microsatellite genotyping

Total genomic DNA was extracted from dry leaves using the ROSE extraction protocol (Steiner et al. 1995) modified by García et al. (2007). Eight specific nuclear microsatellite markers (SSR) for A. colubrina var. cebil were used to genotype all individuals (Barrandeguy et al. 2012). The amplifications by polymerase chain reaction (PCR) were performed in a final volume of 14 μl containing 0.5 ng/μl of genomic DNA, 1× Buffer, 2.5 mM Cl2Mg, 0.175 mM of each dNTP, 0.75 U Hot Start DNA Polymerase, 0.33 pmol of reverse primer, 0.20 pmol of forward primer with M13 tail at the 5′-end, and 0.33 pmol of universal FAM- or HEX-labeled M13 primer, following the method proposed by Schuelke (2000). PCR was performed in a gradient cycler (Biometra) using a touchdown program or a fixed annealing temperature, depending on the locus, following Barrandeguy et al. (2012). The amplification cycles were divided in two phases: the first consisted of 30 cycles using the annealing temperature of the specific SSR primers and the second of 8 cycles using the M13 primer annealing temperature (53 °C). The capillary electrophoresis was carried out in an ABI 3730XL Genetic Analyzer (Applied Biosystems®) and fragment sizes were scored with Peak Scanner™ v1.0 (Applied Biosystems®) software using 400HD-ROX™ (Applied Biosystems®) as internal size standard.

2.4 Data analysis

2.4.1 Genetic diversity and inbreeding

Genetic variation in the population was characterized by number of alleles (NA), effective number of alleles (NE), number of private alleles (NPA), observed (HO) and expected (HE) heterozygosity, and allelic richness (R). Observed and expected heterozygosity and allelic richness were computed in SPAGeDi 1.5a (Hardy and Vekemans 2002) whereas GenAlEx 6.5 (Peakall and Smouse 2012) was used for all other genetic diversity estimates. Significant differences among life stages in R, HO, and HE were tested using a permutation procedure (10,000 iterations) in FSTAT 2.9.3.2 (Goudet 1995). A global population test of heterozygote deficit across loci was computed in GENEPOP 4.2.2 (Rousset 2008) to detect departures from Hardy–Weinberg equilibrium (HWE) with the following parameters: dememorization = 10,000; batches = 100; and iterations = 5000. The inbreeding coefficient (FIS) was calculated using SPAGeDi 1.5a (Hardy and Vekemans 2002). The presence of null alleles and genotyping errors were estimated in each locus using MicroChecker (van Oosterhout et al. 2004). The individual inbreeding model (IIM), implemented as a Bayesian approach, was used to partition out the influence of null alleles on FIS values using INest (Chybicki and Burczyk 2009).

2.4.2 Genetic clusters and Wahlund effects

The existence of genetic clusters in the population was first explored with a Bayesian model-based cluster analysis using STRUCTURE 2.3.4 (Pritchard et al. 2000). This method estimates the number of genetic clusters (K) in the data and estimates the ancestry of individuals in these clusters (Pritchard et al. 2000). Bayesian analysis was performed using the admixture model with correlated allele frequencies and K from 1 to 8. The model was run separately for adults and saplings with 10 independent simulations for each K, and a burn-in length of 500,000 and a run length of 750,000 MCMC iterations. The optimal number of clusters was determined using the ad hoc ΔK statistic (Evanno et al. 2005), based on the second order of change in the log likelihood of data (ΔK) as a function of K, using Structure Harvester 0.6.92 web application (Earl and vonHoldt 2012). Second, principal component analysis (PCA) as implemented in the adegenet 2.1.1 package in R (Jombart 2008) was used to validate STRUCTURE genetic clusters. Geographic maps showing the distribution of genetic clusters were drawn using QGIS v2.18.13 (Quantum GIS Development Team 2017).

Based on the optimal number of genetic clusters, individuals were assigned to the cluster in which they had the highest proportion of estimated ancestry, and genetic differentiation (i.e., the distribution of genetic variation within and among clusters) was examined using analysis of molecular variance (AMOVA) in Arlequin 3.5 (Excoffier and Lischer 2010). Statistical significance of AMOVA was calculated using 10,000 permutations. In addition, we estimated unbiased FST values using the excluding null allele (ENA) correction method in FreeNA software (Chapuis and Estoup 2007). Genetic differentiation was further tested using Jost’s D index (Jost 2008), which partitions heterozygosity within and among clusters (Jost et al. 2018). Statistical significance was tested by means of 1000 bootstraps using DEMEtics v0.8.7 package in R (Gerlach et al. 2010). In addition, to test for Wahlund effects, inbreeding coefficients (FIS), both corrected and uncorrected for the presence of null alleles, were computed for each genetic cluster using the methods described above. A Wahlund effect is observed when inbreeding at the population level is due to population substructure, thus, under a Wahlund effect, FIS within genetic clusters should be lower than FIS of the entire forest stand.

2.4.3 Fine-scale spatial genetic structure

Spatial genetic structure was first investigated using a spatial principal component analysis, sPCA, in adegenet 2.1.1 package in R (Jombart 2008). Strength of spatial genetic structure was estimated by the eigenvalue of the first sPCA axis, eig.sPCA. We tested for global structure, such as allele frequency gradients or spatial autocorrelation, using the G test, and for local structure, such as a spatial repulsion signal between genetic clusters, using the L test. Second, spatial autocorrelation analyses based on pairwise kinship coefficients (Fij) between individuals (Loiselle et al. 1995) against physical distance were done using SPAGeDi 1.5a (Hardy and Vekemans 2002). Mean multi-locus kinship coefficients (F(d)) were computed separately for each life stage by means of pairwise comparisons for eight distance classes with upper bounds of 10, 20, 50, 100, 150, 200, 300, and 500 m, following recommendations by Doligez et al. (1998). Standard errors for the kinship coefficients were estimated using a jackknife procedure over loci. To test for significance of fine-scale spatial genetic structure (FSGS), Fij for all pairs of individuals was plotted against the logarithm (for two-dimensional space) of pairwise spatial distance, and significance of the regression slope b was assessed by 10,000 permutations of the columns and rows of the distance matrix. To compare the extent of FSGS with that in other plant species, we also computed the Sp statistic, which is defined as Sp = −b/(1 − F1), where F1 is the mean Fij between neighboring individuals, i.e., for the first distance class (< 10 m), and b is the regression slope based on the logarithm of Euclidean geographic distance (Vekemans and Hardy 2004).

Finally, to further investigate the factors influencing FSGS, we used a Bayesian generalized linear mixed modelling approach implemented in the MCMCglmm package (Hadfield 2010) in R. We contrasted seven models using three predictor variables and combinations thereof (as fixed effects), to predict the inter-individual kinship matrix of adult trees only: 1. NULL, a null model without any predictor, 2. GEO, with a predictor matrix representing the Euclidean geographic distance between individuals, computed from latitude and longitude, 3. ALT, with a predictor matrix based on Euclidean distance in altitude only, 4. DBH, with a predictor matrix based on diameter differences at 1.30 m (DBH), assuming that DBH could be considered as proxy for distinct groups of individuals sharing microhabitat or growth strategy, 5. GEO_ALT, with both geographic and altitude predictor matrices, 6. GEO_DBH, with geographic and DBH predictor matrices, and 7. GEO_DBH_ALT, with the three predictor matrices. The number of MCMC iterations was set to 200,000, after a burn-in of 50,000 and using a thinning interval of 5000. The deviance information criterion (Spiegelhalter et al. 2002), DIC, of each model was recorded and used for model comparison. The model with the lowest DIC was considered to best describe the data.

2.4.4 Demographic history

Demographic history of the forest stand (based on all adult trees) was first explored by computing the “bottleneck” statistic, T2, in Bottleneck 1.2.02 (Cornuet and Luikart 1996; Piry et al. 1999). This statistic was estimated using a two-phase mutation (TPM) model, which is the most appropriate for SSRs (Di Rienzo et al. 1994; Piry et al. 1999), by means of 10,000 iterations, and considering 95% single-step mutations, 5% multiple-step mutations, and a variance of 12 among multiple steps. This method is suitable to detect recent bottlenecks of low magnitude (Williamson-Natesan 2005). Positive T2 values indicate a heterozygosity excess that can be associated with demographic bottlenecks (Piry et al. 1999). The method can also be used to detect population expansions, which are characterized by a heterozygosity deficiency (Cornuet and Luikart 1996). The significance of T2 was tested using a Wilcoxon sign-rank test. Additionally, the Garza–Williamson M ratio test was computed in Arlequin 3.5 (Excoffier and Lischer 2010), where M is the mean ratio of the number of alleles to the allele size range (Garza and Williamson 2001). A value of M < 0.68 is associated with ancient and severe bottleneck signals (Garza and Williamson 2001; Williamson-Natesan 2005).

Approximate Bayesian computation (ABC) methods were used for more accurate demographic inference (Csilléry et al. 2010), using DIYABC v1.0.4.46beta (Cornuet et al. 2008). Based on historical records (Bauni and Homberg 2015; Werneck et al. 2011), two demographical scenarios were defined for the studied area, modelled through simple two-epoch models of population size change, without fixing the direction of change (i.e., each scenario allowed for both population expansion and population decline). The first scenario assumed a recent demographic change (one to five generations ago) corresponding to strong habitat transformation due to known human activity in the region (Bauni et al. 2013). The second scenario reflected an ancient demographic change (100 to 700 generations ago) associated with the expansion-contraction dynamics described in SDTFs (Prado and Gibbs 1993; Pennington et al. 2000; Werneck et al. 2011). These simple two-epoch models of population expansion/decline were compared with a null model of constant population size, using logistic regression to estimate the posterior probability for each scenario, and to estimate the following parameters: current (N1) and ancient population size (N0), and time of demographic change (t). Prior distributions, model parameters, and summary statistics are reported in Annex 1 (Table 4).

3 Results

3.1 Genetic diversity and inbreeding

All eight nuclear microsatellite loci were highly polymorphic in both adults and saplings (Table 1). Allelic richness (R), and both observed (HO) and expected (HE) heterozygosity did not show significant differences across life stages (p values of 0.12, 0.57, and 0.27 for R, HO, and HE, respectively). HO was significantly lower than HE resulting in a high inbreeding coefficient for both adults and saplings (FIS = 0.41 and 0.40, respectively). High FIS could be explained by the presence of null alleles (Chakraborty et al. 1994), or alternatively, by population substructure (Wahlund effect, see below), assortative mating, and/or selfing (Morand et al. 2002). Despite the high frequency of null alleles (r = 0.19), as estimated by the Brookfield equation (Brookfield 1996), corrected FIS values were similar to those estimated without correction: 0.39 (95% CIs of 0.35–0.44) in adults and 0.35 (95% CIs of 0.28–0.42) in saplings (Table 1).

Table 1 Genetic diversity and inbreeding in the studied population remnant of Anadenanthera colubrina var. cebil

3.2 Genetic clusters and Wahlund effects

Bayesian model-based cluster analysis suggested K = 3 and K = 5 as the optimal numbers of clusters for adults and saplings, respectively (Fig. 2a and b). PCAs based on genotypes of adults and saplings separately also supported the existence of differentiated genetic clusters in the stand, in particular for saplings (Fig. 2c and d). Genetic clusters did not show any apparent spatial pattern (Fig. 2). As expected, the AMOVA indicated larger genetic variation within clusters than among them (Table 2). Nevertheless, genetic differentiation among clusters was significant (p < 0.05). It was weak in adults (FST = 0.06) and relatively high in saplings (FST = 0.14), considering the short geographic distance between individuals from different genetic clusters (see Fig. 2a and b). Similar values were obtained when correcting for bias due to null alleles using the ENA method in FReeNA (FST = 0.05, CIs of 0.03–0.07, for adults, and FST = 0.12, CIs of 0.09–0.15, for saplings), indicating that null alleles had only a small effect on FST estimates. The harmonic mean of Jost’s D (Jost 2008) allowed the evaluation of the relative differentiation of allele frequencies within and among clusters. This parameter was significant for both life stages and higher in saplings (0.70) than in adults (0.31, Table 2). This means that the three adult clusters shared more than half of the alleles, while the five sapling clusters had marked allelic differentiation and a low number of shared alleles. Finally, levels of inbreeding (as estimated by FIS) were still substantially high within each of the genetic clusters for both adults and saplings, allowing us to discard Wahlund effects (see Table 5 in Annex 2), leaving assortative mating and selfing as the main factors to explain the high levels of inbreeding.

Fig. 2
figure 2

Geographic representation of Bayesian model-based cluster analysis of population genetic structure (STRUCTURE software) for a adults (K = 3) and b saplings (K = 5), and principal component analyses (PCAs) based on c adult and d sapling genotypes. ΔK vs. K distribution and the bar plot ordered by the admixture proportions for each individual (Q) are also shown in (a) and (b). In (c) and (d), STRUCTURE cluster membership is indicated using the same colors as in the bar plots (a) and (b), respectively

Table 2 Analyses of molecular variance (AMOVA), fixation indices, and allelic differentiation, for the genetic clusters of Anadenanthera colubrina var. cebil identified by STRUCTURE software

3.3 Fine-scale spatial genetic structure

The sPCA analysis revealed a significant global spatial genetic structure through the G test (p < 0.001), but absence of local genetic structure (non-significant L test). The individual scores on the first eigenaxis of the sPCA indicated strong spatial autocorrelation, suggesting allele frequency gradients in both life stages that corresponded approximately with a geographic North-South axis of variation (Fig. 3). Allele frequency gradients were stronger in saplings (eig.sPCA = 0.178) than in adults (eig.sPCA = 0.136). Supporting these results, spatial autocorrelation analyses using SPAGeDi showed significant FSGS for both life stages (p < 0.001), but with a more pronounced negative slope of pairwise kinship with distance for saplings, which was reflected in a 2.5-fold higher Sp statistic in saplings than in adults (Sp of 0.023 and 0.009, respectively; Table 3). Moreover, we found 5-fold greater pairwise kinship in the first distance class (up to 10 m) in saplings (F1 = 0.328) than in adults (F1 = 0.059), and the F1 of saplings was close to the value expected for full-sibs (i.e., 0.250) (Table 3, Fig. 4).

Fig. 3
figure 3

Geographic representation of individual scores on the first sPCA axis of adults (a) and saplings (b), showing both raw values (lower panels) and their spatial interpolation (upper panels)

Table 3 Fine-scale spatial genetic structure (FSGS) in Anadenanthera colubrina var. cebil
Fig. 4
figure 4

Fine-scale spatial genetic structure (FSGS) correlograms for a adults and b saplings from the studied remnant population of Anadenanthera colubrina var. cebil; dashed lines indicate 95% confidence intervals per distance class around the hypothesis of random genetic structure obtained by permuting individual spatial locations as implemented in the SPAGeDi software

The MCMCglmm analysis (Table 6 in Annex 3) revealed that the model with a geographic predictor matrix alone had the lowest DIC and best explained the pattern of genetic relatedness in the stand; thus, altitude or DBH did not significantly contribute to explaining genetic relatedness.

3.4 Demographic history

Recent bottleneck signals were not detected in adults using Bottleneck program under a TPM model. On the contrary, statistically significant heterozygosity deficit (p < 0.05) suggested a recent population expansion. The Garza–Williamson M ratio, which typically detects older population size declines, showed also no evidence of bottlenecks (M = 0.37 ± 0.12). Based on ABC, demographic scenarios considering recent population decline/expansion had little support when compared to a null scenario of constant population size (Fig. 5, left). However, we found that an ancient expansion scenario was supported by high and consistent posterior probabilities (over 0.8; Fig. 5, right), with parameters: current effective population size (N0) of 7110 (CIs 3020-9900), ancestral population size (N1) of 712 (CIs 48–4360), and time of expansion (t) of 357 (CIs 110–676) generations.

Fig. 5
figure 5

Comparison of demographic scenarios (Sc) using ABC and considering: (1) constant population size, (2a) a recent population size change, and (2b) an ancient population size change. Posterior probabilities (P) of each scenario comparison (1 vs. 2a and 1 vs. 2b) are also shown

4 Discussion

Information on the historical and demographic events that shape the genetic structure of forest tree populations is relevant for understanding stand dynamics and their future evolution. We present here a case study that provides insights into the demographic population history of A. colubrina var. cebil in the Paranaense biogeographic province in Northern Argentina. Thirty years ago, the United Nations Food and Agriculture Organization (FAO 1986) reported that the species was suffering a slow decline. More recently, in the light of its ample use in the region, especially for extraction of wood and tannins, as well as for firewood (Justiniano and Fredericksen 1998; Tortorelli 2009), A. colubrina was indicated as a high priority species for in situ conservation (Monteiro et al. 2006). Although A. colubrina cannot be considered a threatened species (Roskov et al. 2017), sustainable forest management of its populations in the Paranaense biogeographic province demands attention, as most of them risk local extinction, being located on private land that is exploited for logging.

In our study, we found high levels of genetic diversity and allelic richness in A. colubrina var. cebil from CSJ Nature Reserve, similar to those found in previous studies at wider geographical scales (Barrandeguy et al. 2014). High genetic diversity has also been detected in other Neotropical tree species (see Hardy et al. 2006) and is likely a consequence of life history traits of these trees, such as a long life span and a predominantly outcrossing mating system (Hamrick and Godt 1996; Petit and Hampe 2006). Despite the high levels of genetic diversity found in our study, we must be cautious when interpreting the intrinsic value of this remnant forest stand (Hedrick 1999). First, genetic diversity as estimated here only refers to neutral genetic diversity and not to variation associated with quantitative traits with adaptive value, and recent studies have shown little correlation between these two measures of diversity (e.g., Rodríguez-Quilón et al. 2015). Second, high genetic diversity is accompanied by high levels of inbreeding and differentiated genetic clusters, which also calls for caution in the use of this stand as seed source (Ellstrand and Elam 1993; Charlesworth 2003).

High inbreeding was detected for both adults and saplings at the stand level as well as within genetic clusters, even after accounting for null alleles [FIS(IIM) = 0.40 and 0.34, respectively, at the stand level]. These inbreeding coefficients are substantially higher than those displayed by other insect-pollinated forest tree species such as Dalbergia nigra (FIS = 0.08), which is mainly pollinated by bees (Buzatti et al. 2012), Cabralea canjerana (FIS = 0.06) pollinated by moths (de Oliveira Melo et al. 2014), or Prunus africana (FIS = 0.08–0.19), whose main pollinators are Hymenoptera and Diptera (Berens et al. 2014). Previous studies have also revealed a heterozygote deficit in A. colubrina var. cebil populations (Barrandeguy et al. 2014), although not as marked as in our study, and substantial levels of selfing and biparental inbreeding (tm = 0.619, tm-ts = 0.159; Feres 2013). Together, these results suggest that A. colubrina var. cebil displays a mixed mating system. Selfing is common in legumes, and studies on the proportion of selfed vs. outcrossed offspring in other Neotropical legume trees have revealed mixed mating systems, for instance in Senna multijuga (Ribeiro and Lovato 2004), Copaifera langsdorfii (Tarazi et al. 2013), and Dipteryx alata (Tambarussi et al. 2017). More detailed investigation on the genetic constitution of offspring would be needed for a detailed characterization of the mating system of A. colubrina var. cebil, as well as its variation among populations. Nevertheless, an approximate selfing rate, s, can be estimated from the inbreeding coefficient using s = 2FIS/(1 + FIS) (Hartl and Clark 2007), which suggests 51–56% selfing in the CSJ population. When species evolve to high proportions of selfing, this can decrease their effective population size, promote the accumulation of deleterious mutations, and hinder adaptation (Charlesworth 2003; Wright et al. 2013; Barrett et al. 2014). Small populations become inbred more rapidly than large populations, which, besides the mating system, is due to increased genetic drift and biparental inbreeding (Ellstrand and Elam 1993).

Knowledge on the influence of ecological context and life history traits on FSGS can provide crucial information for conservation of forest tree species. The mating system has been shown to have a strong effect on FSGS, with the highest Sp described in selfing species and, on average, a three times higher Sp (Sp = 0.037) observed in mixed mating vs. outcrossing species (Sp = 0.013, Vekemans and Hardy 2004). The Sp statistic of A. colubrina var. cebil (Sp = 0.023 and 0.009, for saplings and adults, respectively) was close to other Neotropical tree species with mixed mating systems, e.g., Theobroma cacao (Sp = 0.018, Silva et al. 2011) or Dicorynia guianensis (Sp = 0.026; Hardy et al. 2006). The scale of pollen and seed movement also largely determines the level of kinship (Fij) observed among neighboring trees (Vekemans and Hardy 2004; Epperson 2003). In a recent meta-analysis for Neotropical tree species, Lowe et al. (2018) found that abiotically dispersed and pioneer species, such as A. colubrina var. cebil (Justiniano and Fredericksen 1998), have stronger FSGS than late successional species, whose seeds are dispersed by biotic vectors (see also Hardy et al. 2006).

In plant species, pollen and seeds often disperse at different scales (Anderson et al. 2010). In the case of A. colubrina var. cebil, while small bees are probably associated to pollen movement (e.g., Tetragonisca angustula; Flores and Sánchez 2010), seeds rely on autochory for dispersal. Flight distances for small bees can be as large as 621–951 m (Araújo et al. 2004), and thus they are probably a main agent effectively connecting A. colubrina var. cebil patches. Restricted seed dispersal can generate spatial clustering of full- and half-sibling cohorts localized close to the maternal plant (Nason and Hamrick 1997; Kalisz et al. 2001). This process, together with selfing, could then explain the high pairwise kinship in the first distance class for A. colubrina var. cebil saplings (F1 = 0.328), which was intermediate to the value expected for full-sibs (0.250) and selfs (0.500). High pairwise kinship values were not maintained in adults (F1 = 0.059). In addition, FSGS in adults was much weaker than in saplings (Sp of 0.009 vs. 0.023, respectively). While seed dispersal conditions the spatial configuration of sites available for establishment, early seedling recruitment represents a major demographical filter in the plant life cycle (Hampe et al. 2010). Thus, FSGS tends to decrease with age due to density-dependent mortality (Doligez et al. 1998; Epperson 2003; Leblois et al. 2004) and random demographic thinning (Hamrick et al. 1993; Schroeder et al. 2014), by which only few seedlings survive to become adults. The reproduction of A. colubrina var. cebil is characterized by the production of many seeds with low dormancy, high germination rates (~ 70%) (Soldati and de Albuquerque 2010; de Souza et al. 2015), and high mortality rates of seedlings (~ 50%) due to abiotic factors, competition, and soil microorganisms (de Medeiros et al. 2016). Therefore, the concerted effects of density-dependent mortality (e.g., Janzen–Connell effects; see Comita et al. 2014) and random demographic thinning could indeed have resulted in a more regular spatial distribution pattern and in a decrease of the FSGS from younger to older life stages in the studied stand.

In addition to significant FSGS, the existence of differentiated genetic clusters within the stand that are not explained by distribution in space alone is striking. Genetic differentiation among clusters was moderate, but still remarkable given the short separation distance between them (FST of 0.12 for saplings and of 0.05 for adults, after correction for null alleles). A similar genetic structure was found in African populations of Symphonia globulifera, an ancient tropical tree species (Torroba-Balmori et al. 2017). Some genetic differentiation at local scales has also been reported for Neotropical forest tree species, e.g., in Dicorynia guianensis (FST = 0.02–0.03), whose indehiscent flat pods are dispersed by anemochory (Latouche-Hallé et al. 2003). While in Dicorynia guianensis these patterns are explained by spatial segregation, evidence for aggregation following habitat features was found in Symphonia globulifera (Torroba-Balmori et al. 2017). In our case, however, the stand is relatively homogeneous and we did not find any evidence of habitat features underlying differentiated genetic clusters. This fact, together with only weak spatial segregation of genetic clusters, suggests assortative mating as the origin of differentiated genetic clusters in A. colubrina var. cebil (e.g., due to phenological differences among trees in the stand). Nevertheless, genetic diversity and allelic richness within clusters were still high for both adults and saplings, and not significantly different across generations, suggesting enough reproductive trees contributing to each genetic cluster.

Anthropogenic impact is usually considered a relevant factor involved in SDTF fragmentation and could have affected the demographic history of forest tree populations in this range. Recent data also indicated that A. colubrina is ecologically more labile than previously thought (Särkinen et al. 2011). However, assuming a seed-to-seed generation time of 50 years in A. colubrina var. cebil, our ABC approach suggested an ancient expansion timed approximately 18,000 years BP as best demographic scenario. Bottleneck tests also failed to show any signature of reduction in effective population sizes in our case study population from the CSJ Nature Reserve. Thus, anthropogenic disturbance and recent fragmentation do not seem to have caused detectable genetic effects in the studied forest remnant. However, it is still to be proven that this particular case study is representative of other populations in the region, and to what degree human-induced forest fragmentation may have affected these patterns.

Research on SDTFs has blossomed in the last decade (Hughes et al. 2013), revealing the biological value of these forests in terms of their highly endemic flora (DRYFLOR 2016), as well as providing deep insights into the biogeography of this biome and the processes that have shaped the historical assembly of SDTF species diversity (Hughes et al. 2013). Despite the fact that A. colubrina var. cebil is a key SDTF indicator species and notwithstanding the known human-induced fragmentation of the SDTFs, the studied remnant population located in the CSJ Nature Reserve conserves high genetic diversity. According to palaeodistribution modelling and biogeographic studies (Werneck et al. 2011; Pennington and Laving 2016), SDTFs generally represent temporally stable habitats over large time scales that are believed to have maintained stable population sizes through climatic fluctuations. Moreover, in long-lived forest trees, especially those with long-distance pollen dispersal, the overlapping generations and longevity may act as a “buffer” against the effects of short-term fragmentation (Jones and Hubbell 2006).

5 Conclusion

Knowledge on patterns of genetic structure within populations and demographic history can help to understand future population trajectories and establish genetic conservation strategies. Our study in a remnant forest of A. colubrina var. cebil in the south of the Paranaense biogeographic province (CSJ Nature Reserve) found a mixed mating system and high levels of genetic diversity. Despite the fragmentation of the SDTF in this region, we observed moderate genetic structure (involving both FSGS and differentiated genetic clusters) as well as signatures of historical demographic expansions since the late Pleistocene. Our results suggest long-term population viability for the studied forest remnant and the value of CSJ Nature Reserve as a genetic reservoir. However, the high inbreeding coefficients, probably due to selfing associated with an increase in the levels of mating between relatives due to assortative mating, while not detrimental to current variability, suggest caution when using this stand as a source of material for reforestation.