Edinburgh Research Explorer Population and conservation genetics using RAD sequencing in four endemic conifers from South America

The emblematic conifer flora of Chile has experienced wide-scale habitat loss and fragmentation. Coupled with the complex topography of the region, this leads to a prediction of high levels of genetic drift and isolation resulting in strong population differentiation, and the potential for negative genetic consequences. To address the degree to which these predictions are realised, we conducted a comparative population genetic survey in four endemic conifers from South America, each of conservation concern and with a restricted distribution area. Between seven and ten populations per species were sampled, covering their entire natural distribution in Chile. We used restriction site-associated DNA markers (RAD-seq), with de novo assembly and optimisation, to accommodate the large and complex genome of conifers. The main finding was low levels of genetic structure in all four conifer species (F ST = 0.017–0.062). This pattern was strikingly consistent across all four species. These results are likely due to the extreme longevity of individuals restricting divergence due to genetic drift. Thus, despite the currently highly fragmented ranges of these conifer species, we did not detect evidence for high levels of population differentiation and genetic isolation. This suggests some resilience to negative genetic impacts of habitat fragmentation, and the longevity of the individual trees may provide considerable buffering capacity. We recommend follow-up studies focusing on genotyping the seedling generation, to assess whether there is any evidence for the early warning of genetic isola tion and/or elevated inbreeding in the currently established cohorts. RAD-seq library and DNA was extracted using the DNeasy Plant Minikit (QIAGEN, Netherlands), following the manufacturer’s protocols. Library construction for RAD-seq was performed using the enzyme PstI, for all four species, following the protocol of Baird et al. ( 2008 ) by Floragenex, USA. Each sample was labelled with a unique barcoded adaptor. Pilot data for 18 samples were generated on one lane of an Illumina HiSeq 4000 to evaluate the performance of the library preparation protocol. Subsequently, a further 95 samples were barcoded, pooled and sequenced using the same protocol on an Illumina HiSeq 4000. Both sequencing runs generated 100bp single-end reads and were performed at the University of Oregon.


Introduction
Globally, about one-third of conifer species are classed as threatened (Farjon 2018). In Chile, nine species are present, all endemic to southern South America, with six species globally threatened and the remainder classed as nationally threatened in Chile or neighbouring Argentina (Donoso Zegers 2006). Major threats to these species include clearance of land for agriculture and exotic plantations, changes to river drainage and water availability due to hydro-electric dams, residential and commercial development, elevated frequency of fires, timber extraction and harvesting, and climate change (Carranza et al. 2020). In addition to anthropogenic habitat degradation and fragmentation, topographic complexity adds another dimension to the isolation of populations of Chilean conifers. The landscape of Chile shows considerable heterogeneity with significant barriers to gene flow created by the mountains of the Andes, the Coastal Cordillera, and the intervening low-lying plateau of the Central Valley (Armesto 1995).
Taken together, the barriers caused by mountains and valleys and the more recent humandriven population loss should lead to genetic drift among isolated populations. This leads to a prediction of low levels of variation within individual populations, and significant differences between them (Frankham et al. 2004). In contrast, some key traits of conifers, particularly being wind-pollinated and long-lived trees are typically associated with high genetic diversity and low population differentiation (Awad et al. 2014). In some cases, the longevity of individual species is extreme, with some individuals of Alerce (Fitzroya cupressoides, Cupressaceae) estimated to be more than 3500 years old (Lara and Villalba 1993). Such a long lifespan offers considerable buffering potential against the loss of genetic diversity and genetic drift.
There is thus a set of conflicting expectations as to the levels of genetic isolation among populations of Chilean conifer species. This leads to uncertainty as to whether these species are likely to encounter genetic problems such as inbreeding depression or loss of genetic diversity due to the current patchy distribution of their populations. To date, only a few population genetic studies have been undertaken which provide data that can be used to tackle these questions. These include studies on Araucaria araucana (Monkey puzzle, Araucariaceae) (Rafii and Dodd 1998;Bekessy et al. 2002;Ruiz et al. 2007;Marchelli et al. 2010;Martín et al. 2014;Varas-Myrik et al. 2022), Austrocedrus chilensis (Cupressaceae) (Pastorino et al. 2004;Arana et al. 2010;Souto et al. 2012;Colabella et al. 2014), Fitzroya cupressoides (Cupressaceae) (Veblen et al. 1976;Allnutt et al. 1999;Premoli et al. 2000aPremoli et al. , b, 2003, and Pilgerodendron uviferum (Cupressaceae) (Premoli et al. 2001Allnutt et al. 2003). These studies detected a range of insights into population genetic structure. Some studies detected patterns of genetic differentiation at a regional scale (Andes Mountains, Central Valley and Coastal Range); e.g. in F. cupressoides (Allnutt et al. 1999;Premoli et al. 2000b), or between high elevation northern Andean and central Andean populations and lower elevation populations from the Andes and the Coastal Mountains in A. araucana (Ruiz et al. 2007). Others detected different patterns, such as genetic differentiation along a north-south gradient along the Andes in A. araucana (Bekessy et al. 2002).
However, one challenge in interpreting data from these studies is that almost all are based on a set of molecular markers which have limitations in terms of understanding genomewide levels of genetic variation and differentiation. For instance, plastid DNA markers are useful for detecting phylogeographic history, but their slow mutation rates, uni-parentally inherited nature and linkage of all plastid markers restrict insights into general population structure (Choi et al. 2019). Likewise, although isozymes are bi-parentally inherited and include multiple independent loci, they are fundamentally conserved loci, potentially subject to selection, and limited to a small number of loci that can be easily assayed (typically, ten or less, Ghowsi 2012). Finally, several studies on southern South American conifers have involved Randomly Amplified Polymorphic DNA (RAPDs) (e.g. Allnutt et al. 1999Allnutt et al. , 2003Alluntt et al. 2001;Bekessy et al. 2002). This approach, although providing data from multiple nuclear loci, has many interpretation challenges in part due to the lack of reproducibility of the fragment profiles the technique produces (Penner et al. 1993). One exception is a recent study by Varas-Myrik et al. (2022) which used restriction-site associated DNA sequencing (RAD-Seq) to study genetic variation in A. araucana.
To gain additional detailed insights into the population genetic structure of Chilean conifers it is desirable to increase the number of studies using large numbers of unlinked nuclear DNA markers. The use of reduced representation sequencing methods, such as RAD-seq, which enables the recovery of hundreds or thousands of nuclear loci (Andrews et al. 2016) is potentially useful, although, for organisms with large and complex genomes, this often requires species-specific optimisation to avoid issues with paralogy (Mastretta-Yanes et al. 2015;Paris et al. 2017).
In this study, we have used RAD-seq to investigate the level of population genetic differentiation in four emblematic endemic conifer species from South America: Saxegothaea conspicua, Prumnopitys andina, Podocarpus salignus (all members of the Podocarpaceae), and Fitzroya cupressoides, from the Cupressaceae. Specifically, we assessed (a) general levels of population genetic structure, (b) whether there is differentiation associated with geographical location (e.g., the Andes vs. Coastal Range), and (c) whether population differentiation is evenly or unevenly distributed throughout the species range.

Study species
Saxegothaea conspicua is a diploid (2n = 24), monoecious tree (Zonneveld 2012), that can reach up to 39m tall and 230cm diameter at breast height (DBH), usually with multiple stems (Cano et al. 2014). It grows in Chile and Argentina, with most populations (90%) located in the temperate forest of Chile (35-46º S). It grows from sea level in the Coastal Range to ca. 1000m in the Andes, being more abundant alongside streams (Farjon 2013). The IUCN classification for the species is Near Threatened (NT), however, there is a possibility that it could be listed as Vulnerable (VU) under criteria A2, A3 or A4 (Thomas 2019). The primary current threats to this species are associated with the conversion of land for exotic plantations (Pinus, Pinaceae and Eucalyptus, Myrtaceae) and trampling and browsing of seedlings and saplings by livestock, with these pressures being mostly concentrated in Coastal populations (Gardner et al. 2006).
Prumnopitys andina is a dioecious (rarely monoecious) tree species with a chromosome number of 2n = 38 (Hair and Beuzenberg 1958). It is presumed to be a diploid (Ohri 2021). Prumnopitys can reach up to 15m tall with a diameter of up to 1m (Donoso Zegers 2006). It is endemic to Chile with its distribution mostly in the Andean Mountains from 35 to 39 ºS. Its altitudinal range varies between 400m (in Malleco) to 1300m above sea level at its northern distribution limit (personal observation). The species has been classified as Vulnerable (VU) by the IUCN. The only Coastal population that remains is severely threatened by afforestation with exotic plantations (Pinus radiata and Eucalyptus globulus) and livestock grazing. In the Andes, populations have been severely reduced by hydroelectric schemes, volcanic activity (e.g., in the Parque Nacional Conguillío) and illegal logging. The species only occurs in three protected areas (Parque Nacional Conguillío, Tolhuaca and the Reserva Nacional Ñuble), which represent a small proportion of the total population.
Podocarpus salignus is a dioecious tree species with a chromosome number first recorded as 2n = 38 (Hair and Beuzenberg 1958) and later as 2n = 40 (Hair 1966). It is presumed to be a diploid (Ohri 2021), although elsewhere in the genus, some Podocarpus species have fewer chromosomes (ranging from 2n = 20-24, Hair 1966), and it is possible that P. salignus is a paleopolyploid. Podocarpus salignus can reach up to 20m tall with a trunk up to 1m in width (Donoso Zegers 2006). It is endemic to Chile, distributed along both the Andean and Coastal Mountain Ranges from 35 to 40 ºS. The altitudinal range of the species is from sea level up to 1000m. The greatest differences in environmental conditions are between the warmer and drier Coastal Range and the cooler and wetter Andes. The species is listed as vulnerable (VU) by the IUCN. Podocarpus salignus has a naturally limited distribution which, combined with the conversion of its habitat to commercial plantations (mostly in the northern part of its range), and the devastating effects of fire, have resulted in a dramatic decrease over the last three generations (60 years) to < 30% of the original cover (Alluntt et al. 2001). The species is affected mostly by agricultural activities, forestry plantations and livestock (Cano 2020). In Chile, there are just a few populations within protected areas. In the Coastal Range, the species occurs within the Reserva Nacional Los Ruiles (Maule region) and Parque Oncol (Los Ríos region). In the Andes, the species is protected within the Reserva Nacional Malleco (Araucanía region).
Fitzroya cupressoides is a dioecious tetraploid species (2n = 4x = 44, Zonneveld 2012). The maximum age reported for individuals of Fitzroya is 3620 years (Lara and Villalba 1993). The species can reach up to 50m tall and 5m diameter (Donoso Zegers 2006). The species occurs in a variety of vegetation types at altitudes between 100m in the Coastal Range and 1200m above sea level in the Andes, with most of the forests occurring between 500 and 800m above sea level. The species also occurs in the Central Valley in Chile. Fitzroya is categorised as Endangered (EN) by the IUCN. The species is also internationally protected by the Convention on International Trade in Endangered Species of Wild Fauna and Flora (CITES), forbidding its commercialisation in any form. Similarly, Chile declared F. cupressoides as a Natural Monument in 1976 and together with Argentina prohibits the logging of the species. In Chile, where 90% of the population occurs, only 18% of its forests are inside the National System of Protected Wilderness Areas; whereas, in Argentina, 85% of the forests are inside the National System of Protected Wilderness Areas (Kitzberger et al. 2000;Landrum 2006). The biggest threat to the species has been from historical overexploitation, mainly due to its highly prized wood (Veblen and Ashton 1982), causing the loss of almost half of the original cover (Lara et al. 2008). Fire and the conversion of forests to pasture or exotic plantations are also significant current threats, mostly in the Central Valley and the Coastal Range populations .
A total of 113 individuals were sampled across the natural distribution of S. conspicua, P. andina, P. salignus, and F. cupressoides in Chile, and locations were chosen to include areas from the Andes, Coastal Range and the Central Valley. Between 7 and 9 populations were included per species (Fig.1) and between 2 and 5 samples per population. Each individual was randomly selected at least 50m from any other sampled individual to avoid sampling closely related individuals. Leaf tissue (needles) was stored in silica gel after collection.

DNA extraction, RAD-seq library construction and sequencing
DNA was extracted using the DNeasy Plant Minikit (QIAGEN, Netherlands), following the manufacturer's protocols. Library construction for RAD-seq was performed using the enzyme PstI, for all four species, following the protocol of Baird et al. (2008) by Floragenex, USA. Each sample was labelled with a unique barcoded adaptor. Pilot data for 18 samples were generated on one lane of an Illumina HiSeq 4000 to evaluate the performance of the library preparation protocol. Subsequently, a further 95 samples were barcoded, pooled and sequenced using the same protocol on an Illumina HiSeq 4000. Both sequencing runs generated 100bp single-end reads and were performed at the University of Oregon.

Data analysis
Raw reads were processed with STACKS version 2.4. Firstly, process_radtags was used to demultiplex reads based on unique barcode sequences and a suitable cut site. Raw reads were filtered to remove those with a Phred quality score below 10. Due to the lack of a reference genome, de novo assembly was performed for each species independently. The main parameters that control de novo assembly in STACKS (m, M and n) were optimised for each of the species separately following the protocol of Catchen et al. (2011) and Paris et al. (2017), to reduce biases and problems of under or over-merging reads (Catchen et al. 2013). Here, m is the minimum stack depth, which controls the minimum number of raw reads required to form a putative allele; M is the distance allowed between stacks, which controls the number of mismatches allowed between stacks to merge them into a putative locus (mismatches are allowed between the two alleles of a heterozygote sample) and n is the distance allowed between catalog loci, which controls the number of mismatches allowed between any two alleles of the population (Catchen et al. 2011;Paris et al. 2017). The optimised de novo assembly parameters for each species were as follows: S. conspicua and P. salignus: m5 M4 n4; P. andina: m4 M3 n3; F. cupressoides: m4 M2 n2 (Cano 2020).

Data filtering and calculation of population differentiation
Genotyping was carried out using the populations pipeline in STACKS. Applying the R filter, loci were recovered that were present in at least 60% of the individuals in each data set (R60). Next, using Tassel version 5.0 (Bradbury et al. 2007) the level of missing data was calculated for each data set. Individuals with a high level of missing data (> 50%) were removed (13 samples in total: 3 samples of S. conspicua, P. andina and F. cupressoides and 4 samples in P. salignus). The populations pipeline in STACKS was rerun on the remaining samples (Table1), recovering R60 loci. A minor allele frequency (MAF) filter of 0.02 and a maximum observed heterozygosity of 0.5 were also applied. These filtering steps are intended to discard rare alleles (which could be sequencing errors) and minimise potential paralogs, respectively. Using the write-single-snp function from the population pipeline in STACKS, only unlinked SNPs were retained. Finally, only loci that were present in at least one sample per population were selected. This is to ensure that genotype information is present in all populations to minimise potential biases due to missing data.
Global F ST and pairwise F ST were calculated with 1000 bootstrap replicates for global F ST , and 100 bootstrap replicates for pairwise F ST , using diveRsity (Keenan et al. 2013). A test for isolation by distance (IBD, Slatkin 1993, 1987 was undertaken for each species by investigating the correlation between F ST /(1-F ST ) and geographical distance (Km), with the distance estimated as the linear distance between sampling localities, with significance assessed by a Mantel test with 999 permutations implemented in GenAlEx. To assess the relationships between populations, a Discriminant Analysis of Principal Components (DAPC, Jombart et al. 2010) was performed using the R package adegenet (Jombart 2008;Jombart and Ahmed 2011). To establish the optimal number of principal components for the DAPC analysis, cross-validation was performed with 100 replicates. To further explore the genetic structure, clustering was undertaken using STRUCTURE version 2.3.4 (Pritchard et al. 2000). STRUCTURE was run 10 times for each K-value from 1 to 9, depending on the maximum number of populations included in each data set. These analyses used a burn-in of 10,000 followed by 100,000 MCMC replicates using the admixture and correlated allele frequency options. To detect the optimal K value, the Evanno method (Evanno et al. 2005) was used in the Clustering Markov Packager Across K program (CLUMPAK, Kopelman et al. 2015).

Genomic sequencing statistics
More than 95% of the total initial raw reads in each species were retained after running the process_radtags pipeline to demultiplex reads. Saxegothaea conspicua had on average 9 million reads per sample (representing 97.6% of the total initial raw reads); P. andina had 7.7 million filtered reads (98.2%); P. salignus 6million reads (96.9%), and F. cupressoides 6.8 million reads (96.5%). For the species S. conspicua the median coverage was 87X with a minimum of 32X and a maximum of 103X. For P. andina the median coverage was 90X with a minimum of 52X and a maximum of 127X. For P. salignus the median coverage was 88X with a minimum of 33X and a maximum of 111X and for F. cupressoides the median coverage was 89X with a minimum of 82X and a maximum of 95X.
The greatest number of filtered SNPs were retained in P. andina, with 4915 SNPs; followed by P. salignus with 4212 SNPs and S. conspicua with 3137 SNPs. The fewest SNPs were found in F. cupressoides, with a total of 853 SNPs (Table2).

Population structure
Low levels of genetic differentiation were observed in all four species, with global measures of population differentiation ranging from F ST =0.017 in F. cupressoides, F ST =0.045 in P.   Fig. S1). None of the pairwise comparisons between populations showed significant differences in any species. Moreover, the Mantel test for the four conifer species showed that there was no significant relationship between genetic distance (measured as F ST / (1-F ST )) and geographical distance (Supplementary information Fig. S2). When the four conifer species were analysed with DAPC, there was a weak signal of population clustering in three of the four species (S. conspicua, P. andina and F. cupressoides). The species with the widest distribution (S. conspicua) showed a slight division between the northern and the southern Andes populations and also with some Coastal populations (Fig.2a). The other species with a smaller distribution (P. andina, P. salignus and F. cupressoides) showed less clear division between populations and regions, such as P. salignus that retained only one discriminant function, grouping all populations as a single cluster (Fig.2c). The DAPC in F. cupressoides tended to divide the Coastal Range populations and the Andes and Central Valley populations (both the Andes and the Central Valley grouped together) (Fig.2d). However, the general pattern for all four species was that individuals from different regions and populations also cluster together, with no evidence for clear differentiation between them. The optimal K-value determined with the Evanno method was K = 6 for S. conspicua and P. salignus, K = 2 for P. andina and K = 5 for F. cupressoides (Supplementary information Fig. S3a, b). However, this clustering did not correspond to any grouping of individuals by their source population, or of proximal populations.

Discussion
This study generated RAD-seq data for four endemic southern South American conifer species, with our most notable finding being a consistent signature of low levels of population differentiation. Here, we briefly summarise the information content of the RAD data for the study species, before focusing on the potential biological and technical factors underlying this low level of genetic structure within species and discuss the potential implication for future conservation work.

Information content of the RAD-seq data
The RAD-seq approach recovered between three and four thousand SNPs from three of the four species (P. andina 4915 SNPs, P. salignus 4212 SNPs, and S. conspicua 3137 SNPs). Although this is a modest number of variable loci compared to some other RAD-seq studies, data from several hundred to a few thousand loci are appropriate for studies of geographic population structure and gene flow (Andrews et al. 2016). The number of SNPs retained in Fitzroya was somewhat lower (853 SNPs) and this may simply reflect a lower genetic diversity in this species, compared to the other three species. An alternative explanation is that our data filtering has removed comparatively more loci from Fitzroya. Of the four study species, Fitzroya is the one considered most likely to be a polyploid, and if there are duplicated regions of the genome, then our steps to remove paralogous loci would discard data from these duplicated regions and reduce the number of recovered SNPs in the dataset (see later for further discussion on polyploidy and Fitzroya). Genetic differentiation was low in all four species, as measured by global F ST (range: 0.017-0.062) or by the lack of population or regional clustering in the DAPC and STRUC-TURE analyses. Comparing these estimates of population structure with previous studies of southern South American conifers (Table3), the general picture was our data detected somewhat lower population differentiation. Allnutt et al. (1999), Allnut et al. (2001) and Martinez Araneda (2011) detected low but significant genetic differentiation among populations for F. cupressoides, P. salignus and P. andina, respectively. Using RAPD markers, Allnutt et al. (1999) and Allnut et al. (2001) reported mean Phi ST =0.143 in F. cupressoides and mean Phi ST =0.069 in P. salignus. In contrast, a study using isozyme markers observed a higher level of genetic differentiation in P. salignus with F ST =0.448 (Quiroga 2009). The same study reported F ST =0.082 in S. conspicua and F ST =0.140 in P. andina, the latter with an F ST value not statistically different from zero (p-value > 0.05). However, research using cpDNA markers by Martinez Araneda (2011) found G ST =0.184 in P. andina (p value < 0.05, significantly different from zero). A number of factors may explain the lower population structure detected in our study. Firstly, the differences could be associated with the different molecular data used. RAPD provides data from multiple nuclear loci; however, the lack of reproducibility of these markers could generate biases. In contrast, RAD-seq generates a large number of SNP markers across the genome, though these data are not without challenges related to bioinformatic filtering of potential paralogs (discussed more below). In addition, in the case of Fitzroya, Allnutt et al. (1999) had a much wider sampling incorporating populations from the eastern part of the Andes Range in Argentina. Therefore, the additional breadth of the sampling (including samples from either side of the Andes) might also be a factor involved in the higher levels of genetic differentiation they observed. The potential impacts of our sampling strategy and other technical artefacts are discussed below.

Potential non-biological reasons for low population structure
Before further considering biological interpretations of low population structure, we evaluate the potential influence of technical artefacts. For instance, it is possible that population differentiation is present, but that the sample sizes were too small to detect it, or that missing data or treatment of paralogous loci as orthologues obscured this population differentiation. In the case of small sample sizes (between 2 and 5 per pop), RAD-seq provides a huge number of SNPs, with the small sample size of individuals being compensated by the large sample size of loci from each individual (Jeffries et al. 2016;Nazareno et al. 2017;Qu et al. 2020). With this in mind, it is noteworthy that in individual-level analyses, no geographic or population-specific clustering is evident. This is shown in the DAPC analysis (and was also evident in basic PCA analysis and Neighbour-joining trees -unpublished results). Additional sampling of more individuals per population would be desirable, but this repeated lack of geographical structure in population and individual-based analyses suggest that the finding of low differentiation is robust to sample size effects.
In the case of missing data, we assessed the likelihood of this contributing to the observed low levels of population structure via a sensitivity analysis using STACKS software following the de novo optimisation protocol of Paris et al. (2017) and Catchen et al. (2011), testing  for biases and problems of under or over merging reads (Catchen et al. 2013). Adjustment of the parameters across a range of settings (e.g. m, M and N between 1 and 7) did not result in material changes to levels of inferred population structure (Cano 2020). We further examined the effects of changing the missing data threshold from 20 to 40% (R60 and R80) on individual-level analyses. Although there are impacts on the distribution of points in the DAPC plot, there is no material difference in population-specific clustering or clustering together of individuals from adjacent populations or regions (Cano, unpublished results). Finally, regarding the potential for undetected paralogy to influence the results, this is more difficult to exclude. This is particularly pertinent for F. cupressoides which has a chromosome number and genome size consistent with being a tetraploid (Farhat et al. 2021) and hence there is an enhanced likelihood of alleles and loci being confused. As a monotypic genus, with a closest extant relative that diverged c 45 mya (Crisp et al. 2018), it is quite possible that Fitzroya has experienced extensive diploidisation of the genome from an ancient polyploidy event and thus behaves like a diploid. On the other hand, previous isozyme studies (Premoli et al. 2000a) showed some evidence for loci behaving in a tetraploid fashion (unbalanced heterozygotes), consistent with tetrasomic inheritance and autoploidy at a small number of protein-encoding genes (although interestingly there was no evidence of excess heterozygosity compared to Hardy-Weinberg equilibrium as would be expected for an outcrossing autoploid). Although the precise nature of the ploidy status of Fitzroya remains unknown, we can say that in our data we saw no evidence for excess heterozygosity, and the data are consistent with the recovered loci behaving in a diploid fashion. This general principle held true in the other species -e.g., if our analyses contained many paralogous loci, we would expect to see highly inflated observed heterozygosity, where different loci were mistakenly treated as different alleles. No such heterozygosity excess was observed. It is also noteworthy that our estimates of F ST were robust to different assembly parameters suggesting that there is considerable stability in the data and an overriding signature of low population differentiation. In addition, RAD-seq has been successfully used in other large and presumably repeat-rich genomes such as cycads (Clugston et al. 2019).

Species
Area of study  Thus, taken together, we do not believe that undetected paralogy is acting as a major confounding variable in our analysis.

Biological factors that might impact population structure in Chilean conifers
Conifers are typically thought to show low genetic differentiation, even among geographically distant populations (O'Connell et al. 2007;Gamba and Muchhala 2020). One factor that is postulated as important here is very effective wind-borne pollen dispersal, which may contribute to maintaining common alleles over large geographical distances (Prunier et al. 2016). Indeed, Hamrick et al. (1992) in their review of 121 gymnosperms species revealed little genetic differentiation among populations, with an overall mean G ST =0.073 (based on allozyme data). However, these general observations are dominated by studies of northern hemisphere conifers that frequently occur in large continuous forest blocks. Thus, the high level of connectivity found in this investigation between populations is still somewhat unexpected, as the four species have a discontinuous distribution in Chile, with many populations separated by marked topographical barriers.
Most of the extant populations of Chilean conifers are restricted to the slopes of the Andean cordillera and to a lesser extent in the Coastal Range. This is mostly due to the impact of human activities, which have reduced a large proportion of the Central Valley and Coastal Range forests (Vergara et al. 2013). This range fragmentation coupled with natural barriers to gene flow stemming from the complex topography of Chile might be expected to lead to high levels of population differentiation. However, low levels of genetic differentiation (F ST range: 0.017-0.062) and weak geographical structure in clustering analyses were detected. This implies that the four species have not (yet) been affected by these geographical barriers and/or fragmentation. This might be associated with the intrinsically effective wind-borne pollen dispersal of these species. Evidence of long-distance pollen flow has been observed in the South American conifers A. chilensis (Colabella et al. 2014) and Araucana angustifolia (Bittencourt and Sebbenn 2007). Both studies detected moderate to high pollen movement among fragmented populations. However, an effective wind-borne pollen dispersal alone seems an unsatisfactory explanation for the populations of the Chilean conifers studied here, given the scale of the barriers and the distances between extant populations.
The sheer longevity of individuals of these species may be the critical factor in determining the observed levels of population differentiation. The four conifers involved in this research are long-lived trees, living for between 100 and 500 years for the Podocarpaceae species to over 3000 years in F. cupressoides (Lara and Villalba 1993). One of the Podocarpaceae species, S. conspicua, shows an additional adaptation to longevity namely adventitious roots (not seen in another Podocarp species yet). These adventitious roots grow down inside the hollowed-out bark of old trees of S. conspicua that allow the individuals to anchor themselves to the ground with the consequence of perpetuation in situ, extending their longevity. This is a process that might be repeated in individuals from generation to generation (Cano et al. 2014).
Against the timeline of this longevity, the major anthropogenic impacts on these species have been in the last 500 years for F. cupressoides and more recently for the Podocarpaceae species. During the 16-17th centuries, the Spanish colonist settlers burned large areas of forests (Gardner et al. 2006). They also began to over-exploit Fitzroya for its timber, an activity that became one of the primary economic resources in the south of Chile (Torrejón et al. 2011). The subsequent expansion of agricultural activities began in the 19th century and mostly affected the Central Valley in central southern Chile (Robles Ortiz 2003). After the 1970s the expansion of the forestry industry (Pinus and Eucalyptus) led to further losses, particularly in the Central Valley and the Coastal Range, and still affects a significant part of the native forests of Chile (Salas et al. 2016).
Over deeper timescales, species history might also have shaped the current patterns of genetic structure in these conifers. Palynological and glaciological evidence suggests that the last glaciation maximum (LGM) affected the distribution of the southern Andean forest, reducing its size (compared to the current distribution) due to extensive ice coverage (Villagrán 1991;Allnutt et al. 2003). Two main hypotheses have been proposed to explain the effect of the LGM on the existing tree species distributions in southern South America. The first is that tree species survived in multiple refugia and the second hypothesis is that they recolonised from a single refugium (Markgraf et al. 1995). Using isozyme markers Premoli et al. (2000b) assessed these hypotheses in F. cupressoides. They suggested that if the current populations originated from only one refugium, low levels of genetic differentiation should be detected across the range of F. cupressoides including between populations in the Andes and the Coastal Mountains. In contrast, multiple refugia could lead to a higher degree of genetic divergence between populations. Results by Premoli et al. (2000b) detected a degree of genetic variation between eastern and western populations (based mostly on DAPC analysis), leading them to suggest that the current F. cupressoides distribution would have originated from multiples refugia. Several other investigations have also suggested the existence of multiple small refugia for different species in the region, based on RAPD or isozyme nuclear markers (e.g. Bekessy et al. 2002;Marchelli and Gallo 2004;Pastorino and Gallo 2002) and plastid markers (e.g. Marchelli et al. 1998;Marchelli and Gallo 2006;Pastorino et al. 2009). In contrast, this investigation, based on SNPs arrays (RAD-seq) showed no signal of genetic structure in any of the four conifers. In this context, our results would suggest that the hypothesis of a forest expansion from a single refugium remains plausible. Another related hypothesis, from Marchelli et al. (2010) who observed no genetic differentiation between A. araucana (conifer species primarily found in the south of Chile) populations, suggested that the pre-Pleistocene distribution of the species could have been only partially reduced and it could be possible for there to have been multiple refugia without observing strong genetic differentiation among extant populations.
In summary, the low levels of genetic differentiation observed are associated with trees of extreme longevity which may have limited the impacts of recent anthropogenic fragmentation. The associated trait of wind pollination may also have served to maintain high levels of connectivity prior to contemporary habitat fragmentation. This lack of observed divergence may offer some intrinsic resilience to immediate loss of genetic diversity and population differentiation. This is potentially positive news from a conservation perspective, in that the highly fragmented distribution of these species may not yet be leading to genetic problems associated with genetic drift, inbreeding and divergence in isolated (and often small) populations. However, as noted previously, it seems highly unlikely that many of these populations will be receiving a contemporary influx of pollen or seed flow. Thus, to err on the side of caution, follow-up studies are recommended to genotype seedlings to check for an early warning of any sign of the erosion of neutral genetic diversity, and/or elevated inbreeding (Neto et al. 2014).